OpenAI, xAI, Google sued for training chatbots with 'pirated books'

The lawsuit, filed by an author-journalist, alleged that the companies copied and used protected literary works to develop large language models that power commercial chatbots

OpenAI, ChatGPT
OpenAI's ChatGPT. (Representational image from files)
Akshita Singh New Delhi
3 min read Last Updated : Dec 23 2025 | 5:02 PM IST
A group of US journalists and writers have sued several leading artificial intelligence (AI) companies, including OpenAI and Elon Musk's xAI, for allegedly using copyrighted books without permission to train their AI systems.
 
The petitioners, including New York Times reporter John Carreyrou, filed the lawsuit on Monday (local time) in a federal court in California.
 
Besides OpenAI and xAI, other defendants include Google, Anthropic, Meta Platforms, and AI search startup Perplexity.
 

What does the complaint allege?

 
The lawsuit, filed by Carreyrou, an investigative journalist known for uncovering fraud at Silicon Valley blood-testing startup Theranos, and three others, alleged that the companies copied and used protected literary works to develop large language models that power commercial chatbots, without securing licences or compensating authors.
 
It described the alleged conduct as deliberate copyright infringement.
 
“This case concerns a straightforward and deliberate act of theft that constitutes copyright infringement,” the filing states. It adds that the companies “illegally copied vast quantities of copyrighted books without permission and then used those stolen copies to build and train their commercial large language models".
 
According to the petitioners, the defendants accessed pirated copies of books through so-called shadow libraries, including LibGen, Z-Library and OceanofPDF. These copies were allegedly reproduced, analysed, re-copied and embedded into AI systems to speed up commercial development.
 
“The Copyright Act prohibits exactly this conduct,” the complaint reads.
 
The lawsuit claimed that the alleged infringement affected hundreds of authors, including bestselling writers and Pulitzer Prize-winning journalists.
 

Why is this case significant?

 
The filing marked the first copyright lawsuit to name xAI as a defendant. It added to a growing list of legal challenges brought by authors, artists and publishers against technology firms over the use of copyrighted material in AI training.
 
The plaintiffs argued that existing class-action settlements fail to reflect the scale of the alleged infringement.
 
“The danger is not hypothetical,” the complaint states, referring to a pending class action against Anthropic. It notes that authors in that case are expected to receive about $3,000 per work, before legal costs, which it describes as “a tiny fraction (just 2 per cent)” of the Copyright Act’s statutory damages ceiling of $150,000 per infringed work.
 
“LLM companies should not be able to so easily extinguish thousands upon thousands of high-value claims at bargain-basement rates,” the filing reads.
 

What are the petitioners seeking?

 
The lawsuit makes clear that the authors are not pursuing a class-action route. Instead, they want individual claims assessed by a jury.
 
“Under established Supreme Court precedent, ‘the amount of statutory damages is a question for the jury'," the complaint states. It adds that the Copyright Act allows authors to hold alleged infringers accountable without relying on class settlements.
 
“This is not how Plaintiffs plan to proceed,” the filing reads.
 

What have AI companies said earlier?

 
AI firms have repeatedly argued that using copyrighted material to train AI models qualifies as fair use, as the systems generate new and transformative outputs rather than reproducing original works.
 
In an earlier case cited by Reuters, a US judge found that Anthropic’s use of copyrighted books for AI training amounted to fair use. However, the court ruled that the company violated copyright law by storing millions of pirated books in a central database, regardless of whether they were ultimately used for training.
 
Carreyrou later told the court that using pirated books to build AI systems was Anthropic’s “original sin”, according to Reuters.
*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

Topics :Artificial intelligenceElon MuskGooglecopyright violationOpenAIBS Web Reports

First Published: Dec 23 2025 | 5:02 PM IST

Next Story