OpenAI, xAI, Google sued for training chatbots with 'pirated books'

The lawsuit, filed by an author-journalist, alleged that the companies copied and used protected literary works to develop large language models that power commercial chatbots

OpenAI, ChatGPT — OpenAI's ChatGPT. (Representational image from files)

Akshita Singh New Delhi

3 min read Last Updated : Dec 23 2025 | 5:02 PM IST

Add as Preferred source

A group of US journalists and writers have sued several leading artificial intelligence (AI) companies, including OpenAI and Elon Musk's xAI, for allegedly using copyrighted books without permission to train their AI systems.

The petitioners, including New York Times reporter John Carreyrou, filed the lawsuit on Monday (local time) in a federal court in California.

Besides OpenAI and xAI, other defendants include Google, Anthropic, Meta Platforms, and AI search startup Perplexity.

What does the complaint allege?

The lawsuit, filed by Carreyrou, an investigative journalist known for uncovering fraud at Silicon Valley blood-testing startup Theranos, and three others, alleged that the companies copied and used protected literary works to develop large language models that power commercial chatbots, without securing licences or compensating authors.

It described the alleged conduct as deliberate copyright infringement.

“This case concerns a straightforward and deliberate act of theft that constitutes copyright infringement,” the filing states. It adds that the companies “illegally copied vast quantities of copyrighted books without permission and then used those stolen copies to build and train their commercial large language models".

According to the petitioners, the defendants accessed pirated copies of books through so-called shadow libraries, including LibGen, Z-Library and OceanofPDF. These copies were allegedly reproduced, analysed, re-copied and embedded into AI systems to speed up commercial development.

“The Copyright Act prohibits exactly this conduct,” the complaint reads.

The lawsuit claimed that the alleged infringement affected hundreds of authors, including bestselling writers and Pulitzer Prize-winning journalists.

Why is this case significant?

The filing marked the first copyright lawsuit to name xAI as a defendant. It added to a growing list of legal challenges brought by authors, artists and publishers against technology firms over the use of copyrighted material in AI training.

The plaintiffs argued that existing class-action settlements fail to reflect the scale of the alleged infringement.

“The danger is not hypothetical,” the complaint states, referring to a pending class action against Anthropic. It notes that authors in that case are expected to receive about $3,000 per work, before legal costs, which it describes as “a tiny fraction (just 2 per cent)” of the Copyright Act’s statutory damages ceiling of $150,000 per infringed work.

“LLM companies should not be able to so easily extinguish thousands upon thousands of high-value claims at bargain-basement rates,” the filing reads.

What are the petitioners seeking?

The lawsuit makes clear that the authors are not pursuing a class-action route. Instead, they want individual claims assessed by a jury.

“Under established Supreme Court precedent, ‘the amount of statutory damages is a question for the jury'," the complaint states. It adds that the Copyright Act allows authors to hold alleged infringers accountable without relying on class settlements.

“This is not how Plaintiffs plan to proceed,” the filing reads.

What have AI companies said earlier?

AI firms have repeatedly argued that using copyrighted material to train AI models qualifies as fair use, as the systems generate new and transformative outputs rather than reproducing original works.

In an earlier case cited by Reuters, a US judge found that Anthropic’s use of copyrighted books for AI training amounted to fair use. However, the court ruled that the company violated copyright law by storing millions of pirated books in a central database, regardless of whether they were ultimately used for training.

Carreyrou later told the court that using pirated books to build AI systems was Anthropic’s “original sin”, according to Reuters.

Connect with us on WhatsApp

OpenAI, xAI, Google sued for training chatbots with 'pirated books'

The lawsuit, filed by an author-journalist, alleged that the companies copied and used protected literary works to develop large language models that power commercial chatbots

What does the complaint allege?

Why is this case significant?

What are the petitioners seeking?

What have AI companies said earlier?

More From This Section

Pakistan's national airline PIA is up for sale: Who are the 3 bidders?

Only 5% disease-focused genomic studies in low, middle-income nations: WHO

South Korea passes bill to establish rebellion courts after Yoon case

Telegram founder Durov to cover IVF costs for women using his donated sperm

US lawmakers condemn killing of Hindu man in B'desh, seek minority safety

Explore News