OpenAI, Microsoft Sued by Publishers for Scraping News Articles

Bloomberg Law News· June 25, 2026

A coalition of publishers representing nearly 400 newspapers has filed a lawsuit against OpenAI and Microsoft, alleging the unauthorized scraping of news content to train generative AI models. The complaint, filed in the Southern District of New York, asserts that the defendants built multi-billion dollar products like ChatGPT and Microsoft Copilot using original reporting without providing compensation or seeking permission. This legal action represents a significant escalation in the publishing sector's efforts to protect local journalism from being undermined by the rapid expansion of artificial intelligence.

The lawsuit, filed by publishers including Richner Communications, Inc., claims that OpenAI and Microsoft "systematically and secretly crawled" newspaper websites to copy articles and stories onto their servers for AI training. The complaint alleges that the defendants stripped out copyright management information and reproduced original works in response to user prompts, generating massive market value while the publishers received no financial return. Represented by Platkin LLP, the plaintiffs argue that the AI boom could be a "death knell for local journalism" if developers are not held accountable for the misuse of proprietary content.

Matthew Platkin, the former New Jersey Attorney General leading the legal effort, emphasized that this suit is the largest to date focused on local and regional newspapers. He noted that while previous litigation like the New York Times case was trailblazing, it is essential to bring local outlets to the table to prevent their "extinction." The publishers contend they have spent billions of dollars to produce and protect their reporting—often behind paywalls—only to have it harvested by the defendants to build competing commercial products.

In defense of its practices, OpenAI spokesperson Drew Pusateri stated that the company’s models are trained on publicly available data and are grounded in the legal principle of fair use. Microsoft did not immediately respond to the allegations, which include claims of copyright infringement and violations of the Digital Millennium Copyright Act. The publishers are seeking statutory damages and injunctive relief, joining a broader wave of litigation from other major content providers like CNN and Reddit who are also challenging the data-gathering methods of AI firms.

The outcome of this litigation could have profound implications for the Publishing & Content sector, particularly regarding how revenue is shared between tech platforms and content creators. Platkin argued that it would be inequitable for legal resolutions to only benefit the largest market players while ignoring the local outlets that perform essential reporting. As the industry faces existential threats from automated content generation, this case seeks to establish a precedent that protects the intellectual property of news organizations of all sizes.

Read the full story at Bloomberg Law News

Summary generated by RabbitReport AI from public reporting. The full article and original reporting belong to Bloomberg Law News.