OpenAI and Microsoft have been sued by another news organization for using articles to train its artificial intelligence systems, this time by the Center for Investigative Reporting.
In a lawsuit filed in New York district court, the nonprofit newsroom, which produces Mother Jones and Reveal, alleges the Sam Altman-led firm “copied, used, abridged, and displayed” the Center for Investigative Reporting’s content without consent or compensation, in violation of copyright laws. It argues that users of AI tools from OpenAI and Microsoft can obtain variants of copyright-protected material, undercutting the market for articles from the organization.
The lawsuit is at least the fifth from a news organization against OpenAI over novel copyright issues associated with training its AI system. In recent months, similar complaints have been brought by The New York Times, Chicago Tribune and Denver Post. It expands a multifront legal battle that may have far-reaching implications on the news publishing industry, with the financial viability of media in a landscape in which readers can bypass direct sources in favor of search results generated by AI tools at stake.
Amid financial strife in the media industry, some news publishers have chosen to cut licensing deals with the tech giant. In May, OpenAI inked a deal with News Corp. that will bring content from its stable of media outlets to ChatGPT and other products from the company. Under the partnership, OpenAI has permission to display content from News Corp. mastheads in response to user questions. The agreement came amid a flurry of other deals last month with People owner Dotdash Meredith and Reddit.
Unlike other organizations that license material, CIR chief executive Monika Bauerlein said OpenAI and Microsoft “started vacuuming up our stories to make their product more powerful” without permission or compensation. She added, “This free rider behavior is not only unfair, it is a violation of copyright.”
According to the complaint, more than 17,000 webpage addresses from Mother Jones are included in OpenAI’s data set to train AI products from the company and Microsoft. This allows ChatGPT and Microsoft’s Copilot, the lawsuit alleges, to provide responses that “mimic copyright-protected works of journalism” when a user asks about a current event.
The complaint nods to other lawsuits brought by news organizations against OpenAI. When CIR attempted to obtain the same regurgitations detailed in a case initiated by the Daily News, it says it received a message stating, “I’m sorry, but I can’t generate the original ending for the article or any copyrighted content.” The lawsuit claims that OpenAI recently changed ChatGPT to reduce the frequency of outputs that copy material from publishers.
Still, ChatGPT and Copilot repeatedly provide “highly detailed abridgements of copyright-protected news articles” from CIR, according to the complaint.
“In some instances, the initial response will summarize the article in substantial detail,” the lawsuit states. “Further, when prompted by the user to provide more information about one or more aspects of that abridgement, ChatGPT or Copilot will provide additional details, often in the format of a bulleted list of main points.”
By rewriting articles and providing them to users, CIR argues that OpenAI and Microsoft harm the market for those pieces by reducing the incentives to go to the original source. This reduces subscription, licensing and advertising revenue, the lawsuit says, and allows the tech companies to monetize news publishers’ content at the expense of copyright owners.
Pointing to OpenAI striking deals with The Associated Press, The Atlantic and News Corp., among others, CIR alleges that the company knows it’s under the obligation to obtain licenses to use copyright-protected content to train its AI system. CIR says it hasn’t been offered such a deal.
The complaint brings several copyright infringement claims, which carry damages up to $150,000 per infringed work. It seeks a court order forcing OpenAI and Microsoft to remove copies of copyrighted content from its training data sets.
On Tuesday, OpenAI said it would delay the launch of voice features for its chatbot to conduct further safety testing. After it unveiled a demo of the tool last month, legal action was threatened by Scarlett Johansson, who said that the company copied her voice for one of its AI personas.