Commentary: An AI firm won a lawsuit for copyright infringement — but may face a huge bill

June 27, 2025

To judge from the reaction among the AI crowd, a federal judge’s Monday ruling in a copyright infringement case was a clear win for all the AI firms that use published material to “train” their chatbots.

“We are pleased that the Court recognized that using works to train [large language models] was transformative — spectacularly so,” Anthropic, the defendant in the lawsuit, boasted after the ruling.

“Transformative” was a key word in the ruling by U.S. Judge William Alsup of San Francisco, because it’s a test of whether using copyrighted works falls within the “fair use” exemption from copyright infringement. Alsup ruled that using copyrighted works to train bots such as Anthropic’s Claude is indeed fair use, and not a copyright breach.

(Anthropic) could have purchased books, but it preferred to steal them.

— U.S. Judge William Alsup

Anthropic had to acknowledge a troubling qualification in Alsup’s order, however. Although he found for the company on the copyright issue, he also noted that it had downloaded copies of more than 7 million books from online “shadow libraries,” which included countless copyrighted works, without permission.

That action was “inherently, irredeemably infringing,” Alsup concluded. “We will have a trial on the pirated copies…and the resulting damages,” he advised Anthropic ominously: Piracy on that scale could expose the company to judgments worth untold millions of dollars.

Newsletter

Get the latest from Michael Hiltzik

Commentary on economics and more from a Pulitzer Prize winner.

You may occasionally receive promotional content from the Los Angeles Times.

What looked superficially as a clear win for AI companies in their long battle to use copyrighted material without paying for it to feed their chatbots, now looks clear as mud.

That’s especially true when Alsup’s ruling is paired with a ruling issued Wednesday by U.S. Judge Vince Chhabria, who works out of the same San Francisco courthouse.

In that copyright infringement case, brought against Meta Platforms in 2023 by comedian Sarah Silverman and 12 other published authors, Chhabria also ruled that Meta’s training its AI bots on copyrighted works is defensible as fair use. He granted Meta’s motion for summary judgment.

But he provided plaintiffs in similar cases with a roadmap to winning their claims. He ruled in Meta’s favor, he indicated, only because the plaintiffs’ lawyers failed to raise a legal point that might have given them a victory. More on that in a moment.

“Neither case is going to be the last word” in the battle between copyright holders and AI developers, says Adam Moss, a Los Angeles attorney specializing in copyright law. With more than 40 lawsuits on court dockets around the country, he told me, “it’s too early to declare that either side is going to win the ultimate battle.”

With billions of dollars, even trillions, at stake for AI developers and the artistic community at stake, no one expects the law to be resolved until the issue reaches the Supreme Court, presumably years from now. But it’s worthwhile to look at these recent decisions — and a copyright lawsuit filed earlier this month by Walt Disney Co., NBCUniversal and other studios against Midjourney, another AI developer — for a sense of how the war is shaping up.

To start, a few words about chatbot-making. Developers feed their chatbot models on a torrent of material, much of it scraped from the web — everything from distinguished literary works to random babbling — as well as collections holding millions of books, articles, scientific papers and the like, some of it copyrighted. (Three of my eight books are listed in one such collection, without my permission. I don’t know if any have been “scraped,” and I’m not a party to any copyright lawsuit, as far as I know.)

The goal is to “train” the bots to extract facts and detect patterns in the written material that can then be used to answer AI users’ queries in a semblance of conversational language. There are flaws in the process, of course, including the bots’ tendency when they can’t find an answer in their massive hoard of data to make something up.

In their lawsuits, writers and artists maintain that the use of their material without permission to train the bots is copyright infringement, unless they’ve been paid. The AI developers reply that training falls within the “fair use” exemption in copyright law, which depends on several factors — if only limited material is drawn from a copyrighted work, if the resulting product is “transformative,” and if it doesn’t significantly cut into the market for the original work.

That brings us to the lawsuits at hand.

Three authors — novelist Andrea Bartz and nonfiction writers Charles Graeber and Kirk Wallace Johnson — sued Anthropic for using their works without permission. In their lawsuit, filed last year, it emerged that Anthropic had spent millions of dollars to acquire millions of print books, new and used, to feed their bots.

“Anthropic purchased its print copies fair and square,” Alsup wrote. It’s generally understood that the owners of books can do almost anything they wish with them, including reselling them.

But Anthropic also downloaded copies of more than 7 million books from online “shadow libraries,” which include untold copyrighted works without permission.

Alsup wrote that Anthropic “could have purchased books, but it preferred to steal them to avoid ‘legal/practice/business slog,’” Alsup wrote. (He was quoting Anthropic co-founder and CEO Dario Amodei.)

Anthropic told me by email that “it’s clear that we acquired books for one purpose only — building LLMs — and the court clearly held that use was fair.”

That’s correct as far as it goes. But Alsup found that Anthropic’s goal was not only to train LLMs, but to create a general library “we could use for research” or to “inform our products,” as an Anthropic executive said, according to legal papers.

Chhabria’s ruling in the Meta case presented another wrinkle. He explicitly disagreed with Alsup about whether using copyrighted works without permission to train bots is fair use.

“Companies have been unable to resist the temptation to feed copyright-protected materials into their models—without getting permission from the copyright holders or paying them.” He posed the question: Is that illegal? And answered, “Although the devil is in the details, in most cases the answer will be yes.”

Chhabria’s rationale was that a flood of AI-generated works will “dramatically undermine the market” for the original works, and thus “dramatically undermine the incentive for human beings to create things the old-fashioned way.”

Protecting the incentive for human creation is exactly the goal of copyright law, he wrote. “While AI-generated books probably wouldn’t have much of an effect on the market for the works of Agatha Christie, they could very well prevent the next Agatha Christie from getting noticed or selling enough books to keep writing.”

Artists and authors can win their copyright infringement cases if they produce evidence showing the bots are affecting their market. Chhabria all but pleaded for the plaintiffs to bring some such evidence before him:

“It’s hard to imagine that it can be fair use to use copyrighted books…to make billions or trillions of dollars while enabling the creation of a potentially endless stream of competing works that could significantly harm the market for those books.”

But “the plaintiffs never so much as mentioned it,” he lamented.

As a result, he said, he had no choice but to give Meta a major win against the authors.

I asked the six law firms representing the authors for their response to Chhabria’s implicit criticism of their lawyering, but heard back from only one — Boies Schiller Flexner, which told me by email, “despite the undisputed record of Meta’s historically unprecedented pirating of copyrighted works, the court ruled in Meta’s favor. We respectfully disagree with that conclusion.”

All this leaves the road ahead largely uncharted. “Regardless of how the courts rule, I believe the end result will be some form of licensing agreement,” says Robin Feldman, director of the Center for Innovation at UC College of the Law. “The question is where will the chips fall in the deal and will smaller authors be left out in the cold.”

Some AI firms have reached licensing agreements with publishers allowing them to use the latters’ copyrighted works to train their bots. But the nature and size of those agreements may depend on how the underlying issues of copyright infringement play out in the courts.

Indeed, Chhabria noted that filings in his court documented that Meta was trying to negotiate such agreements until it realized that a shadow library it had downloaded already contained most of the works it was trying to license. At that point it “abandoned its licensing efforts.” (I asked Meta to confirm Chhabria’s version, but didn’t get a reply.)

The truth is that the AI camp is just trying to get out of paying for something instead of getting it for free. Never mind the trillions of dollars in revenue they say they expect over the next decade — they claim that licensing will be so expensive it will stop the march of this supposedly historic technology dead in its tracks.

Chhabria aptly called this argument “nonsense.” If using books for training is as valuable as the AI firms say they are, he noted, then surely a market for book licensing will emerge. That is, it will — if the courts don’t give the firms the right to use stolen works without compensation.