Anthropic owes authors $1.5B for pirating work — but the claims process is a Kafkaesque mess

· Vox

Earlier this year, the author Maureen Johnson was fighting with Anthropic.

Specifically, she was wrestling with the Anthropic copyright settlement website. 

Visit rocore.sbs for more information.

Johnson is the author of 28 books, most of them YA and many of them bestsellers. The AI company Anthropic owes her an estimated $3,000 per book (to be split 50-50 with her publisher) for several of them. The payouts are part of a first-of-its-kind settlement that was handed down last fall, in which Anthropic admitted that it downloaded millions of pirated, copyrighted books to train its AI models without authors’ permission. (According to the New York Times, “As part of the settlement, Anthropic said it did not use any pirated works to build A.I. technologies that were publicly released.”) A judge found that the use of those books without authorial permission constituted fair use, but the piracy did not. Similar suits are pending against Meta and OpenAI. (Disclosure: Vox’s Future Perfect is funded in part by the BEMC Foundation, whose major funder was also an early investor in Anthropic; they don’t have any editorial input into our content.)

Key takeaways

  • Anthropic owes a class of half a million authors $1.5 billion as a legal settlement for downloading pirated books to train its AI model.
  • However, Anthropic’s data set was so buggy that authors had a hard time navigating the website set up to administer the claim.
  • Plus, that $1.5 billion works out to a very small amount for each individual author in the class, particularly after they’ve split the payout with their publishers.
  • The settlement will go to court for a fairness hearing on May 14.

The class-action lawsuit was intended to even the playing field between individual authors and one of the most valuable companies in the world. To distribute the money to authors, Anthropic and the plaintiff’s lawyers worked with a claims administrator (a company that specializes in managing compensation claims) to set up a website that authors can use to access a small piece of the record-breaking $1.5 billion payout. 

But Johnson, like other authors who spoke to Vox, quickly hit a snag: The claims site is glitchy and unreliable, forcing people to jump through endless hoops to collect the money they’re owed. By March, she had already submitted claims for her 14 eligible titles twice, spending 90 minutes each time to painstakingly fill out the forms. 

Now, the claims administrator was telling her they couldn’t find either of her entries. They escalated her through several layers of management, each of whom repeated the same thing. 

“It was getting more and more surreal, how little this system worked,” Johnson said.

Eventually, Johnson connected with an employee who she said spent the entire call giggling. He told her that he had found her first claim submission from February, but not the new one. 

“This system is really fluky,” Johnson said she told him. “It’s just not well-programmed.”

In response, Johnson said the employee giggled again. “Coding is hard,” he told her.

Johnson is not alone in her frustrating experience. Authors had six months to register their claims for Anthropic’s payout, and a lot of them struggled to do so.

Anthropic regularly touts its ethical and philanthropic bona fides. (The company is here to serve humanity’s long-term well-being! It’s the safe and responsible AI company! Claude helped NASA’s Perseverance rover travel on Mars!) But the good it is doing is based on stolen work — and the people who created that work are having trouble getting the very small recourse that they are owed.

“Everyone agrees it’s not the best data.”

All of the popular large language models were trained on books; that was the only way to get them enough high-quality text to start generating their own. Most of those books were downloaded from pirate libraries, in at least one instance on the grounds that it would simply be too expensive to pay for each title. As it became increasingly clear that this was the case, the class action lawsuits began rolling in.

Bartz et al. v. Anthropic PBC was the first to be settled. In September 2025, a judge approved a $1.5 billion settlement between Anthropic and the nearly half a million writers it had determined belonged to the class. Things got tricky, however, when it came time to determine who those half a million writers were.

They had to be authors of books that appeared in one of the three pirated databases Anthropic used in 2021. But trying to create a comprehensive list from those databases proved difficult. Anthropic hadn’t created its own records as it fed pirated books into its training corpus, so lawyers on both sides had to rely on the pirate sites’ own data. And they had to do it quickly, because the trial came with strict deadlines. 

“It’s, like, crowdsourced pirate library metadata,” Dave Hansen, executive director of the advocacy group Authors Alliance, told Vox. (Authors Alliance has filed amicus briefs in the Bartz case and published extensive technical explainers for authors.) “I wouldn’t rely on that for almost anything, much less administering legal claims in a large and important lawsuit. But that was kind of the best that they had given the data sources being used.” 

“I think everyone agrees it’s not the best data, but it’s the best that they could do on the time frame,” publishing industry reporter Jane Friedman told Vox. “I think it was just the reality for class counsel. The judge was really expediting matters, and so they did the best they could in the time that they had.”

Neither Anthropic, its lawyers, the class counsel for this case, or the claims administrator responded to a request for comment from Vox. But it appears that the plaintiff’s lawyers and the claims administrator worked together to narrow down Anthropic’s starting list of 7 million books to only titles that were under US copyright in 2022. 

“Then they used a bunch of other industry sources to enrich that data so that they had more information about current publishers, and then used that to generate contact info,” Hansen said. “At that scale, it’s really hard to get 100 percent accuracy.” He added, “One of my bigger criticisms of how this settlement and process has gone is the data. They just haven’t been very transparent about it.”

From there, the claims administrator and class counsel used that wonky list to build their glitchy website, which is how Maureen Johnson eventually found herself on the phone with a giggling man who told her coding was hard. Other authors were in a similar boat.

“I have 19 titles in the database,” said Christopher Moore, the author of zany comedic novels like  Lamb: The Gospel According to Biff, Christ’s Childhood Best Friend. After he had done the paperwork for 18 of them, he had to walk away from his computer. When he came back the next day to finish the paperwork for the 19th book, everything had been deleted. 

A month went by after he submitted the form a second time, Moore said. “And I got another notice: what about these other titles?” Most of the titles belonged to one of the four other Christopher Moores working as authors. One was actually his, Moore said, “but it showed it with some weird Texas copyright.” He filed the claim anyway and is still waiting to hear back. 

April Henry, who writes YA mysteries, also found unusual copyright holders on her books. “One of the books on the list appeared to be an audiobook and showed the narrator as one of the copyright holders,” she said.

Meanwhile, she is struggling to figure out how to handle the seven of her 22 books that she wrote with a co-author. “No one ever had it in their contract that you’re going to split the rights to a legal settlement,” Henry said. “You know what I mean?”

And as authors struggle to navigate the claims process, they’re doing so with mixed emotions.

“That’s not a lot for your entire catalog.”

Johnson is still furious about her experience with the claim administrator’s website. “Your AI monster ate all of our work,” she said, addressing Anthropic. “Now you’re trying to pay us off with this […] piece of garbage that doesn’t work.”

For many authors, the money didn’t seem like enough, considering that their life’s works had been taken without their permission. The full settlement of $1.5 billion sounds like a lot. But split among so many copyright holders, it doesn’t go all that far. There’s also the fact that the $3,000 number is just an estimate of what authors’ payouts will eventually look like. In reality, there is a flat amount of cash available for the class, and the more people participate in the class, the smaller the pot of money available for everyone involved gets

“When you think $3,000 a book times 22 books, you’re like, ‘I get $66,000,’” Henry said. But then there’s the money that goes to the publishers, and any money that goes to a co-author. “In some cases, it’s going to end up being like $500 a book,” Henry said. “At first you’re like, ‘What a windfall!’ But it doesn’t seem like a windfall.”

“For me, it’s an entire career, and it’ll come down to under $30,000,” Moore said. “That’s not a lot for your entire catalog.” 

Then there’s the question of what the new world Anthropic helped build with all those stolen books will look like for authors. “We have no idea what the long-term damage of this is to artists,” Moore said. “I’m on the downhill slope of my career, so there’s not that much that they can take from me. But if someone’s strong in the middle of their career, they could really be hurt by this.”

On May 14, the settlement will receive a fairness hearing, where the judge is set to review a number of author complaints, including what they describe as “inadequate compensation relative to the damage.” 

In the meantime, Anthropic remains one of tech’s biggest players, currently valued at $900 billion. According to the industry headlines: “Anthropic’s Claude claws its way towards the top of the AI market.”

Read full story at source