Meta Stole My Books—And CEO Mark Zuckerberg Approved It

March 25, 2025

Like any author, I have always had a complicated relationship with LibGen. For those unfamiliar, LibGen—or Library Genesis—is essentially a digital warehouse of stolen intellectual property, neatly stacked with pirated books, academic papers, and various works authors and publishers never approved.

Most authors, like me, quietly grit their teeth. We sigh. We tolerate.

But now, there is something infinitely more repulsive to stomach.

Meta, led by founder and CEO Mark Zuckerberg, deliberately and explicitly authorized a raid on LibGen—and Anna’s Archive, another massive digital pirate haven—to train its latest AI model, Llama 3.

The Heist

Court documents surfaced recently through stellar investigative reporting by Alex Reisner at The Atlantic expose the ugly underbelly of Meta’s so-called “innovation.”

The gist is as follows: senior Meta management recognized they urgently needed high-quality content to populate its large language model (LLM)—”books are actually more important than web data,” one email chillingly admitted.

Meta staff turned to LibGen, home to more than 7.5 million pirated books and 81 million stolen research papers, to fill that gap. They did the same with Anna’s Archive. After internal discussions, Zuckerberg himself greenlit the theft.

You read correctly. The founder of Facebook personally sanctioned piracy.

To call this unethical would do a disservice to the word unethical.

The Author’s Plight

But let’s put aside sarcasm for a moment. Meta’s actions are not merely an irritation for authors. It ought to be considered a moral crisis that deserves everyone’s attention, not the least of which is the U.S. Government’s National Intellectual Property Rights Coordination Center.

Authors already earn very little from books. For most—including yours truly—the financial reward of writing a book is modest at best. It covers a few mortgage payments, perhaps a little more if luck shines.

Many authors will invest years into their research, writing, and revisions, followed by months (or years) of promotional engagements.

For the authors that I am friends with and talk to, it’s never about a fast payday. In fact, it’s never really about the money. It’s about the ideas, the learning, and contributing something meaningful to society.

Book writing really is part of an author’s sense of purpose.

Affordable Book Act

Zuckerberg and Meta’s decision to steal dismisses all of that work as nothing more than cheap fuel for AI. I do not find it simply unfair; it is exploitative.

It goes further than the books being used without permission. Meta—with its $164.5 billion in 2024 revenues and almost $62.4 billion in profits—could have easily negotiated agreements with publishers and authors.

They might even have acted as the leader in LLM input data and created licensed arrangements that respected an author’s rights.

Ethics and legality seem to be missing from Meta’s core values statements.

Coincidentally, Meta’s “focus on long-term impact” core value states: “We emphasize long-term thinking that encourages us to extend the timeline for the impact we have, rather than optimizing for near-term wins.”

It seems very clear that Meta was indeed optimizing for near-term wins in this case.

Willful Disregard

As Reisner reported, when Meta’s engineers realized they needed high-quality content to make Llama 3 competitive, the team did not hesitate to steal. The quick fix was obvious.

Why pay authors and publishers fairly when Meta engineers could exploit their intellectual property for free?

I decided to check Alex Reisner’s handy tool, which reveals if one’s books might be caught up in the LibGen heist. The result?

All five of my books were pirated and included in Meta’s dataset. The same can also be said for Anna’s Archive. My books were included in that heist as well.

Meta, predictably, has scrambled behind the tired, old “fair use” defense in dealings with lawyers and judges. Its argument suggests that because Llama 3 allegedly “transforms” these stolen texts into new outputs, this colossal act of theft becomes justified.

However, fair use arguments were meant for education, commentary, and criticism, not corporate exploitation for commercial profit at a breathtaking scale.

Based on their 2024 financials, Meta is not some struggling teacher in Boise, Idaho, photocopying textbook pages for their students. Meta ranks among the top 10 most valuable companies in the world. Meta’s market capitalization was roughly $1.8 trillion as of this writing.

Yet, amazingly, it decided to steal from individual, mostly unwealthy creators and authors deliberately.

Next Steps

Some creators have filed a major class-action lawsuit alleging copyright infringement and unfair competition against Meta. This litigation might define how companies can acquire data for their LLMs in the future. <Disclosure: I am not part of this litigation.>

AI and tech companies will continue to face scrutiny for their LLM data-sourcing practices. The industry’s voracious hunger for data often eclipses ethical considerations.

Meta’s decision spotlights the broader recklessness prevalent across the AI ecosystem. While Meta might currently be the poster child for data theft, other firms—some we do not know about yet—are almost certainly guilty of similar sins.

We urgently need transparency and robust ethical guidelines for AI LLM training.

Companies must develop sustainable, lawful partnerships with content creators, authors, publishers, and the like.

The tech companies must be put into a position to respect copyrights, intellectual property, and the simple human dignity behind creative effort.

Innovation cannot excuse exploitation.

How we treat creators today determines the future of our knowledge, art, and ideas for tomorrow.

Meta’s conduct establishes a perilous precedent, which might potentially lead individuals to reconsider their willingness to engage in public discourse like writing books and articles. And that will be a dire shame.

Ultimately, Zuckerberg’s endorsement to proceed with the heist must be characterized as libelous.

All creators deserve better. All authors deserve fairness.

I’m no Pollyanna, but the future should not be built on stolen ideas.