Amazon reported large amount of child sexual abuse material found in AI training data
January 29, 2026
Amazon.com Inc. reported hundreds of thousands of pieces of content last year that it believed included child sexual abuse, which it found in data gathered to improve its artificial intelligence models. Though Amazon removed the content before training its models, child safety officials said the company has not provided information about its source, potentially hindering law enforcement from finding perpetrators and protecting victims.
Throughout last year, Amazon detected the material in its AI training data and reported it to the National Center for Missing and Exploited Children, or NCMEC. The organization, which was established by Congress to field tips about child sexual abuse and share them with law enforcement, recently started tracking the number of reports specifically tied to AI products and their development. In 2025, NCMEC saw at least a 15-fold increase in these AI-related reports, with “the vast majority” coming from Amazon. The findings haven’t been previously reported.
An Amazon spokesperson said the training data were obtained from external sources, and the company doesn’t have the details about its origin that could aid investigators. It’s common for companies to use data scraped from publicly available sources, such as the open web, to train their AI models. Other large tech companies have also scanned their training data and reported potentially exploitative material to NCMEC. However, the clearinghouse pointed to “glaring differences” between Amazon and its peers. The other companies collectively made just “a handful of reports,” and provided more detail on the origin of the material, a top NCMEC official said.
In an emailed statement, the Amazon spokesperson said that the company is committed to preventing child sexual abuse material across all of its businesses. “We take a deliberately cautious approach to scanning foundation model training data, including data from the public web, to identify and remove known [child sexual abuse material] and protect our customers,” the spokesperson said.
The spike in Amazon’s reports coincides with a fast-moving AI race that has left companies large and small scrambling to acquire and ingest huge volumes of data to improve their models. But that race has also complicated the work of child safety officials — who are struggling to keep up with the changing technology — and challenged regulators tasked with safeguarding AI from abuse. AI safety experts warn that quickly amassing large data sets without proper safeguards comes with grave risks.
Amazon accounted for most of the more than 1 million AI-related reports of child sexual abuse material submitted to NCMEC in 2025, the organization said. It marks a jump from the 67,000 AI-related reports that came from across the tech and media industry a year prior, and just 4,700 in 2023. This category of AI-related reports can include AI-generated photos and videos, or sexually explicit conversations with AI chatbots. It can also include photos of real victims of sexual abuse that were collected, even unintentionally, in an effort to improve AI models.
Training AI on illegal and exploitative content raises newfound concerns. It could risk shaping a model’s underlying behaviors, potentially improving its ability to digitally alter and sexualize photos of real children or create entirely new images of sexualized children that never existed. It also raises the threat of continuing the circulation of the images that models were trained on — re-victimizing children who have suffered abuse.
The Amazon spokesperson said that, as of January, the company is “not aware of any instances” of its models generating child sexual abuse material. None of its reports submitted to NCMEC was of AI-generated material, the spokesperson added. Instead, the content was flagged by an automatic detection tool that compared it against a database of known child abuse material involving real victims, a process called “hashing.” Approximately 99.97% of the reports resulted from scanning “non-proprietary training data,” the spokesperson said.
Amazon believes it over-reported these cases to NCMEC to avoid accidentally missing something. “We intentionally use an over-inclusive threshold for scanning, which yields a high percentage of false positives,” the spokesperson added.
The AI-related reports received last year are just a fraction of the total number submitted to NCMEC. The larger category of reports also includes suspected child sexual abuse material sent in private messages or uploaded to social media feeds and the cloud. In 2024, for example, NCMEC received more than 20 million reports from across industry, with most coming from Meta Platforms Inc. subsidiaries Facebook, Instagram and WhatsApp. Not all reports are ultimately confirmed as containing child sexual abuse material, referred to with the acronym CSAM.
Still, the volume of suspected child sexual abuse material that Amazon detected across its AI pipeline in 2025 stunned child safety experts interviewed by Bloomberg News. The hundreds of thousands of reports made to NCMEC marked a drastic surge for the company. In 2024, Amazon and all of its subsidiaries made a total of 64,195 reports.
“This is really an outlier,” said Fallon McNulty, the executive director of NCMEC’s CyberTipline, the entity to which U.S.-based social media platforms, cloud providers and other companies are legally required to report suspected child sexual abuse material. “Having such a high volume come in throughout the year begs a lot of questions about where the data is coming from, and what safeguards have been put in place.”
McNulty, speaking in an interview, said she has little visibility into what’s driving the surge of sexually exploitative material in Amazon’s initial training data sets. Amazon has provided “very little to almost no information” in their reports about where the illicit material originally came from, who had shared it, or if it remains actively available on the internet, she said.
Although Amazon is not required to share this level of detail, the lack of information makes it impossible for NCMEC to track down the material’s origin and work to get it removed, McNulty said. It also limits relevant law enforcement agencies tasked with searching for sex offenders and children in active danger. “There’s nothing then that can be done with those reports,” she said. “Our team has been really clear with [Amazon] that those reports are inactionable.”
When asked why the company didn’t disclose information about the possible origin of the material, or other key details, the Amazon spokesperson replied, “because of how this data is sourced, we don’t have the data that comprises an actionable report.” The spokesperson did not explain how the third-party data were sourced or why the company did not have sufficient information to create actionable reports. “While our proactive safeguards cannot provide the same detail in NCMEC reports as consumer-facing tools, we stand by our commitment to responsible AI and will continue our work to prevent CSAM,” the spokesperson said.
NCMEC, a nonprofit, receives funding from the U.S. government and private industry. Amazon is among its funders and holds a corporate seat on its board.
Amazon’s Bedrock offering, which gives customers access to various AI models so they can build their own AI products, includes automated detection for known child sexual abuse material and rejects and reports positive matches. The company’s consumer-facing generative AI products also allow users to report content that escapes its controls.
The Seattle-based tech giant scans for child sexual abuse material across its other businesses, too, including its consumer photo storage service. Amazon’s cloud computing division, Amazon Web Services, also removes child sexual abuse material when it’s discovered on the web services it hosts. McNulty said AWS submitted far fewer reports than came from Amazon’s AI efforts. Amazon declined to break out specific reporting data across its various business units, but noted it would share broad data in March.
Amazon was not the only company to spot and report potential child sexual abuse material from its AI workflows last year. Alphabet Inc.’s Google and OpenAI told Bloomberg News that they scan AI training data for exploitative material — a process that has surfaced potential child sexual abuse material, which the companies then reported to NCMEC. Meta and Anthropic PBC said they, too, search training data for child sexual abuse material. Meta did not comment on whether it had identified the material, but said it would report to NCMEC if it did. Anthropic said it has not reported such material out of its training data. Meta and Google said that they’ve taken efforts to ensure that reports related to their AI workflows are distinguishable from those generated by other parts of their business.
Griffin and Day write for Bloomberg.
Search
RECENT PRESS RELEASES
Related Post
