Amazon’s Vision of ‘Billions’ of AI Agents Inside Every Company Seems Premature

December 21, 2025

At its annual AWS Re:Invent conference earlier this month, Amazon unsurprisingly focused on AI, and placed more emphasis on the building and managing of agents.

In his keynote, AWS CEO Matt Garman argued that “the true value of AI has not yet been unlocked,” but said we’ll one day live in a world with “billions of agents inside every company and across every imaginable field.”

“Getting to a future of billions of agents, where every organization is getting real-world value and results from AI is going to require us to push the limits of what’s possible with the infrastructure,” Garman added. “We want to reimage every single process in the way that all of us work.”

A New Frontier

Indeed, the bulk of his keynote focused on the infrastructure for AI—new instances with chips from AMD, Intel, and Nvidia; new versions of Amazon’s own chips; and new AI models.

On the hardware side, Garman talked about setting up “AI factories” for building agents and applications, whether running Nvidia hardware or AWS’s own Trainium chips. The majority of inference in Amazon’s Bedrock AI development platform is now done on Trainium, including all use of Anthropic’s Claude, with over 1 million Trainium chips now in use, he said.

At the show, Amazon announced the general availability of Trainium 3 UltraServers, which use a new chip built on TSMC’s 3nm technology. Amazon says it delivers up to 4.4x more compute performance, 4x greater energy efficiency, and almost 4x more memory bandwidth than Trainium2 UltraServers. The new servers are available with up to 144 Trainium3 chips, which can deliver up to 362 FP8 PFLOPs with 4x lower latency.

aws reinvent

(Credit: Amazon)

Separately, the company also announced Graviton 5, the new generation of its Arm-based chip. This version offers 192 cores in a package with a 5x larger L3 cache.

On the model side, the company announced a series of new AI models, including Nova 2 Pro, Lite, Sonic (a speech-to-speech model), and Omni, a multimodal model that supports text, images, video, and speech inputs while generating both text and image outputs.

Garman described these as “frontier models”—meant to be on the leading edge but still “cost-optimized.” In other words, he didn’t claim they had the absolute best performance, but said they gave similar performance at a lower cost.

The most interesting announcement in this area to me was Nova Forge, which would allow organizations to create their own models by picking an early checkpoint from another version of Nova and then adding their own data sets as part of the pre-training process. 

This is an alternative to taking fully pretrained models and fine-tuning them with custom data, which eventually causes them to “forget” knowledge acquired in training, or to RAG (retrieval augmented generation), which can improve results in a specific domain by connecting to a proprietary knowledgebase but doesn’t enhance the model’s underlying reasoning abilities. It’s an interesting concept; I’m curious to see if this really improves domain-specific models companies use in the real world.

Agents Everywhere

It seems like everyone is talking about agents these days, and AWS is no exception, with a lot of talk about “frontier agents,” which Amazon defines as those that are autonomous, can run for hours or days without intervention, and deliver complete outcomes rather than just tasks.

At the event, the company introduced a bunch of new features for Bedrock AgentCore, its platform for building, deploying, and managing agents. The big new thing here is Policy control, which Garman compared with guardrails or boundaries you’d give a teenager who just got a driver’s license. Other new features include memory, so an agent can remember previous conversations.

Much of this is still in preview, and like the other major agent platforms—Google Agent Space (now part of Gemini Enterprise) and Microsoft Agent365—it all seems a little early to me. In other words, the concepts are good, but I think it will take some more use in the real world before people even know how to make them right. Only then will these platforms be ready for large-scale deployment.

Similarly, the company announced general availability of Nova Act, a service that helps developers build, deploy, and manage fleets of agents. But I was a bit taken aback by the company’s claim that such agents would be “over 90% reliable.” That sounds good, but it’s not good enough.

Meanwhile, Garman spent some time talking about the company’s Quick Suite agents, which connects a chatbot with access to company data, including both structured data from databases and unstructured data from applications like Microsoft 365. This can be used for things like research and data flows, he said. In addition, the company offers Amazon Connect, a self-service customer agent service.

But much of the discussion was on tools for developers, including new features for Kiro, the company’s AI-integrated development environment and Transform, a tool for migrating code from legacy languages to a more modern system. Most interesting were a Kiro agent to help development teams create projects; a DevOps Agent designed to help developers respond to incidents, identify root causes, and prevent future issues; and a security agent designed to protect applications through code analysis and penetration testing.

“The reality is, building and scaling these amazing agentic systems are harder than the problems they are trying to solve,” AWS VP of Agentic AI Swami Sivasubramanian said in a separate Re:Invent keynote. The new tools from Amazon and the others are steps in the right direction, but I think there’s still a lot of room for improvement.

About Our Expert

 

Search

RECENT PRESS RELEASES