As Meta fades in open-source AI, Nvidia senses its chance to lead

December 16, 2025

gettyimages-1412721464
BING-JHEN HONG/iStock Editorial / Getty Images Plus

Follow ZDNET: Add us as a preferred source on Google.


ZDNET key takeaways 

  • Nvidia’s Nemotron 3 claims advances in accuracy and cost efficiency.
  • Reports suggest Meta is leaning away from open-source technology.
  • Nvidia argues it’s more open than Meta with data transparency.

Seizing upon a shift in the field of open-source artificial intelligence, chip giant Nvidia, whose processors dominate AI, has unveiled the third generation of its Nemotron family of open-source large language models.

The new Nemotron 3 family scales the technology from what had been one-billion-parameter and 340-billion-parameter models, the number of neural weights, to three new models, ranging from 30 billion for Nano, 100 billion for Super, and 500 billion for Ultra. 

Also: Meta’s Llama 4 ‘herd’ controversy and AI contamination, explained

The Nano model, available now on the HuggingFace code hosting platform, increases the throughput in tokens per second by four times and extends the context window — the amount of data that can be manipulated in the model’s memory — to one million tokens, seven times as large as its predecessor.

Nvidia emphasized that the models aim to address several concerns for enterprise users of generative AI, who are concerned about accuracy, as well as the rising cost of processing an increasing number of tokens each time AI makes a prediction. 

“With Nemotron 3, we are aiming to solve those problems of openness, efficiency, and intelligence,” said Kari Briski, vice president of generative AI software at Nvidia, in an interview with ZDNET before the release. 

Also: Nvidia’s latest coup: All of Taiwan on its software

The Super version of the model is expected to arrive in January, and Ultra is due in March or April.

Llama’s waning influence

Nvidia, Briski emphasized, has increasing prominence in open-source. “This year alone, we had the most contributions and repositories on HuggingFace,” she told me. 

It’s clear to me from our conversation that Nvidia sees a chance to not only boost enterprise usage, thereby fueling chip sales, but to seize leadership in open-source development of AI. 

After all, this field looks like it might lose one of its biggest stars of recent years, Meta Platforms.

Also: 3 ways Meta’s Llama 3.1 is an advance for Gen AI

When Meta, owner of Facebook, Instagram, and WhatsApp, first debuted its open-source Llama gen AI technology in February 2023, it was a landmark event: a fast, capable model with some code available to researchers, versus the “closed-source,” proprietary models of OpenAI, Google, and others. 

Llama quickly came to dominate developer attention in open-source tech as Meta unveiled fresh innovations in 2024 and scaled up the technology to compete with the best proprietary frontier models from OpenAI and the rest.

But 2025 has been different. The company’s rollout of the fourth generation of Llama, in April, was greeted with mediocre reviews and even a controversy about how Meta developed the program

These days, Llama models don’t show up in the top 100 models on LMSYS’s popular LMArena Leaderboard, which is dominated by proprietary models Gemini from Google, xAI’s Grok, Anthropic’s Claude, OpenAI’s GPT-5.2, and by open-source models such as DeepSeek AI, Alibaba’s Qwen models, and the Kimi K2 model developed by Singapore-based Moonshot AI. 

Also: While Google and OpenAI battle for model dominance, Anthropic is quietly winning the enterprise AI race

Charts from the third-party firm Artificial Analysis show a similar ranking. Meanwhile, the recent “State of Generative AI” report from venture capitalists Menlo Ventures blamed Llama for helping to reduce the use of open-source in the enterprise. 

“The model’s stagnation — including no new major releases since the April release of Llama 4 — has contributed to a decline in overall enterprise open-source share from 19% last year to 11% today,” they wrote.

Is Meta closing up?

Leaderboard scores can come and go, but after a broad reshuffling of its AI team this year, Meta appears poised to place less emphasis on open source. 

A forthcoming Meta project code-named Avocado, wrote Bloomberg reporters Kurt Wagner and Riley Griffin last week, “may be launched as a ‘closed’ model — one that can be tightly controlled and that Meta can sell access to,” according to their unnamed sources. 

The move to closed models “would mark the biggest departure to date from the open-source strategy Meta has touted for years,” they wrote.

Also: I tested GPT-5.2 and the AI model’s mixed results raise tough questions

Meta’s Chief AI Officer, Alexandr Wang, installed this year after Meta invested in his previous company, Scale AI, “is an advocate of closed models,” Wagner and Griffin noted. (An article over the weekend by Eli Tan of The New York Times suggested that there have been tensions between Wang and various product leads for Instagram and advertising inside of Meta.)

When I asked Briski about Menlo Ventures’s claim that open source is struggling, she replied, “I agree about the decline of Llama, but I don’t agree with the decline of open source.”

Added Briski, “Qwen models from Alibaba are super popular, DeepSeek is really popular — I know many, many companies that are fine-tuning and deploying DeepSeek.”

Focusing on enterprise challenges

While Llama may have faded, it’s also true that Nvidia’s own Nemotron family has not yet reached the top of the leaderboards. In fact, the family of models lags DeepSeek, Kimi, and Qwen, and other increasingly popular offerings.

Also: Gemini vs. Copilot: I tested the AI tools on 7 everyday tasks, and it wasn’t even close

But Nvidia believes it is addressing many of the pain points that plague enterprise deployment, specifically.

One focus of companies is to “cost-optimize,” with a mix of closed-source and open-source models, said Briski. “One model does not make an AI application, and so there is this combination of frontier models and then being able to cost-optimize with open models, and how do I route to the right model.” 

The focus on a selection of models, from Nano to Ultra, is expected to address the need for broad coverage of task requirements.

The second challenge is to “specialize” AI models for a mix of tasks in the enterprise, ranging from cybersecurity to electronic design automation and healthcare, Briski said.

“When we go across all these verticals, frontier models are really great, and you can send some data to them, but you don’t want to send all your data to them,” she observed. Open-source tech, then, running “on-premise,” is crucial, she said, “to actually help the experts in the field to specialize them for that last mile.”

Also: Get your news from AI? Watch out – it’s wrong almost half the time

The third challenge is the exploding cost of tokens, the output of text, images, sound, and other data forms, generated piece by piece when a live model makes predictions. 

“The demand for tokens from all these models being used is just going up,” said Briski, especially with “long-thinking” or “reasoning” models that generate verbose output.

“This time last year, each query would take maybe 10 LLM calls,” noted Briski. “In January, we were seeing each query making about 50 LLM calls, and now, as people are asking more complex questions, there are 100 LLM calls for every query.”

The ‘latent’ advantage

To balance demands, such as accuracy, efficiency, and cost, the Nemotron 3 models improve upon a popular approach used to control model costs called “mixture of experts (MOE),” where the model can turn on and off groups of the neural network weights to run with less computing effort. 

The fresh approach, called “latent mixture of experts,” used in the Super and Ultra models, compresses the memory used to store data in the model weights, while multiple “expert” neural networks use the data. 

Also: Sick of AI in your search results? Try these 8 Google alternatives

“We’re getting four times better memory usage by reducing the KV-cache,” compared to the prior Nemotron, said Briski, referring to the part of a large language model that stores the most-relevant recent search results in response to a query.

nvidia-2025-moe-versus-latent-moe
Nvidia

The more-efficient latent MOE should give greater accuracy at a lower cost while preserving the latency, how fast the first token comes back to the user, and bandwidth, the number of tokens transmitted per second.

In data provided by Artificial Analysis, said Briski, Nemotron 3 Nano surpasses a top model, OpenAI’s GPT-OSS, in terms of accuracy of output and the number of tokens generated each second. 

More detail on the technical innovations is available in a separate, technical Nvidia blog post on Nemotron 3.

artificial-analysis-intelligence-vs-output-speed.png
Artificial Analysis

Open-sourcing the data

Another big concern for enterprises is the data that goes into models, and Briski said the company aims to be much more transparent with its open-source approach. 

“A lot of our enterprise customers can’t deploy with some models, or they can’t build their business on a model that they don’t know what the source code is,” she said, including training data.

The Nemotron 3 release on HuggingFace includes not only the model weights but also 3 trillion tokens of training data that was used by Nvidia for pre-training, post-training and reinforcement learning. (As the model card states, the Nano model required 10 trillion tokens in total for training, testing, and evaluation, but not all data sets are able to be shared, Nvidia explained.) 

There is a separate data set for “agentic safety,” which the company says will provide “real-world telemetry to help teams evaluate and strengthen the safety of complex agent systems.”

“If you consider the data sets, the source code, everything that we use to train is open,” said Briski. “Literally, every piece of data that we train the model with, we are releasing.”

Also: Meta inches toward open source AI with new Llama 3.1

Meta’s team has not been as open, she said. “Llama did not release their data sets at all; they released the weights,” Briski told me. When Nvidia partnered with Meta last year, she said, to convert the Llama 3.1 models to smaller Nemotron models, via a popular approach known as “distillation,” Meta withheld resources from Nvidia.

“Even with us as a great partner, they wouldn’t even release a sliver of the data set to help distill the model,” she said. “That was a recipe we kind of had to come up with on our own.”

Nvidia’s emphasis on data transparency may help to reverse a worrying trend toward diminished transparency. Scholars at MIT recently conducted a broad study of code repositories on HuggingFace. They related that truly open-source postings are on the decline, citing “a clear decline in both the availability and disclosure of models’ training data.”

As lead author Shayne Longpr and team pointed out, “The Open Source Initiative defines open source AI models as those which have open model weights, but also ‘sufficiently detailed information about their [training] data’,” adding, “Without training data disclosure, a released model is considered ‘open weight’ rather than ‘open source’.”

What’s at stake for Nvidia, Meta

It’s clear Nvidia and Meta have different priorities. Meta needs to make a profit from AI to reassure Wall Street about its planned spending of hundreds of billions of dollars to build AI data centers. 

Nvidia, the world’s largest company, needs to ensure it keeps developers hooked on its chip platform, which generates the majority of its revenue.

Also: US government agencies can use Meta’s Llama now – here’s what that means

Meta CEO Mark Zuckerberg has suggested Llama is still important, telling Wall Street analysts in October, “As we improve the quality of the model, primarily for post-training Llama 4, at this point, we continue to see improvements in usage.”

However, he also emphasized moving beyond just having a popular LLM with the new directions his newly formed Meta Superintelligence Labs (MSL) will take.  

“So, our view is that when we get the new models that we’re building in MSL in there, and get, like, truly frontier models with novel capabilities that you don’t have in other places, then I think that this is just a massive latent opportunity.”

As for Nvidia, “Large language models and generative AI are the way that you will design software of the future,” Briski told me. “It’s the new development platform.”

Support is key, she said, and, in what could be taken as a dig at Zuckerberg’s intransigence, though not intended as such, Briski invoked the words of Nvidia founder and CEO Jensen Huang: “As Jensen says, we’ll support it as long as we shall live.”

 

Search

RECENT PRESS RELEASES