AI Markets Were Deceived To Believe In DeepSeek’s Low Training Costs; They Are Actually 400 Times Higher Than The Reported Figure

February 2, 2025

The controversy around DeepSeek’s costs for training their R1 model shook up the markets, but it seems like there was a lot of deception around it, since the actual figures are indeed surprising.

DeepSeek’s Training Costs Are Said To Be Significantly Higher Than The Reported “$5 Million” Figure; They Have Access To High-End Hardware

The research firm SemiAnalysis has conducted an extensive analysis of what’s actually behind DeepSeek in terms of training costs, refuting the narrative that R1 has become so efficient that the compute resources from NVIDIA and others are unnecessary. Before we dive into the actual hardware used by DeepSeek, let’s take a look at what the industry initially perceived. It was claimed that DeepSeek only utilized “$5 million” for its R1 model, which is on par with OpenAI GPT’s o1, and this triggered a retail panic, which was reflected in the US stock market; however, now that the dust has settled, let’s take a look at the actual figures.

Related Story Apple Intelligence Launch In China Could Be Made Possible By Partnering With DeepSeek, But No Word If Talks With Tencent Or ByteDance Have Been Fruitful

For those unaware, DeepSeek was said to be a side project of the Chinese hedge fund High-Flyer, and the report by SemiAnalysis claims that they purchased 10,000 units of NVIDIA’s A100 back in 2021, when export restrictions weren’t that aggressive. DeepSeek then evolved into a separate entity since the parent company, High-Flyer, decided to spin the project off, and that’s when things actually took off. With that, they started accumulating computing resources, which we’ll discuss next.

Image Credits: SemiAnalysis

The report says that DeepSeek has around 10,000 of NVIDIA’s “China-specific” H800 AI GPUs and 10,000 of the higher-end H100 AI chips. Moreover, the firm has invested in NVIDIA’s H20 AI accelerators, and they have a “pool” of resources that are being shared between DeepSeek and High-Flyer for “trading, inference, training, and research.” This translates into approximately $1.6 billion in CapEx for DeepSeek, with operating costs rumored to be around $944 million. The figures translate into approximately four hundred times higher than the markets initially perceived.

image

For clarification, the initial figure is said to be a “specific part” of the training costs likely associated with running the final model. The one thing DeepSeek was actually good at was capitalizing on local talent, through recruitment events at top local universities, with salaries of over $1.3 million dollars for specific employees. The brains behind DeepSeek’s R1 model were indeed capable of coming up with an efficient solution to compete with the likes of OpenAI, but the “misreported” financial figures acted as a catalyst in last week’s black swan event,

SemiAnalysis has conducted extensive testing with DeepSeek’s AI model, hence you should definitely check that out, since there are interesting details mentioned.