Serving technology enthusiasts for more than 25 years.
TechSpot is a source of trusted tech advice and analysis.
In short: China’s DeepSeek recently threw the multibillion-dollar AI market into chaos with the release its R1 model. This model is said to compete against OpenAI’s o1 even though it was trained on 2,048 Nvidia GPUs at a cost $5.576 Million. A new report claims the firm’s true costs were $1.6 billion and that DeepSeek had access to 50,000 Hopper graphics cards.
The claim by DeepSeek that it was able train R1 with a fraction the resources needed by big tech companies investing in AI caused Nvidia to lose a record amount of $600 billion in a single day. What would stop others from doing the same if the Chinese startup could make a powerful model without spending billions of dollars on Team Green’s AI GPUs?
Did DeepSeek create its Mixture-of-Experts, which is still at the top of the Apple App Store charts today, for such a low price? SemiAnalysis claims it didn’t.
According to the market intelligence firm, DeepSeek has access around 50,000 Hopper graphics cards, including 10,000 H800s as well as 10,000 H100. It also has orders placed for many more H20s, which are China-specific. High-Flyer – the quantitative hedge fund that is behind DeepSeek – and the startup share the GPUs. The GPUs are spread across multiple locations and used for trading, research, inference and training.
Courtesy of SemiAnalysis (19659007) SemiAnalysis reports that DeepSeek invested much more than $5.5 million, which sent the stock markets into a tailspin. The report states that the pre-training costs are a very small portion of the total. The company has invested around $1.6 billion in servers, with $944 million being spent on operating expenses. GPU investments are worth more than $500m.
Anthropic’s Claude 3.5 Sonnet, for example, cost tens or hundreds of millions of dollars to develop, but the company needed to raise billions from Google and Amazon.
DeepSeek’s talent is sourced exclusively from China. This is in contrast to reports that other Chinese tech firms, such as Huawei are attempting to poach overseas workers, with Taiwanese TSMC employees being the most sought-after targets. DeepSeek, according to reports, pays salaries of more than $1.3 million to promising candidates. This is much higher than the salaries paid by competing Chinese AI companies.
DeepSeek has the added advantage of running its datacenters rather than relying on external cloud providers. This allows for greater experimentation and innovation in its AI product stack. SemiAnalysis claims that it is the best “open weights” laboratory today, beating Meta’s Llama, Mistral and others.
Masthead: Solen Feyissa