DeepSeek is now a viral sensation.
Chinese artificial intelligence lab DeepSeek has entered the mainstream consciousness after its chatbot app topped the Apple App Store charts this week. DeepSeek’s AI model, which was trained using compute-efficient methods, has led Wall Street analysts and technologists to question whether or not the U.S. will maintain its lead in AI and whether demand for AI chips can sustain.
Where did DeepSeek originate from and how did the company achieve such rapid international fame?
DeepSeek’s trader roots
DeepSeek has been backed by High-Flyer Capital Management. A Chinese quantitative hedge fund, High-Flyer uses AI to inform their trading decisions.
AI enthusiast Liang Wenfeng founded High-Flyer with his co-founders in 2015. Wenfeng began trading while studying at Zhejiang University. In 2019, he launched High-Flyer Capital Management, a hedge fund focused on developing and deploying AI algorithm.
High-Flyer launched DeepSeek in 2023 as a lab that was dedicated to researching AI tools apart from its financial business. High-Flyer was one of the investors in the lab, which spun off as its own company called DeepSeek.
DeepSeek built their own data center clusters from day one for model training. DeepSeek, like other AI companies in China has been affected by U.S. hardware export bans. To train one its more recent models the company was forced use Nvidia H800, a less powerful version of the H100 chip available to U.S. firms. DeepSeek’s tech team is said to be a young one. The company According to reports, the Chinese government aggressively recruits PhD AI researchers from top Chinese Universities. The New York Times reports that DeepSeek hires people with no computer science background in order to help its tech understand a variety of topics.
DeepSeek’s strong model
DeepSeek released its first set — DeepSeek Coder (LLM), DeepSeek Chat, and DeepSeek LLM — in November 2023. The AI industry didn’t start to take notice until last spring when the startup released the next-gen DeepSeek V2 family of models.
DeepSeek V2, a general purpose text-and-image-analyzing system, performed very well in AI benchmarks and was cheaper to run at the time than comparable models. DeepSeek’s competition in the domestic market, such as ByteDance, Alibaba, and others, were forced to lower their usage prices and make some models free.
DeepSeek V3, launched in December of 2024, has only increased DeepSeek’s fame. According to DeepSeek internal benchmark testing, DeepSeek-V3 outperforms openly available, downloadable models like Meta’s Llama, and “closed” model that can only be accessed via an API, such as OpenAI’s GPT-4o.
DeepSeek’s “reasoning model” R1 is equally impressive. DeepSeek’s R1 model, released in January, is said to perform as well as OpenAI’s o1 on key benchmarks.
As a reasoning model R1 fact-checks its own performance, which helps to avoid some of those pitfalls that models normally fall into. The time it takes for a reasoning model to reach a solution is usually a few seconds or minutes longer than a non-reasoning one. They are more reliable in areas such as physics and science. There is, however, a downside with R1, DeepSeekV3, and DeepSeek’s other models. As they are Chinese-developed AIs, they are subject to China’s internet regulator benchmarkedto ensure that its responses “embodied core socialist values.” For example, in DeepSeek’s Chatbot app, R1 will not answer questions about Tiananmen or Taiwan’s independence.
A disruptive approach
It’s unclear what DeepSeek’s business model is. The company offers its products and services at a price well below the market value, and gives away others for free.
According to DeepSeek, efficiency breakthroughs enabled it maintain extreme cost-competitiveness. Some experts However, the figures provided by the company are not in dispute.
Regardless of the situation, developers have taken up DeepSeek’s model, which isn’t open-source as the term is commonly understood, but available under permissive licensing that allows for commercial use. Clem Delangue, CEO of Hugging face, one of the platforms that hosts DeepSeek’s model, said. Hugging Face developers have created more than 500 “derivatives” of R1 which have combined to garner 2.5 million downloads.
DeepSeek has had great success against bigger and more established competitors. Described as “upending AI”, Nvidia’s stock dropped 18% on Monday as a result of the company’s success. OpenAI CEO Sam Altman to respond publicly .
It’s unclear what DeepSeek might be like in the future. It’s a given that models will be improved. But the U.S. Government appears to be Growing wary of what they perceive as harmful foreign influences
TechCrunch offers a newsletter focusing on AI! Sign up to receive it every Wednesday in your inbox.
Kyle Wiggers, a senior reporter for TechCrunch, has a special interest on artificial intelligence. His writings have appeared in VentureBeat, Digital Trends and a variety of gadget blogs, including Android Police and Android Authority, Droid-Life and XDA-Developers. He lives in Brooklyn, with his partner who is a piano teacher, and plays the piano occasionally. Sometimes — but mostly unsuccessfully.
View Bio