DeepSeek claims that its’reasoning model’ beats OpenAI o1 in certain benchmarks

DeepSeek, a Chinese AI lab, has released a version of DeepSeek R1, its reasoning model. It claims that this open version performs as well or better than OpenAI o1 in certain AI benchmarks.

Hugging Face’s AI development platform offers R1 under a MIT license. This means it can be used without restrictions commercially. According to DeepSeek R1 beats out o1 in the benchmarks AIME MATH-500 and SWE Bench Verified. AIME uses other models to assess a model’s performance. MATH-500, on the other hand, is a collection word problems. SWE-bench verified, on the other hand, focuses on programming.

As a reasoning model R1 checks its own facts, which helps to avoid some of those pitfalls that models normally fall into. The time it takes for a reasoning model to reach a solution is usually a few seconds or minutes longer than a nonreasoning one. They are more reliable in areas such as physics and science.

The R1 model contains 671 billion variables. DeepSeek is revealed in a Technical reportParameters roughly correlate to a modelโ€™s problem-solving abilities, and models with more parameter tend to perform better than those who have fewer parameters.

671 billion parameters may seem large, but DeepSeek has also released “distilled versions” of R1 that range in size from 1,5 billion parameters to 70 trillion parameters. The smallest version can run on laptops. The full R1 requires beefier hardware but is available through DeepSeekโ€™s API for prices 90-95% lower than OpenAIโ€™s o1. Clem Delangue, CEO of Hugging face, said that the R1 is available through DeepSeek’s API for prices 90%-95% cheaper than OpenAI’s o1. On Monday, a post on Xrevealed that developers had created more than 500 โ€œderivativeโ€ models of R1 which have been downloaded 2.5 million times — five times as many downloads as the official R1.

This model was released only a few days before and has already been adapted into more than 500 different models. @deepseek_ai (#19459058) has been created in all parts of the world. @huggingface with 2.5 million downloads (5x original weights).

Open-source decentralized AI is powerful!

— clem (@ClementDelangue) January 27, 2025.

R1 has a downside. It’s a Chinese model and therefore subject to China’s internet regulator benchmarking to ensure that R1’s responses “embodies core socialist values”

Filtering on R1 in action. Image credit:DeepSeek Many Chinese AI models, including other reasoning systems, refuse to respond to topics which could raise the ire regulators in the nation, such as speculations about the Xi Jinping’sRegime.

The R1 arrives just days after the Biden administration announced its departure. Export restrictions andstricter rules on AI technologies will be imposed for Chinese companies. Companies in China are already prohibited from purchasing advanced AI chips. If the new rules take effect as written, they will face stricter limits on both the semiconductor technology and models required to bootstrap sophisticated AI system.

OpenAI, in a policy paper last week, urged the U.S. Government to support the development and deployment of U.S. AI systems, lest Chinese models surpass or match them. In an OpenAI’s Vice President of Policy Chris Lehane, in an interviewconducted with The Information, cited High Flyer Capital Management as a company of particular concern.

At least three Chinese laboratories — DeepSeek and Alibaba, as well as As — owned by Chinese unicorn Moonshot AI – has produced models they claim are comparable to o1. Note that DeepSeek announced a preview for R1 in November. In a The story was originally published on January 20, and updated on January 27, with additional information. TechCrunch now has a newsletter devoted to AI! Sign up hereand receive it every Wednesday.

Kyle Wiggers, a senior reporter for TechCrunch, has a special interest on artificial intelligence. His writings have appeared in VentureBeat, Digital Trends and a variety of gadget blogs, including Android Police and Android Authority, Droid-Life and XDA-Developers. He lives in Brooklyn, with his partner who is a piano teacher, and plays the piano occasionally. Sometimes — but mostly unsuccessfully.

View Bio

www.aiobserver.co

More from this stream

Recomended