Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

DeepSeek claims that its reasoning model outperforms OpenAI’s o1 on certain criteria


Chinese AI lab DeepSeek has released an open version of DeepSeek-R1, its so-called reasoning model, which it claims outperforms OpenAI. o1 on certain AI criteria.

The R1 AI platform is available from Hugging Face under the MIT license, meaning it can be used for unlimited commercial purposes. According to DeepSeek, R1 outperforms o1 in AIME, MATH-500 and SWE-bench Verified benchmarks. AIME uses other models to evaluate model performance, while MATH-500 is a collection of word problems. SWE-bench Verified, meanwhile, focuses on programming tasks.

R1, a reasoning model, is actually self-verifying it helps to avoid some of the pitfalls that usually break models. Judgmental models take slightly longer to obtain solutions than a typical non-judgmental model—usually seconds or minutes longer. The advantage is that they tend to be more confident in areas such as physics, science and mathematics.

DeepSeek contains 671 billion parameters in R1 technical report. The parameters roughly match the problem-solving abilities of the model, and models with more parameters generally perform better than models with fewer parameters.

671 billion parameters is huge, but DeepSeek also released “distilled” versions of R1 ranging in size from 1.5 billion parameters to 70 billion parameters. The smallest can work on a laptop. As for the full R1, it requires better hardware, but it does DeepSeek’s API is available at 90%-95% cheaper than OpenAI’s o1.

R1 has a downside. Since it is a Chinese model, it is subject to it comparison China’s internet regulator to ensure its responses “embody core socialist values”. R1 will not answer questions about, for example, Tiananmen Square or Taiwan’s autonomy.

DeepSeek R1 rejection
R1’s filtering works. Image credits:DeepSeek

A lot Chinese AI systemsincluding other reasoning models, landing to respond to topics that may incur the wrath of regulators in the country, such as speculation about Xi Jinping mode.

R1 comes just days after the outgoing Biden administration suggested harder Export regulations and restrictions on AI technologies for Chinese enterprises. Companies in China are already barred from buying advanced AI chips, but if the new rules go into effect as written, companies will face tougher caps on both the semiconductor technology and the models needed to power sophisticated AI systems.

a policy document Last week, OpenAI called on the US government to support US AI development so that Chinese models match or surpass them in terms of capabilities. one interview Chris Lehane, OpenAI’s vice president of policy, cited DeepSeek’s corporate parent, High Flyer Capital Management, as an organization of particular concern with The Information.

So far, at least three Chinese laboratories – DeepSeek, Alibaba and LikeIt belongs to the Chinese unicorn Moonshot AI — produced models they claimed rivaled the o1. (Note that DeepSeek was the first – this announced Review of R1 at the end of November.) a post Dean Ball, an AI researcher at George Mason University in X, said the trend suggests Chinese AI labs will continue to be “fast followers.”

“The impressive performance of DeepSeek’s distilled models (…) means that highly skilled thinkers will continue to be widespread and able to run on local hardware,” Ball said, “out of sight of any top-down control regime.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *