DeepSeek’s new AI model thinks why ChatGPT

Earlier this week, well-funded Chinese AI lab DeepSeek released an “open” AI model that outperformed many competitors on popular benchmarks. model, DeepSeek V3it’s big but efficient, handling text-based tasks like coding and essay writing with ease.

It seems that this is also the case ChatGPT.

Writings about X – and TechCrunch’s own tests show that DeepSeek V3 identifies itself as ChatGPT, OpenAI’s AI-powered chatbot platform. Asked to elaborate, DeepSeek V3 insists it is a version of OpenAI GPT-4 Model released in June 2023.

It actually repeats itself like today. In 5 out of 8 generations, DeepSeekV3 claims to be DeepSeekV3 only 3 times, while ChatGPT (v4) claims.

It gives you a rough idea of the distribution of some of their training data. https://t.co/Zk1KUppBQM pic.twitter.com/ptIByn0lcv

— Lucas Beyer (bl16) (@giffmana) December 27, 2024

Dreams run deep. If you ask DeepSeek V3 a question about DeepSeek’s API, it will guide you on how to use it. OpenAIs API. DeepSeek V3 even says the same thing jokes Like GPT-4 – up to the punch lines.

So what happens?

Models like ChatGPT and DeepSeek V3 are statistical systems. Trained on billions of examples, they learn patterns to make predictions in those examples, such as how the word “to” usually comes before “may concern” in an email.

DeepSeek hasn’t revealed much about the source of DeepSeek V3’s training data. But there is there is no shortage A public dataset containing text generated by GPT-4 via ChatGPT. If DeepSeek V3 had been trained on these, the model could have memorized some of the GPT-4 outputs and now reproduces them verbatim.

“It’s clear that the model sees raw responses from ChatGPT at some point, but it’s not clear where,” Mike Cook, an artificial intelligence researcher at King’s College London, told TechCrunch. “It may be ‘random’ … but unfortunately we’ve seen people train their models directly on the results of other models to test and feed back their knowledge.”

Cook noted that the practice of training models on the outputs of competing AI systems can be “very bad” for model quality, as it can lead to hallucinations and misleading responses like the one above. “Like making a copy of a copy, we’re increasingly losing touch with information and reality,” Cook said.

It may also conflict with the terms of service of those systems.

OpenAI’s terms prohibit users of its products, including ChatGPT customers, from using the outputs to develop models that compete with OpenAI’s own product.

OpenAI and DeepSeek did not immediately respond to requests for comment. However, Sam Altman, CEO of OpenAI, has posted something that appears to be to dig X Friday on DeepSeek and other competitors.

“It’s (relatively) easy to copy something you know works,” Altman said. “It’s hard to do something new, risky, and challenging if you don’t know if it’s going to work.”

True, the DeepSeek V3 is far from the first model to misidentify itself. Google’s Gemini and others sometimes they claim to be competing models. For example, Mandarin was asked for in Gemini he says This is the Chinese company Baidu’s Wenxinyiyan chatbot.

This is because AI companies are turning to the internet, where the bulk of their training data originates littered With AI slope. It uses artificial intelligence to create content farms click bait. Bots are flooding Reddit and X. With one to guess90% of the internet could be powered by artificial intelligence by 2026.

This “pollution”, if you will, did it quite difficult comprehensively filtering AI results from the training dataset.

It is of course possible for DeepSeek to train DeepSeek V3 directly on text generated by ChatGPT. Google once was accused to do the same, after all.

Heidy Khlaaf, director of engineering at consulting firm Trail of Bits, said that regardless of the risks, the savings from “distilling” the knowledge of an existing model could be attractive to developers.

“Even if internet data is now full of AI outputs, other models that happen to be trained on ChatGPT or GPT-4 outputs will not necessarily display outputs that resemble OpenAI’s personalized messages,” Khlaaf said. “It wouldn’t be surprising if DeepSeek distilled in part using OpenAI models.”

It is likely that most of the ChatGPT/GPT-4 data went into the DeepSeek V3 training set. This means that the model cannot be trusted to identify itself. But what is more interesting is that DeepSeek V3 can uncritically assimilate and reproduce the results of GPT-4. to aggravate some biases and flaws of the model.

Source link

Leave a ReplyCancel Reply