Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Elon Musk agrees that we are running out of AI training data


Elon Musk agrees with other AI experts that there is little real-world data left to train AI models.

“We have now largely exhausted the sum total of human knowledge…. in AI training,” Musk said during a live chat with Stagwell Chairman Mark Penn on X Wednesday. “That’s mostly happened in the last year.”

Musk, who owns an AI company, echoed the themes of former OpenAI chief scientist Ilya Sutskever. touched During a talk at NeurIPS, a machine learning conference, in December. Sutskever, who said the AI ​​industry has reached what he calls “peak data,” predicted that the lack of training data will take away from training models today.

Indeed, Musk has suggested that synthetic data — data generated by AI models themselves — is the way forward. “With synthetic data … (AI) will evaluate itself and go through this self-learning process with synthetic data,” he said.

Other companies, including tech giants such as Microsoft, Meta, OpenAI and Anthropic, are already using synthetic data to develop advanced AI models. Gartner estimates In 2024, 60% of data used for AI and analytics projects will be synthetically generated.

Microsoft’s Phi-4At the beginning of Wednesday, it was taught on open source, real world data as well as on synthetic data. So was Google Gemma models. Anthropic used some synthetic data to develop one of its most powerful systems, Claude 3.5 Sonnet. And Meta specified the newest Llama series of models Using data generated by AI.

Training on synthetic data has other advantages such as cost savings. AI startup Writer claims Palmyra X 004, made almost entirely using synthetic sources, cost just $700,000 to develop — comparison to estimates of $4.6 million for a comparably sized OpenAI model.

But there are also disadvantages. Some studies shows that synthetic data can lead to model breakdown, where the model becomes less “creative” and more biased in its results, severely compromising its functionality.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *