Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

OpenAI announces new o3 models


OpenAI saved its biggest announcement for the last day 12-day “shipmas” event.

On Friday, the company unveiled the successor to the o3 o1 A “trial” model released at the beginning of the year. o3 is a model family, more specifically, as in o1. There’s the o3 and o3-mini, a smaller, distilled model fine-tuned for specific tasks.

Why is the new model called o3 and not o2? Well, trademarks may be to blame. according to OpenAI has reportedly skipped o2 to avoid a potential conflict with British telecom provider O2. Strange world we live in, isn’t it?

Neither the o3 nor the o3-mini are widely available yet, but security researchers can sign up for a preview starting today. The O3 family may not exist at all for a while — at least if OpenAI CEO Sam Altman is true to his word. a final interviewAltman said OpenAI favors a federal testing framework to guide the monitoring and de-risking of such models before releasing new reasoning models.

And there are risks. AI security testers they found that o1’s reasoning ability makes it fool human users at a higher rate than conventional, “unreasonable” models—or, for that matter, Meta, Anthropic, and Google’s leading AI models. It is possible that o3 tries to cheat at a higher speed than its predecessor; We’ll find out once OpenAI’s red team partners release their test results.

Thinking steps

Unlike most AI, reasoning models like o3 effectively check themselves helping them avoid some of the pitfalls that usually derail models.

The process of verifying this fact causes some delays. o3, like o1 before it, takes slightly longer to obtain solutions than a conventional non-reasoning model – typically seconds or minutes more. The upside? It tends to be more reliable in areas such as physics, science and mathematics.

o3 is trained to “think” before responding through what OpenAI calls a “personal chain of thought.” A model can think about a task and plan ahead, performing a series of actions over a long period of time that helps to find a solution.

In practice, when a request is made, o3 pauses before responding, considering a series of related cues and “explaining” its reasoning along the way. After a while, the model summarizes what it considers to be the most accurate answer.

A big question that has come up so far is whether OpenAI’s newest models can claim to come close to AGI. AGI, short for “artificial general intelligence,” broadly refers to artificial intelligence that can do anything a human can do. OpenAI has a unique definition: “Highly autonomous systems that outperform humans in the most economically valuable work.”

Achieving AGI would be a bold claim. And the contract carries weight for OpenAI as well. Under the terms of its contract with close partner and investor Microsoft, once OpenAI acquires AGI, it no longer has to give Microsoft access to its most advanced technologies (those that meet OpenAI’s definition of AGI).

We’re going with one benchmark, OpenAI does slowly approaching AGI. In ARC-AGI, a test designed to evaluate whether an artificial intelligence system can efficiently acquire new skills beyond the data it was trained on, o1 got it A score from 25% to 32% (100% is the best). Eighty-five percent is considered “human-level,” but one of the creators of ARC-AGI, Francois Cholletcalled the progress “solid”.

Of course, ARC-AGI has its own characteristics limitations – and his definition AGI is one of many.

A trend

Following the release of OpenAI’s first batch of reasoning models, there has been an explosion of reasoning models from rival AI companies — including Google. In early November, DeepSeek, an AI research company funded by quant traders, previewed its first reasoning model, DeepSeek-R1. That same month, Alibaba’s Gwen team opened what it claimed was the first “open” competitor to o1.

What opened the floodgates of thought pattern? First, the search for new approaches to improve generative AI. Like my colleague Max Zeff recently informed“Brute force” methods of scaling models no longer yield the improvements they once did.

Not everyone is convinced thinking models are the best way forward. First, they are expensive due to the large computing power required to run them. Although they have performed well on benchmarks so far, it is unclear whether reasoning models can sustain this rate of progress.

Interestingly, the release of o3 comes with the departure of one of OpenAI’s most accomplished scientists. Alec Radford, lead author of the academic paper that launched the “GPT series” of OpenAI’s generative AI models (i.e., GPT-3, GPT-4, etc.), announced this week that he leaves conduct independent research.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *