Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Openai O3-Mini launches the latest ‘reasoning’ model


On Friday, Openai launched the new AI “Reasoning” model, O3-mini, the company’s newest O my family of thinking models.

Open First previously reviewed the model in December Along with a more skilled system called O3, the launcher’s company, ambitions and challenges – seemingly growing seemingly from the day.

Openai struggles with the perception of the EI race Chinese companies such as DeepSeekwhich Openai claims can steal IP. Nevertheless, Chatgpt managed to win over Maker’s scores and works Lift their relationship with Washington to the shore As followed by the same time as follows Ambitious data center projectIt is also reported to be set out For one of the largest financing tours of a technological company in history.

It brings us to O3-Mini. Openai makes a new model like both “strong” and “favorable.”

“Today’s launch marks (…) An important step towards expanding an advanced AI in the service of our mission,” he said.

More efficient justification

Unlike the largest language models, check themselves before the result of the results of substantial models such as O3-mini. This helps them Avoid some traps normally walking models. These justification models take a little longer to arrive in solutions, but although trading is tended to be more reliable, although it is in domains such as physics.

O3-mini is good for special problems for programming, math and science. Openai, the model is mainly equal to the O1 family, O1 and O1-mini, but it works faster, but it works faster and less costs.

The company claimed that foreign testers preferred the answers over the O3-mini of more than more than half-thousand. O3-mini also made 39% less “main mistake” on “Hard Real World Questions” A / B tests The O1-mini and mini-mini and prepared “clearer” answers while responding about 24% faster.

It will be available for all users through O3-Mini Chatgpt Users who began on Friday, but the company pays 150 queries a day, Chatgpt Pro subscribers will get unlimited access. O3-mini, Chatgpt Enterprise and Chatgpt Edu said in a week (no words) Chatgpt Gov).

Users with Premium Chatgpt plans can choose O3-mini using the drop-down menu. Free users can click on the new “cause” button on the chat bar or tap the “reducing” to the Chatgpt.

Starting on Friday, O3-mini will also be available through Openai’s API to choose developers, but will not be supported to analyze the images first. The Devs can choose the level of “key, medium or high) to” make “the” more thinking “based on the need for use and delay.

O3-mini receives a million priests of entrance to a million priests and a million token of one million token in about 750,000 words. This is 63% cheaper than O1-Mini and are competitive with DeepSeek R1 R1 Reason model prices. Deepselaek rose to the signs of a million in a million inputs and a million dollars.

In ChatGpt, O3-mini, Openai’s “a balanced trade between speed and accuracy” is in line with secondary thinking efforts. Paid users will have a “O3-mini-high” selection choice in the model selector called “Ali Zaka” in exchange for slower answers.

Regardless of which version of O3-mini ChatGPT users will work with the search to find the actual response to the relevant web sources links. Openai warns that the functionality is “prototype” because it works for searching for searching for justifying models.

“O1, although the extensive general knowledge remains a substantial model, it provides a special alternative to technical areas that require accurate and speed,” he wrote in a blog post. “The release of O3-mini is taking another step in Openai’s mission to push the borders of effective intelligence.”

Caveats are too much

O3-mini is not the strongest model of Openai to this day, and the DeepSeek’s R1 R1 justification model in each benchmark.

O3-mini, in 2024 in AIME 2024 R1 in R1, how well the models meet and meet complex instructions and only with high thinking works. Both R1’s Programming-oriented test on SWEN-DENCH (.1 point), again hit by high thinking. Low reasoning effort, R1 in R1 in Gpqa Diamond, which tests models with physics, doctoral level physics, biology and chemistry questions.

To be fair, O3-mini responds to many inquiries at a competitive low price and delay. In the article, Openai compares his performance to the O1 family:

“With low thinking efforts, O3-mini can get the performance compared to O1-mini, the average effort, O3-mini can get comparable performance with O3,” Openai writes. “Openai writes O3-mini, O3- Mini, faster response, complies with performance in math, coding and science. Meanwhile, with high thinking efforts, O3-mini, both O1-mini, but also superior to O1. “

It should be noted that the performance advantage over O3-mini O1 is thin in some areas. In 2024, in AIME 2024, O3-mini O1 strikes only 0.3 percentage points when effort is made for high thinking. And GPGA will not even overturn the O3-mini in Diamond, even high thinking.

Think about the “thinking” thinking “thinking” thinking “thinking” thinking “means” thinking “means” thinking “means” thinking “methods” means “thinking” methods “in” thinking “methods”. According to the company, O3-mini “significantly coincided”, one of the flagship of Openai, GPT-4OAbout “challenging security and jailbreak assessments.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *