Used Pokémon to assess an anthropic, the latest AI model

Anthropic, used Pokémon to evaluate the newest AI model. Yes, really.

On a blog post Anthropic, published on Monday, said he tried his last model, Claude 3.7 SonnetThe boy in the game is a classic Pokémon red. The main memory of the company, the screen, the screen calls to press the press press and the press press press press press press pressure and walk around the screen, to play continuously in Pokémon.

Clode 3.7 is an unique feature of Sonnet, the ability to engage in “extended thinking.” As Openai’s O3-Mini and DeepSeek R1, Claude 3.7 Sonnet can “cause” through difficult problems using more calculation.

Probably Pokémon was needed in red.

Claude 3.0 in Pallet, which is not able to leave the story, compared to the previous version of Sonnet, the story of Pale 3.7 Sonnet successfully hit the leaders and won badges.

Anthrop Pokemon Red — **Photo credits:**Anthropical

Now it is not clear how much calculation is required for Klod 3.7 Sonnet to reach these stages – and how long each one is going on. Anthropically, only the model’s last gym leader, the increase is 35,000 acts.

Undoubtedly, some entrepreneurial developers will not be long-running.

Pokémon is a toy benchmark than anything red. But there have Long History Games used for AI benchmarking purposes. A number of new applications and platforms to test the game game skills of a single number of new applications and platform models in the last few months Street fighter for Cutting.

Source link

Used Pokémon to assess an anthropic, the latest AI model

Leave a ReplyCancel Reply

Integration Hospitals 6023: Fortuna / PINC AI

This natural deodorant for him is a bestseller – it’s just $ 15

Jamie Dimon calls the US government ‘inefficient’, promotes Elon Musk’s Duxt effort

Leave a ReplyCancel Reply

Trending now

Integration Hospitals 6023: Fortuna / PINC AI

This natural deodorant for him is a bestseller – it’s just $ 15

Jamie Dimon calls the US government ‘inefficient’, promotes Elon Musk’s Duxt effort