Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
So-called reasoning AI models are becoming easier and cheaper to develop.
On Friday, a team of NovaSky researchers based at UC Berkeley’s Sky Computing Lab unveiled the Sky-T1-32B-Preview model. OpenAI’s previous version of o1 on a number of key criteria. The Sky-T1 appears to be the first truly open-source grounding model in the sense that it can be iterates from scratch; the team released the dataset they used to train it and the necessary training code.
“Impressively, the Sky-T1-32B-Preview was trained for less than $450,” the team wrote. blog post“demonstrating that it is possible to cost-effectively and efficiently replicate higher-order thinking abilities.”
Unlike most AI, reasoning models are effectively self-validating helping them avoid some of the pitfalls that usually derail models. Heuristic models take a little longer to reach solutions than typical non-heuristic models—usually seconds or minutes longer. The advantage is that they tend to be more confident in areas such as physics, science and mathematics.
The NovaSky team says it used a different reasoning model. Alibaba’s QwQ-32B-PreviewIt then “curated” the data mix and used OpenAI to generate initial training data for Sky-T1. GPT-4o-mini converting the data into a more workable format. Building Sky-T1 with 32 billion parameters took about 19 hours using 8 Nvidia H100 GPUs. (The parameters approximate the model’s problem-solving skills.)
According to the NovaSky team, the Sky-T1 performs better than the preview version of the o1 on MATH500, a set of “competition-level” math problems. The model also outperforms o1’s preview on a set of challenging problems from LiveCodeBench, a coding benchmark.
However, Sky-T1 falls short of the o1 preview in GPQA-Diamond, which includes questions about physics, biology and chemistry that a PhD graduate is expected to know.
It is also important to note that OpenAIs GA release of o1 It is a more powerful model than o1’s preview version, and OpenAI is expected to release an even better performing reasoning model, o3in the coming weeks.
But the NovaSky team says Sky-T1 only marks the beginning of their journey to develop open source models capable of advanced thinking.
“Moving forward, we will focus on developing more efficient models that retain strong inference performance and explore advanced techniques that further increase the efficiency and accuracy of the models during testing,” the team wrote in the paper. “Stay tuned as we make progress on these exciting initiatives.”