OpenAI released GPT-4.5is a new version its flagship large language models. The company claims that it is the biggest and best all-round chat model yet. Mia Glaese is a researcher at OpenAI. She says, “It’s a big step forward for us.”
OpenAI has been pushing its two product lines since the release of its reasoning models o1 & o3. GPT-4.5, which is part of OpenAI’s non-reasoning line-up, is what Glaese and Ryder’s colleague Nick Ryder (also a research scientist) call “an installment in GPT classic series.”
ChatGPT Pro users who pay $200 a month can try GPT 4.5 today. OpenAI has announced that it will roll out the new version to other users starting next week.
OpenAI has proven that bigger is better with each release of GPT models. There has been much talk about the fact that this approach is hitting a brick wall, including remarks from OpenAI’s former chief scientist Ilya Sutskever. The company’s claims regarding GPT-4.5 are a slap in the face to those who doubt it.
All the large language models learn patterns from the billions documents they are trained with. Smaller models learned basic facts and syntax. The larger models can pick up more subtle patterns, like when a speaker’s words indicate hostility.
Glaese says, “It is able to engage in warm and intuitive conversations that flow naturally.” “We think it has a better understanding of what users are saying, especially when they have more implicit expectations, leading to nuanced, thoughtful responses,” says Glaese. This is a major exercise in scaling the compute, scaling the data, finding better training methods, and pushing the frontier. It says that the scale jump from GPT-4o (the previous version) to GPT 4.5 is the exact same as the one from GPT 3.5 to GPT-4o. Experts estimate that GPT-4 may have up to 1.8 trillion parameters. These are the values that are tweaked during model training.
GPT 4.5 was trained using techniques similar to the ones used for its predecessor GPT-4o. These included human-led fine tuning and reinforcement learning with feedback from humans. Ryder says that the key to creating intelligent systems has been a recipe that we have followed for many years. It is to find scalable models where we can pour in more and more resources to get more intelligent system out.
Unlike reasoning model such as o1 or o3, that work through answers step-by-step, normal large language modeling like GPT 4.5 spits out the first answer they come up. GPT-4.5, however, is more general. GPT-4.5 scored 62.5% on SimpleQA – a general-knowledge quiz created by OpenAI in the past year. It includes questions on TV shows, video games, science, technology, and more. GPT-4o scored 38.6% and o3 mini 15%.
OpenAI claims GPT-4.5 has far fewer hallucinations (also known as made-up responses). GPT-4.5 fabricated answers 37.1% more often than GPT-4o, and 80.3% of the time for o3 mini.
SimpleQA is only one benchmark. Other tests, such as MMLU (a benchmark more commonly used to compare large language models), showed marginal gains over OpenAI’s previous models. GPT-4.5 also scores lower than o3 on standard math and science benchmarks.
GPT 4.5’s charm seems to be the conversation. OpenAI’s human testers say that they prefer GPT-4.5 over GPT-4o when it comes to everyday queries, professional questions, and creative tasks such as coming up with poetry. (Ryder claims it is also good at old-school ACSII internet art.)
OpenAI is facing a tough competition after years of dominance. Waseem Alshikh, cofounder and chief technology officer of Writer, an enterprise startup that builds large language models, says the focus on creativity and emotional intelligence is cool for niche applications like writing coaches.
He says that GPT-4.5 is like a new coat of paint for the same old car. “Adding more data and compute to a model may make it sound better, but it is not a game changer.”
The energy costs are too high and most users will not notice the difference on a daily basis. “I’d prefer to see them pivot towards efficiency or niche problem solving than keep supersizing that same recipe.”
Sam Altman said that GPT 4.5 will be the final release in OpenAI’s classic lineup. GPT-5 is a hybrid that will combine a general-purpose, large language model with reasoning model.
According to Alshikh, “GPT 4.5 is OpenAI’s way of phoning it while they work on something bigger behind closed door.” “Until then, it feels like this is a pitstop.”
But OpenAI insists its supersized approach has legs. Ryder says, “I’m personally very optimistic about finding a way to overcome those bottlenecks and continue to scale.” “I find it fascinating and profound to match patterns across all human knowledge.”