On EpochAI’s Frontier Math benchmark, o3 was able to solve 25.2 percent of the problems, while no model had been able to exceed 2 percent. This suggests a leap forward in the mathematical reasoning capabilities compared to the previous model. Benchmarks vs. Real-World Value
Ideal applications for a PhD-level AI model include analyzing medical data, supporting climate models, and handling routine research work. If accurate, the high prices reported by The Information suggest that OpenAI believes that these systems can provide substantial value for businesses. SoftBank, a major OpenAI investor, is reported to have committed to spending $3 Billion on OpenAI’s Agent Products this year alone. This indicates a significant business interest, despite the costs.
OpenAI is facing financial pressures which may affect its premium pricing strategy. According to reports, the company lost $5 billion in operating costs and other expenses associated with running its services.
News of OpenAI’s stratospheric price plans comes after years of relatively affordable AI service that have conditioned the users to expect powerful abilities at relatively low cost. ChatGPT Plus costs $20 per month, while Claude Pro is $30. These are both tiny fractions compared to the proposed enterprise tiers. Even ChatGPT’s $200/month is a small amount compared to these new proposed fees. It is unclear whether the performance difference between tiers will match the price difference of a thousandfold.
Despite their benchmark performances, these simulated reasoning models still struggle with confabulations–instances where they generate plausible-sounding but factually incorrect information. This is a major concern for research applications that require accuracy and reliability. A $20,000 investment per month raises the question of whether organizations can rely on these systems to not introduce subtle errors in high-stakes, risky research.
In reaction to the news, many people joked on social media that businesses could hire a PhD student for much less. “In case you have forgotten,” In a viral tweet xAI Developer Hieu Pham wrote, “most PhD students, including the brightest stars who can do way better work than any current LLMs—are not paid $20K / month.”
Although these systems show strong abilities on specific benchmarks “PhD-level” remains largely a market term. These models can process information and synthesize it at impressive speeds. However, questions remain as to how well they can handle creative thinking, intellectual skepticism and original research, which are all part of doctoral level work. They will never tire or need health insurance. They will also continue to improve and reduce in cost as time goes on.