News

Opera Mini launches AI-powered update to compete with Google and Microsoft...

AI Observer
News

Introducing Gemini 2.0: our new AI model for the agentic era

AI Observer
News

Why ā€˜Beating Chinaā€™ in AI Brings Its Own Risks

AI Observer
News

AI means the end of internet search as weā€™ve known it

AI Observer
News

How optimistic are you about AIā€™s future?

AI Observer
News

State-of-the-art video and image generation with Veo 2 and Imagen 3

AI Observer
News

Whatā€™s next for AI in 2025

AI Observer
Natural Language Processing

Virtual Personas for Language Models via an Anthology of Backstories

AI Observer
News

Why Apple Intelligence Might Fall Short of Expectations?

AI Observer
Natural Language Processing

Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination

AI Observer
Natural Language Processing

FACTS Grounding: A new benchmark for evaluating the factuality of large...

AI Observer

Featured

News

OpenAI’s Deep Research is more accurate than you in fact-finding, but...

AI Observer
News

OpenAI releases new simulated reason models with full access to tools

AI Observer
News

xAI adds a memory feature to Grok

AI Observer
AI Hardware

Congress wants to know if Nvidia superchips slipped through Singapore to...

AI Observer
AI Observer

OpenAI’s Deep Research is more accurate than you in fact-finding, but...

Wei and team don't directly offer any hypothesis about why Deep Research fails almost half the time, but the implicit answer is in the scaling of its ability with more compute. As they run more parallel tasks, and ask the model to evaluate multiple answers, the accuracy scales...