OpenAI unveils deep-research agent for ChatGPT (19459000)
The agent can complete its task in anywhere between five and thirty minutes. OpenAI claimed: OpenAI published a multitude of statistics to support its claims. OpenAI deep-research achieved a 26.6 percent accuracy on the Humanity’s last exam evaluation. This dataset consisted of 3,000 questions in 100 subjects, designed to benchmark LLMs. Comparatively, GPT-4o scored only 3.3 percent and Grok-2 3.8 percent. Users may feel a sense of déjà vu.
Google announced Deep Researchfor Gemini Advanced subscribers in December 2024 and claimed that the technology would save them time. “hours of time.”
Google Deep Research creates a multi-step plan of research which a user can either approve or revise. Once the bot has been given the OK, it will search the internet for the user.
OpenAI’s deep research is better suited for asking ChatGPT questions, adding additional resources, such as spreadsheets, for context, before letting it run. The result includes citations as well as a summary of the agent’s reasoning. The user is still responsible for referencing and verifying the information returned by software.
Verification is still necessary. According to OpenAI’s internal evaluations, the number of inaccuracies or hallucinations was lower than that of existing ChatGPT models. “It may struggle with distinguishing authoritative information from rumors, and currently shows weakness in confidence calibration, often failing to convey uncertainty accurately.”
Deep research agent is only for Pro users who pay $200 per month to the company Plus and Team users, followed by Enterprise, will be added in the future. OpenAI has said that paying customers will soon be able to “significantly higher rate limits” when the company releases faster versions powered with a small model.
- AI Review Lucie: France’s Answer to Chatgpt Paused due to False Pas Overdrive.
- GitHub’s boast that Copilot produces quality code is challenged.
- Microsoft was begging you to be reasonable. OpenAI GPT o1 made Copilot reasonable.
It is interesting that DeepSeek’s AI modelswere released after the arrival of OpenAI GPT-o1
OpenAI envisions a combination between deep research and Operator which can take real world action to “enable ChatGPT to carry out increasingly sophisticated tasks.” (r)