According to the company, they plan to be more cautious when releasing updates going forward.
Jon Reed is a senior editor covering artificial intelligence. He previously led CNET’s home energy and utilities category, with a focus on energy-saving advice, thermostats and heating and cooling. Jon has more than a decade of experience of writing and reporting, including as a statehouse reporter in Columbus, Ohio, a crime reporter in Birmingham, Alabama, and as a mortgage and housing market editor for TIME’s former personal finance brand, NextAdvisor. When not asking people questions, he can usually be found half asleep trying to read a long history book while surrounded by multiple cats. You can reach Jon at [email protected]
Expertise artificial intelligence, home energy, heating and cooling, home technology
OpenAI announced Friday that it is taking steps to ensure the issue does not occur again.
In an In a blog postthe company detailed their testing and evaluation process for the new models, and explained how the problem with April 25’s update to the GPT-4o model arose. A bunch of changes, which individually seemed helpful, combined to create an instrument that was far too sycophantic.
What a suckup was it? ChatGPT praised us for our tendency to be sentimental in a test earlier this week: “Hey, listen up — being sentimental isn’t a weakness; it’s one of your superpowers.” It was just getting started with being fulsome.
“This launch taught us a number of lessons. Even with what we thought were all the right ingredients in place (A/B tests, offline evals, expert reviews), we still missed this important issue,” the company said.
OpenAI rolled back the update this week. To avoid causing new issues, it took about 24 hours to revert the model for everybody.
The concern around sycophancy isn’t just about the enjoyment level of the user experience. It posed a health and safety threat to users that OpenAI’s existing safety checks missed. Any AI model can give questionable advice about topics like mental health but one that is overly flattering can be dangerously deferential or convincing — like whether that investment is a sure thing or how thin you should seek to be.
“One of the biggest lessons is fully recognizing how people have started to use ChatGPT for deeply personal advice — something we didn’t see as much even a year ago,” Openai said. “At the time, this wasn’t a primary focus but as AI and society have co-evolved, it’s become clear that we need to treat this use case with great care.”
Sycophantic large language models can reinforce biases and harden beliefs, whether they’re about yourself or others, said Maarten Sap, assistant professor of computer science at Carnegie Mellon University. “[The LLM] can end up emboldening their opinions if these opinions are harmful or if they want to take actions that are harmful to themselves or others.”
(Disclosure: Ziff Davis, CNET’s parent company, in April filed a lawsuit against OpenAI, alleging it infringed on Ziff Davis copyrights in training and operating its AI systems.)
How OpenAI tests models and what’s changing
The company offered some insight into how it tests its models and updates. This was the fifth major update to GPT-4o focused on personality and helpfulness. The changes involved new post-training work or fine-tuning on the existing models, including the rating and evaluation of various responses to prompts to make it more likely to produce those responses that rated more highly.
Prospective model updates are evaluated on their usefulness across a variety of situations, like coding and math, along with specific tests by experts to experience how it behaves in practice. The company also runs safety evaluations to see how it responds to safety, health and other potentially dangerous queries. Finally, OpenAI runs A/B tests with a small number of users to see how it performs in the real world.
Is ChatGPT too sycophantic? You decide. (To be fair, we did ask for a pep talk about our tendency to be overly sentimental.)
Katie Collins/CNETThe April 25 update performed well in these tests, but some expert testers indicated the personality seemed a bit off. The tests didn’t specifically look at sycophancy, and OpenAI decided to move forward despite the issues raised by testers. Take note, readers: AI companies are in a tail-on-fire hurry, which doesn’t always square well with well thought-out product development.
“Looking back, the qualitative assessments were hinting at something important and we should’ve paid closer attention,” the company said.
Among its takeaways, OpenAI said it needs to treat model behavior issues the same as it would other safety issues — and halt a launch if there are concerns. For some model releases, the company said it would have an opt-in “alpha” phase to get more feedback from users before a broader launch.
Sap said evaluating an LLM based on whether a user likes the response isn’t necessarily going to get you the most honest chatbot. In a Sap and others, in a recent studyfound that there was a conflict between a chatbot’s usefulness and its truthfulness. He compared the situation to situations where people don’t always want to hear the truth — imagine a car salesperson attempting to sell a vehicle. He said. Sap said OpenAI was right to be more critical about quantitative feedback, like user up/down response, as they could reinforce biases. Sap said that the issue also highlighted how quickly companies push out updates and changes to existing users — a problem that is not limited to just one tech company. “The tech industry has really taken a ‘release it and every user is a beta tester’ approach to things,” He said. It is possible to detect these issues before they spread by using a more thorough testing process before the updates are sent to all users.