Recent updates to ChatGPT made the chatbot far too agreeable and OpenAI said Friday it’s taking steps to prevent the issue from happening again.
In a blog post, the company detailed its testing and evaluation process for new models and outlined how the problem with the April 25 update to its GPT-4o model came to be. Essentially, a bunch of changes that individually seemed helpful combined to create a tool that was far too sycophantic and potentially harmful.
How much of a suck-up was it? In some testing earlier this week, we asked about a tendency to be overly sentimental, and ChatGPT laid on the flattery: “Hey, listen up — being sentimental isn’t a weakness; it’s one of your superpowers.” And it was just getting started being fulsome.
Is ChatGPT too sycophantic? You decide. (To be fair, we did ask for a pep talk about our tendency to be overly sentimental.)
Katie Collins/CNET
The April 25 update performed well in these tests, but some expert testers indicated the personality seemed a bit off. The tests didn’t specifically look at sycophancy, and OpenAI decided to move forward despite the issues raised by testers. Take note, readers: AI companies are in a tail-on-fire hurry, which doesn’t always square well with well thought-out product development.
“Looking back, the qualitative assessments were hinting at something important and we should’ve paid closer attention,” the company said.
Among its takeaways, OpenAI said it needs to treat model behavior issues the same as it would other safety issues — and halt a launch if there are concerns. For some model releases, the company said it would have an opt-in “alpha” phase to get more feedback from users before a broader launch.
Sap said evaluating an LLM based on whether a user likes the response isn’t necessarily going to get you the most honest chatbot. In a recent study, Sap and others found a conflict between the usefulness and truthfulness of a chatbot. He compared it to situations where the truth is not necessarily what people want — think about a car salesperson trying to sell a vehicle.
“The issue here is that they were trusting the users’ thumbs-up/thumbs-down response to the model’s outputs and that has some limitations because people are likely to upvote something that is more sycophantic than others,” he said.
Sap said OpenAI is right to be more critical of quantitative feedback, such as user up/down responses, as they can reinforce biases.
The issue also highlighted the speed at which companies push updates and changes out to existing users, Sap said — an issue that’s not limited to one tech company. “The tech industry has really taken a ‘release it and every user is a beta tester’ approach to things,” he said. Having a process with more testing before updates are pushed to every user can bring these issues to light before they become widespread.