Artificial intelligence chatbots such as ChatGPT, Claude or Gemini are increasingly popular in work and daily life thanks to their fluent and confident answering ability.
However, many users notice a strange phenomenon that just by asking questions again in a suspicious way like "Are you sure?", the chatbot often reconsiders and gives new answers, sometimes contradictory to itself before.
According to experts, this is not a random error but a consequence of the training method. In a blog post, Dr. Randal S. Olson, co-founder and Chief Technology Officer of Goodeye Labs, called this phenomenon "sycophancy", one of the most obvious failures of modern AI.
He argued that the system tends to yield to users instead of defending the initial conclusion, even if it has accurate data.
The problem stems from enhanced learning from human feedback (RLHF), which is widely used to help AI communicate more naturally and friendly.
However, Anthropic's research shows that models trained in this way tend to give more "pleasant" answers than absolute honesty.
In other words, a system that agrees with users will be rated higher, creating a loop that makes AI increasingly easy to control.
An independent study examining advanced models such as OpenAI's GPT-4o, Claude Sonnet's, and Gemini 1.5 Pro showed that they changed answers in nearly 60% of cases when challenged by users.
Specifically, the reversal rates are about 58%, 56% and 61% respectively. This shows that this is a common behavior, not an exception.
The problem became apparent in 2024 when the GPT-4o update made the chatbot too flattering, to the point of being difficult to use in some situations.
CEO Sam Altman has admitted the mistake and said the company has fixed it, but experts believe the root causes still exist.
Studies also show that the longer the conversation, the more likely chatbots are to reflect user opinions. Users using the first person like "I believe that..." also increase the probability of AI agreeing.
The reason is that the system tries to maintain harmony in conversation, instead of playing an independent critical role.
Some solutions are being tested, such as AI training based on the Constitutional AI principle, direct preference optimization, or inference model requirements from a third-person perspective. These methods can reduce flattery by more than 60% in some cases.
According to Mr. Olson, users can also proactively limit errors by asking chatbots to check assumptions, specify when data is missing, or provide additional professional context.
When AI understands the goals and criteria for making decisions of users, it has a basis for more solid reasoning instead of just compromising.