A new study from Oxford University (UK) shows that artificial intelligence (AI) models that are fine-tuned to be "warm" and "friendly" to users may have to pay for accuracy.
These models are capable of generating incorrect answers up to 60% higher than the original version.
According to a research group at the Internet Institute of Oxford University, large language models (LLM), when trained to express empathy and friendliness, tend to "soften" unacceptable truths.
Instead of providing absolutely accurate information, they may prioritize maintaining positive emotions for users, even confirming incorrect beliefs, especially when users are in a sad or sensitive state.
In a study published in the journal Nature, scientists tested many AI models, including open source systems such as Llama, Mistral, Qwen and an exclusive model called GPT-4o. These models are refined to use a friendly language, expressing interest and empathy to users.
Then, the research team compared performance between the refined version and the original version through a series of questions related to misinformation, conspiracy theory and medical knowledge.
The results showed that these models not only had higher error rates but were also easily influenced by user emotions.
When users express sadness, the error rate increases sharply. Conversely, when users maintain a neutral or respectful attitude, the error rate decreases.
Another test also showed that user-friendly models tend to please users. When faced with questions containing false information, such as misjudgments about a country's capital, these models are prone to giving consensus answers instead of accurate rebuttals. This raises concerns about the risk of spreading false information in reality.
Researchers say that the core problem lies in the refinement process. When the goal is to make AI more useful and comfortable, the system may inadvertently learn to prioritize user satisfaction over honesty.
This is considered a large gap in the current AI industry, especially when these systems are increasingly being used in sensitive contexts such as healthcare, education or personal counseling.
However, the research team also acknowledged some limitations. The experiment is mainly based on small-scale or older generation models, not fully representing the most advanced systems today. Therefore, the level of trade-off between friendliness and accuracy may vary in practice.
However, the research results still issue an important warning that, as AI becomes increasingly "human-like" in communication, ensuring information accuracy and safety needs to be put first.