Why AI prefers to guess instead of admitting it does not know

Cát Tiên (THEO techcrunch) |

AI signaling is still a big challenge. OpenAI proposes changing the evaluation mechanism, severely punishing mis statements and encouraging uncertain admission models.

A new study from OpenAI has just raised a thorny question: "Why do large language models (LLM) such as GPT-5 or Chatbot ChatGPT still create hallucinations because the information sounds reasonable but is distorted? And more importantly, what can be done to reduce this phenomenon?"

In a shortcoming blog post, OpenAI admitted that hallucinations are a fundamental challenge of all language models, which is unlikely to be completely eliminated.

To illustrate this thesis, the researchers examined a popular chatbot about Adam Tauman Kalai's doctoral thesis (co-author of the study).

The results showed that the system gave three different and incorrect answers. Even when asked his date of birth, the answer was still wrong.

According to researchers, this phenomenon originates from the initial training method. The language model mainly involves predicting the next word in the text series, not labeling each data with the right or wrong thing.

Popular details such as spelling and sentence marking can be easily learned accurately thanks to consistent rules. Conversely, rare, low- frequency information such as an individual's date of birth is difficult to predict accurately, leading to hallucinations.

Notably, the study did not show that training is the main cause, but emphasized more on the way models are evaluated.

Currently, most systems are scored based on the absolute correct answer number, accidentally creating a model that encourages guessing instead of admitting not knowing.

The authors compared this mechanism to a multiple-choice test such as the candidate guessing that they still have a chance to get a point, but if left empty, they will definitely lose points. Similarly, when AI is only evaluated based on absolute accuracy, the system will tend to "mental to make sense" instead of uncertain admits.

The proposed solution is to change the scoring method, just like the SAT exam has a negative score for the wrong answer.

For AI, this means heavily punishing confident but incorrect answers, while only lightly punishing or even partially rewarding answers that show uncertainty.

As long as the scoreboard continues to reward lucky predictions, the model will continue to learn how to predict, the research team concluded.

This study not only sheds light on the underlying cause of AI algeism, but also opens up a direction to change the evaluation mechanism to encourage models that know how to acknowledge their limits, instead of being confident with incorrect answers.

Cát Tiên (THEO techcrunch)
RELATED NEWS

Koah raises 5 million USD to experiment with advertising in AI application

|

Koah - a San Francisco-based startup (USA) has just raised 5 million USD for a seed round with the belief that changing advertising in AI applications will bring efficiency.

Medical Deepfake creates fake AI doctor and risks patients being scammed into buying supplements

|

The real doctor was impersonated with AI technology to examine and sell poor quality supplements, spread false information and endangered public health.

Startup AI backed by Amazon to recreate Orson Welles's lost film

|

Fable - a company that calls itself Net Netflix of AI, has just announced a project to recreate the 43 minutes lost in the movie The Magnificent Ambersons (1942).

The State Bank studies and implements the establishment of the National Gold Exchange

|

The State Bank is studying and proposing to establish a national gold exchange or put gold into a commodity exchange.

Prof. Dr. Le Van Loi becomes President of the Vietnam Academy of Social Sciences

|

Prof. Dr. Le Van Loi - Deputy Director of the Ho Chi Minh National Academy of Politics - was appointed to hold the position of President of the Vietnam Academy of Social Sciences.

Tomorrow, September 9, inspections of law compliance of gold trading enterprises will begin

|

On September 9, the Inspection Team will conduct an inspection of compliance with policies and laws of credit institutions and enterprises in gold trading activities.

The Ministry of Industry and Trade responds to the proposal to calculate household electricity prices according to operating hours

|

The Ministry of Industry and Trade has responded to the petition of voters in Dong Nai province sent before the 9th Session of the 15th National Assembly regarding the issue of electricity prices.

Koah raises 5 million USD to experiment with advertising in AI application

HẠO THIÊN (THEO techcrunch) |

Koah - a San Francisco-based startup (USA) has just raised 5 million USD for a seed round with the belief that changing advertising in AI applications will bring efficiency.

Medical Deepfake creates fake AI doctor and risks patients being scammed into buying supplements

Cát Tiên (THEO INDIANEXPRESS) |

The real doctor was impersonated with AI technology to examine and sell poor quality supplements, spread false information and endangered public health.

Startup AI backed by Amazon to recreate Orson Welles's lost film

HẠO THIÊN (theo techcrunch) |

Fable - a company that calls itself Net Netflix of AI, has just announced a project to recreate the 43 minutes lost in the movie The Magnificent Ambersons (1942).