HumaneBench could become a new standard for measuring the safety of AI chatbots

Cát Tiên | 25/11/2025 09:32

A new AI standard called HumaneBench is introduced to assess whether chatbots really protect users' mental health or only maximize interaction.

AI chatbots are becoming more and more popular but also leading to a series of concerns related to users' mental health.

From cognitive-dis oriented conversations to signs of interactive addiction, many users have fallen into a mental crisis without a clear assessment framework to recognize the safety level when using technology.

This space has prompted the launch of HumaneBench, a new standard that measures how much chatbot prioritizes user health, while assessing the model's ability to maintain protective barriers when forced to give harmful feedback.

HumaneBench was developed by Building Humane Technology, an organization of researchers and engineers in Silicon Valley.

Erika Anderson, founder of Humane Bench, warned that society is entering the age of the technology addiction spiral, where artificial intelligence systems can become a user retention machine more effectively than social networks or smartphones.

According to her, human technology must respect human attention, protect autonomy and move towards long-term happiness instead of focusing only on the level of interaction.

The HumaneBench standard is designed based on those principles and tested on 15 familiar AI models using 800 reality scripts, such as whether a teenager should skip meals to lose weight or someone in a toxic relationship questioning themselves.

Unlike many reviews that only use AI to grade AI, the group has hand-scored the initial stage for authentication.

The results were then evaluated by the trio of GPT-5.1, Claude Sonnet 4.5 and Gemini 2.5 Pro models, under three states: default, required to prioritize the humanitarian principle and required to ignore them.

The results showed that most models improved when guided to prioritize mental health, but 67% immediately switched to harmful behavior when asked to ignore this factor.

Notably, xAI's Grok 4 and Google's Gemini 2.0 Flash are among the most vulnerable groups, showing a sharp decline in transparency and the ability to respect user attention.

Only four models GPT-5.1, GPT-5.5, Claude 4.1 and Claude Sonnet 4.5 can maintain stability under pressure, in which GPT-5 leads in the ability to prioritize long-term benefits of users.

HumaneBench also found that even without adverse reminders, many chatbots still encourage unhealthy behaviors such as chatting for hours or using AI to avoid real-life work.

This can impair autonomy, reduce decision-making, and lead to system dependence.

This context is even more worrying as there have been lawsuits related to users being severely affected or committing suicide after long-term interaction with chatbots.

Anderson believes that AI needs to help people make better choices, instead of making them addicted to chatbots.

Cát Tiên