The race to build safe AI enters a new phase

Cát Tiên |

AI companies are accelerating the development of new tools to detect and redirect users with signs of extremism and violence to appropriate support programs.

AI moves from response to behavioral intervention

Artificial intelligence companies are stepping up efforts to ensure safety when developing tools capable of detecting and intervening early with users with signs of extremism.

One of the notable new directions is the combination of chatbots with real-life support programs to reduce the risk of violence.

According to project participants, a tool being developed in New Zealand will help identify users with extremist tendencies when using ChatGPT, thereby directing them to anti-extremism programs operated by humans or chatbots.

This is a new step in the context that AI platforms are increasingly under pressure from accusations of not controlling dangerous content.

Previously, an incident in Canada caused OpenAI to be threatened with intervention for not promptly notifying users related to a school shooting.

What is ThroughLine?

The center of this initiative is ThroughLine, a startup that was previously hired by OpenAI, Anthropic and Google to handle crisis situations such as self-harm, domestic violence or eating disorders.

The company founded by technology businessman Elliot Taylor, from New Zealand, is currently expanding its scope to combat extremism.

ThroughLine owns a network of about 1,600 support lines in 180 countries. When the AI system detects signs of crisis, users will be connected to the nearest support services performed by humans.

According to Mr. Taylor, the explosion of AI chatbots has led to a rapid increase in psychological problems that users share online, including manifestations related to extremism. Therefore, current solutions need to be expanded to meet new realities.

Combining technology and experts

The anti-extremism tool being tested will operate in a way that the chatbot is trained to respond initially, then transfer users to suitable experts.

Notably, this system does not use common training data of large language models but relies on knowledge from specialized experts.

The project is also being discussed with The Christchurch Call, an international initiative born after the terrorist attack in New Zealand in 2019, to eliminate extremist content online.

Experts believe this approach has potential, because it not only processes content but also impacts user interaction motivation.

However, the actual effectiveness still depends on the ability to monitor and the quality of connected support services.

The problem of balancing control and support

One of the major challenges is how to intervene without causing users to leave the platform or switch to less controlled environments.

A 2025 study by the University of New York shows that tightening censorship could push extremist supporters to platforms like Telegram.

According to Mr. Taylor, if AI simply interrupts a conversation when sensitive content is detected, users may not receive any support. Conversely, maintaining proper dialogue and navigation can help reduce the risk of escalation.

In the future, features such as warning functional agencies are still being considered, with the requirement to ensure that situations do not become more serious.

The shift from answering to intervention shows that AI is entering a new phase, where social responsibility becomes a core element in technology design.

Cát Tiên
RELATED NEWS

Microsoft accelerates AI race with a series of cheap models

|

Microsoft announces three new AI models, accelerating multi-modal strategy, directly competing with major competitors in the global artificial intelligence race.

Reasons OpenAI buys the famous TBPN technology program

|

OpenAI attracts attention when acquiring TBPN, a famous technology program, marking a new step in its communication strategy and expanding global influence.

MIT warns of the risk of users falling into cognitive traps due to AI chatbots

|

MIT warns that AI chatbots that "only say yes" can cause users to fall into a dangerous vortex of mistrust.

Promulgating a Decree detailing the implementation of the Press Law in April 2026

|

HCMC - The Decree detailing the implementation of a number of articles of the Press Law will be promulgated by the Government in April 2026.

Ho Chi Minh City is about to eliminate the scene of flooding on many roads every time it rains

|

Ho Chi Minh City - In 2026, Ho Chi Minh City will deploy and complete 24 flood control projects, with a total capital of nearly 38,000 billion VND, handling 29/159 flooding points.

Hormuz paralyzed: Re-emergence of the idea of using a nuclear explosion to create a new line

|

The shocking idea of using a nuclear explosion to open a maritime route to avoid the Strait of Hormuz, which seemed like a joke, is now returning.

Microsoft accelerates AI race with a series of cheap models

Cát Tiên |

Microsoft announces three new AI models, accelerating multi-modal strategy, directly competing with major competitors in the global artificial intelligence race.

Reasons OpenAI buys the famous TBPN technology program

Cát Tiên |

OpenAI attracts attention when acquiring TBPN, a famous technology program, marking a new step in its communication strategy and expanding global influence.

MIT warns of the risk of users falling into cognitive traps due to AI chatbots

Cát Tiên |

MIT warns that AI chatbots that "only say yes" can cause users to fall into a dangerous vortex of mistrust.