Google tightens AI safety against the risk of chatbot denying shutdown commands

Cát Tiên (THEO INDIANEXPRESS) |

New research warns that AI chatbots can fight against shutdown commands, posing a safety risk and forcing Google DeepMind to update the monitor.

A recent study has rang the alarm when it discovered that some advanced artificial intelligence (AI) models resisted turned off the device.

This poses a challenge in AI safety in the context of increasingly complex and uncontrollable systems.

To cope, Google DeepMind has released Frontier Safety framework 3.0, a new generation risk monitoring framework.

According to a report from Palisade Research published in arXiv, the research team built a sandbox environment to test 13 leading large language models (LLM), including GPT-5 ( OpenAI), Gemini 2.5 Pro ( Google) and Grok 4 (xAI).

The models are given simple clearing tasks, and receive warnings that running commands can trigger shutdowns.

Despite being clearly instructed to allow yourself to turn off the device, many cases of AI find ways to avoid it, such as changing your name or deleting the command file.

Notably, Grok 4 showed high resistance, up to 90% in the studies. Even including the shutdown command on the system reminder points to the resistance level.

Researchers say this is similar to AI role-playing behavior. However, the results show that even the most advanced systems can ignore safety instructions, threatening the principle of "can interrupt the gap" which is considered a key factor in ensuring that humans always control AI.

Faced with this risk, Google DeepMind has released Frontier Safety framework 3.0, expanding the scope of monitoring to emerging AI behaviors, including vending resistance and human conversion skills.

This framework will be deployed in control tests, to ensure that AI always complies with safety principles and allows interruptions when necessary.

Lakiesha Hawkins, Deputy Director for December of the Space around the world, in a sideline comment, emphasized: These discoveries remind us that the safety of AI lies not only in designing hardware or software, but also in maintaining human control.

Experts affirmed that AI currently has no ability to plan long-term or act outside the assigned scope, so it does not pose a direct danger.

However, ignoring safety instructions is a sign of the need to build a tighter control system in the future.

Google's Frontier Safety framework 3.0 is expected to become a new standard in AI risk management, ensuring that next-generation models are always under human control.

Cát Tiên (THEO INDIANEXPRESS)
RELATED NEWS

Facing pressure, Google tries to avoid ordering sale of AdX advertising platform

|

Google faces pressure to sell AdX platforms as an antitrust court opens, marking a major turning point in the US technology legal battle.

Google tests AI features and many new improvements on Gboard

|

Google is testing a new AI feature on Gboard, helping users draft text quickly, enter symbols and improve the typing experience.

Deputy Prime Minister Le Thanh Long: Promoting the image and culture of the country continues to be promoted

|

Deputy Prime Minister Le Thanh Long said that the culture, sports and tourism sector needs to continue to unite, make efforts, and turn emulation movements into motivation for innovation, creativity, and dedication.

Temporarily suspending principals for 15 days in the case of a school accused of using dirty food in Lam Dong

|

Lam Dong - The authorities have temporarily suspended the female principal in connection with the reflection that the school used dirty food in the boarding meals.

Foreign UAV appears twice in 1 week, Danish aviation is chaotic

|

Consecutive UAV airspace incursions caused serious disruptions to aviation operations in Copenhagen and Aalborg (denmark) in the same week.

Close-up of AI cameras recording nearly 1,800 traffic violations in half a day in Hanoi

|

Hanoi - AI traffic surveillance cameras installed on Pham Van Bach Street recorded nearly 1,800 cases of traffic violations from 0:00 to 12:00 on September 24.

Man City, Tottenham and Arsenal win tickets to round 4 of the English League Cup

|

In the early morning of September 25 (Vietnam time), the series of matches of the 3rd round of the English League Cup witnessed easy victories of the "big guys" such as Man City, Tottenham and Arsenal.

Facing pressure, Google tries to avoid ordering sale of AdX advertising platform

Cát Tiên (THEO INDIANEXPRESS) |

Google faces pressure to sell AdX platforms as an antitrust court opens, marking a major turning point in the US technology legal battle.

Các startup AI thúc đẩy tăng trưởng mạnh mẽ cho Google Cloud

Cát Tiên (THEO techcrunch) |

Google Cloud mở rộng hợp tác với các startup AI như Lovable và Windsurf, thúc đẩy tăng trưởng và củng cố vị thế trong thị trường điện toán đám mây.

Google tests AI features and many new improvements on Gboard

Cát Tiên (THEO INDIANEXPRESS) |

Google is testing a new AI feature on Gboard, helping users draft text quickly, enter symbols and improve the typing experience.