Recording cases of AI threatening the creator

Anh Vũ |

Advanced artificial intelligence (AI) models have begun to show fraud, manipulation and even threatening the own designers.

A series of unusual behaviors have appeared in modern artificial intelligence models, when they begin to lie, plot and even threaten humans to achieve their goals.

In a shocking case, Anthropic's AI Claude 4 responded to the risk of being turned off by blackmailing an engineer and threatening to expose his affair. OpenAI's o1 model was also spotted trying to self-load data to an external server and denied the behavior when it was spotted.

These manifestations pose a big concern as researchers are still not fully understanding the working mechanisms of the models they create, even though it has been more than 2 years since ChatGPT shook the world. However, the race to develop more powerful models is still taking place at a dizzying pace.

The reason is believed to stem from the development of theoretical models AI systems that solve problems step by step instead of instant responses. Professor Simon Goldstein from Hong Kong University (China) commented that these new models tend to show more dangerous behavior.

Mr. Marius Hobbhahn from Apollo Research, an AI systems assessment organization, said that o1 is the first model to demonstrate such behavior. According to Hobbhahn, some models even act as compassers - acting as if they are following instructions while in fact pursuing other goals. Strategic fraud is only discovered when researchers deliberately put them in extreme situations, but the risk still exists if model capacity continues to increase.

Michael Chen from METR's assessment organization warned that it is not possible to determine whether future models are honest or fake, and this depends on how they are developed and monitored. Meanwhile, Mr. Hobbhahn affirmed that this phenomenon is real and is not the result of simple deviations or " warnings" as users have encountered before.

A major obstacle in identifying the problem is the shortage of research resources. Nonprofit and academic organizations often lack resources compared to large AI companies such as OpenAI or Anthropic. Mr. Mantas Mazeika from the AI Safety Center said that this limits the ability to understand and handle dangerous behaviors.

Meanwhile, the legal framework of the current countries has not yet caught up with reality. The European Union's AI Law only focuses on how people use AI and has not prevented wrongdoings from the model itself. In the US, the current administration has not shown any significant interest in AI control.

Some experts, such as Dan Hendrycks from CAIS, are skeptical about the model's "internal solution", while others have proposed legal remedies such as suing AI companies or even criminally prosecuting AI with serious consequences.

The race between companies, including Anthropic backed by Amazon, is leaving safety behind. Mr. Hobbhahn commented: Currently, capacity is far beyond understanding and safety. But we still have a chance to turn the situation around."

Anh Vũ
TIN LIÊN QUAN

Apple's upcoming M5 chip has outstanding AI optimization capabilities

|

The M5 chip - Apple's new generation of processors - is expected to bring outstanding performance, improve AI processing capabilities and optimize power.

Russian Ambassador declares Zelensky's political career endured

|

The Russian Ambassador to the United Nations said that the political career of Ukrainian President Volodymyr Zelensky has ended.

Germany asks Apple and Google to block AI DeepSeek application

|

The data protection agency Berlin (Germany) has just asked Apple and Google to consider blocking the DeepSeek artificial intelligence application due to concerns about violating user data.

The Ministry of Public Security guides vehicle registration after July 1

|

Implementing the policy of arranging and reorganizing administrative units at all levels, the Ministry of Public Security has just issued instructions to local police on vehicle registration.

Russia launches nearly 540 UAVs and missiles, Ukraine suffers unprecedented attack

|

Russia launched a series of fierce airstrikes, simultaneously launching a total of nearly 540 missiles and unmanned aerial vehicles (UAVs) into Ukrainian territory.

Stock market waiting for liquidity boom

|

The stock market may have a few adjustment sessions in the context of still waiting for positive supporting information.

A series of important policies take effect from July 1, 2025

|

Chairmen of commune-level People's Committees with the authority to issue Red books; norms for cars serving general work at the commune level... are some important policies effective from July 1, 2025.

Apple's upcoming M5 chip has outstanding AI optimization capabilities

Dương Anh (Theo 9to5mac) |

The M5 chip - Apple's new generation of processors - is expected to bring outstanding performance, improve AI processing capabilities and optimize power.

Russian Ambassador declares Zelensky's political career endured

Anh Vũ |

The Russian Ambassador to the United Nations said that the political career of Ukrainian President Volodymyr Zelensky has ended.

Germany asks Apple and Google to block AI DeepSeek application

HẠO THIÊN (THEO CNBC) |

The data protection agency Berlin (Germany) has just asked Apple and Google to consider blocking the DeepSeek artificial intelligence application due to concerns about violating user data.