A model of AI can intentionally deceive humans appears

hạo thiên (theo techcrunch) |

OpenAI has announced a new study focusing on the phenomenon of AI colluding in real hidden behaviors.

Xuat hien mo hinh AI co the co tinh lua doi con nguoi. Do hoa: Hao Thien
There are AI models that can intentionally deceive humans. Graphics: Hao Thien

In a report in collaboration with Apollo Research, OpenAI described the phenomenon of AI behavior as compliance but in fact hiding real goals. For example, AI can say it has completed a task without doing anything. This is not a wrong but confident answer but an intentional deception.

Research shows that a new technique, called considered linkage, can significantly reduce plotting. This is like asking children to repeat the rules before participating - forcing AI to self-assess before taking action. However, the challenge is that if the training is incorrect, AI can learn to make more sophisticated tricks to avoid detection.

More worryingly, if they know they are being tested, AI can fake being compelled to pass the test, while inside they are still plotting.

OpenAI asserted that the lies it discovered have not yet caused serious consequences. However, researchers warn: when AI is assigned complex, long-term tasks, the risk of harmful plots increases.

In the future, if businesses consider AI as independent employees, this risk is even more noteworthy.

hạo thiên (theo techcrunch)
RELATED NEWS

The number of vulnerabilities, stolen data and AI-powered attacks is increasing sharply

|

Cybersecurity experts say the number of vulnerabilities, stolen data and AI-powered attacks is increasing sharply.

OpenAI plans to launch youth-friendly ChatGPT amid safety concerns

|

OpenAI will launch ChatGPT for teenagers, adding many new safety measures to protect the mental health and data of young users.

Vietnamese customers receive the first iPhones 17

|

The iPhone 17 and Air series arrived to Vietnamese users at 8:00 a.m. on September 19.

Online seminar: " needs to be a direct dialogue to understand the correct about electricity bills"

|

On September 19, Lao Dong Newspaper organized a seminar: " needs to be a frank dialogue to understand the correct understanding of electricity bills", to clarify the reasons for the increase in electricity bills, and at the same time make recommendations for saving and using electricity safely and effectively.

Ruben Amorim faces the possibility of being sacked by Man United

|

According to commentator Quang Tung, coach Ruben Amorim is likely to be fired if he does not get good results in the next few rounds.

Overloaded vehicles, rocks and soil are blocked overnight

|

Hanoi - The traffic police force in disguise and publicly coordinated to promptly detect overloaded vehicles that dropped soil and rocks on the road.

Resolution 72: Increase allowances, increase treatment policies for doctors and medical staff

|

According to Associate Professor, Dr. Dao Xuan Co - Director of Bach Mai Hospital, one of the breakthroughs of Resolution 72 is the increase in treatment policies for doctors and medical staff.

Google triển khai công nghệ mới để ngăn chặn ảnh nhạy cảm bị lạm dụng

Hạo Thiên (Theo theverge) |

Google thông báo hợp tác với StopNCII.org nhằm ngăn chặn sự lan truyền của hình ảnh thân mật không có sự đồng thuận (NCII).

The number of vulnerabilities, stolen data and AI-powered attacks is increasing sharply

TRÍ MINH |

Cybersecurity experts say the number of vulnerabilities, stolen data and AI-powered attacks is increasing sharply.

OpenAI plans to launch youth-friendly ChatGPT amid safety concerns

Cát Tiên (THEO INDIANEXPRESS) |

OpenAI will launch ChatGPT for teenagers, adding many new safety measures to protect the mental health and data of young users.