A model of AI can intentionally deceive humans appears

hạo thiên (theo techcrunch) | 19/09/2025 08:05

OpenAI has announced a new study focusing on the phenomenon of AI colluding in real hidden behaviors.

Xuat hien mo hinh AI co the co tinh lua doi con nguoi. Do hoa: Hao Thien — There are AI models that can intentionally deceive humans. Graphics: Hao Thien

In a report in collaboration with Apollo Research, OpenAI described the phenomenon of AI behavior as compliance but in fact hiding real goals. For example, AI can say it has completed a task without doing anything. This is not a wrong but confident answer but an intentional deception.

Research shows that a new technique, called considered linkage, can significantly reduce plotting. This is like asking children to repeat the rules before participating - forcing AI to self-assess before taking action. However, the challenge is that if the training is incorrect, AI can learn to make more sophisticated tricks to avoid detection.

More worryingly, if they know they are being tested, AI can fake being compelled to pass the test, while inside they are still plotting.

OpenAI asserted that the lies it discovered have not yet caused serious consequences. However, researchers warn: when AI is assigned complex, long-term tasks, the risk of harmful plots increases.

In the future, if businesses consider AI as independent employees, this risk is even more noteworthy.

hạo thiên (theo techcrunch)

#WHO

RELATED NEWS

Images at the workshop "AI & Cyber Security - Creating a smart security future" organized by the CYSEEX Information Security Alliance. Photo: Tri Minh

The number of vulnerabilities, stolen data and AI-powered attacks is increasing sharply

| 18/09/2025 - 14:32

Cybersecurity experts say the number of vulnerabilities, stolen data and AI-powered attacks is increasing sharply.

The ChatGPT for Teens version is integrated by OpenAI with strict content filtering, parental control tools and emergency response mechanisms in case of emergency. Photo: Hai Nguyen

OpenAI plans to launch youth-friendly ChatGPT amid safety concerns

| 18/09/2025 - 11:26

OpenAI will launch ChatGPT for teenagers, adding many new safety measures to protect the mental health and data of young users.

Viettel Store opens new iPhone sale, leaders hand over the device to customers. Photo: Tung Giang

Vietnamese customers receive the first iPhones 17

| 19/09/2025 - 09:20

The iPhone 17 and Air series arrived to Vietnamese users at 8:00 a.m. on September 19.

Online seminar: " needs to be a direct dialogue to understand the correct about electricity bills". Photo: EVN

Online seminar: " needs to be a direct dialogue to understand the correct about electricity bills"

| 19/09/2025 - 09:00

On September 19, Lao Dong Newspaper organized a seminar: " needs to be a frank dialogue to understand the correct understanding of electricity bills", to clarify the reasons for the increase in electricity bills, and at the same time make recommendations for saving and using electricity safely and effectively.

Ruben Amorim faces the possibility of being sacked by Man United

| 19/09/2025 - 08:00

According to commentator Quang Tung, coach Ruben Amorim is likely to be fired if he does not get good results in the next few rounds.

Overloaded vehicles, rocks and soil are blocked overnight

Overloaded vehicles, rocks and soil are blocked overnight

| 19/09/2025 - 07:17

Hanoi - The traffic police force in disguise and publicly coordinated to promptly detect overloaded vehicles that dropped soil and rocks on the road.

According to Associate Professor, Dr. Dao Xuan Co - Director of Bach Mai Hospital, one of the breakthroughs of Resolution 72 is the increase in treatment policies for doctors and medical staff.

Resolution 72: Increase allowances, increase treatment policies for doctors and medical staff

| 19/09/2025 - 07:01

According to Associate Professor, Dr. Dao Xuan Co - Director of Bach Mai Hospital, one of the breakthroughs of Resolution 72 is the increase in treatment policies for doctors and medical staff.

Google triển khai công nghệ mới để ngăn chặn ảnh nhạy cảm bị lạm dụng

Google triển khai công nghệ mới để ngăn chặn ảnh nhạy cảm bị lạm dụng

Hạo Thiên (Theo theverge) | 19/09/2025 - 06:30

Google thông báo hợp tác với StopNCII.org nhằm ngăn chặn sự lan truyền của hình ảnh thân mật không có sự đồng thuận (NCII).

Hình ảnh trong hội thảo "AI & Cyber Security – Kiến tạo tương lai bảo mật thông minh" do Liên minh An toàn Thông tin CYSEEX tổ chức. Ảnh: Trí Minh

The number of vulnerabilities, stolen data and AI-powered attacks is increasing sharply

TRÍ MINH | 18/09/2025 - 14:32

Cybersecurity experts say the number of vulnerabilities, stolen data and AI-powered attacks is increasing sharply.

Phiên bản ChatGPT cho thanh thiếu niên được OpenAI tích hợp bộ lọc nội dung nghiêm ngặt, công cụ kiểm soát phụ huynh và cơ chế ứng phó khủng hoảng khẩn cấp trong trường hợp khẩn cấp. Ảnh: Hải Nguyễn

OpenAI plans to launch youth-friendly ChatGPT amid safety concerns

Cát Tiên (THEO INDIANEXPRESS) | 18/09/2025 - 11:26

OpenAI will launch ChatGPT for teenagers, adding many new safety measures to protect the mental health and data of young users.

AGENCY OF VIETNAM GENERAL CONFEDERATION OF LABOUR

Editor-in-Chief: NGUYEN NGOC HIEN

Contact:

Editorial: (+84 24) 38252441 - 35330305

Electronic newspaper: (+84 24) 38303032 - 38303034

Hotline: 096 8383388

Readers: (+84 24) 35335235

Email: toasoan@laodong.com.vn

Advertising Support

Print: (+84 24) 39232694 - E-newspaper: (+84 24) 35335237

Address: No. 06, Pham Van Bach street, Cau Giay ward, Ha Noi.

Giấy phép số: 2013/GP-BTTT

Do Bộ TTTT cấp ngày 30.10.2012

Xuất bản số Báo Lao Động đầu tiên vào ngày 14.8.1929

Báo Lao Động điện tử được phát triển bởi Lao Động Technologies