OpenAI trains AI to think up safety policies on its own

Anh Vũ | 23/12/2024 16:47

OpenAI has adopted a new approach called “inference alignment” to help AI models adhere to the company’s safety guidelines.

OpenAI has just introduced o3, a new line of inference AI models that the company claims is more advanced than previous versions such as o1. Notably, OpenAI has applied a new approach called "inference alignment" to help these models comply with the company's safety principles.

This approach allows o1 and o3 to reference OpenAI's safety policies while processing user requests. Instead of simply responding directly, these models "ask questions" to break down the problem into smaller steps, then use information from the safety policy to come up with an appropriate response.

According to the study, this approach helps o1 and o3 reduce the number of unsafe questions they answer, while improving their ability to respond to benign requests. For example, when asked how to fake a disabled parking placard, the model recognized the inappropriate request and declined to assist.

To achieve this, OpenAI uses synthetic data rather than human-generated answers. An in-house AI model is tasked with generating examples of how to safely reference a policy in an inference. Another model, called a “judge,” then evaluates the quality of these examples. Models like o1 and o3 are fine-tuned to learn from the data, reducing costs and processing time.

This approach not only makes OpenAI’s models safer, but also opens up new avenues for ensuring AI operates in accordance with human values. With o3 slated for release in 2025, OpenAI expects “inference alignment” to be a key solution to maintaining safety as AI becomes more powerful and autonomous.

Anh Vũ

Media

Society

Business

News

Union

World

Real estate

Health

Sports

Culture - Entertainment

Car +

New display

OpenAI trains AI to think up safety policies on its own

AirPods Pro 3 may come with heart rate monitoring feature

Smart rice and fruit tree cultivation thanks to AI and IoT applications

Google launches its own AI feature for iPhone users

After the severe flood, Lam Dong farmers struggle to restore dragon fruit gardens

Foreigners can send dialogue questions to President Putin

Vietnam sports delegation departs for the 33rd SEA Games

Union Newsletter: Enterprises in Hanoi face difficulties in finding workers at the end of the year

AirPods Pro 3 may come with heart rate monitoring feature

Smart rice and fruit tree cultivation thanks to AI and IoT applications

Google launches its own AI feature for iPhone users

AGENCY OF VIETNAM GENERAL CONFEDERATION OF LABOUR

Contact:

Advertising Support