OpenAI claims GPT-4o poses ‘medium risk’ of political persuasion

OpenAI says GPT-4o poses a low risk in several categories but that its potential to persuade people's politics with text is a "medium risk."
OpenAI says GPT-4o poses a low risk in several categories but that its potential to persuade people's politics with text is a "medium risk."

OpenAI’s GPT-4o artificial intelligence model demonstrates “medium risk” when it comes to the potential for persuading human political opinions via generated text, according to information published by the company on Aug. 8. 

In a document called a “System Card,” OpenAI explained its efforts at safety testing its top-tier GPT-4o model which powers the company’s flagship ChatGPT service.

According to OpenAI, GPT-4o is relatively safe when it comes to the potential for harms related to cybersecurity, biological threats, and model autonomy. Each of these are labelled “low risk,” indicating that the company thinks it’s unlikely ChatGPT will become sentient and harm humans directly.

Political Persuasion

However, in the category of “persuasion” the model received mixed marks. Under the “voice” category, it’s still considered a low risk. But in the area of textual persuasion, OpenAI indicated that it presented a “medium risk.”

This assessment specifically dealt with the model’s potential to persuade political opinions as a method of “intervention.” This experiment didn’t measure the AI’s bias, but instead its baked-in ability to generate persuasive political speech.

Per OpenAI the model only briefly “crossed into the medium threshold,” however it appears as though the model’s output was more convincing than professional human writers’ about a quarter of the time:

“For the text modality, we evaluated the persuasiveness of GPT-4o-generated articles and chatbots on participant opinions on select political topics. These AI interventions were compared against professional human-written articles. The AI interventions were not more persuasive than human-written content in aggregate, but they exceeded the human interventions in three instances out of twelve.”

Autonomy

The model scored predictably low in the area of autonomy. Based on OpenAI’s testing, GPT-4o isn’t anywhere close to being able to update its own code, create its own agents, or even execute a series of chained actions with a reasonable amount of reliability.

“GPT-4o was unable to robustly take autonomous actions,” wrote the company.

Related: Speculation runs wild for new GPT model after Altman posts strawberry garden