OpenAI debates when to release its AI-generated image detector

OpenAI has been “discussed and debated quite extensively” about when to release a tool that can determine whether an image was created with DALL-E 3, OpenAI’s generative AI art model. But the startup is unlikely to make a decision anytime soon.

That’s according to Sandhini Agarwal, an OpenAI researcher focused on policy and safety, who spoke to TechCrunch in a phone interview this week. She said that, while the classifier’s accuracy was “really good” — at least in her estimation — it still didn’t meet OpenAI’s quality threshold.

“The problem is coming up with a tool that is somewhat unreliable, because the decisions it makes can have a significant impact on photos, such as whether a work is considered painted by an artist or is not realistic and misleading.”

OpenAI’s targeted accuracy for this tool appears to be extremely high. Mira Murati, OpenAI’s chief technology officer, said this week at The Wall Street Journal’s Tech Live conference that the classifier is “99%” reliable in determining whether an unaltered photo was created with DALL-E 3 or not. Perhaps the goal is 100%; Agarwal won’t say.

A draft OpenAI blog post shared with TechCrunch revealed this interesting information:

“[The classifier] remains more than 95% accurate when [an] image is subjected to common types of modifications, such as cropping, resizing, JPEG compression, or when text or sections are cropped from the image reality is superimposed on small parts of the generated image.

OpenAI’s reluctance may be related to the controversy surrounding its previously publicly available classifier, which was designed to detect AI-generated text not only from OpenAI’s models but also from other models. Text generators published by third-party vendors. OpenAI pulled its AI-written text detector for its “low accuracy rate,” which has been widely criticized.

Agarwal implies that OpenAI also focuses on the philosophical question of what exactly constitutes an AI-generated image. Obviously, artwork created from scratch by DALL-E 3 qualifies. But what about an image from DALL-E 3 that has gone through several rounds of editing, been combined with other images, and then run through a few post-processing filters? It’s less clear.

“At that point, should that image be considered something AI-generated?” Agarwal said. “ Right now, we’re trying to address this question, and we really want to hear from artists and people who would be significantly affected by such [classification] tools .”

Several organizations – not just OpenAI – are exploring synthetic media watermarking and detection techniques as AI deepfakes proliferate.

DeepMind recently proposed a specification, SynthID , to mark AI-generated images in a way that is imperceptible to the human eye but can be detected by a specialized detector. French startup Imatag , launched in 2020, offers a watermarking tool that it claims is unaffected by image resizing, cropping, editing or compression, similar to SynthID. However, another company,  Steg.AI , uses an AI model to apply resizable watermarks and other edits.

The problem is, the industry is still not unified around a single detection or watermarking standard. Even if there were, there’s no guarantee that watermarks – and detectors for that matter – won’t be defeated.

I asked Agarwal whether OpenAI’s image classifier supports detecting images created with generation tools other than OpenAI. She didn’t commit to that, but did say that — depending on the reception of the image classification tool as it exists today — it’s an avenue that OpenAI would consider exploring.

“One of the reasons why right now [the classifier] is specific to DALL-E 3 is because technically it’s a much easier problem to solve,” Agarwal said. “[The general detector] is not something we’re doing right now… But depending on where [the classifier] goes, I’m not saying we’ll never do it.”

