Researchers Develop Statistical Method That Strengthens Detection of AI Text

Text generated by AI models like GPT-4 and Claude is getting increasingly difficult to tell apart from human writing. Researchers at Penn and Northwestern developed a statistical method that tests how well “watermarking” methods catch AI content. Their approach could shape how media, schools and governments manage authorship and fight misinformation.

The battle to distinguish human writing from AI-generated text is intensifying. And, as models like OpenAI’s GPT-4, Anthropic’s Claude and Google’s Gemini blur the line between machine and human authorship, a team of researchers has developed a new statistical framework to test and improve the “watermarking” methods used to spot machine-made text.

Their work has broad implications for media, education and business, where detecting machine-written content is becoming increasingly important for fighting misinformation and protecting intellectual property.

“The spread of AI-generated content has sparked big concerns about trust, ownership and authenticity online,” said Weijie Su, a professor of statistics and data science at the University of Pennsylvania Wharton School who co-authored the research. The project was partially funded by the Wharton AI & Analytics Initiative.

Published in the Annals of Statistics, a leading journal in the field, the paper examines how often watermarking fails to catch machine-made text — known as a Type II error — and uses advanced math, called large deviation theory, to measure how likely those misses are. It then applies “minimax optimization,” a method for finding the most reliable detection strategy under worst-case conditions, to boost its accuracy.

Spotting AI-made content is a big concern for policymakers. The text is being used in journalism, marketing and law — sometimes openly, sometimes in secret. While it can save time and effort, it also comes with risks like spreading misinformation and violating copyrights.

Do AI detection tools still work?

Traditional AI detection tools look at writing style and patterns, but the researchers say these do not work well anymore because AI has gotten that much better at sounding like a real person.

“Today’s AI models are getting so good at mimicking human writing that traditional tools just can’t keep up,” said Qi Long, a professor of biostatistics at the University of Pennsylvania, who co-authored the research.

While the idea of embedding watermarks into the AI’s word selection process isn’t new, the study provides a rigorous way to test how well that approach works.

“Our approach comes with a theoretical guarantee — we can show, through math, how well the detection works and under what conditions it holds up,” Long added.

The researchers, who include Feng Ruan, a professor of statistics and data science at Northwestern University, suggest watermarking could play an important role in shaping how AI-generated content is governed, especially as policymakers push for clearer rules and standards.

Former U.S. president Joe Biden’s October 2023 executive order called for watermarking AI-generated content, tasking the Department of Commerce with helping to develop national standards. In response, companies like OpenAI, Google and Meta have pledged to build watermarking systems into their models.

How to effectively watermark AI-generated content

The study’s authors, who include Penn postdoctoral researchers Xiang Li and Huiyuan Wang, argue that effective watermarking must be hard to remove without changing the meaning of the text and subtle enough to avoid detection by readers.

“It’s all about balance. The watermark has to be strong enough to detect, but subtle enough that it doesn’t change how the text reads,” said Su.

Rather than tagging specific words, many methods influence how the AI selects them, building the watermark into the model’s writing style. This makes the signal more likely to survive paraphrasing or light edits.

At the same time, the watermark has to blend naturally into the AI’s usual word choices, so the output remains smooth and human-like — especially as models like GPT-4, Claude and Gemini become increasingly difficult to tell apart from real writers.

“If the watermark changes the way the AI writes — even just a little — it defeats the point,” Su said. “It has to feel completely natural to the reader, no matter how advanced the model is.”

The study helps address this challenge by offering a clearer, more rigorous way to evaluate how well watermarking performs — an important step toward improving detection as AI-generated content becomes harder to spot.

[Knowledge@Wharton first published this piece.]

The views expressed in this article are the author’s own and do not necessarily reflect Fair Observer’s editorial policy.

MULTIMEDIA

The Indian Subcontinent’s Hindu-Muslim Divide

VIDEOS

FO Talks: Can Spirituality Transform Capitalism?

PODCASTS

The Dialectic: Narendra Modi’s Vegetarian Stalinism Has Ruined the Indian Economy

Researchers Develop Statistical Method That Strengthens Detection of AI Text

SHARE

Saved Successfully.

Do AI detection tools still work?

How to effectively watermark AI-generated content

Comment

Related Reading

Shaping Public Discourse: The Dual Edge of AI in Democracy

How Artificial Intelligence Can Slow the Spread of COVID-19

While AI Is All the Rage, What Is Neuroscience Up To?

Support Fair Observer

Will you support FO’s journalism?

Donation Cycle

Donation Amount

Make Sense of the World

MULTIMEDIA

The Indian Subcontinent’s Hindu-Muslim Divide

VIDEOS

FO Talks: Can Spirituality Transform Capitalism?

PODCASTS

The Dialectic: Narendra Modi’s Vegetarian Stalinism Has Ruined the Indian Economy

SHARE

Saved Successfully.

Do AI detection tools still work?

How to effectively watermark AI-generated content

Look at the world through many prisms — 3,000+ pairs of eyes from 90+ countries. We are a rare nonprofit in the world news space and you can get our newsletters for free.

Comment

Commenting Guidelines

Related Reading

Support Fair Observer

Will you support FO’s journalism?

Donation Cycle

Donation Amount

Make Sense of the World

We are a rare nonprofit in the world news space. Most other organizations are owned by billionaires or governments. We are bottom-up, not top-down and bring you perspectives from around the world. So, sign up now for our free newsletters to hear from our 3,000+ voices from 90+ countries.

BOOKMARK

Sign into your Fair Observer Account

Forgot Password

Forgot Password

Become a Member & Enjoy Exclusive Benefits!

NEWSLETTER

Unique Insights from 2,500+ Contributors in 90+ Countries

NEWSLETTER

Make Sense of the World

Unique Insights from 2,500+ Contributors in 90+ Countries

Make Sense of the World

We Need Your Consent