(367) Study: AI Detectors Are Accurate, Flag Work with AI "Polish"

Plus, another ad for an AI humanizer designed to help students get away with cheating. Plus, look away, "The View" talks AI and Cheating.

Jun 03, 2025

Issue 367

Subscribe below to join 4,703 (+4) other smart people who get “The Cheat Sheet.” New Issues every Tuesday and Thursday.

The Cheat Sheet is free. Although, patronage through paid subscriptions is what makes this newsletter possible. Individual subscriptions start at $8 a month ($80 annual), and institutional or corporate subscriptions are $250 a year. You can also support The Cheat Sheet by giving through Patreon.

Study: AI Detectors Flag Work that Has Used AI to “Polish” Writing

A relatively new research paper from Shoumik Saha and Soheil Feizi at the University of Maryland tested AI detectors on human work, AI-created text, and human writing that had been “polished” by AI to various degrees. The title is:

Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing

The team found that, by and large, AI detection is pretty good at telling the difference between human-created writing and AI-created text. Which, since you read The Cheat Sheet, ought to be no surprise. AI detection works.

When the team set a group of 12 AI detectors on simple classification — AI or not AI — the results were solid:

We evaluate these detectors on 300 samples of our ‘no-polish-HWT’ (pure human-written) set and 300 samples of pure AI-generated texts from the dataset of Zhang et al. (2024). Most detectors achieve 70% − 88% accuracy, with a false positive rate of 1% − 8%

The debate as to whether AI detectors “work” should be over. But it’s worth noting that, as is common practice in similar research papers, the tests covered lab-grown detectors that no one uses in the real world. So, the averages are distorted.

The paper did test three detection systems that people do actually use, and can use — GPTZero, ZeroGPT, and Pangram. Those systems are the last section of this graphic on AI detection success, from the paper:

You’ll notice that, of these three, Pangram was the most accurate and had zero FPR — false positive rate. We’ve seen similar results before (see Issue 357).

What is troubling, however, is the FPR of the other two. According to this, at straight AI detection, ZeroGPT had a false positive rate of 4%, while GPTZero had a false positive rate of a very troubling 8%. We’ve seen results like those before too.

The meat of the paper is not on whether AI detectors can spot AI, but how they do with “polished” text — text that was human written but then AI was used to fix, edit, or improve the work. From the paper:

an overlooked challenge is AI-polished text, where human-written content undergoes subtle refinements using AI tools. This raises a critical question: should minimally polished text be classified as AI-generated? Such classification can lead to false plagiarism accusations and misleading claims about AI prevalence in online content.

This is important because of the question it raises but did not answer. Should “polishing” text with AI be considered using AI? And like most good questions, the answer is probably, it depends. To me, it depends on who the writing is for, what it’s for, and what the expectations are. I can see circumstances in which zero AI use, even for polishing, is expected. I can see the inverse as well.

But for this research, the team found that, by and large, AI detectors tend to flag AI polishing as AI text, which it is. Again, the question is whether you think using AI for editing or polish is appropriate or not.

Nonetheless, the authors of this particular paper see AI detectors flagging AI polishing as a shortcoming, a quasi-failure. Whereas, I tend to prefer that an AI detector flag any work that has been manipulated by AI and let me, or the reviewer, sort it out. I’d prefer that situation to the detection engine deciding on its own what is a significant change or just minor polish, which may not even be the same thing to two different people.

From the paper:

Our findings reveal that detectors frequently flag even minimally polished text as AI-generated, struggle to differentiate between degrees of AI involvement

And:

minimal polishing with GPT-4o can lead to detection rates ranging from 10% to 75%, depending on the detector. Furthermore, detectors struggle to differentiate between minor and major AI refinements

Fair, I think.

Though most detectors can achieve a low FPR on pure [human text], most of them give a high FPR for any polishing, especially for extremely minor and minor polishing.

To be clear, I am not sure that flagging text with even “extremely minor and minor” AI use is a false positive. I mean, AI fingerprints are there. And I know I said this, but I prefer a flag that I can review or ask about, as opposed to not getting one at all.

The paper also points out that the system someone uses to AI-edit or AI-polish matters. Some AI revisions are flagged as AI more often, about which the paper offers:

such an imbalance can create unfair scenarios, where a student using Llama-2 is flagged for minor polishing while another using Llama-3.1 is found innocent.

Found innocent is a big tip about the bias of the paper’s authors.

Still, I agree that one of the biggest sins in this current era of AI ubiquity is that students who know about, or can afford to pay for, better AI get better results, including, in some cases, better odds of not being caught. That is simply unfair.

The paper again:

A key concern is the high false positive rate associated with minimally polished text. Many detectors classify such lightly edited content as AI-generated, which poses serious risks of unjust accusations of plagiarism or academic dishonesty.

I’ll be the broken record — not sure that’s a false positive. The authors clearly think it should be, that using AI to “polish” ought to be fine in any and all circumstances, that no one should even care to want to know.

The authors aren’t entirely disconnected from the universe I understand however, as they later write:

rather than producing a definitive label, detectors should output prediction probabilities, enabling users to better interpret and trust the system’s verdict.

I agree. A probability, likelihood of AI use in a document would be helpful in assessing what happened. Some companies do this, I am pretty sure. I am pretty sure Pangram does.

There’s also this from the paper:

we emphasize the importance of interpretability and human oversight in detection tools. Developing interpretable detectors that can highlight suspicious segments or stylistic anomalies will empower users to make informed decisions. In high-stakes scenarios, integrating human-in-the loop review mechanisms can further enhance the reliability and fairness of the process

This is annoying since no one ever suggested using any AI detector without a human-in-the-loop review system. AI detection is the airport bag screener. It flags. A human has to open the bag and look around, have a conversation.

But whatever on that point.

The paper is good, especially for its great dataset of AI altered papers. The team did great work, in my view. And at a minimum, these two takeaways are important:

Use of common polishing or editing tools such as Grammarly can, and do, set of AI detection systems.
It is important for reviewers, i.e., teachers, to be very clear about what levels of AI use are allowed or expected, including things such as “minor polish.”

Another Look at Trojan Horse Advertising by AI Humanizer

If you’re not familiar, AI humanizers use AI to rewrite or paraphrase text for the clear purpose of trying to avoid detection by an AI detector.

They are fingerprint smudging as a business and they are, in some way, more morally objectionable than generative AI companies, with which there are many plausible beneficial uses. By contrast, if you’re using a humanizer, there’s only one reason — you’re trying to not get caught.

Humanizers are abundant and competitive. So, they advertise their get-away-with-it services. Here’s a good example — an article with the helpful sounding title, “Do UCs Check for AI? Insights for Students’ Awareness.”

If you’re observant, you may wonder why an article about the University of California system is running in an outlet called “Eye on Annapolis.” But the answer soon presents itself. The article is not an article at all, it’s an ad for getting away with cheating.

After laying out that yes, in fact, UC does check for AI, and how you’re likely to be caught, the piece has a section:

How to Submit Authentic and Human-Friendly UC Content

Human-friendly is a thing now, I guess. And then we get this advice:

you should submit authentic application essays, PIQs, and research papers. There are various ways to ensure this

You should submit authentic work. There are not — for the record — several ways to do this. But, being super friendly, the article continues:

Write your work without the help of AI – You can still do your work without the help of AI. Past students did it before the invention of AI. The UC system sets tests that are reasonable to tackle unless you are not prepared. The scholarship committees and teachers are confident that students can succeed without AI.
Use an AI humanizer – Well, AI can really simplify a student’s work and save time. However, the written material AI produces lacks authenticity and a human touch. To avoid being caught by the UC AI detection system, use a reliable AI humanizer to rewrite your content. Now that you are interested in this option, click here to find out more.

You can submit authentic work by using AI, then use a humanizer, specifically “to avoid being caught.” The link, which I left in, goes to a humanizer. Because of course it does.

Let me underline, use this humanizer “to avoid being caught.” Subtle, it is not. But there’s also:

So, how then can you escape the detection? You can use an AI humanizer tool to bypass this. A reliable humanizer will bypass Turnitin, GPTZero, Originality AI, and Copyleaks, which are popular tools used by colleges and universities to detect AI-generated content.

They literally say their product is to “escape the detection.”

And:

Do UCs check for AI? Yes, it does, and you can bypass the detectors with a reliable AI humanizer.

Bypass the detectors.

It’s as amazing as it is disgusting.

“The View” Discusses AI in Schools

The TV Show “The View” decided to discuss AI and cheating in schools a few weeks ago. Their jumping off point was the fantastic New York Magazine article (see Issue 364).

I don’t have much to say about their chatter, except that the people around the table have to be the least informed people on this subject anywhere. Like, anywhere.

Genius suggestions included getting teachers to design assignments that AI can’t solve. Or how ‘AI won’t take your job, someone who knows how to use AI will take your job.’

I know, it’s The View. And no one expects much. But still.

stefanideri

13h

This study highlights a real challenge for content creators. I've tried Clever AI Humanizer (https://aihumanizer.net/ ) to address this issue, and while it does help make AI text sound more natural, it's not perfect. Sometimes the output still needs manual editing, and occasionally it over-corrects, making sentences awkward. The interface is straightforward, but results vary depending on input quality. It's a useful tool for polishing AI-generated drafts, though I wouldn't rely on it completely. Human review remains essential for truly authentic content.

Expand full comment

Omnes Viae

Jun 14

Binoculars is readily available for laypeople to use: https://huggingface.co/spaces/tomg-group-umd/Binoculars

1 more comment...

The Cheat Sheet

(367) Study: AI Detectors Are Accurate, Flag Work with AI "Polish"

Plus, another ad for an AI humanizer designed to help students get away with cheating. Plus, look away, "The View" talks AI and Cheating.

Study: AI Detectors Flag Work that Has Used AI to “Polish” Writing

Another Look at Trojan Horse Advertising by AI Humanizer

“The View” Discusses AI in Schools

Discussion about this post