OpenAI is Wrong About Text Classifiers

Plus, upcoming academic integrity demo, October 18. Plus, Pitt Zips its Lips.

Sep 28, 2023

Issue 241

To join the 3,646 smart people who subscribe to “The Cheat Sheet,” enter your e-mail address below. New Issues every Tuesday and Thursday. It’s free:

If you enjoy “The Cheat Sheet,” please consider joining the 14 amazing people who are chipping in a few bucks via Patreon. Or joining the 21 outstanding citizens who are now paid subscribers. Thank you!

Support "The Cheat Sheet"

OpenAI is Wrong

A few weeks ago, OpenAI, the makers of ChatGPT, said that text classifiers and AI detectors don’t work. They are wrong.

I mean, they were not subtle about it, writing in an educator FAQ:

Do AI detectors work?
In short, no. While some (including OpenAI) have released tools that purport to detect AI-generated content, none of these have proven to reliably distinguish between AI-generated and human-generated content.

And:

Even if these tools could accurately identify AI-generated content (which they cannot yet), students can make small edits to evade detection.

To start, if detection was really impossible, there’s no need to even mention evading it. Vampires don’t exist. But keep garlic nearby. Come on.

Let’s do some collective critical thinking.

Why would OpenAI, which has spent or invested literal billions of dollars in AI text generation have an interest in telling people efforts to detect it do not work? Hmm. That is indeed a puzzle.

It seems to me that if everyone accepts that AI text is easy to detect, the generation of it has less value. I mean consider for a moment if all wigs were bright pink. Wigs would be kind of useless. In a game where the appearance of validity is the value, anything that can tell fact from fiction is a threat.

I mean, OpenAI could “watermark” their text to easily enable detection. Why haven’t they?

Then let’s review OpenAI’s short history on this issue.

OpenAI launches ChatGPT with no way to detect its content and seemingly no awareness of the academic integrity challenge this could present.
Educators and others complained, quickly and loudly.
OpenAI launched its own classifier/detector.
It is terrible and littered with language and disclaimers about inaccuracy.
OpenAI announces a partnership with Chegg (see Issue 203).
OpenAI closes its classifier/detector saying it does not work.
OpenAI says all classifiers do not work.

Then there are the countless companies that have popped up like ragweed, all promising they can take AI-generated text and make it undetectable. Seriously, there have been like 40. Here’s a new one, just this week. Annoyingly carrying a URL from the Associated Press, I point out.

The headline of this new announcement says their product is:

Your Secret Weapon to Bypass AI Detection

They brag their doohickey:

delivers superior capabilities compared to other AI detection remover tools on the market.

I mean, if detection does not work, as OpenAI says, what are these people doing? What have they built? I’m no lawyer but it seems to me that, in addition to academic fraud, if their solution does actually nothing, that’s also actual fraud.

Then we have the litany of academic studies and press stunts that show uniformly that AI text-classifiers work. Unlike the one from OpenAI, the good ones work quite well. Not a single study that I know of shows anything to the contrary. If you know of a study that shows they flatly do not work at all, please send it.

Of course, AI detectors are not flawless. They can generate both false-positive and false-negative results. This is why the makers have counseled repeatedly that they should not be used alone, as primary or exclusive evidence of misconduct.

As I have said before, AI detection systems are like smoke detectors - an alarm. They highlight conditions in need of inquiry and determination. They are not proof of fire. This is not what they do or how they should be used.

It is also true that AI detectors can be fooled. As OpenAI helpfully and illogically points out, human editing can assist in bypassing detection. AI text paraphrasers and translators such as Quillbot (Course Hero) and Grammarly can confuse detectors too. That’s what these new “bypass AI detection” companies are selling.

Moreover, and I appreciate you staying with me through this, no academic integrity is hack-proof or perfect. Not one. Locks can be picked or doors simply kicked in. No one says not to lock your doors.

If you know an academic integrity solution that is both flawless and incorruptible, I’d like to know about it also.

That aside, I just don’t know how else to convey this - what OpenAI said about detection is untrue.

It’s why one of the AI detection companies, Originality.ai, has challenged OpenAI to prove their claim, putting up $10,000 for charity as the prize. Good for them. There’s no way OpenAI accepts the challenge. They can’t.

I know I won’t convince people who think that academic integrity and exam security are stupid or contrived. Or those who think we should not have assessments and grades in the first place. Or those who are so spun up by AI that they regard academic credentials as anachronisms. Though I am sure the folks in those camps are delighted to have an ally in OpenAI.

To credit Forest Gump, that’s all I have to say about that.

Another Note from OpenAI’s FAQ

When OpenAI released their educator FAQ page, another little thing caught my attention.

They write:

We recognize that many school districts and higher education institutions do not currently account for generative AI in their policies on academic honesty. We also understand that some students may have used these tools for assignments without disclosing their use of AI. In addition to potentially violating school honor codes, such cases may be against our terms of use: users must be at least 13 years old and users between the ages of 13 and 18 must have parental or guardian permission to use the platform.

I am sorry - that’s downright bizarre.

OpenAI puts their only mention of school honor codes and “academic honesty” in the context of a violation of terms of service based on age. Not cheating. Not misrepresentation. Your age.

They could say not to turn in or represent AI-generated text as your own. They could say to ask the teacher or professor if using AI is appropriate. They do not.

Moreover, there’s not a single mention of someone who’s over 18 and, let’s say in college, using AI for assignments. Not a word. Apparently, that’s not a terms of service violation at all. Honestly, that is their service. Which makes it clear to me where OpenAI is on this. Just in case we needed more clarity.

Upcoming Demo of Promising New AI Technology

Promising academic integrity and assessment platform Examind, is hosting an online demo of their innovative AI solution - an approach they’re calling “AI transparency.”

The demo is Wednesday, October 18 and you can sign up on their LinkedIn page here: https://www.linkedin.com/events/fromdetectiontotransparency-aut7110030753446469633/about/

Having seen some of the features, I really recommend dropping in and checking it out. I don’t want to spoil anything, but it’s a thoughtful approach to assessment in the age of AI and has several built-in assessment security features as well.

Pitt Zips its Lips

A few weeks ago, in Issue 233, I shared a pretty stunning quote from John Radziłowicz, interim director of teaching support at the University of Pittsburgh.

Among other unbelievable things, Radziłowicz advised faculty at Pitt to not use AI detectors and said:

We think that the focus on cheating and plagiarism is a little exaggerated and hyperbolic

With that shocking display of insular denial and permissive advice, I contacted the University and asked them to share information on incidents of academic misconduct - how many, what types, and dispositions. I also asked for a copy of the e-mail memo that, according to other reporting, Radziłowicz sent to faculty.

I first inquired on August 25. I followed up on September 6. Then again on the 8th. On September 8, Jared Stonesifer, a school spokesperson wrote back:

So sorry you've been having such troubles. With students back on campus, we've been logging long, long days and weeks here in the media office. Not an excuse, just context!
Can you tell me two things? What's your deadline, and what's the larger gist of your story? Any information you can provide would be helpful.

I replied the same day saying I understood and directing him to the quotes I’d seen from Radziłowicz. I said I wanted to have a better view of the situation at Pitt and be able to put the comments and views in context.

After that, nothing.

I followed up on September 18. Again on September 22. And again on Monday, September 25. In my note on Monday, I said my deadline was Wednesday night. And here we are.

More than a month and six requests later, nothing. The school got the inquiry and knew why I was asking. It’s clear to me that Pitt simply does not want to talk about misconduct - other than to say a focus on cheating is exaggerated.

I wish I could say that’s uncommon. In my experience, nearly all schools simply shut down any time the topic of academic misconduct comes up. They prefer to take what a colleague calls the ostrich approach to integrity. I call it the three monkeys method - some schools think that if they don’t see cheating, they don’t hear about cheating and they don’t talk about cheating, that there is no cheating.

You’d think schools - institutions dedicated to collecting and sharing knowledge - would be committed to disclosure and truth-telling. But when it comes to academic integrity, many absolutely are not.

What makes Pitt’s silence even more stunning in this case is that Radziłowicz also reportedly said that the risk of “false positives” in AI detection systems:

carry the risk of loss of student trust, confidence and motivation, bad publicity, and potential legal sanctions.

Loss of trust, confidence, motivation, and bad publicity. You don’t say?