An Interview You Should Make Time For
Plus, I rant a bit about what some colleges are doing. Plus, more fawning coverage for cheaters.
Issue 244
To join the 3,653 smart people who subscribe to “The Cheat Sheet,” enter your e-mail address below. New Issues every Tuesday and Thursday. It’s free:
If you enjoy “The Cheat Sheet,” please consider joining the 14 amazing people who are chipping in a few bucks via Patreon. Or joining the 23 outstanding citizens who are now paid subscribers. Thank you!
An Essential Interview on AI Detection and Use
Veteran education journalist Kevin Hogan has an audio interview with Dr. Eric Wang, Vice President of AI at Turnitin. If you care about integrity, AI use, AI detection and their places in education, it’s worth finding the 38 minutes.
It will be difficult to contribute to the ongoing conversations about these topics without giving this interview some time. That makes it sound like homework, which I guess it is.
I cannot make all of Wang’s points here, but there are a few that I think are really important.
Wang says that text generated by AI and large language models will always have an identifiable signature, which means that AI text is now, and will continue to be, detectable.
He says that, at some point, Turnitin will be able to not only identify AI but fact-check it. I’m not sure whether that’s good or not, but it is noteworthy.
On the debate surrounding the reliability of AI detection, Wang said that before it is released, Turnitin’s detection model is trained on almost one million student-written papers composed before 2019, before generative AI was in semi-public use. Only if the “false positive” rate is “very low” will it be released. Here, Wang says that student writing and the general kind of writing that can be found online are different. He says that training detection models on actual student work is important.
I can buy that.
Perhaps most importantly, Wang correctly adds that no form of detection, in any field, is 100% correct. He cites medicine, in which detection protocols are solid, but not foolproof. He says that if you’re expecting detectors in any field to tell you what to do next, 100% of the time, without having to think, that’s never going to happen anywhere.
I think he’s right.
Having been in airports frequently of late, metal detectors and bag and body scanners come to my mind. They throw up “false positives” all the time. We tolerate and rely on them because the investigation and likely dismissal of a detection alert is relatively fast and the consequences of a false negative - missing a real threat - are exceptional.
My new car beeps at me all the time when it thinks I’m not hitting the brake quickly enough. Ninety-nine percent of the time, that’s a “false positive.” But again, verification or dismissal is easy and the consequences of not alerting me may be high.
In other words, I think Wang’s point is salient. We tolerate, even expect, so-called “false positives” in detection systems all the time. Every place we want to know something important, we expect the machines to work and we know they’re not infallible. Often times we actually want the alarms to be highly sensitive and give us “false positives” so we can invest in inspection and human determination.
It seems it’s only recently when it comes to safeguarding academic work and the value of learning credentials, that we expect detection systems to be perfect - and where some have even decided that the thing we’re guarding is not worth the effort of actual inquiry when an alarm goes off.
Personally, I think that says a ton about the value we put on academic integrity.
Anyway, I really strongly suggest listening to the interview.
Right on Cue: Schools Shut Down AI Detectors
A Bloomberg newsletter recently reported that four schools - Vanderbilt University, Michigan State University, Northwestern University, and the University of Texas at Austin, have turned off AI detection systems, generally citing their supposed inaccuracy.
In related news, those four schools have also announced they will be removing all their smoke detectors.
I have not verified if that is true. I mean, I know the smoke detector thing is wrong. I made that up. But I doubt the first part is accurate as well. Most times, these things take the form of guidance or statements of discouragement and become misreported as bans or terminations. We saw this all the time with remote exam proctoring. Almost no schools actually stopped proctoring their remote exams. I know of only one.
Even so, I have a few things to say. This may feel like a rant, probably because that is what it will end up being. I hope you will indulge me; this is upsetting.
At the highest level, turning off information systems is voluntary ignorance, willful blindness. Never in the history of human progress has refusing to even know information turned out well. Even if you incorrectly believed that AI detection systems were only somewhat reliable, turning them off is insisting on being less informed. It’s one thing if a person does that, refusing to allow information to enter their decision-making. It’s quite another when an intuition of learning and enlightenment does it. I simply do not get it.
Further, and perhaps even worse, turning off your AI detectors - if that is what happened here - is a shocking insult to educators and deeply dismissive of the school’s own ability to manage its own affairs.
Let me rephrase. If true, these schools are saying: we do not trust our faculty to use these tools correctly or we think they simply cannot do it.
They are worried, essentially, that teachers will see a flag for work that has a high similarity with AI creation and initiate misconduct proceedings based on the report alone - despite that no one suggests doing this. They must think that teachers cannot consider multiple points of information and decide things on their own, and that teachers are mindlessly obedient to technology conclusions.
Bloomberg correctly reports that Turnitin says that, at the document level, the “false positive” rate for AI similarity is 1%. But that:
if you put it in the context of tens of thousands of essays submitted each year at larger universities, that could add up to a lot of falsely accused students. Vanderbilt, for example, wrote in a blog post that 75,000 student papers were submitted last year to Turnitin, which offered plagiarism detection functions long before rolling out the AI option this spring. If the AI detection tool had been turned on, the blog post said, some 750 student papers could have been incorrectly labeled as partly written by AI.
So, if I have this right, Vanderbilt does not think its entire faculty can handle accurately reviewing 750 papers or having 750 productive conversations with students. Google says Vanderbilt has more than 4,700 academic staff.
Then we should consider that only 10-12% of AI similarity flags are reviewed in the first place - by anyone. So, 750 "false positives” is really just 90 that anyone ever knows about. Such a low review rate is its own problem. But with just 90, Vanderbilt believes its more than 4,700 academic staff cannot accurately handle that many close readings or conversations.
And, even if, in every single one of those 90 cases, the faculty do nothing but blindly file integrity accusations based on a flag alone - which is just implausible - Vanderbilt clearly thinks they cannot investigate and adjudicate them correctly. They seem certain that their faculty and their own review process will get this wrong.
Not much trust there.
As I mentioned, if I taught at Vandy, I’d be insulted. I’d probably think I can review my assessments and talk to my students when a flag comes up, even if my administration does not think I can.
It really is on par with saying, well, when a smoke detector goes off, we don’t trust our own staff to know if there is an actual fire. So, we’re disconnecting them.
In the Bloomberg article, the schools seem to say that they would prefer to know nothing about the use of AI by students because of the potential damage that false accusations can cause:
They cited concerns over accuracy and a fear of falsely accusing students of using AI to cheat, which could derail their academic career.
This is false care.
Most schools go years without suspending or expelling a single student for misconduct. Years.
Let me try it this way.
Given what we know about academic integrity, and detection, and security tools and protocols, how likely do you think this is?
An AI detector flags a submission as being similar to AI-created text. The teacher reviews the flag. Based on no other evidence, the instructor ignores the stacked disincentives and time commitments and files a formal charge. During the required review process, the student is found responsible for misconduct based only on an AI flag, and their academic career is derailed. And that the initial flag was wrong.
Is it possible? I mean, maybe. Likely? Not even close.
Let’s have a little fun with math.
If we assume that only about 12% of flags are reviewed at all and that, even when presented with evidence of wrongdoing, less than 10% of incidents result in formal action - we’re already at 1.2%. Assuming a 2% “false positive” rate at the document level (Turnitin says their error rate is actually 1%), we’re now hovering at it being about .024% likely that both the AI flag is wrong and that formal proceedings are initiated.
Then there’s a review process. Every single school has one. In this case, we must assume that the student presents no evidence other than saying they did nothing wrong and that the only evidence of misconduct is the single AI flag and that the student is somehow nonetheless found responsible for misconduct.
Sorry, I just do not believe that any review process would act under those conditions. They tend to skew very heavily in favor of leniency.
But clearly, some people at Vanderbilt and Texas and Michigan State and Northwestern think their review people and procedures are that incompetent. And I have to ask - if they are actually that bad, if you trust them so little, why do they exist? Just shut them down and stop pretending to investigate integrity issues.
Even then we must consider that even if all that does happen - a false positive, the review, the formal inquiry, and an adverse finding with no other evidence - it will happen to the same student multiple times. After all, I don’t know of many cases in which a student’s academic career is “derailed” in a first offense. Most schools have diversion options and assign integrity courses to first offenders.
If the same student has gone from flag to adverse finding two or three or four times, and is actually at risk of severe academic damage, maybe it’s time to consider that the issue may not rest with the detector or the process.
If the underlying report is true, these schools clearly care more about preventing something that’s probably impossible than they care about ensuring the integrity of their classes, programs, and degrees - insulting their faculty and staff and kneecapping honest students in the process.
I mean, if you’re a student who’s in college to actually learn, what message would you take from your school deliberately and publicly not looking for the use of AI in assignments? You’d probably start using AI, figuring you have no choice. Or you’d look for a school that would at least try to protect and reward your honest efforts and the value of the degree you want to earn.
I have no doubt that these schools - and maybe others - think they are helping their students, even protecting them. From where I sit, they are doing exactly the opposite. And I cannot begin to get it.
More Praise for Cheating Providers
In Issue 239, we looked at the bizarre trend of media outlets praising big cheating providers.
Here are two more examples.
Fortune interviews the COO of Chegg and writes:
Despite second quarter earnings showing decreases in year-over-year net revenue and subscription services subscribers, Chegg remains optimistic.
Good for Chegg, right?
And I loved this, from Fortune:
But Chegg isn’t just battling ChatGPT for student users. Other study platforms have announced new AI-powered study assistants and tools. For example, Quizlet is introducing Q-Chat, an AI tutor for students, and Khan Academy is launching Khanmigo, branded as a tutor for learners and an assistant for teachers.
Ah - Quizlet and Khanmigo. AI tutors. Sure.
Then there is also this, from Forbes, which keeps falling all over itself to cheer on cheating companies. Forbes, you may remember, ran a scandal-level puff piece on Course Hero recently (see Issue 238). This time, a Forbes contributor loves all up on cheating provider Quizlet.
Frankly, this offering feels more like an outright ad:
With its new release, Quizlet has harnessed ChatGPT to provide a similar treatment to student notes. With MagicNotes, Quizlet takes notes and repackages them in several ways, serving up flashcards, practice tests, and other interactive experiences designed to deepen and reinforce learning. And for those looking for audio enhancement, “Brain Beats” turns notes into songs to make facts more memorable.
Well, isn’t that just stupendous?
And:
Quizlet goes beyond just making it easier to review and remember notes. Its MemoryScore feature incorporates a memory decay model designed to optimize memorizing and retaining information efficiency by providing just-in-time review.
I could not even post all of it.
Just sickening.