Turnitin's AI-Similarity Checker Has Scanned 65 Million Papers, Since April
Plus, Turnitin says 10% of scanned papers had at least 20% of the content match with AI-created writing. Plus, BBC writes on AI cheating. Plus, a funky piece in Faculty Focus.
Issue 227
To join the 3,400 smart people who subscribe to “The Cheat Sheet,” enter your e-mail address below. New Issues every Tuesday and Thursday. It’s free:
If you enjoy “The Cheat Sheet,” please consider joining the 14 amazing people who are chipping in a few bucks a month via Patreon. Or joining the 14 outstanding citizens who are now paid subscribers. Thank you!
Turnitin: Our AI Checker Has Checked 65 Million Papers
According to a release from the company issued today, Turnitin’s new AI similarity detection tool has scanned and checked an eye-opening 65 million papers. It was launched in April. Of this year.
That’s like 100 days. That’s pretty stunning.
More helpful is the information about detection rates:
of those 65 million papers, over 2.1 million – 3.3 percent – have been flagged as having at least 80 percent AI writing present. Nearly 6.7 million – 10.3 percent – have over 20 percent AI writing present.
One in ten papers has 20% or more of its content similar enough to AI creation to generate a flag. That seems like a serious problem.
And I don’t know what to say about the 3.3% number - that more than 2 million academic papers have been submitted are, according to Turnitin, at least 80% AI-like content. Two. Million. And I think at 80%, we can all agree there’s an issue.
Turnitin says, of course, that these flags do not mean misconduct. Not only is that determined by instructors and administrators, but some teachers may want their students to use AI on a given assignment.
At the same time, AI content is so easy to wash or modify in ways that fool AI checkers, you have to think the actual use of AI to create academic work is more than the 3.3% and 10.3% marks.
It’s also interesting that Turnitin says nearly 98% of schools are using the detection capabilities. That does not really surprise me. As I have said before, you have to wonder about the 2% - why they prefer not to have information about what’s happening with their students and assignments and grades and degrees. There’s ignorance, then there is intentional blindness.
Still, 65 million is a major number and it’s clear that checking papers for AI signatures is becoming the new normal. The 10.3% makes it clear why.
BBC On AI Cheating, Catching AI Cheating
The BBC had a somewhat lengthy piece recently on cheating with AI writing and efforts to catch it. It’s worth a quick review.
Suspecting AI creation in an essay contest, the author ran the submission through four AI similarity checkers including Copyleaks, Sapling and Winston AI. Sure enough, all four pinged as likely AI.
Copyleaks, I know. But not in a good way, since they actually enable cheating (see Issue 208). I’ve heard of Winston. Sapling is a new one.
Anyway, it’s a good example of how the AI misconduct process should work - a reader had a suspicion and got a technology-powered second opinion. Or, in this case, four.
The author writes:
So, how might we spot the AI cheaters? Could there be cues and tells? Fortunately, new tools are emerging. However, as I would soon discover, the problem of AI fakery spans beyond the world of education – and technology alone won't be enough to respond to this change.
She is right. Technology alone is not the answer. It’s part of the answer. Not the answer.
Later, she addresses AI checkers from Turnitin and GPTZero, which is unfortunately also cozy with professional cheaters (see Issue 226). It’s also never been very good.
She also offers a good review of how and why AI-checkers work. Then she unfortunately repeats some deeply dubious research on the topic (see Issue 216).
At the end, she lands where we will all land - or at least should:
Here is the real challenge for humans as AI-produced writing spreads: we probably cannot rely on tech to spot it. A sceptical, inquisitive attitude toward information, which routinely stress-tests its veracity, is therefore important.
Bingo. In the quest to detect AI misconduct, as with nearly all cases of academic misconduct, there is no substitute for skeptical close reading and knowing the writer. That’s not exactly what she said, but it kind of is. She says:
After all, I only thought to check my student essay with an AI-checker because I was suspicious in the first place.
Folly in Faculty Focus
Faculty Focus featured an article recently from JT Torres and Claude E. P. Mayo from Quinnipiac University. And it is bizarre.
It has the usual caveats such as:
Accepting that AI will define the immediate future does not mean we have to accept cheating as the norm. We offer no defense to students who skimp on their studies and seek only a grade instead of the requisite learning.
Whew. That was a close one.
Though it does make me wonder what point someone is trying to make when they must clarify that they do not defend cheating. It feels like the person who says, “I offer no defense of forced child labor…”
Anyway, that notwithstanding, it’s an odd offering.
The primary point the authors seem to want to make is that:
For the most part, these concerns [about cheating] are rooted in a cultural orientation that frames knowledge as property.
And how that’s bad, I guess.
But in that frame, the authors seem to think that the threat of cheating rests on a fear of the theft of that knowledge - which is nuts. They write:
When we view any kind of knowledge as property, the emergent danger is the potential for someone to “steal” someone else’s knowledge. Yale University’s policy explicitly claims that “one who borrows unacknowledged ideas or language from others is stealing their work.” In this posthuman turn, the potential of theft goes beyond human students stealing from human others. Now, the risk includes theft from technology: if AI produced knowledge (such as a ChatGPT produced essay) gets passed off as a student’s, that student has stolen from the AI.
What now?
First, Yale is right.
But you may just have to trust me when I say that no one anywhere is investing in academic integrity to protect the creativity rights of AI. No one. That’s just - I don’t know what that is.
It should be obvious that the issue with academic integrity is not knowledge theft - it’s the lack of learning. There’s also the unfairness to those who have done the work and acquired skills and knowledge. And the devaluation of academic credentials and the damage done to those who provide them. It’s a kind of theft - just not the theft of knowledge. If you cheated, you’ve acquired no knowledge.
The authors have confused the value of citation with cheating. They are not the same.
Then there is the obvious issue that AI does not create. It is (has been) trained to connect words based on human writing. That’s why they are called large language models - they use vast amounts of already-written text to learn how words fit with one another. In the context that the authors of this Faculty Focus article create, everything AI produces is stolen knowledge. Which, by the way, it is. All these AI text bots are built on unseen, unacknowledged, uncompensated labor. In that sense, there is theft happening - just not theft from AI.
Moreover, I am not sure you can steal from a machine that’s designed and built to give you what you’re taking. But I’ve chased this too far.
As I said, the piece is bizarre.
They go on to say that “these anxieties” regarding “crediting the AI, holding students accountable, and measuring learning” - that they “emerge from an individualistic view of learning.”
The writers clearly think we should shift our thinking away from things such as holding students accountable and measuring learning. I confess, I don’t get it.