Cheating Increases At George Washington University

Plus, even more evidence that AI detectors work. Plus, how many students are using ChatGPT on homework, tests and other academic work? Plus, Grammarly e-mails me.

Dec 28, 2023

Issue 264

To join the 3,730 smart people who subscribe to “The Cheat Sheet,” enter your e-mail address below. New Issues every Tuesday and Thursday. It’s free:

If you enjoy “The Cheat Sheet,” please consider joining the 15 amazing people who are chipping in a few bucks via Patreon. Or joining the 30 outstanding citizens who are now paid subscribers. Thank you!

Support "The Cheat Sheet"

Cheating Increases at George Washington

According to this strong reporting in the student paper at George Washington University (GW), the school is dealing with:

a recent rise in reports of student academic integrity violations

No numbers were in the article.

But credit is due to GW for discussing the issue of academic misconduct. Credit is due also to the student newspaper — The Hatchet. Without student reporting, we’d know next to nothing about academic misconduct at American schools.

From the article:

Christy Anthony, the director of the Office of Student Rights and Responsibilities, said the office this fall has seen an increase in cases of cheating, the category of academic integrity violations that typically involve the misuse of AI.

Though by no means conclusive, the increase at GW is in line with reports we’ve seen at other schools — that a return to in-person classes and assessments has not dampened misconduct rates (see Issue 249). That genie just may not go back in that bottle.

Continuing:

Anthony said the increase accounts for several instances of cheating that each involved a “large number” of students in the same class but that not all violations included reports of AI use.
“This fall, the first since generative artificial intelligence was broadly available at little to no cost, Student Rights & Responsibilities has seen an increase in cases under ‘cheating’ without seeing a similar increase across other categories of academic integrity violations such as plagiarism,” Anthony said in an email.

There are a handful of other noteworthy pieces to this story. One is that:

The Office of the Provost released a set of guidelines for AI use in April that by default banned the use of AI on work submitted to professors for “evaluation”

Interesting. I left the link in should you want to check it out.

The article also covers a few instances of professors having to deal with AI and cheating, including:

Ronald Bird, an adjunct professor of economics, said he has increased the number of questions on his tests because students who rely on looking up answers using ChatGPT score highly on the first few questions but don’t have time to finish.
“I tend to give more questions so that you’ve got to know something so you can move through them fairly rapidly,” Bird said.
He said he has seen a “couple of instances” where a student doesn’t submit the exam at the end of class and revises their answers after leaving the room. Bird said he can see the timestamps for every answer change because he tests through Blackboard, the University’s primary online education platform.

Yup.

And:

Eric Saidel, an assistant professor of philosophy, said he can identify work produced using AI tools when the paper uses “more sophisticated” language or includes examples that the class did not discuss. He said it is difficult to determine cheating through AI compared to traditional plagiarism because he cannot find a copy of the work online.

True and, unfortunately, true. Which is why a score from an AI similarity classifier should be part of the puzzle, not the entire picture. Insightful, not determinative.

There’s more:

Robert Stoker, the associate chair of the political science department and a professor of political science, public policy and public administration, said this summer he decided to prohibit AI use in his classes and that professors in the political science department consulted one another, researched and drafted individualized AI policies.

And:

Stoker said he aims to prevent cheating by administering tests in class and requiring students to clear their desks. He said he allows students to collaborate on papers and out-of-class assignments when he can’t ensure he will be able to detect when students consult each other.
“What I want to really avoid is a situation where some students can take advantage of an opportunity and other students cannot, because then I’m not protecting the honest students and that, to me, is the most important thing about academic integrity,” Stoker said.

Yes. Thank you. Say it again — protecting the honest students is the most important part of academic integrity.

But there’s also this, from Alexa Alice Joubin, a professor of English and a Columbian College of Arts & Sciences Faculty Administrative Fellow. She says:

faculty are “incredibly frustrated” because the online AI detection tools designed to help instructors check for AI-generated work can’t reliably catch instances of the academic dishonesty.

So frustrating. And manifestly untrue. But repeated so much.

Anyway, the article is really good and the takeaway is that cases of misconduct continue to increase — you know, in case anyone is interested.

Yes, AI Detectors Work

I am sure you are tired of hearing it. I am tired of writing it.

AI similarity detectors work.

Not all of them. Some are junk. But there are good ones and they work quite well, in fact.

But in case you want more evidence of their efficacy, eCampus News is going back through their most read stories of 2023 and just republished this piece on a test of AI detection systems. I missed it when it was first published, in June.

In any case, the article is short enough to scan and written by:

Matthew Flugum, Doctoral Student of Education at Winona State University and Digital Learning Coordinator, Edina Public Schools & Steven M. Baule, Ed.D., Ph.D., Faculty Member, Winona State University

The duo say they tested four systems — Turnitin, ZeroGPT, Quill, and AI Textclassifier — on 28 papers composed by AI and 17 papers in a “control group.” Let me say here, hallelujah. Someone finally tested a real system that people actually use.

When Flugum and Baule tested Turnitin’s system, they found:

Of 28 fully AI-derived assignments, 24 of 28 were determined to be 100 percent AI generated. The other four ranged from zero to 65 percent AI-derived. The size of the papers ranged from 411 to 1368 words.

I cannot recall exactly, but I think Turnitin’s minimum word count is 300 words. And as it is with all text classifiers, shorter passages are more challenging. Still, getting 24 of 28 with 100% accuracy is pretty spot on. It does seem that Turnitin whiffed on one of the papers, giving it a zero. If that’s the case, the overall accuracy rate was 96.4%. At worst, it’s in the high 80% range.

As for the “control group,” the authors share that:

Of those papers, which ranged from 731 to 3183 words, the AI-derived scores ranged from zero to 28 percent. Ten of the papers showed no AI content, and four showed single-digit percentages of AI derived materials. One paper showed 14 percent AI material and the other 28 percent AI material. The highest AI-derived score came from a student whose first language was not English. According to the Turnitin site, currently the tool only detects for AI generation in English language submissions. One of the papers was returned without a similarity score.

So, 14 of 17 papers in this group were tagged with scores in the either single digits or zero. The highest false positive indicator was 28%, which may well trigger a review but is hardly conclusive. And though the authors do not say it, it is likely related to the language context, which we got into a bit in Issue 216.

In any case, if we remove the paper with no score from the set, Turnitin got what we think is human writing 87.5% right — scores of zero or single digits. If you count the 14% score, which you probably should, that accuracy rate goes to 94%.

So, Turnitin, in this test, was at least better than 80% accurate at spotting AI-generated text and text that we’re assuming was human-created. Probably in the range of 94-96% in both cases.

But it wasn’t just Turnitin that scored well. According to the article:

the Quill tool appears to provide accurate predictions for those limited selections of prose. All were identified as AI-derived.

Even ZeroGPT had it moments, catching the paper that Turnitin missed:

Taking one of the 100 percent generated documents that was returned by Turnitin as 0 percent AI-derived, ZeroGPT returned it as 97.79 percent AI-generated.

Also noteworthy is that the authors repeat:

University librarians have reported students looking for AI-generated reference lists containing sources by an actual author and with an actual title, but the author and title were not related.

The bottom line here is that, once again, AI detection works. And I renew my call for any evidence to the contrary — that automated AI similarity checkers cannot discern human-created text from AI-created text. I have not seen any.

Just One More Note on The Same Topic

I’ll be quick here.

The good folks over at PCGuide, who we may assume know something about technology, recently published an article with the title:

Can ChatGPT be detected by Turnitin? – Turnitin AI detection explained

And while I wish they’d spent a little less time telling people how to bypass or confuse the detection systems, PCGuide says:

Can ChatGPT be detected by Turnitin?
Without a doubt, yes!

They were a little excited.

Turnitin is not the only way to detect ChatGPT, but it is one way.

Use of GPT, What We Know

In the recent Issue 261, I was struck by the reporting on “teens” admitting to using ChatGPT with their schoolwork:

Pew also asked teens whether they had ever used ChatGPT to help with their schoolwork. Only a small minority — 13 percent — said they had.

I mentioned at the time that 13% seemed out of the norm, based on other survey data. So, I went back and pulled together other numbers from this year on the use of ChatGPT or similar tools.

In Issue 261, from Stanford University — about 20% of high school students admitting to using AI in an unauthorized way on schoolwork or tests in the past month.
In Issue 190, in a survey from intelligent.com — 30% of college students have used ChatGPT on written homework. Of this group, close to 60% use it on more than half of their assignments.
In Issue 184, in an informal poll at Stanford University — around 17% of Stanford student respondents reported using ChatGPT to assist with their fall quarter assignments and exams.
In Issue 198, in a survey by Best Colleges — about 22% of students acknowledge using AI on school work. And nearly a third (32%) said they intended to use AI for school work or would continue to do so.
In Issue 206, in a survey commissioned by Turnitin — nearly half, or 45 percent, said they were “personally aware of students using ChatGPT or similar services in ways that educators or schools may find inappropriate, or in ways that may violate academic rules or expectations.”
In Issue 214, in a survey of Cambridge students conducted by The Cambridge Tab — almost half (49 per cent) had admitted using the AI chatbot to help complete work for their degree.
In Issue 232, in the 2023 edition of a national survey of college students in Canada — a solid 40% said they had “had seen cheating with AI” and 16% said they saw it “all the time.”
In Issue 236, more than half (52 per cent) of Canadian students aged 18+ surveyed by KPMG are using generative AI to help them in their schoolwork.
In Issue 260, in a survey from Tyton Partners — more than 30% of students report using generative AI for “summarizing or paraphrasing text”, “answering homework questions”, or “assisting with writing assignments.”
Though not covered here, a survey by Study.com found that 89% of respondents had used ChatGPT for schoolwork, and “48% of students admitted to using ChatGPT for an at-home test or quiz, 53% had it write an essay, and 22% had it write an outline for a paper.”

Compiled, those numbers are 20%, 30%, 17%, 22%, 45%, 49%, 40%, 52%, 89% and more than 30%. Even counting Pew’s low finding, the average is about 37% — more than one in every three.

Department of Corrections Department - Kinda

In the last Issue, I wrote about Grammarly:

To work, Grammarly use has to be normal and routine and not cheating. No wonder they’d want to try to convince the school that using generative AI to modify a student’s work is not cheating.

A representative of Grammarly wrote in to clarify that Grammarly is not generative AI but “powered by AI” saying the student in this case:

did use the product's suggestions that provide recommended changes, which are powered by AI but do not use generative AI. Given the differences in how educators might engage with AI vs. generative AI, I think it's important to note the distinction

First, it’s good to be clear that the student who was accused of misconduct for using Grammarly, did in fact use it. She had more or less denied it.

Otherwise, fair. Though not entirely clear either as it seems Grammarly wants to draw a line between AI that generates suggested text and generative AI text. They are distinct, sure. But in this case, the distinction may not come with much difference.

Although, further complicating things is that in July Grammarly announced GrammarlyGO:

new features that guide students in leveraging generative AI to augment—not replace—their critical thinking and communication skills.

And:

the on-demand, generative AI-powered tool that works everywhere students do, in time for the 2023/24 school year.

The announcement says that with the new feature:

students can learn to use generative AI conscientiously—helping educational institutions uphold academic integrity

Though, to be honest, the commitment to integrity is really suggested, not hard-wired. For example, it says:

Grammarly will now offer customers the ability to generate a citation for text generated by tools such as GrammarlyGO and ChatGPT.

Offer.

And:

GrammarlyGO will discourage students from generating long-form text if they attempt to do so, with messaging within the product that redirects them to ideate with GrammarlyGO instead while reminding them to adhere to their institution’s or instructor’s policies.

Discourage and remind.

Pretty weak. Though Grammarly now lets students access generative AI tools anywhere they go — just in time for school. But it’s not for cheating, they will remind you.

I asked the Grammarly representative for an interview on the topic of AI and misconduct. I’ll let you know if that happens.