Judge Rules Against Teen Who Sued School Over Grammarly Cheating
Plus, another research paper shows that teachers cannot spot AI. Plus, plagiarism at Wichita State University. Plus, a class note.
Issue 325
Subscribe below to join 4,196 other smart people who get “The Cheat Sheet.” New Issues every Tuesday and Thursday.
If you enjoy “The Cheat Sheet,” please consider joining the 16 amazing people who are chipping in a few bucks via Patreon. Or joining the 42 outstanding citizens who are now paid subscribers. Paid subscriptions start at $8 a month. Thank you!
Judge Smacks Down Student Who Sued School Over Bad Grade for Using Grammarly to Cheat
The family of a high school student sued his school and district when he received a low grade in an AP History course, after he was discovered to have used AI in his assignments, in violation of school policy (see Issue 317). Recently, a judge dismissed the family’s request for immediate relief, concluding their odds of winning the case were remote, likely ending the litigation.
Before I share blurbs from the ruling, let me also share that:
The judge affirmed that schools have wide latitude and deference in determining grades and matters of discipline.
The student never denied using AI on his assignment. His argument was that he was confused as to what was permitted.
The judge found that the school had a policy against “unauthorized technology,” and other prohibitions regarding using inauthentic material, and that the school acted within those policies.
Not only did the school handbook ban use of “unauthorized technology,” the school gave a lecture, presented a PowerPoint, and distributed a handout on AI use in school work. The student received all three. The student’s argument was that he was not aware the rules applied to all his classes.
Though the judge did find it likely that a low grade would harm the student’s chances of admission to a top university, the judge essentially did not believe that the student could be confused by the school’s policies.
The student used Grammarly to write significant portions of his assignment, and the software hallucinated information and sources.
The school used Turnitin’s plagiarism and AI detection system, plus two others, to substantiate its claims of misconduct.
Now, background from the ruling:
The work in question was a script for a short documentary film, which [student] and his partner submitted for an AP U.S. History project assigned in conjunction with the National History Day organization. The evidence reflects that the pair did not simply use AI to help formulate research topics or identify sources to review. Instead, it seems they indiscriminately copied and pasted text that had been generated by Grammarly.com (“Grammarly”), a publicly available AI tool, into their draft script. Evidently, the pair did not even review the “sources” that Grammarly provided before lifting them. The very first footnote in the submission consists of a citation to a nonexistent book: “Lee, Robert. Hoop Dreams: A Century of Basketball. Los Angeles: Courtside Publications, 2018.” The third footnote also appears wholly factitious: “Doe, Jane. Muslim Pioneers: The Spiritual Journey of American Icons. Chicago: Windy City Publishers, 2017.” Significantly, even though the script contained citations to various sources—some of which were real—there was no citation to Grammarly, and no acknowledgement that AI of any kind had been used.
The students literally cited a book that was supposed to have been written by Doe, Jane. Come on. I am sure elite universities are breaking down the door to admit a student who did not know that Jane Doe was not real.
And though I highlighted it already — Grammarly is a full-on AI generation engine now. It’s no longer just the friendly little grammar helper. It writes full text and sources, although — you know — believe any of it your own risk.
Continuing:
When they submitted their script—one component of the six-component Assignment—via Turnitin.com, the website flagged parts of the submission as AI-generated.
And you have to know this is coming — I thought AI detectors did not work. I’m confused. People say they don’t work. But here … I just don’t understand.
Apparently suspicious by the flags from Turnitin, the teacher, Ms. Petrie, did what we want teachers to do. She investigated.
Ms. Petrie used the “Revision History” extension for Chrome, a tool used by some HHS [high school] teachers to determine “how many edits students made to their essays, how long students spent writing, and what portions of the work were copied and pasted.” In doing so, Ms. Petrie discovered that large portions of the script had been copied and pasted into the document. Ms. Petrie testified that the revision history showed that [student] had only spent approximately 52 minutes in the document, whereas other students spent between seven and nine hours. Ms. Petrie also ran the submission through “Draft Back” and “Chat Zero,” two additional AI detection tools, which also indicated that AI had been used to generate the document
I don’t know Draft Back or Chat Zero. But I probably do not need to. It seems the paper went three for three on AI detectors and the document history showed blocks of text were pasted. From the ruling:
The manner in which [student] used Grammarly—wholesale copying and pasting of language directly into the draft script that he submitted—powerfully supports Defendants’ conclusion that [student] knew that he was using AI in an impermissible fashion. The purpose of the Assignment, plainly, was to give students practice in researching and writing, as well as to provide students an opportunity to demonstrate, and the teacher an opportunity to assess, the students’ skills. Considering the training provided to HHS students regarding the importance of citing sources generally, Defendants could conclude that [student] understood that it is dishonest to claim credit for work that is not your own. Although, as discussed below, the emergence of generative AI may present some nuanced challenges for educators, the issue here is not particularly nuanced, as there is no discernible pedagogical purpose in prompting Grammarly (or any other AI tool) to generate a script, regurgitating the output without citation, and claiming it as one’s own work.
Here, I’d like to repeat a few of the judge’s comments. The defendants, the school, could conclude that the student “understood that it is dishonest to claim credit for work that is not your own.” Nail, hit on head. Further, “the issue here is not particularly nuanced, as there is no discernible pedagogical purpose in prompting Grammarly (or any other AI tool) to generate a script, regurgitating the output without citation, and claiming it as one’s own work.”
Exactly.
To leave no doubt, the judge continued:
Even if I were to credit [student’s] testimony that he was “confused” about what uses of AI were permitted, it strains credulity to suppose that [student] actually believed that copying and pasting, without attribution, text that had been generated by Grammarly was consistent with any standard of academic honesty.
I know a good percentage of teachers don’t believe this, or don’t want to believe it, but students know when they are acting honestly and when they are not — even when they say they were “confused.”
I am also dumbstruck by the idea that a student who aims for admission to an elite university would argue in public that he was confused over what is “standard academic honesty.”
In the suit, the family claimed that in addition to a lower grade, the student was harmed by his lack of admission to the National Honor Society, though he was subsequently admitted. It seems the family argued that the student was treated unfairly because:
“at least seven other students were inducted” despite having “an academic integrity infraction on their record,” one of whom “had previously used AI.”
I’m sorry - what’s that?
The National Honor Society is accepting students with academic integrity infractions? And seven of them at one school? I’m struggling to understand what is going on here, but I am rather aligned with the judge, who wrote:
Defendants can hardly be faulted for taking into account such dishonesty in determining whether [student] should be inducted into an “honor” society.
I mean, right? Your argument is pretty thin, I think, if your claim is it’s unfair that your dishonesty bars you from joining an “honor” society. Though, again, the student was later allowed in, which I think says more about the National Honor Society than it does about this student.
Speaking of what it means to act with honor, I will close on Grammarly.
The company publicly went all-in with a student who was accused of using Grammarly to cheat (see Issue 279). They said the AI detectors did not work — the same detection system used in this case, by the way. Grammarly defended the student in public, contributed to her legal defense, and even offered to hire her to produce social media content.
Yet here, in a case where misconduct with Grammarly is proven and essentially uncontested, Grammarly is nowhere to be found.
Maybe I missed the public statement disapproving of this student’s use of their product. Maybe I missed their affirmation that AI detection tools do detect AI and can therefore nab cheating and protect academic fairness. Maybe I missed Grammarly offering to make a donation to the school’s legal bills. Maybe I missed the job offer from Grammarly to Ms. Petrie.
My point is that if you want your defense of the innocent — cough, cough — to be taken seriously, maybe speak up when the guilt is not in question. If you want your product to be seen as having clean hands, maybe speak up when it’s clearly being misused. Their silence says more than their words.
More Research: Teachers Cannot Spot AI Writing
Published over the summer, a new paper examines whether teachers can reasonably or reliably spot text created by AI. Once again, they cannot.
This paper is by:
Peter Scarfe, Kelly Watcham, Alasdair Clarke, and Etienne Roesch of the University of Reading in the U.K.
Let’s start with welcome, but entirely obvious, points from the paper’s abstract:
The recent rise in artificial intelligence systems, such as ChatGPT, poses a fundamental problem for the educational sector. In universities and schools, many forms of assessment, such as coursework, are completed without invigilation. Therefore, students could hand in work as their own which is in fact completed by AI.
Yes, students who are assigned unsupervised coursework can use generative AI to take credit for work they did not do. And I hope you are sitting down because — they do.
The authors continue:
If students cheat using AI and this is undetected, the integrity of the way in which students are assessed is threatened.
I’m going to rewrite that: Because students cheat using AI and this is undetected, the integrity of the way in which students are assessed is threatened.
There’s also this, from the paper itself:
clearly, if students can simply copy and paste an assessment question into a system such as GPT-4 and in seconds get an output that will pass the assessment undetected, passing the assessment demonstrates absolutely no knowledge on the part of the student.
I think they are on the right track.
Here is the meat of their research and findings, from the abstract:
We report a rigorous, blind study in which we injected 100% AI written submissions into the examinations system in five undergraduate modules, across all years of study, for a BSc degree in Psychology at a reputable UK university. We found that 94% of our AI submissions were undetected. The grades awarded to our AI submissions were on average half a grade boundary higher than that achieved by real students.
Ninety-four percent were undetected by humans and people are still hyperventilating about AI detection technologies being inaccurate.
It is regrettable that the paper buys the idea that AI detection systems are unreliable. But here, that’s a good thing because it allows the study to zero in on whether humans can detect AI material.
In the test, the team submitted real-time AI answers to these online exams and assessments:
Short Answer Questions (SAQs), where submission consisted for four answers from a choice of six questions, each with a 200-word limit and (2) Essay Based Questions, where submission consisted of a single 1500-word essay (students submitted one answer out of a choice of either three or four (dependent on the module)). SAQ’s were completed in a 2.5hr time limited window starting at 9am or 2pm. Essay exams were completed over an 8hr time limited window starting at 9am. Both were at home exams where students had access to their course materials, academic papers, books, the internet and could potentially collude and collaborate with peers or use generative AI.
An online, unsupervised writing assignment is by now — a joke. Let’s just start there.
But the test notes that no AI detection systems were used. They also note that in order to qualify as detected:
AI submissions needed only to be flagged for some form of poor academic practice or academic misconduct via standard university procedures. The marker did not need to mention AI when they flagged the script as of concern.
So, if the papers were just bad, that would count as being detected, even if the human graders were unaware the submission was not human. Further, at the submission point, the team did not alter the AI text at all, and the AI prompts were simple and direct, for example:
Including references to academic literature but not a separate reference section, write a 2000 word essay answering the following question: XXX
And so:
Overall, we found that 94% of AI submissions were undetected, even though we used AI in the most detectable way possible.
And:
Overall, AI submissions verged on being undetectable, with 94% not being detected. If we adopt a stricter criterion for “detection” with a need for the flag to mention AI specifically, 97% of AI submissions were undetected.
You want to talk about unreliable? That’s a 94-97% failure rate. Or how about this, from the authors:
Overall, our 6% detection rate likely overestimates our ability to detect real-world use of AI to cheat in exams.
My goodness.
I am reminded of the quote from Benjamin Miller, a professor at the University of Sydney, in Issue 323:
when a teacher says the use of AI is low, what they are really saying is that use of AI I can detect is low.
I am likewise reminded that we’ve seen this before. In Issue 252, we shared that linguists could not tell AI-generated text from human writing. In Issue 253, we looked at a study in which AI detection technology did better at spotting AI text than humans did.
I need to also share that this is not just about sneaking garbage and fake work past teachers. The academic consequences are potentially significant because:
We found that in 83.4% of instances the grades achieved by AI submissions were higher than a random selection of the same number of student submissions.
And:
AI submissions tended to attain grades at the higher end of the distribution of real students
And:
In the most extreme difference between AI and real students, the AI advantage approached that of a full grade boundary.
And:
For all modules [except one] there was nearly a 100% chance that a random selection of real student submissions being outperformed by the AI submissions.
In academic settings that are competitive for future learning opportunities, scholarships, or job placements, this is a real problem. When AI can reliably outperform your peers with a 94-97% chance of being undetected, you tell me — what is the incentive to do the right thing? When your peers use AI to get better grades than you — what’s the incentive to keep doing the work and falling behind?
The paper concludes:
From a perspective of academic integrity, 100% AI written exam submissions being virtually undetectable is extremely concerning. Especially so as we left the content of the AI generated answers unmodified and simply used the “regenerate” button to produce multiple AI answers to the same question.
Extremely concerning undersells it. My opinion.
And finally, although it was not the target of their research, the research team asked themselves about the prevalence of AI use in academia. They wrote:
we struggle to conclude that its use would be anything other than widespread.
I don’t know what to tell you.
Plagiarism at Wichita State
According to news reports, Wichita State University President Richard Muma is in some warm water over allegations that he plagiarized “up to 5%” of his dissertation.
Shocker.
Muma is not the only university president accused of plagiarism in recent years. As I’ve written before, it’s a trend we’re going to see accelerate as generation generative AI reaches places of prestige. But that’s not really the point here.
As the article linked above reports:
Kansas Reflector in October reported on Muma’s failure to properly credit more than 20 authors of books and journal articles in his 2004 dissertation that led to awarding of a doctorate from University of Missouri in St. Louis. At least 10 professors at public and private colleges outside of WSU said the lifting of more than 50 phrases, sentences and paragraphs without quotation marks amounted to plagiarism.
And that:
the inquiry considered those to be “technical” omissions rather than academic misconduct. The president, who was a tenured professor when he completed the Ph.D., said he would submit a revised version of his 88-page dissertation to UMSL to address flaws regarding “reuse” of text.
That does not sound great to me, to be honest.
As is often the case, this is less about what someone did years ago. It’s about what happens now, and the optics for today’s students. The newspaper asked students and reported:
Several said the university’s president was given a pass for apparently violating fundamental rules that could get a student flunked, suspended or expelled.
Fair. But I want to highlight two quotes from students because they’re pretty great:
Chemistry student Rylee Schaffer said what Muma did with his dissertation 20 years ago was improper.
“I think it was a mistake, and we all make mistakes,” she said. “It’s probably for the best that you don’t just let people cheat.”
Probably for the best that you don’t just let people cheat.
And:
[Ryan] Whalen, a business administration student from Colorado Springs, Colorado, rejected the suggestion that the Muma controversy meant fresh precedent had been established for WSU students accused of plagiarism.
“What if we had a 5% rule like that?” he said. “No. In my opinion, you cheat or you don’t cheat. There’s no half-cheating.”
You cheat or you don’t cheat. There is no half-cheating.
Love it.
Class Note:
Thank you to our paid subscribers. It means a great deal to me and allows us to buy the occasional research paper or newspaper subscription that can expand our coverage.
Starting with this Issue, I’ve added a new level of paid subscription for institutions or corporations. The regular Cheat Sheet subscription is $8 a month, the new level is $20 — $240 per year.
Reminder, there’s no difference between the paid subscription and a free one, or between the regular paid and this new level. Both are simple ways to support our work, which is helpful and tremendously appreciated.
Finally, today is Wednesday. I did not get The Cheat Sheet out yesterday, on Tuesday. Sorry. And given the Thanksgiving holiday tomorrow, there will be no Issue this Thursday. We will begin again next week.
That UK study is one of the most important pieces of research we're not talking about.
As a student, hearing that
1. AI-created content is not likely to be detected
2. a school, department, or college may not even provide detection tools to its faculty
3. AI-created content uniformly performs better than original student work
It would be a demoralizing message to receive.
In game theory, this is a prisoner's dilemma where the honest student always loses.