Research Edition: Students Cheating with Chegg and Successful Interventions
Plus, stats at UC, Davis. Plus, AI detectors work. Again.
Issue 256
To join the 3,695 smart people who subscribe to “The Cheat Sheet,” enter your e-mail address below. New Issues every Tuesday and Thursday. It’s free:
If you enjoy “The Cheat Sheet,” please consider joining the 14 amazing people who are chipping in a few bucks via Patreon. Or joining the 27 outstanding citizens who are now paid subscribers. Thank you!
Research: Exam Security Such as Proctoring Reduces Cheating with Chegg
In Issue 255, we skimmed the research from Jenelle Conaway, an accounting professor at George Mason University, which found that:
In periods of fewer online exam safeguards, 13–25 percent of intermediate accounting students are identified as using Chegg during exams. Corroborating evidence shows an anomalous improvement in student performance in online exams with minimal safeguards, which is attenuated by an increase in mitigation policies.
Here, we will dig in more deeply.
To begin, the paper credits a co-author:
Taylor Wiesen, University of Southern California
Chegg is Not Alone
Also important is that the authors correctly address the cheating resources they are examining:
This study focuses on the challenge academic resource sites (ARS) pose to online learning. These websites, such as Chegg and Course Hero, are online repositories of lecture notes, textbook problems, homework help, test questions, etc
They share further:
Examples of popular websites are Chegg, Course Hero, and Quizlet, although lesser-known sites abound.
Indeed, Chegg isn’t the only “academic resource site” that sells answers to homework and test questions. They got the big three.
Educating Me on Chegg
Also, big, big credit to the research team for educating me on Chegg. I knew, for example, that Chegg stopped assisting professors and other educators in academic integrity inquiries (see Issue 152). Now, they limit their available information to simple time stamps of when information was posted or accessed but protect their customers by refusing to identify them.
Someone I know once equated this absurd policy to a pawn shop that, when the police arrive to investigate a case of stolen property, will tell authorities when they bought it and when they sold it, but refuse to identify either party. So, you’ll know your property was sold, and sold again, but you’ll never get it back or see the thieves caught because the pawn shop knows that if they identify the people who buy and sell them stolen goods, the market of ill-gotten treasure will close up. That’s Chegg and Course Hero.
Anyway, what I did not know was that Chegg’s policy is actually worse than that. According to this new research:
Chegg provides information about questions to their copyright holder. Therefore, instructors can only obtain information about their custom test questions (i.e., not verbatim test bank content).
You have got to be kidding me.
So, Chegg won’t even tell a teacher if their test was compromised if the teacher does not personally own the copyrights to the questions. If an exam is created by a college department, borrowed from a colleague in another section, or written by a test publisher, Chegg will tell a teacher nothing at all. I did not know that. But I do know that Chegg will be happy to tell you how seriously they take academic integrity.
Anyway, back to this research.
Baseline Findings — Cheating with Chegg, Answers on Demand
If you recall, this paper examined three different course sections for compromised final exam questions and answers on Chegg — finding that 13% used Chegg during the test in one semester and nearly 25% in another. Here’s more detail:
In Spring 2020, ten students, or nearly 13 percent of students enrolled in the course, are individually identified as using Chegg during an exam. A single student submitted two exam problems. The compromised questions are then viewed by nine other students. Chegg experts provided a solution to these problems in 27 minutes, on average. Both the submissions and solutions occurred during the exam window. Students frequently refreshed the website in anticipation of the solution or rechecked the answers, as the questions were viewed an average of 10 times per student
And:
Three students submitted eight exam problems, which were then viewed by 11 other students. Although wait times for a solution increased to an average of 53 minutes, the question submissions and answers still occurred during the exam window.
To belabor the point, Chegg answers exam questions live, during the exam and shares those answers with its paying customers, also during the exam. So does Course Hero.
From the study:
This paper highlights an emerging method of academic dishonesty in online exams— the use of on-demand services from ARS such as Chegg. In the current environment, students have access to an enormous amount of publisher and instructor-created content for a nominal fee. Alarmingly, ARS offer services that answer newly submitted questions in real time, potentially during examinations. These websites pose a threat to academic integrity for any out-of-class assignment, including online assessments. This study documents that these websites are widely used by students during online exams.
Further:
For this paper’s sample, Chegg took between 15 and 94 minutes to provide solutions to intermediate accounting exam computational problems.
In other words, if a teacher leaves an online exam open without advanced security measures for three, four, or twenty-four hours, it’s going to be compromised. In the words of the authors:
Collectively, these findings suggest that, absent the implementation of meaningful mitigation measures, online exams are vulnerable
Yes, they are.
Examination of Exam Security Policies
What’s most fascinating and important about this study — yes, I buried the lead again — is that the research team ran exams under three different sets of exam practices with escalating security:
A - time limit, strict availability window, require affirmation of the honor code
B - custom questions, multiple versions of the exam
C - video monitoring and lock-down browser
The exams were administered under policy A, then again with policies A + B, then a third time with A + B + C.
In other words, the core findings were that students accessed Chegg during an online exam — that happened with exam time limits, same-start of exam time, and a pre-test affirmation of the honor code. So, yeah.
Surprising no one, students who cheated with Chegg or other means in unsecure online exams won a quantifiable advantage. According to the study:
the data continue to exhibit abnormally high exam scores in online exams with minimal exam policies. Scores under policies A are estimated at 8.5 percentage points higher than other exams
This link between higher test scores and unsecured, online exams is essentially uncontroverted. I don’t know the exact score on this point, but it’s like 203-2 or something like that.
When the team delivered the exams under the second parameter — A + B — cheating with Chegg actually increased. In my view, there are two reasons for this. One, most of the integrity interventions deployed in A + B are aimed at crimping collusion cheating, the sharing of answers during a test. Using Chegg is different. Moreover, the more a policy closes off cheating by sharing answers, it may drive students to another. In this case, Chegg.
Two, by summer of 2020, Chegg was exceptionally well known and growing in popularity. Plus, it was a summer course and I think that matters. We will call that point two and a half.
The take-away is that re-writing questions, time limits, shuffling questions, and affirming the honor code did not stop Chegg cheating. Though we cannot say what the use of Chegg would have been without these measures, it’s clear they were not strongly successful deterrents.
A + B + C = Zero
So, what happened when the research team added a lock-down browser and exam proctoring to the exam integrity mix? And they say:
In Fall 2020 and Spring 2021, no exam questions are identified as compromised. This period coincided with the first availability of video and audio monitoring and lock-down browser policies for online exams (policies C), which, when combined with the previously deployed policies, represented the maximum online proctoring capabilities available in this course and significantly hindered the use of ARS.
Let’s try it this way. Again, from the paper:
In this study, the use of Chegg came to an identifiable end in semesters where lock-down browsers and video and audio monitoring were implemented in addition to other policies.
You don’t say.
When students could not open new windows or tabs during the exam, and knew their activities were subject to review, they stopped using Chegg. Fascinating.
The research team writes:
Exam performance analyses support our Chegg analysis by demonstrating that online exams are vulnerable, absent the implementation of meaningful mitigation policies: student performance is highest on online exams with the fewest safeguards and attenuated by a progressive increase in policies, after controlling for student quality (GPA) and academic pressure (course load and summer courses).
And:
there is no difference between scores from face-to-face exams and online exams with policies A + B + C
Even after the paper calculates out course load and GPA, scores on unsupervised exams are higher and the scores returned to normal when proctoring was introduced.
I think this team from George Mason and Southern Cal may be on to something.
We’ve seen similar results in many research studies over the past few years (see Issue 169). That proctoring or supervising assessments reduces misconduct is both obvious and proven.
Miscellany
In addition to their exam data, the research duo also conducted a survey of students, finding, among other data points:
Seventy-one percent of survey participants are familiar with Chegg, and most use ARS—including 10 percent who acknowledge doing so during an exam or quiz. Collectively, the results should be concerning to educators seeking to maintain academic integrity in online learning formats
Ten percent say they use Chegg and/or other cheating sites during an exam or quiz. And this is self-reported, which we know under counts actual cheating behavior. Ten. Per. Cent. And this is just one kind of cheating.
The survey also found that:
63 percent (37) participants agree that academic dishonesty is more prevalent during online exams
The sample size is low. But we don’t even need to see this data point again to know that people think it’s easier to cheat in online tests (see Issue 36 for just one other example).
Conclusion
Once again, we see a very clear trade-off between exam security and cheating in online exams. We can have security, or we can have cheating. I know, no security feature or policy is cheat-proof, but the balance remains — every step back taken in one, yields more of the other.
Also once again, we see that Chegg, Course Hero and others who sell answers during exams are pernicious. They devour academic integrity and academic standards for money, penalizing honest students in the process.
I am not sure we needed reminders on either point. But it’s obvious that we are in desperate need of reminders on both points.
University of California, Davis Has 1,600 Integrity Cases Annually
According to a little nothing of an article in the student paper at the University of California, Davis:
about 1600 students are referred to the Office of Student Support and Judicial Affairs (OSSJA) for alleged academic misconduct each year, usually for plagiarism, cheating, unauthorized collaboration, or not following directions.
I confess I have no idea whether that’s a ton or very few. Though, were I at the school, I’d publish that number everywhere because few things deter misconduct more than the possibility of being caught and, in that context, 1,600 feels sizeable.
I also think the number speaks to the administrative burden that schools have to review and adjudicate these cases. The caseload and urgency for disposition of these cases can lead to quick dismissals or very lenient penalties such as required online courses in integrity or promises to not do it again. Few things incentivize misconduct more than hearing that people get away with it.
But mostly, I thought the number was interesting and worth sharing.
New AI Checker Caught Papers “With Unprecedented Accuracy”
I went back and forth on whether to write on this since it’s not really related to academic integrity.
But I am going to share it quickly because, in all honesty, the folks who insist that AI detection systems do not work are under my skin a little bit and this allows me to say again — all the available evidence shows they do work and, in most cases, work very well.
For examples, see Issue 253 or Issue 250).
Anyway, here’s an article from the journal, Nature. The paper discusses using an AI-detector that was trained on writing in science journals. The tool was then tested with hundreds of real paper abstracts, titles and introductions and those generated by ChatGPT. The research showed:
When tested on introductions written by people and those generated by AI from the same journals, the tool identified ChatGPT-3.5-written sections based on titles with 100% accuracy. For the ChatGPT-generated introductions based on abstracts, the accuracy was slightly lower, at 98%. The tool worked just as well with text written by ChatGPT-4, the latest version of the chatbot.
And:
The new ChatGPT catcher even performed well with introductions from journals it wasn’t trained on, and it caught AI text that was created from a variety of prompts, including one aimed at confusing AI detectors.
This shows that, in addition to general-use detection systems, those trained on narrow types of written material can be highly effective. That’s good news.
Other detection systems did not fare as well as the purpose-built one. From the paper:
By contrast, the AI detector ZeroGPT identified AI-written introductions with an accuracy of only about 35–65%, depending on the version of ChatGPT used and whether the introduction had been generated from the title or the abstract of the paper. A text-classifier tool produced by OpenAI, the maker of ChatGPT, also performed poorly — it was able to spot AI-written introductions with an accuracy of around 10–55%.
As I’ve written repeatedly, those systems are terrible and do not represent AI classifiers overall. OpenAI’s text classification system was abysmal and ZeroGPT is too. They’re just bad. Neither one could catch a Little League pop fly.
Class Note: This Thursday is Thanksgiving in the United States, so there will be no Issue of The Cheat Sheet on Thursday. We will pick back up next Tuesday. Gobble, gobble.