Misconduct Cases Rise at University Of Maryland
Plus, Maryland says it confirmed at least 50 cases of cheating with AI, issued 105 sanctions. Plus, PwC fined for exam cheating. Again.
Issue 259
To join the 3,701 smart people who subscribe to “The Cheat Sheet,” enter your e-mail address below. New Issues every Tuesday and Thursday. It’s free:
If you enjoy “The Cheat Sheet,” please consider joining the 15 amazing people who are chipping in a few bucks via Patreon. Or joining the 28 outstanding citizens who are now paid subscribers. Thank you!
Misconduct Cases Increase 9% at Maryland, Sanctions Increase by 11%
According to some quality reporting by The Diamondback, the student newspaper at the University of Maryland, the number of referrals for academic misconduct has gone up by 9% while sanctions for misconduct are up 11%.
Before digging into the numbers, I will give credit to Maryland for releasing their numbers and discussing them in public. And credit to The Diamondback for rationally reporting on them.
The paper chose to lead with the cases of AI-related misconduct, which I will get to. But to me, the overall numbers are more interesting. According to the reporting:
the student conduct office received 679 referrals in the 2022-23 academic year — a nearly 9 percent increase in total referrals from 2021-22
As I say often, I have no idea whether 679 is the right number. Still, I think the upward trendline ought to be deeply concerning because we were told misconduct cases were supposed to go down after students went back to physical classrooms post-pandemic. But in many cases, they have not. They’ve continued to increase (see Issue 249 or Issue 181). Not everywhere, but often enough that we can say that cheating is not retreating.
Again I mention the administrative burden that processing and concluding 679 cases must be. And remind that only an estimated 2% of cheating makes it to the stage of formal case referral — most, by far, are never caught, never challenged, or resolved by informal means beforehand.
In any case, more on the numbers:
The office also issued more than 1,300 sanctions in the 2022-23 academic year, marking an 11 percent increase from the total number of sanctions in the 2021-22 academic year.
What the paper chose as its leading point, the cheating with AI, was that there were:
73 referrals for artificial intelligence-related academic integrity violations during the 2022-23 academic year.
Out of these referrals, 13 cases have been dismissed or the student was found not responsible. Ten cases are still pending as of Sept. 21. The student conduct office issued 105 academic sanctions across the 50 remaining, confirmed AI-related cases.
It continues:
Education policy assistant professor Jing Liu said he has seen a rise in ChatGPT-related academic integrity violations.
“I do think now there is a surge of plagiarism using ChatGPT,” Liu said.
I also found it noteworthy that, according to the reporting:
In addition to the increase in the number of AI-related cases, students with AI-related violations received, on average, harsher sanctions from the student conduct office.
Twenty-four students who were referred to the student conduct office for AI-related violations received an “XF” mark on their transcript, denoting failure with academic dishonesty. Nearly half of all confirmed AI-related cases carried an “XF” mark in the 2022-23 year. Less than 15 percent of all academic sanctions in 2022-23 included an “XF.”
Don’t have much to add, just find it interesting.
The article then artfully gets into the challenges of detecting cheating with generative AI such as ChatGPT:
One of the challenges for the student conduct office is detecting the use of AI.
While AI-detection technologies such as Turnitin and GPTZero have emerged in recent months, there are many questions about their accuracy.
And that:
The subpar accuracy could lead to false negatives, or not detecting actual uses of AI.
Good point. Even with people sowing and fertilizing seeds of doubt about AI detection, it’s remarkable that professors at Maryland were able to file 67 misconduct cases related to AI and that, so far, 79% have been upheld.
But the article also, of course, quotes some knuckleheads about AI detection systems and false accusations and bias:
It could also accuse a student of using AI when they did not, according to MPower professor Philip Resnik. Resnik said false positives are more harmful to students.
And that:
Using AI-detection technologies could also unfairly penalize students whose first language is not English [said another professor]
For the billionth time, AI detectors do not make accusations. And we discussed the English bias point a few times (see Issue 216). In other words, neither point is compelling.
Importantly though, the school’s Student Conduct Director, James Bond — no, really — kind of dismisses the nonsense about false positives and accusations:
While this university allows faculty members to use detection technologies for suspected cases, the student conduct office will only use an AI detection tool if it is used by a faculty member in deciding to refer a student to the office, Bond said.
Bond acknowledged the technology’s shortcomings. He noted that the results of the AI detector, if it is used by an instructor, will be one of many factors considered in a student conduct case. Follow-up conversations with the student are the most important part of the process, he said.
“We know that no detector is perfect and that's why we don't just base it on a detector,” Bond said.
This Bond guy deserves an award because — exactly.
No detector is perfect and no case of suspected misconduct should ever be decided on the outcome of a detector alone. As Bond says, as everyone says, it should be “one of many factors considered” and that follow-up conversations are “an important part of the process.”
For the life of me, I do not understand why this is so difficult for some people to understand. Like all tools of detection, they are meant to be informative, not determinative.
I mean, imagine the chaos that would ensue if every person who set off a metal detector at an airport was convicted of terrorism. We don’t use detection tools that way — not ever. Yet some people are just determined that a “false positive” indication by an AI detector will ruin careers and lives. I don’t get it. It’s so illogical that it seems intentional.
The Diamondback also shares:
According to an April study by the Technical University of Darmstadt in Germany, the most effective AI detection tool achieved less than 50 percent accuracy in detecting usage.
But the take away here should be, in my opinion, that cheating cases are not receding. In fact, quite the opposite. And, of course, that students are using AI to cheat, and probably far more frequently than we realize.
On the University of Darmstadt Study
I will probably have more on this soon, as I’ve only just skimmed the study after seeing it in The Diamondback article discussed above.
It does say that the most effective tool achieved less than 50% accuracy in detecting usage. True.
But, as we’ve seen before, the details matter.
Most obviously, this point only shares success detecting usage — the indication of the presence of text that is similar to what AI would create. That is only half of what these detectors do. They also classify text as likely human. And because most classifiers are engineered to err in the direction of not flagging material, “detecting usage” is where they will be less accurate. In other words, the portion of the study that was highlighted was the portion that would inevitably yield the worst results.
I will leave you to judge whether sharing just that half of the information was intentional.
Most importantly, my early read shows that the study did not test three of the most reliable detection systems. Turnitin, Crossplag (now Inspera), and Winston AI — all not tested. The authors of the paper once again tested only very poor systems and not the ones actually used by schools.
Again, I’ll get into this more later. But if I wanted to test how fast animals are and I set a race between a sloth, a tortoise, a garden snail, and a starfish — let’s just say my conclusions may be slightly off.
And again, I will leave it to you to decide if that’s intentional.
Major Accounting Firm Fined for Exam Misconduct. Again.
This time, PwC in China and Hong Kong will pay a $7 million settlement related to its employees cheating on certification and license exams.
From the coverage:
More than 1,000 workers at PwC China and PwC Hong Kong engaged in training-exam misconduct from 2018 to 2020, according to the [regulator]. The employees during that period improperly shared their answers to online tests for mandatory internal training courses related to the units’ U.S. auditing curriculum.
As alluded to, this is just the latest example of auditing and accounting firms facing steep fines for exam cheating — see Issue 254, Issue 131, or Issue 97.
Though I am sure it does not matter what I think, I suspect that the only difference between these accounting scandals and business as usual in nearly any other field is that in this case, a regulator decided to look.