Mira is a student of international business at the Artevelde University of Applied Sciences in Belgium. She recently received feedback on one of her papers—and was shocked to see that her instructor noted that an artificial intelligence detector flagged 40 percent of her paper as being written by a bot. The feedback left her confused and turning to Twitter for help.
“I just knew I hadn’t used AI, and it made me worried about the quality of my own work,” Mira, whose last name has been withheld to protect her identity, told The Daily Beast. She ended up discussing it with her professor, telling him that she didn’t know how she could prove she wrote the paper. He agreed to check it again—but she hasn’t heard back from him yet.
This is obviously stressful for her. If the detector continues to flag it at 40 percent she’ll fail the assignment, and potentially the whole class. It’s also made Mira more scared of writing essays in the future. As someone who is a non-native English speaker, it makes her struggles with the language all the more difficult
ADVERTISEMENT
This is far from the only such incident to have occurred with AI detectors. Since the advent of ChatGPT, students have taken to social media to share their concerns and experiences being unjustly accused of plagiarizing their assignments using chatbots. Multiple students also posted that they ended up dropping classes after being falsely flagged for submitting AI-generated work because of how it impacted both their grades and mental health.
Criticism around this technology has led to educational institutions, educators, and experts wondering what needs to be done about both the rise of chatbots and the tools used to fight them going forward. The exponential rise in generative AI led to concerns initially about how students would use such tools to generate their work—which is why detectors became common so fast. But the flaws in these detectors have shown that this approach to curbing AI generated work may have more consequences than benefits for the education system.
Vinu Sankar Sadasivan, a computer scientist at the University of Maryland who co-authored a pre-print paper on the reliability of AI detectors, told The Daily Beast that the fast-paced nature of AI’s growth and adoption has created a number of massive and unexpected challenges—most of which caught educators and students alike completely by surprise.
“The rapid and somewhat unexpected emergence of potent language models, such as ChatGPT, even caught the AI community off guard,” Sadasivan said. “The unregulated utilization of these models indeed presents the risk of malicious outcomes, such as plagiarism.”
He added that the explosion in popularity and hype surrounding AI pushed educational institutions to use AI detectors without completely understanding how they worked—or whether or not they were even reliable in the first place. This led to instances in which teachers have accused students of plagiarism even when they didn’t as evidenced by the viral post on Twitter.
The New School of Education
Janelle Shane, an AI researcher and author of the book You Look Like a Thing and I Love You: How AI Works and Why It’s Making the World a Weirder Place, took a more complex view to this through her own experience with the detectors. While Shane originally enjoyed using the tools and found them interesting in the way they would evaluate the text, she told The Daily Beast that she changed her mind after she saw how “these detectors were being used and that false positives weren’t rare.”
“It wasn’t too hard for me to find false positives in my own book as well which I knew I had written myself,” she said.
This becomes particularly problematic when detectors are being used in cases that can have major consequences for someone's life like academic dishonesty and plagiarism. After Shane posted her thoughts about AI detectors, she received multiple responses from students sharing their own experiences.
“ChatGPT doesn’t have any memory between sessions but it’ll give you a definitive answer, which is nonsensical,” Shane said, talking about one particular case a student shared. In this instance, a teacher shared an assignment with chatGPT and asked if the generator had created that content, which would be impossible for ChatGPT to figure out.
This becomes even more problematic in the case of neurodivergent or non-native English speaking students like in Mira’s case. In fact, a Stanford study published in July 2023 in the journal Patterns found that AI detectors have an obvious “bias” against the latter category. Shane further points out that, there have been observations that there is a similar “bias” shown against work by neurodivergent writers.
Rua M. Williams, an associate professor of UX design at Purdue University, recently shared that someone replied to their email assuming AI had written that message. Williams got back to them pointing out that the text probably seemed that way as Williams is autistic.
“I do think that there is currently an AI panic amongst people, especially instructors, that makes them more suspicious of the authenticity of the words they are reading,” Williams told The Daily Beast. “They are thus more likely to deploy that suspicion against people who naturally use language a bit differently, such as neurodivergent people and English-as-second-language people.”
Alain Goudey, associate dean for digital at NEOMA Business School, also points out that non-native English language speakers often find their work falsely flagged because AI detector’s algorithms work by evaluating a text’s “perplexity.”
“Common English words lower the perplexity score, making a text likely to be flagged as AI-generated,” Goudey told The Daily Beast. “Conversely, complex or fancier words lead to a higher perplexity score, classifying a text as human-written.”
He added that since non-native English speakers use straightforward words, this can lead to their work being flagged as AI-generated. For non-native English speakers, who are already doing the extra work of learning a language, this additional burden can be exhausting and puts them at a further disadvantage.
This is something that T. Wade Langer Jr., a humanities professor at University of Alabama Honors College, has recognized with his own students. He told The Daily Beast that he uses the tools as a starting point for conversations with his students for their side of the story, instead of immediately believing the detectors. He doesn’t rule them out entirely, mostly because of how prevalent and popular AI generators have become. However, he said that “our policy is a conversation, not failure.”
“Anytime a question of academic misconduct is addressed, there is some strain on mental health,” he said. “This is why educators and administrators must proceed with curiosity rather than judgment, inviting a conversation to understand and discern the truth of one’s academic integrity, rather than pass an outright judgment.”
Without a proper understanding of the consequences, researchers like Sadasivan worry that the long term impact of these would be that it stifles creativity and perpetuates further biases. But instead of these being reasons to ban or remove this technology, though, experts are pushing to re-evaluate just how exactly it can be used.
AI is progressing at a speed that educational institutions seem unable to catch up with. This has led to a reliance on short term solutions like AI detectors, but now that its consequences are coming to light, critics are quick to point out the dangers of relying on them. As technology progresses beyond the pace of traditional education, teachers will need to keep up—and this potentially comes at the expense of students. That’s why experts are pushing for a change in the way both AI generators and detectors are currently perceived, and making sure they aren’t used in a way that causes harm.
“Like any other resource educators use, I think the biggest concern would be to use the resource as a litmus test or definitive standard to judge a student,” Langer said. “It takes more time and effort to have a conversation versus rendering a grade [or] verdict. But instructional integrity demands due diligence, just as much as academic integrity does.”