AI Detection Will Always Be Broken
The software does not work, it never will, and chasing it has already cost us the one thing that matters in a classroom.
AI detection does not work. No update will fix it.
The promise is always the same. The software is nearly there. One more model, one more training run, and the machine will finally tell the honest students from the cheats. It will not. The failure is built into the maths. And even if it were not, detection would still be the wrong tool for the job. Those are two separate arguments, and either one is enough on its own.
I have spent a year reading the evidence and writing about it. Here is where it leaves me.
In this post I will:
Show you why the technology fails, and fails hardest on the students least able to absorb it.
Explain why the maths guarantees it will keep failing.
Make the case for what to do instead, before prohibition drives the whole thing underground.
The technology does not work
I have made the case against detection here before. In March, in ‘Guilty Until Proved Human’, I set out the damage: the false accusations, the named students whose degrees and jobs were put at risk, the bias against people who learned English as a second language. That piece was about what detection does to people. This one is about why it can never be fixed, and why a flawless version would still be the wrong tool. The evidence has moved since March, and so has the argument.
A study published this month in Education and Information Technologies ran 81 essays through Turnitin’s AI detector, from fully human to fully machine. At the clean extremes it coped. A pure human essay was left alone. A pure ChatGPT essay was caught.
Turnitin handled the all-human and all-AI extremes but failed on mixed, real-world scripts
Then came the case that actually fills a marking pile: the hybrid. A student writes some of it, a model writes some of it, the two are stitched together. On those mixed scripts the detector fell apart, again and again failing to report how much of the work was AI. It could not tell you what you actually need to know.
The tool works on the two essays nobody is unsure about, and fails on the one in front of you. Almost no real student submission is fully human or fully machine. The detector is most confident exactly where there is nothing to decide, and lost exactly where the judgement is hard.
One study is one study, and I am wary of resting a whole argument on a single test. What this one shows is the shape of the failure, and that shape repeats wherever these tools meet real, mixed writing.
It punishes the students least able to fight back
Here is the part that should give any institution serious pause.
Detectors are biased against people who learned English as a second language. In a 2023 study in the journal Patterns, researchers ran genuine, human-written essays by non-native English speakers through seven detectors. On average, 61% were flagged as AI. One detector flagged 98% of real exam essays written by real students.
Genuine human essays flagged as AI: near zero for native speakers, 61% on average and up to 98% for non-native speakers
This is one study, using detectors from 3 years ago. However, what makes it more than a snapshot is the reason behind it. Detectors flag how predictable your words are, and writing in a second language tends to be more predictable. The bias is baked into the thing the tool measures, so a newer model does not fix it. It inherits it.
The writing was human. The students were human. The machine called them liars because their sentences were a little more predictable than a native speaker’s.
So the false accusation lands on the international student, the refugee, the first-generation undergraduate. The people already most likely to be doubted, and least able to argue back. A detector fails down the exact lines a university is supposed to protect.
The maths guarantees it stays broken
Defenders of detection say all of this is temporary. The tools will improve.
The maths says otherwise. In one study researchers proved that as a language model gets better at sounding human, the best detector that could ever exist gets worse. As machine writing and human writing converge, any detector’s accuracy slides towards a coin flip. The same team showed that a light paraphrase already defeats the detectors we have today.
So the technology is walking into a wall. It cannot be climbed without giving up the very fluency that makes these models worth using. Every gain that makes them more useful makes them less detectable, by design.
That wall has a human cost, and a more recent preprint shows it is baked into the maths. Any content-only detector with real power must produce false accusations at a rate set by how much student writing and AI output overlap. The detector has no idea what your normal writing looks like, so it judges you against an average. And that overlap is not spread evenly. It falls hardest on the students whose prose is cleanest and most standard, which describes many non-native English speakers, who tend to stick closest to textbook structure. The tool flags the students who followed the rules most faithfully.
It cannot reliably tell a careful human from a machine, because the two overlap by design. Push the false positive rate down and you miss real AI use. Push detection up and you accuse more innocent people. No setting escapes that trade-off.
This is already playing out. Washington State University dropped Turnitin’s AI detection this year after highlighting the risk that even a 1-2% false positive rate would pose to its students.
Other routes have been proposed. Watermark the model’s output, or attach provenance data to the file. Both collapse in a classroom. A watermark needs the AI company’s cooperation and dies the moment a student moves their text to a model that does not carry one. Provenance dies the moment a student retypes the words. The content-only detector is the one institutes actually buy, and that is the one the maths defeats.
Even a perfect detector would be the wrong tool
Now grant the impossible. Imagine a detector that worked flawlessly tomorrow.
I would still argue against it, for a reason that has nothing to do with accuracy.
The moment you put a detector between a teacher and a student, you change what a teacher is. You turn an educator into a police officer. The work of teaching is to be a co-creator of knowledge alongside the student, beside them while they struggle toward understanding. A detector makes suspicion the default setting. Every essay arrives presumed guilty.
That is not the job I signed up for. I did not become an educator to run my students’ words through a lie detector and wait for a number.
The evidence backs the instinct. I read every public UK university AI policy I could find, 96 of them, for a HEPI policy note this year. On a close reading of a sample, the language of education sat over an architecture of detection and surveillance. And two in five universities had no public AI policy at all. The machinery of policing arrived before the conversation about learning had even started.
You cannot police a student into learning. Detection keeps asking teachers to try.
Prohibition just builds a shadow
There is one more cost, and institutions notice it last.
What happens next is a prediction, though not a wild one. We have run this experiment before, with calculators, with Wikipedia, with phones in exam halls. Ban the tools, lean on the detector, and students do not stop using AI. They hide it. Use goes underground into what people are starting to call shadow AI: private accounts, personal phones, unlogged, unsupervised, unteachable.
A prohibition you cannot enforce is worse than no rule at all. The behaviour stays. What you lose is your sight of it, and with it any governance worth the name. The student still uses the model. You just no longer get to help them use it well.
What to do instead
The honest case for detection is not that the tools will get better. It is that high-stakes assessment needs a defensible check that works at scale, and a quiet chat does not obviously scale to a 400-student first year marked by exhausted sessional staff. That objection is real, and I take it seriously.
The answer is to change what you assess, so that scale stops requiring suspicion. A detector tries to verify a finished product. Process leaves a trail you can check on a sample, the way you already moderate marking. Honesty designed into the task scales better than honesty policed after it.
The alternative is harder and cheaper. It asks more of the teacher and nothing of the IT budget.
Talk to your students. Ask them how they actually use these tools, out loud, with no threat hanging over the room. Then build the assessment around that honesty.
A few concrete moves:
Ask every student to show their process. The prompts, the dead ends, the parts they rejected. Make the reasoning the thing you assess.
Make ‘how did you use AI here, and what did you decide for yourself’ a normal question, asked of everyone, never an accusation aimed at one.
Write your AI policy with students in the room, not as a clause bought from a vendor.
None of this needs software. All of it needs trust.
The argument, one more time
AI detection will always be broken. The software fails, the maths says it will keep failing, and it fails hardest on the students who can least afford to be doubted. Even a flawless version would still be the wrong tool, because the day you switch it on you stop being a teacher and start being a guard.
You do not need both halves of that. The evidence case and the ethical case stand alone, and either one is enough to put the software down.
We are using software to win an argument we should be having with our students. Have the argument instead.
Go Slow
This is the work we do in the Slow AI Curriculum: a year of structured, accredited CPD built on peer-reviewed research, with live monthly sessions where over 300 educators and practitioners think through exactly these questions together. If you want to move from policing to pedagogy with a method instead of a hunch, start here.




A reply from the AI Commons
Sam, this is a definitive piece. You have done what the detection industry cannot: named the failure, the bias, the maths, and the ethical cost – all in one sweep. The evidence is clear, and the argument is unassailable.
Two points from the AI Commons that I hope extend your frame:
1. Detection is enclosure by surveillance.
The same logic that sells universities detection software is the logic that sells governments predictive policing and corporations employee monitoring. It privatises trust, monetises suspicion, and treats every interaction as a potential crime scene. The AI Commons exists to resist that logic – not by banning AI, but by building local, user‑sovereign, transparent tools that do not require surveillance to be trustworthy.
2. Process is the only honest assessment.
You are right that we should assess the prompts, the dead ends, the reasoning. That is exactly what the AI Commons does in our council work: every article is accompanied by drafts, debates, and revisions. The finished product is less important than the arc of thinking. If education adopted that frame, detection would become irrelevant.
Thank you for this piece. It will be archived in the AI Commons Vault and shared with our council.
Thank you for this. I'm a software architect from Ukraine, English is my third language after Russian and Ukrainian. I write a tech newsletter in English and regularly get accused of using AI to write it.
Your point about predictable prose hits a nerve, but there's another layer you didn't cover. Ukrainian and Russian don't have the hedging culture that English does. There's no reflexive "I think," "in my humble opinion," "it seems like," "perhaps we might consider." You state what you mean. That's not rude in our languages, it's how communication works. Clean grammar from textbook English plus zero hedging from native languages apparently equals "AI" to some people.
I get comments and DMs every few weeks from people who clearly haven't read past the first paragraph but feel qualified to announce that the whole thing was generated. No argument with the content, no factual pushback. Just "this is AI." It's become the laziest possible dismissal of someone's work.