One day, Michael's main client informed him that they had started to use an AI detector, and the results were supposedly damning for him: his most recent articles flagged a 95% likelihood of being AI generated. His client started to look at all of his previous articles, many written before ChatGPT was even widley available, and Michael was notified that all his articles showed a likelihood of being AI generated of 65-95%. They terminated his contract with immediate effect. A decision solely based on a single number (or range) that the AI detector spit out.
Michael tried whatever he could to prove his articles were not AI generated. He even gave his client access to the full Google Docs history and showed them his writing process and progress, all edits included. But the seed of doubt that the AI detector had sowed was too strong. Michael lost his main client, and with it most of his income.
A number of things are problematic with this story. And I’d like to go over them one by one:
1. The accuracy of general AI detectors is questionable
General AI detection is flawed. Period. It’s not like the detection works almost all the time and Michael's case is one of the few very unfortunate outliers. No, false positives are the norm when it comes to general AI detection.
Even OpenAI themselves stopped offering their very own detector for this reason:
"The AI classifier is no longer available due to its low rate of accuracy.
" Open AI, creator of ChatGPT, on their detection