“Mr. and Mrs. Dursley, of number four, Privet Drive, were proud to say that they were perfectly normal, thank you very much. They were the last people you’d expect to be involved in anything strange or mysterious because they just didn’t hold with such nonsense.”
No, I’m not diving into the nostalgic world of Harry Potter. Instead, I’m using this classic opening to make a point.
After AI detectors infamously labelled the US Constitution as AI-generated, I decided to have some fun with it. And lo and behold, the algorithm looked at one of the most beloved literary works and said, “Nah, too good to be human.” So apparently, J.K. Rowling’s masterpiece was penned by a robot.
*Cue the eye rolls*
Now, unless Rowling had a time machine, it’s unlikely she used AI to pen one of the most classic books back in ’97.
But the results from these AI detectors are both absurd and mildly interesting. Where do these AI detectors miss the mark? And how can we, mere humans, do a better job at accurately identifying AI-generated content? Let’s dive into the world of AI detectors to answer these questions.
Understanding AI Detectors
When ChatGPT was released to the public for free, it caused a commotion. Students worldwide were fascinated with the idea of using AI to complete their assignments. Alarmed by this misuse of AI and its potential to kill jobs, Edward Tien, a senior at Princeton University, decided to create a new app, GPTZero. It is an AI detector tool that uses ChatGPT against itself to determine the authenticity of content.
“The motivation was that there was so much buzz around ChatGPT at the time, and someone needed to make a tool to detect whether something was AI or human generated,” Tien explains.
AI detectors use advanced machine learning algorithms to analyse patterns, phrases, and other nuances to determine whether a text is AI-generated or human-written. But there’s a catch. No tool out there can guarantee 100% accuracy. Why? Because they run on probabilities.
The Problem with AI Detectors
“GPTZero is not foolproof,” says Tian. “We put a disclaimer not to make any definitive academic decisions off the data.”
However, the plight of AI detectors does not just affect students. It also takes a toll on writers.
A fellow writer took to Reddit to share his experience with AI detectors and clients. “The fun part is that each detector has its own opinion of how much of my content is AI-written. One says 69%, the other says 50, and yet another says 12. I spent 2 hours trying to ‘humanise’ my ALREADY HUMAN work to appease AI,” he explained, highlighting the inconsistency and frustration these tools can cause.
Moreover, according to research by Cornell University, AI detector tools consistently misclassify non-native English writing samples as AI-generated, but they can correctly identify writing from native speakers. However, using some simple prompting techniques can help reduce this bias and trick GPT detectors easily. This means that GPT detectors might wrongly judge the writing of people with constrained linguistic expressions.
Data company Appen also conducted a benchmarking experiment for AI detectors. It selected several popular APIs for the experiment, including GPTZero, Sapling AI, and OpenAI GPT2. The results revealed that while some tools performed better than others in certain areas, none of them met the expected benchmark of 95% accuracy. The false positives in these tools ranged from 16.67% to a whopping 70%.
In fact, OpenAI’s own AI detection software was retracted in 2023 due to its low accuracy rate. The tool had a success rate of only 26% and a false positive rate of 9%.
These studies draw attention to one significant yet often overlooked aspect: AI detectors are unreliable.
“I like to see them as moody teenagers who decide when they want to mess with you. Sometimes, they might highlight your text in that jarring red colour, while other times, you’ll see the same text passing as human-generated content,” explains my colleague Akshada Scott.
So, should you accept every piece of content out there as human-written? Absolutely not. There are several other ways to check if something is AI-generated or human-written. Let’s explore them in detail.
How to Detect AI Content Without Using an AI Detector?
Now that we’ve established that AI detectors aren’t always spot on at detecting AI-generated content, here are some clues to watch out for that could suggest a text is written by a bot:
1. Repetitive writing
AI-generated text often contains repeated phrases. This is because these models generate text based on learned patterns from vast amounts of data and, therefore, cannot identify redundancy like humans.
Don’t just take my word for it. OpenAI admits to this limitation in its own blog: “The model is often excessively verbose and overuses certain phrases, such as restating that it’s a language model trained by OpenAI. These issues arise from biases in the training data (trainers prefer longer answers that look more comprehensive) and well-known over-optimisation issues.”
AI tools generate content based on the keywords in the prompt and repeat it over and over in the output. So, if you see a word or phrase appear too much in a text, chances are it is AI-generated.
For example,
2. Generic Information
Whether you’re analysing an article or a student’s assignment, a simple way to determine if it’s written by AI is by evaluating the depth of information. AI tools generate text based on generic information from the internet. This text might seem very well-written, but it is mostly just a bunch of filler words strung together that add no real value.
Human-written content, on the other hand, is more sophisticated, with data, quotes, and insights to support the arguments.
Here’s what generic information from an AI tool looks like:
- Which foods should you include in your diet?
- What food options should you avoid?
- Is there an ideal time to consume or avoid these foods?
This type of content lacks value and adds the extra burden of research on the reader.
According to digital marketing expert Neil Patel, AI-generated content lacks DRIVE (data, review, insights, visuals, and energy). “Without DRIVE, AI will just crank out content that no one will read. It’s why human-written content outranks AI-written content 94.12% of the time.”
3. Inaccurate Facts
AI tools are notorious for generating information out of thin air. For example, when I asked ChatGPT for statistics on the success rate of AI detectors, it returned multiple studies on accuracy rates, false positives and negatives, and detection challenges.
However, it doesn’t specify the name of the study or the source of the data. Moreover, these AI models have limited knowledge up to a specific month or year and cannot generate relevant information beyond it. So, anytime you come across inaccurate or outdated content, it might be AI-generated.
Detecting AI Content Smartly
AI is evolving rapidly with constant upgrades and advancements. As such, AI tools are increasingly mimicking human writing, making it difficult to detect. While AI detectors offer some help, their reliability is questionable.
“The debate of AI vs. human-generated content is farcical at this point. Many marketers worry they’ll be penalised for using AI. In reality, Google doesn’t penalise AI content but rather content that fails to add value. The recent Google algorithm leak reinforced that EEAT remains the primary ranking factor. This means content must provide value and effectively resolve the user’s query. That’s how we should evaluate content, not by whether it was created by a bot or a human. Currently, AI content alone cannot add significant value and is simply a useful tool in the hands of a skilled writer,” says Ukti’s founder, Deepika Pundora.