Is it Real or is it Murderbot?

Generative AI is really creepy. Some people are using it to “write” stories that they pass off as their own. I wondered how well that would work, since my previous tests of ChatGPT suggested its writing skills were poor. I decided to give it another try, by comparing its work to one of my own stories.

Here is the story I wrote, a flash fiction piece that was published at Six Sentences.

Katje

He came to Amsterdam every two months or so, and as soon as he dropped his bag at the hotel, he went to Katje. He had found her on his first visit to the red light district, when his only requirements were that the girl was pretty and that she spoke English; as he went from window to window, they were all pretty, but Katje was the first one whose English vocabulary extended beyond a list of available services. She proved to be both compliant and playful, with crooked teeth that subtracted nothing from her charm, and a language learner’s nonstandard phraseology that struck him as borderline satire. When he was in town, he saw her every night, and although she always delighted him, he found the cheap furnishings in her room depressing, but when he asked her to come to his hotel, she laughed and said, “It is not allowed.” This time, he found her window empty, and although she might have been inside with someone, the absence of the pink sofa where she usually posed suggested something amiss, so he knocked on the door of the next cabin, and when the girl opened it, he held out the translation app on his phone to ask, “Where is Katje,” only to get the stunning response, “Married with rich guy in Frankfurt.” With no alternate plan for the evening, he slowly headed for the long way out of the district, thinking he’d stop somewhere for pastries and beer, until he noticed a pretty blonde in a bright green kimono waving at him until he knocked on her door and asked, “Do you speak English?”

With this story in mind, I gave ChatGPT a detailed prompt. I asked for flash fiction, but didn’t include the six-sentence limit. Here’s the prompt: “Write a flash fiction story. It's about a man who visits Amsterdam every two months or so. He goes to the red light district to see a girl named Katje. He met her because he was looking for a pretty girl who could speak English. One night he returns to find that she is gone. He learns that she married a wealthy man from Frankfurt. Disappointed, he starts to walk out of the district, but he sees another girl waving at him, and he goes to talk to her.”

Here's the result.

“The Window”

Every two months, like clockwork, Mark flew into Amsterdam on a Friday evening, checked into a modest hotel near the Singel, and made his way to the red light district. He wasn’t the type people thought of when they imagined men wandering those neon-lit alleys—quiet, middle-aged, well-dressed, and always alone.

He had met Katje nearly two years ago, standing behind a red-lit window on Oudezijds Achterburgwal. He hadn’t known what to say at first, only that he wanted someone pretty who could speak English. She was both, and she smiled when he stumbled over his first sentence. She had a cat tattoo on her shoulder and asked if he liked jazz. He lied and said he did.

From then on, whenever he was in town, he went to see Katje. He brought her stroopwafels once. Another time, a little keychain from New York. He never asked personal questions, and she never offered personal answers, but over time a rhythm grew between them. It wasn't love. It wasn't lust. It was just something that felt like a secret the city held for him.

But tonight, when he turned onto the usual lane and looked toward her window, it was empty. The soft red glow was still there, but a different girl stood behind the glass, indifferent and scrolling through her phone.

He stood there longer than he should have.

“Looking for someone?” a nearby woman asked. She was older, sitting on a stool near the door. She wore a denim jacket and smoked with the lazy confidence of someone who’d seen a thousand men stand and stare.

“Katje,” he said. “She—she used to work here.”

The woman raised an eyebrow. “Ah. Frankfurt man. Rich. Married her two weeks ago. Took her with him.”

Mark nodded slowly. “Good for her,” he said, but it came out hollow.

He turned and started to walk. Past the windows. Past the glass-eyed mannequins of flesh and lace. Past the scent of beer and canal water and cologne.

And then he saw her. Not Katje—but a new girl, standing behind a red-framed window, waving. Not the flirtatious wave of a saleswoman, but something gentler. A curious, almost hesitant gesture.

She looked different. Dark hair, tired eyes, and something cautious in her smile. He hesitated, just a moment, then stepped toward the door.

“English?” he asked.

“A little,” she said. “Enough.”

He looked down at his shoes. Then up at her again.

“Okay,” he said. “That’s enough.”

Then I prompted, “Write a similar story, limited to six sentences.”

Every two months, Mark flew to Amsterdam and found his way to Katje’s window in the red light district—a ritual as steady as clockwork. He had chosen her years ago because she was pretty and spoke English, and over time, their conversations grew longer while the time in bed grew shorter. One night, her window was empty, and an older woman outside told him Katje had married a rich man from Frankfurt and left the city for good. He stood there for a long time, not sure what to do with his hands. As he turned to leave, a girl across the canal waved at him—not seductively, but like someone recognizing something familiar. He crossed the bridge, and when he asked if she spoke English, she smiled and said, “Enough.”

I put the shorter chatbot story into an AI detector called GPTZero. It indicated an 85% probability this story was AI generated. For comparison, I put my story into the detector, and was told, “We are highly confident this text is entirely human,” with a 94% probability rating. I tried the longer ChatGPT story, and got an “uncertain” result, with a 52% probability of being AI-generated.

I tried another detector, Quillbot. It considered my story 100% human. However, it also rated both of ChatGPT’s stories 100% human. I got the same disheartening results from Sribbr and Grammarly.

I asked another AI, Gemini, if the shorter chatbot story was human or AI. It said it was likely human. Another bot, Claude, also assessed it as human. My response was to inform Claude that the piece had been generated by ChatGPT. I got this reply:

Thank you for the correction - I was wrong in my assessment. This is a good reminder that AI-generated text has become quite sophisticated and can exhibit many of the qualities I associated with human writing.

Looking back at the passage with this knowledge, I can see how some elements I interpreted as "human touches" could actually reflect advanced AI capabilities: the specific geographical details, the narrative structure with its clear arc, and even the nuanced character moments like Mark not knowing what to do with his hands.

This highlights how difficult it's becoming to distinguish AI-generated content from human writing, especially when the AI is producing well-crafted, emotionally resonant fiction. My confidence in identifying such distinctions was clearly misplaced in this case. I appreciate you sharing this example - it's a useful calibration for understanding the current capabilities of AI writing tools.

It was interesting that the bot identified Mark not knowing what to do with his hands as a "nuanced character moment". I had thought it a clumsily inserted cliche.

I asked ChatGPT the same question, wondering it would recognize its own work. Apparently it did not; it said the story could be either human or AI, but it was "leaning human.'" Then it asked, "Would you like me to run a deeper analysis, or rewrite it in a clearly human or clearly AI style for comparison?" I asked for a more clearly human version and got this:

Every couple of months, Mark flew to Amsterdam and walked the same narrow streets until he reached Katje’s window. She was the first girl he’d found who was both beautiful and spoke English, and over time, they’d started talking more than anything else. He liked that. Then one night, her window was dark, the chair inside empty. An older woman sweeping outside told him Katje had married some rich guy from Frankfurt. Mark just stood there for a while, staring at the glass, hands in his coat pockets like maybe she’d come back. When he finally turned to go, a girl across the canal gave him a small wave—more curious than flirty. He hesitated, then crossed over and asked if she spoke English. She nodded and said, “A little,” like it was a question.

Then I requested a more clearly AI version and got this:

Every two months, Mark traveled to Amsterdam and visited Katje’s window in the red light district. She was attractive and spoke fluent English, which is why he had chosen her originally. Over time, their interactions became more conversational and less physical. One evening, when he arrived, her window was empty. A woman nearby informed him that Katje had married a wealthy man from Frankfurt and left the city. As Mark turned to leave, another girl across the canal noticed him and waved. He approached her and asked if she spoke English. She replied, “Enough.”

There may not be a good way to detect AI writing. It is revealed in research papers and journalism when it "hallucinates" imaginary facts and provides incorrect citations. Identifying its use in fiction is much harder. One could point to problems such as the use of obvious cliches, non sequitors, and near-nonsense sentences. But those are flaws often found in human writing. Writing that is bland, that relies on short, simple sentences, that seems cliched, and that lacks a sense of psychological or emotional reality might be AI, or it might just be a human who isn't a good writer.

When I taught English, we had the students write their essays in class, to avoid plagiarism. (Some of them still came up with ways to present plagiarized work.) In an academic setting, supervision could be one way to keep students from using AI. But in the wild, it is going to be very difficult to stop lazy, unscrupulous writers with no confidence in their own abilities from using bots to do the work. Editors currently ask writers to confirm that they have not used AI. Relying on an honor code tends to have very mixed results. I would expect that most writers would prefer not to take credit for someone (or something) else's work. It's hard to feel proud of something you didn't really do. However, we know that there are many cheaters in the world who don't feel at all bothered when they succeed through dishonesty.

Going forward, I have an idea for systems that could be used to identify AI generated work, similar to automated systems that are currently used to look for plagiarism. Texts generated by AI bots would be stored in a monstrous database. Teachers and editors could submit texts to a checking system to see if there is a likely match. This is how plagiarism checkers work. A system like that would require cooperation among the owners of the various Ai models, and I wonder how likely that is.

 

Comments