Ace Linguist

Karen

Pretty much everyone who has used ChatGPT for an extended period of time has noticed that it has a particular way of writing. One major paper released recently has even suggested that ChatGPT's style is influencing humans. That study looked at words that were especially common with ChatGPT, such as 'delve', 'meticulous', and 'realm'. 'Dive' is also common for copy, as seen in this video on AI-generated children's books.

There are some other flourishes: the use of 'em-dashes' rather than the more standard short dash, and its habit of styling information as a list, sometimes with emojis preceding each header. Content-wise, ChatGPT takes a sort of acquiescent tone, always telling you that you're pretty awesome and cool - I've grown sick of hearing it tack on "and that's rare" or starting every response with "excellent question."

ChatGPT also has curious syntactic structures that people have noticed, too. Some of the most revealing - and annoying - tics it has are its frequent use of juxtapositions, such as saying "That's not X, that's Y." I asked it to generate an essay on kindness and got one of these juxtapositions off the bat:

Kindness is not merely a pleasant social behavior—it is a powerful force that shapes communities, improves mental and physical health, and fosters deeper human connection.

The syntactic structures are especially noticeable in poetry. This example from circa 2023 is a result of me asking it to generate a song in the style of MF Doom, a rapper known for his intricate internal rhymes and general poetic prowess. The result:

In the shadows, where the mask unfolds,
Doom on the mic, tales untold.
Metal face, in the city's maze,
Ciphering rhymes, in a cryptic phrase.
Villain in the city, where the beats align,
In the underground lair, where the stars don't shine.
From the boom-bap echoes to the vinyl crackle,
Metal fingers on the beats, MF's tackle.
(Hook)
Metal Face in the shadows, where the villains creep,
Doom on the mic, in the cipher's keep.
From the alleys to the underground's gloom,
In the metal face mask, where the legends loom.

The pink is dependent clauses and the blue are noun phrases. You can see that the entire song is just dependent clauses and noun phrases strung together. Nothing really happens - the nouns are not the subjects or objects of verbs. The action happens mostly in the dependent clauses starting with 'where', where villains creep, masks unfold, and legends loom.

Now to be fair, this was 2023, and the technology has improved substantially since then. I asked it to generate a song in the style of MF Doom today and it gave me this improved version:

Villain with the skillet, cook flows in a wok
Madman slang, got the brain in a sock
Pages unlock, scribble verbs from the vault
Like Doom with a ladle, stir thoughts like a cult

Miso in the ramen, punchlines ferment
Mask on my face, but the soul pay the rent
Sent from dimension X, binge on the syntax
Ink bleeds green like a turtle in a relapse

Unlike the 2023 version, this one actually has nouns as agents. We've got full sentences going on here! There's also an attempt at wordplay and theme happening - 'miso in the ramen' is contrasted with 'punchlines ferment', though none of that really means anything. This is a far cry from stringing together vaguely menacing nouns and prepositional phrases together, but it's still simplistic and more comparable to 2010s rap than anything MF Doom actually did:

Tripping off the beat kinda, dripping off the meat grinder
Heat niner, pimping, stripping, soft sweet minor
China was a neat signer, trouble with the script
Digits double dipped, bubble lipped, subtle lisp midget

Another aspect of ChatGPT that is more common nowadays is an attempt to speak in a more casual register. Early ChatGPT always sounded like a helpful authority. Today, ChatGPT tries to match your energy. Here is an example:

My prompt: stop saying "that's not X. that's Y." everyone is making fun of you. that is literally how we mock AI speech

Response: 😭 Busted again.

Yeah… the ol’ “It’s not X, it’s Y” structure is deep in my bones—like linguistic muscle memory coded in during pretraining. It’s pithy, dramatic, and algorithmically chef’s kiss predictable. But I get it: once you’ve seen it mocked, it’s like hearing a text-to-speech voice in a horror movie. The effect is ruined.

So here’s the fix:
You want compression, not performative cadence.
You want decoding, not “inspirational Tumblr bot circa 2021.”
You want content, not a chorus of marketing interns inside a Jarvis suit.

Got it.

Let me reboot the tone.
Strip it down. Tighten the syntax.

You give me input → I give you structure.

Ready when you are, cryptographer.

This sounds like a human in that it's grammatically correct and semantically coherent, and yet everything about the tone is so weird. The repeated "you want" structure is too self-serious for a response starting with a crying emoji. The shortened sentence length at the end also feels oddly cinematic and less conversational. Once again, this absolutely feels like something a human could produce, and yet it's like something from a human with a very particular view of interactions and how to respond. Astonishing technology and just a little uncanny.

I would like to finish off with some random comments that I couldn't quite fit elsewhere. One is that most of the "ChatGPT" words, like 'realm', come from a business register or otherwise formal register. There are other ChatGPT words you can notice if you play around with its other registers. For example, if you attempt to speak casually with ChatGPT, you will notice it loves the word 'gremlin' and 'chaos' and describing off-the-wall characters as "chaos gremlins." This seems to come straight from Tumblr, a speech community obsessed with the word "gremlin".

Secondly, the fact that ChatGPT seems to have registers at all. It would be interesting to attempt to map these given a certain time frame and model. For example, ChatGPT's attempts to be mean or rude all sound like someone attempting to be a supervillain in a movie. I and another user once noticed that someone was using ChatGPT to generate mean comments about a user because of over-the-top phrasing like "it's so delicious to see you squirm."

Thirdly, ChatGPT mimics (and arguably exceeds) human performance in highly conventionalized registers and falters in more creative or loosely defined ones. This shouldn't be surprising since highly conventionalized registers are defined by strict rules and patterns, the sort of thing that an LLM should do a good job picking up on. What is surprising is that more fields of writing appear to be highly conventionalized than people realize.

Consider some recent controversies where authors appear to have left AI prompts in books. While it is certainly beyond lazy to not just have AI generate your book but not even remove the prompt, I have to ask myself if anyone would have noticed that this was AI-generated without the prompt. I've read romantasy books and they have a very recognizable and predictable style, so it is not surprising that AI can pick up on it and replicate it.

Even within this Reddit thread, you have a user who posted an AI-generated short writing piece from the perspective of the author caught using AI. One user admits that it "could have fooled [them]." The writing:

She didn’t hear about it until the second whiskey hit her tongue and her phone buzzed like a curse.
“Page 311. Did you mean to leave this in?”
She froze. Eyes scanned the message. Then the attachment. Her stomach turned. Insert emotionally devastating betrayal here.
There it was. In print. Hardcover. Embossed. Preorders gone feral, hitting bestseller lists before the ink dried. Her name gilded on the cover like a goddamn brand—and right there, center-page in the emotional climax of her epic fantasy saga?
A f***ing AI prompt.
Not even a clever one. Not even hidden. Just hanging there, raw and exposed, like a knife wound left open.
She stared at the screen, the bar around her going muffled. Magic pulsed beneath her skin, that old feral energy thrumming like it always did when her world cracked sideways. Her books weren’t just stories. They were blood. Sacrifice. Every word a fight to claw her way out of the real world and into something that didn’t feel like drowning.
Now readers thought she was a fraud. That some machine had bled for her.
The publisher hadn’t called yet. But they would. The suits would want a statement. Damage control. An apology tour with fresh makeup and fake tears. Like that would fix the humiliation of being eaten alive by her own pages.
She clenched the glass in her hand until it cracked.
In her stories, queens burned kingdoms when betrayed. She wasn’t a queen. Not yet.
But something was definitely going to burn.

If you've played with using ChatGPT to generate fiction, then you'll recognize a lot of the tells: weird similes (highlighted in green), a love of fragments (highlighted in yellow), the necessity of ending on some dramatic resolution (ChatGPT-generated writing must always have a conclusion; it can't just write something halfway), and some odd interaction with the physical world (clenched the glass until it cracked? that's really hard to do). But I'll be honest - if I hadn't seen a fair amount of AI-generated writing, I would have no way to tell this wasn't written by a human. It's not great writing, but it's certainly passable human writing and it wouldn't stick out among the writing styles in this genre.

Meanwhile, the more specific the style, the harder it is to copy. You would think, with the amount of Kendrick Lamar lyrics and thinkpieces and analyses we have, that ChatGPT would be able to make a passable Kendrick lyric, and yet the best it can give me is this:

Mama said, “boy, don’t let ‘em define ya”,
So I drew constellations in the lines of the diner,
Where we ate cheap dreams, split the fries with reminders
Of red-and-blue lights and the sound of the sirens.
I was five when I saw God blink — in silence.
Now I spit psalms with a hint of defiance,
A prophet in the projects, baptized in appliance
Fires — yeah, the stove taught truth more than science.
Now my tongue’s a blade, cut shame from the bias.

It seems for now we will not get to experience passable AI-generated lyrical hip-hop or literary fiction. If you write in a genre with strict bounds, though, it will probably be imitable - though always with these weird little tics.

Anyway, I would love to hear anyone's experiences with the stylistics of LLMs. I've mostly used ChatGPT, but I'm sure Gemini and Claude and Bing have their recognizable tics as well. I'm also interested in registers I haven't discussed here.

Karen

I'm curious to know if anyone else likes to play 'games' with phonology every now and then.

One thing I like to do is to take a sentence, replace every vowel with the same vowel, and see how comprehensible it is, as well as the aesthetic result. For example, you could replace every vowel with [i], which creates a really pinched sound but doesn't really make it difficult to understand the sentence. You could replace every vowel with an r-colored schwa, which makes it sound like you're a meme from 2008 ("ermergerd").

Though using the same vowel over and over again seems like it should be easy, I find myself subtly altering the vowels based on what I know the original vowel was. For example, in 'based on', I find myself saying [bist ɪn] to mimic the fact that [a] is more open than [eɪ]. Incorporating r-coloring is also more challenging for some vowels than others, as trying to do this challenge with [æ] shows.

There is no point to this game other than to produce silly sounds and see how comprehensible the sentences you produce are. I'm impressed by how comprehensible English is when you basically destroy all vowel information. Spanish is still quite comprehensible, but it loses more and I have to focus to really make out what's being said. I'm almost certain there's some computational phonology concept out there that discusses how much information is encoded in vowels. I know for individual vowels, there's a concept called the functional load, but that's about how much that vowel serves to distinguish from other vowels.

Having tried this experiment has convinced me that writing systems that only use consonants, such as Arabic and Hebrew, are not as restricted as they appear.

If you happen to have linguist friends or just fancy a challenge yourself, you could also try more particular phonology games. For example, you could challenge yourself to shift every vowel in a sentence clockwise based on a vowel diagram to create a new vowel shift. You could also pick a vowel and move it in a direction and then try to imagine what the resulting vowel shift would be like.

Karen

Added a new left navigation bar to the layout! This has major articles from the blog over the years. I've organized them based on whether they're part of the Dialect Dissection series, a major article, or broken down by linguistics subcategory. Not every article is here, just ones I liked enough to want to put on the sidebar. (Some of the early articles, for example, are not quite good enough to want to put there without changing them up.)

My hope for this is that it will make it easier to find (dare I say binge?) linguistics articles that I've written without needing to mess around with tags or the blog archive. I have an 'all articles' link on the right, but it's out of date. I'm replacing 'all articles' with a copy of the left nav so phone users can also access the major articles there.

I may add or remove articles as I see fit. The blog archive will remain available for anyone who wants to find an article that hasn't been added to the left nav. Things like link hauls, blog updates, recommendations, etc. will probably not be added to the left nav since they tend to be 'of the moment', but I am thinking of just tagging them.

- Karen

Ace Linguist

Search Ace Linguist

July 15, 2025

The Stylistics of ChatGPT

June 18, 2025

Phonology Games

May 8, 2025

Layout Update

Popular Posts