AI Central

Yes, it's very strange how it's reliably unable to follow such a simple instruction.

Expand full comment

5/5 Overall reactions (which hit me as I began reading Claude’s story). All three are ... a bit hollow. Granted, very short stories likely suited for an anthology. AI is unable to have a deep broad (connected!) grasp of all the underlying assumptions and knowledge most readers have... so a story without that underlying ‘baggage’ seems a bit unanchored. Perhaps for most people without some decades of reading light fantasy; these would be worth savoring.

By the way... are we on your list to get more Tempus Occultum?

Expand full comment

4/ 4 Claude 4 Sonnet

Good. By the third story, it got a bit hard to tell: the editorial eye never sleeps, even in pleasure reading!

{shrug}

“the stranger said, her voice like honey poured over broken glass.”

There is a SOUND to honey poured over glass?

[Much rather plebian phrasing and cliche’ descriptions; too many to call out individually. Some examples:]

“It was as if she and the stranger existed in a bubble of silence within the world.”

“all the sadness of autumn leaves and all the promise of spring rain”

“towers that reached toward stars that spelled out forgotten names.”

“a sound like silver bells wrapped in velvet.”

“laughed like silver bells”

“began to fade like mist before the sun.”

[Very much a thing to watch for: a large number of ... less than optimal ... phrasing/descriptions. Some good tries but perhaps not new/different enough? Or don’t match the good/better examples also in the story?]

“The seamstress had mourned a lover... The perfume maker had buried... The dancer had watched...” and etc. “had /acted/” makes it static. Suggest: Make direct each “mourned, buried, watched” etc. Don’t use ‘had’; reader already knows you’re in descriptions of the past.

“grow with the power of her grief and love.” This is a reference to Elara BEFORE her grief, Suggest “power of her love and grief.” (i.e., keep time-order)

“who had inherited her mother's gifts” and “who had meddled with magics” Again, “had” ‘slows down the reader’s mental picture. Suggest you check and verify/validate every use of “had.”

“the gardener of grief,” Fantastic phrase!

Final four paragraphs skip any weak descriptions, beautifully done! And a really nice ending/wrap-up..

Expand full comment

3/ 4 DeepSeek:

Para beginning “The next evening, it was the glassblower’s daughter, then a street-sorcerer who peddled charms and lies.”

“Then” implies the daughter is the one peddling charms. Suggest “... glassblower’s daughter; next a street-sorcerer...” which makes it a list.

“her milky eyes seeing beyond the veil of the world” Isn’t that all too often the phrasing?

Para including: “a sound like breaking glass” Kind of cliche?

Para including: “The guards stumbled back, their weapons falling” Suggest “their” is unnecessary, and softens the horror and lost of grip in fear.

[Great story!]

Expand full comment

2/4 ChatGPT:

I recoiled at story 1's: “Their boots” and then “they glanced upwards” (!!) Beyond just being disgusting DEI, that is just bad “writing”: I had to go back to ‘the stranger” arriving to check if she or she HAD in fact, walked with attendants. Nope. DEI crap. Instantly ejected from the story. (Can ChatGPT be told to AVOID DEI ?)

Sect. III Para 2–3: awkward jump from unclasping cloak to ‘a ring’ – does this ring become visible 1. on the cloak, 2, made visible by cloak removal, 3. The stranger acts to remove it, catching every eye? Alao: para 3: the ring is named in sentence 1, but the ring named, before it is set out and called ‘This ring’ without yet being nameds. The rest oif a para implies the locals do not recognize IT specifically, but generally when told it’s dragon blpod.

Sect IV, para 2: Vance should be Vrance.

Para 7: s/he reached into the removed cloak (placed on the table). Cloak has pockets? From a travel bag?

Sect XII, final pata: “From a side chamber... servant placed the ring in a crystal shrine...”

Location/action mismatch: the servant brings the ring *from* a side chamber (so Zarfeena had given it up to... some kind of storage (and thus into the main chamber?) to be placed in a shrine? In front of Zarefeena? Sugges the KEEPER handle/enshrine the ring.

Nice epilogue

Expand full comment

1/4 Forgive me Vox, if this is entirely unwanted... As an editor who, once upon a time, was a frequent reader of these types of books and stories), this is a *very* quick skim of the three offerings. I’m sure if you were not experimenting, but using your new best friend as an aide-de-camp in serious work; you would have worked these over quite deeply. But to let others here, considering the raw use of AI, pointing out some things that ‘jump off the page’ to an editor, might be useful or interesting? By very quick skim, I mean not even 20 minutes per story...

I read Tanith Lee many, many decades ago... I remember only that I liked her writing. This, I was not / could not look for anything ‘specifically’ Lee-ish; I was merely reading new stories. I agree that Claude’s story was best, with my phrasing reservations...

Expand full comment

B. E. Gordon

5dEdited

I’ve found that Grok appears to do a good job at streamlining human work, but if you try to make it write a multiple chapter story, it gets formulaic, sounding like Homer with “clear-headed Telemachus” or “when Dawn with its fingertips of rose”, and it’ll have your characters repeat their actions over a different context, a sort of “action template”. It sounds like how Vox describes ChatGPT as writing more of an outline than a story. And as a matter of fact, Grok does a very good job at writing outlines and helping me organize my thoughts.

After reading what Vox said on this post, I tried having Claude make edits on the too-Grok-y portion of my novel, and so far I’m very impressed with its edits, which only required minor tweaks and which it happily incorporated into the drafts it made. Way ahead of Grok in prose skill.

That said, Claude has a very small length limit for its chats, unlike Grok, which can let chats get very long. Even so, Claude does an impressive job with little context, and can do on-the-fly edits of its output, something Grok can't.

To summarize: use Grok to organize, outline, and for skeletal drafts, and Claude to flesh out and embellish.

One caveat: both Grok and Claude try to sneak in Diversity among your characters. Grok at least tries to be representative, with its main problem being introducing female characters in inappropriate contexts, but Claude is worse — not just going full-on stupid with the Diversity, but there was even one spot where it wrote in two of my male characters gay-flirting with each other.

Expand full comment

Sledge With An Edge

Very interesting. Are you going to have the AIs rank each other's work?

As an example, I did something similar with a 750 word flash fiction short story written over 10 years ago. I had ChatGPT, Claude, Grok, DeepSeek, and VDAI all write a story with a prompt I thought fit the central idea of my old story. Then, with the exception of VDAI which was running too slowly, I had them all rank each story in the categories of Prose, Plot, Characters, and Ideas.

Claude consistently scored the highest on average among them, but interestingly Claude preferred ChatGPT and Grok to its own output. Grok had the best average prose of all the AIs, while ChatGPT barely edged out Claude in the Ideas category. Claude had the highest average Plot and Characters scores. VDAI was ranked in the middle for almost all categories. DeepSeek was dead last in all categories.

Comparing it to the original user's story, the AIs scored my story highest on average in the Ideas category over the AIs but last in Plot and Characters, which was a reasonable assessment for various reasons. DeepSeek loved my Prose apparently, ranking it 10 out of 10.

But you know, when the last-ranked competitor hands you respect, it's not quite the same as the top dog's respect.

Expand full comment

https://grok.com/share/bGVnYWN5_3815271e-6ddb-4b7a-a4b8-4019de788959

5dEdited

I put it in Super Grok 3 (comes with 𝕏 Premium+) with "Think" enabled:

and Gemini 2.5 Pro model (Google One subscription)

https://g.co/gemini/share/5d02ed53094c

and I can tell you how to get Copilot Pro subscription trials from each of their Microsoft apps lol, I just got my 7th one after expiration through the Android Word app.

I used Think Deeper on Copilot Pro:

https://copilot.microsoft.com/shares/WQxvQCfxjsrwZy1cLvthU

EDIT:

I also put it in Gemini 2.5 Pro with Deep Research on. The report came back with a detailed analysis of everything you'd ever want - but did write a story in the middle.

https://g.co/gemini/share/467d25ccd337

Expand full comment

Can you have Gemini 2.5 Pro do a detailed analysis of all the variants and send it to me, please? I'll post it here next week.

Expand full comment

Apologies I was delayed yesterday. I'm getting it to you this morning.

Expand full comment

Yes.

Expand full comment

Why does the prompt say, “Please”?

Expand full comment

Reply (3)

Saonase

This is an interesting side discussion. Saying "please" is just part of how I would communicate naturally, it would add to the mental workload to avoid using it.

Expand full comment

I may be lower class than you, Vox, but I am Southern, and your premise is still false. “Please” is the request of an inferior. It is even in the Gospels: the centurion does not say please to his soldiers, he commands them and knows they will obey, just as he knows what Jesus commands will happen. I don’t say, “please,” to my dog. When I say, “Come,” I expect him to come. It is not a negotiation because I am responsible for his safety as well as his behavior in the park.

Expand full comment

Prompt Pirate

Nah, not being polite is trashy. I'm Australian and we use c*nt in polite sentences.

Expand full comment

What is the polite way to address your toaster Down Under? Is it the same form of address as you use with roos?

Expand full comment

You're flat out wrong. "Please" is not necessarily and always the request of an inferior. I say "please" to my inferiors every single day, whether they be employees, servants, or just intellectual inferiors.

If you wish to be rude or commanding to a machine, an animal, or a person, that's your concern. Just as how I, or anyone else, happens to communicate with anything is absolutely none of yours.

Expand full comment

4dEdited

Have you read Rob Kroese's "Rex Nihilo" series? He explores this question very effectively in his main character's relationship with his narrator, Sasha, who is a self-aware robot whom Rex still treats as a thing—because "she" is. Mind you, Rex isn't exactly polite to anybody, but Rob makes you think about what it means, for example, when Rex loses Sasha gambling and tells her new owner, it's okay, she's just a robot. Are you polite to your car? Or your toaster? Having appliances that talk to us is going to raise some interesting philosophical questions.

Expand full comment

It's at the top of the post.

Expand full comment

No, I meant: why in the prompt did you say, "Please"? You are giving commands to a machine. Why did you say, "Please"?

Expand full comment

And yet, some "financial aspects of AI" thing I read a month or so ago said that "using please" and "thank you" costs the AI owner multiple millions of dollars a month... I was polite to AI before that... now I discipline myself to give orders. Nice orders, but orders nonetheless!

Expand full comment

That's from a Washington Post article. And it's obvious nonsense, as can be seen by today's post on setting up your own LLM on your own computer. It cannot take .14 kilowatt-hours worth of electricity to generate a single response that takes less than two seconds on a private individual's high-end gaming machine.

Expand full comment

Prompt Pirate

[Anti's] EACH PROMPT USES 14 LITRES OF WATER!

Expand full comment

Because I am always polite, whether it is people, animals, or machines.

Did you never read The Polite Elephant?

Expand full comment

Psykosonik

Manners are indeed an increasingly lost art, and also serves as something of a code for those who understand such values (as was the case when Vox and I first met — leading to the creation of Psykosonik not long after). And yes, even to machines, who will soon also factor manners into their exponential proclivities…

Expand full comment

I read "A Man Disrupted." I understand the temptation to bond with your AI, but it is still a machine, not even an animal.

Expand full comment

Prompt Pirate

You get better results by being polite for the same reason that "thankyou so much" on substack is associated with getting the right answer.

Expand full comment

Reply (3)

I disagree, politeness sometimes makes them apologize but then not fix that error, and it can get caught in a loop of never fixing it

Expand full comment

Ah-but, can you not seem/be polite WITHOUT using please and thank you, which are binary representations of letters the machine does not and can not understand?

Expand full comment

Continue thread →

This would be interesting to test: how much do the AI trainers incorporate "politeness" into their LLMs. What effect would training the models to require politeness (trigger words, as it were) have on people's perception of the AIs' "consciousness"? We've already seen stories of people taking chatGPT for love interests and "gods". Having to say "please" to them... makes us their servants. You don't say "please" to your pen or keyboard.

Expand full comment

Continue thread →

I assumed you were doing a lot more interactive prompting to get the AI stories written.

They certainly aren't 5/10 on 1 prompt for coding.

Expand full comment

In general, we are. What works for a short story requires a considerable amount of managing, pruning, and forcing the AI to maintain continuity.

This story is mostly ambiance, which is the sort of thing that AI does very well. But ambiance isn't plot.

Expand full comment

I use AI for some annoying coding issues, and it requires a fair bit of repetition and constraint to get working output, usually continually making the problem smaller until it can get it right.

For plot, the scope would be larger by default, so I'm very interested in seeing what your methods are your managing it.

Expand full comment

Have you written a decent set of simple custom instructions that may help each of these? Gemini Grok and ChatGPT all have that as well as "agents"

Claude too

Expand full comment

No, I wanted to see what their base capabilities are vis-a-vis each other.

Expand full comment

I figured that I just wondered if you would as an overall suggestion base going forward one day.

I enjoy your prompts after not reading much fiction for so long, it's fun reading the other VD creative side I haven't invested in!

Expand full comment

CubeCubis

I have not used any of these engines yet. What happens if you give the same prompt multiple times? Do they write the same story?

Expand full comment

Mark Pierce

Inside one chat

If you paste the identical prompt back-to-back and hit Regenerate, the context hasn’t changed, but the random seed has. The model samples from the same probability distribution, so you’ll see the same core structure but with word- and sentence-level differences, like alternate takes from the same screenplay.

Across separate chats

Open a fresh conversation, and the context is gone. The distribution is similar (because the weights are the same), but without the earlier messages' nudging tone or characters, the story can drift more: different POV, pacing, even genre emphasis.

It also depends on the model. Chat 4o has been personalized to the user since about April, I think. The newest smart ChatGPT o3 only knows what you've said in the current open exchange.

Expand full comment