When I worked on Textfyre, Michael Gentry and I constructed MS Word documents that contained all aspects of the story Jack Toresal and The Secret Letter. Jon Ingold used the same template for Shadow in the Cathedral. I am adapting that template to how I develop stories in Sharpee using Claude as a code generator. This document describes strict guardrails to prevent slipping into AI-generated prose, puzzle-mechanics, plot, theme, even grammar and punctuation.
The Problem
Large language models are trained to be helpful. When you ask one to implement a scene in an interactive fiction game, its instinct is to fill in every gap — invent a room description, draft NPC dialogue, design a puzzle, name the tavern. The result is technically functional and creatively hollow. The AI writes competent prose that sounds like no one in particular, designs puzzles with no authorial intent behind them, and builds a world that feels generated rather than authored.
Interactive fiction is an authored medium. The text is the game. Every room description sets tone. Every line of dialogue reveals character. Every puzzle encodes the author’s understanding of how the player thinks. These are not implementation details to be delegated — they are the work itself.
The solution is not to stop using AI for interactive fiction. It’s to draw a hard line between creative authority and technical implementation, and to enforce that line through process.
Creative Boundary Constraints
The methodology enforces a set of non-negotiable constraints on what the AI may produce. In practice, these are codified in a CLAUDE.md configuration file that the model reads at the start of every session.
-
No generated prose. All player-facing text is the author’s responsibility.
-
No narrative suggestions. Story direction, plot, and theme come from the author, not the model.
-
No puzzle design. Puzzle mechanics are authorial craft, not implementation logic.
-
No dialogue. Every character’s voice belongs to the author.
-
No world-building. The fictional world is defined in specification documents, not improvised by the model.
-
No unspecified implementation. If it is not in the spec, it is not built.
-
Placeholder discipline. When player-facing text is needed but not yet written, the model inserts a
TODOmarker and moves on.
Why strict boundaries matter
There is a practical reason and an ethical one.
Practically, the most dangerous thing an AI can do to a creative project is produce work that is almost good enough. Near-adequate generated text is harder to replace than a TODO placeholder. It creates inertia — the project feels further along than it is, and the author begins editing AI prose instead of writing their own. Over time, the authorial voice drifts toward something generic and indistinct. A placeholder is honest. Generated prose obscures a gap behind competent but unauthored text.
Ethically, every commercial large language model was trained on copyrighted creative work — novels, short stories, screenplays, game scripts, and other narrative texts produced by working authors. When an LLM generates prose, dialogue, or world-building, it is drawing on statistical patterns learned from that intellectual property. Using AI-generated narrative in a published work means the creative substance of the project is derived, however indirectly, from the uncredited and uncompensated labor of other writers. Constraining the AI to technical implementation — code structure, state management, engine integration — avoids this problem entirely. The author’s words are the author’s own. The AI contributes engineering, not narrative.
The Development Cycle
The workflow follows a strict cycle. The author drives every creative decision. Claude handles the technical implementation.
1. Story
The author defines the story: world, characters, themes, tone, and arc. This is pure creative work. Claude is not involved.
2. Chapters / Scenes
The author breaks the story into implementable units — chapters, scenes, or locations. This decomposition is itself a creative decision (pacing, structure, what the player experiences and in what order). Claude is not involved.
3. Specification
This is where the method lives or dies. The author writes detailed specs for each unit. A scene spec should include everything Claude needs to implement it without inventing anything:
-
Room/location descriptions — The actual text the player will see.
-
Objects and scenery — What’s in the space, what can be examined, what can be taken or used.
-
NPCs and dialogue — Who’s present, what they say, how they respond to player actions.
-
Puzzle mechanics — Trigger conditions, solution steps, failure states, hints.
-
State tracking — What variables change, what flags get set, what consequences carry forward.
-
Exits and connections — Where the player can go from here.
The spec is the contract. If it’s not in the spec, Claude doesn’t implement it — it leaves a TODO. It may point out empty slots in the template, but it will not point out missing design elements.
Writing specs at this level of detail takes real effort. But here’s the thing: you’d have to make all these decisions anyway. The spec just forces you to make them before implementation rather than discovering mid-coding that you haven’t figured out what the blacksmith says when you show him the amulet.
4. Implementation Plan
Claude reads the specs and proposes an implementation plan:
-
What files need to be created or modified.
-
What engine features and data structures will be used.
-
How state tracking will work technically.
-
What order to build things in.
-
What’s fully specified vs. what has gaps.
The plan is presented to the author for review. Claude does not write code until the plan is approved.
5. Review
The author reviews the implementation plan:
-
Does the plan correctly interpret the spec?
-
Are the technical choices sound?
-
Are there gaps Claude identified that need spec work before proceeding?
-
Does the plan’s scope match what the author intended?
The author approves, revises, or sends Claude back to re-read the specs.
6. Execute
Claude implements the approved plan. During implementation:
-
Code follows the spec exactly. Where the spec provides text, that text is used verbatim.
-
Where text is referenced but not yet written, Claude uses
'TODO: author text'placeholders. -
Where implementation reveals a spec gap (e.g., “what happens if the player tries to go north here?”), Claude flags it and keeps going.
-
Claude does not improvise. If something isn’t covered, it’s a placeholder, not an invention.
7. Repeat
Return to step 2 or 3 for the next unit. Each cycle produces implemented, tested code that faithfully represents the author’s specifications — with honest gaps where specs are still needed.
The Gap Conversation
The key to this workflow is what happens when Claude hits a gap. Instead of filling it in, Claude stops and says exactly what’s needed. Some real examples from my projects:
-
“There’s no scene spec for the throne room. I need: room description, exits, interactive objects, and NPC placement.”
-
“The puzzle mechanics for the finale aren’t specified. I need: trigger conditions, solution steps, and failure states.”
-
“The character’s dialogue for this encounter isn’t written. I need their lines before I can implement the conversation.”
This creates an iterative loop. You write, Claude implements, Claude tells you what’s missing, you fill it in, repeat. Nothing is generated. Everything is authored. The AI becomes a very fast, very literal collaborator that keeps asking you the right questions.
Spec Document Hierarchy
I’ve found it helps to organize specs at these levels, with higher levels informing lower ones:
| Level | Contents | Example |
|---|---|---|
| World | Setting, magic systems, physics, rules | How the magic system works |
| Characters | Personality, relationships, knowledge, voice | Why the merchant distrusts strangers |
| Chapters/Scenes | Descriptions, atmosphere, objects, NPCs, events | The market square at dawn |
| Mechanics | Puzzles, choices, branches, consequences, state | Trust meter thresholds |
| Implementation | Technical patterns, data structures, engine APIs | Conversation tree format |
A scene spec can reference character profiles and world rules rather than restating them. This keeps individual specs focused while maintaining consistency.
What This Gets You
This methodology works because it plays to the strengths of both author and AI:
The author provides creative vision, voice, world-building, puzzle design, and narrative craft — things that require artistic intent and cannot be meaningfully delegated to a model.
The AI provides technical implementation, code structure, engine integration, state management, and the ability to turn specifications into working software quickly — things that are genuinely well-suited to an AI coding assistant.
The boundary between these roles is the specification document. Everything above the spec is the author’s domain. Everything below it is the AI’s. The spec itself is written by the author and interpreted — never extended — by the AI.
The result is interactive fiction that is fully authored — every word, every puzzle, every story beat chosen by a human — but implemented at a pace that would be difficult to achieve solo.
Practical Notes
A few things I’ve learned along the way:
-
Session length matters. AI context windows are finite. For sessions where you’re implementing scenes with lots of author text boundaries, keep sessions shorter. For pure technical work (engine features, refactoring, fixing tests), you can let them run longer. The creative constraints are what degrade first.
-
Correct early and often. When Claude slips and invents something, call it out immediately. A quick reference to the violated constraint is enough. The correction reinforces the boundary for the rest of the session.
-
Write session summaries. At the end of each working session, have Claude document what was implemented, what gaps were found, and what’s next. This lets you start fresh sessions without losing progress.
-
The spec effort is real. This methodology front-loads the creative work. Writing detailed specs before implementation is harder than winging it. But the specs serve double duty as design documents, and you end up with a much more coherent game.
-
The AI will still try to help. Models are trained to be helpful. Even with explicit constraints, there is pressure to fill gaps. The boundary constraints and the process together create enough friction to keep things on track, but vigilance is part of the workflow.
Looking for Feedback
I’m curious what the community thinks about this approach:
-
Does the spec-driven cycle make sense for how you think about IF development?
-
Are there creative decisions I’m drawing on the wrong side of the line?
-
Has anyone else found workflows that preserve authorial voice while using AI tools?
-
What am I missing?
This is a living methodology — I’m still refining it with every project. Happy to answer questions about how it works in practice.