Story Development with Claude: A Methodology for Authored Interactive Fiction

I think that’s still one more than TADS3.

I really don’t know ZIL or GPT’s code generation well enough to speculate. At least in the case of Claude Code approaching TADS3/adv3 code, the problem generally seems to be a reciprocal relationship between being unable to design to a vague spec well at all on one hand, and having a tendency to start tail-chasing when provided multiple very narrow but difficult to satisfy requirements to test cases on the other. That is, asserting that the code must do foo will result in Claude producing something that absolutely does foo, adding the requirement that it does bar and it’ll do that, but sometimes causing problems with corner cases of foo. And in particular it seems to be bad at independently concluding that its initial scoping of the problem was wrong and it needs to re-frame (e.g., recognizing that it needs to modify the command execution loop itself, because it’s ping-ponging back and forth trying to do mutually exclusive things at different end of an individual execution cycle; that is, at recognizing that foo and bar are orthogonal).

I think this might illustrate a fundamental difference in our approaches to this kind of thing. I don’t know how to reconcile “does an admirable job” with it producing code that is fundamentally syntactically incorrect. And I want to add that this isn’t intended to say you’re wrong or that AI code generation is useless or anything like that. I’m just trying to convey the fact that I just apparently fundamentally look at this a different way. Like I understand that something like spurious single quoting isn’t an insurmountable problem or anything like that; you could write a bash script to fix it.

But, again, this is the kind of thing that if a junior programmer handed me work like that I wouldn’t think they’d done a surprisingly good job. I’d think they’d failed to satisfy even the barest minimum degree of competence. The fact that the work is being done not by a junior programmer but automatically is impressive in and of itself…but that’s a separate question.

3 Likes

I agree it’s hard to think of it as a competent worker when it can’t cobble together working code on its own. But the way I look at it is, if I can hand a problem off to the tool and get back an almost-solution that doesn’t compile until I fix a handful of mismatched brackets, it has still done 99% of the work for me. Yeah, it’s frustrating to watch it make mistakes that would’ve been trivial for me to avoid, but that’s just it: they’re trivial for me to fix, too, and all the nontrivial parts have been taken care of.

I think of it more like a bright, eager intern who still needs supervision. If I can hand off a problem that would take me a full day, and the tool has it done in 3 minutes but I need to clean up its code, then I’ve traded 8 hours of coding for 3 minutes of waiting, 5 minutes of editing, and 4 cents of API charges, which is a pretty amazing deal.

3 Likes

In my experience, if a problem would take me a full day to solve on my own, then debugging someone else’s non-working solution to the same problem probably takes a day and a half—a full day to figure out their code and where the bugs are, and a half day to redo those parts.

Which, again, I think just means I’m not the target audience for this. I enjoy programming more than I enjoy babysitting and managing.

1 Like

I agree, but this isn’t like that. These are superficial syntax bugs, not logic errors. I can correct them just by scrolling through the file and seeing where the syntax highlighting goes wonky.

4 Likes

That specific example is, yes.

But, for example, I had code that was using a flag on a class as a sort of mutex: check the flag on entering a context, if set immediately return. If not, set it, do your work, clear the flag, return. Claude decided to re-use the flag elsewhere, but just set it, but never cleared it. Which suddenly caused a bunch of wonky behavior not in the code that it had just added (which passed the checks it knew how to perform), but in code that had previously been working.

But more generally I sorta feel like…well, if you hand the thing a problem you know the answer to and you can immediately spot mistakes, why would you expect that you’re seeing all the mistakes it made? I mean when I look at my own code I often have trouble identifying mistakes I’ve made, knowing full well what all the moving pieces are, what the motivation is, what my thinking was, and so on.

I guess if all you ever want out of an AI coding assistant is to blaze through trivial stuff where going back and cleaning up where it messed the bed isn’t a big deal…okay, sure. But I thought the selling point was that it was going to be better than me at the nuts and bolts nonsense.

There are things that it legitimately seems to be good and useful at, like acting as a sort of you-know-what-I-mean mode for grepping through code. Like saying here’s thousands of lines of library code, I’m seeing this exception, where’s that coming from. That kind of thing.

But for actual code generation it seems at best wildly inconsistent. Which is approximately the last thing I want out of coding.

2 Likes

Not to be rude, but I am sensing a real axe to grind.

I’ve personally found it a relaxing exercise to let it handle broad strokes. Usually I already have a good idea of what I want to build, and how to roughly build it, but I personally lack knowledge regarding all the different libraries that are out there, their specific function calls, etc.

It means I can focus on the overarching logic and problem solving, without needing to switch brain modes to write syntax.

That is not so say I don’t enjoy writing syntax (it keeps my thinking engaged, at least, but I do this with Dialog now), but it’s a different mindset I need to switch to, whereas I can usually maintain good control of the code if I maintain the high level view and validate what the LLM generates.

Though when I do notice it has diverged from my needs, obviously it’s time to jump in and alter the syntax. I must admit it feels so much more relaxing than needing to spend considerable time on both architecture and implementation, and allows me to test assumptions and fail faster without needing to rewrite large sections of code.

All I can say is this hasn’t been my experience with GPT 5.2 and later in Copilot, or with the latest Claude Opus on the rare occasions I’ve felt like paying 3x credits to use it. The newer models work a lot better than what you’re describing, at least for every problem I’ve thrown at them.

1 Like

I’m enjoying the banter going back and forth here. Seems to be three zones of opinion:

  • AI is not worth it → AI creates more problems > PUT IT IN PARK, TURN OFF THE IGNITION
  • AI should only handle limited tasks → use it with care > GIVE IT A BIT OF GAS, PREPARE TO BRAKE
  • AI’s potential is untapped → we need to push the limits > PEDAL TO THE METAL!

And people can fall anywhere on that spectrum.


@DavidC Anyway, back on track. I’ve been trying to digest what’s being proposed here and I don’t see it as a game changer for me… but that doesn’t mean it’s not for other authors. For me, my biggest concern is that it feels so nebulous in execution. I feel like I have to relinquish control somehow and when you say things like…

I don’t really care to write code anymore.

…I was surprised. So authors of your system become writer/directors primarily. I think removing the programming ability is where some might feel unease (excluding ethical concerns), I believe.

If I were to use AI to create a game and still have complete authorial control (without programming), I would want the AI part hidden completely. I would want a web interface front end that gives me a dynamic spreadsheet spec editor, push the button, out pops the game file. That would get my foot in the door. Maybe that’s food for thought.

Change always brings about resistance, unfortunately, and that’s not exclusive to AI usage. People get push back on new ideas for all sorts of reasons. All you can do is make something awesome to shut up the naysayers, I say. :wink:

1 Like

I would add one more, but otherwise 100% agree.

AI’s potential is untapped: We need to constrain AI to fully unlock its potential. I think that this will be a game changer.

1 Like

This is where all software engineering is going, like it or not. In five years, most new software engineering jobs will be GenAI overseers. The days of reading and writing lines or blocks or modules of code are ending. This is a delivery model I recently hypothesized.

How we integrate it into creative endeavors is a very important question.

You know, despite all the discussion surrounding AI, regarding ethics, feasability, use cases, etc. which are fascinating in and of themselves, I occasionally get confronted with 9-year old me who, despite everything, does think it’s really cool, because honestly… It’s kind of sci-fi, isn’t it? We’re kind of living in sci-fi

1 Like

So… > DRIVE A LITTLE OVER THE SPEED LIMIT SO YOU DON’T LOOK LIKE A PUSSY

Yeah, that could be a 4th spot on that spectrum. :wink:

1 Like

We will still be be allowed to spend time tinkinering with code at work. But it will be a little like the CFO at a fortune 500 company spending 2 hours on fixing a spreadsheet. Nobody says: “don’t you have people for that?’’. It’s just relaxation time and important for their mental health. You can’t go 200mph, 8 hours a day.

I have roadmap ideas, including pooling the results of several spec-developed games into the training of a small language model specific to Sharpee. I save a summary of every session and have over a thousand such markdown files. I’ll have a few thousand more in four to six months. The vestiges of a system you described are in there somewhere.

1 Like

This has been a very congenial discussion. Given the times we live in, that’s a blessing.

4 Likes

It has been, despite me having to say honestly I am just ever so slightly bewildered by the seemingly gung-ho attitude both you and @Urd have towards AI. But it has been good and informative in all.

1 Like

I am probably tainted. I don’t live in the U.S. but I work for a U.S. based tech company.
The AI revolution is in full swing, and I think that a lot of people don’t realise the level of industrialisation that has already been achieved. The days of being a code typist are already over.

The only ethical issue with AI is that the ones with the most capital win. The investor question of ‘if I give you more money can you make it happen faster?’ has taken a whole new meaning. Because now the answer is almost always ‘yes’. Just throw more at it.

Which feeds directly into the next question I personally feel strongly about: How can I put this to the benefit of as many people as possible, preferably including myself?

So far my answer has been to rapidly build open source and free software. It’s coming along quite well.

Software is a means of production, which I hope can be put to the good of all.

1 Like

That’s a tough one. It’s worse than when the internet came along. The saddest posts are the ones where somebody is begging for access because 20 dollars a month blows their budget.
We basically reached the point in the western world where (at least in tech), if you are not spending at least 100USD per month, you are at a real disadvantage.

I really don’t know how this is gonna be solved. With every step we take, we increase inequality and widen the gaps.

Check this piece: How StrongDM’s AI team build serious software without even looking at the code

It’s a question at the core of much of Transhumanist literature (and before I claim the role of philosopher in the room, I never got my degree, @VictorGijsbers might have opinions).

With rapid advancement of technology, how do you maintain equal access for all? It’s something I am genuinely concerned with, especially since it’s a very vulnerable position to be reliant on 4 big tech companies right now, most (or all?) of which are from the US.

So, one of my running experiments is to figure out how to use sparse hardware to run an agentic framework that can at least somewhat assist in programming, so that we aren’t beholden to the whims of Anthropic alone.

2 Likes