Glulxe with LLM-based User Input Parsing

Hi all,

Like a lot of folks, I’ve been thinking about how much the parser is both the charm and sometimes the wall in front of IF. The way I see it now is that the game doesn’t really care about the exact words you write, it wants a normalized command. So I tried putting an LLM at the Glk I/O layer and left the VM and game files alone.

What I built is a small experiment in the form of a modification to Andrew Plotkin’s CheapGlk, then linked Glulxe against it. The Glk side intercepts your input, reads a few recent lines of output for context, asks an OpenAI-compatible API what the intent likely is, then hands the game a plain command like “take key” or “north”. Glulxe itself is basically untouched, I only added OpenSSL to the build and changed the link to the modified CheapGlk.

Here’s a quick gameplay demo gif (playing Try Again by Tom Devereaux) : https://log.beshr.com/playing-if-with-natural-language/demo.gif.

In practice, you can type naturally, use pronouns, even type in another language, and the interpreter passes a standard command to the game, depending on the model you use.

It tries to keep context straight. If a game asks for raw text, you can bypass the LLM by wrapping your input in brackets. So for a name prompt: [Beshr]. The idea is to keep games fully compatible, no changes to game files, just a more forgiving input path.

A few caveats. Interpretation isn’t perfect or always consistent. There is latency, often around 0.5s. Using hosted models costs money if you play for long sessions. You can point it at local models through an OpenAI-compatible endpoint if you prefer. Model quality matters a lot, I’ve had decent results with google/gemini-2.5-flash. Smaller models can work but probably need tuning for this specific task.

If you want to try it:

What I’m trying to understand is whether this lowers friction without flattening the interesting parts of IF. Does it keep puzzle-solving intact? Is the extra uncertainty from interpretation amusing or just annoying? Also curious which types of games break this quickly, room ambiguity, heavy conversation systems, custom verbs, that sort of thing.

If you have thoughts on evaluation I’d love to hear it. Reports about specific games that worked well or failed would be very helpful too!

More details and a longer write-up here: Playing IF Games with Natural Language.

2 Likes

Is there an online version or something we can use to test this, without all of that setup?

Not yet unfortunately, I’m trying to build glulxe with cheapglk with my modifications to web assembly to run games in the browser (without modifying the code too much). If it works I can hopefully put up a web page somewhere to allow client-side runs (bring-your-own-key style for LLM usage), which will hopefully make testing easier.

Turning >TAKE THE DIAMOND AND THE WATCH to >TAKE DIAMOND shows the obvious fault in this approach to me.

1 Like

I think that’s a case of the author being unfamiliar with the current (well, decades-old, but never mind) state of the art. The LLM prompt includes:

MULTI-OBJECT HANDLING:
- Games often don't support multiple objects in one command
- 'wear winter clothes' → wear coat (pick ONE logical item)
- 'put on coat and boots' → wear coat (games process one at a time)
- Choose the most important/first item mentioned in scene

Since almost every Glulx game is written in Inform and almost every Inform game supports multiple-object handing, the LLM could be given much better instructions in this regard, but I don’t think it undermines the core point of the exercise.

7 Likes

I suppose you’d still have to worry about the exceptions—do Superglús games support multiple objects?

I didn’t consider it being the deliberate result of a prompt precisely because it seemed like such a strange choice to make…

2 Likes

@Dissolved Indeed as @jwalrus guessed it’s my lack of familiarty that’s the reason why the system prompt looks like that :slight_smile: I thought I could normalize the behavior by splitting multiple commands per line into individual commands that run in sequence (separated with “then” from the llm output, then split into individual commands and pass them back one after another) but I’ve now removed this extra/unnecessary handling entierly and left it up to the vm and game to take care of it, and adjusted the system prompt accordingly.

I have finally managed to build to wasm and have prepared a simple web page (available in repo and hosted on webGlulxe - Interactive Fiction Player with LLM support. There might be some issues with my wasm build not related to glulxe or cheapglk, please let me know if you see any strange behavior. For the LLM part you’ll need to set the configuration (with your api key) before being able to use test the LLM part. The LLM api calls are happening entirely on the client side (and glulxe itself actually as a wasm bin).

2 Likes