Accelerating Inform 6 games on 8-bit interpreters

ethern0t · March 23, 2026, 2:17pm

This is getting off-topic but some things I’d love to see…

Many/all “classic” games come with an input file and a transcript of expected behaviors.
If the game uses random numbers, have a mode where the random number sequence is really dumb and deterministic that the transscript is built against.

So for example, we all sit down and agree that if you call @random(-28785) the RND seed will start at 0 and increase by 1 on each call (and the result is modulo the range, plus 1). That way, it’ll be a lot easier to verify interpreters work 100% consistently with each other. Now for existing story files we don’t have the source to, there might be something in the transcript that tells the game to make the magic @random call. Or there’s a parameter you run the interpreter with that forces it to ignore @random calls with a value less than or equal to zero and always produce the same sequence of numbers.

The hard parts I find are

Starting with the correct version of a story file
Having a walkthrough suitable for machine interpretation
Dealing with randomness

For example, months ago I stumbled across a github named “mojozork” where they had spent a weekend writing a Z3 interpreter and it included a walkthrough, but I had to basically use their RNG exactly for it to be useful to me.

Individual interpreter authors seem to cobble together their own solutions to these things. I’d love to be able to pull down a story file, a transcript, and run them through either of my interpreters to make sure they play the game the same way others do. It doesn’t even have to be a full actual walkthrough of the game, it just needs exercise a significant chunk of it (so ideally, not a walkthrough but a recorded play session that tests failure cases too).

These sort of standard runs would be useful for profiling, identifying hot spots, etc, as well, which is the actual topic of this thread

-Dave

ps. bonus points if you can figure out why I picked a bizarre special random number seed, keeping in mind I’ve been working on eight bit interpreters lately

zarf · March 23, 2026, 2:34pm

I tried to do this in Glulx by shipping a particular RNG (xoshiro128**) in the reference interpreter and recommending it for the VM’s deterministic mode. Of course I picked one that was easy to implement on 32-bit machines, not 8-bit machines, so you probably don’t want to steal it directly.

The simple increment function is probably the best you can do on 8-bit hardware, but it will cause problems in some games. (E.g. if a game checks two 50% probabilities every turn.)

ethern0t · March 23, 2026, 2:41pm

Picking a small prime number for your step value might be good enough then. All prime numbers are odd, so if you call @random(2) over and over you’d get 0/1 back consistently.

For the purposes of testing, it doesn’t have to truly be random, it just needs to be consistent and reproducible.

If we really needed a high quality random number for whatever reason, there’s no reason it couldn’t use 16-bit signed arithmetic just like the Z machine itself uses. (I imagine most 8-bit terps use existing library code at least for the modulo of the result)

-Dave

ethern0t · March 23, 2026, 3:19pm

Took a look at your Python script - the patching isn’t as horrible as I’d expected. It also serves as a nice form of documentation of what the patches look like.

I also took a look at zmachine.s and confirmed Z80 assembly is just as alien to me today as it was in 1981.

However, I do like the way you did the instruction dispatch. I currently do a two-way dispatch, once to parse the operands, and another to jump to the actual common instruction handler code. Your way is much nicer and easier to extend - it’s a single dispatch per instruction, but you avoid having a ton of excess code by having common entry points for “colder” paths. If a particular instruction is important enough, the ss/sv/vs/vv variants can get unique entry points, otherwise they all point to the same code that uses a slower path.

I guess I’m not surprised that the two @store variants (opcode 13 of 2OP) are like 5+% of all opcodes executed, but that’s really good information to know!

-Dave

Mike_G · March 23, 2026, 3:51pm

For my interpreter’s integration tests I include the seed for my prng for reproducible runs but clearly that only works with my specific algorithm. Why not take a different approach: output random values to their own log. Terps that want truly reproducible behavior could read that log alongside a transcript and substitute the values whenever a random value is called for. It seems more flexible than tweaking algorithms, and allows multiple branch testing.

zarf · March 23, 2026, 4:05pm

I sorta like that.

The log could be part of the transcript, but it could equally well just be a separate file of numbers. You can assume the interpreter will pull the right number of random calls off the list each turn, because it’s running deterministically, right…?

Mike_G · March 23, 2026, 4:08pm

Right, because any random values are guaranteed to match the values used by the original run regardless of algorithm. Unless you edit it…

Mike_G · March 23, 2026, 4:16pm

I’ve actually used brute force loops incrementing the prng seed in my tests to locate a working seed for a given walkthrough; for example: having the urchin in The Lurking Horror to appear in a given room at a given time so that my scripted input works.

A log of random values is actually much more powerful. It allows editing of individual values by hand to acheive specific results that would be incredibly hard to replicate in normal play.

ethern0t · March 23, 2026, 4:44pm

I guess my concern here is that some games may make a LOT of random calls, which could be a large amount of extra input to process. I guess it depends on whether games call random once every few turns, or potentially several times per turn. But we already have a potentially large input source.

Maybe a simpler idea – have a short list (a prime number of entries) that is the seed value for each random call, and just keep repeating the same list? So for example, every interpreter could have a list of 17 “values” that are picked over and over in order and modulod against the random upper range?

heasm66 · March 23, 2026, 5:19pm

This is a bit OT, but…

Inform6 inserts routines in the order they are definied and the veneer routines are compiled last so these are always at the end of the routines (before the high strings area starts).
They are quite easy to replace in your own project if you want. I use modified veneers that don’t print text messages for my reverse engineeed version of Curses I use for size optimization tests. GitHub - heasm66/Curses_i6: Reverse engineered and reconstructed Curses! from release 16. · GitHub

Draconis · March 23, 2026, 5:48pm

For the first one, I imagine a lot of game authors have this for their regression tests. (I have at least one for each game I’ve released.) It would be nice to start a convention of releasing this alongside source code.

For the second one, a cautionary anecdote:

In Balances, one of the demonstration games released with Inform 6, this block of code is run every turn after completing a certain puzzle.

          if (location ~= Up_Road or Track || random(6)~=1) rfalse;
          if (random(4)==1 && self hasnt general)
          {   move feather to location; give self general;
              "^A tortoise-feather flutters to the ground before you!";
          }

In other words, a crucial item only appears if random(6) == 1 and random(4) == 1. This means it should be a 1 in 24 chance, right?

But multiple early Z-machine interpreters handled “fixed seed” mode for the PRNG by returning an incrementing counter modulo the seed. Which means that when debugging with a fixed seed, that chance is actually impossible! This algorithm will never return two odd numbers back to back.

So whatever really dumb and deterministic algorithm you use, it needs to be sufficiently random to avoid pitfalls like this. Currently, aamrun and dgdebug (two tools used in Dialog debugging) both implement exactly the same PRNG as dfrotz, so that output will be consistent between the three:

seed = 0x15a4e35L * seed + 1;
r = (seed >> 16) & 0x7fff;
r = from + (r % (to - from + 1));
return r;

Which works great on a 32-bit machine, but less so on an 8-bit one.

zarf · March 23, 2026, 6:10pm

Balances is exactly the example I was thinking of when I mentioned some games having problems. :)

It was actually worse than that, as I recall: one interpreter used a rand() call (very old libc standard with terrible randomness). The results alternated between odd and even numbers even when it wasn’t in “fixed seed” mode.

zarf · March 23, 2026, 6:14pm

I guess the up side is that nobody ever mentions an example other than Balances, so this may not be a common problem.

Dbug · March 23, 2026, 6:17pm

Well, I learnt things today: I did not realize that inform was basically a virtual processor implementation, I thought these game adventure systems were more like “domain specific languages” made specially for adventure games, not actual Turing complete assembly code

Draconis · March 23, 2026, 6:25pm

Yeah, one of the key innovations in the microcomputer era that made Infocom and Adventure International so successful was compiling their DSL to bytecode for a virtual machine. That way, when a new computer came on the market, they only had to port their VM interpreter to the new platform, and their entire library of games would now be compatible with it.

In the 21st century that’s less of a concern—everywhere has a C compiler (and a Java interpreter and a Python interpreter), and even if you just release executables for Windows, Mac, and Linux, that will cover 99.9% of the market. But now these VMs have other benefits: interpreters can keep up with new processors and new OS versions without game authors needing to keep rebuilding their games, and once someone wrote a Z-machine interpreter that runs client-side in a web browser, suddenly every Z-machine game could be played straight from the web.

So, they’ve stuck around, which means new games can also still be run on old Apple ][s and so on (if they’re not too greedy with memory).

ethern0t · March 23, 2026, 7:04pm

How do people here actually develop for retro platforms? I’ve done most of my work so far using Virtual ][ because it’s highly regarded and runs on a Mac, but it seems to have little in the way of debugging support other than “hardware” breakpoints I use to stop at known places in my code.

I’m writing all of my code in VS Code and switching to a terminal window to run make. It would be cool to do source level debugging connected to an emulator though. I have an old Atari 2600 and C64 emulator I cobbled together years ago that I might mutate into an Apple ][ emulator to give me better debugging tools, but I figure there’s got to be something out there already?

Unfortunately while I use Windows machines all day for my day job, I don’t have access to any after hours. Just Macs.

-Dave

Angstsmurf · March 23, 2026, 8:25pm

I don’t actually develop for retro platforms, but in my experience the MAME debugger outshines the rest, especially on the Mac.

fredrik · March 23, 2026, 9:46pm

An important point nowadays is safety. As far as I know, a Z-code file can’t contain a virus or other malicious code. As I write this, I start to realize I might be wrong though. If you know of a weakness in a certain Z-code interpreter, you may be able to craft a Z-code file that makes the interpreter do something the interpreter author didn’t intend. Can this be used to write truly malicious code? Not sure. And is e.g. the TADS3 format more or less safe?

Draconis · March 23, 2026, 9:52pm

It’s certainly possible for there to be a vulnerability in a particular interpreter, but I would agree that Z-machine files are much safer than plain executables. They’re only dangerous if there’s a specific bug that can be exploited, while executables are dangerous by design.

fredrik · March 23, 2026, 9:57pm

I mainly use VICE, when debugging for Commodore 8-bits. It’s really good - not perfect, but usually good enough, I’ve found rough edges and incorrect behaviours on the C128, like REU handling and VDC timing. The C64 is a simpler machine, and the C64 part of VICE is better tested. In the monitor of VICE, you can do things like set a breakpoint when the program stores something in a certain address range, but only break if accumulator is > $7f and RAM at address $8000 holds the value $ff. And you have the powerful “CPU history” command, which allows you to look at the CPU state several thousand instructions before the current state.

A few times, when debugging really tricky stuff, I’ve used C64 Debugger (available at CSDB). It’s very powerful, but it comes with a learning curve.