How much work is writing a Z-Machine interpreter?

mulehollandaise · August 2, 2020, 6:04am

I have never attempted to write a Z-Machine interpreter, and I doubt I ever will; but I’m toying with the idea of commissioning a Z-machine interpreter for an old 80s computer. Before I go any further with that project, though, I need more information on it; and I don’t recall reading anywhere, SPAG or other, the perspective of people who create interpreters. So, I have a few questions, but feel free to just talk about it generally

For anyone who has done so, how hard is it to write a Z-machine interpreter? What are the steps involved? Being comfortable with the target platform, reading the Z-Machine spec, following it page by page and writing the functions as you go? Or is there another way? (A general structure for an interpreter, that floats around somewhere?) How do you do it?

I feel like people, some with not a lot of experience in IF, write Z-machines in whatever language as an exercice; Github is absolutely full of implementations in lots of different languages or for different targets. Yet, for me, the task feels daunting; I don’t even know my way around the I6 library after 13 years of playing with it, so an interpreter… But Infocom’s interpreters from the 80s are, like, 12kb of code, no more. So, how hard is it really?

Is there any traps? Do you have to know your target really well in order to have acceptable performance? Is just-in-time compilation a must-have, or just an optional, advanced trick?

I’m thinking of commissioning a V3 implementation, which should be enough for my needs; am I right in thinking that V5 is much harder, since you have time-based events, text styles, etc? Or is it just a dozen more opcodes to support, which is easy once you have the structure?

And finally, an impossible question: any idea of how many hours of work are involved in creating a Z-Machine v3 interpreter?

Thank you all for your input

lft · August 2, 2020, 2:07pm

Something like that. The Z-machine specification is excellent! There might be some rough edges, but overall it’s incredibly clear and thorough.

One could also use an existing open-source interpreter as a starting point.

As always with this kind of project, you’ll get 90% of the way in 10% of the time. It’s easy to lose motivation when only the obscure bugs and hairy corner cases remain.

To get all the way, I would recommend creating an automated setup where you run an existing game in your interpreter, and feed it input from a walkthrough. Log the state of the interpreter, e.g. the current program address. Then you patch an existing interpreter, such as dumbfrotz, to log the same thing. By comparing the logs from the two interpreters, you can easily see where their behaviour diverges.

If the entire game doesn’t fit in RAM, then disk i/o will be the performance bottleneck.

Much of that is optional. V3 has some special quirks of its own, like a fixed status bar format. I’d probably start with an existing game that I wanted to run, and let that decide what story format to implement.

Dannii · August 2, 2020, 2:19pm

Writing an interpreter for an 80s computer will be considerably more complex than a modern one, unless you pick one of the later more powerful ones like the Apple IIGS. This is because you’ll have to care about memory limits. A JIT is definitely not essentially and probably not feasible for an 80s era system.

There’s an active little C64 community right here on this forum - unless you have specific ideas for a different platform, why not consider joining up and helping their projects?

mulehollandaise · August 2, 2020, 2:44pm

@lft Thank you for your perspective

@Dannii Thank you! (I guess I only saw JIT mentioned in desktop interpreters, not older ones; that explains it.) Yes, I’m thinking of an old computer that never had a Z-Machine written for it. I’m just trying to figure out how much of an undertaking it would be so I can be more precise with the eventual person who would attempt it, regarding what is needed and how much work it’d represent, which would help them quote a number for me.

Dannii · August 2, 2020, 2:54pm

For an experienced programmer who is familiar with the Z-Machine format as well as the target machine, I’d guess that a dozen hours might be enough for basic functionality. But it would take a lot more to include full IO support, save and restore, multiple Zcode versions, etc.

jcompton · August 2, 2020, 2:59pm

You may have a slightly distorted understanding of the nature of low-level programming for very constrained systems.

I also suggest that you just say the model of the computer you’re thinking of, because there are at least two active posters to this forum who could give a pretty authoritative assessment of the difficulty level relative to, say, writing a new interpreter for a Commodore 64 or Apple II.

Mike_G · August 2, 2020, 3:08pm

I’ve written several (non-public) interpreters. Doing only V3 on a modern architecture shouldn’t be very hard for an experienced programmer who can read and understand the z-machine spec. Doing it on an 80’s platform is definitely harder.

Most of the time required on a modern architecture would be used in becoming familiar with the spec’s terms and general layout of the z-machine as you implement it in small chunks. The recommendation above to try it on a specific game is good. I first started with Zork I, and it felt very rewarding to me that when I finally got as far as decoding text, the first string printed to the screen is “ZORK”. Most of the implementation is not very difficult. The trickiest areas are probably text decoding and the object manipulation instructions (object instructions are a pain and my least favorite area to work on). On older hardware, being very familiar with the target platform would be key.

Being familiar with Inform is not a requirement, or even particularly useful, as the interpreter is lower level.

Mike_G · August 2, 2020, 3:14pm

Any programmer will tell you, code size means nothing. The size of a puzzle does not determine its difficulty.

You can fit a lot of tiny programs into less than 1KB of code: 1KB = 2^8192 possible programs, but only one is likely to do what you want.

8bitAG · August 2, 2020, 3:28pm

I would also check that there aren’t any active projects already for the machine you want an interpreter for. It’d have to be a pretty obscure (or low spec) machine for someone to have not already had a stab at starting some work on an interpreter.

mulehollandaise · August 2, 2020, 3:51pm

Sorry y’all about the 12kb comment, it made me look ignorant and like i dont know what the demoscene is I meant it more like “it’s not a sprawling project with tons of files and dependencies, it’s a medium-size project with a clearly defined spec and dozens of people doing it every year”.

The machine I’m eyeing up is the Thomson TO line of computers. Millions were sold in France and hundreds in Italy and that’s about it, which explains why there’s no z-machine for it; its processor is the same as the CoCo (for which there is an Infocom terp and source code) but I don’t know how much that helps. There are a few people still programming for that platform, but most probably they’re not really familiar with the Z-Machine.

(Anyone can remind me what the minimum RAM requirements are for a z3 terp? There are several computers in that line, broadly compatible, and I’m wondering how many of the older models could potentially support a terp.)

fredrik · August 2, 2020, 4:10pm

Oh, not too much work.

We started to write our interpreter for the C64 in March of 2018, and we had it working well enough to play through Hollywood Hijinx at the beginning of June. I would say we spent most of our free time on this project for ~2 months, maybe 300-400 hours. Then we started the process of testing, fixing bugs and optimizing it for both size and speed, and adding some nifty features. We probably spent like 300-400 hours on this before we released version 1.0 at the end of December. Our optimization efforts resulted in the playthrough of HH going about twice as fast, and the interpreter can now execute up to ~2500 simple Z-code instructions per second. This interpreter supports v3, v4, v5 and v8. The interpreter supports timed input, text colours, and it uses all 40 characters per line, something that (AFAIK) no Infocom interpreter ever did. Also, it supports a custom character set and languages which use accented characters.

We were two people, programmers by trade since 20+ years, with intimate knowledge of the C64, but without much detailed knowledge of the Z-machine when we started out.

What makes it quite challenging to write a good Z-code interpreter for an 8-bit machine is that it has to be both small and fast.

If you only go for z3, you don’t care about optimizations or nice to have-features, and you know the target platform well, I think you may write an interpreter in as little as 200-300 hours, with an acceptable level of quality. YMMV, and this is just a guesstimate. If you expect to be done in a dozen hours though, you’re in for a surprise.

mulehollandaise · August 2, 2020, 5:09pm

@fredrik thanks for your estimates! I wasn’t expecting 12 hours, more like 60-100h. Sounds like I was off by a factor 4?
In any case, thank you all for your input. There’s not enough post-mortems/interviews/discussions on the work behind technical tools!

eriktorbjorn · August 3, 2020, 6:04am

There’s not enough post-mortems/interviews/discussions on the work behind technical tools!

Brian Moriarty, who wrote a couple of the interpreters, appeared on episode 30 of The CoCo Crew Podcast. One of the things he said at around 1 hours and 40 minutes into it was:

So as soon as the test suite was running, I downloaded all of the virtual games onto Tandy disks - CoCo disks - and our testing department started testing all the games on it, and there was no errors found, it ran them perfectly. So that was the advantage of our virtual technology is that one person working for, you know, one month or less could get all the games running on that system. So as soon as the interpreter was certified, we started manufacturing. (Emphasis added.)

The source code for the CoCo interpreter that they were discussing has since been published on-line.

jcompton · August 3, 2020, 1:13pm

That’s not terribly far from Fredrik’s 200-ish hour guesstimate on writing a z3-only interpreter (which is what most of the official Infocom interpreters were).

Moriarty is also describing a situation in which said interpreter is someone’s full-time employment (or suitably significantly paid contract work), and with the full on-call resources of most of the people who designed the system and had written previous interpreters, and their codebases, etc. etc.

Also, those interpreters were only expected to support the sum total of Infocom releases up to that point. Nobody was gonna be put out if it wasn’t strictly compliant enough to run A Change In the Weather or whatever.

Piergiorgio_d_errico · August 5, 2020, 10:18am

perhaps less in the case of the minizork interpreter. The only mean of putting together minizork’s 50K and the zip 'terp in the 64K of the C=64 is using the RAM below the ROM, and said ROM are two non-contiguous 8k bank.

Best regards from Italy,
dott. Piergiorgio.

fredrik · August 5, 2020, 11:15am

Infocom’s early z3 interpreter was ~8 KB on the C64. The newer version, which had a bit more bells and whistles was a bit bigger than this.

The C64 has, as the name implies, 64 KB of RAM, which is all one large contiguous area. The first 52 KB are easily accessible from machine language (AKA assembler or machine code). Then there’s 4 KB which are normally hidden by I/O chips like the video chip and the sound chip, and 8 K which are normally hidden by Kernal ROM. Both of these areas can be accessed in ML, it’s just a little more work than the rest.

Infocom’s regular z3 interpreters never used the last 12 KB. The Minizork interpreter must have used it, and their z5 interpreters used it.

Ozmoo uses all of the RAM, and can also run in a kind of Mini-Zork mode, where the whole story file stays resident in RAM. This allows for a z3 story file up to ~53 KB in size, IIRC.