Z-machine standard: unclear aspects/ambiguities

NB: Please note that the original version of this message contained URLs to the relevant sections of the Z-machine standard, but the forum engine wouldn’t allow me to include them. I sincerely apologize for any inconvenience caused by this.

Hello everyone!

Recently, I’ve been working on an attempt to implement a Z-machine interpreter in .NET Core, as a hobby project. This is mostly to practice my programming skills in my spare time, as well as to scratch a certain itch - I have never played any IF game, but the concept of the Z-machine has always seemed fascinating to me, ever since I learned about it. So I’m not looking to use an existing interpreter because my primary interest is coding. And I’ve been trying to avoid looking at any existing interpreter’s source code, if possible, and instead use the Z-machine 1.1 standard (at inform-fiction dot org) as my primary reference, complemented by sample Z-machine story files for unit and integration testing.

It appears, however, that there are certain things in the aforementioned standard that are not explained very clearly; there also seem to be some ambiguities here or there. So I’m posting this message in hopes that other developers out there could help me clarify these ambiguities, if possible. In case the standard I linked to is outdated, I will also be grateful if someone could provide me a link to the latest version (I couldn’t find anything newer than 1.1).

I apologize in advance if anything I’m going to ask has already been discussed elsewhere on this site. I’ve been mostly using Google to look for additional information, and I have already been able to clarify a few less obscure moments, some of them with the help of this site. I also think it’s best to collect all the questions in a single thread so that I can easily look up the answers later. But please don’t hesitate to provide a link in case there is an existing thread where a specific issue has already been properly clarified.

With all that in mind, here’s the list of things I could use help with, in no particular order:

1. Table locations

Throughout the entire standard, there are not too many mentions of what kind of table can be stored where:

  • An example memory map at the end of S.1 shows that the abbreviation table might be located in dynamic memory, which means it can theoretically be altered by the game during play. Looking at the sample files, this actually seems to be the case.
  • S.13.1 also mentions that the default dictionary is stored in static memory, which is also confirmed by examining samples.
  • It is also quite clear that certain tables (e.g. the object table) must be located in dynamic memory for the game to properly function.

However, as far as I’m aware, there is no mention of what kind of memory the custom alphabet table (address in byte $34 of the header) or the Unicode translation table (word 3 in header extension table) can be stored in. Am I to assume that they can also be located in dynamic memory, and so it is theoretically not safe to deserialize them in advance? Unfortunately, none of my sample files feature any of these two tables, so I’m unable to confirm or deny that.

2. What’s legal to do during an interrupt routine

S.6.1.1.3 states:

It is illegal to save the game (either with save or save_undo) during an “interrupt routine” (one coming about through timed input, sound effect termination or newline interrupts).

It does not mention, however, whether it is legal to perform some other action that could alter the entire game state, e.g. restore(_undo), restart, or quit. Am I to assume these kinds of operations are legal? Also, is it legal to use partial save (the version with 3-4 operands)? I’d assume it is because it only reads a chunk of memory and does not mess with the state of the Z-machine itself, but the standard does not make it clear.

3. User stack in version 6

S.6.6 states:

In Version 6, the Z-machine understands a third kind of stack: a “user stack”, which is a table of words in dynamic memory. The first word in this table always holds the number of spare slots on the stack (so the initial value is the capacity of the stack).

It would be logical to assume that values are pushed onto a user stack in right-to-left order (e.g. for a stack with capacity N at address A, the first word pushed will be at address A + 2*N, the second one at A + 2*(N-1), and so on), but it’s not explicitly stated anywhere. Is my assumption correct?

4. Echoed input and newline intterupt (version 6)

S.7.1.1.1 states that in all versions, including version 6, the player’s input should be echoed to output stream 1 (the screen).

The question: If the current window has a newline intterupt routine set up, and wrapping is enabled, should the newline interrupt routine be triggered when the cursor reaches the right margin while player input is being echoed?

5. Output stream 3 and header word $30

S7.1.2.1 states that in version 6, after stream 3 is closed, the total width of printing (in units) must then be stored in the word at $30 in the header.

The question: What kind of width should be written in case the output was formatted (by giving the output_stream opcode a third operand): the width of a single line, or the total width of all lines combined?

6. Line count and interrupt countdown

S8.8.3.2 mentions, among others, a “line count” and an “interrupt countdown” as window properties for version 6. Then, S8.8.3.2.2 mentions (quote):

…if the interrupt countdown is set to a non-zero value (which by default it is not), then the line count is decremented on each new-line, and when it hits zero the routine whose packed address is stored in the “newline interrupt routine” property is called before text printing resumes.

It seems as if there is a typo there - it probably should be “interrupt countdown” instead of “line count” that gets decremented. This is only logical, as it is a countdown, after all - i.e. a counter which counts down. Am I correct?

Assuming the answer to my previous question is “yes”, it is also not clear how exactly “line count” is supposed to be made use of: S.8.8.3.2.6 says that the interpreter should use it to see when it should print “[MORE]”, and a line count of -999 means “never print [MORE]”. But what should be done if line count is not -999? The standard does not seem to mention that at all.

7. Non-existing sound effects

The standard explains well that the game needs to make sure a picture if available before attempting to draw it, and that attempting to draw a non-existent picture is illegal. It does not say anything similar for sound effects, though. Am I to assume that if a sound effect is not available, any call to sound_effect (if it’s not a beep) should simply be ignored?

8. read_mouse and menu word

The specification for the read_mouse opcode dictates that one of the words written to the resulting table should contain the number of the selected menu and its item. But it does not say what should be written there if the mouse pointer does not point to a menu item, or if it points to a menu heading but not the menu contents. Should the word be left unchanged then, or zeroed out?

9. Purpose of exposing system menu numbers (0-2)

S10.4.2 states:

Menus are numbered from 0 upwards. 0, 1 and 2 are reserved for the interpreter to manage (this system has only been implemented on the Macintosh, wherein 0 is the Apple menu, 1 the File menu and 2 the Edit menu).

But what is the exact purpose of making the game aware these menus even exist? The game cannot make an assumption of what the contents of these menus might be (it could differ between OS versions, for example), and there is no way to communicate this information to the game. Is it just a quirk of the original implementation?

10. Undefined EXT opcodes

The following notes in S.14.2 and 14.2.1 are quite perplexing:

Formally, it is illegal for a game to contain an opcode not specified for its version. <…> However, extended opcodes in the range EXT:29 to EXT:255 should be simply ignored.

The big problem here is that the exact format of each instruction, like the presence of store and/or branch data, depends on the particular opcode. This means that if an unknown opcode is encountered, it is not possible to properly deserialize an instruction from the data stream because there is no way to know whether we’ve read it all, or if there’s additional store/branch data remaining - which might lead to invalid data being read when the next instruction is deserialized. Ergo, I have no idea how am I supposed to ignore illegal opcodes.

11. Leftover input from read

The spec for the read opcode, in particular, the timed input part, says:

In Version 4 and later, if the operands time and routine are supplied (and non-zero) then the routine call routine() is made every time/10 seconds during the keyboard-reading process. If this routine returns true, all input is erased (to zero) and the reading process is terminated at once.

Pay attention to the second part: if the interrupt routine returns true, then all the input already in the buffer should be erased. And compare to what is says next, when mentioning the new format of the text buffer in version 5 and later (emphasis mine):

Moreover, if byte 1 contains a positive value at the start of the input, then read assumes that number of characters are left over from an interrupted previous input, and writes the new characters after those already there.

However, I’m unable to understand how this might be possible - won’t the buffer always be empty because all input gets erased in case it is interrupted? The logical assumption is that it should only be erased in version 4 and kept in later versions, but this is not expliclitly stated anywhere. Further clarification is necessary.

12. scan_table

Two things here:

  • The spec for the scan_table opcode says that an additional argument can be used to configure whether the table to be scanned contains words or bytes. Now, suppose the first operand (the value to search for) resolves to a word (i.e. it is anything but a “small constant”). In this case, should we immediately return 0 if the high byte of this word is non-zero, or should we truncate the word to a byte and perform the search as usual?
  • From the default value of the third argument ($82 = 10000010b), it appears that the field length of the table is always given in bytes, even if we’re searching for a word. Is it legal in this case to have a field length of 1, and if it is, what should be done in this scenario?

13. Empty object names

In my sample files, some objects in the object table have a zero-length name (i.e. the value of the first byte in the property table is 0). Is it legal to print these objects with print_obj?

14. Some dictionary words have invalid format

In some of my sample files, the default dictionary contains words which do not have the high bit set in the final word of the Z-encoded string. Some of them appear to be exact duplicates of the neighboring word, while a few others are unique. In theory, such words are not able to be looked up, and they also break the binary search algorithm because they violate S13.5, which states that all words must come in numerical order of encoded text. Even though the latter is not a problem for any modern software (and I use a hash table instead of binary search anyway), the purpose of these entries is still a bit unclear. What I’d like to know is whether I’m correct in my assumption that such a word should never be found by dictionary lookup. If necessary, I can provide concrete examples as well.

15. store opcode corner case

The store opcode features two operands, the first of which is a variable number indicating where to put the result, while the second operand provides the value to put in the location refered to by the first operand.

Now, suppose the first operand indicates that the value should be put into the top of the stack (NB: not pushed), and the second operand is of type “variable” with value 0, which indicates a value should be popped from the stack. There seems to be a conflict here, as the value we’re updating literally ceases to exist during the update process. If this is legal, what should happen? The logical course of action would be to replace the new value now on top of the stack (e.g. we essentialy “pop” a value from under the top of the stack with such an operation), but this is not adequately explained - clarification is needed.

Final note

So far, that’s all I’ve found, but I guess there’s likely to be more as I continue to develop my personal project (I don’t even have a working backend yet). I will post further questions here in this thread as they appear, but for now I will be very grateful if someone is able to help me clarify at least some of the above issues.

Thank you very much.

5 Likes

I’ll cherry-pick this to respond to because it’s pretty simple to answer.

Yes, you’re correct: you can’t be sure if they’re store/branch, so you can’t reliably ignore them. I think all interpreters just assume no store/branch and hope for the best.

However, the Zoom interpreter has made use of EXT:128-131. I’ll quote its README below, but in short, EXT:130 is a store, and the rest are no store/branch. You probably won’t ever run into them unless you use the zmark benchmark program included with Zoom, but at least these have been in existence for some time, so probably worth handling EXT:130. Although, even the zmark program doesn’t use EXT:130!

From the Zoom README:

start_timer (EXT:128) (no arguments, neither branch nor store)
This makes a note of the time this instruction is used. Normally this
will be the CPU clock time returned by clock().

stop_timer (EXT:129)
This makes a note of the time this instruction is used, storing it
seperately from the time marked by start_timer.

read_timer (EXT:130) (store)
This stores the difference between the start time and end time, as
defined by using the instruction above, in the variable. This time is
in centiseconds.

print_timer (EXT:131)
This displays the difference between the start time and end time as a
decimal number of seconds, to centisecond precision.

2 Likes

The best reasons to do a Z-machine implementation. :)

I am not the top expert on this stuff (honest) but I will say:


It is illegal to save the game (either with save or save_undo ) during an “interrupt routine”

The rationale is that a save here would have to contain extra information: the buffer position and length the current @read is reading into. There’s no place in the save file to put that info. So it can’t be done.

restore/restart/quit in the interrupt routine doesn’t have these problems. (The interpreter just has to forget where the current @read is reading into!) So there’s no reason for the spec to forbid these cases. However, I bet nobody’s ever tested them either. If an interpreter supports them, it’s very much going the extra mile.


Where’s that from?

1 Like

Thanks, I will keep those in mind.

Yeah, it is pretty much obvious why saving in this case is explicitly forbidden, but other actions such as restore might also need special handling if invoked while reading user input, depending on the implementation. That’s why I had to clarify this. Technically, it wouldn’t be very difficult to support them, but it would complicate the architecture a bit in my case. I would like to ensure that everything that is theoretically legal is properly implemented, although it might be hard to test corner cases in practice.

Here are a couple examples. All dictionary words are numbered from 0 upwards.

  1. sampler1_R55.z3 (from Appendix C) - word #433 (“multi”), unique;
  2. zork1-r15-sUG3AU5.z2 (from eblong dot com) - word #347 (“nasty”), a duplicate of word #348; word #514 (“storm”), unique;

There is also a zork1.dat I got from somewhere I can’t actually remember (basically, by random googling) when setting up the project so that I could have something to run initial tests on (I missed the links in Appendix C at first for some unknown reason, and the site I got the 2nd file from I did not discover until much later). The file length in the header (multiplied by 2, for it is a V3 file) is 84876 bytes, and the checksum is 41257 (hex $A129). Actual file length is 92160 bytes. There are unique words #200 “fcd#” and #437 “pdp1” in the dictionary that both do not have the highest bit set in the last 2-byte word.

I first thought it was a bug somewhere in my code, and I cannot rule it out with absolute certainty even now, but it seems strange that the dictionary reader reacts to these particular words specifically, so I have to assume something is indeed wrong with the data. My suspicions are additionally corroborated by the fact that in the last case the problematic words in question are rather unusual: perhaps some kind of debugging commands or similar.

1 Like

That’s the length/checksum of the common zork1-r88-s840726 which is on the Masterpieces CD and everywhere else. It’s about the most canonical Z3 file in existence, so if it doesn’t follow the spec, then the spec is wrong. :)

There are unique words #200 “fcd#” and #437 “pdp1” in the dictionary that both do not have the highest bit set in the last 2-byte word.

Both of those dict words end with a non-letter, but they’re not debug commands. I suspect what’s going on here is that a multi-Z-character sequence has been truncated.

(The ZIL source defines these words as PDP10 and FCD#3, in fact. So they’re definitely truncated.)

1 Like

The Z-Machine spec is not completely exhaustive, but it’s pretty good. You can actually implement a working interpreter from it, which is more than can be said for some other IF formats. But there are some ambiguities, and you can’t expect them to be all resolved. You can look through the forums to see if there are any past discussions on an issue, or ask a question if you think it’s an important issue, but most of the time you’re probably best just making a choice, implementing it, and then seeing it if causes issues later.

But in addition to the spec, make sure you’re also using:

Also version 6 is ignored by most people. It should only be implemented by masochists. If you have questions you’re mostly on your own.

Table locations

However, as far as I’m aware, there is no mention of what kind of memory the custom alphabet table (address in byte $34 of the header) or the Unicode translation table (word 3 in header extension table) can be stored in. Am I to assume that they can also be located in dynamic memory, and so it is theoretically not safe to deserialize them in advance? Unfortunately, none of my sample files feature any of these two tables, so I’m unable to confirm or deny that.

You’ll have to decide what kind of interpreter you want to write: one that makes minimal assumptions and allows for anything, or one that takes short cuts. Despite a lot of potential for dynamic compilation in the Z-Machine, almost no storyfiles do any of it. My interpreter (ZVM) just reads those arrays in at the start, and any changes to them are ignored later on. I don’t even account for the object property table address being moved, even though 12.4 explicitly says that is legal.

I have a vague memory that abbreviations might be altered in some cases, so I made those dynamic, but I’m not certain. Because the Z-Machine doesn’t provide any way to encode arbitrary length texts, a game would need to include its own encoding functions in order to write dynamic text back as encoded text. It would be easier to copy an encoded text around, but it would have to be the right length to fit in the destination. It might be safe to cache abbreviations and text in dynamic memory - something to consider next time I work on ZVM.

Undefined EXT opcodes

It’s best to throw a fatal error for any unrecognised opcodes.

scan_table

The spec for the scan_table opcode says that an additional argument can be used to configure whether the table to be scanned contains words or bytes. Now, suppose the first operand (the value to search for) resolves to a word (i.e. it is anything but a “small constant”). In this case, should we immediately return 0 if the high byte of this word is non-zero, or should we truncate the word to a byte and perform the search as usual?

Don’t truncate a value without the spec telling you to. So yes you could implement that kind of optimisation here.

From the default value of the third argument ($82 = 10000010b), it appears that the field length of the table is always given in bytes, even if we’re searching for a word. Is it legal in this case to have a field length of 1, and if it is, what should be done in this scenario?

Don’t overthink it, a game is very unlikely to be doing something stupid. Do whatever makes most sense to you. I’d do whatever requires the least special-cased code. In that case it would be to make the loop increment the index by one byte but look for a matching word - which means it’s checking two fields at once.

store opcode corner case

Now, suppose the first operand indicates that the value should be put into the top of the stack (NB: not pushed), and the second operand is of type “variable” with value 0, which indicates a value should be popped from the stack. There seems to be a conflict here, as the value we’re updating literally ceases to exist during the update process. If this is legal, what should happen? The logical course of action would be to replace the new value now on top of the stack (e.g. we essentialy “pop” a value from under the top of the stack with such an operation), but this is not adequately explained - clarification is needed.

While in general you’re meant to evaluate operands from left to right, the only sensible option here would be to pop first and then mutate the new top of stack. @store and the other opcodes like it can be implemented as regular storers - if you put the complexity into their disassembly rather than into their run code.

1 Like

The following is a mix of experience and opinion. Take it with a grain of salt. :grinning:

  1. Table locations - If the standard doesn’t say, then assume the table can appear in either static or dynamic memory. Non-default dictionaries can be in dynamic memory. Even if current usages don’t place tables in dynamic memory, there’s nothing preventing them from being placed there.

  2. Interrupts are nebulous beasts in the z-machine. There are three types and honestly I feel they should have specific rules for each - namely output during a sound interrupt probably shouldn’t be allowed as it has the ability to mess with the display of player input if such is happening concurrently. Newline interrupts are an enourmous headache and luckily are relagated to V6 games. Restore, Restart and Quit are definitely legal during timed input interrupts as this is required for Border Zone to run properly. The instructions for catch and throw should be illegal as well as save_undo due to the z-machine’s save state not being able to record whether a player input is in progress.

  3. You are correct. I feel you can deduce this from the underflow behavior mentioned in the standard, but it is correct regardless.

  4. Good question - newline interrupts are a royal pain.

  5. I believe it is all lines combined. Again V6 is painful.

  6. You are correct - it should be interrupt countdown, not line count. If line count is not -999, then the line count should be checked against how many lines the interpreter can fit on the screen at once, so a large output can be paused with a [MORE] prompt or similar for the player to read.

  7. I wouldn’t make missing sound effects an error as games generally are playable without sound, but you could pop up a warning to the player (maybe make it suppressible).

  8. It shouldn’t matter if you don’t write anything, as the contents of that word will generally only be inspected when a menu click keypress occurrs.

  9. It is a quirk of the original Mac implementation.

  10. Yeah, there’s no reliable way to guess what each extended opcode may do. Ignore and hope for the best!

  11. This is a wording issue. The left over input isn’t from an interrupted read, but rather one which was typically terminated via something other than a newline - Beyond Zork uses this to implement a set of definable macro keys.

  12. Remember large constants can still be small numbers, so in general, this can still work. However in the case you mention where the high byte is non-zero, it’s somewhat ambiguous. I would just assume a field length of one byte is still legal and just won’t match the searched for value. I see no reason to make it an error.

  13. Empty object names are legal to print and print nothing. This definitely occurs in extant games.

  14. I’ve tested this before. It is my firm belief that these are bugs in the original games. Those words are not able to be looked up in Infocom’s original interpreters.

  15. Loading of operands happens before execution of the instruction, so the stack value referenced by the second operand is popped. The value is then stored in the new top of the stack (without pushing).

2 Likes

I like this better than my own answer.

2 Likes

Ah, thanks – that clarifies that.

2 Likes

You may find something useful from this old thread I made about undefined Z-machine behavior.

3 Likes

Wow, I’d never expect to get this many answers so quickly! Thank you very much everyone!

The problem is that whether there is a sequence or not, S3.2 clearly states that the highest bit in the last word marks the end of the text. It does say about sequence truncation later in S3.6.1, but that is mentioned specifically in the context of decoding ZSCII characters. The spec doesn’t say anywhere that the end bit could be omitted; ergo, my assumption was that these two things (the end bit and multi Z-character constructions) are unrelated.

At the same time, S.13 gives the exact length (in bytes) of every encoded dictionary word, but it still does not say anything about the end bit, so I had to assume that normal decoding rules still apply there.

And it seems my assumptions were right after all.

The errata seems to be very helpful, thank you. As for the test, I’ve already found it before and am going to put it into use whenever I’m actually able to run it.

I’m not sure if I can agree, at least in theory (since I haven’t begun working on the frontend yet). S.8 gives a rather thorough description of the V6 screen model, and aside from what I mentioned earlier there doesn’t seem to be much room for frivolous interpretation there. Yes, it is complex, but ultimately its implementation should just be a matter of time. But if it’s not very popular, I guess I may have to resort to looking at the source code of existing implementations, even though it would take away some of the challenge.

Sounds interesting. I will check it out, thanks.

1 Like

Note that praxix uses undefined behavior of the print_table instruction, see the discussion in Additional documentation for @print_table opcode?

1 Like

This only ever happens with multi Z-character constructions. I believe that whatever truncated the dictionary text had a bug that failed to set the high bit when in the middle of a construction, e.g. nasty-knife, and storm-tossed.

This should be Zork I Release 88 Serial 840726, with additional zero padding. In my opinion it is a good story to begin testing with. The first thing it will print upon starting is ZORK. :grinning:

Fun trivia - if you stub out the random number generator to just return zero (or one… my memory is failing me here), the game is unwinnable because neither you nor the troll can hit each other in combat.

1 Like

You want to be careful about this. The spec is just a document that some people wrote.

When you’re doing this kind of archeology, remember:

  • The spec can be wrong.
  • Modern interpreters can be wrong.
  • The modern Inform and ZILF compilers can be wrong.
  • Infocom’s interpreters can be wrong.
  • Infocom’s game files can be wrong.
  • Infocom’s compiler can be wrong.

There’s a couple of cases where Infocom’s game files and interpreter are wrong, but they cancel each other out so that the original game worked. (The Beyond Zork rotating-mirror bug is a classic.) That requires special-casing in modern interpreters.

For this dictionary situation, Infocom’s game files don’t match their interpreter (the words are not recognized) so we know Infocom made some mistake. Looking closely, we observe that the game files have a consistent discrepancy which implies a bug in Infocom’s compiler. We also observe that Infocom’s interpreters match modern interpreters (modern interpreters don’t recognize those words either). So there’s a clear path forward, but you don’t get there by blindly sticking to the spec.

(If we were updating the spec document, it would be worth a footnote here explaining what we just learned.)

Yeah, I’ve seen that one already while searching for more info on print_form and print_table.

You are right. That’s why I had to start this thread in the first place. =)

Here’s one more minor issue I’ve stumbled upon recently:

16. The verify opcode

This opcode is listed as being available from version 3, and it says that the interpreter should use word $1A in the header to calculate the checksum and compare it against the known value in word $1C. The problem is that S.11 also says that some early Version 3 files do not contain length and checksum data. What does this mean, exactly? (e.g. that both words will be 0, or perhaps they will contain some unrelated data pertaining to e.g. the object table or some such).

It would also be logical to assume that any file that does not contain this information also does not contain the verify opcode, but does anyone know if that’s actually the case? If not, it’s unclear what the interpreter should do if it sees that the length and checksum values are not provided, assuming it is possible to find that out in the first place.

They are zero in every example I’ve looked at (that does not contain a verify instruction).

This is true in every example I’ve looked at.

The only way would be if they are zero. There is no other indicator.

Edit: I think Infocom was forward thinking enough to leave unused header bits zeroed until needed.

Additional Edit: Thinking further on this, it is (however unlikely) entirely possible for the checksum to legitimately be zero, so the only thing you can do when encountering the verify opcode is naively calculate and compare to the header value.

I see. Then there’s nothing to worry about here.

I don’t actually know this story. What was it?

2 Likes

See thread: Beyond Zork passing invalid values to @get_prop_addr

2 Likes

Something else I’ve been wondering about recently:

17. Address truncation

Apparently, according to this thread: Index to @loadw: signed or unsigned?, the final address of the word indicated by the opcode loadw is supposed to be truncated to the bottom 16 bits. But what about the other opcodes? (Logically, at least storew should be expected to do the same.)

S.1 says that the total of dynamic plus static memory must not exceed 64K, so my current assumption is that any opcode that does not accept a packed address as an argument should be handled this way. Is this correct?

1 Like