As a fun coding exercise and a learning opportunity, I decided to write my own Z-Machine interpreter and am finding the reformatted standard at https://zspec.jaredreisinger.com/ extremely helpful in the endeavor. I understand from reading other threads here that the standard is not necessarily complete and there are scenarios where it could be either ambiguous or actually wrong, and that most popular interpreters make assumptions to fill in the gaps. There are a couple things I came across that I’d like to ask about though:
where the text-length is the number of 2-byte words making up the text, which is stored in the usual format.
I understand the short name is represented as a ZString, but then the length byte here seems redundant since ZStrings indicate the final 2-byte pair by setting the first bit. Is this byte just an optimization so you can skip the name without parsing the ZString, or are there cases where parsing the ZString is not safe or the length of the ZString may disagree with this byte?
§12.4.1 also states that after the header, the data for any property entry can be 1-8 bytes for V1-3, and 1-64 bytes for V4+, but then 15. Dictionary of opcodes for put_prop states:
As with get_prop the property length must not be more than 2: if it is, the behaviour of the opcode is undefined.
Unless I’m missing something, none of the other opcodes seem to interact with the property table, so if get_prop and put_prop can only have a maximum length of 2 bytes, under what scenario are property data lengths of 3-64 bytes supported?
I’m writing my interpreter as a terminal application, but it seems most terminal applications don’t have the capability of modifying the terminal’s font these days, that functionality instead being reserved for the terminal emulator itself. From what I’ve been able to tell, certain specific terminals might support bespoke control codes that can potentially do this (xterm-based terminals seem to have at one point supported a control code, though I also see references that say most implementations noop this functionality now).
It is legal for a game to change font at any time, including halfway through the printing of a word.
How is this typically handled these days? Are modern interpreters just not terminal based, or do they implement their own terminal emulator to allow this functionality?
From what I can tell, there is also no agnostic way to check what font the terminal is using, and so I’m not sure how I should set memory header IROM bits since I don’t think there’s any way for me to tell what font I’m using (and whether it’s fixed or variable width, for instance). I’m also not sure there’s a way for me to support Font 3. I ideally would like to follow the standard as closely as possible, so is there a recommendation I could follow in this case?
No other opcodes interact with it directly, but get_prop_addr can give you the byte address of the property value, and it can then be read like any other array in memory. This is how Inform handles properties with array values, like name (which holds an array of dictionary words).
If the terminal can’t change font, then the easy answer is, just don’t change it! An interpreter isn’t required to support any font-changing at all, and this is why. It’s nice if you can reflect the style changes somehow (e.g. use underlining for emphasis if the terminal supports it), but if you can’t, so be it.
Just store 0 from set_font to indicate that it’s not available.
The game’s only changing between variable, fixed, and “runic” (font 3); and while you’re right that you can’t figure out what font you’re using (proportional vs fixed) in a portable way, when in a terminal, I think it’s pretty safe to just assume that fonts 1 and 4 are identical (fixed). That means changing fonts between them is a no-op. Font 2 is supposed to be ignored, so that just leaves font 3.
For font 3, I used Unicode characters to mimic the runes. Most glyphs are pretty well covered. You can see what I found in the Bocfel source. This has the caveat that the terminal/font must support Unicode, and of course you have to know what encoding to use. I just assume UTF-8, but here you are getting into some non-universal support.
You’re right that this is not universally supported, as with Unicode/UTF-8. So you do have to get some platform-specific stuff incorporated. On Unix systems you generally have access to terminal capabilities:
#include <curses.h>
#include <stdlib.h>
#include <term.h>
#include <unistd.h>
void f(void)
{
if (setupterm(NULL, STDOUT_FILENO, NULL) != OK) {
exit(1);
}
char *italic = tigetstr("sitm");
if (italic == NULL || italic == (char *)-1) {
// no italic
} else {
putp(italic);
puts("This is in italic");
}
}
This requires curses, which is (or was?) part of POSIX, so should be widely portable to Unix systems, at least. And it doesn’t matter which terminal you’re using. As long as $TERM is a valid value for your terminal, this ought to find capabilities that your terminal can use. That includes color. The terminfo(5) man page lists a bunch of capabilities you can query. It’s not something you have to program different for each terminal, thankfully.
If you want maximum portability you have to sacrifice styled output, but if you’re willing to at least include some conditionally-compiled platform-specific code, you can do it pretty easily.
Huge thank you to everyone who has replied so far! Although I’ve written plenty of backend software in my day, I’ve never really done anything with styling for a CLI app so this was all a bit outside my wheelhouse.
Though I do like to think I’m pretty good about keeping memory usage and performance in mind, it’s been a long time since I’ve worked in any kind of a memory or cpu constrained environment, so I tend to forget that skipping a small handful of bytes can actually be super impactful!