The Å-machine was always intended to support a 32-bit word size eventually. The second byte of the header gives the word size in bytes, which is currently always 2. This is good for compatibility with retro machines like the C64 (and with the Z-machine, which borrows a lot from the 16-bit Å-machine architecture). But it’s also very limiting; it would be nice to have a 32-bit option as well, for larger games.
Some parts of this are easy enough: just use 32-bit pointers instead of 16-bit ones, for example. But the Å-machine is a tagged architecture. Unlike on the Z-machine, where a 16-bit word could plausibly be a byte address, a packed address, an integer, an object number, or something else entirely, every 16-bit word on the Å-machine has one and only one meaning.[1]
For example, here is the encoding of literal values:
000vvvvv vvvvvvvv object (v > 0)
001vvvvv vvvvvvvv dictionary word (v < $1E00)
00111110 vvvvvvvv character
00111111 00000000 [] (nil, the empty list)
00111111 00111111 sentinel for unused memory
00111111 vvvvvvvv (reserved) (all other v)
01vvvvvv vvvvvvvv integer
All literal values start with 0; 1 means a reference.
Now, we could just keep these tags intact and increase the range of each type of value by a factor of 216. But that seems like a wasted opportunity. This system was designed to fit as many values into 15 bits as possible, which is why it doesn’t have things like signed integers.
The current system allows 256 different characters, for example, which are assigned by a Unicode translation table in the story file. If we keep this particular one-byte tag and expand the one-byte value to three bytes, then we can support all of Unicode!
Currently, a full quarter of the value space is allocated to unsigned integers. What if we gave half of that chunk to signed integers, and the other half to floats? We’d need to figure out a 29-bit floating-point format, but that seems like a sufficient number of bits for your average parser IF.
Routine storage and packed string storage are limited to eight megabytes each (24-bit pointers to individual bytes, with the first bit marking it as a full 24-bit pointer instead of a shorter one). That could trivially be doubled by requiring all pointers to be in “long” form.
The Å-machine currently uses one-byte opcodes, which means there aren’t a lot available. Is that worth changing? We haven’t hit the limit yet.
It has 64 registers and 64 “env slots” (local variables for each routine). Is that worth changing? Right now those registers (unlike the Z-machine ones) are only used for routine arguments and temporaries; global variables are stored as an array in RAM instead.
What limits need to be loosened and which should stay consistent? If we’re increasing the word size to 32 bits, I don’t think we necessarily need to worry about file size any more; but consistency with the 16-bit format makes it easier for interpreters to support both, so we also shouldn’t change things just for the sake of changing them.
This is why it only supports 14-bit unsigned integers instead of the Z-machine’s 16-bit signed integers. ↩︎