Bit # 7 6 5 4 3 2 1 0
2OP 0 m m o o o o o 2-operand (short-form)
1OP 1 0 m m o o o o 1-operand
0OP 1 0 1 1 o o o o 0-operand
EXT 1 1 o o o o o o extended (0-4 operand)
(m=mode bits, o=operator bits)
And the Z-machine Standards Document:
4.3
Each instruction has a form (long, short, extended or variable) and an operand count (0OP, 1OP, 2OP or VAR).
If the top two bits of the opcode are $$11 the form is variable; if $$10, the form is short. If the opcode
is 190 ($BE in hexadecimal) and the version is 5 or later, the form is "extended". Otherwise, the form is "long".
§4.3 translated to the same table format:
Bit # 7 6 5 4 3 2 1 0
2OP 0 m m o o o o o 2-operand (long-form)
1OP 1 0 m m o o o o 1-operand (short-form)
0OP 1 0 1 1 o o o o 0-operand
VAR 1 1 o o o o o o variable (0-4 operand)
(m=mode bits, o=operator bits)
I have always thought the short-/long-form described the mode bits (1 or 2 bits per operand). As the Standards describe it it’s the other way around, and to specify large constant you use the short-form.
Is this just a historical artifact or is there any other explanation?
I’m not sure if the Infocom manuals were available when the Z-Machine was being reverse engineered and the Standard written. It wouldn’t surprise me if that explains the differences here.
The form of an instruction refers to the whole instruction:
Form
Structure
long
opcode operand operand
short
opcode [operand]
variable
opcode types [operands]
double variable
opcode types types2 [operands]
extended
0xBE opcode types [operands]
So variable form instructions have a variable number of operands, and need the operand types byte to specify their types. Extended form instructions have an extra byte to specify the number because there were too many to specify with only 5 bits. Then for long and short form, well if it’s got 2 operands it’s long, and if it’s 0 or 1 it’s short.
The quirk is that long form instructions only have one bit to specify the type of the operand, but there are three types of operands. So long form instructions can’t have large constants. If you need to use a large constant you would construct a variable form instruction instead. It’s a bit odd that you can format the 2OP instructs as either long form or variable form, but they must have figured that even though all 2OP instructions could just have been formatted as variable form, that the long form would be used frequently enough to be worth the complexity. Likewise including je as a 2OP instruction even though it can be used with 2-4 operands.