I’m working on a new interpreter to learn Crystal, and I’m running into a problem. I figured a first good step would be to disassemble stories first, and match the venerable txd’s output.
However, I can’t seem to find the conditions to detect where code ends and text starts. My conditions for the end of a routine are:
a decode error
a return opcode, no further jumps, and valid routine byte after (byte with a number in the 0-15 range)
if on an “unreachable” instruction, a return opcode with a zero byte after
if on an “unreachable” instruction, a return opcode with a valid ZSCII string after
when static strings begin (easy, only known on V6 and up)
However, several Infocom files do not match any of these, and even checking for a ZSCII string is flimsy because everything is a valid ZSCII string (right now, I check if it starts with an upcase, which does not cover all cases). I’m at my wit’s end, here - how can txd reliably know where code ends on V5 and less?
The goals of a disassembler are different to an interpreter. An interpreter will always know how to interpret a memory block because the previous bit of bytecode tells it what to do with it. I don’t know how txd works but it may only detect text that is printed from routines.
txd finds strings both in code and in the string section, but it’s possible that it misses some of them in some game files. I’m certain that it uses heuristics tuned to Infocom and Inform-generated files. You’d have to look at its source code to see what they are, though.