TXD and ZTools

I’ve been tinkering with a Z-machine disassembler on my own and used TXD some. During this process i’ve found some bugs and some opcodes that isn’t supported in TXD. Are TXD still maintained? At gitlab there is a repository (David Griffith / ztools · GitLab). Is this the “official” one and can I report issues there?

The thing I’ve found so far:

  1. The EXT opcodes print_unicode and check_unicode aren’t defined (version 5-8).
  2. The EXT opcode set_true_colour is not defined for either version 6 or version 5, 7-8 (different definitions).
  3. The EXT opcode buffer_screen is not defined for version 6.
  4. The 2OP opcode throw is only definied for version 6, It should also be valid for version 5, 7-8.
  5. The 0OP opcode restart should be defined as a RETURN-type (is currently defined as a PLAIN-type). This one am I not entirely sure about but one dialog routine ended with restart and will disassemble correctly when restart is retyped.
1 Like

Point 5 above is a bit more complicated.

To get my disassembly to work I treat quit and restart as RETURN-types if not the next opcode is a valid opcode, otherwise they are considered as PLAIN-types.

Example from Trinity (Release 12 / Serial Number 860926):

Source code:
<ROUTINE FINISH ("AUX" X)
	 <CRLF>
	 <V-SCORE>
	 <TELL CR
"Do you want to restart the story, restore a saved position, or quit?" CR CR>
	 <REPEAT ()
		 <TELL "[Type RESTART, RESTORE or QUIT.] >">
	 	 <READ ,P-INBUF ,P-LEXV>
	 	 <SET X <GET ,P-LEXV 1>>
	 	 <COND (<EQUAL? .X ,W?RESTART>
			<SET X <RESTART>>
			<FAILED "RESTART">)
	       	       (<EQUAL? .X ,W?RESTORE>
			<SET X <RESTORE>>
			<FAILED "RESTORE">)
	       	       (<EQUAL? .X ,W?Q ,W?QUIT>
			<CRLF>
			<QUIT>)>>>

TDX disassemble this as:
Routine 28cfc, 1 local (0000)

28cff:  bb                      new_line        
28d00:  88 51 88 00             call_1s         14620 -> sp
28d04:  bb                      new_line        
28d05:  b2 ...                  print           "Do you want to restart the story, restore a saved position, or quit?"
28d32:  bb                      new_line        
28d33:  bb                      new_line        
28d34:  b2 ...                  print           "[Type RESTART, RESTORE or QUIT.] >"
28d5f:  e4 af b3 67             read            ga3 g57
28d63:  4f 67 01 01             loadw           g57 #01 -> local0
28d67:  c1 8f 01 e0 31 50       je              local0 "restart" ~28d7b
28d6d:  b7                      restart         
28d6e:  0d 01 01                store           local0 #01
28d71:  d9 0f 51 80 f6 09 00    call_2s         14600 s083 -> sp
28d78:  8c ff bb                jump            28d34
28d7b:  c1 8f 01 e0 3a 4e       je              local0 "restore" ~28d8d
28d81:  b6 01                   restore         -> local0
28d83:  d9 0f 51 80 f6 0c 00    call_2s         14600 s084 -> sp
28d8a:  8c ff a9                jump            28d34
28d8d:  c1 83 01 de 30 de 81 3f a0 
                               je              local0 "q" "quit" ~28d34
28d96:  bb                      new_line        
28d97:  ba                      quit            

orphan code fragment:

28d98:  8c ff 9b                jump            28d34

'quit' here is not the end of the routine, it's the jump (a part of the REPEAT-statement.

Example from the Dialog version of Craverly Heights:

Source code:
%% RESTART

(understand command [restart])
(perform [restart])
	Restart the game from the beginning? \(
	(if) (library links enabled) (then) (link) y (else) y (endif)
	(no space) / (no space)
	(if) (library links enabled) (then) (link) n (else) n (endif)
	\)
	(if) (yesno) (then)
		(restart)
		Failed to restart.
	(endif)
	(stop)

(parse game over [restart])
	(restart)

TXD won't disassemble these, but if I apply the changes above to my still unfinished program, I get:
0A93C 00                       Routine: 0A93C, 0 locals
0A93D F9 13 0B 7B 10 5B BC     call_vn
0A944 F9 26 0B 42 38 45 35     call_vn
0A94B D9 2F 0C 25 45 24        call_2s
0A951 C1 83 24 21 0F 3E 71 F4  je
0A959 C1 8F 24 21 15 ED        je
0A95F C1 8F 24 21 16 E4        je
0A965 C1 8F 24 21 97 C3        je
0A96B B1                       rfalse
0A96C DA 1F 0B 6B 08           call_2n
0A971 E1 9B 13 03 14           storew
0A976 DA 0F 0B F7 5B C7        call_2n
0A97C CD 4F 12 5B C2           store
0A981 2D 15 14                 store
0A984 8B 0E F4                 ret
0A987 8B 18 76                 ret
0A98A B7                       restart
0A98B DA 1F 0B 6B 06           call_2n
0A990 CD 4F 12 5B C0           store
0A995 2D 15 14                 store
0A998 8B 23 9D                 ret

Padding:
0A99B 00 

0CF4C 00                       Routine: 0CF4C, 0 locals
0CF4D 4F 14 00 14              loadw
0CF51 B7                       restart

Padding:
0CF52 00 00 

In the first routine restart has a valid opcode following and is treated as a PLAIN-type 
and in the second routine restart is followed by an invalid opcode and treated as a RETURN-type.

The problem here is trying to figure out where the routine ends, right? I haven’t looked at the txd code, but it seems like this is the same problem for QUIT and RESTART as it is for RETURN. Any of them can occur in the middle of a function, if there’s branching.

Exactly! Usually txd moves the goalposts for the end of a routine when it encounters a branching instructions. But in the case of restart is defined as plain (can’t be at the end of a function), quit is definied as return (can be the end of a function).

Seems like restart should be defined as RETURN, then.

(I took a quick look at the code.)

1 Like

I’ve made a ZTools 7/4 Beta for Win32. Works with Trinity and Craverly Heights above (and other test cases I have). Source and binaries at: GitHub - heasm66/ztools_7_4_beta_win32

2 Likes

I’ve found a new bug. If there’s data, probably an array, directly following the dictionary, it gets wrongly identified as a dictionary entry. The bolow code snippet is from The Job:

Routine 25d4, 4 locals

 25d5:  36 04 1e 00             MUL             #04,G0e -> -(SP)
 25d9:  d4 2f 10 66 00 00       ADD             #1066,(SP)+ -> -(SP)
 25df:  55 00 02 01             SUB             (SP)+,#02 -> L00
 25e3:  4f 01 00 03             LOADW           L00,#00 -> L02
 25e7:  cd 4f 04 25 84          STORE           L03,"c lcjibcfdbfd forb
c   cs
coddnd?bapbpvc2  j LKn ?8ye y"
 25ec:  a0 56 d7                JZ              G46 [TRUE] 2604
 25ef:  f7 a7 03 04 10 02 d1    SCAN_TABLE      L02,L03,#10 -> L01 [TRUE] 2605
 25f6:  c1 83 03 1d bf 1e 34 48 JE              L02,"floor","ground" [FALSE]
2604
 25fe:  0d 31 06                STORE           G21,#06
 2601:  8c 00 16                JUMP            2618
 2604:  b1                      RFALSE
 2605:  75 02 04 02             SUB             L01,L03 -> L01
 2609:  be 02 8f 02 ff ff 02    LOG_SHIFT       L01,#ffff -> L01
 2610:  58 02 08 00             MOD             L01,#08 -> -(SP)
 2614:  54 00 01 31             ADD             (SP)+,#01 -> G21
 2618:  d0 2f 25 a4 31 30       LOADB           #25a4,G21 -> G20
 261e:  b0                      RTRUE

dict_end is actually the first byte after the end of the dictionary (code_base). If there’s an array starting at dict_end this could wrongly be identified as a dictionary entry. Changed ‘>’ to ‘>=’ in the function in_dictionary (txd.c).

if (word_address < dict_start || word_address >= dict_end)

I’ve made a new beta, referenced above.

Somehow I missed this thread. That ztools repository of mine wasn’t intended to be official. Though I have made modifications to help me with extracting CGA and EGA data from Infocom’s V6 graphics files. I suppose I could adopt the package, modernize the build procedure, and make it official.

1 Like

I’ve reported on Russotto’s repository also.

If you have a code in an object like below:

default:
    #ifdef SceneryReply;
    if(SceneryReply ofclass string)
        print_ret (string) SceneryReply;
    i = location.&cheap_scenery;
    w1 = self.number;
    if(SceneryReply(i-->w1, i-->(w1 + 1)))
        rtrue;
    #endif;
    "No need to concern yourself with that.";

The SceneryReply can both be a string or a routine, which makes the compiler to compile this unreachable code print_ret (string) SceneryReply; in the case it is a routine. When TXD tries to decompile this it throw a message to stderr, Warning: printing of nonexistent string. The decompiled snippet looks like below:

Routine 43b4, 2 locals

 43b5:  41 f9 01 43             JE              Ge9,#01 [FALSE] 43ba
 43b9:  b1                      RFALSE
 43ba:  e0 07 42 6c 10 bf 04 00 CALL_VS         109b0 (#10bf,#04) -> -(SP)
 43c2:  a0 00 c7                JZ              (SP)+ [TRUE] 43ca
 43c5:  8d 10 bf                PRINT_PADDR     42fc
 43c8:  bb                      NEW_LINE
 43c9:  b0                      RTRUE
 43ca:  e0 23 41 f1 10 00 4a 01 CALL_VS         107c4 (G00,#004a) -> L00
 43d2:  51 fb 08 ff             GET_PROP        Geb,#08 -> Gef
 43d6:  2d 02 ff                STORE           L01,Gef
 43d9:  54 02 01 00             ADD             L01,#01 -> -(SP)
 43dd:  6f 01 00 00             LOADW           L00,(SP)+ -> -(SP)
 43e1:  6f 01 02 00             LOADW           L00,L01 -> -(SP)
 43e5:  e0 2b 10 bf 00 00 00    CALL_VS         42fc ((SP)+,(SP)+) -> -(SP)
 43ec:  a0 00 41                JZ              (SP)+ [FALSE] RTRUE
 43ef:  b3 ...                  PRINT_RET       "No need to concern yourself
with that."

I’ve changed it so that it prints PRINT_PADDR [invalid string: 42fc] instead.

Inform6 also accepts a statement like @store $ffff $0000; which produces this nonsense:

 43ba:  cd 1f ff ff             STORE           Gffef,#00

I’ve changed it to:

43ba:  cd 1f ff ff             store           [invalid variable: ffff] #00

A new win32-binary is at the link above.

1 Like

The policy we’ve landed on is that I6 compiles any assembly statement, even if it produces invalid code.

  • The Z-spec might change. (Unlikely at this late date but you never know.)
  • Someone might want to experiment with an extended Z-machine.
  • Someone might want to test questionable Z-code on various interpreters.

So it seems sensible that TXD should be able to parse as much bad Z-code as possible, printing “invalid string” or “invalid variable” or whatever.

2 Likes

Tangentially: maybe I6 should have an “assembly” format that puts arbitrary bytes directly into the routine!

@@ $CD $1F $FF $FF;

That would make experimentation easier…

2 Likes

I guess that there’s an extremly limited user group for injecting bytes, but powerful and not to hard to implement I imagine.

1 Like

My assembler has a directive which inserts arbitrary bytes, and it’s been very useful for generating instructions which might otherwise be difficult to create.

Having something like that in I6 would be great, removing the need for a separate tool for Z-machine experimentation.

2 Likes

The opcode put_prop have second operand defined as number, but isn’t this always a property number, or could it be any other kind of number?

Compare definition of put_prop vs get_prop

caseline (0x23, "PUT_PROP",        OBJECT,   NUMBER,   ANYTHING, NIL,      NONE,   PLAIN);
caseline (0x11, "GET_PROP",        OBJECT,   PROPNUM,  NIL,      NIL,      STORE,  PLAIN);

Seems to me PROPNUM would be right. It should be a property number (and is considered an error if it’s anything else). It technically can, of course, be any number, but from looking at the TXD source code, PROPNUM falls back to printing numbers if it can’t look the property up, so that should be just fine here.

2 Likes

Changed PUT_PROP and made a new compilation. Next up is to make it work with files compiled with zilf/zapf or Dialog without using the -g switch.

1 Like

Pretty much every assembler for real hardware I’ve encountered allowed for arbitrary byte and word sequences put wherever I like. I’d call an assembler incomplete if it couldn’t do that.

1 Like