Z-machine dictionary alignment trouble

otistdog · February 6, 2022, 2:50am

I agree with fredrik’s analysis on the bug report thread that the “warning” code does not seem to perform its intended function – both because a) the logic is incorrect and b) it is only shown in debugging builds and the situation that it tries to detect will be altered anyway for a release build. Fixing it is irrelevant; the goal at this point should probably be to remove that code from the library entirely.

Other than that, I’ll only say that option 3 preserves backwards-compatibility for all previous I6 code (minus any that suffer from this issue and do not have room for even one more byte in readable memory) and conforms with author expectations set by the DM4 for all future I6 code. And that I’ve already posted a proof-of-concept of some logic for the compiler that seems to cover the general case elucidated so well by fredrik and zarf.

The logic of the revised “warning” code in fredrik’s post is essentially identical to that in mine. [EDIT: On another review, I see that fredrik’s code above is the same that he posted a few weeks ago on the bug report thread, so credit where it’s due, which is with him. The only difference I can see is that my version uses the compiler source’s idiom. I also see that he anticipated the even-sized dictionary entry case, as well. (Sorry about that, @fredrik. I didn’t read your second version of the “warning” code there as closely as I should have the first time through.)][EDIT 2: I can also see that fredrik reported this issue (https://github.com/DavidKinder/Inform6/issues/130) to the compiler project in August of last year.]

The important question, though, is: Why use it to produce a nonfunctional warning when it can be used to just prevent the problem in the first place? As I once saw it put humorously: “If only I had a computer to handle these troublesome details for me!”

Piergiorgio_d_errico · February 6, 2022, 12:27pm

on no.1 I don’t have delved so deep into standard library’s innards, aside the historical work, but OTOH, I have a complete mirror of IFarchive’s inform games sources, AND *grep(1).

if you can provide some example of code whose can be broken, I can check into that large sample of inform 6 sources via grep/rgrep, so a sort of weighing of the cons of breaking old sources can be done.

Best regards from Italy,
dott. Piergiorgio.

fredrik · February 6, 2022, 1:22pm

Any games that use the grammar property. And library extension too, of course. Of course, most game authors don’t upload their source code to if-archive.

I wouldn’t be so sure that anyone wanting to use the grammar property will read the latest addendums to the Inform docs. DM4 is available in multiple locations, and has been available as a printed book too. The more recent additions to the library documentation are a bit harder to find than the DM4, and there are no pointers in DM4 saying always check site X and Y for the latest changes to the docs.

zarf · February 6, 2022, 4:00pm

The 6/12 library has departed from and extended the DM4 in lots of ways. Even the 6/11 library did, a bit, really.

If someone wants to write an “I6 Library Addendum” document, I’ll add it to inform-fiction.org.

otistdog · February 6, 2022, 5:14pm

I misremembered the details. The pattern that I recalled seeing may or may not have been motivated by bug fixes; it may have been in place since the original implementation.

If you review the construct_storyfile_z() function found in the tables.c source file, you’ll see that there are points where the routine that writes the Z-machine story file checks to see whether certain kinds of alignment will or will not happen and adds padding bytes accordingly. Most of the time it is seeking to make sure that a given alignment does happen in order to prevent problems – in the case at hand, it would seek to make sure that the given alignment doesn’t happen.

An example:

/*  -------------------- Objects and Properties ------------------------ */

/* The object table must be word-aligned. The Z-machine spec does not
   require this, but the RA__Pr() veneer routine does.
*/
while ((mark%2) != 0) p[mark++]=0;

Another:

/*  ----------------- A gap before the code area ----------------------- */
/*  (so that it will start at an exact packed address and so that all    */
/*  routine packed addresses are >= 256, hence long constants)           */
/*  -------------------------------------------------------------------- */

while ((mark%length_scale_factor) != 0) p[mark++]=0;
while (mark < (scale_factor*0x100)) p[mark++]=0;
if (oddeven_packing_switch)
    while ((mark%(scale_factor*2)) != 0) p[mark++]=0;

A third:

/*  ------------------ Another synchronising gap ----------------------- */

if (oddeven_packing_switch)
{   if (module_switch)
         while ((mark%(scale_factor*2)) != 0) mark++;
    else
         while ((mark%(scale_factor*2)) != scale_factor) mark++;
}
else
    while ((mark%scale_factor) != 0) mark++;

Those are the only three that I spotted in a quick sweep of the 6.36 version, but clearly the pattern is well-established at the compiler level, and this case very much seems to fit the same mold as the ones that prompted the above code.

otistdog · February 8, 2022, 8:47pm

In further response DavidG’s original post: Understanding that the current definitions of routines Dword__No() and No__Dword() do work under Glulx for all currently-deployed purposes in StdLib 6.12.5, I can’t help but notice that the meaning of the routines changes significantly between Z-Machine and Glulx as currently written.

As zarf explains, the current Glulx definitions (going back to at least StdLib 6/11) are identity functions:

! In Glulx, dictionary entries *are* addresses.
[ Dword__No w; return w; ];
[ No__Dword n; return n; ];

Meanwhile, the current Z-machine definitions are translation functions:

[ Dword__No w; return (w-(HDR_DICTIONARY-->0 + 7))/9; ];
[ No__Dword n; return HDR_DICTIONARY-->0 + 7 + 9*n; ];

in which routine Dword__No() converts the memory address of a dictionary word to the zero-indexed ordinal position of the word within the dictionary table, and No__Dword() does the converse (without bounds checking). The comparable translation routines within Glulx (using zarf’s post as a guide) would be along the lines of:

[ No__Dword entry_num ;
    ! per zarf https://intfiction.org/t/printing-arbitrary-z-machine-dictionary-entries/54446
    ! note: breaks if 4-byte chars due to Unicode
    return #dictionary_table + WORDSIZE + entry_num * (7+DICT_WORD_SIZE);
];

[ Dword__No wdaddr ;
    ! note: breaks if 4-byte chars due to Unicode
    return (wdaddr - (#dictionary_table + WORDSIZE)) / (7+DICT_WORD_SIZE);
];

The Glulx Inform Technical Reference (The Glulx Inform Technical Reference) explains how the dictionary structure changes when $DICT_CHAR_SIZE=4 is used, but it’s not clear how use of that setting can be directly detected at run-time. Is there any kind of compiler-defined value or injected constant that would distinguish these states? If not, would it be possible to distinguish these states by, for example, checking whether byte pattern of the first word in the dictionary starts with $60 $00 $00 $00?

@DavidG, FYI it also looks like the Z-machine definitions for these routines must be updated to account for the possibility of dictionary entry lengths being other than 9 bytes. They are used by ParseToken__(), NounDomain() and PrintCommand().

DavidG · February 12, 2022, 8:30am

Shifting slightly off topic to fix the entry length concern… It looks like this will address the possibility of dictionary entry lengths being other than 9 bytes for Z-machine:

! Cribbed from ShowDictSub()
[ Dword__No w dp;
        dp = HDR_DICTIONARY-->0;
        dp = dp + dp->0 + 1;
        return (w-(HDR_DICTIONARY-->0 + 7))/(dp->0);
];

[ No__Dword n dp;
        dp = HDR_DICTIONARY-->0;
        dp = dp + dp->0 + 1;
        return HDR_DICTIONARY-->0 + 7 + (dp->0) *n;
];

Comments?

otistdog · February 12, 2022, 12:51pm

[EDIT: I was misinterpreting what field means what in the dictionary header format diagram in ZMS when I first posted this reply. The lines adjusting dp look correct, leaving the variable set to the address of the byte holding the size (in bytes) of each dictionary entry, even in cases where the number of word-separators is non-standard.

However, I do think that it is more efficient to use the pre-existing DICT_WORD_SIZE (see below) to obtain this value instead of calculating it on each call, and the + 7 in the current logic of the return statement seems to ignore the possibility that there is a non-standard number of word-separators.]

I’m curious now about the value of DICT_WORD_SIZE as seen in some of the library code for Glulx. This value does not seem to be set anywhere as a global constant, as the capitalized spelling would imply by convention, but neither does it have the # prefix implying that it is a compiler-level pseudo-constant. Where does this variable get its value? It appears to be available even in Z-machine.

otistdog · February 12, 2022, 2:27pm

Since you have brought up this issue on the thread about governing philosophy (What is the governing philosophy regarding changes to Inform 6?), will you clarify your analysis of this issue with respect to the priority of values that you’ve listed there? Am I correct in thinking (based on your comments in the thread on the compiler site) that you count the current behavior here as a case registering no higher than

Fix code generation bugs where the generated code doesn’t follow the language spec.

and that therefore

Backwards compatibility. If possible, old source code should compile exactly the same game file as before.

takes precedence?

If so, why would this issue not count under

Fix code generation bugs where the generated code doesn’t work at all.

? In this case, the generated code does follow the language specification, but the generated code may not function as expected due to a coincidental interaction between Z-machine number representation/arithmetic and the exact location of the dictionary within Z-machine memory. Given that it is almost certain that no author historically affected by this issue planned on the ambiguous return value problem being present, what is the ostensible value of preserving the status quo of the compiler in this case?

DavidG · February 12, 2022, 11:29pm

otistdog:

EDIT : I was misinterpreting what field means what in the dictionary header format diagram in ZMS when I first posted this reply. The lines adjusting dp look correct, leaving the variable set to the address of the byte holding the size (in bytes) of each dictionary entry, even in cases where the number of word-separators is non-standard.

However, I do think that it is more efficient to use the pre-existing DICT_WORD_SIZE (see below) to obtain this value instead of calculating it on each call, and the + 7 in the current logic of the return statement seems to ignore the possibility that there is a non-standard number of word-separators.]

I’m curious now about the value of DICT_WORD_SIZE as seen in some of the library code for Glulx. This value does not seem to be set anywhere as a global constant, as the capitalized spelling would imply by convention, but neither does it have the # prefix implying that it is a compiler-level pseudo-constant. Where does this variable get its value? It appears to be available even in Z-machine.

DICT_WORD_SIZE is set by the Inform 6 compiler. You can see what its value is for a run of the compiler by adding --list to the invocation. Unfortunately, that is always 6. To test, add this to the Initialise() routine:

        dp = HDR_DICTIONARY-->0;
        dp = dp + dp->0 + 1;
        el = dp->0;
        print "foo: ", el, "^";
        print "bar: ", DICT_WORD_SIZE, "^";

For inform \$ZCODE_LESS_DICT_DATA=0 --list test.inf, you get this at the beginning:

foo: 9
bar: 6

For inform \$ZCODE_LESS_DICT_DATA=1 --list test.inf, you get this at the beginning:

foo: 8
bar: 6

otistdog · February 13, 2022, 7:02am

Thank you very kindly for pointing out the source of that value, DavidG!

I think I have it straight now: DICT_WORD_SIZE means the size in bytes of the character storage of the dictionary word, which is not the same as the number of ZSCII characters stored in those bytes for Z-Machine, nor the same as the dictionary entry size which must include the additional bytes for #dict_par1, #dict_par2 and unless suppressed #dict_par3.

It seems that the memory setting ZCODE_LESS_DICT_DATA does not act as a virtual constant, nor any others mentioned by --list (for Z-Machine) except for NUM_ATTR_BYTES and INDIV_PROP_START. Usefully, however, DICT_CHAR_SIZE seems to act as a virtual constant when compiling for Glulx, so there’s no need for an indirect test to determine that state of affairs.

The only potential remaining problem that I can see is that if a non-standard number of word-separators is specified, then the offset from HDR_DICTIONARY-->0 (aka dict_start) to the address of the first dictionary word will be something other than 7. [EDIT: It turns out that, while dict_start is initially set to the value of HDR_DICTIONARY-->0, it is modified only a couple of lines later, so these are not equivalent values in practice.] Integer division makes the difference moot for Dword__No(), but only up to a certain point. No__Dword() is a different story. [EDIT 2: I had the two function names reversed in those last two sentences. I apologize for any confusion caused.]

DavidG · February 14, 2022, 7:02am

It looks like I can replace 7 with dp->0 + 4.

Now, suppose instead of 7, I put 99. Can you come up with some situations that would cause serious trouble? What I can come up at this late hour is PUT ROCK IN BOX followed by PUT WATCH IN BOX WITH ROCK. The first command works as expected. Then I get this for the second:

I only understood you as far as wanting to put the rock hat the box.
(bad)

With a value of zero in there, I get this for the second command:

I only understood you as far as wanting to put the watch in the box.
(good)

fredrik · February 14, 2022, 7:11am

Why would you do that?

This is a value stating a fact. Changing it to 99 is like saying pi is 7 - you can do it, but any calculations using that value are guaranteed to be off.

otistdog · February 14, 2022, 7:13pm

Unless I’m misunderstanding, the way that these routines are used within the library is as a complementary pair. In routine ParseToken__(), Dword__No() encodes the index position of the dictionary entry (a nice, low number) and adds it to the constant REPARSE_CODE as additional information:

    ! If we've run out of the player's input, but still have parameters to
    ! specify, we go into "infer" mode, remembering where we are and the
    ! preposition we are inferring...

    if (wn > num_words) {
        if (inferfrom==0 && parameters<params_wanted) {
            inferfrom = pcount; inferword = token;
            pattern-->pcount = REPARSE_CODE + Dword__No(given_tdata);   ! <---- HERE
        }

        ! If we are not inferring, then the line is wrong...

        if (inferfrom == 0) return -1;

        ! If not, then the line is right but we mark in the preposition...

        pattern-->pcount = REPARSE_CODE + Dword__No(given_tdata);   ! <---- OR HERE
        return GPR_PREPOSITION;
    }

    o = NextWord();

    pattern-->pcount = REPARSE_CODE + Dword__No(o);   ! <---- OR HERE

In NounDomain(), the dictionary entry index position is extracted and fed to No__Dword() to obtain the dictionary word’s memory address:

        else {
            ! An inferred preposition.
            parse2-->1 = No__Dword(pattern-->j - REPARSE_CODE);   ! <---- HERE
            #Ifdef DEBUG;
            if (parser_trace >= 5) print "[Using preposition '", (address) parse2-->1, "']^";
            #Endif; ! DEBUG
        }

Likewise in PrintCommand():

[ PrintCommand from i k spacing_flag;
    if (from == 0) {
        i = verb_word;
        if (LanguageVerb(i) == 0)
            if (PrintVerb(i) == 0) print (address) i;
        from++; spacing_flag = true;
    }

    for (k=from : k<pcount : k++) {
        i = pattern-->k;
        if (i == PATTERN_NULL) continue;
        if (spacing_flag) print (char) ' ';
        if (i ==0 ) { print (string) THOSET__TX; jump TokenPrinted; }
        if (i == 1) { print (string) THAT__TX;   jump TokenPrinted; }
        if (i >= REPARSE_CODE)
            print (address) No__Dword(i-REPARSE_CODE);   ! <---- HERE
        else
            if (i in compass && LanguageVerbLikesAdverb(verb_word))
                LanguageDirection (i.door_dir); ! the direction name as adverb
            else
                print (the) i;
      .TokenPrinted;
        spacing_flag = true;
    }
];

So long as the two routines reverse each other’s mappings (without causing other issues like signed integer overflow), they will function in their current use within the library. That’s why the current Glulx implementations (just the identity function) don’t cause a problem. However, if these are to be true library routines, they should have a consistent functional meaning across Z-machine and Glulx, even if the implementation details differ. That’s why I’m saying that the Glulx implementations should be modified.

Routine Dword__No() should:

take the dictionary word’s memory address as input (call it word_addr)
determine the correct starting memory address for the first word in the dictionary (call it first_word_addr)
determine the correct memory size of each dictionary address (call it entry_len)
use the difference between word_addr and first_word_addr, divided by entry_len, to determine the word’s dictionary index position

Routine No__Dword() should:

take the dictionary word’s index position as input (call it word_index)
determine the correct starting memory address for the first word in the dictionary (call it first_word_addr)
determine the correct memory size of each dictionary address (call it entry_len)
use the product of word_index and entry_len, added to first_word_addr, to determine the word’s memory address

There are many places in the library where work is minimized by using constant expressions that mask informational details of what’s going on. As the compiler changes, some of the assumptions on which these expressions rely are becoming untrue. The new $ZCODE_LESS_DICT_DATA setting is breaking the assumption of the value of entry_length being 9. (I would suppose that those using Z3 format, which uses only 4 bytes of character storage and has an entry size of 7, have already run into this trouble if they ever tried to use Dword__No() for its ostensible function.) I’m pointing out that the value of the offset between the dictionary’s start address and the first word’s address is also being assumed to always be 7, but that’s not a valid assumption if a non-standard number of word-separators is used.

I’m not sure that it’s even possible to modify the number and values of the word-separator characters in Inform right now; the current list of three may just be hardcoded in the compiler. I do know that the Z-machine format allows for lists of different sizes. It seems entirely possible that future compilers may allow the author to modify these from the current default. Since you’re modifying the routines to take into account the correct entry size (which, happily, would fix the Z3 issue), it makes sense to me to future-proof them while you’re at it.

Using Inform 6.36 and StdLib 6.12.5, I’ve done unit testing for the following for Z-machine with and without $ZCODE_LESS_DICT_DATA=1 specified and for Glulx with and without $DICT_CHAR_SIZE=4 specified:

#Ifdef TARGET_ZCODE;

[ Dword__No word_addr;
    ! takes advantage of work done by ZZInitialise() to reduce overhead
    return (word_addr-dict_start)/dict_entry_size;
];

[ No__Dword word_index;
    ! takes advantage of work done by ZZInitialise() to reduce overhead
    return dict_start + (dict_entry_size * word_index);
];

#Endif;


#Ifdef TARGET_GLULX;

[ Dword__No word_addr   first_word_addr entry_len;
    first_word_addr = #dictionary_table + WORDSIZE; ! See Glulx Inform Technical Reference section 4
    ! following assumes only legal values for DICT_CHAR_SIZE are 1 and 4; also assumes number of dict_parN bytes is fixed at 3
    entry_len = (DICT_WORD_SIZE * DICT_CHAR_SIZE) + 7; ! 7 = 2 per dict_parN short plus 1 type byte
    if (DICT_CHAR_SIZE == 4) entry_len = entry_len + 5; ! additional short and 3 extra bytes in this case

    return (word_addr-first_word_addr)/entry_len;
];

[ No__Dword word_index   first_word_addr entry_len;
    first_word_addr = #dictionary_table + WORDSIZE; ! See Glulx Inform Technical Reference section 4
    ! following assumes only legal values for DICT_CHAR_SIZE are 1 and 4; also assumes number of dict_parN bytes is fixed at 3
    entry_len = (DICT_WORD_SIZE * DICT_CHAR_SIZE) + 7; ! 7 = 2 per dict_parN short plus 1 type byte
    if (DICT_CHAR_SIZE == 4) entry_len = entry_len + 5; ! additional short and 3 extra bytes in this case

    return first_word_addr + (entry_len * word_index);
];

#Endif;

Note that this thread is quite long and involves two distinct and essentially unrelated topics: the first (and original) is about conversion of dictionary index entries to word addresses and vice versa, and the second is about how to resolve the .grammar() routine ambiguous return value problem identified by fredrik. Doesn’t it make sense to split these into two threads at this point?

DavidG · February 14, 2022, 9:04pm

I’m doing that to get a handle on just how things would break if the actual length of the dictionary differs from what is assumed.

otistdog · February 15, 2022, 8:43pm

FYI, see the thread at Make #dictionary_table and ZCODE_LESS_DICT_DATA available when compiling for Z-machine? regarding the existence of virtual constant DICT_ENTRY_BYTES.

DavidG · February 15, 2022, 9:17pm

Got it! Let’s bring this thread back to the original revised topic of what to do about dictionary alignment.

DavidG · February 19, 2022, 11:24pm

Could I get some commentary from @DavidK and @zarf about the proposed fix for the dictionary alignment trouble? I think @otistdog has made a very compelling case for it, particularly in this post.

otistdog · February 20, 2022, 12:21am

The question here is deciding what type of action is to be taken and in what part(s) of the system (i.e. compiler, library, documentation, user source code) that action should be taken, correct?

This is the relevant part of the current implementation for handling .grammar() return values in the library (in Parser Letter A, with comments added):

    i = RunRoutines(actor, grammar); ! consult .grammar() routine of actor and store return value
    #Ifdef DEBUG;
    if (parser_trace >= 2 && actor.grammar ~= 0 or NULL)
        print " [Grammar property returned ", i, "]^"; ! debug output if TRACE at least 2
    #Endif; ! DEBUG

    #Ifdef TARGET_ZCODE;
    ! This section looks for -'verb' case by elimination
    if ((i ~= 0 or 1) && ! not true or false AND
        (UnsignedCompare(i, dict_start) < 0 ||
         UnsignedCompare(i, dict_end) >= 0 ||
         (i - dict_start) % dict_entry_size ~= 0)) { ! not a real dictionary word
        usual_grammar_after = j; ! notes current verb_wordnum for later use if needed
        i=-i;                    ! flips -'verb' value returned from .grammar() to 'verb' value
    }

    #Ifnot; ! TARGET_GLULX
    if (i < 0) { usual_grammar_after = j; i=-i; } ! Glulx version uses simplified but vulnerable test
    #Endif;

Note that Glulx is theoretically vulnerable to its own issues if the dictionary is placed so that it spans the 32-bit maximum signed integer value. At present, several things break when trying to do that because Glulx makes some different assumptions, but if the library is adjusted to make it work basically the same as Z-machine outside of the above portion of the code (a job which takes several changes to veneer and library code), then using the current logic any 'verb' values returned for words with addresses higher than MAX_POSITIVE_INTEGER will be handled incorrectly due to the signed comparison, i.e. assumed to be a -'verb' case.

The “evil twin” problem (where -'verb1' == 'verb2' due to unlucky word address alignment) is fundamentally a specification problem, as fredrik asserts. It cannot be fixed by the library. The presence of an evil twin dictionary entry means that the library cannot tell which twin was meant as the return value after checking the .grammar() routine. It is true that the library can be changed to accord with a new specification, but that would be an adaptation to a specification change, not a library bug fix.

I still think that it makes the most sense to leave the (flawed but not exactly broken) specification intact and to have the compiler eliminate any prospective “evil twin” dictionary entry by preventing the undesired alignment. I don’t know that I would call it fixing a compiler bug, but it would be the compiler doing its part to support the spec as it stands, and the compiler seems to me to be the best-situated part of the system for detecting and avoiding the issue. The logic for detection is not very complex and the impact of nudging the dictionary is negligible.

zarf · February 20, 2022, 1:02am

I have read the argument, which is the same as it was when this topic started, and I have not changed my mind.