Translates a ZSCII word to the internal, Z-encoded, text format suitable for use in
a @tokenise dictionary. The text begins at from in the zscii-text and is length
characters long, which should contain the right length value (though in fact the
interpreter translates the word as far as a zero terminator). The result is 6 bytes long
and usually represents between 1 and 9 letters.
I think I understand what the opcode actually does (looks at substring and puts the equivalent of a dictionary word entry into an array?), but I don’t understand how this opcode would be used in practice. Isn’t the dictionary static at runtime (so that a new word generated by this opcode can’t be added)?
The Z-Machine Standards Document doesn’t seem to say anything more about this opcode than DM4, other than to note that it is one of the 10 rarest opcodes found in Infocom games. Is this just part of the parser machinery? (And, if so, how can it appear only 7 times in Infocom works?)
You can pass the address of a custom dictionary to the @tokenise opcode, and this address can be in RAM if you want. So this opcode is a way to put arbitrary text into a custom dictionary, for customizing parsing at run-time.
@Draconis, OK, interesting. So you would insert the result of the @encode_text call as a “row” in a supplemental dictionary array (which I guess involves some research in the Z-Machine Standards first), and then you could pass that supplemental dictionary array as the third parameter of a @tokenise call?
… But then wouldn’t any custom dictionary have to be a “complete” dictionary in the sense that the @tokenise call could use only either the custom dictionary or the game’s normal dictionary when filling out the parse array? In other words, if using the custom dictionary then standard words like ‘south’ would end up with zero as their word value in the parse array, wouldn’t they? And how would any of the custom dictionary words be made useful when routines like WordAddress(), DictionaryLookup(), print (address), etc. look only in the normal dictionary? (…don’t they?)
The value of a 'word' literal in I6, as I understand it, is its memory address. So routines like print (address) take a memory address as their argument, and they don’t care where that address is in memory, as long as it points to an encoded dictionary word.
Now, using an alternate dictionary is annoying and complicated, because all the objects’ name properties and such are filled with pointers into ROM. So I’m not really sure what effect would warrant it; on a hunch, I checked the cubes in Spellbreaker, and they just store a table of strings and check it during parsing instead of altering the dictionary.
@Draconis, I wasn’t thinking straight when I listed WordAddress() up there – it doesn’t have any link to the normal dictionary at all, really. Thinking about things again in light of your comments, maybe really only DictionaryLookup() would be a problem due to hardcoding, and maybe not that much of one.
At any rate, thank you very much for the information – especially the comment on the nature of word literals. I’ve never really looked at them in that way before. Some of the underlying machinery is a lot clearer now. […such as why it’s “print (address)”.]
At least now I see how it would be possible (if, as you say, not necessarily simple) to put the opcode to use. Thanks again!
And no, the dictionary doesn’t have to be complete. You can check words against one dictionary first, setting all non-recognized words to 0 in the parse array. You can then check the sentence again against another dictionary, and provide a special flag to tell the tokenise opcode to just skip unknown word, so they retain the value they had before in the parse array.
You can also signal whether a custom dictionary is sorted or not, so it’s easy to just add words in any order. (The standard dictionary is always sorted)
@fredrik, this is very interesting. I was thinking that a custom routine would be needed to fill in only zero word values in the parse array, but it’s just one of the features left out of DM4’s discussion of the @tokenise opcode. (And now I see the details in the Z-Machine Standards document, so thank you very much for pointing it out!)