How can i know the length of a word in the dictionary (I6: Zcode or Glulx). DICT_WORD_SIZE in Zcode return 6 instead of 9.!% -v8
[ Main key;
print DICT_WORD_SIZE, "^";
@read_char 1 -> key;
];
In the Z-machine, DICT_WORD_SIZE is the number of bytes in the dictionary word. Itâs always 6 (unless you go back to v4).
Z-machine encoding is variable-width. Those six bytes can store nine letters, but only four punctuation marks (if I remember this correctly). So âaliphaticâ fits in a word but âmoon-calfâ does not.
(The Z-machine parser code never uses DICT_WORD_SIZE for anything, anyhow. The dictionary is handled by the interpreter directly.)
In Glulx, DICT_WORD_SIZE is treated slightly differently. Itâs a number of characters, and characters are normally stored in the dictionary one-per-byte, so itâs also the number of bytes. But if you use the experimental setting DICT_CHAR_SIZE=4, then characters are stored in the dictionary one-per-word. So then a dict word has 4*DICT_WORD_SIZE bytes.
I know this is a little clumsy. It makes the library code simple, though.
(The standard I6 parser does not support DICT_CHAR_SIZE=4 yet.)
Those six bytes can store nine letters, but only four punctuation marks (if I remember this correctly).
And each accented character takes the place of four characters. There is not much place left, if you have an accented character in your word. So we donât put accents in the words of the dictionary.
You could define a new Alphabet Table (section 3.5 of the Z-machine standard). Since you never need to store uppercase letters in the dictionary anyway, those 26 codepoints could be used to encode accented lowercase letters using two slots instead of four. Meanwhile, uppercase letters in your string constants will occupy more space, but again the accented lowercase letters will occupy less, so the net effect is probably still a decrease in size.
I need to find the right place in the libraries to put it.
The DM4 TABLE 2B : HIGHER ZSCII CHARACTER SET says that the cedilla for ç is â@,câ but itâs â@ccâ.
The Zcharacter directive used in this way makes these characters available in ZSCII, but not part of the alphabet table. This means each of these characters use up four times as much space as the cheapest characters (normally lowercase a-z and space). You may want to create your own alphabet table to fix this.
You should perform all Zcharacter directives at the very beginning of the game, before any strings are declared, including Story and Headline. Otherwise, mayhem ensues.
For the Swedish translation of the library, I created a file called SweAlpha.h, to be included at the very start of the game source.
I put Zcharacter directive, in the libraries, at the beginning of the frenchU.h file equivalent to english.h, and it works. Itâs better than putting it in the source code of the game.
It seems that there is a difference between the Zcharacter directive in libraries and the Zcharacter table in the source code. And it seems that for the Zcharacter directive, I only have rights to 10 accented characters. Iâll check it out.
If Iâm not misreading the docs, Inform will let you specify 26 characters to be encoded with a single Z-character, 49 to be encoded with 2 Z-characters, and the rest with 4 Z-characters. As this matters most for the dictionary, it makes sense for row 1 of the alphabet table to be only lower case, but it can be any lower case characters. For works in other languages it would pay to consider carefully the frequency distribution of the alphabet for the language. You could swap some of the less used non-accented characters in row 1 for other more frequently used accented characters.
For Swedish, I ended up putting the accented characters in the same positions as the characters I removed from the first row. In this way, interpreters which canât handle custom alphabet tables will still display something that can be read (kinda), and a few strings that are encoded so early that you canât stop them from getting it wrong (like âClassâ and âObjectâ IIRC) will still display correctly or almost correctly.
The length of the words max for v3 is 6 characters, but they seem to use 7 characters in the dictionary.
I use the No__Dword() function from the standard library, that I modified to use it with v3, but for the result to be correct, the value 9 must be replaced by 7 and not by 6:
The length of a dictionary entry is 7 or 9 bytes(*). Part of that is the word text (6 or 9 Z-characters, but they fit into fewer bytes than that). The rest is various flags.
(* Actually the Z-machine allows them to be longer, but Inform always generates 7 bytes for v3 games and 9 for v4+.)