New feature that helps with identifying multiple occurences of identical strings or part of strings suitable for CONSTANT.
New feature that illustrates how the abbreviations are applied to the strings.
New feature that shows statistical information on where in memory strings are located and how much empty space that are lost to uneven alignment to bytes.
New feature that auto-detect between Inform6 or Zilf source code.
New feature that can generate abbreviations from output extracted from binaries with TXD and Infodump from ZTools .
I thought it could be interesting to see what is possible to do with optimization with the latest versions of Inform6 (in some examples I’m gonna use new features that currently only are available in pre-release form) with help of ZAbbrevMaker.
PunyInform also have a document, PunyInform Game Author’s Guide, with a couple of useful tips and tricks that I will use as reference.
Baseline - with debug code and commands
If you compile with the -D switch (or define the constant DEBUG inside the code) we get a max size with all extra code and error checks that is available.
Every dictionary word in Z-code has four or six bytes to store the word string and then three bytes of data. The last of these data bytes hasn’t been used since ancient times in the standard library (grammar version 1) or in PunyInform and is always 0. This unused byte on each dictionary entry can be removed with the compiler switch $ZCODE_LESS_DICT_DATA=1.
Generate full set of 96 abbreviations with Inform6
If we set the compiler switch $MAX_ABBREVS=96 Inform6 will generate a full set of abbreviations. Applying these 96 generated abbreviations to the game and recompiling.
Use all 96 abbreviations generated by ZAbbrevMaker
ZAbbrevMaker is a tool to generate a more optimal set of abbreviations and the z-machine standard allows up to 96 abbreviations. Specifying another set of abbreviations instead of the standard 64 is done with the compiler switch $MAX_ABBREVS=96. Applying these 96 generated abbreviations to the game and recompiling.
Use a optimized alphabet with help of ZAbbrevMaker
The normal alphabet used is optimized to give the lowest cost for the letters a-z. If you actually count the frequency of character in the text often characters like j, q, x or z are used less often than comma, full stop or T. If we use ZAbbrevMaker with the switch -a we get the following alphabet for this game:
! Custom-made alphabet. Insert at beginning of game.
Zcharacter
"abcdefghi.klmnop,rstuvw'yT"
"ABCDEFGHIJKLMNOPzRSjUVWxYZ"
"012q456789*>!?_<]/[-:()";
The new alphabet should be inserted as early as possible because it is applied from the insertion point and forward. If we now recompile with this alphabet and a new set of recalculated abbreviations.
Beware that you test that the interpreter on the platform you’re aiming for is able to use a custom alphabet. According to the z-machine standards a custom alphabet is only valid for version 5 and later. Even though i might wotk on modern interpreters it is not certain that older or ones for retro platforms will work.
Remove SYMBOL TABLE from compiled game
This feature is not yet available but is coming in version 6.42 of Inform6.
Inform compiles in the names of the symbols in a table. These names are used to give better and more informative error messages. Hopefully this is not necessary in the final released version of the game and can be removed with the
compiler switch $OMIT_SYMBOL_TABLE=1.
Move text from high strings to inline text in code.
This feature is not yet available but is coming in version 6.42 of Inform6.
Inform have have a cut-off length of 32 characters when to store the string in high strings area instead of inline in the z-code. The opcode for printing inline text takes less space and inline strings waste less memory for versions 4 onward due to packed addressing for high strings. The cut-off length can be modified with the compiler switch $ZCODE_MAX_INLINE_STRING.
Beware that very long inline strings could lead Inform to construct jumps larger than 8192, which are a “branch out of range” compile error.
ZAbbrevMaker has a switch, --onlyrefactor, to generate a list of all strings or part of string that appears multiple times in the text. For example this game produces this (extract):
Long repeated strings:
3x106 z-chars (~ 134 bytes), ( end ) ". Of course, if someone brought me a tasty treat I might be inclined to do them a favour in return.~"
2x106 z-chars (~ 71 bytes), ( full ) " out of your hands, then briefly considers it before dropping it dismissively. ~Is that all ye got?~"
2x 99 z-chars (~ 66 bytes), ( full ) "The safety railing cannot be traversed. It wouldn't be much of a ~safety~ railing otherwise."
2x 94 z-chars (~ 53 bytes), (mixed ) ", although it does feel like if you pushed it then it would return to its original position."
...
If we replace the string ". Of course, if someone brought me a tasty treat I might be inclined to do them a favour in return.~" with a constant, regenerate abbreviations and recompile, we get and saves 134 bytes.
You really should do a test with the abbrevation list generated by inform -u $MAX_ABBREVS=96. That would let you directly compare Inform’s 96-list with ZAbbrevMaker’s 96-list.
That is true. (The line is #define MAX_ABBREV_LENGTH 64.) I don’t think the question of that limit has come up before.
I guess the thinking was that the author would notice large shared chunks of text and turn them into string constants or routines. The abbreviation mechanism was (notionally) for finding little pieces of text that were too common or annoying to do that for.
Do you think it’s worth making that a dynamic allocation?
That’s correct. The sensible thing would be to refactor and convert them to string constants or routines.
On the other hand, what harm would it create for Inform6 to allow longer abbreviations? As far as I know there’s no restrictions to the length in the z-machine standards. One idea could be to allow longer abbreviations than MAX_ABBREV_LENGTH but issue a warning when an abbreviation exceeds it?
I’m gonna limit abbreviations to 64 characters in next version when producing for Inform6, but make it adjustable.
Like with a lot of the I6 compiler’s limits, it was designed to save memory on machines with very little RAM. It’s not a problem for the generated Z-code, and nowadays, I don’t think anyone would notice the difference if it was raised from 64 to 1024 or whatever. (But it has to be an error rather than a warning because, if I understand right, it would overflow an internal buffer in the compiler.)
Note that this is a 64 byte limit, not a 64 character limit.
Both of the below are too long
! the letter 'é' 23 times
Abbreviate "@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e@'e";
! the letter 'a' 64 times
Abbreviate "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
I understand how the current MAX_ABBREV_LENGTH is/was useful to keep memory usage down while running the compiler, but I’m not sure limiting abbreviations based on memory used while compiling is a useful metric for a user creating a game.
Some sort of (possibly adjustable) limit would be nice (in both Inform and ZAbbrevMaker), at least as a warning, because long repeated strings probably shouldn’t be abbreviations.
If there’re long repeated strings, they should definitely be converted to constants or refactored to routines. A warning in ZAbbrevMaker when one or more abbreviations exceed a limit is a fair compromise (I think a hard limit that generates an error is unnecessary, because long abbreviations are perfecly legal to use by the standards.)