Customized Infocom game won't fit on version 3 Z-machine

Hello,

I’m trying to customize The Lurking Horror, by Dave Lebling, using the source code that Jason Scott released and the ZILF compiler by Jesse McGrew. Unfortunately, after optimizing abbreviations with zapf, I am over the limit for a version 3 Z-machine. This is the error I get when trying to assemble:

…/…/…/zilf-0.9.0-linux-x64/bin/zapf lurkinghorror.zap
ZAPF 0.9
Reading lurkinghorror.zap
Reading ./lurkinghorror_freq.zap
Reading ./lurkinghorror_data.zap
Reading ./lurkinghorror_str.zap
Measuring…
warning: packed address overflow for string: STR?19
warning: packed address overflow for string: STR?20
warning: packed address overflow for string: STR?21
warning: packed address overflow for string: STR?29
warning: packed address overflow for string: STR?489
warning: packed address overflow for string: STR?493
warning: packed address overflow for string: STR?199
warning: packed address overflow for string: STR?145
warning: packed address overflow for string: STR?340
warning: packed address overflow for string: STR?341
warning: packed address overflow for string: STR?339
Assembling
error: file length of 131306 exceeds platform limit by 234 bytes

So close!

Anyhow, I’ve been looking into migrating to version 5, but simply replacing <VERSION ZIP> with <VERSION 5> in lurkinghorror.zil introduces errors:

…/…/…/zilf-0.9.0-linux-x64/bin/zilf -c ./lurkinghorror.zil
ZILF 0.9 built 8/11/19 6:31:00 AM
[error ZIL0405] /home/mantas/Documents/machine_learning/projects/2020/ethical_rl/labeling_code/infocom_files/GamesBeingAnnotated/lurkinghorror/misc.zil:327: USL is not supported in this Z-machine version
[error ZIL0405] /home/mantas/Documents/machine_learning/projects/2020/ethical_rl/labeling_code/infocom_files/GamesBeingAnnotated/lurkinghorror/verbs.zil:222: USL is not supported in this Z-machine version
[error ZIL0207] /home/mantas/Documents/machine_learning/projects/2020/ethical_rl/labeling_code/infocom_files/GamesBeingAnnotated/lurkinghorror/cs.zil:2960: undefined global or constant: A?SOUTH
[error ZIL0207] /home/mantas/Documents/machine_learning/projects/2020/ethical_rl/labeling_code/infocom_files/GamesBeingAnnotated/lurkinghorror/cs.zil:3418: undefined global or constant: A?NORTH
46 warnings (46 suppressed)
4 errors

It looks likes some opcodes changed / are no longer available in version 5. I see the USL opcode mentioned here, but I’m not sure how to figure out what it does and how to go about finding a version 5 replacement. Any advice would be greatly appreciated. Also, is migrating something like The Lurking Horror to version 5 feasible in the first place?

Thanks,

Mantas

3 Likes

Not completely unexpected, given that The Lurking Horror appears to be Infocom’s largest version 3 game. Dave Lebling did say in at least one interview that “some good scary stuff got cut out of it or never implemented due to size restrictions”.

USL is the opcode to force an update of the status line. In version 5 the status line has to be handled by the game itself, not the interpreter, which renders the opcode obsolete.

From what I understand, the parser from a version 3 game will not work when compiled as version 5 because some of the data structures the compiler generates are a bit different. I have no idea how much work it would be to update it, but obviously it could be done. Otherwise there would have been no version 5 “Solid Gold” re-releases. (So comparing those versions may possibly be helpful?)

But it’s clearly not as simple as just copying the parser from an existing version 5 game either, because the parser may contain game-specific hacks. E.g. the Lurking Horror LIT? routine has some special cases for light to shine through open doors.

I guess some space could be saved by removing things like the “OOPS” command?

There are a differences between z3 and z5 opcodes, primarily that z3 use bytes (8-bit) and z5 use words (16-bit). The parser in “Lurking Horror” is designed against z3 and you’ll need to modify it carefully to run it under z5. There’s some hints how to do this in the zillib parser distributed with ZILF (it have compiler options that make it work in both z3 and z5).

It’ll probably be easier to try to save the extra 234 bytes by modifying texts. It would maybe be possible to rephrase to optimize to abbreviations to save the excess bytes…

I have a better solution! I wrote a script that finds more optimised abbreviations for a game. (More optimised as in more optimised than Inform’s -u switch - and DEFINITELY more optimised than infocom, who basically only used whole words.)

You can find it on my github: https://www.github.com/hlabrand/retro-scripts. I can also do it for you, I just need to extract the game text. Let me know!

2 Likes

Is it better than ZILF’s, though? I seem to recall Jesse remarking that ZILF is already more aggressive than both Zilch and Inform.

1 Like

For Zork II there’s the original Zilch versions of zork2.zap, zork2dat.zap, zork2freq.xzap & zork2str.zap and here below is the ZILF version of zork2_freq.zap

        ; Frequent words file for zork2.zap

        .FSTR FSTR?1," the "		; 851x, saved 2548
        .FSTR FSTR?2,"The "		; 418x, saved 1249
        .FSTR FSTR?3," is "		; 458x, saved 912
        .FSTR FSTR?4," you"		; 388x, saved 772
        .FSTR FSTR?5," and "		; 241x, saved 718
        .FSTR FSTR?6,"You "		; 227x, saved 676
        .FSTR FSTR?7,"here"		; 264x, saved 524
        .FSTR FSTR?8," to "		; 258x, saved 512
        .FSTR FSTR?9,". "		; 470x, saved 467
        .FSTR FSTR?10," of "		; 224x, saved 444
        .FSTR FSTR?11,"n't "		; 144x, saved 427
        .FSTR FSTR?12,", "		; 429x, saved 426
        .FSTR FSTR?13," with "		; 81x, saved 318
        .FSTR FSTR?14," in"		; 320x, saved 317
        .FSTR FSTR?15," a "		; 295x, saved 292
        .FSTR FSTR?16,"Wizard"		; 57x, saved 278
        .FSTR FSTR?17," that"		; 94x, saved 277
        .FSTR FSTR?18,"his "		; 135x, saved 266
        .FSTR FSTR?19,"are "		; 122x, saved 240
        .FSTR FSTR?20,"room"		; 121x, saved 238
        .FSTR FSTR?21,"all"		; 234x, saved 231
        .FSTR FSTR?22," from"		; 74x, saved 217
        .FSTR FSTR?23,"thing"		; 73x, saved 214
        .FSTR FSTR?24,"appears"		; 41x, saved 198
        .FSTR FSTR?25,".."		; 89x, saved 174
        .FSTR FSTR?26," of"		; 168x, saved 165
        .FSTR FSTR?27," be"		; 164x, saved 161
        .FSTR FSTR?28,"out"		; 162x, saved 159
        .FSTR FSTR?29," for"		; 79x, saved 154
        .FSTR FSTR?30,"and"		; 156x, saved 153
        .FSTR FSTR?31,"round"		; 50x, saved 145
        .FSTR FSTR?32,"large"		; 49x, saved 142
        .FSTR FSTR?33,"have"		; 72x, saved 140
        .FSTR FSTR?34," on"		; 141x, saved 138
        .FSTR FSTR?35," do"		; 138x, saved 135
        .FSTR FSTR?36,"dragon"		; 35x, saved 134
        .FSTR FSTR?37," see"		; 66x, saved 128
        .FSTR FSTR?38,"north"		; 44x, saved 127
        .FSTR FSTR?39,"Room"		; 44x, saved 127
        .FSTR FSTR?40,"which"		; 43x, saved 124
        .FSTR FSTR?41,"direction"		; 19x, saved 124
        .FSTR FSTR?42," it"		; 127x, saved 124
        .FSTR FSTR?43," through"		; 21x, saved 118
        .FSTR FSTR?44,"the"		; 118x, saved 115
        .FSTR FSTR?45,"Frobozz"		; 20x, saved 112
        .FSTR FSTR?46,"some"		; 57x, saved 110
        .FSTR FSTR?47," to"		; 109x, saved 106
        .FSTR FSTR?48,"light"		; 37x, saved 106
        .FSTR FSTR?49,"Oddly-angled "		; 9x, saved 102
        .FSTR FSTR?50," no"		; 102x, saved 99
        .FSTR FSTR?51,"A "		; 101x, saved 98
        .FSTR FSTR?52,"side"		; 50x, saved 96
        .FSTR FSTR?53,"edge"		; 50x, saved 96
        .FSTR FSTR?54,"princess"		; 17x, saved 94
        .FSTR FSTR?55,"volcano"		; 20x, saved 93
        .FSTR FSTR?56,"look"		; 47x, saved 90
        .FSTR FSTR?57,"**"		; 16x, saved 88
        .FSTR FSTR?58,"passage"		; 19x, saved 88
        .FSTR FSTR?59,"It"		; 91x, saved 88
        .FSTR FSTR?60,"west"		; 45x, saved 86
        .FSTR FSTR?61,"open"		; 44x, saved 84
        .FSTR FSTR?62,"ring"		; 44x, saved 84
        .FSTR FSTR?63,"unicorn"		; 18x, saved 83
        .FSTR FSTR?64,"already"		; 18x, saved 83
        .FSTR FSTR?65," has"		; 43x, saved 82
        .FSTR FSTR?66,"."""		; 42x, saved 80
        .FSTR FSTR?67," at"		; 83x, saved 80
        .FSTR FSTR?68,"close"		; 28x, saved 79
        .FSTR FSTR?69,"but"		; 82x, saved 79
        .FSTR FSTR?70,"He "		; 41x, saved 78
        .FSTR FSTR?71," would"		; 21x, saved 78
        .FSTR FSTR?72,"very "		; 27x, saved 76
        .FSTR FSTR?73,"enter"		; 27x, saved 76
        .FSTR FSTR?74," lead"		; 27x, saved 76
        .FSTR FSTR?75," over"		; 27x, saved 76
        .FSTR FSTR?76,"east"		; 39x, saved 74
        .FSTR FSTR?77," with"		; 26x, saved 73
        .FSTR FSTR?78,"You"		; 38x, saved 72
        .FSTR FSTR?79," an"		; 74x, saved 71
        .FSTR FSTR?80,"demon"		; 25x, saved 70
        .FSTR FSTR?81,"hat"		; 72x, saved 69
        .FSTR FSTR?82," deposit "		; 11x, saved 68
        .FSTR FSTR?83," can"		; 36x, saved 68
        .FSTR FSTR?84,"door"		; 36x, saved 68
        .FSTR FSTR?85,"This"		; 24x, saved 67
        .FSTR FSTR?86," like"		; 24x, saved 67
        .FSTR FSTR?87,"way"		; 70x, saved 67
        .FSTR FSTR?88,"

"		; 35x, saved 66
        .FSTR FSTR?89,"low"		; 69x, saved 66
        .FSTR FSTR?90,"huge "		; 23x, saved 64
        .FSTR FSTR?91," fee"		; 34x, saved 64
        .FSTR FSTR?92,"visible"		; 14x, saved 63
        .FSTR FSTR?93,"receptacle"		; 9x, saved 62
        .FSTR FSTR?94,"enormous "		; 10x, saved 61
        .FSTR FSTR?95,"explosion"		; 10x, saved 61
        .FSTR FSTR?96,"floor"		; 22x, saved 61
WORDS::
        FSTR?1
        FSTR?2
        FSTR?3
        FSTR?4
        FSTR?5
        FSTR?6
        FSTR?7
        FSTR?8
        FSTR?9
        FSTR?10
        FSTR?11
        FSTR?12
        FSTR?13
        FSTR?14
        FSTR?15
        FSTR?16
        FSTR?17
        FSTR?18
        FSTR?19
        FSTR?20
        FSTR?21
        FSTR?22
        FSTR?23
        FSTR?24
        FSTR?25
        FSTR?26
        FSTR?27
        FSTR?28
        FSTR?29
        FSTR?30
        FSTR?31
        FSTR?32
        FSTR?33
        FSTR?34
        FSTR?35
        FSTR?36
        FSTR?37
        FSTR?38
        FSTR?39
        FSTR?40
        FSTR?41
        FSTR?42
        FSTR?43
        FSTR?44
        FSTR?45
        FSTR?46
        FSTR?47
        FSTR?48
        FSTR?49
        FSTR?50
        FSTR?51
        FSTR?52
        FSTR?53
        FSTR?54
        FSTR?55
        FSTR?56
        FSTR?57
        FSTR?58
        FSTR?59
        FSTR?60
        FSTR?61
        FSTR?62
        FSTR?63
        FSTR?64
        FSTR?65
        FSTR?66
        FSTR?67
        FSTR?68
        FSTR?69
        FSTR?70
        FSTR?71
        FSTR?72
        FSTR?73
        FSTR?74
        FSTR?75
        FSTR?76
        FSTR?77
        FSTR?78
        FSTR?79
        FSTR?80
        FSTR?81
        FSTR?82
        FSTR?83
        FSTR?84
        FSTR?85
        FSTR?86
        FSTR?87
        FSTR?88
        FSTR?89
        FSTR?90
        FSTR?91
        FSTR?92
        FSTR?93
        FSTR?94
        FSTR?95
        FSTR?96

        .ENDI

At a quick glance they look pretty similar.

What would your script need to generate the abbreviations?

I’ll guess it’s the string from all the PRINT-statements from the zork2.zap and all the strings from the zork2str.zap?

ZILCH-version of Zork II is 90 112 bytes and ZILF-version is 90 368 bytes.

1 Like

(I don’t have a graphic of the frying-pan-wielding ZILF flipping a table, but if I did, I’d post it)

To be honest I haven’t tried comparing with zilf or zilch (I hadnt realized they had their own abbreviation making utilities, i thought OP meant “with infocom’s abbreviations”). All i know is that it freed 3-4k (out of 128k) compared to inform. It might be that their algorithms find something better!

The script is made to run on “gametext.txt”, the output of the inform compiler, with options to skip the first few lines (abbreviations) and the last few lines (vocabulary). If you have a file with all statements between quotes + zork2str, and set it to skip 0 lines at the top and the bottom, it should be good?

I could easily generalize it and support command line options and different formats, it was just tailored to my own needs for now :slight_smile:

1 Like

It shouldn’t be to hard to pre-process the ZAP-files to generate a gametext.txt that can be used in your script. It would be fun to benchmark a ZIL-file with the three different abbreviations-file (Infocom’s original, ZILFs and the one generated by your script).

1 Like

It’s been on my todo list, but now that other people are interested, i might give it a go very soon :slight_smile:

2 Likes

Hi all,

I downloaded the zork2 repository from github, then did
cat *.zap | grep -o '"' | sed 's/*"\(.*\)*/\1/g' >gametext.txt
to extract everything between quotes in all zap files. (The output is attached, if someone wants to confirm this is the right text…)

I then ran my abbreviation-finding script (ignoring the last 705 lines, which seemed to be vocabulary), which output the attached freq file. By my calculations, my script finds savings of 19870 units, and ZILF finds 19630 units. So uh, my script beats ZILF by 80ish bytes? Which is a pretty low amount :smiley:

I do wonder, however, if my gametext.txt is correct or not… ZILF’s abbreviations show that “**” appears 16 times, when it only appears once in my gametext.txt file. Maybe my gametext file needs to be tweaked, which would yield a very little bit more savings?

@heasm66, are you able to compile zork2 with my abbreviation file and let us know the file size?

I am also very intrigued by the difference in file size for ZILCH and ZILF (presumably, with each their own set of abbreviations?). I counted, and Infocom’s abbreviations only save 15921 units, which is about 1.2kb less than ZILF’s abbreviations and mine. Why would the file size be smaller with Infocom’s compiler?

gametext.txt (83.4 KB)
mygame_freq.txt (6.5 KB)

Presumably the difference comes from something other than text, e.g. the size of the generated tables or routines. Looking at the memory map would give a hint: is the difference in RAM, static memory (pure tables), or high memory (routines and strings)?

If it’s high memory, one possibility is that ZILF is generating bigger routines. Its optimizer is generally at least as aggressive as ZILCH’s, but it might be generating extra instructions to clean up the stack.

Thanks everyone for your replies!

It turns out compiling without sound effects lets the modifications fit. This leaves the Z-code file at around 200 bytes under the limit for V3. I was hoping to make further modifications, probably no more than 2 KB, but modifying the parser for V5 seems pretty challenging based on a diff of the original and solid gold versions of Wishbringer and HHGTTG (I’m very new to IF). I’ll look into removing the OOPS/AGAIN commands–thanks for the suggestion @eriktorbjorn! Any idea how much space this would save?

I’m afraid not. It just seemed like one of the more expendable parser features to me [cue screams of protest here] since I’ve hardly ever used it myself. (I may have used AGAIN from time to time, though probably more often in games that have random combat in them.)

The ZILCH version of Zork 2 I quoted is the DAT-file version 48 compiled 1984-09-04. There will be differences between ZILCH and ZILF that affects the game size outside of the abbreviations.

I’m gonna compile the same ZIL-source with ZILF and try the different abbreviation-files to get a more accurate benchmark test of the effectiveness of the abbreviations.

Ok. Now I’ve run some tests…

Source for the test is this version of Zork II. This repository contains the original Infocom files slightly modified to compile with ZILF and with some bug fixes (Thank you, @eriktorbjorn!).

And without much further ado, here are the results:

Zork 2 without abbreviations                103.462 bytes
Zork 2 with the ZILCH abbrev-file            92.192 bytes (saved 11.268 bytes)
Zork 2 with the ZILF abbrev-file             90.368 bytes (saved 13.094 bytes)
Zork 2 with the mygame_freq.txt abbrev-file  89.556 bytes (saved 13.906 bytes)
5 Likes

@mmazeika It appears that you could save about 1% extra by using the algorithm by @mulehollandaise to build the abbreviations.

A little analysis between the different algorithms. I’ll call them ZILCH, ZILF and NEW.

ZILF and NEW are able to find abbreviations that starts with a space. Like " the ", " is ", " a " and " and ". " the " alone saves almost 1000 bytes compared to "the ".

ZILCH only works with full words and NEW is better than ZILF at finding part of words. "ing " for example is a phrase that ZILF is missing.

One strange thing is that ZILF find 28 occurences of "Oddly-angled " compared to NEW that finds 28 occurences of “Oddly-angl”. Is there a cut-off in NEW with 10 characters? There could be more to gain for NEW to identify longer phrases. The 10 occurences of “Crystal sp” could be “Crystal sphere”. EDIT: When I try it the “Crystal sphere” saves 6 bytes and "Oddly-angled " adds 2 bytes. Presumably this is because "ed " and “here” already are abbreiviated.

2 Likes