I created a new Python-based zil/zilf-to-.z compiler. It does the traditional 6 or so steps in one program. For the infocom games with sources it produced smaller .z than the infocom system. Available for install via pip. See awohl dot com (I can’t post links).
Was AI used to generate most of the code?
Seems like it.
Claude is listed as a co-author.
I appreciate the eagerness and interest, and I myself am a big fan of announcing things before they’re ready for public consumption (hence the name). I’m even a believer in AI coding assistants.
However, in my experience, AI coding assistants are especially prone to hallucinating when it comes to ZIL and MDL, because there are so few examples of valid code in these languages. For example, every difference this document identifies between the syntax of “MDL ZIL” and ZILF is incorrect, and several of the operations listed in this file do not exist.
That said, this project would benefit greatly from some concrete examples of what it can compile and what sort of output it produces. I see several places in the repo where one might look for working examples: examples, games, sample_z, tests/test-games, tests/test-games/examples, etc. Which of those can actually be compiled and run to completion?
One thing I found helpful while developing ZILF was to set up automated end-to-end tests that compile a project, run it in an interpreter with a predictable RNG, feed in a command file based on a walkthrough, and compare the expected output to a known-good “golden” transcript. This is useful not only for confirming that the compiler works, but also for making sure that the changes you make to support new games don’t break the games that were already working.
AI coding assistants can be very effective when they have objective tests to work with, so you may want to do this yourself too. If you want to test against Infocom’s games from historicalsource, I’d recommend starting with these:
Zork I
Zork II
Zork III
Sherlock
Beyond Zork
Zork Zero
Bureaucracy
Zork (German)
…as well as the ZILF samples Cloak of Darkness and Advent.
As you move down that list, you may find that many of the things Claude dismissed as “unnecessary”, “rare”, or “complex” are more important than they first seemed.
I also find it a bit alarming that your LLM reports:
Original Infocom source code works perfectly
- Zork (if we had source): ✓ Would compile
It seems to be asserting that based on no evidence whatsoever! A compiler needs to be tested by actually compiling things and examining the output, not just claiming that they “would compile”. And the three Zorks are in fact the only Infocom games whose source is released under an open license (MIT), letting them be incorporated into test suites.
Excuse my ignorance, but how do you know it would compile if you don’t have the source? And how do you know that the source code works perfectly if you don’t have the source code and you can’t compile it?
Yes, AIs will try and weasel out of any to-do list. However, I have been making compiler test suites since the 1970s and know with an AI to ask how many of the say 117 inputs have proper output (and verify it). CLAUDE told me at least 10 times we had complete coverage of V1-V8 (except V7, it didnt find examples of). Yet when I went to play the following higher V version example, it didnt do it at all.
I did try running a game or two from each .z version. But I didn’t make a systematic list of what worked. I don’t know anything that doesn’t work. I’ll check the list you mentioned and more.
Here it the list of what worked out of the box from your list:
Compilation Results Summary
Successful Compilations
| Game | Version | Size | Notes |
|---|---|---|---|
| Zork I | z3 | 35,884 bytes | Full compilation |
| Enchanter | z4 | 37,928 bytes | Full compilation |
| ZILF Hello | z3 | 638 bytes | Standalone sample |
| ZILF Mandelbrot | z5 | 944 bytes | Standalone sample |
| ZILF Name | z3 | 1,006 bytes | Standalone sample |
Problems being fixed now:
Failed Compilations - Issues to Fix
| Game | Issue | Root Cause
|
|-------------|-------------------------------------|---------------------------------------------
-----------------------------------------------------------------------------------------|
| Zork II | Unexpected character : at line 3580 | Lexer angle_depth goes negative (-1),
causing comment parsing to fail. The : appears in a string inside a commented form ;<COND …> |
| Zork III | Unexpected character : at line 3496 | Same issue as Zork II
|
| Beyond Zork | Unexpected character : at line 3511 | Same issue - lexer state corruption
|
| Zork Zero | Unexpected character : at line 128 | Same issue
|
| Bureaucracy | Unexpected character : at line 187 | Same issue
|
| Zork German | Unexpected character % at line 2068 | % used in atom names for special German
characters (e.g., SKARAB%AUS) - not handled by lexer |
| Sherlock | Missing file debug.zil | The repository is incomplete
|
| ZILF Cloak | Missing parser.zil | Needs zillib - uses <INSERT-FILE “parser”>
|
| ZILF Advent | Missing parser.zil | Needs zillib - uses <INSERT-FILE “parser”>
|
Bugs to Fix
- Lexer angle_depth tracking (zilc/lexer/lexer.py)
- The angle_depth counter goes negative when there are unbalanced > characters
- This corrupts the lexer state and causes form comments (;<…>) to be parsed incorrectly
- The : inside strings in commented forms is then treated as a token
- % character in atoms (zilc/lexer/lexer.py:472)
- The Zork German source uses % in atom names for umlaut encoding (e.g., SKARAB%AUS for
Skarabäus) - Need to add % to valid atom characters in is_atom_char() or handle it specially
- The Zork German source uses % in atom names for umlaut encoding (e.g., SKARAB%AUS for
- Missing library support
- ZILF samples using <INSERT-FILE “parser”> need the zillib path to be included
- The compiler needs to search include paths similar to ZILF’s -i flag
“Running” as in running the compiler, or did you play the resulting games? I wouldn’t check them off the list until you know they can be played through to a winning outcome, at the very least.
“Worked” in what sense?
If your Zork I compilation comes out to only 35 KB, I guarantee it’s not a complete and functional game.
I am more used to making test data for normal programming languages. Test programs with expected output. Is there any automated test for .z files?
The z-code part of Zork I and Enchanter are maybe your numbers. The complete playable games should be around 80-90 kB. I very much doubt the objects gets compiled correctly.
Well, there’s the approach I took with ZILF’s integration tests, which you can see as Zilf.Tests.Integration in the ZILF repo.
But before automated tests, I’d suggest trying some manual tests. Compile the game, play it, and see how far you can get. Walkthroughs are easy to find on Google.
That’s what I did to test Zork I for Glulx - I don’t have automated tests for Glulx yet, so I found a walkthrough and iterated on the compiler until I could make it to the end of the game without any crashes or bugs.