Hm, well, there were 97 distinct characters in the input corpus. So a minimal definition of the encoding (a la JPEG DHT markers) would require 16 bytes (for a list of counts of codes of different lengths, assuming you concur with JPEG’s implementation decision to limit Huffman codes to 16 bits) plus 97 bytes for the code-order list of the input characters, plus maybe a couple bytes for counts and suchlike overhead.
A simple decoder requires additional tables that can be derived from these:
- list of the maximum code value for each code length (so 32 bytes);
- starting offset in the code-order list of the symbols for each code length (so 16 bytes, or maybe 32 if you want to allow for more than 256 symbols)
(JPEG decoders typically build some additional derived tables for more speed, but you could dispense with that if you only need decoding at human reading speed, even on 70s micros.)
If I were trying to do this in the confines of a Z-machine-like setting, I’d likely insist that the compiler create these additional tables, just because the extra code required to compute them from the minimal representation would take more than 64 bytes. Also, if you did that, I think you could skip storing the list-of-counts array, so that the net hit is that much less. So while I didn’t spend time on specifying an exact table representation, I’d expect it to fall short of 200 bytes for this example.
It might well be that the JPEG spec (written late 80s) reflects knowledge about how to implement Huffman decoding efficiently that didn’t exist ten years before, so maybe if somebody had been trying to do this in the 70s they’d have come to unnecessarily pessimistic conclusions about how big the table needed to be. But it doesn’t look to me like the table really needs to be much bigger than what it’d take to have an explicit representation of a Z-machine alphabet.