Highly optimized abbreviations computed efficiently

heasm66 · February 21, 2021, 6:23pm

I made some small changes to the algorithm:

Overlappers are sorted from shortest to longest before they are tested against top 96 to see if they qualify.
Resort the bottom half (candidates 97-) of the BestCandidatesList.
Overlappers that not qualify to a top spot don’t get discarded right away. Instead they are played last in the list and only discarded if they during recalculation get a savings score below 0.

The result:

'   Zork II:
'      No Abbrev.   103,462
'      Zilf 0.9      90.368
'      Zilf beta     89,454
'      Sort once     88,384
'      Keep sorted   88,392
'
'   Mini-Zork II, Release 13:
'      No Abbrev.    60,258
'      Zilf 0.9      54,170
'      Zilf beta     53,846
'      Sort once     53,404
'      Keep sorted   53,468
'
'   Heart of Ice:
'      No Abbrev.   283,184
'      Zilf 0.9     232,408
'      Zilf beta
'      Sort once    222,992
'      Keep sorted  222,588
'
'   Trinity:
'      No Abbrev.   281,580
'      Zilf 0.9     257,408
'      Zilf beta    256,908
'      Sort once    255,232
'      Keep sorted  255,308

It’s irritating that there’s a difference between if the top 96 of the BestCandidatesList is only sorted once, before overlappers are tested, or resorted between each overlapper is tested.

I don’t think this difference is only due to “wasted” z-chars, these should even out. I’m leaning more towards that it’s interference from partial overlappers that changes the result depending on wich order the list is sorted.