I6 compiler code generation improvements

I have spent the past few days making the I6 compiler smarter about skipping dead code. By dead code, I mean lines like this:

if (0) {
	print "Text.";
	return;
}

The release version of I6 will compile the print and return opcodes, even though the branch is guaranteed to skip over them. With my changes, this whole stanza compiles to nothing.

You’d think that’s pretty silly, but in fact the I7 compiler is sloppy about code generation, so I7 games are full of this stuff.

Also, I6 library code sometimes uses if (CONSTANT)... to add optional features. With this change, the feature code gets discarded if CONSTANT is zero.


Why am I posting? Because this is a pretty big change. I had to dig into the code generation logic in funky and complicated ways. (In contrast, the 6.36 code changes were relatively shallow.)

I spent a lot of time diffing assembly dumps to check my work. But I could still have made a mistake. So I’d like some eyes on this change before it gets pushed!

The source code is in this branch on Github. Download, build, test on your favorite I6 source code.

Some notes:

  • “Statement can never be reached” warnings are smarter now, but can sometimes appear in the wrong place.
  • This change has nothing to do with #ifdef. Code which is #ifdef'd out has always been compiled to nothing.
  • This change does not optimize out statements like if (test()) {} – that is, empty code blocks.
  • Inline assembly is compiled as written; it won’t be optimized out.
  • If you compile with the -a (assembly) switch, you’ll see a lot of apparently trivial jumps:
   27  +0003d <*> jump         L0 
   30  +00040    .L0

Don’t worry, these get stripped out at backpatch time. You can use txd to verify this.

If you want more info on the logic of my changes, see this comment in the branch code.


Some statistics for Adventure, a one-room I7 game, and Library of Horror (PunyInform).

Game        Code seg   Savings
Library.z3     19396        36
Advent.z5      67720       384
Advent.ulx    101802       603
I7-min.z8     278008       856
I7-min.ulx    387062      1206

The second column is the size of the compiled game’s code segment (function code only), when compiling in my branch. The third column is the number of bytes saved over 6.36.

If you’re curious, the main savings in the I6 library comes from the FullScoreSub() and AttemptToTakeObject() routines. In FullScoreSub(), if TASKS_PROVIDED is 1 (the default value), all the code that prints out task scores can be skipped. In AttemptToTakeObject(), if SACK_OBJECT is 0 (the default), all of the putting-in-your-sack-to-make-room code can be skipped.

(Yes, TASKS_PROVIDED is backwards: 1 means “no tasks”, 0 means “tasks are provided”. Sorry, not my fault.)

Amusing footnote: while checking for stripped code in the I7 game, I noticed that the entire body of FileIO_PutC() was stripped after the second line. Hey, look, the second line is an unconditional return! That’s a bug that’s been around for years.

In the I6 library, the AnalyseToken() routine also has a bug – the last two lines can never be reached. This is fairly irrelevant because this routine is only called under Grammar__Version 1, which is obsolete. However, it looks like this bug was picked up by the Metro84 library; see AdjectiveAddress() here.

11 Likes

A caveat: This change makes some jump Label patterns illegal which were previously legal in I6. I’m not thrilled about this – I never want an I6 change to make it harder to compile old code. However, the problematic cases are really obscure. I’m fairly sure that nobody has written code like this.

The summary is that if an entire code block is stripped out, labels within it are stripped out. So you can no longer do this:

if (0) {
	.Label;
	return;
}
jump Label;

But this only applies to entire code blocks. The following is still legal, if silly:

jump Label2;
.Label1;
print "world";
return;
.Label2;
print "Hello";
jump Label1;

You can jump into and out of code blocks freely, as long as the block hasn’t been entirely eliminated.

while (val < 10) {
	val++;
	.Loop1;
	if (val == 3) jump Loop2;
}
while (val < 10) {
	val++;
	.Loop2;
	if (val == 6) jump Loop1;
}

(Yeah, I’ve spent a lot of time coming up with deeply pointless code. See this file for example.)

If it turns out that some old code is affected by this problem, I can put in some sort of backwards-compability “don’t optimize” switch.

The PunyInform library is already quite careful to exclude code that’s not needed for the current build. Still, this saves 74/68 (z3/z5) bytes for minimal.inf, the game skeleton which by default disables all bells and whistles of the library. This is ~0.3% of the current size - not an extreme amount but certainly welcome! This compiler change also means we can change some #ifdefs to ifs, which I think will increase readability. Thanks!

It saves surprisingly little space in I7. Theoretically is there something that could be done to improve it further? There must be loads of unused stuff in a mostly empty I7 project.

I haven’t looked at the code yet, but going only by zarf’s explanation, this fix would not eliminate functions without a caller, or redundant functions, or data that is not used.

That is on the assumption that it checks for if (mystaticbooleanexpression == null) and excises that without doing something else. And also the assumption that I haven’t missed something in his explanation, of course.

Correct.

The $OMIT_UNUSED_ROUTINES setting gets rid of functions without a caller. That is usually helpful for I7 games, which have a lot of template infrastructure like floating-point and unit-conversion which are not used by every game.

This change provides I7 games with a lot of very small improvements. For example, if you write

This is the tiny rule:
	rule succeeds.

…this compiles to the I6 function:

[ R_800 ;
    if (debug_rules) DB_Rule(R_800, 800);
    ! [2: rule succeeds]
    RulebookSucceeds(); rtrue;
    rfalse;
];

The final rfalse can’t be reached, so the change saves one byte. For a game with a lot of rules, it can add up.

It also saves a byte on routines that end like this:

if (test) return X;
else return Y;

However, this isn’t primarily intended for I7 games, which usually target Glulx anyhow. The people counting bytes are the v3 users.

Cool, $OMIT_UNUSED_ROUTINES seems to reduce I7 story file size about 13% which is a significant amount. It’s still relevant for reducing loading times and saving bandwidth when playing online.

Cool. Thanks, Andrew.

(For I7-ers who want to play along, the relevant thing beyond building the new inform6 is Use OMIT_UNUSED_ROUTINES of 1. in your code.)

PR with detailed tech description of the compiler change: Strip dead branch code by erkyrath · Pull Request #164 · DavidKinder/Inform6 · GitHub

1 Like