Next steps for Inform 6 compiler

You probably would need to recognize a pattern and optimize on individual patterns. Is it worth it? I don’t know?

The compiler doesn’t have a good way to recognize patterns across more than one basic operation.

However, this specific case is generated within one operation. The code looks like

assemblez_2_to(get_prop_addr_zc, AO,
    ET[ET[below].right].value, temp_var1);
if (!void_flag) write_result_z(Result, temp_var1);

So we could add another case saying, roughly, if Result is stack_pointer and (!void_flag) then do this instead:

assemblez_2_to(get_prop_addr_zc, AO,
    ET[ET[below].right].value, stack_pointer);

This is conceptually simple. The only annoying thing is that this pattern appears in about twenty places in the compiler code (representing various operations). So the full fix is fairly messy, and it’s probably worth refactoring a little to keep things tidier.

8 Likes

I have somewhat revived this thread (see Code generation optimizations · Issue #87 · DavidKinder/Inform6 · GitHub) and implemented optimizations for a couple of things:

  • @get_prop and @get_prop_addr now skip intermediate storage of result in temp-variable when target result is the stack.
  • x=x+1 is compiled as i++ (@add x 1 -> x to @inc x) and similiar.

These changes saves 460 bytes on Curses and 284 bytes on Advent. I have tested with a playthrough of Curses without issues but more testing is obviously needed (anybody interested?).

I’m also happy if there are more suggestions for improvement (other than the ones already suggested in this thread or the ones on GitHub). I’m trying to cram multiple improvements together as so to minimize the testing needed (one testing phase for all, instead of testing each individual change).

6 Likes

It looks like Glulx and Glulxe have been ready for 32-bit “unencoded unicode strings” for a couple decades, since Glulx Spec 3.0.0 and Glulxe 0.4.0, both from 2006-08-13 (and I see tests for dealing with strings beginning 0xE2 in Git and Quixe, too).

The Inform 6 compiler’s default is compressed strings (beginning 0xE1) but with the -~H flag you can get unencoded strings.

It’d be nice if the compiler could create unencoded unicode strings. (I find myself also wishing the spec had left the values of the 3 bytes after the 0xE2 unspecified so it could encode the string’s length, with anything ≥ FFFFFE getting FFFFFF, not that exceeding 16M is likely to be an issue… but it does specify they should be 0).

It wouldn’t have any advantages until other things were written to make use if it, of course… but those things aren’t likely to get written without it. So I thought I’d toss it out there.

If there were a use for it, how would you want to use it? Surely not changing the encoding of every string in the game.

Got it in one. I’m hoping that someday Inform 7 could offer a compilation option to create CONSTANT_UNPACKED_TEXT_STORAGE (on 32-bit targets) as part of getting toward a world where iterating through the characters of a text wouldn’t bring a modern computer to its knees. (There are other interventions that wouldn’t require any change to Inform 6 that could play a bigger role but this would still help.)

String storage space in the game file is liable to be around six times greater. I find this easy enough to countenance in something that would be an option.

But there are so few strings (relatively) that you need to transmute. I’m sure this can be better targetted.

I’m surprised this is such a problem - it suggests some inefficiency on the Inform 7 side. Decompressing a string into a buffer and then iterating through it shouldn’t be that taxing.

AFAIK, Inform7 does not have a built-in “iterate through the characters of this string” feature, so when you emulate it by writing:

let foo be "text";
let count be the number of characters in foo;
repeat with n running from 1 to count:
    let c be the character number n in foo;
    [...]

every time you ask for character number n it prints the string to an array, then immediately deletes it again.

2 Likes

Can’t it be converted once with “the substituted form of” and then accessed character by character efficiently?

1 Like

Or some kind of “with unpacked …” block structure phrase.

I don’t think I’ve seen people talking about the speed of Inform’s text for years. That could be because our devices have gradually crept up, or it could be that no one uses Aaron Reed’s Smarter Parser anymore.

Either way, I do agree that Inform should have faster text functions! But this isn’t the way to do it. Printing a compressed text to a buffer only takes about three Glulx instructions at a minimum. What will improve things is a better flex system, and inlining the array accesses.

1 Like