Next steps for Inform 6 compiler

zarf · May 23, 2021, 11:21pm

Now that 6.35 is out, I am thinking about what should go into the next release.

The big thing I’d love – about 15 years overdue – is to revamp the memory system from “allocate everything at the start” to “reallocate as you go.” This would (eventually) completely get rid of the dreaded

The memory setting FOO (which is N at present) has been exceeded. Try running Inform again…

I’d tackle this incrementally – one memory setting at a time. There are about 30 such settings, plus a few more hard limits which can only be updated by rebuilding the compiler.

Today I experimentally knocked out the MAX_CLASSES setting. (This patch is in a work branch, not yet merged.) That took a couple of hours of rearranging code. It’s not a trivial cut-and-paste task, but I could probably run through the most important settings over the course of a few months, dipping into one whenever I have a free evening.

(Question for the I6 users here: which are the most important memory settings? Which ones do you run into first when extending your game?)

zarf · May 23, 2021, 11:22pm

Other ideas:

I have an itch to improve the -a assembly output. (I started implementing this too.) This doesn’t affect code generation at all; it’s a matter of making the assembly trace output more readable.

Currently if you compile a line like

print (name) obj.owner;

…the -a output looks like this (also using -G -~S):

11  +00017 <*> callfii      long_10 long_5 byte_4 stackptr 
11  +00024     callfi       long_6 stackptr zero_

This is really not very enlightening. In fact it’s actively misleading, because the generated game file doesn’t contain the constants 10, 5, or 6 for this routine! Those are temporary values that get backpatched later in compilation.

I intend to update this to read:

11  +00017 <*> callfii      long_10 (veneer routine: RV__Pr) long_5 (internal object: obj) byte_4 (owner) stackptr 
11  +00024     callfi       long_6 (veneer routine: PrintShortName) stackptr zero_

The numeric values are still temporary, but at least you know what they refer to. It’s much easier to compare this assembly to the original source.

This assembly output happens inline as code is parsed. So there are limits on how much info we can generate at this stage. We don’t know final addresses yet. Forward-declared symbols have no type information. But we can at least say what symbol name every assembly operand represents.

(Question for the crowd: This isn’t going to break anybody’s project, is it? I don’t think that anybody is trying to parse inform -a output. But if you are, let me know.)

zarf · May 23, 2021, 11:33pm

As to other stuff… I don’t have a lot on my list.

Issues · DavidKinder/Inform6 · GitHub has a suggestions about smarter type warnings. That seems like a good idea.

When poking around the repository, I found Inform6/fork-change-notes at fork-i64 · erkyrath/Inform6 · GitHub . This is an old changelist. It’s labelled “6.40”, but it’s actually an updated 6.31 tree that got abandoned back in 2006-ish.

I don’t want to take this file as a TODO list. (Some items have been independently reinvented already.) But if you see any ideas that you want to make an argument for, go for it.

Dannii · May 23, 2021, 11:40pm

Maybe some more optimisations?

I just noticed yesterday that it outputs dead code in a situation this. It converts the branch to an unconditional jump, but still outputs the code inside.

if (0) {
    ...
}

(Actually technically the branches were more complex, possibly it’s smart enough already to know not to output a branch with a simple false constant like that?)

zarf · May 23, 2021, 11:42pm

That’s actual work! I’m not sure I have the brain cells to tackle it. But feel free…

zarf · May 23, 2021, 11:48pm

possibly it’s smart enough already to know not to output a branch with a simple false constant like that?

It’s not even that smart. But I suspect that authors would avoid writing such simple constants anyhow. (If you want code to be conditional on a compile-time constant, you’d use #ifdef rather than if.)

What more complex conditions are you running into, in real-life cases?

Dannii · May 23, 2021, 11:52pm

Inform 7’s filter functions:

[ Noun_Filter_63 
    v ! value parsed
    n ! saved value of noun
    ;
    v = Kind_GPR_79();
    if (v == GPR_NUMBER) {
        n = noun; noun = parsed_number;
        if (~~(((true) && (true)))) v = GPR_FAIL;
        noun = n;
    }
    return v;
];

It’s not a big issue, I had to adjust my secret project to account for orphaned blocks, but it wasn’t hard at all.

zarf · May 23, 2021, 11:55pm

Ooh, yeah, that’ll certainly make for a lot of dead code. I had forgotten about those.

The good news is that the compiler is smart enough to condense the expression down to “if false”. So fixing the simple case will get quite a lot of mileage on I7 code.

Dannii · May 24, 2021, 12:00am

One difficulty is with dynamic branches/jumps and even inter-function jumps: the compiler can’t simply remove such blocks without potentially breaking code.

But dynamic branches and inter-function jumps must be super uncommon, does anything other than Glulxercise actually use them? We could say that files which use such features need to turn on a setting, and if the setting isn’t used then optimisations like dead code removal could be made.

If you like this idea I could look at adapting my algorithm for I6. (One difficulty would be flexible arrays, I’m not sure how you’d do them in C. Though if I’d need to do backpatching that would be a bigger difficulty!)

zarf · May 24, 2021, 12:13am

Dynamic branches (as in the computed_jump tests in Glulxercise) must be written in assembly at present. (Right?)

Inter-function jumps, same thing. That is, I think you could only do this if you laid out your target function with hand-crafted assembly.

We could say that files which use such features need to turn on a setting, and if the setting isn’t used then optimisations like dead code removal could be made.

That would work.

Another possibility is to define a convention: any jump label is assumed to be “live” even if the compiler doesn’t see a jump to it. So you could avoid dead code removal by writing:

if (false) {
  .Label;
    do_stuff();
}

Dannii · May 24, 2021, 12:18am

Yeah that sounds reasonable.

I6 doesn’t have any sort of intermediate representation, does it? In that case I rescind my offer to help

zarf · May 24, 2021, 12:22am

I promise there is no intermediate representation whatsoever. :)

ramstrong · May 24, 2021, 4:53am

Would something like this will work?

https://cc65.github.io/doc/tgi.html

Or maybe define our own custom fixed width font from inside the source code to act as graphic characters?

Warrigal · May 24, 2021, 5:17am

I don’t think I’ve ever hit any Inform 6 limits, until recently when I had to increase MAX_FLOATING_OBJECTS from 32 to 40, although that may have been a PunyInform limit.

zarf · May 24, 2021, 5:34am

MAX_FLOATING_OBJECTS is a library constant, not a compiler setting.

Warrigal · May 24, 2021, 10:52am

Oh, yeah. That makes sense. I’d better shut my mouth before I put my foot in it again.

fredrik · May 24, 2021, 11:13am

Better optimized for-loops would be nice, using Z-code @inc_chk or @dec_chk where possible to save one instruction.

zarf · May 24, 2021, 2:49pm

I will write up a github issue about optimizations. Again, no promises about getting to them. :)

zarf · May 24, 2021, 3:47pm

I think you’re suggesting a new graphics API for games to use? That would require new interpreters. For this thread, I’m only looking at possible improvements to the I6 compiler and language.

ramstrong · May 24, 2021, 4:06pm

Ah, OK. I thought doing it would be no problem because of this:
http://adamcadre.ac/gull/gull-2l.html
Thank you for replying.