Why does the Dialog compiler dislike this?

Draconis · August 25, 2020, 6:47pm

I’ve been working on an extension to Dialog allowing custom structs (what Inform calls “composite kinds of value”, single values made up of multiple numbers). Most of it works perfectly. However, there’s a certain thing that seems to cause serious problems for the compiler.

If I have a parsing predicate like this:

(interface (understand $<Words as frequency $>Freq))
(understand $Words as frequency $Freq)
	%% get $First and $Second somehow
	($Freq = [#kovmarker @frequency $First $Second])

…then everything works fine. But if the predicate looks like this:

(interface (understand $<Words as frequency $>Freq))
(understand $Words as frequency $Freq)
	%% get $First and $Second somehow
	(struct $Freq has tag @frequency and contents [$First $Second])

(struct $Struct has tag $Tag and contents $Contents)
	($Struct = [#kovmarker $Tag | $Contents])

…then the debugger gives a truly frightening litany of hundreds and hundreds of warnings, along the lines of

Warning: stdlib.dg, line 1776: Argument 1 of now-expression can be unbound, leading to runtime errors.

What’s so different about these two implementations? I’d thought it would be nice and elegant to have a single predicate that can pack or unpack a struct, depending which arguments are bound and which aren’t—and while it does work as expected in my test cases, the warnings make me think I’m likely preventing some optimizations. (If nothing else, I don’t want to frighten people who use the extension, or cause their own warnings to be lost in the flood.)

lft · August 25, 2020, 7:18pm

Ah, yes. The compiler tries to track which expressions can end up unbound at runtime, and occasionally there is a false positive like this.

If you know that an expression is going to end up bound no matter what, then the quick fix is to add a (fully bound $) query towards the end of the rule body. Then the compiler will know that after that query succeeds, the expression is indeed known to be bound, and this fact will propagate to the rest of the analysis and prevent the warnings.

The downside is that the expression will be checked at runtime, which has a (tiny) performance impact. But if the compiler gets smarter in a later version, then the query will be optimized away.

I’m going to limit the number of warnings that are printed when this happens, because the root cause is almost invariably covered by the first one.

Draconis · August 25, 2020, 7:23pm

Yep, adding a (fully bound $) fixed it!

Suggestion, then: since the interface here indicates that $Freq should always be fully bound after calling the predicate, perhaps warn only at the point where that interface can get violated, since that’s presumably where the error is?

lft · August 25, 2020, 7:24pm

That’s often what the first warning says (though not always).