Next steps for Inform 6 compiler

Very good, thanks.

Advent.inf looks fine with 6.12.5.

I haven’t been worrying about I7 code (since the code generated by I7 produces thousands of warnings anyhow).

But it turns out that this change doesn’t add any new warnings for a (one-line) I7 game. So that’s nice. In fact it gets rid of one spurious warning.

Is it possible the compiler could auto-generate a routine when a name list is too long for z3? Just being able to write name lists and ‘forget about it’ would be a great relief.

The compiler doesn’t know how name routines are supposed to work. That’s a library feature.

The feature doesn’t necessarily work the same in different libraries, either. The original Inform library’s parse_name routines are called in several circumstances with various global variables set. I don’t know how much of that is replicated in PunyInform, etc. The compiler can’t account for any of it, certainly.

Anyone have opinions about this I6 bug report? "or" operator violates DM4, DeMorgan's law · Issue #112 · DavidKinder/Inform6 · GitHub

The title on that report doesn’t get at the real issue (the DM4 doesn’t specify the “right” behavior). Here’s the whole situation:

The manual describes the “or” clause of comparisons (§1.8):

if (alpha == 3 or 4) ...
! true if (alpha==3 || alpha==4)

if (x > 100 or y) ...   
! true if (x > 100 || x > y)
! that is, if x is bigger than the minimum of 100 and y

if (x < 100 or y) ...   
! true if (x < 100 || x < y)
! that is, if x is less than the maximum of 100 and y
! (this is not mentioned in the manual, but it's implied)

The first form (== or) is sometimes useful. The other forms (> or, < or) are kind of contrived; I’ve never seen them used.

The manual also describes the converse of the first form:

if (alpha ~= 5 or 7 or 9) ...
! true if alpha is not equal to any of these values
! equivalent to ~~(alpha == 5 or 7 or 9)

This is the way we’d say it in English (“alpha is not five or seven or nine…”) It’s not strictly correct for Boolean logic. If you wrote out the condition, DeMorgan’s law says you should use “and”, not “or”:

if (alpha ~= 5 && alpha ~= 7 && alpha ~= 9) ...

The I6 compiler doesn’t require you to distinguish “and” and “or” in this way, because that would be fussy. It’s clear the way it is.

The problem comes when you use >= and <= with or:

if (x >= 100 or y) ...   
! equivalent to ~~(x < 100 or y)

if (x <= 100 or y) ...   
! equivalent to ~~(x > 100 or y)

In one sense this is perfectly consistent with the way ~= or is defined – it’s the converse of an existing operator. (notin or is also defined as the converse of in or.)

In another sense it’s very surprising, because (x > 100 or y) checks the minimum of 100 and y, but (x >= 100 or y) checks the maximum.

That is, you might expect these lines to be equivalent:

if (x >= 100 or y) ...   
if ((x > 100 or y) || (x == 100 or y)) ...   

But in fact they are not.

I agree this is an unhappy outcome. However, changing it would be unhappy in a different way. ((x >= 100 or y) would no longer be equivalent to ~~(x < 100 or y).) And while I’ve never seen these forms used in real life, there might be old game code that relies on them.

So is it worth making this change? I’m ambivalent. And I’m inclined to err on the side of “leave it alone” for ambivalent cases. But I wanted to pass the question around.

EDIT: I probably meant “inverse”, not “converse”. Sorry. Been a long time since the basic logic course.

EDIT: Bonus points if you first learned about De Morgan’s Law from the “Propositional Calculus” chapter of Godel Escher Bach.

3 Likes

Very reasoned. I would trust your judgement.

Using or with <, >, <=, >= is so very rarely useful. I can see two possible cases here:

  • It’s used in some old source code files, and they will break. Better not change it.

  • No one ever uses it. Changing it is pointless.

In any case, don’t alter the basic behaviour of an operator in a language that’s been actively used for 20+ years, unless the current behaviour is so broken that you can be sure the operator has never been used for anything.

5 Likes

I don’t currently use Inform, but I would be wary of changing anything that has been used for so long in its current state.

I also can’t help wondering how such a change might affect Inform 7.

Agreed – it would be a bad thing to change long-standing behavior in a way that would likely break any code that had used it… and good to prominently document “behavior here is weird and surprising but being left alone because legacy”.

1 Like

Fair question. The I7 templates use “a == x or y” and “a ~= x or y” quite a bit, but they never use or with the </>/<=/>= operators.

I will play devil’s advocate.

First, I note that DM4 p. 19 (DM4 §1: Routines) designates or as a “special operator” which “gives alternate possibilities.” I also note that, per DM4 Table 1B (DM4 Tables), or claims highest precedence (above other logical operators).

Second, I will attempt to pin down the functional meaning of the or keyword. An initial definition, based on a naive reading of DM4:

"creates a logical OR applied to a set of comparisons, with the left sides identical and the right sides defined by the elements in the 'or'-ed list"

Third, I observe that the proposed definition can’t be the whole story. As you note, a condition like

(ball notin red_bucket or blue_bucket)

functionally means the same as

  • ((ball notin red_bucket) && (ball notin blue_bucket)) [using && between individual conditions], or
  • ~~((ball in red_bucket) || (ball in blue_bucket)) [per DeMorgan’s Law applied to preceding], or
  • ~~(ball in red_bucket or blue_bucket) [per attempted definition of or applied to preceding?]

(I didn’t trace the assembly for these, but I did test all four of those expressions in code.)

The equivalence of all four expressions above was intuitive to me, but my intuition is wrong; the original expression doesn’t actually fit with the other three equivalents. The translations

  • ((ball notin red_bucket) || (ball notin blue_bucket)), or
  • ((~~(ball in red_bucket)) || (~~(ball in blue_bucket)))

both of which would be consistent with the original expression per the definition of the or keyword listed above, would NOT work as intuitively expected. For example, they would evaluate to true when the ball is in the red_bucket (because it’s not in the blue_bucket). Both of these will always evaluate to true because both conditions can’t be simultaneously unsatisfied by the (well-formed) object tree.

This implies that my first attempt to define the or keyword is incorrect… but that definition also seems to be applicable to the last of the set of three equivalents. So what is the formal definition of or supposed to be? Does it work differently when faced with a negative-case operator like notin or ~=? For example, does the compiler translate (x notin y or z) to ~~(x in y or z) as a first step? (This seems like it might be the case, but if so wouldn’t that constitute a violation of declared operator precedence?)

Based on the notin example and the ~= example on DM4 p.19-20, the whole point of the or keyword seems to be to allow for intuitive expressions which are not necessarily DeMorgan’s Law compatible, i.e. or does not always do the same thing as || – on purpose. As you say, it’s “clear the way it is” in the above example.

Fourth, I note that cases of the >= and <= operators used with or seem different from the above example in ways that may be important:

  • They are arithmetic tests instead of graph tests.
  • It is possible for both conditions to be satisfied simultaneously.
  • They assume an ordering of elements in the comparison set (i.e. integers, as opposed to objects).
  • The ordering of elements is tested in reverse if the operator is less-than-based instead of greater-than-based.

To emphasize that last difference, per your post the current compiler assumes that x >= y is always logically identical to y < x. This holds true for single propositions. However, it doesn’t necessarily hold true when there are multiple propositions, as you point out. The same pattern of “logically negate the comparison operator, then negate the or construction” may not be desirable for these operators.

If the definition of the or keyword listed above is used, then ~~(x >= y or 100) should be different from (x < y or 100), which you acknowledge. As a specific example, if x is 60 and y is 50, the former expression, naively translated as

~~((x >= y) || (x >= 100))

, evaluates to false while the latter expression, naively translated as

((x < y) || (x < 100))

, evaluates to true.

I suspect that the potentially inconsistent behavior of or with respect to negative-case operators is by design, making or closer to its spoken English meaning rather than its mathematical meaning (which is covered by ||). (It does not seem coincidental that the second example of its use on p. 19 employs the negative-case operator ~=.) I suspect that the issue cropping up with arithmetic comparison operators is a simple oversight. It seems to me that it is possible to improve consistency of the I6 or keyword’s function: All positive-case comparisons can adhere to the naive definition of or listed above.

Yes, this might break somebody’s code. Would it be so bad to require authors to use (x < y or 100) if that’s what they want, instead of depending on the perhaps unexpected translation of (x >= y or 100) into < operator terms?

Backwards compatibility is great, but doesn’t the ITM have a quote about Professor Nelson thinking that he erred too much on the side of caution with respect to breaking it while developing I6? Can’t anyone who insists on compiling old code without the slightest update be given the minor burden of digging out an old version of the compiler from the IF Archive?

In any case, it seems worth an entry in the I6 Reference Addendum to provide a specific definition of the or operator’s function.

2 Likes

I’d say keep the existing behaviour as it is, but any use of or with >, < etc should cause a compiler warning. Possibly also notin and any other operators with confusing results.

3 Likes

I agree on both counts.

Yes, this might break somebody’s code. Would it be so bad to require authors to use (x < y or 100) if that’s what they want

The question is not whether it would be bad, but whether it would be worse than leaving the current behavior.

We’re not developing I6 now. We’re supporting I6, which has been stable for 25 years since he wrote that.

A warning is an excellent idea.

Thinking about it more, or really restating what’s been said above:

The current behavior is not confusing for positive-sense comparisons (x == 4 or 5; rock in sack or chest.) Nor is it confusing for negative-sense comparisons (x ~= 4 or 5; rock notin sack or chest.) There’s only one sensible way to read these expressions and that’s how the compiler understands them.

The behavior of x < y or z takes some thought. (For a start, I have to switch to variables, because writing x < 4 or 5 is pointless!) But this form is both documented in the manual and consistent with x == y or z in an obvious way. This extends directly to x > y or z.

Things only get weird because the compiler treats <= and >= as negative-sense comparisons, and that’s not how we normally think of them.

So my conclusion is:

  • Leave the current behavior;
  • Generate a warning for <= or and >= or only;
  • Document.
6 Likes

Fair enough.

Though now I’m left trying to imagine what the warning would say…

WARNING (line 145): Unless you meant to type "~~(x < y or 100)" where you typed "(x >= y or 100)", you probably should have typed "((x >= y) || (x >=100))".
1 Like

I think zarf’s last point (document) is the thing that’s crucial. I know that I was recently confounded by an expression involving (x ~= a or b) producing unexpected results, so I broke it down into (x ~= a && x ~= b). I don’t remember exactly what the original expression was. It was probably doing the right thing, it just wasn’t what I expected from the minimal documentation in the DM4.

This discussion makes me wonder whether the original operator should have been or for the positive-sense comparisons (x == a or b) and and for the negative-sense comparisons (x ~= a and b). In that way, it could have given a compiler warning, but it’s too late to change that now.

3 Likes

I thought about that, but decided the only conceivable result would be that authors would pick and/or blindly and get it wrong half the time.

1 Like

I think if there was an and keyword it would make sense to treat and and or as synonyms in == and ~= expressions, since only one sense is ever useful and they’re treated equivalently in that context in spoken English, but I would treat a < b or c differently from a < b and c, since both are useful and they aren’t treated the same in spoken English.

I’d keep the invariant that ~~ always negates, which means that ~~(a < b or c) is the same as a >= b and c. (Incidentally, since and and or would be treated identically for the equality operators you would be able to flip and and or unconditionally when negating.)

I’ve added this comment to issue #87:

Adding this …

Looking at output generated with -a there’s a lot of lines like this (not always get_prop or get_prop_addr, but for other opcodes too):

 3800  +03265 <*> get_prop     thing short_50 (each_turn) -> TEMP1 
 3800  +03269     push         TEMP1 

and

 3479  +02ec9 <*> get_prop_addr o1 short_1 -> TEMP1 
 3479  +02ecd     store        p1 TEMP1 

Couldn’t this be simplified to (a tleast for zcode):

 3800  +03265 <*> get_prop     thing short_50 (each_turn) -> sp 

and

 3479  +02ec9 <*> get_prop_addr o1 short_1 -> p1

respectivly, and save 4 bytes per occurance?

Please let me know if I’m wrong so I can remove the comment in that case…

1 Like

Let me know if I’m wrong but I feel like the first example at least could not be simplified like that. get_prop, as a store opcode, takes a variable to store to. Storing and pushing are quite different things, so pushing the property value you’ve gotten is a separate step.