Question about Inform dict parsing

A couple of months ago, @fredrik.ramsberg posted an I6 bug report:

When compiling to v3, the compiler (v6.41) fails to give 'shelves//p' the plural flag.
When compiling to v3 or v5, the compiler fails to give 'superstructures//p' the plural flag.

(Also turned out to be true for 'superstructures//p' in Glulx.) Basically, the compiler would stop looking at the dict word after 6 or 9 characters, thus missing the //p suffix.

I agree that’s a pretty silly policy. It also leads to other inconsistencies. For example, if you write the dict word 'superstructüres' in Z-code, the compiler won’t notice the invalid character because it’s past the ninth character.

Okay, this is all easy to fix. I have a patch which tells I6 to scan the entire dict word. It limits the length of the generated dict entry, but not the source text scan.

However! It turns out this meaningfully affects the behavior of Inform 7 games.

As you know, I7 likes to add plural kind names to all objects. (There will be ways to tweak this in the next release, but it will still be the default behavior.) This means that every direction object has the synonym 'directions//p'. Which is, whoops, ten characters long, so it gets truncated. (Under both Z-code and Glulx, which default to nine character dict words.)

With the current I6 back end, the 'direction' word is not plural. With my updated I6, it is plural.

This gives us this change in behavior:

[current behavior]
>get direction
Which do you mean, the north, the northeast, the northwest, the south, the southeast, the southwest, the east, the west, the up, the down, the inside or the outside?

[with patch]
>get direction
What do you want to get those things from?

Now, out of the box this isn’t very important. You can’t take direction objects. So this is just a matter of what confusing error you get from a command that experienced players will never type anyhow.

But it’s really hard to tell how this will affect games that define custom actions around directions. (Maybe a LOOK NORTH command.) Verb testing is always fragile in IF; any change like this could upset somebody’s carefully-tuned pile of grammar hacks.

It’s true that any major I7 release can upset game grammar. However, we’re trying to set a policy that you can always download the latest I7 release and then set your project to “9.3” (or whatever) to get the old behavior. But the included I6 is used for all projects, regardless of I7 settings. So this would be a bit of behavior that you can’t revert.

What do people think? Is it worth breaking backwards compatibility in this way? Too big a change? Should I maybe put the I6 bug fix behind a compiler setting (to make it opt-in)?

2 Likes

This is definitely above my pay grade and for wiser authors than me.

However, as I use I6 and Punyinform, I would suggest the I6 bug fix option behind a compiler setting similar to the options available in Punyinform.

It seems non-ideal to me to put a compiler bugfix behind a flag when the problem is the input it’s given.

I assume it would be defining both direction and directions//p? Do both end up in the dictionary? Does it merge them?

Can this bugfix be delayed in I7 until I7 fixed its side of it (by stopping defining the plural maybe?)

Can I6 show a warning when it would have two dictionary entries that differ only by plurality?

Would it make sense for a dictionary entry to have a flag which means that the word in the buffer should be checked character by character? Then dictionary entry lengths wouldn’t matter, and it would be simpler than requiring all authors to manually check the buffer themselves.

1 Like

I’m not sure about the dictionary resolution in Glulx, but at least when compiling to Z-code you won’t have more than nine characters. In this case, “direction” and “directions” are the same dictionary word, as the first nine characters are the same. If one of them has the plural-flag, that dictionary word will get the plural-flag.

Two dictioary entries can’t differ only by plurality. When the player types “direction”, the tokenizer (part of the Z-machine interpreter) must find the dictionary word it corresponds to. That’s a single word, not a list of possible matches. And it can’t look at all characters, since the Z-machine simply can’t store more than the first nine characters in the dictionary. If you add the word “superstructures//p” to the dictionary, it’s stored as “superstru” with the plural flag. The rest of the word simply isn’t stored.

1 Like

Right, I know about the truncation, I just didn’t know if it would compile two entries or not.

Does it always skip the singular entry, or does it skip the first entry? Someone could experiment by defining directions before direction…

Adjust my recommendation about a warning to be that it should warn if given two entries that differ only by plurality, rather than if it actually ends up compiling two, which it can’t.

My idea about the new flag would be that it would have to store the additional characters elsewhere, perhaps in a routine. It would just automate what you can already do manually now.

(It’s been a very long time since I’ve done any parsing in I6, so I’m just speaking in general terms. I don’t remember what the recommended ways of dealing with the entry length are, I just know that it is possible. It might be too big of a change to automate, but it would be worth considering, wouldn’t it?)

1 Like

As far as I can tell, it shouldn’t skip any of them. I think you need to consider the plural-flag as saying “this word may be plural”. If you write ‘foo’ in one place and ‘foo//p’ in another place in your game, the first word will put the word in the dictionary, and the second word will add the plural flag. Or it should, unless there’s a bug.

1 Like

Ah, so it merges rather than skips. Then I think it should definitely warn when it would be merging words with different pluralities that aren’t the same in the source text and only become the same when truncated. That’s different from the “foo” case where it’s unquestionably the same word.

It’s not ideal, but it may be better than causing problems with I7 projects in progress.

Correct. That’s the way it’s always worked. Flags defined with the Dictionary directive are merged in as well.

That would be a new warning, invalidating something that used to be legal. I don’t think I want to go there.

It doesn’t address my question anyhow. The default I7 build doesn’t contain the dict word 'direction', only 'directions//p'. And I7 users don’t even see warnings.

I don’t think that’s a planned change. Anyhow, it still leaves the problem of putting the new I6 into the “universal” I7 IDE, which is supposed to still support 9.3, 10.1, etc without change.

If I put this change behind an I6 option, then older projects won’t be affected.

Oh, so it doesn’t try to put direction in the dictionary? That’s weird! But it does mean it would be simpler to fix on the I7 side as the plural flag could just be removed.

Are there other words that will be similarly affected though?

Oh, yeah, having the one I6 compiler for old I7 versions complicates things. Maybe a compiler option is the best of a lot of poor options.

It’s an interesting “feature” of Inform 7 that people (namely me) have been needing to hack around for years: Inform puts plural names of kinds into the dictionary, but not singular names. Neither the fact that this exposes the programmer’s internal names in a way that can’t be overridden, nor the fact that the standard library defines at least three basic types with nine-letter names (container, supporter, direction), was considered a good enough reason to change it.

(Show of hands: how many I7 developers here would expect the word “container” to automatically be recognized, and always plural, in every single I7 project?)

We’re finally going to be able to turn it off in an upcoming release, but the default won’t change. Small victories.

1 Like

One, maybe inelegant, solution would be to let i6 behave as before and only issue a warning that a dictionary word gets truncated and its defined plural flag won’t be recognized.

The warning could also suggest a a solution (the code needs to be rephrased, 'superstructures//p''superstru//p').

Could be tricky when word contains characters that needs multiple z-chars, though.

A compiler option could instruct the compiler to do this automatically and don’t issue the warning.

Another way is to change the behaviour but have a compiler option that instructs the compiler to use the old way.

Compiler flag sounds like least bad option to me.

3 Likes

It occurs to me that we could default to the correct behavior, but have a compiler flag for the older buggy behavior. This would work out acceptably if we could update the IDE’s I7 compilers (9.1 through 10.1) to apply the “please be buggy” compiler flag.

We’ve always operated under the assumption that old I7 compilers would never need to be updated. This would hopefully be a small change though.

5 Likes

I’m not certain, but I don’t think the retrospective compilers ni.c includes the code which runs I6. So adding a CLI flag shouldn’t involve any changes to the retrospective compilers.

1 Like

I was thinking about adding a !% line to the generated I6 source. But it should work either way, yeah.