Disambiguation on Demand

This is a proposal for an extension. I was thinking this work was beyond by abilities, but I realized how many bits of it I can steal from other places.

Aaron Reed’s “Remembering” has a technique for switching off disambiguation completely.

Example 356: “Walls and Noses” shows how to access the list of disambiguation choices.

The main thing that is missing is a way to trigger a disambiguation event. I’m hoping this is possible with access to I6 - if anyone has any tips, I’d like to hear about them.

So, this is what I think the extension should do:

Provide a hook for translating a snippet into an object, performing disambiguation if necessary. Something like this:

[code]Keywording is an action applying to one visible thing.

After reading a command when the player’s command matches “[any known thing]”:
let item be the matched text, as an object;
try keywording item.[/code]

Provide a means for disabling disambiguation:

Refusal to disambiguate when remembering: say "You don't know anything about that."

Possibly add some conveniences for use with the “asking which do you mean” activity such as the “to decide whether N fits the parse list” function in “Walls and Noses.”

What do you think? Is it possible? Is it worthwhile? Would you use it?

[size=200]Yes.[/size]

If the prose asks the player a question, it’d be nice to be able to piggyback on the disambig code to understand an answer. I.e., use some code that can understand a given property as referring to a thing, like “A thing can be relevant. Understand the relevant property as referring to a thing.” (Typing from memory.) The author can set relevant/irrelevant on any objects in the game he likes, then prime parser with " RELEVANT", like “Rule for reading a command when the primed command is not empty: change the text of the player’s command to the primed command; now the primed command is empty.” The WDYM question appears on its own, ready to prefer any Relevant object, but barring that, will still understand all other commands if they’re typed in.

Still, you’re looking to dig into the I6 template code. You know where the Appendix A and Appendix B are on the Inform site? The latter especially will give you a guided tour of the parser code. Well, to a point. The DM4 will do the rest.

I would also be interested in this. (I have a vague project in mind which involves basically turning everything into a keyword, which it seems like this might be helpful for.)

If you don’t, they’re here; Appendix A is “The standard rules web extract” and Appendix B is “the I6T web extract.”

(This is a mild peeve I have about the I7 website, that these things are impossible to find without a walkthrough, and even then the name that appears in the room description isn’t the one you’re looking for. Apparently “web extract” doesn’t mean “extract we put on the web” but has something to with a “web” as a “literate program,” explained here; but is putting this stuff under a “Webs” tab on the I7 website really the best way to help people find it? It seems like one of those things, like where the mess hall is or what a Code Red means, that you pick up from other folks.)

Wow, I see more support than I expected, but also more challenges! I will have a look into this and see if it is something I think I can achieve.

I’m just browsing through Disambiguation Control. It looks like Jon Ingold completely reimplemented the NounDomain function. Is he available for comment?

Tell me more about priming the parser. Can it be done without indexed text?

I’ve been looking at the I6 parser code and it looks like there are three buffers, where buffer and buffer2 are used during disambiguation and buffer3 is used during “oops” and “again.” Would it be possible to write I7 hooks for manipulating them? Do they behave like indexed text? Can a text be written to a buffer, or does it need to be converted to indexed text first?

Perhaps I’m being obsessive about avoiding indexed text, but I like to keep static memory static.

I don’t know. I don’t think I7 has a counterpart to the plain I6 array, esp. the byte array. You could look at Appendix A for the phrase “To change the text of the player’s command to” and see what it does, as it converts the indexed text to the input buffer.

Mmm… good luck. I think you’re in for a world of hurt with that parser. I feel it should be rewritten in I7 just for transparency and adding hooks, but it obviously would be a much bigger piece of code as a result. (My argument isn’t with its architecture so much as a few of the additions bolted on. It being GPR based is actually pretty slick.)

And I though Aaron Reed had done something with disambig mode, like fix a bug or something in it which involved replacing a big swathe of template code… but I can’t remember. Maybe it’s in one of his extensions?

Hey there -

It’s been a while since I wrote DC so I’m not entirely sure I remember how it works, which isn’t a great starting point. It’s also - as far as I can see - pretty underused, so it may well be buggy. But looking through, I think I reimplemented NounDomain for a few reasons:

  • to spot and flag when the parser was guessing nouns without any typed information, to prevent disambiguation rules from getting too excited

  • to check the word after a noun matches the right preposition token (to avoid the parser passing a line when it’s about to fail; this is really a bug in the core Inform parser as it leads to “You can’t see anything such thing here.” errors for input like PUT BOTTLE AWAY).

  • to allow the game to say “You can’t use multiple items here, which one do you mean exactly?”, and receive disambiguation input, and use it

Anyway, none of that should be too important for what you want. At the top of either version of NounDomain, there’s a call to the function SearchScope, and I think that’s what actually does the parsing and fills up the match_list array with possible objects. The number_matched variable is set to how many matches get made. If this is 0, the token failed. If 1, the token was unique. Anything higher, and the game asks to disambiguate, which is done by horrible hacky code in NounDomain that prints a question, reads in text, and then copies it into the buffer before calling a reparse.

I think you want to take the pieces of NounDomain you need and write a short version of that routine that hooks in. You’ll probably need to use the buffer for the input text, and then do the library parse thing, that writes out the parse table. You’ll need to set wn to where you want SearchScope to start working.

But it should be possible!

cheers
jon

Thanks, Jon, that’s really helpful!

As for rewriting the parser, I agree with you, Ron! After studying how commas are handled, it does seem horribly over-complicated. Creating an “orders” equivalent in I7 would be complicated, but it seems like it could be done a lot better than it is now. And the ability to have commas in I7 grammar lines seems worthwhile.

What’s GPR?

inform-fiction.org/I7Downloa … ndix-B.pdf

General Parsing Routine. Inform 7 knows them as Understand tokens. Inform 6 can stick functions into variables and properties, pass them to other functions, etc., and the parser makes good use of that. So it isn’t just the case that the parser is this bit of code sitting in a library, built to serve the author’s code above it; the parser is also a framework which treats bits of the author’s code as the parser’s own little library. Whenever we write Understand “foobar [something preferably held]”, there’s a ‘preferably held’ function/rule/to-phrase/whatchamacallit that the parser calls to ask, “do you recognize this stuff starting at word number X?” The function returns the Object that matches the words if there is one, or, sets its corresponding global variable to the number or value if it recognizes one, or, returns a KOV like GPR_PREPOSITION or GPR_FAIL or whatever if it doesn’t recognize the words. Such a function can be written by the I7 author – see 25.22 “Inform 6 Understand tokens” in WwI.

I don’t think the I7 manual tells us about wn – the “word number” variable so important to GPRs – or about the KOVs, or all that other stuff. But it’s possible to use that with I6-I7 glue code to get the parser to do all sorts of interesting things.

In that respect, the parser is quite extensible. Just make new tokens, each of which is an arbitrary function.

Granted, I don’t think any of this helps you with the special disambig-retry loop buried in the parser.