Stupid (adv3) parser tricks: disambiguating between multiple noun-as-verb usages

Piergiorgio_d_errico · July 9, 2023, 10:33am

Thank for the clarification, jbg

I’m not sure of having well understand the layering of issues but AFAIK, the simpler approach (once identified PEBBLE/STONE etc. as verb, compare the actual verb typed with the synonym list (that is, a simple string comparing) but AFAICT the parser engine don’t preserve the text of the verb (whose led to this little thing:

>drop die
Dropped. 


>take it
Taken. 


>drop it
Dropped. 


>pick up it
Taken. 


>drop it
Dropped. 


>get it
Taken.

I tried (and failed) to figure where to read the actual verb used, and change the stock answer in “picked” and “got” accordingly… (aside that I make heavy use of ad prose, so the lack of verb disambiguation in general is a bit annoying for me, hence my watching this thread…)

Best regards from Italy,
dott. Piergiorgio.

johnnywz00 · July 9, 2023, 2:13pm

@Piergiorgio_d_errico Are you looking for getEnteredVerbPhrase?

johnnywz00 · July 9, 2023, 2:20pm

I think before testing I had a wrong understanding of getVocabMatchList… it doesn’t help determine what the grammar matches, just which sim objects to consider…

jjmcc · July 10, 2023, 2:01pm

Ok ok, you have convinced me. I do not have refactoring ahead! Given I am getting what I want now in my much more focused instance of this problem, I cannot justify the head spinning so far on display. I do wish you all the best of luck though, and should you crack it, reserve the right to quietly crib your work later. :]

So sometimes there are relapses, yeah? Those def happen.

jbg · July 15, 2023, 2:46am

Update(-ish): my current thinking is that the solution I’ll end up using is actually a combination of using dobj-only verb rules with high badness and a tweaked executeCommand().

After digging around, it requires a surprising amount of bespoke/duplicated effort to get noun phrase resolution if you’re not letting the parser handle it for you.

So I think the “real” noun-as-verb actions will be handled as discussed earlier in the thread, and I’ll have the catch-all set a flag on the resolved action, and then have executeCommand() check for the flag and throw an exception (caught by the handlers which are also in executeCommand()) to handle the fall-through.

This all seems to work so far, but I need to code up a better regression test before I commit something that fiddles around with the main parser loop.

As a side effect I think executeCommand() will end up being a singleton with a bunch of methods instead of a single big function with labels.

jbg · July 20, 2023, 1:35am

The latest complication: distinguishing between valid noun-as-verb actions and “normal” valid but incomplete verb phrases.

One of the uses I have for the noun-as-verb stuff is non-compass travel. Entering a room name is treated like as a travel action to the named location. For example, >RED ROOM is treated as >ENTER THE RED ROOM. The problem is in handling cases where there’s a collision between part of the room name and a regular verb. E.g.:

Middle of the Hallway
This is the middle of the hallway.  One end of the hallway gets darker, the
other lighter.

>light end of the hallway
You see no end of the hallway here.

>light
What do you want to light?

>dark end of the hallway
You see no dark end of the hallway here.

>dark end of the hallway
Dark End of the Hallway
This is the dark end of a hallway.  Doors here lead into the red room and the
green room.  You can also move down the hall to the middle of the hallway.

Several confusing things here.

The first command is an attempt to move into the location called “light end of the hallway”. This is a valid location name, and it will work if you use, for example >ENTER LIGHT END OF THE HALLWAY. But the input is parser as using the verb >LIGHT on “end of the hallway”, which fails (“You see no end of the hallway here”).

>LIGHT would work if “light” wasn’t a verb (>DARK works to move into the dark end of the hallway), but instead it’s parsed as the verb.

Entering >LIGHT as the third command results in a disambiguation prompt. Inside the disambiguation prompt, “dark end of the hallway” is parsed as an object to give >LIGHT, despite being a valid, complete noun-as-verb action.

The fourth command, not in disambiguation, “correctly” handles the noun phrase as a noun-as-verb action and moves the player to the named location.

At first glance it looks like executeAction()—another one of the big flat functions used by the parser, in this case called from within executeAction()—also needs to be modified to handle this.

Wheee!

jbg · July 20, 2023, 8:57am

Current state of madness: considering re-writing all of this so that noun-as-verb stuff goes in a separate dictionary, and executeCommand() tries both.

johnnywz00 · July 20, 2023, 12:45pm

That’s some devotion to providing those shortcuts!

jbg · July 20, 2023, 11:44pm

The main motivation wasn’t providing shortcuts so much as wanting the narrative/gameplay effect. When the player gets conked on the head and wakes up in a strange location so they don’t know which way is north, that kind of thing. It also helps with things like hallways and outdoor locations where trying to align everything at four or eight compass points can either feel artificial or be awkward (if you’re mostly using the canonical four directions and occasionally using a diagonal, those diagonals can turn into accidental puzzles just because the player isn’t used to having to use them).

That aside, it looks like dictionaries don’t get me what I want either. There seems to be very little documentation on them: the page from the System Manual seems to be all there is. It mentions the idea of using multiple dictionaries “to create different parsing modes, each having their own separate vocabulary words”, but the documentation provides no examples and there don’t appear to be any in the library source.

But what it looks like is that dictionary instances only hold noun phrase stuff, not action/verb phrases, and there doesn’t appear to be any way to enable/disable productions in the middle of command processing. Most of this stuff isn’t T3 code (that is, the stuff that gets fed into t3make) but are compiled-in behaviors of the interpreter.

The point of contact (between implementor-supplied T3 code and interpreter-supplied compiled C stuff) is in e.g commandPhrase.parseTokens(), which takes the list of tokens (from the player input) and a dictionary instance as its arguments. It returns a list of matching actions which is then further processed in ways that don’t matter here.

The problem is that commandPhrase.parseTokens() prunes the action list. So in the example I used earlier, tokenizing >LIGHT and giving it to parseTokens() returns a list containing only one possible action: predicate(LIGHT). Because it’s evaluated the badness of all of the productions and it’s discarded all of the noun-as-verb productions.

This means I can’t just get the results from parseTokens() and sort through them with my own code. If the badness on the noun-as-verb productions is lowered, they’ll “win” and parseTokens() will return them instead (in this case there will be multiple candidates, and one will be selected in executeCommand() after noun resolution), and predicate(Light) won’t be on the resulting list.

And I don’t think there’s any way to dynamically modify the badness of a grammatical rule, which I think would solve the problem (by returning a different badness for noun-as-verb rules based on whether or not processing is happening in disambiguation).

So…back to the drawing board.

jbg · July 21, 2023, 12:48am

Hm. I haven’t actually implemented anything yet, but now I’m thinking that maybe instead of defining noun-as-verb actions as “normal” grammatical productions, “manually” creating a top-level production exclusively for noun-as-verb evaluation and then having the convenience macro (DefineNounAsVerb() in the examples above) tack additional rules onto it via GrammarProd().addAlt() (instead of using adv3’s DefineAction macros), and then use parseTokens() on that production before or after commandPhrase.parseTokens() in executeCommand() (depending on whether or not we’re in disambiguation).

Don’t know what side-effects parseTokens() has (in terms of setting things other than the return value), though.

jbg · July 21, 2023, 1:57am

Ooof. And of course GrammarProd is one of those things that’s a) not well documented, b) the documentation that exists doesn’t include any complete, worked examples, and c) the examples that are given don’t compile as presented.

The example in the System Manual is:

     local prod = new GrammarProd();
     prod.addAlt('noun->n1', new NounPhraseProd(), cmdDict, symtab);

…which won’t compile (symtab isn’t a global symbol). If you omit it (it’s an optional argument) or supply one, the result will throw a runtime error (“intrinsic class exception: code compilation failed”).

At this point I’m pretty sure that if I’d just started out to write a bespoke interpreter in node/typescript/js from scratch, I’d be done by now.

johnnywz00 · July 21, 2023, 4:00am

I can see wanting to avoid compass directions for certain reasons, but can’t you just make it clear via instructions or early footnote that players can ENTER or GO TO any location they read about? Seems like ‘light end of the hallway’ is more or less a shortcut for ‘go to light end of the hallway’…
(If you’re using GO TO for some kind of long-range pathfinding, surely some other phrase can be used for accessing an immediate location)

jbg · July 21, 2023, 8:53pm

Yeah, those are allowed also. I guess I just see it more of an interface design thing than a shortcut thing. The first design case was non-compass travel, but that got me thinking of it generally. So more or less everything you can interact with in any way has a “default” interaction that you get if you use its name by itself without any verb. Sort of like having everything clickable in a point-and-click or 3d game.

johnnywz00 · July 24, 2023, 12:31pm

As far as

>LIGHT
What do you want to light?

I was wondering if, at the point soon after parseTokens where it calls CommandRanking.sortByRanking, and gives you match = rankings[1], before proceeding you could test

local ac = match.resolveFirstAction(issuingActor, targetActor);
if(ac.dobjMatch.ofKind(EmptyNounPhraseProd)) ...

and then look at rankings[2] if the winner matched with an empty noun phrase. That may not even be helpful, it was just a thought.

EDIT: I didn’t even check to see if grammars with badness even show up in the rankings list, so if not, this probably fails immediately.

jbg · July 24, 2023, 10:03pm

Yeah, that’s the problem. parseTokens() returns a pruned list. In the case that the badness of LightAction is lower than whatever noun-as-verb actions the command might otherwise resolve as, the list returned by parseTokens() will only contain LightAction(). With the badness situation reversed (by lowering the noun-as-verb rules, for example) the opposite will happen: the list returned by parseTokens() will contain only the noun-as-verb actions and not LightAction().

This could be made to work if there was some way of dynamically changing the badness of productions, but I don’t think there’s any way to do that (short of making changes to the interpreter code). So I think this will require building a separate production tree and separately both the results of commandPhrase.parseTokens() and [the base of the other parse tree].parseTokens(). But the documentation about how to actually build a separate parse tree like this appears to be non-existent.

johnnywz00 · July 24, 2023, 10:48pm

Spitballing… what about removing badness from noun-as-verb, but manipulating the rankings list, to use the next match after noun-as-verb only if it doesn’t contain an empty noun phrase? Assuming that with no badness, the noun-as-verb will typically come up first in the list. Noun-as-verb actions could have a marker property (or simply check if they’re of the subclass), and they always get removed from the head of the list unless the other alternatives (’ resolvedFirstAction) have no noun phrase…

jbg · July 24, 2023, 11:08pm

That just produces the opposite problem. parseTokens() never returns a heterogeneous list of actions. It does basically everything except noun resolution, so it never (as far as I know…don’t know if there are any obscure corner cases or if I’m just misunderstanding something) returns a list containing multiple options unless the options are vary only by noun resolution.

So in the case of the noun-as-verb stuff, as written if you have three different noun-as-verb productions of equal badness, parseTokens() will return a list containing all three (plus the catch-all), because the only difference between them (gramatically) is determined via noun resolution. That’s not true of >LIGHT versus >LIGHT END OF THE HALLWAY (as a noun-as-verb). parseTokens() gives you either LightAction or a list of the noun-as-verb actions, but there isn’t any way of getting it to defer judgement and give you a list containing all of the above.

inventor200 · July 24, 2023, 11:19pm

Omg so I’m not crazy!!

I just treat GrammarProd as just some out-of-bounds stuff. It’s like a completely new side-language from TADS 3, and doesn’t seem to fit in anywhere else.

That’s a mood. The main reason why I tend to come back to TADS 3 is doing all the front-end and accessibility stuff is not what I want to be focusing on when I’m trying to make the mechanics. That’s why I’m glad there’s an interpreter, usually.

jbg · July 24, 2023, 11:33pm

Not sure if this will make the situation clearer or if it will be more confusing, but:

In this particular case (>LIGHT versus a noun-as-verb noun phrase that would match “light” if >LIGHT wasn’t an action), I think you could just kludge around it by adjusting the badness such that parseTokens() will return both the LightAction and the noun-as-verb actions in the list. This would break the fall-through behavior of the noun-as-verb stuff (taking us back to the old >ASK (or whatever) problem). But you could I guess implement logic in executeCommand() to handle that specific case.

The problem then becomes that you then would have to figure out all the points of potential conflict like this (“light” as a verb versus a noun-as-verb) and make sure that all of the noun-as-verb verb rules are defined with matching badness values. Assuming that’s possible (that is, assuming there’s no possible conflict that involves matching multiple productions with different badness scores).

jbg · July 24, 2023, 11:45pm

I think it’s both better and worse if you’ve got experience in formal grammars/compiler design. The basic ideas are more or less the same, but T3 kinda truncates the top of the parse process. This is presumably for performance reasons (and the belief that a small number of people writing games will care about this sort of thing), but it’s irritating to be banging my head against this sort of thing for so long, when resolving it in a lex/yacc (or flex/bison, or generic EBNF-ish system) grammar would involve just replacing a single top-level (or at least high-level) production.