ParseNoun entry point

fredrik · October 17, 2021, 7:35pm

According to DM4:

ParseNoun(obj)

To do the job of parsing the name property (if parse_name hasn’t done it already). This takes one argument, the object in question, and returns a value as if it were a parse_name routine.

Routine returns The number of words matched, or 0 if there is no match, or −1 to decline to make a decision and give the job back to the parser. Note that if −1 is returned, the word number variable wn must be left set to the first word the parser should look at — probably the same value it had when ParseNoun was called, but not necessarily.

I’m reading the specification and looking at the code that calls ParseNoun (line 4544 in parser.h in library v6.12.5), and I can’t get my head around how it’s supposed to work and how it might be put to use in a game.

The way I read the specification, it seems wn can have any value if I return 0 or a positive number, just as in parse_name. I don’t find that to be the case though. And I’ve tried to write an example which consumes a word (increases wn by 1) and returns -1, but I find that the parser gets all confused after this.

Can anyone enlighten me? Can anyone provide a real-world example of a ParseNoun routine that returns -1 and increases wn?

otistdog · October 18, 2021, 7:49am

Contrary to the description in DM4, I don’t think that ParseNoun() is free to leave wn in an arbitrary place. It seems to be allowed for parse_name() only because wn is saved before the call and restored if the return result is negative. This appears to have been true even for StdLib 6/9.

With the following changes (note: tested vs StdLib 6/11, not 6.12.5 but they seem substantially similar), I think it can be made to work closer to the way that it is described in DM4:

...
#Ifdef DEBUG;
if (threshold >= 0 && parser_trace >= 5) print "    ParseNoun returned ", threshold, "^";
#Endif; ! DEBUG
! MODIFIED - don't arbitrarily increase wn when ParseNoun() returns -1
if (threshold > 0) { k = threshold; wn = j + k; jump MMbyPN; } ! MODIFIED - force wn to match claimed consumption
if (threshold == 0 || Refers(obj,wn) == 0) { ! MODIFIED - check wn instead of wn-1
  .NoWordsMatch;
    if (indef_mode ~= 0) {
        k = 0; parser_action = NULL; wn = j; ! MODIFIED - restore wn to pre-ParseNoun() state
        jump MMbyPN;
...
if (threshold < 0) {
    threshold = wn - j; ! MODIFIED - set threshold to reflect any words consumed by ParseNoun()
    w = NextWord(); ! MODIFIED (order) - ensure w contains actual first word of noun phrase if ParseNoun() moved wn
    dict_flags_of_noun = (w->#dict_par1) & $$01110100;
    while (Refers(obj, wn-1)) {
...

With that in place, it becomes possible to use it productively by leaving wn where you want it and returning -1. I worked up an example to parse the word 'your' as a possessive pronoun based on an owner property for objects, for use when talking to an NPC.

fredrik · October 22, 2021, 12:05pm

I have filed an issue for this.

DavidG · January 9, 2022, 7:00am

What is the likelihood of trouble with existing code if I put these changes into the Standard Library?

fredrik · January 9, 2022, 9:46am

When you say this makes it work closer to the way it is described in DM4, what does that mean? Do you know something that still doesn’t work according to the specification after this change?

otistdog · January 9, 2022, 6:00pm

Unfortunately, I’m not sure exactly what I meant by that – it’s been a couple of months and I’ve forgotten a lot of the details that were in mind at the time.

There’s not much in DM4 about ParseNoun(). Looking it over and checking my notes, nothing is prompting memories of specific remaining issues. I may have just been hedging my bets because there was the possibility of interaction with other parts of the code with which I’m not familiar.

I did find my notes describing the exact nature of the problem with the current code, which the posted fix is intended to address. For the benefit of @DavidG, they are as follows:

After the call to ParseNoun() from TryGivenObject(), there’s a line

if (threshold < 0) wn++;

that advances wn by 1 when ParseNoun() has returned a negative number.

This doesn’t make any sense until one looks at the logic for checking .name parameters later in the routine. There’s an odd order of operations that works out if and only if wn was undisturbed by the call to ParseNoun().

The later line

if (threshold == 0 || Refers(obj,wn-1) == 0) {

compensates for the forced advancement, checking that the word under wn as it was immediately after return from ParseNoun() can be found in the object’s .name property.

So far, so good. However, things get out of sync in the next block:

if (threshold < 0) {
    threshold = 1;
    dict_flags_of_noun = (w->#dict_par1) & (DICT_X654+DICT_PLUR);!$$01110100;
    w = NextWord();
    while (Refers(obj, wn-1)) {
        threshold++;
        if (w)
           dict_flags_of_noun = dict_flags_of_noun | ((w->#dict_par1) & (DICT_X654+DICT_PLUR));
        w = NextWord();
    }
}

The code assumes that the previous Refers() check cleared the first word of the noun phrase, so threshold can be set to 1 to reflect this.

The code also assumes that dict_flags_of_noun can be meaningfully set from the contents of local variable w, which is not the case if ParseNoun() has moved wn before returning -1. The code implicitly expects that the word loaded into local variable w (by an earlier call to NounWord()) is the first word of the noun phrase, but note that the contents of w have not been updated since the earlier call to NounWord(), before the call to ParseNoun(). (It is an OK assumption if wn was not disturbed by ParseNoun().)

The code then updates w with a call to NextWord(). If wn was undisturbed by ParseNoun(), this would load the second word of the noun phrase. If wn was moved by ParseNoun(), then a fault is introduced. (For example, if wn is left at the last word of input.)

Consider the case where ParseNoun() has been written to skip the word 'your' if the actor is the holder of an object. The command >BOB, DROP YOUR PENCIL will get a call to TryGivenObject() with wn set to 4 (pointing to 'your'). The ParseNoun() routine always returns -1 (to allow normal processing of the .name property), but when the word is not 'your' it rewinds wn to put it back to the same place that it was when it was called.

After line

threshold = ParseNoun(obj);

then threshold is -1 and wn is 5 (because it is deliberately skipping a word). The line

if (threshold < 0) wn++;

increments wn to 6. Then the condition in line

if (threshold == 0 || Refers(obj,wn-1) == 0) {

evaluates to false (as desired), because the second parameter of the Refers() check evaluates to 5, and this points to 'pencil' which will return true for the pencil object. But when it gets to

if (threshold < 0) {
    threshold = 1;
    dict_flags_of_noun = (w->#dict_par1) & (DICT_X654+DICT_PLUR);!$$01110100;
    w = NextWord();
    while (Refers(obj, wn-1)) {

the if block is entered, threshold is set to 1 (OK in this case), dict_flags_of_noun is updated based on 'your' (also OK in this case), but then w is set to 0 (because wn is at 6 before the call to NextWord(), past end of input) and wn is increased to 7. The while loop check attempts to look at word 6 of input (which is not there), fails, and TryGivenObject() returns 1. This is an insufficient number of words to consume the whole noun phrase, resulting in the parser’s confusion that is noted in the original post for this thread. (In this case, the grammar for ##Drop fails and the command is converted to an ##Answer action.)

For >BOB, DROP PENCIL, everything works out. ParseNoun() leaves wn at 4 ('pencil'). Then wn is incremented to 5. The first Refers(obj, 4) check sees 'pencil' and passes. Local variable threshold is set to 1 to reflect this. The next call to NextWord() returns 0, so the while loop is skipped. TryGivenObject() returns 1 (consuming the single word 'pencil'), and the grammar line for ##Drop parses OK.

DavidG · January 9, 2022, 10:43pm

I think I understand now. I started a branch, Files · ParseNoun_fix · David Griffith / inform6lib · GitLab, and am awaiting @fredrik’s confirmation that it’s correct.

DavidG · January 10, 2022, 9:21am

Whoops! I got confused over who posted the fix. @otisdog, I’m not certain if I got your fix correct. Would you please take a look at https://gitlab.com/DavidGriffith/inform6lib/-/issues/126 and check the branch I reference there to make sure I did it right?

otistdog · January 11, 2022, 11:20pm

From what I can tell, this reflects the changes from my post above. However, the revised code is derived from StdLib 6.12, whereas mine, as noted, was derived from 6/11.

The significant difference between the two appears to be the inclusion of a branch to allow state changes via the LibraryExtensions subsystem, of which I have been unable to find much in the way of documentation. The branch flow returns to the main line of TryGivenObject() before these changes, so there is a possibility of interaction between that branch and the changes I’ve laid out. I can’t make any predictions about what form that interaction might take, but if the default state is for the branch to make no changes, then there should by default be no interaction.

Note that there is a line in the revised code that you posted:

dict_flags_of_noun = (w->#dict_par1) & $$01110100;

which seems to reverse some changes made since 6/11. I think the 6.12 version of that line is:

dict_flags_of_noun = (w->#dict_par1) & (DICT_X654+DICT_PLUR);!$$01110100;

I believe that the two versions are functionally identical, but the new constants should probably be kept in place.

DavidG · January 11, 2022, 11:59pm

Thanks. I’ve cleared things up and pushed this fix out.