Run-time problem P39 when changing the number of input words in Counterfeit Monkey

otistdog · November 18, 2023, 7:52pm

Oh, wow – I really was not paying enough attention when writing that first set of untested code. I’ve corrected it above.

Please accept my apologies for the confusion.

Angstsmurf · November 18, 2023, 9:30pm

Thanks a lot for looking into this!

As far as I understood you, you are saying that the condition

if the player's command includes "[someone talk-eligible]":

won’t work as intended (that is, be true only if the player’s command includes words that can be understood as the current interlocutor) because of an Inform bug. And that it will cause the run-time problem in question when we then try to set a variable to the non-existent matched text.

In other words, we have to find an alternative way to check for this.

drpeterbatesuk · November 18, 2023, 11:54pm

otistdog:

The routine runs a loop using the provided test routine, in this case Consult_Grammar_1434(). The routine is fed parameters (1,0), then (2,0) on to (5,0), looking for a result that is not GPR_FAIL. In this case, it does get such a result on the fifth cycle, i.e. on the call Consult_Grammar_1434(5,0).

The result that it receives is the object ID for the gift shop volunteer object. I think that this is happening because of a parser inference, but I haven’t traced it out. It makes sense, though, because only one object can be the current interlocutor.

As I understand it, the parser’s logic of this type will only make an inference of that type when thinks that it’s past the end of the entered command. If that’s correct, then in theory the parsing should fail instead of returning the current interlocutor; inference should not be happening because the adjusted player's command is >ASK WHAT IS WORTH SEEING, so ParseToken() should be rejecting the word “seeing” as applicable to the gift shop volunteer.

But the critical issue is that if ParseTokenStopped() has (rightly or wrongly) made a match on the final word of a 5 word command, wn should now be beyond the end of the player’s command (6) so SnippetIncludes() should return 5*100+6-5 = 501, which is a valid snippet representing the last word of a 5 word command…?

otistdog · November 19, 2023, 3:33am

When that calculation is made, wn == 5. Consult_Grammar_1434() forcibly sets wn to 5 as its first operation, and the invoking routine SnippetIncludes() makes no effort to save it and restore a previous value.

As far as I can tell, the contents of parse and buffer are as desired (i.e. reflecting >ASK WHAT IS WORTH SEEING) when the action moves to ParseTokenStopped(). ParseTokenStopped() does not halt early because it sees the request as in-bounds for the current contents.

The particular logical fault seems to be within TryGivenObject(), which has the block:

!  If input has run out then always match, with only quality 0 (this saves
!  time).

	if (wn > num_words) {	! stale value of num_words for some reason
		if (nomatch) return 0;
	    if (indef_mode ~= 0)
	        dict_flags_of_noun = $$01110000;  ! Reject "plural" bit
	    MakeMatch(obj,0);
	    #Ifdef DEBUG;
	    if (parser_trace >= 5) print "    Matched (0)^";
	    #Endif; ! DEBUG
	    return 1;
	}

If the comparison condition is changed to

	if (wn > WordCount()) {	! check current parse array value

then the RTP does not occur.

Here’s the inclusion for 6M62:

TryGiveObject (modified)

Include (-

! ==== ==== ==== ==== ==== ==== ==== ==== ==== ====
! Parser.i6t: TryGivenObject
! ==== ==== ==== ==== ==== ==== ==== ==== ==== ====


[ TryGivenObject obj nomatch threshold k w j;
	#Ifdef DEBUG;
	if (parser_trace >= 5) print "    Trying ", (the) obj, " (", obj, ") at word ", wn, "^";
	#Endif; ! DEBUG

	if (nomatch && obj == 0) return 0;

! if (nomatch) print "*** TryGivenObject *** on ", (the) obj, " at wn = ", wn, "^";

	dict_flags_of_noun = 0;

!  If input has run out then always match, with only quality 0 (this saves
!  time).

	if (wn > WordCount()) {	! MODIFIED
		if (nomatch) return 0;
	    if (indef_mode ~= 0)
	        dict_flags_of_noun = $$01110000;  ! Reject "plural" bit
	    MakeMatch(obj,0);
	    #Ifdef DEBUG;
	    if (parser_trace >= 5) print "    Matched (zero)^";
	    #Endif; ! DEBUG
	    return 1;
	}

!  Ask the object to parse itself if necessary, sitting up and taking notice
!  if it says the plural was used:

	if (obj.parse_name~=0) {
	    parser_action = NULL; j=wn;
	    k = RunRoutines(obj,parse_name);
	    if (k > 0) {
	        wn=j+k;

	      .MMbyPN;

	        if (parser_action == ##PluralFound)
	            dict_flags_of_noun = dict_flags_of_noun | 4;

	        if (dict_flags_of_noun & 4) {
	            if (~~allow_plurals) k = 0;
	            else {
	                if (indef_mode == 0) {
	                    indef_mode = 1; indef_type = 0; indef_wanted = 0;
	                }
	                indef_type = indef_type | PLURAL_BIT;
	                if (indef_wanted == 0) indef_wanted = INDEF_ALL_WANTED;
	            }
	        }

	        #Ifdef DEBUG;
	        if (parser_trace >= 5) print "    Matched (", k, ")^";
	        #Endif; ! DEBUG
	        if (nomatch == false) MakeMatch(obj,k);
	        return k;
	    }
	    if (k == 0) jump NoWordsMatch;
	}

	! The default algorithm is simply to count up how many words pass the
	! Refers test:

	parser_action = NULL;

	w = NounWord();

	if (w == 1 && player == obj) { k=1; jump MMbyPN; }

	if (w >= 2 && w < 128 && (LanguagePronouns-->w == obj)) { k = 1; jump MMbyPN; }

	if (Refers(obj, wn-1) == 0) {
	    .NoWordsMatch;
	    if (indef_mode ~= 0) { k = 0; parser_action = NULL; jump MMbyPN; }
	    rfalse;
	}

	threshold = 1;
	dict_flags_of_noun = (w->#dict_par1) & $$01110100;
	w = NextWord();
	while (Refers(obj, wn-1)) {
		threshold++;
		if (w)
		   dict_flags_of_noun = dict_flags_of_noun | ((w->#dict_par1) & $$01110100);
		w = NextWord();
	}

	k = threshold;
	jump MMbyPN;
];

-) instead of "TryGivenObject" in "Parser.i6t".

None of the other code from previous posts matters for resolving the issue.

I think this does qualify as an actual bug… you could argue that it’s a consequence of the particular path to TryGivenObject(), but really the routine is relying on a cached value that it doesn’t cost much to refresh or read directly.

drpeterbatesuk · November 19, 2023, 8:06am

That said, the parser does rely throughout on the num_words global being refreshed every time the input & parse buffers are altered, so one could argue that the root of the error lies in not having num_words = WordCount(); players_command = 100 + num_words; immediately after VM_Tokenise(buffer,parse) somewhere, as by the parser’s conventions should always happen.

Angstsmurf · November 19, 2023, 12:54pm

It works. Wonderful!

This allows us to simplify the new strip interlocutor from input rule. Now cut the matched text works as intended, and we no longer need to use a regular expression.

drpeterbatesuk · November 20, 2023, 11:00am

Does this fix also work- which I would humbly suggest fixes the fundamental issue at source (therefore without leaving other hostages to fortune): i.e. that num_words is not updated after splicing a snippet, although players_command is:

Modified SpliceSnippet

Include (-
Replace SpliceSnippet;
-) after "Definitions.i6t".
Include (-
[ SpliceSnippet snip t i w1 w2 nextw at endsnippet newlen;
	w1 = snip/100; w2 = w1 + (snip%100) - 1;
	if ((w2<w1) || (w1<1)) {
		if ((w1 == 1) && (w2 == 0)) return;
		return RunTimeProblem(RTP_SPLICEINVALIDSNIPPET, w1, w2);
	}
	@push say__p; @push say__pc;
	nextw = w2 + 1;
	at = WordAddress(w1) - buffer;
	if (nextw <= WordCount()) endsnippet = 100*nextw + (WordCount() - nextw + 1);
	buffer2-->0 = 120;
	newlen = VM_PrintToBuffer(buffer2, 120, SpliceSnippet__TextPrinter, t, endsnippet);
	for (i=0: (i<newlen) && (at+i<120): i++) buffer->(at+i) = buffer2->(WORDSIZE+i);
	#Ifdef TARGET_ZCODE; buffer->1 = at+i; #ifnot; buffer-->0 = at+i; #endif;
	for (:at+i<120:i++) buffer->(at+i) = ' ';
	VM_Tokenise(buffer, parse);
	num_words = WordCount();                           ! #### INSERTED STATEMENT ####
	players_command = 100 + WordCount();
	@pull say__pc; @pull say__p;
];
-)  after "Output.i6t".

EDIT: I see that this bug persists in the Ver 10.1.2 version of SpliceSnippet(). I have reported it.

Angstsmurf · November 20, 2023, 3:18pm

Hm, even with this fix active, I still get run-time problem P39 by typing the thing that originally led me down this path: >ASK WHAT’S WORTH SEEING (note the curly apostrophe):

*** Run-time problem P39: Attempt to say a snippet value which is currently invalid: words 5 to 4.

And it is immediately followed by

*** Run-time problem P40: Attempt to splice a snippet value which is currently invalid: words 5 to 4.

EDIT: I guess this has something to do with the interpreter replacing the Unicode curly apostrophe with a question mark, which the built-in extension Punctuation Removal then replaces with a space, creating an extra word (“WHAT?S” becomes “WHAT S”.)

otistdog · November 20, 2023, 3:55pm

To clarify, when you say “this fix” is active, do you mean the one from me or the one from drpeterbatesuk? (I agree that his fix should also address the issue, but I want to make sure that I’m looking at the same code as you.)

Angstsmurf · November 20, 2023, 4:00pm

I have only tested with Peter Bates’ fix yet.

EDIT: Now I tested with your (otisdog’s) fix as well. With that, there is no run-time problem with ASK WHAT’S WORTH SEEING.

otistdog · November 20, 2023, 4:57pm

As I said, I agree that the fix by drpeterbatesuk should address the issue, but shoulds and oughts don’t always carry much weight in parser code. It seems that there are additional places where WordCount() and num_words can get out of sync. These may be introduced by other code in Counterfeit Monkey (or the extensions it uses) instead of template code. (EDIT: See following post.)

I also agree with the good doctor that every other use of num_words is a hostage of fortune if only the issue in TryGivenObject() is addressed, but your experience suggests that they are not causing any observable problems.

In any case, these RTPs are still both coming from the new strip interlocutor from input rule when only SpliceSnippet() is modified – the second regarding splicing is from the line:

		cut the matched text;

I don’t see either RTP if the modified version of TryGivenObject() is in place.

I count between 20 and 25 places where template code reads the value of num_words for comparison or computation. Arguably, all of these could be replaced by calls to WordCount() and the global num_words retired. Arguably, a standardized routine could be set up to be invoked after any modification to the command buffer to ensure retokenization and update of parse array-dependent values such as players_command and num_words, and the multiplicity of inline repetitions for these operations eliminated.

EDIT: Here’s some code to assist in diagnosing:

To decide which number is num_words cache:
	(- num_words -).

To decide which number is num_words computed:
	(- (WordCount()) -)

To check word count disparity:
	unless num_words cache is num_words computed:
		say "<MISMATCH for num_words: cache = [num_words cache], computed = [num_words computed]>".

If you add the phrase check word count disparity; to any point in your code, you can see whether a mismatch has developed there.

EDIT 2: I misdiagnosed the source of the second RTP. Corrected above.

otistdog · November 20, 2023, 6:03pm

FYI: In the version of the github code I downloaded this morning, the version of drpeterbatesuk’s inclusion in Character Models.i7x takes the form (with comment added):

Include (-
Replace SpliceSnippet;
-) after "Definitions.i6t".

Include (-
...

-)  after "Output.i6t". [note placement]

Per my testing, it prevents both RTPs (without a modification of TryGivenObject) if the location specified for the inclusion is changed to:

-)  after "Snippets" in "Parser.i6t".

[EDIT: Addendum: It prevents both RTPs for the command >ASK WHAT’S WORTH SEEING (with straight apostrophe). It does not prevent either RTP (39/RTP_SAYINVALID_SNIPPET or 40/RTP_SPLICINVALIDSNIPPET) for the command >ASK WHAT?S WORTH SEEING, which is what the parser actually sees when a curly apostrophe is used. You did say to note that!]

drpeterbatesuk · November 21, 2023, 12:42am

Oops, should have checked that sequencing!

drpeterbatesuk · November 21, 2023, 12:51am

Yes, it’s all a bit Heath Robinson at present. If the parse buffer is only ever updated by VM_Tokenise(buffer, parse) (I suspect that’s true although I haven’t checked) this code could simply be included in that function.

Angstsmurf · November 21, 2023, 5:14pm

I don’t know what I’m doing wrong, but I can’t get this to work. In my testing, ASK WHAT’S WORTH SEEING with a curly apostrophe will still cause a RTP with this change.

(I’ve only tested this in the macOS Inform IDE, though.)

However, the modified TryGivenObject() still seems to fix all RTPs (both when used on its own and in combination with the replaced SpliceSnippet()), so I’m going with that for now.

drpeterbatesuk · November 21, 2023, 11:06pm

How about with a straight apostrophe?

I assume then that there is also another routine other than SpliceSnippet which is altering the parse buffer without updating num_words?

How does one get to the relevant part of Counterfeit Monkey so that I can debug this (I have never played it). Is it easy?

otistdog · November 22, 2023, 3:09am

You’re not doing anything wrong. Despite the fact that you said to note that the RTPs were still happening only with curly apostrophes, I was not properly checking that case. I confirm that when the command is >ASK WHAT?S WORTH SEEING (or the version with curly apostrophe, which presents the same actual input to the parser), the RTPs still occur with only the modification to SpliceSnippet() in place, regardless of its specified location.

To attempt to be ultra-clear to all involved (including myself), the following tables show what I am seeing. A command was run for each boolean cell across two compilations, one with just the TryGivenObject() fix and one with just the SpliceSnippet() fix:

Table 1 - P39 (RTP_SAYINVALIDSNIPPET) RTPs

											
----------    situation identification    ----------	----  fixed by modification?  ----
Case #		Entered Command								TryGivenObject?		SpliceSnippet?								
==========================================================================================
1			>ASK WHAT'S WORTH SAYING [straight]			YES					YES
2			>ASK WHATʼS WORTH SEEING [curly]			YES					NO
3			>ASK WHAT?S WORTH SEEING [question mark]	YES					NO


Table 2 - P40 (RTP_SPLICEINVALIDSNIPPET) RTPs

											
----------    situation identification    ----------	----  fixed by modification?  ----
Case #		Entered Command								TryGivenObject?		SpliceSnippet?								
==========================================================================================
1			>ASK WHAT'S WORTH SAYING [straight]			YES					YES
2			>ASK WHATʼS WORTH SEEING [curly]			YES					NO
3			>ASK WHAT?S WORTH SEEING [question mark]	YES					NO

Note that, in theory, cases #2 and #3 are logically identical because the curly apostrophe is changed to a question mark on its way into the command buffer.

The reason the RTPs are still occurring in the final column is still the same basic failure mode that I specified above: It is due to a call path originating from the same culprit rule (new strip interlocutor from input rule), starting with routine SNIPPET_TY_to_TEXT_TY. For the P39 error, the chain is:

SNIPPET_TY_to_TEXT_TY -> BlkValueCast -> TEXT_TY_Support -> TEXT_TY_Cast -> TEXT_TY_CastPrimitive -> PrintSnippet

This call chain does not involve SpliceSnippet(). The chain results from the line:

	let M be the substituted form of the matched text; [P39!]

which becomes:

    ! [5: let m be the substituted form of the matched text]
    tmp_0 = I7SFRAME;
	BlkValueCopy(tmp_0, TEXT_TY_SubstitutedForm((I7SFRAME+WORDSIZE*2), SNIPPET_TY_to_TEXT_TY((I7SFRAME+WORDSIZE*4),matched_text)));

I have to apologize for the confusion that I have contributed on this thread. As evidenced by my very first post (in which I mistyped a routine name), I have not been in top form the last few days. I should have noticed that the good doctor’s inclusion alters routine SpliceSnippet() but not SnippetIncludes() – it is the latter routine that sets matched_text incorrectly, in the manner originally diagnosed.

If only SpliceSnippet() is modified, then at this point in the rule’s logic, SnippetIncludes() has still set global matched_text to 500, and this illegal snippet value is still being fed to PrintSnippet() during the attempt to cast the matched text snippet to text. That does not happen with my inclusion, because it alters the return value from TryGivenObject(), which ensures that ParseTokenStopped() returns a GPR_FAIL result instead of an object ID, and thus the Consult_Grammar_NNNN() routine is not confused into thinking that it has spotted something matching the search topic in the player's command.

I’ve put in several hours on this one, and I think I’ve solved your reported problem in a definitive manner. It shouldn’t hurt to apply drpeterbatesuk’s inclusion along with the original fix, and doing so might prevent as-yet-unidentified trouble.

Angstsmurf · November 22, 2023, 6:18am

In a (6M62) debug build, answer the initial questions (Y, ANDRA), and type >GONEAR VOLUNTEER. Type ACTIONS and RULES ALL if you desire debug output. Then >ASK WHAT’S WORTH SEEING.

Angstsmurf · November 22, 2023, 6:24am

If it wasn’t clear, I am super grateful for all your hard work. Sorry for leading you down such a rabbit hole. As far as I am concerned, the bugs are fixed.

otistdog · November 22, 2023, 2:14pm

OK, great. It wasn’t clear that you considered the issue to be solved because no solution is marked.

It was a good problem, and I think at least two bug fixes will come out of it, so I’m happy to have put in the time.

@drpeterbatesuk, Another way to look at the cause here is that SnippetIncludes() is accepting an object ID returned from ParseTokenStopped() as evidence that text matching that object can be found in the snippet, which, as this thread illustrates, may not be true.

If this case were to return false via code something like

[ SnippetIncludes test snippet w1 w2 wlen i j;
	...
		for (i=w1, j=wlen: j>0: i++, j--) {
			rv = (test)(i, 0); 
			if (rv ~= GPR_FAIL && i - wn ~= 0 ) return i*100+wn-i;	! don't accept zero-length "match"
		}
	}
	rfalse;
];

then matched_text would never get an illegal value due to inference. This would head off trouble in related routines such as SpliceSnippet() (though to be clear I think that your proposed change to that routine is definitely warranted).