Oh, wow – I really was not paying enough attention when writing that first set of untested code. I’ve corrected it above.
Please accept my apologies for the confusion.
Oh, wow – I really was not paying enough attention when writing that first set of untested code. I’ve corrected it above.
Please accept my apologies for the confusion.
Thanks a lot for looking into this!
As far as I understood you, you are saying that the condition
if the player's command includes "[someone talk-eligible]":
won’t work as intended (that is, be true only if the player’s command includes words that can be understood as the current interlocutor) because of an Inform bug. And that it will cause the run-time problem in question when we then try to set a variable to the non-existent matched text.
In other words, we have to find an alternative way to check for this.
But the critical issue is that if ParseTokenStopped() has (rightly or wrongly) made a match on the final word of a 5 word command, wn should now be beyond the end of the player’s command (6) so SnippetIncludes() should return 5*100+6-5 = 501, which is a valid snippet representing the last word of a 5 word command…?
When that calculation is made, wn == 5
. Consult_Grammar_1434()
forcibly sets wn
to 5
as its first operation, and the invoking routine SnippetIncludes()
makes no effort to save it and restore a previous value.
As far as I can tell, the contents of parse
and buffer
are as desired (i.e. reflecting >ASK WHAT IS WORTH SEEING) when the action moves to ParseTokenStopped()
. ParseTokenStopped()
does not halt early because it sees the request as in-bounds for the current contents.
The particular logical fault seems to be within TryGivenObject()
, which has the block:
! If input has run out then always match, with only quality 0 (this saves
! time).
if (wn > num_words) { ! stale value of num_words for some reason
if (nomatch) return 0;
if (indef_mode ~= 0)
dict_flags_of_noun = $$01110000; ! Reject "plural" bit
MakeMatch(obj,0);
#Ifdef DEBUG;
if (parser_trace >= 5) print " Matched (0)^";
#Endif; ! DEBUG
return 1;
}
If the comparison condition is changed to
if (wn > WordCount()) { ! check current parse array value
then the RTP does not occur.
Here’s the inclusion for 6M62:
Include (-
! ==== ==== ==== ==== ==== ==== ==== ==== ==== ====
! Parser.i6t: TryGivenObject
! ==== ==== ==== ==== ==== ==== ==== ==== ==== ====
[ TryGivenObject obj nomatch threshold k w j;
#Ifdef DEBUG;
if (parser_trace >= 5) print " Trying ", (the) obj, " (", obj, ") at word ", wn, "^";
#Endif; ! DEBUG
if (nomatch && obj == 0) return 0;
! if (nomatch) print "*** TryGivenObject *** on ", (the) obj, " at wn = ", wn, "^";
dict_flags_of_noun = 0;
! If input has run out then always match, with only quality 0 (this saves
! time).
if (wn > WordCount()) { ! MODIFIED
if (nomatch) return 0;
if (indef_mode ~= 0)
dict_flags_of_noun = $$01110000; ! Reject "plural" bit
MakeMatch(obj,0);
#Ifdef DEBUG;
if (parser_trace >= 5) print " Matched (zero)^";
#Endif; ! DEBUG
return 1;
}
! Ask the object to parse itself if necessary, sitting up and taking notice
! if it says the plural was used:
if (obj.parse_name~=0) {
parser_action = NULL; j=wn;
k = RunRoutines(obj,parse_name);
if (k > 0) {
wn=j+k;
.MMbyPN;
if (parser_action == ##PluralFound)
dict_flags_of_noun = dict_flags_of_noun | 4;
if (dict_flags_of_noun & 4) {
if (~~allow_plurals) k = 0;
else {
if (indef_mode == 0) {
indef_mode = 1; indef_type = 0; indef_wanted = 0;
}
indef_type = indef_type | PLURAL_BIT;
if (indef_wanted == 0) indef_wanted = INDEF_ALL_WANTED;
}
}
#Ifdef DEBUG;
if (parser_trace >= 5) print " Matched (", k, ")^";
#Endif; ! DEBUG
if (nomatch == false) MakeMatch(obj,k);
return k;
}
if (k == 0) jump NoWordsMatch;
}
! The default algorithm is simply to count up how many words pass the
! Refers test:
parser_action = NULL;
w = NounWord();
if (w == 1 && player == obj) { k=1; jump MMbyPN; }
if (w >= 2 && w < 128 && (LanguagePronouns-->w == obj)) { k = 1; jump MMbyPN; }
if (Refers(obj, wn-1) == 0) {
.NoWordsMatch;
if (indef_mode ~= 0) { k = 0; parser_action = NULL; jump MMbyPN; }
rfalse;
}
threshold = 1;
dict_flags_of_noun = (w->#dict_par1) & $$01110100;
w = NextWord();
while (Refers(obj, wn-1)) {
threshold++;
if (w)
dict_flags_of_noun = dict_flags_of_noun | ((w->#dict_par1) & $$01110100);
w = NextWord();
}
k = threshold;
jump MMbyPN;
];
-) instead of "TryGivenObject" in "Parser.i6t".
None of the other code from previous posts matters for resolving the issue.
I think this does qualify as an actual bug… you could argue that it’s a consequence of the particular path to TryGivenObject()
, but really the routine is relying on a cached value that it doesn’t cost much to refresh or read directly.
That said, the parser does rely throughout on the num_words global being refreshed every time the input & parse buffers are altered, so one could argue that the root of the error lies in not having num_words = WordCount(); players_command = 100 + num_words;
immediately after VM_Tokenise(buffer,parse) somewhere, as by the parser’s conventions should always happen.
It works. Wonderful!
This allows us to simplify the new strip interlocutor from input rule. Now cut the matched text
works as intended, and we no longer need to use a regular expression.
Does this fix also work- which I would humbly suggest fixes the fundamental issue at source (therefore without leaving other hostages to fortune): i.e. that num_words
is not updated after splicing a snippet, although players_command
is:
Include (-
Replace SpliceSnippet;
-) after "Definitions.i6t".
Include (-
[ SpliceSnippet snip t i w1 w2 nextw at endsnippet newlen;
w1 = snip/100; w2 = w1 + (snip%100) - 1;
if ((w2<w1) || (w1<1)) {
if ((w1 == 1) && (w2 == 0)) return;
return RunTimeProblem(RTP_SPLICEINVALIDSNIPPET, w1, w2);
}
@push say__p; @push say__pc;
nextw = w2 + 1;
at = WordAddress(w1) - buffer;
if (nextw <= WordCount()) endsnippet = 100*nextw + (WordCount() - nextw + 1);
buffer2-->0 = 120;
newlen = VM_PrintToBuffer(buffer2, 120, SpliceSnippet__TextPrinter, t, endsnippet);
for (i=0: (i<newlen) && (at+i<120): i++) buffer->(at+i) = buffer2->(WORDSIZE+i);
#Ifdef TARGET_ZCODE; buffer->1 = at+i; #ifnot; buffer-->0 = at+i; #endif;
for (:at+i<120:i++) buffer->(at+i) = ' ';
VM_Tokenise(buffer, parse);
num_words = WordCount(); ! #### INSERTED STATEMENT ####
players_command = 100 + WordCount();
@pull say__pc; @pull say__p;
];
-) after "Output.i6t".
EDIT: I see that this bug persists in the Ver 10.1.2 version of SpliceSnippet(). I have reported it.
Hm, even with this fix active, I still get run-time problem P39 by typing the thing that originally led me down this path: >ASK WHAT’S WORTH SEEING (note the curly apostrophe):
*** Run-time problem P39: Attempt to say a snippet value which is currently invalid: words 5 to 4.
And it is immediately followed by
*** Run-time problem P40: Attempt to splice a snippet value which is currently invalid: words 5 to 4.
EDIT: I guess this has something to do with the interpreter replacing the Unicode curly apostrophe with a question mark, which the built-in extension Punctuation Removal then replaces with a space, creating an extra word (“WHAT?S” becomes “WHAT S”.)
To clarify, when you say “this fix” is active, do you mean the one from me or the one from drpeterbatesuk? (I agree that his fix should also address the issue, but I want to make sure that I’m looking at the same code as you.)
I have only tested with Peter Bates’ fix yet.
EDIT: Now I tested with your (otisdog’s) fix as well. With that, there is no run-time problem with ASK WHAT’S WORTH SEEING.
As I said, I agree that the fix by drpeterbatesuk should address the issue, but shoulds and oughts don’t always carry much weight in parser code. It seems that there are additional places where (EDIT: See following post.)WordCount()
and num_words
can get out of sync. These may be introduced by other code in Counterfeit Monkey (or the extensions it uses) instead of template code.
I also agree with the good doctor that every other use of num_words
is a hostage of fortune if only the issue in TryGivenObject()
is addressed, but your experience suggests that they are not causing any observable problems.
In any case, these RTPs are still both coming from the new strip interlocutor from input rule
when only SpliceSnippet()
is modified – the second regarding splicing is from the line:
cut the matched text;
I don’t see either RTP if the modified version of TryGivenObject()
is in place.
I count between 20 and 25 places where template code reads the value of num_words
for comparison or computation. Arguably, all of these could be replaced by calls to WordCount()
and the global num_words
retired. Arguably, a standardized routine could be set up to be invoked after any modification to the command buffer
to ensure retokenization and update of parse
array-dependent values such as players_command
and num_words
, and the multiplicity of inline repetitions for these operations eliminated.
EDIT: Here’s some code to assist in diagnosing:
To decide which number is num_words cache:
(- num_words -).
To decide which number is num_words computed:
(- (WordCount()) -)
To check word count disparity:
unless num_words cache is num_words computed:
say "<MISMATCH for num_words: cache = [num_words cache], computed = [num_words computed]>".
If you add the phrase check word count disparity;
to any point in your code, you can see whether a mismatch has developed there.
EDIT 2: I misdiagnosed the source of the second RTP. Corrected above.
FYI: In the version of the github code I downloaded this morning, the version of drpeterbatesuk’s inclusion in Character Models.i7x
takes the form (with comment added):
Include (-
Replace SpliceSnippet;
-) after "Definitions.i6t".
Include (-
...
-) after "Output.i6t". [note placement]
Per my testing, it prevents both RTPs (without a modification of TryGivenObject
) if the location specified for the inclusion is changed to:
-) after "Snippets" in "Parser.i6t".
[EDIT: Addendum: It prevents both RTPs for the command >ASK WHAT’S WORTH SEEING (with straight apostrophe). It does not prevent either RTP (39/RTP_SAYINVALID_SNIPPET
or 40/RTP_SPLICINVALIDSNIPPET
) for the command >ASK WHAT?S WORTH SEEING, which is what the parser actually sees when a curly apostrophe is used. You did say to note that!]
Oops, should have checked that sequencing!
Yes, it’s all a bit Heath Robinson at present. If the parse buffer is only ever updated by VM_Tokenise(buffer, parse) (I suspect that’s true although I haven’t checked) this code could simply be included in that function.
I don’t know what I’m doing wrong, but I can’t get this to work. In my testing, ASK WHAT’S WORTH SEEING with a curly apostrophe will still cause a RTP with this change.
(I’ve only tested this in the macOS Inform IDE, though.)
However, the modified TryGivenObject()
still seems to fix all RTPs (both when used on its own and in combination with the replaced SpliceSnippet()
), so I’m going with that for now.
How about with a straight apostrophe?
I assume then that there is also another routine other than SpliceSnippet which is altering the parse buffer without updating num_words?
How does one get to the relevant part of Counterfeit Monkey so that I can debug this (I have never played it). Is it easy?
You’re not doing anything wrong. Despite the fact that you said to note that the RTPs were still happening only with curly apostrophes, I was not properly checking that case. I confirm that when the command is >ASK WHAT?S WORTH SEEING (or the version with curly apostrophe, which presents the same actual input to the parser), the RTPs still occur with only the modification to SpliceSnippet()
in place, regardless of its specified location.
To attempt to be ultra-clear to all involved (including myself), the following tables show what I am seeing. A command was run for each boolean cell across two compilations, one with just the TryGivenObject()
fix and one with just the SpliceSnippet()
fix:
Table 1 - P39 (RTP_SAYINVALIDSNIPPET) RTPs
---------- situation identification ---------- ---- fixed by modification? ----
Case # Entered Command TryGivenObject? SpliceSnippet?
==========================================================================================
1 >ASK WHAT'S WORTH SAYING [straight] YES YES
2 >ASK WHATʼS WORTH SEEING [curly] YES NO
3 >ASK WHAT?S WORTH SEEING [question mark] YES NO
Table 2 - P40 (RTP_SPLICEINVALIDSNIPPET) RTPs
---------- situation identification ---------- ---- fixed by modification? ----
Case # Entered Command TryGivenObject? SpliceSnippet?
==========================================================================================
1 >ASK WHAT'S WORTH SAYING [straight] YES YES
2 >ASK WHATʼS WORTH SEEING [curly] YES NO
3 >ASK WHAT?S WORTH SEEING [question mark] YES NO
Note that, in theory, cases #2 and #3 are logically identical because the curly apostrophe is changed to a question mark on its way into the command buffer.
The reason the RTPs are still occurring in the final column is still the same basic failure mode that I specified above: It is due to a call path originating from the same culprit rule (new strip interlocutor from input rule
), starting with routine SNIPPET_TY_to_TEXT_TY
. For the P39 error, the chain is:
SNIPPET_TY_to_TEXT_TY -> BlkValueCast -> TEXT_TY_Support -> TEXT_TY_Cast -> TEXT_TY_CastPrimitive -> PrintSnippet
This call chain does not involve SpliceSnippet()
. The chain results from the line:
let M be the substituted form of the matched text; [P39!]
which becomes:
! [5: let m be the substituted form of the matched text]
tmp_0 = I7SFRAME;
BlkValueCopy(tmp_0, TEXT_TY_SubstitutedForm((I7SFRAME+WORDSIZE*2), SNIPPET_TY_to_TEXT_TY((I7SFRAME+WORDSIZE*4),matched_text)));
I have to apologize for the confusion that I have contributed on this thread. As evidenced by my very first post (in which I mistyped a routine name), I have not been in top form the last few days. I should have noticed that the good doctor’s inclusion alters routine SpliceSnippet()
but not SnippetIncludes()
– it is the latter routine that sets matched_text
incorrectly, in the manner originally diagnosed.
If only SpliceSnippet()
is modified, then at this point in the rule’s logic, SnippetIncludes()
has still set global matched_text
to 500
, and this illegal snippet value is still being fed to PrintSnippet()
during the attempt to cast the matched text
snippet to text. That does not happen with my inclusion, because it alters the return value from TryGivenObject()
, which ensures that ParseTokenStopped()
returns a GPR_FAIL
result instead of an object ID, and thus the Consult_Grammar_NNNN()
routine is not confused into thinking that it has spotted something matching the search topic in the player's command
.
I’ve put in several hours on this one, and I think I’ve solved your reported problem in a definitive manner. It shouldn’t hurt to apply drpeterbatesuk’s inclusion along with the original fix, and doing so might prevent as-yet-unidentified trouble.
In a (6M62) debug build, answer the initial questions (Y, ANDRA), and type >GONEAR VOLUNTEER. Type ACTIONS and RULES ALL if you desire debug output. Then >ASK WHAT’S WORTH SEEING.
If it wasn’t clear, I am super grateful for all your hard work. Sorry for leading you down such a rabbit hole. As far as I am concerned, the bugs are fixed.
OK, great. It wasn’t clear that you considered the issue to be solved because no solution is marked.
It was a good problem, and I think at least two bug fixes will come out of it, so I’m happy to have put in the time.
@drpeterbatesuk, Another way to look at the cause here is that SnippetIncludes()
is accepting an object ID returned from ParseTokenStopped()
as evidence that text matching that object can be found in the snippet, which, as this thread illustrates, may not be true.
If this case were to return false
via code something like
[ SnippetIncludes test snippet w1 w2 wlen i j;
...
for (i=w1, j=wlen: j>0: i++, j--) {
rv = (test)(i, 0);
if (rv ~= GPR_FAIL && i - wn ~= 0 ) return i*100+wn-i; ! don't accept zero-length "match"
}
}
rfalse;
];
then matched_text
would never get an illegal value due to inference. This would head off trouble in related routines such as SpliceSnippet()
(though to be clear I think that your proposed change to that routine is definitely warranted).