Disambiguation check failure (possibly due to truncating)

Gwen · January 1, 2023, 1:03am

I want to apologize in advance in case I’m not including everything that’s relevant to this issue since I’m not really sure what is going on.

In short what the problem is is that after the parser asks for a disambiguation between two doors it returns “You can’t see any such thing”. For example:

Room 01
You see Bedroom Door and Front Door.

Go through door

Do you mean Bedroom Door or Front Door?

Bedroom Door

You can’t see any such thing.

Trying to reproduce this in a minimal project (with only the doors and the room, none of the other code in my current project) this problem doesn’t arise. I tried turning on Tracing 5 for both to see what the difference was and noticed in the minimal project the tracing returns:

Go through door

[ “go” go / “through” through / “door” door ]

Which do you mean Bedroom Door or Front Door?

Bedroom Door

[token resulted in re-parse request]
[ “go” go / “through” through / “bedroom” bedroom / “door” door / “door” door ]

While my current project results in:

[token resulted in re-parse request]
[ “go” go / “through” through / “bedr” ? ]

After trying to disambiguate. So this seems to indicate to me that for some reason the parser is truncating bedroom door to bedr, which is then making it fail to find any object that matches that name but I’m unsure if this is truly whats going on, and if so what in my code could be causing this to happen.

I’ve tried seeing if my “understand … as …” statements were affecting this but the minimal project doesn’t cause this problem with them included. I was also curious if it was a problem if I included other things that included the word bedroom for example by including a bedroom room in the minimal project and this also didn’t reproduce this problem. So I’m assuming it’s something else about my project which is causing this issue but I’m really not even sure where to start looking for the problem so if anyone has any insight to what could possibly be causing this truncation issue, or knows if this is even the problem, that would be greatly appreciated.

Draconis · January 1, 2023, 1:46am

Do you have any “after reading a command” rules?

Gwen · January 1, 2023, 3:39am

Yes, I’m using this code:

After reading a command:
	let the command read be "[the player's command]";
	replace the text "'s" in the command read with " [']s";
	change the text of the player’s command to the command read.

To read apostrophes. However looking at it in the minimal version including just this code doesn’t reproduce the problem.

Draconis · January 1, 2023, 4:21am

In that case the best I can recommend is:

Make a “test me” script that will reproduce the problem (test me with "trace 5 / go through door" or the like)
Comment out volumes of your code one by one and see if the problem persists
Then narrow it down to books, then parts, then chapters, than sections

Or, decide at what point it’s narrowed down enough to post an example here that reproduces the problem, and we can take a look at it. I have no idea what would be truncating the words like that except “after reading a command” rules and certain use options.

drpeterbatesuk · January 1, 2023, 12:01pm

What extensions are you using, if any?

the pasting in of responses to "which do you mean’ questions to the player’s command happens at a very basic parser level, in NounDomain(), so there’s something very weird going on here.

Try:

First after reading a command: say "The player's command started as [the player's command].".
Last after reading a command: say "The player's command finished as [the player's command].".

as the last lines in your story,
just to double check nothing is interfering with the player’s command after it is reconstructed…

drpeterbatesuk · January 1, 2023, 12:41pm

Hmm. Is your new project in Ver 10? This seems to be a major bug introduced in Ver 10, whereby the player’s command is not correctly reset after a ‘which do you mean…’ insertion. Try compiling your project in Ver 9.3 (if possible) and see if it then works…

drpeterbatesuk · January 1, 2023, 6:34pm

The problem is that after the reply to the ‘Which do you mean…’ question has been inserted into the buffer holding the player’s command, this modified command buffer has to be retokenised, so that the parse buffer holds an updated list of words and their lengths. This ordinarily happens eventually when the full parser is rerun on the modified command, meaning there usually isn’t an issue, but there is a short window between the command buffer being updated and the full parser being rerun where the command buffer and the parse buffer are potentially out of sync, and this is where the problem lies…

In your given example, the original command ‘go through door’ is tokenised into 3 words of length 2, 7 and 4 characters. After typing 'Bedroom Door" in response to the ‘Which do you mean…’, the command buffer now reads “go through bedroom door door”, which should be immediately retokenised to a parse buffer of 5 words of length 2, 7, 7, 4, 4. In V10 this retokenising step is incorrectly omitted. NounDomain() then runs ‘After reading a command’ prior to rerunning the full parser on the new command.

This means that when your ‘After reading a command’ rule is running at the end of NounDomain(), Inform believes the player’s command is still only 3 words, of length 2, 7 and 4 characters, and so “[the player’s command]” comes out as “go through bedr”, which your ‘After reading a command’ rule stores in the command read then sets in stone as the typed command in the command buffer with Change the text of the player's command to the command read. ‘Change the text of the player’s command to…’ always automatically immediately retokenises the new command text (in this case to the three words of length 2, 7, 4 ‘go’ ‘through’ ‘bedr’) to avoid the command buffer and the parse buffer getting out of sync.

Because the problem is created by a combination of this bug and your ‘After reading a command’ rule, the problem should disappear if you comment out the ‘After reading a command’ rule (unless there is a similar one somewhere else in an extension you are using).

However, to keep your rule, you need a workaround to make it work in Ver 10:

To retokenise: (- VM_Tokenise(buffer, parse); players_command = 100 + WordCount(); -).

After reading a command:
	retokenise;
	let the command read be "[the player's command]";
	replace the text "'s" in the command read with " [']s";
	change the text of the player’s command to the command read.

EDIT: if you have (or might have, including extensions) more than one ‘After reading a command’ rule, you’d be better doing this instead, at the end of the story file, to make sure that retokenising happens only once and before any other ‘After reading a command’ rules get chance to run:

First after reading a command: retokenise.

This forces an extra retokenising of the command buffer after every reading of a command- which is wasted time and effort for most commands, but fixes the problem after NounDomain has requested a clarification.

I think the only other workaround pending a formal bugfix from @GrahamNelson would be to hack in a replacement NounDomain() routine that includes the missing retokenising- which you might not fancy.

But if you did, this patched NounDomain() (containing just one additional line of code) will do the job…

EDIT: I’ve inserted a call to the I6 entry point LanguageToInformese() in order to fully reconstruct the original version of NounDomain(): this is an I6 routine provided primarily to be used by translators of Inform to non-English languages, to make input more aligned with the parser’s general anglocentric expectations- things like separating suffixes/affixes into separate words, removing inflections etc. (see DM4 for more details). In the standard English Inform, it’s an empty routine: [ LanguageToInformese; ]; The kinds of things that might have been done in this routine are now generally (as in the case of this thread) done in ‘After reading a command’ rules.

Include (-

[ NounDomain domain1 domain2 context dont_ask first_word i j k l answer_words marker;
	if ((parser_trace >= 4)) {
		print "   [NounDomain called at word ";
		print wn;
		print " (domain1 ";
		PrintShortName(domain1);
		print ", domain2 ";
		PrintShortName(domain2);
		print ")^";
		print "   ";
		if (indef_mode) {
			print "seeking indefinite object: ";
			if (((indef_type)&(OTHER_BIT))) {
				print "other ";
			}
			if (((indef_type)&(MY_BIT))) {
				print "my ";
			}
			if (((indef_type)&(THAT_BIT))) {
				print "that ";
			}
			if (((indef_type)&(PLURAL_BIT))) {
				print "plural ";
			}
			if (((indef_type)&(LIT_BIT))) {
				print "lit ";
			}
			if (((indef_type)&(UNLIT_BIT))) {
				print "unlit ";
			}
			if ((indef_owner ~= 0)) {
				print "owner:";
				PrintShortName(indef_owner);
			}
			print "^";
			print "   number wanted: ";
			if ((indef_wanted == INDEF_ALL_WANTED)) {
				print "all";
			} else {
				print indef_wanted;
			}
			print "^";
			print "   most likely GNAs of names: ";
			print indef_cases;
			print "^";
		} else {
			print "seeking definite object^";
		}
	}
	(match_length = 0);
	(number_matched = 0);
	(match_from = wn);
	SearchScope(domain1, domain2, context);
	if ((parser_trace >= 4)) {
		print "   [ND made ";
		print number_matched;
		print " matches]^";
	}
	(wn = (match_from + match_length));
	if ((number_matched == 0)) {
		(wn)++;
		rfalse;
	}
	if ((match_from <= num_words)) {
		if ((number_matched == 1)) {
			(i = (match_list-->(0)));
			return i;
		}
		if ((wn <= num_words)) {
			(i = NextWord());
			(wn)--;
			if ((i ~= AND1__WD or AND2__WD or AND3__WD or comma_word or THEN1__WD or THEN2__WD or THEN3__WD or BUT1__WD or BUT2__WD or BUT3__WD)) {
				if ((lookahead == ENDIT_TOKEN)) {
					rfalse;
				}
			}
		}
	}
	(number_of_classes = 0);
	if ((number_matched == 1)) {
		(i = (match_list-->(0)));
		if ((((indef_mode == 1)) && ((((indef_type)&(PLURAL_BIT)) ~= 0)))) {
			if ((context == MULTI_TOKEN or MULTIHELD_TOKEN or MULTIEXCEPT_TOKEN or MULTIINSIDE_TOKEN or NOUN_TOKEN or HELD_TOKEN or CREATURE_TOKEN)) {
				BeginActivity(DECIDING_WHETHER_ALL_INC_ACT, i);
				if (((ForActivity(DECIDING_WHETHER_ALL_INC_ACT, i)) && (RulebookFailed()))) {
					rfalse;
				}
				EndActivity(DECIDING_WHETHER_ALL_INC_ACT, i);
			}
		}
	}
	if ((number_matched > 1)) {
		(i = 1);
		if ((number_matched > 1)) {
			for ((j = 0):(j < (number_matched - 1)):(j)++) {
				if ((Identical((match_list-->(j)), (match_list-->((j + 1)))) == 0)) {
					(i = 0);
				}
			}
		}
		if (i) {
			(dont_infer = 1);
		}
		(i = Adjudicate(context));
		if ((i == -1)) {
			rfalse;
		}
		if ((i == 1)) {
			rtrue;
		}
		(dont_infer_pronoun = 1);
	}
	if ((i ~= 0)) {
		if (dont_infer) {
			return i;
		}
		if ((inferfrom == 0)) {
			(inferfrom = pcount);
		}
		((pattern-->(pcount)) = i);
		return i;
	}
	if (dont_ask) {
		return (match_list-->(0));
	}
	if ((match_from > num_words)) {
		jump Incomplete;
	}
	BeginActivity(ASKING_WHICH_DO_YOU_MEAN_ACT);
	if (ForActivity(ASKING_WHICH_DO_YOU_MEAN_ACT)) {
		jump SkipWhichQuestion;
	}
	(j = 1);
	(marker = 0);
	for ((i = 1):(i <= number_of_classes):(i)++) {
		while (((((match_classes-->(marker)) ~= i)) && (((match_classes-->(marker)) ~= (-(i)))))) {
			(marker)++;
		}
		if ((~~(((match_list-->(marker)) has animate)))) {
			(j = 0);
		}
	}
	if (j) {
		PARSER_CLARIF_INTERNAL_RM(65);
	} else {
		PARSER_CLARIF_INTERNAL_RM(66);
	}
	(j = number_of_classes);
	(marker = 0);
	for ((i = 1):(i <= number_of_classes):(i)++) {
		while (((((match_classes-->(marker)) ~= i)) && (((match_classes-->(marker)) ~= (-(i)))))) {
			(marker)++;
		}
		(k = (match_list-->(marker)));
		if (((match_classes-->(marker)) > 0)) {
			DefArt(k);
		} else {
			IndefArt(k);
		}
		if ((i < (j - 1))) {
			print ", ";
		}
		if ((i == (j - 1))) {
			if (((KIT_CONFIGURATION_BITMAP)&(SERIAL_COMMA_TCBIT))) {
				if ((j ~= 2)) {
					print ",";
				}
			}
			PARSER_CLARIF_INTERNAL_RM(72);
		}
	}
	print "?^";
	.SkipWhichQuestion;
	EndActivity(ASKING_WHICH_DO_YOU_MEAN_ACT);
	.WhichOne;
	(answer_words = Keyboard(buffer2, parse2));
	(first_word = (parse2-->(1)));
	if ((first_word == ALL1__WD or ALL2__WD or ALL3__WD or ALL4__WD or ALL5__WD)) {
		if ((context == MULTI_TOKEN or MULTIHELD_TOKEN or MULTIEXCEPT_TOKEN or MULTIINSIDE_TOKEN)) {
			(l = (multiple_object-->(0)));
			for ((i = 0):(((i < number_matched)) && (((l + i) < MATCH_LIST_WORDS))):(i)++) {
				(k = (match_list-->(i)));
				((multiple_object-->(((i + 1) + l))) = k);
			}
			((multiple_object-->(0)) = (i + l));
			rtrue;
		}
		PARSER_CLARIF_INTERNAL_RM(67);
		jump WhichOne;
	}
	for ((i = 1):(i <= answer_words):(i)++) {
		if ((WordFrom(i, parse2) == comma_word)) {
			VM_CopyBuffer(buffer, buffer2);
			jump RECONSTRUCT_INPUT;
		}
	}
	if ((first_word == 0)) {
		(j = wn);
		(first_word = LanguageIsVerb(buffer2, parse2, 1));
		(wn = j);
	}
	if ((first_word ~= 0)) {
		(j = (first_word->(#dict_par1)));
		if ((((0 ~= ((j)&(1)))) && ((~~(LanguageVerbMayBeName(first_word)))))) {
			VM_CopyBuffer(buffer, buffer2);
			jump RECONSTRUCT_INPUT;
		}
	}
	(k = (WordAddress(match_from) - buffer));
	(l = ((buffer2-->(0)) + 1));
	for ((j = ((buffer + INPUT_BUFFER_LEN) - 1)):(j >= ((buffer + k) + l)):(j)-- ) {
		((j->(0)) = (j->((-(l)))));
	}
	for ((i = 0):(i < l):(i)++) {
		((buffer->((k + i))) = (buffer2->((WORDSIZE + i))));
	}
	((buffer->(((k + l) - 1))) = 32);
	((buffer-->(0)) = ((buffer-->(0)) + l));
	if (((buffer-->(0)) > (INPUT_BUFFER_LEN - WORDSIZE))) {
		((buffer-->(0)) = (INPUT_BUFFER_LEN - WORDSIZE));
	}
	.RECONSTRUCT_INPUT;
	(num_words = WordCount());
	(players_command = (100 + num_words));
	(wn = 1);
	!########################## INSERTION COMMENCES #################################
	LanguageToInformese();
	VM_Tokenise(buffer, parse);
	!############################ INSERTION FINISHES ###################################
	(num_words = WordCount());
	(players_command = (100 + num_words));
	(actors_location = ScopeCeiling(player));
	FollowRulebook((Activity_after_rulebooks-->(READING_A_COMMAND_ACT)));
	return REPARSE_CODE;
	.Incomplete;
	if ((context == CREATURE_TOKEN)) {
		PARSER_CLARIF_INTERNAL_RM(68, actor);
	} else {
		PARSER_CLARIF_INTERNAL_RM(69, actor);
	}
	print "^";
	(answer_words = Keyboard(buffer2, parse2));
	for ((i = 1):(i <= answer_words):(i)++) {
		if ((WordFrom(i, parse2) == comma_word)) {
			VM_CopyBuffer(buffer, buffer2);
			jump RECONSTRUCT_INPUT;
		}
	}
	(first_word = (parse2-->(1)));
	if ((first_word == 0)) {
		(j = wn);
		(first_word = LanguageIsVerb(buffer2, parse2, 1));
		(wn = j);
	}
	if ((first_word ~= 0)) {
		(j = (first_word->(#dict_par1)));
		if ((((0 ~= ((j)&(1)))) && ((~~(LanguageVerbMayBeName(first_word)))))) {
			VM_CopyBuffer(buffer, buffer2);
			jump RECONSTRUCT_INPUT;
		}
	}
	if ((inferfrom ~= 0)) {
		for ((j = inferfrom):(j < pcount):(j)++) {
			if (((pattern-->(j)) == PATTERN_NULL)) {
				continue;
			}
			(i = (WORDSIZE + (buffer-->(0))));
			((buffer-->(0)))++;
			((buffer->((i)++)) = 32);
			if ((parser_trace >= 5)) {
				print "[Gluing in inference at ";
				print j;
				print " with pattern code ";
				print (pattern-->(j));
				print "]^";
			}
			((parse2-->(1)) = 0);
			if (((((pattern-->(j)) >= 2)) && (((pattern-->(j)) < REPARSE_CODE)))) {
				if ((dont_infer_pronoun == 0)) {
					PronounNotice((pattern-->(j)));
					for ((k = 1):(k <= (LanguagePronouns-->(0))):(k = (k + 3))) {
						if (((pattern-->(j)) == (LanguagePronouns-->((k + 2))))) {
							((parse2-->(1)) = (LanguagePronouns-->(k)));
							if ((parser_trace >= 5)) {
								print "[Using pronoun '";
								print (address) (parse2-->(1));
								print "']^";
							}
							break;
						}
					}
				}
			} else {
				((parse2-->(1)) = VM_NumberToDictionaryAddress(((pattern-->(j)) - REPARSE_CODE)));
				if ((parser_trace >= 5)) {
					print "[Using preposition '";
					print (address) (parse2-->(1));
					print "']^";
				}
			}
			if (((parse2-->(1)) ~= 0)) {
				(k = (buffer + i));
				(k = Glulx_PrintAnyToArray((buffer + i), (INPUT_BUFFER_LEN - i), (parse2-->(1))));
				(i = (i + k));
				((buffer-->(0)) = (i - WORDSIZE));
			}
		}
	}
	(i = (WORDSIZE + (buffer-->(0))));
	((buffer-->(0)))++;
	((buffer->((i)++)) = 32);
	for ((j = 0):(j < (buffer2-->(0))):((i)++,(j)++)) {
		((buffer->(i)) = (buffer2->((j + WORDSIZE))));
		((buffer-->(0)))++;
		if (((buffer-->(0)) == INPUT_BUFFER_LEN)) {
			break;
		}
	}
	jump RECONSTRUCT_INPUT;
];
-) replacing "NounDomain".

Zed · January 1, 2023, 6:47pm

wow, nice find.

Gwen · January 2, 2023, 10:31pm

Thank you for this very in depth explanation of the issue, as well as the multiple solutions. I will probably stick with the simple retokenising solution you suggest (paired with the first after reading a command rule) but hopefully this will be able to get fixed. I’m unfamiliar with the bug reporting process for I7, have you already reported this, or if not, do you know where I should go to report this?

drpeterbatesuk · January 2, 2023, 10:35pm

Glad to have helped!
Yes, I have reported it. The bug reporting site is currently here.