(Bug report) When typing a command acting on a single indefinite object using any matching plural name, the parser interprets this as 'all' matching objects rather than just one

This is a particular problem for words like ‘sheep’ or ‘fowl’ which are ambiguously plural, e.g.

"Counting Sheep" by PB

The Pasture is a room.

A sheep is a kind of thing.

The Herdwick , the Scottish Blackface  and the Texel are sheep in the pasture.

A bird is a kind of thing. Understand "fowl" as the plural of bird.

The duck, the drake, the goose and the gander are birds in the pasture.

The red cow, the brown cow and the black cow are things in the pasture.

The bay horse, the dappled horse and the piebald horse are things in the pasture.
Understand "horses" as the plural of the bay.
Understand "horses" as the plural of the dappled.
Understand "horses" as the plural of the piebald.

after which:

Counting Sheep
An Interactive Fiction by PB
Release 1 / Serial number 240109 / Inform 7 v10.1.2 / D

Pasture
You can see a Herdwick, a Scottish Blackface, a Texel, a duck, a drake, a goose, a gander, a red cow, a brown cow, a black cow, a bay horse, a dappled horse and a piebald horse here.

>take a sheep
Herdwick: Taken.
Scottish Blackface: Taken.
Texel: Taken.

>drop one sheep
Texel: Dropped.
Scottish Blackface: Dropped.
Herdwick: Dropped.

>take one bird
You can’t see any such thing.

>take one birds
duck: Taken.
drake: Taken.
goose: Taken.
gander: Taken.

>drop one fowl
gander: Dropped.
goose: Dropped.
drake: Dropped.
duck: Dropped.

>take one cow
(the red cow)
Taken.

>take one horses
bay horse: Taken.
dappled horse: Taken.
piebald horse: Taken.

>drop two horses
piebald horse: Dropped.
dappled horse: Dropped.

This seems to be a longstanding bug, previously discussed here but never fixed.

I’ve reported this in the bug tracker

1 Like

Adding

A venomous snake is a kind of thing.

The adder, the viper and the rattlesnake are venomous snakes in the pasture.

gives

>take one venomous
adder: Taken.
viper: Taken.
rattlesnake: Taken.

>drop one snakes
rattlesnake: Dropped.
viper: Dropped.
adder: Dropped.

>take one venomous snakes
adder: Taken.
viper: Taken.
rattlesnake: Taken.

because both ‘venomous’ and ‘snakes’ are added as plural names to all these venomous snakes:

Array name_array27 --> [ 'adder'; 'venomous//p'; 'snakes//p'; ];
Array name_array28 --> [ 'viper'; 'venomous//p'; 'snakes//p'; ];
Array name_array29 --> [ 'rattlesnake'; 'venomous//p'; 'snakes//p'; ];

But interestingly

A stick insect is a kind of thing.

The green stick insect, the yellow stick insect and the orange stick insect are stick insects in the pasture.

yields

>take one stick
(the green stick insect)
Taken.

>take one insects
yellow stick insect: Taken.
orange stick insect: Taken.

because ‘stick’ and ‘insect’ are added as singular names to the insects’ name property, whereas ‘insects’ but not ‘stick’ is added as plural name:

Array name_array30 --> [ 'green'; 'stick'; 'insect'; 'insects//p'; ];
Array name_array31 --> [ 'yellow'; 'stick'; 'insect'; 'insects//p'; ];
Array name_array32 --> [ 'orange'; 'stick'; 'insect'; 'insects//p'; ];

It seems to be the case that when words of an object’s name also form part of the plural of the kind name, they are not added as plural names. Generally, the last word in the plural kind name (in this case, ‘insects’), will be added because it is pluralised and ‘insects’ doesn’t clash with ‘insect’.

This can be overriden by, for example, Understand "stick" as the plural of the green stick insect.- in which case ‘stick’ is matched as a plural from in a parse_name property- similarly Understand "stick" as the plural of a stick insect.

Things get really wild if we have

A stick insect is a kind of thing.

The green stick insect, the yellow stick insect and the orange stick insect are stick insects in the pasture.
The  purple insect and the violet insect are stick insects in the pasture.

which is compiled to

Array name_array30 --> [ 'green'; 'stick'; 'insect'; 'insects//p'; ];
Array name_array31 --> [ 'yellow'; 'stick'; 'insect'; 'insects//p'; ];
Array name_array32 --> [ 'orange'; 'stick'; 'insect'; 'insects//p'; ];
Array name_array33 --> [ 'purple'; 'insect'; 'stick//p'; 'insects//p'; ];
Array name_array34 --> [ 'violet'; 'insect'; 'stick//p'; 'insects//p'; ];

and results in:

>take one stick
green stick insect: Taken.
yellow stick insect: Taken.
orange stick insect: Taken.
purple insect: Taken.
violet insect: Taken.

>drop one insects
violet insect: Dropped.
purple insect: Dropped.
orange stick insect: Dropped.
yellow stick insect: Dropped.
green stick insect: Dropped.

because the parser interprets this as 'take all things matching ‘stick’ ’ when even just one object matches ‘stick’ as a plural name.

So, for example, also with

The green stick insect, the yellow stick insect and the orange stick insect are stick insects in the pasture.
Understand "stick" as the plural of the green stick insect.

we get

>take one stick
green stick insect: Taken.
yellow stick insect: Taken.
orange stick insect: Taken.

even though the green stick insect is the only one recognised as having “stick” as a plural as well as a singular name.

EDIT: Similarly, in the case of an ambiguous plural like sheep, after A Herdwick sheep is a sheep in the pasture ‘sheep’ will be added only as a singular name and not as a plural name to the Herdwick’s name property and, unless there are other things in scope that do have sheep as a plural name ‘Take one sheep’ will be parsed correctly and only one sheep will be taken:

The Herdwick sheep , the Scottish Blackface sheep and the Texel sheep are sheep in the pasture.

>take one sheep
(the Herdwick sheep)
Taken.

but

The Herdwick sheep , the Scottish Blackface sheep and the Texel are sheep in the pasture.

>take one sheep
Herdwick sheep: Taken.
Scottish Blackface sheep: Taken.
Texel: Taken.

because in the latter case the Texel has ‘sheep’ as a plural name (‘sheep//p’)

3 Likes

@otistdog has kindly reminded me that the issue of plural forms is even more nuanced than described above!

In the situation where a name word is flagged as a plural in a name property- as it is when it is part of a plural kind name that doesn’t clash with any of the other names of an object of that kind- then what counts is the consequence of that for construction of the dictionary, i.e. if a word is flagged just once anywhere in the source as being a plural form, then that word’s entry will end up flagged in the dictionary as a (potential) plural form (any given word can have only one entry in the dictionary, which must either be flagged as a potential plural form or not). This is because when (for example) the I6 compiler encounters a ‘sheep//p’ entry in the name property of the Texel object declaration in I6 source, it takes this as instruction to (i) enter ‘sheep’ into the game dictionary if it’s not already there. (ii) flag ‘sheep’ in the game dictionary as a potential plural form. (iii) compile ‘sheep’ into the name property of the Texel game object. Note that ‘sheep’ is not flagged as plural in the name property of the Texel game object- that just contains ‘sheep’- it’s the dictionary entry for ‘sheep’ that is flagged as a potentially plural form as a side effect of the ‘//p’ suffix in the I6 source.

The side-effect of that is that this word is now always treated as a potential plural form for any object using that name word anywhere in the game. In the example given above, then, it makes no difference where the Texel is- merely the fact that it exists marks ‘sheep’ as a potentially plural form and ‘take one sheep’ will therefore take all things in scope matching the word ‘sheep’ in their name whether the Texel is in scope or not.

This is different again if a plural form is marked as such using (for example) Understand "stick" as the plural of the green stick insect.. As mentioned above, an Understand phrase like this is compiled not to a ‘stick//p’ entry in the green stick insect’s name property, but to a parse_name routine that matches the word ‘stick’ from the dictionary (which will remain a singular dictionary form unless marked as a plural form separately elsewhere in the source) and flags to the parser by a special means that in the context of this particular object ‘stick’ should be regarded as a potentially plural form. The details are given in the DM4 p. 214.

Plurals recognised in this way, via an Understand ... as the plural of ... phrase, do behave as descibed in the original post above, i.e. if the word is matched as plural for an object in scope then the word will be treated as a plural applying to all objects in scope matching the same word.

So, in the example above, if the green stick insect is moved out of scope then ‘stick’ no longer behaves as a plural for the yellow or orange stick insect and ‘take a stick’ or ‘take one stick’ will take the yellow or the brown stick insect; ‘take the stick’ or ‘take stick’ will result in ‘Which do you mean, the yellow stick insect or the orange stick insect?’. When the green stick insect is in scope, ‘take one stick’, ‘take a stick’, ‘take stick’ and ‘take the stick’ all result in every stick insect present being taken.

4 Likes

I think part of what you’re seeing here results from the fact that the parser doesn’t really pay much attention to the details of descriptor words (including the indefinite article). Would these changes help? (They’re in 6M62 format, but the few changed lines are marked.)

some parser guts
Include (-

! ==== ==== ==== ==== ==== ==== ==== ==== ==== ====
! Parser.i6t: Parse Token Letter F
! ==== ==== ==== ==== ==== ==== ==== ==== ==== ====

	! Happy or unhappy endings:

  .PassToken;
	if (many_flag) {
	    single_object = GPR_MULTIPLE;
	    multi_context = token;
	}
	else {
	    if (indef_mode == 1 && indef_type & PLURAL_BIT ~= 0) {
		  	if (token == MULTIEXCEPT_TOKEN or MULTIINSIDE_TOKEN) multi_context = token;
	        if (indef_wanted < INDEF_ALL_WANTED && indef_wanted > 1) {
	            multi_had = 1; multi_wanted = indef_wanted;
	            etype = TOOFEW_PE;
	            jump FailToken;
	        }
	    }
	}
	return single_object;

  .FailToken;

	! If we were only guessing about it being a plural, try again but only
	! allowing singulars (so that words like "six" are not swallowed up as
	! Descriptors)

	if (allow_plurals && indef_guess_p == 1) {
	    #Ifdef DEBUG;
	    if (parser_trace >= 4) print "   [Retrying singulars after failure ", etype, "]^";
	    #Endif;
	    prev_indef_wanted = indef_wanted;
	    allow_plurals = false;
	    wn = desc_wn;
	    jump TryAgain;
	}

	if ((indef_wanted > 1 || prev_indef_wanted > 1) && (~~multiflag)) etype = MULTI_PE;	! MODIFIED

	return GPR_FAIL;

]; ! end of ParseToken__

-) instead of "Parse Token Letter F" in "Parser.i6t".

Include (-

! ==== ==== ==== ==== ==== ==== ==== ==== ==== ====
! Parser.i6t: Parsing Descriptors
! ==== ==== ==== ==== ==== ==== ==== ==== ==== ====

[ Descriptors  o x flag cto type n;
	ResetDescriptors();
	if (wn > num_words) return 0;

	for (flag=true : flag :) {
	    o = NextWordStopped(); flag = false;

	   for (x=1 : x<=LanguageDescriptors-->0 : x=x+4)
	        if (o == LanguageDescriptors-->x) {
	            flag = true;
	            type = LanguageDescriptors-->(x+2);
	            if (type ~= DEFART_PK) indef_mode = true;
	            indef_possambig = true;
	            indef_cases = indef_cases & (LanguageDescriptors-->(x+1));
		if (type == INDEFART_PK && indef_cases & $$111000111000) indef_wanted = 1;	! ADDED

	            if (type == POSSESS_PK) {
	                cto = LanguageDescriptors-->(x+3);
	                switch (cto) {
	                  0: indef_type = indef_type | MY_BIT;
	                  1: indef_type = indef_type | THAT_BIT;
	                  default:
	                    indef_owner = PronounValue(cto);
	                    if (indef_owner == NULL) indef_owner = InformParser;
	                }
	            }

	            if (type == light)  indef_type = indef_type | LIT_BIT;
	            if (type == -light) indef_type = indef_type | UNLIT_BIT;
	        }

	    if (o == OTHER1__WD or OTHER2__WD or OTHER3__WD) {
	        indef_mode = 1; flag = 1;
	        indef_type = indef_type | OTHER_BIT;
	    }
	    if (o == ALL1__WD or ALL2__WD or ALL3__WD or ALL4__WD or ALL5__WD) {
	        indef_mode = 1; flag = 1; indef_wanted = INDEF_ALL_WANTED;
	        if (take_all_rule == 1) take_all_rule = 2;
	        indef_type = indef_type | PLURAL_BIT;
	    }
	    if (allow_plurals) {
	    	if (NextWordStopped() ~= -1 or THEN1__WD) { wn--; n = TryNumber(wn-1); } else { n=0; wn--; }
	        if (n == 1) { indef_mode = 1; flag = 1; indef_wanted = 1; }	! MODIFIED
	        if (n > 1) {
	            indef_guess_p = 1;
	            indef_mode = 1; flag = 1; indef_wanted = n;
	            indef_nspec_at = wn-1;
	            indef_type = indef_type | PLURAL_BIT;
	        }
	    }
	    if (flag == 1 && NextWordStopped() ~= OF1__WD or OF2__WD or OF3__WD or OF4__WD)
	        wn--;  ! Skip 'of' after these
	}
	wn--;
	return 0;
];

[ SafeSkipDescriptors;
	@push indef_mode; @push indef_type; @push indef_wanted;
	@push indef_guess_p; @push indef_possambig; @push indef_owner;
	@push indef_cases; @push indef_nspec_at;
	
	Descriptors();
	
	@pull indef_nspec_at; @pull indef_cases;
	@pull indef_owner; @pull indef_possambig; @pull indef_guess_p;
	@pull indef_wanted; @pull indef_type; @pull indef_mode;
];

-) instead of "Parsing Descriptors" in "Parser.i6t".

Then you get interaction like:

>TAKE ALL HORSES
bay horse: Taken.
dappled horse: Taken.
piebald horse: Taken.

>DROP A HORSE
(the piebald horse)
Dropped.

>DROP ONE HORSE
(the dappled horse)
Dropped.

>TAKE SOME HORSES
dappled horse: Taken.
piebald horse: Taken.
1 Like

I think your diagnosis and remedy are correct.

At least, so far I haven’t managed to break it!

With this fix in place, the ambiguous ‘take sheep’ (which could in English mean ‘take the sheep’ (singular) or ‘take the sheep’ (plural), equivalent to ‘take all the sheep’) is parsed with a singular meaning, which aligns with the parser’s general prejudice for interpreting ambiguous phrasings as having singular meaning where possible e.g.:

"Four Candles" by PB

The Agent's Office is a room.  The table is a supporter in the office.

A script is a kind of thing. A script called Four Candles is on the table.
After printing the name of Four Candles when listing nondescript items: say " (a comedy script for The Two Ronnies)".
A candle is a kind of thing. Four candles are on the table.

after which:

Four Candles
An Interactive Fiction by PB
Release 1 / Serial number 240117 / Inform 7 build 6M62 (I6/v6.41 lib 6/12N) SD

Agent’s Office
You can see a table (on which are Four Candles (a comedy script for The Two Ronnies) and four candles) here.

>take four candles
Taken.

>i
You are carrying:
Four Candles

The way this works under the hood is that on parsing the command ‘take four candles’ the parser

(i) initially assumes that we are looking for four items matching ‘candles’ and looks to create a list of items matching that word
(ii) each time it finds an object matching ‘candles’ it goes back to check whether its name actually could match the full phrase ‘four candles’
(iii) if the parser finds at least one object that matches ‘four candles’, it assumes that the player meant to refer to (one of) that/those object(s) rather than to any four objects matching just ‘candles’.

Hence here it chooses taking ‘Four Candles’ over taking four ‘candles’.

This method of working leads to some interesting side effects. if we replace the Four Candles script with some fork 'andles:

Some fork 'andles are on the table.
Understand "four candles" as the fork 'andles.

then we get

Agent’s Office
You can see a table (on which are four candles and some fork 'andles) here.

>take four candles
candle: Taken.
candle: Taken.
candle: Taken.
candle: Taken.

because as written, the parser will only match the full phrase “four candles” with the fork 'andles object, not ‘four’ or ‘candles’ alone, and so it can’t match the fork 'andles against just the word ‘candles’ (step (ii)), and so it never goes back to check whether it might actually match ‘four candles’, so the fork 'andles remain unmatched and the parser goes on to process the list of four candles that it has matched. Change the Understand phrase to Understand "four" or "candles" as the fork 'andles and the parser will treat the fork 'andles object in the same way as it did the script, because it can now match just ‘candles’ alone against the fork 'andles object:

Agent’s Office
You can see a table (on which are four candles and some fork 'andles) here.

>take four candles
Taken.

>i
You are carrying:
some fork 'andles

If we have the script, the candles and the fork 'andles (matched only against the full phrase “fork handles”) present, then even more weirdness ensues:

Some fork 'andles are on the table.
Understand "four candles" as the fork 'andles.
A candle is a kind of thing. Four candles are on the table.
A script is a kind of thing. A script called Four Candles is on the table. After printing the name of Four Candles when listing nondescript items: say " (a comedy script for The Two Ronnies)".

Agent’s Office
You can see a table (on which are some fork 'andles, four candles and Four Candles (a comedy script for The Two Ronnies)) here.

>take four candles
Taken.

>i
You are carrying:
Four Candles

Here the parser is not making a match against the fork 'andles, and chooses Four Candles as the sole match…but switch the order in which the script and the fork 'andles are declared (and therefore switch which is matched against first by the parser as it works its way through the object tree):

A script is a kind of thing. A script called Four Candles is on the table. After printing the name of Four Candles when listing nondescript items: say " (a comedy script for The Two Ronnies)".
A candle is a kind of thing. Four candles are on the table.
Some fork 'andles are on the table.
Understand "four candles" as the fork 'andles.

Agent’s Office
You can see a table (on which are Four Candles (a comedy script for The Two Ronnies), four candles, and some fork 'andles) here.

>take four candles
Four Candles: Taken.
fork 'andles: Taken.

and now both the fork 'andles and Four Candles are matched. This behaviour relates to the method discussed above by which the parser approaches ‘four candles’.

In the latter example, where the parser trys matching Four Candles first, it begins by trying a match against ‘candles’ alone first, then having matched against ‘candles’ steps back to try matching against ‘four candles’ and, having made a match, thereafter matches any in scope objects not yet considered (in this case the candles andfork 'andles) against ‘four candles’ not just ‘candles’ alone. Consequently the fork 'andles are matched, but the candles (which have no ‘four’ in their name) are not.

However, in the former example where the parser trys matching against the fork 'andles first, it begins by matching against ‘candles’ alone first, makes no match, then goes on to make successful matches of the four candles against ‘candles’ alone, but then matches Four Candles against ‘candles’ alone, steps back to match against ‘four candles’, makes a match, so rejects the previous matches against the candles and doesn’t go back to retry matching any of the previously-considered objects against ‘four candles’- and so misses the opportunity to make a match against the fork 'andles. This bug is a side-effect of the fork 'andles matching against ‘four candles’ but not ‘candles’ alone.

We would probably expect that rather than taking both Four Candles and the fork 'andles, the parser would ask 'Which do you mean, the Four Candles or the fork ‘andles?’. This apparent misbehaviour occurs because the word ‘candles’ is compiled in the dictionary tagged as a plural form (because it appears as ‘candles//p’ in the name properties of the candles), so having rejected ‘take four candles’ as meaning ‘take four (objects matching) candles’ the command is interpreted by the parser as ‘take (all objects matching) four candles’

If we avoid ‘candles’ being tagged as a plural form in the dictionary by omitting the candles:

Some fork 'andles are on the table.
Understand "four candles" as the fork 'andles.
A candle is a kind of thing.
A script is a kind of thing. A script called Four Candles is on the table. After printing the name of Four Candles when listing nondescript items: say " (a comedy script for The Two Ronnies)".

Agent’s Office
You can see a table (on which are some fork 'andles and Four Candles (a comedy script for The Two Ronnies)) here.

>take four candles
Taken.

>i
You are carrying:
Four Candles

and switching the order of declaration (so that both Four Candles and fork 'andles get matched- see above):

A script is a kind of thing. A script called Four Candles is on the table. After printing the name of Four Candles when listing nondescript items: say " (a comedy script for The Two Ronnies)".
A candle is a kind of thing.
Some fork 'andles are on the table.
Understand "four candles" as the fork 'andles.

Agent’s Office
You can see a table (on which are Four Candles (a comedy script for The Two Ronnies) and some fork 'andles) here.

>take four candles
Which do you mean, Four Candles or the fork 'andles?

2 Likes

If you would prefer treatment of words like “sheep” and “fish” to default to the plural interpretation (assuming the declarations support the word being marked as plural), then it’s possible to modify TryGivenObject():

6M62 inclusion
Include (-

! ==== ==== ==== ==== ==== ==== ==== ==== ==== ====
! Parser.i6t: TryGivenObject
! ==== ==== ==== ==== ==== ==== ==== ==== ==== ====

[ TryGivenObject obj nomatch threshold k w j;
	#Ifdef DEBUG;
	if (parser_trace >= 5) print "    Trying ", (the) obj, " (", obj, ") at word ", wn, "^";
	#Endif; ! DEBUG

	if (nomatch && obj == 0) return 0;

! if (nomatch) print "*** TryGivenObject *** on ", (the) obj, " at wn = ", wn, "^";

	dict_flags_of_noun = 0;

!  If input has run out then always match, with only quality 0 (this saves
!  time).

	if (wn > num_words) {
		if (nomatch) return 0;
	    if (indef_mode ~= 0)
	        dict_flags_of_noun = $$01110000;  ! Reject "plural" bit
	    MakeMatch(obj,0);
	    #Ifdef DEBUG;
	    if (parser_trace >= 5) print "    Matched (0)^";
	    #Endif; ! DEBUG
	    return 1;
	}

!  Ask the object to parse itself if necessary, sitting up and taking notice
!  if it says the plural was used:

	if (obj.parse_name~=0) {
	    parser_action = NULL; j=wn;
	    k = RunRoutines(obj,parse_name);
	    if (k > 0) {
	        wn=j+k;

	      .MMbyPN;

	        if (parser_action == ##PluralFound)
	            dict_flags_of_noun = dict_flags_of_noun | 4;

	        if (dict_flags_of_noun & 4) {
	            if (~~allow_plurals) k = 0;
	            else {
	                if (indef_mode == 0) {
	                    indef_mode = 1; indef_type = 0; indef_wanted = 0;
	                }
	                indef_type = indef_type | PLURAL_BIT;
	                if (indef_wanted == 0) indef_wanted = INDEF_ALL_WANTED;
	            }
	        }

	        #Ifdef DEBUG;
	        if (parser_trace >= 5) print "    Matched (", k, ")^";
	        #Endif; ! DEBUG
	        if (nomatch == false) MakeMatch(obj,k);
	        return k;
	    }
	    if (k == 0) jump NoWordsMatch;
	}

	! The default algorithm is simply to count up how many words pass the
	! Refers test:

	parser_action = NULL;

	w = NounWord();

	if (w == 1 && player == obj) { k=1; jump MMbyPN; }

	if (w >= 2 && w < 128 && (LanguagePronouns-->w == obj)) { k = 1; jump MMbyPN; }

	if (Refers(obj, wn-1) == 0) {
	    .NoWordsMatch;
	    if (indef_mode ~= 0) { k = 0; parser_action = NULL; jump MMbyPN; }
	    rfalse;
	}

	threshold = 1;
	! BEGIN MODIFICATION
	if (obj hasnt ambigpluralname)
		dict_flags_of_noun = (w->#dict_par1) & $$01110100;
	! END MODIFICATION
	w = NextWord();
	while (Refers(obj, wn-1)) {
		threshold++;
		if (w && obj hasnt ambigpluralname)	! MODIFIED
		   dict_flags_of_noun = dict_flags_of_noun | ((w->#dict_par1) & $$01110100);
		w = NextWord();
	}

	k = threshold;
	jump MMbyPN;
];

-) instead of "TryGivenObject" in "Parser.i6t".

The I6 ambigpluralname attribute is mapped to the I7 ambiguously plural property in the Standard Rules, so adding:

A sheep is usually ambiguously plural.

will tell the parser to prefer the singular interpretation of the word.

It might be necessary to make a similar modification to DetectPluralWord() in some cases.