Recognize and summarily block informal names

Hi folks! What I would like to do is to recognize and summarily block certain informal names. At the same time, I’d like to display their full names, possibly with a title. The code example at the bottom of this post more or less works in Inform 10+ (but not earlier – there are problems referring to a “topic of a person” variable), but I’d like to understand if the approach can be improved. In particular, what are the pros and cons of doing something like:

Understand "joe/joey/joseph" as a mistake.  ("You're scared that Doctor Joseph Sneed will discipline anyone foolish enough to address him by anything less than his full title.")

I don’t really think this is a mistake, in the sense that it’s a valid input to the game. And I’m not confident that mistake text can refer to the player’s input, for example to identify which form of the informal name was used.

In addition, I wonder if there’s a way to refer to the “topic of a person” variable so that I can avoid the repetition in

The informal-name-topic of Sneed is "joseph/joe/joey".  Understand "joseph/joe/joey" as Sneed.

Thanks!

Yields:

Informal Names

An Interactive Fiction

Release 1 / Serial number 231216 / Inform 7 v10.1.2 / D

Laboratory

A spartan laboratory with worn wooden counters running the length of the wall.

You can see a beaker, Madame Rose and Doctor Joseph Sneed here.

>x rose

You've never referred to Madame Rose as Rose, and you're not about to start now.

>x madame

You see nothing special about Madame Rose.

>give beaker to joe

You've never referred to Doctor Joseph Sneed as Joe, and you're not about to start now.

>give beaker to sneed

(first taking the beaker)

Doctor Joseph Sneed doesn't seem interested.
1 Like

This is going to look dumb, but I bet it works!

A nickname is a kind of thing.

A nickname has a person called the nick-referent.

Before doing something:
	if the noun is a nickname and the second noun is a nickname:
		say "You've never referred to [nick-referent of the noun] as [the noun][quotation mark][unicode 8212]nor [nick-referent of the second noun] as '[the second noun][quotation mark][unicode 8212]and you're not about to start now." instead;
	else if the noun is a nickname:
		say "You've never referred to [nick-referent of the noun] as '[the noun]' before[unicode 8212]and you're not about to start now." instead;
	else if the second noun is a nickname:
		say "You've never referred to [nick-referent of the second noun] as '[the second noun]' before[unicode 8212]and you're not about to start now." instead;

Joey is a nickname. Joey is part of Doctor Joseph Sneed.

Joe is a nickname. Joe is part of Doctor Joseph Sneed.

Oh but this would still allow referring to Doctor Joseph Sneed as plain old “Joseph”! Hold on.

Doctor Sneed is a man in Example Location. Printed name of Doctor Sneed is "Doctor Joseph Sneed". Understand "Joseph Sneed" as Doctor Sneed.

Joe is a nickname. Joe is part of Doctor Sneed.

Joey is a nickname. Joey is part of Doctor Sneed.

Joseph is a nickname. Joseph is part of Doctor Sneed.

I guess this doesn’t cover inputs like “Doctor Joe” or “Joey Sneed,” though. I guess that you’d have to implement separate nickname objects for every unacceptable input you can come up with.

Maybe it wasn’t such a hot idea after all!

4 Likes

(EDIT: Note that for the problem of interdicting the word choices for objects, mistakes are not really usable as a solution. See discussion in follow-on post below.)

Pros:

  • Mistakes do not count as turns, which may be desirable.

Cons:

  • Mistakes can only exclude cases, so all wrong forms of address must be covered.
  • Mistake detection operates on topics, so covering all of the possible combinations might be a pain.

You might consider looking into the Subcommands extension by Daniel Stelzer. It assigns the snippet of text matching a noun or second noun to the corresponding object(s) while parsing a command, which can then be checked to see whether they take a correct form. You would need to write special logic to prevent normal action processing, if that’s desired.

3 Likes

A couple of suggestions:

the noun is shorthand for the noun part of the current action (likewise the second noun) so cutting all those part of the current actions will simplify your code:

Before doing something when the noun is a person or the second noun is a person (this is the block informally referring rule):

You can define a token and use it as a topic property and in an Understand phrase, avoiding duplication and the typo in your code where rosie is omitted from one example of the topic:

Understand "rose/rosie" as "[rose-informal]". The informal-name-topic of Rose is "[rose-informal]".  Understand "[rose-informal]" as Rose.
1 Like

I worked up an example (in 6M62) using Subcommands that implements a correct address rulebook. [EDIT: Some modifications to the last section are posted further down; see discussion there.] If an improper form of address is used, then the turn sequence is aborted before an action is generated:

"Correct Address"

Include Subcommands by Daniel Stelzer.

Place is a room.

Doctor Joseph Sneed is a man. He is in Place. The description is "Old Doc Joe 'Sneezy' Sneed is a real stickler for correct forms of address." Understand "old" or "doc" or "joe" or "joey" or "sneezy" as Doctor Joseph Sneed.

Correct address rules are a person based rulebook. The correct address rules have outcomes proper address (success) and improper address (failure - the default).

Correct address for Doctor Joseph Sneed:
	if the subcommand of Doctor Joseph Sneed matches "doctor/-- joseph/-- sneed":
		proper address.

A turn sequence rule (this is the correct address required rule): [modify output as desired]
	if the person asked is not the player:
		follow the correct address rules for the person asked;
		if the outcome of the rulebook is the improper address outcome:
			instead say "Disallowed (person asked).";
	if the noun is not nothing:
		follow the correct address rules for the noun;
		if the outcome of the rulebook is the improper address outcome:
			instead say "Disallowed (noun).";
	if the second noun is not nothing:
		follow the correct address rules for the second noun;
		if the outcome of the rulebook is the improper address outcome:
			instead say "Disallowed (second noun).".

The correct address required rule is listed before the declare everything initially unmentioned rule in the turn sequence rules.

[Nouns aren't yet assigned when the correct address rules run, so a little I6 delving is necessary...]
To decide which object is parser input 1:
	(- parser_results-->INP1_PRES -).

To decide which object is parser input 2:
	(- parser_results-->INP2_PRES -).

This is the assign nouns rule: [a shortcut, not extensively tested]
	now the noun is parser input 1;
	now the second noun is parser input 2.

The assign nouns rule is listed before the correct address required rule in the turn sequence rules.

The player carries a test tube.

Test me with "x doctor sneed / give test tube to joseph sneed / doctor joseph sneed, take test tube / x joey / give test tube  to old doc joe / sneezy, take test tube".
3 Likes

Why not just write:

if parser input 1 is not nothing:

etc. ?

EDIT see discussion below: is it that noun and second noun need to be set anyway in order to use the subcommand of ...?

1 Like

This is dubious, because:

(i) parser_results-->INPx_PRES is undefined (although in practice will likely be 0) for code-generated actions (try taking the aspidistra) → these will directly set the globals inp1 and inp2 when these are meaningful to the action; (when not meaningful, inp1 and inp2 are defined as 0). This includes any individual actions generated from a multiple object list. parser_results-->INPx_PRES are (potentially) set only for directly parser-generated actions. EDIT: 'tho thinking about it, given that we are doing this in a turn sequence rule before action generation, this code will only run before parser-generated actions, and not before code-generated actions.

(ii) The contents of parser_results-->INP2_PRES is undefined when parser_results-->NO_INPS_PRES is 1, as also is parser_results-->INP1_PRES when parser_results-->NO_INPS_PRES is 0. Although again, in practice, when not meaningful and therefore undefined they appear likely to be 0.

(iii) even when containing a meaningful result of a parser-generated action, parser_results-->INPx_PRES may not be a valid I7 object- it may be 1 to indicate this part of the action is a value or topic (found, in an I7 context, in the I6 global parsed_number, which in the case of a topic is equivalent to the snippet variable the topic understood.) It may also be 0 to indicate the parser has parsed multiple objects and written them to the multiple object list (e.g. after Take all) but 0 is at least equivalent to nothing and can therefore be assigned to the I7 object variables noun or second noun without provoking a runtime error.

So, in general, the I7 variable noun should be assigned inp1, not parser_results–>INP1_PRES, and then only when inp1 is not 1 (in which case it should always be 0 (equivalent to nothing) or a valid I7 object). Likewise with second noun and inp2.

EDIT: However, I see that both the assign nouns rule and the correct address required rule are scheduled to run in the turn sequence rules before the generate action rule, which means that none of the global variables set under the generate action rule (which includes inp1 and inp2 as well as noun and second) are yet available. So if noun and second are to be prematurely set here, it would need to be done carefully to maintain type-safety, as is normally done under the generate action rule. Probably better just to make sure that parser input 1 and parser input 2 are nothing or a valid I7 object and use these in the correct address required rule, without setting noun or second- or thinking about it, am I missing that noun and second need to be set to use the subcommand of ...?

small-print parser output is summarised in more detail below:

Parser Output Variables

Section - Output Variables

[
We use almost identically the same parser as that in the I6 library, since it is a well-proven and understood
algorithm. The I6 parser returns some of its results in a supplied array- parser_results- but others are in global variables:

(1) The parser_results array holds four words, used as indexed by the constants below.
(a) The action can be a valid I6 action number, or an I6 “fake action”, a concept not used overtly in I7. Most valid I6 actions
correspond exactly to I7 actions, but in principle it is possible to define (say) extra debugging commands entirely at the I6 level.
(b) The count NO_INPS_PRES is always 0, 1 or 2, and then that many of the next two words are meaningful. inp1 and inp2 are
initialised to 0 during action generation, then if meaningful assigned parser_results–>INPx_PRES.
(c) Each meaningful “inp” value is either 0, meaning “put the multiple object list here”; or 1, meaning “not an object but a value”; or
a valid I6 object. (We use the scoping rules to ensure that any I6 object visible to the parser is also a valid I7 object,
so – unlike with actions – we need not distinguish between the two.)

(2) The global variable actor is set to the person asked to carry out the command, or is the same as player if nobody was mentioned.
Thus it will be the object for Floyd in the command FLOYD, GET PERMIT, but will be just player in the command EAST.

(3) The global variables special_number1 and, if necessary, special_number2 hold values corresponding to the first and second
of the “inps” to be returned as 1. Thus, if one of the “inps” is a value and the other is an object, then special_number1 is that value;
only if both are values rather than objects will special_number2 be used. There is no indication of the kind of these values: I6 is typeless.

(4) At most one of the “inps” is permitted to be 0, referring to a multiple object list. (And a multiple value list is forbidden.)
If this happens, the list of objects is stored in an I6 table array (i.e., with the 0th word being the number of subsequent words) called multiple_object, and the parser will have set the toomany_flag if an overflow occurred – that is, if the list was truncated because
it originally called for more than 63 objects.

(5) The global variable meta is set if the action is one marked as such in the I6 grammar. A confusion in the design of I6 is that
being out of world, as we would say in I7 terms, is associated not with an action as such but with the command verb triggering it.
(This in practice caused no trouble since we never used, say, the word SAVE for both saving the game and saving, I don’t know,
box top coupons.) The state of meta returned by the I6 parser does not quite correspond to I7’s “out of world” concept, so we
will alter it in a few cases.

Some of these conventions are a little odd-looking now: why not simply have a larger results array, rather
than this pile of occasionally used variables? The reasons are purely historical: the I6 parser developed
gradually over about a decade.

Constant ACTION_PRES = 0; ! index into parser_results–>
Constant NO_INPS_PRES = 1;
Constant INP1_PRES = 2;
Constant INP2_PRES = 3; ! NB Parser.i6t code assumes this is INP1_PRES + 1
]

Section - Parser Output Variables and the Current Action

[

For what are, again, historical reasons to do with the development of I6, the current action is recorded in a slate of global variables:

(1) actor is as above; action is the I6 action number or fake action number, though in I7 usage no fake actions should ever reach this point.

(2) act_requester is the person (in practice, the player) requesting that another actor should perform the action, or nothing if the action is requested by the computer or is the actor’s own choice. In practice then, this is a binary- the act_requester is ALWAYS either the player or nothing. In the case of a stored action, this binary (with a requested action being ‘1’) is stored as the least-significant-bit of the REQUEST element of a stored action array (the 4th & 5th bits being used to indicate a stored action with a topic in either the inp1 (4th bit) or inp2 (5th bit) positions). In an action immediately generated by the command FLOYD, MOP FLOOR, the act_requester is the player and the actor is Floyd (this corresponds to actions described as ‘asking to try…’; but for the action subsequently generated (Floyd mopping the floor), or an action arising from a normal command ‘(the player) mopping the floor’ or from a try phrase, such as “try Floyd mopping the floor”, act_requester is nothing because in each case by the time the action is carried out it is either the actor’s own decision to do this, or simply the will of the computer. (The computer, of course, represents the will-power of all characters other than the player.)

(3) inp1 and inp2 are global variables whose contents mean the same as those of parser_results–>INP1_PRES and parser_results–>INP2_PRES.
(This is not duplication, because actions also arise from “try” rather than the parser, in which case parser_results–>INP1_PRES and parser_results–>INP2_PRES may not be set. but inp1 and inp2 may be).

(4) The variable multiflag is set during the processing of a multiple object list, and clear otherwise. (It is used for instance by the Standard Rules to give more concise reports of some successful actions.)
Note that it remains set during any knock-on actions caused by actions in the multiple object list: the Generate Action Rule which translates parser results to action globals is the only place where multiflag is set or cleared.

(5) noun and second are global variables which are initially set equal to inp1 and inp2 when the latter hold valid object numbers, and are equal to nothing otherwise. (This is not duplication either, because it provides us with type-safe access to objects: there is no KOV which can safely represent inp1 and inp2, but noun and second are valid for the I7 kind of value “object”.) If a computer-converted action does not take a second noun (e.g when a Remove action is converted to a Take action in ‘Carry out removing’), noun and inp1 will be set to the object currently being taken also second will be set to zero but inp2 will not (because the action is not generated de novo and so inp1 and inp2 are not reinitialised). inp2 will remain as the inp2 generated by the base action (in this case, what the objects are being taken from). Conversely, if for example a Remove action is diverted in ‘Before removing something: try taking the noun instead’ then both inp2 and second will be set to zero in generating the Take action.

The Generate Action Rule creates this set of variables for the action or multiple action(s) suggested by the parser: each action is sent on to BeginAction for processing. Once done, we reset the above variables in what might seem an odd way: we allow straightforward actions by the player to remain in the variables, but convert requests to other people to the neutral “waiting” action carried out by the player (which is the zero
value for actions).
Now, in a better world, we would always erase the action like this, because an action once completed ought to be forgotten. The value of noun ought to be visible only during the action’s processing. But in practice many I7 users write “every turn” rules which are predicated on what the turn’s main action was: say, “Every turn when going: …” The every turn stage is not until later in the turn sequence, so such rules can only work if we keep the main parser-generated action of the turn in the action variables when we finish up here: so that’s what we do. (Note that BeginAction preserves the values of the action variables, storing copies on the stack, so whatever may have happened during action processing, we finish this routine and move on through the turn sequence with the same action variable values that we set at the beginning from the parser.)

So in summary:

*actor (player or an NPC) (in I7 == ‘the person asked’/‘the person reaching’, or during action processing ‘actor’)
act_requester (0 =>(computer’s/actor’s own) volition, otherwise player.)
*multiple_object–> (table, -->0 is number of entries)
*too_many (overflow flag for above)
*meta (out-of-world action)
multiflag (set when multiple action processing begins, remains set until end of multiple action processing)

*parser_results   -->ACT_PRES   -->NO_INPS_PRES    -->INP1_PRES                             -->INP2_PRES
                        |                              |                                        |
                        |                              |   _____try  doing  something ...____   |
                        |                              |  /                                  \  |
                        |                             inp1____________  *parsed_number        inp2____________  *parsed_number
                        |                             /       \       \     |                 /       \       \     |
                        |                         object?      0       1    |              object?     0       1    |  
                        |                          /           |        \   |               /          |        \   | 
                      action                    noun       *multiple  *special_         second     *multiple  *special_
                                                             object    number1                       object    number2
                                                              list                                    list              
* set by parser, rest set by action setup rules

parsed_number is the result of a token returning GPR_NUMBER, or indeed in I7 any token that returns a value which is not an object. These are represented in Inform 7 by phrases like “the topic understood”, “the time understood” etc. ‘the topic understood’ is defined in the standard rules as a snippet which varies, translated into I6 as “parsed_number”, but the rest seem to be special compiler-generated phrases, which translate either to parsed_number itself or in say phrases to a routine to print parsed_number in the context of its KOV.

special_number1 and special_number2 may be values, topics, numbers, times- indeed, anything that’s not an object. These did not exist in the original I6 parser- the numerical values of these values would just be stored in noun and second. With Inform 7, to ensure that I7 type-safety is maintained, noun and second can only be an object or zero, so other numerical values are stored in special_number1 and special_number2 instead.
However, in practice Inform 7 (unlike Inform 6) only allows an action to contain a single value that is not an object, which is always found in parsed_number, so special_number1 and special_number2 are redundant.
Although redundant, special_number1 will contain the same number as parsed_number, except in the case of a topic understood, for which INPx_PRES and inpx will be 1 but the snippet is stored only in parsed_number and special_number1 may either be 0 or, if the first word of the topic is a dictionary word, the address of that dictionary word. When a command comprises a sequence that fails all grammar parsing following a person being addressed (e.g. ‘Darcy, what is your opinion of Mr Bennet’), auto-generating a 'answering Darcy that “…” action, then special_number1 will be 0 even if the first word of the topic is a dictionary word. Since there is only ever one non-object value in I7 actions, special_number2 is never used and is always 0.

]

Section - Monitoring parser/action variables

trace-level is a number which varies. the trace-level variable translates into I6 as “parser_trace”.
First before doing something when trace-level is at least 6:
say “[bold type][the current action][roman type][line break]”;
say “NO_IMPS_PRES: [no_inps_pres].”; [number of valid inputs parsed- 0, 1 or 2]
say “INP1_PRES: [inp1_pres].”; [working variables for the parser, representing presumptive 1st and 2nd input object/values]
say “INP2_PRES: [inp2_pres].”; [0=> use multiple object list; 1=>value not object (in parsed_number), else object]
say “multiple_object–>0: [multiple_object–>0].”; [first entry in multiple object list, == number of entries]
say “inp1: [inp1].”; [set by action generator, taken either from parser equivalents or from ‘try …’ invocation]
say “inp2: [inp2].”; [0=> use multiple object list; 1=>value not object (in parsed_number), else object]
say “noun: [noun].”; [== I7 ‘noun’: set by action generator, to be object if inp1/2 are objects, else zero]
say “second: [second noun].”; [== I7 ‘second noun’]
say “special_number1: [special_number1].”; [redundant in I7 == parsed_number, or zero if parsed_number is topic understood]
say “special_number2: [special_number2].”; [redundant in I7, always zero]
say “parsed_number: [parsed_number].”; [holds any action parameter that is not an object (only 1 such allowed in I7)]
say line break; [NB unlike rest, parsed_number and special_number1 aren’t reset to 0 each action]
continue the action; [when 0 or a stale/unrelated number, ‘the topic understood’ may be an invalid snippet]

to say no_inps_pres: (- print parser_results–>NO_INPS_PRES; -);
to say inp1_pres: (- print parser_results–>INP1_PRES; -);
to say inp2_pres: (- print parser_results–>INP2_PRES; -);
to say inp1: (- print inp1; -);
to say inp2: (- print inp2; -);
to say special_number1: (- print special_number1; -);
to say special_number2: (- print special_number2; -);
to say inp2: (- print inp2; -);
to say parsed_number: (- print parsed_number; -);
to say multiple_object–>0: (- print multiple_object–>0; -)

3 Likes

No, it’s because the original version that I crafted stole the logic from GENERATE_ACTION_R() to do the same processing of parser_results-->INPx_PRES and inpX values through to noun values that it normally does. In the interest of making the inclusions look less scary, I stripped it down to direct access before posting, but in the process left functionality at the mercy of potentially undefined (or misdefined) internal globals, as you say, and also managed to break it. ([Rule "haste makes waste rule" applies.])

The concerns about use of internals that you point out are absolutely valid. It worked for the test me, but to be safer:

[Nouns aren't yet assigned when the correct address rules run, so a little I6 delving is necessary...]

The pregenerate nouns rule translates into I6 as "ConstructNouns".

Include (-

! extracted from GENERATE_ACTION_R
[ ConstructNouns    i ;

	inp1 = 0; inp2 = 0; multiflag = false;
	if (parser_results-->NO_INPS_PRES >= 1) {
		inp1 = parser_results-->INP1_PRES;
		if (inp1 == 0) multiflag = true;
	}
	if (parser_results-->NO_INPS_PRES >= 2) {
		inp2 = parser_results-->INP2_PRES;
		if (inp2 == 0) multiflag = true;
	}

	if (inp1 == 1) noun = nothing; else noun = inp1;
	if (inp2 == 1) second = nothing; else second = inp2;

];

-).

This is the assign nouns rule:
	follow the pregenerate nouns rule.

The assign nouns rule is listed before the correct address required rule in the turn sequence rules.

Arguably it would be good to do the same processing of the multiple action processing rulebook in the above as is seen in the generate action rule, but since changes wouldn’t be reversible, it’s skipped.

Side question: In the documentation that you posted above, does Output Variables 1b have a typo with respect to inpX and parser_results-->INPx_PRES values? (I mean where it says “… then if meaningful assigned to …”)

A small note for @Draconis: I noticed that if an object is both the noun and the second noun, then it ends up with the same subcommand for both instances, even if different words were used. In this case, a command like >GIVE JOE TO SNEED ends up passing muster because the parsed_snippet value for the person ends up corresponding to “SNEED” only. I don’t think there’s an easy way to change this given the approach, but you may want to add a note about this in the extension’s documentation.

Should be ‘…then if meaningful assigned…’ (I’ve edited)

Many thanks for the responses, folks. I read this forum regularly and it’s awesome that some of the most knowledgeable people contribute so generously.

@drpeterbatesuk I like the simplifications you suggest and will incorporate them.

@otistdog thanks for digging deep into this question (I’m glad I asked it!) but could I request more context? I’m trying to understand the relative advantages of each approach. It looks like the rulebook you propose would

  1. Give flexibility defining the correct form of address, including things like considering location, items worn, etc.
  2. Give flexibility defining when the correct form of address, including things like current scene, etc.

Is there a more structural advantage – a class of inputs that would be handled with the rulebook but not otherwise – that I’m not appreciating?

Thanks!

Yeah, it’s an unfortunate limitation and not one I can think of a good way to fix. (Or rather—I can think of technical ways to fix it, but none of them make things easy on the end user, since they all involve making multiple properties that need to be checked.)

As I understand the problem you’ve presented, it’s that you want to allow only certain combinations of words to be processed normally as an object or addressee in a player command. You want other words to be recognized as applicable to some objects but not accepted as part of legitimate commands.

I never really make use of the Understand ... as a mistake... construction, so I didn’t notice that the example of it that you proposed in your original post:

Understand "joe/joey/joseph" as a mistake.  ("You're scared...

isn’t really a workable approach in the first place. Trying that approach (after correcting the placement of the parenthesized part) generates a Problem Message indicating that the first word of the topic can’t have alternates. Even if done on a per-word basis as in:

Understand "joe" as a mistake.  ("You're scared...
Understand "joey" as a mistake.  ("You're scared...
...

it won’t do what you expect, because the mistake topic is supposed to match the entire command. A command like >X JOEY won’t be recognized as a mistake, and a command like >JOEY, LOOK will be handled (probably unexpectedly) as a multi-part command triggering a mistake then a looking action.

As you note, the nature of rulebooks means that what constitutes correct address at any point can vary based on circumstances through addition of superseding rules, but the rulebook approach is not perfect. For example, if using the default telling it about action the command >TELL MADAME ROSE ABOUT JOEY will pass muster as correct address because the word “JOEY” is being processed as a topic, not an object. I don’t think that would be caught by the informal-name-topic approach, either, because even though the search for disallowed words is done purely on the basis of the command text, it is done using the informal-name-topic coming from the noun (here Madame Rose). You could implement new object-based actions and grammar lines for the >TELL command to try to address this.

The core problems I perceived are:

  1. matching player command text to objects
  2. examining matched text to see if it meets certain criteria
  3. responding appropriately (i.e. reflecting the incorrect usage) when criteria are not met
  4. minimizing the work of defining correct address criteria

Items 1 and 3 call for functionality that is not provided out-of-the-box; that’s why Subcommands is used. Items 2 and 4 together suggest that definition of correct forms of address (a presumably limited subset of allowed name word combinations) is the more desirable approach; that’s why a correct address rulebook is used.

I’m going to mark @otistdog’s final summary as an answer, since anybody with similar questions will almost certainly be able to use this thread to work through a solution. Thanks again, everybody!

1 Like