Possible to help tokenizer with "St. Jude"?

My game takes place in a monastery, so there are a few statues around to saints. A play tester tried “X ST. JUDE” to examine the statue. I have ‘st’ and ‘saint’ in the name property, but I can’t usefully put st. in that, because the tokenizer long before decided that was a full stop between actions, like north. south. This confused my (not-IF-experienced) tester, who didn’t try “st Jude” or “saint Jude”. Is there any reasonable way to help them out without hacking deeply into the tokenizer?

1 Like

My best idea so far is to have a BeforeParsing that scans the input for “ST.” and changes the period to something like a “Z”, and then put stz as an alias. But perhaps there’s something easier.

(Or does BeforeParsing happen before tokenizing?)

1 Like

The usual I6 approach is to run through the command, look for sequences like s t ., and turn the period into a space. Then re-tokenize and re-parse the whole thing. Unfortunately, I don’t remember off the top of my head which entry point routines are used for this.

2 Likes

Daniel’s advice is sound. BeforeParsing is a good entry point for this.

Tokenization has already happened (the opcode that is used to read player input also tokenizes it), but if you’re compiling to z5 you can re-tokenize input after modification.

On the other hand, you can just scan the parse array, and if you find ‘st’ followed by ‘.//’ you remove the entry for the ‘.//’, shortening the array. This works equally well in z3.

3 Likes

This seems to work fine:

[ BeforeParsing _i _len;
  _len = NumberWords();
  for (_i = 1 : _i < NumberWords(): _i++) {
    if (WordValue(_i) == './/' && WordValue(_i-1) == 'st') {
      (parse-2)-->(_i*2) = 'st';
    }
  }
];

It’s turning it into “X ST ST JUDE”, but that works fine.

Thanks, @Draconis and @fredrik !

2 Likes

That solution was my first impulse. Then I thought it would produce ugly feedback to the player if the command isn’t fully understood. Now I think that’s not a problem - if the parser prints part of player input, it should print the text in the input buffer, and that hasn’t been modified. Yeah, I think it’s a solid solution.

3 Likes

The orPrefixSuffix extension is an example of handling this. It’s a standard library extension, but it should be easy to port to puny without too much effort. Its documented on page 21 of the orLibrary User’s Guide. In a nutshell, it just scans the input text and replaces periods which follow known abbreviations. Here’s the text from the orLug:

orPrefixSuffix

This extension allows the use of prefixes and suffixes which are followed by
periods (for example, “Col. Mustard” or “Mr. Anderson”).

Normal Behavior

Consider an object named “Mrs. Robinson”…

object -> mrsRobinson "Mrs. Robinson" has proper
 with description "Loved by Beatles. And Jesus.",
name 'mrs' 'robinson';

In normal English, we might refer to the character like so:

examine Mrs. Robinson

But the default behavior of the Standard Library is to stop parsing the input at
the period and treat the text as two separate commands (EXAMINE MRS and
ROBINSON), producing the following:

Loved by Beatles. And Jesus.
That’s not a verb I recognize.

Revised Behavior

This extension scans user input and removes periods which follow common
general prefixes and suffixes (e.g. “Mr.”“Mrs.” “Dr.” “Col.”, “Jr.”…). This addresses
the parser’s confusion:

examine Mrs. Robinson
Loved by Beatles. And Jesus.

Simply include this extension to enable this behavior. No additional changes
are needed.

1 Like

Thanks, @onyxring . A generalized solution is overkill for this particular case (and for a z3 output, I’m pressed tight for space!), but I’ve been reading about orLibrary and it sounds great. If I next work on a non-Puny Inform6 project, I’ll certainly be looking at it more carefully!

1 Like

Also, the orLibrary solution uses re-tokenization, which can’t be done in z3 (Of course, this doesn’t matter for orLibrary, since it builds on the standard library, which can’t be used to make z3 games anyway)..

1 Like