[WRONG DIAGNOSIS] Best Practices For Inconsistent Behaviors? (parsers)

EDIT: Nevermind; I just had a lapse of scientific method.

ORIGINAL POST:
So I’m seeing a weird problem from testers where some bugs only appear on certain operating systems running certain interpreters. It’s not one bug in particular, either; I’ve found a few of them, now. So far, they’re just weird vocabulary match problems.

I have systems in place that are supposed to fix the bugs—and should, logically-speaking—but some of them seem to work on my machine, only.

With about a week left to finish this game before SpringThing, I’m wondering what the suggestions are. These bugs don’t really affect gameplay mechanics; they mostly create weird disambiguation questions.

Is this a common problem in parser game authoring? Does anyone have any common solutions? I’m making a note to tweak and tune this game on different interpreters (as standard practice), but I was mostly expecting differences in text formatting, and not differences in functionality.

3 Likes

…I’ve never had that happen and based on my limited understanding of how interpreters work, I also would have thought that it couldn’t happen? Sorry, not a helpful response but does make me wonder if there’s something exotically weird happening - do you have like one outlier tester or one interpreter that’s different from the others, or are things behaving differently across the board?

3 Likes

Okay so, I did some more testing for myself, and decided to try replicating the bug in a non-debug version of the game…

…and there’s the bug. On my terp, and on my machine.

So now I might just be a moron, and need to stick a bit closer to the scientific method. There might be something about the debug version which is specifically preventing this bug from appearing.

Yeah, I don’t mean to make anyone paranoid. I have experience as a Java dev, and have had seen many bug reports, which only popped up because someone was using a different Java VM, and the dev was using a rarer VM version.

So my gut response here was “Oh no, the curse follows me here, too.”

My bad.

3 Likes

is something named “tree” in the picture?

There is one place where different OSes/terps can be expected to have different behavior: you shouldn’t expect to get the same sequence of random numbers even if you start with the same seed. One can use Danni’s Xorshift to compensate for that if it’s relevant. Other than that, if there’s a difference, it probably reflects one or another of the terps having a bug. (Of course, one wouldn’t expect randomness to affect parser disambiguation unless one is up to something really weird.) [ oops, I was being all Inform 7-centric again. ]

5 Likes

Nope. Apparently the cat player character has different disambiguation solutions from the normal player character, specifically where doors and handles are concerned.

Still need to investigate why. But at least I can reproduce it in a debug version.

Was just completely unexpected, is all.

EDIT: I tried to delete this thread, but apparently it won’t let me. Oh well. My mistake will be plastered on the forum for all to see, I guess.

3 Likes

That’s intriguing! Keep us posted, please. Seems likely that different things are in scope for the two cases, but why they’d be different, I don’t know, unless maybe you’ve implemented the cat being able to see things in dim light the regular player can’t or something…

It’s part of the process!

5 Likes

Please correct me if I am wrong but I guess the most reliable interpreters are Z-machine interpreters because their details have been fixed for so many years. Perhaps Glulxe and TADS are close? The category of your post is “general design Discussions” so I hope it is okay I mention several systems below…

Old open source interpreters are probably the most reliable(?)

Sometimes people think they behave differently because they forgot to mention a detail. For instance, I mentioned a lot of bugs in a Quest 5 game review until I found out that the combat system became buggy when I activated the SCRIPT ON to produce a transcript.

Another known example is the Scare Adrift 4 interpreter which is actually highly compatible but you may have to deactivate some “smart” GLK abbreviations with the command: GLK ABBREVIATIONS OFF

3 Likes

So this is for OPEN DOOR in TADS Adv3Lite. Adv3Lite allows you a last-chance opportunity to disqualify objects from matching, by providing the context of what other objects happened to also match, and how the parser handled the input itself.

The cat cannot open doors, the normal player can. Door handles are supposed to disqualify themselves from matching when in the presence of door matches.

However, there’s a weird quirk appearing where when the cat cannot open a door, the door handle doesn’t consider the door’s vocabulary match as legitimate, so it barges in like “The door wouldn’t have worked anyway! It’s me you’re looking for!” but the door still needs to know if the player meant to refer to it (instead of the handle) because that’s important for which of the two objects reports an error.

So now I need to figure out exactly how this is happening, and maybe adjust the last-chance disqualification round when the player is the cat, specifically.

Yeah, originally I didn’t think I was having a TADS-specific error, and was asking for general advice. However, apparently I am having a TADS-specific error, but the site isn’t letting me delete this thread, so now I’m just some TADS user barging into the General Design Discussions category with a problem which is not General in Design.

Kill me, lol.

HA! One of my testers got a very similar bug last week! Their specific version of the interpreter started being weird while the transcript was recording, but then it didn’t happen after they restarted the interpreter program, lol.

4 Likes

FWIW even if you can’t delete the thread you should be able to change the category (though the multi system discussion has been interesting to me at least!)

3 Likes

If I fail to fix this bug, I might start a new thread in the correct category.

Right?

This is also why I’m hesitant to change the category. The thread has gathered some interesting info from other systems, and I don’t want to bury this on the TADS side of the forum.

3 Likes

For what it’s worth, I fixed it.

For TADS Adv3Lite users
For every door handle, I had procedurally added vocab from the door it was attached to. For some reason, if the door action was illogical, using the filterResolvedList method didn’t actually affect the final list of matches. So I nuked it all by adding the door vocab to the handle, but only after splitting the vocab and marking each word with [weak], in order to always force the door to have priority, unless the handle was specifically matched with “handle”.

Apparently, the [weak] token drops matches before filterResolvedList does, and will drop matches even when all actions are illogical. At least, this is how the behavior looks from the surface.

For everyone else who might still be curious (I tried to make this funny)
TADS Adv3Lite (and maybe Adv3?) lets an author review the nouns that matched against the player’s input, and use logical context to disqualify certain nouns, before the rest of the parser system tries to use disambiguation.

If an action fails, it sends an “illogical” signal back to the parser. Usually when this happens, the parser says “Hm. Okay. Maybe I did it wrong. Let’s try a slightly different approach just in case.” Time turns back, and it tries to interpret the player’s input differently this time. This continues until one of these branches either succeeds, or all of them fail.

Apparently, if there is no way for the player’s input to make any sense at all, the parser doesn’t even give the author the decency of disqualifying objects from matching. It simply decides that everything is equally-guilty of failure. However, this means that it will still send the player a disambiguation message, basically saying “Hello, player. I’m sorry, but would you prefer Failure A, or Failure B for this turn?”

Which—unfortunately—causes testers to write “why is it asking me if I want to open the door or the door handle?”

And then when the tester makes the choice, the parser reveals the true horror of the universe: All roads lead to failure!

So, I have a door matching to inputs like “freezer exit door”, and the handle matching to “freezer exit door handle”. The handle’s match words are actually generated procedurally (because I can’t be bothered to manually add handles to every door), and just slapped “handle” to the end of the door’s match words.

Unfortunately, when you have a cat that cannot open large doors, and is controlled by a player who types OPEN FREEZER, the parser will see nothing but doom and terror on the horizon, and it asks “Would you like to fail to open the door, or fail to open the door handle?”

Normally, I have it so all door handles will check the list of match nouns for doors, and if handles find a door in the list, then they all disqualify themselves, because it makes more sense for a player to refer to a door, unless the request is so specific that a door would fail to match, leaving only handles in the match list.

Unfortunately, as I explained before, the parser will not let this voluntary disqualification happen if every permutation of the player’s request will fail.

However, there’s another system in Adv3Lite, which lets you mark matching words of a noun as “weak”. This makes it so these words in other nouns will always be given match priority, and this seems to happen before the voluntary-disqualification stage. The difficulty is that for procedurally-generated door handles, I need to do some on-the-spot text editing to add the phrase [weak] after every match word in the handle, which also appears in the door’s match word list.

This way, the match words of the door are guaranteed to have match priority, long before the voluntary disqualification stage.

4 Likes

For whatever it’s worth, this entire thread caused me to erase part of my notes for you. Well done.

3 Likes

Yeah, this bug has been very persistent, but I only realized just now that it only happens to the cat, and I’m usually testing the game as the normal player. Usually, the behavior should be nearly-identical, but doors specifically do different things around the cat player.

I’m sorry that this bug has appeared for you in testing, lol.

2 Likes

Oh, please. No worries. That’s the whole point of testing!

2 Likes

Wait, so what happened? If I open the freezer, is the cat alive or dead? Or both?

2 Likes

Very much alive, and will make it everyone else’s problem in short order.

4 Likes

Trust me, cats CAN open doors, and indeed I don’t considered a cat trying to open a door handle as a failure, because, well, cats understand gravity, and when discover that jumping on an handle their weight is enough to turn the handle… (go figure, resorting to an upside-down lock, giving a kitchen door whose handle is to be turned upward for opening… a nice puzzle, at least…)

Best regards from Italy,
dott. Piergiorgio.

3 Likes

The doors in question here take quite a bit of force to pull open, because springs pull it closed. The freezer doors are magnetically sealed. Both door designs, the handle does not rotate down to open, and must be pulled sideways along a spring-resistance track.

These are based on doors I have encountered in real life, which I had difficulty opening. I can’t fathom a cat succeeding.

Now if these were normal swing doors found in most homes, then absolutely yes a cat can open them. Heck, I’ve seen dogs open typical house doors, even ones that have only a doorknob. But there are not such free-swinging doors in the game I’m working on.

4 Likes

In general we discourage people from deleting topics that they feel are no longer relevant, especially if they have replies. Someone in the future will have the same problem and find the research useful.

3 Likes

Well I hope this absolutely madhouse of a thread can be a future resource, then! :laughing:

3 Likes