Quantifier misplaced in Regex

I am having an issue with the following line:

If x matches the regular expression "[a]|[b]|[c]", case-insensitively:
      foo;

When the scenario that triggers this line encounters the texts in the order listed, there is no problem, but if the result is something like a c b, or b c a… it gives me an error saying that there is a misplaced quantifier. I don’t know if this is an Inform-specific question, or if it is a more general Regex question, as I do tend to get a little flustered on Regex even in other languages, but if anyone has any ideas what is happening here, please let me know.

If it is possible that the error is in the details of what is being passed into these texts, or in the “foo;” above, I can post the whole example code, but there’s a lot of other stuff going on in it, so I thought I’d save getting sidetracked on that in case it’s diagnosable from just this snippet for now.

(First, it’s “case insensitively” without a hyphen.)

When you match against the string “[a]|[b]|[c]”, Inform substitutes the variables a, b, and c before interpreting the result as a regexp. So you are probably generating an illegal regexp here. I don’t know what your values are, but if you do (for example)

	let a be "+";
	let b be "B";
	let c be "C";
	if x matches the regular expression "[a]|[b]|[c]", case insensitively:
		say "Match."

…then the regexp would be “+|B|C” which is not valid regexp syntax and causes a runtime “quantifier misplaced” error.

What are you trying to do?

Hm… I thought it might be something like that, but none of the characters in the text substitution are ever any special characters, so I am still at a loss. I hoped not to have to share the whole application because there’s so much to explain in it that it could sidetrack this specific question… but it is in the rant below. Be aware that the code is not complete in other ways, and in it’s final form (if there ever is one) it will likely be written in a more optimized way. I’m just trying to get the basic premise working first at this point.

The long and short of it is that I am trying to “parse” the players command to do a look-up with regular expressions against the printed names of visible things. As to WHY I would do such a thing, if you’ve read my other threads you might have some idea already, but a little more is posted in the rant.

[rant]Essentially, I am exploring using a system where the printed name of a thing is the only way the player and the game will be able to interact with said thing. In the end, I am not actually creating a “new parser,” but still using the existing parser… but this functionality below takes the player’s command and “parses” it with regular expressions to flag objects matched on criteria, give the item the property value “disambiguate-yes”, and then passes the command “[command] disambiguate-yes” to the real parser. I hope that made sense.

As to WHY I would do this… I’m sure anyone who has been in my topics before is aware I am trying to circumvent limitations to the engine, and after lots of testing, I have discovered that adding each thing as an object, a limited number of things with tables of data that swap properties via rules, individual rules for each object, or individual properties for each object, are all equally limiting in the scope of the application. With these options, no matter how I slice it, the number of “things” (not Inform things, but effective things presented to the player) is too limited for my needs. Even if I really never reach that limit, and I am aiming too high, performance is impacted negatively even quite short of that upper limit.

So, I finally figured out that Inform doesn’t choke on individual texts the way it does on individual object, property, table, or rule records. So, I’m trying to build everything in the game entirely out of texts. You might think this is a bad or crazy way to do things and want to talk me out of it… but, again, I tested all the other element types Inform offers, and texts were the only ones that were truly scalable. So, I need to create a system of rules that will “parse” these texts and apply logic to them as if they are properties, and use those to apply logic against the “props” (the actual things to which the text “properties” apply).

Anyway, as part of this, I need a way for the existing parser to be passed useful information about props that have no normal properties. The printed name property will be the only thing available to the normal parser to understand what the player means, so I need to break that into words with regular expressions and compare the two.

Phew… in any case, here is the (under heavy construction) piece of the application that does this, with test “game”:

[code]
Volume 1 - Settings

Use MAX_OBJECTS of 10000.
Use MAX_PROP_TABLE_SIZE of 2000000.
Use MAX_STATIC_DATA of 1000000.
Use MAX_SYMBOLS of 40000.
Use MAX_ARRAYS of 40000.
Use MAX_NUM_STATIC_STRINGS of 40000.
Use DICT_WORD_SIZE of 20.
Use dynamic memory allocation of at least 16384.
Use maximum text length of at least 3000.

Volume 2 - Compatibility Fixes

To consider (x - a nothing based rule):
follow x;

Volume 3 - Extensions

Book 1 - Numbered Disambiguation Choices by Aaron Reed

Include Numbered Disambiguation Choices by Aaron Reed.

The Numbered Disambiguation Choices don’t use number rule is not listed in any rulebook.

The Numbered Disambiguation Choices preface disambiguation objects with numbers rule is not listed in any rulebook.

Before printing the name of something (called macguffin) that is not the person asked while asking which do you mean (this is the New Numbered Disambiguation Choices preface disambiguation objects with numbers rule):
if disambiguation-busy is false:
now disambiguation-busy is true;
add macguffin to the list of disambiguables, if absent;
now the disambiguation id of macguffin is the number of entries in list of disambiguables;
say “[before disambiguation number text][the number of entries in list of disambiguables][after disambiguation number text]”.

Book 2 - Disambiguation Control by Jon Ingold

Include Disambiguation Control Fix by Jon Ingold.

Use disambiguation list length of at least 100;

The block suggestion rule is listed first in the should the game suggest rules. This is the block suggestion rule: abide by the new suggestion rules. The new suggestion rules are a rulebook. The new suggestion rules have outcomes it is an excellent suggestion, it is a good suggestion, it is a passable suggestion, it is a bad suggestion (failure - the default) and never.

new suggestion taking something (called x):
if x is a part of something:
never;
if x is held by the player:
never;
otherwise:
it is an excellent suggestion;

Volume 4 - Text-based Disambiguation

Understand the printed name property as describing a thing.

Disambiguate-command is a text that varies.

Disambiguate-flag is a kind of value. Disambiguate-flag are disambiguate-yes and disambiguate-no. A thing has a disambiguate-flag. A thing is usually disambiguate-no. Understand the disambiguate-flag property as describing a thing.

text-disambiguation-applicable is a truth state that varies. text-disambiguation-applicable is true.

Understand “take/look/touch” as “[commands]”.

After reading a command while text-disambiguation-applicable is true:
if the player’s command includes “[commands]”:
if the noun is nothing:
let disambiguate-command be text;
let disambiguateTextOne be text;
let disambiguateTextTwo be text;
let disambiguateTextThree be text;
now disambiguateTextOne is “asdfjklimpossiblenever”;
now disambiguateTextTwo is “asdfjklimpossiblenever”;
now disambiguateTextThree is “asdfjklimpossiblenever”;
let command-understood be text;
let command-understood be “[the matched text]”;
now disambiguate-command is “[command-understood]”;
cut the matched text;
let original-command-nouns be text;
let original-command-nouns be the player’s command;
if the player’s command matches the regular expression “\w”:
let first-match be text;
let first-match be “[the matched text]”;
now disambiguateTextOne is “[first-match]”;
cut the matched text;
if the player’s command matches the regular expression “\w”:
let second-match be text;
let second-match be “[the matched text]”;
now disambiguateTextTwo is “[second-match]”;
cut the matched text;
if the player’s command matches the regular expression “\w”:
let third-match be text;
let third-match be “[the matched text]”;
now disambiguateTextThree is “[third-match]”;
cut the matched text;
let exact-pass be a thing;
repeat with exact running through visible things:
if the printed name of exact matches the regular expression “^[original-command-nouns]”, case insensitively:
now exact-pass is exact;
now exact is disambiguate-yes;
now text-disambiguation-applicable is false;
repeat with partial running through visible things:
if the printed name of partial matches the regular expression “[the printed name of exact-pass]”, case insensitively:
now partial is disambiguate-yes;
if text-disambiguation-applicable is true:
say “[disambiguateTextOne] [disambiguateTextTwo] [disambiguateTextThree]”;
repeat with zz running through visible things:
if the printed name of zz matches the regular expression “[disambiguateTextOne]|[disambiguateTextTwo]|[disambiguateTextThree]”, case-insensitively:
now zz is disambiguate-yes;
now text-disambiguation-applicable is false;
if text-disambiguation-applicable is true:
say “[2]”;
repeat with zz running through visible things:
if the printed name of zz matches the regular expression “[disambiguateTextOne]|[disambiguateTextTwo]”, case-insensitively:
now zz is disambiguate-yes;
now text-disambiguation-applicable is false;
if text-disambiguation-applicable is true:
say “[1]”;
repeat with zz running through visible things:
if the printed name of zz matches the regular expression “[disambiguateTextOne]”, case-insensitively:
now zz is disambiguate-yes;
now text-disambiguation-applicable is false;
change the text of the player’s command to “[disambiguate-command] disambiguate-yes”;

First every turn:
now every thing is disambiguate-no;
now text-disambiguation-applicable is true;
follow the Numbered Disambiguation Choices reset disambiguables rule.

Rule for printing a parser error:
now every thing is disambiguate-no;
now text-disambiguation-applicable is true;
follow the Numbered Disambiguation Choices reset disambiguables rule;
continue the action;

Volume 5 - The Test Game

The kitchen is a room.

The player is in the kitchen.

a prop1 is in the kitchen. The printed name of prop1 is “red apple”.

a prop2 is in the kitchen. The printed name of prop2 is “green apple”.

a prop3 is in the kitchen. The printed name of prop3 is “orange apple”.

a prop4 is in the kitchen. The printed name of prop4 is “yellow orange”.

a prop5 is in the kitchen. The printed name of prop5 is “apple”.

a prop6 is in the kitchen. The printed name of prop6 is “orange”.

a prop7 is in the kitchen. The printed name of prop7 is “green apple”.

a prop8 is in the kitchen. The printed name of prop8 is “unique”.

a prop9 is in the kitchen. The printed name of prop9 is “take”.

a prop10 is in the kitchen. The printed name of prop10 is “red orange apple”.

a prop11 is in the kitchen. The printed name of prop11 is “green apple orange”.

dummy1 is a person in the kitchen. The printed name of dummy1 is “Red”.

dummy2 is a person in the kitchen. The printed name of dummy2 is “Mrs Appleworth”.

dummy3 is a person in the kitchen. The printed name of dummy3 is “Mr Orange”.

dummy4 is a person in the kitchen. The printed name of dummy4 is “Orange Mr Apple”.

dummy5 is a person in the kitchen. The printed name of dummy5 is “Red Mr Apple”.

dummy6 is a person in the kitchen. The printed name of dummy6 is “Orange Mrs Appleworth”.

dummy7 is a person in the kitchen. The printed name of dummy7 is “Red Mrs Appleworth”.
[/code][/rant]

Slight digression: how did you test having hundreds of pieces of text in the game? I’m asking because Inform will store lots of identical text much, much more efficiently than lots of different text.

The texts were unique. I guess my results are not 100% definitive, but it seems like they are all unique so that they can’t be stored as identical. Since I didn’t have a good text editor with powerful enough regex, I made my own in javascript and created content like this:

text1 is a text. text1 is “1=2,2=3,3=4,4=5,5=6,6=7,7=8,8=9,9=10,10=11,11=12,12=13,13=14,14=15,15=16,16=17,17=18,18=19,19=20”.

I applied a regex and copy script to it that incremented all the numbers, not just the number of the text name, so I ended up with a bunch of results up through:

Text12503 is “2501=2502,2502=2503,2503=2504,2504=2505,2505=2506,2506=2507,2507=2508,2508=2509,2509=2510,2510=2511,2511=2512,2512=2513,2513=2514,2514=2515,2515=2516,2516=2517,2517=2518,2518=2519,2519=2520”.

And everything in between. Firefox couldn’t handle creating 12503 at once, so the numbers inside the quotes had to start over repeatedly, so I guess there might be some duplicates after all… even still, if there are 2500 unique texts, that’s still 4 times as many things as with the other elements, as that seemed to crap out at about 600.

Edit: And I’m not saying 2500 would even be the limit, even if it was storing them easier by duplicates being stored easier… 2500 unique texts doesn’t even slow the application down AT ALL… so, in theory, there could be as many texts as the gblorb can hold Megabytes for. This just simply is not true of the other elements… you can’t increase the memory settings for the other elements enough to go past only hundreds of them without running into problems.

I know my application where this bug is happening is kind of out there, so that’s why I didn’t want to post it to take away from the question at hand.

I still can’t figure out why the regex doesn’t work. I can solve the problem by making multiple “if” statements each checking different regex values for a single match instead, but that’s pretty annoying and ugly… I don’t know why the “or” operator internal to the regex is failing this way. Hm…

Edit: Nope. I was wrong. Even this causes the “quantifier misplaced” error message when the words are listed in a different order than they are found in the printed name in question:

if the printed name of zz matches the regular expression "[disambiguateTextOne]", case insensitively or the printed name of zz matches the regular expression "[disambiguateTextTwo]", case insensitively or the printed name of zz matches the regular expression "[disambiguateTextThree]", case insensitively:

There are no special characters that should be operating like operators or anything. Even if the 3 substitutions are “apple” “orange” and “red”, this error appears.

Nevermind this… there wasn’t anything technically wrong with any of the regex… I needed to add another check in my logic above, where I was trying to run a regex against a “nothing” value, or something like that. When I made sure this had a conditional preventing the lookup, the errors stopped. That being said, it was hard to figure out which regex was causing the error, the line it was coming from. I have no idea how hard it must be to make the compiler give line numbers for anything, let alone a regex error, so I can’t complain, but anyway, I found it manually. Nevermind this question.