Text variable matching question (includes regex)

StartedTheVoid · July 19, 2021, 2:34am

I have a situation where I am wanting to string random words together - randomly choose one from List A, one from List B, and so forth. But, I don’t want any similar word from List A and List B to be chosen.

e.g. ‘Intensely’ and ‘Intense’.

So what I am trying to do is use a While loop to keep choosing a new random word from List B until it doesn’t match the word from List A. But the words don’t precisely match, they are only similar for a certain number of characters.

In other programming language I might do something like:

$WordA = (random choice from list A)
$WordB = (random choice from list B)

' assuming all words are more than 6 characters:
$SignificantChars = 6

While ( ($WordA.substring($SignificantChars) -eq ($WordB.substring($SignificantChars) ) {
      $WordB = (random choice from list B)
}

But I cannot find any way in Inform 7 to extract or even refer to a substring of a text variable.

What I have resorted to, for a substring comparison, is use a regular expression to remove the last x number or characters from a copy of a text variable, instead of comparing the first x characters.

Is there a more proper way to do this?

Test Room is a room.

Instead of doing anything, say "[TestText]" instead.

To say TestText:
	Let Text1 be "colorful";
	Let Text2 be "colorize";
	Let Text2Match be "[Text1]";
	Replace the regular expression ".{3}$" in Text2Match with ""; 
	Say "Matching whether '[Text2]' includes '[Text2Match]', which is the shortened form of '[Text1]':[line break]";
	If Text2 matches the text Text2Match:
		Say "Matches.";
	Otherwise:
		Say "Doesn't match.";

severedhand · July 19, 2021, 4:15am

I’m just taking a step back from the regex for a moment to look at the problem.

If list A and list B were the same size? (e.g. there’s a one-to-one relationship between the words in each list) then a two-column table might be an easier way to go.

If no… even if the lists aren’t the same length, do you ever have more than one match between lists? e.g. If colorful is in list A, is there ever a chance there will be more than one match for it in list B? Again if no, a two-column table would still work.

If you really want and need two freely definable word lists where you can add between 0 and N words to list B that will match a particular word in list A, or vice versa, yeah, regex seems a good solution.

At the moment you’re doing that subtractively, which looks fine to me. I can assemble regexes but I don’t use them frequently, so I’m not great at instinctively picking the best way to solve a problem. But I found you can do it additively as well, as I just tested.

The additive way below has its own problem, because here, the word prefixes all have to be five letters. But it swaps the onus to the other end of the sentence - the suffixes can now be any length.

Test Room is a room.

Instead of doing anything, say "[TestText]" instead.

To say TestText:
	Let Text1 be "colorful";
	Let Text2 be "colorize";
	Let Text2Match be "[Text1]";
	if Text1 matches the regular expression "^(.....)":
		now Text2Match is "[text matching regular expression]";
	Say "Matching whether '[Text2]' includes '[Text2Match]', which is the shortened form of '[Text1]':[line break]";
	If Text2 matches the text Text2Match:
		Say "Matches.";
	Otherwise:
		Say "Doesn't match.";

-Wade

severedhand · July 19, 2021, 4:18am

Btw I see you’re replying - I just edited my post while you were.

-Wade

StartedTheVoid · July 19, 2021, 4:34am

No.

I thought of that, but didn’t feel like learning about tables right now. Good idea. I thought of table, and making sure the similar words had same index numbers, then comparing the index numbers, not text. I also thought of trying to assign each one a bit position then using binary math to include / exclude words. I decided substring was the easiest thing then spent hours and hours failing to find any substringish functionality.

Not likely. Probably not. - Actually I didn’t get you. Yes, several words between lists will be ‘similar’.

That’s what I figured.

if Text1 matches the regular expression "^(.....)":

:o

LOL… I didn’t think of that.

Now I want to try to assign Regexes to Text variables so I can re-use them and change the pattern only in one place. Not sure if 'matches the regular expression “[MyRegex]” is going to work…

FYI my word lists are like this…

ImpossibleDoorAdverb is a kind of value.  ImpossibleDoorAdverbs are intensely, impossibly, hypnotically, kaleidoscopically, energetically, brightly, dizzyingly, colorfully.

ImpossibleDoorAdj is a kind of value.  ImpossibleDoorAdjs are intense, impossible, hypnotic, kaleidoscopic, energetic, bright, dizzying, colorful, iridescent, glowing, swirling, whirling, multicolored, prisimatic, polychromatic.

[ 'd--r' below because I can't use 'door' as a value.  I'll substitute - for o later.]
ImpossibleDoorNoun is a kind of value.  ImpossibleDoorNouns are d--r, portal, blur, rainbow, colors, lights, array.

ImpossibleDoorPrep is a kind of value.  ImpossibleDoorPreps are of light, of colors, of rainbows, of prisms.

Using those lists, I am trying to build a dynamic phrase:

(optional) Adverb
Adj
(optional) Adj
Noun
(optional) Prep phrase

If the word lists did not include similar words and I didn’t want some of them optional, I could apparently just do this:

	SayImpossibleDoorAdverb is initially "[a random ImpossibleDoorAdverb]";
	SayImpossibleDoorAdj is initially "[a random ImpossibleDoorAdj]";
	SayImpossibleDoorNoun is initially "[a random ImpossibleDoorNoun]";
	SayImpossibleDoorPrep is initially "[a random ImpossibleDoorPrep]";
		
	To say SayImpossibleDoorDesc:
		Say "[SayImpossibleDoorAdverb][SayImpossibleDoorAdj][SayImpossibleDoorNoun][SayImpossibleDoorPrep]";

But to handle optional words and since the word choices have ‘similar’ wording, I need a way to detect that similarity, then re-select each item when necessary. To avoid “colorfully colorful” etc…

BTW, that code snippit works because I discovered this:

"Random Word Test"

Test room is a room.

PsychedelicWord is a kind of value.  PsychedelicWords are intense, impossible, hypnotic, kaleidoscopic, energetic, bright, dizzying, colorful, iridescent, glowing, swirling, whirling, multicolored, prisimatic, polychromatic.

SelectedWord is initially "[a random PsychedelicWord]";  [ LOL 'initially' , it keeps randomly changing! ]

Instead of doing anything, say "[SelectedWord]."

Test me with "wait / g / g / g / g / g"

An initially randomized variable that constantly changes. Well, I am trying to describe a psychedelically impossible door, I guess that fits.

When I am done with my entire phrase builder, I will post it for public flogging. Surely then someone will point me to an extension that very elegantly does the same thing.

But, working this out on my own is a necessary part of me learning Inform 7.

severedhand · July 19, 2021, 4:43am

Yeah that’s all good stuff! You’ve already shown me a lot of options I hadn’t thought of.

About the only angles I don’t think you mentioned, that might be able to help somewhere, are Definitions and To Decide phrases. Have a glance at 11.6 and 11.17 in the docs. For instance you could create a To Decide phrase that determines whether two words match in the A-list/B-list sense. And then it gives you an easy way to write that in the code in future - e.g. “if text1 listmatches text2”, where you’ve already created “To decide whether blah listmatches blah2:” phrase.

Happy hacking.

-Wade

StartedTheVoid · July 19, 2021, 4:46am

One last thing I’ll point out along this thought. When, every time you refer to a variable that is set to some random choice, it CHANGES, that is a big challenge. Thankfully,

Now SelectedWord is the substituted form of SelectedWord; [unrandomize the variable to keep its value]

or

Let SelectedWord be the substituted form of “[a random ImpossibleDoorAdj]”;

But I really want to leverage this randomized behavior, but I can’t because again, it keeps changing its value. I understand why it works like this, but man that’s weird.

Zed · July 19, 2021, 7:53am

That is hilarious! I absolutely would have expected that to generate an error. And if it didn’t, of course it would just get assigned some single value.

drpeterbatesuk · July 19, 2021, 8:55am

If it helps, here are some string functions in I7:

Part 3 - String Functions

To decide what text is the sub-text of (txt_f - a text) from (start - a number) for (no_chars - a number):
	let len_txt be the text-length of txt_f;
	if start < 1, now start is 1;
	if start > len_txt, decide on "";
	if no_chars + start > len_txt + 1:
		now no_chars is len_txt - start + 1;
	if no_chars < 1, decide on "";
	if txt_f matches the regular expression ".{[start - 1]}(.{[no_chars]})":
		decide on "[the text matching subexpression 1]";
	else:
		decide on "";

To decide what text is the clip-text of (txt_f - a text) from (start - a number) for (no_chars - a number):
	decide on the sub-text of txt_f from start for no_chars;
		
To decide what text is the mid-text of (txt_f - a text) from (start - a number) for (no_chars - a number):
	decide on the sub-text of txt_f from start for no_chars;
		
To decide what  number is the text-length of (txt_f - a text):
	decide on the number of characters in txt_f;
	
To decide what number is the text-pos-alt of (txt_n - a text) in (txt_h - a text):
	if txt_h matches the regular expression "(([txt_n]).*)":
		decide on 1 + text-length of txt_h - text-length of the text matching subexpression 1;
	else:
		decide on 0;

To decide what number is the text-pos of (txt_n - a text) in (txt_h - a text):
	if txt_h matches the regular expression "((.*?)([txt_n]))":
		decide on 1 + text-length of the text matching subexpression 2;
	else:
		decide on 0;
		
To decide what number is the text-len-matched of (txt_n - a text) in (txt_h - a text):
	if txt_h matches the regular expression txt_n:
		decide on  text-length of the text matching regular expression;
	else:
		decide on 0;

[succinctly duplicates 'the text matching regular expression' following regular expression matching]	
To decide what text is the text-matched of (txt_n - a text) in (txt_h - a text):
	if txt_h matches the regular expression txt_n:
		decide on the text matching regular expression;
	else:
		decide on "";
		
To decide what text is the clip-text of (txt_f - a text) from (start - a number) to (end - a number):
	decide on the sub-text of txt_f from start for end - start + 1;

To decide what text is the sub-text of (txt_f - a text) from (start - a number) to (end - a number):
	decide on the clip-text of txt_f from start to end;
	
To decide what text is the mid-text of (txt_f - a text) from (start - a number) to (end - a number):
	decide on the clip-text of txt_f from start to end;
	
To decide what text is the left-text of (txt_f - a text) for/of (len - a number):
	decide on the sub-text of txt_f from 1 for len;
	
To decide what text is the right-text of (txt_f - a text) for/of (len - a number):
	decide on the sub-text of txt_f from the text-length of txt_f - len + 1 for len;

[copy a text over (and beyond if necessary) another, starting from a given position (inclusive)]	
To decide what text is the text-copy of (txt_f - a text)  to  (txt_t - a text)  from  (start - a number):
	decide on "[clip-text of txt_t from 1 to start - 1][txt_f][clip-text of txt_t from start + text-length of txt_f  to text-length of txt_t]";
	
[remove a text of given length starting at a given position (inclusive)]
To decide what text is the text-cut of (txt_f - a text) from (start - a number) for (len - a number):
	decide on "[left-text of txt_f for start - 1][right-text of txt_f for ((text-length of txt_f - start) - len) + 1]";
	
[remove a text of given length between two positions (inclusive)]
To decide what text is the text-cut of (txt_f - a text) from (start - a number) to (end - a number):
	decide on "[left-text of txt_f for start - 1][right-text of txt_f for text-length of txt_f - end]";

[insert a text starting at a given position (inclusive)]	
To decide what text is the text-insert of (txt_f - a text)  to  (txt_t - a text)  at  (start - a number):
	decide on "[clip-text of txt_t from 1 to start - 1][txt_f][clip-text of txt_t from start to text-length of txt_t]";

[replace the first match of one regular expression with another text, rather than 'replace the regular expression.... 'which replaces all occurrences]	
To decide what text is the text-replace-alt of (txt_n - a text) in (txt_h - a text) by/with (txt_r - a text):
	if txt_h matches the regular expression "((.*?)([txt_n]))":
		let pos be 1 + the number of characters in the text matching subexpression 2;
		let len be the number of characters in the text matching subexpression 3;
		decide on "[left-text of txt_h for pos - 1][txt_r][right-text of txt_h for ((text-length of txt_h - pos) - len) + 1]";
	else:
		decide on txt_h;
	
[replace the first match of one regular expression with another text, rather than 'replace the regular expression.... 'which replaces all occurrences]	
To decide what text is the text-replace of (txt_n - a text) in (txt_h - a text) by/with (txt_r - a text):
	if txt_h matches the regular expression "((.*?)([txt_n]))":
		let pos be 1 + the number of characters in the text matching subexpression 2;
		let len be the number of characters in the text matching subexpression 3;
		decide on "[left-text of txt_h for pos - 1][txt_r][clip-text of txt_h from pos + len to text-length of txt_h]";
	else:
		decide on txt_h;

To decide what text is the text-cat of (txt1 - a text) and/to (txt2 - a text):
	decide on "[txt1][txt2]";

To decide what text is the text-add of (txt1 - a text) and/to (txt2 - a text):
	decide on "[txt1][txt2]";

For your specific example, you would use e.g. ‘the left-text of WordA for 6’

StJohnLimbo · July 19, 2021, 10:23am

I don’t know if this helps, but you could do the following (and you’d not have to worry about say phrases and the substituted form):

Test room is a room.

PsychedelicWord is a kind of value.  PsychedelicWords are intense, impossible, hypnotic, kaleidoscopic, energetic, bright, dizzying, colorful, iridescent, glowing, swirling, whirling, multicolored, prismatic, polychromatic.

SelectedWord is a PsychedelicWord that varies.

When play begins:
	now SelectedWord is a random PsychedelicWord.

Report jumping:
	now SelectedWord is a random PsychedelicWord. 

Instead of waiting, say "[SelectedWord]."

Test me with "z / z / jump / z / z / jump / z".

StartedTheVoid · July 19, 2021, 1:39pm

Yes, that is helpful. Keeps me from having to keep using ‘the substituted form of’ . Thanks!

StartedTheVoid · July 19, 2021, 1:41pm

Whoa… There is a lot to learn here. Thank you!!

StartedTheVoid · July 19, 2021, 2:02pm

Super big thanks to you all. All this has helped me more elegantly create a system of dynamically generated descriptions. In this case for a key and a door.

This is just a test game I am fiddling with to learn things, it’s not my magnum opus. So my dynamic verbiage isn’t the best. And I have to learn how to sort out a / an - or fold them into my word choices.

But I think this is rather cool effect, if used judiciously.

( * are mine to highlight the dynamic text * )

Left room
This room is on the left.  There are archways to the east and the south.  The southern archway is faintly shimmering.  To the west is an interesting looking door.  On the floor appears a symbol.  A holographic display floats in the air near one wall.

>e

Right room
This room is on the right.  There are archways to the west and the south.  The southern archway is faintly shimmering.  To the east is an interesting looking door.  On the floor appears a symbol.  A holographic display floats in the air near one wall.

You can see an interesting key here.

>take interesting key
As you grasp the key, you feel a slight vibration in your hand.

Taken.

>x interesting key
It seems to be an ordinary key, however its outline faintly shimmers.

>s
You feel a strong vibration in your hand.

Down room
There is a brightly shimmering archway to the north.  On the floor appears a symbol.  A holographic display floats in the air near one wall.  Inset into the south wall appears to be a vague door made of flickering colors.  It seems rather impossible.

>i
You are carrying:
  an impossible key

>x impossible key
While it feels like the same metal key, visually it now appears to be a key-shaped outline filled by * energetically glowing lights *.  You can barely concentrate while looking at it.

Your hand moves toward the * glowing light of rainbows * seemingly of its own accord.

>x key
While it feels like the same metal key, visually it now appears to be a key-shaped outline filled by * intensely colorful rainbows * .  You can barely concentrate while looking at it.

>x key
While it feels like the same metal key, visually it now appears to be a key-shaped outline filled by * colorfully dizzying lights *.  You can barely concentrate while looking at it.

>x key
While it feels like the same metal key, visually it now appears to be a key-shaped outline filled by * intensely bright prisms *.  You can barely concentrate while looking at it.

The key in your hand seems attracted to the * bright kaleidoscopic door *.

>x key
While it feels like the same metal key, visually it now appears to be a key-shaped outline filled by * impossibly swirling prisms *.  You can barely concentrate while looking at it.

>x door
As you look directly at the door, it dissolves into an * energetically swirling light of prisms *, dazzling your eyes and momentarily dampening all consciousness.  As consciousness returns, you feel this this is quite impossible.

>x door
As you look directly at the door, it dissolves into an * colorful blur of light *, dazzling your eyes and momentarily erasing all thoughts.  As thought returns, you feel this this is quite impossible.

Your hand moves toward the * hypnotically dizzying rainbow of light * seemingly of its own accord.

>x door
As you look directly at the door, it dissolves into an * swirling light of colors *, dazzling your eyes and momentarily erasing all awareness.  As awareness returns, you feel this this is quite impossible.

>x door
As you look directly at the door, it dissolves into an * impossibly polychromatic array of colors * dazzling your eyes and momentarily dampening your mind.  As your mind returns, you feel this this is quite impossible.

The key is gently pulling your hand toward the * energetically intense colorful door of prisms *.

>open door
Your dazzled eyes cannot focus on the * intense portal of light *.  Your consciousness fades away.

(You can't recall what you were about to do)

Your hand moves toward the * colorful rainbow of prisms * seemingly of its own accord.

>touch door
Your dazzled eyes cannot focus on the * glowing light of colors *.  Your focus goes elsewhere.

(You can't recall what you were about to do)

>

Now, the trick for the player is, work out how to resolve this door puzzle when you can’t directly focus on it.

Zed · July 19, 2021, 3:59pm

There are a couple of potential advantages to skipping defining values here and just using phrases.

To say the/a/-- psychedelic word:
say "[one of]intense[or]impossible[or]hypnotic[or]kaleidoscopic[or]energetic[or]bright[or]dizzying[or]colorful[or]iridescent[or]glowing[or]swirling[or]whirling[or]multicolored[or]prisimatic[or]polychromatic[at random]".

You get a small bonus for free with the one of…at random structure: it won’t return the same value twice in a row. And you don’t have to worry about name conflicts with “door”.

StartedTheVoid · July 19, 2021, 6:05pm

This would work, but I’m trying to combine words from several lists and I don’t want a word selected from one list to be too similar to one selected from another list e.g. “colorfully colorful”. So I have this fairly lengthy bit of code to select the next word in the ‘phrase’, check it’s first few characters to see if the phrase already includes that, and if so, choose another word. Then move on to the next word, then the next.

Plus I do this exact selection in several different areas of text, so I like the text substitution instead of “saying” all those choices each time.

I love that I am getting so many helpful ideas and suggestions though. Either I ask a pretty compelling question or you all are pretty obsessive. Either way, I like it!!

Zed · July 19, 2021, 6:22pm

This doesn’t stop you from doing that. Saying isn’t really saying until say says it is: in a “to say” phrase, say means return this string. Within a rule, say means output this text.

To say blah: is very close to the same as To decide what text is blah:. A To say phrase can contain arbitrary code. A To say phrase doesn’t have to end with saying anything – it doesn’t have to say anything at all.

You can do text substitutions using “[blah]” with To decide what text is blah just like with To say blah. So far as I can tell, the only relevant difference is the return mechanism. With To decide, decide on immediately returns. With To Say, you don’t have an immediate return option – if you have ifs or loops involved it might be harder to use for a given case.

But To say also offers for free an easy way to cumulatively build an output string. I fibbed above when I said say meant return in a To say. What it means is more like “concatenate this to the invisible return result”. You can have multiple say statements and they all just add to the result, which gets returned when the code block ends (with an empty string if nothing was said at all).

drpeterbatesuk · July 19, 2021, 6:22pm

rockwalrus · July 19, 2021, 6:35pm

This could be a good use for relations, like in this utterly untested code snippet:

Conflict relates various texts to various texts.

The verb to conflict with means the
conflict relation.

"colorfully" conflicts with "colorful".

If first word conflicts with second word:
     ....

StartedTheVoid · July 19, 2021, 6:36pm

Ah ok, I get you. Thanks.

StartedTheVoid · July 19, 2021, 6:37pm

Wow. That idea is way off my radar. I’ll certainly look into this. That looks like pretty much exactly what I am trying to do (say).

rockwalrus · July 19, 2021, 6:50pm

You could then say something like:

Compatibility relates a text (called A) to a text (called B) when A does not conflict with B. The verb to be compatible with means the compatibility relation.