Lack of randomness sometimes when compiling for Glulx

drpeterbatesuk · September 26, 2023, 10:59pm

It’s not so much this (sequential results within one of the 2^n fixed streams appear random) as choosing from 2^n values after a predictable number of calls to the RNG since start-up- the most obvious case being on the very 1st call, but also for example with Glulxe, a call of a random number between 1-4 as the 4th call to the RNG since startup will always be 4. This isn’t influenced by the ranges of calls to the RNG beforehand, it’s simply a function of being the 4th call.

Dannii · September 26, 2023, 11:12pm

Right, so because it only happens if the range is a power of two, that’s why it doesn’t matter most of the time. And if there’s the sequencing of random calls is at all dependent on the player then it would be almost impossible to end up with the same sequence again.

The main time where this could cause issues is with randomness at startup. If a mystery game chose its villain at random and there were 8 candidates, then only 2 of them would ever be chosen (on these flawed interpreters).

It’s quite odd that you can see this happen in Lectrote, because Emscripten doesn’t just use the browser’s random function (which should be good enough), it actually uses the crypto API for cryptography-quality randomness. So the flaw must be in how Git then processes that randomness before returning it to the game.

zarf · September 26, 2023, 11:17pm

Right, right. So forget using the same algorithm in every interpreter.

That seems like way more work for everybody than just fixing the flaw. As you say, it’s only in C code and we’ve got it right here.

Dannii · September 26, 2023, 11:26pm

I was more thinking about what we could do to protect Inform from these bad implementations. Inform currently does this, which it describes as being for bad Z-Machine random functions. I don’t know if it would actually help with that, but it definitely does nothing for this Glulx flaw.

github.com

ganelson/inform/blob/8ca381d1cc08bdc7c79ee0c771fd3eafde1e7a56/inform7/Internal/Inter/BasicInformKit/Sections/Startup.i6t#L125


      
          			@random 10000 i;
          			i = -i-2000;
          			@setrandom i;
          		#Ifnot;
          			@random 10000 -> i;
          			i = -i-2000;
          			@random i -> i;
          		#Endif;
          	}
          	if (RNG_SEED_AT_START_OF_PLAY) VM_Seed_RNG(RNG_SEED_AT_START_OF_PLAY);
          	for (i=1: i<=100: i++) random(i);
          	rfalse;
          ];

But what would help is requesting a random number of random numbers at startup. If we requested a prime number, then requested the result more, then would that help?

@drpeterbatesuk Any idea what the distribution would be if the first random request is for 7 or 13?

drpeterbatesuk · September 26, 2023, 11:30pm

Hmm… You made me doubt myself and go back and check- but it’s true, and not only that, the 4 sequences produced by a range of 1-4 repeatedly called from startup are exactly the same as when running the same test with Git selected in the Windows IDE…

Lectrote (Git):
Table of PseudoRandom Results
(row 1) | 1144133243 | 3 |
(row 2) | 4342242124 | 2 |
(row 3) | 2342424322 | 4 |
(row 4) | 3144311441 | 4 |

IDE (Git)
Table of PseudoRandom Results
(row 1) | 2342424322 | 2 |
(row 2) | 1144133243 | 2 |
(row 3) | 3144311441 | 3 |
(row 4) | 4342242124 | 2 |

drpeterbatesuk · September 26, 2023, 11:40pm

This is the rather kludgey workaround I demonstrated in my post above- calling the RNG a random number of times between 1 and a high prime at startup…

If for a random range of calls between 1 and n, n is prime (or possibly if simply indivisible by 2) then the number of unique streams generated is n*4. If n is divisible by 2, it’s n/4 with a floor of 4. (This is an empirical observation, not based on any mathematical analysis of the RNG algorithm)

EDIT: so, for example calling the RNG between 1 and 64 times leads to 16 unique streams, but calling it between 1 and 19 times leads to 76 unique streams.

Dannii · September 26, 2023, 11:44pm

I don’t really understand, but you’re saying it would need to be a high prime, not just 7 or 13?

zarf · September 26, 2023, 11:56pm

Yeah, and that line dates back to the Inform 5 library! Who the hell knows what interpreter flaw that was meant to accomodate. Very possibly an early-1990s implementation of rand().

Having Inform’s runtime try to compensate for interpreter flaws future and past seems like the worst solution.

drpeterbatesuk · September 27, 2023, 12:07am

It seemed to me that the trade-off was between keeping the maximum potential number of calls low enough to avoid any significant delay at start-up, and generating enough unique streams that there would be a guaranteed even random distribution of values across all streams for likely values of 1-2^n.

I should say that the calculation for the number of unique streams generated that I described above was for calls to the RNG for a range of 1-4. For a call of 1-8, the number of streams generated would be twice that, and so on.

For n==2, and therefore a range of 1-4, you’d I guess only need ~100 streams and therefore, say, a random number of calls during startup of between 1 and 23. For n==6 and therefore a range of 1-64, you’d obviously need a lot more. I haven’t done the maths, but in the solution I posted above I semi-arbitrarily used a range of between 1 and 32633 (being the highest prime below 32767).

For n=16 and therefore a call to the RNG for a random number in the range of 1 to 65536, there would be 32633 * 65536 unique possible streams generated, which is statistically still more than enough I should think.

For substantially larger ranges than 1 to 65536, I’d have to think hard about the stats to be sure that 32633 is big enough.

drpeterbatesuk · September 27, 2023, 12:14am

I feel this is correct.

Either the interpreters are fixed, or, ideally, Inform brings randomness fully in-house.

drpeterbatesuk · September 27, 2023, 12:20am

I think it’s worth keeping perspective on this- we obviously don’t need crypto-level randomness for IF, just a plausible level.

Inform 7 currently doesn’t even implement BODMAS, so perfect randomness might not be the highest development priority

Zed · September 27, 2023, 12:35am

I’ve always wanted to check that it actually passed suitable unpredictability tests before shipping it, but I’ve got an adaptation of Dannii’s Xorshift that replaces Inform’s normal RNG and lets you create other RNGs that start with their own independent seed. It was inspired by a thing I’ve heard Roguelike devs talk about: having the map generator use a separate RNG from everything else so you can reuse its seed and reliably get the same map during testing while letting other stuff be as random as normal.

Amidst a billion other things I’ll try to get that out.

(Hey, bonus. I wrote it a long time ago but I just looked and it doesn’t make me cringe as much as my old Inform code usually does. Of course, it is short…)

[ Edited to add results: looks not bad.

% ent -b b19
Entropy = 1.000000 bits per bit.

Optimum compression would reduce the size
of this 640000000 bit file by 0 percent.

Chi square distribution for 640000000 samples is 3.38, and randomly
would exceed this value 6.60 percent of the times.

Arithmetic mean value of data bits is 0.5000 (0.5 = random).
Monte Carlo value for Pi is 3.140805679 (error 0.03 percent).
Serial correlation coefficient is -0.000071 (totally uncorrelated = 0.0).

]

Dannii · September 27, 2023, 12:38am

If we want to put something into Inform7’s startup code then we have to try to make it as good as possible for all potential ranges. If we need to potentially make 30K+ random() calls at startup, that’s getting to be where a gestalt would make sense, so that we can avoid it if it’s not needed.

Alternatively Inform should just switch to it’s own RNG, only using Glulx’s for seeding it (calling @random once with a high prime value should be safe, right?) An Xorshift* type RNG wouldn’t be hard to implement and wouldn’t be too slow.

Draconis · September 27, 2023, 4:21am

I agree with zarf; trying to ensure the Inform library can compensate for specific documented flaws in interpreters (potentially even using a gestalt!) seems like a worse solution than just fixing the interpreters.

Dannii · September 27, 2023, 4:45am

We need to do both. Some interpreters have stalled (I don’t think anyone is working on an Android interpreter right now), some only get updated once a year (Gargoyle), and even if an interpreter is updated, your distribution channel might be delayed (Gargoyle in Ubuntu).

It may only affect a few authors, but having a N=4 “one of … at random” phrase be non-random is a very substantial bug, so Inform needs some kind of solution until the interpreters can catch up.

I would’ve thought that only doing a random number up to 7 or 13 would have scrambled the RNG enough that the “one of” phrases would be random. If I understand what Peter was saying, it wouldn’t be enough to have a big number of distinct sequences. We might decide that’s okay for Inform, as most authors aren’t making long random lists out of power-of-two random choices, but if someone is doing that we’d also need to ensure they can install an extension like Xorshift that can intercept the interpreter’s algorithm.

(I am disagreeing with my last post, in that I’m now saying we can probably do a simpler compromise in Inform, as long as Xorshift can be installed.)

Draconis · September 27, 2023, 4:50am

Well, it looks like Inform already does 100 random calls at startup. That could be changed to doing random(97) calls instead with no loss of performance, which would be well more than enough to ensure a four-option one is truly random.

Dannii · September 27, 2023, 5:08am

For this particular bug, doing a constant number of calls to random() does nothing, you’re still limited to the same 4 sequences. Changing 100 to 97 might change the original example from “Latte” to “Cappuccino”, but it would still be non-random.

We want some variation of the solution at the bottom of this post, the question is what number to give to the first random() call: 7, 32633, or something in between.

We also need to know if it’s safe to just get an initial number of 7 or 13 or whatever, or if the flawed interpreters give biased results (less random low bits?). We might need to do something like calling random(28000)/4000.

Draconis · September 27, 2023, 5:26am

Right, so do random(97) calls—97 is the largest prime number less than 100, so if you did it this way you’d be guaranteed to do less than the current number of calls.

Dannii · September 27, 2023, 5:28am

Oh, sorry, I misread your post. Yes, indeed random(97) calls won’t be worse than what we’re doing now. But we may be able to do even less, and make startup a little bit faster (if Zarf is right that these 100 calls to random() are very archaic and probably not needed now.)

Does one call to random(97) have a fair distribution in these interpreters? I haven’t tested that yet.

otistdog · September 27, 2023, 5:47am

For what it’s worth, the Linux IDE doesn’t seem to have this issue – the test code from drpeterbatesuk’s original post generates a unique sequence with each run (about 30 of them in my test) and is certainly not limited to four such sequences.

That’s on 6M62… on 10.1.2 the test code generates an internal error (claiming the cause to be something in kit code). The compiler doesn’t seem to like the lines:

	let c be the count corresponding to a randnumber of t in the Table of PseudoRandom Results;
	now the count corresponding to a randnumber of t in the Table of PseudoRandom Results is c + 1;

but it will compile and run if the logic is changed to first choose a row... in that segment and then reference the count entry, in which case the test program shows the same (non-problematic) behavior as the 6M62 IDE.

Presumably this means that the Linux IDE interpreters use the POSIX random function, as with the Mac version?