Lack of randomness sometimes when compiling for Glulx

severedhand · September 27, 2023, 6:02am

If my program’s already using @Dannii 's Xorshift, is it avoiding the whole problem being talked about in this thread, or just part of it?

-Wade

Dannii · September 27, 2023, 6:06am

If you’re getting the initial Xorshift seed from the Glulx RNG, then it might be affected, in that there may be fewer initial actual seeds than you thought. I don’t know enough yet to say how in detail.

severedhand · September 27, 2023, 6:09am

Methinks I’ll wait 'til the dust settles.

-Wade

drpeterbatesuk · September 27, 2023, 7:04am

Have you managed to do that in 10.1.2, and if so how?

drpeterbatesuk · September 27, 2023, 7:14am

Yeah, I’d noticed that and put in the same fix. I’m getting round to reporting it.

One woe doth tread upon another’s heels, so fast they follow:

Hamlet, Act 4, Scene 7

Zed · September 27, 2023, 7:18am

Same way I do everything: look up the I7 phrases in the Standard Rules, look what I6 it uses, look those up in the kits… and then shamelessly steal from Dannii (his Xorshift extension in this case. It’s 9.3 but shouldn’t be hard to update. But, yes, I’ve been playing with an RNG replacing Inform’s in 10.2 tonight. I also threw in my fix allowing numbers across a 2**32-1 range on glulx instead of 2**31-1.

DavidK · September 27, 2023, 7:28am

Thinking about how to fix Glulxe (Git can wait for later):

When building with OS_UNIX set, everything is okay as-is. By default the POSIX random() function is used, which is fine. When COMPILE_RANDOM_CODE is used we get the simple RNG that has issues, but it’s very helpful to have a way to build Glulxe as a command-line tool on any platform that is completely reliable in its RNG output. So let’s leave that alone.
When building for Windows, we can keep the current RNG if a seed is set (not ideal, but will do for now) and call the Windows API for properly random numbers if no seed set (which is the common case). This I can do and make a test build available soon. We might as well just remove the Windows specific code from osdepend.c in the generic Glulxe distribution for the same reason that we removed the Windows start-up code from there, it’s just too platform specific and requires many other bits.
That leaves OS_MAC in osdepend.c. This I know nothing about. Zarf, is that code even relevant any more?

drpeterbatesuk · September 27, 2023, 1:44pm

Having done a bit of statistical theoretical digging, descending into the weeds of large number theory and Monte Carlo simulations, I have come to the tentative conclusion that in order to be confident of a reasonably even distribution of random outcomes across all possibilities- by which I mean that for any given call to the RNG for a random number in the range 2^n, e.g. the first call, there is a >95% likelihood that no outcome is 5% more or 5% less likely than it should be if all 2^n possibilities were truly equally and randomly distributed- you need to have a bare minimum of 2^n * ~5,000 sequences, meaning that we need to run the RNG a minimum of random (~5000) times to prime it.

Given that, and building in a margin for comfort and higher accuracy, my original educated guess of random(32633) calls to prime the RNG doesn’t look too far off the mark. It doesn’t seem to introduce a noticeable delay to start up despite my 10-year-old laptop.

Obviously, retro 1980s/early 90s hardware would be a different case, but games running on those machines would likely be using Z-machine anyway.

zarf · September 27, 2023, 3:26pm

I’d like to update the COMPILE_RANDOM_CODE RNG to one that is reliable but not quite as bad. Good old-fashioned MT19937?

Nope. TickCount() was a way to get the system clock on early-2000s MacOS, but it’s long deprecated and time() is adequate. We can drop OS_MAC entirely and use OS_UNIX on MacOS.

zarf · September 27, 2023, 3:28pm

This seems like the right strategy for Unix as well – built-in RNG if the seed is set, otherwise call random() seeded by clock time.

J_J_Guest · September 27, 2023, 4:31pm

I’ve been following this topic with interest, though a lot of it is going over my head.

My latest game has a few random elements, and when I play it in Windows Glulxe it seems that those random elements happen more frequently than when I play it in the I7 GUI. I’m not sure if this has anything to do with the issues you’ve found.

With the IFComp deadline approaching I’d be really grateful if one of you could summarise your findings from this thread in simple layman’s terms. Also, does this Glulxe issue affect the browser interpreters exported by Inform or used on the IFComp website?

Zed · September 27, 2023, 4:55pm

Here’s a thought for the initial seed on glulx, at least for when the terp supports a clock:
as early as possible, call glk_current_time and seed the RNG with the microseconds value. Then, a few times more, scattered amongst other parts of the sequence, get the current time again, hash the microseconds with a large random number generated by the RNG, re-seed the RNG with the result. Since other stuff is happening between each iteration of this and the terp isn’t the only process on the machine, I’m guessing there should be decent variety amongst the microseconds results.

DavidK · September 27, 2023, 7:09pm

That sounds good to me, not that I know enough number theory to really say anything useful on the topic. Are you happy to work up a patch for Glulxe? The code in osdepend.c that’s conditional on WIN32 can just be removed.

Hanna · September 27, 2023, 7:12pm

Is there a reason to keep both variants or is this a historical artifact? If the built-in RNG is updated to use a good algorithm, it seems simpler to always use it on all platforms, just switching between fixed seed and seed taken from the OS (time / entropy / etc.). That way, it’s just a single runtime toggle between reproducible and unpredictable behavior. That’s what I’ve been doing in the Glulx interpreter I’m working on (not yet released).

Changing the algorithm makes a lot of sense to me, but I would suggest more modern algorithms. Mersenne Twister was historically popular and isn’t quite bad enough for all projects that were already using it to switch away, but there are no longer any technical reasons to adopt it for anything not currently using MT. There are several alternatives that are not just faster and give higher quality results, but are also simpler (less code, less run-time state) and more robust to “bad” seeds and particular yet realistic usage patterns.

The xorshift family of algorithms has already been mentioned in this thread, though I must admit I don’t know off-hand which variant should be preferred and how exactly seeding should be done (an internal state of 0 must be avoided to get any randomness at all, but I’m not sure if there’s anything else to take care of).

Another alternative, which I have a soft spot for, is PCG (permuted congruential generator). It has a nice minimal reference implementation in C which handles seeding and also “uniform random integer between 0 and N” (the random() % N expression that’s every programmer’s first instinct has some bias, especially for large N).

DavidK · September 27, 2023, 7:14pm

I think it’s pretty unlikely to be linked, what’s under discussion here is a lot more obscure than the random number generator being so biased that some numbers come up lots more than others.

No, they’re written in Javascript, so will have a completely different random number generator. Which isn’t to say that that might not have its own problems, but at least it won’t be this problem.

DavidK · September 27, 2023, 7:20pm

I’d favour Zarf’s plan over having one RNG algorithm in the Glulxe source code, since modern operating systems (sometimes) provide access to better RNG algorithms: on Windows you can call CryptGenRandom() or BCryptGenRandom(), for example.

zarf · September 27, 2023, 8:01pm

Also, implementing this RNG in every interpreter is a lot more work than implementing it in C (Glulxe).

Hanna · September 27, 2023, 8:09pm

What is the value of “even better” RNG quality on some platforms once one has a high quality portable RNG? Nowadays, it’s not hard to achieve very high quality portably – not just “good enough” but better than most non-cryptographic uses will ever need. Adopting one of those options is important for the sake of platforms without a good RNG, but once it’s done, having the same behavior across all platforms seems appealing both for users and maintainers.

I mention maintenance because curating a set of #ifdefs to distinguish “good” platform RNGs from “bad” ones sounds like a Not Fun game of whack-a-mole. It’s not a trivial decision, in any case. I believe the Windows APIs you mention provide cryptographically secure randomness, which is of course a high standard (and has advantages over “current time in seconds” for seeding another RNG), but also seems very overkill for IF use cases. On the other hand, the random() function as implemented in glibc (and probably other libcs) is much better than Windows rand() but by modern standards it isn’t good either.

To be clear, I’m not talking about making every interpreter out there agree on one algorithm. It’s just that if an interpreter already has (or now gains) a sufficiently good and portable RNG, that interpreter might as well use that RNG on all platforms instead of having two or more ways of generating random numbers. What’s good for the goose is good for the gander.

DavidK · September 27, 2023, 8:14pm

Attached is a test build of Windows Glulxe. If no random seed is set, this uses rand_s(), which uses the cryptographic random number generator in Windows. (If the seed is set it just uses rand(), which is rubbish, but this is just for testing purposes.) @drpeterbatesuk, would you give this a try?

glulxe_rng_test.zip (975.9 KB)

Dannii · September 27, 2023, 8:29pm

Good to know that 5000 isn’t a performance problem, but it still seems like overkill. I don’t we need to try to completely fix this issue in I7, just enough so that the most obvious effects of the interpreter flaw won’t be visible.

But there’s still the question of how to ensure the initial random() call to determine how many more calls to make isn’t biased.