Lack of randomness sometimes when compiling for Glulx

Zed · September 27, 2023, 9:12pm

PCG gets a harsh writeup here. I’m not nearly well-versed enough in this area to be able to judge myself whether it’s fair.

I’m afraid I hadn’t realized something Peter already knew: you can tell 10.1 you’re replacing random and your version will even be in auto.inf… but you end up getting the inform6 default random but not your own. If you manually put Replace random; before your own version in the auto.inf and recompile it, you’ll get your own. But you can’t use Replace in an inclusion in 10.X. So, no, it isn’t easy to update Xorshift to 10.1 and I wasn’t doing what I thought I was doing.

Reported I7-2388

drpeterbatesuk · September 27, 2023, 9:35pm

It sounds like overkill until you crunch the stats and realise that by using primes lower than 5000 you have a reasonable chance of significantly biased so-called random choices: e.g. if you have 4 choices for your password with theoretically a probablility of 0.25 each, in practice the probabilities may be 0.23, 0.26, 0.27, 0.24.

You could argue that’s ‘good enough’ for most IF scenarios, but given that there’s not a significant performance hit from targeting more uniform randomness, I’ll stick with my suggestion of random (32633) calls to prime the RNG.

I don’t think using the ‘dodgy’ algorithm to determine random(32633) is a problem here- we’re avoiding its known weakness with even numbers and in particular 2^n.

drpeterbatesuk · September 27, 2023, 10:08pm

@DavidK
This passes the tests the previous version failed. I assume you will be able to make a build at some point that works with the Windows IDE and can replace the current version in Inform’s Interpreters folder?

jbg · September 27, 2023, 10:24pm

The thing you’re really worried about is not that the results are biased, but that they’re correlated. If you have an uncorrelated but biased source you can get unbiased outputs at the cost of a little overhead and a somewhat slower PRNG.

Dannii · September 27, 2023, 10:45pm

randomtest.ulx (1.8 KB)
Here’s a simple test to check how long it takes to call random() up to times.

The good news is that takes at most a few milliseconds even in Quixe.

The bad news is that when running it in Windows Git 1.3.4 I always get a number of about 24000 ± 1000. So simply calling count = random(32633); isn’t good enough (and also means most of the random calls it makes are pointless.)

Actually it’s worse than that. It seems like when I run it every 2 seconds or so, the count is just increasing by 10-20. It is strictly increasing. I’m sure I saw it jump around before, but it’s not doing that now.

I wonder if it’s being seeded with the time, so the very first call to random() is basically completely unscrambled? Maybe keeping the hundred calls to random() that I7 does would help.

Edit: 30 minutes later and it’s ticked over and is now getting a count of ~300. And is still strictly increasing. Maybe I was wrong and it wasn’t jumping around when I first tried it?

Source code

Global count = 0;
Global mainwin = 0;
Global timeend = 0;
Global timestart = 0;
Array gg_result --> 4;

[ glk_current_time _vararg_count;
  ! glk_current_time(&{int, uint, int})
  @glk 352 _vararg_count 0;
  return 0;
];

[ glk_set_window _vararg_count;
  ! glk_set_window(window)
  @glk 47 _vararg_count 0;
  return 0;
];

[ glk_window_open _vararg_count ret;
  ! glk_window_open(window, uint, uint, uint, uint) => window
  @glk 35 _vararg_count ret;
  return ret;
];

[ Main i;
    @setiosys 2 0; ! select Glk I/O system
    mainwin = glk_window_open(0, 0, 0, 3, 0);
    glk_set_window(mainwin);

    count = random(32633);
    print "Calling random() ", count, " times.^";

    glk_current_time(gg_result);
    timestart = (gg_result-->1) * 1000000 + gg_result-->2;

    for (i = 0: i < count: i++) random(i);

    glk_current_time(gg_result);
    timeend = (gg_result-->1) * 1000000 + gg_result-->2;

    print "Took ", (timeend - timestart), " microseconds^";
];

drpeterbatesuk · September 28, 2023, 12:03am

I’ve checked this out with the versions of Git and Glulxe shipped with the IDE and by making direct calls on startup to @random, after Inform has done its initialisation to seed the RNG.

There is a clear difference between Git and Gluxe.

Glulxe seems to perform OK from the start, but Git requires some time to ‘warm up’ into an apparently random sequence, the initial result of random(32633) varying over time within a range of several thousand, or hi/lo, that drifts over time, as you’ve noted.

Running the RNG 100 times before calling random(32633) seems to sort Git out.

Conclusion: the implementation of this RNG is even more problematic in Git than in Glulxe!

EDIT: so, as a more reliable workaround within Inform, prime the RNG by calling it 100 times THEN call it random(32633) times! (post of workaround updated accordingly)

zarf · September 28, 2023, 2:08am

@DavidK, I have done this. See branch: https://github.com/erkyrath/glulxe/tree/new-rng.

I simplified the platform-dependent part of the file. Instead of each platform defining its own glulx_setrandom() and glulx_random(), now each platform can define a RAND_SET_SEED() and RAND_GET() macro. (These are currently one-line calls in all cases, so the macros are not messy.)

I kept the old behavior for Windows:

#define RAND_SET_SEED() (mt_seed_random(GetTickCount() ^ time(NULL)))
#define RAND_GET() (mt_random())

Can you put in a patch that replaces that with your new rand_s() implementation? Apply it to the new-rng branch for now; I’ll move all this into the main branch at once.

Or you could just paste the code here; as I said, it’s probably just a couple of lines.

zarf · September 28, 2023, 2:19am

On the other subjects (I see RNGs are reliably a contentious subject!)

My current implementation in the new-rng branch is MT19937. I’m not married to that algorithm, but I had it in front of me and I wanted to get this patch in tonight. I can take a look at xorshift and PCG, but I gotta say, if there’s ongoing arguments in the literature, I’d rather dodge that whole mess.

My instinct is that provided a “good enough” RNG is appropriate for testing and debugging – which are the big use cases for setting a particular (nonzero) RNG seed. But for “real world” use, if there’s a better source available, we should use it. (And there is a better source available on Mac, Windows, and Linux, so yay.)

The previous algorithm was not “good enough” because, look, someone noticed it being bad. That’s a sufficient reason to change it up.

There’s currently two cases in the code (Windows and Unix) and I expect that’s where it will stop. (I could look up MacOS native RNG calls, distinct from POSIX… but I don’t think I will.)

So that’s not a big maintenance hassle.

As for priming the RNG with a bazillion calls… no. No thanks.

Dannii · September 28, 2023, 2:32am

What do you think Inform should do then, while we wait for the interpreters to catch up? (If they ever do… the Android terps aren’t being maintained.)

Draconis · September 28, 2023, 3:36am

Do the Android interpreters show this issue? I thought it was Windows-only.

Instead of making every game call the RNG thousands of times, it sounds like this happens any time you call @random with a power of 2 from the starting state, right? In that case, Inform could test for the bug by calling @random 4096 (or 2^14 or whatever big power of 2 you like) and looking at the first value returned. If I understand right, the affected terps should always return the same value for this call, which Inform can check for; the odds of an unaffected terp returning that same value are miniscule.

Dannii · September 28, 2023, 3:51am

I haven’t tested any of the Android terps, but since it happens in Lectrote, it could happen in them too. (Emscripten compiles software as if it were a Linux platform.) It probably depends on which compile settings were enabled.

And unfortunately I don’t think we can do any test to spot these flawed RNGs. Git has the problem of always starting with 3, but Glulxe in the Windows I7 IDE has different results: there are still only 4 distinct sequences, but each of them start with a different number.

Zed · September 28, 2023, 4:44am

In the reading I did last night, xoroshiro128** and CMWC seemed like possibly suitable candidates.

Dannii · September 28, 2023, 5:02am

Yes, it seems xoshiro/xoroshiro are the successors to xorshift (and were invented after I wrote the Xorshift extension.) The creators recommend xoshiro128++ or xoshiro128** for 32 bit generators. Interestingly the ** variant only uses scrambling multipliers of 5 and 9! When I implemented xorshift32* I had to use a scrambler of 1597334677.

xorshift128+ is used by most JS engines, and Java includes xoroshiro128++ and xoshiro256++. So we’d be in good company if we used any of these variants (or even xoroshiro64**) (except the single + variants, which are only for generating floating point numbers). And they’d be faster and use less memory than MT.

drpeterbatesuk · September 28, 2023, 7:14am

Sadly, it’s not that simple. The first result is not necessarily fully predictable. It might be if calling for a random number between 1 and 4 on a particular platform (with the Windows IDE version of Glulxe it is always 3, but with Git it could be any of 1-4, and the fully predictable results come later in the sequence).

It would be necessary to hold a bank of sequences representing the particular possible starting random number streams determined empirically for each platform/terp combination and check for one of those.

But then what, if Inform finds it’s running on an affected platform? There’s currently not a simple way of implementing an in-house alternative RNG across all versions.

Kludgey though it is, the ‘prime the RNG by running it gazillions of times’ workaround I have posted remains at present the most practical sticking-plaster.

I don’t like it, and do hope someone can come up with a better one, because as @Dannii has pointed out, we ideally would have a good solution for legacy terps.

DavidK · September 28, 2023, 7:32am

Yes, that’s the plan.

Dannii · September 28, 2023, 7:37am

The I6 system function random is quite overloaded. Call it with a negative number to seed the VM’s RNG. Call it with one positive number to request a random number up to that, or with multiple values to return one of them. I hadn’t realised it was overloaded like that! I didn’t account for any of that in my Xorshift extension, but if we were to implement a RNG in I7’s kit code then it really should account for all of them.

@zarf Would it be at all feasible to split up it up into three veneer/system functions, so that you could replace only the seed and get a number parts (leaving the one-of-several as the default implementation, which would in turn use the random number function), and you could replace them separately? Because I don’t how we could reimplement random(1, 2, 3) - is there even a way for a Glulx function to know how many arguments were given to it? It would be great if random was treated like a macro that determined what it was doing then called the appropriate replaced function or native code.

Or alternatively, for I7 can we just depreciate the multi arg form? That would be much simpler.

drpeterbatesuk · September 28, 2023, 9:59am

Here is an evil method to replace the system random() function in Ver 10.

It relies on the fact that one of the few remaining ways to directly inject I6 code into the final I6 source (auto.inf) is with the Include (- ... -) when defining <a kind> | <an object> syntax.

The insertion parasitises the compiler’s random-kind class declaration header to inject a Replace random; statement, then declares a dummy-kind class declaration header to absorb the orphaned remainder of the compiler’s random-kind class declaration:

A random-kind is a kind of object.

Include (-; 

Replace random; 
	
Class dummy_kind 
	class K0_kind
-) when defining a random-kind.

which leads to this in the final I6:

Class K16_random_kind
  class K0_kind
; 

Replace random; 
	
Class dummy_kind 
	class K0_kind
    with plural bc_U167
    with short_name bc_U168
    with article bc_U169
    with list_together bc_U170
;

This works because class declarations come early enough in the final I6 source that no code calling random() (which would otherwise compile the system random() function) has yet been compiled. The I6 compiler is therefore happy to accept the directive to replace the system random() function with our own.

EDIT Evil scientists among you will immediately notice that this provides a method to directly inject ANY undigested code into the final I6 source (auto.inf), bypassing the I6->Inter->I6 compilation process that spits out some otherwise valid I6 code, either because

i) it isn’t supported by the I6->Inter compiler, but is supported by the final I6->story compiler
ii) it has dependencies on things only created late in the I6->Inter->I6 compilation chain

Dannii · September 28, 2023, 10:17am

That kind of inclusion doesn’t work when compiling to C, and will probably be removed in the future.

drpeterbatesuk · September 28, 2023, 10:28am

I know, but for the present it enables those using Ver10 to compile stories that will work properly on legacy interpreters (with the dodgy PRNG) by implementing an in-house random() function, presenting an alternative to the kludgey ‘run the RNG a gazillion times on startup’ workaround.

Hopefully by the next release of Inform the issue will have a more formal resolution.

drpeterbatesuk · September 28, 2023, 12:13pm

Ibid.

One woe doth tread upon another’s heels, so fast they follow:

Hamlet, Act 4, Scene 7