Ah, so 0.4% would be a better approximation. That’s about equivalent to adding another per-object variable to an average game, right? Which costs two bytes per object in the game, plus some overhead.
I’m also currently contemplating adding 256 bytes of scratch space for counting characters (print text to a buffer, look at its length), so I should think about whether that’s an acceptable cost or not.
For what it’s worth, I fed its output to the PractRand test suite and it failed (after several gigabytes of output but before exhausting the full 4^32 period - see details below). That’s the full u16 outputs before mapping into a range smaller because that’s the format the test suite expects. This is tangential to your main point, doesn’t mean it’s a bad PRNG for a Z-machine interpreter, etc. but I was curious and already had a build of PractRand lying around so it was a matter of five minutes for me to test.
RNG_test using PractRand version 0.95
RNG = RNG_stdin16, seed = unknown
test set = core, folding = standard (16 bit)
rng=RNG_stdin16, seed=unknown
length= 128 megabytes (2^27 bytes), time= 3.1 seconds
no anomalies in 159 test result(s)
rng=RNG_stdin16, seed=unknown
length= 256 megabytes (2^28 bytes), time= 7.1 seconds
no anomalies in 172 test result(s)
rng=RNG_stdin16, seed=unknown
length= 512 megabytes (2^29 bytes), time= 14.4 seconds
no anomalies in 185 test result(s)
rng=RNG_stdin16, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 27.9 seconds
Test Name Raw Processed Evaluation
FPF-14+6/16:all R= +5.1 p = 2.8e-4 unusual
…and 197 test result(s) without anomalies
rng=RNG_stdin16, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 52.7 seconds
Test Name Raw Processed Evaluation
FPF-14+6/16:(3,14-0) R= +8.3 p = 2.4e-7 mildly suspicious
FPF-14+6/16:(4,14-0) R= +8.4 p = 2.2e-7 mildly suspicious
FPF-14+6/16:all R= +10.7 p = 1.7e-9 VERY SUSPICIOUS
…and 207 test result(s) without anomalies
rng=RNG_stdin16, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 103 seconds
Test Name Raw Processed Evaluation
FPF-14+6/16:(2,14-0) R= +9.9 p = 8.9e-9 very suspicious
FPF-14+6/16:(3,14-0) R= +15.5 p = 5.7e-14 FAIL
FPF-14+6/16:(4,14-0) R= +13.4 p = 5.3e-12 VERY SUSPICIOUS
FPF-14+6/16:(5,14-0) R= +9.1 p = 4.5e-8 suspicious
FPF-14+6/16:all R= +19.8 p = 4.7e-18 FAIL !
…and 217 test result(s) without anomalies
Output size and state size is probably an important factor, yeah. With a period like 2^64 and an even larger internal state, it’s hard for any test to explore a significant fraction of the generator’s output, so only the most glaring issues are caught. The same tests are a much higher bar to clear for a PRNG with e.g. 64 bits of state.
Unfortunately a flat histogram doesn’t mean a good PRNG. The “PRNG” function prng(x) { return(((x + 1) % 6) + 1); } would pass your test, for example.
In general you want even a just-for-gamedev PRNG to both have a random-ish distribution (what you’re testing for above) and for its elements to be independent-ish.
For whatever its worth I wrote a TADS3 module for statistical testing back when I was writing my own instancable PRNG code for TADS3. It provides a couple of common stat tests for this sort of thing, including an implementation of Pearson’s chi-square test (for the distribution) and the Wald-Wolfowitz runs test (for independence).
I don’t know of any more recent version than what’s on sourceforge, the 0.95-pre version from there is what I’m using (compiled from source following these instructions, I believe).
I built that same version, but haven’t tested anything yet. If you’d like to test another one, I’m playing with this Lehmer MCG, which is a decent amount simpler than the Weyl, but it does use a single 64-bit multiply.
/// Multiplier used by the random number generator.
const KEY: u64 = 0xDA94_2042_E4DD_58B5;
/// Multiplier used to mix seeds.
const MIX: u64 = 0x9E37_79B9_7F4A_7C15;
fn mix_seed(seed: u64) -> u64 {
seed.wrapping_mul(MIX) | 1
}
/// Pseudorandom number generator.
///
/// A Lehmer multiplicative congruential generator.
#[derive(Clone, Debug)]
pub(super) struct Random {
state: u64,
}
impl Random {
/// Returns a pseudorandom number from 1-`range`.
///
/// Uses rejection sampling to provide balanced output.
#[allow(clippy::cast_possible_truncation)]
pub(super) fn next(&mut self, range: u16) -> u16 {
let reject_threshold = u16::MAX - u16::MAX % range;
loop {
self.state = self.state.wrapping_mul(KEY);
let output = (self.state >> 48) as u16;
if range & (range - 1) == 0 || output < reject_threshold {
return output % range + 1;
}
}
}
/// Returns a new [`Random`] based on a seed value.
pub(super) fn new(seed: u64) -> Random {
Random {
state: mix_seed(seed),
}
}
/// Resets the generator to use a new seed.
pub(super) fn reset(&mut self, seed: u64) {
self.state = mix_seed(seed);
}
}
When you build a binary to provide input to stdin for PractRand, do you just build a fixed seed, and output in an endless loop? I can’t find any docs that describe what it wants.
Nevermind, I got it working. Trying some variations now.
I’m testing without the z-machine’s +1 range peculiarity or the rejection loop, obviously.
Increasing the internal state to 128-bit and setting KEY to
0x0FC9_4E3B_F4E9_AB32_8664_58CD_56F5_E605
and the initial state to
0x9E37_79B9_7F4A_7C15
yields the following results:
RNG_test using PractRand version 0.95
RNG = RNG_stdin16, seed = unknown
test set = core, folding = standard (16 bit)
rng=RNG_stdin16, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.3 seconds
no anomalies in 159 test result(s)
rng=RNG_stdin16, seed=unknown
length= 256 megabytes (2^28 bytes), time= 5.5 seconds
no anomalies in 172 test result(s)
rng=RNG_stdin16, seed=unknown
length= 512 megabytes (2^29 bytes), time= 11.0 seconds
no anomalies in 185 test result(s)
rng=RNG_stdin16, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 21.2 seconds
no anomalies in 198 test result(s)
rng=RNG_stdin16, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 40.3 seconds
no anomalies in 210 test result(s)
rng=RNG_stdin16, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 76.4 seconds
no anomalies in 222 test result(s)
rng=RNG_stdin16, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 150 seconds
Test Name Raw Processed Evaluation
BCFN(2+0,13-0,T) R= +8.3 p = 5.7e-4 unusual
...and 232 test result(s) without anomalies
rng=RNG_stdin16, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 294 seconds
no anomalies in 244 test result(s)
I’d love to get results like this with only 64-bit state.
Yeah, I output in an infinite loop. For the seed, I just have it pick a fresh random seed from the OS every time it runs. In Rust this generally requires an external library (getrandom is the standard choice). That’s a bit annoying for quick hacks, so for 64 bit seeds and smaller I usually abuse the standard library’s hash randomization: std::hash::RandomState::new().hash_one(()) is not an RNG by any means, but it’s okay-ish for picking a non-deterministic 64 bit value once per process.
You’re probably not gonna get that with a standard MCG or LCG, I don’t know of any studies for 16 bit output specifically but for 32 bit output and 64 bit output, the MCGs that pass tests with flying colors have around 64 more bits of state than they output.
The standard “fix” is to get more mileage out of a given amount of state is to produce the output with a more complicated function than just taking some subset of bits verbatim. Various xorshift variants do this on top of xorshift, and PCG does it on top of LCG.
Of course, anything that involves a 64 bit multiply is still pretty painful for 16 bit targets. Doable, but no longer very cost-effective compared to other ways of updating the same amount of state.
Yeah, I noticed 16-bit output is not generally considered or tested.
Very true. This is mainly for my interpreter which targets 32 and 64 bit machines. I unfortunately sort of hijacked this thread.
I have gotten one result I consider interesting. I cheated:
I used a 64-bit state but a 128-bit constant multiplier and perform 128-bit multiplication. I update the state to the lower 64-bits of the result, but I return the upper most 16-bits of the result (not the state). That performs great until it fails at the 16 gigabyte test.
Yeah, I gave up on those. I did find that XorShift64* works great with only 64-bit state. It passed 128 Gigabytes with no anomalies. It may have gone further, but I didn’t see the point.