I figured that this would be faster. However, I’m getting mixed results. For Glulxercise (the VM unit test), it runs about 9% faster (Firefox and Safari). However, Reliques of Tolti-Aph (our favorite I7-generated torture test) is about 9% slower on Safari. For Firefox, RoTA is about 5% faster when compiling code – it’s more or less a wash when executing cached code.
(I have not yet tested this with iOS Safari (mobile WebKit) – I really should, since that’s the real-life low-end platform.)
I guess the diagnosis here is that making main memory a byte array is faster when you’re running code that accesses main memory a lot. (Glulxercise does this, because it exercises every VM opcode on main memory, local variables, and the stack.) However, that’s not normal game code; the I7/I6 compiler is good about keeping most accesses in locals and the stack. So in real life, the gain is outweighed by overhead.
Anybody have experience with this?
Thinking out loud here:
- Byte arrays are more compact, I assume. Is the speed hit worth the reduced memory profile?
- I guess I could make the stack and locals into Uint32Arrays. That would be more work. (My existing stack code relies heavily on the array.push() method. Uint32Array is fixed-size and has no push method.)
- Crap, I didn’t array-ize the save-undo routine. I hope the overhead cost isn’t all there! (Quick test) Nope.