At least for me it is now slow-but-playable with Lectrote. A full-game regtest takes about 40 seconds for this version versus a little over two minutes for the old one. I’ve also fixed a couple of bugs.
It would be nice if people could test it in web browsers and on portable devices. Report any bugs on Github or in this thread.
There are still some really low hanging fruit in terms of rewriting a little of the template code. I still haven’t gotten around to building a profiling Glulxe yet, but I’d appreciate seeing any profiles anyone can generate.
Here is a profile report of a run through the entire game, with the default ten slowest functions. Here is a zip with all the files used to create it, including a game compiled without Ultra Undo (otherwise the profiling will not work), in case anyone wants to have a proper look at it. It is interesting that it seems to spend so much time in VM_Save_Undo.
In general, it seems that the slowest remaining parts is approaching and conversing.
Approaching is when you type GO TO HOSTEL rather than N. The slow part isn’t really so much the actual pathfinding as the fact that the game will walk you to your destination silently, step by step. GO TO HOSTEL might translate to “Silently try going N, E, N, E, E”. The single slowest command in the testing script for the game is asking Higgate how we might return a book. That will make both you and her approach the Language Studies Seminar Room, step by step. For each of these steps (one step for you, one for her) the game makes basically the same checks as when you walk into a room normally. That is what makes it slow. A lot of these checks might be unnecessary; there is already code that skips certain parts depending on “if the player is hurrying,” i.e. approaching. There might also be situations where it makes sense to just teleport the player to the destination rather than walking step by step.
The conversation system still runs through a lot of quips every turn to see which ones are viable, i.e. allowed to be said. In theory that should not be necessary: it is basically a conversation tree of interconnected quips. It is analogous to room connections, and how the game doesn’t have to look through every room in the game each turn to see which ones you can go to. This might be quite a bit of work to change, though.
Is there perhaps a depth limit that is being hit tracing VM_Save_Undo (which is from Ultra Undo is it not?), it looks like it’s 5 ops and a whooping 220ms per call but looking at it I’d assume it’s child calls that involve file I/O and serialising the entire game state that’d be making it take that long.
Just to clarify: the profiled game is not using Ultra Undo. The profiling code will not work with the @restart, @restore, @restoreundo, or @throw opcodes, so for testing purposes Ultra Undo was commented out.
Sorry, I don’t quite understand this. An undo save state of the game seems to be 128 K. The interpreter is copying this in memory 549 times. That is the kind of thing you’d think would be almost instant on a modern computer, but here it takes over two minutes. Is it doing some kind of compression on this or something? Is serializing the game state really that cpu-intensive?
The VM does very simple (run-length encoding) compression. Keep in mind that when you see a 128K undo state, that’s after compression. CF’s RAM use is a bit over 4 megabytes. The interpreter is running through that much data, comparing it to the original game file, and squishing it down to 128K.
Could you try adding “#define SERIALIZE_CACHE_RAM (1)” to the glulxe.h header file and recompiling the interpreter? That might improve that.
Also, using Dannii Willis’s patch to make Glulxe profiling compatible with Ultra Undo, it is clear that it makes no difference performance-wise to write the game states to disk instead of keeping them in memory.
I think the discussion was based on Angstsmurf’s suggestion that the performance hit amounted to 75 seconds on the Counterfeit Monkey test case. It may be a case of a really small negligible hit adding up, as it seems like the cost is paid on every read/write of a byte, 16bit value, or 32bit value from the game’s address space.
As I understand VERIFY_MEMORY_ACCESS is changing the interpreter rather than changing the game, is it out of scope for the project? Also, I note github.com/DavidKinder/Git claims to be a faster interpreter, is it? And if the interpreter can be changed what kind of solutions can be considered, can say hardware memory protection be leveraged atleast where available?