Git redux

Hi all! I wrote the Glulx interpreter “Git” back in the day (David Kinder now maintains it) and I’m thinking of revisiting it. It could be made a lot faster, and almost as importantly, it needs a new name!

Do we really still need a speed improvement for Glulx? I think so, because even though computers keep getting faster, there are a number of factors working in the other direction:

  • People want to play games on the web
  • People want to play games on mobile devices
  • Inform 7’s library is much bigger and more complex than Inform 6’s
  • Some game authors are also writing bigger and more complex games

“Counterfeit Monkey” is a great game, but it feels sluggish on a desktop machine, and it’s really slow on the web or on an iPad. Would there be value in a 10x speedup? Sure!

For web-based IF, there’s Quixe, which is already pretty sophisticated—I don’t see any big gains to be made there. But Javascript is just not that great as a target language, so it’s always going to run a bit slow. It might be worth exploring a server-side solution as an alternative: run the game in a fast native interpreter on a beefy server, while the player’s web browser just handles the I/O and graphics. (TADS 3 uses this approach.)

Okay, how about native code? What’s needed is real JIT compilation (Git is just a faux-JIT; it translates Glulx bytecode into a simpler internal format), but JIT is hard, so we need to leverage a library. The two obvious ones are LLVM and Java 7: both of those do really fancy runtime optimization, and can run interpreted code at near-native speed. Either one would be a fun project.

But take a step back; do we want to install a huge and heavyweight interpreter just to play Glulx games? Probably not. And neither LLVM nor Java 7 is going to work well on the iPad, which a lot of people want to use. (Getting them to work on Android wouldn’t be a picnic either.) Looking around a bit, I found that GNU Lightning is still going—there’s even a new maintainer working on Lightning 2.0. This is a really small and lightweight JIT library; it doesn’t do any optimization, but it’s possible to get pretty good results (Racket uses it: see http://benchmarksgame.alioth.debian.org/u32/benchmark.php?test=all&lang=racket&lang2=gcc for benchmarks); and I think it can be made to work within the restrictions of iOS.

So I’m leaning towards Lightning. It seems like the best approach for iOS, and could be useful in a server solution too. On the other hand, if people are more interested in server-side IF, I think Java is the best approach for that (more secure, and easier to deploy on services like App Engine).

What do you think? Worth doing? Interested in contributing?

Iain

Unfortunately, GNU Lightning doesn’t support x86_64 yet, which would be a set-back (Wikipedia says it supports it but it refuses to build on my 64bit machine). I’m not sure about its ARM port’s status given its new maintainership.

I checked on the mailing list, and it looks like there is some support for ARM now: lists.gnu.org/archive/html/light … 00006.html

But it’s clearly not a super active project, and I haven’t tried it out yet. It is reassuring that Racket seems to be using it successfully, but they’re using a older (and forked) version.

Not sure I care so much about x64_64 right now; for our purposes, what’s wrong with 32-bit?

To clarify, I feel like 64-bit desktops don’t have much of a problem right now, at least when running Glulxe locally. It’s little ARM tablets and web browsers that are the problem.

Ah good to hear that the ARM port is out now (I didn’t check too deeply). As for 64-bit desktops, I understood that it was not your primary concern but I thought I’d bring it up anyway.

(I should add that I’m not really qualified to give really good feedback on your real questions. I think overall it would be a welcome thing, esp. since I use an ARM netbook for IF-ing and it is, indeed, slow)

It’s a data point! Are you using Windows or Linux? And how do you feel about a native app versus a web app?

I’m not sure that’s right. Take a look at asm.js. asmjs.org/spec/latest/

Mozilla has a prototype implementation of a hyper-fast asm.js AOT/JIT complier, as well as a plugin to make LLVM/emscripten generate asm.js compliant code. IMO, this will probably be standardized instead of PNaCl. asmjs.org/faq.html

I definitely disagree with this.

IMO, getting a fast LLVM-based IF interpreter would be the best bet; it would work with Emscripten, so it would work on iOS, and it will only get faster over time.

An unrelated idea might be to try to develop a good perf profiler for complex games like Counterfeit Monkey, to figure out where all the time is being spent and optimize it with caches/indexes.

Huh, interesting! I’ve just started hearing about asm.js but hadn’t looked into it much.

One question I have that isn’t addressed in that FAQ—I guess it’s an Emscripten question—is whether one’s LLVM code can generate further LLVM code at runtime, and have that be compiled and optimized (presumably into high-quality asm.js, and then that gets compiled and optimized for the machine). In other words, is the LLVM JIT API available in Emscripten? I was figuring on using runtime JIT for Glulx.

Maybe it wouldn’t be needed, though. You could maybe translate all the Glulx code up front, although that would be tricky to get right. Or maybe you could translate it into asm.js rather than LLVM IR code at runtime. In fact that second approach could be done in Quixe—maybe that’s the route to getting a big speedup there.

Silly thought: asm.js as an additional back-end for Lightning.

As mentioned on ifMUD, Zifmia is pretty stable these days. It’s been on my “to do” list to make something public so people can grab the Zifmia Support I7 extension, compile their glulx game, upload the ulx file, and play the game in the web interface.

I should have some time for this sometime soon. I have all of the working parts. I just need a clean website and a safe upload facility. I’m not sure if a ulx file could be made to hack my server. I think that’s not possible, I’m not sure. It’s running in managed code so I think it’s safe.

The only caveat is that your game cannot use glk extensions. It has to be the basic out of the box glulx. The normal channels of FyreVM like Main, Location, Score, Turn, Time are all built-in and special coding is not required. You just need to add the Zifmia Support extension and compile.

Maybe this week. Probably next.

David C.
www.textfyre.com

OK, that idea is crazy but might actually work.

[] Write a C++ glulx interpreter that uses Lightning to generate code dynamically.[/]
[] That should immediately give a faster desktop interpreter (though maybe not 64-bit).[/]
[] That should also give a fast Android interpreter.[/]
[] If it can be written in a way that’s acceptable to Apple, great! Native iOS terp.[/]
[] Update Lightning to emit asm.js “machine code”.[/]
[] Compile the whole thing with Emscripten, and that’s a fast JS terp.[/]
[] PROFIT[/]

Oh, and the JS terp could be used server-side with Node too—a bit safer than running a native app. Man, I’m starting to buy into this whole Javascript business now.

David, I registered on Textfyre and played a bit of Shadow in the Cathedral, and it’s a pretty decent experience! I’m getting maybe 1000-1500ms lag between prompts. Do you know how much of that is network latency, how much is stuff like database access, and how much is the actual game code? If it was possible to pare it down to 500ms I think it’d feel really slick.

That’s a very good idea. The trick would be to connect the interpreter with a symbol table, so it can report the Inform function names that are eating up time. I think Inform can generate debugging symbols but that’s all I know about it.

I don’t know much about Emscripten, but I think it does not offer a runtime JIT API. sns.cs.princeton.edu/2012/04/jav … y-scripts/ “We also had to disable all assembly routines and just-in-time (JIT) compiling features of SpiderMonkey, since assembler is not available in JavaScript.”

Yes, indeed. Gnusto did this for Z-code with great results; adding a JIT-to-JS system for Quixe seems like a great idea (assuming it hasn’t already happened; I’m really unfamiliar with Quixe’s internals).

The asm.js FAQ does explicitly state that you can dynamically invoke the asm.js compiler:

I’ve lost track of the goal here. IMO, git is already fast enough on x86 desktops, so I don’t see the benefit of Lightning. If you want a fast interpreter on Android, I bet you could compile git for ARM with a few tweaks and run it with Android’s NDK; I bet it would run Counterfeit Monkey well enough on a Nexus 4.

If you want to JIT on iOS or Windows RT, your only option is to JIT to JS. IMO, the easiest path to that is to add JIT to Quixe, a la Gnusto for Glulxe. (Guixe.) That would also work on Android, and on desktop web browsers, which would make large games like Counterfeit Monkey and Blue Lacuna more accessible to more players on more devices. (Nothing to install.)

FyreVM has a profiler. I have no idea how to use it. It just sits there, neglected. (vaporware wrote it).

David C.
www.textfyre.com

Shadow is a huge game and although it has a bunch of Kindle-induced optimizations, it’s still a pretty big game. If I had to do it over, we’d build it and test it on Zifmia at the same time and avoid as many issues as possible.

Most small and average-sized games would run much more quickly.

The biggest performance hit is when it reloads the engine (serializes from a binary file).

David C.
www.textfyre.com

If you’re comfortable in JS you can manually write JS and get all the benefits of asm.js – this is what I’m planning to do with my ifvms.js project. You may even get better results than emscripten (though I’ve seen reports that compiling a C JS engine in emscripten and running your JS in that is faster than running it normally, which makes no sense!)

Quixe already uses a JIT, though in my opinion it has limited potential for speed improvements. For best performance you want to minimise the number of times that you switch between running your JITted bytecode and the engine’s support code. My ifvms.js code builds a syntax tree as it disassembles, which can then be manipulated. I use this to identify the opcodes for loops so that real JS loops can be emitted, rather than "return"ing to the support code and having it run the JIT fragment repeatedly. The second problem is (bytecode) function calls. In general we must manually call functions so that we can send control to Glk and so that we can serialise the stack when saving. It should be possible however to identify which functions are non-halting and call them directly in JIT code so that we can avoid all the support code overhead. (Non-halting in the does not call @save or @glk sense, not the infinite loop sense, I know that’s impossible!) This idea is basically a generalisation of accelerated functions: any function that fulfils the requirements in the spec for accelerated functions would be automatically identified and called without overhead.

Doing all of that through emscripten could be tricky…

I’ve been holding off making the Glulx version of ifvms.js because I wanted to write my own Glk implementation first, but I’m thinking maybe I should keep using Zarf’s. If anyone wants to help (or with Parchment in general) I’d love some contributors!

So, returning to this thread after a PR-IF evening…

(a) Yay.

(b) I6 generates debug symbols. (EmacsUser just submitted a big patch to change over from the old I6 debug file format, which was binary, to a way-more-verbose XML format.)

© I agree that for very large games, some degree of profiling and optimization is necessary. I have consciously been building Hadean Lands with speed in mind, since it’s aimed at iOS devices. The current version takes slightly more than a second per move, even for complex, multi-stage, goal-seeking actions. I am happy with this.

(d) Dannii has summed up the current JS JIT situation. I think that Quixe could be improved to have the same sort of “use real loops” “identify simple function calls” approach, but it’s just an idea at this point.

(e) It is somewhat difficult to pre-compile an entire Glulx file. You need a list of all the function entry points (from the debug-symbol output, I guess). Then you need to identify every possible code path in a function. In theory a function could include a computed jump, which is impossible to precompile. I don’t think I6 ever does this by itself, though.

(f) On the other hand, you could start up a game like Counterfeit Monkey, play three turns, and I bet you’d have a record of 98% of the critical code paths.

I’m running Linux (plain old Debian). I think a native application is usually preferable for me.