Git redux

I want to embed WebKit for a Glk implementation so there won’t always be a big difference :wink:

Some of this sort of work has been done: Glulxe can be built with profiling support, so that you can see where the time is going in terms of Inform 6 routines. A few years back there was work to profile a few large games (I used Alabaster and Eric Eve’s Nightfall, as I recall), which led to the Glulx acceleration opcodes being added (see!topic/ to the Glulx specification.

The current state of the acceleration opcodes pretty much represent as much of a win as is probably possible by speeding up the low level Inform 6 routines. Profiling after adding acceleration indicated that nearly all the remaining time was being spent inside Inform 7 generated code, so any further gains from profiling will most likely involve optimizing Inform 7’s algorithms.

The goal is just to be faster, hopefully significantly faster. There are definitely a lot of smaller and/or older devices that would benefit. And I’d draw a comparison with Javascript—that was “fast enough” 5 years ago, but when V8 initiated the Javascript Wars, that enabled a bunch of cool new applications.

As to the approach I outlined, I like the idea of doing something that could address a few different needs at once, even if some of them are a bit speculative (especially the iOS native app).

Wrap it twice and maybe it’ll get even faster! :slight_smile:

I’m definitely not comfortable in plain JS, so I’ll leave that to you! (I tried to contribute a bit to Quixe, but I don’t think I actually saved zarf much time.)

Git gets pretty good leverage out of a cheap-ass version of the same idea—it doesn’t build a syntax tree, it just remembers a label for each Glulx opcode. Then for branches with immediate offsets, it just emits a “goto” into the gitcode. For function calls, it could emit a direct jump to any function that’s already compiled (which will be many of them, if you recursively compile any newly-encountered functions); can’t remember if I got around to doing that.

Building a syntax tree definitely enables a bunch of other optimizations, like transforming the stack into local variables. Though opcodes like “stkroll” get in the way of that, and I don’t know how common they are. That’s why I was initially thinking of LLVM or Java; just directly translate Glulx into stack-based code and let their clever optimizers figure it out.

I’ve been trying to think if there’s useful overlap between what we’re working on. Maybe benchmarks and test harnesses—I was a bit slapdash with that for the original Git, and this time round I’d like to have a a good suite of test programs. And it might be useful to have something I’m going to call “Traceglk”, which can capture and replay a program’s entire Glk interaction, rather than just using Cheapglk.

Yep. I think I’ll save the whole-game-precompilation idea as a backup plan. I think there is value in sweating that 2% and handling every case, just as JS interpreters today have to allow for all sorts of crazy edge cases, because JS is crazy. I still feel a little bad for punting on 1- and 2-byte locals in Git; they weren’t impossible to support, just really tedious. (On the other hand, I’m so glad you deprecated those!)

Better still, just plug the stack pushing instruction into the stack pulling instruction and eliminate it entirely!

I think Zarf has tests pretty well covered so far. As to tracing glk interaction, RemGlk might be able to substitute?

Hey, it’s not too crazy! I only have a few compatibility things in ifvms.js, and they were really easy to fix. Things get more complex when you involve the DOM, but that’s what jQuery is for. :wink:

Speaking of which, can we work together to port GlkOte to jQuery? I made a start at it but I couldn’t figure it all out. Parchment’s Glulx support is faulty because I’m trying to use jQuery and Prototype at the same time.

Oog, yeah, that needs to be done. Prototype was a great idea once.

Oh, I didn’t mean your interpreters, I meant interpreters of Javascript—V8 and the like. I’m sure your code is polite and well-behaved but there’s all sorts of horrible stuff that an obnoxious JS app could do. :slight_smile:

I started converting the code in a branch:

Just a beginning, so far. Does not in any way work yet.

Awesome! Thanks so much.

Okay, I think I settled on an approach. I’ve been reading around a bit and LuaJIT seems to have absolutely startling performance, like half the speed of C, which is completely ludicrous for a language like Lua. It’s much smaller than LLVM and more portable. (Thanks to hjalfi on for the tip.)

I had figured on mostly using C or C++, but to minimise context switching, it makes sense to actually write most of the interpreter in LuaJIT itself (using some low-level intrinsics, so it won’t be portable Lua code). Essentially the Lua analogue of what Dannii is doing in Javascript. Not what I had in mind at all, but the potential performance seems so good that it’s hard to pass up.

So this is maybe a little redundant with the JS approach, but I think LuaJIT is going to be quite a bit faster on desktop and on Android, compared to running locally in Node, at least until asm.js takes off everywhere. And what the hell, it’s fun and I get to learn Lua.

Not sure if you’re familiar with RedBox, but their kiosks’ GUI is all in Lua embedded in C#.

I personally think Lua is evil, but one can not argue with large performance gains and learning new things.

David C.

pypy is getting faster every day too, you should check it out as well Iain!

Okay, some code starting to appear here:

I’m not going to analyse the branches and reconstruct high-level loop constructs—I’ll just emit spaghetti GOTOs, which LuaJIT should handle happily. Initially I’ll do all the stack operations explicitly, so it’s always in a suitable state for save and undo. Then as an optimization, well-behaved functions (fixed stack usage, no indirect branches, no calls except to other well-behaved functions) will store the whole Glulx stack frame in Lua local variables. I have one unfair advantage over Quixe: I can just have all Glulx calls be real Lua function calls, without worrying about blocking the main browser thread.

Lua is a really interesting language! There’s a lot of implicit and dynamic stuff going on, so it’s easy to make mistakes—more so than in Python, I’d say, and maybe almost as much as in Javascript. (In fact learning Lua might help me finally get my head around JS.) But it’s also really expressive. You can write in an OO style that feels very close to TADS, in both syntax and semantics; so it might make a decent language for IF, except I’m not sure if it’s possible to snapshot the program state.

Yeah, if your language supports GOTO then there’s no need to reconstruct loops. I think someone, probably Zarf, had experimented with faking GOTO in JS using SWITCH statements, but I think he concluded it was more trouble than it was worth.

I haven’t touched Lua in a while, but it looks like you’re having fun! From the inclusion of remglk are you thinking of making gift output JSON glk instructions?

I don’t know what Zarf was playing around with, but although JS doesn’t have “goto” it does have labels which can sometimes do the same work. Somebody actually wrote a preprocessor for JavaScript to add goto support.

I haven’t even looked at RemGlk yet, I just included it in my bundle of stuff! I’m vaguely thinking it’ll be useful either for server-side code, or for scripting Glk when running the tests. Long way off either of those though.

It’s crazy that Javascript doesn’t have GOTO. I wonder how Emscripten compiles branchy code? It looks like you could maybe do a certain amount with labeled break/continue statements. I’m not sure if that’s fully general, but it should work if all the branch paths are hierarchical, and I’d expect Inform-generated code to satisfy that condition.

The good thing is that Inform code rarely needs gotos… there are only a few places in the parser which contain them. So a few optimisations can turn if statements and loops into nice code for JS.

As for emscripten, maybe there is a LLVM mode that avoids gotos?

The general construct for faking GOTO in a language which doesn’t have it is a switch statement in a loop:

// start of function
int label = START;
while (1) {
switch (label) {
case START:

case LABEL1:

case LABEL2:

break; // end of function

Wrap every goto-containing function in one of these, and then your goto idiom is “label = LABELN; continue;” It doesn’t handle jumps between functions, but as far as I know no C-like language does.

This is of course related to the infamous Duff’s Device, c.f.

It looks like the I6 library has a moderate number of gotos, and I7 generates more of them in some cases – for example, any

say “foo[if X]bar[else]baz[end if].”

So it’s not going away, sadly.

I did try this in Quixe, but it was just based on the branch and jump opcodes in the Glulx file – I didn’t try to reconstruct high-level loops. I think the test showed that it was slower than the model I’ve got now… but (a) that was early in development, so maybe things shifted; (b) that was several versions of Safari and Firefox ago, so maybe their optimizers have improved.

(Also © the stack-to-local-variable optimization I implemented is easier, albeit less ambitious, when you’re working with short loopless code fragments.)

I tracked down the Emscripten docs to satisfy my curiosity:

It’s exactly as zarf describes, a big switch statement with a variable to hold the next label. Interesting read! And some of the notes on optimization are very relevant to this discussion. It doesn’t bother doing any optimizations that the original C-to-LLVM compiler should have made, so “that leaves two major optimizations that are left for Emscripten to perform:”

That Relooper sounds like the same trick Dannii is using. Nifty.

I’ve been looking at asm.js a little more, and one of the things that might need to change for our JS interpreters to use it would be for the stack and main memory to be stored in a single memory buffer. This probably isn’t too dissimilar to how most C/C++ interpreters do it however. If you preallocate memory for the stack and locals, how much do you use?