Parchment update: ifvms.js ZVM new default engine

Dannii · November 2, 2011, 1:34pm

Hello everyone!

Tonight I have uploaded my new Z-Machine implementation to Parchment. I would appreciate your help in testing it. To switch to ZVM, just add &vm=zvm to the end of a normal Parchment URL (ex: iplayif.com/?story=stories/troll.z5&vm=zvm). Once I’ve fixed a few minor things, and fix any bugs you help find, then I plan to make ZVM the default Z-Machine implementation in Parchment, and Gnusto will be used only as a backup (for version 3 etc.)

Some things aren’t supported yet:

Colours/styles in the status window
A few rarely used opcodes (one of which has never been supported by Gnusto): @catch, @throw, @get_cursor
Z-machine versions other than 5 and 8 (I have no plans for the other versions, sorry, but I do have plans for Glulx! )
Quotation boxes won’t be displayed
The main window’s width is too wide

I also have some questions for the experts among you:

What do the operands of @throw mean?
The monospace header bit is clearly independent of other UI stuff. Are @set_text_style and @set_font also independent? Or, would @set_text_style 8 then make @set_font return 4?

weiju · November 2, 2011, 5:13pm

Hi Danii,

I have problems getting your example running on Chrome 15 (Uncaught Error: INDEX_SIZE_ERR: DOM Exception 1 in /?story=stories/troll.z5&vm=zvm:13)

as for @throw:

As I understand it, you store the value of your current stack frame in the @catch instruction. That value is then given to you in the @throw instruction as the second parameter, together with a routine return value in the first, which is used when you return from the state after unwinding the stack up to the saved state.
(this is how I implemented it in ZMPP).

Hope that was not too confusing of a sentence ,

Wei-ju

Juhana · November 2, 2011, 5:24pm

Very cool. What are the benefits compared to Gnusto?

cas · November 2, 2011, 11:10pm

To augment Wei-ju’s response: if you implement Quetzal, @catch is required to return the number of frames on the call stack (it uses the phrase “system stack”, but it seems obvious that the call stack is meant). So @catch in Main() returns 1; if you call another function, @catch in that function returns 2, etc. See §6.2 of Quetzal 1.4.

I see nothing in the standard linking text styles and fonts, and I did not implement it that way. More importantly, the gold standard—Windows Frotz—does not appear to link them, which is as close to a de jure answer as you can get if there is nothing in the standard.

Dannii · November 3, 2011, 10:53am

Thanks for your comments everyone. I thought @throw worked that way. Gnusto has implemented it wrongly.

I’ve now more thoroughly tested it in Chrome and have pushed an update so that it works in Chrome, FF and IE. However I haven’t tested every file in existence, and the test suites aren’t comprehensive enough yet, so, if you use Parchment, I would appreciate you trying ZVM next time you use it.

Juhana, ifvms.js is a new way of writing JIT VMs, by using Abstract Syntax Trees. My plan is that there will be a common core, which ZVM and a future VM for Glulx and maybe even one for TADS would be built on top of. The big advantage is that an AST is much more manipulatable than the JIT systems of Gnusto and Quixe are, which just write out code to run. Instructions can easily be reordered, and structures like loops can be identified and converted into real JS loops, rather than a series of branches and jumps. Previously if you had a loop the JITer would have to stop before and after it, so that you would have three code sections. ifvms.js can compile a loop in place, with the code before and after it, and a JS loop in the middle which will run itself, rather than the VM framework telling it to run multiple times.

Here’s an example of the code it generates, which has a while loop inside a block if statement. This would be close to impossible under Gnusto. It’s not pretty code, but it’s fast.
(But not fast enough. The .incdec() function needs to be inlined, and the push/pop can be optimised too.)

if(!(!(m.getUint16(2985)==1))) { l[5]= 0; while(!(!(e.U2S(l[5])<8))) { s.push( m.getUint16(3559+2*e.U2S(l[5]))); m.setUint16(l[0]+2*e.U2S(l[5]), s.pop()); e.incdec(6,1) }; m.setUint16(2985, 0); e.ret(1); return };

matt_weiner · November 3, 2011, 2:00pm

Dannii, apologies if this is off-topic, but the other day I was showing someone an IF (“Cold Iron,” FWIW) on the iPad and it froze up on a “press any key to continue” event. Apparently on the iPad’s browser, you bring up the keyboard by clicking in a text field, and since there’s no text field in a “press any key to continue” event, there was no way to bring up the keyboard and press a key without losing focus – at least not one we could think of.

Do you know a way around this?

(I was actually a little surprised that we got as far as we did on the iPad’s browser, because I’d been under the impression that it just didn’t play nice with Parchment at all.)

zarf · November 3, 2011, 5:14pm

Turns out there’s an invisible text entry field, either right at the last character printed in the status window, or at the bottom edge of the story window. If you feel around you can find it.

(I’ve done this with eblong.com/zarf/zweb/huntdark/ . I haven’t tested it with Cold Iron.)

matt_weiner · November 3, 2011, 5:45pm

Hm. May I suggest that it might be a little more user-friendly to make the field less invisible, or to provide some other workaround? I guess that the workaround would show up whenever anyone uses Parchment, even if they don’t need it, but what with iThings being all the rave now it seems like it might be worth it for outreach’s sake.

(I don’t have an iPad so I can’t look for the text field myself.)

(Definitely OT, but I was impressed at how much of my friend’s totally naive input Cold Iron understood. Some of it was built-in I7 parsing, like “pick up the book,” but I didn’t have to say “No, the game won’t understand you unless you say this exactly” as much as I expected.)

Dannii · November 4, 2011, 12:19pm

Making it mobile friendly is something I’m planning, but I’ve been waiting to get the new VM to the working stage first. I have an Android phone, but they’re similar enough that I expect fixing it for Android will fix it for Apple devices too.

matt_weiner · November 4, 2011, 12:47pm

Cool, thanks.

bcressey · November 9, 2011, 11:38pm

I don’t have a CS background, and I don’t know how to formulate these questions in a way that Google could answer, so I thought I’d ask you.

It seems like the naive approach to a VM would be to load the data and then just start running opcodes. Read the current one, call the appropriate (native code) function to handle it & update the global state, and repeat until finished. (I believe this is called a bytecode interpreter.)

I infer from your efforts (and the use of JIT systems in Gnusto and Quixe) that this is inadequate, presumably for performance reasons. It seems like the JIT capabilities in modern JavaScript VMs would optimize the native code functions, which implies that the bottleneck must be caused by the overhead of running every opcode through a large switch statement.

So Gnusto / Quixe improve the situation by being more clever: instead of calling one function per opcode, they look ahead to see what the next opcode will do, and they keep looking until they find an operation with unpredictable results - like a branch that depends on a value known only at runtime. Then they write out a new function which does the work of the collected opcodes as a text string, and hand that string off to the JS eval() function for execution.

The new function benefits from all the optimizations that the JIT compiler can bring to bear, and because it is somewhat longer, there is more for the compiler to work with. That might yield better results, but in general the savings comes from staying away from the switch statement.

Assuming I understand that correctly, it seems like IFVM improves the situation by moving up a level of abstraction - reordering and replacing instructions with a more efficient JS representation. You still write out code for eval to run, but it’s better code - the kind that the JIT compiler knows how to optimize.

Is this analogous to what emscripten does with its “relooper” algorithm? Except with Z-Machine bytecode instead of LLVM bitcode?

It seems like the logical extension of this would be to dispense with any “runtime” interpretation and instead to decompile the bytecode into a set of equivalent JS functions, which could be executed independently from the original bytecode.

Is that a viable technique? Is that what you’re doing here?

zarf · November 10, 2011, 2:57am

That’s correct. They go one step farther too: the string that gets eval’ed is a Javascript function definition, not just a sequence of operations. The compiled function gets cached, using the first opcode address as a key. So future execution of those opcodes doesn’t have to get recompiled.

The number of opcodes that get compiled into a single JS function is hairy. You’re roughly correct that it goes until the next branch. However, you can be more or less clever about it. In Quixe, I continue along the non-branchy path of conditional jumps. I stop at unconditional jumps, function calls, function returns, and a few other cases too obscure to be interesting.

What Dannii is doing is adding another layer of abstraction, so that the JS code can be put together with more flexibility than “append opcodes until you decide to stop”.

If the JS runtime was perfectly efficient, this wouldn’t matter – executing a lot of small simple JS functions would take the same amount of CPU as a few, larger, more structured JS functions. But of course these things aren’t perfectly efficient, so it’s worth taking this path.

bcressey · November 10, 2011, 6:25pm

Cool. That makes sense - since you know where the jump would go, you can write the actual value into the function and keep going.

I am easing my way into a T3 VM, and just got to the opcode handling stage. I had a rough sense of how it “should” look (based on Frotz / Glulxe / Git), so reading through the Quixe source was a bit mind-blowing. I appreciate your insight into what it’s doing, and why.

I think the T3 VM structure should really lend itself to this approach, since there are no non-local branches. Hence the JS output function could correspond more or less exactly to its bytecode counterpart, and there are a limited number of functions defined in the bytecode.

zarf · November 10, 2011, 9:27pm

When I defined the Glulx VM, I said “no non-local branches”. But it turns out this restriction doesn’t matter at all (and I’ve deleted it). Quixe doesn’t care what locality is. Branching to an address exits the JS function, and the next JS function to be called is found (in the cache) by the new address.

bcressey · November 11, 2011, 6:39pm

Well, if jumps are guaranteed to be local, you can represent the bytecode function as a single JS function, rather than a set of functions.

E.g. this function:

foo() {
	local x = 1;
	while (x < 10)
		x++;
	print(x);
}

becomes something like this:

var lcl;
var jump = 0x0A;
while (jump) {
	switch (jump) {
		case 0x0A:
			lcl = 1;
		case 0x0B:
			if (lcl >= 10) {
				jump = 0x0C;
				break;
			}
			lcl += 1;
			jump = 0x0B;
			break;
		case 0x0C:
			print(lcl);
		case 0xFF:
			jump = undefined;
			break;
	}
}

Which is several kinds of ugly, but has fewer function calls. It strikes me as not obviously terrible from a performance perspective. Loops suffer from the need to test all the case labels between the start of the function and the start of the loop, but the cost is proportional to the number of branches.

If there’s a 1:1 mapping between bytecode functions and JS output, I see an opportunity for optimized code replacement in key areas. You could hash the planned function and compare it to the hash of known adv3 functions, then swap in the hand-coded version if available.

zarf · November 11, 2011, 7:19pm

Quixe started out with that switch structure, but early testing convinced me that it was slower than a set of small functions.

Juhana · November 11, 2011, 7:50pm

Here’s a tip for JS optimizers: jsperf.com lets you compare algorithms by running each of them and timing how many times per second they’re executed. Example: jsperf.com/loop

bcressey · November 11, 2011, 8:28pm

Very interesting. Thanks, that’s good to know!

I saw that site the other day but assumed it was infected with malware because Chrome prompted me to load the JVM plugin, and there was no obvious reason for it to require Java.

The jsperf FAQ covers that objection, at least, though it’s somewhat uncomfortable to discover a Java applet that isn’t either terrible or malicious.

Still, that seems quite handy. Thanks!

Dannii · November 12, 2011, 1:52am

Zarf’s covered a lot of the theory so I won’t repeat it. But I will add a few things.

Modern JIT VMs (including JS ones) generally don’t JIT compile everything. They have a layered approach, starting with a bytecode interpreter. As our web interpreters are VMs inside VMs we skip that stage because it will probably never have the potential to be worth it (though that is unproven, and if anyone wanted to try adding that layer I wouldn’t be opposed.) Our VMs JIT everything, but browser VMs JIT only the code they have determined is run most frequently. They do that by tracing when code is run. Code within a function is easily traced, but they can also trace function calls. But our VMs have one set of code, the VM itself, and another set, the JIT code. The JIT functions are all stored as members of an array, and the VM has a function which says “run the function stored in JITCODE[PROGRAMCOUNTER]”. The browser VMs can trace our VM code, and they can trace the JIT code, but I don’t think that they can trace the link between them, because the function call depends on a variable which could, from the browser’s perspective, be anything at all. To let the browser’s JIT do it’s magic we must aim to reduce the number of changes from our VM to JIT code, and we do that by making our JIT code stop as rarely as we can.

The problem is that branches and jumps are gotos. Now if JS had a goto statement that would make it very easy to produce long stable JIT code, but it doesn’t. Because gotos can go to basically anywhere, whenever we get a branch or a jump the JIT must reluctantly stop. Inform doesn’t have a goto command either (ignoring the fact that you could use assembly to compile a branch/jump), but it does have if statements and loops. But those structures must be assembled by Inform into branches and jumps. What ifvms.js does is recognise those patterns, Idioms in CS terminology, and convert them back into JS ifs and loops. Because we use normally use only those Inform structures, and we don’t manually assemble our own jumps, then the bytecode those structures will be assembled into is very regularly, and easy to detect. With those idioms we can then eliminate the branches/jumps and can continue JIT compiling. ifvms.js is essentially a decompiler, turning low level branches into higher level control structures. At the moment it only recognises idioms for if blocks and while loops, but do/while loops will be easy to add.

Branches and jumps can be taken care of with idioms, which leaves the other main source of JIT stoppages: function calls. To handle them we need function acceleration, and as well as implementing Glulx’s acceleration system, I also plan to experiment with pattern matching, as vaporware’s .NET VM does. (ZLR I think?). Ben’s idea of hashing functions is the same idea. As there will be only a few veneer functions that are accelerated I think it is possible that the browser VMs will be able to trace down into the accelerated veneer and back again. It’s possible that checking whether two AST are the same will work efficiently, but I’m not sure. The only thing we must always stop for is input, and in those cases both the JIT and the VM stop, handing control back to the UI. (Though potentially with a threaded UI system the VM could do background compilation.)

Ben, your example is far more complicated than it needs to be, because loops are really very simple. The condition of the loop is made into a branch, whose target is the instruction after the loop. Then immediately before that a jump instruction is added, whose target is that branch instruction. Without that jump instruction it is identical to an if block. To recognise the loop all we need to do is test whether the last statement in the block is a jump that goes to the condition. The bytecode of do/while loops is simpler because there is no jump, but a little harder to identify because nothing in the bytecode marks where the do statement started - only the target of the final branch.

jsperf.com is indeed a brilliant website and I use it myself. I don’t think it will work well for testing a whole IF VM though. My plan is to write a test suite using the Glulx timer opcodes, and with the same statistical maths as jsperf.com

zarf · November 12, 2011, 4:19am

Correction: Inform 6 does. The parser uses it heavily. It’s not exactly pretty, but then it’s old code.

The early testing I mentioned didn’t use jsperf, but did the same thing: ran a test case a thousand times, using Javascript’s timestamp calls to measure the duration. It’s generally easier to throw that sort of thing together than to integrate it with somebody’s test suite.