I7 profiling war stories

People sometimes ask how they can figure out what parts of their code might be slow.

The answer is “Well, you can use the glulxe profiler, but it’s annoying to do.” This is still true, unfortunately. (It requires using a couple of command-line tools, and browsing through your generated I6 code.) But if you’re writing a large game, you may have to do it anyhow.

I just did a profiling run on my (in progress) Hadean Lands code. According to the output, the function Prop_198 was eating a huge number of cycles. What was Prop_198? Turns out it was derived from this line of I7:

Definition: a container is empty rather than non-empty if nothing is in it.

This compiles as a test that loops through every thing to see if its parent is the given container. (My code currently has 412 things.)

I changed it to:

Definition: a container is empty rather than non-empty if the first thing held by it is nothing.

This compiles as a test that takes one line of I6. (Constant time, a couple of CPU cycles.)

I don’t have a moral here. Profiling is still no fun. The instructions are in the comments on github.com/erkyrath/glulxe/blob … analyze.py .

This is what makes me wary of Inform 7. If not even the experts are sure what’s going on under the hood, how’s a newbie going to know?

Newbies don’t need to know. Hadean Lands is a massive undertaking that Zarf is writing more or less full-time. It’s very unlikely that you’ll run into this kind of problems for a long time if ever.

As a reference, I’ve been writing games with Inform for almost 7 years now and haven’t needed to touch a profiler once.

I’m a newbie as far as Inform 7 is concerned and I don’t have a clue about half the stuff that goes on under the bonnet, but I don’t need to know provided it doesn’t affect any games I’m writing. I have a TV which I use but I don’t need to understand how it works to watch it.

It only really becomes a major factor when you want to build something efficient and/or high-performance, such as extensions. Still, speaking purely for myself, I consider it common courtesy to ensure your games will run at reasonable speeds even on low-performance equipment.

The only game I’ve seen so far which could really benefit from something like this is Counterfeit Monkey, it’s the only I7 game I’ve ever seen that does such extraordinary things that it’s horribly, horribly slow on any platform that isn’t a desktop Glulx/Git interpreter. But of course, it’s probably impossible to make it any more efficient, I’m sure Emily Short already did everything she could to optimise it, the game is simply gargantuous in its operations and that’s all there is to it.

Slightly different take on the thread: interesting to see what sort of a difference that makes, Zarf. I wonder whether these little findings might not be compiled into a document: “Is your I7 project running too slowly? In need of optimisation? Here are X tricks of the trade that might help!”.

As others have said, a “newbie” may not need to know this. However, there’s no doubt that if you read through the bug fix comments or just correlate many posts in this particular forum, Inform itself is a kludgy system. It’s a cool system. It does a lot of neat things. But if you are a programmer who likes elegant code that is eminently maintainable, has high cohesion, and low coupling – Inform is almost a case study in what to avoid. Part of that is due to a historical pedigree that it still attempts to maintain.

But, again, much (if not all) of that can be hidden from people using the system to develop games with it. There are many game engines out there that are quite frightening if you look behind the scenes. The only time that matters is when those scary bits start to limit the future development of the system because of bad code accretion around foundational bits of code. In that case, potentially anyone is affected, whether “newbie” or not. I don’t think Inform has hit that point yet and, given the venue it works within, it may never do so.

The question of whether a “newbie” needs to know this stuff or not is actually a side issue that no one seems to be redirecting. After all, it’s possible to use a system for awhile, thus not be a “newbie”, and yet still have almost complete intransparency into how things work under the hood. To what extent that matters is determined by the extent that the implementation model is exposed to users in a way that allows them to do what is needed such as, in this case, optimization. Sometimes “exposing the implementation model” is nothing more than having good documentation on hand about the internals and how people who encountered similar problems solved them. That’s the basis of patterns, after all.

Kerkerkruip too can be slow on old or less powerful hardware.

Idea for comp: who can make the slowest game using Inorm 7.

I’m not sure I follow. Could you expand on what you mean by “bad code accretion”?

In point of fact, there used to be such a document on raif. I’d say it’s so outdated it’s meaningless now, however: much of it optimizes for things that are now less necessary.

Under, In Erebus was astoundingly slow on the web interpreters and I also understand slow on desktops – Jenni Polodna said it took her ten minutes to get through thirty or forty commands from the walkthrough. I suspect this was something to do with dynamically created objects piling up but I don’t know.

Or if you write a game with 80 rooms and 400 objects. But occasionally someone does this. Someone other than me or Emily, I mean. That’s why I bring it up.

I don’t know if Emily did a profiling pass over Counterfeit Monkey. I guess I could snarf the source code and try it myself.

Other common statements that loop over all objects:

Group (foo) together. [In the “listing contents of something” activity.]

say “[The list of things in the box].” [You’d think this would only loop over the contents of the box, right? Sorry.]

It’s useful info to me - One Eye Open was unplayable online during its comp year due to being too bloody slow, and I wish I’d checked it with something like this back then.

(I’m fairly sure OEO is playable online now, but that’s due to server improvements rather than code improvements. Once I finally do the post comp-release, I’ll be making a pass like this on it.)

Is there a way to write this that only loops though the box’s contents? If not, is it possible to write a simple I6 routine to return a list containing the contents of an object?

I’m hacking around with it now. The I7 list-writer is much more general than the I6 version, but this is the cost. I’m not sure it’s completely fixable.

There’s lots of room for improvements. Some I’ve found recently are

If you use custom relations, consider whether you could turn it into an alias of containment.

The entire indexed text system is ripe for a rewrite.

I don’t quite understand. I see how this would help, but wouldn’t that break most relations (by not allowing them to be used on anything with a hierarchy)?

Yes, that strategy can only be used for non physical objects. For example we did this for windows in Flexible Windows and made it 30 times faster.

Fortunately, according to the mantis list, this is already done: everything in the new version will just be of type ‘text’.