Heap space exhausted: What do you do now?

If you’ve done anything especially elaborate in Dialog, you’ve probably come across the dreaded error: “Technical trouble: Heap space exhausted. Attempting to recover with UNDO.” Or maybe it was the auxiliary heap, or the long-term heap. When this happens, what do you do?

Well, the easiest answer is to increase the size of the appropriate heap on the command line. But if that doesn’t work, or if you’re already at the Z-machine file size limit, then what do you do?

This post is intended as a reference on Dialog’s three heaps, how they work, and what it means if they run out.

But first…

Make sure you’re using version 1a/01 or later. Previous versions have a nasty bug in the Z-machine runtime that crashes the game if the heap size pushes the wordmaps past address $8000. In 1a/01 and later, you can safely make the heaps much larger.

What does the main heap do?

The main heap is basically your program’s RAM. Every time you create a new variable (with $Something), or create a new list (with [Elements] or [Head | Tail]), or create a new dictionary word (with (get input $) or (join words $ into $); words that already exist in the game’s dictionary don’t use the heap), it takes up a certain amount of memory on the main heap.

If the main heap is exhausted, it means you have too many variables, lists, or dictionary words in existence at the same time. Once you create one of these things, it sticks around until its space is reclaimed.

(As a side note, lists are internally stored as LISP-style cons cells. A list like [1 2 3] is actually three blocks on the heap: [1 | X], X = [2 | Y], Y = [3 | []]. This means a list of N elements needs O(N) space. This is also why the basic way to build a list is [Head | Tail]: that’s internally how all lists work.)

How is main heap space reclaimed?

The heap isn’t really a heap at all, in the computer science sense—it’s just a huge block of memory, and the only metadata stored about it is its current size. Any new variables, lists, and dictionary words created are put at the end, and then the size increased appropriately. So the data structure is more like a stack (just a stack with varying sizes of elements).

Every time a choice point is created, the current size of the heap is recorded. Then, when Dialog backtracks to that choice point, the size is changed back to that value. Anything stored after that point is now stranded and can no longer be accessed, and it’ll get overwritten the next time new variables, lists, and dictionary words are created.

This means that the only way to reclaim heap space is to backtrack (or (stop), which has the same effect). Before version 0h/03, disambiguation questions in the standard library were handled like this (very simplified for the sake of example):

(parse commandline $Words)
	(collect $Action)
		*(understand $Words as $Action)
	(into $Options)
	(if) (empty $Options) (then)
		Sorry, that couldn't be parsed.
	(elseif) ($Options = [$Single]) (then)
		(try $Single)
	(else)
		Did you want to (describe options $Options)?
		(get input $Disambig)
		{
			($Disambig disambiguates $Options into $Action)
			(try $Action)
		(or)
			(parse commandline $Disambig)
		}
	(endif)

In other words, if the player’s input $Disambig unambiguously selects one of the options, then try that one. Otherwise, just parse $Disambig as a new action. Seems good, right? But the problem is—if $Disambig is parsed as a new action, then all those variables like $Options just stick around on the heap! And this means asking the same disambiguation question over and over will eventually exhaust the heap and crash the game.

So it’s important to make sure you backtrack periodically to reclaim heap space. That’s why the proper way to make an infinite loop is:

*(repeat forever)
(do something)
(fail)

Rather than:

(loop)
	(do something)
	(loop)

Backtracking to the (repeat forever) ensures that the heap space from each instance of (do something) can be reclaimed.

This can also be exploited if you have a big complicated process, like drawing an automap, that consumes a lot of memory. You can do this:

{ (draw automap) (fail) (or) }

Which draws the automap, then backtracks, then succeeds. This way, all the memory used by the automapping process is freed immediately afterward.

What does the auxiliary heap do?

The auxiliary heap is where choice points are stored, along with an internal structure called an “environment frame”. If you’re used to languages like C, you can think of this as the call stack. Every time you create a choice point (by making a query, or using (or)), and every time you query a predicate that is not in tail position (i.e. the last line of a predicate), it uses a bit of memory on the auxiliary heap.

If the auxiliary heap is exhausted, it means you have too many choice points and non-tail calls active at the same time.

How is auxiliary heap space reclaimed?

Any choice points created during a particular query are usually discarded when that query succeeds or fails, but multi-queries keep them around indefinitely. Choice points created during a multi-query are only discarded when its parent query succeeds or fails (assuming it’s not a multi-query itself!), or when they’re explicitly “cut” using (just) or (stop).

This is the other reason not to implement infinite loops like this:

(loop)
	(do something)
	(loop)

On the Z-machine and Å-machine, this (loop) call won’t create an environment frame, because it’s in tail position. But the debugger doesn’t do tail call optimization, so this will eventually exhaust the auxiliary heap and crash. It’s not a coincidence that (repeat forever) was added in the same update as the debugger was!

What does the long-term heap do?

The long-term heap is where the contents of global and per-object variables are stored. Normally, if you create a list or dictionary word, and then backtrack past its creation, that list or dictionary word is gone, and its main heap space can be reclaimed:

{
	($X = [1 2 3])
	(fail)
(or)
	%% At this point, $X is unbound
}

But if we stashed them in global or per-object variables, they need to stick around even after backtracking. And this means storing them somewhere other than the main heap, since the main heap doesn’t keep metadata.

This also gives us a nice way to reclaim main heap space after an expensive operation:

(global variable (temporary result $))

{
	(very expensive computation produces $X)
	(now) (temporary result $X)
	(fail)
(or)
	(temporary result $X)
	(now) ~(temporary result $)
}

After doing our very expensive computation, we stash the result on the long-term heap, backtrack to free main heap space, then recover our result back to the main heap again.

How is long-term heap space reclaimed?

This one is more akin to the memory model you’ll find in modern languages, like Python. It allocates and deallocates blocks of memory as needed, cleaning them up whenever those variables get reassigned (or unassigned completely).

The long-term heap is typically the smallest of the three, because usually you’re not storing that many lists and dictionary words in variables. (Only lists and dictionary words get stored here; anything saved to the long-term heap must be fully bound, and any bound value except a list or an unknown dictionary word can fit in a single register and thus doesn’t need any heap space.) If it runs out, either increase the size, or question why you’re stashing so many lists.

Conclusion

Dialog’s heaps aren’t documented very well, but each one serves a very specific purpose, and understanding those purposes gives you lots of new tools to use to avoid exhausting them. Now go use this to make even larger games!

10 Likes

6 posts were split to a new topic: Compiling Dialog 1a/01

As a side note, the Dialog language has a shorthand for { (predicate) (fail) (or) }, which is (exhaust) (predicate). In other words, this is exactly equivalent to (exhaust) (draw automap). This is one reason why you might want to put a simple (not multi-) query inside an (exhaust)!

So far in my projects I’ve been sticking to the { (predicate) (fail) (or) } syntax, because I think it’s clearer about what’s going on, but there’s no difference in the generated code. Use whichever one you find more readable.

2 Likes

A post was merged into an existing topic: Compiling Dialog 1a/01