[Z-MACHINE] Stack Distinctions

I’m trying to make sure I use the appropriate stack terms unambiguously.

Section 6.1 of the spec says:

The “state of play” is defined as the following: the contents of dynamic memory; the contents of the stack; the value of the program counter (PC), and the “routine call state” (that is, the chain of routines which have called each other in sequence, and the values of their local variables).

What I’m trying to get is the idea of the “routine call state.” Normally it seems how things are worded in these contexts is you have a call stack and an evaluation stack. The call stack could also be called a routine stack, I guess, but I’ve not heard of it called a “routine call state.”

Also the above quote says “the contents of the stack” (no qualifier on the stack type) as distinct (perhaps?) from the “routine call state.”

Section 6.5 seems to clarify a bit:

A “stack frame” is an index to the routine call state (that is, the call-stack of return addresses from routines currently running, and values of local variables within them).

So then the “routine call state” is a call stack, I guess. So then does that mean the “contents of the stack” (again, the non-qualified one) from 6.1 is referring to an evaluation stack?

The remarks in that section of the specification refer to “the game’s main stack.” But it’s not clear if this is meant to be distinct from the anonymously named stack in 6.1 (which may be the evaluation stack) or what the “main stack” actually means. It does further say:

on some implementations the game’s main stack is also used to store the routine call state (i.e. the game stack and the call-stack are the same) but this need not be true.

So the “main stack” can be the “call stack” (which is the “routine call state”) but doesn’t have to be. This would suggest on some implementations the opposite is the case: that the “main stack” and the “call-stack” are different. But then what is the “main stack” in that case?

Reading the guide “The Z-Machine, and How to Emulate It”, the following is stated:

Each frame contains a program counter (PC), up to 15 local variables and a routine stack, and also some administrative information.

So the stack frame “contains” the “routine stack.” Is that a looser way of saying what the spec says, i.e., that the “stack frame is an index to the routine call state”?

I fear the wording leaves a lot of room for ambiguity, contrary to what you want in a specification, but I say that cognizant of the fact that I’m perhaps missing something or not reading it well enough.

I’m definitely open to being corrected and educated here!

1 Like

The “routine call state” doesn’t necessarily have to be stored as a stack of memory. It could be a stack of structs or objects. So in the context of section 6.5 and @catch and @throw, it’s an implementation detail. I used to implement the call stacks with structs in ZVM. But I’d recommend following the Quetzal stack format for the interpreter’s own internal use as it’s just much more straightforward.

In the interest of clarity, I think it would be better for the specification to consistently describe one particular representation (such as the Quetzal format). Perhaps with a sidenote that any internal representation works as long as it emulates the behaviour of that format. The sidenote could mention chains of heap-allocated objects as an example of such an internal representation.

1 Like

Linus’ comment is especially relevant perhaps given that the specification also says this:

Standard interpreters are not required to support these standards, since they do not affect Z-Machine behaviour, but interpreter-writers are strongly encouraged to consider it.

Here “these standards” refer to Quetzal and others.

Quetzal is described as “a standard format for saved-game files” and for new interpreter writers like me, it wasn’t even clear that I should be considering that at all to get started. I do understand why Quetzal and the stack distinction make sense given what a “saved state” is in the context of the Z-Machine.

But I do think the specification should at least use terminology consistently and unambiguously. For example, in the architecture part of the specification, it says:

The stack is a second bank of memory, quite separate from the main one, which has variable size: initially it is empty.

I’m guessing this is what is normally referred to as a call stack. The spec also says:

As well as being used to keep return details, the stack is also used to store local variables (values needed only by a particular routine) and, for short periods only, the partial results of calculations.

So it sounds like this single stack is a call stack and an evaluation stack.

I do realize from an implementation perspective having two stacks or one stack may not matter at all, as it is just a detail of how the implementation is done. I’m more trying to figure out how at least the specification writers conceived of things.

I’m still finding this confusing as I try to document things. From what I’m reading: in the case of the Z-Machine, each frame in the stack contains the following:

  • program counter
  • up to fifteen local variables
  • a routine stack

So … wait. The call stack contains a routine stack?

Thus is that saying that instead of having some global stack, each frame is responsible for providing a localized routine stack?

I should note the above list comes from “The Z-machine, And How To Emulate It” and that’s a document that I’m determining how much I can trust. The actual quote is:

The call stack is a stack of frames, which is unlimited in size. Each frame contains a program counter (PC), up to 15 local variables and a routine stack, and also some administrative information in V3+

(Emphasis added.) Given that the Z-Machine spec itself is sometimes loose with its specificity (unless I’m mistaken, of course), I’m finding getting the concepts straight is a bit more of a challenge.

It goes on to say:

The routine stack is a stack of words, which is unlimited in size. A word can be pushed on top of the stack, or pulled off again. It is an error to pull a word from an empty routine stack. A routine stack can only be used to store values within one (execution of a) routine; in particular, it cannot be used to pass values from one routine to another.

And the Z-machine standard seems to say

6.3

Writing to the stack pointer (variable number $00 ) pushes a value onto the stack; reading from it pulls a value off. Stack entries are 2-byte words as usual.

6.3.1

The stack is considered as empty at the start of each routine: it is illegal to pull >values from it unless values have first been pushed on.

6.3.2

The stack is left empty at the end of each routine: when a return occurs, any >values pushed during the routine are thrown away.

So yes, it sounds like each routine has a local stack in addition to the local variables. Interesting.

“The Stack” usually means the stack that’s exposed to the code the VM is running. There is only one (except version 6) but the VM is responsible for managing it. When you call a function the stack must be restored to the same state when the function returns. A function also can’t erase values pushed from the function above.

So each call frame contains a routine stack (or pointers to it, the implementation details can vary). When you return from a routine one of the call frames is popped, and then The Stack is restored using its data.

So just to confirm another data point, one of the implementations of the Z-Machine (Bocfel) has this to say:

eval_stack_size
The Z-machine contains an evaluation stack which games use as a sort of scratch space when performing calculations. By default the size of this stack is set at 16384, which should be sufficient for any game. The entire stack is allocated at once, so if you are on a system with a small amount of memory, setting this to a smaller value will cause less memory to be allocated. The maximum value is limited only by system memory. Each evaluation stack entry takes two bytes. If you ever encounter a “stack overflow” message with the default size, please let me know.

And…

call_stack_size
The Z-machine allows games to call routines in order to get work done; these routines can call other routines, and so on. Information about these routines is placed onto the call stack, which by default has a size of 1024. This means that no more than 1024 routines can be active at any single time. This value should be much more than sufficient for any game. As with the evaluation stack, memory for the call stack is allocated at one time, so machines with small amounts of memory can reduce this value to reduce memory consumption. The maximum value allowable by the Z-machine is 65535. Each call stack entry takes between 40 and 50 bytes.

These are thus implementation choices that the creator made but are not necessarily entirely indicative of what has to be. So this is where I may be getting stuck a bit. I find implementations out there that, when they are documented at all, suggest by wording that “this is the way the Z-Machine is” but, in fact, they may just be referring to implementation details they chose about what the specification says.

(That all read as a statement but I guess it’s an implied question. To wit, am I looking at things correctly?)

So in this case the spec is saying that each frame has its own call stack. This is as opposed to having a global call stack (which the architecture diagram in the spec seems to suggest) that all routines reference. (Although you could implement it the latter way in contrast to the spec.)

Okay, I’m starting to form a coherent picture here, I think …

No, each call frame (on the call stack) stores data about the state of the “evaluation” or “routine” stack.

So a “visual” may help here since as I know it (1) a call stack is usually a synonym for a routine stack and (2) a routine stack and an evaluation stack are two different things. (The latter usually being use to hold temporary calculation results that are done from routines.)

So let’s say I have this:

Main --> Routine1 (3 locals) --> Routine2 (2 locals) --> Routine3 (0 locals)                                                     

Let’s say each routine calls the next and performs a few simple calculations (adds or whatever). Then a return chain takes us back to Main.

Okay. so if I have a call stack, then when Routine3 is called the a global call stack would look like this:

Routine3
Routine2
Routine1
Main

As a routine returns, it’s popped off the stack, of course.

But with this idea of each routine frame having it’s own call stack … what does that visual start to look like? This?

Routine1
   Routine2

So if “each call frame contains a routine stack” as was said earlier, since Routine1 calls Routine2, that frame for Routine1 contains its own routine that indicates 2 was called. Now Routine2 calls 3 and so:

Routine1
   Routine2
      Routine3

Here the indenting is meant to indicate that the frame for, say, Routine3 is entirely stored on Routine2. The stack for Routine1 has no knowledge of the Routine2 stack and so on.

It feels like there should be a simple way to explain this and that’s what I’m struggling with.

“Routine stack” is confusing you, and is not a good term for us to use. It’s not a common term, and it’s not used in the spec AFAIK, just “The Z-machine, And How To Emulate It”. So pretend we didn’t hear it.

The Z-Machine has a call stack. It’s a single flat stack. It also has a general or “evaluation” stack that is exposed to the code running in the Z-Machine. Usually this is interleaved with the call stack. It can also be thought of as each routine having its own stack because they can’t access the stacks of other routines. (It’s in this context that TZMAHTEI describes the “routine stack” - it’s the stack of a single routine, not a stack of routines.)

1 Like

That right there is perfect clarification. I particularly like that you reconciled the two bits of documentation. I’ll be sure to include that in my notes for the interpreter.

This is exactly the kind of nuance that I think is important to capture. Because there are various sources floating around talking about how to implement the Z-Machine but as we’ve seen all of them (including the spec itself) can be subject to a bit of possible ambiguity.

Thanks for the assist!

I’ve sometimes heard them called the “hardware stack” and the “software stack”, since the latter is accessible to software running on the VM, and the former is part of the VM’s virtual “hardware” itself. But this is also a bit weird, since there is no actual hardware in a Z-machine emulator.

As another element on this in case anyone is reading for clarifying the spec, section 6.1 says this:

The “state of play” is defined as the following: the contents of dynamic memory; the contents of the stack; the value of the program counter (PC), and the “routine call state” (that is, the chain of routines which have called each other in sequence, and the values of their local variables). Note that the routine call state, the stack and the PC must be stored outside the Z-machine memory map, in the interpreter’s private memory.

But note here how “the contents of the stack” and the “routine call state” are distinct. Yet we said above that there is only one stack in the Z-Machine: a call stack. Yet what is described as the “routine call state” sure sounds like a call stack to me. So why are they distinguished in the spec?

This is further reinforced in the very same paragraph near the end (“Note that the routine call state, the stack…”).

This is probably why the document I referenced above (“The Z-machine, And How To Emulate It”) feel into the trap – if such it be – of calling referring to a “routine stack.” So we have here a case of an ambiguous spec and then a document that attempts to clarify it but is perceived as possibly inaccurate or at least misleading. (You don’t want your “clarifying” document to be misleading. But, then again, you don’t want to require clarifying documents for a specification in the first place.)

As @Dannii said, whether the call stack and evaluation stack are actually separate things is an implementation detail.

The game can only directly access the evaluation stack of the current routine, which may or may not be part of the same interpreter data structure as the evaluation stacks of other routines, and/or the call stack.

One valid implementation would be to push all that stuff onto the same stack in the interpreter’s memory, keep track of how much data was on the stack at the start of the current routine, and enforce separation between routines by throwing an error if the game ever tries to pop more than that. When the game calls a function, the interpreter pushes the current state onto that stack and measures the new height; when it returns, the interpreter pops down to that height (if needed), then pops to restore the saved state.

Another valid implementation would be for the interpreter to have a stack of routine call records, and inside each record, a separate evaluation stack. When the game calls a routine, the interpreter pushes a new call record and allocates a fresh eval stack for it; when it returns, the interpreter frees the eval stack and pops the call record.

The game can’t tell the difference.

1 Like

I am keeping an eye on this thread, and with any luck I’ll be getting back to Z-Machine Standard matters soon. Been a bit distracted by Other Stuff recently.

Good points, Jesse. I think the main thing is just making sure the spec is as unambiguous in wording as possible, which I know David is working on. For example, we talk about implementation details but 6.6 of the spec says:

In Version 6, the Z-machine understands a third kind of stack: a “user stack”, ,

Note it’s the Z-Machine that understands a “third kind” of stack. But a consensus seems to be that there’s actually one stack (a call stack) that matters. So what is the Z-Machine “understanding” here? And if there’s a third stack, that’s the specification saying there are two others. And if those two others are just implementation details, the wording of the spec can reflect that.

In fact, I think taking a mash up of some of the replies here would be a great addition to the spec, calling out what actually must be in place and indicating possibilities around implementation. What I’ve been trying to figure out with my posts is whether that information is in fact there and I just don’t have the skills yet to see it.

This is all more so the point in that there is a document out there that was written to clarify the spec (a warning sign in itself) but as we’ve (maybe?) seen in this thread, that clarifying document may obfuscate matters a bit.

So I hope this thread isn’t me coming off as someone who just wants to nit-pick or be a contrarian or, worse perhaps, armchair criticize people who wrote something I’m sometimes struggling to make sense of. I’m having a ton of fun writing my own Z-Machine interpreter and I’m doing this as a way to incorporate this into some development and testing classes of mine.

I definitely appreciate everyone’s engagement on this.

I would disagree with that. Both the call stack and the evaluation stack matter as concepts.

  • Game authors will probably care more about the evaluation stack, since it’s the one they can directly manipulate: Inform and ZAP both expose it as a special assembly operand.
  • Debugger/IDE authors will probably care more about the call stack, because showing the series of calls that led up to the current state is more useful than showing temporary operands.
  • Interpreter authors are free to make either one the “primary” stack that contains the other, or to do what ZLR does and make them completely separate structures:

image

I’m having a ton of fun writing my own Z-Machine interpreter and I’m doing this as a way to incorporate this into some development and testing classes of mine.

I appreciate it. The Standard could definitely stand to be tightened up, and a fresh set of eyes is usually the best for improving documentation.

Feel free to incorporate the summary of version differences, by the way. I’ve found it pretty helpful for porting games and interpreters to new versions.

1 Like

It’s the fact that the two stacks are distinct concepts that leaves the implementer free to construct one in terms of the other, or not.

If the concepts were strongly tied together, there would be only one data structure that works. (This is the case for Glulx.) (Mostly – there’s a bit of wiggle room, I think, but not much.)