Graph theory in Dialog

I’m really curious why it would crash there, when displaying the solution, because basically no dynamic memory allocation happens at that point. Here’s the relevant code:

	(get input $)
	Knaves:
	(exhaust) {
		*($Knave is one of $KnaveList)
		(name for index $Knave)
	}

So it just iterates over the list and prints the name of each one. Purely iterative code.

Hmm…I wonder if the main heap is used for things like lists, and the auxiliary heap is used for choice points?

This is a quote from Stackoverflow on the NET heaps:

There are 3 heaps + the large object heap. All objects are allocated in heap 0. If they survive one garbage collection they’re promoted to heap 1, then to heap 2, and then stays there until collected. The large object heap (LOH) is for objects which are 85000 bytes or more in size (not in total, continous, like arrays). Where you got the fifth heap from I don’t know.

I suspect the different heaps in dg are used in some smiliar manner (gen0–>gen1–>gen2, or maybe gen0–>gen1 + a LOH).

Unfortunately the source of dg is extremely sparsely commented so it’s hard to know. I can’t find something that looks like a GC, but guess it must be there somewhere. Commenting up the code could be a separate project on its own.

Yeah, there’s very little commenting and no repository with commit comments or history, which makes modifying anything a fraught prospect. Sadly. The auxiliary heap is allowed to be a lot bigger than the main heap, and the long-term heap seems to contain the initial values of variables (if they point into dynamic memory), but beyond that…

2 Likes

I ran some test with different values for the heap sizes (z5 and run on Window Frotz).

dialogc -v -t z5 -H 1000 -A 500 -L 100 knaves2.dg
  6  6 12	Peak dynamic memory usage: 1000 heap words, 327 aux words, and 0 long-term words.
dialogc -v -t z5 -H 2000 -A 500 -L 100 knaves2.dg
  8  8 16	Peak dynamic memory usage: 2000 heap words, 358 aux words, and 0 long-term words.
dialogc -v -t z5 -H 4000 -A 500 -L 100 knaves2.dg
 13 13 26	Peak dynamic memory usage: 4000 heap words, 413 aux words, and 0 long-term words.
dialogc -v -t z5 -H 6000 -A 500 -L 100 knaves2.dg
 15 15 30   Peak dynamic memory usage: 5841 heap words, 500 aux words, and 0 long-term words.
dialogc -v -t z5 -H 8192 -A 600 -L 100 knaves2.dg
 19 19 38   Peak dynamic memory usage: 8192 heap words, 516 aux words, and 0 long-term words.

The heap is the one that grows fastest with the array size and the crash when this heap gets exhausted always occurs when building the array and listing the different persons statements.
The auxheap doesn’t grows as fast but when it gets exhausted it’s always when presenting the result.
The long-term heap isn’t used at all in Knaves.

Finding some ideal values for Knaves would maybe max for the heap (-H 8192), make the auxheap a bit bigger than the default (-A 1000) and minimize the long-term (-L 100).

(Is there any way to see the heap values without finding the edge case and crashing the game? I can’t find anything in dgdebug.)

1 Like

(display memory statistics) is what’s printing those values right before the crash, so you can call that at any point instead of only in the (error $ entry point) to see how much has been used. Note that it only displays the peak usage during the game, not the current usage.

1 Like

Oh, also note that dgdebug specifically doesn’t do tail call optimization; it just dynamically allocates more memory for the call stack if necessary. So it shouldn’t crash due to that, but it could throw the memory usage analysis off.

I worked through that puzzle, trying all combinations, and I came up with two solutions:

  • Claire is the guilty one, and the only one telling the truth.
  • Bob is the guilty one, and the only one lying.

Was the code designed to show all solutions, or just the first one (or have I misunderstood the puzzle)?

You have to assume that guilty people will always lie, and innocent people will always tell the truth.

The guilty people are always lying, so if Claire is the guilty one, Claire would be lying, not telling the truth.

My analysis:

Alice and Claire both state the other person is innocent. This implies they are either both telling the truth or both lying.

Alice and Bob both state the other person is lying. This implies that one of them is telling the truth, and the other is lying.

And we are told 1 of them is actually guilty. Since we know that either Alice or Bob is lying (and hence guilty), them it follows that Claire must be telling the truth. And Claire declared Alice innocent, so Alice is telling the truth as well. And therefore Bob must be lying and is the guilty person.

1 Like

That’s the bit I missed, thanks.

1 Like

If you like logic puzzles… My Dendry game Poetic Justice by obrouwer has a logic puzzle with a bunch of quarreling poets :sunglasses:

1 Like

Exactly the intended solution!

The key to this particular puzzle is:

  • If X says that Y is innocent, then either they’re both innocent, or they’re both guilty
  • If X says that Y is guilty, then one of them is guilty and one of them is innocent

You can use this to divide everyone into two groups: an affirmation means they’re in the same group, an accusation means they’re in opposite groups. Then whichever group has the appropriate number of people is the guilty one.

If there are the same number of guilty and innocent people, there’s no way to tell which group is telling the truth, so in that case the game will give you one additional piece of information (whether the first suspect listed is innocent or guilty).

1 Like

Which, by the way, is why generating these is an interesting problem in Dialog. For the puzzle to be solvable, everyone has to be connected to everyone else by some chain of statements: if Alice and Bob accuse each other, and Claire and Dave accuse each other, there’s no way to know whether Alice is with Claire or with Dave.

So the problem can be reframed as: generate a connected graph on N vertices, with K edges, such that every vertex has at least one edge originating from it. Coming up with a coherent set of statements is the easy part; the hard part is ensuring you have enough information to uniquely determine the solution!

1 Like

Ensuring that the graph (with edge directions erased) is connected by constructing it from a spanning tree is a very neat idea! But I’m wondering if there’s a way to avoid the conceptual detour through an undirected graph (tree), and instead put the “every vertex has an outgoing edge” property first: add one edge → j for each vertex i, such that the resulting graph is connected (with edge directions erased).

We start with N vertices an 0 edges, i.e., N “undirected connected components” (abbrev.: UCC), and want to end up with a single UCC. To ensure this, all but one of the N edges we add must merge two UCCs: for every vertex i except the last one, pick a destination j that’s not yet connected to i. A union-find data structure seems relevant: when adding a directed edge i → j, combine the UCCs of i and j with the union operation. That doesn’t directly enable “pick a random vertex j not in the UCC of i” but it can test “are i and j in the same UCC?” very efficiently. Thus, the brute force way of checking all other vertices only takes approximately O(N²) time overall, while using O(N) space. Not great, but probably not much worse than the repeated updating of the list in the Prüfer sequence → tree conversion.

I don’t know if you can reasonably implement union-find in Dialog. It’s a very imperative data structure, and I’m not aware of Dialog providing any of the tools or “escape hatches” commonly used to implement it in logic or functional programming languages. So I’m pessimistic about doing it with arbitrary integer IDs. Perhaps it can be done in the case where every vertex exists as named object by using per-object variables like ($ has union find parent $)? I don’t have any clue what their memory footprint is, and how efficiently (in runtime and memory) they can be updated repeatedly.

1 Like

Per-object variables use one word per object in the game (there’s no way to restrict them to only a certain category of object), but can be written and read in O(1) time, so they’re a much better solution for anything that requires lots of reading and writing values. The standard library uses that for the route-finder, repurposing an unrelated property that’s not normally used on rooms.

2 Likes

The main heap is used to store most things that are created at runtime. Whenever you assign a new local variable, or construct a new list, it goes here. Unlike the malloc heap from C, there’s no extra metadata stored in this—it’s just a big, unorganized block of memory, with REG_TOP marking the top of it. (Which means it’s not really a heap, per se; it’s more of a stack, but nothing ever gets popped off it.)

The auxiliary heap is used by the runtime to store whatever data it needs. This is where the call stack of choice points and stop points is kept, for example, and where runtime routines can temporarily push values to retrieve later. This one is also not really a heap; it’s a stack, though one that’s in RAM, so the program can poke at lower parts of it when needed. (This is used for local variables, which are stored in an environment frame on the call stack.)

When returning to a choice point or stop point, the previous value of REG_TOP is retrieved from here, which is the only way the heap can shrink. Otherwise, it grows without bound. That’s why the main gameplay loop in the library is implemented with *(repeat forever) (repl) (fail): it ensures that all the heap memory used during a given turn is freed at the end of that turn.

The long-term heap is used to store any values that need to persist past returning to a choice point—that is, values stored in global and per-object variables. Those variables themselves, along with global and per-object flags, are stored in a separate area of RAM outside of the heap, but when you store [1 2 3 4 5] in a global, that list itself is stored on the long-term heap. I haven’t been able to figure out how this one is arranged in memory; it might actually be a heap.

This was added in 0e/01 to avoid putting a maximum size on individual variables. Now instead there’s a maximum on the total size of all variables’ values.

In other words:

This is essentially correct! Exhausting the main heap means you’re using too much dynamic memory, exhausting the auxiliary heap means you’re using too much recursion, and exhausting the long-term heap means you’re storing too much data in persistent variables.

This also means that, if a particular computation uses a horrific amount of memory before returning its result, you can ensure that memory is freed afterward:

(global variable (result of computation $))
(expensive computation into $Result)
    {
        (really expensive computation internal guts into $Tmp)
        (now) (result of computation $Tmp)
        (fail)
    (or)
        (result of computation $Result)
        (now) ~(result of computation $)
    }

This is used by the standard library to free all the memory used by the parser before asking a disambiguation question. It saves the question itself in a global variable (so on the long-term heap), then uses (stop) to free memory, and retrieves it from the variable to ask it.

3 Likes