Bugs with non-linear unification

nore · July 30, 2022, 9:42am

It seems there are bugs with unification when a term has a variable used several times, but that variable isn’t used afterwards. The root cause seems to be in the definition of the would_unify function of the Å-machine specification, but happens with z-code as well. A minimal example is as follows:

(test $A)	($A = [0 1])

(program entry point)
	(test [$X $X])
	Unification is possible. (line)
	(test [$Y $Y])
	Value of Y: $Y.

Here, “Unification is possible.” is printed, but the value of $Y isn’t. Besides, if we do not move the unification to its own rule but instead write this:

(program entry point)
	([$X $X] = [0 1])
	Value of X: $X.

Here we get “Value of X: 0.” which is printed, so unification seems to be incorrectly simplified by the compiler in this case.

Also, thanks a lot Linus for your work, Dialog is really awesome!

Draconis · January 26, 2025, 8:47pm

Here’s the IR output from this code:

(program entry point)
	([$X $X] = [0 1])
	Value of X: $X.

Intermediate code for (program entry point): 0 -1
R0: (group leader) (1 incoming) clause 65535
	JUMP                      R1           -            -           
R1: (part of group R0) (1 incoming) clause 65535
	MAKE_PAIR_VV              X0           X2           []          // X0 = [X2]
	MAKE_PAIR_RV              X0           X2           X0          // X0 = [X2 X2]
	MAKE_PAIR_VV              X1           1            []          // X1 = [1]
	MAKE_PAIR_VV              X1           0            X1          // X1 = [0 1]
	UNIFY                     X0           X1           -           // Unify X0 with X1
	PRINT_WORDS 1             "Value"      "of"         "X:"        
	PRINT_VAL                 X2           -            -           
	PRINT_WORDS 1             "."          -            -           
	PROCEED 0                 -            -            -

Looks fine so far. (The // comments are my annotations.) So the problem is in the implementation of the I_UNIFY IR opcode. Let’s see what happens in the debugger:

dgdebug: eval.c:519: value_of: Assertion `v.value < es->nalloc_temp’ failed.
Aborted (core dumped)

That’s not ideal! This appears with Linus’s official debugger, without any of our modifications since.

Using a version I modified to print each opcode as it’s executed:

*** (program entry point) JUMP
*** (program entry point) ALLOCATE
*** (program entry point) TRACEPOINT
*** (program entry point) MAKE_PAIR_VV
dgdebug: eval.c:519: value_of: Assertion `v.value < es->nalloc_temp' failed.

This happens any time you have a construction like [$X $X] in the source, as far as I can tell. It seems the problem is actually in the I_MAKE_PAIR_VV!

Well, here’s that implementation in the debugger.

case I_MAKE_PAIR_VV:
	v = alloc_heap_pair(es);
	if(v.tag == VAL_ERROR) {
		pred_release(pp.pred);
		return ESTATUS_ERR_HEAP;
	}
	es->heap[v.value + 0] = value_of(ci->oper[1], es);
	es->heap[v.value + 1] = value_of(ci->oper[2], es);
	set_by_ref(ci->oper[0], v, es);
	break;

The crash is happening in the call value_of(ci->oper[1], es);.

Stay tuned for more!

Draconis · January 26, 2025, 9:08pm

With tracing added, here’s the IR.

R0: (group leader) (1 incoming) clause 65535
	JUMP                      R1           -            -           
R1: (part of group R0) (1 incoming) clause 65535
	ALLOCATE                  0            0            -           
	TRACEPOINT 0              unifyfail.dg:1
	MAKE_PAIR_VV              A0           X0           []          
	MAKE_PAIR_RV              A0           X0           A0          
	MAKE_PAIR_VV              A1           1            []          
	MAKE_PAIR_VV              A1           0            A1          
	TRACEPOINT 1              unifyfail.dg:2
	UNIFY                     A0           A1           -           
	TRACEPOINT 3              unifyfail.dg:2
	PRINT_WORDS 1             "Value"      "of"         "X:"        
	PRINT_VAL                 X0           -            -           
	PRINT_WORDS 1             "."          -            -           
	DEALLOCATE 1              -            -            -           
	PROCEED 0                 -            -            -

What if we change that [$X $X] to [$X 1]?

Intermediate code for (program entry point): 0 -1
R0: (group leader) (1 incoming) clause 65535
	JUMP                      R1           -            -           
R1: (part of group R0) (1 incoming) clause 65535
	ALLOCATE                  0            0            -           
	TRACEPOINT 0              unifyfail.dg:1
	MAKE_PAIR_VV              A0           1            []          
	MAKE_PAIR_RV              A0           X0           A0          
	MAKE_PAIR_VV              A1           1            []          
	MAKE_PAIR_VV              A1           0            A1          
	TRACEPOINT 1              unifyfail.dg:2
	UNIFY                     A0           A1           -           
	TRACEPOINT 3              unifyfail.dg:2
	PRINT_WORDS 1             "Value"      "of"         "X:"        
	PRINT_VAL                 X0           -            -           
	PRINT_WORDS 1             "."          -            -           
	DEALLOCATE 1              -            -            -           
	PROCEED 0                 -            -            -

And what if it’s [1 $X]?

Intermediate code for (program entry point): 0 -1
R0: (group leader) (1 incoming) clause 65535
	JUMP                      R1           -            -           
R1: (part of group R0) (1 incoming) clause 65535
	ALLOCATE                  0            0            -           
	TRACEPOINT 0              unifyfail.dg:1
	MAKE_PAIR_RV              A0           X0           []          
	MAKE_PAIR_VV              A0           1            A0          
	MAKE_PAIR_VV              A1           1            []          
	MAKE_PAIR_VV              A1           0            A1          
	TRACEPOINT 1              unifyfail.dg:2
	UNIFY                     A0           A1           -           
	TRACEPOINT 3              unifyfail.dg:2
	PRINT_WORDS 1             "Value"      "of"         "X:"        
	PRINT_VAL                 X0           -            -           
	PRINT_WORDS 1             "."          -            -           
	DEALLOCATE 1              -            -            -           
	PROCEED 0                 -            -            -

Aha! So that first MAKE_PAIR_VV should in fact be MAKE_PAIR_RV—in other words, using a reference instead of a value. This makes sense!

So the real question is, why are lists like [$X $X] getting assembled with MAKE_PAIR_VV instead of MAKE_PAIR_RV?

Draconis · January 26, 2025, 9:13pm

Well, the only place a MAKE_PAIR_VV opcode gets compiled is in comp_value_into. Here’s the relevant bit.

if(an->kind == AN_PAIR) {
	for(i = 0; i < 2; i++) {
		if(an->children[i]->kind == AN_VARIABLE) {
			if(an->children[i]->word->name[0]) {
				vnum = findvar(cl, an->children[i]->word);
				sub[i] = (value_t) {OPER_VAR, vnum};
				is_ref[i] = !seen[vnum];
				seen[vnum] = 1;
			} else {
				sub[i] = (value_t) {OPER_TEMP, ntemp++};
				is_ref[i] = 1;
			}
		} else {
			if(i == 1
			&& an->children[i]->kind == AN_PAIR
			&& (dest.tag == OPER_TEMP || dest.tag == OPER_ARG)) {
				comp_value_into(cl, an->children[i], dest, seen, known_args);
				sub[i] = dest;
			} else {
				sub[i] = comp_value(cl, an->children[i], seen, known_args);
			}
			is_ref[i] = 0;
		}
	}
	ci = add_instr(I_MAKE_PAIR_VV + 2 * is_ref[0] + is_ref[1]);
	ci->oper[0] = dest;
	ci->oper[1] = sub[0];
	ci->oper[2] = sub[1];
}

Aha! So it only sets is_ref[i] if !seen[vnum]—that is, if this is a new variable. I feel like we’re getting close here!

Draconis · January 26, 2025, 9:28pm

Okay okay! I think it’s starting to come together. So the problem is that only the first instance of $X in the list is being treated as a reference, but the last instance is the one that gets compiled first. Which means the last instance is the one that’s supposed to be allocating registers for the whole thing. If you declare $X beforehand, things are totally fine:

(do nothing $)
(program entry point)
	(do nothing $X)
	([$X $X] = [0 1])
	Value of X: $X.

No bug here!

Intermediate code for (program entry point): 0 -1
R0: (group leader) (1 incoming) clause 0
	JUMP                      R1           -            -           
R1: (part of group R0) (1 incoming) clause 0
	ALLOCATE                  1            0            -           
	TRACEPOINT 0              unifyfail.dg:3
	MAKE_VAR                  A0           -            -           
	ASSIGN                    V0/$X        A0           -           
	TRACEPOINT 1              unifyfail.dg:4
	SET_CONT                  R2           -            -           
	INVOKE_ONCE               (do nothing $)
R2: (group leader) (1 incoming) clause 0
	TRACEPOINT 3              unifyfail.dg:4
	MAKE_PAIR_VV              A0           V0/$X        []          
	MAKE_PAIR_VV              A0           V0/$X        A0          
	MAKE_PAIR_VV              A1           1            []          
	MAKE_PAIR_VV              A1           0            A1          
	TRACEPOINT 1              unifyfail.dg:5
	UNIFY                     A0           A1           -           
	TRACEPOINT 3              unifyfail.dg:5
	PRINT_WORDS 1             "Value"      "of"         "X:"        
	PRINT_VAL                 V0/$X        -            -           
	PRINT_WORDS 1             "."          -            -           
	DEALLOCATE 1              -            -            -           
	PROCEED 0                 -            -            -

So the problem is, when a variable name appears multiple times in a list literal, only the first one is marked as unseen, but the last one needs to be instead. Stay tuned!

Draconis · January 26, 2025, 9:37pm

…could this really be as simple as visiting the children of the AN_PAIR AST node in the opposite order? Let’s find out!

Intermediate code for (program entry point): 0 -1
R0: (group leader) (1 incoming) clause 65535
	JUMP                      R1           -            -           
R1: (part of group R0) (1 incoming) clause 65535
	ALLOCATE                  0            0            -           
	TRACEPOINT 0              unifyfail.dg:3
	MAKE_PAIR_RV              A0           X0           []          
	MAKE_PAIR_VV              A0           X0           A0          
	MAKE_PAIR_VV              A1           1            []          
	MAKE_PAIR_VV              A1           0            A1          
	TRACEPOINT 1              unifyfail.dg:5
	UNIFY                     A0           A1           -           
	TRACEPOINT 3              unifyfail.dg:5
	PRINT_WORDS 1             "Value"      "of"         "X:"        
	PRINT_VAL                 X0           -            -           
	PRINT_WORDS 1             "."          -            -           
	DEALLOCATE 1              -            -            -           
	PROCEED 0                 -            -            -

Turns out, yes!

Okay, now it’s time for the hardest part: figuring out the git commands to make a new branch for this fix…

Draconis · January 26, 2025, 9:48pm

Input:

(program entry point)
	([$X $X] = [0 1])
	Value of X: $X.

Output:
[Nothing]

Input:

(program entry point)
	([$X $X] = $Y)
	Value of Y: $Y.

Output:

Value of Y: [$ $].

Any more test cases I should try? Otherwise, time for a pull request!

Draconis · January 26, 2025, 10:37pm

Oh, right.

(program entry point)
	([$X $X] = $Y)
	Value of Y: $Y.
	($X = 24)
	Value of Y: $Y.

Value of Y: [$ $]. Value of Y: [24 24].