nore
July 30, 2022, 9:42am
1
It seems there are bugs with unification when a term has a variable used several times, but that variable isn’t used afterwards. The root cause seems to be in the definition of the would_unify
function of the Å-machine specification, but happens with z-code as well. A minimal example is as follows:
(test $A) ($A = [0 1])
(program entry point)
(test [$X $X])
Unification is possible. (line)
(test [$Y $Y])
Value of Y: $Y.
Here, “Unification is possible.” is printed, but the value of $Y isn’t. Besides, if we do not move the unification to its own rule but instead write this:
(program entry point)
([$X $X] = [0 1])
Value of X: $X.
Here we get “Value of X: 0.” which is printed, so unification seems to be incorrectly simplified by the compiler in this case.
Also, thanks a lot Linus for your work, Dialog is really awesome!
4 Likes
Draconis
(Daniel Stelzer)
January 26, 2025, 8:47pm
2
Here’s the IR output from this code:
(program entry point)
([$X $X] = [0 1])
Value of X: $X.
Intermediate code for (program entry point): 0 -1
R0: (group leader) (1 incoming) clause 65535
JUMP R1 - -
R1: (part of group R0) (1 incoming) clause 65535
MAKE_PAIR_VV X0 X2 [] // X0 = [X2]
MAKE_PAIR_RV X0 X2 X0 // X0 = [X2 X2]
MAKE_PAIR_VV X1 1 [] // X1 = [1]
MAKE_PAIR_VV X1 0 X1 // X1 = [0 1]
UNIFY X0 X1 - // Unify X0 with X1
PRINT_WORDS 1 "Value" "of" "X:"
PRINT_VAL X2 - -
PRINT_WORDS 1 "." - -
PROCEED 0 - - -
Looks fine so far. (The //
comments are my annotations.) So the problem is in the implementation of the I_UNIFY
IR opcode. Let’s see what happens in the debugger:
dgdebug: eval.c:519: value_of: Assertion `v.value < es->nalloc_temp’ failed.
Aborted (core dumped)
That’s not ideal! This appears with Linus’s official debugger, without any of our modifications since.
Using a version I modified to print each opcode as it’s executed:
*** (program entry point) JUMP
*** (program entry point) ALLOCATE
*** (program entry point) TRACEPOINT
*** (program entry point) MAKE_PAIR_VV
dgdebug: eval.c:519: value_of: Assertion `v.value < es->nalloc_temp' failed.
This happens any time you have a construction like [$X $X]
in the source, as far as I can tell. It seems the problem is actually in the I_MAKE_PAIR_VV
!
Well, here’s that implementation in the debugger.
case I_MAKE_PAIR_VV:
v = alloc_heap_pair(es);
if(v.tag == VAL_ERROR) {
pred_release(pp.pred);
return ESTATUS_ERR_HEAP;
}
es->heap[v.value + 0] = value_of(ci->oper[1], es);
es->heap[v.value + 1] = value_of(ci->oper[2], es);
set_by_ref(ci->oper[0], v, es);
break;
The crash is happening in the call value_of(ci->oper[1], es);
.
Stay tuned for more!
3 Likes
Draconis
(Daniel Stelzer)
January 26, 2025, 9:08pm
3
With tracing added, here’s the IR.
R0: (group leader) (1 incoming) clause 65535
JUMP R1 - -
R1: (part of group R0) (1 incoming) clause 65535
ALLOCATE 0 0 -
TRACEPOINT 0 unifyfail.dg:1
MAKE_PAIR_VV A0 X0 []
MAKE_PAIR_RV A0 X0 A0
MAKE_PAIR_VV A1 1 []
MAKE_PAIR_VV A1 0 A1
TRACEPOINT 1 unifyfail.dg:2
UNIFY A0 A1 -
TRACEPOINT 3 unifyfail.dg:2
PRINT_WORDS 1 "Value" "of" "X:"
PRINT_VAL X0 - -
PRINT_WORDS 1 "." - -
DEALLOCATE 1 - - -
PROCEED 0 - - -
What if we change that [$X $X]
to [$X 1]
?
Intermediate code for (program entry point): 0 -1
R0: (group leader) (1 incoming) clause 65535
JUMP R1 - -
R1: (part of group R0) (1 incoming) clause 65535
ALLOCATE 0 0 -
TRACEPOINT 0 unifyfail.dg:1
MAKE_PAIR_VV A0 1 []
MAKE_PAIR_RV A0 X0 A0
MAKE_PAIR_VV A1 1 []
MAKE_PAIR_VV A1 0 A1
TRACEPOINT 1 unifyfail.dg:2
UNIFY A0 A1 -
TRACEPOINT 3 unifyfail.dg:2
PRINT_WORDS 1 "Value" "of" "X:"
PRINT_VAL X0 - -
PRINT_WORDS 1 "." - -
DEALLOCATE 1 - - -
PROCEED 0 - - -
And what if it’s [1 $X]
?
Intermediate code for (program entry point): 0 -1
R0: (group leader) (1 incoming) clause 65535
JUMP R1 - -
R1: (part of group R0) (1 incoming) clause 65535
ALLOCATE 0 0 -
TRACEPOINT 0 unifyfail.dg:1
MAKE_PAIR_RV A0 X0 []
MAKE_PAIR_VV A0 1 A0
MAKE_PAIR_VV A1 1 []
MAKE_PAIR_VV A1 0 A1
TRACEPOINT 1 unifyfail.dg:2
UNIFY A0 A1 -
TRACEPOINT 3 unifyfail.dg:2
PRINT_WORDS 1 "Value" "of" "X:"
PRINT_VAL X0 - -
PRINT_WORDS 1 "." - -
DEALLOCATE 1 - - -
PROCEED 0 - - -
Aha! So that first MAKE_PAIR_VV should in fact be MAKE_PAIR_RV—in other words, using a reference instead of a value . This makes sense!
So the real question is, why are lists like [$X $X]
getting assembled with MAKE_PAIR_VV
instead of MAKE_PAIR_RV
?
1 Like
Draconis
(Daniel Stelzer)
January 26, 2025, 9:13pm
4
Well, the only place a MAKE_PAIR_VV
opcode gets compiled is in comp_value_into
. Here’s the relevant bit.
if(an->kind == AN_PAIR) {
for(i = 0; i < 2; i++) {
if(an->children[i]->kind == AN_VARIABLE) {
if(an->children[i]->word->name[0]) {
vnum = findvar(cl, an->children[i]->word);
sub[i] = (value_t) {OPER_VAR, vnum};
is_ref[i] = !seen[vnum];
seen[vnum] = 1;
} else {
sub[i] = (value_t) {OPER_TEMP, ntemp++};
is_ref[i] = 1;
}
} else {
if(i == 1
&& an->children[i]->kind == AN_PAIR
&& (dest.tag == OPER_TEMP || dest.tag == OPER_ARG)) {
comp_value_into(cl, an->children[i], dest, seen, known_args);
sub[i] = dest;
} else {
sub[i] = comp_value(cl, an->children[i], seen, known_args);
}
is_ref[i] = 0;
}
}
ci = add_instr(I_MAKE_PAIR_VV + 2 * is_ref[0] + is_ref[1]);
ci->oper[0] = dest;
ci->oper[1] = sub[0];
ci->oper[2] = sub[1];
}
Aha! So it only sets is_ref[i]
if !seen[vnum]
—that is, if this is a new variable. I feel like we’re getting close here!
2 Likes
Draconis
(Daniel Stelzer)
January 26, 2025, 9:28pm
5
Okay okay! I think it’s starting to come together. So the problem is that only the first instance of $X
in the list is being treated as a reference, but the last instance is the one that gets compiled first. Which means the last instance is the one that’s supposed to be allocating registers for the whole thing. If you declare $X beforehand, things are totally fine:
(do nothing $)
(program entry point)
(do nothing $X)
([$X $X] = [0 1])
Value of X: $X.
No bug here!
Intermediate code for (program entry point): 0 -1
R0: (group leader) (1 incoming) clause 0
JUMP R1 - -
R1: (part of group R0) (1 incoming) clause 0
ALLOCATE 1 0 -
TRACEPOINT 0 unifyfail.dg:3
MAKE_VAR A0 - -
ASSIGN V0/$X A0 -
TRACEPOINT 1 unifyfail.dg:4
SET_CONT R2 - -
INVOKE_ONCE (do nothing $)
R2: (group leader) (1 incoming) clause 0
TRACEPOINT 3 unifyfail.dg:4
MAKE_PAIR_VV A0 V0/$X []
MAKE_PAIR_VV A0 V0/$X A0
MAKE_PAIR_VV A1 1 []
MAKE_PAIR_VV A1 0 A1
TRACEPOINT 1 unifyfail.dg:5
UNIFY A0 A1 -
TRACEPOINT 3 unifyfail.dg:5
PRINT_WORDS 1 "Value" "of" "X:"
PRINT_VAL V0/$X - -
PRINT_WORDS 1 "." - -
DEALLOCATE 1 - - -
PROCEED 0 - - -
So the problem is, when a variable name appears multiple times in a list literal, only the first one is marked as unseen, but the last one needs to be instead . Stay tuned!
2 Likes
Draconis
(Daniel Stelzer)
January 26, 2025, 9:37pm
6
…could this really be as simple as visiting the children of the AN_PAIR
AST node in the opposite order? Let’s find out!
Intermediate code for (program entry point): 0 -1
R0: (group leader) (1 incoming) clause 65535
JUMP R1 - -
R1: (part of group R0) (1 incoming) clause 65535
ALLOCATE 0 0 -
TRACEPOINT 0 unifyfail.dg:3
MAKE_PAIR_RV A0 X0 []
MAKE_PAIR_VV A0 X0 A0
MAKE_PAIR_VV A1 1 []
MAKE_PAIR_VV A1 0 A1
TRACEPOINT 1 unifyfail.dg:5
UNIFY A0 A1 -
TRACEPOINT 3 unifyfail.dg:5
PRINT_WORDS 1 "Value" "of" "X:"
PRINT_VAL X0 - -
PRINT_WORDS 1 "." - -
DEALLOCATE 1 - - -
PROCEED 0 - - -
Turns out, yes!
Okay, now it’s time for the hardest part: figuring out the git commands to make a new branch for this fix…
3 Likes
Draconis
(Daniel Stelzer)
January 26, 2025, 9:48pm
7
Input:
(program entry point)
([$X $X] = [0 1])
Value of X: $X.
Output:
[Nothing]
Input:
(program entry point)
([$X $X] = $Y)
Value of Y: $Y.
Output:
Value of Y: [$ $].
Any more test cases I should try? Otherwise, time for a pull request!
2 Likes
Draconis
(Daniel Stelzer)
January 26, 2025, 10:37pm
8
Oh, right.
(program entry point)
([$X $X] = $Y)
Value of Y: $Y.
($X = 24)
Value of Y: $Y.
Value of Y: [$ $]. Value of Y: [24 24].
1 Like