Glulxe fatal error: Memory access out of range (2519010C)

Right.

That line guards against that case. But if block itself is 0, that line shouldn’t be executed. (Reading from 0–>BLK_PREV won’t crash the VM but it’s still a mistake.)

If you upload the game file and matching gameinfo.dbg somewhere, and tell me what command causes the error, I can try it.

I’m afraid I can’t do that until Monday, but you can get the branch from github if you want.

github.com/i7/kerkerkruip/tree/table_bug

It’s a bit behind what I’ve been posting, but it has the same bug. Run kerkerkruip.inform, select a new game, enter the command “queue test dreadful-presence-test”, and press space to start the test. The crash happens at line 647 in Inform ATTACK Core by Victor Gijsbers, “blank out the whole row”:

To cautiously blank out (contents - a table name): while the number of filled rows in contents > 0: choose a random row in contents; log "blanking out [option entry]: [action weight entry][line break]"; blank out the whole row;

Zarf, the files can be downloaded from dropbox.com/s/66mjvt5c06j5r … g.zip?dl=0

Ok, when I run this, I get an interpreter memory-access error rather than a “** programming error **”

The stack trace is below. It doesn’t tell me a whole lot – you’ll have to look at the generated I6 to know what the various functions are.

[spoiler]Seeding random number generator with 26
Now testing Dreadful-Presence-Test.

next step: Ape-cowering
The new main actor is you
done taking a player action
The new main actor is the zombie toad
starting standard AI for the zombie toad
Got this far
select an action and do it for zombie toad - 0 rows
blanked out table of AI Action Options
Now there are 4 action selections
done standard AI for the zombie toad
The new main actor is the blood ape
starting standard AI for the blood ape
Got this far
select an action and do it for blood ape - 4 rows
blanking out the zombie toad waiting: -15
Debug: Glulxe fatal error: Memory access out of range
Debug: RT__ChLDW() (pc=$286D)
Debug: x=117506047 ($700FFFF); y=4
Debug: FlexFree() (pc=$166A7E)
Debug: block=117506047 ($700FFFF); fromtxb=2165404 ($210A9C); ptxb=117506047 ($700FFFF)
Debug: ForceTableEntryBlank() (pc=$D3FD7)
Debug: tab=2008058 ($1EA3FA), T9_ai_action_options[3]; col=1; row=2; i=0; at=0; oldv=2165404 ($210A9C); flags=1658 ($67A)
Debug: TableBlankOutRow() (pc=$D4E26)
Debug: tab=2008058 ($1EA3FA), T9_ai_action_options[3]; row=2; k=1
Debug: KERNEL_428() (pc=$F9624)
Debug: t_0=2008058 ($1EA3FA), T9_ai_action_options[3]; ct_0=2008058 ($1EA3FA), T9_ai_action_options[3]; ct_1=2
Debug: PHR_1024_r154() (pc=$F95A7)
Debug: t_0=2008058 ($1EA3FA), T9_ai_action_options[3]; I7RBLK=0
Debug: KERNEL_73() (pc=$47F5D)
Debug: tmp_0=2370256 ($242AD0), I840_blood_ape; tmp_1=0; tmp_2=0; ct_0=0; ct_1=0
Debug: R_1025() (pc=$47EAA)
Debug: I7RBLK=0
Debug: B430_standard_ai() (pc=$AA398)
Debug: forbid_breaks=1; rv=0; original_deadflag=0; p=2370256 ($242AD0), I840_blood_ape
Debug: FollowRulebook() (pc=$C82D4)
Debug: rulebook=430 ($1AE); parameter=0; no_paragraph_skips=1; rv=696621 ($AA12D), B430_standard_ai(); ss=0; spv=0
Debug: PHR_1008_r118() (pc=$F7D45)
Debug: t_0=2370256 ($242AD0), I840_blood_ape
Debug: R_982() (pc=$45B1C)
Debug: (no locals)
Debug: B423_combat_round() (pc=$A96F8)
Debug: forbid_breaks=0; rv=0; original_deadflag=0
Debug: FollowRulebook() (pc=$C82D4)
Debug: rulebook=423 ($1A7); parameter=0; no_paragraph_skips=0; rv=693061 ($A9345), B423_combat_round(); ss=0; spv=0
Debug: R_976() (pc=$1C6B4)
Debug: (no locals)
Debug: B1_turn_sequence() (pc=$9A37E)
Debug: forbid_breaks=0; rv=0; original_deadflag=0
Debug: FollowRulebook() (pc=$C82D4)
Debug: rulebook=1; parameter=0; no_paragraph_skips=0; rv=631661 ($9A36D), B1_turn_sequence(); ss=0; spv=0
Debug: Main() (pc=$E72B)
Debug: (no locals)
Debug: Main__() (pc=$46)
Debug: (no locals)[/spoiler]

…Remember that the memory corruption may have occurred earlier than this stack trace.

Thanks, it’s a start.

I’ve pushed a commit with even more copious debug information, including a periodic sanity checks of the stored actions in the table:

github.com/i7/kerkerkruip/compare/table_bug

Something weird is definitely happening - it looks like simply referencing the stored action is causing FlexFree to be called. How could that be?

[code]Section - Sanity Checking Stored Actions

To decide what number is the/-- address of (block - a stored action): (- {block} -);

To decide what number is the/-- previous block to (block - a stored action): (- {block}–>BLK_PREV -);

To decide what number is the/-- previous-next block of (block - a stored action): (- ({block}–>BLK_PREV)–>BLK_NEXT -);

To say sanity check (plan - a stored action):
Let A be the address of plan;
let N be the previous-next block of plan;
say “Stored action [bracket][the plan][close bracket] at [address of plan] is a child of [previous block to plan], which [if A is N]matches when sanity-checked[otherwise]has a first child of [previous-next block of plan] instead[end if].”;

To say sanity check action options:
say “Main actor’s action: [sanity check main actor’s action][line break]”;
Repeat through table of AI action options:
say “AI action option: [sanity check option entry][line break]”;

A last Standard AI rule for a person (called P) (this is the select an action and do it rule):
log “select an action and do it for [P] - [sanity check action options]”;
cautiously blank out Table of AI Action Options;
log “blanked out table of AI Action Options”;
[…etc]

[also includes debug code posted earlier for FlexFree]
[/code]

I think there might be problems with my debug code again. Looking at auto.inf, it appears to be treating the stored action’s address as a stored action and not a number. Has the old trick for type conversion stopped working?

[code][ KERNEL_487
t_0 ! Call parameter ‘plan’: stored action
tmp_0 ! Let/loop value, e.g., ‘A’: number
tmp_1 ! Let/loop value, e.g., ‘P’: number
tmp_2 ! Let/loop value, e.g., ‘N’: number
;
! [1: say ~getting address of the action~]
say__p=1;ParaContent(); print “getting address of the action”; .L_Say2768; .L_SayX2580;
! [2: let a be the address of plan]

		tmp_0 = BlkValueCopy(I7SFRAME, t_0);

[/code]
It wouldn’t call BlkValueCopy if it were treating it as a number, would it?

I can’t figure out a way of exposing the address of a stored action to I7 - The compiler insists on adding a BlkValueCopy however I try to trick it. But shouldn’t there be an invariant that any stored action–>BLK_PREV–>BLK_NEXT is equal to itself? This never seems to be the case. It makes me think there’s a problem with BlkValueCopy, at least involving stored actions. I tried to trace through the code in auto.inf, but after deflecting the code via KOVSupportFunction, I quickly got lost among the callbacks.

Adding more debug code resulted in an earlier crash:

[code]include (-
[ TEXT_TY_Transmute txt;
TEXT_TY_Temporarily_Transmute(txt);
];

[ TEXT_TY_Temporarily_Transmute txt x;
if ((txt) && (txt–>0 & BLK_BVBITMAP_LONGBLOCKMASK == 0)) {
x = txt–>1; ! The old value was a packed string

	txt-->0 = UNPACKED_TEXT_STORAGE;
	txt-->1 = FlexAllocate(32, TEXT_TY, TEXT_TY_Storage_Flags);
	if (x ~= EMPTY_TEXT_PACKED) TEXT_TY_CastPrimitive(txt, false, x);
	
	return x;
}
return 0;

];

[ TEXT_TY_Untransmute txt pk cp x;
if ((pk) && (txt–>0 == UNPACKED_TEXT_STORAGE)) {
print “Untransmuting unpacked txt=”, txt, " pk=“, pk, " cp=”, cp, “^”;
x = txt–>1; ! The old value was an unpacked string
FlexFree(x);
txt–>0 = cp;
txt–>1 = pk; ! The value earlier returned by TEXT_TY_Temporarily_Transmute
}
return txt;
];
-) instead of “Transmutation” in “Text.i6t”

include (-
[ FlexFree block fromtxb ptxb memsize;
@getmemsize memsize;
print "FlexFree “, (BlkValueDebug) block, " memsize=”, memsize, “^”;
print “^”;
if (block == 0) return;
if ((block->BLK_HEADER_FLAGS) & BLK_FLAG_RESIDENT) return;
if ((block->BLK_HEADER_N) & $80) return; ! not a flexible block at all
if ((block->BLK_HEADER_FLAGS) & BLK_FLAG_MULTIPLE) {
print “Block is multiple^”;
if (block–>BLK_PREV ~= NULL) {
if ((block–>BLK_PREV)–>BLK_NEXT ~= block) {
print "Block ", block, " with previous block ", block–>BLK_PREV, " does not match previous block’s next block: ", (block–>BLK_PREV)–>BLK_NEXT, “^”;
FlexError(“contains bad links”);
}
(block–>BLK_PREV)–>BLK_NEXT = NULL;
}
fromtxb = block;
for (:(block–>BLK_NEXT)~=NULL:block = block–>BLK_NEXT) {
print "current block is ", block, “, next=”, block–>BLK_NEXT, “, previous=”, block–>BLK_PREV, “(NULL=”, NULL, “)^”;
}
while (block ~= fromtxb) {
print "Freeing component block ", block, “^”;
ptxb = block–>BLK_PREV; FlexFreeSingleBlockInternal(block); block = ptxb;
}
}
print "Freeing original block ", block, “^”;
FlexFreeSingleBlockInternal(block);
];

! etc…
-) instead of “Deallocation” in “Flex.i6t”.
[/code]

I noticed this was the first heap address to appear in any debug message.

There’s some kind of code that will do by reference calls, something like {byref:value}. I’m on my phone so I can’t find the exact one. Try the standard rules, or some other tech complicated extension.

“{-by-reference:val}” is what Dannii is thinking of; I see definitions in the standard rules like

To say (val - sayable value of kind K)
        (documented at phs_value):
        (- print ({-printing-routine:K}) {-by-reference:val}; -).

So that’s what that means. What does -my:val mean?

Ensures that a local variable is defined in the current function.

I have a (more) minimal code snippet that reproduces the problem! It requires one include: Dynamic Tables by Jesse McGrew.

[code]Include Dynamic Tables by Jesse McGrew.

Test is a room.

Table of AI Action Options
Option
a stored action
with 20 blank rows

The repeat count is a number that varies. The repeat count is 1.

Every turn:
Repeat with i running from 1 to the repeat count:
blank out the whole of Table of AI Action Options;
choose a blank row in Table of AI Action Options;
now the option entry is the action of waiting;
say “[repeat count] repeats completed.”;
now the repeat count is the repeat count * 2;

test me with “z/z/z/z/z/z/z/z/z/z/z/z/z”[/code]

That’s how many z’s it took to crash my IDE. If yours doesn’t crash, try it a few more times…

Thanks. I can reproduce that. (And if the repeat count starts at 6000, it dies on the first “z”.)

I will dig into this.

I looked at it a little bit, but I think vaporware will have to debug it. (Or someone else more familiar with Dynamic Tables than I am.)

The crash stack trace:

[** Programming error: tried to write outside memory using --> **]
Debug: Glulxe fatal error: Memory access out of range
Debug: RT__ChLDW() (pc=$23E0)
Debug: x=898171936 ($35890420); y=3
Debug: FlexFree() (pc=$5443F)
Debug: block=898171936 ($35890420); fromtxb=1114316 ($1100CC); ptxb=0
Debug: ForceTableEntryBlank() (pc=$2BEC4)
Debug: tab=512097 ($7D061), T2_ai_action_options[2]; col=1; row=1; i=0; at=0; oldv=1114316 ($1100CC); flags=1643 ($66B)
Debug: TableBlankOutRow() (pc=$2CD05)
Debug: tab=512097 ($7D061), T2_ai_action_options[2]; row=1; k=1
Debug: TableBlankOutAll() (pc=$2CD83)
Debug: tab=512097 ($7D061), T2_ai_action_options[2]; n=21 ($15); k=1
Debug: KERNEL_0() (pc=$11F11)
Debug: tmp_0=5712 ($1650); ct_0=512097 ($7D061), T2_ai_action_options[2]; ct_1=1
Debug: R_815() (pc=$11ECE)
Debug: I7RBLK=0
Debug: B8_every_turn() (pc=$1DAF6)
Debug: forbid_breaks=0; rv=0
Debug: FollowRulebook() (pc=$29E37)
Debug: rulebook=8; parameter=0; no_paragraph_skips=0; rv=121577 ($1DAE9), B8_every_turn(); ss=0; spv=0
Debug: R_15() (pc=$11D6C)
Debug: (no locals)
Debug: B1_turn_sequence() (pc=$1D7BE)
Debug: forbid_breaks=0; rv=0; original_deadflag=0
Debug: FollowRulebook() (pc=$29E37)
Debug: rulebook=1; parameter=0; no_paragraph_skips=0; rv=120501 ($1D6B5), B1_turn_sequence(); ss=0; spv=0
Debug: Main() (pc=$E27D)
Debug: (no locals)
Debug: Main__() (pc=$46)
Debug: (no locals)
Glulxe fatal error: Memory access out of range (3589042C)

What I observe about this is that on the crashy iteration, the table entry (row 1) has BLK_FLAG_MULTIPLE set. But BLK_PREV and BLK_NEXT are both 0, not NULL. This causes a blowup in FlexFree(), on the line “(block–>BLK_PREV)–>BLK_NEXT = NULL;”

Great work Mike!

Why is the extension needed? It doesn’t look like it’s being used.

Without the extension, it doesn’t crash.

What the extension is supposed to do is check when a table becomes full and then allocate more rows for it. But maybe what is happening here is that it’s allocating more rows when it’s not supposed to, and then not deallocating them.

Kerkerkruip needs the extension because Dynamic Objects requires it. The Table of Locale Priorities is supposed to have one row for every thing in the game, so adding more things means adding more rows.

Ahh, so that’s why Dynamic Tables is included. I wonder how often it is ever actually needed… usually the Table of Locale Priorities will be far larger than anything in one place, even including dynamic objects.

I’ll take a look at the extension when I get the chance. Clearly something is breaking even though the extension shouldn’t be doing anything. Hopefully that means it will be easier to track down, because the bug must be before it ever checks if a table is actually full.

I think I’ve solved it! I don’t know why I didn’t check this way back at the beginning.

Here’s a truly minimal source that replicates the bug:

[code]
Include (-
[ ForceTableEntryBlank tab col row i at oldv flags;
if (col >= 100) col = TableFindCol(tab, col);
if (col == 0) rtrue;
flags = (tab–>col)–>1;
oldv = (tab–>col)–>(row+COL_HSIZE);
if ((flags & TB_COLUMN_ALLOCATED) && (oldv ~= 0 or TABLE_NOVALUE))
FlexFree(oldv); !!! SHOULD BE BlkValueFree !!!
(tab–>col)–>(row+COL_HSIZE) = TABLE_NOVALUE;
if (flags & TB_COLUMN_NOBLANKBITS) return;
row–;
at = TB_Blanks + ((tab–>col)–>2); ! originally: at = ((tab–>col)–>2) + (row/8);
(at->0) = (at->0) | (CheckTableEntryIsBlank_LU->(row%8)); ! originally: (TB_Blanks->at) = (TB_Blanks->at) | (CheckTableEntryIsBlank_LU->(row%8));
];
-) instead of “Force Entry Blank” in “Tables.i6t”.

[This is the original from Tables.i6t:]
[
Include (-
[ ForceTableEntryBlank tab col row i at oldv flags;
if (col >= 100) col = TableFindCol(tab, col);
if (col == 0) rtrue;
flags = (tab–>col)–>1;
oldv = (tab–>col)–>(row+COL_HSIZE);
if ((flags & TB_COLUMN_ALLOCATED) && (oldv ~= 0 or TABLE_NOVALUE))
BlkValueFree(oldv);
(tab–>col)–>(row+COL_HSIZE) = TABLE_NOVALUE;
if (flags & TB_COLUMN_NOBLANKBITS) return;
row–;
at = ((tab–>col)–>2) + (row/8);
(TB_Blanks->at) = (TB_Blanks->at) | (CheckTableEntryIsBlank_LU->(row%8));
];
-) instead of “Force Entry Blank” in “Tables.i6t”;
]

Test is a room.

Table of AI Action Options
Option
a stored action
with 20 blank rows

The repeat count is a number that varies. The repeat count is 1.

Every turn:
Repeat with i running from 1 to the repeat count:
blank out the whole of Table of AI Action Options;
choose a blank row in Table of AI Action Options;
now the option entry is the action of waiting;
say “[repeat count] repeats completed.”;
now the repeat count is the repeat count * 2;

test me with “z/z/z/z/z/z/z/z/z/z/z/z/z”[/code]

You’ll note that I’ve shown the original code from the current version of Tables.i6t. My guess is that this code changed when Graham fixed the two bugs Zarf posted on the very first page of this thread. I’m assuming Dynamic Tables hasn’t been updated since then. Changing FlexFree to BlkValueFree seems to solve the problem. There may be other changes needed in Dynamic Tables - I’ll check the rest of the code against Tables.i6t.