Some questions about array access in Inform on Z-Machine

otistdog · December 22, 2021, 11:41pm

Question 1: If a negative signed value is used as the index of an array access, for example:

print 0-->n;

where N is -1, is the result the equivalent of an equivalent unsigned value access, e.g. what would be equivalent to -->$FFFF in this case?

Question 2: If the answer to #1 is yes, why does the Inform compiler reject values >= 32768 in such a case if the value is presented as a number literal? (For both array access and array definition.)

Question 3: What are the odds of this restriction being relaxed so that the compiler will allow this in source code, either via an unsigned number literal like $FFFF or a constant set to that value?

I’m asking mainly because I discovered that the I6 compiler rejects the declaration:

Array large_byte_array -> 32768;

with a complaint that the size value is too large. (Although, technically, for this type of array, wouldn’t the resulting access range be ->0 to ->32767, which would fit in positive signed values?)

If the status quo functionally allows access in range ->$0000 to ->$FFFF, what’s the rationale behind disallowing creation of arrays in that size range? (Is it just that word-type arrays of that size would take up the entirety of dynamic memory?)

Dannii · December 23, 2021, 12:15am

I can’t find exactly where it specifies this in the standard, but my ZVM interpreter treats the store/load opcodes as having signed offsets.

cas · December 23, 2021, 7:23am

I asked (essentially) question 1 around a decade ago (Index to @loadw: signed or unsigned?) and Zarf suggested that in these operations, the resulting address ought to be clipped to 16 bits. If that’s done, the distinction of signed vs unsigned is irrelevant, meaning the answer to question 1 is yes, the result is equivalent.

I have no insight into the Inform questions. Sorry.

otistdog · December 23, 2021, 8:14am

Thank you for bringing this previous discussion to my attention. I was doing some “weird science” over in Inform 7 and happened to notice that it was OK with retrieving a value for 0-->-2 in one case.

I note that what I think is the governing Inform 6 veneer routine for word array access, which is

/*  RT__ChLDW: Check at run-time that it's safe to load a word
    and return the word */

[ RT__ChLDW base offset a val;
    a=base+2*offset;
    if (Unsigned__Compare(a,#readable_memory_offset)>=0)
        return RT__Err(25);
    @loadw base offset -> val;
    return val;
];

doesn’t bother to check that the deferenced address is positive (if considered as signed), only that it is below the cutoff for readable memory (if considered as unsigned).

If negative indices for arrays are intended to be illegal, why the exception for this kind of access only? Wouldn’t something like

[ RT__ChLDW base offset a val;
    a=base+2*offset;
    if (Unsigned__Compare(a,#readable_memory_offset)>=0 || Unsigned__Compare(a,0) < 0) ! MODIFIED
        return RT__Err(25);
    @loadw base offset -> val;
    return val;
];

be expected?

Strangely, the sister function RT__ChLDB (a wrapper for @loadb) is structured in essentially the same way.

/*  RT__ChLDB:  check at run-time that it's safe to load a byte
    and return the byte */

[ RT__ChLDB base offset a val;
    a=base+offset;
    if (Unsigned__Compare(a,#readable_memory_offset)>=0)
        return RT__Err(24);
    @loadb base offset -> val;
    return val;
];

… but evaluation of 0->-4 does cause an RTE, though I’m not sure why. (Note that, using the above example values of 0-->-2 and 0->-4, the value of base+offset should be 0+(-4) for the @loadb wrapper, and base+2*offset should be 0+2*(-2) for the @loadw wrapper, and that Inform seems to deem those two expressions to be equivalent, so I would expect the results of the Unsigned__Compare call to be the same.)

zarf · December 23, 2021, 4:38pm

Graham forgot to check?

otistdog · December 23, 2021, 5:05pm

For both word and byte access? And for both read and write routines?

/*  RT__ChSTB:  check at run-time that it's safe to store a byte
                and store it */

[ RT__ChSTB base offset val a f;
    a=base+offset;
    if (Unsigned__Compare(a,#array__start)>=0 && Unsigned__Compare(a,#array__end)<0)
        f=1;
    else if (Unsigned__Compare(a,#cpv__start)>=0 && Unsigned__Compare(a,#cpv__end)<0)
        f=1;
    else if (Unsigned__Compare(a,#ipv__start)>=0 && Unsigned__Compare(a,#ipv__end)<0)
        f=1;
    else if (a==$0011)
        f=1;
    if (f==0)
        return RT__Err(26);
    @storeb base offset val;
];

/*  RT__ChSTW:  check at run-time that it's safe to store a word
                and store it

[ RT__ChSTW base offset val a f;
    a=base+2*offset;
    if (Unsigned__Compare(a,#array__start)>=0 && Unsigned__Compare(a,#array__end)<0)
        f=1;
    else if (Unsigned__Compare(a,#cpv__start)>=0 && Unsigned__Compare(a,#cpv__end)<0)
        f=1;
    else if (Unsigned__Compare(a,#ipv__start)>=0 && Unsigned__Compare(a,#ipv__end)<0)
        f=1;
    else if (a==$0010)
        f=1;
    if (f==0)
        return RT__Err(27);
    @storew base offset val;
];

If I’m reading this set of veneer routines correctly, unsigned index array access is not really an exception at all.

Also ZMS 1.1 says:

loadb
2OP:16 10 loadb array byte-index -> (result)
Stores array->byte-index (i.e., the byte at address array+byte-index, which must lie in
static or dynamic memory).

loadw
2OP:15 F loadw array word-index -> (result)
Stores array-->word-index (i.e., the word at address array+2*word-index, which must lie
in static or dynamic memory).

both of which are silent about the signed/unsigned issue. That which is not forbidden is allowed?

zarf · December 23, 2021, 5:20pm

They’re all pretty much the same logic. So, yes.