Allow creation size 32768 byte array in I6?

Would it be possible to allow for creation of a normal byte array of size 32768 in I6, such that:

Array big_array -> 32768;

would not be rejected by the compiler as being too big?

For such an array, array access would be in the range ->0 to ->32767, which fits within the maximum positive integer limit.

The relevant check in array.c (6.35 version) is:

extern void make_global(int array_flag, int name_only)
...
            if (!glulx_mode) {
                if ((AO.value <= 0) || (AO.value >= 32768))
                {   error("An array must have between 1 and 32767 entries");
                    AO.value = 1;
                }
            }

It looks like a local variable array_type is already being initialized with a constant code to indicate the type of array, so perhaps a modification like:

extern void make_global(int array_flag, int name_only)
...
            if (!glulx_mode) {
                if ((AO.value <= 0) || (AO.value > 32768) || (AO.value == 32768 && array_type != BYTE_ARRAY))
                {   error("An array must have between 1 and 32767 entries");
                    AO.value = 1;
                }
            }

is the minimum change.

It does not appear that, at present, the compiler differentiates the size limit by the type of array. Array allocation checking by type would have to be a little more complex because the correlation of requested size to allocated size varies:

  • string arrays reserve a ->0 entry for array length, so effectively take one more byte than the specified size
  • buffer arrays similarly reserve a -->0 entry, which adds 2 bytes (for Z-machine) to the specified size
  • word (-->) arrays of size 32768 would take up the entirety of writeable memory on the Z-Machine (though presumably this potential problem is already handled by normal writeable memory size checking)
  • table arrays reserve a -->0 entry for array length, similar to a string array

To refine the limit check by array type, perhaps the check block could look something like this (totally untested and possibly wrong) code:

    if (AO.value <= 0) {
        error("An array may not have 0 or fewer entries");
        AO.value = 1;
    } else {
        if (glulx_mode) {
            if (AO.value & 0x80000000)) {
                error("An array may not have more than 2147483647 entries");
                AO.value = 1;
            }
        } else { // Z-machine
            switch (array_type) {
                case BYTE_ARRAY:
                    if ((AO.value > 32768)) {
                        error("A byte array must have between 1 and 32768 entries");
                        AO.value = 1;
                    }
                    break;
                case WORD_ARRAY:
                    if ((AO.value > 32768)) {
                        error("A word array must have between 1 and 32768 entries");
                        return 1;
                    }
                    break;
                case STRING_ARRAY:
                    if ((AO.value >= 256)) {
                        error("A string array must have between 1 and 255 entries");
                        return 1;
                    }
                    break;
                case TABLE_ARRAY:
                    if ((AO.value >= 32767)) {
                        error("A table array must have between 1 and 32766 entries");
                        return 1;
                    }
                    break;
                case BUFFER_ARRAY:
                    if ((AO.value >= 32767)) {
                        error("A buffer array must have between 1 and 32766 entries");
                        return 1;
                    }
                    break;
            }
        }
    }

Note that the above:

  • allows Array big_array -> 32768;, to fix the motivating issue
  • allows Array big_array --> 32768;, and depends on normal writable memory sizing checks to reject arrays that are too large on a practical basis
  • allows only up to string 255 because I think that’s the maximum sensible ->0 size entry value for this type
  • allows only up to table 32767 because this array type requires a -->0 size entry
  • allows only up to buffer 32767 because this array type requires a -->0 size entry that takes up positions ->0 and ->1 (for Z-machine)
  • leaves Glulx constraints logically the same as current code, because I don’t know enough about Glulx

An admittedly cursory search didn’t turn up any constant definitions in the compiler code that set the word length and/or maximum positive integer appropriate for the VM. If such constants exist, it would be better to incorporate them instead of harcoded values where applicable, e.g.

                case BUFFER_ARRAY:
                    if (((AO.value + VM_WORD_SIZE) > (VM_MAX_POSITIVE_INTEGER - 1))) {
                        error("A buffer array must have between 1 and 32766 entries"); // can put computed upper limit number in string here?
                        return 1;
                    }
                    break;

Also, in such case the detailed check switch block could work for both VMs (if the upper limit in the error message can be computed for display).

Note that in 6.35 source, the check on size for string arrays currently occurs in function finish_array, separated from the size check in make_global. Also note that 6.35 doesn’t seem to have a problem with declaration of a buffer array of size 32767, and this value is stored in the -->0 size word, but this seems to have some issues at the high end of array access, e.g.

[** Programming error: tried to write to ->32767 in the buffer array "too_big", which has entries 2 up to -32768 **]

It’s a lot of work to make an array one byte bigger.

1 Like

The short version is a one-line change. The long version (or a proper version of it) fixes an existing bug for buffer arrays.

Is there more that would need to be done other than what I laid out above?

I don’t know. Arrays are a core feature and get used in a lot of different ways. There’s code which interacts with array limits in the compiler and veneer, as well as all the user code which might manipulate arrays. When you let the value 32768 enter the mix, you’re asking for lurking signed-int bugs.

My impulse is to be very conservative about this limit without a really compelling reason. I don’t see slightly larger arrays to be a compelling reason.