I6: How to copy an array from a local to global?

matt_weiner · November 24, 2016, 2:25am

Basically bumping from here in the hopes that someone who knows I6 will see this… I guess I shouldn’t run into intractable problems over an American holiday. Happy Thanksgiving, those who celebrate it! Hope you had/are having relatively easy trips, if you’re taking them.

So I’m working on something where I want to take a string that has been stored in an I6 array and hand it over to an I7 rulebook for processing. The context is a call to LanguageContraction whose argument is StorageForShortName, from this code block within PrefaceByArticle (for Glulx only):

if (findout) { if (pluralise) Glulx_PrintAnyToArray(StorageForShortName, 160, EnglishNumber, pluralise); else Glulx_PrintAnyToArray(StorageForShortName, 160, PSN__, obj); acode = acode + 3*LanguageContraction(StorageForShortName); }

(Apologies; the copy-paste from the web I6 template is munging the tab stops. But it’s I6 code so they shouldn’t be essential.)

LanguageContraction as it stands is very simple:

[LanguageContraction text; if (text->0 == ’a’ or ’e’ or ’i’ or ’o’ or ’u’ or ’A’ or ’E’ or ’I’ or ’O’ or ’U’) return 1; return 0; ]

What I’d like to do is replace that with a call to an I7 rulebook that does something more complex than simply check whether the first letter of text is a vowel. (So we can tell it that it should be “an hourglass,” “a unicorn,” and other things.) In the other thread, Draconis said that when he has to pass a string from I6 to I7, he copies it into a buffer and makes an I7 say phrase to read that buffer out. (This is also how Text Capture exports its captured text buffer to the “[captured text]” substitution.) So I made this routine:

[code]To say the/-- I6 buffer:
(- PrintI6Buffer(); -).

Include (-

Array I6Buffer buffer 250;

[ PrintI6Buffer len i;
len = I6Buffer–>0;
for ( i = 0 : i < len : i++ )
{
glk_put_char_uni(I6Buffer–>(i + 1));
}
];

-)
.[/code]

(Changing “buffer” to “table” seems to leave everything the same.)

And then I defined the rulebook I wanted to run:

[code]The initial sound rules are a rulebook. The initial sound rules have outcomes vowel and consonant.

An initial sound rule (this is the basic initial sound test rule):
let temp be the substituted form of “[I6 buffer]”;
say “(DBG: [temp])”;
if “[temp]” starts with a vowel sound:
vowel;
otherwise:
consonant.[/code]

and tried to edit it into Language Contraction, thus:

[code]Include (-

Constant LanguageAnimateGender = male;
Constant LanguageInanimateGender = neuter;
Constant LanguageContractionForms = 2; ! English has two:
! 0 = starting with a consonant
! 1 = starting with a vowel
[ LanguageContraction text result rv i;

!!! This is the old routine:
!if (text->0 == ’a’ or ’e’ or ’i’ or ’o’ or ’u’
!or ’A’ or ’E’ or ’I’ or ’O’ or ’U’) return 1;
!return 0;
!!! Out w/ old – in w/ new:
for (i=1:i<20:i++) I6Buffer → i = text → i;
rv = FollowRulebook( (+ initial sound rules +) );
if ((rv) && RulebookSucceeded()) {
result = ResultOfRule();
if (result == (+ vowel outcome +)) return 1;
return 0;}
return 0;
];
Array LanguageArticles →
! Contraction form 0: Contraction form 1:
! Cdef Def Indef Cdef Def Indef
"The " "the " "a " "The " "the " "an " ! Articles 0
"The " "the " "some " "The " "the " "some "; ! Articles 1
! a i
! s p s p
! m f n m f n m f n m f n
Array LanguageGNAsToArticles → 0 0 0 1 1 1 0 0 0 1 1 1;
-) instead of “Articles” in “Language.i6t”.[/code]

Running this on a test scenario (which I’ll put in spoilers at the end of the post) yields this:

So obviously the copy isn’t working here, as the I6 Buffer is turning into some Unicode stuff. I’m pretty sure the suspect line is this:

for (i=1:i<20:i++) I6Buffer --> i = text --> i;

But nothing else I’m trying works. If I change i<20 to i<90 or higher, I get a Glulxe fatal error: memory access out of range. (-76FFFF9E or so.) If I try BlkValueCopy:

BlkValueCopy(I6Buffer, text);

I get “You can see” and then a memory access out of range error (686F6E69).

If I try using BlkValueWrite and BlkValueRead:

for (i=1:i<89:i++) BlkValueWrite(I6Buffer, i, BlkValueRead(text, i));

I get the same memory access out of range error (686F6E66).

I’m at my wit’s end here and I’ve been brickwalled for something like a week on this. Obviously I don’t know what I’m doing, but this seems like the simplest freaking operation I6–I just want to copy one array into another–and the I6 Designer’s Manual doesn’t have anything about how to do this, while Appendix B doesn’t give me any hints either. Is it that I don’t know the type of the things I’m trying to copy? And this all arises from trying to puzzle out how Inform is managing to guess what articles to use, which I’m no further along at; I can see where it’s looking for an initial vowel, but I don’t know how it’s managing to get the string that it’s about to print sequestered into something that lets it check the initial vowel before printing.

Anyway, Happy Thanksgiving, and I appreciate any help you can give!

Test bed (largely courtesy of Skinny Mike):

[spoiler][code]The Lab is a room.

An animal can be honest or dishonest. Understand the honest property as describing an animal.

Before printing the name of an animal (called beast): say "[if beast is dishonest]dis[end if]honest ".

The unicorn is an animal. It is honest.
The yak is an animal. It is dishonest.

A block is a kind of thing. It has a number called weight. The description of a block is “It looks to be about [weight of the item described in words] pound[s].”
After examining a block (called B):
now the printed name of B is “[weight of B in words] - pound block of [B]”.

The yttrium is a block. The weight is one.

The uniform is wearable. The description is “It’s a United States Army uniform.”
After examining the uniform for the first time:
now the printed name of the uniform is “United States Army uniform”.

A unicorn, a uniform, an hourglass, a yak and yttrium are in the lab.

[Matt’s code - additions noted]

[create a global variable to be called later in i6:]
[Current vowel sound is a number that varies. Current vowel sound variable translates into i6 as “vowel_sound”.

Before printing the name of a thing (called x):
let T be a text;
now T is the printed name of x;
now Current vowel sound is T evaluated.
[say Current vowel sound.]

To decide what number is (s - a text) evaluated:
if s starts with a vowel sound:
decide on 1;
decide on 0. ]

To decide whether (string - a text) starts with a vowel sound (this is vowel sound checking):
let the first word be punctuated word number 1 in string;
if the first word is a word listed in the Table of Words That Start With Vowel Sounds, yes;
if the first word is a word listed in the Table of Words That Don’t Start With Vowel Sounds, no;
if character number 1 in the first word is a vowel, yes;
no.

To decide whether (letter - a text) is a vowel:
if letter exactly matches the regular expression “a|e|i|o|u|A|E|I|O|U”, yes;
no.

The initial sound rules are a rulebook. The initial sound rules have outcomes vowel and consonant.

An initial sound rule (this is the basic initial sound test rule):
let temp be the substituted form of “[I6 buffer]”;
say “(DBG: [temp])”;
if “[temp]” starts with a vowel sound:
vowel;
otherwise:
consonant.

To say the/-- I6 buffer:
(- PrintI6Buffer(); -).

Include (-

Array I6Buffer buffer 250;

[ PrintI6Buffer len i;
len = I6Buffer–>0;
for ( i = 0 : i < len : i++ )
{
glk_put_char_uni(I6Buffer–>(i + 1));
}
];

-)
.

Table of Words That Start With Vowel Sounds
word
“hour”
“hourglass”
“honest”
“yttrium”

Table of Words That Don’t Start With Vowel Sounds
word
“uniform”
“unicorn”
“united”
“United”
“one”

[end]

[Include (-
Global vowel_sound = 0;
-) after “Definitions.i6t”.]

Include (-

Constant LanguageAnimateGender = male;
Constant LanguageInanimateGender = neuter;
Constant LanguageContractionForms = 2; ! English has two:
! 0 = starting with a consonant
! 1 = starting with a vowel
[ LanguageContraction text result rv i;

!!! This is the old routine:
!if (text->0 == ’a’ or ’e’ or ’i’ or ’o’ or ’u’
!or ’A’ or ’E’ or ’I’ or ’O’ or ’U’) return 1;
!return 0;
!!! Out w/ old – in w/ new:
for (i=1:i<10:i++) BlkValueWrite(I6Buffer, i, BlkValueRead(text, i));
rv = FollowRulebook( (+ initial sound rules +) );
if ((rv) && RulebookSucceeded()) {
result = ResultOfRule();
if (result == (+ vowel outcome +)) return 1;
return 0;}
return 0;
];
Array LanguageArticles →
! Contraction form 0: Contraction form 1:
! Cdef Def Indef Cdef Def Indef
"The " "the " "a " "The " "the " "an " ! Articles 0
"The " "the " "some " "The " "the " "some "; ! Articles 1
! a i
! s p s p
! m f n m f n m f n m f n
Array LanguageGNAsToArticles → 0 0 0 1 1 1 0 0 0 1 1 1;
-) instead of “Articles” in “Language.i6t”.[/code][/spoiler]

dfremont · November 25, 2016, 7:38pm

Looking at this quickly, I think the main issue is that you’re treating StorageForShortName as a word array (i.e. an array of 4-byte fields, since you’re on Glulx) instead of as a byte array (see page 42 of the DM4). The syntax “array–>index” is used for word arrays, and the syntax “array->index” for byte arrays. If you use the wrong one you’ll end up indexing into the wrong location within the array, or reading multiple entries as a single entry (hence the random-looking unicode characters). In the context of PrefaceByArticle, it appears that StorageForShortName is being used as a byte array, written using a Glk memory stream and so with no initial entry storing the length. Since your I6Buffer is a word array, this means you want to use code likeI6Buffer-->0 = 10; for (i=0:i<10:i++) I6Buffer-->(i+1) = text->i;This appears to work as you want, except that you get junk on the end when the length of the text in StorageForShortName is actually less than 10. For finding the real length, it looks like Glulx_PrintAnyToArray returns the length of the text printed (although only 160 characters at most will be stored in this case), so you could save that return value.

Since StorageForShortName is being used as a byte array here and can’t store unicode characters (if I understand correctly they will be converted to ‘?’ by Glk), you might as well save memory by using I6Buffer as a byte array (with ‘->’ everywhere instead of ‘–>’) and glk_put_char instead of glk_put_char_uni.

matt_weiner · November 25, 2016, 10:05pm

Thank you so much, Daniel! That was incredibly helpful… after a whole bunch of banging around with it I realized that I also had an off-by-one error, starting the copy look at entry 1 rather than entry 0 of the text; I guess StorageForShortName is one of those arrays (“string”?) that doesn’t have its length in the first entry, while I6 Buffer is one of those arrays (“table”) that does. (Although I defined it as “buffer,” which doesn’t seem to be documented in DM4; is that defined somewhere in the I6 template?) And of course looking back at your post you have that correction.

I had thought that I wouldn’t need to worry about the junk at the end, because all I care about in general is going to be the beginning of the string–but it turns out that the junk was affecting the individual words, which wouldn’t do. (Overwriting “dishonest yak” with “yttrium” yielded “yttriumst yak,” and my rules don’t recognize “yttriumst” as a word that starts with a vowel sound.) But your tip about Glulx_PrintAnyToArray solved that.

It’s even working with characters like ü and Ø and Æ if I put the relevant word on the list of words that start with vowel sounds. Which is surprising in light of what you said about Unicode. Or are those not Unicode?

Now all I have to do is find the part that is making everything print on a different line, but in my adventures looking through the I6 template I remember running across something about the flag that produces line breaks, so I should be able to work that out.

On further reflection, urgh. The thing I ran across is informing me that there is no such thing as the flag that produces line breaks. The problem seems to be that my initial sound rulebook wants to produce line breaks. I bonked all but one of them by adding this to the initial sound rulebook, but I’m still getting a spurious break before the first article. Maybe I have to add that to the you can also see rule as well.

Thank you so much! As I said, this has been bugging me for a while.

Current code:

[spoiler][code]The Lab is a room.

An animal can be honest or dishonest. Understand the honest property as describing an animal.

Before printing the name of an animal (called beast): say "[if beast is dishonest]dis[end if]honest ".

The unicorn is an animal. It is honest.
The yak is an animal. It is dishonest.

A block is a kind of thing. It has a number called weight. The description of a block is “It looks to be about [weight of the item described in words] pound[s].”
After examining a block (called B):
now the printed name of B is “[weight of B in words]-pound block of [B]”.

The yttrium is a block. The weight is one.

The uniform is wearable. The description is “It’s a United States Army uniform.”
After examining the uniform for the first time:
now the printed name of the uniform is “United States Army uniform”.

A unicorn, a uniform, an hourglass, a yak and yttrium are in the lab. An Æsop anthology is in the lab.

[Matt’s code - additions noted]

[create a global variable to be called later in i6:]
[Current vowel sound is a number that varies. Current vowel sound variable translates into i6 as “vowel_sound”.

Before printing the name of a thing (called x):
let T be a text;
now T is the printed name of x;
now Current vowel sound is T evaluated.
[say Current vowel sound.]

To decide what number is (s - a text) evaluated:
if s starts with a vowel sound:
decide on 1;
decide on 0. ]

To decide whether (string - a text) starts with a vowel sound (this is vowel sound checking):
let the first word be punctuated word number 1 in string;
if the first word is a word listed in the Table of Words That Start With Vowel Sounds, yes;
if the first word is a word listed in the Table of Words That Don’t Start With Vowel Sounds, no;
if character number 1 in the first word is a vowel, yes;
no.

To decide whether (letter - a text) is a vowel:
if letter exactly matches the regular expression “a|e|i|o|u|A|E|I|O|U”, yes;
no.

The initial sound rules are a rulebook. The initial sound rules have outcomes vowel and consonant.

To skip upcoming rulebook break: (- say__pc = say__pc | PARA_NORULEBOOKBREAKS; -).

First initial sound rule: skip upcoming rulebook break.

An initial sound rule (this is the basic initial sound test rule):
let temp be the substituted form of “[I6 buffer]”;
[say “(DBG: [temp])”;]
if “[temp]” starts with a vowel sound:
vowel;
otherwise:
consonant.

To say the/-- I6 buffer:
(- PrintI6Buffer(); -).

Include (-

Array I6Buffer buffer 250;

[ PrintI6Buffer len i;
len = I6Buffer->0;
for ( i = 0 : i < len : i++ )
{
glk_put_char(I6Buffer->(i + 1));
}
];

-)
.

Table of Words That Start With Vowel Sounds
word
“hour”
“hourglass”
“honest”
“yttrium”
“Æsop”

Table of Words That Don’t Start With Vowel Sounds
word
“uniform”
“unicorn”
“united”
“United”
“one”

[end]

[Include (-
Global vowel_sound = 0;
-) after “Definitions.i6t”.]

Include (-
Global short_name_case;

[ PrefaceByArticle obj acode pluralise capitalise i artform findout artval buflen;
if (obj provides articles) {
artval=(obj.&articles)–>(acode+short_name_case*LanguageCases);
if (capitalise)
print (Cap) artval, " ";
else
print (string) artval, " ";
if (pluralise) return;
print (PSN__) obj; return;
}

i = GetGNAOfObject(obj);
if (pluralise) {
    if (i < 3 || (i >= 6 && i < 9)) i = i + 3;
}
i = LanguageGNAsToArticles-->i;

artform = LanguageArticles
    + 3*WORDSIZE*LanguageContractionForms*(short_name_case + i*LanguageCases);

#Iftrue (LanguageContractionForms == 2);
if (artform-->acode ~= artform-->(acode+3)) findout = true;
#Endif; ! LanguageContractionForms
#Iftrue (LanguageContractionForms == 3);
if (artform-->acode ~= artform-->(acode+3)) findout = true;
if (artform-->(acode+3) ~= artform-->(acode+6)) findout = true;
#Endif; ! LanguageContractionForms
#Iftrue (LanguageContractionForms == 4);
if (artform-->acode ~= artform-->(acode+3)) findout = true;
if (artform-->(acode+3) ~= artform-->(acode+6)) findout = true;
if (artform-->(acode+6) ~= artform-->(acode+9)) findout = true;
#Endif; ! LanguageContractionForms
#Iftrue (LanguageContractionForms > 4);
findout = true;
#Endif; ! LanguageContractionForms

#Ifdef TARGET_ZCODE;
if (standard_interpreter ~= 0 && findout) {
    StorageForShortName-->0 = 160;
    @output_stream 3 StorageForShortName;
    if (pluralise) print (number) pluralise; else print (PSN__) obj;
    @output_stream -3;
    acode = acode + 3*LanguageContraction(StorageForShortName + 2);
}
#Ifnot; ! TARGET_GLULX
if (findout) {
    if (pluralise)
        buflen = Glulx_PrintAnyToArray(StorageForShortName, 160, EnglishNumber, pluralise);
    else
        buflen = Glulx_PrintAnyToArray(StorageForShortName, 160, PSN__, obj);
    acode = acode + 3*LanguageContraction(StorageForShortName, buflen);
}
#Endif; ! TARGET_

Cap (artform-->acode, ~~capitalise); ! print article
if (pluralise) return;
print (PSN__) obj;

];
-) instead of “Object Names II” in “Printing.i6t”.

Include (-

Constant LanguageAnimateGender = male;
Constant LanguageInanimateGender = neuter;
Constant LanguageContractionForms = 2; ! English has two:
! 0 = starting with a consonant
! 1 = starting with a vowel
[ LanguageContraction text len result rv i;

!!! This is the old routine:
!if (text->0 == ’a’ or ’e’ or ’i’ or ’o’ or ’u’
!or ’A’ or ’E’ or ’I’ or ’O’ or ’U’) return 1;
!return 0;
!!! Out w/ old – in w/ new:
I6Buffer->0 = len;
for (i=0:i<len+1:i++) I6Buffer->(i+1) = text->i;
rv = FollowRulebook( (+ initial sound rules +) );
if ((rv) && RulebookSucceeded()) {
result = ResultOfRule();
if (result == (+ vowel outcome +)) return 1;
return 0;}
return 0;
];
Array LanguageArticles -->
! Contraction form 0: Contraction form 1:
! Cdef Def Indef Cdef Def Indef
"The " "the " "a " "The " "the " "an " ! Articles 0
"The " "the " "some " "The " "the " "some "; ! Articles 1
! a i
! s p s p
! m f n m f n m f n m f n
Array LanguageGNAsToArticles --> 0 0 0 1 1 1 0 0 0 1 1 1;
-) instead of “Articles” in “Language.i6t”.[/code][/spoiler]

dfremont · November 25, 2016, 10:41pm

It’s awkward because sometimes StorageForShortName does have its length in the first entry, e.g. in CPrintOrRun. Note that that function also uses both ‘->’ and ‘–>’ to access the array. A ‘buffer’ is a type of array which allows you to use both indexing methods without getting compiler warnings - as far as I know it is only documented here. The bottom line is that arrays are just contiguous regions in memory which the compiler tries to police a little bit, but can ultimately be accessed however you like - putting a length in the first byte or first word is just a convention.

As for “ü”, etc., I think all characters that Inform allows in object names (group A in WI 5.10) are stored consistently with the single-byte Latin-1 character encoding used by Glk and will work correctly.

zarf · November 25, 2016, 11:24pm

Yes, byte arrays can store those characters (Unicode values up to U+00FF). If you’re talking about the naive “a/an” algorithm, you only care about ASCII values anyway, so this is good enough.

matt_weiner · November 26, 2016, 2:45am

Thanks for pointing that out about buffer. Sometimes I6 really needs a tour guide!

So StorageForShortName is basically a holding pen that can be used for, well, storing a short name for various routines? CPrintOrRun or PrefaceByArticle or whatever? I guess what matters is that whatever calls LanguageContraction always sends it an array that doesn’t have its length in the first entry, and I think this is the only call to LanguageContraction, so that should be OK. I think.

I’m OK with this only working for the characters that are allowed in object names. Anyone who wants to print something about œnology will have to shift for themselves.

I also took out the last spurious paragraph break before the “You can see” line looking with this rule:

Last before listing nondescript items when the number of marked for listing things is not 0: skip upcoming rulebook break.

but there are probably other line breaks that need to be taken care of by hand. Which means that (among other reasons) this probably will never be ready for prime time as an extension, but sometime I can clean up the code and post it to Github for other people to tinker with if they choose.

Thanks again!

zarf · November 26, 2016, 2:58am

One day all of the parser guts will be updated to be 32-bit-clean (word arrays everywhere). This has not yet happened.