how and when does I7 guess which indirect article to use?

Consider the following code:

[code]Use american dialect and the serial comma.

Playroom is a room. “Here you keep your toys.”

A chest is a closed openable container in Playroom. A unicorn, Winnie the Pooh, an hourglass, and a ball are in the chest.
A small table is a portable supporter in the chest.
A bucket is in the chest. Some sand is in the bucket. A chameleon is in the bucket.
A blanket is in the chest. The indefinite article of the blanket is “your favorite”.

Before printing the name of a closed container: say "closed ".
After printing the name of a closed container: omit contents in listing.

Before printing the name of an open openable container (called vessel) when the current action is looking or the current action is taking inventory or the current action is opening and the vessel is not the noun: say "open[if nothing is in the vessel],[end if] ".
Before printing the name of an open container (called vessel) when nothing is in the vessel: say "empty ".
Before printing the name of an open container (called vessel) when nothing is in the vessel: omit contents in listing.[/code]

Output:

Note that Inform is correctly saying “a closed chest,” “an open chest,” “a bucket (containing some stuff),” and “an empty bucket”; even though I didn’t tell it how to handle the articles. (Also that “an unicorn” and “a hourglass” are wrong, but I know how to handle that, I think.)

But now consider the following code from Inform example 338, Trachypachidae Maturin 1803:

Before printing the name of a bottle (called target) while not inserting, taking, searching, or removing: if the target is closed, say "sealed "; otherwise say "now open ".

If we delete “now” from the last line:

Before printing the name of a bottle (called target) while not inserting, taking, searching, or removing: if the target is closed, say "sealed "; otherwise say " open ".

then it doesn’t always work, unless you empty the bottle:

Notice “a open jug containing a beetle” and “an open jug.”

Does anyone know what’s going on here? I tried to look at InDefArt (in Printing.i6t) and LanguageGNAsToArticles (in Language.I6t) and was not enlightened.

“Andreas” noticed this a ways back and remarked on the way that example finesses the problem, but it doesn’t look like that thread unearthed the underlying mechanisms (it wound up devoted to fixes and to a different, buggy behavior).

Hey Matt,

Pardon me if I’m stating the obvious, but it seems as though you’ve discovered a bug in either the i6 list writer or the I7 compiler. I’m guessing from your post that you’ve already figured out the work-around to your first problem:

The indefinite article of unicorn is "a". The indefinite article of hourglass is "an".
This works as expected with both your example and when listing inventory. The thing is, you shouldn’t have to do that since you’ve already assigned the correct indefinite articles when you defined “a unicorn” and “an hourglass.”

Note that in your original example if you change “the chest” to “the echest” (changing the leading consonant to a vowel) the behavior remains unchanged. In other words, the container follows the expected rules, but its contents don’t.

As far as the second example, I’ve skimmed over the referenced posts, but I’m still trying to digest it all; it’s been some time since I’ve worked with I7. A while back, I was working on an extension to replace the i6 list writer for a more I7 - like version and I got pretty far with the programming and notes on the list writer’s behavior. Unfortunately, all that work was on my other laptop that’s hard drive died. Since I didn’t have that stuff backed up (rookie mistake — I know) it’s gonna take me a bit to be of any real help. I’m intrigued — but if anyone else wants to jump in the pool, feel free. :slight_smile:

Hi Mike,
Thanks for the response! I should’ve probably left out the unicorn and the hourglass, because I feel as though I have a good grasp on what’s going on their–Inform guesses indefinite articles based on whether the first word of the letter is a vowel. And if it that makes it guess wrong, you can explicitly set the indefinite article. I don’t actually think it’s necessarily a bug that the Inform compiler doesn’t infer the article from what you use when you create it–I think this may even be explicitly stated in the docs–though it might be nice if it did.

What I’m really puzzled about is the behavior with “open” and “empty.” I didn’t expect Inform to get those cases right; the behavior I expected was for Inform to infer that chest and bucket to have indefinite article “a”, and then to print “a” before their names no matter what other stuff was getting printed before their names. So I expected “a open chest” and “a empty bucket” as well as “a open jug containing a beetle.” It’s surprising to me that Inform was able to guess that it should be “an open chest” and “an empty bucket,” and I’m really completely confused about how it got “an open jug” right but “a open jug containing a beetle” wrong. Something complicated is happening with indefinite articles, but I don’t know what it is.

…wait, I think I’ve got it, at least to some extent. Looking at Printing.i6t, PrefaceByArticle is doing some kind of lookahead before deciding what to print. But there’s a note on StorageForShortName that you mustn’t call this if it would overflow the buffer, or Bad Things will happen. And checking my attempt to modify Trachypachidae Maturin 1803, I was compiling that project to z8; when I compile to Glulx, I do get “an open jug containing a beetle.” So it must be that when the z-machine was planning to print “open jug containing a beetle” that was too long for StorageForShortName and it punted back to the default article for the jug. Whew.

Still curious about exactly how this is working, though. It must be that black magic stuff where when you print the indefinite article of something it silently runs through the activity of printing its name, with the accompanying mishegoss. This:

[code]Lab is a room.

A rock is in Lab. Before printing the name of the rock: say "[one of]orange[or]blue[or]green[or]red[or]indigo[at random] ".[/code]

yields “a/an” at random without respect to the color name; I think what’s happening is that it silently prints the name, picks a random color, prints the article based on the random color it picked, and then prints the name out loud this time–but now it’s printing a different random color. There’s some kind of new thing where text substitutions (sometimes?) get evaluated and queued up in advance; don’t know if that’s having an effect here.

When Appendix B says “All of this is a legacy design from a time when the I6 library did not support capitalised indefinite articles” is perhaps when I should back away slowly.

[For background, I’m working on extending this code, which provides a way to use Text Capture to see what you’re about to print, and calculates whether to say “a” or “an” based on that; and also handles “unicorn” and “hourglass” by putting them in a table of exceptions. And then I was surprised to see that “an empty bucket” was working even in contexts where I wasn’t running my code!]

That’s a real shame about your list writer extension! I’d really like to see it; the list writer is a bane of my existence sometimes. For one thing, there’s basically no way to make it play nicely with my code for using Text Capture to handle indefinite articles.

Matt, I haven’t taken a good look at your third example, but I’m sorry to report that the second one is merely a typo. You wrote:Before printing the name of a bottle (called target) while not inserting, taking, searching, or removing: if the target is closed, say "sealed "; otherwise say " open ".
Removing the extra space before “open” yields the desired output:

Are you compiling to z8 or Glulx though? I think my code actually doesn’t have that leading space in it, and I’m still getting different results depending on whether I choose z8 or Glulx in “settings.”

I cut and pasted your original post and it does have the leading space, but I’ll try compiling to both formats to see if it’s different.

You’re right — I was compiling to Glulx, but when I went to z8 the behavior was exactly what you first posted. Dammit — this is really starting to cook my noodle.

You’re right about the space; my original post contains the leading space, but I hadn’t actually pasted that into (or from) my code. I should make sure I do that to try to avoid this kind of confusion! My bad.

There is a z-machine/Glulx check in PrefaceByArticle that I think explains what’s going on, but the I6 code here is opaque enough that I can only guess what’s going on from the annotations in Appendix B. Not that I understand I6 anyway, but occasionally I can puzzle out the logic, and this isn’t one of those occasions. As someone who is better versed in I6 than me you might have better luck.

So I looked at PrefaceByArticle and it appears we were both under the mistaken impression that Inform would override the article selection based on our definitions — whoops. The real problem here is: why does the same code yield different results when compiling to z vs. G?

Your supposition that it’s the fork in PrefaceByArticle is logical, but I can’t find anything messed up there. I’m by no means an expert, but I think this must be something in the list writer’s output. I’ll keep looking.

Here’s one problem that arises from the way Inform does things:

[code]Enigma is a room.

A cipher is a kind of thing. A cipher can be one-time, Caesar, or public-key (this is its cryptotype property). Understand the cryptotype property as describing a cipher.

Before printing the name of a cipher: say "[cryptotype] ".

One one-time cipher and one Caesar cipher are in Enigma.[/code]

(This took an embarrassingly long time to get compiled and working, BTW.)

Output:

PrefaceByArticle (or whatever is doing it) looks ahead at “one-time cipher,” sees that it starts with an “o,” and guesses wrongly that the indirect article should be “an.” Unlike with the unicorn and the hourglass, you can’t just set the indirect article of ciphers directly (well, in this case, every cipher should have “a,” but that won’t generalize). I guess in a case where you do know exactly which element is going to come after the indirect article you could do something like this:

[code]Enigma is a room.

A cipher is a kind of thing. A cipher can be one-time, Caesar, open-woffle, or public-key (this is its cryptotype property). Understand the cryptotype property as describing a cipher.

Before printing the name of a cipher: say "[cryptotype] ".

One one-time cipher, one open-woffle cipher, and one Caesar cipher are in Enigma.

The indefinite article of a cipher is usually “a[if the item described is open-woffle]n[end if]”.

A room has a cryptotype.[/code]

…you have to give a room a cryptotype to avoid this bug. (Also open-woffle isn’t a real thing, I got tired of looking up types of ciphers.)

But if it’s not entirely predictable which property is going to get printed first when you print the name of something, this won’t work. The code I’m working on allows for some more sophisticated and customizable ways of figuring out which article to use but as I said it doesn’t play nicely with the places where Inform uses indirect articles.

So I’d really like to figure out what Inform is doing by default, but that code does seem unusually opaque.

…OK, winkling through it there does ultimately seem to be a call to LanguageContraction in Language.i6t, which checks whether the letter in question is aeiouAEIOU. If I could figure out a way to have that call an I7 phrase instead I might be able to avoid a lot of the machinery I’ve been trying to use.

[OT: Go GATORS!! and in your face LSU!]
Meanwhile back at the ranch…

Alright, this is really starting to piss me off. I thought the theory you originally posited that taking the beetle somehow changed the article of the jug was merely coincidence. However, I’ve tried different actions (jump, etc.) and none of them change the article for the jug under z-code. It doesn’t even matter if the player is holding the jug. Consider:

Since both the room description and the inventory list (both of which use the list writer) change only after taking the beetle, it seems like there are only two possibilities: the taking action somehow interacts with the list writer in a screwy way (which seems unlikely) or this has to do with containment (which seems a little less unlikely considering that the LW leans heavily on containment relationships).

As far as your cipher example — which doesn’t involve taking or containment — I’m not sure just how related these bugs / misbehaviors are. Ultimately, figuring out how inform chooses articles is the solution which of course, was the point of your OP.

Gah. I was convinced that it was something about overflowing StorageForShortName and so avoiding the call to LanguageContraction. And so I was convinced that making the string that you’re planning to print long enough would reproduce the weird jug behavior (in z-code). So I wrote an example with a super-long word before the name of the item, and then with a super-long phrase before the name of the item, and then with a lot of extraneous words after the name of the item. None of them reproduced the behavior; I’m getting output like

You can see an oblate jug and an oblong jug (with a whole bunch of filler nonsense to overrun StorageForShortName) here.

when I was expecting the last one to be “a oblong jug (with a whole bunch of filler nonsense to overrun StorageForShortName)” because, well, I included a whole bunch of filler nonsense to overrun StorageForShortName. I still am pretty convinced that the deal with the original example has to do with the amount of stuff that’s printed after “jug” tripping the test in the z-code branch that means StorageForShortName doesn’t get called, but I don’t know how.

I feel like I have a pretty good handle on the cipher example–there it’s going through LanguageContraction in order to figure out which article to print, and LanguageContraction just tests whether a text begins with aeiouAEIOU. Since “one-time cipher” does begin with “o,” LanguageContraction thinks its indefinite article should be “an.” If I could figure out how to make LanguageContraction call my I7 code instead, I think I could take care of this. It’s partly a question of exactly what you can put in those (+ +) markers that let you call I7 from within I6.

Coincidentally, the markers for calling I6 and I7 from each other are a pretty good emoticon representation of the mental state this leaves me in. (- -) (+ +)

[OT: Go MAGIC!! and suck it Mavs! Sorry, but yesterday was a good one for sports in our house — for a change. Anyway… ]

So I did some experimenting to verify your conclusion that extra - long names don’t affect the article selection and I concur:[code]The Lab is a room.

A ball is a kind of thing. A ball can be supercalifragilistickexpialidosious, Supercalifragilistickexpialidoozey, or extrasuperspecial (this is the dumb property). Understand the dumb property as describing a ball.

Before printing the name of a ball: say "[dumb] ".

In the Lab are a supercalifragilistickexpialidosious ball, a supercalifragilistickexpialidoozey ball, and a extrasuperspecial ball.

test me with “get ball / supercalifragilistickexpialidosious / get ball /extrasuperspecial / i / get ball / both / i”.[/code]
yields:[spoiler]Lab
You can see a supercalifragilistickexpialidosious ball, a Supercalifragilistickexpialidoozey ball and an extrasuperspecial ball here.

test me
(Testing.)

[1] get ball
Which do you mean, the supercalifragilistickexpialidosious ball, the Supercalifragilistickexpialidoozey ball or the extrasuperspecial ball?

[2] supercalifragilistickexpialidosious
Which do you mean, the supercalifragilistickexpialidosious ball or the Supercalifragilistickexpialidoozey ball?

[3] get ball
Which do you mean, the supercalifragilistickexpialidosious ball, the Supercalifragilistickexpialidoozey ball or the extrasuperspecial ball?

[4] extrasuperspecial
Taken.

[5] i
You are carrying:
an extrasuperspecial ball

[6] get ball
Which do you mean, the supercalifragilistickexpialidosious ball, the Supercalifragilistickexpialidoozey ball or the extrasuperspecial ball?

[7] both
supercalifragilistickexpialidosious ball: Taken.
Supercalifragilistickexpialidoozey ball: Taken.
extrasuperspecial ball: You already have that.

[8] i
You are carrying:
a Supercalifragilistickexpialidoozey ball
a supercalifragilistickexpialidosious ball
an extrasuperspecial ball[/spoiler]Note that the output is the same with both z and G. The article behavior works as expected. The only problem demonstrated in my example is that really long names can cause problems with disambiguation (because they’re shortened internally), but I suspect many of us have run into this before, so this isn’t really new news.

I’m working on getting your I7 code inserted into or replacing LanguageContraction. It would be helpful if I actually had your I7 code to work with. You could pm it to me if you’d rather not post it in unfinished form.

Oh thanks! The I7 code is actually pretty short and in decent shape; here goes:

[code]To decide whether (string - a text) starts with a vowel sound:
let the first word be punctuated word number 1 in string;
if the first word is a word listed in the Table of Words That Start With Vowel Sounds, yes;
if the first word is a word listed in the Table of Words That Don’t Start With Vowel Sounds, no;
if character number 1 in the first word is a vowel, yes;
no.

To decide whether (letter - a text) is a vowel:
if letter exactly matches the regular expression “a|e|i|o|u|A|E|I|O|U”, yes;
no.

Table of Words That Start With Vowel Sounds
word
“hour”
“hourglass”
“honest”
“yttrium”

Table of Words That Don’t Start With Vowel Sounds
word
“uniform”
“unicorn”
“united”
“United”
“one”
[/code]

(with the latter two tables being extended as you wish with whatever relevant words show up in your project). There are some other things I’d like to do here–insert a use option so you can make the comparisons case-insensitive, and hook in a rulebook in case the author wants to do some more complicated calculations about what starts with a vowel. (For instance, to allow calculation that you say “a 1000-watt bulb” but “an 11-watt bulb.”) But that can all be handled by adding extra stuff to the “To decide whether (string - a text) starts with a vowel sound” phrase; if you have a way of calling that phrase from I6 then I should be good to go.

…it occurs to me that I’ll probably need to name the phrase for this to work.

Dude, I’m getting close! Here’s the output of my test-bed without the inclusion:

This is with the inclusion:

As you can see, some of the problems are fixed, but some aren’t. Hopefully, I can figure out what went wrong. Either way, I’ll post the code shortly.

Excellent! Looking forward to it.

EDIT: Is this z-code, Glulx, or both?

Both have the same output in this example. I still can’t figure out why “honest” isn’t working correctly.

Post the code! Post the code! I wanna see it!

Sorry Matt, I fell asleep watching the Magic game. I don’t even know who won, but don’t tell me — I DVR’d it. :wink:

Okay, so here’s the dealeo: I couldn’t figure out how to get i6 to access the phrases in your I7 code (which should be possible, but whatever) so I took a different tack. I created a global variable accessible to both and used that in the replacement of LanguageContraction. It makes more sense if you see it so, without further ado:[spoiler][code]The Lab is a room.

An animal can be honest or dishonest. Understand the honest property as describing an animal.

When play begins (this is the stupid and should be unnecessary rule):
repeat with A running through animals:
let T be a text;
now T is the printed name of A;
if A is honest:
now the printed name of A is “honest [T]”;
otherwise:
now the printed name of A is “dishonest [T]”.

The unicorn is an animal. It is honest.
The yak is an animal. It is dishonest.

A block is a kind of thing. It has a number called weight. The description of a block is “It looks to be about [weight of the item described in words] pound[s].”
After examining a block (called B):
now the printed name of B is “[weight of B in words] - pound block of [B]”.

The yttrium is a block. The weight is one.

The uniform is wearable. The description is “It’s a United States Army uniform.”
After examining the uniform for the first time:
now the printed name of the uniform is “United States Army uniform”.

A unicorn, a uniform, an hourglass, a yak and yttrium are in the lab.

[Matt’s code - additions noted]

[create a global variable to be called later in i6:]
Current vowel sound is a number that varies. Current vowel sound variable translates into i6 as “vowel_sound”.

Before printing the name of a thing (called x):
let T be a text;
now T is the printed name of x;
now Current vowel sound is T evaluated.
[say Current vowel sound.]

To decide what number is (s - a text) evaluated:
if s starts with a vowel sound:
decide on 1;
decide on 0.

To decide whether (string - a text) starts with a vowel sound (this is vowel sound checking):
let the first word be punctuated word number 1 in string;
if the first word is a word listed in the Table of Words That Start With Vowel Sounds, yes;
if the first word is a word listed in the Table of Words That Don’t Start With Vowel Sounds, no;
if character number 1 in the first word is a vowel, yes;
no.

To decide whether (letter - a text) is a vowel:
if letter exactly matches the regular expression “a|e|i|o|u|A|E|I|O|U”, yes;
no.

Table of Words That Start With Vowel Sounds
word
“hour”
“hourglass”
“honest”
“yttrium”

Table of Words That Don’t Start With Vowel Sounds
word
“uniform”
“unicorn”
“united”
“United”
“one”

[end]

Include (-
Global vowel_sound = 0;
-) after “Definitions.i6t”.

Include (-

Constant LanguageAnimateGender = male;
Constant LanguageInanimateGender = neuter;
Constant LanguageContractionForms = 2; ! English has two:
! 0 = starting with a consonant
! 1 = starting with a vowel
[ LanguageContraction text;

!!! This is the old routine:
!if (text->0 == ’a’ or ’e’ or ’i’ or ’o’ or ’u’
!or ’A’ or ’E’ or ’I’ or ’O’ or ’U’) return 1;
!return 0;
!!! Out w/ old – in w/ new:
if (vowel_sound == 1) return 1;
return 0;
];
Array LanguageArticles →
! Contraction form 0: Contraction form 1:
! Cdef Def Indef Cdef Def Indef
"The " "the " "a " "The " "the " "an " ! Articles 0
"The " "the " "some " "The " "the " "some "; ! Articles 1
! a i
! s p s p
! m f n m f n m f n m f n
Array LanguageGNAsToArticles → 0 0 0 1 1 1 0 0 0 1 1 1;
-) instead of “Articles” in “Language.i6t”.[/code][/spoiler]

Here’s the result:

I had to replace the whole “Articles” section of Language.i6t, but only thing changed is LC (which is actually simpler since it merely uses the value your code calculates). I think this method is minimally invasive and therefore probably safer and more robust. Also, it means that authors using your extension won’t have to worry about the i6 code because they can control everything from I7. It still needs work to generalize things with either / or properties, but it’s a start. :slight_smile:

(P.S. I know one wouldn’t refer to yttrium with a lone indefinite article; I just left it that way to demonstrate that code works as expected.)

EDIT: Oh yeah, I forgot to mention that the output is the same regardless of which machine you compile to – yay!

OK, that’s very nice to get started with. I guess the stupid and should be unnecessary rule is, um, necessary because you have to look at the printed name property to figure out which vowel you need; and that means that you need to fold “honest” into the printed name in order to get it looked at. Presumably that’s why “honest” didn’t work before, because it was (I tried a quick fix and immediately sent the code into an infinite loop.)

The problem for me is that it’d get awkward if there are lots of things that could show up before the printed name, especially if they weren’t unpredictable and involved random substitutions. Like, if you had something where the code prefaced the name of the thing with “[one of]orange[or]red[at random]”, I’d worry that it might preface it with “orange” when it was time to calculate current vowel sound, and “red” when it came time to print, and we’d get “an red.” And the coding could get awkward.

Looking through the Standard Rules it’s definitely possible to call an I7 rulebook from I6; that’s how the Does the player mean rules get called. And I need to turn the procedure for checking whether text starts with a vowel sound into a rulebook anyway. The question is how I pass the parameter to the rulebook… or can I just draw it out from StorageForShortName or wherever it is? This may go back to another issue I’ve had, which is how you turn an I6 array (?) into an I7 text.