[I6] Possible to test/modify whether a dictionary word counts as plural?

An object’s name property includes one tagged as a plural, e.g.

with name   'coin' 'coins//p'

In this case, where is the information about ‘coins’ indicating a
plural actually stored? Is it part of the word internally somehow,
or is it stored elsewhere?

Would it be possible to test a given dictionary word to determine
whether or not it is plural? Would it be possible to change whether
a word counts as plural or not mid-execution?

It looks like zarf was long ago looking at this sort of thing (see
I6 patch for setting dict flags), so
perhaps dynamically changing the flags is possible with newer
versions of Inform 6?

Each entry in the dictionary consists of the word itself in encoded form (4 or 6 bytes depending on Z-code version) and 3 data bytes. The flags (verb, noun, plural, maybe something more?) are stored in those data bytes.

Inform 6 never stores the dictionary in dynamic memory so no, you can’t change these flags during gameplay.

As Fredrik mentions, the Z-machine standard requires the “main” dictionary to be in static (i.e. read-only) memory, so you can’t change it during play.

Theoretically, you can pass a different dictionary to the @tokenise opcode, which can be in RAM—Inform doesn’t provide any way to do this automatically, but you could set aside a block of RAM, then at runtime copy the whole dictionary from static memory (its address is stored at byte 8 in the header) into it, and use that for all your parsing purposes. This would let you modify it however you like during play, though it’ll cause some annoying complications in your code (for example, you have to be very careful with 'word' constants, since those compile to addresses in the table in ROM).

Reading it, though, is absolutely possible—the standard library uses this during parsing. The value of a dictionary word is its byte address, so you can get to the data bytes with pointer manipulation. Inform provides constants #dict_par1, #dict_par2, and #dict_par3 for this, in case the dictionary format ever changes.

So if you want to check whether a word is plural, that’s word->#dict_par1 & DICT_PLUR (or & 4 if you’re not using the library).

1 Like

Thanks, @draconis. This is handy. It seems that there’s no DICT_PLUR defined in the Standard Library that comes with Inform 6.31, but using 4 works.

Where can one find more information about the various flags held by these three bytes (#dict_par1, #dict_par2, #dict_par3)?

This is in the rather obscure (as in, less-well-known than the DM4) Inform Technical Manual, section 8.5.xiii:

Each word is encoded with an entry giving 4 (in version 3) or 6 (otherwise) bytes of textual representation, fully specified by the Z-machine, and then three bytes of data, which Inform can do with as it pleases.

These are the so-called “dictionary parameters” dict_par1, 2 and 3. Inform’s pleasure is to write into them as follows:

dict_par1: flags
dict_par2: verb number (counting downwards from 255)
dict_par3: preposition number (counting downwards from 255) in
           grammar version 1; not used in grammar version 2

The flags are given as follows:

bit:    7      6   5   4   3     2        1      0
        <noun>             <adj> <plural> <meta> <verb>

The bits <verb>, <noun> and <adj> are set if the word can be used in the context of a verb, noun and/or preposition (all three can be simultaneously set). The <meta> bit indicates that the English verb is “meta”, that is, is a command to the program and not a request in the game.

The <plural> bit is set using the ‘…//p’ notation, like so:

'egg'  'eggs//p'

The Z-machine doesn’t even specify how many data bytes each entry can have—but various Inform internal mechanisms rely on it being three, so the third byte just sits there unused now. Which means you can use it for whatever you like without breaking anything.

For the verb, noun, and “adjective” (preposition) bits, as far as I can tell, they’re determined as follows:

  • If a word is given grammar lines with the Verb declaration, it gets the verb bit.
  • If a word appears in single-quotes within a grammar line, it gets the preposition bit.
  • If a word appears in single-quotes anywhere else in the code, it gets the noun bit.

Basically, the compiler assumes that any word referenced outside a grammar line is used in parsing object names, which is usually a reasonable assumption. The exceptions are things like pronouns and conjunctions, but as long as they’re not marked as verbs this usually doesn’t hurt anything.

(Why is it useful to know if a word was referenced outside a grammar line? Because it helps the parser figure out if a comma is separating verbs or nouns. If the first word after a comma has the verb bit, and doesn’t have the noun bit, it’s parsed as a verb; otherwise it’s parsed as a noun.)