Need clarification on dictionary separators vs punctuation

In the dictionary header, there is a string that holds so-called “word separators” such as .,"

In Klooster’s doc, we are told the phrase fred,go fishing should be broken into four tokens fred , go fishing. The , becomes a token because it is a defined separator.

In Border Zone’s dictionary header I see the typical .," separator string. However, in its dictionary words I see . , " ? ! How would ? and ! get tokenized into words for dictionary lookup if they aren’t defined as separators?

Is any/every zscii punctuation also supposed to become individual tokens, in addition to whatever separators have been defined? I can’t find information that explains this.

2 Likes

There is a footnote in section 13:

Linards Ticmanis reports that some of Infocom’s interpreters convert question marks to spaces before lexical analysis. This is not Standard behaviour. (Thus, typing What is a grue? into Zork I no longer works: the player must type What is a grue instead.)

Possibly, there are more rules like this which some Infocom interpreters adhered to, but which are not part of the standard.

1 Like

Of course converting question marks into spaces before lexical analysis means they wouldn’t need to appear in the dictionary at all. That seems to be a simple way to make questions parseable without needing a question mark in the dictionary and is probably why infocom made their interpreters do this.

Punctuation not appearing as a separator can still be parsed as a word if surrounded by valid separators. Not very practical, and their presence in the dictionary is probably just an artifact or oversight of development.

@ChristopherDrum Any particular reason you are working off that older document instead of the latest standard?

@Mike_G I don’t know what you mean. I’m using the annotated standards document from https://zspec.jaredreisinger.com (which covers 1.1) and the Klooster doc has helped a lot (and sometimes confused the issue) when I can’t understand the standards document.

their presence in the dictionary is probably just an artifact or oversight of development. seems most likely to me, given the above discussion.

@fredrik Possibly, there are more rules like this which some Infocom interpreters adhered to, but which are not part of the standard
I’m unclear then what the standard says to do in this matter. If, for example, ? ends a sentence and is not defined as a separator, then it should be included as a char in the last word?
i.e. What is a grue?what is a grue?

The dictionary can contain deliberately un-input-able words, which is sometimes used in Inform 6. I don’t know if Infocom would have done that though.

Yes, it is part of the last word.

1 Like

Yes.

1 Like

Ah, I just noticed you’ve mentioned Klooster in other posts before and wasn’t sure you were aware of newer docs. Yes, there is helpful info in it. :slight_smile:

1 Like

Interestingly, Frotz 2.54 (the version in Debian 12) converts question marks to spaces.