Towards an i7 spec?

Laroquod · February 1, 2011, 8:44am

Something that I have felt is lacking in the world of manuals for i7 is a very bare bones, ‘just the specification and nothing but the specification’ approach. It’s not that there can be no examples, but I prefer them to be kept to a bare minimum and to use my own imagination to fill in the gaps. My ideal i7 manual would explain every grammar structure in only one place, but in the right place, and in as abstract a form as possible. I guess the perfect manual to me is Kernighan and Ritchie’s classic C book. I doubt i7 could be described in as parsimonious a manner, but I think there would be enormous value in the attempt.

Is there already something like this out there, or was it decided by the community early on that such an approach would not be useful for i7?. Ron Newcomb’s guide comes the closest, in the things I’ve skimmed (I have not yet read it thoroughly), but what I am looking for doesn’t really seem to exist. In its absence, I have gone through the official manual several times trying to get at the basic skeleton of the allowed grammars, and I think I’ve made some headway, although it’s nowhere near complete and probably inaccurate in places. But below is a snippet of the sorts of notes that I’ve been making over time, that should give you an idea of the sort of thing I was looking for and didn’t find.

[code]==== AN INCOMPLETE GRAMMAR OF I7 ====

UPPER CASE COMPONENT NAMES are to be substituted with the specification of those components, whereas lowercase letters are to be typed literally, except the indefiniate articles ‘a’ or ‘an’, which are always optional. Other optional components are specified by [square] brackets. Slashes separate phrases that are interchangeable / alternatives / otherwise synonyms.

– basic building blocks –

OBJECT
	Kitchen / the rusty knife / a container / Susan / etc.

ADJECTIVE
	closed / not openable / edible / male / dark / etc.

ADJECTIVES
	ADJECTIVE [ADJECTIVE] [ADJECTIVE] [...]

QUALIFIED-NOUN
	[ADJECTIVES] OBJECT [[that is/which is/who is] ADJECTIVES]

– is here –

SUBJECT-LIST is here.
Here is a SUBJECT-LIST.

SUBJECT-LIST
	SUBJECT [, SUBJECT] [[,] and SUBJECT] [...]

SUBJECT
	[QUALIFIED-NOUN called] OBJECT

– contains, supports, carries, wears –

SUBJECT PREDICATES a QUALIFIED-NOUN-LIST.

PREDICATES
	contains / supports / carries / wears

QUALIFIED-NOUN-LIST
	QUALIFIED-NOUN [, QUALIFIED-NOUN] [[,] and QUALIFIED-NOUN] [...]

– containing, supporting, carrying, wearing –

SUBJECT is PREDICATING a QUALIFIED-NOUN-LIST.
PREDICATING a QUALIFIED-NOUN-LIST is a SUBJECT.

PREDICATING
	containing / supporting / carrying / wearing

– in, contained in/by, on, supported on/by, carried by, worn by –

SUBJECT-LIST is PREDICATED-BY a QUALIFIED-NOUN.
PREDICATED-BY a QUALIFIED-NOUN is a SUBJECT-LIST.

PREDICATED-BY
	in / contained in / contained by /
	on / supported in / supported by /
	carried [by] /
	worn [by] /
	part of

– is a kind of –

SUBJECT-LIST is a kind of OBJECT.

– is a –

SUBJECT-LIST is a FULLY-QUALIFIED-ADJECTIVE.
SUBJECT-LIST is a FULLY-QUALIFIED-NOUN.

FULLY-QUALIFIED-ADJECTIVE
	[ADJECTIVE-LIST] [VERB-PHRASE / PARTICIPLE-PHRASE-LIST]

FULLY-QUALIFIED-NOUN
	[ADJECTIVE-LIST] OBJECT [[that is/which is/who is] ADJECTIVE-LIST] [VERB-PHRASE / PARTICIPLE-PHRASE-LIST]

ADJECTIVE-LIST
	ADJECTIVE [[,/, and/and/that is/which is/who is] ADJECTIVE] [[,/, and/and/that is/which is/who is] ADJECTIVE] [...]
	
VERB-PHRASE
	that/which/who PREDICATES a QUALIFIED-NOUN

PARTICIPLE-PHRASE-LIST
	[,/, and/and/that is/which is/who is] PARTICIPLE-PHRASE [, PARTICIPLE-PHRASE] [[,] and PARTICIPLE-PHRASE] [...]

PARTICIPLE-PHRASE
	PREDICATED-BY/PREDICATING a QUALIFIED-NOUN

– has –

SUBJECT-LIST has the PROPERTY VALUE.
PROPERTY-LIST is a VALUE.

PROPERTY-LIST
	PROPERTY [, PROPERTY] [[,] and PROPERTY] [...]

PROPERTY
	VARIABLE
	VARIABLE of OBJECT

VARIABLE
	decription / location / player / etc.

VALUE
	"It looks ordinary." / 10 / 334 / lamp / sword / etc.

[/code]

And I’ve been building on it as I go. Currently there a lot of things I’ve learned about the grammar that I have not yet folded into this skunkworks ‘spec’. But is this all duplicative work, anyway? I have just been doing it because it suits my brain.

Paul.

Juhana · February 1, 2011, 10:23am

Have you seen the syntax document (http://inform7.com/learn/documents/I7_syntax.txt)? I think it’s slightly out of date (says it’s updated for 5Z71) but it looks a bit like your example.

Laroquod · February 1, 2011, 4:21pm

Thanks, Juhana, that could come in handy, and no I hadn’t seen it before, but it’s not really the same thing as a spec — more like a pile of categorised examples. Still, it’s a more concentrated rundown than any other I’ve seen.

P.

katz · February 1, 2011, 5:59pm

I also tend to think this way and would prefer a manual with that structure.

Do make sure that everything you list comes with an explanation of what it is and how to use it, though, lest it just become another guessing game.

Laroquod · February 1, 2011, 7:51pm

Good suggestion. I have written some explanations but not enough yet because I’m not sure that my categories and terms won’t change in light of later chapters. I’m still really sketchy on the later chapters of the official manual, and I have the feeling that once I really nail down the common grammatical elements among them, they might end up actually forming the core of the spec causing me to rethink how I’ve categorised things already, so I’ve got a long way to go.

Also, it’s just something I do when I feel the need to shift from ‘try programming more stuff’ to ‘learn a bunch more because I’m not good enough yet to code the things I want’. Adding to this document is basically how I’m learning i7. This would probably be much better if it were done by an expert I suppose but what I would like it to shape up into is the most complete, self-sufficient description of the language possible in the fewest number of words.

P.

Ron_Newcomb · February 1, 2011, 8:10pm

Example #383 in the official manual, titled “Backus-Naur form for rules”, states:

Sample BNF follows. That’s probably what you’re looking for, and why Inform doesn’t provide it. The Programmer’s Guide touches upon which part of English grammar certain constructs should be named: i.e., rulebooks should be named after a subordinating conjunction (before, instead), an imperative verb (check, report), or any larger fragment that accepts a participial phrase after it (Every turn, Should the game disambiguate). Beyond that, the grammar is mainly the simple subset of English’s, but of course general To-phrases define a one-off idiom rather than a category of grammar, so can break or reinforce just about anything.

George · February 2, 2011, 12:56am

I would definitely appreciate a simple, clear tutorial that looked something like this,

docs.python.org/tutorial/introduction.html

And a reference manual that looked like this,

docs.python.org/library/index.html

Though that’s easier for me to say rather than do ;D.

I wonder if part of the problem is that I7 is both a library and a language, and that learning both at the same time could be confusing?

Dannii · February 2, 2011, 3:31am

I think one issue is that most “library” type things in I7 have their own syntax. It would be hard to separate the language from the library in I7.

Laroquod · February 2, 2011, 9:41am

Ron Newcomb:

Example #383 in the official manual, titled “Backus-Naur form for rules”, states:

Backus-Naur form, or BNF, is a standard notation used by computer scientists to specify more or less precisely what the valid programs are for a given programming language. It tends to provide a good description for a language such as C or Pascal, where contextual rules are limited, but the authors of Inform are doubtful that it is such a good tool for a natural-language system. For those who are interested, though, the following gives a formal specification for Inform’s rules.

Sample BNF follows. That’s probably what you’re looking for, and why Inform doesn’t provide it. The Programmer’s Guide touches upon which part of English grammar certain constructs should be named: i.e., rulebooks should be named after a subordinating conjunction (before, instead), an imperative verb (check, report), or any larger fragment that accepts a participial phrase after it (Every turn, Should the game disambiguate). Beyond that, the grammar is mainly the simple subset of English’s, but of course general To-phrases define a one-off idiom rather than a category of grammar, so can break or reinforce just about anything.

Interesting, thanks Ron. So it seems there was a decision early on that this sort of notation isn’t useful for Inform (BTW I am sure I am wildly deviating from the BNF standard, in fact I never thought of it as a standard although I’ve seen plenty of them). But I can’t say I agree and the key is in the last thing you say, about the grammar being ‘mainly the simple subset of English’. That makes something sound simple which is actually quite complex — in order to understand the subset, currently, you have try to use English and then to learn each case individually where you are using ‘too much English’ and not adhering to the subset. That’s a lot of cases and a lot of trial & error. I prefer to start from the position that it is not English, it is merely a collection of grammars that superficially resemble English, and then to learn those grammars from the ground up. ‘Subset of English’ makes it sound easier, but to me it’s actually harder because there is way too much irrelevant English bouncing around in my head getting in my way of seeing what are the relevant base structures. For example, it took me way too long to understand the difference between what I call the ‘FULLY-QUALIFIED-NOUN’ structure, above, and the ‘QUALIFIED-NOUN’ structure, and the contexts in which they are used. It’s a categorical distinction that is fundamental to Inform and should be one of the first things learned, as this distinction is what renders most of the grammatical restrictions you will run into at first intelligible. It’s not hard to remember it once you know where it is, but it is impossible to divine from any knowledge of the English language. And it’s a distinction I found much obscured (unnecessarily, IMO) by the offiicial manual’s attachment to the idea that it is describing a form of English, i.e. a subset. Which subset? There are a near-infinite number of possible subsets. Knowing that it’s a subset of English doesn’t give me any grammatical knowledge I can put into practise. What it tells me is, ‘The tree you want is somewhere in the forest.’

Anyway I’m just explaining why I find what you quoted a little bit baffling, but definitely thanks for bringing it to my attention and for explaining the Backus-Naur term.

Distinguishing between what’s a ‘built-in’ and what’s in the library is one of those things that I don’t feel qualified to do yet without a fuller undestanding of the later chapters, but I agree that it too is an important, if mostly theoretical, distinction. I suspect that in the end, it will be almost all library. I am a believer in the idea that the section headings should be the literal keywords that you need to use and not categorical desciptions. (i.e. as much as is possible, I want to look through a list of literal library functions like ‘1.1: Carries / Carrying / Carried by; 1.2: Contains / Containing / Contained by; 1.3: Supports / Supporting / Supported by’ rather than ‘1.1 Possessions; 1.2 Contents; 1.3: etc.’). Particularly in the case of Inform, where it is hard to distinguish the required tokens from the words used to explain those tokens, there should be one convention in any manual as to how to designate a token, and it should be adhered to pretty rigidly, IMO. An undifferentiated mix of tokens and non-tokens in chapter and section headings is just the worst possible state of confusion for the uninitiated, but that’s the system adopted by the official manual. So I interpret that as the main point of George wanting the Python-like manual organisation: it is very clear on what’s a token and it is designed to be able to scanned for tokens. (I suppose I’m using ‘token’ rather loosely here to include built-in keywords and library functions.) That’s of huge importance to me as well.

Thanks for your thoughts, guys, it’s been helpful. I’m glad that I haven’t just missed something and that at least some others feel the same need.

Paul.

Ron_Newcomb · February 2, 2011, 6:17pm

That whole One Magic Token thing is what bugs me about programming languages. Heaven forbid we use words, with spaces between them even. Comp Sci really needs to give the mathematicians back their syntax.

Also, I forgot to mention to you the Index, especially the Phrases tab.

zarf · February 2, 2011, 7:41pm

Here’s my take: Python, as an example, has an extremely simple syntax. (That was a design goal, of course.) The Python library reference is wonderfully detailed, but it isn’t about syntax at all – it assumes that you already know how to do “X.Y” and “X.Y(Z)”, and those are 95% of what you do with Python libraries.

I7 doesn’t break down that way. You say:

But in fact those aren’t functions, and the words in them aren’t tokens. “Carried” could refer to a relation (the carrying relation); an adjective (“a thing is carried if the player is carrying it”); or one of the special requirements of an action (“Wearing is an action applying to one carried thing.”). Or it could be part of a longer name (there’s a rule called “the can’t put onto something being carried rule”; there’s a condition “whether the action requires a carried noun”).

In Python, a token “carried” would have exactly one definition in any context. In I7, the compiler looks through all of these possibilities. It distinguishes them based on context, which includes the presence of the longer name, but also on – well, English-like connectives and rules. “If X is carried: …” is different from “now the player carries X”.

(You get a peek into this parsing process when you make a mistake, and the I7 error message lists all the various ways the compiler failed to parse your code.)

So in a traditional language, it makes sense to organize the manual by token, because there’s a strong aesthetic of “one token <=> one feature”. (More so in Python than in Perl, of course. But in I7, there’s this messy many-many relation between code words and concepts. You really do have to know that the carrying relation is a core concept, which feeds into many syntaxes (“now the player carries X”, “if the player carries X”, “all swords carried by the player”, …) but is distinct from concepts like action requirements. If you lead with the words you have a horrible disorganized mess.

But then you also want a reference organized by code word. I tried to do this with my I7 index page: eblong.com/zarf/i7index/ The point there is to be an index, rather than a manual – entries just refer to manual chapters. It may be useful, but on the other hand, it’s very hard to do well. (The entries on “carrying”, “carried”, etc are rather sparse. I don’t know if they cover everything you want to know.)

And that leads back to the original question, which is syntax. I7 has a high-level syntax, which I think the manual fails to convey very well. (Thus the very frequent question whose answer is “you have to put your ‘if’ statement inside some rule”.) I think Ron’s manual goes in that direction. But it’s more important to have the high-level stuff – the layout of rules and top-level declarations. If you try to notate the syntax of everything down to individual values and variable names, you’re going to drown in the swamp of phrase and adjective names (which, as I showed above, can be very flexible and include other “reserved” words).

Laroquod · February 2, 2011, 9:16pm

I didn’t mean single words with no spaces in them. Let’s just say I used the wrong word with ‘token’. I meant to differentiate between actual words you are supposed to type when programming, and words that are merely in the manual to help explain those keywords. The distinction between those two things gets pretty muddy at times in the official manual, particularly in the section headings which I don’t find useful at all in that manual — I’m just saying they should be clearly distinct, even moreso with i7 than other languages, because without some non-alphabetic or contextual cue, it is impossible to distinguish the pseudo-English from the English meant to explain the pseudo-English. The simplest cue I can think of is to reserve section headings for actual syntax words only. Also, aren’t i7 keywords self-describing? They make better section headings than any other language’s keywords. Forgive me if I’m just not getting it, but this seems like a no-brainer.

I realise this stuff. To me it doesn’t obviate the need. English is way more complex than Inform, and I have seen plenty of grammar specifications for English that attempt to conceptually name and mark out the relations of parts of speech, rather than going with examples alone.

OK let’s just say I screwed up all the terminology. I don’t care if they are functions or tokens or something else. They are words I need to know to write the code, that can be substituted into sentences with certain effects – almost all of them (like the relations, say) belong to a particular category the members of which behave in a similar way. That’s good enough for me. I didn’t think any of my reasoning was really contingent on the semantics of whether they are functions or tokens or whatnot.

zarf:

In Python, a token “carried” would have exactly one definition in any context. In I7, the compiler looks through all of these possibilities. It distinguishes them based on context, which includes the presence of the longer name, but also on – well, English-like connectives and rules. “If X is carried: …” is different from “now the player carries X”.

(You get a peek into this parsing process when you make a mistake, and the I7 error message lists all the various ways the compiler failed to parse your code.)

So in a traditional language, it makes sense to organize the manual by token, because there’s a strong aesthetic of “one token <=> one feature”. (More so in Python than in Perl, of course. But in I7, there’s this messy many-many relation between code words and concepts. You really do have to know that the carrying relation is a core concept, which feeds into many syntaxes (“now the player carries X”, “if the player carries X”, “all swords carried by the player”, …) but is distinct from concepts like action requirements. If you lead with the words you have a horrible disorganized mess.

You make excellent points here but I still don’t think it obviates the need for a spec. It just makes it more difficult. Where there are many-to-many relationships, I guess I would just approach that by making it as clear as possible which parallel structures each type of relationship has as its siblings (not in the OO sense of ‘sibling’). As I say above, there are grammarians who have written what are basically specs for the English language itself. So although I’m sure you are right about all of these specifics and I was wrong, it’s still hard for me to see how your points speak directly to the need (or lack of need) for a specification.

I have seen this document. It’s a very worthwhile goal, it’s just too bad that it references the official manual, which I find very frustratingly unclear in parts. Very often when I read a section of the manual, I have to do a bunch of trial & error to actually nail down the behaviour. I save everything I’ve learned in a document, some of which is in the format I used above. (The rest is just me rewriting sections of the manual in my own words to be more specific about the resulting behaviour. I’ve already completely rewritten in my own words everything up to chapter 13 in the official manual, though the spec section is lagging behind as it requires more analysis – I’ve been attempting to build the formal grammar on top of my rewritten chapters.) So that’s the main reason I didn’t perceive your index as a solution to my problems.

Well you have the experience to know better here and it seems what you are saying is that I am undertaking a Quixotic quest. I’m not going to argue that. If that is the case, I’ll be easily dissuaded when that realisation dawns; it just hasn’t yet and I’ve made a lot more headway than what I’ve posted above. Thanks for the warning, I know you are just trying to save me work. But if I don’t try and fail to write an accurate spec while I have the desire, I figure I’ll be sitting around years from now still wishing, ‘Gee I wish I had a spec’ and wondering if maybe I shouldn’t have been so easily dissuaded. 8) Some mistakes we just have to make for ourselves. I’ll be happy to admit you were right all along, when the time comes, because it’ll mean I can drop this obsession with a spec.

Thanks guys I hope my stubbornness isn’t too annoying. I’m just not convinced that a spec isn’t doable or useful. I was desperate for one coming in, and being more than halfway in now, I can see all the little places where it would have saved me tons of time. Even half a specification would be better than the status quo, IMO.

Paul.

Laroquod · February 2, 2011, 10:08pm

I forgot to address this. I’ve tried to use the Index and the Phrases tab, I’ve been all over that stuff, and I just can’t use that stuff. I want my info in a text file – I’m strictly a text file man. And it isn’t a spec but you’re right it does come close in some ways. Can I get that stuff in a text file? Or is that basically considered to be equivalent to the syntax document Juhana pointed me to?

Paul.

capmikee · February 2, 2011, 10:22pm

The other day I tried to use the “find” feature to locate some rules in the Actions tab of the Index. No matter what I tried, I couldn’t get it to work without guessing what the action was and clicking on it. I’m not sure if this means the Actions tab is split up into multiple pages, or if the find feature is flawed, but I’ve had plenty of problems with both.

Laroquod · February 2, 2011, 10:57pm

Yeah, I don’t so much reread text files, I search them. Search after search after search. Grep is indispensible. Go full screen – paging around as rapidly as possible to cross-reference different sections. In TextWrangler I can even look at two widely separated sections of the same document simultaneously. Also, I like to add my own notes. I feel crippled inside the IDE, reference-wise. The index is pretty useful though once you get stuff built. And the skein is cool, there are lots of neat things about the IDE but I think it’s the wrong place for reference material [ EDITED TO QUALIFY: For someone with my text file habits. ]

P.

StJohnLimbo · February 3, 2011, 12:18am

AFAIK, it’s not available in an up-to-date text file; here’s a pertinent explanatory post: inform7.com/news/2010/06/20/6e59 … ook-index/

zarf · February 3, 2011, 2:47am

Yeah, but nobody tries to learn from them.

I don’t think the desire for a spec is quixotic or impossible. I just think most of the value is in the first 20% of the work – which is an argument for doing that much, of course. Also: that value is in supplementing the manual with a clear understanding of how the core I7 concepts are fit into English. It won’t be great for looking up how to do specific things.

Laroquod · February 3, 2011, 10:32am

I agree with that last bit. The waters got muddied a bit by the introduction of the strategy of a keyword-organised reference from those Python docs. The purpose of a spec and the purpose of organising a manual around keyword headings are separate purposes — but they’re principles that do cross over in that grammatical terminology used in the spec can be used to help categorise a keyword-based reference (where one looks up specific things) and make it more structurally intelligible. And of course, the keyword reference describes the range of vocabulary available for parts of the spec, hopefully completing it.

I wouldn’t be as interested in having a complete spec if it didn’t have an adjoining, compatibly-categorised keyword reference. It would be less useful, as you say.

P.

Ron_Newcomb · February 3, 2011, 7:29pm

Generally, I think just understanding type gets you a long way. Granted, type like description of people would need to be explained in detail, as that’s a whole mess of adjectives, relations, a kind, “called” local variables, and the like, on top of the sometimes restricted ways of how a relation fits in (i.e., sometimes a “reversed” relation is necessary) and especially when chaining relations together.

OK, to-phrases can take a description or instance for the “type” of each parameter, an oft-overlooked fact. When we say To foo (x - a person) the “person” bit is a description of people, not just a kind like more restrictive languages.

But other than that, the to-phrases, decide-phrases, and assertions all are each like an idiom. There’s not real grammar there, just a fill-in-the-blank ad-lib. Descriptions are the only part with a real grammar.

Well, maybe rule preambles as well.

I keep undermining my own point, don’t I?

Laroquod · February 4, 2011, 12:59pm

Yeah, truly, and you know what I don’t think I fully understand type yet in all its myriad detail. I’m sure I will though by the end of this effort, however it turns out. 8)

Heh. I take your point though — there are some very freeform constructions that are sort of like grammatical black boxes. Not everything is going to be well illuminated by this process — I see that now.

Paul.