As an example from a very early draft, here’s a short excerpt:
Understanding strings in Inform 6
Most people with even passing familiarity with another programming language will understand the concept of a string as a piece of text, i.e. a series of characters in a certain order.
In some programming languages, such as C, storage for a string takes the form of a memory address. The first character of the string is at that address, and each successive address contains the next character in the string. A special terminator character marks the end of the sequence of characters that comprise the string. As an example, consider the string "cat"
in C. It would be stored in memory somewhere, say at address 1000
. The first character c
is stored at address 1000
, a
is stored at 1001
, t
is stored at 1002
, and the terminator character marking the end of the string is stored at 1003
. [diagram here]
Under this storage scheme, if you know the starting address you can access individual characters within the string by adding an offset to the memory address of the string (i.e. the address of the string’s first character). The offset is usually thought of as zero-indexed, so the offset is one less than the ordinal position of the character sought. For example, to find the second character (a
) in "cat"
, the memory position to examine is 1000+2-1
or 1001
.
It is important to understand that in the majority of cases this is not how strings are stored in Inform 6.
As discussed in the section on the Z-Machine memory model (see [cross-ref]), memory in the Z-machine is segmented into addressable and packed segments. Most strings found in the source code – such as those used to provide descriptions of rooms and objects and to specify responses to actions – will be consigned to packed memory by the compiler. The running game cannot directly access the memory locations in packed memory; it must rely on the interpreter to do so. As a result, there is no way for the running game to even ask for the second character in a packed string. All it can do is ask the interpreter to print the whole string starting at a particular location (the packed string’s starting point) in packed memory. This is what the “printing rule” (string)
does.
As also discussed (see [cross-ref]), packed memory is static and can’t be changed at run-time. This is why it is not possible to modify most strings (being held in packed memory) while the game is running.
However, not every string literal in the source code is destined to become a packed string. If a string literal is used to initialize an array, then the compiler treats the declaration as equivalent to asking for an array to be built with each storage unit (byte or word) of the array populated by the successive characters of the string literal. Note that, unlike in C, no terminator character is included at the end of the sequence in this case, though for some array variants (see [cross-ref]) the number of characters in the string will be stored where it can be easily accessed.
This text will refer to such arrays containing unterminated sequences of characters as “pseudo-strings.” Built-in support for pseudo-strings is limited. The (string)
printing rule does not work with these pseudo-strings. The String
metaclass does not apply to pseudo-strings. The burden is very much on the author to deal with these constructs… [go on to define some routines that might be useful and/or cite useful extensions or StdLib routines that may exist]
This is the kind of thing I meant about a “sliding scale” of technical detail intended to be useful for both programming newbies and those with programming experience whose expectations have been set by other languages.
Would a whole book of this sort of stuff be useful?