Truncation of long words in input and BlkValueWrite error

Ben · May 10, 2021, 1:27am

Hi!

I’m writing my own parsing routines for a menu system and I just got confused by this (non-fatal) run-time error:

*** BlkValueWrite: writing to index out of range: 22 in 697450 *** *** BlkValueWrite: writing to index out of range: 23 in 697450 ***

The below isn’t to reproduce the issue (the full code is long). But these lines cut from the code give you some idea what I’m doing.

let C be "[the player's command]"; repeat with CWI running from 1 to the number of words in C: let commandWord be word number CWI in C in lower case;

The run-time error happens not every time the loop operates on the same player command, but is predictable given the same sequence of player commands (i.e. happens eventually).

The error comes (sometimes) when one of the words in the player’s command is 23 characters long. When I print commandWord after the error it gets the last character truncated.

Anyone know how long a word can be and still safe for the player to use? I’m on Glulx.

Cheers,

Ben

patrick_mooney · May 10, 2021, 6:14am

I thought I knew more of the answer than I can find documentation for, so I’ll mostly confine my answer to what I can document.

Section 3.1 of Writing with Inform says:

A small limitation here is that probably only the first 9 letters of each word are read from the player’s command. This is plenty for handling the wicker cage and the black rod, but it might be embarrassing at a meeting of the Justice League to find that KISS SUPERHERO and KISS SUPERHEROINE read as if they are the same command.

I believe, but cannot be certain, that “probably only the first 9 letters of each word are read” means “assuming Glulx, and fewer [I want to say 6?] are read on the Z-machine.”. But I cannot find that documented, either.

I think I’ve seen a USE option somewhere that allows this to be raised, but a quick Google search didn’t turn up anything useful. Hopefully someone else can be helpful there.

This may not be germane (you say the code snippet was an excerpt, and it might not match your actual code in this regard), and you may already know this. However, when you write

let C be "[the player's command]"; repeat with CWI running from 1 to the number of words in C: let commandWord be word number CWI in C in lower case;

… it may be helpful to unwrap the semicolon-delimited commands into block-indent form:

every turn when parsing: [this is just a made-up header to illustrate a structure; of course your rule will have its own header that does what you want it to do when you want it to do it]
    let C be "[the player's command]";
    repeat with CWI running from 1 to the number of words in C:
        let commandWord be word number CWI in C in lower case;
        [do other things]

Inform 7’s parser can act in ~~wonky~~ counterintuitive ways when a condition ending in a colon is followed by one or more imperatives ending in a semicolon on the same line, and sometimes just unwrapping them into indented form helps to clarify your own logic to the parser.

Again, that may not apply to your actual code, of which this is merely an illustration; but it’s hard to say without seeing that code.

Hopefully someone else can fill in some of the blanks I left!

StJohnLimbo · May 10, 2021, 10:04am

I think the limit is 9 characters on the Z-machine (but can be less if there are special characters which use an encoding that takes up more space than usual), and it’s configurable on Glulx with the setting “Use DICT_WORD_SIZE of 15.” (or another number, of course).

See chapter 2.2 Varying What Is Read in the Recipe Book:

by default Inform truncates words to nine letters before attempting to identify them. This is no problem in most circumstances and is likely to go unnoticed – until we have two very long words whose names are nearly identical, such as “north-northwest exit” and “north-northeast exit”. (To make matters worse, a punctuation mark such as a hyphen counts as two letters on its own.)
When we are compiling for Glulx, the limit is easily changed with a single line, setting the constant called DICT_WORD_SIZE.

Here are other threads on input length issues, which mention limits of 20 words and 260 characters in total:

Also, here’s a (very old) thread where a similar error message about “BlkValueWrite: writing to index out of range” crops up, where it probably has to do with needing to choose a row before modifying a table entry. But that might be totally unrelated to your question, since BlkValueWrite is used in all sorts of places (I think).

I don’t know if any of this sheds light on the issue you are facing, unfortunately.

If possible, it would be great if you could isolate the problem into a reproducible minimal example. (But I know that’s often really hard to do.)

Ben · May 10, 2021, 10:56am

Thanks ever so much @patrick_mooney and @StJohnLimbo for helping me try to get to the bottom of this. It’s a bit of a mystery, isn’t it. I’d love to know what the default DICT_WORD_SIZE is, and if this is definitely the relevant factor. Luckily in my case it’s not completely necessary to solicit really long words from the player, so I just won’t do it, and the uncertainty around what counts as a really long word isn’t going to keep me awake at night!

By the way @patrick_mooney I do indent my code like a total pro, but I post forum posts like a total novice. No idea why it got wrapped like that…

drpeterbatesuk · May 10, 2021, 12:14pm

Not sure about the cause of this strange apparent bug, but the length of dictionary words is a red herring, I think. The player’s command should not be affected by whether words in it are dictionary words at all, never mind the length of those words as stored in the dictionary. It seems particularly odd that the error is replicable after a given sequence of player commands, rather than every time for a specific player’s command. Are these commands being entered separately one at a time, or together on one entry, separated by periods or ‘then’? Could you give an example of a sequence of commands that triggers the error? Out of interest, does changing 'let C be “[the player’s command]” to 'let C be the substituted form of “[the player’s command]” make a difference? What does ‘say C’ reveal at the point where the error occurs? Is there any punctuation or other unusual characters in the command that triggers the error (such as apostrophes, hyphens, commas, etc.)? Does the error still occur if you remove ‘in lower case’? Is there anything in your code that could cause the player’s command to be changed at some point (‘change the text of the player’s command to…’)?

Ben · May 10, 2021, 2:01pm

@drpeterbatesuk Because of my my strong hunch that the error is only with words over a certain length (because when I did nothing but change to a shorter word, the error went away), and because I can live with shorter words, I’m not going to spend more time on trying to debug this, so sorry I can’t answer all your questions. But I can say that the player command that caused the trouble was “backChannelFlag speechRecognitionFailed”, and that code that dealt with this command (the loop I originally included with lots of other stuff also) did what it should have done the first three times, but the fourth time the line let commandWord be word number CWI in C in lower case processed that input, the error arose. If I included say commandWord after the let line, it output “speechRecognitionFailed” for the first three times (no error) but “speechRecognitionFaile” (no terminal “d”) following the error.

If you wonder why my “player” is giving commands to the story in what looks like code - I have wrapped Glulx in something of my own so the player isn’t interacting directly.

drpeterbatesuk · May 10, 2021, 2:56pm

Hmmm. I wonder if that has something to do with it. Certainly a minimal reconstruction of your code and input doesn’t replicate the error (in Inform 6M62). Good luck with your project- sounds interesting!

drpeterbatesuk · May 11, 2021, 11:43am

I’ve replicated the error in a minimal example. It’s a non-trivial bug in block value processing. It doesn’t have anything to do with player input or the dictionary.

Lab is  a room.
 
When play begins:
	let C be "1234567890123456789012 123456789012345678901234567890123";
	let commandWord be "xyzzy";
	repeat with CWI running from 1 to the number of words in C:
		now  commandWord is word number CWI in C;
		say "Word [CWI] is:  [commandWord].";

The error occurs in TEXT_TY_BlobAccessI () when in making a mutable copy of the text block value it’s going to write to it inadvertently creates a long block of truncated length compared to the one it’s copied from and shorter than the one it’s writing from.

Ben · May 11, 2021, 12:24pm

Thanks @drpeterbatesuk ! Do you have any advice (on the basis on your now better understanding of this bug) for when using this technique, other than “try and avoid really long words in player input”?

drpeterbatesuk · May 11, 2021, 4:00pm

Hi, it hasn’t anything per se to do with player input (as you can see from the minimal example).

tldr: set a text variable to “” before setting it equal to a word extracted from a longer text.

OK, deep breaths, here we go…

The I6 template function the bug arises in - TEXT_TY_BlobAccessI() - is a moderately complex one implementing a finite state machine that provides various functions related to counting/extracting/replacing parts of a text. TEXT_TY is shorthand for I7’s text type, ‘blobs’ is how the template refers to smaller parts of one of I7’s dynamically allocated block values, of which texts are an example. Other examples of block values include lists and stored actions. In the case of texts, blobs can for example be characters, words, punctuated words, unpunctuated words, lines or paragraphs- ideas manifest in the various phrases used in I7 to search and modify texts.

In your original code and my minimal example, TEXT_TY_BlobAccessI() is (indirectly) called in order to extract an enumerated word from a longer text. In your code, the longer text is derived from the player’s input. In mine it’s a simple local text variable.

The problem potentially arises when TEXT_TY_BlobAccessI() has to return a text value. This is supplied by writing to one of the parameters of the function, ctxt, a text block value. Simplifying things slightly, block values comprise two parts. The first is a short header (called the short block), whose address is directly referenced by the (for example) text variable. The short block contains information describing the block value and a reference to the address of the first of a doubly-linked list of data blocks (called long blocks), potentially scattered through memory, each consisting of a header referencing the address of the previous and next long block in the chain followed by (in this case text) data. Long blocks are dynamically allocated and of variable size and number, but their size in bytes (including header) is always a power of 2. The maximum data storage of a block value therefore comprises the sum of the sizes of its chain of long blocks, less the room taken up by their headers. The actual amount of data storage used may be a little less than the maximum, with padding from there to the end of the final long block in the chain. In the case of texts, the end of the data can be found by stepping through the data in each long block in the chain until reaching a zero. Ultimately then, block value texts are a storage format for dynamically- allocated zero-terminated strings.

For more complete information about block values, you can copy the BlockValues.i6t and Flex.i6t files to be found in the /Internal/I6T/ folder of your Inform 7 install directory. Open these copies in a text editor to reveal partly-annotated versions of the I6 template functions used to work with block values.

The above description illustrates the possibility that the same data (i.e. chain of long blocks) can in theory be referenced by two different short blocks, i.e. two different I7 text variables. e.g. ‘let commandWord be “frotz”; let magicWord be commandWord’ creates two text variables both now pointing to the same long block data- “frotz”. If one of these 2 variables is then changed, e.g. 'now commandWord is “xyzzy”, the new text ‘xyzzy’ can’t simply be written to the existing long block- otherwise now magicWord, which points to the same long block data will be “xyzzy” too, rather than remaining as “frotz”. Inform deals with this by keeping a count of how many variables are referencing a given chain of long block data. If Inform needs to change long block data referenced by two or more variables, it first makes a new copy of that data so that in this case there are now two long block copies of “frotz”- one pointed to by commandWord and the other pointed to by magicWord. This is called making commandWord ‘mutable’. It then overwrites the “frotz” pointed to by commandWord with “xyzzy”, so we end up with commandWord as “xyzzy” while magicWord remains as “frotz”.

To quote the BlockValues.i6t template, ‘Subtle and beautiful bugs can occur as a result of making a value mutable…’ This is an example of one of those bugs. At the start of the function, TEXT_TY_BlobAccessI() records the current maximum data storage available in the long block chain allocated to its return text variable parameter, ctxt, in a local variable- csize. When writing to ctxt, it keeps an eye on csize and if it realises it is going to end up needing more data storage than will fit, it dynamically reallocates more storage to ctxt’s long block chain to make room. This works fine, except in the case where ctxt’s long block chain is also referenced by another variable. In this case, as soon as TEXT_TY_BlobAccessI() tries to write to ctxt, the template code notices this and before writing, makes ctxt ‘mutable’ by making for ctxt its own copy of its long chain data. Unfortunately, although the long block created is sized to fit that data, it is not guaranteed to have exactly the same size and structure as the long block chain it was copied from. For example (to oversimplify), “frotz” would fit equally well within a block of 16 bytes or one of 32 bytes. Consequently, the maximum data storage capacity of ctxt may change when it is made ‘mutable’- but TEXT_TY_BlobAccessI() does not notice this and update csize accordingly. The end result is that when ctxt is made mutable, and in doing so its maximum storage capacity is less than it was before, TEXT_TY_BlobAccessI() does not notice if it begins to try to write beyond the end of ctxt’s reduced data storage capacity unless and until it tries to write beyond the limit previously defined by the original csize.

Your original example and my minimal case both create an edge case where this can happen. When commandWord is reused in the 2nd iteration of the repeat loop, the temporary I6 variable ctxt created as a parameter for TEXT_TY_BlobAccessI() is pointing to the same long block data as commandWord. After certain sequences of past-and-present ‘blobs’ ctxt is now pointing to a long block chain with significant ‘padding’ beyond the actual data, such that when ctxt is made ‘mutable’ the copy made of its long block data is ‘economically’ created with a smaller-capacity long block chain- containing the same data but less ‘padding’. If the new data being written to ctxt is sufficiently longer than that in commandWord to overwrite not only the old data but also that shorter padding and beyond, the error will occur.

In my example, on the second iteration of the repeat loop ctxt starts with a maximum data storage of 44 characters in its long block, consisting of 22 text characters, a terminating zero, and 21 characters of ‘padding’. When TEXT_TY_BlobAccessI() tries to write the first character to ctxt, the template makes ctxt mutable and in doing so creates a long block with a maximum data storage of 28 characters- more than enough to hold the existing copy of the data, but nothing like enough to hold the 33 characters plus terminating zero TEXT_TY_BlobAccessI() is about to write. TEXT_TY_BlobAccessI() is oblivious to this, still thinking (because csize is 44) that there is plenty of room. As soon as the 29th character is written (to index 28 in the data block, which are zero-indexed) the errors start, and are repeated through to the terminating zero being written to index 33.

The error occurs due to a combination of slightly unusual circumstances in specific and possibly not-very-commonly-invoked template code. I’m not sure how widespread elsewhere in the template such edge cases might occur. However, for this circumstance there appears a simple fix: set a text variable to “” before making it equal to a word or other type of ‘blob’. This should ensure that ctxt does not start out with a dangerously extensively-padded long block chain that in being copied might end up being shortened.

e.g.

Lab is  a room.
 
When play begins:
	let C be "1234567890123456789012 123456789012345678901234567890123";
	let commandWord be "xyzzy";
	repeat with CWI running from 1 to the number of words in C:
		now commandWord is "";
		now  commandWord is word number CWI in C;
		say "Word [CWI] is:  [commandWord].";

EDIT: looking closer at this, the reason this simple example I7 fix works is not what I originally thought- what it does is make ctxt mutable by removing the reference that commandWord was holding from the 1st iteration of the loop to the same long block data pointed to by ctxt.

Without this fix, after the first iteration of the loop the I6 temporary variable ctxt (represented in the I7 code by ‘word number CWI in C’) and the I7 local variable commandWord are left both pointing to shared long block data, representing a copy of the 1st word of C -“1234567890123456789012”.

When the second iteration starts 'now commandWord is “” ’ points commandWord elsewhere, to “”, reducing the number of variables referencing ctxt’s long block data from 2 to 1, making ctxt mutable and thus not triggering the bug.

Ben · May 11, 2021, 4:25pm

Wow! Very pleased that you included the TLDR for this one! Code updated.

drpeterbatesuk · May 11, 2021, 4:52pm

For those who prefer I6 hacking, this should provide a general fix by ensuring ctxt is already mutable before TEXT_TY_BlobAccessI() is called:

Include (-
Replace TEXT_TY_BlobAccess;
-) after "Definitions.i6t".

Include(-

[ TEXT_TY_BlobAccess txt blobtype ctxt wanted rtxt
	p1 p2 cp1 cp2 r;
	if (txt==0) return 0;
	if (blobtype == CHR_BLOB) return TEXT_TY_CharacterLength(txt);
	cp1 = txt-->0; p1 = TEXT_TY_Temporarily_Transmute(txt);
	cp2 = rtxt-->0; p2 = TEXT_TY_Temporarily_Transmute(rtxt);
	TEXT_TY_Transmute(ctxt);
	! ########### insertion begins ###########
	if (ctxt) BlkMakeMutable(ctxt);
	! ########### insertion ends ###########
	r = TEXT_TY_BlobAccessI(txt, blobtype, ctxt, wanted, rtxt);
	TEXT_TY_Untransmute(txt, p1, cp1);
	TEXT_TY_Untransmute(rtxt, p2, cp2);
	return r;
];

-) after "Output.i6t".

otistdog · May 12, 2021, 12:46am

Nice job on tracing the cause here, @drpeterbatesuk!

If interested in restricting the scale of possible side effects, a modification to the circumstances surrounding the call to BlkValueWrite() within TEXT_TY_BlobAccessI() may be a suitable alternative. The proximal cause seems to come from the section:

	if (brm == ACCEPTED_BRM or ACCEPTEDP_BRM) {
		if (oldbrm ~= brm) blobcount++;
		if ((ctxt) && (blobcount == wanted)) {
			if (rtxt) {
				BlkValueWrite(ctxt, cl, 0);
				TEXT_TY_Concatenate(ctxt, rtxt, CHR_BLOB);
				csize = BlkValueLBCapacity(ctxt);
				cl = TEXT_TY_CharacterLength(ctxt);
				if (brm == ACCEPTED_BRM) brm = ACCEPTEDN_BRM;
				if (brm == ACCEPTEDP_BRM) brm = ACCEPTEDPN_BRM;
			} else {
				if (cl+1 >= csize) {
					if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break;
					csize = BlkValueLBCapacity(ctxt);
				}
				BlkValueWrite(ctxt, cl++, ch);
			}
    ...

where it can be seen that TEXT_TY_BlobAccessI() does try to check the relative storage of the target (csize) against the text to be copied from source (cl+1). As shown by your analysis, the value of csize can become stale, so another way to head off the issue is to update the value of csize prior to the check and let the code expand the target block’s size per (apparent) original design:

			} else {
				csize = BlkValueLBCapacity(ctxt); ! CHECK FOR LENGTH MODIFICATION SINCE INITIAL VALUE
				if (cl+1 >= csize) {
					if (BlkValueSetLBCapacity(ctxt, 2*cl) == false) break;
					csize = BlkValueLBCapacity(ctxt);
				}
				BlkValueWrite(ctxt, cl++, ch);
			}

You know more about the block value layer than I do, so my apologies if you’ve intentionally rejected this approach. (And please do feel free to point out why, if that’s the case.)

drpeterbatesuk · May 12, 2021, 1:39am

Hi,

It’s a valid approach to refresh csize alongside any code writing to ctxt- which occurs in a number of places through TEXT_TY_BlobAccessI()- or more simply, just replace all tests against csize with tests against BlkValueLBCapacity(ctxt)

I did test that approach also, and it certainly works. I suspect that the original (faulty) code as written was trying to avoid doing this for performance reasons because as a consequence of the routine being written as a finite state machine, ctxt is generally written out via multiple writes, one character at a time. This means BlkValueLBCapacity() will be run for every character, which for extensive long block chains is itself a not entirely trivial process, since it needs to step through each block in the chain, calculating the block size and data size for each and adding up the data bytes as it goes. I did also for this reason try the approach where csize is refreshed only after the first write to ctxt- which is the only one to potentially trigger making ctxt mutable (and hence the bug) mid-function- and this works too.

In the end I decided that making ctxt mutable before the start was the simplest and most foolproof method, since this avoids csize becoming unexpectedly stale and allows the moderately-complex code of to TEXT_TY_BlobAccessI to operate exactly as originally written.

I can’t see that this could lead to any unintended consequences- any call to TEXT_TY_BlobAccessI with ctxt as a parameter will inevitably trigger BlkMakeMutable(ctxt) in any case as soon as (and indeed every time) BlkValueWrite(ctxt ...) is called, so getting it out of the way at the start seems both simplest and best. Calling BlkMakeMutable(ctxt) when ctxt is already mutable is safe, as it just returns without changing anything. Making a non-mutable block value mutable unnecessarily shouldn’t have any adverse consequences in any event- Inform implements multiple-pointers-to-one-block only as a means to save time and memory in making simple copies of (potentially) large data blocks, not in order to invoke C-style pointer-wizardry. As in the instance we’re discussing, Inform disentangles multiple-pointers-to-one-block as soon as a function involving anything other than making a straight copy is involved.

drpeterbatesuk · May 12, 2021, 2:35am

PS I’ve noticed that the I7 fix mentioned above also works by ensuring ctxt is kept mutable, not for the reason I originally surmised (see EDIT at end of previous post)