Lectrote ZVM : @sread text parse

auraes · January 24, 2021, 1:40pm

There’s something I don’t understand with the ZVM interpreter and @sread instruction for version 3.
In this example, I should be able to type a maximum of 7 characters and not 9. I should be able to type a maximum of 4 words and it shows me 102. The text buffer should end with 0 and not 32. The ‘f’ is missing in the text buffer, but is in the parse buffer.
With Frotz all works fine, but not with ZVM under Lectrote.

Lectrote ZVM:

>ab cd e f
text: [8] 97 98 32 99 100 32 101 32
parse: [102][4] ab[2,1] cd[2,4] e[1,7] f[1,9]

Frotz:

>ab cd e
text: [8] 97 98 32 99 100 32 101 0
parse: [4][3] ab[2,1] cd[2,4] e[1,7]

Code

!% -v3
!% -~S
 
Constant MAX_INPUT_CHARS = 7;
Constant MAX_INPUT_WORDS = 4;

Array text->(1+MAX_INPUT_CHARS+1);
Array parse->(2+(MAX_INPUT_WORDS * 4));

Global location = Library;
Global status_field_1 = 0;
Global status_field_2 = 0;

Object Library "The Library"
	with description "You are in a library.";

[ BeforeParsing   i w at len;
   print "text: [", text->0, "] ";
	for (i=1 : i<=text->0 : i++)
      print text->i, " ";
	print "^parse: [", parse->0,"]", "[", parse->1,"] ";
	for (w=1 : w <= parse->1 : w++){
		at = parse->(4*w+1);
		len = parse->(4*w); 
		for (i=0 : i< len : i++)
			print (char) text-> (at+i);
      print "[", parse->(4*w), ",", parse->(4*w+1), "] ";
	}
];	

[ main;
   text->0 = MAX_INPUT_CHARS+1;
   parse->0 = MAX_INPUT_WORDS;
   print "Type Q to quit.^";
   while(1) {
      print "^>";
      @sread text parse;
      if (text->1 == 'q') break;
      BeforeParsing();
      new_line;
   }
];

Dannii · January 24, 2021, 2:04pm

It’s definitely possible there are bugs in ZVM. Good test files would be welcome.

In this case I was trying to follow the spec, which means that you’re passing a length that is too long, as you’re giving the length of the buffer capacity, not subtracting one. The ‘f’ has been written into the parse array - that’s what the 102 is.

In Versions 1 to 4, byte 0 of the text-buffer should initially contain the maximum number of letters which can be typed, minus 1 (the interpreter should not accept more than this).

I’m a bit confused by this section honestly. But if your array is 9 bytes, that’s a capacity of 8. 1 byte is for the 0, leaving 7 characters that can be entered, so you should set byte 0 to 6 before calling @sread?

It would make more sense if the spec had said plus 1… This is probably a spec error, so maybe ZVM will need changing.

auraes · January 24, 2021, 2:50pm

It all depends, in fact, how we interpret the documentation, which is not very clear. The number of characters +1 seems more logical to me since it corresponds to the LF which will be transformed into 0.
I get the same result with Ozmoo as with Frotz.
I have updated and completed my example.

Mike_G · January 24, 2021, 5:32pm

Yes the standard is wrong, or at least very confusingly worded here.

In Versions 1 to 4, byte 0 of the text-buffer should initially contain the maximum number of letters which can be typed, minus 1 (the interpreter should not accept more than this).

The initial value holds the number of characters that can be typed, plus one for the zero used as a string terminator in the same style as a string in the C programming language. Versions above 4 write the actual length entered directly before the text and so do not need the zero terminator. I don’t think it is useful or accurate to think of the newline as being “transformed” into a zero because the newline (or any other terminating character in Version 5 and up) is specifically never written into the buffer.

Dannii · January 24, 2021, 10:36pm

@Marvin and @cas Mind giving your thoughts on this part of the spec?

cas · January 25, 2021, 1:54am

I think it’s entirely unclear, and although I’ve impmeneted it as @auraes infers (“number of characters +1”) I don’t think that’s what the quoted part of the standard is saying (even if it’s what it intends to say). In fact, I think the standard is effectively saying that byte 0 should be read and have 1 added to it, and that is the number of characters to be read. Which is at odds with how I actually implemented it.

But the standard also says that the buffer is, in Inform terminology, a string array of length n (where n is the value of byte 0). That means there are n bytes in the buffer following byte 0. Since a zero byte has to be written in those bytes, you can only read (n - 1) characters. So… n is the total number of characters you can read, plus one. Which means when you write code, you do something like uint8_t maxchars = byte(text) - 1, which may be how the confusing wording got into the standard.

I think the first part (“minus 1”) is poorly/incorrectly worded, and the “long” explanation following is correct. It makes sense from a programming perspective (e.g. like with snprintf where the buffer size itself is provided).

Mike_G · January 25, 2021, 3:46am

Curiouser and curiouser…

For versions less than 5, I have always implemented this as the first byte (byte 0) of the buffer being equal to max+1, where max is the maximum number of input characters allowed. This allows for the length of the input including the terminating zero.

Looking at the source of Frotz reveals that it also behaves this way.

Edit: Disregard the following paragraph. The terminating zero can not be written past the maximum length. My initial belief and the behavior of Frotz appears to be correct.
However, now having checked several early Infocom interpreters, this is clearly not the case. The first byte is exactly equal to max, and the terminating zero is written past the maximum length if need be. Needless to say, I have not checked all the possible interpreters.

Is this a case where some interpreters do one thing and some another, and the standard drew a compromise? Or was the behavior of those early interpreters a bug which the standard sought to correct? Or is the standard in error? The last seems most likely to me. I don’t know, but something is definitely odd here.

auraes · January 25, 2021, 4:28am

I tried with the Sorcerer’s interpreter under Dos who does an interesting thing: for a value of 8 in text->0, I can indeed type 8 characters (a-h) that appear on the screen, but I can’t validate my input with Enter. I have to delete a character (h), and then I can press Enter.

Mike_G · January 25, 2021, 4:46am

Ah, that is actually what I see too. The enter key won’t register if the buffer is completely full, so the interpreter will never write past the initial length. When running Zork Release 88 in DOS the value in byte0 is 0x64 (100 decimal) during the first read. This is too large to test because input won’t go past a single line, so editing the initial value to a smaller number before running produces the behavior you describe.

Mike_G · January 25, 2021, 5:01am

So it would seem the number of characters that may be typed, including the enter key, is equal to the value in byte0, but this leads to the possible case of not being able to press enter because the buffer is full. If we say the interpreter can only accept one less than this then the issue of having to delete before pressing enter is avoided. This is probably what the spec was trying to accomplish, but got the wording wrong.

russotto · January 26, 2021, 1:46am

A 1984 version of the Infocom spec says this:

Reads and parses a line of input. Table1 is the buffer used
to store the characters read. The first byte (read-only) of
this table contains the length of the rest of the buffer
where the input string is stored.

READ reads text until the buffer is full or until it
encounters a newline character.

The V4 spec adds

READ reads text until it encounters a newline character.
If the buffer is full, the CORRECT action would be to ring the
bell when additional characters are typed. Other actions (like
an assumed newline) are considered inferior implementations and
should be avoided where possible.

The zero terminator does not seem to be mentioned, but clearly it’s there.

If ‘n’ is the value in byte 0, only ‘n’-1 characters are accepted, and the buffer has to be n+1 characters long. Whether the interpreter accepts ‘n’ characters and then refuses all further input, or accepts ‘n’ - 1 characters and then refuses anything but a terminating character, matters only in V4, when the interrupt routine may see the input before termination.

Dannii · January 26, 2021, 2:07am

Posted an issue to the spec Github repo:

I’ll look at fixing ZVM, shouldn’t be too complex.

Mike_G · January 26, 2021, 4:11am

Nice observation. I hadn’t realized the possibility that this difference could be observed by the game.

Dannii · February 11, 2021, 3:14am

Fixed the bug and updated the pull request for Lectrote.

Dannii · March 21, 2021, 2:49am

This should be fixed in Lectrote 1.4.