_read with no parse buffer passed?

ChristopherDrum · December 8, 2024, 8:20am

I’m sure I’m just missing something in the documentation or misunderstanding something about how instructions are parsed.

I’m updating my interpreter for z5 support and running test programs against it. So far, so good with Praxix and czech (both passing 100% for supported features), but not so with etude.z5. It can’t even get to the initial input cursor without crashing.

I ran txd.exe against the .z5 file and I’m looking at the etude source code.

I’m getting tripped up on the var instruction read.
In the txd output, I see these instructions

 31c5:  e2 17 05 9a 00 4d       STOREB          #059a,#00,#4d
 31cb:  e2 17 05 9a 01 00       STOREB          #059a,#01,#00
 31d1:  cd 4f 02 05 9a          STORE           L01,#059a
 31d6:  0d 04 00                STORE           L03,#00
 31d9:  e4 bf 02 04             READ            L01 -> L03

In my parsing of the instruction, bf is 10 11 11 11 where 10 gets the next byte as a var byte, then the next 2 bits 11 tells me to stop fetching bytes, though it looks to me like I should be grabbing the next byte 04

In the source, this seems to coincide with these instructions

    print "^> ";
    inbuf->0 = (INBUFSIZE-3);
    inbuf->1 = 0;
    inbufvar = inbuf;
    ix = 0;
    @aread inbufvar ix;

This appears to be passing 0 as the second parameter to @aread, which… explains why the decoded instruction only had me read one var?
Ultimately, this means that read only receives one parameter: the text-buffer (baddr1)

I don’t see anything in the z-machine docs (neither Nelson nor Klooster) that says what to do to handle this case. Klooster writes

Finally, if baddr2 is not zero in V5+,
the sequence stored in the buffer is tokenised,
just as if a tokenise baddr1 baddr2 instruction was used.

OK, but what if baddr2 IS zero? Or have I misunderstood how var instructions are processed in some subtle way? I’ve played dozens and dozens of games and never encountered this issue, but maybe z5 has a chance of it occuring? At any rate, etude triggers it so I want to tighten things up and claim victory over this bug.

SomeOne2 · December 8, 2024, 8:59am

Ok, I honestly have almost clue what you are asking, but I think someone else might have more of an idea. If READ gets one parameter, or essentially only one parameter, it only writes the per-letter array to the first variable. If it has the second argument, the second argument is passed the word array. Otherwise, nothing happens to that second array. If that makes sense…

ChristopherDrum · December 8, 2024, 9:02am

I think it finally dawned on my when I took a break just now. I’m thinking too much in pre-z5 ways, where the read and parse always happen together. Presumably in z5 we might be asked to ONLY read and just hold that text there in the text buffer, then later we’ll be explicitly requested to tokenise that text. I need to think of that process as a series of smaller, explicit stages that I will be asked to perform.

SomeOne2 · December 8, 2024, 9:03am

Yeah, that does happen. In fact, my current WIP has that method. I never tokenise in the READ instruction, because I have a bunch of different things I do to complete a prompt before I then use LEX to tokenise.

ChristopherDrum · December 8, 2024, 9:19am

My target platform has really strict limitations on the length of the code, so I was trying to consolidate things in the past to buy room for z5 support expansion later. This particular way of doing things was not smart, in the long run. Time to refactor!

Dannii · December 8, 2024, 9:40am

You’ve missed this part of the standard:

Next, lexical analysis is performed on the text (except that in Versions 5 and later, if parse-buffer is zero then this is omitted).

ChristopherDrum · December 8, 2024, 10:04pm

Yep, you’re right. I didn’t catch that parenthetical (!), yet vitally important, piece. Sometimes I find the mixing of discussion about version differences confusing to follow.