Dialog Wishlist

I still think you should consider adding a memory stream feature, it could prove so useful. Now I know that you’ve been holding off because implementing it in the 8 bit interpreters is hard, but if it can be tested for, isn’t that okay? It makes sense to me to plan ahead and include it in the 1.0 format even if it can’t be implemented in all the interpreters yet. Adding it later would be a harder leap after 1.0.

The problem is less the 8-bit interpreters and more the lack of any string or character array data type. It’s possible to redirect output into a list of dictionary words, which is close to the same thing, but doesn’t preserve case and spacing; but in Dialog, there’s no data type that can preserve case and spacing.

1 Like

I really like this logic, honestly! I’m now inclined to make a new repository (either personal or part of the organization) that holds the bash script and the Unicode library, and have the manual point to that.

1 Like

The fun thing about Dialog is that it actually does this already! Even the C64 interpreter will use curly quotes by default. The only platform that might not is the Z-machine, and that one will use @check_unicode to test for support, use curly quotes if supported, and straight quotes if not.

But…it only does this if the author specifies in the source which quotes are opening and which are closing, and a lot of people don’t. Like TeX, it doesn’t want to assume that in case it gets it wrong.

Also (and then I’ll stop spamming posts in this channel for a while), there’s a poll elsewhere about default colors for the Å-machine. Dialog authors may want to weigh in!

Per forum consensus, the Å-machine opcodes will be rearranged in version 1.0:

  • LEAVE_STATUS will be changed to $EF, matching ENTER_STATUS at $6F
  • SET_BODY will be changed to $67, which was previously ENTER_STATUS_0

I’ve updated the interpreters accordingly. They pass the basic tests in the repository, but more testing is always welcome!

1 Like

Not that I have used Dialog, but my instinct as a programmer is to never ever ever use unicode in source. You never know what’s going to support it and what’s going to choke on it, even today. It’s gotten to the point where straight quotes inherently look better to me than curly quotes, because curly quotes look dangerous.

I feel like curly quotes should always be a last-possible-second switch from straight quotes, just barely before the user sees it. Unless that user is me, in which case I still want the straight quotes :wink:

But why should we be forever hobbled by the limitations of ASCII? UTF-8 is older than I am, and over 99% of web traffic uses it by now! Throw off the chains of one-byte Anglocentric character encodings and join me in the world of ideographic descriptors and undeciphered Phaistos Disk symbols!

(Plus, quote marks have no special meaning in Dialog. There’s no string data type, and the character data type is marked by @ instead. So ASCII straight quotes and Unicode curly quotes are equally meaningless to the compiler.)

In seriousness, earlier versions of Dialog favored ASCII in some ways, but I’ve been working hard to get rid of that. My goal is that, apart from a handful of punctuation marks used by the language’s syntax, all other BMP characters should be treated equally. 1a/01 took some significant steps in that direction (allowing them in dictionary words on Z-machine); 1b/01 goes further (no longer requiring the basic Latin alphabet to be in the character set on Z-machine at all). The Å-machine is going to take more work, but I have faith that one day we’ll get there. We can’t let ourselves be restricted to 96 basic characters forever!

(Why BMP only? Because architecturally the Z-machine can’t handle non-BMP characters. I’d like to change that, but there’s been no real enthusiasm for a new version of the Z-machine spec in the last decade or so. Alas.)

5 Likes

Meanwhile, I am trying so hard not to propose any more changes until 1b/01 is out the door. I have a new piece of syntax I want in my current Dialog projects, and a way to enhance the Å-machine’s text encoding…

But no. One thing at a time. New syntax in 1c/01.

I missed the TALJ deadline, but I’m sticking to my promise to not introduce any new features until 1b/01 is out the door. I’ve pinged some people to look at the last few pull requests, and if those meet with approval, you can expect 1b/01 to launch this upcoming week!

2 Likes

I’ve forked this off into its own repository. The main repo now links it as an example in the chapter on modifying the standard library.

3 Likes

I think everything is now in place for the release of 1b/01 / 1.1.1! If anyone objects to this, speak now (like in the next 24h or so) or forever hold your peace!

(Also Å-machine 1.0.0.)

5 Likes

…which reminds me, the library version should be bumped to 1.2.0, because it now uses a feature introduced in 1b/01. It checks (transcript active) to print an appropriate message when you > SCRIPT ON or > *COMMENT. This is why waiting a day is a good idea!

(The idea, as I understand it, is that you can use a 1.1 library with a 1b compiler, but can’t use a 1.2 library with a 1a compiler. So whenever the library requires a new language feature, the library minor version gets set to the compiler minor version. Otherwise, changes to the library only bump the patch version.)

1 Like

Kansei ja!

1 Like

So, what’s next on the roadmap? Not a ton; most of the features on the agenda are in 1b/01. But these are the issues I’m currently hoping to tackle in the next month or two.

My top priority is a new feature that I’ve been longing for since Miss Gosling, and I’ll elaborate on it in another post. I’ve needed to come up with workarounds for it on three separate occasions, so at this point I think it should be part of the language proper. (It’s also really easy to implement.)

Beyond that, there are a few separate issues with how divs and status bars are handled on the Z-machine. It’s going to be a headache to deal with, but they need to be fixed at some point. In particular, unsupported inline status bars don’t produce a line break (when it seems like spec-wise they should), and the backend only tracks the number of active spans, not the number of active divs. This causes problems with operations that are illegal inside a div.

There’s also an odd bug with very long literal lists of lists. Dialog chokes if these literals appear in rule heads, but not in rule bodies, which suggests that something is messed up in the initial-value-compiling step. This requires tinkering with parts of the compiler I haven’t worked with before, but it’s clearly a bug, not just an enhancement, so it gets a bump up in priority.

Then, there are a couple things that require changing a bunch of data types. Not hard, just tedious. The debugger only supports 3-bit color, not 24-bit color, but that can be fixed by changing a bunch of uint8_ts to int32_ts. And the compiler’s Unicode handling only supports the BMP, even though the Å-machine can handle astral (non-BMP) characters just fine. Just gotta replace uint16_ts with uint32_ts, and add error messages if astral characters are used on Z-machine.

On the Å-machine side, I’d like to make a few more improvements to the web terp. Right now, specifying both “old-style” and “new-style” colors in a style class will default to the old ones. I’d prefer it default to the new, since they’re more flexible. I’m working on supporting IFComp telemetry recording, but that’s on hold until the IFComp people get back to me (hopefully soon). And of course, I’ve made some accessibility improvements in Å-machine 1.0.0, but only time will tell how well they work. I fully expect those will need to be adjusted once we have real-world data on them.

For the 6502 terp, there are a couple things I’d like to improve, but I’m not confident in my ability to do so. One is supporting display:none divs, just for feature parity with Z-machine. Others have also suggested taking cues from Ozmoo and supporting link navigation. These would be nice to have, but they’re low-priority right now because the effort they’d take is very large compared to the potential benefit.

Finally, there’s a more aspirational change to the Å-machine spec I’d like to make. Currently, the Å-machine uses a game-specific eight-bit character encoding. $00-$19 are control characters, $20-$7E are ASCII, $7F is reserved as an escape code, and $80-$FF are defined by the story file. (We could perhaps consider $20 a control code as well, because it’s needed for the spacing mechanism.)

That’s generally enough for European languages, but not enough for Japanese. I’d like to update the spec to say, if the story file’s character table includes more than 128 entries, then remove ASCII to make room. This will unlock 94 extra codepoints, which should be comfortably enough for Japanese if not for Chinese. This won’t affect existing story files, since none of them will have more than 128 entries in the table. But since it’s a significant change to the spec, I want to discuss it with the community first and talk through the benefits and drawbacks. I’ll make a proper post about this in the Technical Development > Specifications category sometime later.

(In the long term, I also want to update the Å-machine spec to support a 32-bit word size. This will remove basically all the current limits on it. But that’s also a huge change that’s going to take a lot of work—more than I can do alone. So it’s probably not going to be in 1c/01.)

So yeah! That’s my current vision for the future of Dialog and the Å-machine. Like it? Dislike it? Got a different feature you’re hungering for? Post them here; that’s what this thread’s for!

3 Likes

So, what’s this new feature I want?

Well, imagine some code like this.

(give the tutorial)
	(div @tutorial) {
		Useful actions might include: (line)
		(exhaust) { *(useful action) (line) }
	}

(useful action)
	(current room $Room) *($Obj is #in $Room) (item $Obj)
	TAKE something from the ground

(useful action)
	*($Item is #heldby #player) ~($Item = #compass)
	DROP something from your inventory

(useful action)
	*($Item is #heldby #player) ~($Item = #compass)
	(current room $Room) *($Obj is #in $Room) (supporter $Obj)
	PUT something ON something

Seems straightforward enough, right? Until…

Useful actions might include:
TAKE something from the ground
TAKE something from the ground
DROP something from your inventory
DROP something from your inventory
DROP something from your inventory
PUT something ON something
PUT something ON something
PUT something ON something
PUT something ON something
PUT something ON something
PUT something ON something

Oops! Each of those multi-queries created a choice point, and the (exhaust) ran through all of those choice points multiple times. This is the trap I fell into with Miss Gosling.

It’s not limited to (exhaust), though. This is a problem that Linus noticed in early Dialog, which is why (if) exists. The structure (if) COND (then) TRUE (else) FALSE (endif) is almost equivalent to { COND TRUE (or) FALSE }. But as described in the manual:

There are subtle differences between the if-statement above and the disjunction shown earlier: An if-condition is evaluated at most once, even if it creates choice points. […] In the disjunction-based version of the rule, there are several lingering choice points, so if a failure is encountered further down in the rule (or even in the calling rule, if this was a multi-query), then execution might resume somewhere in the middle of this code, printing half-sentences as a result.

This means you can use an (if) block to throw away choice points:

(if)
    ARBITRARY CODE
    ARBITRARY CODE
(then) (else) (fail) (endif)

And I used this extensively in the Scott Adams transpiler when I didn’t want to leave choice points around:

	(if)
		%% SECURITY BADGE
		(here #picture-of-me-stamped-security)
		
		(now) (#picture-of-me-stamped-security is at #grey-room)
		(go to #sitting)
		There's a Bright flash & I hear something fall to the floor. (line)
		I can't see what it is from here though. (line)
		 (line)
	(then) (endif)

But it’s clearly a hack, and it’s not obvious at a glance why I’m doing this. Why is the (then) branch empty?

You can also do this:

(style class @default)

(span @default) { ARBITRARY CODE ARBITRARY CODE }

Divs and spans can succeed at most once; after the div or span exits, any choice points left inside are discarded. (This makes sure that all the reset-the-style code doesn’t run multiple times.)

But again, this is clearly a hack. The purpose of divs and spans is to style the document, and it’s not at all obvious that removing this seemingly-pointless span will change the behavior of the code.

The best solution currently, and the one that I went with in Miss Gosling, is to make additional predicates.

(useful action)
	(droppable item in inventory and supporter in room)
	PUT something ON something

(droppable item in inventory and supporter in room)
	*($Item is #heldby #player) ~($Item = #compass)
	(current room $Room) *($Obj is #in $Room) (supporter $Obj)

A simple query (that is, not a multi-query) is the easiest way to get rid of choice points. But this leads to a lot of extra predicates, and it moves the code away from where it logically belongs. In my opinion, this works, but it’s not elegant.

So, what can we do about this?

In 1c/01, I would like to add an additional bit of syntax:

(at most once) { ARBITRARY CODE ARBITRARY CODE }

This would use the same mechanism currently used for (if), (div $), and (span $) to discard all choice points from the block once it exits. (Specifically, it would store the value of CHO (the choice point register) to a temporary variable before starting the block, then restore it afterward.) That mechanism has been around for a long time already and is well-tested; this would just give authors a clearer way of accessing it.

I have a significant desire for this feature, so I’m specifically not going to make a poll for it; even if the forum is indifferent, it would be very helpful to me in multiple contexts. But if you have strong objections to this, please do let me know! And if you don’t like the (at most once) name, also let me know on that front; I don’t care too much about the exact syntax. I just want it to be clearer at a glance than (span @default).

3 Likes

if someone is willing to plumb the depths of the 6502 interpreter can i again throw this bug in the ring?

it’s like the text is first being blanked and then redrawn. i kind of feel like i’m playing ‘duck hunt’ again on an old NES,

having said this, though, i realize that, big picture, this is low priority and likely not worth the 6502 assembly dive.

I’ll put it on the issue list for the Å-machine repository, but my ability to debug uncommented 6502 assembly is unfortunately very very low. So while I’d love to see it fixed, it will probably not be done by me!

understandable. and it may not even be a bug but just the way linus chose to set up the text output.

I’ve logged it, along with a pointer to where I think the issue might be. Someone with proper C64 experience might know what would cause the screen to flash, and thus might be able to find the culprit in that file.

1 Like