Displaying the character ö directly vs. using a unicode number

If I paste the character ö directly into the IDE, it displays ok in the interpreters I’ve tried. (I don’t know what it looks like in an interpreter that doesn’t support unicode).

Is there any advantage to writing something like

say unicode #

(whatever the number is, I don’t know) for this character rather than pasting in the character directly? Does it make any difference to an interpreter’s ability to display it?

A few times when I’ve cut and pasted code from a git repo, this character has gotten messed up, but I don’t know if using the character directly has any other effects.

As I interpret Writing With Inform, there’s no difference beyond not having to type the character into the source code, letting the file stay as ASCII:

No difference as far as I know. Depending on I6 details it may compile to a slightly different Z-machine or Glulx instruction (“print this numbered Unicode character” instead of “print this string”), but that shouldn’t have any significant effects.

The one significant difference is that on Z-machine, there’s a limit to the number of unique extended characters you can print via strings, whereas there’s no limit to the number you can print with @print_unicode. But in practice, it’s hard to run into that limit.

True! But Inform 7 never changes the default ZSCII mapping; ö will always be recognized in input, ō never.

1 Like

…or at least not until current development Inform, which will cheerfully accept even emoji as input, gets here.

On Z-machine, though? I thought the new Unicode parser was going to be Glulx-only.

yup, sorry, totally glulx-only.

1 Like

Thanks, everybody!

Unicode characters in the source aren’t displayed correctly if you create a website with source. (It shows the low byte.)

Ooh, that’s a good catch.

Is that an issue with Inform or just an issue with the template?

It could be just that the created HTML file doesn’t properly declare its encoding…

I thought of that, but it really is only showing the low byte. I checked the hex dump.

The HTML file has <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />, anyhow.

2 Likes