Experimental "Unicode Parser" extension

(I’ve been threatening to do this in a couple of threads recently.)

Here is an extension which experimentally upgrades I7’s internal command buffers to be Unicode-aware.


I say experimental because I’ve barely tested it. I can see some places in the parser that almost certainly don’t work with the new format yet. Oh, look, I broke disambiguation too.

With this extension, defining nouns and verbs that include Unicode characters becomes possible. But it’s not yet easy, because the I7 language doesn’t realize that it’s possible and throws errors at you. (You can’t just say “Understand ‘βράχος’ as the rock.”) You have to do an end-run around I7 and define your synonyms in I6.

See the documentation for the gory details.

Cool, thanks zarf, so does this mean (assuming the bugs are worked out) that I will now be able type em dashes directly into string variables [EDIT: er… I meant constants] and it will work in every interpreter, without typing “[unicode 8212]” as suggested in this thread?

What? Are you trying to obsolete my Japanese work before I’m even done with it? :stuck_out_tongue:

No, this does nothing to change the way strings are printed, and nothing to change the way I7 parses source code.

After a lot of iterative testing, I am declaring this ready for (cautious) real-world use.


A few things still don’t work, but they’re mostly things that cannot be made to work without changing the I7 compiler. See the “Caveats” section. As far as 6G60 goes, this is usable.

…and now I have fixed the bug that was making it not usable. :confused:

I think we’ve all had days like that.