As I’m working on both internationalizing the Dialog compiler (making it better able to handle non-ASCII characters) and improving the lexer (making it better able to handle weird things in the source code), I’m curious about the $ character it uses for variables.
Is it difficult to type on non-QWERTY keyboards? If so, what are the common alternatives that are easier to type? With the way the lexer is structured, I probably can’t support any alternatives above U+07FF, but I could probably make £, €, and ¤ work as alternatives (for example). I just don’t know what’s actually feasible to type outside the US, because all my keyboards are US-based.
(This would have the side effect of making those characters require escaping when used for other purposes, so it does come at a slight cost, so I don’t want to do it unless it’s actually solving a problem.)
I’ve used several European keyboard layouts, and $ is normally fairly accessible. Definitely more than £ or € (I live in Italy, and I still need to double check where to find €…).
Some other characters that you’d expect to be straightforward are actually harder. ~ (tilde) isn’t available at all on the standard ITA layout, for example. (This becomes particularly relevant if you, say, purely hypothetically, choose to teach Ink to Italian middle school students before you notice)
I’m told that many of the common Twine characters are harder on other layouts: ~ # { [ |` \ ^ @ ] }. Whenever I try to type SugarCube code for people on my phone, ` and < are the killers.
The brackets don’t tend to be too bad but yeah… ~ ` are hard, and even though everybody now knows where it is, @ is actually quite awkward on the Italian layout.
Interesting! I don’t think I can provide alternatives for all of those, but hopefully the new Unicode escape sequences help there?
If there’s one or two particular characters that are hard to type across a variety of popular keyboard layouts (like @ and ~), and easier alternatives in the $80-$07FF range that work well, I can see about supporting those, though. I just want to make sure the syntax doesn’t get too convoluted, because the lexer is already extremely hard to maintain.
There’s an extent to which I wouldn’t worry about it too hard, not because it isn’t a problem but because the people who are likely to try Dialog will have been forced to deal with the problem already for things like (pretty fundamentally) sending emails, or adding hashtags. Which is not an answer I like, but I suspect the only alternative is a configurable parser which sounds a nightmare for you as a maintainer.
It’s a good idea, but I don’t see a straightforward way to implement it. Currently ~ is treated specially by the lexer, because it can appear in the first column ~(item *(animate $)) or inside a rule head (prevent [eat ~(edible $)]). So without major refactoring, an alternative to it will need to be handled at the lexer level, not the AST level.