Yeah idk if Windows players would see the fruits of your effort with this. I think Windows has a very different approach to control characters.
EDIT: Got sniped by jbg above.
Yeah idk if Windows players would see the fruits of your effort with this. I think Windows has a very different approach to control characters.
EDIT: Got sniped by jbg above.
They’re not really Unicode special characters; they’re TTY special characters that got imported into ASCII and then ASCII got imported into Unicode.
If I paste a ctrl-H (U+0008) into a text document (or this message!) it’s not going to act a backspace. Most IF engines use generic document rendering toolkits.
I mean if you want to go there, then I guess the letter A, for example, isn’t “really” a Unicode character.
But yeah, I wouldn’t necessarily expect an IF engine to properly render control characters, although I do think the comparatively slapdash rendering of the character set notionally supported by the underlying spec (in this case TADS3) probably counts as a misfeature. Like FrobTADS does correctly handle 0x0008 (backspace), but doesn’t render 0x25a6 (part of the geometric shapes Unicode block). QTads is the opposite.
FrobTADS doesn’t really render anything, though. It just passes characters on to the system’s ncurses
library, and then on to the terminal program. Which Unicode characters can or can’t be drawn depends entirely on the latter.
This doesn’t appear to be true. Which surprised me, because I had assumed exactly what you’re suggesting: that frob
is using ncurses(3)
, and so frob
would support whatever character set(s) are supported by the terminal it’s running in.
But vi
is perfectly happy displaying the result of [ctrl]-[v]u25a6
(for example), and saved to foo.txt
, cat foot.txt
, less foo.txt
, and so on are all happy to display Unicode characters that show up in frob
as ?
s.
I haven’t made any attempt to hunt down where the breakage is, though.
Edit: As pointed out by @ArdiMaster below, my speculation here is completely incorrect, and so I’m putting it in spoiler tags to prevent some future person from getting lead astray by it showing up in search results:
After a little poking around, I think it’s because adv3 pretty much everywhere, e.g. output.t
, declares #charset "us-ascii"
. Per the T3 character set documentation this will override any compile-time declaration of a default character set, or any character set declared in e.g. an individual game’s source.
So even if you specify some other default in the makefile and/or via #charset
declaration in your project, all the output classes are still going to end up handling strings (for output) as ASCII, unless I’m missing something obvious.
Demo:
#charset "utf-8"
#include <adv3.h>
#include <en_us.h>
#include <charset.h>
versionInfo: GameID;
gameMain: GameMainDef
_codes = [ '\u0008', '\u25a6' ]
newGame() {
local cs;
cs = new CharacterSet(getLocalCharSet(CharsetDisplay));
"Charset is: <<getLocalCharSet(CharsetDisplay)>>\n ";
if(!cs.isMappable(_codes.join(''))) {
"Charset check failed.\n ";
return;
}
"<<'foo' + makeString(_codes.join('')) + 'bar'>>\n";
}
;
Ah, correcting myself here. I was missing something obvious: frob
does not appear use the locale or anything like that from the terminal it’s invoked in; you have to explicitly use UTF-8 on the command line (e.g. frob --character-set utf-8 game.t3
).
That’s…irritating.
For the avoidance of confusion: the #charset
directive only determines how the compiler reads the current source file. It has no bearing on the run-time representation of strings.
This is because frob does not support Unicode. It’s a limitation in the portable Tads 2 terminal/console output subsystem (also used in Tads 3) which was never updated to support Unicode. It was written with legacy Windows codepages in mind
Terps that don’t run in a terminal don’t use that subsystem.