Some of the groundwork is already laid for this in the “output areas” system; the bigger problem is that Dialog (and thus the Å-machine) very specifically has no “string” data type. (In I7 terms, it has texts, but no indexed texts.) Which means printing text to a buffer isn’t currently very useful; there’s not much you can do with that buffer afterwards besides “print it” or “measure its length”.
(Both useful operations, certainly, but much less useful than in Inform, where you can then go mess with the individual bytes in the buffer!)
I agree, that’s a cleaner syntax! I’ve now changed it to \x{12...34}, with any number of hex digits allowed inside the braces (though it throws an error as soon as you go past $10FFFF). This means you can even use \x{} to insert a $00 character, or \x{1} for a $01, though neither of these will ever be useful (they’re filtered out by the lexer before the compiler ever sees them). I guess you could use \x{9} for a tab or \x{A} for a newline if you really wanted.
I’m also seeing a general lack of interest in the character length builtin, so I’ll shelve that for now and implement it in standard Dialog code instead. If that ends up being unacceptably slow, I can reopen the issue.
All right, enough Dialog for this week. Back to stats!
Were the dgdebug options --warn-not-topic and --no-warn-not-topic added recently, or have I just not been seeing stderr lately? A metric ton of warnings get generated now when you run unit.dg tests, unless you add --no-warn-not-topic. Maybe --no-warn-not-topic could be the default, at least for the debugger?
The default should be --warn-not-topic if a library file is included (anything that defines a (library version) predicate) and --no-warn-not-topic otherwise—from the poll, the majority of authors wanted these warnings to be on by default, because most of them won’t know about obscure command-line options. If someone is adding unit.dg to their project, though, they’re hopefully familiar enough with the command line to add --no-warn-not-topic as well?
I could also make it be off by default in the debugger, but that’s where I expect authors to be doing most of their testing, so I’d prefer to give the warnings.
Another option might be to not warn about objects declared in files that define an (extension version) predicate unless --warn-not-topic is explicitly requested. This will be a bit trickier to figure out, but could be doable.
This wouldn’t help if the individual unit tests were defined in the same file as the main story code, though.
In general, unit tests live in their own *-tests.dg file, not in story or library files. Checking for (extension version) doesn’t help. The tests can’t be in the main story file, because unit.dg defines (program entry point).
A mechanism to turn command line flags on and off from Dialog code could do the trick – when running unit tests in the debugger, you almost always want -q --no-warn-topic. You’d want any flags entered on the command line to override what the code asked for, though.
Genuine question, since I haven’t used unit.dg before: is it a problem to have the recommended style be
#test-aardvarks
(test *) (number of aardvarks $N) ($N > 5)
#test-zebras
(set up *)
(exhaust) {
*(zebra $Z)
(now) ($Z is #in #veldt)
}
And so on?
Otherwise, I could see some other arguments for --no-warn-not-topic being the default in the debugger: you probably aren’t going to compile unfinished code, but you will debug it, and unfinished code is likely to have non-topic objects in it.
While I’m on the subject, enough of a preprocessor to enable conditional compilation, including taking “set this flag/value” from the command line, could be valuable. I’d love to be able to target multiple different memory models, and include more options, colour, help system, and so forth in a .z8 or .gblorb for modern desktop machines, but strip out enough from a version targeted at the Å-machine for the C64 to get it to fit in memory, or onto a single 1541 disk. (And then possibly also fine-tune for C64/REU and/or C128/1571.)
40-column versus 80+ column output would be another use case here; I have some structured output that looks awful if it line wraps, and knowing how much real estate I have to work with is helpful.
At present, the solution is to have multiple file versions, but that gets error-prone and awkward; I’d love to be able to build for all the targets I’m after from a single source tree.
(Okay, at present, the solution is that I don’t have a game close enough to release yet for this to matter, but still.)
I owe another PR with documentation for unit (and other) testing; for the moment, there’s a worked example of an extension with its tests sitting in test/unit in the Dialog source tree.
There would be a (list of tests [ #test-aardvarks test-zebras]) up front, but generally, that’s right. I’ve been using something like (test #test-aardvarks) by itself without bothering with the topic, which helps readability, but declaring the test name as topic followed by (test *) works around the issue.
I’m strongly interested in that character length builtin! In particular, I need to know the width in characters of the main output div, to tell 40-column from 80-column displays.
I’m hesitant to add a full preprocessor, because that massively increases the complexity of the language, and Linus’s original goal was to keep the language simple and elegant. Instead, the way I’ve been doing it in my project is to put platform-specific code in a special file, and include or exclude that file when compiling—adding zmachine.dg to the command line isn’t any more work than adding -Dzmachine.
From the Makefile for Wise-Woman’s Dog:
FILES = act1.dg act2.dg act3c.dg act3w.dg act3e.dg act4.dg spells.dg footnotes.dg actions.dg interface.dg automap.dg nudge.dg plist.dg stdlib.dg
hasawa.zblorb: $(FILES) zstyling.dg resources/cover.jpg
$(DIALOGC) -t zblorb zstyling.dg $(FILES) -o hasawa.zblorb $(MEMORY) -c resources/cover.jpg -a "A dog in a field surrounded by hieroglyphs" -vv 2>&1 | tee z.build
hasawa.aastory: $(FILES) aastyling.dg resources/hints.html
$(DIALOGC) -t aa aastyling.dg $(FILES) -o hasawa.aastory $(MEMORY) -vv | tee aa.build
debug: $(FILES) aastyling.dg
$(DGDEBUG) aastyling.dg debugger.dg $(FILES)
For more fine-grained changes that can’t be pulled out into separate files, I’ve found it easier to just do my own preprocessing (in Perl, Python, sed, etc) before compiling. For Wise-Woman’s Dog I did this to create a pure-ASCII version, for use with older interpreters; writing a script in Python to remove non-ASCII characters was much easier than in any preprocessor that could be built into the compiler. Some kind of external script to process %%ifdef, %%else, %%endif seems like the better solution here.
That part will definitely be available! I’m leaving this PR open for a week or two so people can take a look if they want, but it’ll make (current div width $) work on Å-machine (both web and 6502); in the main window, it’ll tell you the full screen width.
I’ve implemented a highly-specific feature that’s probably of use to no one but me: the ability to specify word separators. Want words to not be broken by . (so intfiction.org is a single word), but to be broken by = (so na=an is three words)? Now you can do that by passing --word-seps "=" on the command line. This is useful in some non-English languages that break sentences differently than we do.
(This has to be done on the command line rather than within the source because it affects how Dialog processes dictionary words, so it needs to be set before any dictionary words are read. The easiest solution is to make it a command-line switch.)
If you’ve got big objections to this, say them now! Otherwise, I expect it to be an obscure curiosity that languishes in the back of the manual. But this lays the groundwork for future things like changing which punctuation marks suppress space before or after them (e.g. in French, ? and ! should not suppress space before them) if it ends up being useful.
What does this mean? Well, take this bit of source.
(try joining $List)
(join words $List into $Joined)
Joined $List into $Joined
(try joining $List)
Unable to join $List
(length of [] into 0)
(length of [$|$Tail] into $Length)
(length of $Tail into $Lm1)
($Lm1 plus 1 into $Length)
(program entry point)
(try joining [a . b . c]) (line)
(try joining [a = b = c]) (line)
> (get input $Words)
(length of $Words into $Length)
Got $Length words: $Words
Normally, it will output this:
Unable to join [a . b . c]
Joined [a = b = c] into a=b=c
> w x.y=z
Got 4 words: [w x . y=z]
If compiled with --word-seps "=", it will output this:
Joined [a . b . c] into a.b.c
Unable to join [a = b = c]
> w x.y=z
Got 4 words: [w x.y = z]
One word of caution: this supports all BMP characters on Z-machine and Å-machine, but only ASCII in the debugger. The way the debugger tokenizes dictionary words is kind of weird and I didn’t feel like rewriting it all for such an obscure feature. If someone wants it, that can change.
As an IFID is required to compile Dialog code to a .zblorb file, would you be able to add either an IFID-generating Dialog program or have the compiler generate an IFID automatically?
I think the problem is that the portable C we’re using has no real way to guarantee proper randomness. To ensure IFIDs don’t collide, there needs to be a lot of entropy in how they’re generated, and the only thing the C standard library can guarantee is a 16-bit seed—sometimes even less than that. A proper UUID needs 128. We can get better randomness on Mac and Linux via /dev/random, but that’s no help on Windows.
So my guess is, Linus thought a defective IFID generator that would repeat values was worse than no IFID generator at all. And I’m inclined to agree there! But the current behavior could still be improved. It could point to one of the sites for generating IFIDs (1, 2), for example, so you can just click the link and get one from a properly random source. Or the default output format could be changed to Z8 instead of zblorb, which doesn’t require an IFID. Or both!