Testing extensions

Does unit testing make sense in a rule-based language? If not, what kind of testing scenario does make sense?

I’ve been finding testing extensions to be very inconvenient - and that seems unfortunate because extensions should really be more tested than individual stories, not less.

If you write extensions, do you have a regular method for testing them? What do you do? What’s on your wish list for Inform testing capabilities?

My attempt at supporting I7 unit tests is:

ifwiki.org/index.php/Automated_Testing

It’s still very much a work in progress (well, very slow progress) so I’m happy to hear any feedback you might have.

Cheers,
Roger

I’m trying this out. It looks awesome - a way to get around all the weirdness with “test me” and the skein/transcript bugs!

Now if only there were a utility to translate the transcript into your test table format…

Thanks for directing me to your work, Roger. I think I’m going to use it, and I hope you’ll keep developing it.

One question: Why are expected outputs entered as regular expressions? Since you can fix the random number seed during testing, wouldn’t you want every output to be an exact match?

I’ve managed to eliminate the need for the Table of Scripts…

[code]Check allasserting:
if there are no tests, say “There are no tests!” instead.

Carry out allasserting:
Blank out the whole of Table of Executions;
Repeat with item running through tests:
Choose a blank row in Table of Executions;
now the test-index entry is the index of item;
sort the Table of Executions in reverse test-index order;
write File of Scripts from the Table of Executions;
reboot.
[/code]
For some reason, using the same name for the “index” column of the Table of Extensions as the “index” property of a test causes a conflict when sorting the table, so I also had to rename the column to “test-index.” But otherwise I think this works as advertised - just number your tests in the order you want them to be executed and this will take care of the rest.

Also, it looks like the “order” column in test step tables is completely unused. Were you planning to do something with that, or were you in the process of removing it?

I’m on a roll with this! I have now set it up so that the vm doesn’t reboot if you only do one test. I’ve also given tests a “prerequisite” property, so you can specify that one test runs before another test, like a skein branch. And I’ve added the handle test response rule to the after printing a parser error rulebook so you can test the output for parser errors as well as successfully parsed lines. Now all that’s left is out-of-world actions like score…

Sounds like you’re making good use of it!

I don’t recall exactly why I decided to go with regular expression matching, actually. I guess the extra functionality doesn’t hurt, per se – you can still test for an exact match if you want, of course. Hmmm.

You can fork the development into your own extension, of course. Or if you think it’d be better to keep it all together, let me know and we can, at some point, incorporate your improvements and release it as a new version (I /think/ extensions allow for multiple authors to be credited, but I’m not entirely sure.)

Cheers,
Roger

No, extensions are supposed to have only one author in the identifier. Authors are, of course, encouraged to credit all contributors in the documentation.

If you’re planning to come back to this yourself, I would definitely prefer to avoid a fork. If you’re not able to weigh in on design decisions while I’m fiddling with the extension, I will probably keep my version semi-private for the time being. My preference is for you to remain as the primary author.

For practical reasons, all the extensions I’m working on can be found here:

eyeballsun.org/i/

These are under version control and are updated pretty frequently. If you’d like to collaborate with me, you can always go there to see my latest check-in, and I’m always open to input on how to keep the extension in sync with your own work.