Inform 7: A Coder's Approach / CLI dev

TL;DR I wanted to make it so I can use “vim” and “make” and “py.test” to dev/test Inform, check out what’s in progress by:

git clone --recursive https://github.com/spankminister/i7project

You should be able to just type “make” if you have make/gcc installed, and you may need to add the directory with inform6/ni to your path.

Long version:

I’ve been somewhat dissatisfied with the way Inform 7 is set up for development. I’m not blaming the tools-- I understand the rationale behind wanting to create a means for game dev with non-intimidating (to writers) syntax and a built-in IDE. But as a coder, Inform 7 is a programming language with syntax, scope, compiler errors, and everything that goes along with that. And making a program of any nontrivial complexity requires code organization, the ability to generate robust unit and regression tests, and the means to refactor and isolate functionality and responsibility. Most of the criticisms of my first game weren’t because of the prose, but because it was buggy. And because of the legacy of Inform 7 being a natural language manuscript, we have one big text file and the general attitude that one should send the “draft” to “proofreaders.” But really, no serious application today does QA by simply sending the “beta” to “testers” and telling them to do whatever comes to mind-- to really identify and isolate bugs, coherent test plans ensure that the parts of a program work separately, and then that they work together.

So, since I develop on OS X and Linux primarily, I wanted to make it possible for me to develop in a way that promoted good habits in terms of code reorg, unit tests, and so on. There’s a bunch of changes I made, including:
[] Inform can be invoked via “make” which will perform the I7 compile, the I6 compile, as well as glulxe built against cheapglk
[
] Run Python style unit tests, such as specifying a series of commands that complete the game and checking the output for "*** YOU WIN **"
[
] Source files go in the Projectname.Inform/Source directory with the .txt extension, upon make, all these .txt files are concatenated together to make the story.ni that actually gets compiled.

I’d like to go even further in the testing department, adding unit testing for scenes, threaded conversation, and connecting them all together to make more complex regression tests. Also, I think it would be cool to have an “Infinite Monkey” tester that randomly took verbs and nouns from the game and attempted to brute force complete the game, as an attempt to programmatically find bugs that could make the game unwinnable. Ideally, I’d be able to harvest actions and objects from the Index, but I’m not sure how to grab things programmatically out of it quite yet.

I’m applying this approach to my “rewrite” of my first game. I’m not throwing anything out, I’m just adding things into this project “cleanroom” style, and writing tests as I go to try and prevent the bugs that plagued the original from making their way in. I know a couple of people had been asking about command line development, hopefully this helps someone.

Those interested in a more test driven style of development should also check out Mike Ciul’s test framework for Kerkerkruip: github.com/i7/kerkerkruip/blob/ … esting.i7x It allows for testing things within the game (so giving you access to more data than what is just printed out), while also allowing you to reset the game state, handling random events etc.

It doesn’t really take much to compile an Inform 7 project, you can see how we do it in Kerkerkruip here https://github.com/i7/kerkerkruip/blob/master/tools/build-i7-project - most of the trouble comes with ensuring you have the right extension versions, something that will hopefully be much improved in the next release of I7. :slight_smile:

1 Like

I like the idea of automated testing, but that approach is not physically feasible. The standard library alone has about 60 in-world actions. About half of them accept a single noun, about 10 are nounless (WAIT, JUMP), another 10 take two nouns, and the rest require topics or numbers (which we’ll ignore for now). Given a trivially small game with, say, 10 objects per room, even that is 10+3010+1010*10 = 1310 different combinations per room.

If the testing script picks commands purely at random and the optimal walkthrough requires only 10 actions to reach the ending, and assuming the script has a 1 in 1310 chance each turn to enter the correct action required, on average it takes about 13,000 moves for the script to reach the end. This is generously assuming that the player can’t undo actions, like drop items they’ve picked up, and that the script is always in the right room when it stumbles upon the correct action.

A couple of real life examples: in one game I had an optional puzzle where the player had to move a heavy statue 4 rooms and place it in a container. The statue was so heavy that the game was set up to drop it automatically after one action. So you had to pick it up, immediately move to the next room, pick it up again etc. Given the generous assumptions of 10 objects per room and no custom actions, the probability for a random script to do exactly those actions in that order is about 1 in 1310*1310 = 1/1,716,100. And that would have to happen 5 times in total, and the script would have to miraculously move exactly that route and never move the statue anywhere else. The total probability of a random script ever solving that puzzle is so incredibly low that it’s practically zero.

Another practical example: I had designed a fairly complex game that lasted for exactly 25 moves, which is quite a good situation for automated testing. I don’t remember how many sensible actions it had, but if we say 25 it’s probably quite close. Note that these were hand-picked, reasonable actions that were known to change the game state, so all nonsense actions were trimmed out. I thought I’d write a quick script to play out all the possible permutations of gameplay using only these sensible actions. It turned out that trying all 25^25 possibilities (~8.89*10^34) would have taken the script about 10^24 times longer to run than the estimated age of the universe. And the script used only about one 50th of all possible actions!

Sorry about the lengthy post, I just like crunching numbers like this. Didn’t mean to completely shoot down the idea. As I said, it’s a good idea, but a completely random set of commands is never going to work. You could, for example, give the script an expected walkthrough and every turn have a random chance of it attempting a random action (again from a preset list of actions that are known to affect the world state) instead of following the walkthrough. This way you wouldn’t get wrecked by the combinatorial explosion and you could still get meaningful data from the random actions.

1 Like

I’ve thought about this a lot in the past.

My take would be a tool that allowed the author to test based on selection. So you play the game out to turn n and then say “test noun get” or “test object sword” and the tool would go through all of the combinations with that particular word within the context of the current turn. Just throwing out ideas of course.

I agree with Juhana that doing everything including turns is currently improbable. Although the tools to do this are not too far off. There are projects like Microsoft Orleans that allows you to “spin up” thousands of nodes to act on data. If you had a node that contained the VM and the story file and save file, you could potentially fire off millions of commands at once and get responses back. It would be “big data”, so you’d still need tools to decipher patterns. But you could get statistics that told you certain objects were only useful once in a story (and maybe that’s a good thing, but testing that is something an author may want to do). Or you could get results of object use that you hadn’t considered and want to add a custom response. This is all science-fiction at the moment, but don’t be afraid of the numbers. Big Data is a real thing and the potential with IF is there.

Thanks for the link, I was obviously going to have to access more data eventually to essentially perform asserts on things the player can’t see like variable state or the location of a roaming NPC, so I will likely crib a lot from these examples.

Ah, I was having a lot of trouble locating CLI examples like this earlier. I modified the existing i7 Linux convenience script/console for OS X, but your way is cleaner-- I am inclined to trim PERL from projects where possible =)

Extension-wise, my Inform 7 project source checkout just has a subdirectory with all the extensions it needs, and stomps them into the Inform 7 directory just prior to compiling.

So you’re right that an entirely brute force approach is generally not feasible, and the first thing I wanted to implement was indeed inserting “errors” into the critical path and finding out how far off the path it was reasonable to go as you suggest. I also think most stories will not need to really do every action in every order. Going by the threaded and dynamic object-oriented model diagrams in this paper, game states often collapse down, and at least in the game I was hoping to test out, I think only a few of my actions truly “change the game state”. Another possibility is having the brute forcer insert a certain quantity of garbage commands, and then make sure that from that point, the walkthrough still works. I’m less interested generally in mathematically proving the game is complete than I am finding bugs-- and really, all testing simply provides some measure of assurance rather than complete assurance.

Even more so than making the game unwinnable, I’d like to at least get a feeling for which actions and objects are defined and undefined. Similar ideas have definitely been brought up before about autoparsing room descriptions and ensuring that any noun mentioned has a description or portability status, and I’d like to attempt to address that as well.

Is that correct? If you’re doing a 25-permutation of 25 elements, that’s 25! or 1.551121004 E+25, right? Your point is taken that’s still a large number, but not as large as 25^25.

In any case, the valuable thing I think brute forcing can bring to the table here is not “can this random assortment of commands win the game?” but “is the game still winnable using the walkthrough after this random assortment of commands?”. That’s a much lower bar, and I think one could potentially get useful data from even a low number like a couple of million playthroughs.

That would be true for unique elements, but actions can be repeated (and almost always need to be, e.g. >GO NORTH).

If a little self-promotion is allowed, I’ve already made that tool: emshort.com/pl/payloads/Juha … 0Tests.i7x

That’s a good idea. Some of the debugging/testing logic is going to have to live in I7, and some externally. I was also thinking to group actions into “game phases” in case it’s possible to complete several critical tasks out-of-order.

Yeah, one of the chief complaints I got as feedback was that I hadn’t anticipated a potential use of an object, so I really want to make sure I handle those gracefully. Ideally, whatever I do will make it easier to identify those situations in future projects. I’m not afraid of the numbers in the least, quite the opposite: I want to make sure that my experience doing parallel computing (the hammer) is not leading me to treat an overly large problem like a nail.

Maybe I misunderstood, I thought you meant you wanted to reorder a 25-move solution into all potential orders, so the first move has 25 choices, the second 24, and so on.

Awesome! Self-promotion is always allowed when it makes my life easier :smiley:

No, the same number was just a coincidence (perhaps I should have picked a slightly different number to avoid confusion.)

Hi Everyone, I recently pulled together an integrated vim development environment using the vim plugin architecture for Inform7. The installer, documentation, and links to all source code are here:

1 Like