Developer Diary #3 - Testing, Testing, Testing

This third diary topic follows from the conversation that arose from my previous topic - Code vs Configuration. Tragic underwent an enormous amount of testing. A quick unblurred spoiler first: despite all the testing I describe here, it still wasn’t enough. Every game ultimately ships with bugs, and though I wish it were not true, Tragic has been no exception.

In this post I will outline the three levels of testing that I used before publishing the game:

  • Automated Unit Tests
  • Manual Coverage Tests
  • Beta Testing

Automated Unit Tests

Tragic is written using the Unity game engine. This engine comes with a feature that allows you to write code that tests your game code. To give you a sense of the scale of the game, Tragic has 26,200 lines of code. In addition to this, the unit tests for the game make up 5,441 lines of code. Neither of these numbers include the configuration files I covered in the previous post or the dialog text files (the stuff you actually read in the game).

Unit tests are generally short pieces of code designed to exercise a method or function in your game code. When the test runs that method it looks at the side effects and the test either passes or fails based on what it observes. I have 198 unit tests for Tragic that I run to check on my implementation. When they are run, I get a screen that looks something like this:

UnitTests

Green check marks are good. They mean the test passed. Anything else and I need to go fix a problem - sometimes that means a problem in my code, sometimes in the test.

Writing testable code is an art form all on it’s own. These unit tests work best when you have a new game instance for each test, and creating a new game instance takes time. So part of the art of writing tests involved setting up my functions so that they could work with a mocked up game instance that didn’t have all the art loaded.

For those who can read code, here is an example of a test:

[Test]
public void DrawCard_PlayCard_DrawsCards()
{
    EncounterUtils utils = new EncounterUtils();
    CombatManager combatManager = utils.CreateCombatManager();
    EnemyEncounter encounter = utils.CreateTestEncounter(1);

    int drawAmount = 2;
    PlayerCard card = utils.CreateEmptyCard(CardType.Effect);
    DrawCard component = new DrawCard(drawAmount);
    card.Components.Add(component);

    int startingHandCount = encounter.Hand.Count;
    combatManager.PlayCard(encounter, card, utils.Data.MainPlayer);
    Assert.AreEqual(startingHandCount + drawAmount, encounter.Hand.Count);
}

This purpose of this test is to check that a card with a ‘DrawCard’ component on it will in fact draw a card when it is played. The top three lines create the game environment (in this case an enemy encounter) in an efficient way. These steps can be done in about 1 millisecond. The middle four lines creates the card to test. The final three lines play the card and then check the results.

The whole suite of unit tests can be run in just a few seconds. I would run these tests before submitting any of my changes to my source control program, and of course before publishing the game to the competition. I discovered and fixed hundreds of bugs this way. Still, it wasn’t enough.

Manual Coverage Testing

With the proper coverage in automated testing I thought I would be able to avoid having to test everything manually. Things did not work out that way. Despite the testing coverage that I thought covered all card, item, and character components, it seemed like nearly every one of those game objects had some sort of defect when I played with them in the game. It quickly became obvious that I was going to have to manually test every card, every accessory, every enemy, every encounter, every story event, you get the idea.

To do this, I first had to write in some game cheats. These cheats would allow me to add or remove cards to my hand, change the enemies I was fighting, add buffs and debuffs, etc. Those cheats are in the game now, but I’d be impressed if someone found them. The second thing I had to do was create a spreadsheet of all the items I needed to test. The job then was simply to work my way down the spreadsheet.

This is just a small sample of the items whose functionality I had to check. An interesting aspect of working on an IF game is that I found it useful to keep track not just of the functionality of the item, but also any narrative text that was associated with that object. There is a good argument to be made that for each problem I encountered I should have written or updated a unit test to check that again. Had I been more disciplined, perhaps I would have done so.

I highlight the Foil Hat above, as it illustrates that even this level of testing is not enough. The Foil Hat helps your character by preventing him or her from gaining debuffs that reduce damage output. I checked this against enemies that applied those debuffs, but I forget to check the few cards whose play makes you debuff yourself. The fix for this bug came in an update after the competition had already begun.

Beta Testing

Beta testing involves giving a build of the game to a limited number of players in order to collect feedback. The hope is that you have found and fixed all the bugs at this point and you are mostly looking to collect feedback about the mechanics and design of the game.

For Tragic I had three Beta Testers. Their feedback was invaluable. I worked with them one at a time so that I could make adjustments to the game and have those adjustments seen with a fresh set of eyes. Mathbrush is a trooper. He is asked to test a lot of comp games, and he was the first one to test Tragic. He quickly ran into problems that blocked game advancement and was patient with me as I would fix these so he could see more of the game.

Spike went next. He showed me enough typos and grammar problems that it caused me to write an exporter for my game dialog so that I could have Microsoft Word provide me suggestions for the entire game before he had to dive back in. This is a theme for this post: let robots do robot work so that humans can concentrate on things humans are good at. I also learned from Spike that I had tuned the game too much toward an audience steeped in the genre. It was after his playthroughs that I created the save and restore functionality that allowed you to further progress in the game without having to restart.

My third beta tester, David, is not involved in the IF community. He did something Mathbrush had also done and immediately chose the highest difficulty level when he first started to play. This behavior surprised me, though in hindsight it shouldn’t have. I shouldn’t expect players to have any background whatsoever on the impact of this choice. The higher difficulty setting suppresses both the tutorial and the story itself, so I quickly added logic to push the user to play on Story mode their first time through.

The End Result

In the end I believe I spent as much time testing my game as I did writing it, perhaps more. It is a complex game, but what game isn’t? I discovered and fixed hundreds of bugs, and still I entered the competition with several that I had missed. Testing is an essential part of any game development process and on behalf of everyone who has plays the games in the competition, I want to thank all of you who have contributed to the success of these games with your testing feedback.

4 Likes