The story of failures in automatic text adventure testing

Testing a text adventure game is very important. There are many types of testing that needs to be done, and most of it has to be done by hand. I tried automating some testing and also blogged about automatically testing adventure games before. Now it is time to share some examples where automatic testing goes wrong.

How I test?

I essentially create a test script that will play the game by itself.

I add alternative commands to try out. This will make sure that synonyms and alternative words are also recognized. The test runner will pick a command in random during every run, so in the long run, every one of them is tested.

The script will check that items are picked up or left in place, score is assigned, things are locked or unlocked etc. It does a decent job on making sure everything is working (and nothing is broken).

Item not picked up

One test I do is testing that the player can pick objects up. But sometimes before you can pick something up that object needs to be found (searched for). At other times a puzzle needs to be completed or some event has to happen before the game allows the item to be picked up. When I test for this behavior I am making sure that the puzzles will work in the correct order.

My automated scripts can test if the player has an item, or if he does not have it. So I can test things like this:

Make sure item X is not with the player
> take X
Make sure item X is now with the player

If the check in steps 1 or 3 fails, my test fails. I check the player does not have item X, take the item, and then verify that it succeeded. If X is not visible or accessible or the player is prevented from taking X, the test will fail.

However, if now want to test that the player cannot take something, I will write:

Make sure item X is not with the player
> take X
Make sure item X is not with the player

For example, if the item is a fixed item (decoration) or if there is some mechanic preventing the player from getting at X before something else is done, this is the test to use.

A typo?

Now if I accidently make a typo in line 2 (I write take papwe instead of take paper) it will of course not take it, because there is no such item called papwe. The parser tells me such (I don’t understand the word ‘papwe’). But the test is not able to pick up on this, and it will conclude successfully, since I do not have X.

This is wrong. There was no real attempt made to get X (because of the typo). This test actually should fail.

I found by accident one such mistake in my test script. Because in order to find these, I need to manually read through the transcript of the testing session and verify each command and response.

How to fix this so the tests can pick up on typos?

What I will probably do is have a test to check if the last command was successful. If that is not the case, the test will fail.

My parser already (internally) flags if something was not executed, but my test script is not able to verify this yet. If I could verify that a command executed successfully, that would be another check in the test script.

Item there, but not visible

During development, things are changed around. I accidently flipped the visibility of one item. It should have been visible (listed with the room description), but it was not. It was still there, and if you knew about it you could take it. It just was not listed.

Such “hidden from description” items are normal in text adventure games. They are usually included in the room description itself or other item descriptions.

The test script of course ran without errors – it could pick up the item. But if I played the game, there was no way for me to spot the item.

This is very tricky to detect automatically, and I will need to figure out how I could protect against it.

Wandering actors

Sometimes interaction is required with an actor that wanders around. This is a problem, since if I instruct the script to go somewhere, the actor will not be there all the time.

My automated scripts are pretty dumb and static, so this posed a problem. I ended up adding a special debug verb to “bring actor to my location”. This seemingly solved the problem for a while. When I require actor Y to be present, I go to the location where the actor can appear, and bring him or her there with the debug verb.

This now created a new problem, namely if the actor’s wandering action takes place just as I bring him there, it is possible that the actor leaves right away.

So I needed to temporarily “restart” the wandering whenever this special debug verb is used. By restarting the wandering the actor will stay in place for a few turns, so the test script can interact with it.

Conclusions

Automatic testing can verify that certain commands work, that the game is playable (to the end), the right amount of score is given out for the actions. But it cannot check for textual errors (something not being listed) and it does not know your intentions, so it also cannot always distinguish errors from typos.

It is still a very useful tool in my arsenal. But when planning to properly test a text adventure, remember that automatic testing will not get you there all the way. Manual testers will still find lots of bugs that current automatic tests are not able to find.

Seventh Horseman Blog

Writing about developing stuff