Oddlabs Game Development Blog - a peek inside the labs

How to save yourself from Debugging Hell: Deterministic games

Have you ever wished you knew just what lead to the strange exception that your game just threw? Most games are complex simulations so even an otherwise well-meaning exception message and a stack trace is not always adequate for figuring out what went wrong. However, there is a way to recreate even the most obscure situations…

Recreating those obscure crashes

For example, quite a few bugs in Tribal Trouble happened under obscure circumstances; we had a crash once that involved someone throwing an axe at a tower at just the same moment the defender decided to abandon the very same tower. That is why we need a more heavy-handed tool to help debugging - how would you like to have an entire game session logged for easy replaying? We can even do it with very low CPU and disk usage - Tribal Trouble always run with logging enabled so the log can be uploaded to us in the event of a crash.

If you’ve never done replay-capable games before, you’d be surprised how easy it is to incorporate, if you realize an important fact about most games: the output (graphics and sound) only depends on very few low-bandwidth input sources. And an entire run of a game can be replayed if all of the input sources are replayed.

So what are the typical inputs of a game? The short story is: Input devices. The longer story is: Input devices, network, environment settings, exceptions, FPU/JVM, threading and time. We’ll cover all of them shortly. First we’ll define an input source as “a method invocation whose result varies with each run of the game”. Let’s see what that means from an example taken from an input loop of the LWJGL Mouse:

Mouse.poll();
while (Mouse.next()) {
int x = Mouse.getEventX();
int y = Mouse.getEventY();
int dz = Mouse.getEventDWheel();
int button = Mouse.getEventButton();
updateGameState(accum_x, accum_y, accum_dz, button);
}

Here, the input sources are: Mouse.next(), Mouse.getEventX(), Mouse.getEventY(), Mouse.getEventDWheel(), and Mouse.getEventButton(). All of them cause the game to behave differently at each run if they return different values at each run. So we need an easy and lightweight way to make them behave exactly the same as a previous run. This is what the Deterministic utility is for. It’s an abstract class, with the important methods seen below:

public boolean log(boolean b)
public byte log(byte b)
public char log(char c)
public int log(int i)
public long log(long l)
public float log(float f)
public Object log(Object o)

The game instantiates either a logging instance (SaveDeterministic) or a replaying instance (LoadDeterministic) depending on a startup flag or whatever fancies you. Each and every input source is then fed to one of the overloaded log methods before being used, as done for the LWJGL Mouse loop:

...

boolean replay;
Deterministic deterministic;
if (replay)
deterministic = new LoadDeterministic("replay.log");
else
deterministic = new SaveDeterministic("replay.log");

...

Mouse.poll();
while (deterministic.log(Mouse.next())) {
int x = deterministic.log(Mouse.getEventX());
int y = deterministic.log(Mouse.getEventY());
int dz = deterministic.log(Mouse.getEventDWheel());
int button = deterministic.log(Mouse.getEventButton());
updateGameState(accum_x, accum_y, accum_dz, button);
}

As you can see, it’s a pretty straightforward transformation and that’s really all there is to making Mouse input deterministic. Run the transformed LWJGL space invaders example game with ‘ant run’ and then replay it with ‘ant replay’ to see it in action. Even though the game is simple in terms of simulation complexity, it is important to point out that the implementation complexity of replays depends only on the complexity of the input sources, not on the game complexity. So it is just as easy to implement replays in space invaders as it is for Tribal Trouble, since both use just the LWJGL Keyboard and the Mouse for input.

So how does it work?

There are two kinds of game runs: a normal run and the replay run. With a normal run, a SaveDeterministic instance is used and a log method looks like this:

int log(int i) {
ensureSpace(4);
buffer.putInt(i);
return i;
}

which simply logs the given value to a buffer (which is regularly flushed to the log) and return it. So for a normal run, the log methods are a NOP as far as the rest of the game is concerned. When replaying, however, the log methods change:

int log(int i) {
ensureSpace(4);
return buffer.getInt();
}

they ignore the passed argument and return the next value in the buffer (which is regularly read from the log). So for a replay run, all input values are effectively ignored and replaced with the previously saved input values. Simple, but effective.

Checksums

To make sure you haven’t missed any input sources, you’d want to add checksumming of the game state. To implement checksums, you’ll need a computeChecksum() method that computes a checksum from the game state:

int checksum = computeChecksum();
int logged_checksum = deterministic.log(checksum);
if (logged_checksum != checksum)
throw new IllegalStateException("Checksum mismatch: " +
checksum + " != " + logged_checksum);

Reducing log size

If we stopped here and used the Deterministic classes for a game, we’d quickly run low on disk space: The game loop runs at about 60 Hz and the game therefore logs input return values 60 times a second, even if the input devices are not touched.

To slow the log growing proportional to the number game loop iterations, we introduce default input values. Each primitive input type has a default value: 0 for byte, char, short, int, long, float and double and false for booleans. If a log invocation detects that the logged value matches the default, no value is logged but an internal counter is incremented. When a non-default value is later logged, the counter is first written to the log, indicating the number of default values preceding the non-default value. This way the log grows very slowly over time, at the expense of some overhead for each non-default value.

Input sources

A game can receive input from other sources than the Mouse or Keyboard, of course, and every source has to be dealt with to make the game replay correctly. If even one source of input is missed, the replay will differ from the previous normal run. Here are the most common types of input:

Other Input Devices

JInput is a common source of input other than keyboards and mice, and can be replayed almost as easy as in the Mouse example above. Look out for subtle input sources, like the length and contents of the Controller[] array returned from ControllerEnvironment.getControllers(). This is a sample controller setup loop with replay support:

ControllerEnvironment ce = ControllerEnvironment.getDefaultEnvironment();

Controller[] controllers = ce.getControllers();

for (int i = 0; i < deterministic.log(controllers.length); i++) {
Controller controller = replay ? null : controllers[i];
String name = replay ? null : controller.getName();
System.out.println(deterministic.log(name));

if (deterministic.log(controller instanceof Mouse)) {
devices.add(new MouseDevice((Mouse)controller));
}
if (deterministic.log(controller instanceof Keyboard)) {
devices.add(new KeyboardDevice((Keyboard)controller));
}
if (deterministic.log(replay || controller.getType() == Controller.Type.STICK)) {
devices.add(new StickDevice((AbstractController)controller));
}
}

where we use the “replay” property to hack around the problem that the Controller[] array might not have the same length and content as the last run. The StickDevice, MouseDevice and KeyboardDevice classes need to check for null accordingly. Here’s a polling loop for StickDevice:

if (stick != null)
stick.poll();
int x = 0;
int y = 0;
while (deterministic.log(stick == null || stick.getEventQueue().getNextEvent(event))) {
if (deterministic.log(stick == null || event.getComponent().getIdentifier()
== Component.Identifier.Axis.X)) {
x = deterministic.log(event.getValue());
} else if (deterministic.log(stick == null || event.getComponent().getIdentifier()
== Component.Identifier.Axis.Y)) {
y = deterministic.log(event.getValue());
}
}

Exceptions

The input methods can throw exceptions, which can alter the game behavior. The simplest way is to avoid exceptions in input methods altogether, or return them as regular return values and log them through the log(Object o) method.

Environment settings

Common sources include the Preferences and System.getProperty APIs. Use the log(Object o) to “clean” the settings before using them. Note that logged Object return values cannot be compared with the == operator since they are serialized.

Network

Networking generally return large chunks of data in a somewhat higher rate
than human-controlled input devices. We haven’t used the replay tools for
networking yet (Tribal Trouble used an older and much less elegant replay tool), but Deterministic will probably need direct support for arrays without the overhead per primitive value of a naive implementation of loggin each element separately.

Floating point

The JVM spec allows for different results from floating computations on different FPU types and/or JVM version. This can only be completely avoided by using strictfp or avoiding floating point game state altogether. However, if you’re only replaying logs from your own machine with the same JVM, chances are that the floating point results are replayable - but be careful.

hashCode()

An obscure source of input is the default hashCode() method defined on all objects which returns potentially different values at each run. You’ll need to watch out for code that depends on the hash codes. For example, the HashMap iterators can return its elements in arbitrary order and the order can vary for each run, because of the changing hash codes. If you need a deterministic hash table, use LinkedHashMap instead.

Threading

In general, you’d want to avoid multiple threads in the game loop, because the thread scheduling is not deterministic. Threads should only be used for music queueing, loading game resources and similar tasks that don’t change game behaviour.

Time

Time can be an input source, typically if you implement a frame rate independent game loop. A typical loop looks like this:

long current_time = System.currentTimeMillis();
long elapsed_time = current_time - last_frame_time;
last_frame_time = current_time;
while (elapsed_time >= MILLIS_PER_TICK) {
tick();
}
render();

Since you can’t effectively log the elapsed_time values, you need to avoid updating any game state and reading any input in render(). It must be a “read-only” function that draws pretty stuff on the screen.

Another source depending indirectly on time is the default Random constructor or the Math.random() method. Make sure you avoid Math.random and use the manually seeded Random constructor. You can use time to seed if you remember to log it first.

Sample files

Download deterministic.zip

This archive contains the replay tool and the space invaders game adapted to support replaying.

3 Responses to “How to save yourself from Debugging Hell: Deterministic games”

  1. Edde Says:

    Outstanding article! I’ll have to digest all of that later, but from what I read so far it was some good food for thought! Thanks!

  2. Edde Says:

    Is this to say that the AI always behaves the same way when replayed? How many threads are running during game play? Thanks!

  3. Froilan Squirtle Says:

    Hey that article was pretty good, simple, and easy to understand!
    Thanks for that, and please write more of those ;)

Leave a Reply