OrangeBeta Blog - The awesome AI

As hinted in the previous post - the variations in Annexation board types (and other things that make Annexation unique) requires the game-playing AI to be very general. There are several interesting algorithms and strategies I've programmed before settling on the current implementation.

Random AI

This one is the easiest to do. The algorithm choses a random legal move. It's a great algorithm to implement at first to test how your AI is called/used in the game-loop (and a quick test if your genius AI algorithms can actually beat a random player).

Greedy AI

A sensible, if completely naive strategy. The algorithm choses a color that will increase the number of annexed cells the most. This results in very compact play-style which is too slow.

greedy ai

Fractal AI Another interesting idea is to chose a color, that will increase the number of your neighbors. It works surprisingly well as it works to spread quickly (if aimlessly) across the board. It also leads to end-game situations where the algorithm is reluctant to take profit. Any color that would increment the score significantly would also decrement the neighbors count by the same large margin.

$fractal ai$

MinMax AI

The main issue with both previous approaches is that they have a limited look-ahead. Any situation where a "slow" move leads to a subsequent "bigger" move is overlooked. And this is exactly what MinMax algorithm does. It allows you to use the same heuristic to determine the move/variation value as before (either annexed or neighbors cells count) but apply it after you simulate various possible moves by both players.

heuristic is a shortcut (a rule of thumb) that produces a solution that might not be optimal, but is practical in terms of available resources (eg. time).

The principle of MinMax algorithm is simple. Expand the game tree to the desired depth. This means playing all the legal moves. Then playing all the possible replies to all those moves. Then playing all the possible replies to all those replies and so on.

plies

Once you expand the game to the desired depth (ply) you can apply a heuristic to determine the value of the position for the player to move on this ply (This is usually the player that will play the next move in the game and the player that chooses max value).

heuristics

Next we back-propagate this result to the previous ply always choosing the maximum value from all the possible, legal moves. The idea here is that this player will play the highest scoring move out of all the moves he can play.

max

Next we back-propagate this result to the previous ply always choosing the minimum value from all the possible, legal moves. The idea here is that the opposing player will chose the lowest scoring variation out of all the possible variations he can play (Your lower score is his higher score).

min

Continuing in this way to the first ply we will have a heuristic based score for each possible move, based on all the possible variations of moves and replies up to the desired level. The only thing left to do is play the highest scoring move.

This algorithm will play better the deeper we expand the tree and the better heuristic we use. If we fully expand the tree and reach the end on all variations and use actual game scoring as a heuristic, we will play perfectly. This is possible for some games, but rarely for the interesting ones.

In Annexation the median number of possible moves is 6 (there's 8 colors, but you cannot choose own color or opponents color). So the number of possible positions (or nodes in our MinMax graph) for the ply lookahead is listed in this table:

ply	pos
1	6
2	36
3	216
4	1296
5	7776
6	46656
7	279936

To play out all the moves on a small annexation board (around 200 cells) between good players, it takes about 20 moves. That's 3.65 * 10^15 possible positions. This is clearly too many positions to consider.

MinMax with AlphaBeta pruning AI

There's a simple way to improve the basic MinMax algorithm by discarding non-promising variations early. This means we can reduce the number of possible positions to calculate. Let's consider the previous example and pause at a certain point in AI calculation.

ab pruning

At this point the Min player's "best" option so far is to play the move with the score 5. In the move under consideration, his Max opponent has an option to reach a game state with the score of 7. At this point the Min player already knows he will not choose this variation.

No matter what other possible move Max player can choose (the question mark) - 7 is already worse then 5 for the Min player. So the rest of the calculations for this variation does not need to be considered (beta pruning).

There's a corelating line of thought for the Max player as well. He may discard variations that result in at least one sub-variation for the Min player, that result in a lower score than the one he can achieve by playing some other move (alpha pruning).

With this approach it is very important to first try out more promising variations, before considering less promising ones. If we first find the high scoring moves, we can discard subsequent variations early. If we first find the low scoring moves, then subsequent variations (being better) will not be pruned.

Even so - the search tree for Annexation is still too large to search exhaustively.

The AI used in Annexation is able to overcome this difficulty and be 90% confident of a win very early in the game (with five moves being played by each player).

move 5

It is using the Monte Carlo Tree Search algorithm. Which I will describe in the next post.