OrangeBeta Blog - The awesome GPU

We explored some game playing algorithms in the this post and MCTS in particular in the last post. The first implementation of MCTS for Annexation was not that hard. The way the game code is structured, the game-playing engine is very much separated from the UI or the game-loop itself. The search part of the algorithm simply used the existing Game engine to do its playouts.

The results were not very encouraging. The engine managed about 1.500 playouts per second. With various optimizations this number went up to 3.000 playouts per second. This was tested on a small hexagon board which has around 200 cells on average. On the largest board which averages to 1000 cells, this went down to 400 playouts per second.

For MCTS to be useful in a mobile game - these calculations had to be orders of magnitude faster. So the next step was to try to perform these calculations on the GPU. GPU's are great at performing simple calculations, massively parallel. So the challenge is to turn Annexation rules into simple calculations that can be programed in a shader.

Let's recount the basic rules of the game:

Players take turns choosing colors.
All player owned cells change their color to the new color.
All neighboring cells of chosen color become player owned cells.
The player that owns more then half the cells wins the game.

By annexing cells, the players grow and acquire new neighbors, slowly taking over the board.

GPU Calculations

There are three important aspects to doing calculations on the GPU:

To execute calculations on the GPU, we need to encode all the information required for this calculation into a form that GPU understands: a collection of points or a collection of pixels. And the result of these calculations will be encoded in points or pixels as well. In my case I decided to use pixels for both.
Calculations on the GPU are massively parallel. So you need to adjust the way you perceive algorithm efficiency. It's often faster to recalculate some information than to store it and look it up when needed (as you would normally do in CPU based calculations).
It is very beneficial to think about the GPU calculations from the result backwards.

How do we represent the game result? To represent the end result of the game, we must encode information on cells and which player owns which. We can represent this as a line of pixels - each pixel representing a cell. The x coordinate of the pixel is the cell id, the color of the pixel represents ownership. Red pixel means owned by player 1, blue pixel means owned by player 2, white pixel means not owned by any player. To get the result of the game we can simply sum the number of red, blue and white pixels in this row. We'll call this the state texture

state encoding

How do we calculate the result of a move? Looking at a particular cell, with player making a move (choosing a color), how can we calculate if the cell is annexed or not?

The cell has to be not owned by any player.
The cell must be of the color player chose for his move.
The cell must be a neighbor of player (of a player owned cell).

So condition 1. is easy - the pixel representing the cell in the state texture must be white. For condition 2. we need to encode the information of the cell color in another texture. Again the x coordinate of the pixel will be the cell id, the color of the pixel will represent the color of the cell. We'll call this the board texture. For condition 3. we need to somehow represent the board state - the cells and their neighbors.

How do we represent board state? We already discussed how the first row of the board texture encodes the information on cell color. We can use the other rows to encode neighbor cell ids, for each cell. So x coordinate represents the cell id. The color of cells in rows 1 to row N, represents the cell id of cell neighbors.

board encoding

How do we represent numbers as colors? A pixel color is usually encoded using 4 values (representing RGBA - Red, Green, Blue and Alpha channels). A simple way to encode a number would then be to use these channels as positional notation of a N-based (in our case a 256-based) numeral system.

We must send the state texture and board texture containing the information of the current state of the game to the shader. While the board will remain unchanged, the state will be changed each turn. The simplest way to achieve this is having two state textures. We can then read from the "first" texture and write to the "second" texture. Next turn we can read from the "second" texture and write to the "first" texture.

We can now put this all together in a few steps the pixel shader will perform on every pixel:

Calculate which player is on the move.

We're supplying the shader with the current move number.

Calculate a random color move (in essence choose a number between 0 and 7).

We're using a very 'light' playout approach here. We're actually not even checking if the chosen color is a valid move! We must simply make sure the shader calculates the same random color for every pixel in the current turn (frame).

Check if current cell (pixel) is annexed by looking it up in the state texture.

is annexed

Check if current cell is of color (the random color from step 1.) by looking it up in the board texture.

is annexed

Check if current cell is next to a player - ie. next to a cell that is owned by a player. First we lookup cells neighbors in the board texture. For each neighbor we look up his owner in the state texture to check if any of these neighbors is owned by the player making the move.

is next to

If the conditions from step 3., 4. and 5. are met, the cell is annexed by drawing a correctly colored pixel (red for player 1 and blue for player 2) in the "new" state texture.

annex cell

Repeat steps 1 to 6 enough times that the game settles (all cells are taken by players).
Count the number of red and blue pixels to determine the winner.

Important to note

We have constructed this algorithm backwards, starting from the desired result.
We are paying random colors. This would be very inefficient in CPU based playouts as we would ideally play a valid move (even a sensible move).
We're not checking if a player has won until we played enough turns (frames) that we're reasonably sure the game has ended. Playing extra turns is faster then taking the time to read the data from the GPU on each turn.
The pixel shader will be applied to pixels in a massively parallel manor. A modern GPU has 4k+ shader processors! This means we need to make sure each pixel color can be calculated in isolation from other calculations. Pixels will not be calculated in any predictable order.

While it may take about 200 frames to play a game (using random moves) to a statistically stable end-state, we represented a game as a row of pixels in the state texture. There's no reason not to use multiple rows of pixels each representing a different variation of a game! We only need to make sure that our random move (color) algorithm chooses the same color for every pixel in the same row, while choosing a different random color for each row. So instead of rendering 200 frames for a single playout, we can render 200 frames to calculate 1000 playouts at once.

Here's an example of playouts of a game played on a very small board (7 cells in total).

8 playouts

Each frame is a turn. Each row is a game playout. In the first turn, the red player annexed an additional cell in playouts 3,4 and 8. In turn 2, the blue player annexed a cell only in playout 5. By turn 36 the game is almost over in all playouts. After 60 turns we count the cells in each row to determine the winner.

Row	Red	Blue	Winner
1	2	5	Blue
2	4	3	Red
3	5	2	Red
4	2	5	Blue
5	3	4	Blue
6	3	4	Blue
7	5	2	Red
8	3	4	Blue

So blue player wins 5 out of 8 playouts.

Takeaways

Programing in shaders requires reversing the thought process of constructing algorithms.
Each result/pixel is processed in parallel so we cannot rely on other results/pixels in the same calculation.
Repeated recalculation is often faster then sharing data.

At this point in 2021, even mobile phones GPU have 1000 shader processors. This means that offloading some processing on GPU can be very beneficial provided the calculation can be simplified and parallelized.

For Annexation players this means that they can play against a very strong opponent which does not take ages to reply. For me it meant I had a wonderful Adventure in WebGl