Game Theory – MiniMax

By Kevin Packard                        twitter@kevinpackard

Making the Three Straight Game Engine

Three Straight can not succeed without a single-player mode.  I need a Three Straight Game Engine, an artificial intelligence opponent that can play a mean game of Three Straight.

If Design is the skin, then the Game Engine is the brain.  Last month, nearly one year to the day after prototyping Three Straight, I started work on a Three Straight game engine.

 

First, some background

As a freshman in high school, I wrote a Tic-Tac-Toe game on the Apple ][, in BASIC.  The logic was a combination of special-cases and brute-force searching for a win.  It sometimes won, never lost, and definitely wasn’t pretty.

As a sophomore in college, I wrote Othello (aka Reversi) for the Macintosh.  An Othello game is over in 60 moves.  60 moves didn’t sound like much, and I naively thought the computer could brute-force search all possible moves and select the best one.  So I coded up the game engine, and ran it.  Nothing happened.  A bug?  No.  It was thinking.  And thinking.  And thinking…  Turns out that evaluating all possible Othello moves in a reasonable time was not going to happen on a circa 1986 Macintosh.  So with 19-year-old gusto and naiveté, I started hacking.  I tried to get the computer to play Othello like I did – grab the corners, build edges, avoid positions next to empty corners, grab positions two-away from corner spots…  But in truth, I’m not a very good Othello player.  And my program was even worse.  It played like a drunken Frankenstein version of myself, and was easily beaten by nearly everyone.  Disappointing.

 

No hacking this time

I approached the Three Straight game engine with more deliberation.  With science!  Okay, I starting by hitting The Google.

There’s an amazing amount of information available on game theory.  A Yale course on Game Theory looked promising, and it was educational and entertaining, but it failed to provide specifics.  I also found many Othello and tic-tac-toe game engine source code repositories.  But none of these helped.

 

MiniMax

I narrowed the search to Wikipedia, and finally found what I was looking for.  The MiniMax algorithm.

function integer minimax(node, depth)
    if node is a terminal node or depth <= 0:
        return the heuristic value of node
    α = -∞
    for child in node:      # evaluation is identical for both players 
        α = max(α, -minimax(child, depth-1))
    return α

(from Wikipedia)

MimiMax is ingeniously simple.  When it’s the computer’s turn to move, the Game Engine generates a score for each possible move, then selects the move with the highest score.  To score a particular move, the Engine first looks to see if that move is a win or a loss.  A win is scored as ∞ and a loss is scored as -∞.

A move that’s neither a win nor a loss is scored by evaluating the opponent’s possible moves in response to the computer’s move.  To do this, the Engine temporarily makes the evaluation move, then “switches sides” and selects the best move for the opponent.  Evaluating an opponent’s best move is identical to evaluating the computer’s best move: a win is ∞, a loss is -∞, and all other moves are scored by once again switching sides, back to the computer this time, and evaluating the next set of moves.

In other words, the Minimax algorithm works by looking ahead: “if I move here, my opponent will move there, and then I will move here, …”

Finally, there must be a limit to how far ahead the Engine will look.  Minimax limits the number of look-aheads by setting a maximum search depth.  That is, the Engine will only look ahead so many turns.  If it reaches the “look ahead” limit  and does not find a win or a loss, the Engine must generate a score for the board based solely on the state of the board.  This is the tricky part: how to assign a numeric score to a Three Straight board that is neither a win nor a loss?

The Minimax algorithm itself does not solve this problem.  Minimax is a generic algorithm, that can be applied to Tic-Tac-Toe, Othello, chess, or any turn-based game.  The key to making Minimax work for Three Straight is figuring out how to numerically score a move, given only a snapshot of the board and the sequence of moves that brought the game to this point.  No more looking ahead, because the Engine has already looked as far as it’s allowed.

Before getting into the solution, let’s take a look at generic Minimax in action.  It’s essential to truly understand MiniMax, before thinking about how to numerically score a Three Straight board.

 

MiniMax in Action

Here’s an example of an imaginary turn-based game.  In this imaginary game, each turn has only 1 or 2 possible moves.  The computer’s moves are represented by squares, and his opponent’s moves are circles.  We’ve limited Minimax to a 4 move “look-ahead”.

Here, the opponent has just moved (level 0), and it’s the computer’s turn.  The computer has two possible moves (level 1).  MiniMax will evaluate them both, and choose the move with the largest value.

The MiniMax algorithm at work (image from Wikipedia).

Starting at level 1, MiniMax evaluates the leftmost move first.  It’s neither a win nor a loss, so MiniMax must determine a score.  He has not reached the look-ahead limit of 4, so he’ll evaluate this move by switching sides, and score each of his opponent’s two possible moves (the left two nodes of level 2).  When they are both scored, he’ll select the smallest.

To score the two opponent moves at level 2, MiniMax evaluates the leftmost first.  It’s neither a win nor a loss.  The look-ahead limit of 4 has not been reached, so MiniMax evaluates his own possible moves (the left two nodes of level 3), leftmost first.

The leftmost node on level 3 is neither a win nor a loss, so MiniMax evaluates his opponent’s moves (left two nodes of level 4), leftmost first.  The leftmost node at level 4 is neither a win nor a loss, but the “look-ahead” limit has been reached.  The imaginary board must be evaluated in place.  In this example, MiniMax examines the board and assigns a score of 10.  Remember, that’s the hard part, but this is an imaginary example, so just play along.  MiniMax has now scored the leftmost move at level 4 with a value of 10.

MiniMax evaluates at the second level-4 move and detects a win, scoring this move with a value of ∞.  With both level-4 moves scored, MiniMax can choose the score for preceding the level-3 move.  Because he’s evaluating his opponent’s move, MiniMax selects smaller score, a 10.  He now has a score for the first level-3 move: a 10.

The second level-3 move has only one possible level-4 move, which MiniMax scores as 5.   The level-3 move inherits the smallest score of his children, or child in this case, so the level-3 move is scored as 5.

With both level-3 moves scored at 10 and 5, MiniMax scores the level-2 move with the larger of the two level-3 scores: 10.  He chooses the larger this time, not the smaller, because he’s evaluating one of his own future moves, not one of his opponent’s.

This process continues, scoring level-4 moves based on state of the game, and scoring higher level moves with the largest value for a computer turn, or the smallest value move for an opponent turn.  Scores percolate up from the bottom of the search tree, until finally, at the top of the tree, the computer’s move is apparent.

Then the computer makes his move for real, and it’s his opponent’s move.

Ingenious and simple.  Except for the one remaining problem: how to evaluate a Three Straight board that’s neither a win nor a loss?

 

Jumping the Shark

At this point I began to over-think the problem.  What is the numeric value if a Three Straight board?  I sketched out patterns that set up wins – the “wedge”, the “picnic bench”, the “lightning bolt”, the “rose” – names by brother and I came up with, playing as kids.  I iterated through opponent marble positions for these patterns, and began assigning a range of values, depending on blocking positions.  I generated yet another set of values for these patterns when they occurred on a board edge.  In short, I started designing a system that played like me.  And frankly, I’m not that good.  My father and brother regularly beat me.

Egads!  I was repeating my Othello game engine mistake!

But this time I recognized it.

 

Jumping Back

So, I postponed the problem.  I went with a very simplistic evaluation: a win is ∞, a loss is -∞, and everything else is 0.

This decision let me move ahead.  Time to write some code!  After three days, much to my delight, the very simplistic evaluation turned out to be enough for a very smart Three Straight game engine.  Because, you see, a simple evaluation executes quickly.  The quicker the evaluation, the more boards that can be evaluated.  The more boards that can be evaluated, the deeper the search.  The deeper the search, the smarter the Engine.

 

Cruel Engine, Taunting Engine

Actually, I had to make one minor tweak.  MiniMax doesn’t specify which move to choose when there are tying high scores.  What if there are two winning moves?  I decided to randomly choose one.

This led to an interesting effect.  Let’s say the Engine sees two moves that are guaranteed wins.  One win is one move away.  The other win is guaranteed, but it’s 3 moves away.  MiniMax evaluates both moves as ∞, because both are guaranteed wins.  A human would always choose the immediate win, but MiniMax sees them as equal, and chooses one at random.  And if MiniMax happens to choose the 3-move win over the obvious and immediate win, then suddenly the Engine seems cruel, as if it’s taunting its human opponent, because it’s prolonging the agony of defeat.

This was not the desired effect.  For a fix, I made a small tweak.  Instead of assigning ∞ and -∞ for wins and losses, I assigned +1,000,000,000 and -1,000,000,000.  Then I “aged” the scores as they moved up the tree, by multiplying them by 0.9 at each level.  This changed the evaluation so that an immediate win is scored as +1,000,000,000, but a guaranteed win that’s 3 moves in the future is scored as +1,000,000,000 x 0.9 x 0.9 = 810,000,000.  Now, the Engine selects the immediate win over the guaranteed future win.  It now kills rapidly, and with purpose. Which somehow makes it kinder.

 

Satisfaction

I have to say, it’s extremely satisfying to have a smart, functional Three Straight game engine.  It’s come alive, and it has personality.  Level 2 (look ahead 2 moves) is perfect for beginners.  At Level 4, it plays a pretty good game, but can still be tricked.  At Level 7, it only looses when its opponent starts first and makes no mistakes.  At Level 8, it does not loose.  Ever.  These translate directly to the four levels of game play: Beginner, Intermediate, Difficult, and Inhuman.

For those interested the code itself, I’ll cover that in the next post.