Minimax Search [Knowledge]

Let’s say your program is in a competitive situation against others, and must predict possible consequences of actions to select the one with the best outcome. This is the case in many logic games like Tic-Tac-Toe, Othello, Chess or Go. The minimax algorithm can be applied here as long as:

  • The current state is (at least partly) available to the program. For example, the program knows the position of the pieces on the game board.
  • It’s possible to search through successor states in the future. The program knows how the pieces end up when a move is made.
  • There’s a way to evaluate the utility of certain states. For instance, knowing when about the winning/loosing conditions.

Since this is a competitive setting, you can assume that the others pick actions resulting in a worse outcome for your program. Then, minimax works by taking the predictions of future states, and working back to determine the utility of predecessor states. When your program has estimated the utility of the next states, it can easily select the best.

Description

Minimax is a policy to select optimal utility actions assuming the worst from other programs. Typically, minmax is combined with a depth-first search algorithm that traverses the game tree, predicting what the opponents are likely to do and what your program should do. As such, there are two different levels in the search tree: 1) for the opponents and 2) for team-members.

  1. The level in the game tree dealing with opponents is known as MIN, since they will typically minimize the utility of the outcome state.
  2. Conversely, the search tree’s level dealing with team-members is known as MAX, since they will typically maximize the utility of the outcome state.

The minimax search, in effect, minimizes the maximum possible loss, or looking at it the other way, maximizes the minimum gain.

Min/Max Search in Game Tree

Application

Minimax is applied to many games with great success, especially when the branching factor is small (i.e. there are few options each turn), when the situation is deterministic, and the state is fully observable.

However, in practice, there are significant challenges to applying Minimax to a logic game. Notably, the game tree is usually too large to search until the end, so good estimator functions are required for any state (and not just terminal states). This is not a trivial problem, and significantly affects the outcome of the algorithm. Most state estimation functions are hand written for each problem by experts.

The pseudo-code for a minimax search looks like this:

def minimax(node, depth)    if node.is_terminal() or depth == 0:         return node.calculate_heuristic()    if node.is_opponent():         a = +inf         for child in node.get_children():             a = min(a, minimax(child, depth-1))    else:        a = -inf        for child in node.get_children():             a = max(a, minimax(child, depth-1))    return a

On the bright side, alpha-beta pruning can be implemented easily to improve the efficiency of the search.

Theory

In classical statistical decision theory, an estimator \delta is used to estimate a parameter \theta \in \Theta. Also assume a risk function R(\theta,\delta), usually specified as the integral of a loss function. Given this framework, \tilde{\delta} is called “minimax” if it satisfies:

\sup_\theta R(\theta,\tilde{\delta}) = \inf_\delta \sup_\theta R(\theta,\delta)

Resources

Related Posts

Comments are closed.