Engineering Notes

Engineering Notes

Thoughts and Ideas on AI by Muthukrishnan

Minimax and Adversarial Search for Decision-Making in Competitive Environments

13 Oct 2025

Introduction

The minimax algorithm is a decision-making procedure for two-player zero-sum games, where one player’s gain is the other’s loss. It assumes both players play optimally and works backward from game endings to determine the best current move.

Minimax performs a depth-first search through the game tree, alternating between maximizing and minimizing layers. At terminal nodes, it evaluates game states using a heuristic function. The algorithm propagates these values upward: the maximizing player chooses the highest-value child, the minimizing player chooses the lowest. This creates a minimax value for each node representing the game outcome under optimal play.

Historical & Theoretical Context

The minimax algorithm emerged from game theory, pioneered by John von Neumann and Oskar Morgenstern in their 1944 book “Theory of Games and Economic Behavior.” Von Neumann proved the minimax theorem in 1928, establishing that in zero-sum games, there exists an optimal mixed strategy.

The algorithmic application to computer game-playing began in the 1950s:

Minimax rests on several key assumptions:

  1. Perfect Information: both players see the complete game state
  2. Determinism: no randomness in game mechanics
  3. Zero-Sum: one player’s utility equals the negative of the other’s
  4. Rationality: both players choose optimally to maximize their outcome

The algorithm implements backward induction from game theory, solving the game from end states backward to determine current optimal play.

The Algorithm

Minimax Pseudocode

def minimax(node, depth, maximizing_player):
    """
    node: current game state
    depth: how many moves to look ahead
    maximizing_player: True if current player wants to maximize score
    """
    # Base case: reached terminal state or depth limit
    if depth == 0 or node.is_terminal():
        return node.evaluate()

    if maximizing_player:
        max_eval = -infinity
        for child in node.get_children():
            eval = minimax(child, depth - 1, False)
            max_eval = max(max_eval, eval)
        return max_eval
    else:
        min_eval = +infinity
        for child in node.get_children():
            eval = minimax(child, depth - 1, True)
            min_eval = min(min_eval, eval)
        return min_eval

Alpha-Beta Pruning

Alpha-beta pruning optimizes minimax by eliminating branches that cannot influence the final decision:

def minimax_alpha_beta(node, depth, alpha, beta, maximizing_player):
    """
    alpha: best value maximizer can guarantee
    beta: best value minimizer can guarantee
    """
    if depth == 0 or node.is_terminal():
        return node.evaluate()

    if maximizing_player:
        max_eval = -infinity
        for child in node.get_children():
            eval = minimax_alpha_beta(child, depth - 1, alpha, beta, False)
            max_eval = max(max_eval, eval)
            alpha = max(alpha, eval)
            if beta <= alpha:
                break  # Beta cutoff
        return max_eval
    else:
        min_eval = +infinity
        for child in node.get_children():
            eval = minimax_alpha_beta(child, depth - 1, alpha, beta, True)
            min_eval = min(min_eval, eval)
            beta = min(beta, eval)
            if beta <= alpha:
                break  # Alpha cutoff
        return min_eval

Complexity: Basic minimax has O(b^d) time complexity where b is branching factor and d is depth. Alpha-beta pruning reduces this to O(b^(d/2)) in the best case with optimal move ordering.

Design Patterns & Architecture

Integration with Agent Architectures

Minimax fits into the deliberative agent paradigm:

┌─────────────────────────────────────┐
│         Agent Architecture          │
├─────────────────────────────────────┤
│  Perception                         │
│    └─> State Representation         │
│                                     │
│  Decision-Making (Minimax)          │
│    ├─> Game Tree Generation         │
│    ├─> State Evaluation             │
│    └─> Best Move Selection          │
│                                     │
│  Execution                          │
│    └─> Action Application           │
└─────────────────────────────────────┘

Common Patterns

  1. Iterative Deepening: Start with shallow search, progressively deepen until time limit
  2. Transposition Tables: Cache evaluated positions (memoization for game trees)
  3. Move Ordering: Evaluate promising moves first for better pruning
  4. Quiescence Search: Continue search at volatile positions to avoid horizon effect

Practical Implementation

Tic-Tac-Toe Example

from typing import List, Tuple, Optional
import math

class TicTacToe:
    """Simple Tic-Tac-Toe game with minimax AI"""

    def __init__(self):
        self.board = [[' ' for _ in range(3)] for _ in range(3)]
        self.current_player = 'X'

    def get_available_moves(self) -> List[Tuple[int, int]]:
        """Return list of empty positions"""
        return [(i, j) for i in range(3) for j in range(3)
                if self.board[i][j] == ' ']

    def make_move(self, row: int, col: int, player: str):
        """Place player's mark at position"""
        self.board[row][col] = player

    def undo_move(self, row: int, col: int):
        """Remove mark from position"""
        self.board[row][col] = ' '

    def check_winner(self) -> Optional[str]:
        """Return 'X', 'O', 'TIE', or None"""
        # Check rows, columns, diagonals
        for i in range(3):
            if self.board[i][0] == self.board[i][1] == self.board[i][2] != ' ':
                return self.board[i][0]
            if self.board[0][i] == self.board[1][i] == self.board[2][i] != ' ':
                return self.board[0][i]

        if self.board[0][0] == self.board[1][1] == self.board[2][2] != ' ':
            return self.board[0][0]
        if self.board[0][2] == self.board[1][1] == self.board[2][0] != ' ':
            return self.board[0][2]

        if not self.get_available_moves():
            return 'TIE'
        return None

    def evaluate(self) -> int:
        """Evaluate board state: +1 for X win, -1 for O win, 0 otherwise"""
        winner = self.check_winner()
        if winner == 'X':
            return 1
        elif winner == 'O':
            return -1
        return 0

    def minimax(self, depth: int, is_maximizing: bool,
                alpha: float = -math.inf, beta: float = math.inf) -> int:
        """Minimax with alpha-beta pruning"""
        winner = self.check_winner()
        if winner is not None:
            return self.evaluate()

        if is_maximizing:  # X's turn (maximizing player)
            max_eval = -math.inf
            for row, col in self.get_available_moves():
                self.make_move(row, col, 'X')
                eval_score = self.minimax(depth + 1, False, alpha, beta)
                self.undo_move(row, col)
                max_eval = max(max_eval, eval_score)
                alpha = max(alpha, eval_score)
                if beta <= alpha:
                    break
            return max_eval
        else:  # O's turn (minimizing player)
            min_eval = math.inf
            for row, col in self.get_available_moves():
                self.make_move(row, col, 'O')
                eval_score = self.minimax(depth + 1, True, alpha, beta)
                self.undo_move(row, col)
                min_eval = min(min_eval, eval_score)
                beta = min(beta, eval_score)
                if beta <= alpha:
                    break
            return min_eval

    def get_best_move(self, player: str) -> Tuple[int, int]:
        """Find optimal move using minimax"""
        best_move = None
        is_maximizing = (player == 'X')
        best_value = -math.inf if is_maximizing else math.inf

        for row, col in self.get_available_moves():
            self.make_move(row, col, player)
            move_value = self.minimax(0, not is_maximizing)
            self.undo_move(row, col)

            if is_maximizing and move_value > best_value:
                best_value = move_value
                best_move = (row, col)
            elif not is_maximizing and move_value < best_value:
                best_value = move_value
                best_move = (row, col)

        return best_move

# Usage
game = TicTacToe()
best_move = game.get_best_move('X')
print(f"Best move for X: {best_move}")

Integration with Modern Frameworks

In LangGraph or CrewAI, minimax can be a specialized decision node:

from langgraph.graph import StateGraph

class GameAgentState:
    board: dict
    current_player: str
    game_history: list

def minimax_decision_node(state: GameAgentState):
    """LangGraph node using minimax for game decisions"""
    game = TicTacToe()
    # Load state into game
    best_move = game.get_best_move(state.current_player)
    return {"action": best_move, "reasoning": "minimax_optimal"}

# Add to graph
graph = StateGraph(GameAgentState)
graph.add_node("minimax_decision", minimax_decision_node)

Latest Developments & Research

Modern Advances (2022-2025)

  1. Neural Network Evaluation Functions

    • AlphaZero (2017) and MuZero (2020) combine MCTS with deep learning
    • Stockfish NNUE (2020+) uses neural nets for position evaluation
    • Hybrid approaches outperform pure minimax by 400+ Elo in chess
  2. Partial Information Games

    • ReBeL (Brown & Sandholm, 2020) extends minimax to poker via counterfactual regret
    • DeepNash (2022) masters Stratego, a game with hidden information
  3. Real-Time Adversarial Search

    • StarCraft II agents use approximate minimax for tactical decisions
    • Anytime algorithms provide “best move so far” under time pressure
  4. Multi-Agent Extensions

    • MaxN algorithm generalizes minimax to N players
    • Paranoid search assumes all opponents collaborate against you

Open Problems

Cross-Disciplinary Insights

Connections to Other Fields

Economics & Auctions: Minimax thinking appears in:

Cybersecurity: Defender-attacker scenarios model as minimax games:

Neuroscience: Human strategic thinking shows minimax-like patterns:

Distributed Systems: Byzantine fault tolerance uses minimax reasoning:

Daily Challenge

Coding Exercise (20-30 minutes)

Implement a Connect-4 agent using minimax with the following enhancements:

class Connect4:
    def __init__(self):
        self.board = [[0 for _ in range(7)] for _ in range(6)]
        # 0 = empty, 1 = player, 2 = AI

    def evaluate(self) -> int:
        """
        TODO: Implement evaluation function
        - +1000 for winning position
        - -1000 for losing position
        - Heuristic score for non-terminal states:
          - Count 3-in-a-row with open fourth position: +50
          - Count center column pieces: +3 per piece
        """
        pass

    def get_best_move(self, depth: int = 5) -> int:
        """
        TODO: Implement minimax with alpha-beta pruning
        Return column number (0-6) for best move
        Use iterative deepening if time permits
        """
        pass

Bonus Challenges:

  1. Add move ordering: check center columns first
  2. Implement transposition table caching
  3. Add time-limited search with iterative deepening

Thought Exercise

Question: If you increase minimax search depth from 4 to 5 ply in chess (branching factor ~35), how much longer will the search take? What if you add alpha-beta pruning with random move ordering vs. optimal move ordering?

Analysis: Consider the tradeoff between search depth and evaluation function quality. Is it better to search 6 ply with a simple evaluation or 4 ply with a complex neural network evaluation?

References & Further Reading

Classic Papers

Modern Research

Practical Resources

GitHub Repositories

Books


Next Steps: Explore how Monte Carlo Tree Search (MCTS) handles games where minimax becomes intractable, and how modern agents combine neural networks with tree search in systems like AlphaZero and MuZero.

● Intelligence at Every Action

AI Native
Project Management

Stop using tools that bolt on AI as an afterthought. Jovis is built AI-first — smart routing, proactive monitoring, and intelligent workflows from the ground up.