How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Preview:

Citation preview

How computers play games with you

CS161, Spring ‘03

Nathan Sturtevant

Outline

Historic Examples Classes of Games Algorithms

Minimax - pruning

Other techniques Multi-Player Games

Successful Game Programs

Checkers Chinook

1992 Tinsley won 40-game match, 4-2-34 1994 Tinsley withdrew due to health reasons 444 billion move end-game database

Chess Kasparov is currently the best human 1997 Deep Blue won exhibition match 2-1-3 2003 Deep Junior played to a draw

Game Programs (continued)

Othello (Reversi) 1997, Logistello beat Murakami 6-0 (264/120)

Scrabble Maven

1998 played Adam Logan, won 9-5 Came back from down 98 to win with MOUTHPART

Awari (Mancala) Solved in 2002 - draw http://awari.cs.vu.nl/

Overview - Types of Games

Single-Agent Search 1 player v. a difficult problem Defined by:

Start state Successor function Heuristic function Goal test

Overview - Types of Games

Game Search (Adversary Search) Defined by:

Initial State Successor function Terminal Test Utility / payoff function

Similar to heuristics in single agent problems

Chinese Checkers

Based on European game Halma

Americans called it Chinese Checkers 1 player game? 2 player game? Multi-player game?

Classes of Games

Deterministic v. Non-deterministic Chess v. Backgammon

Perfect Information v. Imperfect information Checkers v. Bridge

Zero-sum (strictly competitive) Prisoners dilemna

Non-zero sum

Classes of Games

Deterministic Chance

Perfect information

Imperfect information

Classes of Games

Deterministic Chance

Perfect information

Chess, checkers, go, othello, chinese checkers

Imperfect information

Classes of Games

Deterministic Chance

Perfect information

Chess, checkers, go, othello, chinese checkers

Backgammon, monopoly, risk

Imperfect information

Classes of Games

Deterministic Chance

Perfect information

Chess, checkers, go, othello, chinese checkers

Backgammon, monopoly, risk

Imperfect information

Stratego

Classes of Games

Deterministic Chance

Perfect information

Chess, checkers, go, othello, chinese checkers

Backgammon, monopoly, risk

Imperfect information

Stratego Bridge, poker, scrabble, (real life)

How do we simulate games?

Build a game tree Start state at the root All possible moves as children

Tic-Tac-Toe Me

Opponent

How do we choose our move?

Apply utility function at the leaves of the tree In tic-tac-toe, count how many rows and columns are

occupied by each player and subtract

Tic-Tac-Toe Me

Opponent

x: 2r 2c 2d

o: 2r 1c 1d

Utility = 2 Utility = 3 Utility = ∞ Utility = 2

x: 2r 3c 2d

o: 2r 1c 1d

x: 3r 3c 2d

o: 2r 2c 0d

x: 2r 2c 2d

o: 2r 2c 0dUtility = 3 Utility = ∞

Utility = 3

What is our algorithm?

Apply utility function at the leaves of the tree In tic-tac-toe, count how many rows and columns are

occupied by each player and subtract Back-up values in the tree

This calculates the “minimax” value of a tree

Minimax Maximizer

Minimizer

2 3 ∞ 2

3

3

1 - ply

1 - ply

Minimizers strategy

Minimax - Properties

Complete? Yes - if tree is finite

Optimal? Yes - against an optimal opponent

Time Complexity? O(bd)

Space Complexity? O(bd)

Minimax

Assume our computer can expand 105 nodes/sec Assume we have 100 seconds to move 107 nodes/move Tic-tac-toe

9! = 362880 (naïve) ways to play a game (b=4) 39 = 19683 possible states (upper bound) on a board

Chess b = 35, d = 100, must search 2154 nodes

Minimax - issues

Evaluation function Where does it come from?

Expert knowledge Chess: material value Othello (reversi): positional strength

Learned information Pre-computed tables

Quiescence

quiescence

Minimax - issues

Quiescence We don’t see the consequences of our bad choices quiescence search

Horizon problem We avoid dealing with a bad situation

Minimax

In Chess b = 35 107 nodes/move Can search 4-ply into tree (human novice) Good humans can search 8-ply Kasparov searches about 12-ply

What to do? - pruning

Minimax Maximizer

Minimizer

2 3 ∞ 2

3

3

1 - ply

1 - ply

Minimizers strategy

- pruning

= lower bound on Maximizer’s score Start at -∞

= upper bound on Minimizer’s score Start at ∞

Maximizer

Minimizer

1

-∞ ∞

= -∞

= ∞

= -∞

= ∞

= -∞

= ∞ ≥1

Maximizer

Minimizer

1 2

2

-∞ ∞

= -∞

= ∞

= -∞

= ∞

= 1

= ∞ ≥1

Maximizer

Minimizer

1 2

2

-∞ ∞

= -∞

= ∞

= -∞

= ∞

= 2

= ∞

Maximizer

Minimizer

1 2 3

2

-∞ ∞

= -∞

= ∞

≥ 3

= -∞

= 2

= -∞

= 2

≤ 2

Maximizer

Minimizer

1 2 3

2

2

-∞ ∞

= -∞

= ∞

≥ 3

≥ 2

= -∞

= 2

= 3

= 2

Maximizer

Minimizer

1 2 3

2

2

5

≥ 5

-∞ ∞

= 2

= ∞

≥ 3

≥ 2

= 2

= ∞

= 2

= ∞

Maximizer

Minimizer

1 2 3

2

2

5 6

6

-∞ ∞

= 2

= ∞

≥ 3

≥ 2

= 2

= ∞

= 5

= ∞

≥ 5

Maximizer

Minimizer

1 2 3

2

2

5 6 7

6

-∞ ∞

= 2

= ∞

≥ 3

≥ 2

= 2

= 6 ≤ 6

≥7

= 2

= 6

Maximizer

Minimizer

1 2 3

2

2

5 6 7

6

-∞ ∞

= 2

= ∞

≥ 3

≥ 2

= 2

= 6 ≤ 6

≥7

= 7

= 6

6

6

- pruning

Complete? Yes - if tree is finite

Optimal? Computes same value as minimax

Time Complexity? Best case O(bd/2) Average case O(b3d/4)

- pruning

Effectiveness depends on order of moves in tree In practice, we can usually get best-case performance

Chess Before we could search 4-ply into tree Now we can search 8-ply into tree

Other Techniques

Transposition tables Opening / Closing book

Transposition Tables

Only using linear about of memory Search only takes a few kb of memory

Most games aren’t trees but graphs

Transposition Tables

Transposition Tables

A lot of duplicated effort Transposition tables hash game states into table

Store saved minimax value in table

Pre-compute & store values Opening book Closing book

Multi-Player Games

2-Player game trees have a single minimax value Games with ≥ 2 players use a n-tuple of scores

ie (3, 2, 5)

The sum of values in every tuple should be constant

Maxn

1

3

(7, 3, 0)

3

(3, 2, 5)

(7, 3, 0) (0, 10, 0) (1, 4, 5)

(7, 3, 0)

3

(0, 10, 0)

3

(4, 2, 4)

22 2

3

(1, 4, 5)

3

(4, 3, 3)

(3 Players)

Can we prune maxn trees

In minimax we bound the game tree value In maxn we bound based on sum of values

All scores sum to 10 If Player 1 gets 7 points… Player 2-3 will get ≤ 3 points

Shallow Maxn Pruning

1

3

(7, 3, 0)

3

(3, 2, 5)

(7, 3, 0) (0, 10, 0) (≤6, ≥4, ≤6)

(7, 3, 0)

3

(0, 10, 0)

22 2

3

(1, 4, 5)

(3 Players)

(≥7, ≤3, ≤3)

Shallow Maxn Pruning

Complete? Yes

Optimal? Yes*

Time Complexity? Best-case**: bd/2

Average-case: bd

Space Complexity? b•d

Maxn Pruning

Why is maxn weak in practice? Only compares 2 scores out of n players Relies on game evaluation properties, not ordering

Last-Branch Pruning Speculative Pruning

Last-Branch/Speculative Pruning

1

(3, 3, 4)

(3, 3, 4) 2

(3 Players)

2

33(1, 4, 5)

1(2, 4, 4)

2

Last Branch/Spec. Pruning

Best case: O(bd·(n-1)/n) As b gets large Dependent only on node ordering in tree http://www.cs.ucla.edu/~nathanst/ for more info

Imperfect Information

Most card games have imperfect information We can use monte-carlo simulation

Create many consistent samples of possible opponent hands

Solve using perfect-information methods Combine results together to make next move

Recommended