23
Game Playing Game Playing Revision Revision Mini-Max search Mini-Max search Alpha-Beta pruning Alpha-Beta pruning General concerns on games General concerns on games

Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

Embed Size (px)

Citation preview

Page 1: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

Game PlayingGame PlayingRevisionRevision

Mini-Max searchMini-Max search

Alpha-Beta pruningAlpha-Beta pruning

General concerns on gamesGeneral concerns on games

Page 2: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

2

Why study board games ?Why study board games ?

One of the oldest subfields of AI (Shannon and One of the oldest subfields of AI (Shannon and

Turing, 1950)Turing, 1950)

Abstract and pure form of competition that Abstract and pure form of competition that

seems to require intelligenceseems to require intelligence

Easy to represent the states and actionsEasy to represent the states and actions

Very little world knowledge required !Very little world knowledge required !

Game playing is a special case of a search Game playing is a special case of a search

problem, with some new requirements.problem, with some new requirements.

Page 3: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

3

Types of gamesTypes of games

Bridge, poker, Bridge, poker, scrabble, scrabble, nuclear warnuclear war

Backgammon, Backgammon, monopolymonopoly

Chess, Chess, checkers, go, checkers, go, othelloothello

ChanceChanceDeterministicDeterministic

Imperfect Imperfect informationinformation

Perfect Perfect informationinformation

Sea battleSea battle

Page 4: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

4

Why new techniques for games?Why new techniques for games?

““Contingency” problem:Contingency” problem:

We don’t know the opponents move !We don’t know the opponents move ! The size of the search space:The size of the search space:

Chess : ~15 moves possible per state, 80 ply Chess : ~15 moves possible per state, 80 ply 15158080 nodes in tree nodes in tree

Go : ~200 moves per state, 300 plyGo : ~200 moves per state, 300 ply 200200300300 nodes in tree nodes in tree

Game playing algorithms:Game playing algorithms: Search tree only up to some Search tree only up to some depth bounddepth bound Use an Use an evaluation functionevaluation function at the depth boundat the depth bound PropagatePropagate the evaluation the evaluation upwardsupwards in the tree in the tree

Page 5: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

5

MINI MAXMINI MAX Restrictions:Restrictions:

2 players:2 players: MAXMAX (computer)(computer) andand MIN MIN (opponent)(opponent) deterministic, perfect informationdeterministic, perfect information

Select a depth-bound Select a depth-bound (say: 2)(say: 2) and evaluation and evaluation functionfunction

MAXMAX

MINMIN

MAXMAX

- - Construct the tree up tillConstruct the tree up till the depth-boundthe depth-bound

- - Compute the evaluation Compute the evaluation function for the leavesfunction for the leaves

22 55 33 11 44 44 33

- - Propagate the evaluationPropagate the evaluation function upwards:function upwards: - taking minima in- taking minima in MINMIN

22 11 33

- - taking maxima intaking maxima in MAXMAX

33SelectSelectthis movethis move

Page 6: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

6

The MINI-MAX algorithm:The MINI-MAX algorithm:Initialise Initialise depthbounddepthbound;;

Minimax (Minimax (boardboard, , depthdepth) = ) =

IFIF depthdepth == depthbounddepthbound THENTHEN returnreturn static_evaluation(static_evaluation(boardboard));; ELSEELSE IFIF maximizing_level( maximizing_level(depthdepth) )

THENTHEN FOR EACH FOR EACH child child childchild of of boardboard compute Minimax(compute Minimax(childchild, ,

depth+1depth+1);); returnreturn maximum over all maximum over all childrenchildren; ;

ELSEELSE IFIF minimizing_level( minimizing_level(depthdepth) ) THENTHEN FOR EACH FOR EACH child child childchild of of boardboard

compute Minimax(compute Minimax(childchild, , depth+1depth+1););

return return minimum over all minimum over all childrenchildren; ; Call: Minimax(Call: Minimax(current_boardcurrent_board, , 00))

Page 7: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

7

Alpha-Beta Cut-offAlpha-Beta Cut-off Generally applied optimization on Mini-max.Generally applied optimization on Mini-max.

Instead of:Instead of: firstfirst creating the entire tree (up to depth-level) creating the entire tree (up to depth-level) thenthen doing all propagation doing all propagation

InterleaveInterleave the generation of the tree and the the generation of the tree and the propagation of values.propagation of values.

PointPoint:: some of the obtained valuessome of the obtained values in the tree will in the tree will

provide informationprovide information that that other (non-generated) other (non-generated) parts are parts are redundantredundant and do not need to be and do not need to be generated.generated.

Page 8: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

8

MINMIN

MAXMAX

MAXMAX

22

Alpha-Beta idea:Alpha-Beta idea: Principles:Principles:

generate the tree depth-first, left-to-rightgenerate the tree depth-first, left-to-right propagate final values of nodes as initial propagate final values of nodes as initial

estimates estimates for their parent node.for their parent node.

22

55

=2=2

22

11

11

- - The The MINMIN-value (-value (11) is already) is alreadysmaller than the smaller than the MAXMAX-value of-value ofthe parent (the parent (22))

- - The The MINMIN-value can only -value can only decrease further,decrease further,

- - The The MAXMAX-value is only allowed-value is only allowed to increase,to increase,

- - No point in computing further No point in computing further below this nodebelow this node

Page 9: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

9

Terminology:Terminology:

- The (temporary) values at- The (temporary) values at MAX MAX-nodes are-nodes are ALPHA- ALPHA-valuesvalues

- The (temporary) values at- The (temporary) values at MINMIN-nodes are-nodes are BETA-valuesBETA-values

MINMIN

MAXMAX

MAXMAX

22

22

55

=2=2

22

11

11

Alpha-valueAlpha-value

Beta-valueBeta-value

Page 10: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

10

The Alpha-Beta principles (1):The Alpha-Beta principles (1):

- If an - If an ALPHA-value ALPHA-value is is larger or equallarger or equal than the than the Beta-Beta-value value of a descendant node:of a descendant node:

stop generation of the children of the stop generation of the children of the descendantdescendant

MINMIN

MAXMAX

MAXMAX

22

22

55

=2=2

22

11

11

Alpha-valueAlpha-value

Beta-valueBeta-value

Page 11: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

11

The Alpha-Beta principles (2):The Alpha-Beta principles (2):

- If an - If an Beta-valueBeta-value is is smaller or equalsmaller or equal than the than the Alpha-Alpha-valuevalue of a descendant node:of a descendant node:

stop generation of the children of the stop generation of the children of the descendantdescendant

MINMIN

MAXMAX

MAXMAX

22

22

55

=2=2

22

33

11

Alpha-valueAlpha-value

Beta-valueBeta-value

Page 12: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

12

88 77 33 99 11 66 22 44 11 11 33 55 33 99 22 66 55 22 11 22 33 99 77 22 88 66 44

Mini-Max with Mini-Max with at work:at work:

11

22 88

33

55= 8= 8

44

66 88

77

88 99

99 1111 1313 1717 1919212124242626 2828 3232 3434 3636

1010 221212 441414= 4= 4

1515= 4= 4

441616

1818 112020 332222= 5= 5

3030= 5= 5 55 2323

553131

2525 332727 99 2929 66

3333 113535 223737= 3= 3

33 3838

3939= 5= 5MAXMAX

MINMIN

MAXMAX

11 static evaluations 11 static evaluations saved !!saved !!

Page 13: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

13

““DEEP” cut-offsDEEP” cut-offs

- For game trees with at least 4 - For game trees with at least 4 MinMin//MaxMax layers: layers:the the AlphaAlpha - - BetaBeta rules apply also to deeper rules apply also to deeper

levels. levels.

44

44

44

44

44

22

22

Page 14: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

14

The Gain: Best case:The Gain: Best case:

MAXMAX

MINMIN

MAXMAX

- If at every layer: the - If at every layer: the best nodebest node is the is the left-most oneleft-most one

Only Only THICKTHICK is explored is explored

Page 15: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

15

Example of a perfectly ordered Example of a perfectly ordered treetree

MAXMAX

MINMIN

MAXMAX

21 20 19 24 23 22 27 26 2521 20 19 24 23 22 27 26 2512 11 10 15 14 13 18 17 1612 11 10 15 14 13 18 17 163 2 1 6 5 4 9 8 73 2 1 6 5 4 9 8 7

21 24 2721 24 27 12 15 1812 15 18 3 6 93 6 9

21 12 321 12 3

2121

Page 16: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

16

# (static evaluations # (static evaluations saved) =saved) =

How much gain ?How much gain ?

- Alpha / Beta : best case :- Alpha / Beta : best case :

2 2 bbdd/2/2 - 1 (if - 1 (if dd is even) is even)

bb((dd+1)/2+1)/2 + + bb((dd-1)/2-1)/2 - 1 (if - 1 (if dd is odd) is odd)

- - The proof is by induction. The proof is by induction.

- - In the running example: In the running example: dd=3=3, , bb=3=3 : 11 ! : 11 !

Page 17: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

17

Best case gain pictured:Best case gain pictured:

1010

100100

10001000

1000010000

100000100000

11 22 33 44 55 66 77

# Static evaluations# Static evaluations

DepthDepth

No pruningNo pruningb = 10b = 10

Alpha-BetaAlpha-BetaBest caseBest case

- - Note: algorithmic scale. Note: algorithmic scale. - - Conclusion: Conclusion: still exponential growth !!still exponential growth !!

- - Worst case??Worst case??For some trees alpha-beta does nothing,For some trees alpha-beta does nothing,For some trees: impossible to reorder to avoid cut-offsFor some trees: impossible to reorder to avoid cut-offs

Page 18: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

18

The horinzon effect.The horinzon effect.

Queen lostQueen lost Pawn lostPawn lost

Queen lostQueen lost

horizon = depth boundhorizon = depth boundof mini-max of mini-max

Because of the depth-boundBecause of the depth-bound we prefer to delay disasters, although we we prefer to delay disasters, although we don’t don’t prevent them !!prevent them !!

solutionsolution: heuristic continuations: heuristic continuations

Page 19: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

19

Heuristic ContinuationHeuristic Continuation

In situations that are identifies as strategically crucialIn situations that are identifies as strategically cruciale.g: king in danger, imminent piece loss, pawn e.g: king in danger, imminent piece loss, pawn

to to become as queens, ... become as queens, ...

extend the search beyond the depth-bound !extend the search beyond the depth-bound !

depth-bounddepth-bound

Page 20: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

20

How to organize the continuations?How to organize the continuations?How to control (and stop) extending the tree? How to control (and stop) extending the tree?

Tapering searchTapering search (or: heuristic pruning)(or: heuristic pruning)

Order the moves in 1 layer by quality. Order the moves in 1 layer by quality.

b(b(childchild)) = = b(b(parentparent)) -- (rank child among brothers) (rank child among brothers)

b = 4b = 4

33 22 11 00

22 11 00 11 00

......

Page 21: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

21

Time bounds:Time bounds:

How to play within reasonable time bounds? How to play within reasonable time bounds?

Even with fixed depth-bound, times can vary strongly!Even with fixed depth-bound, times can vary strongly!

Solution:Solution: Iterative Deepening !!!Iterative Deepening !!!

Remember: overhead of previous searches = Remember: overhead of previous searches = 1/b1/b

Good investment to be sure to have a move Good investment to be sure to have a move ready.ready.

Page 22: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

22

Games of chanceGames of chance

Ex.: Ex.: Backgammon:Backgammon:

Form of the game tree:Form of the game tree:

Page 23: Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

23

““Utility” propagation with Utility” propagation with chances: chances:

Utility function for a Utility function for a MaximizingMaximizing node node CC : :

( , )

: outcome dice

( ) : probability

( , ) : reachable positions from given

( ) : evaluation of

expectimax( ) ( )max ( )i

i

i i

i i

is S C di

d

P d d

S C d C d

utility s s

C P d utility s

MAXMAX

s1s1 s2s2 s3s3 s4s4

d1d1 d2d2 d3d3 d4d4 d5d5

S(S(CC,,d3d3))

CC

MinMin