AlphaGo andMonteCarloTreeSearch - University of Washington5/30/17 1 AlphaGo andMonteCarloTreeSearch Machine0Learning0forBig0Data00 CSE547/STAT548, University0of0Washington Sham0Kakade

5/30/17

1

AlphaGo and Monte Carlo Tree Search

Machine Learning for Big Data CSE547/STAT548, University of Washington

Sham Kakade

1©Sham Kakade 2017

A feat previously thought to be (a decade?) away!!!

9/28/16

12

2011

25

http://www.youtube.com/watch?v=WFR3lOm_xhE

2016

26

AlphaGo deep RL defeats Lee Sedol (4-1)

5/30/17

2

Chess vs. Alpha Go

1997, AI named “Deep Blue” beat chess world champion.

Search space: 𝑏": 𝑏 = 35, 𝑑 = 80Search space: 𝑏": 𝑏 = 250, 𝑑 = 150

Key Points

• Given current game state, – how to decide next action “a”?– How to evaluate each legal action?

5/30/17

3

Monte Carlo Tree Search (MCTS)• A popular heuristic search algorithm for game play– By lots of simulations and select the most visited action.

Evaluation

Node value: Wining rate

Key point: how to calculate this valueOne playout to simulate the game

Monte Carlo Tree Search (MCTS)• AlphaGo

– Key point: how to calculate the value for each action (path)

5/30/17

4

The visited number for this edge

The mean value of all simulations passing through it.

Winning rate

Winning rate is predicted by deep reinforcement learning andpredicted by fast rollout playout

Action probability predicted by supervised and reinforcement learning

V(s) is estimated in two different ways.

(Fast) Methods to evaluate V

30 million games

30 million self-‐play games

5/30/17

5

Thanks!

• ML for big data: – lots of different methods/challenges in the wild.– Participate in the ML community.

• Posters: – Looking forward to the poster session.

• Have a great summer!

• Some slides modified from: prlab.tudelft.nl/sites/default/files/deepmind_go_nature.pptx

Acknowledgments

©Sham Kakade 2017

Documents

AlphaGo andMonteCarloTreeSearch - University of Washington5/30/17 1 AlphaGo andMonteCarloTreeSearch Machine0Learning0forBig0Data00 CSE547/STAT548, University0of0Washington Sham0Kakade