View
132
Download
1
Category
Preview:
Citation preview
Programs that Play better than Us
Melvin Zhangmelvin@melvinzhang.net
@melvinzhangzy
https://en.wikipedia.org/wiki/File:ST Battle Chess.png
https://en.wikipedia.org/wiki/Deep Blue (chess computer)
Deep Blue (IBM, 1996)
http://afflictor.com/2012/09/11/chess-programs-regularly-play-at-good-amateur-level/
Game tree
Optimal play
Terminal
min player
max player
Optimal play
1 01 1 1Terminal
min player
max player
Optimal play
1 01 1 1
0
Terminal
min player
max player
Optimal play
1 01 1 1
0 1
Terminal
min player
max player
Optimal play
1 01 1 1
0 1
1
Terminal
min player
max player
Chess has about 1046 states!
Minimax algorithm
Cut-off
min player
max player
Minimax algorithm
.7 .1 .6 .9Cut-off
min player
max player
Minimax algorithm
.7 .1 .6 .9
.1
Cut-off
min player
max player
Minimax algorithm
.7 .1 .6 .9
.1 .6
Cut-off
min player
max player
Minimax algorithm
.7 .1 .6 .9
.1 .6
.6
Cut-off
min player
max player
https://stockfishchess.org/
Stockfish
https://tests.stockfishchess.org/
Testing AI changes is crucial
Value functions are hard!
http://mathworld.wolfram.com/Go.html
http://www.remi-coulom.fr/CrazyStone/
Remi Coulom
http://www.wired.com/2014/05/the-world-of-computer-go/
Monte Carlo evaluations
Cut-off
min player
max player
Monte Carlo evaluations
Cut-off
min player
max player
Monte Carlo evaluations
Cut-off
min player
max player
Monte Carlo evaluations
Cut-off
min player
max player
Monte Carlo evaluations
Cut-off
min player
max player
.7
Monte Carlo Tree Search (MCTS)
by Google Deepmind
https://deepmind.com/research/alphago/
https://gogameguru.com/alphago-races-ahead-2-0-lee-sedol/
MCTS + Policy and value networks
http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html
MCTS + Policy and value networks
Some games have hidden information!
http://magic.wizards.com/en/events/coverage/gpsin15/father-son-2015-06-27
https://magarena.github.io
Determinization: choose a random instance of thehidden information during simulation
Comparison of Minimax and MCTS
At 1s thinking time:Minimax MCTS
1 0.88
Comparison of Minimax and MCTS
At 1s thinking time:Minimax MCTS
1 0.88
At 4s thinking time:Minimax MCTS
1 1.71
Open problems
MCTS is bad at tight tactical play.
MCTS plays badly when it is behind in the game.
Further readings
Further readings
Recommended