Programs that Play better than Us

Preview:

Citation preview

Programs that Play better than Us

Melvin Zhangmelvin@melvinzhang.net

@melvinzhangzy

https://en.wikipedia.org/wiki/File:ST Battle Chess.png

https://en.wikipedia.org/wiki/Deep Blue (chess computer)

Deep Blue (IBM, 1996)

http://afflictor.com/2012/09/11/chess-programs-regularly-play-at-good-amateur-level/

Game tree

Optimal play

Terminal

min player

max player

Optimal play

1 01 1 1Terminal

min player

max player

Optimal play

1 01 1 1

0

Terminal

min player

max player

Optimal play

1 01 1 1

0 1

Terminal

min player

max player

Optimal play

1 01 1 1

0 1

1

Terminal

min player

max player

Chess has about 1046 states!

Minimax algorithm

Cut-off

min player

max player

Minimax algorithm

.7 .1 .6 .9Cut-off

min player

max player

Minimax algorithm

.7 .1 .6 .9

.1

Cut-off

min player

max player

Minimax algorithm

.7 .1 .6 .9

.1 .6

Cut-off

min player

max player

Minimax algorithm

.7 .1 .6 .9

.1 .6

.6

Cut-off

min player

max player

https://stockfishchess.org/

Stockfish

https://tests.stockfishchess.org/

Testing AI changes is crucial

Value functions are hard!

http://mathworld.wolfram.com/Go.html

http://www.remi-coulom.fr/CrazyStone/

Remi Coulom

http://www.wired.com/2014/05/the-world-of-computer-go/

Monte Carlo evaluations

Cut-off

min player

max player

Monte Carlo evaluations

Cut-off

min player

max player

Monte Carlo evaluations

Cut-off

min player

max player

Monte Carlo evaluations

Cut-off

min player

max player

Monte Carlo evaluations

Cut-off

min player

max player

.7

Monte Carlo Tree Search (MCTS)

by Google Deepmind

https://deepmind.com/research/alphago/

https://gogameguru.com/alphago-races-ahead-2-0-lee-sedol/

MCTS + Policy and value networks

http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html

MCTS + Policy and value networks

Some games have hidden information!

http://magic.wizards.com/en/events/coverage/gpsin15/father-son-2015-06-27

https://magarena.github.io

Determinization: choose a random instance of thehidden information during simulation

Comparison of Minimax and MCTS

At 1s thinking time:Minimax MCTS

1 0.88

Comparison of Minimax and MCTS

At 1s thinking time:Minimax MCTS

1 0.88

At 4s thinking time:Minimax MCTS

1 1.71

Open problems

MCTS is bad at tight tactical play.

MCTS plays badly when it is behind in the game.

Further readings

Further readings

Recommended