1

GoGoGo: Improving Deep Neural Network Based Go Playing AI with Residual Networks Introduction Xingyu Liu CONV3, 64 CONV3, 64 Input (19x19x5) CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 Output CONV3 64 P Fig. 2 (a) Policy Network CONV3, 64 CONV3, 64 Input (19x19x5) CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3 64 (b) Value Network Network Architecture Experiment Result • Go playing AIs using Traditional Search: GNU Go, Pachi, Fuego, Zen etc. Training Methodology and Data • From Kifu to Input Feature Maps Channels: 1) Space Positions; 2) Black Positions; 3) White Positions; 4) Current Player; 5) Ko Positions [1] David Silver et al., “Mastering the game of go with deep neural networks and tree search”, Nature, 529:484–503, 2016. [2] Kaiming He et al, “Deep residual learning for image recognition”, CoRR, abs/1512.03385, 2015. SL on Policy Network RL on Policy Network RL on Value Network • Dynamic Board State Expansion Ko fight performing. Saves disk space. Small Mem • Use Ing Chang-ki rule Board State + Ko is Game State, No need to remember the number of captured stones • Two Levels of Batches (Kifus, moves) Random Shuffling. Mem usage small and locality. • Powered by Deep Learning: Zen → Deep Zen Go, darkforest , AlphaGo • Goal: From by Vanilla CNN to ResNets CONV3 Batch Norm ReLU CONV3 Batch Norm ReLU Data Eltwise Add (c) Residual Module FC23104 1 Hyperparameters Value Base learning rate 2E-4 Decay Policy Exp Decay Rate 0.95 Decay Step (kifu) 200 Loss Function Softmax (b) Hyperparameters Future Work Fig 1. Ko fight explicity expansion • Training Accuracy ~ 32% • Testing Accuracy ~ 26% Fig. 3 GoGoGo plays against itself, policy network only 0 1 2 3 4 5 6 7 1 183 365 547 729 911 1093 1275 1457 1639 1821 2003 2185 2367 2549 2731 2913 3095 3277 3459 3641 3823 4005 4187 4369 4551 4733 4915 5097 5279 5461 5643 5825 6007 6189 6371 6553 6735 6917 7099 7281 7463 7645 7827 8009 8191 8373 8555 8737 Supervised Learning Training Loss /100 batches • Monte Carlo Tree Search = argmax ( , + ( , )) (, ) ∝ (, ) 1 + (, ) • Reinforcement Learning of Value Network • Network Architecture Exploration • Real Match Testing against Human Players

Xingyu Liu - Machine Learningcs229.stanford.edu/proj2016/poster/Liu-GoGoGo Improving Deep Neural... · [1] David Silver et al., “Mastering the game of go with deep neural networks

Download PDF Report

Upload
others
View
2
Download
0

Embed Size (px)

Citation preview

Page 1: Xingyu Liu - Machine Learningcs229.stanford.edu/proj2016/poster/Liu-GoGoGo Improving Deep Neural... · [1] David Silver et al., “Mastering the game of go with deep neural networks

GoGoGo: Improving Deep Neural Network Based Go Playing AI with Residual Networks

Introduction

Xingyu Liu

CONV3, 64

CONV3, 64

Input (19x19x5)

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

Output

CONV3 64

P

Fig. 2 (a) Policy Network

CONV3, 64

CONV3, 64

Input (19x19x5)

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3 64

(b) Value Network

Network Architecture Experiment Result

• Go playing AIs using Traditional Search:

GNU Go, Pachi, Fuego, Zen etc.

Training Methodology and Data

• From Kifu to Input Feature Maps

Channels: 1) Space Positions; 2) Black Positions; 3)

White Positions; 4) Current Player; 5) Ko Positions

[1] David Silver et al., “Mastering the game of go with deep neural networks and

tree search”, Nature, 529:484–503, 2016.

[2] Kaiming He et al, “Deep residual learning for image recognition”, CoRR,

abs/1512.03385, 2015.

SL on Policy

Network

RL on Policy

Network

RL on Value

Network

• Dynamic Board State Expansion

Ko fight performing. Saves disk space. Small Mem

• Use Ing Chang-ki rule

Board State + Ko is Game State, No need to

remember the number of captured stones

• Two Levels of Batches (Kifus, moves)

Random Shuffling. Mem usage small and locality.

• Powered by Deep Learning:

Zen → Deep Zen Go, darkforest, AlphaGo

• Goal: From by Vanilla CNN to ResNets

CONV3

Batch Norm

ReLU

CONV3

Batch Norm

ReLU

Data

Eltwise Add

(c) Residual Module

FC23104 1

Hyperparameters Value

Base learning rate 2E-4

Decay Policy Exp

Decay Rate 0.95

Decay Step (kifu) 200

Loss Function Softmax

(b) Hyperparameters

Future Work Fig 1. Ko fight explicity

expansion

• Training Accuracy ~ 32%

• Testing Accuracy ~ 26%

Fig. 3 GoGoGo plays against itself, policy network only

0

1

2

3

4

5

6

7

1183

365

547

729

911

1093

1275

1457

1639

1821

2003

2185

2367

2549

2731

2913

3095

3277

3459

3641

3823

4005

4187

4369

4551

4733

4915

5097

5279

5461

5643

5825

6007

6189

6371

6553

6735

6917

7099

7281

7463

7645

7827

8009

8191

8373

8555

8737

Supervised Learning Training Loss

/100 batches

• Monte Carlo Tree Search

𝑎𝑡 = argmax𝑎

(𝑄 𝑠𝑡 , 𝑎 + 𝑢(𝑠𝑡, 𝑎))

𝑢(𝑠, 𝑎) ∝𝑃(𝑠, 𝑎)

1 + 𝑁(𝑠, 𝑎)

• Reinforcement Learning of Value Network

• Network Architecture Exploration

• Real Match Testing against Human Players

Sports

El Corte Ingles catalogo GoGoGo

El Corte Ingles catalogo GoGoGo

Documents

Incremental Learning in Deep Neural Networks · Yuan Liu INCREMENTAL LEARNING IN DEEP NEURAL NETWORKS Master of Science Thesis Examiners:Prof. Joni-Kristian Kämäräinen Dr. Ke Chen

Incremental Learning in Deep Neural Networks · Yuan Liu INCREMENTAL LEARNING IN DEEP NEURAL NETWORKS Master of Science Thesis Examiners:Prof. Joni-Kristian Kämäräinen Dr. Ke Chen

Documents

Recurrent neural network based decision support system · 2019-10-08 · Recurrent neural network based decision support system Abiodun Ayodeji* Yong-kuo Liu Fundamental Science on

Recurrent neural network based decision support system · 2019-10-08 · Recurrent neural network based decision support system Abiodun Ayodeji* Yong-kuo Liu Fundamental Science on

Documents

Neural mechanisms of feature- based attention Taosheng Liu

Neural mechanisms of feature- based attention Taosheng Liu

Documents

CSGNet: Neural Shape Parser for Constructive Solid Geometry · 2018-06-11 · CSGNet: Neural Shape Parser for Constructive Solid Geometry Gopal Sharma Rishabh Goyal Difan Liu Evangelos

CSGNet: Neural Shape Parser for Constructive Solid Geometry · 2018-06-11 · CSGNet: Neural Shape Parser for Constructive Solid Geometry Gopal Sharma Rishabh Goyal Difan Liu Evangelos

Documents

Object-oriented Neural Programming (OONP) for … Neural Programming (OONP) for Document Understanding Zhengdong Lu1 Haotian Cui 2* Xianggen Liu * Yukun Yan * Daqi Zheng1 1DeeplyCurious.ai

Object-oriented Neural Programming (OONP) for … Neural Programming (OONP) for Document Understanding Zhengdong Lu1 Haotian Cui 2* Xianggen Liu * Yukun Yan * Daqi Zheng1 1DeeplyCurious.ai

Documents

Knowledge-Guided NLPˆ˜知远.pdfZhenghao Liu, Chenyan Xiong, Maosong Sun, and Zhiyuan Liu. Entity-Duet Neural Ranking: Understanding the Role of Knowledge Graph Semantics in Neural

Knowledge-Guided NLPˆ˜知远.pdfZhenghao Liu, Chenyan Xiong, Maosong Sun, and Zhiyuan Liu. Entity-Duet Neural Ranking: Understanding the Role of Knowledge Graph Semantics in Neural

Documents

Deep Convolutional Neural Fields for Depth …...Deep Convolutional Neural Fields for Depth Estimation from a Single Image Fayao Liu, Chunhua Shen, Guosheng Lin University of Adelaide,

Deep Convolutional Neural Fields for Depth …...Deep Convolutional Neural Fields for Depth Estimation from a Single Image Fayao Liu, Chunhua Shen, Guosheng Lin University of Adelaide,

Documents

Heterogeneous Graph Neural Networks for Malicious Account Detection 2018... · 2018-12-15 · Heterogeneous Graph Neural Networks for Malicious Account Detection Ziqi Liu Ant Financial

Heterogeneous Graph Neural Networks for Malicious Account Detection 2018... · 2018-12-15 · Heterogeneous Graph Neural Networks for Malicious Account Detection Ziqi Liu Ant Financial

Documents

MemNAS: Memory-Efficient Neural Architecture Search With ...openaccess.thecvf.com/content_CVPR_2020/papers/Liu... · MemNAS: Memory-Efﬁcient Neural Architecture Search with Grow-Trim

MemNAS: Memory-Efficient Neural Architecture Search With ...openaccess.thecvf.com/content_CVPR_2020/papers/Liu... · MemNAS: Memory-Efﬁcient Neural Architecture Search with Grow-Trim

Documents

Joint Training for Neural Machine Translation …...Joint Training for Neural Machine Translation Models with Monolingual Data Zhirui Zhangy, Shujie Liu z, Mu Li , Ming Zhou , Enhong

Joint Training for Neural Machine Translation …...Joint Training for Neural Machine Translation Models with Monolingual Data Zhirui Zhangy, Shujie Liu z, Mu Li , Ming Zhou , Enhong

Documents

An intriguing failing of convolutional neural networks and the … · An intriguing failing of convolutional neural networks and the CoordConv solution Rosanne Liu 1Joel Lehman Piero

An intriguing failing of convolutional neural networks and the … · An intriguing failing of convolutional neural networks and the CoordConv solution Rosanne Liu 1Joel Lehman Piero

Documents

Structural learning in artificial neural networks using ...tarjomefa.com/wp-content/uploads/2018/07/357... · for deep neural networks. Furthermore, Liu et al. [21] have used group

Structural learning in artificial neural networks using ...tarjomefa.com/wp-content/uploads/2018/07/357... · for deep neural networks. Furthermore, Liu et al. [21] have used group

Documents

Neural Machine Translation: A Machine Learning Perspectivetcci.ccf.org.cn/summit/2017/dlinfo/06.pdf · Neural Machine Translation: A Machine Learning Perspective Tie-Yan Liu Principal

Neural Machine Translation: A Machine Learning Perspectivetcci.ccf.org.cn/summit/2017/dlinfo/06.pdf · Neural Machine Translation: A Machine Learning Perspective Tie-Yan Liu Principal

Documents

RUMBA DISCOVER MORE ON IT BY READING THIS........GOGOGO

RUMBA DISCOVER MORE ON IT BY READING THIS........GOGOGO

Documents

Chunk-based Decoder for Neural Machine Translationynaga/papers/ishiwatari-acl...Chunk-based Decoder for Neural Machine Translation Shonosuke Ishiwatari¹, Jingtao Yao², Shujie Liu³,

Chunk-based Decoder for Neural Machine Translationynaga/papers/ishiwatari-acl...Chunk-based Decoder for Neural Machine Translation Shonosuke Ishiwatari¹, Jingtao Yao², Shujie Liu³,

Documents

Developers gogogo!!!

Developers gogogo!!!

Technology

Exploring Segment Representations for Neural Segmentation … · 2016-06-28 · Exploring Segment Representations for Neural Segmentation Models Yijia Liu, Wanxiang Che ⇤, Jiang

Exploring Segment Representations for Neural Segmentation … · 2016-06-28 · Exploring Segment Representations for Neural Segmentation Models Yijia Liu, Wanxiang Che ⇤, Jiang

Documents

NAS-Count: Counting-by-Density with Neural Architecture Search · NAS-Count: Counting-by-Density with Neural Architecture Search Yutao Hu 1, Xiaolong Jiang , Xuhui Liu , Baochang

NAS-Count: Counting-by-Density with Neural Architecture Search · NAS-Count: Counting-by-Density with Neural Architecture Search Yutao Hu 1, Xiaolong Jiang , Xuhui Liu , Baochang

Documents

CausalVAE: Disentangled Representation Learning via Neural ...CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models Mengyue Yang 1, Furui Liu , Zhitang

CausalVAE: Disentangled Representation Learning via Neural ...CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models Mengyue Yang 1, Furui Liu , Zhitang

Documents

Neural Interaction Detection · 2017-12-01 · Neural Interaction Detection Michael Tsang, Dehua Cheng, Yan Liu Department of Computer Science University of Southern California Los

Neural Interaction Detection · 2017-12-01 · Neural Interaction Detection Michael Tsang, Dehua Cheng, Yan Liu Department of Computer Science University of Southern California Los

Documents

Probabilistic Reasoning via Deep Learning: Neural ...alanwags/DLAI2016/(Liu+) IJCAI-16 DLAI...Probabilistic Reasoning via Deep Learning: Neural Association Models Quan Liuy yUniversity

Probabilistic Reasoning via Deep Learning: Neural ...alanwags/DLAI2016/(Liu+) IJCAI-16 DLAI...Probabilistic Reasoning via Deep Learning: Neural Association Models Quan Liuy yUniversity

Documents

GeniePath: Graph Neural Networks with Adaptive …GeniePath: Graph Neural Networks with Adaptive Receptive Paths Ziqi Liu 1, Chaochao Chen , Longfei Li , Jun Zhou 1, Xiaolong Li ,

GeniePath: Graph Neural Networks with Adaptive …GeniePath: Graph Neural Networks with Adaptive Receptive Paths Ziqi Liu 1, Chaochao Chen , Longfei Li , Jun Zhou 1, Xiaolong Li ,

Documents

ReActNet: Towards Precise Binary Neural Network with ......ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions Zechun Liu 1; 2?, Zhiqiang Shen y,

ReActNet: Towards Precise Binary Neural Network with ......ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions Zechun Liu 1; 2?, Zhiqiang Shen y,

Documents

arXiv:2006.10310v1 [cs.LG] 18 Jun 2020 · Neural Architecture Optimization with Graph VAE Jian Li 1;2, Yong Liu , Jiankun Liu , Weiping Wang1;2 1Institute of Information Engineering,

arXiv:2006.10310v1 [cs.LG] 18 Jun 2020 · Neural Architecture Optimization with Graph VAE Jian Li 1;2, Yong Liu , Jiankun Liu , Weiping Wang1;2 1Institute of Information Engineering,

Documents

VIBNN: Hardware Acceleration of Bayesian Neural …VIBNN: Hardware Acceleration of Bayesian Neural Networks Ruizhe Cai∗ Ao Ren∗ Ning Liu Caiwen Ding Department of Electrical Engineering

VIBNN: Hardware Acceleration of Bayesian Neural …VIBNN: Hardware Acceleration of Bayesian Neural Networks Ruizhe Cai∗ Ao Ren∗ Ning Liu Caiwen Ding Department of Electrical Engineering

Documents

Oblivious Neural Network Predictions via MiniONN ... · Oblivious Neural Network Predictions via MiniONN transformations Jian Liu Aalto University jian.liu@aalto.fi Mika Juuti Aalto

Oblivious Neural Network Predictions via MiniONN ... · Oblivious Neural Network Predictions via MiniONN transformations Jian Liu Aalto University [email protected] Mika Juuti Aalto

Documents

Neural Architecture Optimizationpapers.nips.cc/paper/8007-neural-architecture...Neural Architecture Optimization 1Renqian Luoy, 2Fei Tiany,2Tao Qin, 1Enhong Chen, 2Tie-Yan Liu 1University

Neural Architecture Optimizationpapers.nips.cc/paper/8007-neural-architecture...Neural Architecture Optimization 1Renqian Luoy, 2Fei Tiany,2Tao Qin, 1Enhong Chen, 2Tie-Yan Liu 1University

Documents

Cambricon: An Instruction Set Architecture for Neural · PDF fileCambricon: An Instruction Set Architecture for Neural Networks Shaoli Liu∗§, Zidong Du∗§, Jinhua Tao∗§, Dong

Cambricon: An Instruction Set Architecture for Neural · PDF fileCambricon: An Instruction Set Architecture for Neural Networks Shaoli Liu∗§, Zidong Du∗§, Jinhua Tao∗§, Dong

Documents