19

Layer-wise Asynchronous Training of Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams

Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

Download PDF Report

Upload
others
View
12
Download
0

Embed Size (px)

Citation preview

Page 1: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

Layer-wise Asynchronous Training of Neural Network with Synthetic Gradient

Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams

Page 2: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

CONTENTS

BACKGROUND

INTRODUCTION

SYNCHRONOUS TRAINING

INSIGHTS

ASYNCHRONOUS TRAINING

Page 3: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

Part OneBACKGROUND01

Page 4: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

1BACKGROUND

Page 5: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

Asynchronous SGD

Downpour SGD

Dean, Jeffrey, et al. "Large scale distributed deep networks." Advances in neural information processing systems. 2012.

Page 6: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

Part TwoINTRODUCTION02

Page 7: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

2-1OVERVIEW

Jaderberg, Max, et al. "Decoupled neural interfaces using synthetic gradients." arXiv preprint arXiv:1608.05343 (2016).

Page 8: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

2-2

Page 9: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

Part ThreeSYNCHRONOUS TRAINING03

Page 10: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

3-1

𝐿ℳ =% 𝛿' − 𝛿)' **

�

'

𝐿ℐ = % ℎ' − ℎ.' **

�

'

Minimizing the synthetic input/gradient simultaneously with the general loss

𝐿 𝑤0, 𝑏0,⋯ ,𝑤', 𝑏',⋯ ,𝑤4, 𝑏4 = 𝐿(𝑊,𝐵)

Page 11: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

Part FourASYNCHRONOUS TRAINING04

Page 12: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

4-1ONEPOSSIBLEARCHITECTURE

Page 13: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

4-1 INFRASTRUCTURE - BASIC

Page 14: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

4-2INFRASTRUCTURE- LIGHT

Page 15: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

4-2SLAVE

Page 16: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

4-3MASTER

Page 17: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

Part FiveINSIGHTS05

Page 18: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

5-1

• Synthetic gradient and synthetic input as a new alternative of batch normalization / Dropput

• M and I auxiliary network introduces noises in the input/gradient of each layer

• No exact update is required!

• The model will learn how to reduce the variance within each batch

• while keeping the flavor of that specific batch.

Page 19: Layer-wise Asynchronous Training of Neural …wcohen/10-605/2016-project-presentations/...Neural Network with Synthetic Gradient Xupeng Tong, Hao Wang, Ning Dong, Griffin Adams CONTENTS

Thank You

CD133 neural stem cells in the ependyma of mammalian … · 2008. 7. 17. · CD133 neural stem cells in the ependyma of mammalian postnatal forebrain Volkan Coskun*†‡, Hao Wu†‡,

CD133 neural stem cells in the ependyma of mammalian … · 2008. 7. 17. · CD133 neural stem cells in the ependyma of mammalian postnatal forebrain Volkan Coskun*†‡, Hao Wu†‡,

Documents

Electricity Consumption Forecasting for Smart Grid …Electricity Consumption Forecasting for Smart Grid using the Multi-Factor Back-Propagation Neural Network Hao Songa, Yu Chena,*,

Electricity Consumption Forecasting for Smart Grid …Electricity Consumption Forecasting for Smart Grid using the Multi-Factor Back-Propagation Neural Network Hao Songa, Yu Chena,*,

Documents

Documents

Documents

Documents

Learning from Context or Names? An Empirical Study on Neural … · 2020. 11. 11. · Learning from Context or Names? An Empirical Study on Neural Relation Extraction Hao Peng 1,

Learning from Context or Names? An Empirical Study on Neural … · 2020. 11. 11. · Learning from Context or Names? An Empirical Study on Neural Relation Extraction Hao Peng 1,

Documents

E cient Multi-Key Homomorphic Encryption with Packed ...E cient Multi-Key Homomorphic Encryption with Packed Ciphertexts with Application to Oblivious Neural Network Inference Hao

E cient Multi-Key Homomorphic Encryption with Packed ...E cient Multi-Key Homomorphic Encryption with Packed Ciphertexts with Application to Oblivious Neural Network Inference Hao

Documents

A S C ATTENTIONS IN GRAPH NEURAL NETWORKS ...Hao Zhang Chongqing University of Posts and Telecommunications sufeidechabei@gmail.com Xingjian Shi Unafﬁliated xshiab@connect.ust.hk

A S C ATTENTIONS IN GRAPH NEURAL NETWORKS ...Hao Zhang Chongqing University of Posts and Telecommunications [email protected] Xingjian Shi Unafﬁliated [email protected]

Documents

Visualizing the Loss Landscape of Neural Nets...Visualizing the Loss Landscape of Neural Nets Hao Li 1, Zheng Xu , Gavin Taylor2, Christoph Studer3, Tom Goldstein1 1University of Maryland,

Visualizing the Loss Landscape of Neural Nets...Visualizing the Loss Landscape of Neural Nets Hao Li 1, Zheng Xu , Gavin Taylor2, Christoph Studer3, Tom Goldstein1 1University of Maryland,

Documents

Hao Li - Portfolio

Hao Li - Portfolio

Documents

Deep Neural Networks for Swept Volume Prediction Between Conﬁgurations · 2018-06-07 · Deep Neural Networks for Swept Volume Prediction Between Conﬁgurations Hao-Tien Lewis

Deep Neural Networks for Swept Volume Prediction Between Conﬁgurations · 2018-06-07 · Deep Neural Networks for Swept Volume Prediction Between Conﬁgurations Hao-Tien Lewis

Documents

Documents

Real-Time Neural Style Transfer for Videos...Real-Time Neural Style Transfer for Videos Haozhi Huang†‡∗ Hao Wang‡ Wenhan Luo‡ Lin Ma‡ Wenhao Jiang‡ Xiaolong Zhu‡ Zhifeng

Real-Time Neural Style Transfer for Videos...Real-Time Neural Style Transfer for Videos Haozhi Huang†‡∗ Hao Wang‡ Wenhan Luo‡ Lin Ma‡ Wenhao Jiang‡ Xiaolong Zhu‡ Zhifeng

Documents

Documents

Supplementary Materials for Natural-Parameter Networks: A ... · Supplementary Materials for Natural-Parameter Networks: A Class of Probabilistic Neural Networks Hao Wang, Xingjian

Supplementary Materials for Natural-Parameter Networks: A ... · Supplementary Materials for Natural-Parameter Networks: A Class of Probabilistic Neural Networks Hao Wang, Xingjian

Documents

libreto Hairspray ola hao jajaoah haola hao jajaoah haola hao jajaoah haola hao jajaoah haola hao jajaoah ha

libreto Hairspray ola hao jajaoah haola hao jajaoah haola hao jajaoah haola hao jajaoah haola hao jajaoah ha

Documents

Documents

Neural Person Search Machines - Tencent · Neural Person Search Machines Hao Liu 1 Jiashi Feng2 Zequn Jie3 Karlekar Jayashree4 Bo Zhao 5 Meibin Qi1 Jianguo Jiang1 Shuicheng Yan6,2

Neural Person Search Machines - Tencent · Neural Person Search Machines Hao Liu 1 Jiashi Feng2 Zequn Jie3 Karlekar Jayashree4 Bo Zhao 5 Meibin Qi1 Jianguo Jiang1 Shuicheng Yan6,2

Documents

Hao Phuong Profile

Documents

Towards Fast and Energy-Efficient Binarized Neural Network ...Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA Cheng Fu1,2,∗, Shilin Zhu2, Hao Su2, Ching-En

Towards Fast and Energy-Efficient Binarized Neural Network ...Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA Cheng Fu1,2,∗, Shilin Zhu2, Hao Su2, Ching-En

Documents

Text-Guided Graph Neural Networks for Referring 3D ...Text-Guided Graph Neural Networks for Referring 3D Instance Segmentation Pin-Hao Huang,1 Han-Hung Lee,1,2 Hwann-Tzong Chen,2,4

Text-Guided Graph Neural Networks for Referring 3D ...Text-Guided Graph Neural Networks for Referring 3D Instance Segmentation Pin-Hao Huang,1 Han-Hung Lee,1,2 Hwann-Tzong Chen,2,4

Documents

Indicadores Hao Lab

Indicadores Hao Lab

Documents

Basri Adhi : Belajar dari Hao Hao

Basri Adhi : Belajar dari Hao Hao

Leadership & Management

Zeng Hao (D.H.)

Zeng Hao (D.H.)

Entertainment & Humor

Deep neural networks - Carnegie Mellon School of …wcohen/10-601/deep-1.pdf · 2016-04-05 · • Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition

Deep neural networks - Carnegie Mellon School of …wcohen/10-601/deep-1.pdf · 2016-04-05 · • Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition

Documents

Real-Time Neural Style Transfer for Videosopenaccess.thecvf.com/content_cvpr_2017/papers/Huang...Real-Time Neural Style Transfer for Videos Haozhi Huang†‡∗ Hao Wang‡ Wenhan

Real-Time Neural Style Transfer for Videosopenaccess.thecvf.com/content_cvpr_2017/papers/Huang...Real-Time Neural Style Transfer for Videos Haozhi Huang†‡∗ Hao Wang‡ Wenhan

Documents

Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Medhdi Khashei, Mehdi Bijari 2011, ASOC A novel hybridization of artificial neural

Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Medhdi Khashei, Mehdi Bijari 2011, ASOC A novel hybridization of artificial neural

Documents

Documents

Shiqin Yan, Huaicheng Li, Mingzhe Hao, Michael Hao Tong, … · 2019-12-18 · Shiqin Yan, Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminathan Sundararaman*, Andrew Chien, and

Shiqin Yan, Huaicheng Li, Mingzhe Hao, Michael Hao Tong, … · 2019-12-18 · Shiqin Yan, Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminathan Sundararaman*, Andrew Chien, and

Documents

HAO & JOYCE ENGAGEMENT

HAO & JOYCE ENGAGEMENT

Documents