Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
TensorLayer 2.0Hao Dong
Peking University
1
2
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
How to Use Better
3
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
How to Use Better
4
History of Deep Learning Tools
• Automatic Differentiation
Key reasons for TensorFlow● Largest user base● Widest production adoption● Well-maintained documents● Battlefield-proof quality● TPU !
5
History of Deep Learning Tools
Deep learning high-level elements: neural networks, layers and tensors
http://www.asimovinstitute.org/neural-network-zoo/
Abstraction gap
Low-level interface: dataflow graph, placeholder, session, queue runner, devices …
https://www.tensorflow.org/extend/architecture
Bridged by wrappers: TensorLayer, Keras and TFLearn
• Beyond Automatic Differentiation
6
History of Deep Learning Tools
Theano TensorFlow
Torch Pytorch
Caffe Caffe2
Time
Keras Tflearn TensorLayer
CNTK2016
2010 2015
2017
2014
2016 Sep
2017
2014
2015 2016
2018
7
History of Deep Learning Tools
Theano TensorFlow
Torch Pytorch
Caffe Caffe2
Time
CNTK2016
2010 2015
2017
2014 2017
2002
2018
transfer
transfer
transfermerge
TensorFlow 2.02019
8
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
9
History of TensorLayer
2016 2019
Time
2015 2019
10
History of TensorLayer
10
• TensorLayer 2.0
TensorLayer: A Versatile Library for Efficient Deep Learning Development. H. Dong, A. Supratak et al. ACM MM 2017.
11
History of TensorLayer
GithubDocumentation(English)
12
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
13
Future of TensorLayer
GPU TPU Huawei 晟腾 ShengTeng …
• Platform-agnostic• Research oriented
TensorFlow MindSpore …
14
Future of TensorLayer
Contributors
15
Future of TensorLayer
16
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
17
Static vs. Dynamic ModelsStaticModel
LasagneFashio
n
18
Static vs. Dynamic ModelsDynamicModel
ChainerFashion
19
Static vs. Dynamic Models
StaticModel DynamicModel
ChainerFashionLasagn
eFashion
20
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
21
Switching Train/Test Models
22
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
23
Reuse Weights
ReuseWeightsinStaticModel
24
Reuse Weights
ReuseWeightsinDynamicModel
25
Reuse Weights
ReuseWeightsinStaticModel ReuseWeightsinDynamicModel
26
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
27
Model Information
PrintModelArchitecture
[{'args': {'dtype': tf.float32,'layer_type': 'normal','name': '_inputlayer_1','shape': [None, 784]},
'class': '_InputLayer','prev_layer': None},
{'args': {'keep': 0.8, 'layer_type': 'normal', 'name': 'dropout_1'},'class': 'Dropout','prev_layer': ['_inputlayer_1_node_0']},
{'args': {'act’: ‘relu','layer_type': 'normal','n_units': 800,'name': 'dense_1'},
'class': 'Dense','prev_layer': ['dropout_1_node_0']},
{'args': {'keep': 0.8, 'layer_type': 'normal', 'name': 'dropout_2'},'class': 'Dropout','prev_layer': ['dense_1_node_0']},
{'args': {'act': ‘relu','layer_type': 'normal','n_units': 800,'name': 'dense_2'},
'class': 'Dense','prev_layer': ['dropout_2_node_0']},
{'args': {'keep': 0.8, 'layer_type': 'normal', 'name': 'dropout_3'},'class': 'Dropout','prev_layer': ['dense_2_node_0']},
{'args': {'act’: ‘relu','layer_type': 'normal','n_units': 10,'name': 'dense_3'},
'class': 'Dense','prev_layer': ['dropout_3_node_0']}]
28
Model Information
PrintModelInformation
GetSpecificWeights
SaveWeightsOnly
SaveWeights+ Architecture
.trainable_weights
.nontrainable_weights
29
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
30
Customize Layer without Weights
31
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
32
Customize Layer with Weights
33
Customize Layer with Weights
34
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
35
Dataflow
𝑡𝑖𝑚𝑒ModelTraining
(GPU)DataProcessing
(CPU)GPU Utilization: 70%
ModelTraining(GPU)
DataProcessing(CPU)
GPU Utilization: 50%
• Training without Dataflow
𝑡𝑖𝑚𝑒
Prepareabatchoftrainingdata
Oneiteration GPUIdle
36
Dataflow
𝑡𝑖𝑚𝑒ModelTraining
(GPU)DataProcessing
(CPU)GPU Utilization: 100%
ModelTraining(GPU)
DataProcessing(CPU)
GPU Utilization: 100%
• Training with Dataflow
𝑡𝑖𝑚𝑒
ModelTraining(GPU)
DataProcessing(CPU)
GPU Utilization: 75%
𝑡𝑖𝑚𝑒
37
• History of Deep Learning Tools• History of TensorLayer• Future of TensorLayer• Static vs. Dynamic Models• Switching Train/Test Modes• Reuse Weights• Model Information• Customize Layer without Weights• Customize Layer with Weights• Dataflow• Distributed Training
Background
How to Use
38
Distributed Training
Model part 3Worker 3
Model part 2Worker 2
Model part 1Worker 1
Model Parallelism
Entire ModelWorker 3
Entire ModelWorker 2 Entire ModelWorker 1
Entire ModelWorker 4
Data Parallelism
39
Distributed Training
Worker 1
Parameter Server
Worker 2 Worker 3
Averages gradients
Worker 1
Parameter Server 1
Worker 2 Worker 3
Averages different parts of the gradients
Parameter Server 2
• Distributed Training: Data Parallelism - Parameter Server
Distributed Training
40
Worker 1
Worker 2 Worker 3
Worker 1
Worker 2 Worker 3
Worker 1
Worker 2 Worker 3
Step 1 Step 2
Step 3
• Distributed Training: Data Parallelism - Horovod - All ringreduce
41
Distributed Training
• Distributed Training: Data Parallelism - Horovod - All ringreduce
Model Gradients Averaged GradientsWorker 1
Model Gradients Averaged GradientsWorker 2
Ring-allreduce
Model Gradients Averaged GradientsWorker 3
42
Please install TensorFlow and TensorLayer and make sure this code is runnable
https://github.com/tensorlayer/tensorlayer/blob/master/examples/basic_tutorials/tutorial_mnist_mlp_static.py
Thanks
43