View
234
Download
0
Category
Preview:
Citation preview
Convolutional Neural Network & Backpropagation Algorithm
Xuelin Qian
Content
1.Neural Network
2.Backpropagation Algorithm
3.Convolutional Neural Network
4.Backpropagation on CNN
Content
1.Neural Network
2.Backpropagation Algorithm
3.Convolutional Neural Network
4.Backpropagation on CNN
“neuron”
where is called the activation function
1. Neural Network
neural network: consist of many simple “neurons”
1. Neural Network
InputLayer
HiddenLayer
OutputLayer
Forward Propagation
1. Neural Network
ln : the number of layerslL : the layer l( )lijW : the weight associated with the connection between unit in layer ,
and unit in layer j l
1+li( )lib : the bias associated with unit in layer 1+li( )lia : the acctivation unit in layeri l
Forward Propagation
1. Neural Network
ln : the number of layerslL : the layer l( )lijW : the weight associated with the connection between unit in layer ,
and unit in layer j l
1+li( )lib : the bias associated with unit in layer 1+li( )lia : the acctivation unit in layeri
( )liz : the total weighted sum of inputs to unit in layer li
e.g.( ) ( ) ( )å
=
+=3
1
11
11
21
jjj bxWz( ) ( )( )212
1 zfa =
l
Forward Propagation
1. Neural Network
ln : the number of layerslL : the layer l( )lijW : the weight associated with the connection between unit in layer ,
and unit in layer j l
1+li( )lib : the bias associated with unit in layer 1+li( )lia : the acctivation unit in layeri( )liz : the total weighted sum of inputs to unit in layer li
l
Content
1.Neural Network
2.Backpropagation Algorithm
3.Convolutional Neural Network
4.Backpropagation on CNN
2. Backpropagation Algorithm
“Loss function”
2. Backpropagation Algorithm
“Gradient Descent”
2. Backpropagation Algorithm
“Gradient Descent”
different i
2. Backpropagation Algorithm
“Derivative chain rule”
( ) ( )( )11 ++ = li
li zfa (Page 7)
( )( ) ( )( ) ( )
( ) ( )( )( )
( )lij
liii
li
iilij W
zyxbWJz
yxbWJW ¶
¶´
¶¶
=¶¶ +
+
1
1 ,;,,;,
( ) ( ) ( ) ( )å=
+ +=ls
j
li
lj
lij
li baWz
1
1
2. Backpropagation Algorithm
“error term”
( )( )
( ) ( )( )iili
li yxbWJ
z,;,1
1+
+
¶¶
=d
which measures how much that node was "responsible" for any errors in output
2. Backpropagation Algorithm
“error term”
a) For each output unit in layer lni
( ) ( ) ( ) ( )å=
+ +=ls
j
li
lj
lij
li baWz
1
1( ) ( )( )11 ++ = li
li zfa (Page 7)
( )( )
( ) ( )( )iili
li yxbWJ
z,;,1
1+
+
¶¶
=d
2. Backpropagation Algorithm
“error term”
b) For 2,,3,2,1 !---= lll nnnl
( ) ( ) ( ) ( )å=
+ +=ls
j
li
lj
lij
li baWz
1
1
( ) ( )( )11 ++ = li
li zfa
2. Backpropagation Algorithm
“error term”
b) For 2,,3,2,1 !---= lll nnnl
2. Backpropagation Algorithm
( )( ) ( )( ) ( )
( ) ( )( )( )
( )lij
liii
li
iilij W
zyxbWJz
yxbWJW ¶
¶´
¶¶
=¶¶ +
+
1
1 ,;,,;,
“Gradient Descent”
“Derivative chain rule”
“error term” ( )( )
( ) ( )( )iili
li yxbWJ
z,;,1
1+
+
¶¶
=d( ) ( ) ( ) ( )å
=
+ +=ls
j
li
lj
lij
li baWz
1
1
“Gradient Descent”
2. Backpropagation Algorithm
“error term”
2. Backpropagation Algorithm
“error term”
2. Backpropagation Algorithm
“error term”
结论:一个点(神经元)的残差=前向的时候下一层中使用过该点的神经单元的残差按权重求和
Content
1.Neural Network
2.Backpropagation Algorithm
3.Convolutional Neural Network
4.Backpropagation on CNN
3. Convolutional Neural Network
3. Convolutional Neural Network
3. Convolutional Neural Network
convolution
How to calculate the number of hidden units?1,000,000 hidden units?
3. Convolutional Neural Network
convolution layer
stride, pad, kernel(size), number(output)
1_
)_ker_2(1 +
-´+=+ hstride
hnelhpadHH ll 1
_)_ker_2(
1 +-´+
=+ wstridewnelwpadWW l
l
3_ker_ker == wnelhnel
5== ll WH
0=pad
1=stride
3. Convolutional Neural Network
pooling layer
Max Pooling & Avg Pooling
3. Convolutional Neural Network
fully connected layer
activation function
ReLU Sigmoid tanh
3. Convolutional Neural Network
MNIST
3. Convolutional Neural Network
convolution layer
a. compute the input size
b. setup stride, pad, kernel(size), number(output),
c. compute the size of filter and output
d. *initialize weights (the first iteration)
e. convolution
al,
Content
1.Neural Network
2.Backpropagation Algorithm
3.Convolutional Neural Network
4.Backpropagation on CNN
4. Backpropagation on CNN
Ø Backpropagation on pooling layer
Ø Backpropagation on convolution layer
4. Backpropagation on CNN
“error term”
结论:一个点(神经元)的残差=前向的时候下一层中使用过该点的神经单元的残差按权重求和
4. Backpropagation on CNN
Backpropagation on pooling layer
1 2 3 45 6 7 89 2 3 56 5 1 1
6 8
9 5
max pooling
kernel_size=2stride=2pad=0
Forward
0 0 0 00 1 0 23 0 0 40 0 0 0
1 2
3 4Forward
( ) ( )ll dd ®+1
4. Backpropagation on CNN
Backpropagation on pooling layer
1 2 3 4
5 6 7 8
9 2 3 5
6 5 1 1
7 11
11 5
avg pooling
kernel_size=2stride=2pad=0
Forward
0.25 0.25 0.5 0.5
0.25 0.25 0.5 0.5
0.75 0.75 1 1
0.75 0.75 1 1
1 2
3 4Backward ( ) ( )ll dd ®+1
4. Backpropagation on CNN
Backpropagation on convolution layer
1 1 1 0 0
0 0 0 1 0
0 1 1 1 1
0 1 1 0 0
1 1 1 1 0
w1 w2 w3
w4 w5 w6
w7 w8 w9
cstride=1pad=0
lW 1+ll
4. Backpropagation on CNN
Backpropagation on convolution layer
w1 w2 w3
w4 w5 w6
w7 w8 w9
c
0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0
1)ker2(1 +
-´+=+ stride
nelpadWW ll
stride=1pad=?
lW ( )padl ++1d ld
4. Backpropagation on CNN
Backpropagation on convolution layer
w1 w2 w3
w4 w5 w6
w7 w8 w9
c
0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0
1)ker2(1 +
-´+=+ stride
nelpadWW ll
stride=1pad=2
lW ld
结论:一个点(神经元)的残差=前向的时候下一层中使用过该点的神经单元的残差按权重求和
c
( )padl ++1d
4. Backpropagation on CNN
Backpropagation on convolution layer
w1 w2 w3
w4 w5 w6
w7 w8 w9
c
0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0
1)ker2(1 +
-´+=+ stride
nelpadWW ll
stride=1pad=?
lW ld( )padl ++1d
c
4. Backpropagation on CNN
Backpropagation on convolution layer
w1 w2 w3
w4 w5 w6
w7 w8 w9
c
0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0
1)ker2(1 +
-´+=+ stride
nelpadWW ll
stride=1pad=?
lW ld
c
( )padl ++1d
4. Backpropagation on CNN
Backpropagation on convolution layer
w1 w2 w3
w4 w5 w6
w7 w8 w9
c
0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0
1)ker2(1 +
-´+=+ stride
nelpadWW ll
stride=1pad=?
lW ld
c
( )padl ++1d
4. Backpropagation on CNN
Backpropagation on convolution layer
w1 w2 w3
w4 w5 w6
w7 w8 w9
c
0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0
1)ker2(1 +
-´+=+ stride
nelpadWW ll
stride=1pad=?
lW ld
c
( )padl ++1d
4. Backpropagation on CNN
Backpropagation on convolution layer
w1 w2 w3
w4 w5 w6
w7 w8 w9
c
0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0
1)ker2(1 +
-´+=+ stride
nelpadWW ll
stride=1pad=2
lW ld( )padl ++1d
4. Backpropagation on CNN
Backpropagation on convolution layer
w1 w2 w3
w4 w5 w6
w7 w8 w9
c
0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0
1)ker2(1 +
-´+=+ stride
nelpadWW ll
stride=1pad=2
lW ld( )padl ++1d
(rotation)
4. Backpropagation on CNN
Backpropagation on convolution layer
a.
b. rotate
c. compute (convolution)
d. compute gradient
e. update weights
padl ++1dlW °180
ld
JW¶¶
Reference
1. UFLDL: http://ufldl.stanford.edu/wiki/index.php/UFLDL_Tutorial2. Blogs:
http://www.cnblogs.com/tornadomeet/tag/Deep%20Learning/http://blog.csdn.net/zouxy09/article/category/1387932
THANK YOU
Recommended