Convolutional Neural Network & Backpropagation … · 2017-11-09 · Content 1.Neural Network...

Convolutional Neural Network & Backpropagation Algorithm

Xuelin Qian

Content

1.Neural Network

2.Backpropagation Algorithm

3.Convolutional Neural Network

4.Backpropagation on CNN

Content

1.Neural Network

“neuron”

where is called the activation function

1. Neural Network

neural network: consist of many simple “neurons”

1. Neural Network

InputLayer

HiddenLayer

OutputLayer

Forward Propagation

1. Neural Network

ln : the number of layerslL : the layer l( )lijW : the weight associated with the connection between unit in layer ,

and unit in layer j l

1+li( )lib : the bias associated with unit in layer 1+li( )lia : the acctivation unit in layeri l

Forward Propagation

1. Neural Network

1+li( )lib : the bias associated with unit in layer 1+li( )lia : the acctivation unit in layeri

( )liz : the total weighted sum of inputs to unit in layer li

e.g.( ) ( ) ( )å

jjj bxWz( ) ( )( )212

1 zfa =

Forward Propagation

1. Neural Network

1+li( )lib : the bias associated with unit in layer 1+li( )lia : the acctivation unit in layeri( )liz : the total weighted sum of inputs to unit in layer li

Content

1.Neural Network

2. Backpropagation Algorithm

“Loss function”

“Gradient Descent”

different i

“Derivative chain rule”

( ) ( )( )11 ++ = li

li zfa (Page 7)

( )( ) ( )( ) ( )

( ) ( )( )( )

( )lij

iilij W

zyxbWJz

yxbWJW ¶

=¶¶ +

1 ,;,,;,

( ) ( ) ( ) ( )å=

+ +=ls

li baWz

“error term”

( )( )

( ) ( )( )iili

li yxbWJ

which measures how much that node was "responsible" for any errors in output

“error term”

a) For each output unit in layer lni

( ) ( ) ( ) ( )å=

+ +=ls

li baWz

1( ) ( )( )11 ++ = li

li zfa (Page 7)

( )( )

( ) ( )( )iili

li yxbWJ

“error term”

b) For 2,,3,2,1 !---= lll nnnl

( ) ( ) ( ) ( )å=

+ +=ls

li baWz

( ) ( )( )11 ++ = li

li zfa

“error term”

b) For 2,,3,2,1 !---= lll nnnl

( )( ) ( )( ) ( )

( ) ( )( )( )

( )lij

iilij W

zyxbWJz

yxbWJW ¶

=¶¶ +

1 ,;,,;,

“Derivative chain rule”

“error term” ( )( )

( ) ( )( )iili

li yxbWJ

=d( ) ( ) ( ) ( )å

+ +=ls

li baWz

“error term”

结论：一个点（神经元）的残差=前向的时候下一层中使用过该点的神经单元的残差按权重求和

Content

1.Neural Network

3. Convolutional Neural Network

convolution

How to calculate the number of hidden units?1,000,000 hidden units?

convolution layer

stride, pad, kernel(size), number(output)

)_ker_2(1 +

-´+=+ hstride

hnelhpadHH ll 1

_)_ker_2(

1 +-´+

=+ wstridewnelwpadWW l

3_ker_ker == wnelhnel

5== ll WH

1=stride

pooling layer

Max Pooling & Avg Pooling

fully connected layer

activation function

ReLU Sigmoid tanh

convolution layer

a. compute the input size

b. setup stride, pad, kernel(size), number(output),

c. compute the size of filter and output

d. *initialize weights (the first iteration)

e. convolution

Content

1.Neural Network

4. Backpropagation on CNN

Ø Backpropagation on pooling layer

Ø Backpropagation on convolution layer

“error term”

Backpropagation on pooling layer

1 2 3 45 6 7 89 2 3 56 5 1 1

max pooling

kernel_size=2stride=2pad=0

Forward

0 0 0 00 1 0 23 0 0 40 0 0 0

3 4Forward

( ) ( )ll dd ®+1

Backpropagation on pooling layer

1 2 3 4

5 6 7 8

9 2 3 5

6 5 1 1

avg pooling

kernel_size=2stride=2pad=0

Forward

0.25 0.25 0.5 0.5

0.75 0.75 1 1

3 4Backward ( ) ( )ll dd ®+1

Backpropagation on convolution layer

1 1 1 0 0

0 0 0 1 0

0 1 1 1 1

0 1 1 0 0

1 1 1 1 0

w1 w2 w3

w4 w5 w6

w7 w8 w9

cstride=1pad=0

lW 1+ll

w1 w2 w3

w4 w5 w6

w7 w8 w9

0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0

1)ker2(1 +

-´+=+ stride

nelpadWW ll

stride=1pad=？

lW ( )padl ++1d ld

w1 w2 w3

w4 w5 w6

w7 w8 w9

0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0

1)ker2(1 +

-´+=+ stride

nelpadWW ll

stride=1pad=2

( )padl ++1d

w1 w2 w3

w4 w5 w6

w7 w8 w9

0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0

1)ker2(1 +

-´+=+ stride

nelpadWW ll

stride=1pad=？

lW ld( )padl ++1d

w1 w2 w3

w4 w5 w6

w7 w8 w9

0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0

1)ker2(1 +

-´+=+ stride

nelpadWW ll

stride=1pad=？

( )padl ++1d

w1 w2 w3

w4 w5 w6

w7 w8 w9

0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0

1)ker2(1 +

-´+=+ stride

nelpadWW ll

stride=1pad=？

( )padl ++1d

w1 w2 w3

w4 w5 w6

w7 w8 w9

0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0

1)ker2(1 +

-´+=+ stride

nelpadWW ll

stride=1pad=？

( )padl ++1d

w1 w2 w3

w4 w5 w6

w7 w8 w9

0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0

1)ker2(1 +

-´+=+ stride

nelpadWW ll

stride=1pad=2

lW ld( )padl ++1d

w1 w2 w3

w4 w5 w6

w7 w8 w9

0 0 0 0 0 0 00 0 0 0 0 0 00 0 1 2 3 0 00 0 4 5 6 0 00 0 7 8 9 0 00 0 0 0 0 0 00 0 0 0 0 0 0

1)ker2(1 +

-´+=+ stride

nelpadWW ll

stride=1pad=2

lW ld( )padl ++1d

(rotation)

b. rotate

c. compute (convolution)

d. compute gradient

e. update weights

padl ++1dlW °180

JW¶¶

Reference

1. UFLDL: http://ufldl.stanford.edu/wiki/index.php/UFLDL_Tutorial2. Blogs:

http://www.cnblogs.com/tornadomeet/tag/Deep%20Learning/http://blog.csdn.net/zouxy09/article/category/1387932

THANK YOU

Convolutional Neural Network & Backpropagation … · 2017-11-09 · Content 1.Neural Network...

Documents

Convolutional Neural Networks Arise From Ising Models …sunilpai/convolutional-neural-networks.pdf · Convolutional Neural Networks Arise From Ising Models and Restricted Boltzmann

Convolutional Neural Networks for Deep Learning: An Introduction · 2017-10-05 · Introduction: Convolutional Neural Networks (CNN) Convolutional Neural Networks: A deep learning

Backpropagation in Convolutional Neural Network

Backpropagation and Convolutional Neural Networks · graphs for neural networks are a bit more complex than the previous example. ... •Learning rate •Number of layers •Number

Convolutional Neural Networks Arise From Ising Models and ...web.stanford.edu/~sunilpai/convolutional-neural-networks.pdf · Convolutional neural networks are an attractive option

Deep Network Models · convolutional-neural-network-cnn/ Discriminative Training of Linear Rules ... –Efficient via Backpropagation Algorithm . Gradient Descent Optimization Problem:

CSNNs: Unsupervised, Backpropagation-free Convolutional

Navigating the Neural Net Terrain - Datascope...BackPropagation . Convolutional Neural Nets Very similar to Neural Nets.. But how are they different ? Convolutional Neural Nets Vs

CS456: Machine Learning · Outlines A brief history of neural network Multilayer neural network Backpropagation Deep neural network Convolutional neural networks (CNNs) Long-Short

Neural Networks: Backpropagation & Regularization

A PARALLEL IMPLEMENTATION OF BACKPROPAGATION NEURAL

Artificial Neural Network III Backpropagation - Staff.kmutt.ac.th

Research Article Backpropagation Neural Network ...downloads.hindawi.com/journals/jam/2013/453098.pdfResearch Article Backpropagation Neural Network Implementation for Medical Image

Hyperbolic Graph Convolutional Neural Networkspapers.nips.cc › paper › 8733-hyperbolic-graph-convolutional-neural-networks.pdfGraph Convolutional Neural Networks (GCNs) are state-of-the-art

Convolutional Neural Networks

Accelerating Convolutional Neural Network Systemshgrg1/publications/honours.pdf · 2015-07-29 · Accelerating Convolutional Neural Network Systems ... Convolutional Neural Networks

Training Artiﬁcial Neural Networks: Backpropagation via ...wtang/papers/skorin2001tan.pdf · Training Artiﬁcial Neural Networks: Backpropagation via Nonlinear Optimization Jadranka

Convexiﬁed Convolutional Neural Networks · 2018-04-15 · than CNNs trained by backpropagation, SVMs, fully-connected neural networks, ... over a fully-connected neural network:

Graph Based Convolutional Neural Network - BMVA · 2017-05-30 · EDWARDS, XIE: GRAPH CONVOLUTIONAL NEURAL NETWORK 5 2.2 Backpropagation on Graph Backpropagation of errors is a pivotal

Doubly Convolutional Neural Networks - papers.nips.ccpapers.nips.cc/paper/6340-doubly-convolutional-neural-networks.pdf · convolutional neural network (DCNN). In Figure 3 we have