49
UDRC-EURASIP Summer School 26th June 2019 - Edinburgh Sen Wang UDRC Co-I – WP3.1 and WP3.2 Assistant Professor in Robotics and Autonomous Systems Institute of Signals, Sensors and Systems Heriot-Watt University Deep Neural Networks II Slides adapted from Andrej Karpathy, Kaiming He

Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

UDRC-EURASIP Summer School 26th June 2019 - Edinburgh

Sen Wang

UDRC Co-I – WP3.1 and WP3.2Assistant Professor in Robotics and Autonomous Systems

Institute of Signals, Sensors and SystemsHeriot-Watt University

Deep Neural Networks II

Slides adaptedfromAndrejKarpathy,Kaiming He

Page 2: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Outline

Learning features for machines to solve problems

• Convolutional Neural Networks (CNNs)

• Deep Learning Architectures (focus on CNNs) - learning features

• Some Deep Learning Applications - problemso Object detection (image, radar, sonar)o Semantic segmentation o Visual odometryo 3D reconstructiono Semantic mappingo Robot navigation o Manipulation and graspingo …………

UDRC-EURASIP Summer School 1

Page 3: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Deep Learning: a learning technique combining layers of neural networks to automatically identify features that are relevant to the problem to solve

UDRC-EURASIP Summer School

test data

forwardprediction

trained DNN

Testing

Training (supervised learning)

big data

forward

backward

prediction

error ⌦ label

data label

Deep Learning

Low-level Features

Middle-level Features

High-level FeaturesRaw Data

2

Page 4: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Deep Learning in Robotics

UDRC-EURASIP Summer School

IJRR2016

ICRA2018 ~2500 submissions:the most popular keyword

IJCV2018

3

Page 5: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Deep Learning in Robotics

UDRC-EURASIP Summer School 4

Page 6: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Convolutional Neural Networks (CNNs)

UDRC-EURASIP Summer School 5

Page 7: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

From MLPs to CNNs

• Feed-forward Neural Networks or Multi-Layer Perceptrons (MLPs)o many multiplications

• CNNs are similar to Feed-forward Neural Networks o convolution instead of general matrix multiplication

UDRC-EURASIP Summer School 6

Page 8: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

UDRC-EURASIP Summer School

CNNs

• 3 Main Types of Layers:o convolutional layero activation layero pooling layer

• repeat many times

input layer

convolutional layer

activation layer

pooling layer

fully-connected layer

……….

7

Page 9: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

32x32x3 image

width

height

depth

CNNs: Convolution Layer

UDRC-EURASIP Summer School

5x5x3 filter

Convolve the filter with the imagei.e. “slide over the image spatially, computing dot products”

8SlidescourtesyofAndrejKarpathy

Page 10: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

5x5x3 filter

32x32x3 image

Convolve the filter with the imagei.e. “slide over the image spatially, computing dot products”

Filters always extend the full depth of the input volume

CNNs: Convolution Layer

UDRC-EURASIP Summer School 9

Page 11: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

32x32x3 image5x5x3 filter

1 number: the result of taking a dot product between the filter and a small 5x5x3 chunk of the image(i.e. 5*5*3 = 75-dimensional dot product + bias)

CNNs: Convolution Layer

UDRC-EURASIP Summer School

2 important ideas:• local connectivity• parameter sharing

10

Page 12: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

32x32x3 image5x5x3 filter

convolve (slide) over all spatial locations

activation map

1

28

28

CNNs: Convolution Layer

UDRC-EURASIP Summer School 11

Page 13: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

32x32x3 image5x5x3 filter

convolve (slide) over all spatial locations

activation maps

1

28

28

consider a second, green filter

CNNs: Convolution Layer

UDRC-EURASIP Summer School 12

Page 14: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

Convolution Layer

activation maps

6

28

28

For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps:

We stack these up to get a “new image” of size 28x28x6

CNNs: Convolution Layer

UDRC-EURASIP Summer School 13

Page 15: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

Convolution Layer

activation maps

6

28

28

For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps:

We processed [32x32x3] volume into [28x28x6] volume.Q: how many parameters would this be if we used a fully connected layer instead?

CNNs: Convolution Layer

UDRC-EURASIP Summer School 14courtesyofAndrejKarpathy

Page 16: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

Convolution Layer

activation maps

6

28

28

For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps:

We processed [32x32x3] volume into [28x28x6] volume.Q: how many parameters would this be if we used a fully connected layer instead?A: (32*32*3)*(28*28*6) = 14.5M parameters, ~14.5M multiplies

CNNs: Convolution Layer

UDRC-EURASIP Summer School 15

Page 17: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

Convolution Layer

activation maps

6

28

28

For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps:

We processed [32x32x3] volume into [28x28x6] volume.Q: how many parameters are used instead?

CNNs: Convolution Layer

UDRC-EURASIP Summer School 16

Page 18: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

Convolution Layer

activation maps

6

28

28

For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps:

We processed [32x32x3] volume into [28x28x6] volume.Q: how many parameters are used instead? --- And how many multiplies?A: (5*5*3)*6 = 450 parameters

CNNs: Convolution Layer

UDRC-EURASIP Summer School 17

Page 19: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

32

32

3

Convolution Layer

activation maps

6

28

28

For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps:

We processed [32x32x3] volume into [28x28x6] volume.Q: how many parameters are used instead?A: (5*5*3)*6 = 450 parameters, (5*5*3)*(28*28*6) = ~350K multiplies

CNNs: Convolution Layer

UDRC-EURASIP Summer School

2 Merits:• vastly reduce the amount of parameters• more efficient

18

Page 20: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

UDRC-EURASIP Summer School

CNNs: Activation Layer

• 3 Main Types of Layers:o convolutional layero activation layero pooling layer

input layer

convolutional layer

activation layer

pooling layer

fully-connected layer

……….

19

Page 21: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

UDRC-EURASIP Summer School

CNNs: Pooling Layer

• 3 Main Types of Layers:o convolutional layero activation layero pooling layer

• repeat many times

input layer

convolutional layer

activation layer

pooling layer

fully-connected layer

……….

makes the representations smaller and more manageable

20

Page 22: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

CNNs: A sequence of Convolutional Layers

UDRC-EURASIP Summer School

32

32

3

28

28

6

CONV,ReLUe.g. 6 5x5x3 filters32

32

3

CONV,ReLUe.g. 6 5x5x3 filters 28

28

6

CONV,ReLUe.g. 10 5x5x6 filters

CONV,ReLU

….

10

24

24

21

Page 23: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Deep Learning Architectures

UDRC-EURASIP Summer School 22

Page 24: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Hand-Crafted Features by Human

UDRC-EURASIP Summer School

Feature Extraction(hand-crafted)

Activities, Context, …

Locations, Scene types,Semantics, …

Objects, Structure, …

InferencePervasive Data

time-series data

vision

point cloud

23

Page 25: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Feature Engineering and Representation

UDRC-EURASIP Summer School 24

vision

2563x800x600

point cloud

2?x?x?

Raw data ≈

Bad Representation

Pervasive Data

time-series data

Page 26: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Deep Learning: Representation Learning

UDRC-EURASIP Summer School

End-to-End Learning

Locations, Scene types, …

Activities, Context, …

Structure, Semantics, …

automatically learn effectivefeature representation to solve the problem

Inference

time-series data

vision

point cloud

Pervasive Data

25

Page 27: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

LeNet - 1998

UDRC-EURASIP Summer School 26

• Convolution:o locally-connectedo spatially weight-sharing

• weight-sharing is a key in DL• Subsampling• Fully-connected outputs

“Gradient-basedlearningappliedtodocumentrecognition”,LeCun etal.1998

Foundation of modern ConvNets!

Page 28: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

AlexNet – 2012

UDRC-EURASIP Summer School 27

8 layers: 5 conv and max-pooling + 3 fully-connected

LeNet-style backbone, plus:• ReLU

o Accelerate trainingo better gradprop (vs. tanh)

• Dropout o Reduce overfitting

• Data augmentationo Image transformationo Reduce overfitting

“ImageNetClassificationwithDeepConvolutionalNeuralNetworks”,Krizhevsky,Sutskever,Hinton.NIPS2012

Page 29: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

VGG16/19 - 2014

UDRC-EURASIP Summer School 28

Very deep ConvNet

Modularized design• 3x3 Conv as the module• Stack the same module• Same computation for each module

Stage-wise training• VGG-11 => VGG-13 => VGG-16

“VeryDeepConvolutionalNetworksforLarge-ScaleImageRecognition”,Simonyan &Zisserman.arXiv 2014(ICLR2015)

Page 30: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

GoogleNet/Inception - 2014

UDRC-EURASIP Summer School 29

22 layers

Multiple branches• e.g., 1x1, 3x3, 5x5, pooling• merged by concatenation• Reduce dimensionality by 1x1 before expensive 3x3/5x5 conv

Szegedy etal.“Goingdeeperwithconvolutions”.arXiv 2014(CVPR2015)

Page 31: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Going Deeper

UDRC-EURASIP Summer School 30

Simply stacking layers?

• Plain nets: stacking 3x3 conv layers• 56-layer net has higher training error and test error than 20-layer net• A deeper model should not have higher training error

Page 32: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Going Deeper

UDRC-EURASIP Summer School 31

Problem:deeper plain nets have higher training error on various datasets

Optimization difficulties: o vanishing gradiento solvers struggle to find the solution when going deeper

Kaiming He,Xiangyu Zhang,Shaoqing Ren,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.

Cannot go deeper for deep neural networks!

Page 33: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

ResNets-2016

UDRC-EURASIP Summer School 32

Kaiming He,Xiangyu Zhang,Shaoqing Ren,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.

Residual netPlain net

gradients can flow directly through the skip connections backwards

Page 34: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

ResNets-2016

UDRC-EURASIP Summer School 33

• Deep ResNets can be trained easier• Deeper ResNets have lower training error, and also lower test error

Kaiming He,Xiangyu Zhang,Shaoqing Ren,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.

Page 35: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

ImageNet experiments

UDRC-EURASIP Summer School 34

top5error%

Page 36: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

DenseNets - 2018

• simply connect every layer directly with each othero each layer has direct access to

the gradients from the loss function and the original input image

o exploit the potential of the network through feature reuse

• DenseNets concatenate the output feature maps of the layer with the incoming feature maps.

UDRC-EURASIP Summer School 35

G.Huang,Z.LiuandL.vanderMaaten,“DenselyConnectedConvolutionalNetworks,”2018.

Page 37: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

MobileNets - 2017

UDRC-EURASIP Summer School 36

Light-weight ConvNets for mobile applications using depth-wise convolutions

Howard.et.al.MobileNets:EfficientConvolutionalNeuralNetworksforMobileVisionApplications 2017

multiply–accumulateoperation

Page 38: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Deep Learning Applications

UDRC-EURASIP Summer School 37

Page 39: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Deep Learning Applications

UDRC-EURASIP Summer School 38

Feature Extractor

DataFeature(feature from

last Conv layer)

Application

Backbone network

Page 40: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Object Detection and Recognition

UDRC-EURASIP Summer School 39

Vision based: RCNN, Fast RCNN, Faster RCNN, YOLO, SSD,…..

Page 41: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Object Detection and Recognition

UDRC-EURASIP Summer School 40

Radar and sonar based object detection and recognition

object detection from sonar images[Valdenegro 2016] vehicle detection using polarised infrared sensors

[Sheeny 2018]

object detection/recognition on side scan object detection/recognition on radar

Page 42: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Semantic Segmentation

UDRC-EURASIP Summer School 41

FCN, SegNet, RefineNet, PSPNet, …..

Page 43: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Visual Odometry

• DeepVO, UnDeepVO, VINet, …..

UDRC-EURASIP Summer School 42

Page 44: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Image based Localisation

UDRC-EURASIP Summer School 43

PoseNet, VidLoc, …. : map images to 6 DoF poses

Page 45: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

3D Reconstruction

• OctNet, Octree Generative Network (OGN), Mesh R-CNN, …

UDRC-EURASIP Summer School 44

Page 46: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Semantic Mapping

UDRC-EURASIP Summer School 45

Page 47: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Robot Navigation

UDRC-EURASIP Summer School 46

Page 48: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Summary

• Deep Learning is a powerful tool• Learning representation is the key for Deep Learning

UDRC-EURASIP Summer School 47

Page 49: Deep Neural Networks II - University of Edinburghudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · Modularized design • 3x3 Conv as the module • Stack the same module •

Thank you for your attention!

Slides adaptedfromAndrejKarpathy,Kaiming He