DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato...

Deep Learning & Neural Networks

Machine Learning – CSE4546Sham KakadeUniversity of WashingtonNovember 29, 2016

©Sham Kakade

Announcements:

n HW4 postedn Poster Session Thurs, Dec 8

n Today: ¨Review: EM¨Neural nets and deep learning

©Sham Kakade

Poster Sessionn Thursday Dec 8, 9-11:30am

¨ Please arrive 20 mins early to set up

n Everyone is expected to attendn Prepare a poster

¨ We provide poster board and pins¨ Both one large poster (recommended) and several pinned pages are OK.

n Capture¨ Problem you are solving¨ Data you used¨ ML methodology¨ Results

n Prepare a 1-minute speech about your projectn Two instructors will visit your poster separatelyn Project Grading: scope, depth, data

©Sham Kakade 3

©Sham Kakade 4

Logistic regression

n P(Y|X) represented by:

n Learning rule – MLE:

-6 -4 -2 0 2 4 60

©Sham Kakade 5

Perceptron as a graph-6 -4 -2 0 2 4 60

©Sham Kakade 6

Linear perceptron classification region

-6 -4 -2 0 2 4 60

©Sham Kakade 7

Percepton, linear classification, Boolean functions

n Can learn x1 AND x2

n Can learn x1 OR x2

n Can learn any conjunction or disjunction

©Sham Kakade 8

Percepton, linear classification, Boolean functions

n Can learn majority

n Can perceptrons do everything?

©Sham Kakade 9

Going beyond linear classification

n Solving the XOR problem

©Sham Kakade 10

Hidden layer

n Perceptron:

n 1-hidden layer:

©Sham Kakade 11

Example data for NN with hidden layer

©Sham Kakade 12

Learned weights for hidden layer

©Sham Kakade 13

NN for images

©Sham Kakade 14

Weights in NN for images

©Sham Kakade 15

Forward propagation for 1-hidden layer - Predictionn 1-hidden layer:

©Sham Kakade 16

Gradient descent for 1-hidden layer –Back-propagation: Computing

Dropped w0 to make derivation simpler

©Sham Kakade 17

Gradient descent for 1-hidden layer –Back-propagation: Computing

Dropped w0 to make derivation simpler

©Sham Kakade 18

Multilayer neural networks

©Sham Kakade 19

Forward propagation – prediction

n Recursive algorithmn Start from input layern Output of node Vk with parents U1,U2,…:

©Sham Kakade 20

Back-propagation – learning

n Just stochastic gradient descent!!! n Recursive algorithm for computing gradientn For each example

¨Perform forward propagation ¨Start from output layer¨Compute gradient of node Vk with parents U1,U2,…¨Update weight wik

©Sham Kakade 21

Many possible response/link functionsn Sigmoid

n Linear

n Exponential

n Gaussian

n Hinge

Convolutional Neural Networks & Application to Computer Vision

Machine Learning – CSE4546Sham KakadeUniversity of Washington

November 29, 2016©Sham Kakade

Contains slides from…

n LeCun & Ranzaton Russ Salakhutdinovn Honglak Lee

©Sham Kakade 23

Neural Networks in Computer Vision

n Neural nets have made an amazing come back¨ Used to engineer high-level features of images

n Image features:

©Sham Kakade 24

Some hand-created image features

©Sham Kakade 25

Computer$vision$features$

SIFT$ Spin$image$

HoG$ RIFT$

Textons$ GLOH$Slide$Credit:$Honglak$Lee$

Scanning an image with a detectorn Detector = Classifier from image patches:

n Typically scan image with detector:

©Sham Kakade 26

Using neural nets to learn non-linear features

©Sham Kakade 27

Y LeCunMA Ranzato

Deep Learning = Learning Hierarchical Representations

It's deep if it has more than one stage of non-linear feature transformation

Trainable Classifier

Low-LevelFeature

Mid-LevelFeature

High-LevelFeature

Feature visualization of convolutional net trained on ImageNet from [Zeiler & Fergus 2013]

But, many tricks needed to work well…

©Sham Kakade 28

ConvoluAnal$Deep$Models$$for$Image$RecogniAon$

• $Learning$mulAple$layers$of$representaAon.$

(LeCun, 1992)!

Convolution Layer

©Sham Kakade 29

Y LeCunMA Ranzato

Fully-connected neural net in high dimension

Example: 200x200 imageFully-connected, 400,000 hidden units = 16 billion parameters

Locally-connected, 400,000 hidden units 10x10 fields = 40

million params

Local connections capture local dependencies

Parameter sharingn Fundamental technique used throughout MLn Neural net without parameter sharing:

n Sharing parameters:

©Sham Kakade 30

Pooling/Subsampling n Convolutions act like detectors:

n But we don’t expect true detections in every patchn Pooling/subsampling nodes:

©Sham Kakade 31

Example neural net architecture

©Sham Kakade 32

Sample results

©Sham Kakade 33

Y LeCunMA Ranzato

Simple ConvNet Applications with State-of-the-Art Performance

Traffic Sign Recognition (GTSRB)German Traffic Sign Reco

99.2% accuracy

#1: IDSIA; #2 NYU

House Number Recognition (Google) Street View House Numbers

94.3 % accuracy

Example from Krizhevsky, Sutskever, Hinton 2012

Y LeCunMA Ranzato

Object Recognition [Krizhevsky, Sutskever, Hinton 2012]

CONV 11x11/ReLU 96fm

LOCAL CONTRAST NORM

MAX POOL 2x2sub

FULL 4096/ReLU

FULL CONNECT

CONV 11x11/ReLU 256fm

LOCAL CONTRAST NORM

MAX POOLING 2x2sub

CONV 3x3/ReLU 384fm

CONV 3x3ReLU 384fm

CONV 3x3/ReLU 256fm

MAX POOLING

FULL 4096/ReLU

Won the 2012 ImageNet LSVRC. 60 Million parameters, 832M MAC ops4M

4Mflop

Results by Krizhevsky, Sutskever, Hinton 2012

Y LeCunMA Ranzato

Object Recognition: ILSVRC 2012 results

ImageNet Large Scale Visual Recognition Challenge1000 categories, 1.5 Million labeled training samples

Y LeCunMA Ranzato

TEST IMAGE RETRIEVED IMAGES

Application to scene parsing

Y LeCunMA Ranzato

Semantic Labeling:Labeling every pixel with the object it belongs to

[Farabet et al. ICML 2012, PAMI 2013]

Would help identify obstacles, targets, landing sites, dangerous areasWould help line up depth map with edge maps

Learning challenges for neural nets

n Choosing architecturen Slow per iteration and convergence

n Gradient “diffusion” across layersn Many local optima

Random dropouts

n Standard backprop:

n Random dropouts: randomly choose edges not to update:

n Functions as a type of “regularization”… helps avoid “diffusion” of gradient

Revival of neural networks

n Neural networks fell into disfavor in mid 90s -early 2000s¨ Many methods have now been rediscovered J

n Exciting new results using modifications to optimization techniques and GPUs

n Challenges still remain:¨ Architecture selection feels like a black art¨ Optimization can be very sensitive to parameters¨ Requires a significant amount of expertise to get good results

DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato...

Documents

Convolutional Neural Networks - Hacettepe · PARRSLAB Comparing the receptive ﬁelds Convolutional vs Fully Connected 39 Responses are spatially selective, can be used to localize

Fully Convolutional Neural Networks in Juliacourses.csail.mit.edu/.../fully_conv_networks_writeup.pdf · 2015-12-15 · Fully Convolutional Neural Networks in Julia Victor Jakubiuk

IMPROVED CUCKOO SEARCH BASED NEURAL NETWORK …FBP Firefly Back propagation algorithm FCRNN Fully Connected Recurrent Neural Network GDAM Gradient Descent with Adaptive Momentum GN

Optimizing content delivery through machine learning · 2013-08-06 · •We can optimize information flow through our network increasing throughput ... •Fully connected neural

Convolutional Neural Netsfouhey/teaching/EECS442_F19/slides/10_cnns.pdfConvolutional Neural Nets EECS 442 –David Fouhey ... Fully Connected Network Each neuron connects to each neuron

Size-Invariant Fully Convolutional Neural Network for ... · PDF fileSize-Invariant Fully Convolutional Neural Network for Vessel Segmentation of Digital Retinal Images Yuansheng Luo

Introduction to Artificial Intelligence · - Multilayer perceptron (MLP) - Feed forward neural networks - Standard feed forward networks: nodes in neighbouring layers are fully connected

Monte Carlo Methods and Neural Networks - NVIDIA€¦ · ⌅ drop connect ⌅ stochastic ... – partially trained fully connected ⌅ goal: explore algorithms linear in time and

Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

DCNAS: Densely Connected Neural Architecture Search for … · 2021. 6. 11. · DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation Xiong Zhang1∗,

SNAP: Stanford Network Analysis Projectsnap.stanford.edu/class/cs224w-2018/reports/CS224W-2018-77.pdf · Additionally, We used a 3-layer fully connected neural network, with layer

Efficient Inference in Fully Connected CRFs with …graphics.stanford.edu/projects/densecrf/densecrf.pdfEfﬁcient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp

Convolutional Neural Networksmilos/courses/cs3750/lectures...11/19/2018 2 Neural Networks Recap Multilayer Perceptron (MLP) Fully-connected (FC) layer •A layer has full connections

Convolution Neural Network - Data Science & Business …dsba.korea.ac.kr/wp/wp-content/seminar/Deep learning/CNN... · 2016-04-29 · Fully connected network에서는기존backpropagation

Transitioning between Convolutional and Fully Connected ...Transitioning between Convolutional and Fully Connected Layers in Neural Networks Shazia Akbar 1, Mohammad Peikari , Sherine

Matrix factorization and embeddings...Neural networks summary • Discussed basics, including • Basic architectures (fully connected layers with activations like sigmoid, tanh, and

A Multi-Neural Network Acceleration Architecture€¦ · each other. Figure 1a shows an example neural network which consists of convolutional (CONV), fully connected (FC), and pooling

A Simple Convolutional Neural Network for Accurate P300 ... · 3 Fully-Connected Š 100 Output Fully-Connected Š 2 Table 1: CCNN architecture. Liu[Liuet al., 2017] improves CCNN

Fully-Connected Tensor Network Decomposition and Its

Adaptively Connected Neural NetworksAdaptively Connected Neural Networks Guangrun Wang Sun Yat-sen University Guangzhou wanggrun@mail2.sysu.edu.cn Keze Wang University of California,