TensorFlow: neural networks lab
Gianluca Corrado and Andrea Passerini
[email protected]@disi.unitn.it
Machine Learning
Corrado, Passerini (disi) TensorFlow Machine Learning 1 / 27
Introduction
TensorFlow
TensorFlow is a Python package
Numerical computation using data flow graphs
Developed (by Google) for the purpose of machine learning and deepneural networks research
Installation and Documentation
https://www.tensorflow.org/
Corrado, Passerini (disi) TensorFlow Machine Learning 2 / 27
Introduction
Outline
MNIST datasetI https://www.tensorflow.org/versions/master/tutorials/
mnist/beginners/index.html#the-mnist-data
MNIST for ML BeginnersI https://www.tensorflow.org/versions/master/tutorials/
mnist/beginners/index.html
Deep MNIST for ExpertsI https://www.tensorflow.org/versions/master/tutorials/
mnist/pros/index.html
Corrado, Passerini (disi) TensorFlow Machine Learning 3 / 27
Introduction
On Ubuntu 12.04
LD LIBRARY PATH
Run the script run me on ubuntu1204.sh
Corrado, Passerini (disi) TensorFlow Machine Learning 4 / 27
MNIST dataset
MNIST dataset
Dataset of handwritten digits
Each image 28× 28 = 784 pixels
Train: 60k test: 10k
Corrado, Passerini (disi) TensorFlow Machine Learning 5 / 27
MNIST dataset
Importing MNIST
Use the given input data.py script
Train: 55k, validation: 5k, test: 10k
mnist is a Python Class (has attributes, and methods)I mnist.train.imagesI mnist.train.labelsI mnist.train.next batch(n)I . . .
Corrado, Passerini (disi) TensorFlow Machine Learning 6 / 27
MNIST dataset
Data representation
The 28× 28 = 784 pixels are represented as a vector
The 10 classes are represented with the one-hot encoding
Corrado, Passerini (disi) TensorFlow Machine Learning 7 / 27
Softmax regressions
Softmax regressions
Look at an image and give probabilities for it being each digit
Evidence that an image is a particular class i
evidencei =∑j
Wijxj + bi ∀i
Softmax to shape the evidence as a probability distribution over icases
softmaxi =exp(evidencei )∑j exp(evidencej)
∀i
Corrado, Passerini (disi) TensorFlow Machine Learning 8 / 27
Softmax regressions
Softmax regressions
Schematic view
Vectorized version
Compactlyy = softmax(Wx + b)
Corrado, Passerini (disi) TensorFlow Machine Learning 9 / 27
Softmax regressions
Implementation: model
Import TensorFlow
Define the placeholders
Define the variables
Define the softmax layer
Corrado, Passerini (disi) TensorFlow Machine Learning 10 / 27
Softmax regressions
Implementation: optimization
Define the cost (cross-entropy): Hy (y) = −∑
y log y
Define the training algorithm
Start a new session (for now we have not computed anything)
Initialize the variables
Train the model
Corrado, Passerini (disi) TensorFlow Machine Learning 11 / 27
Softmax regressions
Implementation: evaluationEvaluate accuracy (it should be around 0.91)
Plot the model weights (plotter.py)
Corrado, Passerini (disi) TensorFlow Machine Learning 12 / 27
Deep convolutional net
Deep architechtures
0.91 accuracy on MNIST is NOT good!
State of the art performance is 0.9979
Let’s refine our modelI 2 convolutional layersI alternated with 2 max pool layersI ReLU layer (with dropout)I Softmax regressions
Accuracy target: 0.99
Corrado, Passerini (disi) TensorFlow Machine Learning 13 / 27
Deep convolutional net
Convolutional layer
Broadly used in image classification
Local connettivity (width, height, depth)
Spatial arrangement (stride, padding)
Parameter sharing
convolutional
max pool
convolutional
max pool
ReLU (dropout)
softmax
x1x1 x2x2 x784x784. . .
y0y0 y1y1 y9y9. . .
Corrado, Passerini (disi) TensorFlow Machine Learning 14 / 27
Deep convolutional net
Max pooling layer
Commonly used after convolutional layer(s)
Reduce spatial size
Avoid overfitting
Max pooling 2× 2 is very common
convolutional
max pool
convolutional
max pool
ReLU (dropout)
softmax
x1x1 x2x2 x784x784. . .
y0y0 y1y1 y9y9. . .
Corrado, Passerini (disi) TensorFlow Machine Learning 15 / 27
Deep convolutional net
ReLU (and Dropout)
Fully connected layer
Rectified Linear Unit activation
Dropout randomly excludes neurons to avoidoverfitting convolutional
max pool
convolutional
max pool
ReLU (dropout)
softmax
x1x1 x2x2 x784x784. . .
y0y0 y1y1 y9y9. . .
Corrado, Passerini (disi) TensorFlow Machine Learning 16 / 27
Deep convolutional net
Softmax layer
Look at a feature configuration (coming fromthe layers below) and give probabilities for itbeing each digit
Evidence that a feature configuration is aparticular class i
evidencei =∑j
Wijxj + bi ∀i
Softmax to shape the evidence as a probabilitydistribution over i cases
softmaxi =exp(evidencei )∑j exp(evidencej)
∀i
convolutional
max pool
convolutional
max pool
ReLU (dropout)
softmax
x1x1 x2x2 x784x784. . .
y0y0 y1y1 y9y9. . .
Corrado, Passerini (disi) TensorFlow Machine Learning 17 / 27
Deep convolutional net
Implementation: preparation
Imports
Load data
Placeholders and data reshaping
NOTE
Reshaping is needed for convolution and max pooling
Corrado, Passerini (disi) TensorFlow Machine Learning 18 / 27
Deep convolutional net
Implementation: weights initialization
Define functions to initialize variables of the model
Corrado, Passerini (disi) TensorFlow Machine Learning 19 / 27
Deep convolutional net
Implementation: convolution and pooling
Define functions (to keep the code cleaner)
Corrado, Passerini (disi) TensorFlow Machine Learning 20 / 27
Deep convolutional net
Implementation: model (convolution and max pooling)1st layer: convolutional with max pooling
2nd layer: convolutional with max pooling
Shrinking
Applying 2× 2 max pooling we are shrinking the image
After 2 layers we moved from 28× 28 to 7× 7
For each point we have 64 features
Corrado, Passerini (disi) TensorFlow Machine Learning 21 / 27
Deep convolutional net
Implementation: model (ReLu, dropout and softmax)3rd layer: ReLU
Reshape
We are switching back to fully connected layers, we want to reshape theinput as a flat vector.
3rd layer: add dropout
4th layer: softmax (output)
Corrado, Passerini (disi) TensorFlow Machine Learning 22 / 27
Deep convolutional net
Implementation: optimization and evaluation
Define the cost (cross-entropy): Hy (y) = −∑
y log y
Define the training algorithm
Start a new session (for now we have not computed anything)
Initialize the variables
Define the accuracy before training (for monitoring)
Corrado, Passerini (disi) TensorFlow Machine Learning 23 / 27
Deep convolutional net
Implementation: optimization and evaluation
Train the model (may take a while)
Evaluate the accuracy (it should be around 0.99)
Corrado, Passerini (disi) TensorFlow Machine Learning 24 / 27
Assignment
Assignment
The third ML assignment is to compare the performance of the deepconvolutional network when removing layers.For this assignment you need to adapt the code of the complete deeparchitecture. By removing one layer at the time, and keeping all theothers, you can evaluate the change in performance of the neural networkin classifing the MNIST dataset.
Note
The only coding required is to modify the shape and/or size of theinput vectors of the layers. The output of each layer has to remainthe same.
The report has to contain a short introduction on the methodologiesused in the deep architecture showed during the lab (convolution,max pooling, ReLU, dropout, softmax).
Corrado, Passerini (disi) TensorFlow Machine Learning 25 / 27
Assignment
Assignment
Steps
1 Remove the 1st layer: convolutional and max pooling
2 Train and test the network
3 Remove the 2nd layer: convolutional and max pooling
4 Train and test the network
5 Remove the 3rd layer: ReLU with dropout
6 Train and test the network
Computation
Training a model on a quad-core CPU takes 30-45 mins. You may want touse the computers in the lab.
Corrado, Passerini (disi) TensorFlow Machine Learning 26 / 27
Assignment
Assignment
After completing the assignment submit it via email
Send an email to [email protected] (cc:[email protected])
Subject: tensorflowSubmit2016
Attachment: id name surname.zip containing:I the Python code
F model no1.py (model without the 1st layer)F model no2.py (model without the 2nd layer)F model no3.py (model without the 3rd layer)
I the report (PDF format)
NOTE
No group work
This assignment is mandatory in order to enroll to the oral exam
Corrado, Passerini (disi) TensorFlow Machine Learning 27 / 27
References
References
https://www.tensorflow.org/
http://cs231n.github.io/convolutional-networks/
https:
//www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf
Corrado, Passerini (disi) TensorFlow Machine Learning 28 / 27