Introduction to Convolutional Neural Networks

Machine Learning 101 Teach your computer the difference

between cats and dogs

Cole Howard & Hannes Hapke Open Source Bridge, June 23rd, 2016

Who are we?

John Howard @uglyboxer

Senior Developer at Dark Horse Comics Master of recommendation systems,

convolutional neural networks

Hannes Hapke @hanneshapke

Senior Developer at CrowdStreet Excited about neural networks

applications

We want to show you how you can train a computer to “recognize”

images *

* aka to decide between cats and dogs

What is this all about ...

Convolutional Nets are good at determining ...

• The spatial relationship of data • And therefore detecting determining patterns

Are these dogs?

Convolutional Neural Nets are heavily used by

For detecting patterns in images, videos, sounds and texts

• Music recommendation at Spotify (http://benanne.github.io/2014/08/05/spotify-cnns.html)

• Google’s PlaNet—Photo Geolocation with CNN (http://arxiv.org/abs/1602.05314)

• Who else is using CNNs? (https://www.quora.com/Apart-from-Google-Facebook-who-is-commercially-using-deep-recurrent-convolutional-neural-networks)

http://benanne.github.io/2014/08/05/spotify-cnns.html

http://arxiv.org/abs/1602.05314

https://www.quora.com/Apart-from-Google-Facebook-who-is-commercially-using-deep-recurrent-convolutional-neural-networks

What are conv nets?• In traditional feed-forward networks,

we are learning weights to apply to the data

• In conv-nets, we are learning to describe filters

• After each convolutional layer we still have an “image”

• Instead of 3 channels (r-g-b), we have n - channels. Each described by one of the learned filters

Convolutional Neural Net

Filters (or Kernels)

Example of Edge Detector

Example of Blurring Filter

Pooling• Can condense information as filters pull details apart

• With MaxPooling we take the local maximum activation as representative of the region. Usually a 2x2 subsample

• As we filter, precise location becomes less relevant

• This condenses the amount of information by ¼ per learned channel

• BONUS: Net becomes tolerant to local perturbations in the data

Traditional Feed-Forward Icing on the Cake

• Flatten the filtered image into one long 1 dimensional vector

• Pass into a feed forward network

• Out to classes -> to determine error

• Learn like normal - backpropagation works on filter weights, just as it does on neuron weights

Convolutional Neural Net

What frameworks are available?

Theano• Created by the

University of Montreal • Framework for

symbolic computation • Provides GPU support

• Great Python libraries based on Theano: Keras, Lasagne, PyLearn2

import numpy import theano.tensor as T

x = T.dmatrix('x') y = T.dmatrix('y') z = x + y f = function([x, y], z)

TensorFlow• Developed by a small startup in Moutainview • Used for 50 Google products • Used as part of AlphaGo (trained on TPUs*) • Designed for distributed learning problems • Growing ecosystem: TensorBoard, tflearn,

scikit-flow

import tensorflow as tf a = tf.placeholder("float") b = tf.placeholder("float") y = tf.mul(a, b) # multiply the symbolic variables with tf.Session() as sess: print("%f should equal 2.0" % sess.run(y, feed_dict={a: 1, b: 2})) print("%f should equal 9.0" % sess.run(y, feed_dict={a: 3, b: 3}))

How to prepare your images for the classification?

Normalize the image size• Use the pillow package in Python • For small size differences, squeeze images • For larger differences, resize images

• Or use Keras’ pre-processing functions

y, x = image.size y = x if x > y else y resized_image = Image.new(color_schema, (y, y), (255, )) try: resized_image.paste(image, image.getbbox()) except ValueError: continue resized_image = resized_image.resize( (resized_px, resized_px), Image.ANTIALIAS) resized_image.save(new_filename, 'jpeg', quality=90)

Convert the images into matrices

• Use the numpy package in Python • No magic, use numpy’s asarray method • Create a classification vector at the same time

image = Image.open(directory + f) image.load() image_matrix = np.asarray(image, dtype="int32").T image_classification = 1 if animal == 'Cat/' else 0 data.append(image_matrix) classification.append(image_classification)

Save the matrices in a reusable format

• Pickle or numpy is your best friend • You can split the dataset into training/test set

with `train_test_split`

• Store matrices as compressed pickles (use numpy for large arrays)

• Use compression!

X_train, X_test, y_train, y_test = train_test_split( data, classification, test_size=0.20, random_state=42)

np.savez_compressed('petsTrainingData.npz', X_train=X_train, X_test=X_test, y_train=y_train, y_test=y_test)

How to assemble a simple CNN

with Keras

What is Keras? Why?• Excellent Python wrapper library for Theano • Supports TensorFlow too! • Growing TensorFlow support • Amazing documentation • Amazing community

Steps1. Setup your sequential model 2. Create a network structure 3. Set the “compile” parameters 4. Set the fit parameters

Setup a sequential model• Sequential models allow you to define the

network structure

• Use model.add() to add layers to the neural network

Model = Sequential()

model.add(Convolution2D(64, 2, 2, border_mode='same'))

Create your network structure

• Keras provides various types of layers • Convolution2D • Convolution3D • Dense • Dropout • Activation • MaxPooling2D • etc.

model.add(Convolution2D(64, 2, 2)) model.add(Activation(‘relu’)) model.add(MaxPooling2D(pool_size=(2, 2)))

Set the “compile” parameters

• Keras provides various options for optimizing your network

• SGD • Adagrad • Adadelta • Etc.

• Set the learning rate, momentum, etc. • Define your loss definition and metricssgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) model.compile( loss=‘categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

Set the fit parameters• This is where the magic starts! • model.fit() allows you to define:

• The batch size • Number of epochs • Whether you want to shuffle your training data • Your validation set • Your callbacks

• Callbacks are amazing!

Use Callbacks• Keras comes with various callbacks

• ModelCheckpoint allows saving the model parameters after every/best run

• EarlyStopping allows stopping the training if your training condition is met

• Other callbacks: • LearningRateScheduler

• TensorBoard

• RemoteMonitor

Faster, Faster … • GPU’s are your friends

• Unlike traditional feed-forward nets, there are large parts of CNN’s that are parallel-izable!

• As each neuron normally depends on the neuron before it and the error reported from the neuron after it, filters are different.

• In a layer, each filter and each filter at each position are independent of each other.

• So all of those computations can happen simultaneously.

• And as all are simple matrix multiplications, we can make use of the 1000’s of cores on modern GPU’s

Running on a GPU• Install proper dependencies (linux requires a few extra steps here)

• Install Theano, Keras

• Install CUDA (http://tleyden.github.io/blog/2015/11/22/cuda-7-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/)

• Install cuDNN (requires registration with NVIDIA)

• Configurations in ~/.theanorc

• Set Theano Flags when running script (or in .theanorc)

• Pre-configured AMI on AWS (ami-a6ec17c6 in region US-west-2/Oregon)

How does a training look like in action?

What to do once the training is completed?

Learning resourcesConvNets• http://cs231n.stanford.edu/ • https://www.youtube.com/watch?v=bEUX_56Lojc • http://blog.keras.io/how-convolutional-neural-networks-see-the-world.html

Keras• https://www.youtube.com/watch?v=Tp3SaRbql4k

TensorFlow• http://learningtensorflow.com/examples/

http://cs231n.stanford.edu/

https://www.youtube.com/watch?v=bEUX_56Lojc

http://blog.keras.io/how-convolutional-neural-networks-see-the-world.html

https://www.youtube.com/watch?v=Tp3SaRbql4k

http://learningtensorflow.com/examples/

Thank you!

bit.ly/OSB16-machinelearning101

http://bit.ly/OSB16-machinelearning101

Data & Analytics

Introduction to Convolutional Neural Networks