PyData2015

Representation learning

PyData Warsaw 2015

Michael JamrozMatthew Opala

24’th september 2015

● Goals of AI● Learning representations● Deep learning● Examples

Presentation Plan

AI

● Goal: build the intelligent machine● It needs knowledge to make decisions● Impossible to put the knowledge into

computer program● Knowledge gained by learning from data

Data representation

● Representation - features passed to ML algorithms, crucial for good performance on various tasks

● Features can be handcrafted or learned automatically

● Representation learning: discovering meaningful features by the computer

ML in industry nowadays

● Most of the time spent on manual feature extraction

● We would like to have

Why representation learning ?

● Previous slide (time-consuming, incomplete)● Unsupervised feature learning

○ Collected data are mostly unlabeled (bigger datasets)

○ Labels do not provide enough information○ Process of learning is independent of the

ML task performed on data

Semi-supervised, transfer learning

● Transfer learning - transferring knowledge from previous learning to the new machine learning task

● Semi-supervised learning

few labeled examples

many unlabeled examples

Need for Deep Architectures

● deep architecture can represent certain functions more compactly than shallow one

● any boolean function (e. g. AND, OR, XOR) can be represented by a single hidden layer - however it may require exponential number of hidden units

Formally

● shown by Yao in 1985 that d-bit parity circuits of depth 2 have exponential size

● generalised to perceptrons with linear threshold units in 1991 by Hastad

How deep representation do we need?

Informal arguments

Shallow program

Deep program

Biology inspirations

Learning multiple levels of representation

“I'm sorry, Dave. I'm afraid I can't do that.”

对不起，戴夫。恐怕我不能这样做。

Let’s build deep representation

Multilayer Perceptron

input layer

hidden layers

output layer

Reminder - Gradient Descent

But MLPs have their problems

● vanishing, exploding gradients● stucking in poor local optima● lack of good initializations● lack of labeled data● hard time to encourage for research● slow hardware

Breakthrough 2006

Greedy layer-wise pretraining

Restricted Boltzmann Machine

Stacking RBMs

● but for natural images we would like to be invariant to translations, rotations and other non-changing class transformations

● fully connected networks do not introduce such invariance

Limitations of fully connected networks

Convolutional Neural Nets

Convolution = sparse connectivity + parameters sharing

Sparse connectivity

Parameter sharing

Convolution

Pooling

Architecture

Examples

Word2Vec / Doc2Vec

● Tomas Mikolov et al 2013● Embedding words / documents in vector

space● Neural network with one hidden layer● Trained in unsupervised way● Representation for word obtained by

computing hidden layer activation● Good explanation: http://arxiv.org/pdf/1411.

2738v1.pdf

Problem

● ~180k documents - reports made by american companies of activity

● companies belonging to different industry segments (260)

● ~9k labeled documents (given industry the company operates in)

● example of semi-supervised learning● task: classify the remaining part of

documents

Doc2Vec - document embedding

Doc2Vec - classification

● Division of labeled set to training/test data with ratio 70/30

● Test set: ~2700 examples, 260 classes● Classification performed on representation

obtained from Doc2Vec● Accuracy on test set:

○ KNN with voting: ~85 %○ SVM one-versus-one: ~83 %○ Random forest: ~80 %

Neural Art Style Transfer

Pretrain CNN

Content representation

Art style representation

Objective function

Summing up

● define loss function for content● define loss function for art● define total loss● perform gradient-based optimization● compute derivatives with respect to data

● Theano & Lasagne● NViDIA GTX● https://github.com/Craftinity/art_style● http://deeplearning.net

Contact

● http://www.craftinity.com● https://www.facebook.com/craftinitycom● https://twitter.com/craftinitycom● [email protected]● [email protected]● [email protected]

Q&A

The End

Documents

PyData2015