Transcript
Page 1: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 1

Lecture 9: Unsupervised, Generative & Adversarial NetworksDeep Learning @ UvA

Page 2: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 2

o Recurrent Neural Networks (RNN) for sequences

o Backpropagation Through Time

o Vanishing and Exploding Gradients and Remedies

o RNNs using Long Short-Term Memory (LSTM)

o Applications of Recurrent Neural Networks

Previous Lecture

Page 3: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 3

o Latent space data manifolds

o Autoencoders and Variational Autoencoders

o Boltzmann Machines

o Adversarial Networks

o Self-Supervised Networks

Lecture Overview

Page 4: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

UNSUPERVISED, GENERATIVE& ADVERSARIAL NETWORKS - 4

The manifold hypothesis

Page 5: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 5

Data in theory

o One image: 256 256Γ—256Γ—3

β—¦ 256 height, 256 width, 3 RGB channels, 256 pixel intensities

o Each of these images is like the one in the background

o For text the equivalent would be generating random letter sequences

dakndfqblznqrnbecaojdwlzbirnxxbesjntxapkklsndtuwhc

Page 6: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 6

Data in theory

o One image: 256 256Γ—256Γ—3

β—¦ 256 height, 256 width, 3 RGB channels, 256 pixel intensities

o Each of these images is like the one in the background

o For text the equivalent would be generating random letter sequences

qgkhlkjijxskmbisuwephrhudskneyaeajdzhowieyqwhfnago

Page 7: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 7

Data in theory

o One image: 256 256Γ—256Γ—3

β—¦ 256 height, 256 width, 3 RGB channels, 256 pixel intensities

o Each of these images is like the one in the background

o For text the equivalent would be generating random letter sequences

tpxvdcxwgebouyvqaekqzgwvqfuakuhodsapxzbfsizgobtpjb

Page 8: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 8

o Data live in manifolds

o A manifold is a latent structure of muchlower dimensionalitydimπ‘šπ‘Žπ‘›π‘–π‘“π‘œπ‘™π‘‘ β‰ͺ dimπ‘‘π‘Žπ‘‘π‘Ž

o Nobody β€œforces” the datato be on the manifold, theyjust are

o The manifold occupies onlya tiny fraction of the possible space

Data in reality: The manifold hypothesis

Page 9: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 9

o The trajectory definestransformations/variances

o Rotationβ—¦ E.g. the faces turns

Data in reality: The manifold hypothesis

Page 10: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 10

o The trajectory definestransformations/variances

o Rotationβ—¦ E.g. the faces turns

o Global appearance changesβ—¦ E.g., Different face

Data in reality: The manifold hypothesis

Page 11: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 11

o The trajectory definestransformations/variances

o Rotationβ—¦ E.g. the faces turns

o Global appearance changesβ—¦ E.g., Different face

o Or even local appearance changeβ—¦ E.g., eyes open/closed

Data in reality: The manifold hypothesis

Page 12: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 12

o Each image is a either a set ofcoordinates in the full data space

Data in reality: The manifold hypothesis

Image=

0.1βˆ’0.20.80.3βˆ’0.5βˆ’0.30.80.1βˆ’0.4

Page 13: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 13

o Or a smaller set of intrinsicmanifold coordinates

Data in reality: The manifold hypothesis

Image taken from Pascal Vincent, Deep Learning Summer School, 2015

Page 14: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 14

o Manifold Data distribution

o Learning the manifold Learning data distribution and data variances

o How to learn the these variances automatically?

o Unsupervised and/or generative learning

So what?

Page 15: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

UNSUPERVISED, GENERATIVE& ADVERSARIAL NETWORKS - 15

Unsupervised and GenerativeLearning

Page 16: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 16

o Latent space manifolds

o Autoencoders and Variational Autoencoders

o Boltzmann Machines

o Adversarial Networks

o Self-Supervised Networks

What is unsupervised learning?

Page 17: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 17

o Much more unlabeled than labeled dataβ—¦ Large data better models

β—¦ Ideally not pay for annotations

o What is even the correct annotation for learning data distribution and/or data variances

o Discovering structureβ—¦ What are the important features in the dataset

Why unsupervised learning?

Page 18: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 18

o Latent space manifolds

o Autoencoders and Variational Autoencoders

o Boltzmann Machines

o Adversarial Networks

o Self-Supervised Networks

What are generative models?

Page 19: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 19

o Force models to go beyond surface statistical regularities of existing dataβ—¦ Go beyond direct association of observable inputs to observable outputs

β—¦ Avoid silly predictions (adversarial examples)

o Understand the world

o Understand data variances, disentangle and recreate them

o Detect unexpected events in your data

o Pave the way towards reasoning and decision making

Why generative?

Page 20: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 20

o Autoencoders and Variational Autoencoders

o Boltzmann Machines

o Adversarial Networks

o Self-Supervised Networks

Unsupervised & generative learning of the manifold

Page 21: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

UNSUPERVISED, GENERATIVE& ADVERSARIAL NETWORKS - 21

Autoencoders

Page 22: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 22

o 𝑧 is usually after a non-linearityβ—¦ E.g. 𝑧 = β„Ž π‘Šπ‘₯ + 𝑏 , β„Ž ≑ 𝜎 β‹… , tanh(β‹…)

o Reconstruction error β„’

o Often error is the Euclidean loss

Standard Autoencoder

Encoder 𝒉

Decoder π’ˆ

Latent space 𝒛Error β„’

Input: π‘₯

Output: reconstruction ොπ‘₯

β„’ =

𝑑

β„“ π‘₯, 𝑔 β„Ž π‘₯

β„“ = π‘₯ βˆ’ ොπ‘₯ 2

Page 23: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 23

o The latent space should have fewer dimensions than inputβ—¦ Undercomplete representation

β—¦ Bottleneck architecture

o Otherwise (overcomplete) autoencoder might learn the identity function

π‘Š ∝ 𝐼 ⟹ π‘₯ = π‘₯ ⟹ β„’ = 0β—¦ Assuming no regularization

β—¦ Often in practice still works though

o Also, if 𝑧 = π‘Šπ‘₯ + 𝑏 (linear) autoencoder learns same subspace as PCA

Standard Autoencoder

Page 24: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 24

o Stacking layers on top of each other

o Bigger depth higher level abstractions

o Slower training

Stacking Autoencoders

Page 25: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 25

o Add random noise to inputβ—¦ Dropout, gaussian

o Loss includes expectationover noise distribution

Denoising Autoencoder

Error β„’

Corrupted input: π‘₯

Output: reconstruction ොπ‘₯

Input: π‘₯

Noise πœ€: π‘ž( π‘₯|π‘₯, πœ€)

Encoder 𝒉

Decoder π’ˆ

Latent space 𝒛

β„’ =

𝑑

Ξ•π‘ž( π‘₯|π‘₯)β„“ π‘₯, 𝑔 β„Ž π‘₯

Page 26: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 26

o The network does not overlearn the dataβ—¦ Can even use overcomplete latent spaces

o Model forced to learn more intelligent, robust representationsβ—¦ Learn to ignore noise or trivial solutions(identity)

β—¦ Focus on β€œunderlying” data generation process

Denoising Autoencoder

Page 27: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 27

o Instead of invariant to input noise, invariant output representationsβ—¦ Similar input with different sampled noise similar outputs

o Regularization term over noisy reconstructions

Stochastic Contractive Autoencoders

β„’ =

𝑑

β„“ π‘₯, 𝑔 β„Ž π‘₯ + πœ†Ξ•π‘ž( π‘₯|π‘₯)[ β„Ž π‘₯ βˆ’ β„Ž π‘₯ 2]

Reconstruction error Regularization error

Page 28: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 28

o Instead of invariant to input noise, invariant output representationsβ—¦ Similar input with different sampled noise similar outputs

o Regularization term over noisy reconstructions

o Trivial solution?

Stochastic Contractive Autoencoders

β„’ =

𝑑

β„“ π‘₯, 𝑔 β„Ž π‘₯ + πœ†Ξ•π‘ž( π‘₯|π‘₯)[ β„Ž π‘₯ βˆ’ β„Ž π‘₯ 2]

Reconstruction error Regularization error

Page 29: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 29

o Instead of invariant to input noise, invariant output representationsβ—¦ Similar input with different sampled noise similar outputs

o Regularization term over noisy reconstructions

o Trivial solution?β—¦ β„Ž π‘₯ = 0

β—¦ To avoid make sure the encoder and decoder weights are shared π‘Šπ‘‘ = π‘Šπ‘’π‘‡

Stochastic Contractive Autoencoders

β„’ =

𝑑

β„“ π‘₯, 𝑔 β„Ž π‘₯ + πœ†Ξ•π‘ž( π‘₯|π‘₯)[ β„Ž π‘₯ βˆ’ β„Ž π‘₯ 2]

Reconstruction error Regularization error

Page 30: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 30

o Taylor series expansion about a point 𝑦 = π‘Žβ„Ž 𝑦 = β„Ž π‘Ž + β„Žβ€² π‘Ž 𝑦 βˆ’ π‘Ž + 0.5 β„Žβ€²β€² π‘Ž 𝑦 βˆ’ π‘Ž 2 + …

o For 𝑦 = π‘₯ = π‘₯ + πœ€, where πœ€~𝑁(0, 𝜎2π›ͺ) the (first order) Taylor

β„Ž π‘₯ = β„Ž π‘₯ + πœ€ = β„Ž π‘₯ +πœ•h

πœ•π‘₯πœ€

Stochastic Analytic Regularization

Ξ•π‘ž( π‘₯|π‘₯)[ β„Ž π‘₯ βˆ’ β„Ž π‘₯ 2] =πœ•h

πœ•π‘₯πœ€

2

∝ 𝜎2πœ•h

πœ•π‘₯

2

Page 31: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 31

o Taylor series expansion about a point 𝑦 = π‘Žβ„Ž 𝑦 = β„Ž π‘Ž + β„Žβ€² π‘Ž 𝑦 βˆ’ π‘Ž + 0.5 β„Žβ€²β€² π‘Ž 𝑦 βˆ’ π‘Ž 2 + …

o For 𝑦 = π‘₯ = π‘₯ + πœ€, where πœ€~𝑁(0, 𝜎2π›ͺ) the (first order) Taylor

β„Ž π‘₯ = β„Ž π‘₯ + πœ€ = β„Ž π‘₯ +πœ•h

πœ•π‘₯πœ€

o Higher order expansions also possible although computationally expensiveβ—¦ Alternative: compute higher order derivatives stochastically

β—¦ Analytic and stochastic regularization

Stochastic Analytic Regularization

Ξ•π‘ž( π‘₯|π‘₯)[ β„Ž π‘₯ βˆ’ β„Ž π‘₯ 2] =πœ•h

πœ•π‘₯πœ€

2

∝ 𝜎2πœ•h

πœ•π‘₯

2

Page 32: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 32

o Gradients show sensitivity of function around a point

β„’ =

𝑑

β„“ π‘₯, 𝑔 β„Ž π‘₯ + πœ†πœ•h

πœ•π‘₯

2

o Regularization term penalizes sensitivity to all directions

o Reconstruction term enforces sensitivity only to direction of manifold

Some geometric intuition

Page 33: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 33

o Let’s try to check where the gradients are sensitive around our manifold

o Gradients Jacobian

o Strongest directions of Jacobian Compute SVD of Jacobianπœ•β„Ž

πœ•π‘₯= π‘ˆπ‘†π‘‰π‘‡

o Take strongest singular values in 𝑆 and respective vectors from Uβ—¦ These are our strongest tangents directions of Jacobian of most sensitivity

o Hopefully the tangents sensitive to direction of the manifold, not random

Some geometric intuition: let’s get real

Page 34: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 34

o Let’s try to check where the gradients are sensitive around our manifold

o Gradients Jacobian

o Strongest directions of Jacobian Compute SVD of Jacobianπœ•β„Ž

πœ•π‘₯= π‘ˆπ‘†π‘‰π‘‡

o Take strongest singular values in 𝑆 and respective vectors from Uβ—¦ These are our strongest tangents directions of Jacobian of most sensitivity

o Hopefully the tangents sensitive to direction of the manifold, not random

o How to check?

Some geometric intuition: let’s get real

Page 35: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 35

o Let’s try to check where the gradients are sensitive around our manifold

o Gradients Jacobian

o Strongest directions of Jacobian Compute SVD of Jacobianπœ•β„Ž

πœ•π‘₯= π‘ˆπ‘†π‘‰π‘‡

o Take strongest singular values in 𝑆 and respective vectors from Uβ—¦ These are our strongest tangents directions of Jacobian of most sensitivity

o Hopefully the tangents sensitive to direction of the manifold, not random

o How to check? Visualize!!

Some geometric intuition: let’s get real

Page 36: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 36

o Let’s try to check where the gradients are sensitive around our manifold

o Gradients Jacobian

o Strongest directions of Jacobian Compute SVD of Jacobianπœ•β„Ž

πœ•π‘₯= π‘ˆπ‘†π‘‰π‘‡

o Take strongest singular values in 𝑆 and respective vectors from Uβ—¦ These are our strongest tangents directions of Jacobian of most sensitivity

o Hopefully the tangents sensitive to direction of the manifold, not random

o How to check? Visualize!!

Some geometric intuition: let’s get real

Page 37: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 37

Strongest tangents

Page 38: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 38

o We want to model the data distribution

𝑝 π‘₯ = ΰΆ±π‘πœƒ 𝑧 π‘πœƒ π‘₯ 𝑧 𝑑𝑧

o Posterior π‘πœƒ 𝑧 π‘₯ is intractable for complicated likelihood functions π‘πœƒ π‘₯ 𝑧 , e.g. a neural network 𝑝 π‘₯ is also intractable

o Introduce an inference machine π‘žπœ‘ 𝑧 π‘₯ (e.g. another neural network) that learns to approximate the posterior π‘πœƒ 𝑧 π‘₯β—¦ Since we cannot know π‘πœƒ 𝑧 π‘₯ define a variational lower bound to optimize instead

β„’ πœƒ, πœ‘, π‘₯ = βˆ’π·πΎπΏ π‘žπœ‘ 𝑧 π‘₯ Τ‘π‘πœƒ 𝑧 + πΈπ‘žπœ‘ 𝑧 π‘₯ (log π‘πœƒ(π‘₯|𝑧))

Variational Autoencoder

Reconstruction termRegularization term

Page 39: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 39

o Since we cannot know π‘πœƒ 𝑧 π‘₯ define a variational lower bound to optimize instead

o Optimize inference machine π‘žπœ‘ 𝑧 π‘₯ and likelihood machine π‘πœƒ(π‘₯|𝑧)β—¦ π‘žπœ‘ 𝑧 π‘₯ should follow a Gaussian distribution

o Reparamerization trickβ—¦ Instead of 𝑧 being stochastic, introduce stochastic noise variable πœ€

β—¦ Then, think of 𝑧 as deterministic, where 𝑧 = πœ‡π‘§ π‘₯ + πœ€πœŽπ‘§ π‘₯

β—¦ Simultaneous optimization of π‘žπœ‘ 𝑧 π‘₯ and π‘πœƒ(π‘₯|𝑧) now possible

β„’ πœƒ, πœ‘, π‘₯ = βˆ’π·πΎπΏ π‘žπœ‘ 𝑧 π‘₯ Τ‘π‘πœƒ 𝑧 + πΈπ‘žπœ‘ 𝑧 π‘₯ (log π‘πœƒ(π‘₯|𝑧))

Variational Autoencoder

Reconstruction termRegularization term

Page 40: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 40

Examples

Page 41: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

UNSUPERVISED, GENERATIVE& ADVERSARIAL NETWORKS - 41

Adversarial Networks

Page 42: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 42

o So far we were trying to express the pdf of the data 𝑝(π‘₯)

o Why bother? Is that really necessary? Is that limiting?

o Can we model directly the sampling of data?

Generative Adversarial Networks

Page 43: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 43

o So far we were trying to express the pdf of the data 𝑝(π‘₯)

o Why bother? Is that really necessary? Is that limiting?

o Can we model directly the sampling of data?

o Yes, by modelling sampling as a β€œThief vs. Police” game

Generative Adversarial Networks

Page 44: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 44

o Composed of two successive networksβ—¦ Generator network (like upper half of autoencoders)

β—¦ Discriminator network (like a convent)

o Learningβ—¦ Sample β€œnoise” vectors 𝑧

β—¦ Per 𝑧 the generator produces a sample π‘₯

β—¦ Make a batch where half samples are real,half are the generated ones

β—¦ The discriminator needs to predict what is realand what is fake

Generative Adversarial Networks

Generator

Noise 𝒛

Discriminator

Page 45: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 45

o Generator and discriminator networks optimized togetherβ—¦ The generator (thief) tries to fool the discriminator

β—¦ The discriminator (police) tries to not get fooled by the generator

o Mathematically

min𝐺

max𝐷

𝑉(𝐺, 𝐷) = 𝐸π‘₯~π‘π‘‘π‘Žπ‘‘π‘Ž(π‘₯) log 𝐷(π‘₯) + 𝐸𝑧~𝑝𝑧(𝑧) log(1 βˆ’ 𝐷 𝐺 𝑧 )

β€œPolice vs Thief”

Page 46: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 46

A graphical interpretation

Page 47: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 47

o This is (very) fresh research

o As such the theory behind adversarialnetworks is by far not mature

o Moreoever, GANs are quite unstable

o Optimal solution is a saddle pointβ—¦ Easy to mess up, little stability

o Solutions?β—¦ Not great ones yet, although several

researchers are actively working on that

Adversarial network hardships

Generator

Discriminator

Page 48: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 48

Examples of generated images

Bedrooms

Page 49: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 49

Image β€œarithmetics”

Page 50: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

UNSUPERVISED, GENERATIVE& ADVERSARIAL NETWORKS - 51

Summary

o Latent space data manifolds

o Autoencoders and Variational Autoencoders

o Boltzmann Machines

o Adversarial Networks

o Self-Supervised Networks

Page 51: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES UNSUPERVISED, GENERATIVE & ADVERSARIAL NETWORKS - 52

o http://www.deeplearningbook.org/β—¦ Part II: Chapter 10

o Excellent blog post on Backpropagation Through Timeβ—¦ http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-

rnns/β—¦ http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-2-implementing-a-

language-model-rnn-with-python-numpy-and-theano/β—¦ http://www.wildml.com/2015/10/recurrent-neural-networks-tutorial-part-3-backpropagation-

through-time-and-vanishing-gradients/

o Excellent blog post explaining LSTMsβ—¦ http://colah.github.io/posts/2015-08-Understanding-LSTMs/

[Pascanu2013] Pascanu, Mikolov, Bengio. On the difficulty of training Recurrent Neural Networks, JMLR, 2013

Reading material & references

Page 52: Lecture 9: Unsupervised, Generative & Adversarial Networks · PDF fileuva deep learning course –efstratios gavves unsupervised, generative & adversarial

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

UNSUPERVISED, GENERATIVE& ADVERSARIAL NETWORKS - 53

Next lecture

o Memory networks

o Advanced applications of Recurrent and Memory Networks


Recommended