Upload
others
View
16
Download
0
Embed Size (px)
Citation preview
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Auto-Encoding Variational Bayes
Milan Ilic
3rd April 2019
Milan Ilic MATF/Everseen VAE 3rd April 2019 1 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Paper
Diederik P Kingma, Max Welling. Universiteit van Amsterdam.Auto-Encoding Variational Bayes. December, 2013
Number of citations: 4364
Milan Ilic MATF/Everseen VAE 3rd April 2019 2 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Github
Implementation on github: [Ili]
Milan Ilic MATF/Everseen VAE 3rd April 2019 3 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Table of contents
1 Autoencoders
2 Generative models
3 Variational Autoencoder
4 Variational Inference
5 Reparameterization Trick
6 Results & Applications
7 Conditional VAE
8 References
Milan Ilic MATF/Everseen VAE 3rd April 2019 4 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Autoencoders
Autoencoders
Milan Ilic MATF/Everseen VAE 3rd April 2019 5 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Autoencoder Architecture
Milan Ilic MATF/Everseen VAE 3rd April 2019 6 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Conventional Autoencoder
Encoder: pencoder (h | x)
Decoder: pdecoder (x | h)
Where h is the code and x is the input.
Milan Ilic MATF/Everseen VAE 3rd April 2019 7 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Conventional Autoencoder
Dimensionality reductionOutlier detection
Milan Ilic MATF/Everseen VAE 3rd April 2019 8 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Denoising Autoencoder
Reducing noise in an imageRemoving some object from an image (e.g. watermark)
Milan Ilic MATF/Everseen VAE 3rd April 2019 9 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Generative models
Generative models
Milan Ilic MATF/Everseen VAE 3rd April 2019 10 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Discriminative & Generative models
Let (x , y) be the inputs and the corresponding labels, in that order
Discriminative classifiers model the posterior p(y | x) directly, orlearn a direct map from inputs x to the class labels
Generative classifiers learn a model of joint probability p(x , y)and make their predictions by using the Bayes rule to calculatep(y | x), and then picking the most likely label y
p(y | x) =p(x | y)p(y)
p(x)
Milan Ilic MATF/Everseen VAE 3rd April 2019 11 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Discriminative & Generative models
Milan Ilic MATF/Everseen VAE 3rd April 2019 12 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Discriminative models
Logistic regression
Linear regression
Support vector machines
Random forests
Traditional neural networks
etc...
Milan Ilic MATF/Everseen VAE 3rd April 2019 13 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Generative models
If we are not interested in a supervised problem, we can usegenerative model to only learn the distribution of our data p(x)
After training we can generate new data similar to x
“What I cannot create, I do not understand”
— Richard Feynman
Milan Ilic MATF/Everseen VAE 3rd April 2019 14 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Generative models
Boltzmann machine
PixelRNN
GAN
Variational autoencoder
One problem with generative models is that they need very largedatasets to work properly.
Milan Ilic MATF/Everseen VAE 3rd April 2019 15 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Variational Autoencoder
Variational Autoencoder
Milan Ilic MATF/Everseen VAE 3rd April 2019 16 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
VAE
"The variational autoencoder approach is elegant, theoreticallypleasing, and simple to implement"
— Ian Goodfellow
Milan Ilic MATF/Everseen VAE 3rd April 2019 17 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Reminder
Milan Ilic MATF/Everseen VAE 3rd April 2019 18 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
VAE Architecture
Milan Ilic MATF/Everseen VAE 3rd April 2019 19 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
What can we now do?
After training the VAE, we could simply discard the encoder partand use the decoder to generate new data
To get the decoders input, we just sample from the Gaussiandistribution and pass that sample
Generated data instances should come from points with highprobability in datasets distribution space
Milan Ilic MATF/Everseen VAE 3rd April 2019 20 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Vizualization of latent space
On the left the Conventional AE latent space and on the right VAEs latent space
Milan Ilic MATF/Everseen VAE 3rd April 2019 21 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Variational Inference
Variational Inference
Milan Ilic MATF/Everseen VAE 3rd April 2019 22 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Set up
Assume that x = x1:n are the observations, z = z1:m are hiddenvariables and α are fixed parameters
First we generate value z from a prior distribution p(z) and thengenerate x form conditional distribution p(x | z), we can assumewhat form these two distributions take.
We want to calculate the posterior distribution:
pα(z | x) =pα(x | z)pα(z)∫
z pα(z, x)dz
In most cases the posterior is intractable
Variational inference treats the inference problem as anapproximation problem
Milan Ilic MATF/Everseen VAE 3rd April 2019 23 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Approaches
Deterministic approximationMean field variational inferenceStochastic variational inference
Stochastic approximation (Markov Chain Monte Carlo)Metropolis–Hastings algorithmGibbs sampling
By doing the deterministic approximation we will converge but notfind the optimal solution
The main problem with the stochastic approximation is that it isvery slow due to the sampling step
Milan Ilic MATF/Everseen VAE 3rd April 2019 24 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Main Idea
The main idea behind variational methods is to, first pick atractable family of distributions over the latent variables with itsown variational parameters q(z1:m | ν)
Then to find parameters that make it as close as possible to thetrue posterior
Use that q instead of the posterior to make predictions aboutfuture data
Milan Ilic MATF/Everseen VAE 3rd April 2019 25 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Kullback-Leibler Divergence
DKL(p‖q) = Ex∼p
[log
p(x)
q(x)
]Discrete and continuous form:
DKL(p‖q) =∑
x p(x) log p(x)q(x)
DKL(p‖q) =∫
x p(x) log p(x)q(x) dx
Used to measure similarity between two probability distributions (w.r.t. one of them).
Properties:
DKL(p‖q) ≥ 0,∀p, qDKL(p‖q) = 0 ⇐⇒ p = q
DKL(p‖q) 6= DKL(q‖p) in general
Milan Ilic MATF/Everseen VAE 3rd April 2019 26 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
The Variational Lower Bound
DKL(q(z | x)‖p(z | x)) =
∫z
q(z | x) logq(z | x)
p(z | x)
= −∫
zq(z | x) log
p(z | x)
q(z | x)
= −[∫
zq(z | x) log
p(x , z)
q(z | x)−∫
zq(z | x) log p(x)
]= −
∫z
q(z | x) logp(x , z)
q(z | x)+ log p(x)
∫z
q(z | x)
= −L+ log p(x)(1)
log p(x) = L+ DKL(q(z | x)‖p(z | x))
Minimizing the KL divergence is equal to maximizing the variationallower bound!
Milan Ilic MATF/Everseen VAE 3rd April 2019 27 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
The Variational Lower Bound
L =
∫z
q(z | x) logp(x , z)
q(z | x)
=
∫z
q(z | x) logp(x | z)p(z)
q(z | x)
=
∫z
q(z | x) log p(x | z) +
∫z
q(z | x) logp(z)
q(z | x)
(2)
L = Eq(z|x) log p(x | z)− DKL(q(z | x)‖p(z))
The first term is conceptually the negative reconstruction errorand the second makes our q(z | x) close to the prior p(z)
Milan Ilic MATF/Everseen VAE 3rd April 2019 28 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Back to VAE
Let θ be the generative parameters and φ the variationalparameters and assume that x (1), ..., x (N) are i.i.d:
log pθ(x (1), ..., x (N)) =∑
i=1:N
log pθ(x (i))
Each term on the right hand side can be written as:
log pθ(x (i)) = DKL(qφ(z | x (i))‖pθ(x (i) | z)) + L(θ, φ; x (i))
The variational lower bound is now:
L(θ, φ; x (i)) = Eqφ(z|x (i)) log pθ(x (i) | z)− DKL(qφ(x (i) | z)‖pθ(z))
Milan Ilic MATF/Everseen VAE 3rd April 2019 29 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Reparameterization Trick
Reparameterization Trick
Milan Ilic MATF/Everseen VAE 3rd April 2019 30 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Definition
In the middle of VAEs architecture there should be sampling fromz ∼ qφ(z | x) ∼ N (µ,Σ)
We cannot do such thing because backpropagation can’t gothrough a sampling node
It is often possible to express the random variable z as adeterministic variable z = gφ(ε, x), where ε is an auxiliary variablewith independent marginal p(ε), and gφ(.) is some vector-valuedfunction parameterized by φ
In our (Gaussian) case z = µ + Σ ∗ ε, where ε ∼ N (0, I)
Milan Ilic MATF/Everseen VAE 3rd April 2019 31 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Other distributions
Normal distribution isn’t the only one on which we can do thistransformation, there are three groups of distributions:
Tractable inverse CDF:Let ε ∼ U(0, I) and gφ(ε, x) be the inverse CDF of qφ(z | x).Exponential, Cauchy, Logistic...
Location-scale family:As in the example from the last slide, z = location + scale ∗ ε, where ε is from thestandard distribution.Laplace, Student’s t, Uniform, Normal...
Composition:It is often possible to express random variables as different transformations of auxiliaryvariables.Log-Normal, Gamma, Beta, Chi-Squared, F...
Milan Ilic MATF/Everseen VAE 3rd April 2019 32 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Putting everything together
Milan Ilic MATF/Everseen VAE 3rd April 2019 33 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Results
Results & Applications
Milan Ilic MATF/Everseen VAE 3rd April 2019 34 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Healing Imagery
Milan Ilic MATF/Everseen VAE 3rd April 2019 35 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Healing Imagery
Milan Ilic MATF/Everseen VAE 3rd April 2019 36 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Healing Imagery
Milan Ilic MATF/Everseen VAE 3rd April 2019 37 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Healing Imagery
Milan Ilic MATF/Everseen VAE 3rd April 2019 38 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Applications
Data generation (e.g. images, music...)
Caption generation
Anomaly detection
Image segmentation
Super resolution
etc...
Milan Ilic MATF/Everseen VAE 3rd April 2019 39 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Applications
Milan Ilic MATF/Everseen VAE 3rd April 2019 40 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Conditional VAE
Conditional VAE
Milan Ilic MATF/Everseen VAE 3rd April 2019 41 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Motivation
By using the variational autoencoder, we do not have control overthe data generation process.
E.g. we cannot generate only one specific digit from a modeltrained on MNIST dataset.
We want, for example, to input the character 9 to our model andget a generated image of a handwritten digit 9.
Milan Ilic MATF/Everseen VAE 3rd April 2019 42 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Approach
We will condition encoder and decoder on other inputs as well asthe image, lets call those inputs c.
Encoder becomes: q(z | x , c)
Decoder becomes: p(x | z, c)
Now our variational lower bound objective becomes:
L = E[log p(x | z, c)]− DKL(q(z | x , c)‖p(z | c))
Milan Ilic MATF/Everseen VAE 3rd April 2019 43 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Latent space
On the left the Vanilla VAE latent space and on the right CVAEs latent space.
Milan Ilic MATF/Everseen VAE 3rd April 2019 44 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Results
Milan Ilic MATF/Everseen VAE 3rd April 2019 45 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
References
[KW14, GDG+15, GBC16, NJ01, SLY15]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep learning, MIT Press,2016, http://www.deeplearningbook.org.
Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and DaanWierstra, Draw: A recurrent neural network for image generation, citearxiv:1502.04623.
Milan Ilic, Autoencoders-101,https://github.com/milanilic332/Autoencoders-101.
Diederik P. Kingma and Max Welling, Auto-encoding variational bayes.
Andrew Y. Ng and Michael I. Jordan, On discriminative vs. generativeclassifiers: A comparison of logistic regression and naive bayes, 841–848.
Kihyuk Sohn, Honglak Lee, and Xinchen Yan, Learning structured outputrepresentation using deep conditional generative models, 3483–3491.
Milan Ilic MATF/Everseen VAE 3rd April 2019 46 / 47
Autoencoders Generative models Variational Autoencoder Variational Inference Reparameterization Trick Results & Applications Conditional VAE References
Thanks
Thanks for listening!
Milan Ilic MATF/Everseen VAE 3rd April 2019 47 / 47