View
103
Download
0
Category
Preview:
Citation preview
Novelty generation with deep learningPresented by : Cherti Mehdi
joint work with Balázs Kégl and Akın Kazakçı
• Research questions
• Why machine learning and deep learning ?
• Generating new types of objects (C) using previously acquired knowledge (K)
• Evaluating novelty with out-of-class generation metrics
• Perspectives
Roadmap
Summary
• Recently, generative models have gained momentum. But such models are almost exclusively used in a prediction pipeline.
• Our objective is to study
• a) whether such models can be used to generate novelty
• b) how to evaluate their capacity for generating novelty
Research questions
• What is meant by the generation of novelty?
• How can novelty be generated?
• How can a model generating novelty be evaluated?
Why machine learning and deep learning ?
• Knowledge is important : machine learning enable the study of creativity in relation with knowledge
• Generative modeling: we want to generate objects
• Composition of features is important : deep learning models can automatically learn a hierarchy of features of growing abstraction from raw data
I focus my thesis on deep generative models.
Generating new types of objects
In Kazakçı et al. 2016:
• We show that symbols of new types can be generated by carefully tuned autoencoders
• We make a first step of defining the conceptual and experimental framework of novelty generation
Generating new types of objects:autoencoders
• Autoencoders have existed for a long time (Kramer 1991)
• Deep variants are more recent (Hinton, Salakhutdinov, 2006; Bengio 2009)
• A deep autoencoder learns successive transformations that decompose and then recompose a set of training objects
• The depth allows learning a hierarchy of transformations
• Two ways of learning an autoencoder : undercomplete (bottleneck) and overcomplete representation
Slide adapted from from Kazakçı et al. 2016
Generating new types of objects:autoencoders with undercomplete
representation
Reconstruction
Input (dim 625)
Bottleneck
Encode
Decode
Deep autoencoder with a bottleneck from Hinton, G. E., & Salakhutdinov, R. R. (2006).
Generating new types of objects:autoencoders with overcomplete
representation
• Autoencoders can also be learned using an overcomplete representation
• Problem : Risk of learning the identity function
• One solution : constrain the representation to be “simple”
• Example : enforce sparsity of the representation with sparse autoencoders
Generating new types of objects:autoencoders with overcomplete
representation• What does a sparse autoencoder
end up learning ?
• Detect features with the encode function
• Superpose the detected features in the reconstructed image with the decode function
• Benefits of overcomplete representation with sparsity: for each image only a small fraction of features are used but different images use a different subset of features
k is the sparsity rate in %
Figure taken from Makhzani, A., & Frey, B. (2013)
Generating new types of objects:experimental setup
• Training data : MNIST, 70000 images of handwritten digits of size 28x28
• We use a sparse convolutional autoencoder trained to:
• Encode : take an image and transform it to a sparse code
• Decode : take the sparse code and reconstruct the image
• Training objective is to minimize the reconstruction error
Slide adapted from from Kazakçı et al. 2016
Generating new types of objects:generating new symbols
• We use an iterative method to build symbols the net has never seen (inspired by Bengio et al. (2013) but we don’t try to avoid spurious samples):
• Start with a random image
• force the network to construct (i.e. interpret)
• , until convergence, f(x) = decode(encode(x))Slide adapted from from Kazakçı et al. 2016
Generating new types of objects:generating new symbols
• What does the iterative generation procedure do ?
• It’s a non-linear path on the input space defined by the autoencoder (encode + decode) function
• It converges to fixed points defined by the autoencoder
Figure taken from Alain and Bengio (2013)
Generating new types of objects:Visualization of the structure of generated
images
• Colored clusters are original digits (classes from 0 to 9)
• The gray dots are newly generated objects
• New objects form new clusters
• Using a clustering algorithm, we recover coherent sets of new symbols
Slide adapted from from Kazakçı et al. 2016
Generating new types of objects
In Kazakçı et al. 2016:
• We show that symbols of new types can be generated by carefully tuned autoencoders
• We make a first step of defining a conceptual and experimental framework of novelty generation
• However, we make no attempt to design evaluation metrics
A set of types (clusters) discovered by the model
Evaluating novelty
In “Out-of-class novelty generation: an experimental foundation” :
• We design an experimental framework based on hold-out classes
• We review and analyze the most common evaluation techniques from the point of view of measuring “out-of-distribution novelty” and propose new ones
• We run a large-scale experimentation to study the capacity for generating novelty of a wide set of generative models
Evaluating noveltyExperimental framework
• We contrast two main concepts : in-class and out-of-class generation
• in-class generation: can a model re-generate the types already seen in the dataset ? (traditional objective)
• out-of-class generation : can a model generate an unseen (hold-out) set of types ? (a proxy to measure the capacity of a model to generate novelty)
• setup : we train models on a set of types(in), we seek for models that generate a hold-out set of types(out)
Evaluating noveltyEvaluation metrics
• In our experiments:
• We train models on digits
• We seek for models that generate letters
in-class:
out-of-class:
• We pre-train a
• digit classifier (0 to 9)
• a letter classifier (a to z)
• a classifier on a mixture of digits and letters
• Our evaluation metrics report a score for a set of generated objects by a model
Evaluating noveltyEvaluation metrics
Given a set of images, out-of-class objectness is high if:
• the letter classifier is highly confident for each image being one of the letters (a to z)
• we define in-class objectness similarly but using the digit classifier
Evaluating noveltyEvaluation metrics
Given a set of images, out-of-class max and out-of-class count are high if:
• the mixture of digits and letters classifier is highly confident for each image being one of the letters (a to z)
• we define in-class max and in-class count similarly but for digits
Evaluating noveltyEvaluation metrics
• We do a large scale experiment where we train ~1000 models by varying their parameters
• from each model, we generate 1000 images, then we evaluate the model using our proposed metrics
• We collect a total of ~1.000.000 generated images
Experiments
• We evaluate the evaluators with human assessment
• We build an annotation tool to check whether the models selected by our evaluation metrics are effectively good
ExperimentsEvaluating the evaluators
• we found that selecting models for in-class generation will make them memorize the classes they are trained to sample from
• we did succeed to find models which lead to out-of-class novelty
• Pangram obtained from the above model:
ExperimentsResults
ExperimentsResults
• The main focus was setting up the experimental pipeline and to analyze various quality metrics, designed to measure out-of-distribution novelty
• The immediate next goal is to analyze the models in a systematic way
Perspectives
Thank you for listening!
Recommended