35
Part 1: Graphical Models Machine Learning Techniques for Computer Vision Microsoft Research Cambridge ECCV 2004, Prague Christopher M. Bishop

Cristopher M. Bishop's tutorial on graphical models

  • Upload
    butest

  • View
    414

  • Download
    0

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Cristopher M. Bishop's tutorial on graphical models

Part 1: Graphical Models

Machine Learning Techniques

for Computer Vision

Microsoft Research Cambridge

ECCV 2004, Prague

Christopher M. Bishop

Page 2: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

About this Tutorial

• Learning is the new frontier in computer vision • Focus on concepts

– not lists of algorithms– not technical details

• Graduate level• Please ask questions!

Page 3: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Overview

• Part 1: Graphical models– directed and undirected graphs– inference and learning

• Part 2: Unsupervised learning– mixture models, EM– variational inference, model complexity– continuous latent variables

• Part 3: Supervised learning– decision theory– linear models, neural networks, – boosting, sparse kernel machines

Page 4: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Probability Theory

• Sum rule

• Product rule

• From these we have Bayes’ theorem

– with normalization

Page 5: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Role of the Graphs

• New insights into existing models• Motivation for new models• Graph based algorithms for calculation and computation

– c.f. Feynman diagrams in physics

Page 6: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Decomposition

• Consider an arbitrary joint distribution

• By successive application of the product rule

Page 7: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Directed Acyclic Graphs

• Joint distribution

where denotes the parents of i

No directed cycles

Page 8: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Undirected Graphs

• Provided then joint distribution is product of non-negative functions over the cliques of the graph

where are the clique potentials, and Z is a normalization constant

Page 9: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Conditioning on Evidence

• Variables may be hidden (latent) or visible (observed)

• Latent variables may have a specific interpretation, or may be introduced to permit a richer class of distribution

Page 10: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Conditional Independences

• x independent of y given z if, for all values of z,

• For undirected graphs this is given by graph separation!

Page 11: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

“Explaining Away”

• C.I. for directed graphs similar, but with one subtlety• Illustration: pixel colour in an image

image colour

surfacecolour

lightingcolour

Page 12: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Directed versus Undirected

Page 13: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Example: State Space Models

• Hidden Markov model• Kalman filter

Page 14: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Example: Bayesian SSM

Page 15: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Example: Factorial SSM

• Multiple hidden sequences• Avoid exponentially large hidden space

Page 16: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Example: Markov Random Field

• Typical application: image region labelling

Page 17: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Example: Conditional Random Field

Page 18: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Inference

• Simple example: Bayes’ theorem

Page 19: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Message Passing

• Example

• Find marginal for a particular node

– for M-state nodes, cost is – exponential in length of chain– but, we can exploit the graphical structure

(conditional independences)

Page 20: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Message Passing

• Joint distribution

• Exchange sums and products

Page 21: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Message Passing

• Express as product of messages

• Recursive evaluation of messages

• Find Z by normalizing

Page 22: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Belief Propagation

• Extension to general tree-structured graphs• At each node:

– form product of incoming messages and local evidence– marginalize to give outgoing message– one message in each direction across every link

• Fails if there are loops

Page 23: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Junction Tree Algorithm

• An efficient exact algorithm for a general graph– applies to both directed and undirected graphs– compile original graph into a tree of cliques– then perform message passing on this tree

• Problem: – cost is exponential in size of largest clique– many vision models have intractably large cliques

Page 24: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Loopy Belief Propagation

• Apply belief propagation directly to general graph– need to keep iterating– might not converge

• State-of-the-art performance in error-correcting codes

Page 25: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Max-product Algorithm

• Goal: find

– define

– then

• Message passing algorithm with “sum” replaced by “max”• Example:

– Viterbi algorithm for HMMs

Page 26: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Inference and Learning

• Data set

• Likelihood function (independent observations)

• Maximize (log) likelihood

• Predictive distribution

Page 27: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Regularized Maximum Likelihood

• Prior , posterior

• MAP (maximum posterior)

• Predictive distribution

• Not really Bayesian

Page 28: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Bayesian Learning

• Key idea is to marginalize over unknown parameters, rather than make point estimates

– avoids severe over-fitting of ML and MAP– allows direct model comparison

• Parameters are now latent variables• Bayesian learning is an inference problem!

Page 29: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Bayesian Learning

Page 30: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Bayesian Learning

Page 31: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

And Finally … the Exponential Family

• Many distributions can be written in the form

• Includes: – Gaussian– Dirichlet– Gamma– Multi-nomial– Wishart– Bernoulli– …

• Building blocks in graphs to give rich probabilistic models

Page 32: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Illustration: the Gaussian

• Use precision (inverse variance)

• In standard form

Page 33: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Maximum Likelihood

• Likelihood function (independent observations)

• Depends on data via sufficient statistics of fixed dimension

Page 34: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Conjugate Priors

• Prior has same functional form as likelihood

• Hence posterior is of the form

• Can interpret prior as effective observations of value• Examples:

– Gaussian for the mean of a Gaussian– Gaussian-Wishart for mean and precision of Gaussian– Dirichlet for the parameters of a discrete distribution

Page 35: Cristopher M. Bishop's tutorial on graphical models

Machine Learning Techniques for Computer Vision (ECCV 2004)

Christopher M. Bishop

Summary of Part 1

• Directed graphs

• Undirected graphs

• Inference by message passing: belief propagation