Upload
eustacia-welch
View
223
Download
1
Tags:
Embed Size (px)
Citation preview
Markov Random Field (MRF)
• Definition:A Markov random field, Markov network or undirected
graphical model is a graphical model in which a set of random variables have a Markov property described by an undirected graph. A Markov random field is similar to a Bayesian network in its representation of dependencies. It can represent certain dependencies that a Bayesian network cannot (such as cyclic dependencies); on the other hand, it can't represent certain dependencies that a Bayesian network can (such as induced dependencies). The prototypical Markov random field is the Ising model; indeed, the Markov random field was introduced as the general setting for the Ising model
From Wiki.
2
Problem solved by MRF• MRF is very popular and useful in both Computer Vision
and Computer Graphic Areas.• Solve Discrete Labeling problem of graphical model
– Segmentation, Stereo, Noise removal, etc.
• Reading materials– Graphical models: Probabilistic inference,
Michael I., Jordan and Yair Weiss
• This topic itself can be another courses, I will cover this topic briefly related to our area.
• You might not understand the mathematics, but you will learn how to use it. This is the goal of this class.
3
Things we want to be able to articulate in a spatial prior
• Favor neighboring pixels having the same state (state, meaning: estimated depth, or group segment membership)
• Favor neighboring nodes have compatible states (a patch at node i should fit well with selected patch at node j).
• But encourage state changes to occur at certain places (like regions of high image gradient).
4
Graphical models: tinker toys to build complex probability distributions
http://mark.michaelis.net/weblog/2002/12/29/Tinker%20Toys%20Car.jpg
• Circles represent random variables.• Lines represent statistical dependencies.• There is a corresponding equation that gives P(x1, x2, x3, y, z),
but often it’s easier to understand things from the picture.• These tinker toys for probabilities let you build up, from simple,
easy-to-understand pieces, complicated probability distributions involving many variables.
x1 x2 x3
y z
5
Steps in building and using graphical models
• First, define the valuables, how many discrete labels in your valuables, eg. no. of depth label
• Second, define function you want to optimize. Note the two common ways of framing the problem– In terms of probabilities. Multiply together component terms,
which typically involve exponentials. – In terms of energies. The log of the probabilities. Typically add
together the exponentiated terms from above. • The third step: optimize that function. For probabilities,
take the mean or the max (or use some other “loss function”). For energies, take the min.– Find the set of label configuration that max/min the
probabilities/energies.
6
Standard Form of MRF optimization
Hidden variable
Data
Neighborhoodcompatibility
function
Neighboringnodes
local observations
Data compatibilityfunction
argmaxx P(x|y) = argmaxx ÕF(xi,yi) ÕY(xi,xj)i i,j
= argminx F*(xi,yi) +Y*(xi,xj)i i,j
Note:F* = log F Y * = log
Y
Our solution is a configuration of x that maximize/minimize the probabilities/energies 7
• Typical setup of MRF in image processing
• Good News: Standard solver is available.– http://vision.middlebury.edu/MRF/
MRF - Graphical Model
8
x,y are swapped in this figure
Method for solving MRF• Iterated conditional modes (ICM)
– Described in: Winkler, 1995. Introduced by Besag in 1986
• Gibbs sampling, simulated annealing– Pros: finds global MAP solution. Cons: takes forever
• Variational methods– Tommi Jaakkola’s tutorial on variational methods– http://www.ai.mit.edu/people/tommi/– Example: mean field
• State of the art (Standard) techniques in Computer Vision:– Belief propagation– Graph cuts
9
Comparison of graph cuts and belief propagation
Comparison of Graph Cuts with Belief Propagation for Stereo, using IdenticalMRF Parameters, ICCV 2003.Marshall F. Tappen William T. Freeman
10
Graph cuts versus belief propagation• Graph cuts consistently gave slightly lower energy
solutions for that stereo-problem MRF, although BP ran faster, although there is now a faster graph cuts implementation than what we used…
• Conclusion: better results can be obtained with better defined Energies
• Personally, I prefer Belief Propagation, and here are the reasons:– Works for any compatibility functions, not a restricted
set like graph cuts.– I find it very intuitive.– Extensions: sum-product algorithm computes MMSE,
and Generalized Belief Propagation gives you very accurate solutions, at a cost of time. 12
Belief propagation: the nosey neighbor rule
“Given everything that I know, here’s what I think you should think”
(Given the probabilities of my being in different states, and how my states relate to your states, here’s what I think the probabilities of your states should be)
13
Reminder: Standard Form of MRF optimization
Hidden variable
Data
Neighborhoodcompatibility
function
Neighboringnodes
local observations
Data compatibilityfunction
argmaxx P(x|y) = argmaxx ÕF(xi,yi) ÕY(xi,xj)i i,j
= argminx F*(xi,yi) +Y*(xi,xj)i i,j
Note:F* = log F Y * = log
Y
Our solution is a configuration of x that maximize/minimize the probabilities/energies 14
Belief propagation messages
jii =
ijNk
jkjji
xi
ji xMxxxM
j \)(ij )(),( )(
j
To send a message: Multiply together all the incoming messages, except from the node you’re sending to,then multiply by the compatibility matrix and marginalize over the sender’s states.
A message: can be thought of as a set of weights on each of your possible states
15
Beliefs
j
)(
)( )(jNk
jkjjj xMxb
To find a node’s beliefs: Multiply together all the messages coming in to that node.
16
Simple BP example
y1
),( 11 yx
),( 21 xx ),( 32 xx
),( 33 yx
x1 x2
y3
x3
),( 21 xx ),( 32 xx
x1 x2 x3
9.1.
1.9.),( 32 xx
6.
4.11yM
9.1.
1.9.),( 21 xx
2.
8.33yM
This is defined based on your Prior
This is defined based on the goodness-of-fit among observed date 17
Simple BP example
9.1.
1.9.),(),( 3221 xxxx
),( 21 xx ),( 32 xx
x1 x2 x3
6.
4.11yM
2.
8.33yM
To find the marginal probability for each variable, you can(a) Marginalize out the other variables of:
(b) Or you can run belief propagation, (BP). BP redistributes the various partial sums, leading to a very efficient calculation.
)()(),(),(),|,,( 331132213132131 xMxMxxxxyyxxxP yy
18
Belief, and message updates
jii =
ijNk
jkjji
xi
ji xMxxxM
j \)(ij )(),( )(
j
)(
)( )(jNk
jkjjj xMxb
19
Optimal solution in a chain or tree:Belief Propagation
• “Do the right thing” Bayesian algorithm.• For Gaussian random variables over time:
Kalman filter.• For hidden Markov models:
forward/backward algorithm (and MAP variant is Viterbi).
• Caution: For Cyclic Graph, there is no proof that BP converges, but we general believe it converges 20
References on BP and GBP• J. Pearl, 1985
– classic• Y. Weiss, NIPS 1998
– Inspires application of BP to vision• W. Freeman et al learning low-level vision, IJCV 1999
– Applications in super-resolution, motion, shading/paint discrimination
• H. Shum et al, ECCV 2002– Application to stereo
• M. Wainwright, T. Jaakkola, A. Willsky– Reparameterization version
• J. Yedidia, AAAI 2000– The clearest place to read about BP and GBP.
21
Applications of MRF’s
• Image Denoising• Stereo• Motion estimation• Labelling shading and reflectance• Many others…
22
Denoising• Each pixel is a node• Each pixel intensity is a stage of node
Input Gau. Filter Median Filter MRF Ground Truth
23
Motion estimation• Each motion direction is a stage of node• Segmentation information can also be
included into Prior
From Zitnick et al. ICCV’0525
Intrinsic Image Estimation
Input Image Reflectance ImageWith Propagation
Reflectance ImageWithout Propagation
Tappen et al. NIPS’0226
Summary• Solve Discrete Labelling / Graph Partitioning
problems• MRF is useful, many applications in vision and
graphics, it’s also useful in other areas• To use MRF
– define your data term from observation– define your neighbor term from Priors
• Standard solver (support both BP and Graphcut) is available:
http://vision.middlebury.edu/MRF/32