Graphical models, belief propagation, and Markov random fields 1

Graphical models, belief propagation, and Markov random fields

1

Markov Random Field (MRF)

• Definition:A Markov random field, Markov network or undirected

graphical model is a graphical model in which a set of random variables have a Markov property described by an undirected graph. A Markov random field is similar to a Bayesian network in its representation of dependencies. It can represent certain dependencies that a Bayesian network cannot (such as cyclic dependencies); on the other hand, it can't represent certain dependencies that a Bayesian network can (such as induced dependencies). The prototypical Markov random field is the Ising model; indeed, the Markov random field was introduced as the general setting for the Ising model

From Wiki.

2

Problem solved by MRF• MRF is very popular and useful in both Computer Vision

and Computer Graphic Areas.• Solve Discrete Labeling problem of graphical model

– Segmentation, Stereo, Noise removal, etc.

• Reading materials– Graphical models: Probabilistic inference,

Michael I., Jordan and Yair Weiss

• This topic itself can be another courses, I will cover this topic briefly related to our area.

• You might not understand the mathematics, but you will learn how to use it. This is the goal of this class.

3

Things we want to be able to articulate in a spatial prior

• Favor neighboring pixels having the same state (state, meaning: estimated depth, or group segment membership)

• Favor neighboring nodes have compatible states (a patch at node i should fit well with selected patch at node j).

• But encourage state changes to occur at certain places (like regions of high image gradient).

4

Graphical models: tinker toys to build complex probability distributions

http://mark.michaelis.net/weblog/2002/12/29/Tinker%20Toys%20Car.jpg

• Circles represent random variables.• Lines represent statistical dependencies.• There is a corresponding equation that gives P(x1, x2, x3, y, z),

but often it’s easier to understand things from the picture.• These tinker toys for probabilities let you build up, from simple,

easy-to-understand pieces, complicated probability distributions involving many variables.

x1 x2 x3

y z

5

Steps in building and using graphical models

• First, define the valuables, how many discrete labels in your valuables, eg. no. of depth label

• Second, define function you want to optimize. Note the two common ways of framing the problem– In terms of probabilities. Multiply together component terms,

which typically involve exponentials. – In terms of energies. The log of the probabilities. Typically add

together the exponentiated terms from above. • The third step: optimize that function. For probabilities,

take the mean or the max (or use some other “loss function”). For energies, take the min.– Find the set of label configuration that max/min the

probabilities/energies.

6

Standard Form of MRF optimization

Hidden variable

Data

Neighborhoodcompatibility

function

Neighboringnodes

local observations

Data compatibilityfunction

argmaxx P(x|y) = argmaxx ÕF(xi,yi) ÕY(xi,xj)i i,j

= argminx F*(xi,yi) +Y*(xi,xj)i i,j

Note:F* = log F Y * = log

Y

Our solution is a configuration of x that maximize/minimize the probabilities/energies 7

• Typical setup of MRF in image processing

• Good News: Standard solver is available.– http://vision.middlebury.edu/MRF/

MRF - Graphical Model

8

x,y are swapped in this figure

Method for solving MRF• Iterated conditional modes (ICM)

– Described in: Winkler, 1995. Introduced by Besag in 1986

• Gibbs sampling, simulated annealing– Pros: finds global MAP solution. Cons: takes forever

• Variational methods– Tommi Jaakkola’s tutorial on variational methods– http://www.ai.mit.edu/people/tommi/– Example: mean field

• State of the art (Standard) techniques in Computer Vision:– Belief propagation– Graph cuts

9

Comparison of graph cuts and belief propagation

Comparison of Graph Cuts with Belief Propagation for Stereo, using IdenticalMRF Parameters, ICCV 2003.Marshall F. Tappen William T. Freeman

10

Ground truth, graph cuts, and belief propagation disparity solution energies

11

Graph cuts versus belief propagation• Graph cuts consistently gave slightly lower energy

solutions for that stereo-problem MRF, although BP ran faster, although there is now a faster graph cuts implementation than what we used…

• Conclusion: better results can be obtained with better defined Energies

• Personally, I prefer Belief Propagation, and here are the reasons:– Works for any compatibility functions, not a restricted

set like graph cuts.– I find it very intuitive.– Extensions: sum-product algorithm computes MMSE,

and Generalized Belief Propagation gives you very accurate solutions, at a cost of time. 12

Belief propagation: the nosey neighbor rule

“Given everything that I know, here’s what I think you should think”

(Given the probabilities of my being in different states, and how my states relate to your states, here’s what I think the probabilities of your states should be)

13

Reminder: Standard Form of MRF optimization

Hidden variable

Data

Neighborhoodcompatibility

function

Neighboringnodes

local observations

Data compatibilityfunction

argmaxx P(x|y) = argmaxx ÕF(xi,yi) ÕY(xi,xj)i i,j

= argminx F*(xi,yi) +Y*(xi,xj)i i,j

Note:F* = log F Y * = log

Y

Our solution is a configuration of x that maximize/minimize the probabilities/energies 14

Belief propagation messages

jii =

ijNk

jkjji

xi

ji xMxxxM

j \)(ij )(),( )(

j

To send a message: Multiply together all the incoming messages, except from the node you’re sending to,then multiply by the compatibility matrix and marginalize over the sender’s states.

A message: can be thought of as a set of weights on each of your possible states

15

Beliefs

j

)(

)( )(jNk

jkjjj xMxb

To find a node’s beliefs: Multiply together all the messages coming in to that node.

16

Simple BP example

y1

),( 11 yx

),( 21 xx ),( 32 xx

),( 33 yx

x1 x2

y3

x3

),( 21 xx ),( 32 xx

x1 x2 x3

9.1.

1.9.),( 32 xx

6.

4.11yM

9.1.

1.9.),( 21 xx

2.

8.33yM

This is defined based on your Prior

This is defined based on the goodness-of-fit among observed date 17

Simple BP example

9.1.

1.9.),(),( 3221 xxxx

),( 21 xx ),( 32 xx

x1 x2 x3

6.

4.11yM

2.

8.33yM

To find the marginal probability for each variable, you can(a) Marginalize out the other variables of:

(b) Or you can run belief propagation, (BP). BP redistributes the various partial sums, leading to a very efficient calculation.

)()(),(),(),|,,( 331132213132131 xMxMxxxxyyxxxP yy

18

Belief, and message updates

jii =

ijNk

jkjji

xi

ji xMxxxM

j \)(ij )(),( )(

j

)(

)( )(jNk

jkjjj xMxb

19

Optimal solution in a chain or tree:Belief Propagation

• “Do the right thing” Bayesian algorithm.• For Gaussian random variables over time:

Kalman filter.• For hidden Markov models:

forward/backward algorithm (and MAP variant is Viterbi).

• Caution: For Cyclic Graph, there is no proof that BP converges, but we general believe it converges 20

References on BP and GBP• J. Pearl, 1985

– classic• Y. Weiss, NIPS 1998

– Inspires application of BP to vision• W. Freeman et al learning low-level vision, IJCV 1999

– Applications in super-resolution, motion, shading/paint discrimination

• H. Shum et al, ECCV 2002– Application to stereo

• M. Wainwright, T. Jaakkola, A. Willsky– Reparameterization version

• J. Yedidia, AAAI 2000– The clearest place to read about BP and GBP.

21

Applications of MRF’s

• Image Denoising• Stereo• Motion estimation• Labelling shading and reflectance• Many others…

22

Denoising• Each pixel is a node• Each pixel intensity is a stage of node

Input Gau. Filter Median Filter MRF Ground Truth

23

Stereo

• Each depth is stage of node

24

Motion estimation• Each motion direction is a stage of node• Segmentation information can also be

included into Prior

From Zitnick et al. ICCV’0525

Intrinsic Image Estimation

Input Image Reflectance ImageWith Propagation

Reflectance ImageWithout Propagation

Tappen et al. NIPS’0226

27

Image Segmentation

• Lazy snapping, etc.• Each node stage is the segmentation label

28

Texture Synthesis

• Graphcut Textures, Siggraph’03

29

Super resolution

• Learning low level vision, IJCV’00

30

Image Completion

• Image Completion with Structure Propagation, Siggraph’05

31

Summary• Solve Discrete Labelling / Graph Partitioning

problems• MRF is useful, many applications in vision and

graphics, it’s also useful in other areas• To use MRF

– define your data term from observation– define your neighbor term from Priors

• Standard solver (support both BP and Graphcut) is available:

http://vision.middlebury.edu/MRF/32

Documents

Graphical models, belief propagation, and Markov random fields 1