14
CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Embed Size (px)

Citation preview

Page 1: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

CS774. Markov Random Field : Theory and Application

Lecture 04

Kyomin JungKAIST

Sep 15 2009

Page 2: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Basic Idea of Belief Propagation (BP)

Let be the marginal prob. of the MRF on the subtree rooted at j, and so on.

i

j k

… …

)1()1,0()0()0,0()1()1,0()0()0,0()0()0( kikkikjijjijii ZZZZZ

)( jj xZ

)( jj xZ )( kk xZ

)( ii xZ

Page 3: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Belief Propagation (BP)

i j k

)( jj x

),( ijji xx

ijNkjjk

tjj

xijjiiij

t xmxxxxmj \)(

1 )()(),()(

)( jj xZ

Page 4: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

)(

)()()(iNj

itijiii

ti xmxxb

i j∏

Belief Propagation (BP)

ijNkjjk

t

xjjijjiiij

t xmxxxxmj \)(

1 )()(),()(

Belief at node i at time t:

Ni

For t>n, and)()(1i

tijiij

t xmxm )()( 1

itii

ti xbxb

Page 5: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Properties of BP (and MP)

Exact for trees Each node separates Graph into 2 disjoint

components

On a tree, the BP algorithm converges in time proportional to diameter of the graph – at most linear

For general Graphs Exact inference is NP-hard Constant Approximate inference is hard

Page 6: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Loopy Belief Propagation

Approaches for general graphsExact Inference

Computation tree based approach (for graph with large girth)

Junction Tree algorithm (for bounded tree width graph)

Graph cut algorithm (for submodular MRF)Approximate Inference

Loopy BP Sampling based algorithm Graph decomposition based approximation

Page 7: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Loopy Belief Propagation

If BP is used on graphs with loops, messages may circulate indefinitely

Empirically, a good approximation is still achievableStop after fixed # of iterationsStop when no significant change in be-

liefs If solution is not oscillatory but con-

verges, it usually is a good approxima-tion

Example: LDPC Codes

Page 8: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Fixed point of BP

Messages of BP at time t forms a di-mensional real vector. Let M(t) be this vector.

If we normalize , the output of BP(marginal probabilities) is the same.

BP algorithm is a continuous function that maps M(t) to M(t+1). BP:

Hence, by Brouwer Fixed Point Theorem, BP has at least one fixed point. (since the domain is a convex, compact set)

||||2 E

||2||)(|| 1 EtM

}2||:||{}2||:||{ 1||||

1|||| ExRxExRx EE

Page 9: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Fixed point of BP

Now important questions are “Is there a unique fixed point ?”“Does BP converges to a fixed point ?”“If it does, how fast ?”

Studying these questions are of current re-search topics. Ex, studying them for restricted class of MRF

(ex graphs with large girth) Studying relations of BP fixed point with other

values (ex Minima of the Bethe Free energy)

Page 10: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Girth of a Graph

For a graph G=(V,E), the girth of G is the length of a shortest cycle contained in G.

If G has girth, and bounded de-gree, and the MRF satisfies exponential (s-patial) correlation decay, then BP com-putes good approximation of the solution.Proof: By considering computation tree

of BP It can be used to design a system based

on MRF Ex: LDPC code

)(logn

Page 11: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Computation Tree of BP

Graph G Computation tree of G at x1

Page 12: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

(Temporal) Decay of correlations in Markov chains

A Markov chain with transition matrix satisfies decay of correlation (mixes)

if and only if it is aperiodic

(Spatial) Decay of correlations

Same thing, but time is replaced by a “spatial” distance

Correlation Decay

Page 13: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

A sequence of spatially (graph) related random variables

exhibits a correlation decay(long-range independence), if when is large

Principle motivation - statistical phyisics. Uniqueness of Gibbs measures on infinite lattices, Dobrushin [60s].

Correlation Decay

),( VvX v

],[]|[ vvgvv xXPyBxXP y

Page 14: CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep 15 2009

Weitz [05]. Independent sets - graph

Goldberg, Martin & Paterson [05]. Coloring. General graphs

Jonasson [01]. Coloring. Regular trees

• is the maximum vertex degree of G.

• in the independent set is the weight for each vertex.

(i.e. weight for an independent set of size I is )

• q in the coloring problem is the number of possible colors.

||I

What is known about correlation decay ?