60
Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts et Analyse d'Images) Paris, 29 November 2005 Note: these slides contain animation

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Embed Size (px)

Citation preview

Page 1: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Algorithms for MAP estimationin Markov Random Fields

Vladimir Kolmogorov

University College London

Tutorial at GDR (Optimisation Discrète, Graph Cuts et Analyse d'Images) Paris, 29 November 2005

Note: these slides contain animation

Page 2: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Energy function

qp

qppqp

ppconst xxxE,

),()()|( x

p

qunary terms

(data)pairwise terms

(coherence)

- xp are discrete variables (for example, xp{0,1})

- p(•) are unary potentials

- pq(•,•) are pairwise potentials

Page 3: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Minimisation algorithms• Min Cut / Max Flow [Ford&Fulkerson ‘56]

[Grieg, Porteous, Seheult ‘89] : non-iterative (binary variables)[Boykov, Veksler, Zabih ‘99] : iterative - alpha-expansion, alpha-beta swap, … (multi-valued variables)+ If applicable, gives very accurate results– Can be applied to a restricted class of functions

• BP – Max-product Belief Propagation [Pearl ‘86]+ Can be applied to any energy function– In vision results are usually worse than that of graph cuts– Does not always converge

• TRW - Max-product Tree-reweighted Message Passing [Wainwright, Jaakkola, Willsky ‘02] , [Kolmogorov ‘05]+ Can be applied to any energy function+ For stereo finds lower energy than graph cuts + Convergence guarantees for the algorithm in [Kolmogorov ’05]

Page 4: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Main idea: LP relaxation• Goal: Minimize energy E(x) under constraints

xp{0,1}

• In general, NP-hard problem!

• Relax discreteness constraints: allow xp[0,1]

• Results in linear program. Can be solved in polynomial time!

Energy functionwith discrete variables

LP relaxation

E E Etight not tight

Page 5: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Solving LP relaxation• Too large for general purpose LP solvers (e.g. interior point methods) • Solve dual problem instead of primal:

– Formulate lower bound on the energy– Maximize this bound– When done, solves primal problem (LP relaxation)

• Two different ways to formulate lower bound– Via posiforms: leads to maxflow algorithm– Via convex combination of trees: leads to tree-reweighted message passing

Lower bound onthe energy function

E

Energy functionwith discrete variables

E E

LP relaxation

Page 6: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Notation and Preliminaries

Page 7: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Energy function - visualisation

0

4

0

1

3

02

5

node p edge (p,q) node q

label 0

label 1

)0(p

)1,0(pq

qp

qppqp

ppconst xxxE,

),()()|( x

0

const

Page 8: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

0

4

0

1

3

02

5

node p edge (p,q) node q

label 0

label 1

Energy function - visualisation

qp

qppqp

ppconst xxxE,

),()()|( x

0

vector of

all parameters

Page 9: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

0 0 4

4

1 12

5

0

-1

-1

0 + 1

Reparameterisation

Page 10: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Reparameterisation

0 0 3

4

1 02

5

• Definition. is a reparameterisation of

if they define the same energy:

xxx any for )|()|( EE

4 -1

1 -1 0 +1

• Maxflow, BP and TRW perform reparameterisations

1

Page 11: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Part I: Lower bound viaposiforms

( maxflow algorithm)

Page 12: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

non-negative

const - lower bound on the energy:

xx constE )|(

maximize

Lower bound via posiforms[Hammer, Hansen, Simeone’84]

qp

qppqp

ppconst xxxE,

),()()|( x

Page 13: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

• Maximisation algorithm?– Consider functions of binary variables only

• Maximising lower bound for submodular functions – Definition of submodular functions– Overview of min cut/max flow– Reduction to max flow– Global minimum of the energy

• Maximising lower bound for non-submodular functions– Reduction to max flow

• More complicated graph– Part of optimal solution

Outline of part I

Page 14: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

• Definition: E is submodular if every pairwise term satisfies

• Can be converted to “canonical form”:

Submodular functions of binary variables

)0,1()1,0()1,1()0,0( pqpqpqpq

2

1 2 3 4

10

0 05

zerocost

Page 15: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Overview of min cut/max flow

Page 16: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Min Cut problemsource

sink

2 1

1

2

3

45

Directed weighted graph

Page 17: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Min Cut problem

sink

2 1

1

2

3

45

S = {source, node 1}T = {sink, node 2, node 3}

Cut:source

Page 18: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Min Cut problem

sink

2 1

1

2

3

45

S = {source, node 1}T = {sink, node 2, node 3}

Cut:

• Task: Compute cut with minimum cost

Cost(S,T) = 1 + 1 = 2

source

Page 19: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

sink

2 1

1

2

3

45

source

Maxflow algorithm

value(flow)=0

Page 20: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Maxflow algorithm

sink

2 1

1

2

3

45

value(flow)=0

source

Page 21: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Maxflow algorithm

sink

1 1

0

3

3

44

value(flow)=1

source

Page 22: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Maxflow algorithm

sink

1 1

0

3

3

44

value(flow)=1

source

Page 23: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Maxflow algorithm

sink

1 0

0

3

4

33

value(flow)=2

source

Page 24: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Maxflow algorithm

sink

1 0

0

3

4

33

value(flow)=2

source

Page 25: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

value(flow)=2

sink

1 0

0

3

4

33

source

Maxflow algorithm

Page 26: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Maximising lower bound for submodular functions:

Reduction to maxflow

Page 27: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

2

1 2 3 4

10

0 05

sink

2 1

1

2

3

45

source

value(flow)=0

0

Maxflow algorithm and reparameterisation

Page 28: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

sink

2 1

1

2

3

45

value(flow)=0

2

1 2 3 4

10

0 05

0

source

Maxflow algorithm and reparameterisation

Page 29: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

sink

1 1

0

3

3

44

value(flow)=1

1

0 3 3 4

10

0 04

1

source

Maxflow algorithm and reparameterisation

Page 30: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

sink

1 1

0

3

3

44

value(flow)=1

1

0 3 3 4

10

0 04

1

source

Maxflow algorithm and reparameterisation

Page 31: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

sink

1 0

0

3

4

33

value(flow)=2

1

0 3 4 3

00

0 03

2

source

Maxflow algorithm and reparameterisation

Page 32: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

sink

1 0

0

3

4

33

value(flow)=2

1

0 3 4 3

00

0 03

2

source

Maxflow algorithm and reparameterisation

Page 33: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

value(flow)=2

0

00

0

)1,1,0(x

minimum of the energy:

2

0

sink

1 0

0

3

4

33

source

Maxflow algorithm and reparameterisation

Page 34: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Maximising lower bound for non-submodular functions

Page 35: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Arbitrary functions of binary variables

• Can be solved via maxflow [Boros,Hammer,Sun’91]– Specially constructed graph

• Gives solution to LP relaxation: for each node

xp{0, 1/2, 1}

E

LP relaxation

non-negativemaximize

qp

qppqp

ppconst xxxE,

),()()|( x

Page 36: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Arbitrary functions of binary variables

0

1

0

1

1 1/2 1/2

1/2

1/2

Part of optimal solution[Hammer, Hansen, Simeone’84]

Page 37: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Part II: Lower bound viaconvex combination of trees

( tree-reweighted message passing)

Page 38: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

• Goal: compute minimum of the energy for

• In general, intractable!

• Obtaining lower bound:– Split into several components: – Compute minimum for each component:

– Combine to get a bound on

• Use trees!

)|(min)( xx

E

)|(min)( ii E xx

Convex combination of trees [Wainwright, Jaakkola, Willsky ’02]

Page 39: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

'

2

1 TT2

1

graph tree T tree T’

)( )(2

1 T )(2

1 'T

lower bound on the energymaximize

Convex combination of trees (cont’d)

Page 40: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

TRW algorithms• Goal: find reparameterisation maximizing lower bound

• Apply sequence of different reparameterisation operations:– Node averaging– Ordinary BP on trees

• Order of operations?– Affects performance dramatically

• Algorithms:– [Wainwright et al. ’02]: parallel schedule

• May not converge

– [Kolmogorov’05]: specific sequential schedule• Lower bound does not decrease, convergence guarantees

Page 41: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Node averaging

0

1

4

0

Page 42: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Node averaging

2

0.5

2

0.5

Page 43: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

• Send messages– Equivalent to reparameterising node and edge parameters

• Two passes (forward and backward)

Belief propagation (BP) on trees

Page 44: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Belief propagation (BP) on trees

constEpx

p

)|(min)0(0

x3

0constE

pxp

)|(min)1(

1 x

• Key property (Wainwright et al.):

Upon termination p gives min-marginals for node p:

Page 45: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

TRW algorithm of Wainwright et al. with tree-based updates (TRW-T)

Run BP on all trees “Average” all nodes

• If converges, gives (local) maximum of lower bound• Not guaranteed to converge. • Lower bound may go down.

Page 46: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Sequential TRW algorithm (TRW-S)[Kolmogorov’05]

Run BP on all trees containing p

“Average” node p

Pick node p

Page 47: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Main property of TRW-S

• Theorem: lower bound never decreases.

• Proof sketch:

constT 0)(

0

1

4

0

' 0)( ' constT

Page 48: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Main property of TRW-S

constT 5.0)(

2

0.5

2

0.5

' 5.0)( ' constT

• Theorem: lower bound never decreases.

• Proof sketch:

Page 49: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

TRW-S algorithm

• Particular order of averaging and BP operations

• Lower bound guaranteed not to decrease

• There exists limit point that satisfies weak tree agreement condition

• Efficiency?

Page 50: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

“Average” node p

Pick node p

inefficient?

Efficient implementation

Run BP on all trees containing p

Page 51: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Efficient implementation

• Key observation: Node averaging operation preserves messages oriented towards this node

• Reuse previously passed messages!

• Need a special choice of trees:– Pick an ordering of nodes– Trees: monotonic chains

4 5 6

7 8 9

1 2 3

Page 52: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Efficient implementation

4 5 6

7 8 9

1 2 3

• Algorithm:– Forward pass:

• process nodes in the increasing order

• pass messages from lower neighbours

– Backward pass:• do the same in reverse order

• Linear running time of one iteration

Page 53: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Efficient implementation

4 5 6

7 8 9

1 2 3

• Algorithm:– Forward pass:

• process nodes in the increasing order

• pass messages from lower neighbours

– Backward pass:• do the same in reverse order

• Linear running time of one iteration

Page 54: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Memory requirements

• Additional advantage of TRW-S: – Needs only half as much memory as standard message

passing!

– Similar observation for bipartite graphs and parallel schedule was made in [Felzenszwalb&Huttenlocher’04]

standard message passing TRW-S

Page 55: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Experimental results: binary segmentation (“GrabCut”)

0 100 200 300 400

3

4

5

6x 10

5

Time

Energy average over 50 instances

Page 56: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Experimental results: stereo

left image ground truth

BP TRW-S20 40 60 80 100

3.6

3.8

4x 10

5

Page 57: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Experimental results: stereo

20 40 60 80 100 120 1401.36

1.4

1.44

x 106

20 40 60 80 100 120 140

1.93

1.94

x 107

Page 58: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Summary• MAP estimation algorithms are based on LP relaxation

– Maximize lower bound

• Two ways to formulate lower bound

• Via posiforms: leads to maxflow algorithm– Polynomial time solution– But: applicable for restricted energies (e.g. binary variables)

• Submodular functions: global minimum• Non-submodular functions: part of optimal solution

• Via convex combination of trees: leads to TRW algorithm– Convergence in the limit (for TRW-S)– Applicable to arbitrary energy function

• Graph cuts vs. TRW:– Accuracy: similar– Generality: TRW is more general– Speed: for stereo TRW is currently 2-5 times slower. But:

• 3 vs. 50 years of research!• More suitable for parallel implementation (GPU? Hardware?)

Page 59: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Discrete vs. continuous functionals Continuous formulation (Geodesic active contours)

qp

qppqp

pp xxExEE,

),()()(x ||

0

))(()(C

dssCgCE

• Maxflow algorithm– Global minimum, polynomial-time

• Metrication artefacts?

• Level sets– Numerical stability?

• Geometrically motivated– Invariant under rotation

Discrete formulation (Graph cuts)

Page 60: Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts

Geo-cuts

• Continuous functional

• Construct graph such that for smooth contours C

)interior(

||

0

)()(C

C

dVfdsgCE N

cut ingcorrespond theofcost )( CE

• Class of continuous functionals?

[Boykov&Kolmogorov’03], [Kolmogorov&Boykov’05]:

– Geometric length/area (e.g. Riemannian)

– Flux of a given vector field

– Regional term