The Exponential Formula for the Wasserstein Metrickcraig/math/curriculum_vitae... · 2016. 10. 20. · The Exponential Formula for the Wasserstein Metric. Katy Craig UCLA SIAMAnnual

The Exponential Formula for the Wasserstein Metric.Katy CraigUCLA

SIAM Annual Meeting, ChicagoJuly 8, 2014

1 / 22

Plan.

• Gradient flow

• Discrete gradient flow

• Euler-Lagrange equation

• Exponential formula

2 / 22

Plan.

• Gradient flow




3 / 22

Gradient Flow.∂u(t)

∂t= −∇E(u(t)), u(0) = u

4 / 22


∂t= −∇E(u(t)), u(0) = u

Heat Equation as Gradient Flow on L2(Rd)

4 / 22


∂t= −∇E(u(t)), u(0) = u

Heat Equation as Gradient Flow on L2(Rd)L2(Rd) gradient:

(∇L2E(u), v)L2 = limh→0

E(u+ hv)− E(u)h

for all v ∈ L2(Rd)

4 / 22


∂t= −∇E(u(t)), u(0) = u


(∇L2E(u), v)L2 = limh→0

E(u+ hv)− E(u)h


Thus, for E(u) = 12∫|∇u|2,

(∇L2E(u), v)L2 = limh→0

1

2

∫|∇(u+ hv)|2 −

∫|∇u|2

h= (∇u,∇v)L2 = (−∆u, v)L2 .

4 / 22


∂t= −∇E(u(t)), u(0) = u


(∇L2E(u), v)L2 = limh→0

E(u+ hv)− E(u)h


Thus, for E(u) = 12∫|∇u|2,

(∇L2E(u), v)L2 = limh→0

1

2

∫|∇(u+ hv)|2 −

∫|∇u|2

h= (∇u,∇v)L2 = (−∆u, v)L2 .

Hence, the L2 gradient flow of E is

∂u/∂t = −∇L2E(u) = −(−∆u) = ∆u .

4 / 22


∂t= −∇E(u(t)), u(0) = u


(∇L2E(u), v)L2 = limh→0

E(u+ hv)− E(u)h


Thus, for E(u) = 12∫|∇u|2,

(∇L2E(u), v)L2 = limh→0

1

2

∫|∇(u+ hv)|2 −

∫|∇u|2

h= (∇u,∇v)L2 = (−∆u, v)L2 .

Hence, the L2 gradient flow of E is

∂u/∂t = −∇L2E(u) = −(−∆u) = ∆u .

Note: ∇L2E(u) = δEδu4 / 22

Examples of Hilbert Space Gradient Flow.

PDE Energy Functional MetricAllen-Cahn d

dtu = ∆u− F ′(u) E(u) = 1

2

∫ [|∇u|2 + F (u)

]L2

Cahn-Hilliard ddtu = ∆(∆u− F ′(u)) E(u) = 1

2

∫ [|∇u|2 + F (u)

]H−1

Porous Media / ddtu = ∆um E(u) = 1

m+1

∫um+1 H−1

Fast Diffusion

5 / 22

Examples of Hilbert Space Gradient Flow.

PDE Energy Functional MetricAllen-Cahn d

dtu = ∆u− F ′(u) E(u) = 1

2

∫ [|∇u|2 + F (u)

]L2

Cahn-Hilliard ddtu = ∆(∆u− F ′(u)) E(u) = 1

2

∫ [|∇u|2 + F (u)

]H−1

Porous Media / ddtu = ∆um E(u) = 1

m+1

∫um+1 H−1

Fast Diffusion

Why gradient flow?• Free estimates, e.g. |u(t)− v(t)| ≤ e−λt|u(0)− v(0)|• Method to construct and approximate solutions (discrete gradient flow)

5 / 22

Wasserstein Gradient Flow.∂µ(t)

∂t= −∇W2E(µ(t)), µ(0) = µ

6 / 22


∂t= −∇W2E(µ(t)), µ(0) = µ

Simplifying assumptions:∫|x|2dµ < +∞, µ

Geodesics and Convexity.Geodesics:

µ(α) = (αtνµ + (1− α)id)#µ is the geodesic from µ to ν at time α,

W2(µ(α), µ(β)) = |α− β|W2(µ, ν) .

7 / 22



W2(µ(α), µ(β)) = |α− β|W2(µ, ν) .

Convexity:

E is convex in case

E(µ(α)) ≤ (1− α)E(µ(0)) + αE(µ(1)) .

7 / 22



W2(µ(α), µ(β)) = |α− β|W2(µ, ν) .

Convexity:

E is convex in case

E(µ(α)) ≤ (1− α)E(µ(0)) + αE(µ(1)) .

Assumption: E is lower semicontinuous and convex.

7 / 22

The Wasserstein Metric's ``Gradient''.

By a similar computation as in the L2 case,(∇W2E(µ),

∂µ

∂t

∣∣∣∣t=0

)µ

= limt→0

E(µ(t))− E(µ)t

for all ∂µ∂t

,

implies∇W2E(µ) = −∇ ·

(µ∇δE

δµ

).

8 / 22

The Wasserstein Metric's ``Gradient''.

By a similar computation as in the L2 case,(∇W2E(µ),

∂µ

∂t

∣∣∣∣t=0

)µ

= limt→0

E(µ(t))− E(µ)t

for all ∂µ∂t

,

implies∇W2E(µ) = −∇ ·

(µ∇δE

δµ

).

Therefore,

∂µ(t)

∂t= −∇W2E(µ(t)) ⇐⇒

∂µ(t)

∂t= ∇ ·

(µ∇δE

δµ

).

8 / 22

Examples of Wasserstein Gradient Flow.

PDE Energy FunctionalPorous Media / ∂

∂tµ = ∆µm E(µ) = 1

m−1

∫ρ(x)mdx

Fast DiffusionFokker Planck ∂

∂tµ = ∆µ+∇ · (µ∇V ) E(µ) =

∫ρ(x) log ρ(x) + V (x)ρ(x)dx

Aggregation ∂∂tu = ∇ · (µ∇K ∗ µ) E(µ) = 1

2

∫ ∫ρ(x)K(x− y)ρ(y)dxdy

9 / 22

Examples of Wasserstein Gradient Flow.

PDE Energy FunctionalPorous Media / ∂

∂tµ = ∆µm E(µ) = 1

m−1

∫ρ(x)mdx

Fast DiffusionFokker Planck ∂

∂tµ = ∆µ+∇ · (µ∇V ) E(µ) =

∫ρ(x) log ρ(x) + V (x)ρ(x)dx

Aggregation ∂∂tu = ∇ · (µ∇K ∗ µ) E(µ) = 1

2

∫ ∫ρ(x)K(x− y)ρ(y)dxdy

Why gradient flow?• Free estimates, e.g. W2(µ(t), ν(t)) ≤ e−λtW2(µ(0), ν(0))• Method to construct and approximate solutions (discrete gradient flow)

9 / 22

Plan.

• Gradient flow




10 / 22

Discrete Gradient Flow: Euclidean Space.Gradient flow:

du(t)

dt= −∇E(u(t)), u(0) = u ∈ Rd

11 / 22


du(t)

dt= −∇E(u(t)), u(0) = u ∈ Rd

Implicit Euler method:

un − un−1τ

= −∇E(un), u0 = u

11 / 22


du(t)

dt= −∇E(u(t)), u(0) = u ∈ Rd


un − un−1τ

+∇E(un) = 0, u0 = u

11 / 22


du(t)

dt= −∇E(u(t)), u(0) = u ∈ Rd


un − un−1τ

+∇E(un) = 0, u0 = u

Given un−1, compute un using that it is a critical point of

Φ(v) =1

2τ|v − un−1|2 + E(v) .

11 / 22


du(t)

dt= −∇E(u(t)), u(0) = u ∈ Rd


un − un−1τ

+∇E(un) = 0, u0 = u

Given un−1, compute un using that it is a critical point the unique minimizer of

Φ(v) =1

2τ|v − un−1|2 + E(v) .

11 / 22


du(t)

dt= −∇E(u(t)), u(0) = u ∈ Rd


un − un−1τ

+∇E(un) = 0, u0 = u

Given un−1, compute un using that it is a critical point the unique minimizer of

Φ(v) =1

2τ|v − un−1|2 + E(v) .

Theorem (Exponential Formula)Let τ = t/n. Then limn→∞ un = u(t).

11 / 22

Discrete Gradient Flow: Wasserstein Metric.Want to say... given µn−1, let µn be the unique minimizer of

Φ(v) =1

2τW 22 (ν, µn−1) + E(ν) .

12 / 22


Φ(v) =1

2τW 22 (ν, µn−1) + E(ν) .

Problem: ν 7→W 22 (ν, µn−1) is not convex, so Φ may not have a uniqueminimum.

12 / 22


Φ(v) =1

2τW 22 (ν, µn−1) + E(ν) .


Need additional assumptions on E:• coercive• convex along generalized geodesics

12 / 22


Φ(v) =1

2τW 22 (ν, µn−1) + E(ν) .


Need additional assumptions on E:• coercive• convex along generalized geodesics

Proposition (AGS)For all τ > 0, there exists a unique minimizer of Φ(ν), so the discrete gradientflow is well defined.

12 / 22

Plan.

• Gradient flow




13 / 22

Euler-Lagrange Equation.In the Euclidean case,

un = argminv

{1

2τ|v − un−1|2 + E(v)

}⇐⇒ un − un−1

τ= −∇E(un) .

14 / 22


un = argminv

{1

2τ|v − un−1|2 + E(v)

}⇐⇒ un − un−1

τ= −∇E(un) .

In the Wasserstein case,

Proposition (AGS, C.)

µn = argminν

{1

2τW 22 (ν, µn−1) + E(ν)

}⇐⇒ 1

τ(tµn−1µn − id) ∈ ∂sE(µn) .

14 / 22


un = argminv

{1

2τ|v − un−1|2 + E(v)

}⇐⇒ un − un−1

τ= −∇E(un) .

In the Wasserstein case,

Proposition (AGS, C.)

µn = argminν

{1

2τW 22 (ν, µn−1) + E(ν)

}⇐⇒ 1


Key property of subdifferential: for E convex, 0 ∈ ∂E(µ) ⇐⇒ µ minimizes E.

14 / 22

Sketch of Proof: Euler-Lagrange Equation.Proposition (AGS, C.)

µn = argminν

{1

2τW 22 (ν, µn−1) + E(ν)

}⇐⇒ 1


15 / 22


µn = argminν

{1

2τW 22 (ν, µn−1) + E(ν)

}⇐⇒ 1


Proof:Let Φ(ν) = 12τW 22 (ν, µn−1) + E(ν).

15 / 22


µn = argminν

{1

2τW 22 (ν, µn−1) + E(ν)

}⇐⇒ 1



=⇒ [AGS, Otto]: minimality implies Φ(t#µn) ≥ Φ(µn); expand both sides.

15 / 22


µn = argminν

{1

2τW 22 (ν, µn−1) + E(ν)

}⇐⇒ 1




⇐= want to say...• 1

τ (tµn−1µn − id) ∈ ∂sE(µn)

• hence 0 ∈ ∂Φ(µn)• hence by key property, µn minimizes Φ

15 / 22


µn = argminν

{1

2τW 22 (ν, µn−1) + E(ν)

}⇐⇒ 1




⇐= want to say...• 1

τ (tµn−1µn − id) ∈ ∂sE(µn)

• hence 0 ∈ ∂Φ(µn)• hence by key property, µn minimizes Φ

Problem: ν 7→W 22 (ν, µn−1) is not convex, so Φ may not be convex.

15 / 22

Sketch of Proof: Euler-Lagrange Equation.Solution: generalized geodesics and transport metrics

• [AGS] ν 7→W 22 (ν, µ) is not convex (along all geodesics)• [AGS] ν 7→W 22 (ν, µ) is convex along generalized geodesics with base µ• [C.] the generalized geodesics with base µ are not arbitrary curves: they are

exactly the geodesics of the transport metric with base µ

16 / 22

Sketch of Proof: Euler-Lagrange Equation.Solution: generalized geodesics and transport metrics

• [AGS] ν 7→W 22 (ν, µ) is not convex (along all geodesics)• [AGS] ν 7→W 22 (ν, µ) is convex along generalized geodesics with base µ• [C.] the generalized geodesics with base µ are not arbitrary curves: they are

exactly the geodesics of the transport metric with base µ

WassersteinMetric: W2(µ, ν) =(∫

|tνµ − id|2dµ)1/2

TransportMetric: W2,ω(µ, ν) =(∫

|tµω − tνω|2dω)1/2

• ν 7→W 22,ω(ν, µ) is convex• W2(µ, ν) ≤W2,ω(µ, ν)

16 / 22

Plan.

• Gradient flow




17 / 22

Exponential Formula.Theorem (AGS)Let τ = t/n. Then limn→∞ µn = µ(t).

18 / 22


• the limit exists• the limit is a solution to the gradient flow

18 / 22



SketchofProof, alaCrandallandLiggett:

18 / 22



SketchofProof, alaCrandallandLiggett:Let Jτ be the function Jτu = argminv

{12τ |v − u|

2 + E(v)}

=⇒ Jnτ u0 = un.

18 / 22




{12τ |v − u|

2 + E(v)}

=⇒ Jnτ u0 = un.

..1 Contractioninequality

18 / 22




{12τ |v − u|

2 + E(v)}

=⇒ Jnτ u0 = un.

..1 ContractioninequalityBanach space: ∥Jτu− Jτv∥ ≤ ∥u− v∥

18 / 22




{12τ |v − u|

2 + E(v)}

=⇒ Jnτ u0 = un.

..1 ContractioninequalityBanach space: ∥Jτu− Jτv∥ ≤ ∥u− v∥

Theorem (Carlen, C.)W 22 (Jτµ, Jτν) ≤W 22 (µ, ν) +O(τ2)

18 / 22

Exponential Formula...2 Largevssmalltimesteps, 0 < h ≤ τ

19 / 22


Banach space: Jτu = Jh[τ−hτ Jτu+

hτ u]

19 / 22


Banach space: Jτu = Jh[τ−hτ Jτu+

hτ u]

Lemma (Jost, Mayer, C.)Jτµ = Jh

[(τ−hτ t

Jτµµ +

hτ id)

#µ]

19 / 22

Exponential Formula...3 Recursiveinequality:

20 / 22

Exponential Formula...3 Recursiveinequality:Let µn = Jnτ µ and µm = Jmh µ.

20 / 22


W 22 (µn, µm) =W22

Jhν︷︸︸︷[(

τ − hτ

tµnµn−1 +h

τid)

#µn−1], Jhµm−1

20 / 22


W 22 (µn, µm) =W22

Jhν︷︸︸︷[(

τ − hτ

tµnµn−1 +h

τid)


≤W 22 (ν, µm−1) +O(h2)

20 / 22


W 22 (µn, µm) =W22

Jhν︷︸︸︷[(

τ − hτ

tµnµn−1 +h

τid)


≤W 22 (ν, µm−1) +O(h2)

≤W 22,µn−1(ν, µm−1) +O(h2)

20 / 22


W 22 (µn, µm) =W22

Jhν︷︸︸︷[(

τ − hτ

tµnµn−1 +h

τid)


≤W 22 (ν, µm−1) +O(h2)

≤W 22,µn−1(ν, µm−1) +O(h2)

≤ τ − hτ

W 22,µn−1(µn, µm−1) +h

τW 22,µn−1(µn−1, µm−1) +O(h

2)

20 / 22


W 22 (µn, µm) =W22

Jhν︷︸︸︷[(

τ − hτ

tµnµn−1 +h

τid)


≤W 22 (ν, µm−1) +O(h2)

≤W 22,µn−1(ν, µm−1) +O(h2)

≤ τ − hτ

W 22,µn−1(µn, µm−1) +h

τW 22,µn−1(µn−1, µm−1) +O(h

2)

≤ τ − hτ

W 22 (µn, µm−1) +h

τW 22 (µn−1, µm−1) +O(h2)

20 / 22


W 22 (µn, µm) =W22

Jhν︷︸︸︷[(

τ − hτ

tµnµn−1 +h

τid)


≤W 22 (ν, µm−1) +O(h2)

≤W 22,µn−1(ν, µm−1) +O(h2)

≤ τ − hτ

W 22,µn−1(µn, µm−1) +h

τW 22,µn−1(µn−1, µm−1) +O(h

2)

≤ τ − hτ

W 22 (µn, µm−1) +h

τW 22 (µn−1, µm−1) +O(h2)

W 22 (µn, µm) ≤τ − hτ

W 22 (µn, µm−1) +h

τW 22 (µn−1, µm−1) +O(h2)

20 / 22


W 22 (µn, µm) ≤τ − hτ

W 22 (µn, µm−1) +h

τW 22 (µn−1, µm−1) +O(h2)

20 / 22

Exponential Formula.Iterating

W 22 (µn, µm) ≤τ − hτ

W 22 (µn, µm−1) +h

τW 22 (µn−1, µm−1) +O(h2)

with τ = t/n and h = t/m for n ≤ m gives

W2(µn, µm) ≤ O(1√n)

n,m→∞−−−−−→ 0 .

Therefore, the limit exists.

21 / 22

Thank you!

22 / 22

Backup

23 / 22


∂t= −∇W2E(µ(t)), µ(0) = µ

Wasserstein Metric as ``Riemannian Manifold''*The Wasserstein metric is induced by this inner product (Benamou-Brenier):

W2(µ0, µ1) =

inf{∫ 1

0

∥∇ψ(t)∥µ(t)dt : µ(0) = µ0, µ(1) = µ1,∂µ

∂t+∇ · (∇ψµ) = 0

}.

24 / 22

The Wasserstein Metric's ``Inner Product''* [Otto].

25 / 22

The Wasserstein Metric's ``Inner Product''* [Otto].*DISCLAIMER: "given sufficient regularity," "in the limit", ...

25 / 22


Given µ(t), there exists a velocity field v(x, t) = ∇ψ(x, t) so that

∂µ

∂t+∇ · (∇ψµ) = 0 .

25 / 22



∂µ

∂t+∇ · (∇ψµ) = 0 .

The tangent space at a measure µ is{∂µ

∂t

∣∣∣∣t=0

: µ(0) = µ

}={∇ψ : ψ ∈ C∞c (Rd)

}.

25 / 22



∂µ

∂t+∇ · (∇ψµ) = 0 .

The tangent space at a measure µ is{∂µ

∂t

∣∣∣∣t=0

: µ(0) = µ

}={∇ψ : ψ ∈ C∞c (Rd)

}.

The inner product is(∂µ

∂t,∂̃µ

∂t

)µ

:=

∫∇ψ(x) · ∇ψ̃(x)dµ .

25 / 22

Wasserstein Subdifferential.Wasserstein subdifferential of convex function:

• ξ ∈ ∂E(µ) in case E(ν)− E(µ) ≥∫⟨ξ, tνµ − id⟩dµ for all ν

• ξ ∈ ∂sE(µ) in case E(ν)− E(µ) ≥∫⟨ξ, t − id⟩dµ for all ν and all t#µ = ν.

26 / 22

Generalized Geodesics.• µ(α) = (αtνµ + (1− α)id)#µ is the geodesic from µ to ν• µ(α) = (αtµω + (1− α)tνω)#ω is the gen. geodesic from µ to ν with base ω

27 / 22


Proposition (AGS)ν 7→W 22 (ν, µ) is convex along gen. geodesics with base µ.

27 / 22


Proposition (AGS)ν 7→W 22 (ν, µ) is convex along gen. geodesics with base µ.

Thus, E convex along gen. geodesics =⇒Φ(ν) = 12τW

22 (ν, µn−1) + E(ν) convex along gen. geodesics with base µn−1.

27 / 22

Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

(∫|tωµ − tνµ|2dµ

)1/2.

28 / 22



)1/2.

Proposition (C.)..1 W2,µ is a metric

28 / 22



)1/2.

Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν

28 / 22



)1/2.

Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ

28 / 22



)1/2.

Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ..4 ω 7→W 22,µ(ω, ν) is convex

28 / 22



)1/2.

Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ..4 ω 7→W 22,µ(ω, ν) is convex..5 ξ ∈ ∂sE(ν) =⇒ ξ ◦ tνµ ∈ ∂2,µE(ν)

28 / 22



)1/2.


Proof of Euler-Lagrange equation:1τ (t

µn−1µn − id) ∈ ∂sE(µn) =⇒ µn = argminν

{12τW

22 (ν, µn−1) + E(ν)

}.

28 / 22



)1/2.




{12τW

22 (ν, µn−1) + E(ν)

}.

• E convex along gen. geodesics =⇒ convex in W2,µn−1

28 / 22



)1/2.




{12τW

22 (ν, µn−1) + E(ν)

}.

• E convex along gen. geodesics =⇒ convex in W2,µn−1• Φ(ν) = 12τW

22 (ν, µn−1) + E(ν) convex in W2,µn−1

28 / 22



)1/2.




{12τW

22 (ν, µn−1) + E(ν)

}.



• Since 1τ (tµn−1µn − id) ∈ ∂sE(µn), a computation shows 0 ∈ ∂2,µn−1Φ(µn)

28 / 22



)1/2.




{12τW

22 (ν, µn−1) + E(ν)

}.



• Since 1τ (tµn−1µn − id) ∈ ∂sE(µn), a computation shows 0 ∈ ∂2,µn−1Φ(µn)

• Therefore, µn minimizes Φ.28 / 22

Documents

The Exponential Formula for the Wasserstein Metrickcraig/math/curriculum_vitae... · 2016. 10. 20. · The Exponential Formula for the Wasserstein Metric. Katy Craig UCLA SIAMAnnual