93
The Exponential Formula for the Wasserstein Metric . Katy Craig UCLA SIAM Annual Meeting, Chicago July 8, 2014 1 / 22

The Exponential Formula for the Wasserstein Metrickcraig/math/curriculum_vitae... · 2016. 10. 20. · The Exponential Formula for the Wasserstein Metric. Katy Craig UCLA SIAMAnnual

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • The Exponential Formula for the Wasserstein Metric.Katy CraigUCLA

    SIAM Annual Meeting, ChicagoJuly 8, 2014

    1 / 22

  • Plan.

    • Gradient flow

    • Discrete gradient flow

    • Euler-Lagrange equation

    • Exponential formula

    2 / 22

  • Plan.

    • Gradient flow

    • Discrete gradient flow

    • Euler-Lagrange equation

    • Exponential formula

    3 / 22

  • Gradient Flow.∂u(t)

    ∂t= −∇E(u(t)), u(0) = u

    4 / 22

  • Gradient Flow.∂u(t)

    ∂t= −∇E(u(t)), u(0) = u

    Heat Equation as Gradient Flow on L2(Rd)

    4 / 22

  • Gradient Flow.∂u(t)

    ∂t= −∇E(u(t)), u(0) = u

    Heat Equation as Gradient Flow on L2(Rd)L2(Rd) gradient:

    (∇L2E(u), v)L2 = limh→0

    E(u+ hv)− E(u)h

    for all v ∈ L2(Rd)

    4 / 22

  • Gradient Flow.∂u(t)

    ∂t= −∇E(u(t)), u(0) = u

    Heat Equation as Gradient Flow on L2(Rd)L2(Rd) gradient:

    (∇L2E(u), v)L2 = limh→0

    E(u+ hv)− E(u)h

    for all v ∈ L2(Rd)

    Thus, for E(u) = 12∫|∇u|2,

    (∇L2E(u), v)L2 = limh→0

    1

    2

    ∫|∇(u+ hv)|2 −

    ∫|∇u|2

    h= (∇u,∇v)L2 = (−∆u, v)L2 .

    4 / 22

  • Gradient Flow.∂u(t)

    ∂t= −∇E(u(t)), u(0) = u

    Heat Equation as Gradient Flow on L2(Rd)L2(Rd) gradient:

    (∇L2E(u), v)L2 = limh→0

    E(u+ hv)− E(u)h

    for all v ∈ L2(Rd)

    Thus, for E(u) = 12∫|∇u|2,

    (∇L2E(u), v)L2 = limh→0

    1

    2

    ∫|∇(u+ hv)|2 −

    ∫|∇u|2

    h= (∇u,∇v)L2 = (−∆u, v)L2 .

    Hence, the L2 gradient flow of E is

    ∂u/∂t = −∇L2E(u) = −(−∆u) = ∆u .

    4 / 22

  • Gradient Flow.∂u(t)

    ∂t= −∇E(u(t)), u(0) = u

    Heat Equation as Gradient Flow on L2(Rd)L2(Rd) gradient:

    (∇L2E(u), v)L2 = limh→0

    E(u+ hv)− E(u)h

    for all v ∈ L2(Rd)

    Thus, for E(u) = 12∫|∇u|2,

    (∇L2E(u), v)L2 = limh→0

    1

    2

    ∫|∇(u+ hv)|2 −

    ∫|∇u|2

    h= (∇u,∇v)L2 = (−∆u, v)L2 .

    Hence, the L2 gradient flow of E is

    ∂u/∂t = −∇L2E(u) = −(−∆u) = ∆u .

    Note: ∇L2E(u) = δEδu4 / 22

  • Examples of Hilbert Space Gradient Flow.

    PDE Energy Functional MetricAllen-Cahn d

    dtu = ∆u− F ′(u) E(u) = 1

    2

    ∫ [|∇u|2 + F (u)

    ]L2

    Cahn-Hilliard ddtu = ∆(∆u− F ′(u)) E(u) = 1

    2

    ∫ [|∇u|2 + F (u)

    ]H−1

    Porous Media / ddtu = ∆um E(u) = 1

    m+1

    ∫um+1 H−1

    Fast Diffusion

    5 / 22

  • Examples of Hilbert Space Gradient Flow.

    PDE Energy Functional MetricAllen-Cahn d

    dtu = ∆u− F ′(u) E(u) = 1

    2

    ∫ [|∇u|2 + F (u)

    ]L2

    Cahn-Hilliard ddtu = ∆(∆u− F ′(u)) E(u) = 1

    2

    ∫ [|∇u|2 + F (u)

    ]H−1

    Porous Media / ddtu = ∆um E(u) = 1

    m+1

    ∫um+1 H−1

    Fast Diffusion

    Why gradient flow?• Free estimates, e.g. |u(t)− v(t)| ≤ e−λt|u(0)− v(0)|• Method to construct and approximate solutions (discrete gradient flow)

    5 / 22

  • Wasserstein Gradient Flow.∂µ(t)

    ∂t= −∇W2E(µ(t)), µ(0) = µ

    6 / 22

  • Wasserstein Gradient Flow.∂µ(t)

    ∂t= −∇W2E(µ(t)), µ(0) = µ

    Simplifying assumptions:∫|x|2dµ < +∞, µ

  • Wasserstein Gradient Flow.∂µ(t)

    ∂t= −∇W2E(µ(t)), µ(0) = µ

    Simplifying assumptions:∫|x|2dµ < +∞, µ

  • Wasserstein Gradient Flow.∂µ(t)

    ∂t= −∇W2E(µ(t)), µ(0) = µ

    Simplifying assumptions:∫|x|2dµ < +∞, µ

  • Wasserstein Gradient Flow.∂µ(t)

    ∂t= −∇W2E(µ(t)), µ(0) = µ

    Simplifying assumptions:∫|x|2dµ < +∞, µ

  • Geodesics and Convexity.Geodesics:

    µ(α) = (αtνµ + (1− α)id)#µ is the geodesic from µ to ν at time α,

    W2(µ(α), µ(β)) = |α− β|W2(µ, ν) .

    7 / 22

  • Geodesics and Convexity.Geodesics:

    µ(α) = (αtνµ + (1− α)id)#µ is the geodesic from µ to ν at time α,

    W2(µ(α), µ(β)) = |α− β|W2(µ, ν) .

    Convexity:

    E is convex in case

    E(µ(α)) ≤ (1− α)E(µ(0)) + αE(µ(1)) .

    7 / 22

  • Geodesics and Convexity.Geodesics:

    µ(α) = (αtνµ + (1− α)id)#µ is the geodesic from µ to ν at time α,

    W2(µ(α), µ(β)) = |α− β|W2(µ, ν) .

    Convexity:

    E is convex in case

    E(µ(α)) ≤ (1− α)E(µ(0)) + αE(µ(1)) .

    7 / 22

  • Geodesics and Convexity.Geodesics:

    µ(α) = (αtνµ + (1− α)id)#µ is the geodesic from µ to ν at time α,

    W2(µ(α), µ(β)) = |α− β|W2(µ, ν) .

    Convexity:

    E is convex in case

    E(µ(α)) ≤ (1− α)E(µ(0)) + αE(µ(1)) .

    Assumption: E is lower semicontinuous and convex.

    7 / 22

  • The Wasserstein Metric's ``Gradient''.

    By a similar computation as in the L2 case,(∇W2E(µ),

    ∂µ

    ∂t

    ∣∣∣∣t=0

    = limt→0

    E(µ(t))− E(µ)t

    for all ∂µ∂t

    ,

    implies∇W2E(µ) = −∇ ·

    (µ∇δE

    δµ

    ).

    8 / 22

  • The Wasserstein Metric's ``Gradient''.

    By a similar computation as in the L2 case,(∇W2E(µ),

    ∂µ

    ∂t

    ∣∣∣∣t=0

    = limt→0

    E(µ(t))− E(µ)t

    for all ∂µ∂t

    ,

    implies∇W2E(µ) = −∇ ·

    (µ∇δE

    δµ

    ).

    Therefore,

    ∂µ(t)

    ∂t= −∇W2E(µ(t)) ⇐⇒

    ∂µ(t)

    ∂t= ∇ ·

    (µ∇δE

    δµ

    ).

    8 / 22

  • Examples of Wasserstein Gradient Flow.

    PDE Energy FunctionalPorous Media / ∂

    ∂tµ = ∆µm E(µ) = 1

    m−1

    ∫ρ(x)mdx

    Fast DiffusionFokker Planck ∂

    ∂tµ = ∆µ+∇ · (µ∇V ) E(µ) =

    ∫ρ(x) log ρ(x) + V (x)ρ(x)dx

    Aggregation ∂∂tu = ∇ · (µ∇K ∗ µ) E(µ) = 1

    2

    ∫ ∫ρ(x)K(x− y)ρ(y)dxdy

    9 / 22

  • Examples of Wasserstein Gradient Flow.

    PDE Energy FunctionalPorous Media / ∂

    ∂tµ = ∆µm E(µ) = 1

    m−1

    ∫ρ(x)mdx

    Fast DiffusionFokker Planck ∂

    ∂tµ = ∆µ+∇ · (µ∇V ) E(µ) =

    ∫ρ(x) log ρ(x) + V (x)ρ(x)dx

    Aggregation ∂∂tu = ∇ · (µ∇K ∗ µ) E(µ) = 1

    2

    ∫ ∫ρ(x)K(x− y)ρ(y)dxdy

    Why gradient flow?• Free estimates, e.g. W2(µ(t), ν(t)) ≤ e−λtW2(µ(0), ν(0))• Method to construct and approximate solutions (discrete gradient flow)

    9 / 22

  • Plan.

    • Gradient flow

    • Discrete gradient flow

    • Euler-Lagrange equation

    • Exponential formula

    10 / 22

  • Discrete Gradient Flow: Euclidean Space.Gradient flow:

    du(t)

    dt= −∇E(u(t)), u(0) = u ∈ Rd

    11 / 22

  • Discrete Gradient Flow: Euclidean Space.Gradient flow:

    du(t)

    dt= −∇E(u(t)), u(0) = u ∈ Rd

    Implicit Euler method:

    un − un−1τ

    = −∇E(un), u0 = u

    11 / 22

  • Discrete Gradient Flow: Euclidean Space.Gradient flow:

    du(t)

    dt= −∇E(u(t)), u(0) = u ∈ Rd

    Implicit Euler method:

    un − un−1τ

    +∇E(un) = 0, u0 = u

    11 / 22

  • Discrete Gradient Flow: Euclidean Space.Gradient flow:

    du(t)

    dt= −∇E(u(t)), u(0) = u ∈ Rd

    Implicit Euler method:

    un − un−1τ

    +∇E(un) = 0, u0 = u

    Given un−1, compute un using that it is a critical point of

    Φ(v) =1

    2τ|v − un−1|2 + E(v) .

    11 / 22

  • Discrete Gradient Flow: Euclidean Space.Gradient flow:

    du(t)

    dt= −∇E(u(t)), u(0) = u ∈ Rd

    Implicit Euler method:

    un − un−1τ

    +∇E(un) = 0, u0 = u

    Given un−1, compute un using that it is a critical point the unique minimizer of

    Φ(v) =1

    2τ|v − un−1|2 + E(v) .

    11 / 22

  • Discrete Gradient Flow: Euclidean Space.Gradient flow:

    du(t)

    dt= −∇E(u(t)), u(0) = u ∈ Rd

    Implicit Euler method:

    un − un−1τ

    +∇E(un) = 0, u0 = u

    Given un−1, compute un using that it is a critical point the unique minimizer of

    Φ(v) =1

    2τ|v − un−1|2 + E(v) .

    Theorem (Exponential Formula)Let τ = t/n. Then limn→∞ un = u(t).

    11 / 22

  • Discrete Gradient Flow: Wasserstein Metric.Want to say... given µn−1, let µn be the unique minimizer of

    Φ(v) =1

    2τW 22 (ν, µn−1) + E(ν) .

    12 / 22

  • Discrete Gradient Flow: Wasserstein Metric.Want to say... given µn−1, let µn be the unique minimizer of

    Φ(v) =1

    2τW 22 (ν, µn−1) + E(ν) .

    Problem: ν 7→W 22 (ν, µn−1) is not convex, so Φ may not have a uniqueminimum.

    12 / 22

  • Discrete Gradient Flow: Wasserstein Metric.Want to say... given µn−1, let µn be the unique minimizer of

    Φ(v) =1

    2τW 22 (ν, µn−1) + E(ν) .

    Problem: ν 7→W 22 (ν, µn−1) is not convex, so Φ may not have a uniqueminimum.

    Need additional assumptions on E:• coercive• convex along generalized geodesics

    12 / 22

  • Discrete Gradient Flow: Wasserstein Metric.Want to say... given µn−1, let µn be the unique minimizer of

    Φ(v) =1

    2τW 22 (ν, µn−1) + E(ν) .

    Problem: ν 7→W 22 (ν, µn−1) is not convex, so Φ may not have a uniqueminimum.

    Need additional assumptions on E:• coercive• convex along generalized geodesics

    Proposition (AGS)For all τ > 0, there exists a unique minimizer of Φ(ν), so the discrete gradientflow is well defined.

    12 / 22

  • Plan.

    • Gradient flow

    • Discrete gradient flow

    • Euler-Lagrange equation

    • Exponential formula

    13 / 22

  • Euler-Lagrange Equation.In the Euclidean case,

    un = argminv

    {1

    2τ|v − un−1|2 + E(v)

    }⇐⇒ un − un−1

    τ= −∇E(un) .

    14 / 22

  • Euler-Lagrange Equation.In the Euclidean case,

    un = argminv

    {1

    2τ|v − un−1|2 + E(v)

    }⇐⇒ un − un−1

    τ= −∇E(un) .

    In the Wasserstein case,

    Proposition (AGS, C.)

    µn = argminν

    {1

    2τW 22 (ν, µn−1) + E(ν)

    }⇐⇒ 1

    τ(tµn−1µn − id) ∈ ∂sE(µn) .

    14 / 22

  • Euler-Lagrange Equation.In the Euclidean case,

    un = argminv

    {1

    2τ|v − un−1|2 + E(v)

    }⇐⇒ un − un−1

    τ= −∇E(un) .

    In the Wasserstein case,

    Proposition (AGS, C.)

    µn = argminν

    {1

    2τW 22 (ν, µn−1) + E(ν)

    }⇐⇒ 1

    τ(tµn−1µn − id) ∈ ∂sE(µn) .

    Key property of subdifferential: for E convex, 0 ∈ ∂E(µ) ⇐⇒ µ minimizes E.

    14 / 22

  • Sketch of Proof: Euler-Lagrange Equation.Proposition (AGS, C.)

    µn = argminν

    {1

    2τW 22 (ν, µn−1) + E(ν)

    }⇐⇒ 1

    τ(tµn−1µn − id) ∈ ∂sE(µn) .

    15 / 22

  • Sketch of Proof: Euler-Lagrange Equation.Proposition (AGS, C.)

    µn = argminν

    {1

    2τW 22 (ν, µn−1) + E(ν)

    }⇐⇒ 1

    τ(tµn−1µn − id) ∈ ∂sE(µn) .

    Proof:Let Φ(ν) = 12τW 22 (ν, µn−1) + E(ν).

    15 / 22

  • Sketch of Proof: Euler-Lagrange Equation.Proposition (AGS, C.)

    µn = argminν

    {1

    2τW 22 (ν, µn−1) + E(ν)

    }⇐⇒ 1

    τ(tµn−1µn − id) ∈ ∂sE(µn) .

    Proof:Let Φ(ν) = 12τW 22 (ν, µn−1) + E(ν).

    =⇒ [AGS, Otto]: minimality implies Φ(t#µn) ≥ Φ(µn); expand both sides.

    15 / 22

  • Sketch of Proof: Euler-Lagrange Equation.Proposition (AGS, C.)

    µn = argminν

    {1

    2τW 22 (ν, µn−1) + E(ν)

    }⇐⇒ 1

    τ(tµn−1µn − id) ∈ ∂sE(µn) .

    Proof:Let Φ(ν) = 12τW 22 (ν, µn−1) + E(ν).

    =⇒ [AGS, Otto]: minimality implies Φ(t#µn) ≥ Φ(µn); expand both sides.

    ⇐= want to say...• 1

    τ (tµn−1µn − id) ∈ ∂sE(µn)

    • hence 0 ∈ ∂Φ(µn)• hence by key property, µn minimizes Φ

    15 / 22

  • Sketch of Proof: Euler-Lagrange Equation.Proposition (AGS, C.)

    µn = argminν

    {1

    2τW 22 (ν, µn−1) + E(ν)

    }⇐⇒ 1

    τ(tµn−1µn − id) ∈ ∂sE(µn) .

    Proof:Let Φ(ν) = 12τW 22 (ν, µn−1) + E(ν).

    =⇒ [AGS, Otto]: minimality implies Φ(t#µn) ≥ Φ(µn); expand both sides.

    ⇐= want to say...• 1

    τ (tµn−1µn − id) ∈ ∂sE(µn)

    • hence 0 ∈ ∂Φ(µn)• hence by key property, µn minimizes Φ

    Problem: ν 7→W 22 (ν, µn−1) is not convex, so Φ may not be convex.

    15 / 22

  • Sketch of Proof: Euler-Lagrange Equation.Solution: generalized geodesics and transport metrics

    • [AGS] ν 7→W 22 (ν, µ) is not convex (along all geodesics)• [AGS] ν 7→W 22 (ν, µ) is convex along generalized geodesics with base µ• [C.] the generalized geodesics with base µ are not arbitrary curves: they are

    exactly the geodesics of the transport metric with base µ

    16 / 22

  • Sketch of Proof: Euler-Lagrange Equation.Solution: generalized geodesics and transport metrics

    • [AGS] ν 7→W 22 (ν, µ) is not convex (along all geodesics)• [AGS] ν 7→W 22 (ν, µ) is convex along generalized geodesics with base µ• [C.] the generalized geodesics with base µ are not arbitrary curves: they are

    exactly the geodesics of the transport metric with base µ

    WassersteinMetric: W2(µ, ν) =(∫

    |tνµ − id|2dµ)1/2

    TransportMetric: W2,ω(µ, ν) =(∫

    |tµω − tνω|2dω)1/2

    • ν 7→W 22,ω(ν, µ) is convex• W2(µ, ν) ≤W2,ω(µ, ν)

    16 / 22

  • Plan.

    • Gradient flow

    • Discrete gradient flow

    • Euler-Lagrange equation

    • Exponential formula

    17 / 22

  • Exponential Formula.Theorem (AGS)Let τ = t/n. Then limn→∞ µn = µ(t).

    18 / 22

  • Exponential Formula.Theorem (AGS)Let τ = t/n. Then limn→∞ µn = µ(t).

    • the limit exists• the limit is a solution to the gradient flow

    18 / 22

  • Exponential Formula.Theorem (AGS)Let τ = t/n. Then limn→∞ µn = µ(t).

    • the limit exists• the limit is a solution to the gradient flow

    SketchofProof, alaCrandallandLiggett:

    18 / 22

  • Exponential Formula.Theorem (AGS)Let τ = t/n. Then limn→∞ µn = µ(t).

    • the limit exists• the limit is a solution to the gradient flow

    SketchofProof, alaCrandallandLiggett:Let Jτ be the function Jτu = argminv

    {12τ |v − u|

    2 + E(v)}

    =⇒ Jnτ u0 = un.

    18 / 22

  • Exponential Formula.Theorem (AGS)Let τ = t/n. Then limn→∞ µn = µ(t).

    • the limit exists• the limit is a solution to the gradient flow

    SketchofProof, alaCrandallandLiggett:Let Jτ be the function Jτu = argminv

    {12τ |v − u|

    2 + E(v)}

    =⇒ Jnτ u0 = un.

    ..1 Contractioninequality

    18 / 22

  • Exponential Formula.Theorem (AGS)Let τ = t/n. Then limn→∞ µn = µ(t).

    • the limit exists• the limit is a solution to the gradient flow

    SketchofProof, alaCrandallandLiggett:Let Jτ be the function Jτu = argminv

    {12τ |v − u|

    2 + E(v)}

    =⇒ Jnτ u0 = un.

    ..1 ContractioninequalityBanach space: ∥Jτu− Jτv∥ ≤ ∥u− v∥

    18 / 22

  • Exponential Formula.Theorem (AGS)Let τ = t/n. Then limn→∞ µn = µ(t).

    • the limit exists• the limit is a solution to the gradient flow

    SketchofProof, alaCrandallandLiggett:Let Jτ be the function Jτu = argminv

    {12τ |v − u|

    2 + E(v)}

    =⇒ Jnτ u0 = un.

    ..1 ContractioninequalityBanach space: ∥Jτu− Jτv∥ ≤ ∥u− v∥

    18 / 22

  • Exponential Formula.Theorem (AGS)Let τ = t/n. Then limn→∞ µn = µ(t).

    • the limit exists• the limit is a solution to the gradient flow

    SketchofProof, alaCrandallandLiggett:Let Jτ be the function Jτu = argminv

    {12τ |v − u|

    2 + E(v)}

    =⇒ Jnτ u0 = un.

    ..1 ContractioninequalityBanach space: ∥Jτu− Jτv∥ ≤ ∥u− v∥

    Theorem (Carlen, C.)W 22 (Jτµ, Jτν) ≤W 22 (µ, ν) +O(τ2)

    18 / 22

  • Exponential Formula...2 Largevssmalltimesteps, 0 < h ≤ τ

    19 / 22

  • Exponential Formula...2 Largevssmalltimesteps, 0 < h ≤ τ

    Banach space: Jτu = Jh[τ−hτ Jτu+

    hτ u]

    19 / 22

  • Exponential Formula...2 Largevssmalltimesteps, 0 < h ≤ τ

    Banach space: Jτu = Jh[τ−hτ Jτu+

    hτ u]

    Lemma (Jost, Mayer, C.)Jτµ = Jh

    [(τ−hτ t

    Jτµµ +

    hτ id)

    #µ]

    19 / 22

  • Exponential Formula...2 Largevssmalltimesteps, 0 < h ≤ τ

    Banach space: Jτu = Jh[τ−hτ Jτu+

    hτ u]

    Lemma (Jost, Mayer, C.)Jτµ = Jh

    [(τ−hτ t

    Jτµµ +

    hτ id)

    #µ]

    19 / 22

  • Exponential Formula...3 Recursiveinequality:

    20 / 22

  • Exponential Formula...3 Recursiveinequality:Let µn = Jnτ µ and µm = Jmh µ.

    20 / 22

  • Exponential Formula...3 Recursiveinequality:Let µn = Jnτ µ and µm = Jmh µ.

    W 22 (µn, µm) =W22

    Jhν︷ ︸︸ ︷[(

    τ − hτ

    tµnµn−1 +h

    τid)

    #µn−1], Jhµm−1

    20 / 22

  • Exponential Formula...3 Recursiveinequality:Let µn = Jnτ µ and µm = Jmh µ.

    W 22 (µn, µm) =W22

    Jhν︷ ︸︸ ︷[(

    τ − hτ

    tµnµn−1 +h

    τid)

    #µn−1], Jhµm−1

    ≤W 22 (ν, µm−1) +O(h2)

    20 / 22

  • Exponential Formula...3 Recursiveinequality:Let µn = Jnτ µ and µm = Jmh µ.

    W 22 (µn, µm) =W22

    Jhν︷ ︸︸ ︷[(

    τ − hτ

    tµnµn−1 +h

    τid)

    #µn−1], Jhµm−1

    ≤W 22 (ν, µm−1) +O(h2)

    ≤W 22,µn−1(ν, µm−1) +O(h2)

    20 / 22

  • Exponential Formula...3 Recursiveinequality:Let µn = Jnτ µ and µm = Jmh µ.

    W 22 (µn, µm) =W22

    Jhν︷ ︸︸ ︷[(

    τ − hτ

    tµnµn−1 +h

    τid)

    #µn−1], Jhµm−1

    ≤W 22 (ν, µm−1) +O(h2)

    ≤W 22,µn−1(ν, µm−1) +O(h2)

    ≤ τ − hτ

    W 22,µn−1(µn, µm−1) +h

    τW 22,µn−1(µn−1, µm−1) +O(h

    2)

    20 / 22

  • Exponential Formula...3 Recursiveinequality:Let µn = Jnτ µ and µm = Jmh µ.

    W 22 (µn, µm) =W22

    Jhν︷ ︸︸ ︷[(

    τ − hτ

    tµnµn−1 +h

    τid)

    #µn−1], Jhµm−1

    ≤W 22 (ν, µm−1) +O(h2)

    ≤W 22,µn−1(ν, µm−1) +O(h2)

    ≤ τ − hτ

    W 22,µn−1(µn, µm−1) +h

    τW 22,µn−1(µn−1, µm−1) +O(h

    2)

    ≤ τ − hτ

    W 22 (µn, µm−1) +h

    τW 22 (µn−1, µm−1) +O(h2)

    20 / 22

  • Exponential Formula...3 Recursiveinequality:Let µn = Jnτ µ and µm = Jmh µ.

    W 22 (µn, µm) =W22

    Jhν︷ ︸︸ ︷[(

    τ − hτ

    tµnµn−1 +h

    τid)

    #µn−1], Jhµm−1

    ≤W 22 (ν, µm−1) +O(h2)

    ≤W 22,µn−1(ν, µm−1) +O(h2)

    ≤ τ − hτ

    W 22,µn−1(µn, µm−1) +h

    τW 22,µn−1(µn−1, µm−1) +O(h

    2)

    ≤ τ − hτ

    W 22 (µn, µm−1) +h

    τW 22 (µn−1, µm−1) +O(h2)

    W 22 (µn, µm) ≤τ − hτ

    W 22 (µn, µm−1) +h

    τW 22 (µn−1, µm−1) +O(h2)

    20 / 22

  • Exponential Formula...3 Recursiveinequality:Let µn = Jnτ µ and µm = Jmh µ.

    W 22 (µn, µm) ≤τ − hτ

    W 22 (µn, µm−1) +h

    τW 22 (µn−1, µm−1) +O(h2)

    20 / 22

  • Exponential Formula...3 Recursiveinequality:Let µn = Jnτ µ and µm = Jmh µ.

    W 22 (µn, µm) ≤τ − hτ

    W 22 (µn, µm−1) +h

    τW 22 (µn−1, µm−1) +O(h2)

    20 / 22

  • Exponential Formula.Iterating

    W 22 (µn, µm) ≤τ − hτ

    W 22 (µn, µm−1) +h

    τW 22 (µn−1, µm−1) +O(h2)

    with τ = t/n and h = t/m for n ≤ m gives

    W2(µn, µm) ≤ O(1√n)

    n,m→∞−−−−−→ 0 .

    Therefore, the limit exists.

    21 / 22

  • Thank you!

    22 / 22

  • Backup

    23 / 22

  • Wasserstein Gradient Flow.∂µ(t)

    ∂t= −∇W2E(µ(t)), µ(0) = µ

    Wasserstein Metric as ``Riemannian Manifold''*The Wasserstein metric is induced by this inner product (Benamou-Brenier):

    W2(µ0, µ1) =

    inf{∫ 1

    0

    ∥∇ψ(t)∥µ(t)dt : µ(0) = µ0, µ(1) = µ1,∂µ

    ∂t+∇ · (∇ψµ) = 0

    }.

    24 / 22

  • The Wasserstein Metric's ``Inner Product''* [Otto].

    25 / 22

  • The Wasserstein Metric's ``Inner Product''* [Otto].*DISCLAIMER: "given sufficient regularity," "in the limit", ...

    25 / 22

  • The Wasserstein Metric's ``Inner Product''* [Otto].*DISCLAIMER: "given sufficient regularity," "in the limit", ...

    Given µ(t), there exists a velocity field v(x, t) = ∇ψ(x, t) so that

    ∂µ

    ∂t+∇ · (∇ψµ) = 0 .

    25 / 22

  • The Wasserstein Metric's ``Inner Product''* [Otto].*DISCLAIMER: "given sufficient regularity," "in the limit", ...

    Given µ(t), there exists a velocity field v(x, t) = ∇ψ(x, t) so that

    ∂µ

    ∂t+∇ · (∇ψµ) = 0 .

    The tangent space at a measure µ is{∂µ

    ∂t

    ∣∣∣∣t=0

    : µ(0) = µ

    }={∇ψ : ψ ∈ C∞c (Rd)

    }.

    25 / 22

  • The Wasserstein Metric's ``Inner Product''* [Otto].*DISCLAIMER: "given sufficient regularity," "in the limit", ...

    Given µ(t), there exists a velocity field v(x, t) = ∇ψ(x, t) so that

    ∂µ

    ∂t+∇ · (∇ψµ) = 0 .

    The tangent space at a measure µ is{∂µ

    ∂t

    ∣∣∣∣t=0

    : µ(0) = µ

    }={∇ψ : ψ ∈ C∞c (Rd)

    }.

    The inner product is(∂µ

    ∂t,∂̃µ

    ∂t

    :=

    ∫∇ψ(x) · ∇ψ̃(x)dµ .

    25 / 22

  • Wasserstein Subdifferential.Wasserstein subdifferential of convex function:

    • ξ ∈ ∂E(µ) in case E(ν)− E(µ) ≥∫⟨ξ, tνµ − id⟩dµ for all ν

    • ξ ∈ ∂sE(µ) in case E(ν)− E(µ) ≥∫⟨ξ, t − id⟩dµ for all ν and all t#µ = ν.

    26 / 22

  • Generalized Geodesics.• µ(α) = (αtνµ + (1− α)id)#µ is the geodesic from µ to ν• µ(α) = (αtµω + (1− α)tνω)#ω is the gen. geodesic from µ to ν with base ω

    27 / 22

  • Generalized Geodesics.• µ(α) = (αtνµ + (1− α)id)#µ is the geodesic from µ to ν• µ(α) = (αtµω + (1− α)tνω)#ω is the gen. geodesic from µ to ν with base ω

    Proposition (AGS)ν 7→W 22 (ν, µ) is convex along gen. geodesics with base µ.

    27 / 22

  • Generalized Geodesics.• µ(α) = (αtνµ + (1− α)id)#µ is the geodesic from µ to ν• µ(α) = (αtµω + (1− α)tνω)#ω is the gen. geodesic from µ to ν with base ω

    Proposition (AGS)ν 7→W 22 (ν, µ) is convex along gen. geodesics with base µ.

    Thus, E convex along gen. geodesics =⇒Φ(ν) = 12τW

    22 (ν, µn−1) + E(ν) convex along gen. geodesics with base µn−1.

    27 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    28 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    Proposition (C.)..1 W2,µ is a metric

    28 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν

    28 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ

    28 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ..4 ω 7→W 22,µ(ω, ν) is convex

    28 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ..4 ω 7→W 22,µ(ω, ν) is convex..5 ξ ∈ ∂sE(ν) =⇒ ξ ◦ tνµ ∈ ∂2,µE(ν)

    28 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ..4 ω 7→W 22,µ(ω, ν) is convex..5 ξ ∈ ∂sE(ν) =⇒ ξ ◦ tνµ ∈ ∂2,µE(ν)

    Proof of Euler-Lagrange equation:1τ (t

    µn−1µn − id) ∈ ∂sE(µn) =⇒ µn = argminν

    {12τW

    22 (ν, µn−1) + E(ν)

    }.

    28 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ..4 ω 7→W 22,µ(ω, ν) is convex..5 ξ ∈ ∂sE(ν) =⇒ ξ ◦ tνµ ∈ ∂2,µE(ν)

    Proof of Euler-Lagrange equation:1τ (t

    µn−1µn − id) ∈ ∂sE(µn) =⇒ µn = argminν

    {12τW

    22 (ν, µn−1) + E(ν)

    }.

    • E convex along gen. geodesics =⇒ convex in W2,µn−1

    28 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ..4 ω 7→W 22,µ(ω, ν) is convex..5 ξ ∈ ∂sE(ν) =⇒ ξ ◦ tνµ ∈ ∂2,µE(ν)

    Proof of Euler-Lagrange equation:1τ (t

    µn−1µn − id) ∈ ∂sE(µn) =⇒ µn = argminν

    {12τW

    22 (ν, µn−1) + E(ν)

    }.

    • E convex along gen. geodesics =⇒ convex in W2,µn−1• Φ(ν) = 12τW

    22 (ν, µn−1) + E(ν) convex in W2,µn−1

    28 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ..4 ω 7→W 22,µ(ω, ν) is convex..5 ξ ∈ ∂sE(ν) =⇒ ξ ◦ tνµ ∈ ∂2,µE(ν)

    Proof of Euler-Lagrange equation:1τ (t

    µn−1µn − id) ∈ ∂sE(µn) =⇒ µn = argminν

    {12τW

    22 (ν, µn−1) + E(ν)

    }.

    • E convex along gen. geodesics =⇒ convex in W2,µn−1• Φ(ν) = 12τW

    22 (ν, µn−1) + E(ν) convex in W2,µn−1

    • Since 1τ (tµn−1µn − id) ∈ ∂sE(µn), a computation shows 0 ∈ ∂2,µn−1Φ(µn)

    28 / 22

  • Transport Metrics.The transport metric with base µ is W2,µ(ω, ν) :=

    (∫|tωµ − tνµ|2dµ

    )1/2.

    Proposition (C.)..1 W2,µ is a metric..2 W2(ω, ν) ≤W2,µ(ω, ν) with equality if µ = ω or µ = ν..3 the geodesics of W2,µ are the gen. geodesics with base µ..4 ω 7→W 22,µ(ω, ν) is convex..5 ξ ∈ ∂sE(ν) =⇒ ξ ◦ tνµ ∈ ∂2,µE(ν)

    Proof of Euler-Lagrange equation:1τ (t

    µn−1µn − id) ∈ ∂sE(µn) =⇒ µn = argminν

    {12τW

    22 (ν, µn−1) + E(ν)

    }.

    • E convex along gen. geodesics =⇒ convex in W2,µn−1• Φ(ν) = 12τW

    22 (ν, µn−1) + E(ν) convex in W2,µn−1

    • Since 1τ (tµn−1µn − id) ∈ ∂sE(µn), a computation shows 0 ∈ ∂2,µn−1Φ(µn)

    • Therefore, µn minimizes Φ.28 / 22