114
1/54 MADMM: a generic algorithm for non-smooth optimization on manifolds Michael Bronstein Faculty of Informatics Perceptual Computing Group University of Lugano Intel Corporation Switzerland Israel Louvain-la-Neuve, 25 September 2015

MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

1/54

MADMM: a generic algorithm for non-smoothoptimization on manifolds

Michael Bronstein

Faculty of Informatics Perceptual Computing Group

University of Lugano Intel Corporation

Switzerland Israel

Louvain-la-Neuve, 25 September 2015

Page 2: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

2/54

Image.processing Geometry processing

Image analysis..

Shape . analysis

Computer vision

Computer graphics 2D 3D

nD

Pattern recognition Machine

learning . Graph analysis . & processing

Page 3: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

3/54

What is manifold optimization?

Manifold (or manifold-constrained) optimization problem

minX∈Rn×m

f(X) s.t. X ∈M

f ∶ Rn×m → R is a smooth function

M is a Riemannian submanifold of Rn×m

Absil et al. 2009

Page 4: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

4/54

Applications

Sphere: principal geodesic analysis1, 1-bit compressed sensing2

Stiefel manifold: eigenvalue-, assignment-, Procrustes problems3,orthogonal dictionary learning4, binary coding5

Product of Stiefel manifolds: functional correspondence6,manifold learning7, structure-from-motion8, sensor localization9

Fixed-rank PSD: maxcut problems, sparse PCA10, matrixcompletion11, multidimensional scaling12

Oblique: ICA13, blind source separation14

1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008

; 3Ten Berghe 1977; 4Sun et al.2015; 5Xia et al. 2015; 6Kovnatsky et al. 2013; 7Eynard et al. 2015; 8Arie-Nachimsonet al. 2012; 9Cucuringu et al. 2012; 10Journee et al. 2010; 11Tan et al. 2014;12Cayton, Dasgupta 2006; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.

Page 5: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

4/54

Applications

Sphere: principal geodesic analysis1, 1-bit compressed sensing2

Stiefel manifold: eigenvalue-, assignment-, Procrustes problems3,orthogonal dictionary learning4, binary coding5

Product of Stiefel manifolds: functional correspondence6,manifold learning7, structure-from-motion8, sensor localization9

Fixed-rank PSD: maxcut problems, sparse PCA10, matrixcompletion11, multidimensional scaling12

Oblique: ICA13, blind source separation14

1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.2015; 5Xia et al. 2015

; 6Kovnatsky et al. 2013; 7Eynard et al. 2015; 8Arie-Nachimsonet al. 2012; 9Cucuringu et al. 2012; 10Journee et al. 2010; 11Tan et al. 2014;12Cayton, Dasgupta 2006; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.

Page 6: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

4/54

Applications

Sphere: principal geodesic analysis1, 1-bit compressed sensing2

Stiefel manifold: eigenvalue-, assignment-, Procrustes problems3,orthogonal dictionary learning4, binary coding5

Product of Stiefel manifolds: functional correspondence6,manifold learning7, structure-from-motion8, sensor localization9

Fixed-rank PSD: maxcut problems, sparse PCA10, matrixcompletion11, multidimensional scaling12

Oblique: ICA13, blind source separation14

1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.2015; 5Xia et al. 2015; 6Kovnatsky et al. 2013; 7Eynard et al. 2015; 8Arie-Nachimsonet al. 2012; 9Cucuringu et al. 2012

; 10Journee et al. 2010; 11Tan et al. 2014;12Cayton, Dasgupta 2006; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.

Page 7: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

4/54

Applications

Sphere: principal geodesic analysis1, 1-bit compressed sensing2

Stiefel manifold: eigenvalue-, assignment-, Procrustes problems3,orthogonal dictionary learning4, binary coding5

Product of Stiefel manifolds: functional correspondence6,manifold learning7, structure-from-motion8, sensor localization9

Fixed-rank PSD: maxcut problems, sparse PCA10, matrixcompletion11, multidimensional scaling12

Oblique: ICA13, blind source separation14

1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.2015; 5Xia et al. 2015; 6Kovnatsky et al. 2013; 7Eynard et al. 2015; 8Arie-Nachimsonet al. 2012; 9Cucuringu et al. 2012; 10Journee et al. 2010; 11Tan et al. 2014;12Cayton, Dasgupta 2006

; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.

Page 8: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

4/54

Applications

Sphere: principal geodesic analysis1, 1-bit compressed sensing2

Stiefel manifold: eigenvalue-, assignment-, Procrustes problems3,orthogonal dictionary learning4, binary coding5

Product of Stiefel manifolds: functional correspondence6,manifold learning7, structure-from-motion8, sensor localization9

Fixed-rank PSD: maxcut problems, sparse PCA10, matrixcompletion11, multidimensional scaling12

Oblique: ICA13, blind source separation14

1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.2015; 5Xia et al. 2015; 6Kovnatsky et al. 2013; 7Eynard et al. 2015; 8Arie-Nachimsonet al. 2012; 9Cucuringu et al. 2012; 10Journee et al. 2010; 11Tan et al. 2014;12Cayton, Dasgupta 2006; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.

Page 9: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

5/54

Toy example: eigenvalue problem

minx∈Rn

x⊺Ax s.t. x⊺x = 1

Page 10: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

5/54

Toy example: eigenvalue problem

minx∈Rn

x⊺Ax s.t. x⊺x = 1

Page 11: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

6/54

Optimization on the manifold: main idea

minX∈M

f(X)

where f ∶M→ R is a function on the manifold (scalar field)

No global system of coordinates

Manifold M is locally homeomorphic to the tangent space TXMIntrinsic gradient

Exponential map expx ∶ TXM→MMoving vectors on M requires parallel transport

Absil et al. 2009

Page 12: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

6/54

Optimization on the manifold: main idea

minX∈M

f(X)

where f ∶M→ R is a function on the manifold (scalar field)

No global system of coordinates

Manifold M is locally homeomorphic to the tangent space TXM

Intrinsic gradient

Exponential map expx ∶ TXM→MMoving vectors on M requires parallel transport

Absil et al. 2009

Page 13: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

6/54

Optimization on the manifold: main idea

minX∈M

f(X)

where f ∶M→ R is a function on the manifold (scalar field)

No global system of coordinates

Manifold M is locally homeomorphic to the tangent space TXMIntrinsic gradient ∇Mf ∶M→ TM such that

f(“X + dV ”) = f(X) + ⟨∇Mf(X), dV ⟩TXM +O(∥dV ∥2)

Exponential map expx ∶ TXM→MMoving vectors on M requires parallel transport

Absil et al. 2009

Page 14: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

6/54

Optimization on the manifold: main idea

minX∈M

f(X)

where f ∶M→ R is a function on the manifold (scalar field)

No global system of coordinates

Manifold M is locally homeomorphic to the tangent space TXMIntrinsic gradient = projection of the extrinsic gradient

∇Mf(X) = PTXM∇f(X)

Exponential map expx ∶ TXM→MMoving vectors on M requires parallel transport

Absil et al. 2009

Page 15: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

6/54

Optimization on the manifold: main idea

minX∈M

f(X)

where f ∶M→ R is a function on the manifold (scalar field)

No global system of coordinates

Manifold M is locally homeomorphic to the tangent space TXMIntrinsic gradient = projection of the extrinsic gradient

∇Mf(X) = PTXM∇f(X)

Exponential map expx ∶ TXM→M

Moving vectors on M requires parallel transport

Absil et al. 2009

Page 16: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

6/54

Optimization on the manifold: main idea

minX∈M

f(X)

where f ∶M→ R is a function on the manifold (scalar field)

No global system of coordinates

Manifold M is locally homeomorphic to the tangent space TXMIntrinsic gradient = projection of the extrinsic gradient

∇Mf(X) = PTXM∇f(X)

Exponential map expx ∶ TXM→MMoving vectors on M requires parallel transport

Absil et al. 2009

Page 17: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

7/54

Optimization on the manifold: main idea

X(k)

X(k+1)

M

Absil et al. 2009

Page 18: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

7/54

Optimization on the manifold: main idea

X(k)

∇f(X(k))

PX(k)

∇Mf(X(k))

TX(k)M

M

Absil et al. 2009

Page 19: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

7/54

Optimization on the manifold: main idea

X(k)

∇f(X(k))

PX(k)

α(k)∇Mf(X(k))

TX(k)M

M

Absil et al. 2009

Page 20: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

7/54

Optimization on the manifold: main idea

X(k)

∇f(X(k))

PX(k)

α(k)∇Mf(X(k))

RX(k)

X(k+1)

TX(k)M

M

Absil et al. 2009

Page 21: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

8/54

Optimization on the manifold

Algorithm 1 Conceptual algorithm for smooth optimization on Mrepeat

Compute extrinsic gradient ∇f(X(k))Projection: ∇Mf(X(k)) = PX(k)(∇f(X(k)))Compute step size α(k) along the descent direction −∇Mf(X(k))Retraction: X(k+1) = RX(k)(−α(k)∇Mf(X(k)))k ← k + 1

until convergence;

Projection and retraction operators are manifold-dependent

Typically expressed in closed form

“Black box”: need to provide only f(X) and gradient ∇f(X)

Absil et al. 2009; Boumal et al. 2014

Page 22: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

8/54

Optimization on the manifold

Algorithm 1 Conceptual algorithm for smooth optimization on Mrepeat

Compute extrinsic gradient ∇f(X(k))Projection: ∇Mf(X(k)) = PX(k)(∇f(X(k)))Compute step size α(k) along the descent direction −∇Mf(X(k))Retraction: X(k+1) = RX(k)(−α(k)∇Mf(X(k)))k ← k + 1

until convergence;

Projection and retraction operators are manifold-dependent

Typically expressed in closed form

“Black box”: need to provide only f(X) and gradient ∇f(X)

Absil et al. 2009; Boumal et al. 2014

Page 23: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

8/54

Optimization on the manifold

Algorithm 1 Conceptual algorithm for smooth optimization on Mrepeat

Compute extrinsic gradient ∇f(X(k))Projection: ∇Mf(X(k)) = PX(k)(∇f(X(k)))Compute step size α(k) along the descent direction −∇Mf(X(k))Retraction: X(k+1) = RX(k)(−α(k)∇Mf(X(k)))k ← k + 1

until convergence;

Projection and retraction operators are manifold-dependent

Typically expressed in closed form

“Black box”: need to provide only f(X) and gradient ∇f(X)

Absil et al. 2009; Boumal et al. 2014

Page 24: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

8/54

Optimization on the manifold

Algorithm 1 Conceptual algorithm for smooth optimization on Mrepeat

Compute extrinsic gradient ∇f(X(k))Projection: ∇Mf(X(k)) = PX(k)(∇f(X(k)))Compute step size α(k) along the descent direction −∇Mf(X(k))Retraction: X(k+1) = RX(k)(−α(k)∇Mf(X(k)))k ← k + 1

until convergence;

Projection and retraction operators are manifold-dependent

Typically expressed in closed form

“Black box”: need to provide only f(X) and gradient ∇f(X)

Absil et al. 2009; Boumal et al. 2014

Page 25: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

9/54

Prototype problem

Non-smooth manifold optimization problem

minX∈M

f(X) + g(AX)

f ∶ Rn×m → R is a smooth function

g ∶ Rk×m → R is a non-smooth function

A is k × n matrix

M is a Riemannian submanifold of Rn×m

Typical examples: g(X) = ∥X∥1, ∥X∥2,1-, or ∥X∥∗

Smoothing Subgradient Splitting

/ Approximate / Problem dependent / Problem dependent

Kovnatsky, B, Glashoff 2015

Page 26: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

9/54

Prototype problem

Non-smooth manifold optimization problem

minX∈M

f(X) + g(AX)

f ∶ Rn×m → R is a smooth function

g ∶ Rk×m → R is a non-smooth function

A is k × n matrix

M is a Riemannian submanifold of Rn×m

Typical examples: g(X) = ∥X∥1, ∥X∥2,1-, or ∥X∥∗

Smoothing Subgradient Splitting

/ Approximate / Problem dependent / Problem dependent

Kovnatsky, B, Glashoff 2015

Page 27: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

9/54

Prototype problem

Non-smooth manifold optimization problem

minX∈M

f(X) + g(AX)

Smoothing Subgradient Splitting

/ Approximate / Problem dependent / Problem dependent

Smoothing: Chen 2012Subgradient: Ferreira, Oliveira 1998; Ledyaev, Zhu 2007; Kleinsteuber, Shen 2012Splitting: Lai, Osher 2014; Neumann et al. 2014; Rosman et al. 2014

Page 28: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

9/54

Prototype problem

Non-smooth manifold optimization problem

minX∈M

f(X) + g(AX)

Smoothing Subgradient Splitting

/ Approximate / Problem dependent / Problem dependent

Smoothing: Chen 2012Subgradient: Ferreira, Oliveira 1998; Ledyaev, Zhu 2007; Kleinsteuber, Shen 2012Splitting: Lai, Osher 2014; Neumann et al. 2014; Rosman et al. 2014

Page 29: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

10/54

Manifold ADMM

Page 30: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

11/54

Manifold ADMM

Non-smooth manifold optimization problem

equivalently written as

minX∈M

f(X) + g(AX)

introducing an artificial variable Z and a linear constraint

Apply the method of multipliers only to the constraint Z = AX

minX∈M,Z

f(X) + g(Z) + ρ2∥AX −Z +U∥2F

Solve alternating w.r.t. X and Z and updating U ← U +AX −Z

Problem breaks into

Smooth manifold optimization sub-problem w.r.t. X, and

Non-smooth unconstrained sub-problem w.r.t. Z

Hestenes 1969; Powell 1969; Kovnatsky, Glashoff, B 2015

Page 31: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

11/54

Manifold ADMM

Non-smooth manifold optimization problem equivalently written as

minX∈M,Z

f(X) + g(Z) s.t. Z = AX

introducing an artificial variable Z and a linear constraint

Apply the method of multipliers only to the constraint Z = AX

minX∈M,Z

f(X) + g(Z) + ρ2∥AX −Z +U∥2F

Solve alternating w.r.t. X and Z and updating U ← U +AX −Z

Problem breaks into

Smooth manifold optimization sub-problem w.r.t. X, and

Non-smooth unconstrained sub-problem w.r.t. Z

Hestenes 1969; Powell 1969; Kovnatsky, Glashoff, B 2015

Page 32: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

11/54

Manifold ADMM

Non-smooth manifold optimization problem equivalently written as

minX∈M,Z

f(X) + g(Z) s.t. Z = AX

introducing an artificial variable Z and a linear constraint

Apply the method of multipliers only to the constraint Z = AX

minX∈M,Z

f(X) + g(Z) + ρ2∥AX −Z +U∥2F

Solve alternating w.r.t. X and Z and updating U ← U +AX −Z

Problem breaks into

Smooth manifold optimization sub-problem w.r.t. X, and

Non-smooth unconstrained sub-problem w.r.t. Z

Hestenes 1969; Powell 1969; Kovnatsky, Glashoff, B 2015

Page 33: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

11/54

Manifold ADMM

Non-smooth manifold optimization problem equivalently written as

minX∈M,Z

f(X) + g(Z) s.t. Z = AX

introducing an artificial variable Z and a linear constraint

Apply the method of multipliers only to the constraint Z = AX

minX∈M,Z

f(X) + g(Z) + ρ2∥AX −Z +U∥2F

Solve alternating w.r.t. X and Z and updating U ← U +AX −Z

Problem breaks into

Smooth manifold optimization sub-problem w.r.t. X, and

Non-smooth unconstrained sub-problem w.r.t. Z

Hestenes 1969; Powell 1969; Kovnatsky, Glashoff, B 2015

Page 34: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

12/54

MADMM

Algorithm 2 MADMM for non-smooth optimization on manifold MInitialize k ← 1, Z(1) = AX(1), U (1) = 0.

repeat

X-step: X(k+1) = argminX∈M

f(X) + ρ2∥AX −Z(k) +U (k)∥2F

Z-step: Z(k+1) = argminZ

g(Z) + ρ2∥AX(k+1) −Z +U (k)∥2F

Update U (k+1) = U (k) +AX(k+1) −Z(k+1)

k ← k + 1until convergence;

Solver/number of optimization iterations in X- and Z-steps

X-step and Z-step in some problems have a closed form

Parameter ρ > 0 can be chosen fixed or adapted

Kovnatsky, Glashoff, B 2015

Page 35: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

12/54

MADMM

Algorithm 2 MADMM for non-smooth optimization on manifold MInitialize k ← 1, Z(1) = AX(1), U (1) = 0.

repeat

X-step: X(k+1) = argminX∈M

f(X) + ρ2∥AX −Z(k) +U (k)∥2F

Z-step: Z(k+1) = argminZ

g(Z) + ρ2∥AX(k+1) −Z +U (k)∥2F

Update U (k+1) = U (k) +AX(k+1) −Z(k+1)

k ← k + 1until convergence;

Solver/number of optimization iterations in X- and Z-steps

X-step and Z-step in some problems have a closed form

Parameter ρ > 0 can be chosen fixed or adapted

Kovnatsky, Glashoff, B 2015

Page 36: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

12/54

MADMM

Algorithm 2 MADMM for non-smooth optimization on manifold MInitialize k ← 1, Z(1) = AX(1), U (1) = 0.

repeat

X-step: X(k+1) = argminX∈M

f(X) + ρ2∥AX −Z(k) +U (k)∥2F

Z-step: Z(k+1) = argminZ

g(Z) + ρ2∥AX(k+1) −Z +U (k)∥2F

Update U (k+1) = U (k) +AX(k+1) −Z(k+1)

k ← k + 1until convergence;

Solver/number of optimization iterations in X- and Z-steps

X-step and Z-step in some problems have a closed form

Parameter ρ > 0 can be chosen fixed or adapted

Kovnatsky, Glashoff, B 2015

Page 37: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

12/54

MADMM

Algorithm 2 MADMM for non-smooth optimization on manifold MInitialize k ← 1, Z(1) = AX(1), U (1) = 0.

repeat

X-step: X(k+1) = argminX∈M

f(X) + ρ2∥AX −Z(k) +U (k)∥2F

Z-step: Z(k+1) = argminZ

g(Z) + ρ2∥AX(k+1) −Z +U (k)∥2F

Update U (k+1) = U (k) +AX(k+1) −Z(k+1)

k ← k + 1until convergence;

Solver/number of optimization iterations in X- and Z-steps

X-step and Z-step in some problems have a closed form

Parameter ρ > 0 can be chosen fixed or adapted

Kovnatsky, Glashoff, B 2015

Page 38: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

13/54

Compressed modes

Page 39: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

14/54

Laplacian eigenfunctions

The first k eigenfunctions of some Laplacian are used in...

Spectral clustering Dimensionalityreduction

Spectral distances

Ng et al. 2001; Belkin, Nyogi 2001; Coifman, Lafon 2006

Page 40: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

15/54

Laplacian eigenfunctions

Find the first k eigenfunctions of an n × n Laplacian matrix ∆

minΦ∈Rn×k

tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I

tr(Φ⊺∆Φ) = ∑ij wij∥φi − φj∥2F a.k.a. Dirichlet energy in physics

Many efficient solvers with global optimality guarantees

1D Euclidean Laplacian eigenfunctions = Fourier basis

∆e−iωx = −ω2e−iωx

Globally supported!

Page 41: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

15/54

Laplacian eigenfunctions

Find the first k eigenfunctions of an n × n Laplacian matrix ∆

minΦ∈Rn×k

tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I

tr(Φ⊺∆Φ) = ∑ij wij∥φi − φj∥2F a.k.a. Dirichlet energy in physics

Many efficient solvers with global optimality guarantees

1D Euclidean Laplacian eigenfunctions = Fourier basis

∆e−iωx = −ω2e−iωx

Globally supported!

Page 42: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

15/54

Laplacian eigenfunctions

Find the first k eigenfunctions of an n × n Laplacian matrix ∆

minΦ∈Rn×k

tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I

tr(Φ⊺∆Φ) = ∑ij wij∥φi − φj∥2F a.k.a. Dirichlet energy in physics

Many efficient solvers with global optimality guarantees

1D Euclidean Laplacian eigenfunctions = Fourier basis

∆e−iωx = −ω2e−iωx

Globally supported!

Page 43: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

15/54

Laplacian eigenfunctions

Find the first k eigenfunctions of an n × n Laplacian matrix ∆

minΦ∈Rn×k

tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I

tr(Φ⊺∆Φ) = ∑ij wij∥φi − φj∥2F a.k.a. Dirichlet energy in physics

Many efficient solvers with global optimality guarantees

1D Euclidean Laplacian eigenfunctions = Fourier basis

∆e−iωx = −ω2e−iωx

Globally supported!

Page 44: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

15/54

Laplacian eigenfunctions

Find the first k eigenfunctions of an n × n Laplacian matrix ∆

minΦ∈Rn×k

tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I

tr(Φ⊺∆Φ) = ∑ij wij∥φi − φj∥2F a.k.a. Dirichlet energy in physics

Many efficient solvers with global optimality guarantees

1D Euclidean Laplacian eigenfunctions = Fourier basis

∆e−iωx = −ω2e−iωx

Globally supported!

Page 45: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

16/54

Laplacian eigenfunctions: 1D example

0 10 20 30 40 50 60 70 80 90 100−0.2

0

0.2

0 10 20 30 40 50 60 70 80 90 100−0.2

0

0.2

0 10 20 30 40 50 60 70 80 90 100−0.2

0

0.2

0 10 20 30 40 50 60 70 80 90 100−0.2

0

0.2

0 10 20 30 40 50 60 70 80 90 100−0.2

0

0.2

0 10 20 30 40 50 60 70 80 90 100−0.2

0

0.2

φ1 φ2

φ3 φ4

φ5 φ6

First eigenfunctions of a 1D Euclidean Laplacian

Page 46: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

17/54

Laplacian eigenfunctions: non-Euclidean example

0

max

min

First Laplacian eigenfunctions of a Laplacian on a triangular mesh

Neumann et al. 2014

Page 47: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

18/54

Compressed modes

minΦ∈Rn×k

tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I

Dirichlet energy = smoothness

L1-norm = sparsity

Smoothness + sparsity = localization

Ozolins et al. 2013

Page 48: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

18/54

Compressed modes

minΦ∈Rn×k

tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I

Dirichlet energy = smoothness

L1-norm = sparsity

Smoothness + sparsity = localization

Ozolins et al. 2013

Page 49: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

18/54

Compressed modes

minΦ∈Rn×k

tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I

Dirichlet energy = smoothness

L1-norm = sparsity

Smoothness + sparsity = localization

Ozolins et al. 2013

Page 50: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

18/54

Compressed modes

minΦ∈Rn×k

tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I

Dirichlet energy = smoothness

L1-norm = sparsity

Smoothness + sparsity = localization

Ozolins et al. 2013

Page 51: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

19/54

Compressed modes: 1D example

0 10 20 30 40 50 60 70 80 90 100−2

0

2

4

6

0 10 20 30 40 50 60 70 80 90 100−5

0

5

0 10 20 30 40 50 60 70 80 90 100−5

0

5

0 10 20 30 40 50 60 70 80 90 100−5

0

5

0 10 20 30 40 50 60 70 80 90 100−5

0

5

0 10 20 30 40 50 60 70 80 90 100−5

0

5

φ1 φ2

φ3 φ4

φ5 φ6

First compressed modes of a 1D Euclidean Laplacian

Ozolins et al. 2013

Page 52: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

20/54

Compressed modes: non-Euclidean example

0

max

min

First compressed modes

Neumann et al. 2014

Page 53: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

20/54

Compressed modes: non-Euclidean example

0

max

min

First Laplacian eigenfunctions

Neumann et al. 2014

Page 54: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

21/54

Wannier functions

Maximally-localized Wannier functions in Si and GaAs crystals

Wannier 1937; Mostofi 2008

Page 55: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

22/54

Splitting method for orthogonality constraints (SOC)

minΦ∈Rn×k

tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I

Algorithm 3 SOC method for computing compressed modes

Initialize k ← 1, Φ(1), P (1) = Q(1) = Φ(1), U (1) = V (1) = 0repeat

Φ(k+1) = argminΦ

tr(Φ⊺∆Φ)+ ρ2∥Φ−Q(k)+U (k)∥2F+ ρ

2∥Φ−P (k)+V (k)∥2F

Q(k+1) = argminQ

µ∥Q∥1 + ρ2∥Φ(k+1) −Q +U (k)∥2F

P (k+1) = argminP

ρ′

2∥Φ(k+1) − P + V (k)∥2F s.t. P ⊺P = I

U (k+1) = U (k) +Φ(k+1) −Q(k+1)

V (k+1) = V (k) +Φ(k+1) − P (k+1)

k ← k + 1until convergence;

Lai, Osher 2014

Page 56: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

22/54

Splitting method for orthogonality constraints (SOC)

minΦ,P,Q∈Rn×k

tr(Φ⊺∆Φ) + µ∥Q∥1 s.t. P = Φ, Q = Φ, P ⊺P = I

Algorithm 3 SOC method for computing compressed modes

Initialize k ← 1, Φ(1), P (1) = Q(1) = Φ(1), U (1) = V (1) = 0repeat

Φ(k+1) = argminΦ

tr(Φ⊺∆Φ)+ ρ2∥Φ−Q(k)+U (k)∥2F+ ρ

2∥Φ−P (k)+V (k)∥2F

Q(k+1) = argminQ

µ∥Q∥1 + ρ2∥Φ(k+1) −Q +U (k)∥2F

P (k+1) = argminP

ρ′

2∥Φ(k+1) − P + V (k)∥2F s.t. P ⊺P = I

U (k+1) = U (k) +Φ(k+1) −Q(k+1)

V (k+1) = V (k) +Φ(k+1) − P (k+1)

k ← k + 1until convergence;

Lai, Osher 2014

Page 57: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

22/54

Splitting method for orthogonality constraints (SOC)

minΦ,P,Q∈Rn×k

tr(Φ⊺∆Φ) + µ∥Q∥1 s.t. P = Φ, Q = Φ, P ⊺P = I

Algorithm 3 SOC method for computing compressed modes

Initialize k ← 1, Φ(1), P (1) = Q(1) = Φ(1), U (1) = V (1) = 0repeat

Φ(k+1) = argminΦ

tr(Φ⊺∆Φ)+ ρ2∥Φ−Q(k)+U (k)∥2F+ ρ

2∥Φ−P (k)+V (k)∥2F

Q(k+1) = argminQ

µ∥Q∥1 + ρ2∥Φ(k+1) −Q +U (k)∥2F

P (k+1) = argminP

ρ′

2∥Φ(k+1) − P + V (k)∥2F s.t. P ⊺P = I

U (k+1) = U (k) +Φ(k+1) −Q(k+1)

V (k+1) = V (k) +Φ(k+1) − P (k+1)

k ← k + 1until convergence;

Lai, Osher 2014

Page 58: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

23/54

Compressed modes as manifold optimization

minΦ∈S(n,k)

tr(Φ⊺∆Φ) + µ∥Φ∥1

Stiefel manifold S(n, k) = {X ∈ Rn×k ∶X⊺X = I}

Sub-problem w.r.t. Φ: smooth manifold-constrained minimization ofa quadratic function

minΦ∈S(n,k)

tr(Φ⊺∆Φ) + ρ2∥Φ −Z +U∥2F

Sub-problem w.r.t. Z: sparse coding (Lasso) problem

minZ∥Z∥1 + ρ

2∥Φ +U −Z∥2F

Kovnatsky, Glashoff, B 2015

; Chen et al. 1995; Tibshirani 1996

Page 59: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

23/54

Compressed modes as manifold optimization

minΦ∈S(n,k),Z

tr(Φ⊺∆Φ) + µ∥Z∥1 + ρ2∥Φ −Z +U∥2F

Stiefel manifold S(n, k) = {X ∈ Rn×k ∶X⊺X = I}

Sub-problem w.r.t. Φ: smooth manifold-constrained minimization ofa quadratic function

minΦ∈S(n,k)

tr(Φ⊺∆Φ) + ρ2∥Φ −Z +U∥2F

Sub-problem w.r.t. Z: sparse coding (Lasso) problem

minZ∥Z∥1 + ρ

2∥Φ +U −Z∥2F

Kovnatsky, Glashoff, B 2015

; Chen et al. 1995; Tibshirani 1996

Page 60: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

23/54

Compressed modes as manifold optimization

minΦ∈S(n,k),Z

tr(Φ⊺∆Φ) + µ∥Z∥1 + ρ2∥Φ −Z +U∥2F

Stiefel manifold S(n, k) = {X ∈ Rn×k ∶X⊺X = I}Sub-problem w.r.t. Φ: smooth manifold-constrained minimization ofa quadratic function

minΦ∈S(n,k)

tr(Φ⊺∆Φ) + ρ2∥Φ −Z +U∥2F

Sub-problem w.r.t. Z: sparse coding (Lasso) problem

minZ∥Z∥1 + ρ

2∥Φ +U −Z∥2F

Kovnatsky, Glashoff, B 2015

; Chen et al. 1995; Tibshirani 1996

Page 61: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

23/54

Compressed modes as manifold optimization

minΦ∈S(n,k),Z

tr(Φ⊺∆Φ) + µ∥Z∥1 + ρ2∥Φ −Z +U∥2F

Stiefel manifold S(n, k) = {X ∈ Rn×k ∶X⊺X = I}Sub-problem w.r.t. Φ: smooth manifold-constrained minimization ofa quadratic function

minΦ∈S(n,k)

tr(Φ⊺∆Φ) + ρ2∥Φ −Z +U∥2F

Sub-problem w.r.t. Z: sparse coding (Lasso) problem

minZ∥Z∥1 + ρ

2∥Φ +U −Z∥2F

Kovnatsky, Glashoff, B 2015; Chen et al. 1995; Tibshirani 1996

Page 62: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

24/54

Compressed modes by MADMM

Algorithm 4 MADMM for computing compressed modes

Input n × n Laplacian matrix ∆, parameter µ > 0

Output k first compressed modes of ∆

Initialize k ← 1, Φ(1) ←some orthonormal matrix, Z(1) = Φ(1), U (1) = 0

repeat

Φ(k+1) = argminΦ∈S(n,k)

tr(Φ⊺∆Φ) + ρ2∥Φ −Z(k) +U (k)∥2F

Z(k+1) = Shrinkµρ(Φ(k+1) +U (k))

Update U (k+1) = U (k) +Φ(k+1) −Z(k+1)

k ← k + 1until convergence;

where Shrinkα(x) = sign(x)max{0, ∣x∣ − α} is the shrinkage operator

Kovnatsky, Glashoff, B 2015

Page 63: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

25/54

Convergence

Convergence of MADMM with different random initializations(compressed modes problem of size n = 500, k = 10)

10−1 100 101 102

101

102

103

Time (sec)

Co

st

Kovnatsky, Glashoff, B 2015

Page 64: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

26/54

Convergence

Convergence of MADMM with different X-step solvers(compressed modes problem of size n = 500, k = 10)

10−1 100 101 102

101

102

103

2

3

5

23

5

Time (sec)

Co

st

Trust regions

Conjugate gradients

Kovnatsky, Glashoff, B 2015; Manopt: Boumal et al. 2014

Page 65: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

27/54

Convergence

Example of convergence of different methods(compressed modes problem of size n = 8 × 103, k = 10)

0 1,000 2,000 3,000 4,000 5,000100

101

102

103

Time (sec)

Co

st

Lai & Osher

Neumann et al.

MADMM

Kovnatsky, Glashoff, B 2015; Lai, Osher 2014; Neumann et al. 2014

Page 66: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

28/54

Scalability

Complexity of different methods(compressed modes problem of size n, k = 10)

1,000 2,000 3,000 4,000 5,00010−1

100

101

102

Problem size n

Tim

e/it

er(s

ec)

Lai & Osher

Neumann

MADMM

Kovnatsky, Glashoff, B 2015; Lai, Osher 2014; Neumann et al. 2014

Page 67: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

29/54

Functional correspondence

Page 68: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

30/54

Applications of shape correspondence

Texture mapping Pose transfer

B2, Kimmel 2007; Sumner et al. 2004

Page 69: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

31/54

Shape correspondence

s

S

q

Q

t

Point-wise map t∶S → Q

Page 70: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

31/54

Shape correspondence

s

S

q

Q

t

s′

q′

Minimum-distortion point-wise map t∶S → Q

B2, Kimmel 2006

Page 71: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

31/54

Shape correspondence

f

F(S)

g

F(Q)

linear T

Functional map T ∶F(S)→ F(Q)

Ovsjanikov et al. 2012

Page 72: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

32/54

Functional correspondence

f

g

T↓

Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺

If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)

Ovsjanikov et al. 2012

Page 73: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

32/54

Functional correspondence

f

g

φ1 φ2 φk

ψ1 ψ2 ψk

≈ a1 + a2 + ⋯ + ak

≈ b1 + b2 + ⋯ + bk

T↓

Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺

If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)

Ovsjanikov et al. 2012

Page 74: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

32/54

Functional correspondence

f

g

φ1 φ2 φk

ψ1 ψ2 ψk

≈ a1 + a2 + ⋯ + ak

≈ b1 + b2 + ⋯ + bk

T↓

C↓

Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺

If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)

Ovsjanikov et al. 2012

Page 75: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

32/54

Functional correspondence

f

g

φ1 φ2 φk

ψ1 ψ2 ψk

≈ a1 + a2 + ⋯ + ak

≈ b1 + b2 + ⋯ + bk

T↓

C↓

Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺

If T is area-preserving, C⊺C = I

Represent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)

Ovsjanikov et al. 2012

Page 76: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

32/54

Functional correspondence

f

g

φ1 φ2 φk

ψ1 ψ2 ψk

≈ a1 + a2 + ⋯ + ak

≈ b1 + b2 + ⋯ + bk

T↓

C↓

Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺

If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)

Ovsjanikov et al. 2012

Page 77: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

32/54

Functional correspondence

f

g

φ1 φ2 φk

ψ1 ψ2 ψk

≈ a1 + a2 + ⋯ + ak

≈ b1 + b2 + ⋯ + bk

T↓

C↓

Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺

If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)

Given known corresponding functions F = (f1,⋯, fq) andG = (g1,⋯, gq), find C by solving linear system CΦ⊺F = Ψ⊺G

Ovsjanikov et al. 2012

Page 78: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

32/54

Functional correspondence

f

g

φ1 φ2 φk

ψ1 ψ2 ψk

≈ a1 + a2 + ⋯ + ak

≈ b1 + b2 + ⋯ + bk

T↓

C↓

Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺

If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)

Given known corresponding Fourier coefficients A = Φ⊺F andB = Ψ⊺G, find C by solving linear system CA = B

Ovsjanikov et al. 2012

Page 79: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

33/54

Functional correspondence in shape collection

S1 S2

SL⋱

AijCij ≈ Bij

Si

Sj

Kovnatsky, B2, Glashoff, Kimmel 2013; Kovnatsky, Glashoff, B 2015

Page 80: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

33/54

Functional correspondence in shape collection

X1 X2

XLXi

Xj

S1 S2

SL⋱

AijXi ≈ BijXj

Si

Sj

Kovnatsky, B2, Glashoff, Kimmel 2013; Kovnatsky, Glashoff, B 2015

Page 81: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

34/54

Functional correspondence as manifold optimization

min(X1,⋯,XL)∈SL(k,k)

∑i≠j∥AijXi −BijXj∥2,1 + µ

L

∑i=1

tr(X⊺i ΛiXi)

where Aij ,Bij are Fourier coefficients of given corresponding functionson shapes Si,Sj , and Λi are the first k eigenvalues of Laplacian ∆Si

Joint diagonalization of Laplacians: find new bases Φi = ΦiXi thatapproximately diagonalize ∆i and are coupled

Optimization on product of Stiefel manifolds SL(k, k)L2,1-norm allows to cope with outliers in correspondence data

X-step: manifold-constrained minimization of a quadratic function

Z-step: one iteration of shrinkage

Eynard, Kovnatsky, B2, Glashoff 2012; Kovnatsky, Glashoff, B 2015

Page 82: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

34/54

Functional correspondence as manifold optimization

min(X1,⋯,XL)∈SL(k,k)

∑i≠j∥AijXi −BijXj∥2,1 + µ

L

∑i=1

tr(X⊺i ΛiXi)

where Aij ,Bij are Fourier coefficients of given corresponding functionson shapes Si,Sj , and Λi are the first k eigenvalues of Laplacian ∆Si

Joint diagonalization of Laplacians: find new bases Φi = ΦiXi thatapproximately diagonalize ∆i and are coupled

Optimization on product of Stiefel manifolds SL(k, k)

L2,1-norm allows to cope with outliers in correspondence data

X-step: manifold-constrained minimization of a quadratic function

Z-step: one iteration of shrinkage

Eynard, Kovnatsky, B2, Glashoff 2012; Kovnatsky, Glashoff, B 2015

Page 83: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

34/54

Functional correspondence as manifold optimization

min(X1,⋯,XL)∈SL(k,k)

∑i≠j∥AijXi −BijXj∥2,1 + µ

L

∑i=1

tr(X⊺i ΛiXi)

where Aij ,Bij are Fourier coefficients of given corresponding functionson shapes Si,Sj , and Λi are the first k eigenvalues of Laplacian ∆Si

Joint diagonalization of Laplacians: find new bases Φi = ΦiXi thatapproximately diagonalize ∆i and are coupled

Optimization on product of Stiefel manifolds SL(k, k)L2,1-norm allows to cope with outliers in correspondence data

X-step: manifold-constrained minimization of a quadratic function

Z-step: one iteration of shrinkage

Eynard, Kovnatsky, B2, Glashoff 2012; Kovnatsky, Glashoff, B 2015

Page 84: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

34/54

Functional correspondence as manifold optimization

min(X1,⋯,XL)∈SL(k,k)

∑i≠j∥AijXi −BijXj∥2,1 + µ

L

∑i=1

tr(X⊺i ΛiXi)

where Aij ,Bij are Fourier coefficients of given corresponding functionson shapes Si,Sj , and Λi are the first k eigenvalues of Laplacian ∆Si

Joint diagonalization of Laplacians: find new bases Φi = ΦiXi thatapproximately diagonalize ∆i and are coupled

Optimization on product of Stiefel manifolds SL(k, k)L2,1-norm allows to cope with outliers in correspondence data

X-step: manifold-constrained minimization of a quadratic function

Z-step: one iteration of shrinkage

Eynard, Kovnatsky, B2, Glashoff 2012; Kovnatsky, Glashoff, B 2015

Page 85: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

34/54

Functional correspondence as manifold optimization

min(X1,⋯,XL)∈SL(k,k)

∑i≠j∥AijXi −BijXj∥2,1 + µ

L

∑i=1

tr(X⊺i ΛiXi)

where Aij ,Bij are Fourier coefficients of given corresponding functionson shapes Si,Sj , and Λi are the first k eigenvalues of Laplacian ∆Si

Joint diagonalization of Laplacians: find new bases Φi = ΦiXi thatapproximately diagonalize ∆i and are coupled

Optimization on product of Stiefel manifolds SL(k, k)L2,1-norm allows to cope with outliers in correspondence data

X-step: manifold-constrained minimization of a quadratic function

Z-step: one iteration of shrinkage

Eynard, Kovnatsky, B2, Glashoff 2012; Kovnatsky, Glashoff, B 2015

Page 86: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

35/54

Correspondence data

Example of correspondence data(10% of outliers shown in red)

Kovnatsky, Glashoff, B 2015

Page 87: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

36/54

Correspondence quality

Robust (MADMM)

Least squares

Kovnatsky, Glashoff, B 2015

Page 88: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

37/54

Correspondence quality

Correspondence quality evaluated using Princeton protocol

0 5 ⋅ 10−2 0.1 0.15 0.2 0.250

0.2

0.4

0.6

0.8

1

% geodesic diameter

%o

fco

rres

po

nd

ence

LS

MADMM

Kovnatsky, Glashoff, B 2015; data: B2, Kimmel 2008, benchmark: Kim et al. 2011

Page 89: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

38/54

Convergence

Convergence of different methods

0 2 4 6 8 10

100.2

100.4

10-410-6

10-8

Time (sec)

Co

st

Smoothing

MADMM

Kovnatsky, Glashoff, B 2015

Page 90: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

39/54

Multimodal spectral clustering

UncoupledNo outliers

100%

Coupled (L2)No outliers

53%

Coupled (L2)10% outliers

72%

Coupled (L2,1)10% outliers

82%

Kovnatsky, Glashoff, B 2015; Eynard, Kovnatsky, B2, Glashoff 2012

Page 91: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

39/54

Multimodal spectral clustering

UncoupledNo outliers

100%

Coupled (L2)No outliers

53%

Coupled (L2)10% outliers

72%

Coupled (L2,1)10% outliers

82%

Kovnatsky, Glashoff, B 2015; Eynard, Kovnatsky, B2, Glashoff 2012

Page 92: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

39/54

Multimodal spectral clustering

UncoupledNo outliers

100%

Coupled (L2)No outliers

53%

Coupled (L2)10% outliers

72%

Coupled (L2,1)10% outliers

82%

Kovnatsky, Glashoff, B 2015; Eynard, Kovnatsky, B2, Glashoff 2012

Page 93: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

39/54

Multimodal spectral clustering

UncoupledNo outliers

100%

Coupled (L2)No outliers

53%

Coupled (L2)10% outliers

72%

Coupled (L2,1)10% outliers

82%

Kovnatsky, Glashoff, B 2015; Eynard, Kovnatsky, B2, Glashoff 2012

Page 94: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

40/54

Multidimensional scaling

Page 95: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

41/54

Multidimensional scaling

D = X =

7 1 9 2 13

10 2 7 2 13

9 1 2 2 2

2 14 2 7 9

3 14 1 2 1

3 2 9 10 7

MDS problem: given an n × n (squared) distance matrix D, find ak-dimensional configuration of points X ∈ Rn×k such that

∥xi − xj∥22 ≈ dij

Cayton, Dasgupta 2006

Page 96: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

42/54

Similarity vs Distance

Equivalence between distances and similarities

(Squared) distances Similarities

EDM PSD

dist(B) = (bii + bjj − 2bij)

B = − 12HDH

where H = I − 1n

11⊺ is the double-centering matrix

Schonberg 1938; Dattoro 2005; Cayton, Dasgupta 2006

Page 97: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

42/54

Similarity vs Distance

Equivalence between distances and similarities

(Squared) distances Similarities

EDM PSD

B = − 12HDH

B∗= UΛ+U⊺

where H = I − 1n

11⊺ is the double-centering matrix

Schonberg 1938; Dattoro 2005; Cayton, Dasgupta 2006

Page 98: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

43/54

Classical MDS

Algorithm 5 Classical MDS

Input squared distance matrix D

Compute similarity by double centering: B = − 12HDH

Perform eigendecomposition B = UΛU⊺ and take the largest k positiveeigenvalues Λk and corresponding eigenvectors Uk

Output X = UkΛ1/2k

Classical MDS as optimization problem: minimize the strain

minX∈Rn×k

∥HDH −XX⊺∥2F

Young, Householder 1938

Page 99: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

43/54

Classical MDS

Algorithm 5 Classical MDS

Input squared distance matrix D

Compute similarity by double centering: B = − 12HDH

Perform eigendecomposition B = UΛU⊺ and take the largest k positiveeigenvalues Λk and corresponding eigenvectors Uk

Output X = UkΛ1/2k

Classical MDS as optimization problem: minimize the strain

minX∈Rn×k

∥HDH −XX⊺∥2F

Young, Householder 1938

Page 100: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

44/54

Sensitivity to outliers

Error dispersion by double-centering

(Squared) distance matrix Similarity matrix

ε ε/n ε

ε/n2

B = − 12HDH

Cayton, Dasgupta 2006

Page 101: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

45/54

Sensitivity to outliers

Seattle

SF

LA

Denver

NY WDC

Atlanta

Miami Houston

Chicago

Distances between 10 US cities computed with classical MDS

with distance between NY and LA doubled

Kruskal, Wish 1978; Cayton, Dasgupta 2006

Page 102: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

45/54

Sensitivity to outliers

Seattle SF

LA

Denver

NY

WDC Atlanta

Miami

Houston Chicago

Distances between 10 US cities computed with classical MDSwith distance between NY and LA doubled

Kruskal, Wish 1978; Cayton, Dasgupta 2006

Page 103: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

46/54

Robust Euclidean embedding (REE)

Minimize a robust norm (instead of the Frobenius norm)

minD∗∈EDM

∥D −D∗∥1

and then recover k-dimensional X from D∗ using classical MDS

Non-smooth

Can be formulated as a semi-definite program (SDP), or

Solved by subgradient minimization

Cayton, Dasgupta 2006

Page 104: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

46/54

Robust Euclidean embedding (REE)

Minimize a robust norm (instead of the Frobenius norm)

minD∗∈EDM

∥D −D∗∥1

and then recover k-dimensional X from D∗ using classical MDS

Non-smooth

Can be formulated as a semi-definite program (SDP), or

Solved by subgradient minimization

Cayton, Dasgupta 2006

Page 105: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

46/54

Robust Euclidean embedding (REE)

Minimize a robust norm (instead of the Frobenius norm)

minD∗∈EDM

∥D −D∗∥1

and then recover k-dimensional X from D∗ using classical MDS

Non-smooth

Can be formulated as a semi-definite program (SDP), or

Solved by subgradient minimization

Cayton, Dasgupta 2006

Page 106: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

46/54

Robust Euclidean embedding (REE)

Minimize a robust norm (instead of the Frobenius norm)

minD∗∈EDM

∥D −D∗∥1

and then recover k-dimensional X from D∗ using classical MDS

Non-smooth

Can be formulated as a semi-definite program (SDP), or

Solved by subgradient minimization

Cayton, Dasgupta 2006

Page 107: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

47/54

REE as manifold optimization

minB∈S+(n,k)

∥D − dist(B)∥1

Manifold of fixed-rank positive semi-definite matricesS+(n, k) = {X ∈ Rn×n ∶X =X⊺ ⪰ 0, rank(X) = k}Only non-smooth function (f ≡ 0)

X-step: manifold-constrained minimization of a quadratic function

Z-step: one iteration of shrinkage

Kovnatsky, Glashoff, B 2015

Page 108: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

48/54

REE by MADMM

Algorithm 6 MADMM for solving the REE problem

Input squared distance matrix D

Initialize k ← 1, Z(1) =X(1), U (1) = 0

repeat

X-step: B(k+1) = argminB∈S+(n,k)

∥dist(B(k+1)) −Z(k) −D +U (k)∥2F

Z-step: Z(k+1) = Shrink 1ρ

(dist(B(k+1)) −D +U (k))

Update U (k+1) = U (k) + dist(B(k+1)) −D −Z(k+1)

k ← k + 1until convergence;

Kovnatsky, Glashoff, B 2015

Page 109: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

49/54

Robust Euclidean embedding example

Groundtruth

Classical MDS

MADMM

Embedding of distanced between 500 US cities corrupted by sparse noise(doubling the distance between a few pairs of cities)

Kovnatsky, Glashoff, B 2015

Page 110: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

50/54

Scalability of REE

Complexity of different methods for REE problem of different size n

0 200 400 600 800 1,000

10−2

100

102

Problem size n

Tim

e/it

er(s

ec)

SDP

Subgradient

MADMM

Kovnatsky, Glashoff, B 2015; Cayton, Dasgupta 2006

Page 111: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

51/54

Convergence

Convergence of different methods on REE problem of size n = 500

0 20 40 60 80 100

103.5

104

10-310-4

10-5

10-2

Time (sec)

Str

ess

Subgradient

MADMM

Kovnatsky, Glashoff, B 2015; Cayton, Dasgupta 2006

Page 112: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

52/54

Conclusions

Non-smooth manifold optimization problems are ubiquitous inmachine learning, pattern recognition, signal processing, andcomputer graphics applications

MADMM is a generic algorithm for such problems

Any manifold, any function

Very simple to implement

No parameters to tune

A. Kovnatsky, K. Glashoff, M. M. Bronstein, ‘MADMM: a generic algorithm fornon-smooth optimization on manifolds’, arXiv:1505.07676, May 2015

Page 113: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

53/54

A. Kovnatsky

Funded by

Page 114: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015  · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:

54/54

Thank you!