A dual certificate analysis of compressive off-the-grid recoveryA dual certificate analysis of...

Preview:

Citation preview

A dual certificate analysis of compressive off-the-grid recovery

Nicolas KerivenEcole Normale Supérieure (Paris)

CFM-ENS chair in Data Science

Joint work with Clarice Poon (Cambridge Uni.), Gabriel Peyré (ENS)

Journée GdR MIA, May 3rd 2018

Discrete compressive sensing

1/14

Discrete compressive sensing

1/14

Discrete compressive sensing

1/14

• Signal: vector

Discrete compressive sensing

1/14

• Signal: vector• Sparsity: few non-zeros coefficients

Discrete compressive sensing

1/14

• Signal: vector• Sparsity: few non-zeros coefficients• Dimensionality reduction (often random matrix)

Discrete compressive sensing

1/14

• Signal: vector• Sparsity: few non-zeros coefficients• Dimensionality reduction (often random matrix)• Recovery: convex relaxation LASSO

Discrete compressive sensing

1/14

Off-the-grid recovery: « super-resolution »

2/14

Off-the-grid recovery: « super-resolution »

2/14

- Fourier

- Convolution

Off-the-grid recovery: « super-resolution »

2/14

- Fourier

- Convolution

Off-the-grid recovery: « super-resolution »

2/14

- Fourier

- Convolution

Off-the-grid recovery: « super-resolution »

• Signal: Radon measure

2/14

- Fourier

- Convolution

Off-the-grid recovery: « super-resolution »

• Signal: Radon measure• Sparsity:

2/14

- Fourier

- Convolution

Off-the-grid recovery: « super-resolution »

• Signal: Radon measure• Sparsity:• Dimensionality reduction (e.g. first Fourier coefficients)

2/14

- Fourier

- Convolution

Off-the-grid recovery: « super-resolution »

• Signal: Radon measure• Sparsity:• Dimensionality reduction (e.g. first Fourier coefficients)• Recovery: convex relaxation?

See Keriven 2017, Gribonval 2017

2/14

- Fourier

- Convolution

Off-the-grid recovery: « super-resolution »

• Signal: Radon measure• Sparsity:• Dimensionality reduction (e.g. first Fourier coefficients)• Recovery: convex relaxation? BLASSO [De Castro, Gamboa 2012]

See Keriven 2017, Gribonval 2017

2/14

- Fourier

- Convolution

Off-the-grid recovery: « super-resolution »

• Signal: Radon measure• Sparsity:• Dimensionality reduction (e.g. first Fourier coefficients)• Recovery: convex relaxation? BLASSO [De Castro, Gamboa 2012]

Other approaches: « Prony-like » ESPRIT, MUSIC… (but only 1d noiseless Fourier)

See Keriven 2017, Gribonval 2017

2/14

Example of applications

Fluorescence microscopy[Betzig 2006]

3/14

Example of applications

Fluorescence microscopy[Betzig 2006]

Astronomy[Puschmann 2017]

3/14

Example of applications

Fluorescence microscopy[Betzig 2006]

Compressive k-means(GMM…) [Keriven 2017]

Astronomy[Puschmann 2017]

3/14

Example of applications

Fluorescence microscopy[Betzig 2006]

Compressive k-means(GMM…) [Keriven 2017]

Astronomy[Puschmann 2017]

• Neuro-imaging with EEG [Gramfort 2013]

• 1-layer neural network [Bach 2017]

• Radar• Geophysics• …

3/14

A seminal result [Candès, Fernandez-Granda 2012]

4/14

A seminal result [Candès, Fernandez-Granda 2012]

First Fourier coefficients

4/14

A seminal result [Candès, Fernandez-Granda 2012]

First Fourier coefficients

Inverse Fourier

4/14

A seminal result [Candès, Fernandez-Granda 2012]

First Fourier coefficients

Inverse Fourier

BLASSO

4/14

A seminal result [Candès, Fernandez-Granda 2012]

First Fourier coefficients

• Minimal separation

Inverse Fourier

BLASSO

4/14

A seminal result [Candès, Fernandez-Granda 2012]

First Fourier coefficients

• Minimal separation

• Regular Fourier on the Torus

Inverse Fourier

BLASSO

4/14

A seminal result [Candès, Fernandez-Granda 2012]

First Fourier coefficients

• Minimal separation

• Regular Fourier on the Torus

• Reconstruction: formulated as SDP (other: Frank-Wolfe,

greedy, Prony-like…)

Inverse Fourier

BLASSO

4/14

A seminal result [Candès, Fernandez-Granda 2012]

First Fourier coefficients

• Minimal separation

• Regular Fourier on the Torus

• Reconstruction: formulated as SDP (other: Frank-Wolfe,

greedy, Prony-like…)

• Noise: weak convergence (mass of concentrated

around true Diracs)

Inverse Fourier

BLASSO

4/14

A seminal result [Candès, Fernandez-Granda 2012]

First Fourier coefficients

• Minimal separation

• Regular Fourier on the Torus

• Reconstruction: formulated as SDP (other: Frank-Wolfe,

greedy, Prony-like…)

• Noise: weak convergence (mass of concentrated

around true Diracs)

Inverse Fourier

BLASSO

4/14

Randomness ? Numberof measurements ?

Compressed sensing off-the-grid [Tang, Recht 2013]

5/14

Compressed sensing off-the-grid [Tang, Recht 2013]

Random Fourier coefficients

random

5/14

Compressed sensing off-the-grid [Tang, Recht 2013]

Random Fourier coefficients

BLASSO

random

5/14

Compressed sensing off-the-grid [Tang, Recht 2013]

Random Fourier coefficients

• As in compressive sensing, randomFourier sampling is possible

But:• Limited to 1d Regular Fourier (relies

heavily on previous work by Candès)

• Random signs assumption

BLASSO

random

5/14

Compressed sensing off-the-grid [Tang, Recht 2013]

Random Fourier coefficients

• As in compressive sensing, randomFourier sampling is possible

But:• Limited to 1d Regular Fourier (relies

heavily on previous work by Candès)

• Random signs assumption

BLASSO

Questions:

• More general sampling scheme ?

• Multi-dimensional result ?

• Get rid of random signs ?

random

5/14

Outline

Background on dual certificates

Compressive off-the-grid recovery

Conclusion, outlooks

Dual certificates

Measurements

Hilbert space

6/14

Dual certificates

Measurements

Hilbert space

BLASSO

6/14

Dual certificates

Measurements

Hilbert space

BLASSO

First-order conditions

solution ofBLASSO

6/14

solution of

Dual certificates

Measurements

Hilbert space

BLASSO

First-order conditions

Dual certificate (noiseless case)

solution ofBLASSO

6/14

What is a dual certificate ?

What does it look like ?

7/14

What is a dual certificate ?

What does it look like ?

7/14

What is a dual certificate ?

What does it look like ?

7/14

Case :

What is a dual certificate ?

What does it look like ?

7/14

Case :

What is a dual certificate ?

What does it look like ?

7/14

Case :

What is a dual certificate ?

What does it look like ?

7/14

Ensures uniqueness and robustness…

Non-degenerate dual certif.

Recovery results

8/14

Recovery results

8/14

Recovery results

Theorem (refinement of [Azaïs et al. 2015])

8/14

Recovery results

Theorem (refinement of [Azaïs et al. 2015])

Hyp: there exists a ND dual certif.

8/14

Recovery results

Theorem (refinement of [Azaïs et al. 2015])

Hyp: there exists a ND dual certif.Result: (wrt Bregman divergence)

- not necessarily sparse, but-

8/14

Recovery results

Theorem (refinement of [Azaïs et al. 2015])

Hyp: there exists a ND dual certif.Result: (wrt Bregman divergence)

- not necessarily sparse, but- Mass of concentrated around

8/14

Recovery results

Theorem (refinement of [Azaïs et al. 2015])

Hyp: there exists a ND dual certif.Result: (wrt Bregman divergence)

- not necessarily sparse, but- Mass of concentrated around- Concentration increases when:

noise

8/14

Recovery results

Theorem (refinement of [Azaïs et al. 2015])

Hyp: there exists a ND dual certif.Result: (wrt Bregman divergence)

- not necessarily sparse, but- Mass of concentrated around- Concentration increases when:

noise

Theorem ([Duval Peyré 2015])

8/14

Recovery results

Theorem (refinement of [Azaïs et al. 2015])

Hyp: there exists a ND dual certif.Result: (wrt Bregman divergence)

- not necessarily sparse, but- Mass of concentrated around- Concentration increases when:

noise

Theorem ([Duval Peyré 2015])

Hyp: the minimal norm certificate is non-degenerate

8/14

Recovery results

Theorem (refinement of [Azaïs et al. 2015])

Hyp: there exists a ND dual certif.Result: (wrt Bregman divergence)

- not necessarily sparse, but- Mass of concentrated around- Concentration increases when:

noise

Theorem ([Duval Peyré 2015])

Hyp: the minimal norm certificate is non-degenerateResult: (in the small noise regime)

- is sparse, with the right number of components

-

8/14

Outline

Background on dual certificates

Compressive off-the-grid recovery

Conclusion, outlooks

Goal

Goal: random sampling

Low-dim, random

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Step 1: Build ND certificate with full kernel

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Step 1: Build ND certificate with full kernel

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Step 1: Build ND certificate with full kernel

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Step 1: Build ND certificate with full kernel

Step 2: Use Random Features on full kernel to define sampling

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Step 1: Build ND certificate with full kernel

Step 2: Use Random Features on full kernel to define sampling

Assuming

Define

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Step 1: Build ND certificate with full kernel

Step 2: Use Random Features on full kernel to define sampling

Assuming

Define

Then

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Step 1: Build ND certificate with full kernel

Step 2: Use Random Features on full kernel to define sampling

Assuming

Define

Then

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Step 1: Build ND certificate with full kernel

Step 2: Use Random Features on full kernel to define sampling

m=10

Assuming

Define

Then

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Step 1: Build ND certificate with full kernel

Step 2: Use Random Features on full kernel to define sampling

m=10 m=50

Assuming

Define

Then

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Goal

Strategy: Start with « high »-dimensional problemGoal: random sampling

Step 1: Build ND certificate with full kernel

Step 2: Use Random Features on full kernel to define sampling

m=500m=10 m=50

Assuming

Define

Then

Low-dim, random Ex: full Fourier (high-dim), full convolution (infinite-dim)

9/14

Step 1: Acceptable full kernels

10/14

Step 1: Acceptable full kernels

10/14

Step 1: Acceptable full kernels

10/14

Step 1: Acceptable full kernels

10/14

Step 1: Acceptable full kernels

Min. Separation

10/14

Step 1: Acceptable full kernels

Min. Separation

1: kernel at each saturation point

10/14

Step 1: Acceptable full kernels

Min. Separation

2: Small adjustements: minimal separation

1: kernel at each saturation point

10/14

Step 1: Acceptable full kernels

Min. Separation

2: Small adjustements: minimal separation

- Multi-d square Féjer kernel (regular Fourier on Torus)

- Multi-d Gaussian kernel

1: kernel at each saturation point

10/14

Step 2: sampling

Thm: Ideal scaling in sparsity: infinite-dim. golfing scheme

11/14

Step 2: sampling

Thm: Ideal scaling in sparsity: infinite-dim. golfing scheme

Depend on kernel

11/14

Step 2: sampling

Thm: Ideal scaling in sparsity: infinite-dim. golfing scheme

• There exists a ND dual certificate (however not the minimal norm certificate)

Depend on kernel

11/14

Step 2: sampling

Thm: Ideal scaling in sparsity: infinite-dim. golfing scheme

• There exists a ND dual certificate (however not the minimal norm certificate)

• No need for random signs

Depend on kernel

11/14

Step 2: sampling

Thm: Ideal scaling in sparsity: infinite-dim. golfing scheme

• There exists a ND dual certificate (however not the minimal norm certificate)

• No need for random signs

• Proof: golfing scheme [Gross 2009, Candès Plan 2011]

Depend on kernel

11/14

Step 2: sampling

Thm: Ideal scaling in sparsity: infinite-dim. golfing scheme

• There exists a ND dual certificate (however not the minimal norm certificate)

• No need for random signs

• Proof: golfing scheme [Gross 2009, Candès Plan 2011]

Thm: Minimal norm certificate (adaptation of [Tang, Recht 2013])

Depend on kernel

With random signs Without random signs

11/14

Step 2: sampling

Thm: Ideal scaling in sparsity: infinite-dim. golfing scheme

• There exists a ND dual certificate (however not the minimal norm certificate)

• No need for random signs

• Proof: golfing scheme [Gross 2009, Candès Plan 2011]

Thm: Minimal norm certificate (adaptation of [Tang, Recht 2013])

Depend on kernel

With random signs Without random signs

11/14

Fun application: convex approach for automatic estimation of number of components in a GMM

Number of measurements in practice ?

Nicolas Keriven

Compressive k-means [Keriven 2017]

Relative number of measurements m/(sd)

12/14

Outline

Background on dual certificates

Compressive off-the-grid recovery

Conclusion, outlooks

Summary, outlooks

• Summary: generalization of existing results on super-resolutionwith random measurements (and minimal separation)

• Beyond Fourier on the Torus (« acceptable » kernels)• Multi-d• No need for random signs for basic recovery result• Support recovery when random signs (or quadratic number of

measurements)

13/14

Summary, outlooks

• Summary: generalization of existing results on super-resolutionwith random measurements (and minimal separation)

• Beyond Fourier on the Torus (« acceptable » kernels)• Multi-d• No need for random signs for basic recovery result• Support recovery when random signs (or quadratic number of

measurements)

13/14

• Outlooks• Other kernels, very different from translation-invariant• More quantified treatment of dimension• Other practical applications (eg 1-layer neural networks with continuum of

neurons [Bach 2017])

Poon, Keriven, Peyré. A Dual Certificates Analysis of Compressive Off-the-Grid Recovery. Preprint arxiv:1802.08464

Code: sketchml.gforge.inria.fr,github: nkeriven

Thank you !

14/14

data-ens.github.ioEnter the data challenges!Come to the colloquium!Come to the Laplace seminars!

Recommended