26
High-dimensional Error Analysis of Regularized M-Estimators Ehsan Abbasi Christos Thrampoulidis Babak Hassibi Allerton Conference Wednesday September 30, 2015 1

High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

Embed Size (px)

Citation preview

Page 1: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

1

High-dimensional Error Analysis of Regularized M-Estimators

Ehsan AbbasiChristos Thrampoulidis Babak Hassibi

Allerton ConferenceWednesday September 30, 2015

Page 2: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

2

Linear Regression ModelEstimate unknown signal from noisy linear measurements:

measurement/design matrix

unknown signal

noise vector

Page 3: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

3

M-estimatorsFor some convex loss function solve:

• Maximum Likelihood (ML) estimators

?

• least-squares, least-absolute deviationsHuber-loss, etc…

Fisher information, consistency, asymptotic normality,Cramer-Rao bound, ML, robust statistics, Huber loss, optimal loss …

Page 4: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

4

Why revisit & what changes?

• Modern: n is increasingly large machine learning, image processing, sensor/social networks, DNA microarrays, ...

• Structured signals: sparse, low-rank, block-sparse, low-varying …

Regularized M-estimators

• Compressive sensing:

• Traditional: but the ambient dimension n is fixed

• Regularizer is structure inducing, convex, typically non-smoothL1 , nuclear, L1/L2 norms, total variation …atomic norms

atomic norms

Page 5: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

5

Classical question - Modern regime: New results & phenomena

• High-dimensional Proportional regime

?

• Question goes back to 50’s (Huber, Kolmogorov…)• Only very recent advances, special instances, strict assumptions• No general theory!

has entries iid GaussianAssumption:

• benchmark in CS/statistics theory• universality

Page 6: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

6

Contribution

• at a rate Assume

• has entries iid Gaussian

• mild regularity conditions on , pz, f, and px0

Then, with probability one,

where is the unique solution to a system of four nonlinear equationsin four unknowns :

Page 7: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

7

The Equations

Let’s parse them,to get some insight …

Page 8: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

8

The Explicit ones

and appear in the equations explicitly.

Page 9: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

9

The Loss and the Regularizer

The loss function and the regularizer appear through their Moureau envelope approximations.

In the traditional regime instead of the Moureau envelopes the functions themselves appear

Page 10: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

10

The Distributions

The convolution of the pdf of the noise with a gaussian is a completely new phenomenon compared to the traditional regime

Page 11: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

11

The Expected Moureau Envelope• The role of and is summarized in

• how they affect error performance of the M-estimator • (strictly) convex and continuously differentiable

even if is non-differentiable!

• generalizes the “Gaussian width” or “Gaussian distance squared” or “statistical dimension”.

• same for and

Page 12: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

12

Reminder: Moureau EnvelopesMoureau-Yoshida envelope of evaluated at with parameter :

• always underestimates f at x. The smaller the τ the closer to f

• smooth approximation always continuously differentiable in both x and τ

( even if f is non-differentiable )• jointly convex in x and τ

• optimal v is unique (proximal operator)

• everything extends to vector-valued function f

Page 13: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

13

Examples

Page 14: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

14

Set Indicator Function

Gaussian width

Page 15: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

15

Summarizing Key Features

• Squared error of general Regularized M-estimators• Minimal and generic regularity assumptions

– non-smooth, heavy-tails, non-separable, …• Key role of Expected Moureau envelopes

– strictly convex and smooth– generalize known geometric summary parameters

• Observation: fast solution by simple iterative scheme!

Page 16: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

16

Simulations

Optimal tuning?

Page 17: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

17

Non-smooth losses

Page 18: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

18

Non-smooth losses

Optimal loss?

Page 19: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

19

Non-smooth losses

Consistent Estimators?

Page 20: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

20

Heavy-tailed noise• Huber loss function + noise iid Cauchy Robustness?

Page 21: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

21

Non-separable loss

Square-root LASSO

Page 22: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

22

Beyond Gaussian Designs

• analysis framework directly applies to elliptically distributed• For the LASSO we have extended ideas to IRO matrices

• Universality over iid entries (Empirical observation) modifiedequations

Page 23: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

23

Convex Gaussian Min-max Theorem

Apply CGMT to

(PO)

(AO)

Theorem (CGMT) [TAH’15,TOH’15]

Page 24: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

24

Proof Diagram

M-estimator (PO)Duality

(AO)

(DO)Deterministic min-max

Optimization in 4 variablesCGMT

The Equations

First-order optimality conditions

Page 25: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

25

Related Literature

• [El Karoui 2013,2015]• Ridge regularization, smooth loss, no structured x0

• Ellpitical distributions• iid entries beyond Gaussian

• [Donoho, Montanari 2013]• No regularizer• smooth+strongly convex, bounded noise

Page 26: High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

26

Conclusions• Master Theorem for general M-estimators

– Minimal assumptions– 4 nonlinear equations, unique solution, fast iterative solution (why?)– Summary parameters: Expected Moureau envelopes

• Opportunities, lots to be asked…• Optimal loss-function? optimal Regularizer? • When can we be consistent?• Optimally tuning tuning parameter?

LASSO: Linear = Non-linear[TAH’15 NIPS]

• CGMT framework is powerful• non-linear measurements, y=g(Ax0)

• Beyond squared error analysis… Apply CGMT for different set S…[TAYH’15 ICASSP]

Thank You!