44
Center for Evolutionary Medicine and Informatics Sparse Screening for Exact Data Reduction Jieping Ye Computer Science and Engineering The Biodesign Institute Arizona State University 1 Joint work with Jie Wang and Jun Liu

Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Sparse Screening for Exact Data Reduction

Jieping Ye Computer Science and Engineering

The Biodesign Institute

Arizona State University

1

Joint work with Jie Wang and Jun Liu

Page 2: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

           

           

           

           

Major Depressive Disorder

Drosophila Embryogenesis Alzheimer’s Disease

Imaging Genetics 2

Page 3: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Lasso/Basis Pursuit (Tibshirani, 1996, Chen, Donoho, and Saunders, 1999)

… = × +

y A z

n×1 n×p n×1

p×1

x

3

Simultaneous feature selection and regression

Page 4: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

4

Neuroimage Analysis (Sun et al. 2009)

Elucidate a Magnetic Resonance Imaging-Based Neuroanatomic Biomarker for Psychosis

Page 5: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Imaging Genetics (Thompson et al. 2013)

5

Page 6: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Sparse Reduced-Rank Regression

6 Vounou et al. (2010, 2012)

Page 7: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Structured Sparse Models

7

           

           

           

           

Group Lasso

Tree Lasso

Fused Lasso

Graph Lasso

Page 8: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

8

           

Sparsity has become an important modeling tool in genomics, genetics, signal and audio processing, image processing, neuroscience (theory of sparse coding), machine learning, statistics …

Page 9: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Optimization Algorithms

•  Coordinate descent •  Subgradient descent •  Augmented Lagrangian Method •  Gradient descent •  Accelerated gradient descent •  …

9

min loss(x) + λ×penalty(x)

Page 10: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Lasso

Fused Lasso

Group Lasso

Sparse Group Lasso

Tree Structured Group Lasso

Overlapping Group Lasso

Sparse Inverse Covariance Estimation

Trace Norm Minimization

http://www.public.asu.edu/~jye02/Software/SLEP/ 10

Page 11: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

More Efficiency?

11

Very high dimensional data

Non-smooth sparsity-induced norms

Multiple runs in model selection

A large number of runs in permutation test

Page 12: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

How to make any existing Lasso solver much more efficient?

12

Page 13: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

13

1M 1K

Data Reduction/Compression

original data reduced data

Page 14: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Data Reduction •  Heuristic-based data reduction

–  Sure screening, random projection/selection –  Resulting model is an approximation of the true

model

•  Propose data reduction methods –  Exact data reduction via sparse screening

•  The model based on reduced data is identical to the

one constructed from complete data

14

Page 15: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

15

with screening

same solution

1M

1M 1K

without screening

Sparse Screening

Page 16: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Large-Scale Sparse Screening

Page 17: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Screening Rule: Motivation

Page 18: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Large-Scale Sparse Screening (Cont’d)

Page 19: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

More on the Dual Formulation

•  Solving the dual formulation is difficult

•  Providing a good (not exact) estimate of the optimal dual solution is easier

•  A good estimate of the optimal dual solution is sufficient for effective feature screening

19

Page 20: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Screening Rule

20

Page 21: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Model Selection via Computing a Sequential Solution

λ1 < λ2< … λi < λi+1< … < λ100

θ1 θ2 … θi θi+1 … θ100

min loss(x) + λ×penalty(x)

Model selection: q cross validation q stability selection

Page 22: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

How to Estimate the Region Θ?

J. Wang et al. NIPS’13; J. Liu et al. ICML’14

Non-expansiveness:

Page 23: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Enhanced DPP

23

Use projections of rays:

Define:

Enhanced DPP:

Page 24: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Firmly Non-expansive Projection

24

Non-expansiveness:

Firmly non-expansiveness:

Page 25: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

25

Results on MNIST along a sequence of 100 parameter values along the λ/λmax scale from 0.05 to 1. The data matrix is of size 784x50,000

Page 26: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

26

Evaluation on MNIST solver   SAFE   DPP   EDPP   SDPP  

time  (s)   2245.26  685.12   233.85   45.56   9.34  

0 50 100 150 200 250 300

SAFE  DPP  EDPP  SDPP  

Speedup  

Page 27: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Evaluation of EDPP

•  Problem: GWAS to MRI ROI prediction (ADNI) –  The size of the data matrix is 747 by 504095

Method ROI3 ROI8 ROI30 ROI69 ROI76 ROI83 Lasso Solver 37975.31 37097.25 38258.72 36926.81 38116.29 37251.03 SR 84.06 84.44 84.70 83.09 82.76 85.39 SR+Lasso 217.08 215.90 223.39 214.36 212.04 211.57 EDDP 43.56 45.75 45.70 45.01 44.31 44.16 EDDP+Lasso 183.64 190.43 182.87 170.71 177.41 178.98

Running time (in seconds) of the Lasso solver, strong rule (Tibshriani et al, 2012), and EDPP. The parameter sequence contains 100 values along the log λ/λmax scale from 100 log 0.95 to log 0.95.

Page 28: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Sparse Screening Extensions •  Group Lasso

–  J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening Methods. arXiv preprint arXiv:1307.4156.

•  Sparse Logistic Regression –  J Wang, J Zhou, P Wonka, J Ye. A Safe Screening Rule for Sparse Logistic

Regression. arXiv preprint arXiv:1307.4145.

•  Sparse Inverse Covariance Estimation –  S Huang, J Li, L Sun, J Liu, T Wu, K Chen, A Fleisher, E Reiman, J Ye. Learning

brain connectivity of Alzheimer’s disease by exploratory graphical models. NeuroImage 50, 935-949.

–  Witten, Friedman and Simon (2011), Mazumder and Hastie (2012)

•  Multiple Graphical Lasso –  S Yang, Z Pan, X Shen, P Wonka, J Ye. Fused Multiple Graphical Lasso. arXiv

preprint arXiv:1209.2139. 28

Page 29: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Wide versus Tall Data

29

wide data

tall data

Page 30: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Support Vector Machines •  SVM  is  a  maximum  margin  classiCier.

30

denotes  +1  

denotes  -­‐1  

Margin  

Page 31: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Support Vectors •  SVM  is  determined  by  the  so-­‐called  support  vectors.

31

Support  Vectors  are  those  data  points  that  the  margin  pushes  up  against  

denotes  +1  

denotes  -­‐1  

The  non-­‐support  vectors  are  irrelevant  to  the  classiCier.  

Can  we  make  use  of  this  observation?  

Page 32: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

The Idea of Sample Screening

32

Original  Problem Screening Smaller  Problem  to  Solve

Page 33: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Guidelines for Sample Screening

33 J. Wang, P. Wonka, and J. Ye. ICML’14.

Page 34: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Relaxed Guidelines

34

Page 35: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Estimation via Variational Inequality

35

Page 36: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Sketch of SVM Screening

36

Page 37: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

The DVI Screening Rule

37

Page 38: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

A General Formulation

38

Page 39: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

The DVI Screening Rule for LAD

39

Page 40: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Synthetic Studies

40

•  We  use  the  rejection  rates  to  measure  the  performance  of  the  screening  rules,  the  ratio  of  the  number  of  data  instances  whose  membership  can  be  identiCied  by  the  rule  to  the  total  number  of  data  instances.

Page 41: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Performance of DVI for SVM on Real Data Sets

41

Comparison  of  SSNSV  (Ogawa  et  al.,  ICML’13),  ESSNSV  and  DVIs  for  SVM  on  three  real  data  sets.

IJCNN, , Speedup

Solver Total 4669.14

Solver + SSNSV

SSNSV 2.08

2.31 Init. 92.45

Total 2018.55

Solver + ESSNSV

ESSNSV 2.09

3.01 Init. 91.33

Total 1552.72

Solver + DVI

DVI 0.99

5.64 Init. 42.67

Total 828.02

Wine, , Speedup

Solver Total 76.52

Solver + SSNSV

SSNSV 0.02

3.50 Init. 1.56

Total 21.85

Solver + ESSNSV

ESSNSV 0.03

4.47 Init. 1.60

Total 17.17

Solver + DVI

DVI 0.01

6.59 Init. 0.67

Total 11.62

Covertype, , Speedup

Solver Total 1675.46

Solver + SSNSV

SSNSV 2.73

7.60 Init. 35.52

Total 220.58

Solver + ESSNSV

ESSNSV 2.89

10.72 Init. 36.13

Total 156.23

Solver + DVI

DVI 1.27

79.18 Init. 12.57

Total 21.26

Page 42: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Experiments on Real Data Sets

42

Comparison  of  SSNSV  (Ogawa  et  al.,  ICML’13),  ESSNSV  and  DVIs  for  LAD  on  three  real  data  sets.

Telescope, , Speedup

Solver Total 122.34

Solver + DVI

DVI 0.28

9.86 Init. 0.12

Total 12.14

Computer, , Speedup

Solver Total 5.85

Solver + DVI

DVI 0.08

19.21 Init. 0.05

Total 0.28

Telescope, , Speedup

Solver Total 21.43

Solver + DVI

DVI 0.06

114.91 Init. 0.1

Total 0.19

Page 43: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

Summary •  Developed exact data reduction approaches

–  Exact data reduction via feature screening –  Exact data reduction via sample screening

•  The model based on reduced data is identical to the one constructed from complete data

•  Results show screening leads to a significant speedup.

•  Extend exact data reduction to other sparse learning formulations

43

Page 44: Sparse Screening for Exact Data Reduction · Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening

Center for Evolutionary Medicine and Informatics

44