34
The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August 9, 2013

When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

When Dictionary Learning Meets Classification

Bufford, Teresa1 Chen, Yuxin2

Horning, Mitchell3 Shee, Liberty1

Mentor: Professor Yohann Tendero

1UCLA2Dalhousie University3Harvey Mudd College

August 9, 2013

Page 2: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Outline

1. The Method: Dictionary Learning

2. Experiments and Results

2.1 Supervised Dictionary Learning

2.2 Unsupervised Dictionary Learning

2.3 Robustness w.r.t. Noise

3. Conclusion

Page 3: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Dictionary LearningGOAL: Classify discrete image signals x ∈ Rn.

The Dictionary, D ∈ Rn×K

x ≈ Dα =

| |atom1 · · · atomK

| |

α1

...αK

• Each dictionary can be represented as a matrix, where each

column is an atom ∈ Rn, learned from a set of training data.

• A signal x ∈ Rn can be approximated by a linear combinationof atoms in a dictionary, represented by α ∈ RK .

• We seek a sparse signal representation:

arg minα∈RK

‖x − Dα‖22 + λ‖α‖1. (1)

0Sprechmann, et al. ”Dictionary learning and sparse coding for unsupervisedclustering.” Acoustics Speech and Signal Processing (ICASSP), 2010 IEEEInternational Conference on. IEEE, 2010.

Page 4: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Supervised Dictionary Learning Algorithm

Illustrating Example

Training image signals: x1, x2, x3, x4 X = [x1 x2 x3 x4]Classes: 0, 1

Group training images according to their labels:

class 0 x1 x4

class 1 x2 x3

Use training images with label i to train dictionary i :

class 0 x1 x4 −→ D0

class 1 x2 x3 −→ D1

Page 5: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Signal Classification

Illustrating Example

Test image signals: y1, y2, y3 Y = [y1 y2 y3]Dictionaries: D0,D1

Take one test image. Compute its optimal sparse representation α foreach dictionary. Use α to compute energy:

y1Ei (y1) = min

α‖y1 − Diα‖22 + λ‖α‖1

−−−−−−−−−−−−−−−−−−−−−−−−−→

[E0(y1)E1(y1)

]Do this for each test signal:

E(Y ) =

[E0(y1) E0(y2) E0(y3)E1(y1) E1(y2) E1(y3)

]Classify the test signal as class i∗ = arg minEi (y). For example:

E(Y ) =

[5 12 824 6 4

]−→

class 0 y1

class 1 y2 y3

Page 6: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Supervised Dictionary Learning Results

Table : Supervised Results

Cluster Centered& Digits K misclassificationType Normalized rate average

Supervised False {0, ..., 4} 500 1.3431%

Supervised True {0, ..., 4} 500 0.5449%

Supervised False {0, ..., 4} 800 0.7784%

Supervised True {0, ..., 4} 800 0.3892%

Supervised False {0, ..., 9} 800 3.1100%

Supervised True {0, ..., 9} 800 1.6800%

Error rate for digits {0, ..., 9} and K = 800 from [1] is 1.26%.

1Sprechmann, et al. ”Dictionary learning and sparse coding for unsupervisedclustering.” Acoustics Speech and Signal Processing (ICASSP), 2010 IEEEInternational Conference on. IEEE, 2010.

Page 7: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Supervised Results, ctd.

Here is a confusion matrix for the supervised, centered, normalizedcase with k = 800 and digits {0, . . . , 4}.

Element cij ∈ C is the number of times that an image of digit iwas classified as digit j .

C =

0 1 2 3 4

0 978 0 1 1 01 0 1132 2 1 02 5 2 1023 0 23 0 0 3 1006 14 1 0 1 0 980

Page 8: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Unsupervised Dictionary Learning Algorithm(Spectral Clustering)

Illustrating Example

Training image signals: x1, x2, x3, x4 X = [x1 x2 x3 x4]Classes: 0, 1

Train a dictionary D from all training images:

Allimages

x1 x2x3 x4

−→ D =

| | |atom1 atom2 atom3

| | |

Reminder: The number of atoms in the dictionary is a parameter.

For each image, compute optimal sparse representation α w.r.t. D:

A =

x1 x2 x3 x4

atom1 | | | |atom2 α1 α2 α3 α4

atom3 | | | |

Reminder: α is a linear combination of atoms in D.

Page 9: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Illustrating Example

Training image signals: x1, x2, x3, x4 X = [x1 x2 x3 x4]Classes: 0, 1

Construct a similarity matrix S1 = |A|t |A|.Perform spectral clustering on the graph G1 = {X ,S1}:

G1 = {X ,S1}spectral−−−−−−−−→..clustering..

class 0 x1 x4

class 1 x2 x3

Use signal clusters to train initial dictionaries:

class 0 x1 x4 −→ D0

class 1 x2 x3 −→ D1

Page 10: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Refining Dictionaries

Repeat:

1. Classify the training images using current dictionaries:

x1, x2, x3, x4classify

−−−−−−−−−−→..with D0,D1..

class 0 x4

class 1x1 x2x3

2. Use these classifications to train new, refined dictionariesfor each cluster:

class 0 x4 −→ D0,new

class 1x1 x2x3

−→ D1,new

Page 11: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Unsupervised Dictionary LearningSpectral Clustering Results

Table : Unsupervised Results (spectral clustering)

Cluster Centered & Digits K misclassificationNormalized rate average

Atoms True {0, ..., 4} 500 24.8881%

Atoms True {0, ..., 4} 800 27.1843%

Signals True {0, ..., 4} 500 27.4372%

Signals True {0, ..., 4} 800 29.5777%

Misclassification rate for digits {0, ..., 4} and K = 500 from [1] is 1.44%.

1Sprechmann, et al. ”Dictionary learning and sparse coding for unsupervisedclustering.” Acoustics Speech and Signal Processing (ICASSP), 2010 IEEEInternational Conference on. IEEE, 2010.

Page 12: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Why do our results differ from Sprechmann, et al.’s?

Hypothesis

The problems lie in the author’s choice of similarity measures,S = |A|t |A|.

Possible Problem: Normalization (A’s cols not of constant l2 norm)

Illustration of Problem

• Assume that A contains only positive entries. ThenS = |A|t |A| = AtA and the ij th entry of S is

< αi , αj >= ‖αi‖‖αj‖ cos(αi , αj).

• Nearly orthogonal vectors are more similar than co-linear onesif the products of their norms are big enough.

“more similar”

Page 13: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Attempted Solution

Normalized each column of A before computing S . However, thiscaused little change in the results, due to the l2 norms of A’scolumns being almost constant.

Figure : The normalized histogram of the l2 norm of the columns of A.Note: The l2 norms of most columns in A are in [3, 4].

2 3 4 5 6 7 8 9 10 110

0.1

0.2

0.3

0.4

0.5

0.6

0.7Histogram of the l2−norm of the columns of A (Normalized)

Norm interval

Em

irica

l pro

babi

lity

Page 14: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Possible Problem: Absolute Value (use |A|t |A| instead of |AtA|)

Illustration of Problem

a1 = (1,−1)t and a2 = (1, 1)t , which are orthogonal vectors,become co-linear when the entries are replaced by their absolutevalues.

Attempted Solution

In the experiments, changing |A||A|t to |AAt | does not significantlyimprove the results, because among all the entries of A associatedwith MNIST data, only ≈ 13.5% are negative.

Page 15: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

In the last part of Sprechmann et al, the authors state:

“We observed that best results are obtained for all the experimentswhen the initial dictionaries in the learning stage are constructedby randomly selecting signals from the training set. If the size ofthe dictionary compared to the dimension of the data is small, [it]is better to first partition the dataset (using for example Euclideank-means) in order to obtain a more representative sample.”

Algorithm

Use k-means to cluster the training signals, instead of spectralclustering.

Page 16: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Unsupervised Dictionary Learning (kmeans)

Illustrating Example

Training image signals: x1, x2, x3, x4 X = [x1 x2 x3 x4]Classes: 0, 1

Use kmeans to cluster training images:

x1, x2, x3, x4..kmeans..−−−−−−→

class 0 x1 x4

class 1 x2 x3

Note: Must center and normalize after clustering.

Use clusters to train dictionaries:

class 0 x1 x4 −→ D0

class 1 x2 x3 −→ D1

Refinement process is same as spectral clustering case.

Page 17: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Unsupervised Dictionary Learning K-means Results

Table : Unsupervised Results (kmeans), without refinement

Cluster Centered & Digits K iter misclassificationNormalized rate average

Signals True {0,...,4} 500 2 7.1025%

Table : Unsupervised Results (kmeans), with refinement

Cluster Centered & Digits K iter misclassificationNormalized rate average

Signals True {0,...,4} 500 2 1.1286%Signals True {0,...,4} 500 20 0.5449%

Misclassification rate for digits {0, ..., 4} and K = 500 from [1] is 1.44%.

1Sprechmann, et al. ”Dictionary learning and sparse coding for unsupervisedclustering.” Acoustics Speech and Signal Processing (ICASSP), 2010 IEEEInternational Conference on. IEEE, 2010.

Page 18: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Unsupervised Results, ctd.

Here is a confusion matrix for the k-means, unsupervised, centered,normalized case with k = 500, iter = 20, and digits {0, . . . , 4}.

Element cij ∈ C is the number of times that an image of digit iwas classified as digit j .

C =

0 1 2 3 4

0 979 0 1 0 01 0 1129 4 1 12 4 2 1023 2 13 0 0 3 1006 14 0 1 3 0 978

Page 19: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Gaussian Noise Experiments

Classification of Noisy Images

Goal: To analyze the robustness of our method with respect tonoise in images

Questions to consider:

• How does adding noise affect misclassification rates?

• How does centering and normalizing the data after addingnoise affect the results?

Page 20: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Results: Classifying Pure Gaussian NoiseClassification Rates for Each Digit

Train dictionaries on Train dictionaries oncentered and normalized data not centered and not normalized data

noise variance = 0.5 noise variance = 0.5

Page 21: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

MNIST Images with Gaussian Noise Added

Page 22: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Results: Adding Gaussian Noise to MNISTNoise in test images only

Centered and Normalized Not Centered and Not Normalizedtest noise variance = 0.5 test noise variance = 0.5

Misclassification rate 8.21% Misclassification rate 87.05%

Page 23: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Results: Adding Gaussian Noise to MNISTSame noise variance for test & training images

Misclassification Rates

Noise variance 0.1 0.5 1.0Centered & 3.2% 17.33% 38.92%Normalized

Not Centered & 11.08% 57.56% 75.87%Not Normalized

Page 24: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Conclusions:

• Unable to reproduce results reported by Sprechmann

• Method of k-means produces better results than spectralclustering

• Centering and normalizing data consistently improves results

• Method is robust to added noise

Page 25: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Thanks for your attention!

Page 26: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Unsupervised Dictionary Learning Algorithm 2(with split initialization)

Step 1 Train a dictionary D0 ∈ Rnxk from all training images

Step 2 Obtain matrix A0 and similarity matrix S0 = |A0||A0|t asbefore

Step 3 Apply spectral clustering to dictionary D0 to split its atomsinto D1 and D2

Step 4 Obtain matrices A1, S1, A2, and S2 for dictionaries D1 andD2

Step 5 Apply spectral clustering to dictionaries D1 and D2 to spliteach into two clusters. Choose the segmentation thatresults in the lower energy and keep this one. Return theother dictionary to its original form. Now you have threedictionaries.

Step 6 Repeat this process so that after each iteration the numberof dictionaries increases by one. Stop when the desirednumber of clusters is reached.

Page 27: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Semisupervised Dictionary Learning Algorithm

Step 1 Use training images, a percentage of which have knownlabels. The remaining percentage are randomly labeled.

Step 2 Train dictionaries {D0, . . . ,DN−1} independently.

Step 3 Classify the training images using current dictionaries.Then use these classifications to train new, refineddictionaries for each cluster.

Page 28: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Semisupervised Dictionary Learning Results

Dictionaries for digits {0, . . . , 4}.Cluster Type Centered & K perturb. misclassification

Normalized percent rate average

Semisupervised True 200 40 0.9730%

Semisupervised True 200 70 0.8951%

Semisupervised True 200 90 1.4205%

Page 29: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

1 2 3 4 5 6 7 8 9 10 11

5

10

15

20

25

30

Average misclassification Rate at Each Iteration, digits 0−4, 40%

Iteration (iteration 1 = first refinement)

Ave

rage

mis

clas

sific

atio

n R

ate

1 2 3 4 5 6 7 8 9 10 11

5

10

15

20

25

30

Average misclassification Rate at Each Iteration, digits 0−4, 70%

Iteration (iteration 1 = first refinement)

Ave

rage

mis

clas

sific

atio

n R

ate

1 2 3 4 5 6 7 8 9 10 11

5

10

15

20

25

30

Average misclassification Rate at Each Iteration, digits 0−4, 90%

Iteration (iteration 1 = first refinement)

Ave

rage

mis

clas

sific

atio

n R

ate

Figure : Misclassification rate average per refinement iteration. Left toright: 40%, 70%, 90% perturbation.

1 2 3 4 5 6 7 8 9 10 11

0.129

0.13

0.131

0.132

0.133

0.134

0.135

0.136

0.137

0.138

0.139

Energy at Each Iteration, digits 0−4, 40%

Iteration (iteration 1 = first refinement)

Ene

rgy

1 2 3 4 5 6 7 8 9 10 11

0.129

0.13

0.131

0.132

0.133

0.134

0.135

0.136

0.137

0.138

0.139

Energy at Each Iteration, digits 0−4, 70%

Iteration (iteration 1 = first refinement)

Ene

rgy

1 2 3 4 5 6 7 8 9 10 11

0.129

0.13

0.131

0.132

0.133

0.134

0.135

0.136

0.137

0.138

0.139

Energy at Each Iteration, digits 0−4, 90%

Iteration (iteration 1 = first refinement)

Ene

rgy

Figure : Energy per refinement iteration. Left to right: 40%, 70%, 90%perturbation.

Page 30: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

1 2 3 4 5 6 7 8 9 10 11

5

10

15

20

25

30

Average misclassification Rate at Each Iteration, digits 0−4, 40%

Iteration (iteration 1 = first refinement)

Ave

rage

mis

clas

sific

atio

n R

ate

1 2 3 4 5 6 7 8 9 10 11

5

10

15

20

25

30

Average misclassification Rate at Each Iteration, digits 0−4, 70%

Iteration (iteration 1 = first refinement)

Ave

rage

mis

clas

sific

atio

n R

ate

1 2 3 4 5 6 7 8 9 10 11

5

10

15

20

25

30

Average misclassification Rate at Each Iteration, digits 0−4, 90%

Iteration (iteration 1 = first refinement)

Ave

rage

mis

clas

sific

atio

n R

ate

Figure : Misclassification rate average per refinement iteration. Left toright: 40%, 70%, 90% perturbation.

2 3 4 5 6 7 8 9 10 11

1000

2000

3000

4000

5000

6000

7000

Changes in Training Signals at Each Iteration, digits 0−4, 40%

Change to refinement # (change 1 = change to first refinement)

Num

ber

of s

igna

ls th

at c

hang

ed

2 3 4 5 6 7 8 9 10 11

1000

2000

3000

4000

5000

6000

7000

Changes in Training Signals at Each Iteration, digits 0−4, 70%

Change to refinement # (change 1 = change to first refinement)

Num

ber

of s

igna

ls th

at c

hang

ed

2 3 4 5 6 7 8 9 10 11

1000

2000

3000

4000

5000

6000

7000

Changes in Training Signals at Each Iteration, digits 0−4, 90%

Change to refinement # (change 1 = change to first refinement)

Num

ber

of s

igna

ls th

at c

hang

ed

Figure : Number of training images whose classification changed perrefinement iteration. Left to right: 40%, 70%, 90% perturbation.

Page 31: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Unsupervised - Atoms (change over refinement iterations)

1 2 3 4 5 6 7 8 9 10

0.113

0.114

0.115

0.116

0.117

0.118

0.119

0.12

0.121

0.122

0.123

Energy at Each Iteration, digits 0−4, CN

Iteration (iteration 1 = first refinement)

Ene

rgy

1 2 3 4 5 6 7 8 9 10

24

24.5

25

25.5

Average misclassification Rate at Each Iteration, digits 0−4, CN

Iteration (iteration 1 = first refinement)

Ave

rage

mis

clas

sific

atio

n R

ate

2 3 4 5 6 7 8 9 10

20

30

40

50

60

70

80

90

Changes in Training Signals at Each Iteration, digits 0−4, CN

Change to refinement # (change 1 = change to first refinement)

Num

ber

of s

igna

ls th

at c

hang

ed

Figure : Unsupervised dictionary learning (K = 500), spectral clusteringof centered and normalized signals by atoms. Left to right: (1) energy,(2) misclassification rate average, (3) number of training images whoseclassification changed per refinement iteration.

Page 32: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Unsupervised - Signals (change over refinement iterations)

1 2 3 4 5 6 7 8 9 10

0.0855

0.086

0.0865

0.087

0.0875

0.088

Energy at Each Iteration, digits 0−4, CN

Iteration (iteration 1 = first refinement)

Ene

rgy

1 2 3 4 5 6 7 8 9 1027

27.5

28

28.5

29

29.5

Average misclassification Rate at Each Iteration, digits 0−4, CN

Iteration (iteration 1 = first refinement)

Ave

rage

mis

clas

sific

atio

n R

ate

2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Changes in Training Signals at Each Iteration, digits 0−4, CN

Change to refinement # (change 1 = change to first refinement)

Num

ber

of s

igna

ls th

at c

hang

ed

Figure : Unsupervised dictionary learning (K = 500), spectral clusteringof centered and normalized signals by signals. Left to right: (1) energy,(2) misclassification rate average, (3) number of training images whoseclassification changed per refinement iteration.

Page 33: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Unsupervised - kmeans (change over refinement iterations)

1 2 3 4 5 6 7 8 9 100.1282

0.1284

0.1286

0.1288

0.129

0.1292

0.1294

0.1296

0.1298Energy at Each Iteration, digits 0−4, CN

Iteration (iteration 1 = first refinement)

Ene

rgy

1 2 3 4 5 6 7 8 9 101

1.5

2

2.5

3

3.5

4

4.5

5Average misclassification Rate at Each Iteration, digits 0−4, CN

Iteration (iteration 1 = first refinement)

Ave

rage

mis

clas

sific

atio

n R

ate

2 3 4 5 6 7 8 9 10200

300

400

500

600

700

800Changes in Training Signals at Each Iteration, digits 0−4, CN

Change to refinement # (change 1 = change to first refinement)

Num

ber

of s

igna

ls th

at c

hang

ed

Figure : Unsupervised dictionary learning (K = 500), kmeans clusteringof centered and normalized signals. Left to right: (1) energy, (2)misclassification rate average, (3) number of training images whoseclassification changed per refinement iteration.

Page 34: When Dictionary Learning Meets Classificationbertozzi/WORKFORCE/REU 2013... · When Dictionary Learning Meets Classi cation Bu ord, Teresa1 Chen, Yuxin2 Horning, ... column is an

The Method Supervised Unsupervised - spectral Unsupervised - k-means Gaussian Noise Experiments Additional Material

Unsupervised - kmeans (change over refinement iterations)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

0.114

0.116

0.118

0.12

0.122

0.124

0.126

0.128

Energy at Each Iteration, digits 0−4, CN

Iteration (iteration 1 = first refinement)

Ene

rgy

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

0.114

0.116

0.118

0.12

0.122

0.124

0.126

0.128

Energy at Each Iteration, digits 0−4, CN

Iteration (iteration 1 = first refinement)E

nerg

y

Figure : Unsupervised dictionary learning (K = 500), kmeans clusteringof centered and normalized signals. Left: 2 dictionary learningrefinements. Right: 20 dictionary learning refinements.