16
Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Embed Size (px)

Citation preview

Page 1: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Cross Validation of SVMs for Acoustic Feature Classification using Entire

Regularization Path

Tianyu Tom Wang

T. Hastie et al. 2004

WS04

Page 2: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Acoustic Feature Detection

Manner Features: e.g. +sonorant vs. –sonorant

Place of Articulation Feature: e.g. +lips vs. –lips

Vectorize spectrograms as feature vectors of +/- manner/place feature

Hasegawa-Johnson, 2004

Page 3: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Binary Linear SVM Classifiers

• n training pairs: [xi , yi ] where xip ; p = number of

attributes of vector; yi{-1, +1}

• Linear SVM: h(xi) = sgn[f(xi) = 0+ T xi] T xi : dot product

• Finding f(xi):

subject to yi f(xi) 1 for each i

Hastie, Zhu, Tibshirani, Rosset, 2004

Page 4: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Binary Linear SVM Classifiers

Page 5: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Inseparable Case, Binary Linear SVM Classifiers

Zhu,Hastie, 2001

Slack Variables

1 - yi f(xi) = i

Outside Boundary (correct):

i < 0

Inside Boundary (correct/incorrect):

i > 0

On Boundary (correct):

i = 1

Page 6: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Binary Linear SVM Classifiers, Slack Variables

• Finding f(x):

subject to yi f(xi) 1 - i for each i

Rewrite as:

C = cost parameter = 1/Note: [arg]+ indicates max(0, arg)

1 - yi f(xi) = i

Page 7: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Generalizing to Non-Linear Classifiers

• Kernel SVM: h(xi) = sgn[f(xi) = 0+g(xi)] g(xi) = i yi K(xi , x) = i yi (xi)(x)

• Value picked for C = 1/is crucial to error.

Hastie, Zhu, Tibshirani, Rosset, 2004

Page 8: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Example – mixture Gaussians

Hastie, Zhu, Tibshirani, Rosset, 2004

Page 9: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Tracing Path of SVMs w.r.t. C

One condition for SVM solution -

C = 1/ controls width of margin (1/||||)0 i 1yi f(xi) < 1, > 0, i = 1, incorrectly classified

yi f(xi) > 1, = 0, i = 0

yi f(xi) = 1, = 0, 0 < i < 1, on margin, support vector

Hastie, Zhu, Tibshirani, Rosset, 2004

Page 10: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Tracing Path of SVMs w.r.t. C

• Algorithm:Set to be large (margin very wide)All i consequently = 1

• Decrease while keeping track of the following sets:

• Theoretical Findings: i values are piecewise linear w.r.t. C• Can find “breakpoints” of i interpolate in between

Hastie, Zhu, Tibshirani, Rosset, 2004

Page 11: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Tracing Path of SVMs w.r.t. C

• Breakpoint: i reaches 0 or 1 (boundary), or yi f(xi) = 1 for a point in set O or I

• Find corresponding and store i‘s

• Terminate when either:

Set I is empty (classes separable)

~ 0

Hastie, Zhu, Tibshirani, Rosset, 2004

Page 12: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Linear Example

Hastie, Zhu, Tibshirani, Rosset, 2004

Page 13: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Finding Optimal C in Cross Validation

• Stored i generate all possible SVMs for a given training set.

• Test all possible SVMs on new test points.

• Find SVM with minimum error rate and corresponding C

• Tested method on +/- continuant feature of NTIMIT corpus.

• Resultant C values were close to those found by training individual SVMs.

Page 14: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04
Page 15: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04
Page 16: Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Conclusions

• Can use entire regularization path of SVM to find optimal C value for cross validation

• Faster than training individual SVM’s for each C value

• Finer traversal of C values