Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu...

Preview:

Citation preview

Cross Validation of SVMs for Acoustic Feature Classification using Entire

Regularization Path

Tianyu Tom Wang

T. Hastie et al. 2004

WS04

Acoustic Feature Detection

Manner Features: e.g. +sonorant vs. –sonorant

Place of Articulation Feature: e.g. +lips vs. –lips

Vectorize spectrograms as feature vectors of +/- manner/place feature

Hasegawa-Johnson, 2004

Binary Linear SVM Classifiers

• n training pairs: [xi , yi ] where xip ; p = number of

attributes of vector; yi{-1, +1}

• Linear SVM: h(xi) = sgn[f(xi) = 0+ T xi] T xi : dot product

• Finding f(xi):

subject to yi f(xi) 1 for each i

Hastie, Zhu, Tibshirani, Rosset, 2004

Binary Linear SVM Classifiers

Inseparable Case, Binary Linear SVM Classifiers

Zhu,Hastie, 2001

Slack Variables

1 - yi f(xi) = i

Outside Boundary (correct):

i < 0

Inside Boundary (correct/incorrect):

i > 0

On Boundary (correct):

i = 1

Binary Linear SVM Classifiers, Slack Variables

• Finding f(x):

subject to yi f(xi) 1 - i for each i

Rewrite as:

C = cost parameter = 1/Note: [arg]+ indicates max(0, arg)

1 - yi f(xi) = i

Generalizing to Non-Linear Classifiers

• Kernel SVM: h(xi) = sgn[f(xi) = 0+g(xi)] g(xi) = i yi K(xi , x) = i yi (xi)(x)

• Value picked for C = 1/is crucial to error.

Hastie, Zhu, Tibshirani, Rosset, 2004

Example – mixture Gaussians

Hastie, Zhu, Tibshirani, Rosset, 2004

Tracing Path of SVMs w.r.t. C

One condition for SVM solution -

C = 1/ controls width of margin (1/||||)0 i 1yi f(xi) < 1, > 0, i = 1, incorrectly classified

yi f(xi) > 1, = 0, i = 0

yi f(xi) = 1, = 0, 0 < i < 1, on margin, support vector

Hastie, Zhu, Tibshirani, Rosset, 2004

Tracing Path of SVMs w.r.t. C

• Algorithm:Set to be large (margin very wide)All i consequently = 1

• Decrease while keeping track of the following sets:

• Theoretical Findings: i values are piecewise linear w.r.t. C• Can find “breakpoints” of i interpolate in between

Hastie, Zhu, Tibshirani, Rosset, 2004

Tracing Path of SVMs w.r.t. C

• Breakpoint: i reaches 0 or 1 (boundary), or yi f(xi) = 1 for a point in set O or I

• Find corresponding and store i‘s

• Terminate when either:

Set I is empty (classes separable)

~ 0

Hastie, Zhu, Tibshirani, Rosset, 2004

Linear Example

Hastie, Zhu, Tibshirani, Rosset, 2004

Finding Optimal C in Cross Validation

• Stored i generate all possible SVMs for a given training set.

• Test all possible SVMs on new test points.

• Find SVM with minimum error rate and corresponding C

• Tested method on +/- continuant feature of NTIMIT corpus.

• Resultant C values were close to those found by training individual SVMs.

Conclusions

• Can use entire regularization path of SVM to find optimal C value for cross validation

• Faster than training individual SVM’s for each C value

• Finer traversal of C values