Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04

Cross Validation of SVMs for Acoustic Feature Classification using Entire

Regularization Path

Tianyu Tom Wang

T. Hastie et al. 2004

WS04

Acoustic Feature Detection

Manner Features: e.g. +sonorant vs. –sonorant

Place of Articulation Feature: e.g. +lips vs. –lips

Vectorize spectrograms as feature vectors of +/- manner/place feature

Hasegawa-Johnson, 2004

Binary Linear SVM Classifiers

• n training pairs: [xi , yi ] where xip ; p = number of

attributes of vector; yi{-1, +1}

• Linear SVM: h(xi) = sgn[f(xi) = 0+ T xi] T xi : dot product

• Finding f(xi):

subject to yi f(xi) 1 for each i

Hastie, Zhu, Tibshirani, Rosset, 2004

Binary Linear SVM Classifiers

Inseparable Case, Binary Linear SVM Classifiers

Zhu,Hastie, 2001

Slack Variables

1 - yi f(xi) = i

Outside Boundary (correct):

i < 0

Inside Boundary (correct/incorrect):

i > 0

On Boundary (correct):

i = 1

Binary Linear SVM Classifiers, Slack Variables

• Finding f(x):

subject to yi f(xi) 1 - i for each i

Rewrite as:

C = cost parameter = 1/Note: [arg]+ indicates max(0, arg)

1 - yi f(xi) = i

Generalizing to Non-Linear Classifiers

• Kernel SVM: h(xi) = sgn[f(xi) = 0+g(xi)] g(xi) = i yi K(xi , x) = i yi (xi)(x)

• Value picked for C = 1/is crucial to error.


Example – mixture Gaussians


Tracing Path of SVMs w.r.t. C

One condition for SVM solution -

C = 1/ controls width of margin (1/||||)0 i 1yi f(xi) < 1, > 0, i = 1, incorrectly classified

yi f(xi) > 1, = 0, i = 0

yi f(xi) = 1, = 0, 0 < i < 1, on margin, support vector



• Algorithm:Set to be large (margin very wide)All i consequently = 1

• Decrease while keeping track of the following sets:

• Theoretical Findings: i values are piecewise linear w.r.t. C• Can find “breakpoints” of i interpolate in between



• Breakpoint: i reaches 0 or 1 (boundary), or yi f(xi) = 1 for a point in set O or I

• Find corresponding and store i‘s

• Terminate when either:

Set I is empty (classes separable)

~ 0


Linear Example


Finding Optimal C in Cross Validation

• Stored i generate all possible SVMs for a given training set.

• Test all possible SVMs on new test points.

• Find SVM with minimum error rate and corresponding C

• Tested method on +/- continuant feature of NTIMIT corpus.

• Resultant C values were close to those found by training individual SVMs.

Conclusions

• Can use entire regularization path of SVM to find optimal C value for cross validation

• Faster than training individual SVM’s for each C value

• Finer traversal of C values

Documents

Cross Validation of SVMs for Acoustic Feature Classification using Entire Regularization Path Tianyu Tom Wang T. Hastie et al. 2004 WS04