Upload
rebecca-underwood
View
219
Download
0
Embed Size (px)
Citation preview
Cross Validation of SVMs for Acoustic Feature Classification using Entire
Regularization Path
Tianyu Tom Wang
T. Hastie et al. 2004
WS04
Acoustic Feature Detection
Manner Features: e.g. +sonorant vs. –sonorant
Place of Articulation Feature: e.g. +lips vs. –lips
Vectorize spectrograms as feature vectors of +/- manner/place feature
Hasegawa-Johnson, 2004
Binary Linear SVM Classifiers
• n training pairs: [xi , yi ] where xip ; p = number of
attributes of vector; yi{-1, +1}
• Linear SVM: h(xi) = sgn[f(xi) = 0+ T xi] T xi : dot product
• Finding f(xi):
subject to yi f(xi) 1 for each i
Hastie, Zhu, Tibshirani, Rosset, 2004
Binary Linear SVM Classifiers
Inseparable Case, Binary Linear SVM Classifiers
Zhu,Hastie, 2001
Slack Variables
1 - yi f(xi) = i
Outside Boundary (correct):
i < 0
Inside Boundary (correct/incorrect):
i > 0
On Boundary (correct):
i = 1
Binary Linear SVM Classifiers, Slack Variables
• Finding f(x):
subject to yi f(xi) 1 - i for each i
Rewrite as:
C = cost parameter = 1/Note: [arg]+ indicates max(0, arg)
1 - yi f(xi) = i
Generalizing to Non-Linear Classifiers
• Kernel SVM: h(xi) = sgn[f(xi) = 0+g(xi)] g(xi) = i yi K(xi , x) = i yi (xi)(x)
• Value picked for C = 1/is crucial to error.
Hastie, Zhu, Tibshirani, Rosset, 2004
Example – mixture Gaussians
Hastie, Zhu, Tibshirani, Rosset, 2004
Tracing Path of SVMs w.r.t. C
One condition for SVM solution -
C = 1/ controls width of margin (1/||||)0 i 1yi f(xi) < 1, > 0, i = 1, incorrectly classified
yi f(xi) > 1, = 0, i = 0
yi f(xi) = 1, = 0, 0 < i < 1, on margin, support vector
Hastie, Zhu, Tibshirani, Rosset, 2004
Tracing Path of SVMs w.r.t. C
• Algorithm:Set to be large (margin very wide)All i consequently = 1
• Decrease while keeping track of the following sets:
• Theoretical Findings: i values are piecewise linear w.r.t. C• Can find “breakpoints” of i interpolate in between
Hastie, Zhu, Tibshirani, Rosset, 2004
Tracing Path of SVMs w.r.t. C
• Breakpoint: i reaches 0 or 1 (boundary), or yi f(xi) = 1 for a point in set O or I
• Find corresponding and store i‘s
• Terminate when either:
Set I is empty (classes separable)
~ 0
Hastie, Zhu, Tibshirani, Rosset, 2004
Linear Example
Hastie, Zhu, Tibshirani, Rosset, 2004
Finding Optimal C in Cross Validation
• Stored i generate all possible SVMs for a given training set.
• Test all possible SVMs on new test points.
• Find SVM with minimum error rate and corresponding C
• Tested method on +/- continuant feature of NTIMIT corpus.
• Resultant C values were close to those found by training individual SVMs.
Conclusions
• Can use entire regularization path of SVM to find optimal C value for cross validation
• Faster than training individual SVM’s for each C value
• Finer traversal of C values