Upload
lucianhuluta
View
129
Download
2
Tags:
Embed Size (px)
Citation preview
Support Vector Machinesand
Kernel methods
byLucian Huluta
06/15/2009
6/2
0/2
009
Support Vector Machine (SVM)
What is Support Vector Machine?
A statistical tool, essentially used for NONLINEAR
classification/regression.
A SUPERVISED LEARNING mechanism like
neural networks.
An quick and adaptive method for PATTERN
ANALYSIS.
A fast and flexible approach for learning COMPLEX
SYSTEMS.
2
Support Vector Machine (SVM)
Strengths
Few parameters required for tuning the learning
machine
Learning involves optimisation of a convex
function
It scales relatively well to high dimensional
data
Weaknesses
Training large data still difficult
Need to choose a “good” kernel function
3
- weights ( - dimensional vector),
- bias
Binary classification problem
- input space
SVM: linear classification
More than one solution for the decision function!
4
Generalization region:
SVM: Generalization capacity
Generalization ability
5
SVM: Hard margin
Training data must satisfy:
Quadratic optimization problem
subject to:
minimize
with constraint:
6
SVM: Primal form
Convert the constrained problem => unconstrained problem:
We obtain:
Solving the for and
where is nonnegative Lagrange multipliers.
7
The dual form of the cost function consists of inner products.
SVM: Dual form
Solve QP with following problem:
The SVM is called
hard-margin support vector machine
8
The modified QP minimizes following cost function:
subject to the constraints:
SVM: L1-soft margin problem
: trade-off between the maximization of the margin and minimization of the classification error.
9
SVM: L2-soft margin problem
The modified QP minimizes following cost function:
subject to the constraints:
10
Decision function:
We assume that all the training
data are within the tube with radius ε
named insensitive loss function
Slack variables:
SVM: Regression
11
Cost function with slack variables:
SVM: Regression
If p=1: L1 soft-margin
If p=2: L2 soft-margin
subject to the constraints:
12
SVM: Linear inseparability
1. data are NOT linear separable.
2. feature space is HIGH DIMENSIONAL, hence QP
takes long time to solve
3. nonlinear function approximation problems can
NOT be solved13
If the feature space is Hilbert space, i.e., where inner product applies…
SVM: Linear inseparability
…,we can simplify the optimization problem by a TRICK!!!
14
Kernel Trick = is a method for using a linear classifier algorithm to solve a non-linear problem by
The Kernel “trick”
Kernel trick avoids computing inner product of two vectors in feature space.
choosing appropriate KERNEL FUNCTIONS
15
Consider a two-dimensional input space together with the feature map:
Numerical Example
Kernel function
16
Choose kernel function:
Maximize:
Compute bias term:
Classify data using decision function:
SVM with Kernel: Steps
17
Linear:
Polynomial:
Radial Basis Function:
Others: design kernels suitable for target applications
Kernels
18
Breast cancer diagnosis and prognosis
Handwritten digit recognition
On-line Handwriting Recognition
Text Categorization
3-D Object Recognition Problems
Function Approximation and Regression
Detection of Remote Protein Homologies
Gene Expression
Vast number of applications…
• andFault diagnosis in chemical processes
Applications of SVM
20
Application aspects of SVM – Belousov et.al., 2002, Journal of Chemometrics
Current developments: SVM
About Kernel latent variables approaches and SVM– Czekaj et.al., 2005, Journal of Chemometrics
Kernel based orthogonal projections to latent structures– Rantalainen et.al., 2007, Journal of Chemometrics
Performance assessment of a novel fault diagnosis system based on SVM – Yelamos et.al., 2009, Computers and Chemical Engineering
SVM and its application in chemistry – Li et.al., 2009, Chemometrics and intelligent Laboratory
21
Identification of MIMO Hammerstein systems with LS-SVM, Goethals et.al., 2005, Automatica
Current developments: SVM
An online support vector machine for abnormal event detection, Davy et.al., 2006, Signal Processing
Support vector machine for quality, monitoring in a plastic injection molding process, Ribeiro, 2005, IEEE System Man and Cybernetics
Fault prediction for nonlinear system based on Hammerstein model and LS-SVM, Jiang et.al., 2009,IFAC Safeprocess
22
My future work
Finish my diploma project
Study the role of various “Tuning” parameters on classification results
Apply SVM to Tennessee Eastman benchmark that involves 20 pre-defined faults.
Apply of SVM based classification algorithm to small academic example
Study support vector machine based classification for “one-against-one” and “one-against-all” problems
23
Thank you..
24
&Answers