Data Selection For Support Vector Machine Classifier

Glenn Fung and Olvi L. Mangasarian

August 2000

20081021 Kuan-Chi-I

1

OutlineIntroductionSVMMSVMComparisonsConclusion

2

IntroductionA method for selecting a small set of support vectors

which determines a separating plane clsssifier.Useful for applications contain millions of data points.

3

SVMA method for classification.

4

SVM (Linear Separable Case)

5

SVM To find the maximum margin ,equivelent to find minimum ½||w||2.

We can transfer above problem to a quadratic problem with parameter v > 0.

A : a real m×n matrix. e : column vectors of ones in arbitrary dimension.e ′ : transpose of e.y : nonnegitive slack variables.D : m×m diagonal matrix of 1 or -1.

6

SVMWritten in individual component natation.

Ai :row vector of matrix A.

7

SVMx′w = γ +1 bounds the class A ＋ points.x′w = γ +1 bounds the class A － points.

γ : the location relative to the origin.

w : normal to the bounding planes.

The linear separating surface is the plane:

8

SVM (Linearly Inseparable Case)

9

SVM (Inseparable)If the class are inseparable then the two planes bound the two

class with a 〝 soft margin”.

10

MSVM (1-Norm SVM)A minimal support vertor machine (MSVM).In order to make use of a faster programming based approach,

we reformulate (1) by replacing the 2-norm by a 1-norm as follows:

11

MSVMThe mathematical program (7) is easily convert to a linear

program as follows:

υ : the absolute value |w| of w, and υi |≧ wi|

12

MSVM

If we define nonnegative multipliers u R∈ m associated with the first set of constraints of the linear program (8), and multipliers (r, s) R∈ n+n for the second set of constraints of (8), then the dual linear program associated with the linear SVM formulation (8) is the following:

13

MSVMWe modify the linear program to generate an SVM with as fewer

support vector as possible by addingan error term e′y*

The term e′y* suppresses mis-classified points and results in our minimal support vector machine MSVM:

y* :vector x in Rn with components (y*)i =1 if yi > 0 and 0 otherwise.μ :positive parameter ,chosen by a tuning set .

14

MSVMWe approximate e′y* here by a smooth concave exponential on

the nonnegative real line as was done in the feature selection approach of. For y ≥ 0, the approximation of the step vector y ∗of (9) by the concave exponential, , i = 1, . . . ,m, that is:

15

MSVMThe smooth MSVM:

16

MSVM (SLA)

17

Comparison

18

Observations of Comparisons 1. For all test problems MSVM had least number of support

vectors.

2. For the Ionosphere problem, the reduction in the num-

ber of support vectors of MSVM over SVM| · |1 is 81%, and

the average reduction in the number of support vectors of MSVM over SVM| · | is 65.8%.

3. Tenfold testing set correctness of MSVM was good.

4. Computing times were higher for MSVM than for other classifiers.

19

ConclutionWe proposed a minimal support vector machine.Useful in classifying very large datasets by using only a

fraction of the data.Improves generalization over other classifiers that use a higher

number of data points.MSVM requires the solution of a few linear programs to

determine a sepaeating surface.

20

Education

Data Selection For Support Vector Machine Classifier