38

SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining
Page 2: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM: Algorithms of Choice for Challenging Data

Boriana Milenova, Joseph Yarmus, Marcos CamposData Mining TechnologiesORACLE Corp.

Page 3: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Overview

SVM theoretical framework

ORACLE data mining technology– SVM parameter estimation– SVM optimization strategy

SVM on challenging data

Page 4: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM Model Defines a Hyperplane

Linear models in feature space

Hyperplane defined by a set of coefficients and a bias term

0 bxw

wb

Page 5: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Maximum Margin Models

))(min( ii xfymarginFunctional

support vectors

)max(min marginw

ww1))(min( ii xfymarginGeometric

Page 6: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM Optimization ProblemMinimize ||w|| subject toLagrangian in primal space:

subject to

1)( ii xfy

121)( byL iiip xwwww

0i

0wpL

0bLp

iii y xw

0 ii y

Page 7: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Duality

Lagrangian in dual space:

subject to

Dot products!– dimension-insensitive optimization– generalized dot products via non-linear map

jijijiiD yyL xx21

0i 0 ii y

)()(),( jijiK xxxx

Page 8: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Towards Higher Dimensionality via Kernels1. Transform data via non-linear mapping to an inner

product feature space2. Train a linear machine in the new feature space

),(,.)(,.)( jiji KKK xxxx

Mercer’s kernels:– symmetry

– positive semi-definite kernel matrix

– reproducing property

),(),( ijji KK xxxx

Page 9: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Soft Margin: Non-Separable Data

k

p CL www21)(

iii by 1xwsubject to

Capacity parameter Ctrades off complexity and empirical risk

Page 10: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

1-Norm Dual Problem

Lagrangian in dual space:

subject to

Quadratic problem– linear and inequality constraints

),(21

jijijiiD KyyL xx

Ci 0 0 ii y

Page 11: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM Regression

)ˆ(21)( kk

p CL www

iii yb xwsubject to

iii by ˆ xw

Page 12: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM Fundamental Properties

Convexity– single global minimum

Regularization– trades off structural and empirical risk to

avoid overfittingSparse solution

– usually only a fraction of training data become support vectors

Not probabilistic

Solvable in polynomial time…

Page 13: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM in the Database

ORACLE Data Mining (ODM)– commercial SVM implementation in the

database– product targets application developers and

data mining practitioners– focuses on ease of use and efficiency

Challenges:– effective and inexpensive parameter

tuning– computationally efficient SVM model

optimization

Page 14: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM Out-Of-The-Box

Inexperienced users can get dramatically poor results

LIBSVM examples:

VehicleBioinformaticsAstroparticle Physics

0.880.020.790.570.970.67

After tuningcorrect rate

Out-of-the-boxcorrect rate

Page 15: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM Parameter Tuning

Grid search (+ cross-validation or generalization error estimates)

– naive– guided (Keerthi & Lin, 2002)

Parameter optimization– gradient descent (Chapelle et al., 2000)

Heuristics

Page 16: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

ODM On-the-Fly Estimates

Standard deviation for Gaussian kernel– single kernel parameter– kernel has good numeric properties

bounded, no overflowCapacity

– key to good classification generalizationEpsilon estimate for regression

– key to good regression generalization

Page 17: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

ODM Standard Deviation Estimate

Goal: Estimate distance between classes

3. Pick random pairs from opposite classes

4. Measure distances5. Order descending6. Exclude tail (90th percentile)7. Select minimum distance

Page 18: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

ODM Capacity EstimateGoal: Allocate sufficient capacity

to separate typical examples2. Pick m random examples per class3. Compute yi assuming = C

5. Exclude noise (incorrect sign)6. Scale C, (non bounded sv)

8. Order descending9. Exclude tail (90th percentile)10.Select minimum value

m

j ijji KCyy 2

1),( xx

m

j ijji KyyC 2

1),(/ xx

1iy

Page 19: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Some Comparison Numbers

LIBSVM examples:

0.710.840.97

On-the-fly estimates

VehicleBioinformaticsAstroparticle Physics

0.880.020.850.570.970.67

Grid search + xval

Out-of-the-box

Page 20: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

ODM Epsilon EstimateGoal: estimate target noise

by fitting a preliminary model

3. Pick m random examples 4. Train SVM model with 5. Compute residuals on

remaining data6. Scale 7. Retrain

0

2/1 ntt

Page 21: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Comparison Numbers Regression

0.020.356.57

On-the-fly estimatesRMSE

PumadynComputer activityBoston housing

0.020.336.26

Grid searchRMSE

Page 22: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Optimization Approaches

QP solvers– MINOS, LOQO, quadprog (Matlab)

Gradient descent methods– Sequentially update one coefficient at a

timeChunking and decomposition

– optimize small “working sets” towards global solution

– analytic solution possible (SMO - Platt, 1998)

Page 23: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Chunking strategy

/* WS working set */select initial WS randomly;while (violations){Solve QP on WS;Select new WS;

}

Page 24: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

ODM Working Set Selection

Avoid oscillations– overlap across chunks– retain non-bounded support vectors

Choose among violators– add large violators

Computational efficiency– avoid sorting

Page 25: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Who to Retain?

/* Examine previous working set */if (non-bounded sv < 50%){

retain all non-bounded sv;add other randomly selected up to 50%;

}else{

randomly select non-bounded sv;}

Page 26: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Who to Add?create violator list;/* Scan I - pick largest violators */while (new examples < 50% AND WS Not Full){

if (violation > avg_violation)add to WS;

}

/* Scan II - pick other violators */while (new examples < 50% AND WS Not Full){

add randomly selected violators to WS;}

Page 27: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM in Feed-Forward Framework

j iijji Kyy ),( xx

j

),( iiK xx

Page 28: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

DOF in Neural Nets / RBF

Page 29: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

DOF in SVM

Page 30: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM vs. Neural Net / RBF

Compact model

Global minimum

Regularization

––

NN / RBFSVM

Page 31: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Text Mining

Domain characteristics:– thousands of features– hundreds of topics– sparse data

Science Sport Art

Page 32: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM in Text Mining

Reuters corpus~10K documents, ~10K terms, 115 classesAccuracy: recall / precision breakeven point

0.860.840.820.790.800.72

SVMnon-linear

SVM linear

K-NNC4.5RocchioNaive Bayes

Joachims, 1998

Page 33: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Biomining

Domain characteristics:– thousands of features– very few data points– dense data

microarray data

Page 34: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

SVM on Microarray Data

Multiple tumor types144 samples, 16063 genes, 14 classes Accuracy: correct rate

0.43

Naive Bayes

0.780.680.62

SVM linearK-NNWeighted voting

Ramaswamy et al., 2001

Page 35: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Other domains

High dimensionality problems:– image (color and texture histograms)– satellite remote sensing– speech

Linear kernels sufficient in most cases– data separability– single parameter tuning (capacity)– small model size

Page 36: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

Final Note

SVM classification and regression algorithms available in ORACLE 10G database

Two APIs– JAVA (J2EE)– PL/SQL

Page 37: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining

References

Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2001). Choosing Multiple Parameters for Support Vector Machines.Hsu C., Chang C., & Lin, C. (2003). A Practical Guide to Support Vector Classification.Joachims, T. (1998). Text Categorization with Support Vector Machines: Learning with Many Relevant Features.Keerthi, S. & Lin, C. (2002). Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel.Platt, J. (1998). Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines.Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J., Poggio, T., Gerald, W., Loda, M., Lander, E., Golub, T. (2001). Multi-Class Cancer Diagnosis Using Tumor Gene Expression Signatures.

Page 38: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining