82
ICIAP 2007 - Tutorial Advances of statistical learning and Applications to Computer Vision Ernesto De Vito and Francesca Odone - PART 2 - http://slipguru.disi.unige.it

ICIAP 2007 - Tutorial Advances of statistical learning and

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ICIAP 2007 - Tutorial Advances of statistical learning and

ICIAP 2007 - Tutorial

Advances of statistical learningand

Applications to Computer Vision

Ernesto De Vito and Francesca Odone

- PART 2 -

http://slipguru.disi.unige.it

Page 2: ICIAP 2007 - Tutorial Advances of statistical learning and

2

Plan of the second part

Brief intro through a set of applications

One problem in detail (face detection):Choosing the representationFeature selectionClassification

On the choice of the classifier (filter methods)

Spotlights on other interesting issuesImage annotationKernel engineeringGlobal vs local

Page 3: ICIAP 2007 - Tutorial Advances of statistical learning and

3

Plan of the second part

Brief intro through a set of applications

One problem in detail (face detection):Choosing the representationFeature selectionClassification

On the choice of the classifier (filter methods)

Spotlights on other interesting issuesImage annotationKernel engineeringGlobal vs local

Page 4: ICIAP 2007 - Tutorial Advances of statistical learning and

4

Learning in everyday life

Security and video-surveillanceOCR systemsRobot controlBiometricsSpeech recognitionEarly diagnosis from medical dataKnowledge discovery in big dataset of heterogeneous data (included the Internet)Microarray analysis and classificationStock market preditionRegression applications in computer graphics

Page 5: ICIAP 2007 - Tutorial Advances of statistical learning and

5

Statistical Learning in Computer Vision

Page 6: ICIAP 2007 - Tutorial Advances of statistical learning and

6

Statistical Learning in Computer Vision

Detection problems

Page 7: ICIAP 2007 - Tutorial Advances of statistical learning and

7

Statistical Learning in Computer Vision

More in general: Image annotation

cartreebuildingskypavementpedestrian..

Page 8: ICIAP 2007 - Tutorial Advances of statistical learning and

8

How difficult is image understanding?

Page 9: ICIAP 2007 - Tutorial Advances of statistical learning and

9

Plan of the second part

Brief intro through a set of applications

One problem in detail (face detection):Choosing the representationFeature selectionClassification

On the choice of the classifier (filter methods)

Spotlights on other interesting issuesImage annotationKernel engineeringGlobal vs local

Page 10: ICIAP 2007 - Tutorial Advances of statistical learning and

10

Regularized face detection

Main steps towards a complete classifier:

Choosing the representationFeature selectionClassification

Joint work with A. Destrero – C. De Mol – A. Verri

Problem setting:Find one or more occurrences of a (~frontal) human face, possibly at different resolutions, in a digital image

Page 11: ICIAP 2007 - Tutorial Advances of statistical learning and

11

Application scenario (the data)

2000+2000 training1000+1000 validation3400 test

19 x 19 images

Page 12: ICIAP 2007 - Tutorial Advances of statistical learning and

12

Initial representation (the dictionary)

Overcomplete, general purpose sets of features are effective for modeling visual informationMany object classes have a peculiar intrinsic structure that can be better appreciated if one looks for symmetries or local geometry

Examples of features: wavelets, curvelets, ranklets, chirplets, rectangle features ...Example of problems: face detection (Heisele et al, Viola & Jones, ....), pedestrian detection (Oren et al., ..), car detection (Papageorgiou & Poggio)

Page 13: ICIAP 2007 - Tutorial Advances of statistical learning and

13

Initial representation (the dictionary)

The approach is inspired by biological systems See, for instance, B.A. Olshauser and D. J. Field “Sparse coding with an over-complete basis set: a strategy employed by V1?” 1997.

Usually this approach is coupled with learning from examples

The prior knowledge is embedded in the choice of an appropriate training set

Problem: usually these sets are very big

Page 14: ICIAP 2007 - Tutorial Advances of statistical learning and

14

Initial representation (the dictionary)

Rectangle features (Viola & Jones)

... About 64000 features per image patch!

Most of them are correlatedShort range correlation of natural imagesLong range correlation relative to the object of interest

Page 15: ICIAP 2007 - Tutorial Advances of statistical learning and

15

What’s wrong with this?

Measurements are noisyFeatures are correlatedThe number of features is higher than the number of examples

=> Ill conditioned

Page 16: ICIAP 2007 - Tutorial Advances of statistical learning and

16

Feature selection

Extracting features relevant for a given problem

What is relevant?

Often related to dimensionality reductionBut the two problems are different

A possible way to address the problem is to resort to regularization methods

Elastic net penalty (PART 1)

Page 17: ICIAP 2007 - Tutorial Advances of statistical learning and

17

Let us revise the basic algorithm

We assume a linear dependence between input and output

φ={φij} is the measurement matrixi=1,...,n examples/dataj=1,...,p dictionary

β=(β1,..., βp)T vector of unknown weights to be estimated

f=(f1,..., fn)T output values {-1,1} labels in binary classification problems

βϕ=f

Page 18: ICIAP 2007 - Tutorial Advances of statistical learning and

18

Choosing the appropriate algorithm

What sort of penalty suits our problem best?In other words:

How do we choose ε?

The choice is driven by the application domainWhat can we say about image correlation?Is there any reason to prefer feature A to feature B?Do we want them both?

{ })|(|minarg 222 βεβλϕβ

β++−

fNR

A

B

Page 19: ICIAP 2007 - Tutorial Advances of statistical learning and

19

Peculiarity of images

Given a group of short range correlated features each element is a good representative of the group

As for long range correlated features it would be interesting to keep them all, but it’s difficult to distinguish them at this stage

Notice that in other applications (e.g., microarray analysis) feature is important per se.

Page 20: ICIAP 2007 - Tutorial Advances of statistical learning and

20

L1 penalty

A purely L1 penalty automatically enforces the presence of many zeros in fThe L1 norm is convex therefore providing feasible algorithms

(PROB L1) is the Lagrangian formulation of the so-called LASSO Problem

PROB L1{ }||minarg βλϕββ

+−∈

22f

NR

Page 21: ICIAP 2007 - Tutorial Advances of statistical learning and

21

L1 penalty

The regularization parameter λ regulates the balance between misfit of the data and penalty

Also it allows us to vary the degree of sparsity

Page 22: ICIAP 2007 - Tutorial Advances of statistical learning and

22

How do we solve it?

The solution is not uniqueA number of numerical strategies have been proposed

We adopt the iterated soft-threshold Landweber

[ ])( )()()( tL

TtL

tL fS ϕβϕββ λ −+=+1

⎩⎨⎧ ≥−

=otherwise

hifhsignhhS jjj

j 02λλ

λ

||)()(Where the

soft-thresholderis defined as

This algorithm converges to a minimized of (PROB L1) if |ϕ|<1.

ALG L

Page 23: ICIAP 2007 - Tutorial Advances of statistical learning and

23

Thresholded Landweber and our problem

βϕ=f

φ is the measurement matrix: one row per imageone column per feature

f is the vector of labels+1 for faces-1 for negative examples

In our experiments φ has size 4000x64000 (about 1Gb!)

[ ])( )()()( tL

TtL

tL fS ϕβϕββ λ −+=+1

Page 24: ICIAP 2007 - Tutorial Advances of statistical learning and

24

A sampled version of Thresholded Landweber

We build S feature subsets each time extracting with replacement m features, m < < pWe compute S sub-problems

Then we keep the features that were selected eachtime they appeared in the sub-set

s=1,...,Sssf ϕβ=

Page 25: ICIAP 2007 - Tutorial Advances of statistical learning and

25

A sampled version of Thresholded Landweber

In our experimentsEach sub-set is 10% of the original sizeS=200 (the probability of extracting each feature at least 10 times is high)

5 10 15 20 25 30 35 400

1000

2000

3000

4000

5000

6000

Page 26: ICIAP 2007 - Tutorial Advances of statistical learning and

26

Structure of the method (I)

S0

sub1 sub2 subS....

Alg L Alg L Alg L

+

S1

Page 27: ICIAP 2007 - Tutorial Advances of statistical learning and

27

Choosing λ

A few words on parameter tuning

A classical choice is cross validation but in this case it is too heavy (because of the number of sub-problems..)

Thus, at this stage, we fix the number of zeros to be reached in a given number of iterations

Page 28: ICIAP 2007 - Tutorial Advances of statistical learning and

28

Cross validation

A standard technique for parameter estimation

Try different parameters and choose the one that performs (generalizes) best.

K-fold cross validation:Divide the training set in K chunksKeep K-1 for training and 1 for validatingRepeat for the K different validation setsCompute an average classification rate

Page 29: ICIAP 2007 - Tutorial Advances of statistical learning and

29

Classification

Two reasons:Obtain an effective face detectorSpeculate on the quality of the selected features

Face detection is a fairly standard binary classification problem

Regularized Least SquaresSupport Vector Machines (Vapnik, 1995)...with some nice kernel

In the following experiments we start using linear SVMs

Page 30: ICIAP 2007 - Tutorial Advances of statistical learning and

30

Setting 90% of zeros

We get 4636 features..too many

What about increasing the number of zeros in the solution???

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.005 0.01 0.015 0.02

One stage feature selectionOne stage feature selection (cross validation)

One stage feature selection (on entire set of features)

Page 31: ICIAP 2007 - Tutorial Advances of statistical learning and

31

A refinement of the solution

Setting 99% of zeros:345 features (good)Generalization performance drops of about 3% (bad)

IDEA: We apply the Thresholded Landweber once again (on S1 = 4636 features)This time we tune λ with cross validationWe obtain 247 features

Page 32: ICIAP 2007 - Tutorial Advances of statistical learning and

32

Structure of the method (II)

S1

Alg L

S2

S0

sub1 sub2 subS....

Alg L Alg L Alg L

+

Page 33: ICIAP 2007 - Tutorial Advances of statistical learning and

33

Comparative analysis

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.005 0.01 0.015 0.02

2 stages feature selection2 stages feature selection + correlation

Viola+Jones feature selection using our same dataViola+Jones cascade performance

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.005 0.01 0.015 0.02

2 stages feature selectionPCA

Comparison with PCAComparison with Adaboost feature selection (Viola&Jones)

Page 34: ICIAP 2007 - Tutorial Advances of statistical learning and

34

How compact is the solution?

The 247 are still redundantFor real-time processing we may want to try and reduce it further

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.005 0.01 0.015 0.02

2 stages feature selection2 stages feature selection (polynomial kernel)

Linear vs Polynomial kernel

Page 35: ICIAP 2007 - Tutorial Advances of statistical learning and

35

A third optimization stage

Starting from S2We choose one delegate for each group of short range correlated featuresOur correlation is based on discarding features that are

Of the same typeCorrelated according to the Spearman’s testSpatially close

Page 36: ICIAP 2007 - Tutorial Advances of statistical learning and

36

Structure of the method (III)

S0

S2

sub1 sub2 subS

Alg L Alg L

+

S1

Alg L

Corr

S3

Page 37: ICIAP 2007 - Tutorial Advances of statistical learning and

37

What do we get?

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.005 0.01 0.015 0.02

2 stages feature selection + correlation2 stages feature selection + correlation (polynomial kernel)Linear vs polynomial 0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.01 0.02 0.03 0.04 0.05 0.06

Two stages feature selectionTwo stages + correlation analysis

With and without 3rd stage

Page 38: ICIAP 2007 - Tutorial Advances of statistical learning and

38

A fully trainable system for detecting faces

Peculiarity of object detectors:For each image many testsVery few positive examplesVery many negative examples

Page 39: ICIAP 2007 - Tutorial Advances of statistical learning and

39

A fully trainable system for detecting faces

Coarse-to-fine methods deal with this, devising multiple classifiers of increasing difficulty

Many approaches (focus-of-attention, cascades, ...)

Page 40: ICIAP 2007 - Tutorial Advances of statistical learning and

40

Our cascade of classifiers

Starting from a set of features, say S3we build many small linear SVM classifiers each of them based on at least 3 distant features that are able to reach a fixed target performance on a validation setThe target performance is chosen so that each classifier is not likely to miss faces

Minimum hit rate 99.5%Maximum false positive rate 50%

∏∏

==

i

i

hHfF For 10 layers

F ~ 90% and H ~ 0.510

Page 41: ICIAP 2007 - Tutorial Advances of statistical learning and

41

Our cascade of classifiers

Page 42: ICIAP 2007 - Tutorial Advances of statistical learning and

42

Finding faces in images

Page 43: ICIAP 2007 - Tutorial Advances of statistical learning and

43

Finding faces in images

Page 44: ICIAP 2007 - Tutorial Advances of statistical learning and

44

Finding faces in images

Page 45: ICIAP 2007 - Tutorial Advances of statistical learning and

45

Finding faces in video frames

Page 46: ICIAP 2007 - Tutorial Advances of statistical learning and

46

Finding eye regions...

The beauty of data driven approachesSame approachDifferent dataset: we extracted eye regions from a subset of the Feret dataset

Page 47: ICIAP 2007 - Tutorial Advances of statistical learning and

47

A few results (faces and eyes)

Page 48: ICIAP 2007 - Tutorial Advances of statistical learning and

48

Online examples

video

Page 49: ICIAP 2007 - Tutorial Advances of statistical learning and

49

A few words on the choice of the classifier

SVMs are very popular for their effectiveness and their generalization abilityOther algorithms can perform in a similar way and have other attractiveness

Filter methods are very simple to implement and allow us to obtain very interesting performanceIn particular, iterative methods are very useful when parameter tuning is needed

Joint work with L. Logerfo, L. Rosasco, E. De Vito, A Verri

Page 50: ICIAP 2007 - Tutorial Advances of statistical learning and

50

Experiments on face detection

1.48 ±0.34σ=300 t=59

1.53 ±0.33σ=341 t=89

1.63 ±0.32σ=341 t=95

ν method

1.60 ±0.71σ=1000 C=0.9

1.99 ±0.82σ=1000 C=1

2.41 ±1.39σ=800 C=1

RBF-SVM

800700600

Size of the training set

Experiments carried out on a portion of the previously mentioned facesdataset

Page 51: ICIAP 2007 - Tutorial Advances of statistical learning and

51

Plan of the second part

Brief intro through a set of applications

One problem in detail (face detection):Choosing the representationFeature selectionClassification

On the choice of the classifier (filter methods)

Spotlights on other interesting issuesImage annotationKernel engineeringGlobal vs local

Page 52: ICIAP 2007 - Tutorial Advances of statistical learning and

52

On the classifier choice:Filter methods

Starting from RLS we have seen (PART 1) how a large class of methods known as spectral regularization give rise to regularized learning algorithmsThese methods were originally proposed to solve inverse problemsThe crucial intuition is that the same principle allowing us to numerical stabilize a matrix inversion is crucial to avoid overfitting

They are worth investigating for their simplicity and effectiveness

Page 53: ICIAP 2007 - Tutorial Advances of statistical learning and

53

Filter methods

Alle these algorithms are consistent and can be easily implemented

They have a common derivation (and similar implementation) but have

Different theoretical properties (PART 1)Different computational burden

Page 54: ICIAP 2007 - Tutorial Advances of statistical learning and

54

Filter methods: computational issues

Non iterativeTikhonov (RLS)Truncated SVD

IterativeLandweberv methodIterated Tikhonov

Page 55: ICIAP 2007 - Tutorial Advances of statistical learning and

55

Filter methods: computational issues

RLSTraining (for a fixed lambda):function [c] = rls(K, lambda, y)

n = length(K);c = (K+n*lambda*eye(n))\y;

Test:function [y_new] = rls_test(x, x_new, c)

K_test = kernel(x,x_test);y_new = K_test * c;% for classificationy_new = sign(y_new);

Careful to choose the matrix inversion function

Page 56: ICIAP 2007 - Tutorial Advances of statistical learning and

56

Filter methods: computational issues

RLS

The computational cost of RLS is the cost of invertingthe matrix K: O(n3)

In case parameter tuning is needed resorting to a eigendecomposition of matrix K saves time:

yQnIQc

QQKT

T

1−+Λ=

Λ=

)()( λλ

ynIKc 1−+= )( λ

Page 57: ICIAP 2007 - Tutorial Advances of statistical learning and

57

Filter methods: computational issues

v method

t plays the role of the regularization parameterComputational cost: O(tn2)

The iterative procedure allows us to compute all solutions from 0 to t (regularization path)This is convenient if parameter tuning is needed:

With an appropriate choice of the max number of iterations the computational cost does not change

λ=t

Page 58: ICIAP 2007 - Tutorial Advances of statistical learning and

58

Plan of the second part

Brief intro through a set of applications

One problem in detail (face detection):Choosing the representationFeature selectionClassification

On the choice of the classifier (filter methods)

Spotlights on other interesting issuesImage annotationKernel engineeringGlobal vs local

Page 59: ICIAP 2007 - Tutorial Advances of statistical learning and

59

How difficult is image understanding?

Problem setting (general):Assign one or many labels (from a finite but possibly big

set of known classes) to a digital image according to its content

This general problem is very complexMany better defined domains have been studied

Image categorizationObject detectionObject recognition

Usually the trick is in defining the boundaries of the problem of interest

Joint work(s) with A. Barla – E. Delponte – A. Verri

Page 60: ICIAP 2007 - Tutorial Advances of statistical learning and

60

Object idenfication/recognition

Nevertheless, the problem isnot that simple

Page 61: ICIAP 2007 - Tutorial Advances of statistical learning and

61

Image annotation

Problem setting:Assign one or more labels (from a finite set of

known classes) to a digital image according to its content

Assumption: we look for global descriptionsIndoor/outdoorDrawing/pictureDay/nightCityscape/not

It usually leads to supervised problems (binary classifiers)Low level descriptions are often applied

Page 62: ICIAP 2007 - Tutorial Advances of statistical learning and

62

Image annotation from low-levelglobal descriptions

The problem:Capture a global description of the image usingsimple features

The procedure:Build a suitable training set of dataFind an appropriate representationChoose a classification algorithm and a kernelTune the parameters

Page 63: ICIAP 2007 - Tutorial Advances of statistical learning and

63

Computer vision ingredients

Color: Color histograms

Shape: Orientation and strength edge histogramsHistograms of the lengths of edge chains

Texture:Wavelets, Co-occurrence matrices

We represent whole images with low leveldescriptions of color, shape or texture

Page 64: ICIAP 2007 - Tutorial Advances of statistical learning and

64

A few comments

Histograms appear quite often

We need a simple example to discuss kernel engineeringDesigning ad hoc kernels for the problem/data at hand and the right properties:

SymmetryPositive definiteness

=> Let us go through the histogram intersection example

Page 65: ICIAP 2007 - Tutorial Advances of statistical learning and

65

Histogram Intersection (HI)

• Since (Swain and Ballard, 1991) it is knownthat histogram intersection is a powerfulsimilarity measure for color indexing

• Given two images, A and B, of N pixels, if werepresent them as histograms with M bins Ai and Bi, histogram intersection is defined as

{ }∑=

=M

iii BABAK

1,min),(

Page 66: ICIAP 2007 - Tutorial Advances of statistical learning and

66

Histogram Intersection (HI)

05

1015202530354045

Bin1

Bin2

Bin3

Bin4

Bin5

Bin6

Bin7

Bin8

05

1015202530354045

Bin1

Bin2

Bin3

Bin4

Bin5

Bin6

Bin7

Bin8

0

5

10

15

20

25

30

35

Bin1

Bin2

Bin3

Bin4

Bin5

Bin6

Bin7

Bin8

∑i

iBin

Page 67: ICIAP 2007 - Tutorial Advances of statistical learning and

67

HI is a Kernel

If we build the MxN – dimensional vector

it can immediately be seen that

⎟⎟⎟

⎜⎜⎜

⎛=

−−−321

876

321

876

321

876r

M

M

AN

A

AN

A

AN

A

A 0,...,0,0,1,...,1,1,...,0,...,0,0,1,...,1,1,0,...,0,0,1,...,1,12

2

1

1

( ) >=< BABAK ,,

NOTICE: The proof is based on finding an explicit mapping

Dot product(linear kernel)

Page 68: ICIAP 2007 - Tutorial Advances of statistical learning and

68

Histogram intersection: applications

HI has been applied with success to a variety of classification problems, both global and local:

Indoor/outdoor, day/night, cityscape/landscape classificationObject detection from local features (SIFT)

In all those cases it outperformed RBF classifiers Also HI does not depend on any parameter

Page 69: ICIAP 2007 - Tutorial Advances of statistical learning and

69

Local approaches

Global approaches have limitsOften objects of interest occupy only a (small) portion of the imageIn a simplified setting all the rest of the image can be defined as background (or context)Depending on the application domain context can help recognition or make it more difficult:

Page 70: ICIAP 2007 - Tutorial Advances of statistical learning and

70

Local approaches

We may represent the image content as a set of local features (f1, ..., fn) --- corners, DoG features, ...

We immediately see that this is a variable length description

How to deal withvariable length:

Vocabulary approachLocal kernels (or kernels on sets)

Local features in scale-space

Page 71: ICIAP 2007 - Tutorial Advances of statistical learning and

71

Local approaches: features vocabulary

It is reminiscent of text categorization

We define a vocabulary of local features and represent our images based on how often a given feature appears in the image

One implementation of this paradigm is the bag of keypoints approach

Page 72: ICIAP 2007 - Tutorial Advances of statistical learning and

72

Local approaches: features vocabulary

[Csurka et al, 2004]

Page 73: ICIAP 2007 - Tutorial Advances of statistical learning and

73

Local approaches: kernels on sets

Image descriptions based on local features can be seen as sets:

Variable lengthNo internal ordering

A common approach to define a global similarity between feature setsis to combine the local similarity between (possibly all) pairs of vector elements

},,{},,,{ 11 mn yyYxxX KK ==

mjniyxKYXK jiL ,...,1,,1)),((),( ==∀ℑ= K

Page 74: ICIAP 2007 - Tutorial Advances of statistical learning and

74

Summation kernel [Haussler,1999]

The simplest kernel for sets is the summation kernel

Ks is a kernel if KL is a kernelKs is not so useful in practice:

Computationally heavyIt mixes good and bad correspondences

∑∑= =

=n

i

m

jjiLS yxKYXK

1 1),(),(

Page 75: ICIAP 2007 - Tutorial Advances of statistical learning and

75

Matching kernel [Wallraven et al, 2003]

Among the many other kernels for sets proposed the matching kernel received a lot of attention for image data

( ){ }∑

= ==

+=n

jjiLmj

M

yxKm

YXK

XYKYXKYXK

1 ,...,1),(max1),(ˆ

),(ˆ),(ˆ21),(

where

Page 76: ICIAP 2007 - Tutorial Advances of statistical learning and

76

Matching kernel [Wallraven et al, 2003]

The matching kernel lead to promising results on object recognition problems

Nevertheless it has been shown that it is not a Mercer kernel (because of the max op.)

Page 77: ICIAP 2007 - Tutorial Advances of statistical learning and

77

Intermediate matching kernel[Boughorbel et al,, 2004]

Let us consider two feature sets

The two feature sets are compared through an auxiliary set of virtual features

The intermediate matching kernel is defined as

},,{},,,{ 11 mn yyYxxX KK ==

},,{ 1 pvvV K=

∑∈

=Vv

vVi

iYXKYXK ),(),(

),(),( ** yxKYXK Lvi=where

x* and y* are the elements of X and Ycloser to vi

Page 78: ICIAP 2007 - Tutorial Advances of statistical learning and

78

Intermediate matching kernel[Boughorbel et al,, 2004]

∑∈

=Vv

vVi

iYXKYXK ),(),(

),(),( ** yxKYXK Lvi=where

x* and y* are the elements of X and Ycloser to vi

Page 79: ICIAP 2007 - Tutorial Advances of statistical learning and

79

Intermediate matching kernel:how to choose the virtual features

The intuition behind the virtual features is to find representatives of the feature points extracted in the training setSimply the training set features are grouped in N clusters

The authors show that the choice of N is not crucial (the bigger the better, but careful to computational complexity)It is better to cluster features within each class

Page 80: ICIAP 2007 - Tutorial Advances of statistical learning and

80

Conclusions

Understanding the image content is difficult

Statistical learning can help a lot

Don’t forget computer vision! Appropriate descriptions, similarity measures allow us to achieve good results and to obtain effective solutions

Page 81: ICIAP 2007 - Tutorial Advances of statistical learning and

81

That’s all!

How to contact us:Ernesto: [email protected]: [email protected]

http://slipguru.disi.unige.itwhere you will find updated versions of the slides

Page 82: ICIAP 2007 - Tutorial Advances of statistical learning and

82

Selected (and very incomplete) biblio

A. Destrero, C. De Mol, F. Odone, A. Verri. A regularized approachto feature selection for face detection. DISI-TR-2007-01A. Mohan, C. Papageorgiou, T. PoggioExample based object detection in images by components, PAMI(Vol. 23, No. 4), 2001F. Odone, A. Barla, and A. Verri. Building kernels from binary

strings for image matching, IEEE Transactions on ImageProcessing, 14(2):169-180, 2005

P. Viola and M. J. Jones. Robust real-time face detection. International Journal on Computer Vision, 57(2),2004.C. Wallraven, B. Caputo, A. Graf. Recognition with Local features: the kernel recipe. ICCV03.