24
1 Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center of Excellence for Document Analysis and Recognition (CEDAR) Department of Computer Science and Engineering University at Buffalo State University of New York, USA

Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

  • Upload
    others

  • View
    23

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

1

Machine learning for Signature Verification

Sargur Srihari (with Harish Srinivasan and Matthew Beal)

Center of Excellence for Document Analysis and Recognition (CEDAR)

Department of Computer Science and EngineeringUniversity at Buffalo

State University of New York, USA

Page 2: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

2

Decision Task in SignatureVerification1:n Verification

Known Signatures Questioned (unknown) Signature

1.

2.… …… Questioned is

Genuine/Forgery/Unknown

Verification process

n.

n is small unlike in other machine learning domains

Page 3: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

3

Overview of Presentation1. Machine Learning Terminology2. Decision Task for Signatures3. Inference Strategies

1. Person-dependent 2. Person-independent

4. Methods of Comparing Distributions5. Similarity measure for Signatures6. Experimental Results7. Summary and Future Work

Page 4: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

4

Machine Learning Terminology

• Two stages in a machine learning system– Inference Stage

• Learning from samples• Training

– Decision Stage• Perform the task on given data• Execution

Page 5: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

5

General Learning and Special Learning

• Standard machine learning assumes large n which is not always the case in QD

• For signature analysis we introduce two types of inference: 1. Special (person dependent) 2. General (person independent)

Page 6: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

6

Person-dependent InferenceGenuine (Known) Signatures

},..,,{ 21 ngggG =

Questioned Signature Q

Decision is a one-class problem: whether Q belongs to G

Page 7: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

7

Person dependent Inference MethodGenuine Pairs (G x G)

InferenceSimilarity or distancefunction d(gi,gj)applied toknown pairsyields Genuine ( G x G)distribution

G x G = {.4, .1, .2, .3, .2, .1}n2 = 6

d

Page 8: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

8

Person dependent Decision MethodStep 1: Questioned vs Knowns yields Q x G Distribution

Questioned (Q)a

b

c

d

d(q,gi)yieldsn values

Q x G = {0, .4, .1, .2}

Step 2: Compare G x G and Q x G distributions

G x G = .4,.1,.2,.3,.2,.1

Statistical test tocompare distributions yields confidence of matchQ x G = 0,.4,.1,.2

Page 9: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

9

Person independent inferenceGenuinesignatures },..,{},,..,,{ 121 iimiim ggGGGGG ==

},..,{},,..,,{ 121 iimiim ffFFFFF ==Forgery(Impostor)signatures

Construct Genuine and Impostor distributions

Page 10: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

10

Genuine and Impostor Distributions

Impostor

Genuine

Person IndependentLearningGenuine: Gi x GiImpostor: Gi x Fi

Distance between pairs modeled as gamma distributions(since distances are positive)

Page 11: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

11

Person Independent DecisionCompare questioned againsteach of n genuines

Decision madeon each element ofQ x G

Impostor

Genuine

)/()/(log

dimpostorPdgenuinePLLR =

if LLR > α => Genuine, else Forgery (α learnt from ROC)For n knowns n LLR values are added (assuming independent knowns)

Page 12: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

12

Methods for Comparing Distributions

• Kolmogorov Smirnov (KS)• Kullback Leibler (KL) Divergence• Reverse KL• Symmetric KL• Jensen-Shannon KL

Page 13: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

13

Statistical TestsKolmogorov Smirnov (KS) Test: tell if two data sets drawn from same distribution.

Computes statistic D: max value of absolute difference between two cdfs.

Mapping to Probability (with which Q belongs to K)

Page 14: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

14

Kolmogorov Smirnov TestBased on Maximum difference between Two C.D.F.

G x G Distribution Q x G Distribution.4,.1,.2,.3,.2,.1 0,.4,.1,.2

sort sort

.1,.1,.2,.2,.3,.4 0,.1,.2,.4C. D. F.

0. 000. 200. 400. 60

0. 801. 001. 20

0. 00 1. 00 2. 00 3. 00 4. 00 5. 00Di st r i but i on

P(X

<= x

)

C. D. F.

0. 000. 200. 400. 60

0. 801. 001. 20

0. 00 1. 00 2. 00 3. 00 4. 00 5. 00Di st r i but i on

P (X

<=

x)

MaximumDifferenceBetweenTwoC.D.F. DKS = 0.333

DKS = 0.333 is mapped to Probability using formula on previous slide 89.98%Which is the confidence with which Q belongs to K

Page 15: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

15

Kullback Leibler Divergence

To measure similarities between distributions.B = total number of bins in the distribution. Pb and Qb are probabilities of the distributions in the bth bin. PKL = probability of similarity of the two distributions.

Combined KL and KS probabilistic measure = Average of the resulting probabilities of KL and KS.

Page 16: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

16

KL CalculationG x G Distribution Q x G Distribution

.4,.1,.2,.3,.2,.1 0,.4,.1,.2

Binning BinningSay B=2

Bins 0 – 0. 2 0.2 – 0.4

P.D.F. 4 / 6 = 0. 66

2 / 6 = 0 . 33

Bins 0 – 0. 2 0.2 – 0.4

P.D.F. 3 / 4 = 0. 75

1 / 4 = 0 . 25

DKL = (0.66* log(0.66/0.75) ) + (0.33 * log(0.33/0.25)) = 0.0072

PKL = exp(-0.0072) = 0.9928 = 99.28 %

Page 17: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

17

Other information theoretic measures

PKLKS : Combined KL and KS probabilistic measure = Average of the resulting probabilities of KL and KS.

Empirically proves to be thebest measure

Page 18: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

18

Data SetGenuine signatures (1320): 55 individuals , 24 signatures each

Forgery signatures (1320): 55 individuals , 24 signatures each

Page 19: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

19

Feature Vector

Gradient (12 bits): 111101111111Structural (12 bits): 000011001100Concavity (8 bits): 10100000

4

8

( )

s"1" have rsboth vecto wherebits ofnumber theis s"0" have rsboth vecto wherebits ofnumber theis

10245.0 score Similarity

bits 10248481212 bits Total

11

00

1100

CC

CC +×=

=××++=

*Described in paper at IWFHR, Tokyo, Nov. 2004

Page 20: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

20

Similarity/Distance measure

Similarity measure between two feature vectors

Hence transformto distance spacefrom features space

Page 21: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

21

Person independent verification accuracy

LLR distribution ROC curves

Each sub plot corresponds to training on different n

For n = 20 Error rate = 21.3 %

Page 22: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

22

Person dependent verification accuracyError vs Reject rateTable of error rates

For n = 20: Error rate = 16.40 %Cannot be used for n <4

Page 23: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

23

Summary• Two learning strategies introduced for small

sample learning with signatures

• Method 1:General Learning (Person Independent) • Inference

Does not require multiple( >1) GenuinesRequires forgery samples (of other writers)

• DecisionTwo-class problemBased on LLR from Gamma distributions

Page 24: Machine learning for Signature Verificationsrihari/talks/icvgip.pdf · Machine learning for Signature Verification Sargur Srihari (with Harish Srinivasan and Matthew Beal) Center

24

Summary and Future Work• Method 2: Special Learning (Person dependent) • Inference

Requires multiple Genuine samples No forgery samples required DecisionOne-class problemBased on KSKL comparison of distributions

• Method 2 has 5% better accuracy with n=20

• Combination of methods needs investigation