34
Forged Handwriting Forged Handwriting Detection Detection Hung-Chun Chen M.S. Thesis in Computer Science Advisors: Drs. Cha and Tappert

Forged Handwriting Detection

Embed Size (px)

DESCRIPTION

Forged Handwriting Detection. Hung-Chun Chen M.S. Thesis in Computer Science Advisors: Drs. Cha and Tappert. Motivation. Important documents require signatures to verify the identity of the writer Experts are required to differentiate between authentic and forged signatures - PowerPoint PPT Presentation

Citation preview

Page 1: Forged Handwriting Detection

Forged Handwriting DetectionForged Handwriting Detection

Hung-Chun ChenM.S. Thesis in Computer ScienceAdvisors: Drs. Cha and Tappert

Page 2: Forged Handwriting Detection

MotivationMotivation

Important documents require signatures to verify the identity of the writer

Experts are required to differentiate between authentic and forged signatures

Important to develop an objective system to identify forged handwriting, or at least to identify those handwritings that are likely to be forged

Page 3: Forged Handwriting Detection

Key IdeaKey Idea

It seems reasonable that successful forgers often forge handwriting shape and size by carefully copying or tracing the authentic handwriting

Forensic literature indicates that this is true

Page 4: Forged Handwriting Detection

HypothesesHypotheses

• Good forgeries – those that retain the shape and size of authentic writing – tend to be written more slowly (carefully) than authentic writing

• Good forgeries are likely to be wrinklier (less smooth) than authentic handwriting

Page 5: Forged Handwriting Detection

MethodologyMethodology

Handwriting sample collectionMeasurement (feature) extraction– Speed–Wrinkliness

Statistical analysis

Page 6: Forged Handwriting Detection

IBM Thinkpad TransnoteIBM Thinkpad Transnote

Page 7: Forged Handwriting Detection

Database ConstructionDatabase Construction

Record format for the handwriting samples1. ID of subject

2. online or offline

3. ID of copied subject

4. word written

5. first/second/third try

6. sampling rate (online) or resolution (offline)

7. file extension

Page 8: Forged Handwriting Detection

Subject ID

<File>Rate

ResolutionExtension.TApril-yyyy

ONOFF

xxxx

onlineoffline

ID of copied subject

word written

first trysecond trythird try

100 Hz300 dpi600 dpi file

extension

Page 9: Forged Handwriting Detection

Handwriting SamplesHandwriting Samples

Page 10: Forged Handwriting Detection

Feature ExtractionFeature Extraction

Speed

Wrinkliness

Page 11: Forged Handwriting Detection

SpeedSpeed

The digitizer records the x-y coordinates of the pen movement at a sampling rate of 100Hz

This information is used to calculate the average speed of each handwriting sample

Page 12: Forged Handwriting Detection

SpeedSpeed

The original file of the points ** Page 10 has 4 scribbles: PageSize is 21.59 cm wide by 27.94 cm high. Scribble 0: time 2002/12/11 23:37 Stroke has 93 points: Point ( 4.73 , 5.02 Point ( 4.73 , 5 ) Point ( 4.73 , 4.99 ) Point ( 4.73 , 4.97 ) .... Scribble 1: time 2002/12/11 23:37 Stroke has 113 points: Point ( 5.82 , 5.26 ) Point ( 5.83 , 5.26 ) Point ( 5.85 , 5.25 ) Point ( 5.88 , 5.24 )... Scribble 2: time 2002/12/11 23:37 Stroke has 7 points: Point ( 7.93 , 4.61 ) Point ( 7.94 , 4.61 ) Point ( 7.96 , 4.61 ) Point ( 7.99 , 4.62 )... Scribble 3: time 2002/12/11 23:37 Stroke has 47 points: Point ( 8.26 , 5.75 ) Point ( 8.27 , 5.75 )....

Page 13: Forged Handwriting Detection

WrinklinessWrinkliness

WrinklinessWrinkliness = log( = log( high_resolution high_resolution / / low_resolutionlow_resolution) / log(2)) / log(2)

high_resolution – the number of pixels on the boundary of the high resolution handwriting sample

low_resolution – the number of pixels on the boundary of the low resolution handwriting sample

Note that the wrinkliness of a straight line = 1.0

Page 14: Forged Handwriting Detection

Original handwriting sampleOriginal handwriting sample

Page 15: Forged Handwriting Detection

Find the edge of the handwriting Find the edge of the handwriting

Page 16: Forged Handwriting Detection

Edges of 300 and 600 dpiEdges of 300 and 600 dpi

Page 17: Forged Handwriting Detection

Number of pixels on the boundaryNumber of pixels on the boundary

Convert the scanned images to color images

Count the number of pixels whose (Red < 50, Green < 50, Blue < 50) in two different resolutions

Get the wrinkliness value

Page 18: Forged Handwriting Detection

Sample ResultsSample Results

Filename 300dpi 600dpi Wrinkliness Speed0101T1 14894 30583 1.03799867 0.113969730101T2 8786 18638 1.084968652 0.1074572040101T3 9258 19764 1.094102493 0.1181841030202T1 6453 13765 1.092962679 0.0932752420202T2 6212 13319 1.100356033 0.0940806350202T3 5824 12722 1.127243231 0.087968122

Page 19: Forged Handwriting Detection

Information of the ten subjectsInformation of the ten subjects

UserID Age Ethnicity Education Gender Schooling Handiness

1 30 Caucasian Master F English R

2 30 Asian Master F Foreign R

3 20 Asian Bachelor F Foreign R

4 27 Asian Master M Foreign R

5 28 Asian Master F Foreign R

6 35 Caucasian Bachelor M English R

7 60 Caucasian Master M English R

8 67 Asian Beyond H.S F Foreign L

9 35 Caucasian PHD F English R

10 70 Asian Beyond H.S M Foreign L

Page 20: Forged Handwriting Detection

Summary of handwriting samplesSummary of handwriting samples

10 subjects Each subject wrote – 3 authentic handwriting samples– 3 forgeries of each of the other 9 subjects

Total 300 handwriting samples – 30 authentic – 270 forgeries

Total 900 database records– One online and two resolutions offline for each

handwriting sample

Page 21: Forged Handwriting Detection

Speed Hypothesis TestSpeed Hypothesis Test

H0(null hypothesis): the mean speed for the authentic and forged handwritings are about equal

Ha (alternate hypothesis): the mean speed of the authentic handwriting is greater than that of the forged

Page 22: Forged Handwriting Detection

Mean equality test outputMean equality test output

Alpha (level of significance) = 5%

  Authentic Forged

Mean 0.083 0.057

Variance 0.00050 0.00053

Observations na=30 nf=270

Pooled Variance 0.00053

Hypothesized Mean Difference 0

df 298

t Stat 5.87

P(T<=t) one-tail 5.90E-09t Critical one-tail 1.65

Page 23: Forged Handwriting Detection

Reject the null hypothesisReject the null hypothesis

Alpha (level of significance) = 0.05p (probability) value is 5.90E-09

which is much less than alpha

Successfully prove the hypothesis

Reject null hypothesis with a 95% confidence interval

Page 24: Forged Handwriting Detection

Wrinkliness Hypothesis TestWrinkliness Hypothesis Test

H0 (null hypothesis):

log2 ( 600dpif / 300dpif) ~ log2 ( 600dpia/ 300dpia)

Ha (alternative hypothesis): the mean wrinkliness of the authentic handwriting is less than the mean wrinkliness of the forged handwriting

Page 25: Forged Handwriting Detection

Mean equality test outputMean equality test output

Alpha (level of significance) = 5%

  Forged Authentic

Mean 1.094 1.083

Variance 0.0013 0.0010

Observations 270 30

Pooled Variance 0.0013

Hypothesized Mean Difference 0

df 298

t Stat 1.52

P(T<=t) one-tail 0.065t Critical one-tail 1.65

Page 26: Forged Handwriting Detection

Accept the null hypothesisAccept the null hypothesis

Alpha (level of significance) = 0.05p (probability) value is 0.065

which is greater than alpha

Fail to prove the hypothesis

Accept null hypothesis with 95% confidence interval

Page 27: Forged Handwriting Detection

The first possible reason for failureThe first possible reason for failure

Different writing styles among the three tries of the authentic handwriting

First try Second try Third try

Page 28: Forged Handwriting Detection

The second possible reason for failureThe second possible reason for failure

Some subjects didn’t forge other subjects’ handwritings carefully

Authentic Forged

Page 29: Forged Handwriting Detection

Revised hypothesis testRevised hypothesis test

Eliminate the different authentic writing styles and the poorly forged handwriting samples

Run the hypothesis test again

Page 30: Forged Handwriting Detection

Mean equality test outputMean equality test output

Alpha (level of significance) = 5%  

Forged Authentic

Mean 1.097 1.079

Variance 0.0016 0.0009

Observations 190 23

Pooled Variance 0.0015

Hypothesized Mean Difference 0

df 211

t Stat 2.06

P(T<=t) one-tail 0.0205t Critical one-tail 1.65

Page 31: Forged Handwriting Detection

Reject the null hypothesisReject the null hypothesis

Alpha (level of significance) = 0.05p (probability) value is 0.0205 which is less than alpha

Successfully prove the hypothesis

Reject null hypothesis with 95% confidence interval

Page 32: Forged Handwriting Detection

ConclusionConclusion

The average writing speed of the forged handwritings tends to be slower than the speed of the authentic handwritings

“Good” (well formed) forged handwritings tend to be wrinklier (less smooth) than authentic ones

Page 33: Forged Handwriting Detection

Future ExtensionsFuture Extensions

Redo the study using signatures rather than arbitrary words since writing signatures is a highly learned automatic process

Investigate using different resolutions to improve the estimate of wrinkliness

Devise pattern recognition algorithms to filter out the “bad” forged samples automatically

Compute features over portions of the writing rather than over the whole word or signature

Page 34: Forged Handwriting Detection

The EndThe End