Visual Captcha with handwritten Imange analysis

8/7/2019 Visual Captcha with handwritten Imange analysis

1/25

Visual CAPTCHA with HandwrittenImage Analysis

Amalia Rusu and Venu Govindaraju

CEDARUniversity at Buffalo


2/25

Completely Automatic Public Turing test to tell Computers and Humans

Apart CAPTCHA

CAPTCHA should be automatically generated and graded

Tests should be taken quickly and easily by human users

Tests should accept virtually all human users and reject software agents

Tests should resist automatic attack for many years despite the

technology advances and prior knowledge of algorithms

Exploits the difference in abilities between humans and machines

(e.g., text, speech or facial features recognition) A new formulation of the Alan Turings test - Can machines think?

Background on CAPTCHA


3/25

Securing Cyberspace Using CAPTCHA

Initialization

Handwritten CAPTCHA Challenge

User Response

Verification

Automatic Authentication Session for Web Services.

Internet

User

Authentication Server

Challenge

Response

User authentication

The user initiate the

dialog and has to be

authenticated by server

Internet

User

Authentication Server

Challenge

Response

User authentication

The user initiates the

dialog and has to be

authenticated by server


4/25

Objective

Develop CAPTCHAs based on the ability gap between humans

and machines in handwriting recognition using Gestalt laws of perception

Speed and accuracy of a HR. Feature extraction time is excluded.Testing platform is an Ultra-SPARC.

Lexiconsize

Lexicon Driven Grapheme Model

time(secs)

accuracy time(secs)

accuracy

Top 1 Top 2 Top 1 Top 2

10 0.027 96.53 98.73 0.021 96.56 98.77

100 0.044 89.22 94.13 0.031 89.12 94.06

1000 0.144 75.38 86.29 0.089 75.38 86.29

20000 1.827 58.14 66.56 0.994 58.14 66.49

State-of-the-art in HR

[Xue, Govindaraju 2002]


5/25

H-CAPTCHA Motivation

Machine recognition of handwriting is more difficult than printedtext

Handwriting recognition is a task that humans perform easily and

reliably Several machine printed text based CAPTCHAs have been

already broken Greg Mori and Jitendra Malik of the UCB have written a program that can solve

Ez-Gimpy with accuracy 83%

Thayananthan, Stenger, Torr, and Cipolla of the Cambridge vision group havewritten a program that can achieve 93% correct recognition rate against Ez-Gimpy

Gabriel Moy, Nathan Jones, Curt Harkless, and Randy Potter of Aret Associateshave written a program that can achieve 78% accuracy against Gimpy-R

Speech/visual features based CAPTCHAs are impractical

H-CAPTCHAs thus far unexplored by the research community


6/25

H-CAPTCHA Challenges

Generation of random and infinite many distinct

handwritten CAPTCHAs

Quantifying and exploiting the weaknesses of state-of-the-art handwriting recognizers and OCR systems

Controlling distortion - so that they are human readable

(conform to Gestalt laws) but not machine readable


7/25

Use handwritten word images that current recognizers cannot read

Handwritten US city name images available from postal applications

Collect new handwritten word samples

Create real (or nonsense) handwritten words and sentences by gluing isolated

upper and lower case handwritten characters or word images


handwritten text images


8/25

Use handwriting distorter for generating human-like samples

Models that change the trajectory/shape of the letter in a controlled fashion (e.g.

Hollerbachs oscillation model)

Original handwritten image (a). Synthetic images (b,c,d,e,f).


handwritten text images


9/25

Word Model Recognizer (WMR)

Accuscript [Xue, Govindaraju 2002]

[Kim, Govindaraju 1997]

lexicon driven approach

chain code based image processing

pre-processing

segmentation

feature extraction

dynamic matching

grapheme-based recognizer

extracts high-level structural

features from characters such as

loops, turns, junctions, arcs,

without previous segmentation

uses a stochastic finite state

automata model based on the

extracted features

uses static lexicons in the

recognition process

JunctionLoops

LoopTurns

End

End

Grapheme Based Model

1 2 3 4 5 6 7 8 9

w[7.6]

w[7.2]r[3.8]

w[5.0]

w[8.6]

o[7.6]r[6.3]

d[4.9]

w[5.0]

o[6.6]

o[6.0]

o[7.2]o[10.6] d[6.5]

d[4.4]

r[7.5]r[6.4]

o[7.8]r[8.6]

r[7.6]

o[8.3]

o[7.7]r[5.8]

1 2 3 4 5 6 7 8 9

o[6.1]

Find the best way of accounting for characters w, o,

r, d buy consuming all segments 1 to 8 in theprocess

Distance between lexicon entry word

first character w and the image

between:- segments 1 and 4 is 5.0

- segments 1 and 3 is 7.2

- segments 1 and 2 is 7.6

Lexicon Driven Model

Exploit the Source ofErrors forState-of-the-art

Handwriting Recognizers


10/25

Source ofErrors forState-of-the-art HandwritingRecognizers

Image quality

Background noise, printing surface, writing styles

Image features

Variable stroke width, slope, rotations, stretching, compressing

Segmentation errors

Over-segmentation, merging, fragmentation, ligatures, scrawls

Recognition errors

Confusion with a similar lexicon entries, large lexicons


11/25

Gestalt psychology is based on the observation that we often

experience things that are not a part of our simple sensations

What we are seeing is an effect of the whole event, not contained

in the sum of the parts (holistic approach) Organizing principles: Gestalt Laws

By no means restricted to perception only (e.g. memory)

Gestalt Laws


12/25

1. Law of closure 2. Law of similarity

Gestalt Laws

OXXXXXXXOXXXXX

XXOXXXX

XXXOXXX

XXXXOXX

XXXXXOX

XXXXXXO

3. Law of proximity 4. Law of symmetry

**************

**************

**************

[ ][ ][ ]


13/25

Gestalt Laws

a) Ambiguous segmentationb) Segmentation based on good continuity, follows the path of minimal curvature change

c) Perceptually implausible segmentation

a) Ambiguous segmentation

b) Perceptual segmentation

c) Segmentation based on good continuity proves to be erroneous

6. Law of familiarity

5. Law of continuity


14/25

Gestalt Laws

7. Figure and ground

8. Memory


15/25


16/25

Gestalt laws: closure, proximity, familiarity

Add occlusions by circles, rectangles, lines with random angles

Ensure small enough occlusions such that they do not hide letters completely

ControlO

cclusions


17/25

Gestalt laws: closure, proximity, familiarity

Add occlusions by waves from left to right on entire image, with various

amplitudes / wavelength or rotate them by an angle

Choose areas with more foreground pixels, on bottom part of the text image

(not too low not to high)

ControlOcclusions


18/25

Gestalt laws: continuity, figure and ground, familiarity

Add occlusion using the same pixels as the foreground pixels (black pixels),

arcs, or lines, with various thickness

Curved strokes could be confused with part of a character

Use asymmetric strokes such that the pattern cannot be learned

Control Extra Strokes


19/25

flip-flop

vertical mirror

horizontal mirror

Gestalt laws: memory, internal metrics, familiarity of letters

Change word orientation entirely, or the orientation for few letters only

Use variable rotation, stretching, compressing

Control Letter/Word Orientation


20/25

Input.

Original (randomly selected) handwritten image (existing US city nameimage or synthetic word image with length 5 to 8 characters or meaningfulsentence)

Lexicon containing the images truth word

Output.

H-CAPTCHA image

Method.

Randomly choose a number of transformations

Randomly establish the transformations corresponding to the given number

If more than one transformation is chosen then A priori order is assigned to each transformation based on experimental results

Sort the list of chosen transformations based on their priori order and apply themin sequence, so that the effect is cumulative

General H-CAPTCHA Generation Algorithm


21/25

The accuracy of HR on images deformed using Gestalt laws approach. The number of tested images is

4,127 for each type of transformation. HR running time increases from few seconds per image for

lexicon 4,000 to several minutes per image for lexicon 40,000.

Testing Results on MachinesHW Recognizer WMR Accuscript

Lexicon Size 4,000 40,000 4,000 40,000

Occlusion by circles 35.93% 20.28% 32.34% 17.37%

Vertical Overlap 27.88% 14.36% 12.64% 3.94%

Horizontal Overlap

(Small)24.35% 10.70% 2.93% 0.60%

Black Waves 16.36% 5.33% 1.57% 0.38%

Occlusion by waves 15.43% 7.00% 10.56% 4.28%

Horizontal Overlap

(Large)12.93% 3.56% 2.42% 0.36%

Overlap Different

Words 3.80% 0.48% 4.43% 0.92%

Flip-Flop 0.46% 0.14% 0.70% 0.19%

General Image

Transformations9.28% N/A 4.41% N/A


22/25


23/25

No risk of image repetition

Image generation completely automated: words, images and distortions

chosen at random

The transformed images cannot be easily normalized or rendered

noise free by present computer programs, although original images

must be public knowledge

Deformed images do not pose problems to humans

Human subjects succeeded on our test images

Test against state-of-the-art: Word Model Recognizer, Accuscript

CAPTCHAs unbroken by state-of-the-art recognizers

H-CAPTCHAEvaluation


24/25

Future Work

Develop general methods to attack H-CAPTCHA (e.g. pre and postprocessing techniques)

Research lexicon free approaches for handwriting recognition

Quantify the gap between humans and machines in readinghandwriting by category (of distortions & Gestalt laws)

Parameterize the difficulty levels of Gestalt based H-CAPTCHAs


25/25

Thank You

Questions?

Documents

Visual Captcha with handwritten Imange analysis