OCR and OCV
Tom Brennan
Artemis VisionArtemis Vision781 Vallejo St
Denver, CO 80204(303)832-1111
About Us
• Machine Vision Integrator
– Turnkey Systems
• OEM Vision Software
– Work with camera partners and their clients
Artemis Vision781 Vallejo St.
Denver, CO 80204(303)832-1111
www.artemisvision.com
Tom [email protected]/pub/tom-brennan/1b/2b7/984/
OCR and OCV
• Considerations for Deployment
• OCR vs OCV
• Technical Challenges
– Pre-Processing
– Segmentation
– Recognition
Written Language and Machine Vision
• Written Human Language
– Highly varied:
• Character based and letter based
• Fonts and Scripts
• Scale, Spacing, Directionality
• Machine Vision
– Doesn’t like variability:
• Difficult to test without stepping through examples
• Greater variability = greater costs
Barcodes vs Human Language
• Barcodes
– Highly regular
– Designed for Vision Readability
– Uniform global specifications
• Written Human Language
– Evolved over time
– Highly variable
– Many Languages, many fonts, many standards
OCR Applications
• Space or process constraints preclude barcode
• Human Readability Requirements
• Aesthetic concerns
• Too many legacy parts / labels in circulation
• Information cannot be readily barcoded (i.e. labelled drawing, or chart)
To OCR or Not to OCR?
• The barcode exists because OCR is difficult.
• OCR is typically used as a modern “Turing Test”
AA
Hardware Setup
• Geometric Constraints
– Fixture text consistently in front of the camera
– Minimum 20x40 pixels per character
– Diffuse lighting – avoid hotspots – light scene evenly
– Correct for lens distortion or longer focal length preferred
OCR Fonts
• OCR fonts minimize segmentation and recognition challenges
– OCR-A
• Characters evenly spaced
• Characters slightly modified to all look unique
• Used on Bank Checks
• OCR fonts are engineered for easy OCR
OCR and OCV
• Considerations for Deployment
• OCR vs OCV
• Technical Challenges
– Pre-Processing
– Segmentation
– Recognition
OCR vs OCV
• OCR – Optical Character Recognition
– Attempts to read text
• OCV – Optical Character Verification
– Verifies text conforms to a standard
– Helps diagnose printer problems
• Missing Lines
• Low contrast
OCV
• Typically verifies known text
• Difficult to combine OCV and OCR.
– “Smudged” 6 or “Good” 8
– OCV for lot code verification, expiration date verification, etc.
OCR and OCV
• Considerations for Deployment
• OCR vs OCV
• Technical Challenges
– Pre-Processing
– Segmentation
– Recognition
OCR Steps
• Pre-process
– Reduce background noise
– Improve characters
• Segment
– Locate and divide into characters
• Recognize
– Identify Specific Characters
Pre-Processing
• Reduce Noise
– Erosion and Dilation
– Adaptive Thresholding
– Blur and sharpen
• Improve Character Consistency
– Compute Skeletons
– Compute Stroke Width
– Prune
Noise Reduction Techniques
• Dilation
– Expansion of light colored areas
• Erosion
– Shrinking of light colored areas
Original Dilated Eroded
Character Consistency
• Skeleton
– All points equal-distant from at least 2 edges
– Think “start a fire on the boundary, where fires meet, draw a point”
Locating “Text”
• Easy for people.
• Can be a challenge for software.
– Logos
– Symbols
– Lines
• OCR applications will work best when text is consistently located.
Segmentation
• Splitting Text into Discrete Characters
• Critical to accurate OCR
• Issues
– Not all characters are the same width
– Not all characters can be split with vertical lines due to skew
– Sometimes characters touch
Segmentation
• Adaptive Thresholding
• Detect Corners
• Estimate Stroke Width
• Edge detection
• Path detection
Recognition
• Can be easier than Locating and Segmenting
• However
– Similar Characters:
• l, 1, I, i, 7, /, \ , (, )
• B, D, 8, 6, 9, S, Z, R, P
– Handwriting vs Type
– Scale and Orientation (Document Scan vs. Package on Conveyor)
Recognition Strategies
• Pattern Matching Techniques
– Match the actual image pattern
– Can be problematic on large character sets
• Artificial Intelligence Techniques
– Extract Features from the image
– Learn rules for features
– Neural nets, SVMs, kNN, AdaBoost, etc.
– Tesseract uses a feature distance method
Conclusions
• General Purpose OCR is challenging
• Consider shortcuts to make OCR easier– Context?
– Character number known?
– Character size known?
– Font known? Can we train on that font?
– Eliminate hotspots, distortion
– Locate text consistently, control scale, orientation
– Preprocess to improve image / characters
Questions?
Tom Brennan
Artemis Vision
781 Vallejo St
Denver, CO 80204
(303)832-1111
www.artemisvision.com