Upload
kaushik-godhani
View
603
Download
5
Embed Size (px)
DESCRIPTION
Text Extraction is a process by which we convert Printed document/Scanned Page or Image in which text are available to ASCII Character that a Computer can Recognize.
Citation preview
Prepared By:Amit Bhoraniya (7022)
Kaushik Godhani(7009)Mayur Halai(7016)
Vikram Ghunsar(7039)
Text Extraction From Image
Guided By:Mr. Udesang Jaliya
Mr. Kirti Sharma
What is Text Extraction ??Text Extraction is a process by which
we convert Printed document/Scanned Page or Image in which text are available to ASCII Character that a Computer can Recognize.
Goal Of Project
GENERAL APTITUDEComputer ScienceElectronics & Communication Engineering
How Will We Archive That Goal ??
1Preprocessing
2Segmentation
3Recognition
Pre-Processing1
Pre-Processing
1Gray Scale 2Noise Removal 3Thresholding
Gray Scale
Noise Removal
Noise Removal is used to Enhance the ImageFor Enhancing We have used Median Filter
FilteredImage = Median Filter(Origional Image, FilterSize)We have used FilterSize [5,5]
Thresholding
Edge DetectionDilate ImageDetect Text Area Using HistrogramPersonal Thresholding to Text Area
Edge Detection using Canny
Dilate
Text Area Using Histrogram
Algorithm
• Row Histrogram• Separate Region by (no. of Pixel > 60 )• For Each Row
– Separate Region by (no. of Pixel > Height of (Row/4))
2 Segmentation
Segmentation
1Line Segmentation 2Word
Segmentation
3Character Segmentation
From above Image, Image are segment in to Different Lines, Below an example of Only For one Line.
TEXT SEGMENTATION
Find all the word than convert text area in one image
Segmentation
Character are separate from the word
3 Recognition
Recognization
1Feature Extraction 2Classifier
3Text Document
• Feature Extraction• Binary Code Method• Chain Code Method• PCA (Principle Component Analysis)• LDA (Linear Discriminative Image)
• Classifier• Artificial Neural Network• Support Vector Machine
Recognization
Applications• Banking (To read Credit Card)• Libraries (To convert Scanned Page to
Image)• Govt. Sector (Form Processing)• Used in Car Number Plate Recognition
System• Undesirable Text removal from images.
References
1. OCR for Devnagari Script by Mahesh Goyani2. Edge Based Text Extraction From Complex Images
by Xiaoqing Liu and Jagath Samarbandhu3. Automatic Text Detection using Morphological
Operations and Inpainting by Khyati Vaghela4. Font and Background Color Independent Text
Binarization by T.Kasar , J.Kumar , A.G. Ramkrishnan
Thank You