Upload
vivek-chamorshikar
View
514
Download
4
Embed Size (px)
Citation preview
Abstract
Of the 314 million visually impaired people worldwide, 45 million are blind. Even in a
developed country like the U.S., the 2008 National Health Interview Survey reported that
an estimated 25.2 million adult Americans (over 8%) are blind or visually impaired. In
our society to survive reading is obviously essential in today’s society. In camera-based
assistive text reading framework to help blind person read text labels and product
packaging from hand-held objects in their daily lives, simultaneously direct for the
navigation of places with the help of signs which are basically in every public and private
places. To isolate the object from cluttered backgrounds or other surrounding objects in
the camera view, first an efficient and effective motion based method to define a region
of interest (ROI) in the video by asking the user to shake the object.
In this system, moving object region from background is extracted. In the
extracted ROI, text localization and recognition are conducted to acquire text
information. To automatically localize the text regions from the object ROI, a novel text
localization algorithm by learning gradient features of stroke orientations and
distributions of edge pixels can be used. Text characters in the localized text regions are
then binarized and recognized. Experimental results demonstrate algorithm achieves the
state of the arts. User interface issues and assess robustness of the algorithm in extracting
and reading text from different objects with complex backgrounds and extracted output
component is use to inform the blind user of recognized text codes in the form of speech
or audio.
1
2. Literature Review
No. Year ofPublication
Title of the paper
Authors name
Study
1 2014 Portable
Camera-
Based
Assistive Text
and Product
Label
Reading From
Hand-Held
Objects for
Blind Persons
Chucai Yi,
Yingli Tian
and Aries
Arditi
This is the main paper of this
research, this paper totally focus
recent developments in computer
vision, digital cameras, and
portable computers make it
feasible to assist these
individuals by developing
camera-based products that
combine computer vision
technology with other existing
commercial products such optical
character recognition (OCR)
systems. The corresponding
feature maps estimate the global
structural feature of text at every
pixel. Adjacent character
grouping is performed to
calculate candidates of text
patches prepared for text
classification. An Adaboost
learning model is employed to
localize text in camera-based
images. Off-the-shelf OCR is
used to perform word recognition
on the localized text regions and
transform into audio output for
blind users.
2
2 2014 Information
and Assisted
Navigation
System for
Blind People
Karen
Duarte, Jos
´e Cec´ılio,
Jorge S´a
Silva,
Pedro
Furtado
The system presented in this
paper aims to highlight the user’s
device integrating it with devices
and technologies already used by
users, as their own smartphone.
So the location system is being
developed based on Bluetooth
technology, present in most parts
of the mobile phones. After the
environment is equipped with
sufficient sensors, the system is
able to locate the user and send
him/her instructions that lead to
the desired destination. Another
important feature of the system is
the accessible information
system: the system allows the
user to receive information about
available stores, services or
spaces.
3 2009 An algorithm
enabling blind
users to find
and read
barcodes
Ender
Tekin and
James M.
Coughlan
In this paper, the ability of people
who are blind or have significant
visual impairments to read
printed labels and product
packages will enhance
independent living and foster
economic and social self-
sufficiency.
3
4 2007 Text
Extraction
and
Document
Image
Segmentation
Using
Matched
Wavelets and
MRF Model
Sunil
Kumar,
Rajat
Gupta,
Nitin
Khanna,
Santanu
Chaudhury
and Shiv
Dutt Joshi
This paper proposes scheme for
the extraction of textual areas of
an image using globally matched
wavelet filters. A clustering-
based technique has been devised
for estimating globally matched
wavelet filters using a collection
of groundtruth images and text
extraction scheme for the
segmentation of document
images into text, background,
and picture components
5 2003 Texture-
Based
Approach for
Text
Detection in
Images Using
Support
Vector
Machines and
Continuously
Adaptive
Mean Shift
Algorithm
Kwang In
Kim,
Keechul
Jung, and
Jin Hyung
Kim
This paper show texture-based
method for detecting texts in
images. A support vector
machine (SVM) is used to
analyze the textural properties of
texts. No external texture feature
extraction module is used; rather,
the intensities of the raw pixels
that make up the textural pattern
are fed directly to the SVM,
which works well even in high-
dimensional spaces.
4
3. Problem Statement
To overcome the problem in assistive reading systems for blind persons, in
existing system very challenging for users to position the object of interest within the
center of the camera’s view. As of now, there are still no acceptable solutions for exact
location of bar code on the product. This problem approached in stages. The hand-held
object should be appear in the camera view, for this use of camera with sufficiently wide
angle to accommodate users with only approximate aim. On the same time system will
allow to direct the direction navigation to blind person with the help of sign based
system.
This may often result in other text objects appearing in the camera’s view (for
example, while shopping at a supermarket). To extract the hand-held object from the
camera image, this system going to develop a motion-based method to obtain a region of
interest (ROI) of the object with proper text recognition. And help the blind person with
good audio quality as output of the system.
5
4. System Architecture
4.1 Description
The system architecture consists of three functional components: scene capture, data
processing, and audio output. The scene capture component collects scenes containing
objects of interest in the form of images or video. In our prototype, it corresponds to a
camera attached to a pair of sunglasses. The data processing component is used for
deploying our proposed algorithms, including 1) object- of- interest detection to
selectively extract the image of the object held by the blind user from the cluttered
background or other neutral objects in the camera view; and 2) text localization to obtain
image regions containing text, and text recognition to transform image-based text
information into readable codes. We use a laptop as the processing device in our current
prototype system. The audio output component is to inform the blind user of recognized
text codes.
6
5. Possible Contribution
The algorithm used in previous paper can handle complex background and
multiple patterns, and extract text information from hand-held objects. In assistive
reading systems for blind persons, it is very challenging for users to position the object of
interest within the center of the camera’s view. As of now, there are still no acceptable
solutions.
In this system the previous drawback of algorithm can be minimized and divided
the problem in stages. To make sure the hand-held object appears in the camera view, a
camera with sufficiently wide angle to accommodate users with only approximate aim.
This may often result in other text objects appearing in the camera’s view (for example,
while shopping at a supermarket). To extract the hand-held object from the camera
image, a motion-based method to obtain a region of interest (ROI) of the object is used.
It is a challenging problem to automatically localize objects and text ROIs from
captured images with complex backgrounds, because text in captured images is most
likely surrounded by various background outlier “noise,” and text characters usually
appear in multiple scales, fonts, and colors. For the text orientations, algorithm used in
the previous paper assumes that text strings in scene images keep approximately
horizontal alignment but that drawback of algorithm will overcome by algorithm which is
best suitable. Many algorithms have been developed for localization of text regions in
scene images.
7
6. Time ScheduleThis is the tentative time for our plan of action.
Work to be
done /month
Jul
’15
Aug
’ 15
Sep
’15
Oct
’15
Nov
’15
Dec
’15
Jan
’ 16
Feb
’16
Mar
’ 16
Apr
’ 16
May
’ 16
Jun
’ 16
Studying and
analyzing
different Data
Stream algorithms
and technique.
Studying of
literatures
regarding Project
Designing of
algorithm for the
dynamicity of
privacy system
Start
implementing
Project Phase I
Phase II
Phase III
Phase IV
Testing
Thesis
Preparation
Phase I: Design of System Architecture.
Phase II Implementation of Algorithms.
Phase III: Verifications and Designs.
8
Phase IV: Building Real Time System.
7. Conclusion This paper has introduced to read printed text on hand-held objects for assisting
blind persons. In order to solve the common aiming problem for blind users, a motion-
based method to detect the object of interest is projected, while the blind user simply
shakes the object for a couple of seconds. This method can effectively distinguish the
object of interest from background or other objects in the camera view. An Adaboost
learning model is employed to localize text in camera-based images .Off the shelf OCR is
used to perform word recognition on the localized text regions and transform into audio
output for blind users.
9
References[1] Chucai Yi, Yingli Tian and Aries Arditi ,”Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons”, IEEE/ASME TRANSACTIONS ON MECHATRONICS, VOL. 19, NO. 3, JUNE 2014
[2] Karen Duarte, Jos´e Cec´ýlio, Jorge S´a Silva, Pedro Furtado “Information and Assisted Navigation System for Blind People”, Proceedings of the 8th International Conference on Sensing Technology, Sep. 2-4, 2014, Liverpool, UK
[3] Sunil Kumar, Rajat Gupta, Nitin Khanna, Santanu Chaudhury and Shiv Dutt Joshi “Text Extraction and Document Image Segmentation Using Matched Wavelets and MRF Model”, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 8, AUGUST 2007
[4] Kwang In Kim, Keechul Jung, and Jin Hyung Kim “Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm”, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 12, DECEMBER 2003
[5] Advance Data Reports from the National Health Interview Survey (2008).[Online]. Available: http://www.cdc.gov/nchs/nhis/nhis_ad.htm.
[6] B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in natural scenes with stroke width transform,” in Proc. Comput. Vision Pattern Recognition, 2010, pp. 2963–2970.
[7] C. Yi and Y. Tian, “Assistive text reading from complex background for blind persons,” in Proc. Int. Workshop Camera-Based Document Anal.Recognit., 2011, vol. LNCS-7139, pp. 15–28.
[8] C. Yi and Y. Tian, “Text string detection from natural scenes by structure based partition and grouping,” IEEE Trans. Image Process., vol. 20, no. 9, pp. 2594–2605, Sep. 2011.
[9] International Workshop on Camera-Based Document Analysis and Recognition (CBDAR 2005, 2007, 2009, 2011). [Online]. Available: http://www.m.cs.osakafuu.ac.jp/cbdar2011
10
Asst. Prof. Garima Singh Makhija Mr. Fazeel I. Z. Qureshi Head of CSE, WCEM Asst Professor, Project Coordinator
Department of CSE
Submitted By Under Guidance of Mr. Vivek R. Chamorshikar Asst. Prof. Saiyad Sharik Kaji M-Tech III Sem, Asst. Professor, Department of CSE
11