Concept Paper #5 - Sir Mike

8/11/2019 Concept Paper #5 - Sir Mike

1/2

CISSFIL: Conversion of Image to Speech System in Filipino

I. Summary

The purpose of this research is to develop an android application that can convertimage text to speech in Filipino. The researchers will use Optical Character Recognition for therecognition of text on image and Speech Synthesis to speech what is written in the image. The

input for the system is a text image captured from camera. The output will be the speech of the

words attached to the inputted image text.

II. Name and Contact Details

Name Student Number Course, Year and Section

Bryan C. Condino 2011-07179-MN-0 BSCS 4-2

Justine Arielle A Estinopo 2011-04865-MN-0 BSCS 4-2

Erwin L. Praico 2011-05650-MN-0 BSCS 4-2

Jose Luis G. Salac 2011-01842-MN-0 BSCS 4-2

III. Introduction

1. Background of the Study

Filipino is thenational language of thePhilippines,sharing official status with

theEnglish language.As of 2007, Filipino is thefirst language of 28 million people, or about

one-third of thePhilippine population, while 45 million speak Filipino as theirsecond

language.[1].

OCR is the acronym for Optical Character Recognition. This technology allows aMachine to automatically recognize a character through an optical mechanism. OCR is the

process of translating scanned images of typewritten text into machine-editable information [2].

Speech synthesis is the generation of human speech without directly using a human

voice. Generally speaking, a speech synthesizer is software or hardware capable of rendering

artificial speech i.e. a symbol to signal generation system. Speech synthesis systems are often

called text-to-speech (TTS) [3] systems in reference to their ability to convert text into speech.

However, there exist systems that can only render symbolic linguistic representations like

phonetic transcriptions into speech [4].

2. Problem Statement

Most students with visual impairments are constantly challenged by classroom

instructional strategies. Although they can easily hear lectures and discussions, it can be difficult

for them to access class syllabi, textbooks, overhead projector transparencies, maps, videos,

written exams, demonstrations, and films [5].
http://en.wikipedia.org/wiki/National_languagehttp://en.wikipedia.org/wiki/Philippineshttp://en.wikipedia.org/wiki/English_languagehttp://en.wikipedia.org/wiki/First_languagehttp://en.wikipedia.org/wiki/Philippine_populationhttp://en.wikipedia.org/wiki/Second_languagehttp://en.wikipedia.org/wiki/Second_languagehttp://en.wikipedia.org/wiki/Second_languagehttp://en.wikipedia.org/wiki/Second_languagehttp://en.wikipedia.org/wiki/Philippine_populationhttp://en.wikipedia.org/wiki/First_languagehttp://en.wikipedia.org/wiki/English_languagehttp://en.wikipedia.org/wiki/Philippineshttp://en.wikipedia.org/wiki/National_language

8/11/2019 Concept Paper #5 - Sir Mike

2/2

IV. Literature Summary

The Image to speech conversion system for Telugu language is developed for only

numeric digits. The system takes an input image with 32 pixel height and n pixel width (nvaries). Presently, the system works only in png and pgm formats, and also it gives 100%

accuracy with urmbookman font from M. Nagami, S. Manoj, and S. Uday work.

According to Strangman & Dalton (2005) reading is arguably the most important

academic skill for students to learn. In the Handbook of Special Education Technology Research

and Practice, they explain that reading influences not only students success in school, but their

eventual opportunities for employment, civic contribution and personal enrichment. In fact, the

authors explain that for many students, poor reading achievement constitutes a major barrier to

learning and opportunity.

V. Project Description

The project is an android application where users can take a picture of text he/she wants

the system to speak. This system takes input as image which contains numeric data and extracted

as text information using OCR technology. Then this text information is converted into speech

by using Speech Synthesis tool to speak numeric content. This application is more helpful in

games, toys and many other applications like, railways, aid to the physically challenged persons.

References

[1]Filipino language. (n.d.). Retrieved from wikipedia:

http://en.wikipedia.org/wiki/Filipino_language

[2] Ishitani, Y. (2001). "Model-based information extraction method tolerant of OCR errors for

document images".Document Analysis and Recognition, 908-915.

[3] Tatham, M. (1996). "Improving text-to-speech synthesis,".Fourth International Conference

(pp. 3-6). ICSLP 96.

[4] Teaching Students with Visual Disabilities. (1999). Retrieved from umass.edu:http://www.umass.edu/complit/ogscl/visualdis.htm
http://en.wikipedia.org/wiki/Filipino_languagehttp://en.wikipedia.org/wiki/Filipino_language

Documents

Concept Paper #5 - Sir Mike