Concept Paper #5 - Sir Mike

  • Upload
    bry-an

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

  • 8/11/2019 Concept Paper #5 - Sir Mike

    1/2

    CISSFIL: Conversion of Image to Speech System in Filipino

    I. Summary

    The purpose of this research is to develop an android application that can convertimage text to speech in Filipino. The researchers will use Optical Character Recognition for therecognition of text on image and Speech Synthesis to speech what is written in the image. The

    input for the system is a text image captured from camera. The output will be the speech of the

    words attached to the inputted image text.

    II. Name and Contact Details

    Name Student Number Course, Year and Section

    Bryan C. Condino 2011-07179-MN-0 BSCS 4-2

    Justine Arielle A Estinopo 2011-04865-MN-0 BSCS 4-2

    Erwin L. Praico 2011-05650-MN-0 BSCS 4-2

    Jose Luis G. Salac 2011-01842-MN-0 BSCS 4-2

    III. Introduction

    1. Background of the Study

    Filipino is thenational language of thePhilippines,sharing official status with

    theEnglish language.As of 2007, Filipino is thefirst language of 28 million people, or about

    one-third of thePhilippine population, while 45 million speak Filipino as theirsecond

    language.[1].

    OCR is the acronym for Optical Character Recognition. This technology allows aMachine to automatically recognize a character through an optical mechanism. OCR is the

    process of translating scanned images of typewritten text into machine-editable information [2].

    Speech synthesis is the generation of human speech without directly using a human

    voice. Generally speaking, a speech synthesizer is software or hardware capable of rendering

    artificial speech i.e. a symbol to signal generation system. Speech synthesis systems are often

    called text-to-speech (TTS) [3] systems in reference to their ability to convert text into speech.

    However, there exist systems that can only render symbolic linguistic representations like

    phonetic transcriptions into speech [4].

    2. Problem Statement

    Most students with visual impairments are constantly challenged by classroom

    instructional strategies. Although they can easily hear lectures and discussions, it can be difficult

    for them to access class syllabi, textbooks, overhead projector transparencies, maps, videos,

    written exams, demonstrations, and films [5].

    http://en.wikipedia.org/wiki/National_languagehttp://en.wikipedia.org/wiki/Philippineshttp://en.wikipedia.org/wiki/English_languagehttp://en.wikipedia.org/wiki/First_languagehttp://en.wikipedia.org/wiki/Philippine_populationhttp://en.wikipedia.org/wiki/Second_languagehttp://en.wikipedia.org/wiki/Second_languagehttp://en.wikipedia.org/wiki/Second_languagehttp://en.wikipedia.org/wiki/Second_languagehttp://en.wikipedia.org/wiki/Philippine_populationhttp://en.wikipedia.org/wiki/First_languagehttp://en.wikipedia.org/wiki/English_languagehttp://en.wikipedia.org/wiki/Philippineshttp://en.wikipedia.org/wiki/National_language
  • 8/11/2019 Concept Paper #5 - Sir Mike

    2/2

    IV. Literature Summary

    The Image to speech conversion system for Telugu language is developed for only

    numeric digits. The system takes an input image with 32 pixel height and n pixel width (nvaries). Presently, the system works only in png and pgm formats, and also it gives 100%

    accuracy with urmbookman font from M. Nagami, S. Manoj, and S. Uday work.

    According to Strangman & Dalton (2005) reading is arguably the most important

    academic skill for students to learn. In the Handbook of Special Education Technology Research

    and Practice, they explain that reading influences not only students success in school, but their

    eventual opportunities for employment, civic contribution and personal enrichment. In fact, the

    authors explain that for many students, poor reading achievement constitutes a major barrier to

    learning and opportunity.

    V. Project Description

    The project is an android application where users can take a picture of text he/she wants

    the system to speak. This system takes input as image which contains numeric data and extracted

    as text information using OCR technology. Then this text information is converted into speech

    by using Speech Synthesis tool to speak numeric content. This application is more helpful in

    games, toys and many other applications like, railways, aid to the physically challenged persons.

    References

    [1]Filipino language. (n.d.). Retrieved from wikipedia:

    http://en.wikipedia.org/wiki/Filipino_language

    [2] Ishitani, Y. (2001). "Model-based information extraction method tolerant of OCR errors for

    document images".Document Analysis and Recognition, 908-915.

    [3] Tatham, M. (1996). "Improving text-to-speech synthesis,".Fourth International Conference

    (pp. 3-6). ICSLP 96.

    [4] Teaching Students with Visual Disabilities. (1999). Retrieved from umass.edu:http://www.umass.edu/complit/ogscl/visualdis.htm

    http://en.wikipedia.org/wiki/Filipino_languagehttp://en.wikipedia.org/wiki/Filipino_language