25
AUTOMATIC SUBTITLE GENERATOR By Lohith Kumar Menchu Manikanta Thumu Ravinder Putta

Automatic Subtitle Generator

Embed Size (px)

DESCRIPTION

PPT describing about automatic generation of subtitles having input as video.

Citation preview

Page 1: Automatic Subtitle Generator

AUTOMATIC SUBTITLE GENERATOR

By

Lohith Kumar MenchuManikanta ThumuRavinder Putta

Page 2: Automatic Subtitle Generator

ABSTRACT

Video has become one of the most popular multimedia artefacts used on PC’s and internet.

In a majority of cases within a video, the sound holds an important place.

For people with gaps in spoken language and auditory problems the most natural way lies in the use of subtitles.

Page 3: Automatic Subtitle Generator

ABSTRACT

Therefore, it is necessary to find solutions for the purpose of making these media artefacts accessible for most people.

Here, we confine our research to the videos which has single speaker.

If we try to employ SR technology in conversations or meetings where people frequently interrupt each other, we’re likely to get extremely poor results.

Page 4: Automatic Subtitle Generator

PROJECT DESCRIPTION

The current thesis work principally tends to answer out problematic by presenting a potential system.

Three distinct modules have been defined, namely audio extraction, speech recognition, and subtitle generation(with time synchronization).

Page 5: Automatic Subtitle Generator

PROJECT DESCRIPTION

The system should take a video file as input and generate a subtitle file as output.

This extracted subtitles must also be synchronized with the video content.

Speaker independent model presents an accuracy greater than 90% with peaks reaching 98% under optimal conditions (quiet room, high quality microphone).

Page 6: Automatic Subtitle Generator

EXISTING SYSTEM

In the existing system whether it is a single speaker or multi speaker media the subtitles are generated manually by some linguistic.

However, manual subtitle creation is a long and boring activity and requires the presence of the user.

Moreover, the user need to know the language of video content in order to generate subtitles.

Page 7: Automatic Subtitle Generator

EXISTING SYSTEM

In present scenario we cannot generate subtitles for all languages.

Software generating subtitles without intervention of individual using speech recognition has not been developed.

Page 8: Automatic Subtitle Generator

PROPOSED SYSTEM

In the proposed system SR technology allows a computer to handle sound input through either a microphone or media file in order to be transcribed or used to interact with the machine.

This analog form of a signal is converted into digital format and then divided into small segments which are then matched with known phonemes in appropriate language.

Page 9: Automatic Subtitle Generator

PROPOSED SYSTEM

Speech recognition can be used to handle either a unique speaker or an infinite number of speakers.

The first case which is our area of interest , presents an accuracy greater than 90% with peaks reaching 98% under optimal conditions.

Various models are under construction but modern SR engines are based on the Hidden Markov Models.

Page 10: Automatic Subtitle Generator

HOW SPEECH RECOGNITION WORKS ??

Rule Based

Early speech recognition systems tried to apply a set of grammatical and syntactical rules to speech.

 If the words spoken fit into a certain set of rules, the program could determine what the words were.

Accents, dialects and mannerisms can vastly change the way certain words or phrases are spoken, so this model has limited usage.

Page 11: Automatic Subtitle Generator

HOW SPEECH RECOGNITION WORKS ??

Statistical-Modelling Approach

We basically have a model that has three fundamental components to it that model different aspects of the speech signal.

Acoustic Model

Lexicon

Language Model

Page 12: Automatic Subtitle Generator

ACOUSTIC MODEL

Acoustic models require engineers to collect all the sounds made by speakers of a particular language.

We differentiate two acoustic models: Speaker Dependent. Speaker Independent.

Page 13: Automatic Subtitle Generator

PHONEMES

Page 14: Automatic Subtitle Generator

LEXICON

The next part of the model is called the lexicon, the dictionary. And what that is, is a definition for all of the words in the language of how they get pronounced.

Page 15: Automatic Subtitle Generator

LANGUAGE MODEL

The third piece of the model is the model of how we put words together into phrases and sentences in the language.

So for example, that model might learn that if the recognizer thinks it just recognized "the dog" and now it's trying to figure out what the next word is, it may know that "ran" is more likely than "pan" or "can" as the next word just because of what we know about the usage of language in English.

Page 16: Automatic Subtitle Generator

HOW SPEECH RECOGNITION WORKS ??

Page 17: Automatic Subtitle Generator

MODULES DESCRIPTION

Our scenario Automatic Subtitle Generator contains three important modules.

Audio Extraction

Speech Recognition

Subtitle Generation

Page 18: Automatic Subtitle Generator

AUDIO EXTRACTION

The audio extraction routine is expected to return a suitable audio format that can be used by the speech recognition module as pertinent material.

To facilitate the extraction of audio we use Java Media Framework API features. This API provides many interesting features for dealing with media objects.

Page 19: Automatic Subtitle Generator

SPEECH RECOGNITION

The speech recognition routine is the key part of the system. Indeed, it affects directly performance and results evaluation.

First, it must get the type (film, music, information, home-made, etc...) of the input file as often as possible. Then, if the type is provided, an appropriate processing method is chosen.

Page 20: Automatic Subtitle Generator

SUBTITLE GENERATION

The subtitle generation routine aims to create and write in a file in order to add multiple chunks of text corresponding to utterances limited by silences and their respective start and end times.

The module is expected to get a list of words and their respective speech time from the speech recognition module and then to produce a SRT subtitle file.

Page 21: Automatic Subtitle Generator

WORKING

Page 22: Automatic Subtitle Generator

CONCLUSION

In a cyber world where the accessibility remains insufficient, it is essential to give each individual the right to understand any media content.

During the last years, the internet has known a multiplication of websites based on videos of which most are from amateurs and of which transcripts are rarely available.

Page 23: Automatic Subtitle Generator

CONCLUSION

This thesis work was mostly orientated on video media and suggested a way to produce transcript of audio from video for the ultimate purpose of making content comprehensible by deaf persons.

Although the current system does not present enough stability to be widely used, it proposes one interesting way that can certainly be improved.

Page 24: Automatic Subtitle Generator

REFERENCES

Tutorial : Getting started with the java media framework. URL http://www.ee.iitm.ac.in/~tgvenky/JMFBook/Tutorial.pdf

How Stuff Works : http://electronics.howstuffworks.com/gadgets/high-tech-gadgets/speech-recognition.htm

Engineered Station. How speech recognition works. 2001. URL http://project.uet.itgo.com/speech.htm

Page 25: Automatic Subtitle Generator

THANK YOU!