Upload
shanna
View
38
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Traditional supports create a reliance on intermediaries. Finding experienced or qualified interpreters, note-takers or re-speakers at university level is difficult. Speech Recognition: real time access to spoken language. Speech Recognition: also supports note taking. - PowerPoint PPT Presentation
Citation preview
Real-Time Speech Recognition Subtitling in Education
Respeaking 2009
Dr Mike Wald University of Southampton
Traditional supports create
a reliance on intermediaries
Finding experienced or qualified interpreters, note-takers or re-speakers at university level is difficult
Speech Recognition: real time access to spoken language
Speech Recognition: also supports note taking
SR in classrooms is VERY DIFFICULT!
• Star Trek expectations
• Special vocabulary
• Spontaneous speech not writing
• Dialogue and interaction
• Andtherearenospacesbetweenwordswhenpeo
pletalksoitisunclearwherewordsbeginandend
How to Wreck a Nice Beach You Sing Calm Incense
This is a demonstration of the problem of the readability of text created by commercial speech recognition software used in lectures they were designed for the speaker to dictate grammatically complete sentences using punctuation by saying comma period new paragraph to provide phrase sentence and paragraph markers when people speak spontaneously they do not speak in what would be regarded as grammatically correct sentences as you can see you just see a continuous stream of text with no obvious beginnings and ends of sentences normal written text would break up this text by the use of punctuation such as commas and periods or new lines by getting the software to insert breaks in the text automatically by measuring the length of the silence between words we can improve the readability greatly
This is a demonstration of the problem of the readability of text created by commercial speech recognition software used in lectures
they were designed for the speaker to dictate grammatically complete sentences using punctuation by saying comma period new paragraph to provide phrase sentence and paragraph markers
when people speak spontaneously they do not speak in what would be regarded as grammatically correct sentences
as you can see you just see a continuous stream of text with no obvious beginnings and ends of sentences
normal written text would break up this text by the use of punctuation such as commas and period or new lines
by getting the software to insert breaks in the text automatically by measuring the length of the silence between words we can improve the readability greatly
1998 LIBERATED LEARNING Pilot Project
Today we will be discussing applied interactions in psychotherapy and how it related to Canadian law I will be covering chapters 7 8 and nine in preparation for next week;s midterm exam
But first are they are questions about what we discussed yesterday lets move forward by asking the following question how does
It gives you something to compare your notes to and
if you miss a class the notes are still accessible
It’s very helpful when the lecturer moves on while
you’re still writing down a point as you can look at
the screen
It’s forced me to be more reflective with my own
teaching style and approach
It makes me ask myself what is teaching, why am I teaching this way, is there
a better way?
Initial Research Summary
• Helps students access lectures
• Students thought it had great potential
• Improved teaching but gave teachers extra work
• Challenges: accuracy, readability, and ease of use
Accuracy & UnderstandingStop / Proceed with Caution / GO
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
28 KB/s speech signal
320 b/s (240 words/minute) Real Time Editor
Corrects 15 errors/minute
320 b/s (240 words/minute)
Speaker(s)
uncorrected words
Human intermediar(y/ies) can interact with a post processing system through the selection and correction of errors.
SR Systemusing voice & language models
possible feedback?
Post-processing system for automatic error identification and correction
Corrected words
SR Post-processing Enhancement
Text with Errors
Text without Errors
Post-processing Enhancement
using statistical (including alternates lists and confidence levels), linguistic, context, phonemic, visual, signal and noise information
Text with Errors
Text without Errors
Post-processing Enhancement
Topic Detection
&
Machine Translation
Text can reduce the memory demands of spoken language
Speech can better express subtle emotions
Images can communicate moods, relationships and complex information holistically.
Multimedia
Easy to find WHOLE of recording but NOT a PART
Analogy
Text book with front cover but no contents page, index or page numbers
synchronised images (e.g. Slides)user created synchronised bookmarks, tags, notes with associated links to other resources
audio / video
synchronised text captions
multimedia start time multimedia end time
Synchronised
Multimedia
Collaborate/Reflect
Search
Reason/Summarise
Organise/Index
Notes
Tags
Bookmarks
Text Captions
Images/Slides
Video
Audio
&
Links
Synote Supports learners to …
• search text and replay sections
• read transcript rather than listen to speech
• read text of slide images • insert bookmark to continue later
• tag/highlight sections (e.g. not understood)
• add synchronised notes
• link to other web pages/resources
Synote Support teachers to…
•index and tag their recordings
•provide synchronised slides and captions
•respond to learner tags
•link to web pages/resources •link to sections of existing multimedia
Synote Demo
Questions ?
www.synote.org