Upload
goncal-garces-diaz-munio
View
195
Download
1
Embed Size (px)
DESCRIPTION
Watch this presentation in video: http://videolectures.net/internetofeducation2013_diaz_munio_translectures/ Same presentation slides in the transLectures Slideshare: http://www.slideshare.net/transLectures/trans-lectures-ioe201320131112 http://www.translectures.eu
Citation preview
Internet of Education 2013
12 November 2013
Universitat Politècnica de València
www.translectures.eu EC FP7 ICT project #287755
Watch this presentation in video:http://videolectures.net/internetofeducation2013_diaz_munio_translectures/
Outline
• transLectures (tL): motivation and goals
• Video demo
• tL technologies
• Progress and results• Scientific evaluations
• User evaluations
• Quality control (expert evaluations)
• Implementation and integration
• tL open source tools
12 Nov 2013 2
Motivation
• Video lecture repositories and MOOCs• Thousands of hours of video lectures available
• Hundreds of hours of video lectures recorded every week
• Most video lectures only available in their original language• No subtitles
12 Nov 2013 3
Motivation
• Transcriptions and translations are needed• Accessibility for people with disabilities
• Accessibility for speakers of different languages
• Search and analysis functions
• Automated topic finding
• …
12 Nov 2013 4
Motivation
• Transcriptions and translations are needed• Accessibility for people with disabilities
• Accessibility for speakers of different languages
• Search and analysis functions
• Automated topic finding
• …
• How do we get there?
12 Nov 2013 5
The transLectures approach
1. Automatic Speech Recognition (ASR) and Machine Translation (MT)• Adaptation: Taking advantage of the characteristics
of video lecture repositories
• High-quality automatic transcriptions and translations
2. Interactive postediting: intelligent interaction for reduced effort
12 Nov 2013 6
Goals
• Massive adaptation
• Intelligent interaction
• Implementation• Case studies: Videolectures.NET & Polimedia
• Real-life evaluation
• Integration into Opencast Matterhorn
http://opencast.org/matterhorn/
12 Nov 2013 7
The transLectures partners
12 Nov 2013 8
Name Country
1 Universitat Politècnica de València Spain
2 Xerox SAS France
3 Institut Jožef Stefan Slovenia
3+ Knowledge for All Foundation UK
4 RWTH Aachen University Germany
5 EML – European Media Laboratory Germany
6 DDS – Deluxe Digital Studios UK
Languages• Transcription (ASR)
• EN
• SL
• ES
• Translation (MT)• EN>SL , SL>EN
• EN>ES , ES>EN
• EN>FR
• EN>DE
12 Nov 2013 9
transLectures: video demo
http://www.translectures.eu/new-demo-video/
12 Nov 2013 10
Massive adaptation
12 Nov 2013 11
• Characteristics
of video lecturesJust one person
Known speaker
Clear talking
No interruptions
Focused on a topic
Slides
Massive adaptation
• Known speaker and topic
• Slides
• Related documents
12 Nov 2013 12
Scientific evaluations (Y2)
• Transcription results• WER: Word Error Rate (%)
• Goal: WER < 25%
• EN, SL, ES
12 Nov 2013 13
Worse
Better
Scientific evaluations (Y2)
• Translation results• BLEU
• Goal: BLEU > 30
• EN>SL , SL>EN
• EN>ES , ES>EN
• EN>FR
• EN>DE
12 Nov 2013 14
Better
Worse
Y1 results and comparison
12 Nov 2013 15
Org. X
Org. X = Undisclosed, state of the art ASR & MT systems
Org. X
Y1 results and comparison
12 Nov 2013 16
Org. X
Org. X
Org. X
Org. X = Undisclosed, state of the art ASR & MT systems
Org. X
Intelligent interaction
• Postediting automatic transcriptions/translations• The user invests the least possible effort
• The system learns the most from it
• Confidence measures
• Fast constrained search
12 Nov 2013 17
Intelligent interaction
12 Nov 2013 18
Intelligent interaction
12 Nov 2013 19
Intelligent interaction
• User evaluations• UPV (Polimedia)
• JSI (Videolectures.NET)
12 Nov 2013 20
User evaluations
• User evaluations at UPV• Users: lecturers
• Revising their own lectures
• 3 different experiments1. Complete supervision
2. Intelligent interaction
3. Two-round supervision
12 Nov 2013 21
User evaluations
12 Nov 2013 22
1. Complete supervision
User evaluations
12 Nov 2013 23
2. Intelligent interaction
User evaluations
12 Nov 2013 24
3. Two-round supervision
User evaluations
• User evaluations at UPV: results
12 Nov 2013 25
User evaluations
• User evaluations at UPV: results
12 Nov 2013 26
User evaluations
• User evaluations at UPV: results
12 Nov 2013 27
Quality control: expert evaluations
• Transcription quality (EN, ES, SL)• UPV: Representative set of Spanish transcriptions
• Avg. WER: 23.2 ; Avg. RTF: 3.8
12 Nov 2013 28
Quality control: expert evaluations
• Transcription quality (EN, ES, SL)• UPV: Representative set of Spanish transcriptions
• Avg. WER: 23.2 ; Avg. RTF: 3.8
• Translation quality (EN<>SL, EN<>ES, EN>FR, EN>DE)• UPV: Representative set of Spanish>English translations
• Avg. BLEU: 46.6 ; Avg. RTF: 14.8 ; Avg. Score: 3.6 out of 5
12 Nov 2013 29
Implementation and integration
• Videolectures.NET
• Polimedia
• Opencast Matterhorn
12 Nov 2013 30
• Polimedia
12 Nov 2013 31
transLectures: Open source tools
• The tL player (& editor)• Coming soon (www.translectures.eu)
• The transLectures-UPV Toolkit (TLK) for ASR• www.translectures.eu/tlk
• RWTH Aachen: rASR, Jane (MT)• http://www-i6.informatik.rwth-aachen.de/web/Software/
12 Nov 2013 32
Next steps for transLectures
• Keep improving ASR and MT results
• Keep improving tL open source tools (TLK, tL player)
• External user evaluations (VL.NET and polimedia)
• External trials: implementation in other universities
12 Nov 2013 33
• More detailed info in the public project deliverables, soon available from http://www.translectures.eu/progress/http://www.translectures.eu/public-reports/(M12 = Year 1; M24 = Year 2, most recent results)
• More tL video demos: http://www.translectures.eu/progress/
• Follow transLectures:• http://twitter.com/translectures• http://www.facebook.com/translectures• http://www.slideshare.net/transLectures
12 Nov 2013 34
www.translectures.eu
• About this presentation:
Gonçal Garcés Díaz-Munío [email protected]
• Project coordinator:
Alfons Juan-Ciscar [email protected]
EC FP7 ICT Programme – Project Number 287755
12 Nov 2013 35