transLectures @IOE2013 (12 Nov 2013)

Preview:

DESCRIPTION

Watch this presentation in video: http://videolectures.net/internetofeducation2013_diaz_munio_translectures/ Same presentation slides in the transLectures Slideshare: http://www.slideshare.net/transLectures/trans-lectures-ioe201320131112 http://www.translectures.eu

Citation preview

Internet of Education 2013

12 November 2013

Universitat Politècnica de València

www.translectures.eu EC FP7 ICT project #287755

Watch this presentation in video:http://videolectures.net/internetofeducation2013_diaz_munio_translectures/

Outline

• transLectures (tL): motivation and goals

• Video demo

• tL technologies

• Progress and results• Scientific evaluations

• User evaluations

• Quality control (expert evaluations)

• Implementation and integration

• tL open source tools

12 Nov 2013 2

Motivation

• Video lecture repositories and MOOCs• Thousands of hours of video lectures available

• Hundreds of hours of video lectures recorded every week

• Most video lectures only available in their original language• No subtitles

12 Nov 2013 3

Motivation

• Transcriptions and translations are needed• Accessibility for people with disabilities

• Accessibility for speakers of different languages

• Search and analysis functions

• Automated topic finding

• …

12 Nov 2013 4

Motivation

• Transcriptions and translations are needed• Accessibility for people with disabilities

• Accessibility for speakers of different languages

• Search and analysis functions

• Automated topic finding

• …

• How do we get there?

12 Nov 2013 5

The transLectures approach

1. Automatic Speech Recognition (ASR) and Machine Translation (MT)• Adaptation: Taking advantage of the characteristics

of video lecture repositories

• High-quality automatic transcriptions and translations

2. Interactive postediting: intelligent interaction for reduced effort

12 Nov 2013 6

Goals

• Massive adaptation

• Intelligent interaction

• Implementation• Case studies: Videolectures.NET & Polimedia

• Real-life evaluation

• Integration into Opencast Matterhorn

http://opencast.org/matterhorn/

12 Nov 2013 7

The transLectures partners

12 Nov 2013 8

Name Country

1 Universitat Politècnica de València Spain

2 Xerox SAS France

3 Institut Jožef Stefan Slovenia

3+ Knowledge for All Foundation UK

4 RWTH Aachen University Germany

5 EML – European Media Laboratory Germany

6 DDS – Deluxe Digital Studios UK

Languages• Transcription (ASR)

• EN

• SL

• ES

• Translation (MT)• EN>SL , SL>EN

• EN>ES , ES>EN

• EN>FR

• EN>DE

12 Nov 2013 9

transLectures: video demo

http://www.translectures.eu/new-demo-video/

12 Nov 2013 10

Massive adaptation

12 Nov 2013 11

• Characteristics

of video lecturesJust one person

Known speaker

Clear talking

No interruptions

Focused on a topic

Slides

Massive adaptation

• Known speaker and topic

• Slides

• Related documents

12 Nov 2013 12

Scientific evaluations (Y2)

• Transcription results• WER: Word Error Rate (%)

• Goal: WER < 25%

• EN, SL, ES

12 Nov 2013 13

Worse

Better

Scientific evaluations (Y2)

• Translation results• BLEU

• Goal: BLEU > 30

• EN>SL , SL>EN

• EN>ES , ES>EN

• EN>FR

• EN>DE

12 Nov 2013 14

Better

Worse

Y1 results and comparison

12 Nov 2013 15

Org. X

Org. X = Undisclosed, state of the art ASR & MT systems

Org. X

Y1 results and comparison

12 Nov 2013 16

Org. X

Org. X

Org. X

Org. X = Undisclosed, state of the art ASR & MT systems

Org. X

Intelligent interaction

• Postediting automatic transcriptions/translations• The user invests the least possible effort

• The system learns the most from it

• Confidence measures

• Fast constrained search

12 Nov 2013 17

Intelligent interaction

12 Nov 2013 18

Intelligent interaction

12 Nov 2013 19

Intelligent interaction

• User evaluations• UPV (Polimedia)

• JSI (Videolectures.NET)

12 Nov 2013 20

User evaluations

• User evaluations at UPV• Users: lecturers

• Revising their own lectures

• 3 different experiments1. Complete supervision

2. Intelligent interaction

3. Two-round supervision

12 Nov 2013 21

User evaluations

12 Nov 2013 22

1. Complete supervision

User evaluations

12 Nov 2013 23

2. Intelligent interaction

User evaluations

12 Nov 2013 24

3. Two-round supervision

User evaluations

• User evaluations at UPV: results

12 Nov 2013 25

User evaluations

• User evaluations at UPV: results

12 Nov 2013 26

User evaluations

• User evaluations at UPV: results

12 Nov 2013 27

Quality control: expert evaluations

• Transcription quality (EN, ES, SL)• UPV: Representative set of Spanish transcriptions

• Avg. WER: 23.2 ; Avg. RTF: 3.8

12 Nov 2013 28

Quality control: expert evaluations

• Transcription quality (EN, ES, SL)• UPV: Representative set of Spanish transcriptions

• Avg. WER: 23.2 ; Avg. RTF: 3.8

• Translation quality (EN<>SL, EN<>ES, EN>FR, EN>DE)• UPV: Representative set of Spanish>English translations

• Avg. BLEU: 46.6 ; Avg. RTF: 14.8 ; Avg. Score: 3.6 out of 5

12 Nov 2013 29

Implementation and integration

• Videolectures.NET

• Polimedia

• Opencast Matterhorn

12 Nov 2013 30

• Polimedia

12 Nov 2013 31

transLectures: Open source tools

• The tL player (& editor)• Coming soon (www.translectures.eu)

• The transLectures-UPV Toolkit (TLK) for ASR• www.translectures.eu/tlk

• RWTH Aachen: rASR, Jane (MT)• http://www-i6.informatik.rwth-aachen.de/web/Software/

12 Nov 2013 32

Next steps for transLectures

• Keep improving ASR and MT results

• Keep improving tL open source tools (TLK, tL player)

• External user evaluations (VL.NET and polimedia)

• External trials: implementation in other universities

12 Nov 2013 33

• More detailed info in the public project deliverables, soon available from http://www.translectures.eu/progress/http://www.translectures.eu/public-reports/(M12 = Year 1; M24 = Year 2, most recent results)

• More tL video demos: http://www.translectures.eu/progress/

• Follow transLectures:• http://twitter.com/translectures• http://www.facebook.com/translectures• http://www.slideshare.net/transLectures

12 Nov 2013 34

www.translectures.eu

• About this presentation:

Gonçal Garcés Díaz-Munío ggarces@dsic.upv.es

• Project coordinator:

Alfons Juan-Ciscar ajuan@dsic.upv.es

EC FP7 ICT Programme – Project Number 287755

12 Nov 2013 35