17
Ongoing Developments in Spoken Language Translation Mark Seligman, Ph.D. Spoken Translation, Inc. [email protected] SpeechTrans, Inc.

Spoken Translation - TAUS Tokyo Forum 2015

Embed Size (px)

Citation preview

Ongoing Developments in Spoken Language Translation

Mark Seligman, Ph.D. Spoken Translation, Inc.

[email protected] SpeechTrans, Inc.

Introduction

• SPEECH TRANSLATION:

• SpeechTrans, Inc. – InterprePhone

• Streaming ASR

• Shows text

• Can add people to call

• Can dial into call, any language

• New languages (Hindi ..)

• Spoken Translation, Inc. – Converser 4.0: now via Web

– Will scale via SMT

• Pangeanic – Rapid prototyping

– Security, privacy

• OTHER:

• MT evaluation

– Via back-translation

• Recognizing semantic patterns

– Via semantically smoothed regex

• Language-teaching technology

• Parallel text, etc.

• Grounded semantics

• Sydney Lamb’s work

SpeechTrans, Inc. • InterprePhone

– Shows recognized text

– Managing calls

– Can add people to call

– Can dial into call

– New languages

– Hindi …

Pangeanic, Inc.

• Rapid prototyping

– Healthcare domain

• Security, privacy

MT Evaluation via Back-translation

• ReversEval • Usually

• Language A → Language B

• Compute distance: target <> golden standard

• But we do

• Language A → Language B → Language A

• Compute distance: source <> back-translation

• Structured test corpus, e.g. grammatically balanced

• Sort corpus: highlight weakest, strongest

• Compare grammar areas or whole corpora

2_57 203 Directives I Search the room carefully! ¡Registra el cuarto

cuidadosamente!

2_57 204 Directives I Make yourself a cup of tea. Hazte una taza de té.

2_57 205 Directives I Don't hurry! ¡No te apresures!

2_57 206 Directives I Don't be frightened! ¡No tengas miedo!

2_57 207 Directives I Don't wait for me! ¡No me esperes!

2_57 208 Exclamations X What beautiful clothes she

wears!

¡Qué ropa tan bella ella lleva

puesta!

2_57 209 Exclamations X How well Philip plays the piano! ¡Qué bien toca el piano Philip!

2_59 210 Cleft Sentence D It's Julie that buys her

vegetables at the market.

Es Julie quien compra sus

verduras en el mercado.

2_59 211 Cleft Sentence D It's Julie who buys her

vegetables at the market.

Es Julie quien compra sus

verduras en el mercado.

2_59 212 Cleft Sentence D It's her vegetables that Julie

buys at the market.

Son sus verduras lo que Julie

compra en el mercado.

2_59 213 Cleft Sentence D It's at the market that Julie buys

her vegetables.

Es en el mercado que Julie

compra sus verduras.

2_59 214 Fronting D Her vegetables Julie buys at the

market.

Sus verduras, Julie las compra

en el mercado.

2_59 215 Fronting D At the market Julie buys her

vegetables.

En el mercado Julie compra sus

verduras.

2_59 216 Wh-forms D What you say doesn't matter. Lo que tú dices no tiene

importancia.

MT Evaluation via Back-translation (2)

MT Evaluation via Back-translation (3)

• Advantages

• No golden translations required

• Hook up MT engine, good to go

• Probe grammatical strong, weak points

• Rule-based: feedback for manual fixes

• SMT: guide content of training corpus

• Subjective evaluation facilitated

• View (back-)translations side-by-side,

in grammatical groupings

MT Evaluation via Back-translation (4)

2_43 117 John works hard. John works hard. John works hard. equal 0 equal

2_43 118 John is working hard. John works hard. John works hard. equal 0.33 added:

works

2_43 119 John is tall. John is tall. John is tall. equal 0 equal

2_43 120 Marion is beautiful. Marion is beautiful. Marion is beautiful. equal 0 equal

2_43 121 Marion dances

beautifully.

Marion dances

beautifully.

Marion dances

beautifully.

equal 0 equal

2_43 122 He is being a nuisance

again.

He is a bother again. He is a bother again. equal 0.2 added:

bother

2_43 123 He is being naughty

again.

He is mischievous

again.

He is mischievous

again.

equal 0.25 added:

mischievous

2_44 124 John searched the big

room and then the small

room.

John registered the

big room and next the

small room.

John registered the

big room and next the

small room.

equal 0.2 added:

registered

next

2_44 125 John searched the big

room and then the small

one.

John registered the

big room and next the

child.

John registered the

big room and next

what's little.

added:

what's little

0.44 added:

registered

next what's

little

2_44 126 The man invited the little

Swedish girl because he

liked her.

The man invited the

Swedish little girl

because to him liked

him she.

The man invited the

Swedish small girl

because to him liked

him she.

added: small 0.38 added: small

to him him

she

Section num Original Sentence v.4.3 1/23/05

Backtranslations

v 4.10 1/25/05

Backtranslations

Regression

-type

1.25 vs

original

1.25 vs

original -

Type

JisstMatch: Semantically smoothed regex

• Linguistic patterns, based on regular expression pattern matching

• Discover, categorize relevant tweets, etc. – “I’m planning to buy a car.”

• Patterns augmented with semantic symbols – Synonym sets

• e.g. love_v# = [love|adore|BE crazy about ...]

– Classes and subclasses • E.g. car_brands& = [Toyota|GM|Chevrolet ...]

– Elements from interface • e.g. subject! or object!

Caught!

WordNet Linkup

Language Teaching Technology

• E.g. via parallel text

– Linked

– Translations

– Transcriptions

– TTS (both sides)

– Dictionaries

– Grammar points

Grounded Semantics

Associative Area

• SpeechTrans.com

• facebook.com/SpeechTransInc

• twitter.com/SpeechTrans

• youtube.com/SpeechTrans

• spokentranslation.com

Contact

John Frei

[email protected]

Yan Auerbach

[email protected]

Mark Seligman

[email protected]

Sendoff