View
229
Download
1
Tags:
Embed Size (px)
Citation preview
High-quality Speech Translation
for Language Learning
Chao Wang and Stephanie Seneff
June 24, 2004
Spoken Language Systems Group
MIT Computer Science and Artificial Intelligence Lab
Outline
• Motivation and introduction
• Component technologies– Language understanding
– Language generation
• Translation by generation
• Translation by example
• Evaluation
• Summary and future work
Background
• Language teachers have limited time to interact with students in dialogue exchanges
• Computers can provide non-threatening environment in which to practice communicating
• Our group has been developing multi-lingual spoken conversational systems since 1990– Concentrating on domains related to travel
– Can easily be adapted for language learning applications
– A translation capability from the native language (L1) to the target language (L2) can greatly improve their usability for language learning
Introduction
• Goal: provide translation aids for language learning– Must be high quality
– Must be robust to speech recognition errors
• Strategies for achieving high quality and robustness– Interlingua-based translation using formal generation rules
– Restricted conversational domains (lesson plans)
* Emphasis on mechanisms to enable rapid porting to new domains and languages
– Use parsability to assess quality of translation outputs
– Back off to example-based method when parse fails
Language Understanding: TINA
Approach: Context free rules + constraints + probabilities
Rules: – Define permissible linguistic patterns in the language and
domain
– Encode both syntactic and semantic information
Constraints: – Eliminate patterns that violate known syntactic/semantic
restrictions (e.g., number agreement)
– Account for movement of constituents in surface realization
Probabilities:– Support prediction of next word given preceding context
TINA has been used in many systems over the last 10 years:– Domains: weather, air travel, restaurant guide, hotel
reservations, urban navigation, . . .
– Languages: English, Mandarin, Japanese, Spanish, French. . . .
Process to Automate Grammar Development
• Merge several grammars into shared rules, predominantly syntax-based
• Once generic grammar is available, creating derivative domain-dependent grammars is straightforward
Merged “Seed”
GrammarMercury
Orion
Voyager
Jupiter
Pegasus
“Scrubbed” sentences
Generic Grammar
Grammar for New Domain
Domain dependent semantics
“Are there any <noun> from <proper_name> to <proper_name>”
Example Parse Tree
• Utilizes pre-existing sub-grammars for time and location
• Selected parse categories contribute to a hierarchical semantic frame (interlinguainterlingua)
subject
question
will predicate
sentence
rain weekendthiswill
weekend
temporal
this
it
intr_verb_phrase
intr_verb_argsintr_verb
day_list
bostonin
city_name
locative
in a_city
subject
question
will predicate
sentence
rain weekendthiswill
weekend
temporal
this
it
intr_verb_phrase
intr_verb_argsintr_verb
day_list
bostonin
city_name
locative
in a_city
Semantic Frame for Example
Semantic frame encodes syntactic structure and features in addition to semantic information
{c verify :aux “will” :subject “it” :pred {p rain
:pred {p locative :prep ‘in” :topic {q city
:name “boston” } } :pred {p temporal
:topic {q weekday :quantifier “this” :name “weekend” } } } }
Will it rain in boston this weekend?
Language Generation: GENESIS
• Generates a surface string from the semantic frame
• Accomplishes many tasks in dialogue system development– In the same language (paraphrasing & response generation)
– In a different language (translation)
– Other formal languages (key-value pairs, SQL queries, etc.)
• Utilizes recursive formal rules along with a lexicon encoding appropriate surface form realizations in context
Challenges in Cross-languageGeneration for Translation
• Some expressions have very different syntactic structures in different languages
What is your name? 你 (you) 叫 (call) 什么 (what) 名字(name)? I like her. Ella me gusta.
附近 (vicinity) 哪儿 (where) 有 (have) 银行 (bank)?Where is a bank nearby?
that hotel 那 (that) 家 (<particle>) 旅馆 (hotel)
I lost my key. 我 (I) 丢 (lose) 了 (<past tense>) 我的 (my) 钥匙 (key).
– Particles (Chinese but not English)
– Gender (extensive in Spanish)
• Syntactic features are expressed in many different ways– Determiners (English but not Chinese)
Generation Procedures
• Constituent order specified in recursive rules– “Pull” and “Push” mechanisms support major structural
reorganization
• Lexical selection controlled by feature propagation – Inflectional forms based on syntactic features
– Lexical realization (word sense) influenced by surrounding semantic context
• Infers missing features
• Can generate multiple surface strings for the same semantic frame
A Generation Example
{c verify :aux “will” :subject “it” :pred {p rain
:pred {p locative :prep ‘in” :topic {q city
:name “boston” } } :pred {p temporal
:topic {q weekday :quanitifier “this” :name
“weekend” } } } }
bo1 shi4 dun4 zhe4 zhou4 mo4 hui4 bu2 hui4 xia4 yu3 ?( Boston this weekend will-not-will rain ? )
pulled to the front
“will” conditioned by “verify”
zhe4 zhou4 mo4 bo1 shi4 dun4 hui4 xia4 yu3 ma5 ?( this weekend Boston will rain <question-particle> ? )
Generation-based Translation
• Semantic frame serves as interlingua
• Translation achieved by parsing and generation
• Use Chinese grammar to detect potential problems
• Rejected sentences routed to example-based translation for a second chance
Parse
EnglishGrammar
Generate
ChineseRules
EnglishInput
SemanticFrame
ChineseSentence
ChineseOutput
Parse?
ChineseGrammar
acceptedaccepted
reje
cted
reje
cted
Example-basedTranslation
Example-based Translation
• Requires translation pairs and a retrieval mechanism– Corpus automatically obtained via the generation-based approach
– Retrieval based on lean semantic information
* Encoded as key-value pairs
* Obtained from semantic frame via simple generation rules
* Generalizes words to classes (e.g., city name, weekday, etc.) to overcome data sparseness
WEATHER: rain CITY: San Francisco
Example-based Translation Procedure
Is there any chance of rain in San Francisco?
{ <CITY> : San Francisco }<CITY> { <CITY> : jiu4 jin1 shan1 }
<CITY> hui4 bu2 hui4 xia4 yu3?jiu4 jin1 shan1
• Key-value string serves as interlingua
• Translation achieved by parsing and table lookup
• City name masked during retrieval and recovered in final surface string
KV-ChineseTable
ChineseOutput
KVString
Parser
EnglishGrammar
Generator
Key-valueRules
EnglishInput
SemanticFrame
Complete Translation Procedure
• Only parsed sentences go into key-value database
• Indexed by semantic information encoded as key-value string
• Unnparsed translations replaced by key-value option
• Use word classes to overcome data sparseness
WEATHER: rain CITY: boston indexing
indexing
Parses?
ChineseGrammar
Key-value Index Database
no
Key-valueRules
Parse
EnglishGrammar
Generate
ChineseRules
EnglishInput
SemanticFrame
Chinese Sentence
will it rain in Boston tomorrow? bo1 shi4 dun4 ming2 tian1 hui4 xia4 yu3 ma5?
yes
Key-value Index Database
CreationRetrieval
<CITY>
<CITY>
yes translation
Evaluation: English to MandarinWeather Domain
• Evaluation data– Drawn from the publicly available Jupiter weather system
– Telephone recordings; conversational speech
– Unparsable utterances (English grammar) were excluded
– Total of 695 utterances, with 6.5 words per utterance on average
• System configuration– Text input or speech input
* Recognizer achieved 6.9% word error rate, and 19.0% sentence error rate
– Generation-based method preferred over example-based method
– NULL output if both failed
• Evaluation criteria– Yield of each translation method
– Human judgment of translation quality
Evaluation Results (I)
• Majority of the utterances are successfully translated using formal generation rules, which are likely to achieve high fidelity and quality
• A greater percentage of the utterances fail in the speech mode, due to recognition errors– System will apologize for not understanding the utterance and
invite the user to try again
Yield Text SpeechBy generation 606 87.2% 592 85.2%By example 59 8.5% 48 6.9%Failed 30 4.3% 55 7.9%Total 695 100% 695 100%
Evaluation Results (II)
• Human judgment of translation quality based on grammaticality and fidelity
• Three categories: perfect, acceptable, or wrong
• Fewer than 2% of the utterances produce incorrect translation outputs– A concurrent English paraphrase provides context for the
Chinese translation
Quality Text SpeechPerfect 613 88.2% 577 83.0%Acceptable 43 6.2% 50 7.2%Wrong 9 1.3% 13 1.9%Failed 30 4.3% 55 7.9%Total 695 100% 695 100%
Summary and Future Work
• We have demonstrated a capability to produce high-quality spoken-language translations from English to Mandarin– Evaluation restricted to weather domain
– Fewer than 2% of the translations were incorrect
Future Plans:
• Integrate into spoken dialogue systems
• Incorporate framework into classroom environment
• Assess effectiveness in second-language acquisition
• Port to other domains and languages– Develop tools to enable rapid porting
Translation Corpus
• Guaranteed coverage by the Chinese grammar
• Indexed by semantic information encoded as key-value string
• Use word classes to overcome data sparseness
Parser
EnglishGrammar
Generator
ChineseRules
EnglishInput
SemanticFrame
ChineseSentence
ChineseOutput
Parser
ChineseGrammar
acceptedaccepted
Key-valueRules
will it rain in Boston tomorrow? bo1 shi4 dun4 ming2 tian1 hui4 xia4 yu3 ma5?
WEATHER: rain CITY: boston indexing
indexing<CITY>
<CITY>
KV Strin
gKV-Chinese
Table
Translation Corpus
• Guaranteed coverage by the Chinese grammar
• Indexed by semantic information encoded as key-value string
• Use word classes to overcome data sparseness
Key-valueRules
will it rain in Boston tomorrow? bo1 shi4 dun4 ming2 tian1 hui4 xia4 yu3 ma5?
WEATHER: rain CITY: boston indexing
indexing<CITY>
<CITY>
Key-value Index Database
Parse
EnglishGrammar
Generate
ChineseRules
EnglishInput
SemanticFrame
Parses?
ChineseGrammar
Chinese Sentence
yes
NLG
Synthesis
NLU
Recognition
Interlingua-based Speech Translation
Common meaning representation: semantic frame
Interlingua
ParsingRules
GenerationRules
Models
SpeechCorpora
SUMMIT
ENVOICE
GENESIS
TINA
EnglishChinese
EnglishChinese
Understanding and Generation:Procedural Strategy
• Develop end-to-end English system– Solicit example utterances from SLS members
• Create generation rules for Chinese paraphrase– Generated sentences become initial Chinese corpus
• Develop understanding component for Chinese input– Map to identical semantic frame as much as possible
• Adjust English generation for Chinese inputs– Deal with missing function words, etc.
– Translation loop now possible:
English Chinese English
• Evaluation based on English-to-translated-English
• Similar strategy for other languages
Strategies for Translation
• Grammar design strategies– Preserve as much information as necessary for accurate
translation
* Semantic frames are much more detailed than those in human-computer interaction applications
– Maintain consistency of semantic frame representation across different languages whenever possible
* Seed grammar rules for each new language on English grammar rules
* Mapping from parse tree to semantic frame preserved
• Remaining language dependent aspects in semantic frame are addressed by generation rules
How long does it take to take a taxi thereHow long take take taxi there
An Example: English/Chinese
• Function words disappear in Chinese
How long does it take to take a taxi there
( take taxi go there need how long )
坐 出租车 去 那里 要 多久
• Sentence structure is very different
• Verb “go” omitted in English
• Two instances of “take” have different translations
How long need take taxi thereHow long need take taxi go there