17
MT with an Interlingua Lori Levin April 13, 2009

MT with an Interlingua Lori Levin April 13, 2009

Embed Size (px)

Citation preview

Page 1: MT with an Interlingua Lori Levin April 13, 2009

MT with an Interlingua

Lori Levin

April 13, 2009

Page 2: MT with an Interlingua Lori Levin April 13, 2009

Interlingua

• “An interlingua is a notation for representing the content of a text that abstracts away from the characteristics of the language itself and focuses on the meaning (semantics) alone.

• Interlinguas are typically used as pivot representations in machine translation, allowing the contents of a source text to be generated in many different target languages.

• Due to the complexities involved, few interlinguas are more than demonstration prototypes, and only one has been used in a commercial MT system.”– Dorr, Hovy, Levin, Natural Language Processing and Machine

Translation, Encyclopedia of Language and Linguistics, 2nd ed. (ELL2). Machine Translation: Interlingual Methods

Page 3: MT with an Interlingua Lori Levin April 13, 2009

KANT: If the error persists, service is required (Mitamura and Nyberg)

• (*BE-PREDICATE (attribute (*REQUIRED (degree positive))) (mood declarative) (predicate-role attribute) (punctuation period) (qualification (*QUALIFYING-EVENT (event (*PERSIST (argument-class theme) (mood declarative) (tense present) (theme (*ERROR (number (:OR mass singular)) (reference definite))))) (extent (*CONJ-if)) (topic +))) (tense present) (theme (*SERVICE (number (:OR mass singular)) (reference no-reference))))

Page 4: MT with an Interlingua Lori Levin April 13, 2009

NESPOLE! (Levin et al.)

• “I want to know what time the flight leaves Pittsburgh.”

• “What time does the flight leave Pittsburgh?”

• request-information+departure (time = (clock = question), transportation-spec = (flight, id = yes), origin = name=Pittsburgh)

Page 5: MT with an Interlingua Lori Levin April 13, 2009

Not Just for MT anymore

InterlinguaLanguageAnalysis

LanguageSynthesis

Cross-languageInformation

Retrieval

Cross-languageSummarization

Machine Translation

MultilingualQuestion

Answering

Page 6: MT with an Interlingua Lori Levin April 13, 2009

Interlingua

SemanticStructure

SemanticStructure

SyntacticStructure

SyntacticStructure

WordStructure

WordStructure

Source Text Target Text

SemanticComposition

SemanticDecomposition

SemanticAnalysis

SemanticGeneration

SyntacticAnalysis

SyntacticGeneration

MorphologicalAnalysis

MorphologicalGeneration

SemanticTransfer

SyntacticTransfer

Direct

Vauquois Triangle

Page 7: MT with an Interlingua Lori Levin April 13, 2009

Reasons for using an interlingua

• N2 vs 2N– For all-ways translation between N languages, you

need an analyzer (L to interlingua) and a synthesizer (interlingua to L) for each language.

• Monolingual development teams– Each developer needs to know only his/her language

and the interlingua.– NESPOLE! project: Italian to Korean translation

worked as well as Italian to English, even though nobody on the team was bilingual in Korean and Italian.

– Same may be true for SMT?

Page 8: MT with an Interlingua Lori Levin April 13, 2009

MT Divergences• Translating word-by-word, node-by-node, or

dependency-by-dependency does not work.– Mi chiamo Lori – My name is Lori– to be jealous — tener celos (to have jealousy)– to kick — dar una patada (give a kick)– to enter the house — entrar en la casa (enter in the

house)– to run in — entrar corriendo (enter running)– meet someone/meet with someone– decide/make a decision

• Which of these are handled well by phrase-based SMT or syntax based SMT (with or without morphology – dar, doy, etc.)?

Page 9: MT with an Interlingua Lori Levin April 13, 2009

Interlingua Example: KANT• (*BE-PREDICATE

(attribute (*REQUIRED (degree positive))) (mood declarative) (predicate-role attribute) (punctuation period) (qualification (*QUALIFYING-EVENT (event (*PERSIST (argument-class theme) (mood declarative) (tense present) (theme (*ERROR (number (:OR mass singular)) (reference definite))))) (extent (*CONJ-if)) (topic +))) (tense present) (theme (*SERVICE (number (:OR mass singular)) (reference no-reference))))

Page 10: MT with an Interlingua Lori Levin April 13, 2009

Interlingua Example: NESPOLE!

• “I want to know what time the flight leaves Pittsburgh.”

• “What time does the flight leave Pittsburgh?”

• request-information+departure (time = (clock = question), transportation-spec = (flight, id = yes), origin = name=Pittsburgh)

Page 11: MT with an Interlingua Lori Levin April 13, 2009

Interlingua Example: Mikrokosmos

request-action-69   agent human-72 theme accept-70   beneficiary organization-71   source-root-word ask time (< (find-anchor-time)) accept-70   theme war-73   theme-of request-action-69   source-root-word authorize organization-71   has-name united-nations   beneficiary-of request-action-69   source-root-word UN human-72   has-name colin powell   agent-of request-action-69 source-root-word he ; ref. resolution has been carried out war-73   theme-of accept-70   source-root-word war

Page 12: MT with an Interlingua Lori Levin April 13, 2009

Interlingua Example: Lexical Conceptual Structure

• (event cause (thing[agent] reporter+) (go loc (thing[theme] email+) (path to loc (thing email+) (position at loc (thing email+) (thing[goal] aljazeera+))) (manner send+ingly)))

• Figure 10: LCS Representation of The reporter emailed Al-Jazeera

Page 13: MT with an Interlingua Lori Levin April 13, 2009

Issues in Interlingua design

• Grainsize of meaning• Domain specificity of meaning• Ambiguity• Lack of agreement among humans• From EACL workshop 2009:

– Russell-Frege: Meaning can be broken down in to pieces that combine logically.

– Witgenstein-Quine: Meaning = use. • Use is represented by a corpus

Page 14: MT with an Interlingua Lori Levin April 13, 2009

Interlingua: annotated corpora

• Many annotated corpora can be considered as part of an interlingua:– Named entities and co-reference– Semantic roles– Temporal expression

Page 15: MT with an Interlingua Lori Levin April 13, 2009

IAMTC: Interlingua Annotation of Multi-lingual Text Corpora

• 14 PI’s. One year (2003-2004). Still publishing.

• See other set of slides.

Page 16: MT with an Interlingua Lori Levin April 13, 2009

Elicitation Corpus

• 3000 feature structures

• English sentence for each one.

• LDC translated the English sentences into 13 languages and a few other places did a few more languages.

Page 17: MT with an Interlingua Lori Levin April 13, 2009

SCALE 2009: MT and HIVEs• High Information Value Elements

– Named entities, negation, modality

• Urdu to English• Modality

– H firmly believes [R is true/false] – H believes [R may be true/false] – H requires [R to be true/false] – H permits [R to be true/false] – H intends [to make R true/false] – H does not intend [to make R true/false] – H is trying [to make R true/false] – H is able [to make R true/false and succeeds] – H is able [to make R true/false and fails] – H is able [to make R true/false]– H wants [R to be true/false]