24
November 17, 200 0 TDT-2000 Workshop Topic Tracking at Maryland: Lessons from the Johns Hopkins Mandarin-English Information (MEI) Project Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies University of Maryland, College Park

Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

  • Upload
    astrid

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Topic Tracking at Maryland: Lessons from the Johns Hopkins Mandarin-English Information (MEI) Project. Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies University of Maryland, College Park. Roadmap. MEI Overview (6 weeks in 5 minutes) MEI Results - PowerPoint PPT Presentation

Citation preview

Page 1: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

November 17, 2000 TDT-2000 Workshop

Topic Tracking at Maryland:Lessons from the Johns Hopkins

Mandarin-English Information (MEI) Project

Gina-Anne Levow and Douglas W.OardInstitute for Advanced Computer Studies

University of Maryland, College Park

Page 2: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Roadmap

• MEI Overview (6 weeks in 5 minutes)

• MEI Results

• Adapting MEI to TDT

• TDT Results

• Conclusions

Page 3: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

The MEI Team• Senior Members

• Students

Helen Meng Chinese University of Hong KongErika Grams Advanced Analytic ToolsSanjeev Khudanpur Johns Hopkins UniversityGina-Anne Levow University of MarylandDouglas Oard University of MarylandPatrick Schone US Department of DefenseHsin-Min Wang Academia Sinica, Taiwan

Berlin Chen National Taiwan UniversityWai-Kit Lo Chinese University of Hong KongKaren Tang Princeton UniversityJianqiang Wang University of Maryland

Page 4: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

MEI: The Challenges

• Speech Recognition– Tokenization– Lexicon coverage– Selection among alternatives

• Translation– Tokenization– Lexicon coverage– Selection among alternatives

Dif

fere

nt P

robl

ems

Page 5: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Term Granularity Options

MandarinWords

MandarinSyllables

MandarinCharacters

EnglishWords

EnglishPhrases

Page 6: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

MEI Evaluation Collections

2265manually

segmentedstories

3371manually segmented

stories

DevelopmentCollection: TDT-2

EvaluationCollection: TDT-3

Mar 98

Oct 98 Dec 98

17 topics,variable number

of exemplars

Jun 98Jan 98

English texttopic exemplars:Associated PressNew York Times

Mandarin audiobroadcast news:Voice of America

56 topics,variable number

of exemplars

Jun 98

Page 7: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Mandarin Audio

Term Translation

President Bill Clinton and…

English Exemplar

Term Selection

BilingualTermList

Query Construction

MandarinIR System

StoryBoundaries

Evaluation

Named Entity

Tagging

DocumentConstruction

SpeechRecognition

Relevance Judgments

RankedList

BBN

U Mass

LDC

Cornell

DragonLDC

LDC

LDC 000100010000010100

MeanUninterpolated

AveragePrecision

LDCCETA

Page 8: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Query Translation

• Dictionary inversion for phrase translation– “Wall Street” “best interests” “human rights”

• Lemmatize remaining words if necessary– e.g. “televised” translates as “television

• filtering for query term selection– Compared to an English background model

2

Page 9: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

0.0

0.5

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Recall

Inte

rpol

ated

Pre

cisi

onEvaluation Measure

Able to characterize variation across exemplars!

Page 10: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Balanced Translation Works Well

• Pirkola’s structured queries– Treat translation alternatives

as synonyms

– Inquery #syn() operator

• Balanced translation– Distribute probability mass

over translation alternatives

– Inquery #sum() operator 0

0.1

0.2

0.3

0.4

0.5

0.6

Me

an

Av

era

ge

Pre

cis

ion

StructuredQueries

BalancedTranslation

StrategyTDT-2, phrase-based translation, word-based retrieval

Page 11: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Phrase Translation Beats Words

• Phrases beat words

• Three sources– Translation lexicon

– Named entities

– Numeric expressions

0

0.1

0.2

0.3

0.4

0.5

0.6

Me

an

Av

era

ge

Pre

cis

ion

Words Phrases Phrases +NE/NUMEX

StrategyCondition: TDT-2, 12 exemplars, word-based retrieval

Page 12: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Character Bigram Indexing Wins

• Character bigrams are best

• Syllable bigrams do poorly

0

0.1

0.2

0.3

0.4

0.5

0.6

Mea

n A

ver

age

Pre

cisi

on

Words Char Syllable

TDT-2, single NYT exemplar, manual translation

Page 13: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Untranslatable Terms

Term Occurrencessuharto 97netanyahu 88starr 62arafat 50bjp 45vajpayee 44estrada 44….hsu 19zemin 7

# (by token)87,0043,028

# (by type)12,4021,122

TermstotalOOV

Page 14: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Cross-Language Phonetic Matching

• Small improvement– Not statistically significant

• Character bigrams are best– Form a unified index

• Character and syllable bigrams

– Translate words if possible• Then form character bigrams

– Otherwise translate syllables• Then form syllable bigrams

0

0.1

0.2

0.3

0.4

0.5

0.6

Me

an

Av

era

ge

Pre

cis

ion

Wo

rd

Ch

ar

Sy

llab

le

Indexing Terms

no CLPM CLPM

TDT-2, phrase-based translation

Page 15: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

MEI: Comparing Collections

0.4

0.45

0.5

0.55

0.6

Words Character Bigrams Character Bigrams +CLPM

Mea

n A

vera

ge

Pre

cisi

on

TDT2 TDT3

Page 16: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

MEI Conclusions

• ASR Words

• Translation Phrases, Words, Lemmas, Syllables

• Indexing Character Bigrams

Page 17: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

TDT-2000: What’s New Since ’99?

• Key ideas from MEI:– Dictionary inversion for phrase translation– Balanced translation– Post-translation resegmentation

• Adaptation to TDT:– Exploit negative exemplars– Improved Mandarin topic normalization– Round-robin balanced translation

Page 18: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Mandarin Audio

Term Translation

President Bill Clinton and…

English Exemplars

Term Selection

BilingualTermList

Query Construction

PRISE

StoryBoundaries

ScoreNormalization

DocumentConstruction

SpeechRecognition

RankedList

NIST

DragonLDC

LDC

LDC

Scores

LDC/CETA

TDT-2000

IDFComputation

Training Epoch

Page 19: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Topic Tracking Improvements

• Improved filtering for query term selection– First compare to background model– Augment by comparison to negative exemplars

• Mandarin topic normalization (unofficial)– Language-specific strategy

• Mandarin: Best single training epoch score

• English: Average of exemplar scores

– Recomputed Mandarin source normalization

2

Page 20: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Effect of Negative Exemplars

Text Only DET Plots1st 60 topics (self-scored)

Mandarin TextNn=0 & Nn = 2

English TextNn=0 & Nn=2

Page 21: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Indexing Character Bigrams

Mandarin Speech Only1st 60 topics

(unofficial renormalization)

Character Bigrams

Words

Page 22: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Round Robin 8-Best Translation

TDT-1999 2-best translation

Mandarin Text1st 60 Topics(self-scored)

TDT-2000Round-robin 8 best

Page 23: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Conclusions

• Top-8 round robin translation to Mandarin wins– Slightly outperforms top-2 translation to English

• Query translation is more efficient– Better suited to a stream of stories

• Match term extent to purpose– ASR, translation, indexing

Page 24: Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies

Closing Thoughts

• Thanks to Jon and LDC !

• Normalization limits our insight– Need some way to see past it

• Availability of TDT-3 ground truth?