DCU Search Runs at MediaEval 2014 Search and Hyperlinking

DCU Search Runs at MediaEval2014 Search and HyperlinkingDavid N. Racca, Maria Eskevich, Gareth J.F. Jones

CNGL Centre for Global Intelligent ContentSchool of Computing, Dublin City University

Dublin, Ireland

Overview

Novelty of Approach for MediaEval 2014:• What: integrate prosodic prominence in the IR weighting scheme• How: increase weight of prosodic prominent terms• Effect: promote rank of retrieved segments containing prominent terms

Motivation

● Speech is much more than a “bag of words”

● Prosodic information: e.g. intonation, duration, and loudness

● Shown useful in many speech processing tasks:● emotions, discourse structure, speech acts, speaker ID, focus, contrast, topic shifting

Related Work

● Possible correlation between acoustic stress and TFIDF scores in English (Crestani 2001)

● Use of signal amplitude and duration in a spoken content retrieval task in Chinese (Chen et al. 2001)

● Use of pitch and intensity in a topic tracking task in French (Guinaudeau 2011)

Approach

● We followed Guinaudeau's method

● Implemented a SCR system that gives greater importance to terms that are prosodically prominent in the spoken content

● Prosodic prominent terms stand out from their surroundings by means of their acoustic characteristics

Approach: Indexing

Video Collection

LVCSR

Feature extraction

XML

CSV

Prosodic features: F0, loudness

~ 10 ms

Automatic text transcripts

XML

Transcripts with prosodic

information

XML

Normalised max, min of F0, and

loudness

XMLNon-overlapping fixed

time segmentation

Segments with prosodic

information

Video Collection

Segments Index

Terrier Indexing

Stopword removal,stemming

Approach: Indexing

XML

Normalised max, min of F0, and

loudness

Segments Index

XML

Transcripts with prosodic

information

car 350 (1,3, [.78, .34,.20 ] ,[ .56,.23, .25] ,[0.66,1.0, .33])(4,1,[ .23,.10, .15]) ,...

race 45 (1,1, [0.81,0.98 ])(9,2,[0.23,0.10 ] ,[0.54,0.27]) ,⋯

⋯

⋯

idf docid tf [F0, L, D]

Transcript

Speech Segment context

Speech Seg 0

Speech Seg 1

Speech Seg N

...

Full transcript context

Approach: Retrieval

Terrier Matching

w s (t )=θ ir∗( idf t∗tf t )+θac∗act

θ ir+θac

Segments Index

“car race”Query Result List

1 Traffic cops 0.0 1.30 3.35

2 Top gear 1.30 3.00 2.82

3 Top gear 48.0 49.30 2.73

... ... ... ... ...

score

act=max (durt )

act=max ( loudt )∗max (F0 t )

(G-dur)

(G-pr)

(G-lp)

Results: Development Set

TranscriptRetrieval

ModelMRR

Subtitles

TFIDF 1 0 .428

Gpr 2 3 .126

Glp 3 1 .281

Gdur 1 3

NSTSheffield

TFIDF 1 0 .313

Gpr 2 3 .329

Glp 3 1 .319

G-dur 1 3 ---

● 50 queries – 1335 hours

θir θacTranscript

Retrieval Model

MRR

LIMSI

TFIDF 1 0 .259

Gpr 2 3 .264

Glp 3 1 .285

Gdur 1 3 .279

LIUM

TFIDF 1 0 .296

Gpr 2 3 .253

G-lp 3 1 ---

G-dur 1 3 ---

θir θac

● Known-item Task

Results: Test Set

TranscriptRetrieval

ModelMAP

Subtitles

TFIDF 1 0 .639

Gpr 2 3 .599

Glp 3 1 .533

Gdur 1 3 .345

NSTSheffield

TFIDF 1 0 .440

Gpr 2 3 .434

Glp 3 1 .435

G-dur 1 3 .404

● 36 queries – 2686 hours

θir θacTranscript

Retrieval Model

MAP

LIMSI

TFIDF 1 0 .525

Gpr 2 3 .508

Glp 3 1 .428

Gdur 1 3 .505

LIUM

TFIDF 1 0 .451

Gpr 2 3 .444

Glp 3 1 .436

G-dur 1 3 .358

θir θac

● Ad-hoc Retrieval Task ● MAP is Overlap MAP

Conclusion

● We explored the potential of prosodic information in the Search and Hyperlinking task

● Prosodic models outperformed a textbased baseline on the development set but failed to do so on the test set:

● known-item vs ad hoc task● differences in query length● other differences in style of query

References

● F. Crestani. Towards the use of prosodic information for spoken document retrieval. SIGIR'01

● B. Chen, HM. Wang, and L.S. Lee. Improved spoken document retrieval by exploring extra acoustic and linguistic cues. INTERSPEECH'01

● C. Guinaudeau and J. Hirschberg. Accounting for prosodic information to improve ASRbased topic tracking for TV broadcast news. INTERSPEECH'11

Software

DCU Search Runs at MediaEval 2014 Search and Hyperlinking