View
172
Download
0
Category
Preview:
Citation preview
Query by Example Search on Speech Task
(QUESST 2015)
Igor Szoke, Luis Javier Rodriguez-Fuentes, Andi Buzo, Xavier Anguera, Florian Metze
(with help of Jorge Proenca, Martin Lojka, Xiao Xiong as data providers)
14-15.9.2015 MediaEval workshop, Wurzen, Germany
What is QUESST about...
• Spoken Audio Search (or Query-by-Example Spoken-Term Detection)
• Given a spoken query we search for matches (at lexical level) within a set of spoken documents
• It is similar to Spoken Term Detection (NIST STD2006, OpenKWS) ut …
• Queries are spoken
• No prior information
• Different acoustic conditions
• “ear hi g for whole do u e ts
Evolution
• SWS2011
• English and Indian lang, exact match, find document, TVW
• SWS2012
• 6 South African lang., exact match, queries from data, find exact place, TVW metric
• SWS2013
• 6 European lang., exact match, queries from data , find exact place, TVW metric
• QUESST2014
• 6 European lang., not exact match, queries are dictated, find document, Cnxe metric
• QUESST2015
• ...
Evolution in 2015
• 6 lang. (Albanian, Chinese, Czech, Portuguese, Romanian, Slovak)
• 19 hours of audio (dev = eval), per sentence segmentation
• 450 queries/dev, 450 queries/eval • Recorded in isolation by different speakers (some non-native of the language)
• Utterance-level matching
• Recorded with context New!
• 3 types of search • T1 - Exact match, dictated
• T2 - Reordering and small variations, dictated
• T3 - Reordering and small variations, conversational speech New!
• We provided • Scoring tool, Features, Baseline search technique (DTW), Calibration and Fusion, Speech Kitchen (VM) New!
• Surprise • The data was artificially noised and reverberated New!
• Data examples: Clean, Noisy, Reverb, Noisy+Reverb
Teams Team Affiliation Country Note
BUT BUT Speech@FIT, Faculty of Infromation Technology, Brno University
of Technology
Czech late
CUNY Department of Computer Science at Queens College of The City
University of New York.
US
ELiRF Natural Language Engineering and Pattern Recognition, Departament de
Sistemes Informàtics i Computació, Universitat Politècnica de València
Spain
GTM-UVigo Multimedia Technology Group, Universida de Vigo Spain Late
IIT-B Department of Electrical Engineering , Indian Institute of Technology
Bombay
India Not
arrived
NNI Northwestern Polytechnical University, Xi’an, China
Nanyang Technological University, Singapore
Institute for Infocomm Research, A*STAR, Singapore
China
Singapore
NTU National Taiwan University Taiwan zero
SpeeD SpeeD Research Laboratory, University Politehnica of Bucharest Romania
SPL-IT-UC Instituto de Telecomunicações, Coimbra
Electrical and Computer Eng. Department, University of Coimbra
Portugal
TUKE Laboratory of Speech Technologies in Telecommunications @ Technical
University of Košice
Slovakia late, zero
Scoring
● Is a query in document?
● Analysis per search type / language / noise
● Metrics: Cnxe (lower is better, up to 0, 1 is random)
TWV (higher is better, up to 1, 0 is random)
All teams Cnxe
Per type search (avg top7)
Per noise/reverb (avg top7)
All teams Cnxe – T1 & clean
Conclusion
• We made it really hard. All teams fight bravely!
• No surprise
• Big fusion vs. Simple system
• Addressing the reordering and noisy data.
• Zero resourced and non DTW.. But not good results.
• Is the DTW really the best approach?
It was really hard this year!
Thank you!
.. do not forget technical retreat today at 15:00 ..
Conclusion - RT
• Who used provided tools?
• Who used query context?
• There was problem i data … we need to use on-line signal centering
• Future … the same task on the „same data?
Technical retreat
● What provided technologies were used by teams? (make a table)
● Show data problem (raw audio)..
● Did participants used the query context?
Technical retreat
• Task description - going deeper
• IIT-B remote presentation
• round • "everyone" - if wants to speak
• conclusion + future (next year)
• prepare answers to: • What was the easiest thing • What was the toughest thing • What would you did in a different way this year • One thing that should be the same next year • One thing that should change next year • How happy are you with your participation in QUESST (0-10, more is better, should not reflect
your score, rather your feeling of work done)
Technical retreat
• Task description - going deeper
• IIT-B remote presentation
• round • "everyone" - if wants to speak
• conclusion + future (next year)
• prepare answers to: • What was the easiest thing • What was the toughest thing • What would you did in a different way this year • One thing that should be the same next year • One thing that should change next year • How happy are you with your participation in QUESST (0-10, more is better, should not reflect
your score, rather your feeling of work done)
Recommended