Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
Identification of temporal expressions in the domain of tourism
Andrea Varga, Georgiana Puscasu, Constatin Orasan
1Research Group in Computational LinguisticsUniversity of Wolverhampton
2-4 July 2009 / KEPT 2009
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
Outline
1 Context: the QALL-ME project
2 Classification of temporal expressions
3 The temporal annotator in QALL-ME
4 The QALL-ME benchmark
5 Evaluation
6 Conclusions
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
The QALL-ME project
QALL-ME: Question Answering Learning technologies in a multiLingual andMultimodal Environment
EU-funded project (FP6 IST-033860)
shared infrastructure for multilingual and multimodal question answering in thedomain of tourism
answers questions about local events: movie showtimes, directions to sites(cinemas, hotels), etc
a large number of user questions contain temporal constraints
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
The QALL-ME project: http://qallme.fbk.eu/
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
The QALL-ME project: http://qallme.fbk.eu/
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
Temporal expressions
TEs denote:1 position in time
2 duration
3 time frequency
in order to obtain data to train / test automatic TE identification modules, TEs needto be annotated => annotation standards are required
annotation standards for annotating TEs:1 TIMEX2
2 TIMEX3 (part of TimeML)
TIMEX2 has been adopted for the purposes of temporal expression annotation inQALL-ME
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
TIMEX2 annotation standard
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
TE classes and subclasses (1)
1. TEs indicating time position1. Precise TEs
1. Calendar dates2. Times of day3. Week references
2. Fuzzy TEs1. Generic references to the past, present or future2. Seasons, parts of the year (quarters and halves)3. Weekends4. Fuzzy day parts
3. Non-specific TEs referring to time position
2. TEs capturing durations1. Precise durations2. Fuzzy durations3. Non-specific durations
3. Set-denoting TEs1. Precise frequency TEs2. Non-specific frequencies
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
TE classes and subclasses (2)
1. TEs indicating time position1. Precise TEs
1. Calendar dates:<TIMEX2 VAL="2009">2009</TIMEX2>,<TIMEX2 VAL="2009-07">July</TIMEX2>,<TIMEX2 VAL="2009-07-02">2nd of July</TIMEX2>,<TIMEX2 VAL="197">70s</TIMEX2>,<TIMEX2 VAL="10">11th century</TIMEX2>,<TIMEX2 VAL="2">this millennium</TIMEX2>
2. Times of day:<TIMEX2 VAL="2009-07-02T21:36:42.85">21:36:42.85</TIMEX2>,<TIMEX2 VAL="2009-07-02T10:00">10 o’clock</TIMEX2>,<TIMEX2 VAL="2009-07-02T07:53">7:53 am.</TIMEX2>,<TIMEX2 VAL="2009-07-02T11:15Z">11:15 GMT</TIMEX2>
3. Week references:<TIMEX2 VAL="2009-W27">next week</TIMEX2>
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
TE classes and subclasses (3)
1. TEs indicating time position2. Fuzzy TEs
1. Generic references to the past, present or future:<TIMEX2 VAL="PAST_REF" ANCHOR_DIR="BEFORE" ANCHOR_VAL="2009-07-02">recently</TIMEX2>,<TIMEX2 VAL="PRESENT_REF" ANCHOR_DIR="AS_OF" ANCHOR_VAL="2009-07-02">now</TIMEX2>,<TIMEX2 VAL="FUTURE_REF" ANCHOR_DIR="AFTER" ANCHOR_VAL="2009-07-02" >future</TIMEX2>
2. Seasons, parts of the year (quarters and halves):<TIMEX2 VAL="2009-SU">this summer</TIMEX2>,<TIMEX2 VAL="2009-Q4">the 4th quarter of 2009</TIMEX2>,<TIMEX2 VAL="2009-H2">2nd half of 2009</TIMEX2>
3. Weekends:<TIMEX2 val="2009-W26-WE">this weekend</TIMEX2>
4. Fuzzy day parts:<TIMEX2 val="2009-07-02TAF">afternoon</TIMEX2>
3. Non-specific TEs referring to time position:<TIMEX2 VAL="XXXX-SU">summers</TIMEX2>
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
TE classes and subclasses (4)
2. TEs capturing durations1. Precise durations:
<TIMEX2 VAL="PT24H" ANCHOR_DIR="WITHIN" ANCHOR_VAL="2009-07-02">24 hours</TIMEX2>
2. Fuzzy durations:<TIMEX2 VAL="PXW" ANCHOR_DIR="BEFORE" ANCHOR_VAL="2009-W26">preceding weeks</TIMEX2>
3. Non-specific durations:<TIMEX2 VAL="P1D">all day</TIMEX2>
3. Set-denoting TEs1. Precise frequency TEs:
<TIMEX2 VAL="XXXX-XX-XX" SET="YES">every day</TIMEX2>
2. Non-specific frequencies:<TIMEX2 VAL="TNI" SET="YES">some nights</TIMEX2>
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
The temporal annotator in QALL-ME
the QALL-ME QA system required a module that adds temporal expressionannotations to a user question
TIMEX2 standard was adopted as temporal annotation scheme for the followingreasons:
a shared common annotation schema among all QALL-ME partners
usability of QALL-ME benchmark outside the QALL-ME project
re-use existing annotation tools capable of annotating according to TIMEX2 standard
a system that performs TIMEX2 tagging involves two stages:TE identification: detecting the textual extent of the TEs present in a text
TE normalisation: is the whole process carried out in order to identify the final values ofthe attributes attached to every temporal expression
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
TE indentification (1)
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
TE indentification (2)
Why is it easy?- in general Calendar dates ("2009-07-02") represent more than 50% of all TEs present in
user questions
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
TE normalisation
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
Simplifying an existing TE annotator for QALL-ME
a complex TE annotator is available at the University of Wolverhampton (initial TEannotator)
is able to annotate all types of existing TEs according to TIMEX2
has high performance
simplified TE annotator:covers only a few TE classes (the most frequent types of TE present in user questions)
follows the design and methodology employed in the initial TE annotator
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
The QALL-ME benchmark
a collection of several thousand spoken questions in 4 languages:1 Italian2 English3 Spanish4 German
purposes:1 to allow development of applications based on machine-learning for QA2 to enable testing their performance in a controlled laboratory settings
to date the benchmark contains 15,479 questions related to cultural events andtourism: eq. accommodation, gastro, cinemas, movies, exhibitions,etc.
from the 4,501 questions included in the English part of the QALL-ME benchmark,we selected 1,118 questions for our experiments
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
Distribution of TEs
1,118 randomly selected user questions have been manually annotated accordingto TIMEX2
not the full range of possible existing TEs is covered in QALL-ME user questions
Time Position Duration FrequencyCalendar dates 86
Precise Times of day 81 19 13Week 12
Past, Present, Future 0Fuzzy Seasons and parts of year 1 1 N/A
Weekends 16Day parts 20
Non-specific 43 10 0
Table: Distribution of TEs in the QALL-ME user questions
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
Coverage of the simplified TE annotator
Time Position Duration FrequencyCalendar dates
Precise Times of dayWeek
Past, Present, FutureFuzzy Seasons and parts of year N/A
WeekendsDay parts
Non-specific
Table: TE classes partially covered by QALL-ME temporal annotator
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
Evaluation results
the gold standard consists of 1,118 user questions manually annotated accordingto TIMEX2 guidelines
complete and partial matches complete matchesprecision(initial annotator) 96.0 % 89.0%
recall(initial annotator) 95.0 % 88.1%F-measure(initial annotator) 95.5 % 88.5%
precision(simplified annotator) 96.6 % 82.4 %recall(simplified annotator) 76.2 % 64.9 %
F-measure(simplified annotator) 85.2 % 72.6 %
Table: initial TE identifier and the simplified version evaluated against the gold standard
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
Error analysis
some expressions are not captured by the simplified annotator:a check-out timethe check-in timethe minimum ageweekday daytime
some expressions are partially captured by the simplified: annotator
simplified annotator gold standard<8-years>-old 8-years-oldweekday <evening> weekday evening<10 pm> <tomorrow> 10 pm tomorrow1 hundred and <23 minutes> 1 hundred and 23 minutes
Table: partial matches examples
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism
Context: the QALL-ME projectClassification of temporal expressions
The temporal annotator in QALL-METhe QALL-ME benchmark
EvaluationConclusions
Conclusions
the paper presented a simplified temporal processor used by an English QAsystem:
implemented for a specific task
obtained acceptable performance
proved to be enough for a practical application
Andrea Varga, Georgiana Puscasu, Constantin Orasan Identification of TE in the domain of tourism