113
RANLP 2005 - Bernardo Magnini 1 Open Domain Question Answering: Techniques, Resources and Systems Bernardo Magnini Itc-Irst Trento, Italy [email protected]

Open Domain Question Answering: Techniques, Resources and Systems

  • Upload
    trapper

  • View
    47

  • Download
    0

Embed Size (px)

DESCRIPTION

Open Domain Question Answering: Techniques, Resources and Systems. Bernardo Magnini Itc-Irst Trento, Italy [email protected]. Outline of the Tutorial. Introduction to QA QA at TREC System Architecture - Question Processing - Answer Extraction Answer Validation on the Web - PowerPoint PPT Presentation

Citation preview

Page 1: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 1

Open Domain Question Answering:

Techniques, Resources and Systems

Bernardo Magnini

Itc-Irst

Trento, Italy

[email protected]

Page 2: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 2

Outline of the Tutorial

I. Introduction to QA II. QA at TRECIII. System Architecture

- Question Processing - Answer Extraction

IV. Answer Validation on the Web V. Cross-Language QA

Page 3: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 3

Previous Lectures/Tutorials on QA

Dan Moldovan and Sanda Harabagiu: Question Answering, IJCAI 2001.

Marteen de Rijke and Bonnie Webber: Knowlwdge-Intensive Question Answering, ESSLLI 2003.

Jimmy Lin and Boris Katz, Web-based Question Answering, EACL 2004

Page 4: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 4

I. Introduction to Question Answering

What is Question Answering Applications Users Question Types Answer Types Evaluation Presentation Brief history

Page 5: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 5

Query Driven vs Answer Driven Information Access

What does LASER stand for? When did Hitler attack Soviet Union?

Using Google we find documents containing the question itself, no matter whether or not the answer is actually provided.

Current information access is query driven. Question Answering proposes an answer driven

approach to information access.

Page 6: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 6

Question Answering

Find the answer to a question in a large collection of documents

questions (in place of keyword-based query) answers (in place of documents)

Page 7: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 7Searching for: Etna Where is Naxos? Searching for: Naxos What continent is Taormina in? What is the highest volcano in Europe?

Why Question Answering?

Documentcollection

From the Caledonian Star in the Mediterranean – September 23, 1990 (www.expeditions.com):

On a beautiful early morning the Caledonian Star approaches Naxos, situated on the east coast of Sicily. As we anchored and put the Zodiacs into the sea we enjoyed the great scenery. Under Mount Etna, the highest volcano in Europe, perches the fabulous town of Taormina. This is the goal for our morning.After a short Zodiac ride we embarked our buses with local guides and went up into the hills to reach the town of Taormina.Naxos was the first Greek settlement at Sicily. Soon a harbor was established but the town was later destroyed by invaders.[...]

Searching for: Taormina

Page 8: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 8

Document Retrieval users submit queries corresponding to their information need system returns (voluminous) list of full-length documents it is the responsibility of the users to find their original information need,

within the returned documents Open-Domain Question Answering (QA)

users ask fact-based, natural language questions

What is the highest volcano in Europe? system returns list of short answers

… Under Mount Etna, the highest volcano in Europe, perches the fabulous town …

more appropriate for specific information needs

Alternatives to Information Retrieval

Page 9: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 9

What is QA?

Find the answer to a question in a large collection of documents

What is the brightest star visible from Earth?

1. Sirio A is the brightest star visible from Earth even if it is…

2. the planet is 12-times brighter than Sirio, the brightest star in the sky…

Page 10: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 10

QA: a Complex Problem (1)

Problem: discovery implicit relations among question and answers

Who is the author of the “Star Spangled Banner”?

…Francis Scott Key wrote the “Star Spangled Banner” in 1814.

…comedian-actress Roseanne Barr sang her famous rendition of the “Star Spangled Banner” before …

Page 11: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 11

QA: a Complex Problem (2)

Problem: discovery implicit relations among question and answers

Which is the Mozart birth date?

…. Mozart (1751 – 1791) ….

Page 12: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 12

QA: a complex problem (3) Problem: discovery implicit relations among

question and answers

Which is the distance between Naples and Ravello?

“From the Naples Airport follow the sign to Autostrade (green road sign). Follow the directions to Salerno (A3). Drive for about 6 Km. Pay toll (Euros 1.20). Drive appx. 25 Km. Leave the Autostrade at Angri (Uscita Angri). Turn left, follow the sign to Ravello through Angri. Drive for about 2 Km. Turn right following the road sign "Costiera Amalfitana". Within 100m you come to traffic lights prior to narrow bridge. Watch not to miss the next Ravello sign, at appx. 1 Km from the traffic lights. Now relax and enjoy the views (follow this road for 22 Km). Once in Ravello ...”.

Page 13: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 13

QA: Applications (1) Information access:

Structured data (databases) Semi-structured data (e.g. comment field in

databases, XML) Free text

To search over: The Web Fixed set of text collection (e.g. TREC) A single text (reading comprehension

evaluation)

Page 14: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 14

QA: Applications (2)

Domain independent QA Domain specific (e.g. help systems)

Multi-modal QA Annotated images Speech data

Page 15: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 15

QA: Users

Casual users, first time users Understand the limitations of the system Interpretation of the answer returned

Expert users Difference between novel and already

provided information User Model

Page 16: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 16

QA: Questions (1) Classification according to the answer type

Factual questions (What is the larger city …) Opinions (What is the author attitude …) Summaries (What are the arguments for and

against…)

Classification according to the question speech act: Yes/NO questions (Is it true that …) WH questions (Who was the first president …) Indirect Requests (I would like you to list …) Commands (Name all the presidents …)

Page 17: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 17

QA: Questions (2)

Difficult questions Why, How questions require

understanding causality or instrumental relations

What questions have little constraint on the answer type (e.g. What did they do?)

Page 18: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 18

QA: Answers Long answers, with justification Short answers (e.g. phrases) Exact answers (named entities)

Answer construction: Extraction: cut and paste of snippets from

the original document(s) Generation: from multiple sentences or

documents QA and summarization (e.g. What is this

story about?)

Page 19: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 19

QA: Information Presentation Interfaces for QA

Not just isolated questions, but a dialogue Usability and user satisfaction

Critical situations Real time, single answer

Dialog-based interaction Speech input Conversational access to the Web

Page 20: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 20

QA: Brief History (1)

NLP interfaces to databases: BASEBALL (1961), LUNAR (1973),

TEAM (1979), ALFRESCO (1992) Limitations: structured knowledge and

limited domain

Story comprehension: Shank (1977), Kintsch (1998), Hirschman (1999)

Page 21: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 21

QA: Brief History (2)

Information retrieval (IR) Queries are questions List of documents are answers QA is close to passage retrieval Well established methodologies (i.e. Text

Retrieval Conferences TREC) Information extraction (IE):

Pre-defined templates are questions Filled template are answers

Page 22: Open Domain Question Answering: Techniques, Resources and Systems

22RANLP 2005 - Bernardo Magnini

Research Context (1)

Question Answering

Domain specific Domain-independent

Structured data Free text

WebFixed set

of collectionsSingle

document

Growing interest in QA (TREC, CLEF, NT evaluation campaign).

Recent focus on multilinguality and context aware QA

Page 23: Open Domain Question Answering: Techniques, Resources and Systems

23RANLP 2005 - Bernardo Magnini

Research Context (2)

faithfulness

compactness

AutomaticSummarization

MachineTranslation

AutomaticQuestion Answering

as compact as possible

answers must be faithful w.r.t. questions (correctness) and compact (exactness)

as faithful as possible

Page 24: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 24

II. Question Answering at TREC

The problem simplified Questions and answers Evaluation metrics Approaches

Page 25: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 25

The problem simplified:The Text Retrieval Conference

Goal Encourage research in information retrieval based on large-scale

collections Sponsors

NIST: National Institute of Standards and Technology ARDA: Advanced Research and Development Activity DARPA: Defense Advanced Research Projects Agency

Since 1999 Participants are research institutes, universities, industries

Page 26: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 26

TREC QuestionsQ-1391: How many feet in a mile?

Q-1057: Where is the volcano Mauna Loa?

Q-1071: When was the first stamp issued?

Q-1079: Who is the Prime Minister of Canada?

Q-1268: Name a food high in zinc.

Q-896: Who was Galileo?

Q-897: What is an atom?

Q-711: What tourist attractions are there in Reims?

Q-712: What do most tourists visit in Reims?

Q-713: What attracts tourists in Reims

Q-714: What are tourist attractions in Reims?

Fact-based, short answer questions

Definition questions

Reformulation questions

Page 27: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 27

Answer Assessment

Criteria for judging an answer Relevance: it should be responsive to the question Correctness: it should be factually correct Conciseness: it should not contain extraneous or irrelevant

information Completeness: it should be complete, i.e. partial answer should not

get full credit Simplicity: it should be simple, so that the questioner can read it

easily Justification: it should be supplied with sufficient context to allow a

reader to determine why this was chosen as an answer to the question

Page 28: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 28

Questions at TREC

Yes/ No Entity Definition Opinion/Procedure/Explanation

Singleanswer

Is Berlin thecapital ofGermany?

What is thelargest city inGermanyÊ?

Who wasGalileoÊ?

Multipleanswer

Name 9countries thatimportCuban sugar

What are thearguments for andagainst prayer inschoolÊ?

Page 29: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 29

Exact Answers Basic unit of a response: [answer-string, docid] pair An answer string must contain a complete, exact answer and nothing else.

What is the longest river in the United States? The following are correct, exact answers

Mississippi, the Mississippi, the Mississippi River, Mississippi River mississippi

while none of the following are correct exact answers

At 2,348 miles the Mississippi River is the longest river in the US. 2,348 miles; Mississippi Missipp Missouri

Page 30: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 30

Assessments

Four possible judgments for a triple[ Question, document, answer ]

Rigth: the answer is appropriate for the question Inexact: used for non complete answers Unsupported: answers without justification Wrong: the answer is not appropriate for the

question

Page 31: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 31

R 1530 XIE19990325.0298 WellingtonR 1490 NYT20000913.0267 Albert DeSalvoR 1503 XIE19991018.0249 New GuineaU 1402 NYT19981017.0283 1962

R 1426 NYT19981030.0149 SundquistU 1506 NYT19980618.0245 ExcaliburR 1601 NYT19990315.0374 April 18 , 1955X 1848 NYT19991001.0143 Enola

R 1838 NYT20000412.0164 FalaR 1674 APW19990717.0042 July 20 , 1969X 1716 NYT19980605.0423 BartonR 1473 APW19990826.0055 1908R 1622 NYT19980903.0086 EllenW 1510 NYT19980909.0338 Young Girl

R=Right, X=ineXact, U=Unsupported, W=Wrong

What is the capital city of New Zealand? What is the Boston Strangler's name?What is the world's second largest island?What year did Wilt Chamberlain score 100 points?Who is the governor of Tennessee?What's the name of King Arthur's sword?When did Einstein die?What was the name of the plane that dropped the Atomic Bomb on Hiroshima?What was the name of FDR's dog?What day did Neil Armstrong land on the moon?Who was the first Triple Crown Winner?When was Lyndon B. Johnson born?Who was Woodrow Wilson's First Lady?Where is Anne Frank's diary?

Page 32: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 32

1402: What year did Wilt Chamberlain score 100 points?

DIOGENE: 1962

ASSESMENT: UNSUPPORTED

PARAGRAPH: NYT19981017.0283

Petty's 200 victories, 172 of which came during a 13-year span between 1962-75, may be as unapproachable as Joe DiMaggio's56-game hitting streak or Wilt Chamberlain's 100-point game.

Page 33: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 33

1506: What's the name of King Arthur's sword?

ANSWER: ExcaliburPARAGRAPH: NYT19980618.0245

ASSESMENT: UNSUPPORTED

`QUEST FOR CAMELOT,' with the voices of Andrea Carr, Gabriel Byrne, Cary Elwes, John Gielgud, Jessalyn Gilsig, Eric Idle, Gary Oldman, Bronson Pinchot, Don Rickles and Bryan White. Directed by Frederik Du Chau (G, 100 minutes). Warner Brothers' shaky entrance into the Disney-dominated sweepstakes of the musicalized animated feature wants to be a juvenile feminist ``Lion King'' with a musical heart that fuses ``Riverdance'' with formulaic Hollywood gush. But its characters are too wishy-washy and visually unfocused to be compelling, and the songs (by David Foster and Carole Bayer Sager) so forgettable as to be extraneous. In this variation on the Arthurian legend, a nondescript Celtic farm girl named Kayley with aspirations to be a knight wrests the magic sword Excalibur from the evil would-be emperor Ruber (a Hulk Hogan look-alike) and saves the kingdom (Holden).

Page 34: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 34

1848: What was the name of the plane that dropped the Atomic Bomb on Hiroshima?

DIOGENE: EnolaPARAGRAPH: NYT19991001.0143

ASSESMENT: INEXACT

Tibbets piloted the Boeing B-29 Superfortress Enola Gay,which dropped the atomic bomb on Hiroshima on Aug. 6, 1945, causing an estimated 66,000 to 240,000 deaths. He named the plane after his mother, Enola Gay Tibbets.

Page 35: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 35

1716: Who was the first Triple Crown Winner?

DIOGENE: BartonPARAGRAPH: NYT19980605.0423

ASSESMENT: INEXACT

Not all of the Triple Crown winners were immortals. The first, Sir Barton, lost six races in 1918 before his first victory, just as Real Quiet lost six in a row last year. Try to find Omaha and Whirlaway on anybody's list of all-time greats.

Page 36: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 36

1510: Where is Anne Frank's diary?

DIOGENE: Young GirlPARAGRAPH: NYT19980909.0338

ASSESMENT: WRONG

Otto Frank released a heavily edited version of “B” for its first publication as “Anne Frank: Diary of a Young Girl” in 1947.

Page 37: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 37

TREC Evaluation Metric:Mean Reciprocal Rank (MRR)

Reciprocal Rank = inverse of rank at which first correct answer was found:

[1, 0,5, 0.33, 0.25, 0.2, 0]

MRR: average over all questions Strict score: unsupported count as incorrect Lenient score: unsupported count as correct

Page 38: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 38

TREC Evaluation Metrics:Confidence-Weighted Score (CWS)

Sum for i = 1 to 500 (#-correct-up-to-question i / i)

500System A:

1 C

2 W

3 C

4 C

5 W

System B:

1 W

2 W

3 C

4 C

5 C

(1/1) + ((1+0)/2) + (1+0+1)/3) + ((1+0+1+1)/4) + ((1+0+1+1+0)/5) 5

Total: 0.7

0 + ((0+0)/2) + (0+0+1)/3) + ((0+0+1+1)/4) + ((0+0+1+1+1)/5) 5

Total: 0.29

Page 39: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 39

Evaluation

Best result: 67% Average over 67 runs: 23%

TREC-8 TREC-9 TREC-10

66%25% 58%

24%

67%23%

Page 40: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 40

Main Approaches at TREC

Knowledge-Based Web-based Pattern-based

Page 41: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 41

Knowledge-Based Approach Linguistic-oriented methodology

Determine the answer type from question form Retrieve small portions of documents Find entities matching the answer type category in text

snippets Majority of systems use a lexicon (usually WordNet)

To find answer type To verify that a candidate answer is of the correct type To get definitions

Complex architecture...

Page 42: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 42

Web-Based Approach

QUESTION

Question Processing Component

Search Component

Auxiliary CorpusWEB

ANSWER

TREC Corpus

Answer Extraction Component

Page 43: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 43

Patter-Based Approach (1/3)

Knowledge poor Strategy

Search for predefined patterns of textual expressions that may be interpreted as answers to certain question types.

The presence of such patterns in answer string candidates may provide evidence of the right answer.

Page 44: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 44

Patter-Based Approach (2/3) Conditions

Detailed categorization of question types Up to 9 types of the “Who” question; 35

categories in total Significant number of patterns corresponding to

each question type Up to 23 patterns for the “Who-Author” type,

average of 15 Find multiple candidate snippets and check for the

presence of patterns (emphasis on recall)

Page 45: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 45

Pattern-based approach (3/3) Example: patterns for definition questions Question: What is A?

1. <A; is/are; [a/an/the]; X> ...23 correct answers

2. <A; comma; [a/an/the]; X; [comma/period]> …26 correct answers

3. <A; [comma]; or; X; [comma]> …12 correct answers

4. <A; dash; X; [dash]> …9 correct answers

5. <A; parenthesis; X; parenthesis> …8 correct answers

6. <A; comma; [also] called; X [comma]> …7 correct answers

7. <A; is called; X> …3 correct answers

total: 88 correct answers

Page 46: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 46

Use of answer patterns1. For generating queries to the search engine.

How did Mahatma Gandhi die? Mahatma Gandhi die <HOW> Mahatma Gandhi die of <HOW> Mahatma Gandhi lost his life in <WHAT>

The TEXTMAP system (ISI) uses 550 patterns, grouped in 105 equivalence blocks. On TREC-2003 questions, the system produced, on average, 5 reformulations for each question.

2. For answer extractionWhen was Mozart born?P=1 <PERSON> (<BIRTHDATE> - DATE)P=.69 <PERSON> was born on <BIRTHDATE>

Page 47: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 47

Acquisition of Answer PatternsRelevant approaches: Manually developed surface pattern library (Soubbotin, Soubbotin,

2001) Automatically extracted surface patterns (Ravichandran, Hovy 2002)

Patter learning:1. Start with a seed, e.g. (Mozart, 1756)2. Download Web documents using a search engine3. Retain sentences that contain both question and answer terms4. Construct a suffix tree for extracting the longest matching substring that

spans <Question> and <Answer>5. Calculate precision of patterns

Precision = # of correct patterns with correct answer / # of total patterns

Page 48: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 48

Capturing variability with patterns

Pattern based QA is more effective when supported by variable typing obtained using NLP techniques and resources.

When was <A> born?<A:PERSON> (<ANSWER:DATE> -<A :PERSON > was born in <ANSWER :DATE >

Surface patterns can not deal with word reordering and apposition phrases: Galileo, the famous astronomer, was born in …

The fact that most of the QA systems use syntactic parsing demonstrates that the successful solution of the answer extraction problem goes beyond the surface form analysis

Page 49: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 49

Syntactic answer patterns (1)

S

NP VP

The <A> was invented PP

in

<ANSWER>

Answer patterns that capture the syntacticrelations of a sentence.

When was <A> invented?

Page 50: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 50

Syntactic answer patterns (2)

S

NP VP

The first was invented PP

in

18771877

phonograph

The matching phase turns out to be a problemof partial match among syntactic trees.

Page 51: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 51

III. System Architecture

Knowledge Based approach Question Processing Search component Answer Extraction

Page 52: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 52

Knowledge based QA

Search Component

ANSWER

Answer Extraction Component

ANSWER VALIDATION

NAMED ENTITIES RECOGNITION

PARAGRAPH FILTERING

ANSWER IDENTIFICATION

QUERY COMPOSITION

SEARCH ENGINE

Document collection

MULTIWORDS RECOGNITION

KEYWORDS EXPANSION

WORD SENSE DISAMBIGUATION

QUESTION PARSING

ANSWER TYPE IDENTIFICATION

TOKENIZATION & POS TAGGING

QUESTION

Question Processing Component

Page 53: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 53

Question Analysis (1)

Input: NLP question Output:

query for the search engine (i.e. a boolean composition of weighted keywords)

Answer type Additional constraints: question focus,

syntactic or semantic relations that should hold for a candidate answer entity and other entities

Page 54: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 54

Question Analysis (2)

Steps:1. Tokenization2. POS-tagging3. Multi-words recognition4. Parsing5. Answer type and focus identification6. Keyword extraction7. Word Sense Disambiguation8. Expansions

Page 55: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 55

Tokenization and POS-tagging

NL-QUESTION: Who was the inventor of the electric light?

Who Who CCHI [0,0]

was be VIY [1,1]

the det RS [2,2]

inventor inventor SS [3,3]

of of ES [4,4]

the det RS [5,5]

electric electric AS [6,6]

light light SS [7,7]

? ? XPS [8,8]

Page 56: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 56

Multi-Words recognition

NL-QUESTION: Who was the inventor of the electric light?

Who Who CCHI [0,0]

was be VIY [1,1]

the det RS [2,2]

inventor inventor SS [3,3]

of of ES [4,4]

the det RS [5,5]

electric_light electric_light SS [6,7]

? ? XPS [8,8]

Page 57: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 57

Syntactic Parsing Identify syntactic structure of a

sentence noun phrases (NP), verb phrases (VP),

prepositional phrases (PP) etc.

Why did David Koresh ask the FBI for a word processor

WRB VBD NNP NNP VB DT NNP IN DT NN NN

WHADVP NP NP NP

PP VP SQ SBARQ

Why did David Koresh ask the FBI for a word processor?

Page 58: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 58

Answer Type and Focus

Focus is the word that expresses the relevant entity in the question Used to select a set of relevant documents ES: Where was Mozart born?

Answer Type is the category of the entity to be searched as answer PERSON, MEASURE, TIME PERIOD, DATE, ORGANIZATION,

DEFINITION ES: Where was Mozart born?

LOCATION

Page 59: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 59

Answer Type and Focus What famous communist leader died in Mexico City?

RULENAME: WHAT-WHO

TEST: [“what” [¬ NOUN]* [NOUN:person-p]J +]OUTPUT: [“PERSON” J]

Answer type: PERSON

Focus: leaderThis rule matches any question starting with what, whose first noun, if any, is a person (i.e. satisfies the person-p predicate)

Page 60: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 60

Keywords ExtractionNL-QUESTION: Who was the inventor of the electric light?

Who Who CCHI [0,0]

was be VIY [1,1]

the det RS [2,2]

inventor inventor SS [3,3]

of of ES [4,4]

the det RS [5,5]

electric_light electric_light SS [6,7]

? ? XPS [8,8]

Page 61: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 61

Word Sense Disambiguation

What is the brightest star visible from Earth?”

STAR star#1: celestial body ASTRONOMY star#2: an actor who play … ART

BRIGHT bright #1: bright brilliant shining PHYSICS bright #2: popular glorious GENERIC bright #3: promising auspicious GENERIC

VISIBLE visible#1: conspicuous obvious PHYSICSvisible#2: visible seeable ASTRONOMY

EARTH earth#1: Earth world globe ASTRONOMYearth #2: estate land landed_estate acres ECONOMYearth #3: clay GEOLOGYearth #4: dry_land earth solid_ground GEOGRAPHY earth #5: land ground soil GEOGRAPHYearth #6: earth ground GEOLOGY

Page 62: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 62

Expansions

- NL-QUESTION: Who was the inventor of the electric light?

- BASIC-KEYWORDS: inventor electric-light

inventor

synonyms discoverer, artificer derivation invention

synonyms innovation

derivation invent

synonyms excogitate

electric_light synonyms incandescent_lamp, ligth_bulb

Page 63: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 63

Keyword Composition

Keywords and expansions are composed in a boolean expression with AND/OR operators

Several possibilities: AND composition Cartesian composition

(OR (inventor AND electric_light) OR (inventor AND incandescent_lamp) OR (discoverer AND electric_light) ………………………… OR inventor OR electric_light))

Page 64: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 64

Document Collection Pre-processing

For real time QA applications off-line pre-processing of the text is necessary Term indexing POS-tagging Named Entities Recognition

Page 65: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 65

Candidate Answer Document Selection

Passage Selection: Individuate relevant, small, text portions

Given a document and a list of keywords: Paragraph length (e.g. 200 words) Consider the percentage of keywords present in the

passage Consider if some keyword is obligatory (e.g. the focus

of the question).

Page 66: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 66

Candidate Answer Document Analysis Passage text tagging Named Entity Recognition

Who is the author of the “Star Spangled Banner”?

…<PERSON>Francis Scott Key </PERSON> wrote the “Star Spangled Banner” in <DATE>1814</DATE>

Some systems: passages parsing (Harabagiu, 2001) Logical form (Zajac, 2001)

Page 67: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 67

Answer Extraction (1) Who is the author of the “Star Spangled Banner”?

…<PERSON>Francis Scott Key </PERSON> wrote the “Star Spangled Banner” in <DATE>1814</DATE>

Answer Type = PERSON

Candidate Answer = Francis Scott Key

Ranking candidate answers: keyword density in the passage, apply additional constraints (e.g. syntax, semantics), rank candidates using the Web

Page 68: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 68

Answer Identification

Thomas E. Edison

Page 69: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 69

IV. Answer Validation

Automatic answer validation Approach:

web-based use of patterns combine statistics and linguistic

information Discussion Conclusions

Page 70: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 70

QA Architecture

Search Component

ANSWER

Answer Extraction Component

ANSWER RANKING

NAMED ENTITIES RECOGNITION

PARAGRAPH FILTERING

ANSWER IDENTIFICATION

QUERY COMPOSITION

SEARCH ENGINE

Document collection

KEYWORDS EXPANSION

WORD SENSE DISAMBIGUATION

QUESTION PARSING

ANSWER TYPE IDENTIFICATION

TOKENIZATION & POS TAGGING

QUESTION

Question Processing Component

Page 71: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 71

The problem: Answer Validation

Given a question q and a candidate answer a, decide if a is a correct answer for q

What is the capital of the USA?

Washington D.C.

San Francisco

Rome

Page 72: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 72

The problem: Answer Validation

Given a question q and a candidate answer a, decide if a is a correct answer for q

What is the capital of the USA?

Washington D.C. correct

San Francisco wrong

Rome wrong

Page 73: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 73

Requirements for Automatic AV

Accuracy: it has to compare well with respect to human judgments

Efficiency: large scale (Web), real time scenarios

Simplicity: avoid the complexity of QA systems

Page 74: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 74

Approach Web-based

take advantage of Web redundancy Pattern-based

the Web is mined using patterns (i.e. validation patterns) extracted from the question and the candidate answer

Quantitative (as opposed to content-based) check if the question and the answer tend to appear

together in the Web considering the number of documents returned (i.e. documents are not downloaded)

Page 75: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 75

Web Redundancy

What is the capital of the USA?

Capital Region USA: Fly-Drive Holidays in and Around Washington D.C.

the Insider’s Guide to the Capital Area Music Scene (Washington D.C., USA).

The Capital Tangueros (Washington DC Area, USA)

I live in the Nations’s Capital, Washington Metropolitan Area (USA)

In 1790 Capital (also USA’s capital): Washington D.C. Area: 179 square km

Washington

Page 76: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 76

Validation PatternCapital Region USA: Fly-Drive Holidays in and Around Washington D.C.

the Insider’s Guide to the Capital Area Music Scene (Washington D.C., USA).

The Capital Tangueros (Washington DC Area, USA)

I live in the Nations’s Capital, Washington Metropolitan Area (USA)

In 1790 Capital (also USA’s capital): Washington D.C. Area: 179 square km

[Capital NEAR USA NEAR Washington]

Page 77: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 77

Related Work

Pattern-based QA Brill, 2001 – TREC-10 Subboting, 2001 – TREC-10 Ravichandran and Hovy, ACL-02

Use of the Web for QA Clarke et al. 2001 – TREC-10 Radev, et al. 2001 - CIKM

Statistical approach on the Web PMI-IR: Turney, 2001 and ACL-02

Page 78: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 78

Architecture

question candidate answer

validation pattern

answer validity score

correct answer wrong answer

> t < t

#doc

filtering

#doc < k

Page 79: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 79

Architecture

question candidate answer

validation pattern

answer validity score

correct answer wrong answer

> t < t

#doc

filtering

#doc < k

Page 80: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 80

Extracting Validation Patterns

question candidate answer

question pattern (Qp) answer pattern (Ap)

stop-word filter

term expansion

validation pattern

answer type

named entity recognition

stop-word filter

Page 81: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 81

Architecture

question candidate answer

validation pattern

answer validity score

correct answer wrong answer

> t < t

#doc

filtering

#doc < k

Page 82: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 82

Answer Validity Score

PMI-IR algorithm (Turney, 2001)

PMI (Qp, Ap) =P(Qp, Ap)

P(Qp) * P(Ap)

The result is interpreted as evidence that the validation pattern is consistent, which imply answer accuracy

Page 83: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 83

Answer Validity Score

PMI (Qp, Ap) =hits(Qp NEAR Ap)

hits(Qp) * hits(Ap)

Three searches are submitted to the Web:

hits(Qp)hits(Ap)hits(Qp NEAR Ap)

Page 84: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 84

Example

What county is Modesto, California in?

Answer type: Location

Qp = [county NEAR Modesto NEAR California]

P(Qp) = P(county, Modesto, California) =909

3 *108

A1= The Stanislaus County district attorney’s

A2 = In Modesto, San Francisco, and

Stop-wordfilter

Page 85: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 85

Example (cont.)

The Stanislaus County district attorney’s

A1p = [Stanislaus]

In Modesto, San Francisco, and

A2p = [San Francisco]

P(Stanislaus)=736413 *108 P(San Francisco)=

40725193 *108

NER(location)

Page 86: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 86

Example (cont.)

PMI(Qp, A1p) = 2473

P(Qp, A1p) =552

3 *108P(Qp, A2p) =

113 *108

PMI(Qp, A2p) = 0.89

The Stanislaus County district attorney’s

In Modesto, San Francisco, and

correct answer wrong answer

t = 0.2 * MAX(AVS)

> t < t

Page 87: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 87

Experiments

Data set: 492 TREC-2001 questions 2726 answers: 3 correct answers and 3 wrong

answers for each question, randomly selected from TREC-10 participants human-judged corpus

Search engine: Altavista allows the NEAR operator

Page 88: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 88

Experiment: AnswersQ-916: What river in the US is known as the Big Muddy ?

The Mississippi Known as Big Muddy, the Mississippi is the longest as Big Muddy, the Mississippi is the longest messed with. Known as Big Muddy, the Mississip Mississippi is the longest river in the US the Mississippi is the longest river(Mississippi) has brought the Mississippi to its lowest ipes.In Life on the Mississippi,Mark Twain wrote t Southeast;Mississippi;Mark Twain; officials began Known; Mississippi; US; Minnesota; Gulf Mexico Mud Island,;Mississippi;”The;--history,;Memphis

Page 89: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 89

Baseline

Consider the documents provided by NIST to TREC-10 participants (1000 documents for each question)

If the candidate answer occurs (i.e. string match) at least one time in the top 10 documents it is judged correct, otherwise it is considered wrong

Page 90: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 90

Asymmetrical Measures

Problem: some candidate answers (e.g. numbers) produce an enormous amount of Web documents

Scores for good (Ac) and bad (Aw) answers tend to be similar, making the choice more difficult

~PMI(q, ac) = PMI (q, aw)

How many Great Lakes are there?

… to cross all five Great Lakes completed a 19.2 …

Page 91: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 91

Asymmetric Conditional Probability (ACP)

ACP (Qsp, Asp) =P(Qsp | Asp)

P(Qsp) * P(Asp)

=hits(Qsp NEAR Asp)

hits(Qsp) * hits(Asp)

2/3

2/3

Page 92: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 92

Comparing PMI and ACP

17.50029.0)2.19,(

015.0),(

8.102.0)2.19,(

036.0),(

LakesGreatACP

fiveLakesGreatACP

LakesGreatPMI

fiveLakesGreatPMI

ACP increases the difference between the right and the wrong answer.

Page 93: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 93

Results

Baseline MLHR PMI ACP

Absolute

threshold

52.9 77.4 77.7 78.4

Relative

threshold

52.9 79.6 79.5 81.2

SR on all 492 TREC-2001 questions

Baseline MLHR PMI ACP

Absolute

threshold

82.1 83.3 83.3

Relative

threshold

84.4 84.9 86.3

SR on all 249 factoid questions

Page 94: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 94

Discussion (1)

Definition questions are the more problematic on the subset of 249 named-entities questions

success rate is higher (i.e. 86.3) Relative threshold improve performance (+ 2%)

over fixed threshold Non symmetric measures of co-occurrence work

better for answer validation (+ 2%) Source of errors:

Answer type recognition Named-entities recognition TREC answer set (e.g. tokenization)

Page 95: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 95

Discussion (2)

Automatic answer validation is a key challenge for Web-based question/answering systems

Requirements: accuracy with respect to human judgments: 80%

success rate is a good starting point efficiency: documents are not downloaded simplicity: based on patterns

At present, it is suitable for a generate&test component integrated in a QA system

Page 96: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 96

V. Cross-Language QA

Motivations QA@CLEF Performances Approaches

Page 97: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 97

Motivations

Answers may be found in languages different from the language of the question.

Interest in QA systems for languages other than English.

Force the QA community to design real multilingual systems.

Check/improve the portability of the technologies implemented in current English QA systems.

Page 98: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 98

Cross-Language QA Quanto è alto il Mont Ventoux?

(How tall is Mont Ventoux?)

“Le Mont Ventoux, impérial avec ses 1909 mètres et sa tour blanche telle un étendard, règne de toutes …”

1909 metri

English corpusItalian corpus Spanish corpusFrench corpus

Page 99: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 99

CL-QA at CLEF

Adopt the same rules used at TREC QA Factoid questions (i.e. no definition questions) Exact answers + document id

Use the CLEF corpora (news, 1994 -1995) Return the answer in the language of the text collection in

which it has been found (i.e. no translation of the answer) QA-CLEF-2003 was an initial step toward a more complex

task organized at CLEF-2004 and 2005.

Page 100: Open Domain Question Answering: Techniques, Resources and Systems

100RANLP 2005 - Bernardo Magnini

QA @ CLEF 2004 (http://clef-qa.itc.it/2004)

Seven groups coordinated the QA track:

- ITC-irst (IT and EN test set preparation)

- DFKI (DE)

- ELDA/ELRA (FR)

- Linguateca (PT)

- UNED (ES)

- U. Amsterdam (NL)

- U. Limerick (EN assessment)

Two more groups participated in the test set construction:

- Bulgarian Academy of Sciences (BG)

- U. Helsinki (FI)

Page 101: Open Domain Question Answering: Techniques, Resources and Systems

101RANLP 2005 - Bernardo Magnini

CLEF QA - Overview

documentcollections

translationEN => 7 languages

systems’answers

100 monolingual Q&A pairs with EN translation

IT

FR

NL

ES

700 Q&A pairs in 1 language + EN

selection ofadditional

80 + 20 questions

Multieight-04XML collection

of 700 Q&Ain 8 languages

extraction ofplain text test sets

experiments(1 week window)

manualassessment

question generation (2.5 p/m per group)

Exercise (10-23/5)

evaluation(2 p/d for 1 run)

Page 102: Open Domain Question Answering: Techniques, Resources and Systems

102RANLP 2005 - Bernardo Magnini

CLEF QA – Task Definition

Given 200 questions in a source language, find one exact answer per question in a collection of documents written in a target language, and provide a justification for each retrieved answer (i.e. the docid of the unique document that supports the answer).

DE EN ES FR IT NL PT

BG

DE

EN

ES

FI

FR

IT

NL

PT

SSTT 6 monolingual and

50 bilingual tasks.

Teams participated in 19 tasks,

Page 103: Open Domain Question Answering: Techniques, Resources and Systems

103RANLP 2005 - Bernardo Magnini

CLEF QA - Questions

All the test sets were made up of 200 questions:

- ~90% factoid questions

- ~10% definition questions

- ~10% of the questions did not have any answer in the corpora (right answer-string was “NIL”)

Problems in introducing definition questions:

What’s the right answer? (it depends on the user’s model)

What’s the easiest and more efficient way to assess their answers?

Overlap with factoid questions:

F Who is the Pope?

D Who is John Paul II?

the PopeJohn Paul IIthe head of the Roman Catholic Church

Page 104: Open Domain Question Answering: Techniques, Resources and Systems

104RANLP 2005 - Bernardo Magnini

CLEF QA – Multieight<q cnt="0675" category="F" answer_type="MANNER"> <language val="BG" original="FALSE"> <question group="BTB">Как умира Пазолини?</question> <answer n="1" docid="">TRANSLATION[убит]</answer> </language> <language val="DE" original="FALSE"> <question group="DFKI">Auf welche Art starb Pasolini?</question> <answer n="1" docid="">TRANSLATION[ermordet]</answer> <answer n="2" docid="SDA.951005.0154">ermordet</answer> </language> <language val="EN" original="FALSE"> <question group="LING">How did Pasolini die?</question> <answer n="1" docid="">TRANSLATION[murdered]</answer> <answer n="2" docid="LA112794-0003">murdered</answer> </language> <language val="ES" original="FALSE"> <question group="UNED">¿Cómo murió Pasolini?</question> <answer n="1" docid="">TRANSLATION[Asesinado]</answer> <answer n="2" docid="EFE19950724-14869">Brutalmente asesinado en los arrabales de Ostia</answer> </language> <language val="FR" original="FALSE">  <question group="ELDA">Comment est mort Pasolini ?</question>   <answer n="1" docid="">TRANSLATION[assassiné]</answer>   <answer n="2" docid="ATS.951101.0082">assassiné</answer>   <answer n="3" docid="ATS.950904.0066">assassiné en novembre 1975 dans des circonstances mystérieuses</answer>   <answer n="4" docid="ATS.951031.0099">assassiné il y a 20 ans</answer> </language> <language val="IT" original="FALSE">  <question group="IRST">Come è morto Pasolini?</question>   <answer n="1" docid="">TRANSLATION[assassinato]</answer>   <answer n="2" docid="AGZ.951102.0145">massacrato e abbandonato sulla spiaggia di Ostia</answer>   </language> <language val="NL" original="FALSE">  <question group="UoA">Hoe stierf Pasolini?</question>   <answer n="1" docid="">TRANSLATION[vermoord]</answer>   <answer n="2" docid="NH19951102-0080">vermoord</answer>   </language> <language val="PT" original="TRUE">  <question group="LING">Como morreu Pasolini?</question>   <answer n="1" docid="LING-951120-088">assassinado</answer>   </language></q>

Page 105: Open Domain Question Answering: Techniques, Resources and Systems

105RANLP 2005 - Bernardo Magnini

CLEF QA - Assessment

Judgments taken from the TREC QA tracks:

- Right

- Wrong

- ineXact

- Unsupported

Other criteria, such as the length of the answer-strings (instead of X, which is underspecified) or the usefulness of responses for a potential user, have not been considered.

Main evaluation measure was accuracy (fraction of Right responses).

Whenever possible, a Confidence-Weighted Score was calculated:

1

Q Q

i=1

number of correct responses in first i ranksi

CWS =

Page 106: Open Domain Question Answering: Techniques, Resources and Systems

106RANLP 2005 - Bernardo Magnini

Evaluation Exercise - Participants

America Europe Asia Australia TOTALsubmitted

runs

TREC-8 13 3 3 1 20 46

TREC-9 14 7 6 - 27 75

TREC-10 19 8 8 - 35 67

TREC-11 16 10 6 - 32 67

TREC-12 13 8 4 - 25 54

NTCIR-3 (QAC-1) 1 - 15 - 16 36

CLEF 2003 3 5 - - 8 17

CLEF 2004 1 17 - - 18 48

Distribution of participating groups in different QA evaluation campaigns.

Page 107: Open Domain Question Answering: Techniques, Resources and Systems

107RANLP 2005 - Bernardo Magnini

Evaluation Exercise - ParticipantsNumber of participating teams-number of submitted runs at CLEF 2004.

Comparability issue.

DE EN ES FR IT NL PT

BG 1-1 1-2

DE 2-2 2-3 1-2

EN 1-2 1-1

ES 5-8 1-2

FI 1-1

FR 3-6 1-2

IT 1-2 1-2 2-3

NL 1-2 1-2

PT 1-2 2-3

SS TT

Page 108: Open Domain Question Answering: Techniques, Resources and Systems

108RANLP 2005 - Bernardo Magnini

Evaluation Exercise - ResultsSystems’ performance at the TREC and CLEF QA tracks.

* considering only the 413 factoid questions

** considering only the answers returned at the first rank

70

25

65

24

67

23

83

22

70

21.4

41.5

2935

17

45.5

23.7

35

14.7

accuracy (%)

TREC-8 TREC-9 TREC-10 TREC-11 TREC-12* CLEF-2003**

monol. bil.CLEF-2004

monol. bil.

best system

average

Page 109: Open Domain Question Answering: Techniques, Resources and Systems

109RANLP 2005 - Bernardo Magnini

Evaluation Exercise – CL Approaches

QuestionAnalysis / keyword extraction

INPUT (source language)

CandidateDocumentSelection

DocumentCollection

DocumentCollection

Preprocessing

PreprocessedDocuments

CandidateDocumentAnalysis

AnswerExtraction

OUTPUT (target language)

question translationinto target language

translation ofretrieved data

U. AmsterdamU. EdinburghU. Neuchatel

Bulg. Ac. of SciencesITC-IrstU. LimerickU. HelsinkiDFKILIMSI-CNRS

Page 110: Open Domain Question Answering: Techniques, Resources and Systems

110RANLP 2005 - Bernardo Magnini

Discussion on Cross-Language QA

CLEF multilingual QA track (like TREC QA) represents a formal evaluation, designed with an eye to replicability. As an exercise, it is an abstraction of the real problems.

Future challenges:

• investigate QA in combination with other applications (for instance summarization)

• access not only free text, but also different sources of data (multimedia, spoken language, imagery)

• introduce automated evaluation along with judgments given by humans

• focus on user’s need: develop real-time interactive systems, which means modeling a potential user and defining suitable answer types.

Page 111: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 111

References Books Pasca, Marius, Open Domain Question Answering from Large Text Collections, CSLI, 2003. Maybury, Mark (Ed.), New Directions in Question Answering, AAAI Press, 2004.

Journals Hirshman, Gaizauskas. Natural Language question answering: the view from here. JNLE, 7

(4), 2001.

TREC E. Voorhees. Overview of the TREC 2001 Question Answering Track. M.M. Soubbotin, S.M. Soubbotin. Patterns of Potential Answer Expressions as Clues to the

Right Answers. S. Harabagiu, D. Moldovan, M. Pasca, M. Surdeanu, R. Mihalcea, R. Girju, V. Rus, F.

Lacatusu, P. Morarescu, R. Brunescu. Answering Complex, List and Context questions with LCC’s Question-Answering Server.

C.L.A. Clarke, G.V. Cormack, T.R. Lynam, C.M. Li, G.L. McLearn. Web Reinforced Question Answering (MultiText Experiments for TREC 2001).

E. Brill, J. Lin, M. Banko, S. Dumais, A. Ng. Data-Intensive Question Answering.

Page 112: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 112

References

Workshop Proceedings

H. Chen and C.-Y. Lin, editors. 2002. Proceedings of the Workshop on Multilingual Summarization and Question Answering at COLING-02, Taipei, Taiwan.

M. de Rijke and B. Webber, editors. 2003. Proceedings of the Workshop on Natural Language Processing for Question Answering at EACL-03, Budapest, Hungary.

R. Gaizauskas, M. Hepple, and M. Greenwood, editors. 2004. Proceedings of the Workshop on Information Retrieval for Question Answering at SIGIR-04, Sheffield, United Kingdom.

Page 113: Open Domain Question Answering: Techniques, Resources and Systems

RANLP 2005 - Bernardo Magnini 113

References

N. Kando and H. Ishikawa, editors. 2004. Working Notes of the 4th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Summarization (NTCIR-04), Tokyo, Japan.

M. Maybury, editor. 2003. Proceedings of the AAAI Spring Symposium on New Directions in Question Answering, Stanford, California.

C. Peters and F. Borri, editors. 2004. Working Notes of the 5th Cross-Language Evaluation Forum (CLEF-04), Bath, United Kingdom.

J. Pustejovsky, editor. 2002. Final Report of the Workshop on TERQAS: Time and Event Recognition in Question Answering Systems, Bedford, Massachusetts.

Y. Ravin, J. Prager and S. Harabagiu, editors. 2001. Proceedings of the Workshop on Open-Domain Question Answering at ACL-01, Toulouse, France.