21
Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of University of Pittsburgh Pittsburgh

Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

  • View
    221

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

Multi-Perspective Question Answering Using the OpQA Corpus

Veselin Stoyanov

Claire Cardie

Janyce Wiebe

Cornell University University of PittsburghUniversity of Pittsburgh

Page 2: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 2

Multi-Perspective Question Answering

• Fact-based question answering (QA):• When is the first day of spring?

• Do Lipton employees take coffee breaks?

• Vs Multi-perspective question answering (MPQA).

• How does the US regard the terrorist attacks in Iraq?

• Is Derek Jeter a bum?

Page 3: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 3

• Properties of Opinion vs. Fact answers– OpQA corpus– Traditional fact-based QA systems– Different properties of opinion questions

• Using fine-grained opinion information for MPQA– Annotation framework and automatic classifiers– QA experiments

Talk Outline

Page 4: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 4

Opinion Question & Answer (OpQA) Corpus

• 98 documents manually tagged for opinions (from the NRRC MPQA corpus [Wilson and Wiebe 2003])

• 30 questions – 15 fact – 15 opinion

[Stoyanov, Cardie, Litman, and Wiebe 2004]

Page 5: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 5

OpQA corpus: Answer Annotations

• Two annotators• Include every text segment contributing to an

answer– Partial answers:

• When was the Kyoto protocol ratified?– … before May 2003 …

• Are the Japanese unanimous in their support of Koizumi?– … most Japanese support their prime minister …

• Minimum spans

Page 6: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 6

Traditional Fact-based QA systems

IRsubsystem

1.Frag 3242.Frag 1113.Frag 4314.Frag 213

1.Frag 3242.Frag 1113.Frag 4314.Frag 213

Linguisticfilters

Guesses:

1.Frag 324

2.Frag 213

3. …..

Documents

(document fragments)

Questions

Syntactic filters

Semantic filters

Page 7: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 7

Characteristics of Opinion vs. Fact Answers

• Answer length– Syntactic and semantic class– Additional processing difficulties

• Partial answers– Answer generator

Number of answers

Length (tokens) Number of partials

Fact 124 5.12 12 (9.68%)

Opinion 415 9.24 154 (37.11%)

Page 8: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 8

Fine-grained Opinion Information for MPQA

• Recent interest in the area of automatic opinion information extraction.

– E.g. [Bethard, Yu, Thornton, Hatzivassiloglou, and Jurafsky 2004], [Pang and Lee 2004], [Riloff and Wiebe 2003], [Wiebe and Riloff 2005], [Wilson, Wiebe, and Hwa 2004], [Yu and Hatzivassiloglou 2003]

• In our evaluation:– Opinion annotation framework– Sentence-level automatic opinion classifiers– Subjectivity filters– Source filter

Page 9: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 9

• Described in [Wiebe, Wilson, and Cardie 2002]• Accounts for both:

– Explicitly stated opinions • Joe believes that Sue dislikes the Red Sox.

– Indirectly expressed opinions• The aim of the report is to tarnish China’s image.

• Attributes include strength and source.• Manual sentence-level classification

– sentence subjective if it contains one or more opinions of strength >= medium

Opinion Annotation Framework

• Described in [Wiebe, Wilson, and Cardie 2002]• Accounts for both:

– Explicitly stated opinions • Joe believes that Sue dislikes the Red Sox.

– Indirectly expressed opinions• The aim of the report is to tarnish China’s image.

• Attributes include strength and source.• Manual sentence-level classification

– sentence subjective if it contains one or more opinions of strength >= medium

Page 10: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 10

Automatic Opinion Classifiers

• Two sentence-level opinion classifiers from Wiebe and Riloff [2005] used for evaluation

• Both classifiers use unannotated data– Rulebased: Extraction patterns bootstrapped using

word lists – NaiveBayes: Trained on data obtained from

Rulebased

Precision Recall F

Rulebased 90.4 34.2 46.6

NaiveBayes 79.4 70.6 74.7

Page 11: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 11

Subjectivity Filters

IRsubsystem

1.Sent 3242.Sent 1113.Sent 4314.Sent 213

1.Sent 324(o)2.Sent 111(f)3.Sent 431(f)4.Sent 213(o)

Subjectivityfilters

Document Sentences

Opinion Questions

Guesses

1.Sent 324

2.Sent 213

3. …..

Manual

Rulebased

NaiveBayes

Baseline

Page 12: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 12

Subjectivity Filters Cont’d

• Look for the rank of the first guess containing an answer

• Compute:

1.Sent 3242.Sent 2133.Sent 007 (ans)4.Sent 212 5.Sent 211 (ans)6. …

– Mean Reciprocal Rank (MRR) across the top 5 answers

• MRR = meanall_questions(1/Rank_of_first_answer)

– Mean Rank of the First Answer• MRFA = meanall_questions(Rank_of_first_answer)

Page 13: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 13

Subjectivity Filters Results

61.33

40.33 43.73

26.2

0

10

20

30

40

50

60

70

MRFA

0.4911

0.51890.5078

0.5856

0.44

0.46

0.48

0.5

0.52

0.54

0.56

0.58

0.6

MRR

Baseline

Manual

NaiveBayes

Rulebased0.4214

Page 14: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 14

Source Filter

• Manually identify the sources in the opinion questions

• Does France approve of the war in Iraq?

• Retains only sentences that contain opinions with sources matching sources in the question

• France has voiced some concerns with the situation.

Page 15: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 15

Source Filter Results

• Performs well on the hardest questions in the corpus

• All questions answered within the first 25 sentences with one exception.

MRR MRFA

Baseline 0.4911 61.33

Source 0.4633 11.27

Page 16: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 16

Summary

• Properties of opinion vs. fact answers– Traditional architectures unlikely to be

effective

• Use of fine-grained opinion information for MPQA– MPQA can benefit from fine-grained

perspective information

Page 17: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 17

Future Work

• Create summaries of all opinions in a document using fine-grained opinion information

• Methods used will be directly applicable to MPQA

Page 18: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 18

Thank you.

Questions?

Page 19: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 19

• Did something surprising happen when Chavez regained power in Venezuela after he was removed by a coup?

• What did South Africa want Mugabe to do after the 2002 elections?

• What’s Mugabe’s opinion about the West’s attitude and actions towards the 2002 Zimbabwe election?

Page 20: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 20

Characteristics of Fact vs. Opinion Answers Cont’d

• Syntactic Constituent of the answersFact Opinion

Answers in best matching category

31% 16%

Syntactic type of best match

Verb Phrase

0 2

Noun Phrase

9 4

Clause 2 6

Page 21: Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh

10/08/05 HLT/EMNLP 2005. 21

• All improvement significant using Wilcoxon Matched-Pairs Signed-Ranks Test (p<=0.05) except for source filter (p=0.81)