42
Finding What Matters in Questions Xiaoqiang Luo, Hema Raghavan, Vittorio Castelli, Sameer Maskey and Radu Florian IBM T.J. Watson Research Center NAACL-HLT 2013 1

Finding What Matters in Questions

  • Upload
    ally

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Finding What Matters in Questions. Xiaoqiang Luo , Hema Raghavan , Vittorio Castelli , Sameer Maskey and Radu Florian IBM T.J. Watson Research Center. Introduction. e.q . : “ How does one apply for a New York day care license?” bag-of-words model 的最高分 : - PowerPoint PPT Presentation

Citation preview

Page 1: Finding What Matters  in Questions

NAACL-HLT 2013 1

Finding What Matters in Questions

Xiaoqiang Luo, Hema Raghavan, Vittorio Castelli, Sameer Maskey and Radu Florian

IBM T.J. Watson Research Center

Page 2: Finding What Matters  in Questions

3

Introduction

ه e.q. : “How does one apply for a New York day care license?”ه bag-of-words model 的最高分 :

ى “New licenses for day care centers in York county, PA”ه MMP model :

“ 用ى New York,” “day care,” and “license” 這三個 phrase 來搜尋ه We call these important phrases mandatory matching

phrases (MMPs)

NAACL-HLT 2013

Page 3: Finding What Matters  in Questions

NAACL-HLT 2013 4

Question Corpus

ه subset of the DARPA BOLT corpus containing forum postings in English.

ه 四人挑選 question ه 以下 5 種 question 不會用

需要推理或計算才能得到答案的問句ى問題描述不清楚或有ى ambiguation可以拆成很多問句的問題ىmultiple choice questionsىى factoid questions

Page 4: Finding What Matters  in Questions

NAACL-HLT 2013 5

Question Corpus

ه 兩位標記者負責標記所挑選的 question 的MMP 類型 (MMP-Must, MMP-Maybe) 以及span

ه E.q.

不重疊連續

Page 5: Finding What Matters  in Questions

NAACL-HLT 2013 6

Generate MMP Training Instances

Page 6: Finding What Matters  in Questions

NAACL-HLT 2013 7

Generate MMP Training Instances

m

N

m

N

m

N

Page 7: Finding What Matters  in Questions

NAACL-HLT 2013 8

Generate MMP Training Instances

ه Output instances:ه < span, MMP type>

E.q. : hedge funds = <(5, 6), +1>

Position : 0 1 2 3 4 5 6 7 8 9

deep : 0

1

2

3

4

5

6

ه MMP type:ه MMP-Must : +1ه MMP-Skip : -1ه MMP-Maybe : -1

<(4, 6), +1>

p

Np <(4, 4), +1><(4, 6), +1>

p

Np

<(7, 9), +1><(9, 9), +1>

p

<(5, 6), +1>

Page 8: Finding What Matters  in Questions

NAACL-HLT 2013 9

Generate MMP Training Instances

Page 9: Finding What Matters  in Questions

NAACL-HLT 2013 10

MMP Features

Lexical Features:ه CaseFeatures:

ه is the first word of an MMP upper-case?ه Is it all capital letters? ه Does it contain numeric letters?ه E.q. :

.For “(NP American)” in Figure 1, the upper-case feature firesى

Page 10: Finding What Matters  in Questions

NAACL-HLT 2013 11

MMP Features

Lexical Features:ه CommonQWord:

ه Does the MMP contain question words, including “What,” “When,” “Who,” etc.

Page 11: Finding What Matters  in Questions

NAACL-HLT 2013 12

MMP Features

Syntactic Features:ه PhraseLabel:

ه this feature returns the phrasal label of the MMP.ه E.q:

”.For “(NP American)” in Figure 1, the feature value is “NPى

Page 12: Finding What Matters  in Questions

NAACL-HLT 2013 13

MMP Features

Syntactic Features:ه NPUnique:

ه this Boolean feature fires if a phrase is the only NP in a question

ه E.q.: .For “(NP American),” the feature value would be falseى

Page 13: Finding What Matters  in Questions

NAACL-HLT 2013 14

MMP Features

Syntactic Features:ه PosOfPTN:

ه (1) the position of the left-most word of the nodeه (2) whether the left-most word is the beginning of the

questionه (3) the depth of the anchoring node, defined as the

length of the path to the root node.

Page 14: Finding What Matters  in Questions

NAACL-HLT 2013 15

E.q. of PosOfPTN

ه E.q: For “(NP American)” in Figure 1:ه 5th word in the sentenceه not the first word of the sentenceه Depth of the node is 6

Position : 1 2 3 4 5 6 7 8 9 10

deep : 0

1

2

3

4

5

6

Page 15: Finding What Matters  in Questions

NAACL-HLT 2013 16

MMP Features

Syntactic Features:ه PhrLenToQLenRatio:

ه This feature computes the number of words in an MMP, and its relative ratio to the sentence length.

Page 16: Finding What Matters  in Questions

NAACL-HLT 2013 17

MMP Features

Semantic Features (NETypes):ه The feature tests if a phrase is or contains a named

entity, and, if this is the case, the value is the entity type.ى information extraction (IE) pipeline consisting of syntactic

parsing, mention detection and coreference resolution (Florian et al., 2004; Luo et al., 2004; Luo and Zitouni, 2005)

ه E.q. : For “(NP American)” in Figure 1, the feature value would be “GPE.”

ه

Page 17: Finding What Matters  in Questions

NAACL-HLT 2013 18

MMP Features

Corpus-based Features ( AvgCorpusIDF):ه This group of features computes the average of the

IDFs of the words in this phrase. Have stop wordsى

Page 18: Finding What Matters  in Questions

NAACL-HLT 2013 19

MMP Classification Results

Classifier:ه logistic regression binary classifier using WEKA.Data set:

questionstraining set 174

test set 27

Page 19: Finding What Matters  in Questions

NAACL-HLT 2013 20

Performances of the MMP classifier

Page 20: Finding What Matters  in Questions

NAACL-HLT 2013 21

Example Questions by MMP Model

Page 21: Finding What Matters  in Questions

NAACL-HLT 2013 22

Data for Relevance Model

ه From BOLT-IR task(IR, 2012)ه Top snippets returned by the search engine are

judged for relevancy by our annotators.

questiontraining set 390 (28915 snippet, 6528 answer)

test set 59

Page 22: Finding What Matters  in Questions

NAACL-HLT 2013 23

Relevance Prediction

ه The relevance model is a conditional distribution P(r|q, s;D)ه where r is a binary random variable indicating if the

candidate snippet s is relevant to the question q.ه D is the document where the snippet s is found.

Page 23: Finding What Matters  in Questions

NAACL-HLT 2013 24

Relevance Prediction

Baseline systemه (1) Text Match Features

ه query and snippet 的 cosine scoresه (2) Answer Type Features:

ه The top 3 predictions of a statistical classifier trained to predict answer categories were used as features.

Page 24: Finding What Matters  in Questions

NAACL-HLT 2013 25

Relevance Prediction

Baseline systemه (3) Mention Match Features

ه whether a named entity in the query occurs in the snippet.

Page 25: Finding What Matters  in Questions

NAACL-HLT 2013 26

Relevance Prediction

Baseline systemه (4) Event match features

ه use several hand-crafted dictionaries containing terms exclusive to various types of events like ”violence”, ”legal”, ”election”.

ه If both the query and snippet contain the same event type ’The features take value is ‘1ى

Page 26: Finding What Matters  in Questions

NAACL-HLT 2013 27

Relevance Prediction

Baseline systemه (5) Snippet Statistics:

ه snippet lengthه the position of the snippet in the post etc were created.

Page 27: Finding What Matters  in Questions

NAACL-HLT 2013 28

Relevance Prediction

Features Derived from MMPه HardMatch:

ه Let I(m s)∈ be a 1 or 0 function indicating if a snippet contains the MMP m

Page 28: Finding What Matters  in Questions

NAACL-HLT 2013 29

Relevance Prediction

Features Derived from MMPه SoftLMMatch:

ه The SoftLMMatch score is a language-model (LM) based score, similar to that used in (Bendersky and Croft, 2008), except that MMPs play the role of concepts.

Page 29: Finding What Matters  in Questions

NAACL-HLT 2013 30

Relevance Prediction

Features Derived from MMPه SoftLMMatch:

ه The SoftLMMatch score is a language-model (LM) based score, similar to that used in (Bendersky and Croft, 2008), except that MMPs play the role of concepts.

Page 30: Finding What Matters  in Questions

NAACL-HLT 2013 31

Relevance Prediction

Features Derived from MMPه SoftLMMatch:

ه where wi is the ith in snippet sه I(wi = v) is an indicator function, taking value 1 if wi is v

and 0 otherwiseه |V | is the vocabulary size

Page 31: Finding What Matters  in Questions

NAACL-HLT 2013 32

Relevance Prediction

Features Derived from MMPه MMPInclScore:

ه where w ∈ m are the words in mه I( ・ ) is the indicator function taking value 1 when the argument

is true and 0 otherwiseه is a constant thresholdه l(w, s) is the similarity of word w to the snippet s as:

ى l(w, s) = maxv s ∈ JW(w, v) ى JW(w, v) = (Jaro Winkler similarity score between words w and v)

Page 32: Finding What Matters  in Questions

NAACL-HLT 2013 33

Relevance Prediction

Features Derived from MMPه MMPInclScore:

ه The MMP weighted inclusion score between the question q and snippet s is computed as:

Page 33: Finding What Matters  in Questions

NAACL-HLT 2013 34

Relevance Prediction

Features Derived from MMPه MMPRankDep:

ه This feature, RD(q, s) first tests if there exists a matched bilexcial dependency between q and s;

Page 34: Finding What Matters  in Questions

NAACL-HLT 2013 35

Relevance Prediction

Features Derived from MMPه MMPRankDep:

ه Let m(i) be the ith ranked MMPه let <wh, wd | q> and <uh, ud | s> be bilexical

dependencies from q and s, respectively wh and uh are the headsىwd and ud are the dependentsى

Page 35: Finding What Matters  in Questions

NAACL-HLT 2013 36

Relevance Prediction

Features Derived from MMPه MMPRankDep:

ه EQ(w, u) EQ(w, u) is true if either w and u are exactly the same, or theirى

morphs are the same, or they head the same entity, or their synset in WordNet overlap

ه RD(q, s) RD(q, s) is true if and only ifى

ي EQ(wh, uh) EQ(w∧ d, ud) w∧ h m∈ (i) w∧ d m∈ (j) is true for some <wh, wd | q>, for some <uh, ud | s> and for some i and j.

Page 36: Finding What Matters  in Questions

NAACL-HLT 2013 37

Relevance Prediction

3 snippet classifiers modelه noMMP model

ه a system without MMP features;ه IDF-as-MMP model

ه a baseline with each word as an MMP and the word’s IDF as the MMP score.

ه MMP model

Page 37: Finding What Matters  in Questions

NAACL-HLT 2013 38

Relevance Prediction

Performance of 3 snippet classifiers system

Page 38: Finding What Matters  in Questions

NAACL-HLT 2013 39

End-to-End System Results

ه The question-answering system is used in the 2012 BOLT IR evaluation (IR, 2012)ه There are 499K(Arabic), 449K(Chinese ) and

262K(English ) threads in each of these languages. ه The Arabic and Chinese posts were first translated into

English before being processed.

Page 39: Finding What Matters  in Questions

NAACL-HLT 2013 40

End-to-End System Results

ه performance

Page 40: Finding What Matters  in Questions

NAACL-HLT 2013 41

BOLT Evaluation Results

ه The BOLT evaluation consists of 146 questions, mostly event- or topic- related

Page 41: Finding What Matters  in Questions

NAACL-HLT 2013 42

BOLT Evaluation Results

Page 42: Finding What Matters  in Questions

NAACL-HLT 2013 43

Conclusions

ه 作者提供一個使用 mandatory matching phrases (MMP) 的 QA 系統

ه 從 question 抽取出 MMP 的 F-measure 高達 88.6%

ه 將 MMP model 跟 snippet relevance model 合併可以有效提升 snippet relevance model 的效能ه 使用 MMP 的 QA 系統是 2012 BOLT IR 中效能最好的系統