Question Classifier

Jennifer Lee

CSI 5386 Project Presentation

Fall 2008

Question ClassificationUsing Machine Learning Methods

Motivation• An important step in QA

– To classify the question to the anticipated type of the answer (semantically).

– More challenging than common search tasks.

• Q: What Canadian city has the largest population?Answer type: city

• What is bipolar disorder?• What do bats eat?• What is the PH scale?• Hard to categorize those questions into one

single class– Need multiple class labels for a single

question.

The Ambiguity Problem

Why Machine Learning?• Manually constructed sets of rules to map a

question to its type is not efficient.– Requires the analysis of a large number of

questions.– Mapping questions into fine classes requires

the use of lexical items (specific words).• A learned classifier enables one to define only a

small number of “type” features.• Can be trained on a new taxonomy.

Li and Roth (2002):Learning Question Classifier

• Uses the SnoW learning architecture.– Hierarchical classifiers– 6 coarse classes: ABBREVATION, ENTITY,

DESCRIPTION, HUMAN, LOCATION, NUMERIC VALUE.

– 50 fine classes.

Li and Roth (cont)• UIUC question classification dataset

– 5500 training (from TREC 8,9, including 500 rare questions).

– 500 test datasets from TREC 10.• Six primitive feature types:

– Words, pos tags, chunks, named entities, head chunks and semantically related words

• Semantically related word list for each question– “away” belongs to the sensor Rel(distance).

Zhang and Lee (2003): Question Classifcation using SVM

• Two kind of features:– Bag of words and bag of ngrams.

• SVM with kernel tree – Use LIBVSM (Chang and Lin, 2001).– Take advantage of the syntactic structures of

questions.– Compare with Nearest Neighbors, Naïve

Bayes, Decision Tree, SnoW.

Zhang and Lee (cont)

• Using the same dataset as Li and Roth• Same twolayered question taxonomy• Same assumption:

– One question resides in only one category.• Uses automated constructed features

– No semantically related word list

Huang et al. (2008):QC using Head Words and their Hypernyms

• In contrast to Li's, a compact feature set was proposed:– Head word– Use WordNet to augment the semantic

features.– Adopt Lesk's word sense disambiguation

algorithm

Huang et al. (cont)• Again, use the same dataset.• Other features:

– Question whword, word grams, word shape

• Classifiers:– Maximum Entropy Model– Support Vector Model – also adopt LIBVSM.– Obtained higher accuracy (89% and 89.2%).

Datasets for the project• Same dataset as Li's:

– http://l2r.cs.uiuc.edu/~cogcomp/Data/QA/QC/

• Additional datasets:– TREC QA: http://trec.nist.gov/data/qa.html

http://l2r.cs.uiuc.edu/~cogcomp/Data/QA/QC/

http://trec.nist.gov/data/qa.html

Plan for the project• Experiment with different feature types:

– Head chunks, semantic features for head chunk, namedentities, word grams and word shape feature

• Use WordNet to automate the generation of semantic features– Find hypernyms.– Apply Lesk's WSD to the head chunk.

Head word sense disambiguation

Resources

• Java interface to WordNet:– http://wordnet.princeton.edu/links#SQL

• A syntactic parser for extracting the headchunk feature:– Berkeley parser (Petrov and Klein, 2007).

• Use the Ngram Statistics Package

http://wordnet.princeton.edu/links#SQL

Resources (cont)

• Named entity recognizer, a relational feature extraction language (FEX):– http://l2r.cs.uiuc.edu/~cogcomp/software.php

• Mallet Machine Learning Library:– http://mallet.cs.umass.edu/

http://l2r.cs.uiuc.edu/~cogcomp/software.php

http://mallet.cs.umass.edu/

References

• Li, X. and D. Roth. 2002. Learning Question Classifiers.The 19th international conference on Computational linguistics, vol. 1, pp. 1–7.

• Zhang D. and W. S. Lee. 2003. Question Classification using Support Vector Machines. The ACM SIGIR conference in information retrieval, pp. 26–32.

• Zhiheng Huang; Marcus Thint; Zengchang Qin. Question Classification using Head Words and their Hypernyms.

References (cont)• D. Roth, G. Kao, X. Li, R. Nagarajan, V.

Punyakanok, N. Rizzolo, W. Yih, C. O. Alm, and L. G. Moran. 2002. Learning components for a question answering system. In TREC2001.

• Jonathan Brown – IR Lab. EntityTagged Language Models for Question Classification in a QA System.

• Donald Metzler, W. Bruce Croft Analysis of statistical question classification for factbased questions (2003).

Education

Question Classifier