12
A Trainable Multi- factored QA System Radu Ion, Dan Ştefănescu, Alexandru Ceauşu, Dan Tufiş, Elena Irimia, Verginica Barbu-Mititelu Research Institute for Artificial Intelligence, Romanian Academy

A Trainable Multi-factored QA System

Embed Size (px)

DESCRIPTION

A Trainable Multi-factored QA System. Radu Ion, Dan Ştefănescu, Alexandru Ceauşu, Dan Tufiş, Elena Irimia, Verginica Barbu-Mititelu. Research Institute for Artificial Intelligence, Romanian Academy. ResPubliQA. We participated in the Romanian-Romanian ResPubliQA task - PowerPoint PPT Presentation

Citation preview

Page 1: A Trainable Multi-factored QA System

A Trainable Multi-factored QA System

Radu Ion, Dan Ştefănescu, Alexandru Ceauşu, Dan Tufiş, Elena Irimia,

Verginica Barbu-Mititelu

Research Institute for Artificial Intelligence, Romanian Academy

Page 2: A Trainable Multi-factored QA System

ResPubliQA

• We participated in the Romanian-Romanian ResPubliQA task

• 500 juridical questions to be answered from the Romanian JRC Acquis (10714 docs)

• Questions have been translated from other languages => a more difficult QA task since translated terms are not necessarily expressed the same in the actual Romanian documents

Page 3: A Trainable Multi-factored QA System

Corpus processing and indexing

• POS tagging, lemmatization, chunking.

• Only the ‘body’ part of a document was indexed (no annexes, no headers)

• We have two Lucene indexes: a document index and a paragraph index

• What’s in the index: lemmas and paragraph classes for the paragraph index

Page 4: A Trainable Multi-factored QA System

QA flow

• Web services based:– Question preprocessing using TTL (

http://ws.racai.ro/ttlws.wsdl)

– Question classification using a ME classifier (http://shadow.racai.ro/JRCACQCWebService/Service.asmx)

– Query generation (2 types: TFIDF and chunk based) (http://shadow.racai.ro/QADWebService/Service.asmx)

– Search engine interrogation (http://www.racai.ro/webservices/search.asmx)

– Paragraph relevance score computation and paragraph reordering

Page 5: A Trainable Multi-factored QA System

The combined QA system

• In order to account for NOA strings (which, when given, will increase the overall performance measure) we decided to combine 2 results:– The QA system using the TFIDF query– The QA system using the chunk query

• When the same paragraph was returned among the top K (=3) paragraphs by the two systems, it was the answer

• For the other case, we returned the NOA string

Page 6: A Trainable Multi-factored QA System

Paragraph relevance

• s1 to s5 are paragraph relevance scores• λi are trained weights by iteratively computing

MRR scores on a 200 questions test set using sets of weights for which the sum is 1.

• Retaining the value of the weights that account for the largest obtained MRR, results in a MERT-like training procedure

• Increment step was 0.01

Page 7: A Trainable Multi-factored QA System

Relevance scores

• Lucene scores for the document and paragraph retrieval

• One BLUE-like relevance score which is high if a candidate paragraph contains keywords much in same order as in the question

• One indicator variable that is 1 if the candidate paragraph has the same class as the question (0 otherwise)

• One lexical chains based score (a real number quantified semantic distance between the question and the candidate paragraph)

Page 8: A Trainable Multi-factored QA System

Evaluations• Official results• Second run: query contained the question class

Page 9: A Trainable Multi-factored QA System

Post CLEF2009 Evaluations

• Results with all questions (500) answered (no NOA strings)• With trained parameters for every question class, we obtain an

overall accuracy of 0.5774 (29 additional correctly answered questions)

Page 10: A Trainable Multi-factored QA System

Post CLEF2009 Evaluations (II)

• Some other informative measures:– Answering precision: correct / answered– Rejection precision: (1 – correct) / unanswered

• AP(icia092roro) = 75.58%• RP(icia092roro) = 86.53%• So, the system is able to reject giving wrong

answers at a high rate which is a merit in itself (discovered due to the c@1 calculus) even if it cannot offer the same answering precision in the unanswered area

Page 11: A Trainable Multi-factored QA System

Conclusions

• A multi-factored QA system may be easily extended with new paragraph relevance scores

• It’s also easily adaptable on new domains and/or languages

• Update: better correlation between documents and paragraph relevance scores

• Future plans: to develop the English QA system along the same lines and combine the En-Ro outputs

Page 12: A Trainable Multi-factored QA System

Conclusions (II)

• Competition drives innovation but let’s not forget that these tools are there to help users

• Useful requirement: QA systems to be on the Web

• Ours is at http://www2.racai.ro/sir-resdec/