Beyond “Bag of Words”: Towards a Framework for Conceptual Retrieval Jimmy Lin College of...

Beyond “Bag of Words”: Towards a Framework for Conceptual Retrieval

Jimmy LinCollege of Information StudiesUniversity of Maryland

Thursday, October 4, 2007IPAM Workshop, UCLA

Beyond “Bag of Words”

IR is fundamentally based on counting words Different ways of “bookkeeping”: vector space,

probabilistic, LM, DFR, etc.

So… Words aren’t enough to capture meaning Term statistics aren’t enough to capture meaning

Thus…

IR systems should go beyond term statistics: concepts, relations, etc.

Hypothesis:

However… A reasonable hypothesis? Where’s the empirical support?

IR based on concepts, relations, etc. >> IR based on words

Outline

Previous attempts to go beyond BoW

Slightly different approach Start with specialized applications Generalize

Case study in the medical domain A clinical question answering system in support of

evidence-based medicine (EBM)

Broader applicability?

Previous Work

Beyond “bags” Indexing phrases

Modeling term dependencies

Beyond “words” Query expansion:

Word Sense Disambiguation

Results? Mixed

e.g., (Fagan, 1987; Smeaton et al., 1994; etc.)

e.g., (Gao et al., 2004; Liu et al., 2004; Metzler and Croft, 2005; Cui et al., 2005; etc.)

e.g., (Voorhees, 1993; 1994)

e.g., (Sanderson, 1994; Mihalcea and Moldovan, 2000)

A Different Approach

Previous work focuses on the general domain Broad but (relatively) shallow Hampered by commonsense problem Difficult to acquire large amounts of knowledge

Our approach: Develop a general framework Instantiate in domain-specific applications Leverage lessons learned to refine the framework Rinse, repeat

“Conceptual Retrieval”

Questions

SemanticMatcher

Answers

Conceptual representation

KnowledgeExtractor

Collection

What type of knowledge?

Knowledge about the problem structure What representations are useful for capturing the

information need?

Knowledge about user tasks Why is this information needed? How will it be further used?

Knowledge about the domain What background knowledge is needed to reason about

the information need?

K1: Problem Structure

Knowledge representations are important! Helps experts reason about problems Form the basis for tractable computational structures

GO’FAI Frames (Minsky) Scripts (Schank) Semantic networks (attribution less clear)

Knowledge about problem structureKnowledge about user tasksKnowledge about the domain

K2: User Tasks

The user is important!

Users are different High school student vs. intelligence analyst

Different types of relevance Topical, situational, etc.

K3: Domain

Why is the sky blue?

Users bring a tremendous amount of knowledge to bear when asking questions Specialized, technical knowledge Commonsense

“To really learn something, you basically have to already know it.”

K4 … Kn?

More types of knowledge need?

Working hypothesis: {K1, K2, K3} comprise a necessary set

Introductions

Dr. Dr. Dina Demner-Fushman, M.D., Ph.D.Dr. , Ph.D.

Why the Medical Domain?

Evidence-Based Medicine = A paradigm of medical practice that emphasizes

decision-support from high-quality clinical research Provides a basis for K1, K2, and K3

Need for retrieval systems is well documented:

Clinical QA: “Ready-made” domain for exploring conceptual retrieval Availability of corpora, resources, etc. Important and potentially high-impact application

e.g., (Gorman et al., 1994; Chambliss and Conley, 1996; Cogdill and Moore, 1997; Ely et al., 2005; Sutton et al., 2005)

K1: Problem Structure

EBM identifies four components of a question Originally developed as a clinical tool Can serve as a knowledge representation

“In children with an acute febrile illness, what is the efficacy of single-medication therapy with acetaminophen or ibuprofen in reducing fever?”

= PICO frame

Population/Problem

children/acute febrile illness

Intervention acetaminophen

Comparison ibuprofen

Outcome reducing fever

K2: User Tasks

Clinical tasks

Considerations for strength of evidence Strength of Recommendations Taxonomy (SORT):

three evidence grades

Therapy Selecting effective treatments, taking into account other factors such as risk and cost

Diagnosis Selecting and interpreting diagnostic tests, while considering factors such as precision and safety

Prognosis Estimating the patient’s likely course over time and anticipating likely complications

Etiology Identifying risk factors and the causes for a patient’s disease

K3: Domain

The Unified Medical Language System (UMLS) 2004 version: 1+ million biomedical concepts, > 5

million concept names

Software for leveraging this resource: MetaMap, SemRep for identifying concepts, relations

ofloxacin

boric acid

Quinolone

Ciclopirox

Borate product

Antibacterial drugs

Mucous membrane antifungal agent

Disinfectants and cleansers

Anti-infective agent

Antifungal

Re: Conceptual Retrieval

Question: In children with an acute febrile illness, what is the efficacy of single-medication therapy with acetaminophen or ibuprofen in reducing fever?

Task therapyP children/acute febrile illnessI acetaminophenC ibuprofenO reducing fever

MEDLINE

P children/acute febrile illnessI acetaminophenC ibuprofenO reducing fever

Answer:Ibuprofen provided greater temperature decrement and longer duration of antipyresis than acetaminophen when the two drugs were administered in approximately equal doses.

NLM’s authoritative repository of 17 million+ abstracts

System Architecture

query frame

Question(query frame)

Answers

search query

abstracts

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

annotatedabstracts

scoredcitations

Test Collection

Manually gathered 50 clinical questions from FPIN and the Parkhurst Exchange Reflects distribution of real-world questions Divided into development and test collections

Therapy 22 Does quinine reduce leg cramps for young athletes?

Diagnosis 12 How often is coughing the presenting complaint in patients with gastroesophageal reflux disease?

Prognosis 6 What’s the prognosis of lupoid sclerosis?

Etiology 10 What are the causes of hypomagnesemia?

Total 50

Gathering Judgments

Manually formulated PubMed queries ~40 minutes per question; gathered top 50 fits

Manually evaluated all retrieved citations ~2 hours per question

Question: What is the best treatment for analgesic rebound headaches?

PubMed Query: (((“analgesics”[TIAB] NOTMedline[SB]) OR “analgesics”[MeSH Terms] OR “analgesics”[Pharmacological Action] OR analgesic[TextWord]) AND ((“headache”[TIAB] NOT Medline[SB]) OR “headache”[MeSH Terms] OR headaches[TextWord]) AND (“adverse effects”[Subheading] OR side effects[Text Word])) AND hasabstract[text] AND English[Lang] AND “humans”[MeSH Terms]

Antipyretic efficacy of ibuprofen vs acetaminophen.

OBJECTIVE--To compare the antipyretic efficacy of ibuprofen, placebo, and acetaminophen. DESIGN--Double-dummy, double-blind, randomized, placebo-controlled trial. SETTING--Emergency department and inpatient units of a large, metropolitan, university-based, children's hospital in Michigan. PARTICIPANTS--37 otherwise healthy children aged 2 to 12 years with acute, intercurrent, febrile illness. INTERVENTIONS--Each child was randomly assigned to receive a single dose of acetaminophen (10 mg/kg), ibuprofen (7.5 or 10 mg/kg), or placebo. MEASUREMENTS/MAIN RESULTS--Oral temperature was measured before dosing, 30 minutes after dosing, and hourly thereafter for 8 hours after the dose. Patients were monitored for adverse effects during the study and 24 hours after administration of the assigned drug. All three active treatments produced significant antipyresis compared with placebo. Ibuprofen provided greater temperature decrement and longer duration of antipyresis than acetaminophen when the two drugs were administered in approximately equal doses. No adverse effects were observed in any treatment group. CONCLUSION--Ibuprofen is a potent antipyretic agent and is a safe alternative for the selected febrile child who may benefit from antipyretic medication but who either cannot take or does not achieve satisfactory antipyresis with acetaminophen.

Am J Dis Child. 1992 May; 146(5):622-5

Knowledge Extraction Example

Population Problem Interventions Outcome

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Knowledge Extractors

Population, Problem, Intervention: IE task Exploited coverage of medical concepts in UMLS Additional candidate ranking based a few features

Outcome: sentence-level classification task “Kitchen sink approach”, ensemble of classifiers Features:

• Manually-defined cue words

• N-grams

• Position in abstract

• Presence of certain UMLS concepts

• …

Semantics helps!Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Knowledge Extractors

?80% 0% 20%

?90% 5% 5%

?80% 13% 7%

?95% 0% 5%

OutcomePopulationProblem Intervention

Am J Dis Child. 1992 May; 146(5):622-5

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Details: Dina Demner-Fushman and Jimmy Lin. Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Computational Linguistics, 33(1):63-103, 2007

Semantic Matching

Three score components:

SEBM = SPICO + SSoE + SMeSH

SPICO Matching PICO frame elements

SSoE Strength of evidence considerations

SMeSH MeSH indicators for each clinical task

Problem Structure User Tasks

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Details: Dina Demner-Fushman and Jimmy Lin. Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Computational Linguistics, 33(1):63-103, 2007

Semantic Matching: Evaluation

Research Questions Does it work? What are the relative contributions of each component? What is the interaction between knowledge-based and

statistical techniques?

Approach Reranking experiments with test collection Ablation studies

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Evaluation: Abstract RerankingQuestion: What is the best treatment for analgesic rebound headaches?

(((“analgesics”[TIAB] NOTMedline[SB]) OR “analgesics”[MeSH Terms] OR “analgesics”[Pharmacological Action] OR analgesic[TextWord]) AND ((“headache”[TIAB] NOT Medline[SB]) OR “headache”[MeSH Terms] OR headaches[TextWord]) AND (“adverse effects”[Subheading] OR side effects[Text Word])) AND hasabstract[text] AND English[Lang] AND “humans”[MeSH Terms]

MEDLINE

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

KnowledgeExtractor

Clinical task,PICO frame

SemanticMatcher

vs. original PubMed orderingvs. Indri baseline (state-of-the-art LM)

Results: Complete Model

Performance on held-out blind test set:

Therapy Diagnosis Prognosis Etiology All

Precision at 10 (P10)

PubMed .350 (–39%) .150 (–70%) .200 (–46%) .320 (–20%) .281 (–44%)

Indri .575 .500 .367 .400 .500

EBM .783 (+36%) .583 (+17%) .467 (+27%) .660 (+65%) .677 (+35%)

Mean Average Precision (MAP)

PubMed .421 (–29%) .279 (–48%) .235 (–56%) .364 (–17%) .356 (–35%)

Indri .595 .534 .533 .439 .544

EBM .765 (+29%) .637 (+19%) .722 (+35%) .701 (+60%) .718 (+32%)

Results are statistically significant

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Details: Jimmy Lin and Dina Demner-Fushman. The Role of Knowledge in Conceptual Retrieval: A Study in the Domain of Clinical Medicine. SIGIR 2006.

Results: Parameter Settings

Tuning each component

No statistically significant difference

Combining EBM + Indri

Better performance, but not statistically significant

SEBM = λ1 SPICO + λ2 SSoE + (1 - λ1 - λ2 ) SMeSH

SEBM+Indri = λ SEBM + (1- λ ) SIndri

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Results: Contributions

What’s the contribution of each EBM facet?

What types of knowledge are important? Problem structure (K1) helps a lot

User tasks (K2) help, but not as much

MAP vs. EBM vs. Indri

SPICO .646 –10%** +19%*

SSoE + SMeSH .538 –25%** –1%

** = sig. at 99%, * = sig. at 95%

Problem Structure

User Tasks

P10 vs. EBM vs. Indri

SPICO .627 –7% +25%**

SSoE + SMeSH .485 –28%** –3%

Problem Structure

User Tasks

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Results: Partial Models

Can we use limited knowledge to improve term-based methods?

Any knowledge helps!

λ MAP P10

SIndri .544 .500

λ SIndri + (1- λ) SPICO .46 .668 (+23%)** .627 (+25%)**

λ SIndri + (1- λ)(.5 SSoE + .5 SMeSH) .55 .620 (+14%)** .565 (+13%)*

** = sig. at 99%, * = sig. at 95%

+ Problem Structure

+ User Tasks

Term Statistics

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Answer: Prevention of thromboembolic events in atrial fibrillation: The results from the SPAF III study demonstrated that a combination of mini-intensity warfarin plus aspirin was insufficient for stroke prevention in atrial fibrillation. Other trials now indicate, that oral anticoagulation at INR-values below 2.0 is not effective for stroke prevention in these patients. The present clinical challenge is to ensure effective and safe oral anticoagulation to patients with atrial fibrillation at high risk of stroke.

Answer Generation

Physicians are most interested in outcomes

Approach: identify outcome sentences Generate an answer from each citation: abstract title

and three highest scoring outcome sentences

Question: Does combining aspirin and warfarin decrease the risk of stroke for patients with nonvalvular atrial fibrillation?

Answer: Prevention of thromboembolic events in atrial fibrillation: The results from the SPAF III study demonstrated that a combination of mini-intensity warfarin plus aspirin was insufficient for stroke prevention in atrial fibrillation. Other trials now indicate, that oral anticoagulation at INR-values below 2.0 is not effective for stroke prevention in these patients. The present clinical challenge is to ensure effective and safe oral anticoagulation to patients with atrial fibrillation at high risk of stroke.

abstract title outcome1 outcome2 outcome3

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Evidence Synthesis

Integrate findings from multiple citations

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

Question: What is the best treatment for chronic prostatitis?► anti-microbial

[temafloxacin] Treatment of chronic bacterial prostatitis with temafloxacin. Temafloxacin 400 mg b.i.d. administered orally for 28 days represents a safe and effective treatment for chronic bacterial prostatitis.

[ofloxacin] Ofloxacin in the management of complicated urinary tract infections, including prostatitis. In chronic bacterial prostatitis, results to date suggest that ofloxacin may be more effective clinically and as effective microbiologically as carbenicillin....

► Alpha-adrenergic blocking agent

[terazosine] Terazosin therapy for chronic prostatitis/chronic pelvic pain syndrome: a randomized, placebo controlled trial. CONCLUSIONS: Terazosin proved superior to placebo for patients with chronic prostatitis/chronic pelvic pain syndrome who had not received alpha-blockers previously....

Semantic Clustering

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

relevantcitations

Cluster1

Cluster2

Cluster3

Answer Extraction

Semantic Clustering

Interactive Presentation

Evaluation: Evidence Synthesis

What is the best treatment of X?

Compare Top three answers from PubMed First answer in three largest semantic clusters

Evaluation by a physician:

Question

Answers

SemanticMatcher

KnowledgeExtractors

QueryFormulator

AnswerGenerator

PubMed

“Good” “Okay” “Bad”

PubMed 0.600 0.227 0.173

Semantic Clustering 0.827 0.133 0.040

Details: Dina Demner-Fushman and Jimmy Lin. Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering. ACL 2006.

Findings

K1 + K2 + K3 → “conceptual retrieval”

Knowledge helps a lot!

But here’s the catch: Limited domain: “narrow but deep” Dependent on availability of existing resources

Beyond “bag of words”: Develop a general framework Instantiate in domain-specific applications Leverage lessons learned to refine the framework Rinse, repeat

Re: Re: Conceptual Retrieval

Question: In children with an acute febrile illness, what is the efficacy of single-medication therapy with acetaminophen or ibuprofen in reducing fever?

MEDLINE

P children/acute febrile illnessI acetaminophenC ibuprofenO reducing fever

Answer:Ibuprofen provided greater temperature decrement and longer duration of antipyresis than acetaminophen when the two drugs were administered in approximately equal doses.

NLM’s authoritative repository of 17 million+ abstracts

= faceted query!

facetfacet

Conceptual Retrieval

“Building blocks” strategy in library science Decompose information need into conceptual facets Identify terms that represent those facets Instantiate in a structured query

EBM-based retrieval is a specific case of facet analysis and structured querying!

( A1 A2 …) ( B1 B2 …) ( C1 C2 …) ( D1 D2 …) …

P I C O

A General Framework?

For a domain

1. Identify prototypical information needs

2. Develop a frame-based representation

3. Build extractor for frame elements

4. Instantiate semantic matcher

5. Watch performance go up!

The subject of ongoing work…

What comes next?

Retrieval in the biomedical domain

Complex question answeringWhat evidence is there for transport of [art looted by the Nazis in WWII] from [Germany] to [France]?

What [familial ties] exist between [Neanderthals] and [humans]?

What [common interests] exist between [Network Solutions] and [the Internet Corporation for Assigned Names and Numbers (ICANN)]?

Information describing the role(s) of a [gene] involved in a [disease]. gene: Interferon-beta disease: Multiple Sclerosis

Information describing the role of a [gene] in a specific [biological process]. gene: nucleoside diphosphate kinase (NM23) biological process: tumor progression

Acknowledgments

Dina Demner-Fushman (Ph.D., 2006)

This work was funded in part by NLM

References

Dina Demner-Fushman and Jimmy Lin. Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Computational Linguistics, 33(1):63-103, 2007.

Jimmy Lin and Dina Demner-Fushman. The Role of Knowledge in Conceptual Retrieval: A Study in the Domain of Clinical Medicine. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), 2006, pp. 99-106.

Dina Demner-Fushman and Jimmy Lin. Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering. Proceedings of the 21th International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL 2006), 2006, pp. 841-848.

Beyond “Bag of Words”: Towards a Framework for Conceptual Retrieval Jimmy Lin College of...

Documents

Word Sense Disambiguation CMSC 723: Computational Linguistics I ― Session #11 Jimmy Lin The iSchool University of Maryland Wednesday, November 11, 2009

Jimmy Lin The iSchool University of Maryland Sunday, May 31, 2009

CS270 Project Overview Maximum Planar Subgraph Danyel Fisher Jason Hong Greg Lawrence Jimmy Lin

Text Retrieval Algorithms Data-Intensive Information Processing Applications ― Session #4 Jimmy Lin University of Maryland Tuesday, February 23, 2010 This

LBSC 796/INFM 718R: Week 11 Cross-Language and Multimedia Information Retrieval Jimmy Lin College of Information Studies University of Maryland Monday,

Jimmy Lin The iSchool University of Maryland

LBSC 690: Session 11 Information Retrieval and Search Jimmy Lin College of Information Studies University of Maryland Monday, November 19, 2007

Precision Medicine Pathway - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2019/02/Precision... · 2016 – 2018 2017 – ... Jimmy Weagley Gervette Penny Wu-Lin

Information Retrieval in Blogs - Fraunhoferkontext.fraunhofer.de/...Valiath_Hopp_InformationRetrievalInBlogs.pdf · PHP and MySQL The blogsoftware ... 21.01.2008 S. Lin, S. Valiath,

Pairwise Document Similarity in Large Collections with MapReduce Tamer Elsayed, Jimmy Lin, and Douglas W. Oard Association for Computational Linguistics,

Introduction to MapReduce Data-Intensive Information Processing Applications ― Session #1 Jimmy Lin University of Maryland Tuesday, January 26, 2010 This

Relevance Feedback in Image Retrieval Systems: A Surveychens/courses/cis6931/2001/Tao.pdf · Relevance Feedback in Image Retrieval Systems: A Survey Tao Huang, Lin Luo, Chengcui Zhang

Thanks to Jimmy Lin slides

Data-Intensive Text Processing with MapReducegtsat/collection/map reduce/Lin-MapReduce.pdf · Data-Intensive Text Processing with MapReduce Jimmy LinJimmy Lin The iSchool University

LBSC 690: Session 5 Metadata and XML Jimmy Lin College of Information Studies University of Maryland Monday, October 8, 2007

LBSC 796/INFM 718R: Week 8 Relevance Feedback Jimmy Lin College of Information Studies University of Maryland Monday, March 27, 2006

1 Design PatternsUbiComp PatternsEvaluations Design Patterns in Ubiquitous Computing Eric Chung Jason I. Hong Jimmy Lin James A. Landay

Jimmy Lin, Michael Schatz, and Ben Langmead University of Maryland Wednesday , June 10, 2009

Answering Definition Questions Using Multiple Knowledge Sources Wesley Hildebrandt, Boris Katz, and Jimmy Lin MIT Computer Science and Artificial Intelligence

Based on the text by Jimmy Lin and Chris Dryer 8/16/2015CSE4/5871