25
A Multiple Ontology, Concept-Based, Context-Sensitive Search and Retrieval Robert Moskovitch and Prof. Yuval Shahar Medical Informatics Research Center Ben Gurion University, Israel

A Multiple Ontology, Concept based, Context-sensitive Search and Retrieval

Embed Size (px)

Citation preview

A Multiple Ontology, Concept-Based, Context-Sensitive Search and Retrieval

Robert Moskovitch and Prof. Yuval ShaharMedical Informatics Research CenterBen Gurion University, Israel

Clinical Guidelines

• Clinical practice guidelines (Clinical practice guidelines (CPGsCPGs) and protocols ) and protocols are a powerful method for standardizing the are a powerful method for standardizing the quality of medical carequality of medical care

• The main challenge is providing easy access to The main challenge is providing easy access to CPGs at the point of careCPGs at the point of care

• Access involves Access involves representationrepresentation of the guidelines of the guidelines and easy, accurate and easy, accurate retrievalretrieval of relevant guidelines of relevant guidelines

The DEGEL Framework

Ben Gurion University’s Ben Gurion University’s Digital Electronic Digital Electronic Guidelines LibraryGuidelines Library ( (DeGeLDeGeL) is an architecture ) is an architecture and a Web-based set of computational tools and a Web-based set of computational tools for:for: AuthoringAuthoring markup (semi-structuring and structuring)markup (semi-structuring and structuring) RetrievalRetrieval browsingbrowsing Runtime application of clinical guidelinesRuntime application of clinical guidelines Retrospective assessment of the quality of the Retrospective assessment of the quality of the

applicationapplication

The Goal

• Build a search and retrieval tool to retrieve Build a search and retrieval tool to retrieve CPGs, to support the challenge of accurately CPGs, to support the challenge of accurately retrieving CPGs at the point of careretrieving CPGs at the point of care

• Enable Enable concept-based searchconcept-based search, which supports , which supports querying using an existing set of semantic querying using an existing set of semantic classification indicesclassification indices

• Support Support context-sensitive searchcontext-sensitive search, which , which supports querying for a term only within a supports querying for a term only within a particular knowledge role (e.g., eligibility particular knowledge role (e.g., eligibility conditions)conditions)

Classification and Concept Based Search

• DeGeL uses seven semantic axes (or aspects) that DeGeL uses seven semantic axes (or aspects) that

can categorize CGPs (e.g., diagnosis type, therapy can categorize CGPs (e.g., diagnosis type, therapy

type)type)• Each axis is implemented as a tree Each axis is implemented as a tree • Each Guideline can be classified under zero, one, Each Guideline can be classified under zero, one,

or more indices from each axisor more indices from each axis

Example Markup, Using The Asbru Ontology

Plan

The markup process gradually converts a free-text-based CPG to a semi-structured, then fully structured one, maintaining all formats in parallel (a hybrid architecture)

Conditions

Filter condition

Setup condition

Intentions

Outcome intentions

Process intentions

This guideline is intended only for women who are pregnant and who are at high risk for gestational diabetes and who had a glucose-tolerance test…

The main goal is reduction of potential hypertension…

The guideline uses mainly dietary measures…If a need for insulin develops, use a guideline for using short-acting insulin…

Context-Sensitive Search Within Knowledge Roles of Ontologies

• Several ontologies such as Several ontologies such as AsbruAsbru, , GEMGEM, and , and GLIF GLIF were developed to represent CPGs in a structured were developed to represent CPGs in a structured fashion in order to provide automated support for fashion in order to provide automated support for their usetheir use

• Context-Based Search exploits the existence of Context-Based Search exploits the existence of certain terms within semantically meaningful certain terms within semantically meaningful segments of the text, or segments of the text, or knowledge rolesknowledge roles

Example: searching within articles summarizing clinical studies [G.Purcell, 1996]. According to Purcell, a context defines a semantically meaningful region of the document for searching, and thus facilitates precise retrieval of information from the medical literature

The Information Retrieval Task in the DEGEL Framework

• Document CollectionDocument Collection• Content IndexingContent Indexing• Document RepresentationDocument Representation• Query FormulationQuery Formulation• Matching ProcessMatching Process

Matching

Process

Retrieved Documents

Content Indexin

g

Document Representation

Document Collection

Query Formulatio

n

Query Representation

Information Need ?

GLS GLM

N : 1

• Vaidurya Query LanguageVaidurya Query Language

- Free Text- Free Text

- Text Value- Text Value

- Text Multiple Value- Text Multiple Value

- Int- Int

- Date- Date

Source Ontology Markup Ontology

Representing Ontolgies for Search Purposes

To implement the Concept-Based Search and the Context-Sensitive Search, two properties for each element in a guideline representation ontology were defined, Search Type and Search Scope. These properties, or aspects, define how an element will be indexed, queried and retrieved.

Search Type

Search Type Description Querying Options Relevance measure

Free Text An element containing a free text content.

Keywords with disjunction or conjunction logic operator.

Metric

Text Value An element that may contain only a single fixed string value.

Requested string values with disjunction being the only possible relation.

Boolean

Text Multiple Value

An element containing one or more fixed string values.

Requested string values with conjunction, disjunction relations.

Boolean

Date An element that its content represents a calendar date.

A date constraint using operators such as ‘>’ or ‘>=’ etc.

Metric

Integer An element that its content represents an integer value.

An integer constraint using operators such as ‘>’ or ‘>=’ etc..

Metric

Semantic Index

An element represents the conceptual classification of the guideline

Requested concepts using conjunction, disjunction operators between indices.

Boolean

Unsearchable An element that doesn’t have content or its content is irrelevant for search.

No query. Not relevant.

Search Scope

Search Scope Description

None No search at that element nor at its descendents - elements that don’t contain any content, and their descendents' contents aren’t relevant to them.

Search-Self Search the element without descendents

Only-Children No search at that element, search only its descendents.

Children-Included Search both that element and its descendents.

Query Interface

Results Interface

Evaluation

The evaluation goals were, to examine the contribution of the concept search and the context sensitive to the traditional full text search.

Test sets: TREC NGC CPGs collection

Concept-based and Context- sensitive evaluation

NGC CPGs collection 1136 CPGs stored in a GEM based ontology Classified along two MeSH taxonomies:

Disease/Condition and Treatment/Intervention. Each taxonomy contains ~2500 concepts, in

some regions the concepts are 10 levels deep but averages 4-6 levels.

Queries and Judgments

In order to evaluate an IR system Queries and judgments should be created.

We created a set of 15 daily queries created by 5 physicians ( E&C and Stanford )

Each Physician was asked to label the relevant CPGs, for each query, in the collection.

Each query had three formats: Full Text Concept Query in 2nd and 3rd level Context Query in 3 elements

Evaluation Measures

PRECISIONPRECISION = =

j

Recalli i Precisionj umPrecisionS

RECALLRECALL = =

Number of Relevant

Documents Retrieved

Total Number of

Documents Retrieved

Number of Relevant

Documents Retrieved

Number of RelevantDocuments in theDocument Collection

Evaluation Hypotheses

Hypothesis 1Retrieval performance will be increased as more context elements are queried, also in addition to full-text search.

Hypothesis 2Retrieval performance will be increased as concept based queries will be used in addition to full text search.

Results – Contextual search

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Recall

Pre

cisi

on

3

2

1

Context Sensitive in addition to Full Text search

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Recall

Pre

cis

ion

1

2

3

4

Results – Concept based search in addition to three contexts

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Recall

Pre

cisi

on

1

2

3

4

5

Results – Concept based search in addition to full-text search

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Recall

Pre

cisi

on

1

2

3

4

5

Results – Concept based search in addition to single context

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Recall

Pre

cis

ion

1

2

3

4

5

Discussion

Concept based search increased the retrieval performance in any of the cases.Improvement observed when deeper queries used using conjunctive relation.

Context sensitive search improves performance as more contexts participate in the query.

Questions ?

[email protected]