17
Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

Embed Size (px)

Citation preview

Page 1: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

Retrieval of Similar Electronic Health Records

using UMLS Concept Graphs

Laura Plaza and Alberto DíazUniversidad Complutense de Madrid

Page 2: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª When facing complex and untypical cases, physicians need to refer to similar previous cases

ª The adoption EHR by office-based physicians and hospitals is increasing

ª But still the time required to find them can be prohibitive if no effective access is provided

Motivation

Given a reference record, retrieve others from the clinical database that are similar to the reference one

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 2

Page 3: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª A mix of highly structured information + idiosyncratic narrative text

ª Unique sublanguage characteristics: Verbless sentences, punctuation, spelling errors. Synonyms and homonyms Neologisms Acronyms and abbreviations

ª When two HR can be considered as similar?

A Different IR Task

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 3

Page 4: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

Two EHR are Similar if…

Same symptom or sign (e.g. fever or 5 kg weight loss)

Same diagnosis (e.g. bacterial pneumonia)

Same test or procedure (e.g. cerebral NMR or endoscopy biopsy)

Same medicament (e.g. clopidogrel) But … absent criteria are not relevant

for the task!!!

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 4

Page 5: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª UMLS consists of three main components: the Specialist Lexicon, the Metathesaurus and the Semantic Network

ª We use MetaMap to translate free-form text to Metathesaurus concepts

ª Advantages: Broad coverage Performs word sense disambiguation Numerous entries for acronyms and abbreviations Etc.

Using UMLS for Concept Annotation

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 5

Page 6: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª A four-step graph-based method :1. Extraction of UMLS concepts2. Negation detection3. Semantic graph-based representation4. Ranking similar EHR

Our Proposal

CLINICAL HISTORY: Eleven years old with ALL, bone narrow transplant on Jan.2, now with 3 day history of cough.

IMPRESSION: No focal pneumonia. Likely chronic changes at the left lung base. Mild anterior wedging of the thoracic vertebral bodies.

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 6

Page 7: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª We use MetaMap to extract the UMLS concepts from the Metathesaurus and their semantic types from the Semantic Network

ª But, according to the expert, not all concepts are relevant to the task

ª Thus, the expert mapped these criteria to semantic types and only concepts from those types are considered

Our Proposal: Extracting UMLS Concepts

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 7

Page 8: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

Our Proposal: Extracting UMLS Concepts

Category UMLS Semantic Types

Symptoms and Signs Sign or SymptomFinding

Diseases Disease or SyndromePathologic Function

Procedures Therapeutic or Preventive ProcedureDiagnosis Procedure

Body Parts Body Location or RegionBody Part, Organ, or Organ Component

Medicaments Pharmacologic substance

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 8

Page 9: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª According to the expert, absent or negated criteria (e.g. On admission, the patient had no internal bleeding) are not relevant for the task

ª Thus, negated UMLS concepts are ignoredª Negations in medical records usually appears

in a reduced number of forms, easy to identify using a simple lexical scanner from regular expressions

Our Proposal: Negation Detection

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 9

Page 10: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

Our Proposal: Negation Detection

Lexical Pattern Examplesno|without| rule out + concept +(or concept)*

No pneumonia Without fever or cough

no|without|rule out + adj + concept + (or concept)*

No significant hydronephrosisRule out cardiac abnormality

no|without + noun + of + concept + (or concept)*

No signs of tuberculosisWithout evidence of hydroureter

evaluate for + (noun|adj)? + concept + (or concept)*

Evaluate for foreign bodyEvaluate for abnormalities

lack of|absence of + (noun|adj)? + concept + (or concept)*

Lack of kyphosis Absence of heart murmur

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 10

Page 11: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª First, the concepts are retrieved from the UMLS Metathesaurus along with their complete hierarchy of hypernyms (is-a relations).

ª Second, all concept hierarchies for each category are merged, building a unique graph for each category in the EHR

ª Finally, each concept is assigned a weight, using the Jaccard similarity coefficient, attaching greater importance to specific concepts than to general ones

Our Proposal: Semantic Graph Representation

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 11

Page 12: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

Our Proposal: Semantic Graph Representation

Grafting procedure

TransplantationOf bone marrow

Bone Marrow Transplantation

Procedure

Respiratoryfinding

Functional findingof respiratory tract

Coughing

Finding by site

Clinical finding

Procedure bymethod

Body structure

Thoracic structure

Structure of thoracic viscus

Lung structure

Left lung structure

Thoracic cavity structure

Structure of base of left lung

Thoracic cage structure

Bone structure of thoracic vertebra

Structure of body of thoracic vertebra

Musculoskeletal structure of thorax

Entire body of thoracic vertebra

...

Physical anatomical entity

Hematopoietic Neoplasm

Leukemia

Lymphoid leukemia

Acute LymphocyticLeukemia

Disease, Disorder or finding

Disease or disorder

...

Disease or Syndrome

Procedure

Sign or Symptom

Body Part

Transplantation

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 12

1/5

2/5

3/5

4/5

5/5

Page 13: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª We compute the similarity among the reference EHR and all records in the database, and rank them

ª Given two graphs, A and B, so that the similarity of A to B has to be measured: First, each concept of A which is not in B assigns a score

equal to 0, while each concept of A which is also in B assigns a score equal to its weight in the graph A

Next, the sum of the scores for all concepts in A is computed.

Finally, this result is normalized in the interval [0, maximum similarity].

Our Proposal: Ranking Similar EHR

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 13

Page 14: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

Our Proposal: Ranking Similar EHR

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 14

4869,0ityMaxSimilar

VotesSimilarity

55

54

53

1111

...112

111

119

...112

111

Similarity

Finding by site

Clinical finding

Disease

Bacterialpneumonia

Infectious disease

Disorder by body site

Pneumonia due to Streptococcus

Mycoplasma pneumonia

Respiratoryfinding

Functional findingof respiratory tract

Coughing

Clinical finding

Disorder by body site

Finding by site1/11

2/11

3/11

8/11

9/11

10/11

3/5

4/5

5/5

Bacterialpneumonia

Pneumococcal pneumonia

11/11

Pneumonia due to anaerobic bacteria

Pneumonia due to pleuropneumonia

Graph A Graph B

... ...

Virus Diseases

Page 15: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª Test collection: 50 radiology reports from the CMC-NLP 2007 Challenge corpus

ª Query collection: a subset of 20 reports from the test collection

ª Two hospital physicians were asked to select, for each report in the query collection, the most similar reports within the test collection

ª There is a substantial agreement between judges (Kappa test, k=0.7980)

ª Precision and Recall of our method are compared with those obtained by a term-based approach

Experiments

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 15

Page 16: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

Results

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 16

Graph-based method Term-based MethodPrecision Recall F-score Precision Recall F-score

Union-5 0.530 0.765 0.626 0.470 0.708 0.564Intersection-5 0.420 0.884 0.569 0.340 0.729 0.463Union-3 0.717 0.676 0.696 0.467 0.476 0.471Intersection-3 0.600 0.788 0.681 0.333 0.487 0.395

Graph-based method Term-based Method

Precision F-score Precision F-scoreUnion 0.707 0.692 0.594 0.632Intersection 0.745 0.742 0.503 0.598

Page 17: Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª The method achieves relatively high precision and recall which are also well balanced

ª UMLS occasionally fails to recover relevant concepts especially when expressed in their shortened forms

ª Another impairment to concept identification comes from the spelling errors in the clinical records

ª Future work will test the method on a different evaluation collection which will present longer medical records structured in different sections

Conclusion and Future Work

Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 17