Upload
esmond-gilmore
View
214
Download
0
Embed Size (px)
Citation preview
Retrieval of Similar Electronic Health Records
using UMLS Concept Graphs
Laura Plaza and Alberto DíazUniversidad Complutense de Madrid
ª When facing complex and untypical cases, physicians need to refer to similar previous cases
ª The adoption EHR by office-based physicians and hospitals is increasing
ª But still the time required to find them can be prohibitive if no effective access is provided
Motivation
Given a reference record, retrieve others from the clinical database that are similar to the reference one
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 2
ª A mix of highly structured information + idiosyncratic narrative text
ª Unique sublanguage characteristics: Verbless sentences, punctuation, spelling errors. Synonyms and homonyms Neologisms Acronyms and abbreviations
ª When two HR can be considered as similar?
A Different IR Task
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 3
Two EHR are Similar if…
Same symptom or sign (e.g. fever or 5 kg weight loss)
Same diagnosis (e.g. bacterial pneumonia)
Same test or procedure (e.g. cerebral NMR or endoscopy biopsy)
Same medicament (e.g. clopidogrel) But … absent criteria are not relevant
for the task!!!
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 4
ª UMLS consists of three main components: the Specialist Lexicon, the Metathesaurus and the Semantic Network
ª We use MetaMap to translate free-form text to Metathesaurus concepts
ª Advantages: Broad coverage Performs word sense disambiguation Numerous entries for acronyms and abbreviations Etc.
Using UMLS for Concept Annotation
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 5
ª A four-step graph-based method :1. Extraction of UMLS concepts2. Negation detection3. Semantic graph-based representation4. Ranking similar EHR
Our Proposal
CLINICAL HISTORY: Eleven years old with ALL, bone narrow transplant on Jan.2, now with 3 day history of cough.
IMPRESSION: No focal pneumonia. Likely chronic changes at the left lung base. Mild anterior wedging of the thoracic vertebral bodies.
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 6
ª We use MetaMap to extract the UMLS concepts from the Metathesaurus and their semantic types from the Semantic Network
ª But, according to the expert, not all concepts are relevant to the task
ª Thus, the expert mapped these criteria to semantic types and only concepts from those types are considered
Our Proposal: Extracting UMLS Concepts
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 7
Our Proposal: Extracting UMLS Concepts
Category UMLS Semantic Types
Symptoms and Signs Sign or SymptomFinding
Diseases Disease or SyndromePathologic Function
Procedures Therapeutic or Preventive ProcedureDiagnosis Procedure
Body Parts Body Location or RegionBody Part, Organ, or Organ Component
Medicaments Pharmacologic substance
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 8
ª According to the expert, absent or negated criteria (e.g. On admission, the patient had no internal bleeding) are not relevant for the task
ª Thus, negated UMLS concepts are ignoredª Negations in medical records usually appears
in a reduced number of forms, easy to identify using a simple lexical scanner from regular expressions
Our Proposal: Negation Detection
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 9
Our Proposal: Negation Detection
Lexical Pattern Examplesno|without| rule out + concept +(or concept)*
No pneumonia Without fever or cough
no|without|rule out + adj + concept + (or concept)*
No significant hydronephrosisRule out cardiac abnormality
no|without + noun + of + concept + (or concept)*
No signs of tuberculosisWithout evidence of hydroureter
evaluate for + (noun|adj)? + concept + (or concept)*
Evaluate for foreign bodyEvaluate for abnormalities
lack of|absence of + (noun|adj)? + concept + (or concept)*
Lack of kyphosis Absence of heart murmur
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 10
ª First, the concepts are retrieved from the UMLS Metathesaurus along with their complete hierarchy of hypernyms (is-a relations).
ª Second, all concept hierarchies for each category are merged, building a unique graph for each category in the EHR
ª Finally, each concept is assigned a weight, using the Jaccard similarity coefficient, attaching greater importance to specific concepts than to general ones
Our Proposal: Semantic Graph Representation
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 11
Our Proposal: Semantic Graph Representation
Grafting procedure
TransplantationOf bone marrow
Bone Marrow Transplantation
Procedure
Respiratoryfinding
Functional findingof respiratory tract
Coughing
Finding by site
Clinical finding
Procedure bymethod
Body structure
Thoracic structure
Structure of thoracic viscus
Lung structure
Left lung structure
Thoracic cavity structure
Structure of base of left lung
Thoracic cage structure
Bone structure of thoracic vertebra
Structure of body of thoracic vertebra
Musculoskeletal structure of thorax
Entire body of thoracic vertebra
...
Physical anatomical entity
Hematopoietic Neoplasm
Leukemia
Lymphoid leukemia
Acute LymphocyticLeukemia
Disease, Disorder or finding
Disease or disorder
...
Disease or Syndrome
Procedure
Sign or Symptom
Body Part
Transplantation
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 12
1/5
2/5
3/5
4/5
5/5
ª We compute the similarity among the reference EHR and all records in the database, and rank them
ª Given two graphs, A and B, so that the similarity of A to B has to be measured: First, each concept of A which is not in B assigns a score
equal to 0, while each concept of A which is also in B assigns a score equal to its weight in the graph A
Next, the sum of the scores for all concepts in A is computed.
Finally, this result is normalized in the interval [0, maximum similarity].
Our Proposal: Ranking Similar EHR
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 13
Our Proposal: Ranking Similar EHR
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 14
4869,0ityMaxSimilar
VotesSimilarity
55
54
53
1111
...112
111
119
...112
111
Similarity
Finding by site
Clinical finding
Disease
Bacterialpneumonia
Infectious disease
Disorder by body site
Pneumonia due to Streptococcus
Mycoplasma pneumonia
Respiratoryfinding
Functional findingof respiratory tract
Coughing
Clinical finding
Disorder by body site
Finding by site1/11
2/11
3/11
8/11
9/11
10/11
3/5
4/5
5/5
Bacterialpneumonia
Pneumococcal pneumonia
11/11
Pneumonia due to anaerobic bacteria
Pneumonia due to pleuropneumonia
Graph A Graph B
... ...
Virus Diseases
ª Test collection: 50 radiology reports from the CMC-NLP 2007 Challenge corpus
ª Query collection: a subset of 20 reports from the test collection
ª Two hospital physicians were asked to select, for each report in the query collection, the most similar reports within the test collection
ª There is a substantial agreement between judges (Kappa test, k=0.7980)
ª Precision and Recall of our method are compared with those obtained by a term-based approach
Experiments
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 15
Results
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 16
Graph-based method Term-based MethodPrecision Recall F-score Precision Recall F-score
Union-5 0.530 0.765 0.626 0.470 0.708 0.564Intersection-5 0.420 0.884 0.569 0.340 0.729 0.463Union-3 0.717 0.676 0.696 0.467 0.476 0.471Intersection-3 0.600 0.788 0.681 0.333 0.487 0.395
Graph-based method Term-based Method
Precision F-score Precision F-scoreUnion 0.707 0.692 0.594 0.632Intersection 0.745 0.742 0.503 0.598
ª The method achieves relatively high precision and recall which are also well balanced
ª UMLS occasionally fails to recover relevant concepts especially when expressed in their shortened forms
ª Another impairment to concept identification comes from the spelling errors in the clinical records
ª Future work will test the method on a different evaluation collection which will present longer medical records structured in different sections
Conclusion and Future Work
Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 17