29
Extraction of Adverse Drug Effects from Clinical Records E. ARAMAKI* Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D., K.WAKI * Ph.D. M.D., K.OHE * Ph.D. M.D., Our material is Discharge Summary

Extraction of Adverse Drug Effects from Clinical Records

Embed Size (px)

DESCRIPTION

Extraction of Adverse Drug Effects from Clinical Records. Our material is Discharge Summary. E. ARAMAKI * Ph.D., Y. MIURA **, M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D., H. MASHUICHI ** Ph.D.,K.WAKI * Ph.D. M.D., K.OHE * Ph.D. M.D., - PowerPoint PPT Presentation

Citation preview

Page 1: Extraction of Adverse Drug Effects from Clinical Records

Extraction of Adverse Drug Effects from Clinical Records

E. ARAMAKI* Ph.D., Y. MIURA **,M. TONOIKE ** Ph.D., T. OHKUMA ** Ph.D.,H. MASHUICHI ** Ph.D., K.WAKI * Ph.D. M.D.,

K.OHE * Ph.D. M.D., * University of Tokyo, Japan

** Fuji Xerox, Japan

Our material is Discharge Summary

Page 2: Extraction of Adverse Drug Effects from Clinical Records

Background• The use of Electronic Health Records (EHR) in

hospitals is increasing rapidly everywhere• They contain much clinical information about

a patient’s health

BUT Many Natural Language texts !

BUT Many Natural Language texts !

Extracting clinical information from the reports is difficult because they are written in natural language

Page 3: Extraction of Adverse Drug Effects from Clinical Records

NLP based Adverse Effect Detecting System

• We are developing a NLP system that extracts medical information, especially Adverse Effect, form natural language parts

• INPUT– a medical text (discharge summary)

• OUTPUT– Date Time– Medication Event– Adverse Effect Event

≒ i2b2 MedicationChallenge

But our target focuses only on adverse effect

Adverse Effect Relation (AER)

Page 4: Extraction of Adverse Drug Effects from Clinical Records

Why Adverse Effect Relations?

• Clinical trials usually target only a single drug.• BUT: real patients sometimes take multiple

medications, leading to a gap separating the clinical trials and the actual use of drugs

• For ensuring patient safety, it is extremely important to capturing a new/unknown AEs in the early stage.

Page 5: Extraction of Adverse Drug Effects from Clinical Records

DEMO is available on

http://mednlp.jp

Page 6: Extraction of Adverse Drug Effects from Clinical Records

副作用関係の推定System Demo

Page 7: Extraction of Adverse Drug Effects from Clinical Records

C c

副作用関係の推定System Demo

has no complications at the time of diagnosis 6/23-25 FOLFOX6 2nd.6/24, 25: moderate fever (38℃) again. a fever reducer….

Adverse Effect

Medication

Relation

Page 8: Extraction of Adverse Drug Effects from Clinical Records

The point of This Study• (1) Preliminary Investigation: How much information

actually exist? – We annotated adverse effect information in

discharge summaries

• (2) NLP Challenge: Could the current NLP retrieve them?– We investigated the accuracy of with which the

current technique could extract adverse effect information

Page 9: Extraction of Adverse Drug Effects from Clinical Records

Outline

• Introduction• Preliminary Investigation

– How much information actually exist in discharge summary?

• NLP Challenge

• Conclusions

Page 10: Extraction of Adverse Drug Effects from Clinical Records

Material & Method

• Material: 3,012 Japanese Discharge Summaries• 3 humans annotated possible adverse effects due to

the following 2 steps

<D>Lasix<D> for <S>hypertension</S> is stopped due to <S>his headache</S>.

<D rel=“1”>Lasix<D> for <S>hypertension</S> is stopped due to <S rel=“1”>his headache</S>.

Step 1 Event

Annotation

Step 2Relation

Annotation

XML tag = Event

XML attribute = Relation

Page 11: Extraction of Adverse Drug Effects from Clinical Records

Annotation Policy & Process

• We regard only MedDRA/J terms as the events.

• We regarded even a suspicion of an adverse effect as positive data.

• Entire data annotation is time-consuming → We split data into 2 sets SET-A (Event Rich parts): contains keywords such

as Stop, Change, Adverse effect, Side effect

SET-B: The other

adverse effect terminology

Full annotated

Randomly sampled & annotated

Page 12: Extraction of Adverse Drug Effects from Clinical Records

14.5%×53.5% + 85.5%×11.3% = 17.4%

SET-BSET-A

Page 13: Extraction of Adverse Drug Effects from Clinical Records

Results of Preliminary Investigation

• About 17% discharge summaries contain adverse effect information.– Even considering that the result includes just a

suspicion of effects, the summaries are a valuable resource on AE information.

• We can say that discharge summaries are suitable resources for our purpose.

Page 14: Extraction of Adverse Drug Effects from Clinical Records

Outline

• Introduction• Preliminary Investigation

• NLP Challenge– Could the current NLP technique retrieve the AEs?

• Conclusions

Page 15: Extraction of Adverse Drug Effects from Clinical Records

Combination of 2 NLP Steps

• 2 NLP steps directly correspond to each annotation step

Lasix for hyperpiesia is stopped due to the pain in the head.

symptom symptomMedication

Adverse Effect Relation

Event Annotation

RelationAnnotation

≒Named Entity Recognition Task

= Relation Extraction Task, which is one of the most hot NLP research topics.

Page 16: Extraction of Adverse Drug Effects from Clinical Records

Step1: Event Identification

• Machine Learning Method– CRF (Conditional Random Field) based Named

Entity Recognition

• Feature– Lexicon (Stemming), POS, Dictionary based

feature (MedDRA), window size=5

• Material– SET-A Corpus with Event Annotations

state-of-the-art method ati2b2 de-identification task

Standard Feature Set

Page 17: Extraction of Adverse Drug Effects from Clinical Records

Step1: Result of Event Identification

• Result SummaryCat. of Event Precision Recall F-measure

Medication Event 86.99 81.34 0.8485.56 80.24 0.82AE Event

• All accuracies (P, R) >> 80 %, F>0.80, demonstrating the feasibility of our approach

• Considering that the corpus size is small (435 summaries), we can say that the event detection is an easy task

Page 18: Extraction of Adverse Drug Effects from Clinical Records

Step2: Relation Extraction Method

• Basic Approach ≒Protein-Protein Interaction (PPI) task [BioNLP2009-shared Task]

• ExampleLasix for hypertension is stopped due to his headache

For each m (Medications) For each a (Adverse Effects) judge_it_has_rel (a, m)For each m (Medications) For each a (Adverse Effects) judge_it_has_rel (a, m)(1) judge_it_has_AER (Lasix , hypetension)(2) judge_it_has_AER (Lasix , headach)

Page 19: Extraction of Adverse Drug Effects from Clinical Records

• (1) PTN-BASED: heuristic rules using a set-of-keyword & word distance

..is on ACTOS but stopped for relief of the edema .

n=1<medication> <adverse effect>keyword

n=4

Judge_it_has_AER (m, a, keyword=stopped, windowsize5)

• (2) SVM-BASED: Machine learning approach– Feature: distance & words between two events

( medication & adverse effect)

Two judgment methods

See proceedings for detailed

Page 20: Extraction of Adverse Drug Effects from Clinical Records

Step2: Result of Relation ExtractionPrecision Recall F-measure

PTN-BASED 41.1% 91.7% 0.65057.6% 62.3% 0.598SVM-BASED

• Both PTN & SVM accuracies are low (F<0.65)→ the Relation extraction task is difficult!

• SVM accuracy is significant (p=0.05) lower than PTN (1) Corpus size is small (2) positive data << negative data

Machine learning suffers from such small imbalanced data

Page 21: Extraction of Adverse Drug Effects from Clinical Records

Outline

• Introduction• Preliminary Investigation• NLP Challenge• Discussions

– (1) Overall Accuracy– (2) Controllable Performance– (3) Event Distribution

• Conclusions

Page 22: Extraction of Adverse Drug Effects from Clinical Records

Discussion (1/3) Overall Accuracy

• The overall accuracy is estimated by the combined accuracies of step1 & step2

Overall (= step1 × step2)

Precision 0.289 (=0.855 × 0.869 × 0.390)

• Each NLP step is not perfect, so, the combination of such imperfect results leads to the low accuracy (especially many false positives; low precision)

Recall 0.597 (=0.802 × 0.813 × 0.917)

Page 23: Extraction of Adverse Drug Effects from Clinical Records

Discussion (2/3)Performance is Controllable

Precision & Recall curve in SVM

• The performance balance between recall & precision could be controlled

High precision setting

High recall setting

That is a strong advantage of NLP

Page 24: Extraction of Adverse Drug Effects from Clinical Records

Discussion (3/3)Event Distribution

• We investigated the entire AE frequency for each medication category.

distribution acquired from annotated real data

distribution acquired from our system results

AE freq. distribution of Drug #1

Page 25: Extraction of Adverse Drug Effects from Clinical Records

Discussion (3/3)AER Distribution

• Then, we checked the goodness of the fit test, which measures the similarity between two distributions

Med. 1Med. 2Med. 3Med. 4Med. 5

Total

0.0230.0130.0100.0060.005

0.011

P-value

• High p-value (p=0.011 > 0.01) indicates two distributions are similar.

Page 26: Extraction of Adverse Drug Effects from Clinical Records

Outline

• Introduction• Preliminary Investigation• NLP Challenge• Discussions

• Conclusions

Page 27: Extraction of Adverse Drug Effects from Clinical Records

Conclusions (1/2)

• Preliminary Investigation:– About 17% discharge summaries contain adverse

effect information.– We can say that discharge summary are suitable

resources for AERs

• NLP Challenge:– Could NLP retrieve the AE information?– Difficult! Overall accuracy is low

Page 28: Extraction of Adverse Drug Effects from Clinical Records

Conclusions (2/2)

• BUT: 2 positive findings:(1) We can control the performance balance(2) Even the accuracy is low, the aggregation of the results is similar to the real distribution

• IN THE FUTURE:–A practical system using the above advantages–More acute method for relation extraction

Page 29: Extraction of Adverse Drug Effects from Clinical Records

Thank you

Contact Info– Eiji ARAMAKI Ph.D.– University of Tokyo– [email protected]– http://mednlp.jp