Upload
kaylee
View
72
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Three Stages of Evaluation for Syndromic Surveillance from Chief Complaint Classification . Wendy W Chapman, PhD John N Dowling, MD, MS Oleg Ivanov, MD, MPh, MS Bob Olszewski, PhD Michael M Wagner, MD, PhD. Introduction. Syndromic surveillance from chief complaints becoming common - PowerPoint PPT Presentation
Citation preview
2004 University of Pittsburgh
Three Stages of Evaluation for Syndromic Surveillance from Chief
Complaint Classification
Wendy W Chapman, PhDJohn N Dowling, MD, MS
Oleg Ivanov, MD, MPh, MSBob Olszewski, PhD
Michael M Wagner, MD, PhD
2004 University of Pittsburgh
Introduction
Syndromic surveillance from chief complaints becoming common– Chief complaints are ubiquitous– Chief complaints are early
Can we detect outbreaks by monitoring chief complaints that are classified into syndromic
categories?
2004 University of Pittsburgh
Outline
• Describe a three-staged approach for answering that question
• Describe a body of research applying the three-staged approach
• Discuss what we have learned by applying the three-stage approach
2004 University of Pittsburgh
Stages of Evaluation in Medical Technology Development
Does the systemdo what it is
trained to do?
No
Stop
Technical Accuracy
Yes Does the systemdiagnose patients
correctly?
Diagnostic Accuracy
No
Stop
Yes Does the systemimprove outcomes?
Outcome Efficacy
2004 University of Pittsburgh
Three Stages of Evaluation in Syndromic Surveillance
Does the CCclassifier accuratelyassign syndromic
categories?
No
Stop
Technical Accuracy
Yes Does the syndromiccategory representthe patient’s state?
Case Detection
No
Stop
Yes Can we detect outbreaks from
chief complaints?
Outbreak Detection
2004 University of Pittsburgh
Methods
2004 University of Pittsburgh
Stage 1: Technical AccuracyCan we accurately classify a chief complaint
string into a syndromic category?• Determine whether automated application
performs its task• Reference Standard
– IS: Expert classification of chief complaint string– IS NOT: Patient’s actual syndrome
2004 University of Pittsburgh
ClassifierSyndromic Category
CC Classifier Performance
Compare
Gold Standard Syndromic Category
Chief Complaint Classifier
Gold Standard
Chief Complaint
2004 University of Pittsburgh
Stage 2: Case ClassificationDoes the syndromic classification from the chief
complaint accurately represent the patient’s clinical state?
• Results reflect quality of– Chief complaint classifier– Chief complaint content
• Reference Standard is patient’s actual syndrome• Bulk of our work has been in Case Classification
– More informative than Technical Accuracy– Easier to evaluate than Outbreak Detection
2004 University of Pittsburgh
Medical Records of Test Cases
Diagnostic Accuracy
Compare
Gold StandardSyndromic
Classification
Gold Standard Chief Complaint Classifier
ClassifierSyndromic
Classification
Chief Complaints of Test Cases
Different than inEvaluation ofTechnical Accuracy
2004 University of Pittsburgh
Stage 3: Outbreak DetectionCan we detect outbreaks by monitoring chief
complaint classifications?
• Outcome metrics– Accuracy– Timeliness
• Reference Standard is an outbreak• Most difficult evaluation to perform
Outbreaks are rare
2004 University of Pittsburgh
Population Chief Complaints for Population
Syndromic Categories for Chief Complaints
Chief Complaint Classifier
Detected Outbreak
Syndromic Outbreak Detection Algorithms
Accuracy and Timelines of Outbreak Detection
Compare
DefinedOutbreak
Standard Outbreak Detection Methods
2004 University of Pittsburgh
Chief Complaint ClassifierCoCo: naïve Bayesian classifier
“SOB/cough” CoCo Respiratory 0.97GI 0.00Constitut 0.01Rash 0.00Hemorrhagic 0.00Botulinic 0.00Neurological 0.00Other 0.02
• CoCo is open source– openrods.sourceforge.net
• CoCo can be trained on any syndromic categories using manually classified chief complaints
2004 University of Pittsburgh
Results• Technical Accuracy• Case Classification• Outbreak Detection
2004 University of Pittsburgh
Evaluations of Technical Accuracy
2004 University of Pittsburgh
How well can we classify chief complaints into syndromes?
TechnicalAccuracy
Case Classification
Outbreak Detection
• Text Processing ApplicationCoCo
• Test Set28,990 chief complaints from Utah
• Reference standardPhysician classifications of chief complaints
• Outcome measureArea under the ROC curve (AUC)
* Olszewski RT. Bayesian classification of triage diagnoses for the early detection of epidemics. In: Recent Advances in Artificial Intelligence: Proceedings of the Sixteenth International FLAIRS Conference;2003:412-416.
2004 University of Pittsburgh
Results: General Syndromes
Syndrome CoCo
GastrointestinalConstitutionalRespiratoryRashHemorrhagicBotulinicNeurologicalOther
94.5%93.1%95.7%91.0%92.6%78.1%92.4%95.7%
TechnicalAccuracy
Case Classification
Outbreak Detection
2004 University of Pittsburgh
How well can we identify specific findings in chief complaints?
• Text Processing ApplicationKeyword searches
• Reference standardPhysician identification of findings in chief complaints
• Outcome measureStandard test statistics
TechnicalAccuracy
Case Classification
Outbreak Detection
2004 University of Pittsburgh
Results: Specific Syndromes and Findings
Sensitivity Specificity
Fever 100% 100%
Diarrhea 100% 100%
Vomiting 100% 100%
TechnicalAccuracy
Case Classification
Outbreak Detection
2004 University of Pittsburgh
Evaluations of Case Classification
2004 University of Pittsburgh
How Well Can We Identify Syndromic Cases from Chief Complaints
• Text Processing ApplicationCoCo
• Test Set527,228 patients at University of Pittsburgh Medical Center (UPMC)
• Reference standardPrimary ICD-9 discharge diagnosis
Syndromic lists of ICD-9 codes
• Outcome measureStandard test statistics
TechnicalAccuracy
Case Classification
Outbreak Detection
2004 University of Pittsburgh
Results: Syndromic Case Classification
Respiratory 34,916 92.3 63.1 94.3 44.1 97.3
Botulinic 1,961 99.1 30.1 99.3 14.2 99.7
GI 20,431 94.6 69.0 95.6 38.8 98.7
Neurological 7,393 92.3 67.6 92.7 11.6 99.5
Rash 2,232 99.1 46.8 99.3 21.7 99.8
Constitut 10,603 95.6 45.8 96.6 21.9 98.8
Hemorrhagic 8,033 98.1 75.2 98.5 43.1 99.6
Syndrome # pos. cases Accuracy Sensitivity Specificity PPV NPV
Mean = 62%
2004 University of Pittsburgh
• Text Processing ApplicationCoCo + keyword searches
• Reference standardPhysician review of ED report
• Outcome measureStandard test statistics
How Well Can We Identify Cases of Specific Syndromes from Chief Complaints?
TechnicalAccuracy
Case Classification
Outbreak Detection
2004 University of Pittsburgh
Case Definition Sensitivity Specificity
Febrile 61% 100%
Febrile Respiratory 22% 99%
Diarrhea 10% 99%
Vomiting 15% 99%
Results: Case Classification of Specific Syndromes
TechnicalAccuracy
Case Classification
Outbreak Detection
2004 University of Pittsburgh
Evaluations of Outbreak Detection
2004 University of Pittsburgh
• Text Processing ApplicationCoCo*
• Reference standardPediatric respiratory illness outbreaks (bronchiolitis, RSV)Pediatric gastrointestinal illness outbreaks (Rotavirus)
• Outcome measureExponentially Weighted Moving Average (EWMA) detection algorithm for timelinessStandard test statistics for accuracy
* Ivanov O, Gesteland PH, Hogan W, Mundorff MB, Wagner MM. Detection of pediatric respiratory and gastrointestinal outbreaks from free-text chief complaints. AMIA Annu Symp Proc. 2003:318-22.
How Well Can We Identify Outbreaks from Chief Complaints?
TechnicalAccuracy
Case Classification
Outbreak Detection
2004 University of Pittsburgh
Respiratory Outbreaks (n = 3)• Timeliness: 10.3 days earlier• Sensitivity: 100%• Specificity: 100%
GI Outbreaks (n = 3)• Timeliness: 29 days earlier• Sensitivity: 100%• Specificity: 100%
2004 University of Pittsburgh
DiscussionAre classified chief complaints good enough to use
for syndromic surveillance?Three phases of evaluation important
in answering this question
2004 University of Pittsburgh
Technical Accuracy Evaluations
How well do classification methods perform?• CoCo quite good and very simple• Keyword searches for fever, diarrhea, vomiting have
perfect accuracy
2004 University of Pittsburgh
Case Classification Evaluations
How well can we identify syndromic cases from CC’s?• Sensitivity: 30% (botulinic) to 75% (Hemorrhagic)• PPV: 12% to 44%• Outbreak must be larger to be detected
Which syndromes are best?• Respiratory Syndrome – sensitivity 63%• Febrile respiratory syndrome – sensitivity 22%• GI Syndrome – sensitivity 69%• Diarrhea – sensitivity 11%• Vomiting – sensitivity 15% • Should not make syndromic definitions too narrow
2004 University of Pittsburgh
Outbreak Detection Evaluations
Can we detect outbreaks with classified chief complaints?• Can detect pediatric respiratory and GI
outbreaks• Chief complaints contain signal for outbreaks• Chief complaint signal is earlier than that of
ICD-9 diagnoses
2004 University of Pittsburgh
Summary• Our research over the last few years aimed at
answering question of how well we can detect outbreaks from chief complaints
• Three stages of evaluation are important in understanding the answer– Optimize and focus effort– Make evaluation more feasible– Evaluate question from different angles– Provide insight into related practical questions
Which syndromes are best?
2004 University of Pittsburgh
Thank You
http://rods.health.pitt.edu/– NLP
2004 University of Pittsburgh
Future WorkChief Complaints
– Improve CoCo• Synonym replacement, spell checking• Split into multiple chief complaints
– Different chief complaint classification methods• M+ outperforms CoCo in technical accuracy• Have not tested M+ in diagnostic accuracy yet
– Improve reference standard for Diagnostic Accuracy evaluations
• 1,600 cases of 7 syndromes with physician judgment from ED notes– Increase number of outcome efficacy studies
• Real outbreaks
Other Clinical Data– Chest radiograph reports– ED Reports
2004 University of Pittsburgh
• Text Processing Applications• CoCo – Respiratory• CoCo – Constitutional
• Reference standard• Primary ICD-9 discharge diagnoses for
Influenza
• Outcome measure• Standard test statistics
How Well Can We Identify Influenza Cases from Chief Complaints?
TechnicalAccuracy
DiagnosticAccuracy
OutcomeEfficacy
2004 University of Pittsburgh
Results: Case Classification for Influenza
Classifier Sensitivity Specificity
CoCo-Respiratory
0.22 0.92
CoCo-Constitutional
0.55 0.94
Respiratory OR Constitutional
0.77 0.86