54
February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and Program in Biological and Medical Informatics UCSF Electronic Health Records for Clinical Research Copyright Ida Sim, 2005. All federal and state rights reserved for all original material presented in this course through any medium, including lecture or print.

February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

Embed Size (px)

Citation preview

Page 1: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Ida Sim, MD, PhD

February 22, 2005

Division of General Internal Medicine, and Program in Biological and Medical Informatics

UCSF

Electronic Health Records for Clinical Research

Copyright Ida Sim, 2005. All federal and state rights reserved for all original material presented in this course through any medium, including lecture or print.

Page 2: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Electronic health records (EHR)– clinical benefits

• reduction in medical errors, prescription errors• supports quality improvement programs

– research benefits• “Frankly, one of the biggest attractions to LastWord (now called

CareCast) is going to be a boon to clinical research. Information will be accessible in a much more uniform and complete way.” ex-Dean Haile Debas, Daybreak, Feb. 2, 2001

• UCSF spending $50 mil on CareCast• How real is the promise of EHRs for research ?

The Promise

Page 3: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Understand key properties of useful electronic health records and data warehousing– free vs. coded entry– importance of a standardized clinical vocabulary

• Understand implications of database technologies on clinical research

• Be familiar with basic concepts in data security and privacy (time permitting)

Learning Objectives

Page 4: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Example Study– a single-institution outcomes research question

• Electronic Health Records (EHRs)– relational databases– vocabulary

• Data Warehousing• (Security and Privacy)

Outline

Page 5: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Retrospective analysis• Compare 1 year re-admission rate for acute MI for

– diabetics admitted with acute MI, discharged • on -blockers• not on -blockers

• First acute MI in 2000 to 2002, followup to 2004

An Outcomes Research Project

Page 6: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Find diabetics admitted with AMI‘2000 to‘2002• Find whether D/C’ed on -blocker• For these patients, find all re-admissions in the year

following the index MI– identify re-admissions that were for acute MI

• Analyze– predictor = -blocker status– primary outcome = acute MI readmission rate– secondary outcome = length of stay (LOS)

Study Steps

Page 7: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Data needed– admission: Admission Discharge Transfer system– diabetes diagnosis: chart, HgbA1C– MI diagnosis: chart, troponins, EKG readings

• or just trust coding of admission diagnosis?

-blocker usage: orders, pharmacy

• Existing (legacy) systems– claims, pharmacy, ADT, lab, xray, med record, etc

Health System Minnesota: 50 paper, 50 computer

200,000 lives, 460 physicians

Health System Minnesota: 50 paper, 50 computer

200,000 lives, 460 physicians

Data Needed for -Blocker Study

Page 8: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Pros Cons

ChartReview

ElectronicHealthRecord

Data Collection Method

Page 9: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• EHR provides individual patient data for– real-time clinical care – reimbursement (eg for E&M coding)– see table for major functionality dimensions

• Clinical workstation includes interfaces to– practice management systems– pharmacy benefit management– knowledge resources (e.g., WWW, guidelines)

• “EHRs” range from flat file, text-based systems to full-featured workstations

What is an EHR?

Page 10: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

8 Types of EHR Functionality

Viewing Electronic viewing of chart notes, problem and medication lists, dischargesummaries, laboratory results, and radiology results.

Documentation Entry of visit note and other information into the EMR, whether throughdictation or direct keyboard entry.

Order Entry Electronic physician order entry of drug prescriptions, laboratorytests, radiology studies, or referrals.

Care Planningand Management

Managing patients in disease management programs, such as for asthma orcongestive heart failure

Patient-Directed Patient education materials; web-based education modules, self-diagnosisalgorithms, patient-viewing of EMR data, and e-mail with care providers

Billing and OtherAdministrative

Determination of insurance eligibility, assistance with visit level coding,management and tracking of referrals.

PerformanceReporting

Quality and utilization reporting to both internal and external audiences

Messaging E-mail or other messaging system among providers and staff within theorganization, or to external organizations

Page 11: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Physician friendliness– if docs won’t use it, it won’t help research

• What data it contains• How that data is stored (and retrieved)• Security• Cost, maintenance, technical support, etc

Critical EHR Features

Page 12: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Workflow compatible– portable

• Easy data entry– voice-recognition– pen-based (PDAs)– digital ink

• Preserves doctor-patient relationship

• Secure Fujitsu 510

Physician Friendliness

Page 13: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Structure: should store contents in relational form– lots of one-to-many relationships in clinical care

• Contents– type of content

• real-time clinical care– notes, orders, labs, prescriptions, xray (reports)...

• administration– demographic, billing, provider IDs...

• research– standardized data collection, symptom scales, etc

– form of content• free text versus coded entries

EHR Structure and Contents

Page 14: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Relational Admissions Database MasterTable

ID Name Sex Birthdate Insurance000-01-001 Lee M 09-Jul-00 B/T Healthnet000-01-002 Smith F 22-Oct-25 Medicare000-01-003 Perez F 13-Jun-57 B/T Pacificare

AdmissionNumberTableAdm# ID Admit Date Discharge

Date001 000-01-001 31-Dec-94 12-Jan-95002 000-01-001 27-Mar-96 31-Mar-96003 000-01-002 03-Feb-95 16-Feb-95004 000-01-002 27-Feb-95 20-Mar-95005 000-01-003 19-Nov-97 23-Nov-97

AdmissionTableAdm# Admit

ServiceAdmit

DiagnosisPrincipalDischargeDiagnosis

001 Med Acute MI Acute MI002 Med COPD Pneumonia003 Surg THR THR004 Med Acute MI Acute MI005 Gyn Menorrhagia von Willebrand's

Secondary Discharge Diagnosis TableAdmission # Secondary Discharge Diagnoses

001 COPD001 Diabetes002 COPD003 Acute MI004 VF Arrest005 Diabetes

Page 15: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

What Goes Into the Table Cells?

• Column names help narrow the meaning– “Pneumonia” entry in AdmitDiagnosis column, vs.– “Pneumonia” entry in PastHistory column

• Free text vs. coded entries– “Pneumococcal pneumonia”, “Pn PNA”, or “RLL PNA”– Pneumonia: Yes, Organism: Pneumococcus, Location:

Right Lower AdmissionTable

Adm# AdmitService

AdmitDiagnosis

PrincipalDischargeDiagnosis

001 Med Acute MI Acute MI002 Med PNA-1 Pneumonia003 Surg THR THR004 Med Acute MI Acute MI005 Gyn Menorrhagia von Willebrand's

PneumoniaTablePNA # Organism LocationPNA-1 Pneumococcus Right Lower

Page 16: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• A term is a designation of a concept or an object in a specific vocabulary

• e.g., English blood = German blut

• Standardization required for communication– acts like a dictionary

• DGIM tried to use STOR to pull out all CHF patients for quality improvement program but terms used were too varied

– CHF, LVF, heart failure, etc.

• Vocabularies are collections of terms– general standardized: ICD-9, CPT, MeSH– research-domain specific: for cancer, diabetes, etc...– your own data dictionary

Standardization of Clinical Terms

Page 17: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Costs vs. Benefits of Coding

• The more coded and more structured your data, the more powerful computing you can do – because the computer can “understand” more

• But coding and structuring costs time and effort– e.g., selecting billing codes for outpatient practice– tiresome to pick codes for clinical care, let alone for

even more specific codes needed for research• Tradeoff between

– costs of more coding and structuring, and– benefits to accrue from “smarter” computing– for both care and research

Page 18: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Notable Clinical VocabulariesVocabulary Name Domain Use

SNOMED-CT Standardized Nomenclatureof Medicine

ClinicalMedicine

EHRDocumentation

MeSH Medical Subject Heading BiomedicalIndexing

BibliographicRetrieval

ICD-9 International Classificationof Diseases

Diseases Billing

CPT Current ProceduralTerminology

MedicalProcedures

Billing

DSM-IV Diagnostic and StatisticalManual of Mental Disorders

Pyschiatry Billing,Nosology

LOINC Logical ObservationIdentifier Names and Codes

Labs Lab systems,Billing

READ Read Clinical Classification ClinicalMedicine

EHRs in the UK

Page 19: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Dangers of ICD-9 Coding• VBAC uterine rupture rate

– 665.0 and 665.1 ICD-9 discharge codes used in study (NEJM 2001;345:3-8)

– letter to editor: in 9 years of Massachusetts data• 716 patients with 665.0 and 665.1 discharged• reviewed 709 charts• 363 (51.2%) had actual uterine rupture

– others had incidental extensions of C-section incision, or were incorrectly coded or typed

• 674.1 (dehiscence of the uterine wound) used to code another 197 ruptures (or 35% of confirmed cases of uterine rupture)

• i.e., sensitivity 65%, specificity 51.2%

• Administrative codes are not ideal for research

Page 20: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

ICD-9 Concept Coverage

• How well would ICD-9 do in capturing a medical chart?

• Inpatient and outpatient charts from 4 medical centers abstracted into 3061 concepts [Chute, 96]

– diagnoses, modifiers, findings, treatments and procedures, other

• Matching: 0=no match, 1=partial, 2=complete– 1.60 for diagnoses– 0.77 overall– ICD-9 augmented with CPT: overall 0.82

Page 21: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

SNOMED-CT

• 364,400 health care concepts; 984,000 descriptions• Formally constructed terminology

– 18 high-level hierarchies • e.g. finding, organism, substance, body structure, event, social

context– each concept can be described by many attributes

• e.g., finding site = lung, associated-morphology = inflammation– encodes “knowledge”

• pneumonia is an infection of the lung by an organism– can build up “post-coordinated” concepts to increase

expressive power• pneumonia: finding-site=lung ; finding-site=lower lobe;

laterality=right; causative agent=pneumococcus;

Page 22: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Using SNOMED-CT

• Semantic coverage– best coverage of all clinical vocabularies– test of 1603 concepts in 20 opthalmology cases

• SNOMED-CT 1.625 +/- 0.667 (0=no match, 1=partial, 2=complete)

• ICD9-CM 0.280+/-0.619

• Site-licensed (i.e., is free) to entire U.S. as of 2004– the de facto standard for EHR clinical vocabulary

• Coding barriers– how to get docs to reliably pick the right code out of 364,000??– coded data entry biggest barrier to more computable EHRs

• Financial barriers– was $50K per site, now free but site license is only till 2009

Page 23: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Research Data Dictionaries

• Research data dictionaries are lists of study variables and their definitions

• Standardization of data dictionaries facilitates data sharing, merging, and meta-analysis

• Terms in data dictionaries should ideally come from a standard clinical vocabulary– e.g., SOB? shortness of breath? breathlessness?

• ICD-9: Dypsnea and other respiratory abnormalities (786.0)• CPT: no matching concept or term• SNOMED: Dypsnea

Page 24: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Notable Research Data Dictionaries

• Defined by clinical domain– e.g., Common Data Elements (CDE, from the NCI)

• standardized variables for breast, lung, cervical, prostate CA• http://ncicb.nci.nih.gov/CDEBrowser/

– e.g., HCFA’s MedQuest modules • a fib, CHF, diabetes, pneumonia, orthopedics, etc.

• Defined by a research community– e.g., NCI, UCSF

Page 25: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

CDE Example #1

Page 26: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

CDE Example #2• Menopausal Status: “Indication of whether a

woman is potentially fertile or not.” • Allowed values:

Post (Prior bilateral ovariectomy, OR >12 mo since LMP with no prior hysterectomy and not currently receiving therapy with LH-RH analogs [eg. Zolades])

Post (Prior bilateral ovariectomy, OR >12 mo since LMP with no prior hysterectomy)

Pre (<6 mo since LMP AND no prior bilateral ovariectomy, AND not on estrogen replacement)

Above categories not applicable AND Age < 50Above categories not applicable AND Age >=50

Page 27: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

EHR for Research Summary

• An EHR is not automatically going to help clinical research– if it’s all unstructured free text, it won’t help much at

all• the more structured it is (ie more defined fields), the better

– if it’s just coded sporadically in ICD-9• problem with gamed codes• poor coverage of many clinical concepts

– if it’s coded in SNOMED• some clinical concepts still not well covered

• EHR better than chart review; can we do even better?

Page 28: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Sample Study– a single-institution outcomes research question

• Electronic Medical Records (EHRs)– relational databases– vocabulary

• Data Warehousing• Security and Privacy

Outline

Page 29: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Types of Queries

• Clinical care• What was Mr. Smith’s last

potassium?• Does he have an old CXR

for comparison?• What antihypertensives

has he been on before?• What did the neurology

consult say about his epilepsy?

• Research• What % of diabetics with

AMI admissions were discharged on -blockers?

• What was the average Medicine length of stay in 2004 compared to 2000?

• What is the trend in use of head CTs in patients with migraine?

• Is admission creatinine independent predictor of bacteremia outcomes?

Page 30: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

MICU

FinanceResearch

QA

Data Warehouse

Internet

ADT Chem EHR XRay PMB Claims

• Integrated historical data common to entire enterprise

What is a Data Warehouse?

Page 31: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Types of Data Warehouses

• A data warehouse is just a collection of data from other databases– is itself just a database

• Two somewhat distinct types– clinical data repository

• collects data from day-to-day clinical care, admin data, etc.• for quality improvement, outcomes research, business decision

making…– research data repository

• collects data from multiple research projects• may also collect data from day-to-day clinical care, admin

data, etc.

Page 32: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Data Warehouses: Hype and Hope

• Touted for– business decision making– health care quality improvement– outcomes research– genotype-phenotype correlations for translational research

• UCSF Clinical and Genomic Information Management (CGIM) database (now defunct)– was a $4-6 million partnership with IBM– goal: a single repository of research data from all UCSF research

projects, plus data from STOR, radiology, CareCast etc. to enable • correlation of clinical, genomic, imaging, etc data across data sets

• Stanford’s STRIDE research data repository– CareCast and research data

Page 33: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Need many types of data for research and QI• E.g., for our outcomes study, need

– admission: ADT (admission/discharge/transfer) system– diabetes diagnosis: e-chart, HgbA1C– MI diagnosis: e-chart, troponins, EKG readings– -blocker usage: online ordering, pharmacy system

• Existing (legacy) systems– claims, pharmacy, ADT, lab, xray, med record, etc– HealthSystems Minnesota with 50 computer systems, 50

paper systems Health System Minnesota: 50 paper, 50 computer

200,000 lives, 460 physicians

Health System Minnesota: 50 paper, 50 computer

200,000 lives, 460 physicians

Why are Data Warehouses Useful?

Page 34: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Extract data from legacy systems• Clean data and feed it to warehouse• Allow ad hoc use

– data query, data mining, data analysis

• Service users– modify data content based on queries– provide standard reports– provide alerts to trends

Data Warehousing Procedure

Page 35: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• For uploading data to warehouse– a physical connection between the computers– common data transmission protocols

• e.g., HL-7– common database communication protocol

• e.g. SQL over TCP/IP (the telnet protocol)

• For sharing and merging– common data schema

• type (e.g., relational)• data modeling (i.e., column names)

– common naming of data items• eg., “PNA” vs. “pneumonia”

Prerequisites for Data Warehouse Construction

Page 36: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Requires physical networking and transmission standards (protocols)

MICU

FinanceResearch

QA

Warehouse

Internet

ADT Chem EHR XRay PMB Claims

Networking

Page 37: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Health-Level 7 (HL-7) – “original HL7”

– HL7 RIM: full object model of all of health care• Digital Imaging and Communications in Medicine

(DICOM)– common data exchange format for medical images

Health-Specific Network Protocols

MSH|…message headerPID|…patient identifier<!-OBX…observation result>OBX|1|ST|84295^NA||150|mmol/l|136-148|H||A|F|19850301<CR> OBX|2|ST|84132^K+||4.5|mmol/l|3.5-5|N||N|F|19850301<CR> OBX|3|ST|82435^CL||102|mmol/l|94-105|N||N|F|19850301<CR> OBX|4|ST|82374^CO2||27|mmol/l|24-31|N||N|F|19850301<CR>

Page 38: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• For uploading data to warehouse– a physical connection between the computers– common data transmission protocols

• e.g., HL-7– common database communication protocol

• e.g. SQL over TCP/IP (the telnet protocol)

• For sharing and merging– common data schema

• type (e.g., relational)• data modeling (i.e., column names)

– common naming of data items• eg., “PNA” vs. “pneumonia”

Prerequisites for Data Warehouse Construction

Page 39: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

MICU

FinanceResearch

QA

???

Internet

ADT Chem EHR XRay PMB Claims

Data Warehouse Contents

Page 40: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

UCSF “CGIM” Example

• Standard coding vocabulary? data representation?• Are queries mostly within or across projects? ongoing or

completed projects or both?• Need administrative data (e.g., insurance)?

(in SNOMED-CT) xrays/CT/MRImicroarray data(in MAGE-OM) (in DICOM)

•Breast CA (not DCIS)•Menopause

•Osteoporosis (Heel US)•Menopause

Project 1

DB 1

Project 2

DB 2

Project 3

DB 3

Project 4

DB 4

•Osteoporosis (DXA)•Menopause

•Breast CA (DCIS ok)•Alzheimers (path)

CGIM

Data mining/Display ToolsRadiologySTOR

Page 41: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

What’s the Warehouse for?

• Clinical care• What was Mr. Smith’s last

potassium?• Does he have an old CXR

for comparison?

• Research• What % of diabetics with

AMI admissions were discharged on -blockers?

• What was the average Medicine length of stay in 2004 compared to 2000?

• Use same schema for warehouse and EHR?– should depend on anticipated queries

• Anticipated use has huge implications for design (and eventual worth) of warehouse– if you don’t know what you want, no technology will give it to you

Page 42: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Clinical Data Warehouse Schema Discharge Diagnoses

DischargeDiagnosis Admission #

LOS Service Team Attending

Acute MI 001 13 Med II RedAcute MI 004 22 Med I BlueTHR 003 14 Surg III BronzeCOPD 002 5 Med II WhiteMetrorrhagia 005 4 Gyn A Buff

Discharge Meds for AMI Admissions Table

Admission #Aspirinon D/C

Beta-Blockeron D/C

Statinon D/C

ACE Inhibitor onD/C

001 ASA 325 mg QD Atenolol 50 mgQD

Simvastatin 20 mgQD

Lisinopril 10 mgQD

004 ECA 81 mg QD Metoprolol 100mg BID

Atorvastatin 40 mgQD

Ramipril 5 mg QD

• diagnoses would be ICD-9 codes• one warehouse for both clinical improvement and research?

Page 43: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Choosing a Vocabulary

• For an EHR– billing: ICD-9, CPT, Read code (in UK)– clinical data capture: SNOMED-CT best – research: any is better than none!

• For your own research databases– if standard domain-specific data dictionary exists, use it– if not, ideally use a standard clinical vocabulary

• often ICD-9 or CPT, or SNOMED

– try not to be defining your own terms and your own definitions

• upfront work will make it easier to share data later…

Page 44: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Examples from the UK• UK practices contribute to research databases

– e.g., General Practice Research Database• Features that make them work

– National Health System covering treatment, prescriptions, etc.– 90% of UK practices use EHRs– Dept. of Health mandates Read code for all general practice– relatively simple “research warehouse” data structure

• registration, prescription, problems/diagnoses, notes files

• Weaknesses/cautions– biases in patients and/or practices included or excluded– completeness and accuracy of reporting (e.g., 90% sensitive for

DM)– lots of information in free text in the notes files (e.g., specialist

referrals, drug dosing instructions)– needs dedicated staff and resources to maintain

Page 45: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Data Warehouse Summary

• Enterprise viewpoint more appropriate for research than patient viewpoint of EHR

• Integrates data from multiple sources– need standardization of codes, definitions, and data

formats• Schema can evolve to optimize for analytic needs

– can make or modify tables off of legacy systems• Querying and processing occurs “offline”

– little impact on real-time clinical care

Viewpoint Time Queries

EHR Patient Real-Time ClinicalData Warehouse Enterprise Historical Ad Hoc

Page 46: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Compare 1 year re-admission rate for acute MI in diabetics discharged on -blockers or not– data captured in EHR and other databases– data aggregated in data warehouse– you query the data warehouse — NOT YET….

Study Steps Using EHR

Page 47: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Sample Study– a single-institution outcomes research question

• Electronic Medical Records (EHRs)– relational databases– vocabulary

• Data Warehousing• Security and Privacy

Outline

Page 48: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Privacy vs. Security

• Security (a technical feature)– confidentiality

• ensuring that only authorized persons can read or copy information

– encryption of data during transmission impedes eavesdropping only

– integrity• ensuring that information is modified only in appropriate ways

– availability• ensuring that information is not made inaccessible

• Privacy (a legal concept) -- see HIPAA– right to keep personal information from outside world

• study nurse, data entry clerk, investigator, database administrator, etc may be authorized to see data but may disclose it inappropriately

Page 49: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Physical security– firewalls

• Encryption– public/private keys

• People security– authority– authentication – access– audit

Internet

Firewall

Network Security

itsa

jaundice

ucsf.edu

LAN

Page 50: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Authentication – are you who you say you are?

• use passwords, biometrics (e.g., retinal scan), smartcards

• Authority– do you have a need to know?

• different levels of data access for different users• Access

– how to allow only authenticated users to perform authorized activities on authorized data?

• Audit– record of who actually got into what

People Security

Page 51: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

De-identification Isn’t Easy• 87% of the American populace can be uniquely

identified by only [Sweeney, L. ‘97]

– date of birth• in room of 23 people, what is chance that 2 people will share

the same birthday (independent of year of birth)?• http://www.people.virginia.edu/~rjh9u/birthday.html

– gender– five-digit ZIP code– easy to find someone’s info if you’re looking for it;

harder to find out who’s info it is that you have• Anonymizing databases does not remove your

duty to enforce security and safeguard privacy

Page 52: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

Summary of Privacy & Security

• Computing/network infrastructure can deal with security– but privacy is a policy matter

• Anonymizing of databases helps but it isn’t foolproof

• In general, people are the weakest security and privacy link

Page 53: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• Compare 1 year re-admission rate for acute MI in diabetics discharged on -blockers or not– data captured in EHR and other databases– data aggregated in data warehouse– you request IRB approval– you are authorized to to conduct HIPAA-compliant

search (e.g.,. Limited Data Set) in data warehouse– audit trail of queries are maintained

Outcomes Research Project

Page 54: February 22, 2005: I. Sim EHRs and Research Epi 206 — Medical Informatics Ida Sim, MD, PhD February 22, 2005 Division of General Internal Medicine, and

February 22, 2005: I. Sim EHRs and ResearchEpi 206 — Medical Informatics

• EHR does not always = easier clinical research• Structure and coding is critical

– structure: e.g., relational schema, designed to support intended queries

– coding: standardized, coded data trumps free text• especially important for research• but most standardized vocabularies have insufficient clinical

coverage– standard formats needed for genomic, imaging, etc. data

• Clinical/Research data warehouses could be useful for research but must be designed “correctly” with high-quality, cross-compatible data

Take-Home Points