Upload
julianna-hubbard
View
218
Download
0
Embed Size (px)
Citation preview
February 15, 2005: I. Sim OverviewMedical Informatics
Medical Informatics for Clinical Research
Ida Sim, MD, PhD
February 15, 2005
Division of General Internal Medicine, andGraduate Group in Biological and Medical Informatics
UCSF
Copyright Ida Sim, 2005. All federal and state rights reserved for all original material presented in this course through any medium, including lecture or print.
February 15, 2005: I. Sim OverviewMedical Informatics
Outline
• Introduction
• Course Goals and Overview
• Computing Infrastructure for Health Care– data storage– networking
February 15, 2005: I. Sim OverviewMedical Informatics
Introduction: Ida Sim, MD, PhD• Assistant Professor
– General Internal Medicine
• PhD in Medical Informatics, Stanford• Associate Director for Medical Informatics
– Program in Biological and Medical Informatics
• Interests– computerized trial registration and reporting– meta-analysis, and evidence-based decision making– computer-assisted clinical decision-making– economics of health information technology
February 15, 2005: I. Sim OverviewMedical Informatics
Big Picture
• Cusp of a “new medicine”– genomics revolution– personalized medicine– quality improvement from evidence-based
medicine• But quality problems endemic
– Institute of Medicine (IOM) 2001 report• “Health care today harms too frequently and
routinely fails to deliver its potential benefits… Between the health care we have and the care we could have lies not just a gap, but a chasm.”
February 15, 2005: I. Sim OverviewMedical Informatics
Informatics “Solution”
• Electronic health records (EHRs) touted as critical for – transforming health care
• e.g., guideline-based, personalized decision support• 24/7, anywhere web-based interactive care
– improving quality• “A nationwide effort is needed to build a
technology-based information infrastructure that would lead to the elimination of most handwritten clinical data within the next 10 years…” (IOM “Chasm” report, 2001)
• asks for $1 billion for health informatics
February 15, 2005: I. Sim OverviewMedical Informatics
Yet...• Only ~12% of outpatient clinics have an EHR; only
30% of hospitals have a website– 20% of group practices have EHR (MGMA, 2005)
• Much clinical research is still done using chart abstraction and paper forms
• Medicine and medical research is information intensive, but– health sector invests only 2.5% of gross revenue on
information technologies (Gartner Group, 2003)
– vs. 6% in comparable information-intensive sectors (e.g., banking)
February 15, 2005: I. Sim OverviewMedical Informatics
Major Current Responses• Top national priority is IT for clinical care
– IT for clinical research should build upon IT for clinical care
• IT for clinical care– newly formed Office of the National Coordinator
for Health Information Technology (ONCHIT)• the health IT czar for the U.S.
• IT for clinical research– National Electronic Clinical Trials and Research
Network (NECTAR)
February 15, 2005: I. Sim OverviewMedical Informatics
ONCHIT Framework
• 4 goals of Health IT in U.S.– inform clinical practice
• i.e., increase EHR use– interconnect clinicians
• through computers that “talk” to each other• via the National Health Information Network (NHIN)
– personalize care• i.e. consumer-centric care, not “personalized” medicine
– improve population health• public health surveillance, “accelerate scientific discovery”
• Government to catalyze private-sector solutions to above
February 15, 2005: I. Sim OverviewMedical Informatics
NECTAR
• NIH goal to foster large clinical studies through flexible networks of researchers
• NECTAR: “the informatics infrastructure that will serve as the backbone for interconnected and inter-operable research networks”– standardized data reporting– informatics and other technologies to enhance
• participation in large-scale clinical research• data sharing and cross-study analyses• communication and collaboration
February 15, 2005: I. Sim OverviewMedical Informatics
Confluence of Events
• Recognition that move to personal and personalized medicine will require much better health IT
• EHR is coming, finally, we think…
• Focused, national attention on building informatics foundation(s) – for health care– for clinical research
February 15, 2005: I. Sim OverviewMedical Informatics
Course Goals• Be familiar with core concepts in medical
informatics: vocabularies, interchange standards, decision support systems
• Understand how computers are used to manage information in health care and to support clinical research
• Be familiar with key issues in using information technology to improve the way we do clinical research
February 15, 2005: I. Sim OverviewMedical Informatics
Course Overview• 5 Lectures
– PowerPoint file up few days before lecture– class participation expected
• Guest lecture: Palo Alto Medical Foundation’s EHR– Tues. Mar. 15, 8:45 to 10:15 am
• Assignments– 5 homeworks, no final exam
• Office “hours”: [email protected]– http://www.epibiostat.ucsf.edu/courses/schedule/med_informatics.html
February 15, 2005: I. Sim OverviewMedical Informatics
Outline
• Introduction
• Course Goals and Overview
• Computing Infrastructure for Health Care– data storage– networking
Computing Infrastructure
Clinic 2005
FrontDesk
Radiology
Claims
MedicalInformation
Bureau
Archive
Walgreens
Prescribing
Pharm BenefitManager
Benefits Check(RxHub)
HealthNetFormulary Check
B&TEligibility Authorization
Personal HealthRecord
LogicianEMR
OutsourcedElectronic Medical
Record
Specialist
Referral
ReferralAuthorization
Internet Intranet Phone/Paper/Fax
Lab
UniLab
(HL-7)
February 15, 2005: I. Sim OverviewMedical Informatics
Understanding the Infrastructure
• Clients and servers (the components)
• Data storage (how data is stored)– flat file versus relational model
• Networking (how data gets back and forth)
February 15, 2005: I. Sim OverviewMedical Informatics
Client/Server Model
• Computers can be servers and/or clients– provide or request “services”
• e.g., web server “serves” web pages to “clients,” who view these pages using a browser – MS Internet Explorer or Netscape Communicator
Clients
WebServer
February 15, 2005: I. Sim OverviewMedical Informatics
Internet Clients and Servers
itsa
medicine
ucsf.edu
nci.nih.gov cochrane.uk myhome.com
Main Trunk Cables
amazon.com
at home
pacbell.net
aol.com
LAN
February 15, 2005: I. Sim OverviewMedical Informatics
Data Storage
• Computers can help us– store, retrieve, query, compute, and report data
• For this to happen, we must describe the data in such a way that the computer– “understands” the data– can manipulate the data
• e.g., sort them, graph them, add numbers, perform analyses
– can retrieve the data for later use
February 15, 2005: I. Sim OverviewMedical Informatics
“Describing” the Data
• The computer’s ability to manage your data depends on how well you describe your data to it
• JIFE database example: is your data described – accurately: did Baby Oscar have jaundice?– cleanly: with as little redundancy as possible
• don’t want Baby Oscar’s birth date in 3 places– sufficiently: all that is needed for later analyses
• what later analyses do you have in mind (e.g., by ethnicity)?
– understandably: for humans and for computers
February 15, 2005: I. Sim OverviewMedical Informatics
“Describing” Data: To Humans
• For understanding and communication– via a system for codifying meaning
• English language, mathematical notation,– making the coded message concrete
• skywriting “I LUV U”, a graph on a sandy beach
• text on paper, an oil painting, lecture on audiotape
• For later retrieval– a permanent or semi-permanent physical
embodiment of the message• papers in a file cabinet, collection of runes
24142 1083.9 96
February 15, 2005: I. Sim OverviewMedical Informatics
“Describing” Data: To Computers• For understanding and communication
– via a data model for describing data to computers• akin to “German prose on paper” or “Olde English epic poetry on
audiotape”– standard data models to choose from include
• flat file• relational• extended-relational (combo of object-oriented and relational)
• For later retrieval– storage as 1’s and 0’s in
• random access memory: short term, until power off• permanent memory on a hard disk: longer term
February 15, 2005: I. Sim OverviewMedical Informatics
Data Model Choices• Data model should be the best that allows you to
– do what you want to do with the data• query, manipulate, share, merge
– handle the amount of data that you have– handle the type of data that you have
• prose, numbers, xray images, audio files, etc.
• Standard data model choices– flat file: one long list of text characters– relational: tables of columns and rows– object: data arranged in conceptual groups
• Usual clinical research choice is flat file vs. relational
February 15, 2005: I. Sim OverviewMedical Informatics
Flat File Model• For understanding and communication
– data are encoded as strings of characters• one character at a time, no concept of a “word” or “sentence”
– so, computers cannot understand the meaning of data• “male” is just a string of 4 characters
• For storage– in a single file (e.g. a Word or STATA file)– “flat” structure: start with one baby’s data, and keep
adding data baby by baby
• Like writing all your data from beginning to end onto one piece of paper and putting that paper into your file drawer
February 15, 2005: I. Sim OverviewMedical Informatics
Word Text File
Carson Jackson 1 3/2/05 J 5
Hannah Hillary 2 1/2/05 C 2
Jonas Oscar 1 1/1/05 J 3
STATA File
Carson,Jackson,1,3/2/05,J,5
Hannah,Hillary,2,1/2/05,C,2
Jonas,Oscar,1,1/1/05,J,3
Flat File Examples
February 15, 2005: I. Sim OverviewMedical Informatics
Database Schema
• A database’s schema is a compact summary description your database’s contents
• Database schema = description of database– what type of data– how that data is conceptually arranged
• E.g., schema for research paper– intro, methods, results, discussion (text)– tables (table) and figures (graphic)– pictures (image)
February 15, 2005: I. Sim OverviewMedical Informatics
Flat File Data SchemaWord File
Carson Jackson 1 3/2/05 J 5
Hannah Hillary 2 1/2/05 C 2
Jonas Oscar 1 1/1/05 J 3
• Which fields are– first name, DOB, case status, last name, exam score,
gender
• Flat file schemas are implicit– is in the mind of whoever is entering the data– can change from record to record
• e.g., Baby 1 is Jackson Carson while Baby 2 is Hannah Hillary
February 15, 2005: I. Sim OverviewMedical Informatics
Flat File Advantages• Easy, just start entering data, doesn’t need any
preliminary database work or knowledge• Can do with any word processor
– Word, WordPerfect, editor for STATA or SAS, Excel, SimpleText
• Cheap• Can be exported to analysis programs• Portable
– almost all programs can read in a flat file
February 15, 2005: I. Sim OverviewMedical Informatics
Flat File Disadvantages• Description of the data isn’t clear, so data may
not be understandable– meaning of the data items is not explicit
• unclear that last column is neuropsych exam score– structure is not explicit
• does last name always precede first name?– structure is not stable
• can change from record to record • Inefficient and prone to error for representing repeating
data fields– e.g., if each baby has more than one neuropsych exam
score
February 15, 2005: I. Sim OverviewMedical Informatics
Word Text FileCarson Jackson 1 3/2/05 J 5 x
4
Hannah Hillary 2 1/2/05 C 2
Jonas Oscar 1 1/1/05 J 3 43• Implicit structure to repeating data
– is the nth column always the nth neuropsych exam score?
• can a missed exam be denoted by an X?
Repeating Data in Flat File Model (2)
February 15, 2005: I. Sim OverviewMedical Informatics
Flat File Disadvantages (cont.)• Inefficient at finding a particular baby
– must look at all records from beginning to end– no guarantee that you have found all the
information for that baby unless you look all the way to the end
• Inefficient at manipulating data– to see list of male babies, must make a new file
• Difficult to share since the database itself gives no clues about what data is in each field
February 15, 2005: I. Sim OverviewMedical Informatics
Summary of Flat File Data Model
Factor Flat File RelationalHuman-understandable Frequently NotComputer-“understandable” NoComplexity of data structure SimpleQuerying InefficientManipulating InefficientAmount of data SmallType of data Text, NumbersSharing and merging Very Difficult
February 15, 2005: I. Sim OverviewMedical Informatics
When Are Flat Files Useful?• For a small, simple, “quick and dirty” databases
– few data items, small number of records– one set of predictors and one set of outcomes per
participant/subject• i.e., no repeating data fields• i.e., only one-to-one relations, no one-to-many
– quick and dirty• for very few users (i.e. just you) • you’re not planning on reusing this database later• you’re not planning on sharing this database now or
later
February 15, 2005: I. Sim OverviewMedical Informatics
Relational Data Model• Data are arranged in tables made up of
columns and rows– the columns are the types of data
• fixed number of columns• each column has a unique name (e.g., FirstName)• has a “domain” of values that may appear in that
column– domain=text for FirstName– domain=positive integers for Age
– the rows are the records themselves• there can be an arbitrary number of unique unnamed
rows (i.e., the table can be arbitrarily long)
February 15, 2005: I. Sim OverviewMedical Informatics
Flat File Admissions DatabaseRobert Lee, 000-01-001, M, 09-Jul-70,B/T Healthnet31-Dec-94 to 12-Jan-95, admitted to Medicine with Acute MI, discharged with
Acute MI, COPD, Diabetes, CHF27-Mar-96 to 31-Mar-96, admitted to Medicine with COPD, discharged with
Pneumonia, COPD, CHF, Diabetes
June Smith, 000-01-002,F,22-Oct-25,Medicare02-Feb-95 to 16-Feb-95, admitted to Surgery for Total Hip Replacement,
discharged with THR, Acute MI, Diabetes27-Feb-95 to 20-Mar-95, admitted to Medicine with Acute MI, discharged with
Acute MI,VF Arrest, Diabetes
Marissa Perez,000-01-003,F,13-Jun-57,B/T Pacificare19-Nov-97 to 23-Nov-97, admitted to Gyn for metrorrhagia, discharged with
uterine fibroids, Diabetes
February 15, 2005: I. Sim OverviewMedical Informatics
Review of Problems with Flat Files
• Implicit structure, implicit data schema
• Schema may change from record to record
• Inefficient for finding a particular admission
• Difficult to share or to understand later
• Problems in representing one-to-many relationships
• etc.
Relational Admissions Database InpatientMasterTable
ID Name Sex Birthdate Insurance000-01-001 Lee M 09-Jul-70 B/T Healthnet000-01-002 Smith F 22-Oct-25 Medicare000-01-003 Perez F 13-Jun-57 B/T Pacificare
AdmissionsTableID Admit
ServiceAdmit Date Discharge
DateAdmit
DiagnosisPrincipalDischargeDiagnosis
000-01-001 Med 31-Dec-94 12-Jan-95 Acute MI Acute MI000-01-001 Med 27-Mar-96 31-Mar-96 COPD Pneumonia000-01-002 Surg 03-Feb-95 16-Feb-95 THR THR000-01-002 Med 27-Feb-95 20-Mar-95 Acute MI Acute MI000-01-003 Gyn 19-Nov-97 23-Nov-97 Menorrhagia von Willebrand's
SecondaryDischargeDiagnosisTableID Admit Date Secondary Discharge Diagnoses
000-01-001 31-Dec-94 COPD000-01-001 31-Dec-94 Diabetes000-01-001 31-Dec-94 CHF000-01-001 27-Mar-96 COPD000-01-001 27-Mar-96 CHF000-01-001 27-Mar-96 Diabetes000-01-002 03-Feb-95 Acute MI000-01-002 03-Feb-95 Diabetes000-01-002 27-Feb-95 VF Arrest000-01-002 27-Feb-95 Diabetes000-01-003 19-Nov-97 Diabetes
1 patient, multiple admissions
Relational Admissions Database InpatientMasterTable
ID Name Sex Birthdate Insurance000-01-001 Lee M 09-Jul-70 B/T Healthnet000-01-002 Smith F 22-Oct-25 Medicare000-01-003 Perez F 13-Jun-57 B/T Pacificare
AdmissionsTableID Admit
ServiceAdmit Date Discharge
DateAdmit
DiagnosisPrincipalDischargeDiagnosis
000-01-001 Med 31-Dec-94 12-Jan-95 Acute MI Acute MI000-01-001 Med 27-Mar-96 31-Mar-96 COPD Pneumonia000-01-002 Surg 03-Feb-95 16-Feb-95 THR THR000-01-002 Med 27-Feb-95 20-Mar-95 Acute MI Acute MI000-01-003 Gyn 19-Nov-97 23-Nov-97 Menorrhagia von Willebrand's
SecondaryDischargeDiagnosisTableID Admit Date Secondary Discharge Diagnoses
000-01-001 31-Dec-94 COPD000-01-001 31-Dec-94 Diabetes000-01-001 31-Dec-94 CHF000-01-001 27-Mar-96 COPD000-01-001 27-Mar-96 CHF000-01-001 27-Mar-96 Diabetes000-01-002 03-Feb-95 Acute MI000-01-002 03-Feb-95 Diabetes000-01-002 27-Feb-95 VF Arrest000-01-002 27-Feb-95 Diabetes000-01-003 19-Nov-97 Diabetes
1 admission, multiple secondary
diagnoses
February 15, 2005: I. Sim OverviewMedical Informatics
Relational Database Schema• The schema is the names of the tables and their
column names– InpatientMasterTable(ID,Name,Sex,Birthdate, etc)– AdmissionsTable(ID,AdmitService,AdmitDate,Disc
hargeDate,AdmitDiagnosis, etc) – SecondaryDiagnosisTable(ID,AdmitDate,Secondary
DischargeDiagnosis)
• The schema is explicitly stated– in a language called Structured Query Language
(SQL)
February 15, 2005: I. Sim OverviewMedical Informatics
Pros of Relational Model• Database is always consistent
– built-in prevention against insert, delete, update errors• data stored non-redundantly and properly cross-
indexed
• Is more efficient, due to basis in formal set theory– normalization saves storage space– normalization results in faster searching through the
data
• Standard schema definition and query language– SQL=Structured Query Language
• Available as reliable commercial software systems...
February 15, 2005: I. Sim OverviewMedical Informatics
Cons of (Traditional) Relational Model• Profusion of tables and keys can be confusing
– higher organizing principles are implicit • e.g., a patient has only one primary diagnosis but
may have several secondary diagnoses• Inefficient at representing complex semantic
relationships– e.g., ICU admission is a type of admission
• Unable to capture certain types of data– nested data
• e.g., admit diagnosis = MITable(location,Qwave,CHFStatus)
– images and other multimedia– metadata (e.g., “Exam score corrected May 2nd, 2004”)
February 15, 2005: I. Sim OverviewMedical Informatics
Summary of Relational Data Model
Factor Flat File RelationalHuman-understandable Frequently Not YesComputer-“understandable” No YesComplexity of data structure Simple ComplexQuerying Inefficient EfficientManipulating Inefficient EfficientAmount of data Small Very LargeType of data Text, Numbers Text, Numbers*Sharing and merging Very Difficult Less Difficult
*Extended-relational data model can handle images and video too
February 15, 2005: I. Sim OverviewMedical Informatics
Summary of Data Model Choices• Clinical data (e.g., EHR)
– should use the relational model
• Clinical research data– default should be to use relational model– exceptions
• you have only one-to-one relations in your database, which you are not intending on sharing or reusing
– then use the flat file model
February 15, 2005: I. Sim OverviewMedical Informatics
The Model vs. The System• Data model
– the generic abstract structure of the information• Database management systems
– are real-world software packages that you can buy
– stores information using a particular data model – provides additional functionality
Example Database Management SystemsData Model Small Scale (PC’s) Large Scale (Mainframes)
Flat file original Filemaker Pro VA system (enhanced)
Relational Access, MySQL Oracle, Sybase, MySQL,SQL Server
Object Informix Objectivity
February 15, 2005: I. Sim OverviewMedical Informatics
DBMS Features for System Selection
• Multi-user support – e.g., prevents simultaneous updating of the same data
field
• Data entry forms
• Triggers and rules• Security
– e.g., separate logins with different levels of access• only database administrator can change data schema• data entry person can only enter data into certain fields
• Automatic backup and archiving– standard for health care data is at least 7 years of archiving
February 15, 2005: I. Sim OverviewMedical Informatics
Summary on Data Storage• How a computer stores information can have
serious implications for– data integrity– speed– ability to share data
• Relational model is generally the best choice for storing clinical and research data– but making sense of multiple databases is still
difficult
February 15, 2005: I. Sim OverviewMedical Informatics
Understanding the Infrastructure
• Clients and servers (the components)
• Data storage (how data is stored)– flat file versus relational model
• Networking (how data gets back and forth)
February 15, 2005: I. Sim OverviewMedical Informatics
Internet = Network of Networks
itsa
medicine
ucsf.edu
nci.nih.gov cochrane.uk myhome.com
Main Trunk Cables
local trunk cablethrough Berkeley
amazon.com
at homedial-in to itsa.ucsf.edu via modem
pacbell.net
aol.com
or use a commercialInternet Service Provider (ISP) via dial--up
or DSL
LAN
February 15, 2005: I. Sim OverviewMedical Informatics
Networking BandwidthSim: Computer Infrastructure 1/26/00
Connection TypeSpeed
(in kilo bits per second, Kbps)CXR
(12 Mbits)CT Scan
(5.2 Mbits)Phone modem 14.4, 28.8, or 56
ISDN 64 to 128 3 min 1.4 min
T1 1,000
Spread-spectrum RF 2,000
ADSL 6,000 to 7,000
Cable modem to 10,000
Infrared 16,000
Ethernet 10,000100,000 on some sytems
T3 45,000ATM 155,000 over copper wires
622,000 over fiberoptic8 sec 3.3 sec
SONET 52,000 to 9,953,000
February 15, 2005: I. Sim OverviewMedical Informatics
What Happens over Network Cables?
itsa
medicine
ucsf.edu
nci.nih.gov cochrane.uk myhome.com
Main Trunk Cables
amazon.com
at home
pacbell.net
aol.com
LAN
February 15, 2005: I. Sim OverviewMedical Informatics
• Protocol = grammar for machines talking to each other– protocol for the web is http
• Health-specific networking “grammars” add to complexity of infrastructure– e.g., HL-7 needed for computers to “talk”
clinical data• Many interactive services (e.g., realtime
teleconsultation) would need more bandwidth than is commonly available
Networking
Clients/Servers/Data/Networking
Modern U.
FrontDesk
Radiology
Claims
MedicalInformation
Bureau
Archive
Walgreens
Prescribing
Pharm BenefitManager
Benefits Check
HealthNetFormulary Check
Lab
UniLab
B&TEligibility Authorization
Personal HealthRecord
LogicianEMR
OutsourcedElectronic Medical
Record
Specialist
Referral
ReferralAuthorization
Internet Intranet Phone/Paper
February 15, 2005: I. Sim OverviewMedical Informatics
HealthSystem Minnesota
• 1.6 million patient visits per year, 270,000 capitated lives, 460 physicians, 4700 employees, 31 clinics, and over $400 million in revenues (1998)
– over 50 computer and 50 paper systems
• “Maintaining the consistency of these tables in various systems is impossible and creates enormous problems for understanding let alone improving our performance.”
February 15, 2005: I. Sim OverviewMedical Informatics
UCSF
• Spent ~$100 million on networking in the late 1990’s
• EHR implementation delayed multiple times
• Several Dean’s committees on IT for clinical research in last 5 years
February 15, 2005: I. Sim OverviewMedical Informatics
Conclusions• Clinical and research databases are generally
more reliable and efficient if they are relational rather than flat file
• Computing infrastructure for health care is very complex, very fragmented, has lots of gaps, and is saddled with lots of old technology– but finally, the national will appears to exist to
tackle this• Networking involves both hardware (cable) and
software (protocols); bandwidth limits wide deployment of interactive technologies
February 15, 2005: I. Sim OverviewMedical Informatics
Teaching Points• If you want computers to do “smart” things with
your data (e.g., retrieve, sort, graph), you must describe that data very explicitly– what you don’t say the computer does not know
• Data models are standard abstract ways of describing data
• To send data back and forth, you also need very explicit “grammars” for communication
• Today = how of infrastructure; next class = what
February 15, 2005: I. Sim OverviewMedical Informatics
References
• Crossing the Quality Chasm: A New Health System for the 21st Century (Washington: National Academy Press, 2001)