Upload
carmella-jenkins
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Back-End Structures and Front End VisualizationsDAMA Minnesota
Matthew Israelson19 November, 2014
About Us
IHME is an independent global health research center at the University of Washington
Vision: Provide high-quality information on population health, its determinants, and the performance of health systems.
Mission: Improve the health of the world’s populations by providing the best information on population health.
Method: Produce rigorous and comparable measurements.
For general information Phone: +1-206-897-2800Fax: [email protected]
A Short History
Started in 2007 and continuing to grow into 2014
• July 2007: Founding of IHME with support from Bill & Melinda Gates Foundation and the state of Washington
• July 2009: Published the Financing Global Health (FGH) report
• June 2010: Graduated the first Masters of Public Health
• March 2011: Launched the Global Health Data Exchange (GHDx) at the Global Health Metrics and Evaluation conference
• December 2012: The Lancet published The Global Burden of Diseases, Injuries, and Risk Factors Study 2010 (GBD 2010)
• December 2014: First annual update with GBD 2013
Agenda
Back End Structures
Front End Visualizations
• Data Collection
• Infrastructure
• Modeling and Analysis
• Audience
• Outreach
• Visualizations
Back-End Structures
Data Collection
Infrastructure
• Process
• Collection
• Cataloging
• People
• Networking
• Technology
Modeling & Analysis
• GBD
• Data model
• Deliverables
Locate data
Acquire data
Catalog dataExtract data
Identify gaps
Overview the data cycle
Search for new health sources from:
• Government and NGO websites
• Databases
• Expert advice
• Literature
Negotiate with providers for access
• Formal requests
• Collaboration
• DUA / MOU / IRB
• Payment
Add to the GHDx
• Assign NID
• Create citation
• Add keyword
• Attach files
Provide data to our researchers
• Notify teams
• Extract data
• Import to research databases
• Provide sourcing
Analyze results to identify data gaps
• Years
• Causes
• Countries
• Etc.
What we collect
Health SurveysCensus RecordsSurveillance SystemsDisease RegistriesVital RegistrationHospital RecordsFinancial RecordsLiteratureEstimates Ja
nuar
y, 2
011
Janu
ary,
201
2
Janu
ary,
201
3
Janu
ary,
201
4
Oct
ober
, 201
40
10000
20000
30000
40000
50000
5500 7000
14000
28000
43000
# of Records in the GHDx
How we collect it
Added 15,000 new sources of data since January• Not everything is on the Internet• 900+ “high-touch” requests
• Applications• Data Use Agreements• IRB Approval• Restricted Data
A project management tool is essential• Adopted JIRA in 2013
Sourcing data
Global Health Data Exchange (GHDx)• Centralized citation database for IHME• Ensures same citation for the same data• Allows us to source all data points• http://ghdx.healthdata.org/
GHDx• Nids• All metadata• Citations
Federated• Citations• Accessed date• Publication status• Nids
Research databases• Nids
Not publicly accessiblePublicly accessible
Organizing Data LIVE DEMOhttp://ghdx.healthdata.org/
The people
Board of Directors & Scientific Oversight Group
210 Employees• Professors: 20+• Researchers: 90+• IT: 20+• Staff: 80+
16 Affiliate Professors
GBD Expert Collaborators
The GBD network
GBD enrolled 1,095 collaborators from 107 countries
Networking as an enabler
The collaborative network enhances the GBD• Assess the validity of country results• Identify missing datasets or incorrect interpretations of data• Interpret findings and facilitate country policy translation• Assist with acquiring new sources of data• Publish papers using GBD results
The size of the network demands new ways to manage contacts• CRM is an immediate priority
The technical infrastructure
Capacity of 250 TerabytesAccess limited to IHMELimited Use Access• Restricted to named researchers• Controlled or sensitive datasets
• Cluster for running Stata and R jobs (Sun Grid Engine)• Largest capacity at the University of Washington• Capacity to increase 10x for projections
IT requirements for GBD
12+ major database8 ServersCluster (STATA; R)
Primary databases for GBD
Cod Shared
Covariates GHDx
Epi Risk
Idie2 GBD results
mortality Codmod
GBDviz GBDx2.0
– every day, all day
The Global Burden of Disease (GBD)
A systematic, scientific effort to quantify the comparative magnitude of health loss due to diseases, injuries & risk factors.
• GBD 2010 published by The Lancet• GBD 2013 to be published in 2014• Annual updates to follow
GBD 2010 GBD 2013
Diseases and injuries 291 322
Sequelae 1,160 2,435
Risk factors 67 68
Countries 187 188
Years 1990-2010 1990-2013
Measuring burden of diseases and injuries
Data Inputs for GBD
Population-based Encounter-level Other
• Vital registration• Censuses• Surveys• Verbal autopsy• Disease registries• Surveillance
systems
• Hospital records• Ambulatory
records• Primary care
records• Claims data
• Literature reviews• Sensor data• Mortuaries/burial
sites• Police records
Defining analysis
Task of the analysts• Research• Prep data• Write code• Review estimates• Interpret results• Publish
Mortality
2Causes
of death
3
Nonfatal health
outcomes
4
Risk factors
5
Co-variate
s
1
YLLs/ YLDs/ DALYs
6
Main components of the data model
Processes within the data model
Deliverables
All-cause mortality rates Deaths by cause (1980-2013)Years of life lost (YLLs)Years lived with disability (YLDs)Disability adjusted life years (DALYs)
188 Countries322 Disease and Injuries68 Risk FactorsMen and Women20 Age Groups1990-2013
At least 1,000 draw calculations per estimate based on known data points and uncertainty
1.03 billion estimates
COOPER LIVE DEMOhttp://ghdx.healthdata.org/
cooper
Front End Visualizations
Purpose & Audience
Traditional Outreach
• Audience
• Underlying principles
• Publications
• Media
• Other approaches
Interactive Visualizations
• Key Uses
• Development
• Demonstrations
Audience
Communicating data for impact
Audiences and characteristics• Casual user• Data actor• Data analyst• ResearcherGranularity of dataType of tool or visual
http://bit.ly/1mogRom
Designing for the right audience
Casual User Data Actor Analyst
• Infographics• Illustrative
diagrams• Narrative
visualizations
• Press releases
• Reports• Briefs• Search tools• Limited
interactive visualizations
• Query tools• Exploratory
visualizations• API
Researcher
• Query tools• Exploratory
visualizations• Data catalog
– repository• Methods
IHME outreach
Research ArticlesPolicy ReportsBrochuresCountry ProfilesInfographicsNewslettersPresentationsVideosVisualizations
2007 2008 2009 2010 2011 2012 2013 20140
5
10
15
20
25
30
35
4
810
14
25
17
2730
IHME Research Articles
Policy reports, articles & profiles
Note………
Infographics
Note………
News Articles
Blogging and newsletters
@IHME_UW
Video
Open Source Tools
Note………
Key uses for visualizations
1. Review input data & launch models2. Review results3. Obtain feedback from collaborators/
experts
4. Communicate results5. Use as presentation / teaching aid6. Convince data owners to share data
Researchers
Different Audiences
The development process
1. Contact product owner2. Identification of relevant audience(s)3. Business and technical requirements4. Creation of appropriate design 5. Development (using Agile/Scrum)6. Testing & initial user feedback7. Launch under embargo (journalists)8. Public launch9. Feedback collection
Visualizations LIVE DEMO
GBD Comparehttp://vizhub.healthdata.org/gbd-
compare/
GBD Cause Patternshttps://www.healthdata.org/data-
visualization/gbd-cause-patterns
Visualizations LIVE DEMO
US Health Maphttp://vizhub.healthdata.org/us-health-map/
Tobacco Burden Visualizationhttp://vizhub.healthdata.org/tobacco/
Millennium Development Goalshttp://vizhub.healthdata.org/mdg//
Summary
• Gather and organize the data• Utilize that information• Inform and empower your audience
Contact me:Matthew Israelson