39
Spa$otemporal Sensor Integra$on, Analysis, Classifica$on or Can Exascale Cure Cancer?

Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Embed Size (px)

DESCRIPTION

Presentation at Clusters, Clouds and Data for Scientific Computing 2014 Integrative analyses of large scale spatio-temporal datasets play increasingly important roles in many areas of science and engineering. Our recent work in this area is motivated by application scenarios involving complementary digital microscopy, radiology and “omic” analyses in cancer research. In these scenarios, the objective is to use a coordinated set of image analysis, feature extraction and machine learning methods to predict disease progression and to aid in targeting new therapies. I will describe tools and methods our group has developed for extraction, management, and analysis of features along with the systems software methods for optimizing execution on high end CPU/GPU platforms. Once having provided our current work as an introduction, I will then describe 1) related but much more ambitious exascale biomedical and non-biomedical use cases that also involve the complex interplay between multi-scale structure and molecular mechanism and 2) concepts and requirements for methods and tools that address these challenges.

Citation preview

Page 1: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Spa$o-­‐temporal  Sensor  Integra$on,  Analysis,  Classifica$on  

or    

Can  Exascale  Cure  Cancer?  

Page 2: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

EXASCALE  CHALLENGES  IN  INTEGRATIVE  MULTI-­‐SCALE  SPATIO-­‐TEMPORAL  ANALYSIS  

Joel  Saltz  Chair  Biomedical  InformaIcs  Stony  Brook  University  

Page 3: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

“Domain”:  Spa$o-­‐temporal  Sensor  Integra$on,  Analysis,  Classifica$on    Big  Data  Extreme  Compu$ng  2014  

•  Mul$-­‐scale    material/$ssue  structural,  molecular,  func$onal  characteriza$on.    Design  of  materials  with  specific  structural,  energy  storage    proper$es,  brain,  regenera$ve  medicine,  cancer  

•  Integra$ve  mul$-­‐scale  analyses  of  the  earth,  oceans,  atmosphere,  ci$es,  vegeta$on  etc    –  cameras  and  sensors  on  satellites,  aircraO,  drones,  land  vehicles,  sta$onary  cameras  

•  Digital  astronomy    •  Hydrocarbon  explora$on,  exploita$on,  pollu$on  remedia$on  

Page 4: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

•  Aerospace  –  wind  tunnels,  acquisi$on  of  data  during  flight  

•  Solid  prin$ng  integra$ve  data  analyses  •  Autonomous  vehicles,  e.g.  self  driving  cars  •  Data  generated  by  numerical  simula$on  codes  –  PDEs,  par$cle  methods  

 

Page 5: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Typical  Computa$onal/Analysis  Tasks    Spa$o-­‐temporal  Sensor  Integra$on,  Analysis,  Classifica$on  

•  Data  Cleaning  and  Low  Level  Transforma$ons  •  Data  SubseWng,  Filtering,  Subsampling  •  Spa$o-­‐temporal  Mapping  and  Registra$on  •  Object  Segmenta$on    •  Feature  Extrac$on  •  Object/Region/Feature  Classifica$on  •  Spa$o-­‐temporal  Aggrega$on  •  Diffeomorphism  type  mapping  methods  (e.g.  op$mal  mass  transport)  

•  Par$cle  filtering/predic$on  •  Change  Detec$on,  Comparison,  and  Quan$fica$on  

Page 6: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?
Page 7: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Integra$ve  Analysis:  OSU  BISTI  NBIB  Center  

Big  Data  (2005)    Associate  genotype  with  phenotype  Big  science  experiments  on  cancer,  

heart  disease,  pathogen  host  response  Tissue  specimen  -­‐-­‐    1  cm3  

0.1  μ  resolu$on  –      roughly  1015  bytes  

Molecular  data  (spa$al  loca$on)  can  add  addi$onal  significant    factor;  e.g.  102  Mul$spectral  imaging,  laser  

captured  microdissec$on,  Imaging  Mass  Spec,  Mul$plex  QD  

Mul$ple  $ssue  specimens;  another  factor  of  103  

 

Total:  1020  bytes  -­‐-­‐  100    exabytes    per  big  science  experiment      

Page 8: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

The Tyranny of Scale (Oil Reservoir Management

Tinsley Oden - U Texas) process scale

field scale

km

cm

simulation scale

mm

pore scale

Page 9: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Why  Applica$ons  Get  Big  •  Physical  world  or  simula$on  results  •  Detailed  descrip$on  of  two,  three  (or  more)  dimensional  space  

•  High  resolu$on  in  each  dimension,  lots  of  $mesteps  

•  e.g.  oil  reservoir  code    -­‐-­‐  simulate  100  km  by  100  km  region  to  1  km  depth  at  resolu$on  of  100  cm:      

– 10^6*10^6*10^4  mesh  points,  10^2  bytes  per  mesh  point,  10^6  $mesteps  -­‐-­‐-­‐  10^24  bytes  (Yo1abyte)  of  data!!!  

Page 10: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Center  for  Mul$  Scale  Cancer  Informa$cs  (Sept  2014)  

•  Stony  Brook  •  Oak  Ridge  Na$onal  Labs  •  Emory  •  Yale  

•  Cancer  Research  meets  HPC,  Material  Science,  “omics”  

•  Vector  Valued  “omics”  

Page 11: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Cen

ter

for

Com

preh

ensi

ve I

nfor

mat

ics

Page 12: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Integrative Cancer Research with Digital Pathology

histology neuroimaging

clincal\pathology

Integrated Analysis

molecular

High-resolution whole-slide microscopy

Multiplex IHC

Page 13: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Direct  Study  of  Rela$onship  Between    vs    

Lee Cooper, Carlos Moreno

Page 14: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Clustering identifies three morphological groups •  Analyzed 200 million nuclei from 162 TCGA GBMs (462 slides)

•  Named for functions of associated genes: Cell Cycle (CC), Chromatin Modification (CM), Protein Biosynthesis (PB)

•  Prognostically-significant (logrank p=4.5e-4)

Feat

ure

Indi

ces

CC CM PB

10

20

30

40

500 500 1000 1500 2000 2500 3000

0

0.2

0.4

0.6

0.8

1

Days

Surv

ival

CCCMPB

Page 15: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

VASARI Feature Set

Page 16: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Complex  image  analysis,  feature  extrac$on,  machine  learning  pipelines  SpaIo-­‐temporal  Sensor  IntegraIon,  Analysis,  

ClassificaIon  

Page 17: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Programming  Tools    •  Mul$  level  computa$onal  pipeline  management  

•  Region  Templates  –  abstrac$on  for  mul$-­‐scale  spa$o-­‐temporal  computa$ons  

•  “Domain”  specific  language  

Page 18: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Database  Tools  (Fusheng  Wang)  Spa$al  Queries  and  Analy$cs  

Introduction

•  Feature based descriptive queries –  Feature based filtering or feature aggregation –  Spatial relationship based queries –  Spatial join (two- or multi-), window, point-in-

polygon –  Polygon overlay or spatial cross-matching –  Distance based queries –  Nearest neighbors

•  Spatial analytics –  Density based spatial patterns: find clusters,

hotspots, and anomalies –  Spatial relationship modeling, e.g., geographically

weighted regression model(GWR)

Page 19: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Vector  Valued  “omics”  Data  Scale  

 

Page 20: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Tumor Heterogeneity

Marusyk 2012

Page 21: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Whole  Slide  Imaging:  Scale  

Page 22: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Nature News and Comment - $1000 Genome

Page 23: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Brian Owens

Page 24: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Epigene$cs  

Genetic Science Learning Center - Utah

Page 25: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

University of Utah Epigenetics Training Site

Page 26: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Spanish National Cancer Research Center

Page 27: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?
Page 28: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

EMR Data Analytics: Tools for Clinical Phenotyping and Population Health

Patient History

Physical Exams

Chemistries

Hematology

CT Scans

MRI Scans

PET Scans

Path specimens

Genomics

Proteomics

Metabalomics

Lipidomics

Integrative Predictive Analytics

Page 29: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Cen

ter

for

Com

preh

ensi

ve I

nfor

mat

ics

•  Example Project: Find hot spots in readmissions within 30 days –  What fraction of patients with a given principal diagnosis will

be readmitted within 30 days? –  What fraction of patients with a given set of diseases will be

readmitted within 30 days? –  How does severity and time course of co-morbidities affect

readmissions? –  Geographic analyses

•  Compare and contrast with UHC Clinical Data Base –  Repeat analyses across all UHC hospitals –  Are we performing the same? –  How are UHC-curated groupings of patients (e.g., product

lines) useful?

Clinical Phenotype Characterization and the Emory Analytic Information Warehouse

Andrew Post, Sharath Cholleti, Doris Gao, Michel Monsour, Himanshu Rathod

Page 30: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Cen

ter

for

Com

preh

ensi

ve I

nfor

mat

ics

30-Day Readmission Rates for Derived Variables Emory Health Care

Page 31: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Cen

ter

for

Com

preh

ensi

ve I

nfor

mat

ics

Geographic Analyses UHC Medicine General Product Line (#15)

Analytic Information Warehouse

Page 32: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Cen

ter

for

Com

preh

ensi

ve I

nfor

mat

ics

Predictive Modeling for Readmission

•  Random forests (ensemble of decision trees) –  Create a decision tree using a random subset of the

variables in the dataset –  Generate a large number of such trees –  All trees vote to classify each test example in a

training dataset –  Generate a patient-specific readmission risk for each

encounter •  Rank the encounters by risk for a subsequent 30-

day readmission

Sharath Cholleti

Page 33: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Cen

ter

for

Com

preh

ensi

ve I

nfor

mat

ics

Predictive Modeling for 180 UHC Hospitals, 35 Million Patients Identify High Risk Patients! Readmission fraction of top 10% high risk patients

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 9 17

25

33

41

49

57

65

73

81

89

97

105

113

121

129

137

145

153

161

169

177

185

All Hospital Model

Individual Hospital Model

Page 34: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

DSRIP  

Delivery  System  Reform  Incen$ve  Payment  (DSRIP)  Program  

Page 35: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

What  is  DSRIP?  

•  8  billion  dollar  grant  from  CMS  to  NY  State  – 25%  reduc$on  over  five  years  in  avoidable  hospitaliza$ons  and  ER  visits  in  the  Medicaid  and  uninsured  popula$on  

– Collabora$ve  effort  to  implement  innova$ve  projects  focused  on  

•  System  transforma$on  •  Clinical  improvement  •  Popula$on  health  improvement    

Page 36: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

5  YEAR  GOALS  

•  Create  integrated  care  delivery  system  anchored  by  safety  net  providers    

•  Engage  partners  across  the  care  delivery  spectrum  to    create  a  county  wide  network  of  care  

•  AOer  five  years  transi$on  this  network  to  an  ACO  which  will  contract  with  insurance  providers  on  an  at  risk  basis  

Page 37: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

The  projects  

•  The  chosen  projects  must  address  the  most  significant  healthcare  issues  in  the  Suffolk  County  Medicaid  and  uninsured  popula$on  and  address  healthcare  dispari$es—some  examples  –  Behavioral  health:  BH  and  primary  care  integra$on    – Adults:  COPD,  diabetes,  HTN,  renal  failure  –  Children:  Asthma  – Hi  risk  OB/neonates—Esp.  Hispanic  and  African  American  communi$es  

Page 38: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

DATA  ANALYTICS  

•  County  wide  healthcare  data  will  be  collected  •  Near  real-­‐$me  data  analy$cs  will  be  used  to  drive  healthcare  improvements    – Analyzing  success  and  failures  to  create  fast  turn-­‐around  improvement  opportuni$es  

– Analyze  trends  in  disease  and  wellness  popula$on  wide  

– Con$nuous  analysis  of  outcomes  •  Testbed  for  Machine  Learning  

Page 39: Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascale Cure Cancer?

Conclusions    

•  Major  applica$on  areas    –  Exascale++  –  Impact  –  “cure  cancer”  

•  “Domains”  –  Spa$o-­‐temporal  Sensor  Integra$on,  Analysis,  Classifica$on    

–  Integra$ve  Predic$ve  Analy$cs  

•  Agile  extreme  scale  compu$ng