Data Explosion in Medical Imaging

Preview:

DESCRIPTION

A talk at The Fifth Elephant - a confrence on Big Data in Bangalore - July 2012

Citation preview

Data Explosion in Medical ImagingShourya SarcarSr. Engineering ManagerGE Healthcare

@shouryasarcar

Rene Laennec , "immediate" auscultation with the unaided ear

1820Paris

2010Handheld

digital ultrasound

stethoscope

Imaging 101

Digital X-Ray

Computed Tomography

X-Ray technology, but in 3D !

Magnetic Resonance Imaging

• Principal of magnetic spin• Great for soft tissue imaging

Positron Emission Tomography

PET-CT

• Functional imaging• Radioactive bio-marker

binds to cancerous cell• Capture positron decay

with a scintillation detector

Digital Acquisition System

Image Reconstruction

Add Patient Information

Basic Imaging Chain

The fancier the DAS, The larger the data

How BIG is that Image Data ?

Pixel

Header

1024 x 102416 bits per pixel2 MB(tchah ! That’s not much)

Primarily “Text”Few KB

They don’t come alone !

Images come in “Stacks”, a.k.aVolumesSeriesSlices

HeaderPixel

HeaderPixel

HeaderPixel

HeaderPixel

HeaderPixel

HeaderPixel

HeaderPixel

How BIG is that Image Data ?

Full body PET CT Cardiac fMRI

Images / set 600 3000 20000

Size of 1 set 1.2 GB 6 GB 40 GB

No. of sets (typical)

4 6 8

Exam Size 9 GB 36 GB 300 GBSizes are approximations

How do we take that data home ?

…and who does it belong to ?

To store and

share digital data,

we need a

format and a

protocol

And there was none until 1993 !

“And the whole earth was of one language and of one speech” ~ Genesis 11

DICOM: Digital Information and Communication in Medicine

1985 1988 1993

ACR-NEMA 1.0 ACR-NEMA 2.0 DICOM 3.0

DiagnosticImaging

DICOM Scope

PatientBedside

Monitoring

Administrative HIS/RIS

. . .

Lab Data

. . .

Medical Informatics

Scope ofDICOM

DICOM Standard

PS 3.1: Introduction and Overview

PS 3.2: Conformance

PS 3.3: Information Object Definitions

PS 3.4: Service Class Specifications

PS 3.5: Data Structure and Encoding

PS 3.6: Data Dictionary

PS 3.7: Message Exchange

PS 3.8: Network Communication Support for Message Exchange

PS 3.9: Point‑to‑Point Communication Support for Message Exchange (Retired)

PS 3.10: Media Storage and File Format for Data Interchange

PS 3.11: Media Storage Application Profiles

PS 3.12: Storage Functions and Media Formats for Data Interchange

PS 3.13: Print Management Point-to-Point Communication Support (Retired)

PS 3.14: Grayscale Standard Display Function

PS 3.15: Security Profiles

PS 3.16: Content Mapping Resource

PS 3.17: Explanatory Information

PS 3.18: Web Access to DICOM Persistent Objects

PS 3.19: Application Hosting

PS 3.20: Transformation of DICOM to and from HL7 standards

20parts

161 supplemen

ts

+

DICOM Network:where is it in the network stack?

Medical Imaging Application

DICOM Application Entity

DICOMUpper Level

Protocolfor TCP/IP

TCP

IPStandard Network Physical Layer(i.e. Ethernet, FDDI, ISDN, etc.)

OSI Association ControlService Element (ACSE)

OSI Presentation Kernel

OSI Session Kernel

OSI Transport

OSI Network

LLC

OSIstack

TCP/IPstack

OSI upper layerService boundary

The Stack has Evolved !

Once Upon a

Time

How much disk space is needed for cardiac screening

of Bangalore ?

Back of the envelope calculations

Population of Bangalore : 7,000,000

Cardiac Screening : 50%

HDD / CT Cardiac : 20 GB

I need some space : 70 PB

0

100

200

300

400

500

600

700

800

900

20

220

312

468

850

Data Stores in Tera Bytes

Christian Medical College Vellore

0.5 million exams / yr60 TB

Clalit Healthcare Services, 14-hospital network in Israel

4.5 million exams/yr

250 TB (annually)

Est. imaging data size in US – 2014

100 PB

Est. imaging data size globally – 2020

35 ZB

Some more size estimates

Challenges of Large Imaging Data

Archival Search Transfer

Lawmakers demand storage guarantee

Moms:“25 years after the birth of the last child”

Mentally disabled:“20 years after the last contact or 8 years after the patient's death”

Children: “Until the patient is 25”

Storage Commitment built into DICOM

Huge Capacity requirements

Huge CapExUnpredictable TCOHardware technology obsoSpaceData-center grade infra

Simple search is a soft problem

Index only the meta-data DICOM loves SQL

Finding the needle in the haystack

HeaderPixel

HeaderPixel

HeaderPixel

HeaderPixel

HeaderPixel

HeaderPixel

HeaderPixel

SQL Tables

Flat Files Power of SQL queries Insertion is fast, Read is

fast Replicate tables Memory-mapped IO Better disaster

recovery

Relevant header info

Complete File

Why move the data ?

• Offline storage [Store/Fetch]• Reporting• Teleradiology / Remote

reporting

Remote Radiologist

Onsite Radiologist

Outpatient Imaging Center

Inte

rnet

Why move the data ?

• Offline storage [Store/Fetch]• Reporting• Teleradiology / Remote

reporting

Challenges of DICOM

• DICOM is based on TCP/IP• Slow over large number of hops• FileCatalyst, CISCO WAAS

• DICOM compression is not adequate• Lossy, Loseless

• DICOM is not efficient on fault-tolerance• Dated retry mechanism, transmit

in sets/series, not files

FOSS DICOM Tools and Images

ftp://medical.nema.org/medical/dicom/DataSets/

http://www.barre.nom.fr/medical/samples/

OsiriX for MacSantesoft for WinKradview for Linux

Language Toolkit

C/C++ GDCM, DCMTK

Java Pixel, dmc4che

Perl DICOM.pm

Ruby Ruby DICOM

Python pydicom

PHP Nanodicom

C# DICOM#

Viewers

Public Datasets

API

Recommended