23
The MammoGrid Project Grids The MammoGrid Project Grids Architecture Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions, Univ of Oxford, Univ of Sassari & Pisa, Univ West of England, Univ Hospitals of Cambridge (Addensbrookes) & Udine

The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

Embed Size (px)

Citation preview

Page 1: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

The MammoGrid Project Grids The MammoGrid Project Grids ArchitectureArchitecture

Richard McClatcheyCHEP’03, San Diego March 24th 2003

On behalf of the MammoGrid Consortium:CERN, Mirada Solutions, Univ of Oxford, Univ of

Sassari & Pisa, Univ West of England, Univ Hospitals of Cambridge (Addensbrookes) & Udine

Page 2: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 2

ContentsContents

1. The MammoGrid project objectives2. Project challenges and philosophy3. HEP vs distributed medical image analysis4. The MammoGrid infrastructure5. Implementation and current status6. Future plans7. Conclusions & questions

Page 3: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 3

What is the Mammogrid?What is the Mammogrid?

• EU FP5 project to build a pan-European distributed Database of mammography images using GRID Technologies.

• Aim: To provide a demonstrator for use in epidemiological studies, quality control and validation of computer aided detection algorithms.

Page 4: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 4

Mammogrid ObjectivesMammogrid Objectives

1.1. To To evaluate current Grids technologiesevaluate current Grids technologies and determine the requirements for Grid-compliance in a pan-European mammography database.

2.2. To To implement the Mammogrid databaseimplement the Mammogrid database, , using novel Grid-compliant and using novel Grid-compliant and Federated-Database technologiesFederated-Database technologies that will provide improved access to distributed data and will allow rapid deployment of software packages to operate on locally stored information.

3. To deploy enhanced versions of a standardization systemstandardization system that enables that enables comparison of mammogramscomparison of mammograms in terms of intrinsic tissue properties independently of scanner settings, and to explore its place in the context of medical image formats (DICOM).

4. To develop software tools to automatically extract image informationsoftware tools to automatically extract image information that can be used to perform quality controls on the acquisition process of participating centers (e.g. average brightness, contrast).

5. To develop software tools to automatically extract tissuesoftware tools to automatically extract tissue informationinformation that can be used to perform clinical studies (e.g. breast density, presence, number and location of micro-calcifications) in order to increase the performance of breast cancer screening programs.

6. To use the annotated information and the images in the database toto benchmarkbenchmark the performance of the softwarethe performance of the software described in points 3, 4 and 5.

7. To exploit the Mammogrid database and the algorithms to propose initial to propose initial pan-European quality controlspan-European quality controls on mammographic acquisition and ultimately to provide a benchmarking system to third party algorithms.to provide a benchmarking system to third party algorithms.

Page 5: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 5

Mammogrid PhilosophyMammogrid Philosophy

• Project concentrates on applying emerging GRID technology rather on developing it.

• It plans to implement a ‘lightweight’ (but fully functional) GRID and study its usage in hospitals

• It will draw heavily on other Grids projects e.g. DataGrid

• It will deliver a prototype federated database of mammograms in hospitals in the UK and Italy

• It will provide rapid feedback from the Hospital community

• And will inform the next generation of HealthGrids developments

Page 6: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 6

Why a Mammography Why a Mammography Database?Database?

• Breast cancer is a huge problem:– 10% of women develop breast cancer, – 19% of cancer deaths are due to breast cancer, – 24% of all cancer cases are breast cancers, – there are 348,000 cases in EU & USA, 50,000

die every year,– fortunately there is a solution.

• Early diagnosis through mammography screening improves prognosis

Page 7: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 7

...but...but

• Quality control in acquisition, diagnosis and efficient data management is vital.

• Improving the reliability of screening and early diagnosis requires:– better epidemiological understanding, – improved diagnostic tools, – enhanced quality control, – continuous training and – efficient management of data and records.

• A way to achieve the above is through repositories of mammography data for research and training that contain sufficiently large statistical samples e.g. – Mammogrid-EU, – NDMA-US, – eDIAMonD-UK (Mirada, IBM, Oxford, Edin. KCL, UCL) – GPCalma-Italy

Page 8: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 8

The Mammogrid ChallengeThe Mammogrid Challenge

• Building this repository is not trivial because:– Large numbers of exemplars are required.– Cases must be obtained from many

geographically remote locations.– Data itself is large: 2 breasts × 2 views × 4K ×

4K pix × 2 bytes = 128Mbyte per patient per visit, 3M women per year UK, ~ 400 Terabytes in UK alone,

– Acquisition is highly variable, same image may look different depending on machine and parameters. How do you compare?

– Patient privacy and data security is key.– Many relevant items of metadata.

Page 9: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 9

A GRID Infrastructure is idealA GRID Infrastructure is ideal

• The DatabasesDatabases to statistically validate image based clinical hypothesis are:

Populated by large number of casesContain large files (1 mammogram 10Mb+)Geographically distributed repositoriesHeterogeneous database formatsNeed to be accessible to co-workers

• Development and validationDevelopment and validation of medical image analysis solutions demands:

Computationally expensive simulations.Repeated runs for optimal parameter tuning.Statistical test rigs.Remote execution and maintenance

• Services (e.g. security) must be system-resident, invisible, Services (e.g. security) must be system-resident, invisible, genericgeneric

Page 10: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 10

High Energy Physics vs. High Energy Physics vs. MammogridMammogrid

• Mammogrid heavily relies on technologies developed primarily in the field of high energy physics.– Similarities

• Large number of big files • Files can be sensibly organized in directory tree• Need to replicate and move file copies between

sites• Need to execute commands on the node which

hosts data locally– Difficulties

• Complexity of co-working in medical environment• Lack of trained IT personnel

– Confidentiality

Page 11: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 11

Federated System SolutionFederated System Solution

Hospital Italy

Healthcare Institute

University Database

Hospital UK

Shared meta-data

Analysis-specific data

•Knowledge is stored alongside data•Active (meta-)objects manage various versions of data and algorithms•Small network bandwidth required

Clinician’s Workstations

QueryResult

LocalQuery

LocalAnalysis

LocalAnalysis

LocalAnalysis

LocalAnalysis

Massively distributed dataAND distributed analyses

GRIDLocalQuery

LocalQuery

LocalQuery

Page 12: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 12

Mammogrid Mammogrid ImplementationImplementation

Use case/validation

UserReq’s

& Specs

GRID/DBinfrastructure

H/Wlocal node implem.

StandardisationS/W.

Application S/W

Dissemination & Exploitation

Project Management

WP 2CERN/UWE

Hospitals

WP 3 - CERN/UWE

WP 4 - Mirada

WP 6 - Mirada

Integrationtest bed

WP 5 - CERN

WP 7&8 - Oxford,Pisa/Sassari

WP 9&10Cambridge

Udine

WP 11 - All

WP 1 - CERN (Vitamib)

spec

ific

atio

ns

Info

rmat

ion

infr

astr

uctu

re

Page 13: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 13

MammoGram Analysis Use-MammoGram Analysis Use-CaseCase

View Patient Details

(from Maintain Patient Basic Detai ls)

View Mammogram Image

Annotate Mammogram Images

Execute Radiological Queries

Run Cad Software

Obtain User Authorization

(from Use Case View)

Perform Radiological Analysis

<<include>>

Mammogram Analyst

(from Use Case View)

<<extend>>

<<extend>>

<<extend>>

<<extend>>

<<include>>

Define Queries

<<extend>>

Example Use-Case:Mammogram Analysis

•View and Annotate Images•Run CAD•Execute Queries

Page 14: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 14

MammoGrid Data MammoGrid Data StructuresStructures

Patient Study

MR Series

Equipment

Sonography Series

Equipment

Mammography Series

Equipment

DateTimeDescriptionWeightSymptoms

Patient

Patient Study

Patient Study

Patient Study

NameDate of BirthAge at MenopauseAge at MenarchePlace of BirthEthnic GroupNationality

MedicalHistory Entry

MedicalHistory EntryMedical

History Entry

Mammography Series

Equipment

X-raymachine

FilmProcessor

Digitiser

Mammography Image

Mammography Image

Mammography Image

Mammography Image

Laterality (Right/Left) Implant present? Modality (CC/MLO) Exposure KvP Exposure MAS Breast Thickness AEC Position Exposure Comments

Mammography Image

Patient

Patient Study

Patient Study

Patient Study

NameDate of BirthAge at MenopauseAge at MenarchePlace of BirthEthnic GroupNationality

Medical History Entry

Medical History Entry

Medical History Entry

Database Entities:•Hospitals•Users (Radiologists)•Equipment•Patients

•Studies•Series

•Images

Page 15: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 15

MammoGrams & MammoGrams & AnnotationAnnotation

Patient Study

MR Series

Equipment

Sonography Series

Equipment

Mammography Series

Equipment

DateTimeDescriptionWeightSymptoms

Patient

Patient Study

Patient Study

Patient Study

NameDate of BirthAge at MenopauseAge at MenarchePlace of BirthEthnic GroupNationality

MedicalHistory Entry

MedicalHistory EntryMedical

History Entry

Mammography Series

Equipment

X-raymachine

FilmProcessor

Digitiser

Mammography Image

Mammography Image

Mammography Image

Mammography Image

Laterality (Right/Left) Implant present? Modality (CC/MLO) Exposure KvP Exposure MAS Breast Thickness AEC Position Exposure Comments

Mammography Image

Laterality (Right/Left) Implant present? Modality (CC/MLO) Exposure KvP Exposure MAS Breast Thickness AEC Position Exposure Comments

Mammography ImageAnnotation

FeaturesSize of FeaturesFeature propertiesMalignancyBiopsy Proven?Comments

Mammogram

Page 16: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 16

Main Main Deliverables/milestonesDeliverables/milestones

• User Requirements Specification and Technical System Specification (months 3, 6)

• Prototype GRID-compliant database and information infrastructure (first release m. 18, final rel. m. 36)

• Packaged medical imaging workstation with interface to GRID, secure GRID box, (month 12)

• Grid compliant SMF software (month 12)• Application software (months 12, 24, 36) • Clinical Trial results (month 24, 36)

Page 18: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 18

Local Site ArchitectureLocal Site Architecture

GRID : Mammogrid – AliEn

Mammogrid Database

Mirada Workstation

SOAP MessagesSends

Dicom Files

Alien Database PFNs

Alien File Catalogue LFNs

DICOM File :- Description Inf.- Image

MAS: Mirada Acquisition System

Local Cache

Digitizer

Object : Patient

- Patient Personal Information,- Additional Information,- …

Workstations

DIC

OM

S

erve

rIn

form

atio

n

Ser

vice

Read / Write operations

File

Tra

nsf

er D

aem

on

Web Services

Page 19: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 19

Clinician to DataClinician to Data

MiradaWorkstation

ClientFrontend

Mammogrid Server

SOAP

Clinician

DICOM Server

Grid Server

. . .

Page 20: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 20

Mirada-AliEn InterfaceAliEn prototype

Interface

PerlSOAPServer

AliEn Catalogue

cern

cambridge

udine

The Catalogue is divided in several databases, which can be distributed.

The catalogue keeps the LFN-PFN mapping and the metadata

MammoGrid AliEn PrototypeMammoGrid AliEn Prototype

Page 21: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 21

Interaction DiagramInteraction Diagram

Case : READ

Mirada WST

ISInformation

Service

FTDFile

Transfer Daemon

Mirada WST

DICOM Server

FTDFile

Transfer Daemon

Mammogrid - AliEn

SOAP Messages

Query

Result Set

Negociation

Case : WRITE

Push(DICOM File) Negociation

Mammogrid - AliEn

File Catalogue

Updates

File Catalogue

Reads

Alien Service

File Catalogue

Mammogrid Service

File Handle

GRID Environment

Page 22: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 22

Current Hardware SetupCurrent Hardware Setup

Gridbox specifications :

2x intel Xeon processors, 2 GB DDR 200/266 MHz, Redundant Power Supply, 2x 20 GB IDE HDD (7200 rpm) UDMA, RAID-1 IDE adapter, 360 GB usable, RAID-1, Ethernet network adapter 10/100Mb/s, Gigabit network adapter

Page 23: The MammoGrid Project Grids Architecture Richard McClatchey CHEP’03, San Diego March 24 th 2003 On behalf of the MammoGrid Consortium: CERN, Mirada Solutions,

R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 23

ConclusionsConclusions

• Distributed Health informatics is an important application area for Grids technologies – HealthGrid

• Many similarities with High Energy Physics• Need rapid feedback from the user community –

MammoGrid user requirements specified BUT• Effective Grid deployment needed now and• Many open questions e.g :

– How to resolve distributed queries ?– What role for meta-data ?– How to maintain secure, reliable data ?

• MammoGrid : First results expected late 2003