Upload
peter-cummings
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
The MammoGrid Project Grids The MammoGrid Project Grids ArchitectureArchitecture
Richard McClatcheyCHEP’03, San Diego March 24th 2003
On behalf of the MammoGrid Consortium:CERN, Mirada Solutions, Univ of Oxford, Univ of
Sassari & Pisa, Univ West of England, Univ Hospitals of Cambridge (Addensbrookes) & Udine
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 2
ContentsContents
1. The MammoGrid project objectives2. Project challenges and philosophy3. HEP vs distributed medical image analysis4. The MammoGrid infrastructure5. Implementation and current status6. Future plans7. Conclusions & questions
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 3
What is the Mammogrid?What is the Mammogrid?
• EU FP5 project to build a pan-European distributed Database of mammography images using GRID Technologies.
• Aim: To provide a demonstrator for use in epidemiological studies, quality control and validation of computer aided detection algorithms.
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 4
Mammogrid ObjectivesMammogrid Objectives
1.1. To To evaluate current Grids technologiesevaluate current Grids technologies and determine the requirements for Grid-compliance in a pan-European mammography database.
2.2. To To implement the Mammogrid databaseimplement the Mammogrid database, , using novel Grid-compliant and using novel Grid-compliant and Federated-Database technologiesFederated-Database technologies that will provide improved access to distributed data and will allow rapid deployment of software packages to operate on locally stored information.
3. To deploy enhanced versions of a standardization systemstandardization system that enables that enables comparison of mammogramscomparison of mammograms in terms of intrinsic tissue properties independently of scanner settings, and to explore its place in the context of medical image formats (DICOM).
4. To develop software tools to automatically extract image informationsoftware tools to automatically extract image information that can be used to perform quality controls on the acquisition process of participating centers (e.g. average brightness, contrast).
5. To develop software tools to automatically extract tissuesoftware tools to automatically extract tissue informationinformation that can be used to perform clinical studies (e.g. breast density, presence, number and location of micro-calcifications) in order to increase the performance of breast cancer screening programs.
6. To use the annotated information and the images in the database toto benchmarkbenchmark the performance of the softwarethe performance of the software described in points 3, 4 and 5.
7. To exploit the Mammogrid database and the algorithms to propose initial to propose initial pan-European quality controlspan-European quality controls on mammographic acquisition and ultimately to provide a benchmarking system to third party algorithms.to provide a benchmarking system to third party algorithms.
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 5
Mammogrid PhilosophyMammogrid Philosophy
• Project concentrates on applying emerging GRID technology rather on developing it.
• It plans to implement a ‘lightweight’ (but fully functional) GRID and study its usage in hospitals
• It will draw heavily on other Grids projects e.g. DataGrid
• It will deliver a prototype federated database of mammograms in hospitals in the UK and Italy
• It will provide rapid feedback from the Hospital community
• And will inform the next generation of HealthGrids developments
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 6
Why a Mammography Why a Mammography Database?Database?
• Breast cancer is a huge problem:– 10% of women develop breast cancer, – 19% of cancer deaths are due to breast cancer, – 24% of all cancer cases are breast cancers, – there are 348,000 cases in EU & USA, 50,000
die every year,– fortunately there is a solution.
• Early diagnosis through mammography screening improves prognosis
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 7
...but...but
• Quality control in acquisition, diagnosis and efficient data management is vital.
• Improving the reliability of screening and early diagnosis requires:– better epidemiological understanding, – improved diagnostic tools, – enhanced quality control, – continuous training and – efficient management of data and records.
• A way to achieve the above is through repositories of mammography data for research and training that contain sufficiently large statistical samples e.g. – Mammogrid-EU, – NDMA-US, – eDIAMonD-UK (Mirada, IBM, Oxford, Edin. KCL, UCL) – GPCalma-Italy
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 8
The Mammogrid ChallengeThe Mammogrid Challenge
• Building this repository is not trivial because:– Large numbers of exemplars are required.– Cases must be obtained from many
geographically remote locations.– Data itself is large: 2 breasts × 2 views × 4K ×
4K pix × 2 bytes = 128Mbyte per patient per visit, 3M women per year UK, ~ 400 Terabytes in UK alone,
– Acquisition is highly variable, same image may look different depending on machine and parameters. How do you compare?
– Patient privacy and data security is key.– Many relevant items of metadata.
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 9
A GRID Infrastructure is idealA GRID Infrastructure is ideal
• The DatabasesDatabases to statistically validate image based clinical hypothesis are:
Populated by large number of casesContain large files (1 mammogram 10Mb+)Geographically distributed repositoriesHeterogeneous database formatsNeed to be accessible to co-workers
• Development and validationDevelopment and validation of medical image analysis solutions demands:
Computationally expensive simulations.Repeated runs for optimal parameter tuning.Statistical test rigs.Remote execution and maintenance
• Services (e.g. security) must be system-resident, invisible, Services (e.g. security) must be system-resident, invisible, genericgeneric
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 10
High Energy Physics vs. High Energy Physics vs. MammogridMammogrid
• Mammogrid heavily relies on technologies developed primarily in the field of high energy physics.– Similarities
• Large number of big files • Files can be sensibly organized in directory tree• Need to replicate and move file copies between
sites• Need to execute commands on the node which
hosts data locally– Difficulties
• Complexity of co-working in medical environment• Lack of trained IT personnel
– Confidentiality
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 11
Federated System SolutionFederated System Solution
Hospital Italy
Healthcare Institute
University Database
Hospital UK
Shared meta-data
Analysis-specific data
•Knowledge is stored alongside data•Active (meta-)objects manage various versions of data and algorithms•Small network bandwidth required
Clinician’s Workstations
QueryResult
LocalQuery
LocalAnalysis
LocalAnalysis
LocalAnalysis
LocalAnalysis
Massively distributed dataAND distributed analyses
GRIDLocalQuery
LocalQuery
LocalQuery
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 12
Mammogrid Mammogrid ImplementationImplementation
Use case/validation
UserReq’s
& Specs
GRID/DBinfrastructure
H/Wlocal node implem.
StandardisationS/W.
Application S/W
Dissemination & Exploitation
Project Management
WP 2CERN/UWE
Hospitals
WP 3 - CERN/UWE
WP 4 - Mirada
WP 6 - Mirada
Integrationtest bed
WP 5 - CERN
WP 7&8 - Oxford,Pisa/Sassari
WP 9&10Cambridge
Udine
WP 11 - All
WP 1 - CERN (Vitamib)
spec
ific
atio
ns
Info
rmat
ion
infr
astr
uctu
re
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 13
MammoGram Analysis Use-MammoGram Analysis Use-CaseCase
View Patient Details
(from Maintain Patient Basic Detai ls)
View Mammogram Image
Annotate Mammogram Images
Execute Radiological Queries
Run Cad Software
Obtain User Authorization
(from Use Case View)
Perform Radiological Analysis
<<include>>
Mammogram Analyst
(from Use Case View)
<<extend>>
<<extend>>
<<extend>>
<<extend>>
<<include>>
Define Queries
<<extend>>
Example Use-Case:Mammogram Analysis
•View and Annotate Images•Run CAD•Execute Queries
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 14
MammoGrid Data MammoGrid Data StructuresStructures
Patient Study
MR Series
Equipment
Sonography Series
Equipment
Mammography Series
Equipment
DateTimeDescriptionWeightSymptoms
Patient
Patient Study
Patient Study
Patient Study
NameDate of BirthAge at MenopauseAge at MenarchePlace of BirthEthnic GroupNationality
MedicalHistory Entry
MedicalHistory EntryMedical
History Entry
Mammography Series
Equipment
X-raymachine
FilmProcessor
Digitiser
Mammography Image
Mammography Image
Mammography Image
Mammography Image
Laterality (Right/Left) Implant present? Modality (CC/MLO) Exposure KvP Exposure MAS Breast Thickness AEC Position Exposure Comments
Mammography Image
Patient
Patient Study
Patient Study
Patient Study
NameDate of BirthAge at MenopauseAge at MenarchePlace of BirthEthnic GroupNationality
Medical History Entry
Medical History Entry
Medical History Entry
Database Entities:•Hospitals•Users (Radiologists)•Equipment•Patients
•Studies•Series
•Images
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 15
MammoGrams & MammoGrams & AnnotationAnnotation
Patient Study
MR Series
Equipment
Sonography Series
Equipment
Mammography Series
Equipment
DateTimeDescriptionWeightSymptoms
Patient
Patient Study
Patient Study
Patient Study
NameDate of BirthAge at MenopauseAge at MenarchePlace of BirthEthnic GroupNationality
MedicalHistory Entry
MedicalHistory EntryMedical
History Entry
Mammography Series
Equipment
X-raymachine
FilmProcessor
Digitiser
Mammography Image
Mammography Image
Mammography Image
Mammography Image
Laterality (Right/Left) Implant present? Modality (CC/MLO) Exposure KvP Exposure MAS Breast Thickness AEC Position Exposure Comments
Mammography Image
Laterality (Right/Left) Implant present? Modality (CC/MLO) Exposure KvP Exposure MAS Breast Thickness AEC Position Exposure Comments
Mammography ImageAnnotation
FeaturesSize of FeaturesFeature propertiesMalignancyBiopsy Proven?Comments
Mammogram
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 16
Main Main Deliverables/milestonesDeliverables/milestones
• User Requirements Specification and Technical System Specification (months 3, 6)
• Prototype GRID-compliant database and information infrastructure (first release m. 18, final rel. m. 36)
• Packaged medical imaging workstation with interface to GRID, secure GRID box, (month 12)
• Grid compliant SMF software (month 12)• Application software (months 12, 24, 36) • Clinical Trial results (month 24, 36)
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 17
Overall Grids ArchitectureOverall Grids Architecture
GRIDVPN Network Central
File Catalogue
Alien Backup
Cambridge Site
Alien DataGridBox GridBox
GridBox GridBox
File Cat. Replica
Alien Data
File Cat. Replica
Alien Data
File Cat. Replica
High Security Level
Mammogrid Data
Mammogrid Data
Mammogrid Data Backup
Mammogrid Data
Mirada WST(MAS)
Workstations
Data replication
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 18
Local Site ArchitectureLocal Site Architecture
GRID : Mammogrid – AliEn
Mammogrid Database
Mirada Workstation
SOAP MessagesSends
Dicom Files
Alien Database PFNs
Alien File Catalogue LFNs
DICOM File :- Description Inf.- Image
MAS: Mirada Acquisition System
Local Cache
Digitizer
Object : Patient
- Patient Personal Information,- Additional Information,- …
Workstations
DIC
OM
S
erve
rIn
form
atio
n
Ser
vice
Read / Write operations
File
Tra
nsf
er D
aem
on
Web Services
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 19
Clinician to DataClinician to Data
MiradaWorkstation
ClientFrontend
Mammogrid Server
SOAP
Clinician
DICOM Server
Grid Server
. . .
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 20
Mirada-AliEn InterfaceAliEn prototype
Interface
PerlSOAPServer
AliEn Catalogue
cern
…
cambridge
udine
The Catalogue is divided in several databases, which can be distributed.
The catalogue keeps the LFN-PFN mapping and the metadata
MammoGrid AliEn PrototypeMammoGrid AliEn Prototype
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 21
Interaction DiagramInteraction Diagram
Case : READ
Mirada WST
ISInformation
Service
FTDFile
Transfer Daemon
Mirada WST
DICOM Server
FTDFile
Transfer Daemon
Mammogrid - AliEn
SOAP Messages
Query
Result Set
Negociation
Case : WRITE
Push(DICOM File) Negociation
Mammogrid - AliEn
File Catalogue
Updates
File Catalogue
Reads
Alien Service
File Catalogue
Mammogrid Service
File Handle
GRID Environment
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 22
Current Hardware SetupCurrent Hardware Setup
Gridbox specifications :
2x intel Xeon processors, 2 GB DDR 200/266 MHz, Redundant Power Supply, 2x 20 GB IDE HDD (7200 rpm) UDMA, RAID-1 IDE adapter, 360 GB usable, RAID-1, Ethernet network adapter 10/100Mb/s, Gigabit network adapter
R. McClatchey, CHEP’03 San Diego March 2003R. McClatchey, CHEP’03 San Diego March 2003 23
ConclusionsConclusions
• Distributed Health informatics is an important application area for Grids technologies – HealthGrid
• Many similarities with High Energy Physics• Need rapid feedback from the user community –
MammoGrid user requirements specified BUT• Effective Grid deployment needed now and• Many open questions e.g :
– How to resolve distributed queries ?– What role for meta-data ?– How to maintain secure, reliable data ?
• MammoGrid : First results expected late 2003