49
Enabling of Grid based Diffusion Tensor Imaging using a Workflow Implementation of FSL Healthgrid 2009, Berlin 30.06.2009 Ralf Lützkendorf Johannes Bernarding Frank Hertel Fred Viezens Andreas Thiel Dagmar Krefting

Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

Enabling of Grid based Diffusion TensorImaging using a WorkflowImplementation of FSL

Healthgrid 2009, Berlin 30.06.2009

Ralf LützkendorfJohannes BernardingFrank HertelFred ViezensAndreas ThielDagmar Krefting

Page 2: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 2

Diffusion Tensor Imaging

Diffusion Tensor Imaging• is based on diffusion weighted MRI• allows for tracking of nerve fibers

Main applications• Fundamental research

• Anatomy and formation of axonal tracts• Connectivity between functional areas

• Clinical research• Recognition of desease patterns• Planning support in brain tumor therapy

Page 3: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 3

Image Processing

Different Software Tools available for DTI

Page 4: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 4

FDT

FMRIB‘s Diffusion Toolbox:• Graphical user interfaces• Command line tool

Image processing pipeline:• Preprocessing

• Distortion correction• Segmentation

• Local modeling of diffusion parameters (bedpostX)• Estimation of diffusion tensor for each voxel by Markov

Chain Monte Carlo sampling• Computationally expensive• Parallelizable on slice level

• Fiber tractography

Page 5: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 5

bedpostX

Runtime• Website information: 24 h• Personal computer (512 Mb RAM, 3 GHz, 56 slices): 10 days!• Runtime depends on slicenumber and content!

First approach:• Local windows desktop grid of 10 computers:

• using cygwin• Scales nearly linearly• Runtime ~ 24 hrs• Only temporarily available

Grid approach:• Usage of powerful machines• Parallel processing of MRI slices

Page 6: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 6

Data Management

Requirements:• Fixed number of dedicated files provided within a directory• Generated by preprocessing• Image Format: NIFTI

SDSC‘s Storage Resource broker:• Linux-like file system structure• Shared access using group rights

Stored data:• Input data• Intermediate results• Final results

Page 7: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 7

Process Management

Processing steps:• Splitting the data into slices• Run bedpostX• Recombine data

Grid Workflow Execution Service:• Orchestration of the processing steps• Scheduling and resource brokering• Error handling• Reusable workflow templates

Page 8: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 8

Workflow Layout

Monolithic implementation

Parallelized implementation

Page 9: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 9

Clinic: User

Workflow Execution Service

Univ. of Leipzig

Data Storage and MetadataBerlin, Dresden, Göttingen

Web-Portal

Uni Leipzig

www.medigrid.de

Grid-NodeZuse Inst. Berlin

Grid-NodeUniv. of Dresden

Grid-NodeUniv. of Leipzig

Grid-NodeGWDG

Göttingen

System architecture

Main components used for application

• All middleware components are free software for academic use

Page 10: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 10

Results

Runtime measurement:

# Slices Monolithic hrs Parallelized hrs Speedup

11 6.5±0.7 (0.7) 3.5±2.5 (0.3) 1.9

14 14.3±0.8 (0.2) 3.5±1.7 (0.25) 0.9

20 8.9±3.8 (0.4) 3.6±1.2 (0.18) 1.6

56 22.0±8 (0.4) 3.8±2.2 (0.06) 5.6

LocalCluster

56 240.0 (4.2) 23 (0.4) 10.4

Grid

Page 11: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 11

Results

Reliability:• Complete monolithic grid implementation• Single slice monolithic grid implemetation• Parallelized grid implementation• Five subsequent monolithic grid implementation, tested on

different grid sites• Single slice monolithic local implementation on 64bit machine• Single slice monolithic local implementation on 64bit machine

Page 12: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 12

Results

Differences between 64bit and 32bit machine

Page 13: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 13

Results

Accessibility• Webbased up- and download to SRB• Webbased upload, start and control of the workflow• Data and application has been shared between Magdeburg

and Berlin

Usability:• BIRN‘s SRB portlet• GWES workflow portlets• Workflow template (xml-file)• No application specific gui

Page 14: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 14

Discussion and Outlook

• Speedup by more powerful computing resources• Speedup by parallelization• Fault tolerance essential (poor SRB scalability, network)• Speedup drastically reduced by queuing times• Reliability given for current grid infrastructure• Accessibility and usability sufficient for researcher

Further developments• Completion of the processing on the grid• Provision of a portlet• Usage of more scalable GDMS (iRods ?)• Process optimization (dynamic bundling of slices)

Page 15: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 15

MedInfoGRID

Page 16: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 16

Clinic: User

Workflow Execution Service

Univ. of Leipzig

Data Storage and MetadataBerlin, Dresden, Göttingen

Web-Portal

Uni Leipzig

www.medigrid.de

Grid-NodeZuse Inst. Berlin

Grid-NodeUniv. of Dresden

Grid-NodeUniv. of Leipzig

Grid-NodeGWDG

Göttingen

System architecture

User: medical doctor or scientist in medical sciences

• Clinical environment: strict data and network protection

• Not necessarily a computer expert

Page 17: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 17

Clinic: User

Workflow Execution Service

Univ. of Leipzig

Data Storage and MetadataBerlin, Dresden, Göttingen

Web-Portal

Uni Leipzig

www.medigrid.de

Grid-NodeZuse Inst. Berlin

Grid-NodeUniv. of Dresden

Grid-NodeUniv. of Leipzig

Grid-NodeGWDG

Göttingen

System architecture

Grid portal: gridsphere based webbased access

• Accounts for strict firewall settings (plain http/httpss access)

• Grid access from “everywhere” (with internet connection…)

Page 18: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 18

Clinic: User

Workflow Execution Service

Univ. of Leipzig

Data Storage and MetadataBerlin, Dresden, Göttingen

Web-Portal

Uni Leipzig

www.medigrid.de

Grid-NodeZuse Inst. Berlin

Grid-NodeUniv. of Dresden

Grid-NodeUniv. of Leipzig

Grid-NodeGWDG

Göttingen

System architecture

Data storage: Storage Resource Broker

• Unix-like access rights (basic data protection)

• Metadata Management

Page 19: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 19

Clinic: User

Workflow Execution Service

Univ. of Leipzig

Data Storage and MetadataBerlin, Dresden, Göttingen

Web-Portal

Uni Leipzig

www.medigrid.de

Grid-NodeZuse Inst. Berlin

Grid-NodeUniv. of Dresden

Grid-NodeUniv. of Leipzig

Grid-NodeGWDG

Göttingen

System architecture

Grid nodes: Globus Toolkit 4

• Credential based security

• WS-GRAM generic webservice for program execution

Page 20: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 20

Clinic: User

Workflow Execution Service

Univ. of Leipzig

Data Storage and MetadataBerlin, Dresden, Göttingen

Web-Portal

Uni Leipzig

www.medigrid.de

Grid-NodeZuse Inst. Berlin

Grid-NodeUniv. of Dresden

Grid-NodeUniv. of Leipzig

Grid-NodeGWDG

Göttingen

System architecture

Workflow Manager: Grid Workflow Execution Service (GWES)

• Data driven workflow modeling as high level Petri net

• Resource matching, basic brokering, error handling

Page 21: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 21

Main tasks for the developer:• Register the software components to the WF-DB• Provide a workflow description (upload or template)

Workflow Manager GWES

Web Services

Grid Workflow User Interface (GWUI)

Grid Portlet

Application Portlet (Genetic Tools, Imaging)

Grid Portlet

GWorkflowDL

Usern

Globus Toolkit 4, Web Services

Run SimpleGlobus Job

WS-GRAM, RFT, SOAP

Web Service

Exist XML DB

<D-GRDL>

D-GRDL

D-GRDL

Run Workflow

Assemble/ MonitorWorkflow

GWorkflowDL

Run Application

MediGRID Workflow Management

<GWorkflowDL>

GWorkflowDL

Grid Workflow Execution Service (GWES)

D-GRDL

Scheduler +Resource Matcher

GRDBDaemon

D-GRDL

MDS, Ganglia + Custom Metrics

Page 22: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 22

Grid Integration SIESTA DB

Storage of the SIESTA data in the SRB• Preservation of the dedicated folder structure• Group account for shared access• Relevant metadata is stored as

user defined Metadata (Sufmeta)

Integration of the processing tools• Plain command line tools• Execution by invocation of

WS-GRAM automaticallycalled by the GWES

Data transfer• Srb-transfer protocol between SRB and gridnodes• reliable file transfer (RFT) between nodes

ECG ECG presence

ODI Oxygen Desaturation Index

TST Total Sleep Time [min]

AI Apnea Index [/min]

AHI Apnea Hypopnea Index [/min]

Age Age [years]

Sex Sex [male=1,female=2]

Status Health Status [tag]

Height Height [cm]

Weight Weight [kg]

Pulse Pulse rate [/min]

BPsys Blood press. syst. [mmHg]

BPdia Blood press. Diast. [mmHg]

Page 23: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 23

Processing chain

Sleep-phase resolved heart frequency for study samples, determined by different indicators (age, health status,..)

1. Record selection: Metadata query2. Record analysis

• Heart frequency (HF) analysis on the selected samples• Matching of HF and consensed sleep states• Statistical analysis of individual HF patterns

3. Collection analysis• Statistical analysis on sample set

SRB-Zone

3

GWES

12

Page 24: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 24

Petri net modeling

Petri net, modelling the application:

Query 2

Query 4

Query 3

RecordSelection

Record Analysis

Record 1Query 1

Query 5

Record 2

Record n

Result 2

Result 1

Result nCollectionAnalysis

Final Result

T1

T2

T2

T2

T3

Transition PlacePlaces & Tokens

Page 25: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 25

Petri net modeling

Petri net, alternative description

Query 2

Query 4

Query 3

RecordSelection

Query 1

Query 5

Record Result

CollectionAnalysis

Final Result

Record Analysis

n-m m

Control

T3T2T1

Page 26: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 26

Petri net modeling

Petri net, actual implementation

Problem:Special characters or blanksin token values-> don‘t pass complete

srb queries

Page 27: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 27

Portal integration

Integration of a workflow template into an applicationspecificportlet:

Page 28: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 28

Portlet development

MediGRID Portal

Page 29: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 29

Portlet development

Gridsphere Framework• Gridportlets• Predefined Components• JSR168 compliance

MediGRID Portal• SRB Portlet (from BIRN)• Workflow GUI Portlets

Application specific Portlet:• Workflow list and status• Parameter setting• Workflow initialization• Result visualization

Page 30: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 30

Application of the tools on heterogeneous longterm measurementsreveals significant flaws in the heart-beat detection

Developer usecase

sqrs sqrs + correction shortterm-FFT

Healthysubjects

Apneapatients

Current usage of the grid: Algorithm testing and development-> parameter scans and evaluation on the datasets

Page 31: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 31

Discussion and outlook

Clinical ResearcherDeveloper

What are the main findings for this

patient?

Attending Physician

Does my algorithm work for all data?

What are the characteristic patterns

and features?

The application runs on a production grid infrastructure and isusable for all MediGRID participants

A developer can:- test deployed algorithms for different groups/setups (portlet)- perform Parameterscans on the deployed algorithms (workflow)- test algorithms or modifications (deployment to gridnodes)

Main flaws: - Metadata handling -> switching to iRODS- no manual ECG annotation available for SIESTA data

-> creation of ECG annotation-> integration of physiobank databases with annotation

Page 32: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 32

Discussion and outlook

Clinical ResearcherDeveloper

What are the main findings for this

patient?

Attending Physician

Does my algorithm work for all data?

What are the characteristic patterns

and features?

The application runs on a production grid infrastructure and isusable for all MediGRID participants

A clinical researcher can:- perform statistical analysis on SIESTA data with deployed tools(portlet)- upload and analyse further (pseudonymized) polysomnographies(portlet)

Main flaws: - Portal stability -> switching to LiveRay- rebustness of the algorithms -> developer‘s task- Userfriendly pseudonymization and upload tool

Page 33: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 33

Discussion and outlook

Clinical ResearcherDeveloper

What are the main findings for this

patient?

Attending Physician

Does my algorithm work for all data?

What are the characteristic patterns

and features?

The application runs on a production grid infrastructure and isusable for all MediGRID participants

A physician can:- upload and analyse (pseudonymized) polysomnographies(portlet)

Main flaws: - Approvement and Validation of the tools -> clinical researcher- Grid access - > simplified user registration- Validation of the infrastructure -> implementation of GCP-rules- Pricing models and accounting -> grid business models

Page 34: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 34

Thanks to all collaborators

Sebastian Canisius

Univ. Marburg Sleep Center

Andreas Hoheisel

Fraunhofer FIRST

Thomas Tolxdorff

CharitéMedical Inform.

Thomas Penzel

CharitéSleep Center

Page 35: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 35

Additional slides

Petri Nets

Page 36: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 36

Mathematical modeling language for distributed systems, consisting of

• Transitions (squares)

• Places (circles), that may hold np tokens (black dots)

• Flow relations (arrows between places and transitions)

• Input place: arrow is pointing from place to transition

• Output place: arrow is pointing from transition to place

• Marking: Distribution of tokens on places

Petri nets

Page 37: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 37

• Enabling of a transition: • All input places are occupied • All output places may receive further tokens

• Firing of a transition:• One token of each input place is consumed• One token is added to each output place

• Modeling of image processing workflows• Data -> token, executables -> transitions• Program execution -> firing

Petri nets

Page 38: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 38

Additional slides

Implementation of command line tools

Page 39: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 39

Implementation steps

Implementation of command-line tools to the grid

1. Deployment of the software to the gridnodes2. Generation of a wrapper script3. Registration of the software4. Creation of a workflow description5. Optional: Integration of the workflow into the user portal

Page 40: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 40

Software has to be installed on the front-end of the sites• Each application group has it‘s own remote directory

• Copy application from a local directory to the remoteinstallation directory with gsiscp (script)

• Access to the gridnodes via gsissh and svn update

Deployment of software

Page 41: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 41

A shell-script• Sets environment (pathes, environment variables)• Calls the program(s) • Requirement: all parameters have to be passed as

name/value pair• Program call:

segmentation 51123_1100.png 51123_roi.mat• Script call:

gwes-segmentation-simple.sh–input_image 51123_1100.png –roi 51123_roi.mat

Wrapper script

Page 42: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 42

Database-entry (exIST-database, dgrdl):• new software (path of the script)• gridnodes where the software is available

D-GRDL Registration

Page 43: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 43

Xml-based GWorkflowDLgwes-segmentation-simple.sh–input_image 51123_1100.png –roi 51123_roi.mat

Workflow Description

Page 44: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 44

Additional slides

Further Projects

Page 45: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 45

Results

Currently implemented:• 5 image- and signalprocessing applications• With application specific portlets:

• Functional MRI: simple workflow (needs matlab)• Virtual vascular surgery: basic interactive visualization• Ultrasound imaging: 4 different workflows• MRE brain segmentation (FSL sienax)

• Without portlets:• Analysis of polysomnographic signals from a clinical study• DTI analysis with FSL

• Prospected: dynamical lung imaging (PneumGRID)

Page 46: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 46

Additional slides

MediGRID Portal

Page 47: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 47

Gensequenzanalyse

Grid Certificate

MediGRID User

MediGRID Portal

fMRI

Ontologiezugriff

TRUS

Workflow Management

Hemodynamics

File Browser

Credential Management

SNP Selection

D-GRDL Metadaten-Erstellung D-GRDL Metadaten-

Management

MediGRID Admin

MediGRID Developer

Resource Management

Resource Monitoring

Bioinformatics

Medical Imaging

OntologyAccess

StandardGrid Portlets

Developer-support

Administration

Web Portal

Page 48: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 48

Additional slides

Requirements of MedicalGrids

Page 49: Enabling of Grid based Diffusion Tensor Imaging using a ... · • Estimation of diffusion tensor for each voxel by Markov Chain Monte Carlo sampling • Computationally expensive

page 49

Medical Grids demand special requirements with respect to mere computing Grids

High security and safety

• Patient data, traceability of processing steps

User friendliness

• User accustomed to graphical user interfaces

Virtualization of grid resources

• Heterogeneous data and applications

Current research on modern Grids is working to overcome these barriers

Medical Grids