MRI Brain Tissue segmentation - GitHub Pageshaustive comparative evaluation of existing state-of-the-art brain tissue segmentation methods using T1w data which is the most used for

MRIBrain Tissue segmentation

Submitted bySergi Valverde Valverde

Supervisors:Dr. Xavier LladoDr. Arnau OliverMariano Cabezas

Department of Computer Architecture and TechnologyUniversity of Girona

A Thesis Submitted for the Degree ofMSc Erasmus Mundus in Vision and Robotics (VIBOT)

· 2012 ·

Abstract

Manual segmentation of brain tissue is both challenging and time-consuming due to of the largenumber of MRI slices for each patient which composes the three-dimensional information andalso due to intra/inter-observer variability of manually segmented scans. The development ofrobust automated MS segmentation methods, which can segment large amounts of MRI dataand do not suffer from intra/inter-observer variability, is nowadays an active research field.However, automated segmentation of brain tissue is still a challenging problem due to the com-plexity of the images, differences in tissue intensities, noise, intensity non-uniformities, partialvolume effects or absence of models of the anatomy that fully capture the possible deformationsin each structure. The main motivation of this master thesis is two-fold: first, to perform an ex-haustive comparative evaluation of existing state-of-the-art brain tissue segmentation methodsusing T1w data which is the most used for tissue classification; and second, to extend the eval-uation with a quantitative analysis of how MS lesions affect the tissue classification. We haveselected 4 publicly available segmentation approaches from the-state-of-the-art, where some ofthem such as FAST, SPM5, SPM8 or GAMIXTURE are currently used by the neuroimagingcommunity for tissue segmentation and volumetric analysis. Moreover, we extend the list withthe implementation of 4 more works selected from the state-of-the art which comprises twoFuzzy Clustering techniques, one Neural Network method based on Self Organized Maps andone KNN auto-trained with the subject itself. Quantitative analysis is carried out on syntheticand real T1w data from publicly available datasets such as Brainweb and IBSR20. Further-more, scans from the SALEM project dataset with different loads of MS lesion are employedto evaluate the efficiency of methods segmenting brain tissue in the presence of MS lesions.Results on synthetic data have reported a good accuracy for all the analyzed approaches andwere according with previous studies using one or more of these methods. Results on IBSR20have shown a slightly better performance on the KNN classifier and GMM approaches. Finally,results on the SALEM dataset with MS lesions have indicated that in general methods tend tomiss-classify WM as GM at least in 17%. These results vary from SPM8 (17%) to KNN, whichis miss-classifying WM in 37%.

A good runner leaves no footprints... . . .

Lao-Tzu

i

Contents

Acknowledgments viii

1 Introduction 1

1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Research framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.5 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Problem definition 7

2.1 Magnetic Resonance Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 MRI concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Computer vision aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.1 Skull stripping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.2 Partial volume effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.3 Intensity inhomogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 State of the art 13

3.1 Classification of segmentation approaches . . . . . . . . . . . . . . . . . . . . . . 13

3.1.1 Supervised methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.1.2 Unsupervised methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

ii

3.2 Reported results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2.1 Brainweb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2.2 IBSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2.3 Evaluation measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2.4 Results analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4 Proposal 29

4.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.1.1 Skull stripping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.1.2 Intensity correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2 Tissue segmentation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.2.1 Tissue classification with FAST . . . . . . . . . . . . . . . . . . . . . . . . 32

4.2.2 Tissue classification with SPM . . . . . . . . . . . . . . . . . . . . . . . . 33

4.2.3 Tissue classification with GAGMM . . . . . . . . . . . . . . . . . . . . . . 35

4.2.4 Tissue classification with SOM . . . . . . . . . . . . . . . . . . . . . . . . 36

4.2.5 Tissue classification with FCM . . . . . . . . . . . . . . . . . . . . . . . . 37

4.2.6 Tissue classification with RFCM . . . . . . . . . . . . . . . . . . . . . . . 38

4.2.7 Tissue classification with KNN . . . . . . . . . . . . . . . . . . . . . . . . 39

4.3 Tissue classification in the presence of lesions . . . . . . . . . . . . . . . . . . . . 40

4.4 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5 Results 43

5.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.2 Synthetic data results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.2.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.3 Real data results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.3.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.4 MS lesion results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.4.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

iii

6 Conclusions 59

6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

A Results tables 62

Bibliography 74

iv

List of Figures

2.1 Different MRI acquisition sequences . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Brain skull stripping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Partial volume effects representation . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4 Different acquired MRI scans with image artifacts . . . . . . . . . . . . . . . . . 12

3.1 Brainweb generated dataset for different noise levels (n) and biases (b). . . . . . . 22

4.1 Proposed pipeline for brain tissue segmentation . . . . . . . . . . . . . . . . . . 30

4.2 Scan preprocessing output example with selected tools . . . . . . . . . . . . . . . 32

4.3 FAST segmentation output example with PVE. . . . . . . . . . . . . . . . . . . . 34

4.4 SPM prior tissue atlas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.5 Scan preprocessing output example with selected tools . . . . . . . . . . . . . . . 36

4.6 Membership classification output in FCM . . . . . . . . . . . . . . . . . . . . . . 38

4.7 Proposed modified pipeline for brain tissue segmentation with MS lesions . . . . 41

5.1 Dice metrics boxplots computed from all Brainweb scans . . . . . . . . . . . . . . 45

5.2 Segmentation results for various methods on Brainweb scans . . . . . . . . . . . . 47

5.3 Dice metrics boxplots computed from all IBSR20 scans . . . . . . . . . . . . . . . 49

5.4 Dice metrics plots evaluated for each scan and method . . . . . . . . . . . . . . . 51

5.5 Segmentation results for various methods on 5 8 IBSR scan . . . . . . . . . . . . 52

5.6 WM FBP for masked and not masked scans SALEM . . . . . . . . . . . . . . . 55

v

5.7 Segmentation results for all methods on 201 SALEM scan . . . . . . . . . . . . . 57

5.8 BPF for masked and not masked scans SALEM . . . . . . . . . . . . . . . . . . . 57

vi

List of Tables

3.1 Selected state-of-the-art automatic brain tissue segmentation methods . . . . . . 15

3.2 Available IBSR datasets for segmentation analysis . . . . . . . . . . . . . . . . . 23

3.3 Surveyed works based on Brainweb database and Dice or Jaccard indexes . . . . 26

3.4 Surveyed works based on IBSR database and Dice or Jaccard indexes . . . . . . 27

5.1 FBT evaluation on healthy subjects SALEM . . . . . . . . . . . . . . . . . . . . 53

5.2 FBT evaluation on subjects with MS disease masking lesions SALEM . . . . . . 56

5.3 FBT evaluation on subjects with MS disease without masking lesions SALEM . . 56

5.4 Lesion tissue classification, SALEM . . . . . . . . . . . . . . . . . . . . . . . . . . 56

A.1 Dice metrics computed from segmented Brainweb scans with different intensity

inhomogeneity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

A.2 BrainWeb synthetic database statistical evaluation . . . . . . . . . . . . . . . . . 64

A.3 Average Dice metrics computed from segmented Brainweb scans with different

noise levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

A.4 IBSR20 database statistal evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 65

vii

Acknowledgments

Invisible steps... I guess we have lost the path...

To the eyes: Yago Diez, Mariano Cabezas, Arnau Oliver and Xavier Llado. Thank you, it

was a real pleasure.

To the professors which made me start this path with the Bachelor computer vision course

and put their signature to make possible two years later this Master thesis: Joan Martı, Xavier

Munoz and again Xavier Llado. Thank you.

To the staff : David, Yohan, Fabrice, Herma, David, Aina, Robert, Yvan. Thank you.

To the present: Darko, Reinhard, Adriyana, Sergio, Xin, Igor, Joven, Saeed, Hashim, Lukas,

Mari, Taro, Mark, Cedric, Taleb, Bernat and Dani. Thank you.

To the past: Moises, Alfredo, Antonio, Julio, Francesc, Moi, Edu. Thank you to made me

grow.

To the roots: Pol, mother and father: Thank you, it had been impossible without you.

To the dreams: Alicia. I guess we have lost the path...

We are new, and we are the same...

viii

Chapter 1

Introduction

1.1 Overview

Central nervous system (CNS) is the part of the human nervous network which integrates the

coordination and processing of receiving neural information. CNS is contained by the brain and

the spinal cord and constituted by two tissue components: gray matter tissue (GM), which is the

main CNS element and consists in neuronal cell bodies; and white matter tissue (WM), which is

the second CNS component and it is mainly composed of myelinated axon tracts. WM and GM

tissues occupy the most important part of the brain area. Cerebro-spinal fluid (CSF) is a bodily

fluid present all over the brain evolving both CNS tissues. The central nervous system may

be damaged by different affections caused by infections such as encephalitis, neurodegenerative

diseases like Alzheimer or autoimmune and inflammatory diseases such as multiple sclerosis.

Multiple sclerosis (MS) is the most common immune-mediated disabling neurological disease

of the CNS. It is an inflammatory disability in which the structure of the neurons are progres-

sively injured [41]. MS is the most frequent non-traumatic neurological disease that causes

more disability in young adults. It is relatively common in Europe, the United States, Canada,

New Zealand, and parts of Australia, but rare in Asia, and in the tropics and subtropics of all

continents [16]. It follows a similar behavior also seen in other putative autoimmune diseases,

and affects twice as many women as it does in men. MS has an incidence of about seven per

100 000 every year, prevalence of around 120 per 100.000, and lifetime risk of one in 400 [24]. It

is low in childhood, increases rapidly after the age of 18, reaches a peak between 25 and 35, and

then slowly declines, becoming rare at 50 and older. The world estimate is 1.3 to 2.5 million

cases of MS, with Western Europe having 350.000 [31].

Histopathology implicates a reduction in myelin and axonal degeneration as the major con-

1

Chapter 1: Introduction 2

tributors to the accumulation of disability [54]. Myelin is a dielectric material that forms a

layer, which is called myelin sheath, around the axon of the neuron. The disease damages the

fatty myelin sheaths of brain cells and spinal cord, leading to demyelination. Demyelinated

axons conduct impulses at reduced or spontaneous velocity causing impairment in sensation,

movement and cognition [24]. MS has four internationally recognized forms [55]: 1) Relaps-

ing/Remitting (RRMS) is characterized by exacerbation times where symptoms are present.

These periods are followed by periods of remission, where the patient recovers partial or totally

from the disease symptoms. 2) The Secondary Progressive (SPMS) form is characterized by a

gradual intensification of symptoms between affection relapses. 50% of MS patients after 10

years of the RRMS develop the SPMS stage. This percentage rises into 90% after 25 years [59].

3) Progressive remitting (PRMS) form is typified by an increase in the relapsing times with

significant recovery but with worsening symptoms in new relapsing intervals. 4) Lastly, Pri-

mary Progressive (PPMS) is characterized by a severe decrease of remitting times with special

localization in the brain.

Magnetic Resonance Imaging (MRI) are medical techniques commonly used to visualize the

internal structures of the body. The composition of each brain tissue permits different image

acquisitions types such T1, T2 and PD. T1 acquired scans clearly separate grey matter from

white matter and are often used in inter-tissue classification. T2 scans are often used in the

intra-tissue classification of abnormal fluid against the normal tissue which is suitable for lesion

detection. The ability of MRI to detect MS lesions, had led to its general acknowledgement

as tool to diagnosticate the disease [35]. Conventional MRI are highly sensitive for detecting

MS plaques and can provide quantitative assessment of inflammatory activity and lesion load

[16]. More advanced acquisition methodologies are being used today such as fluid attenuated

inversion recovery (FLAIR) [9], diffusion techniques as diffusion weighted imaging (DWI) [58]

and diffusion tensor (DTI) [25] or magnetization transfer (MT) [86]. MS detection techniques

are based on the assumption that a classification of the brain tissue is known. This process is

essential to perform an analysis of the brain tissue atrophy, volumetric analysis and its evolution.

Manual segmentation of brain tissue is both challenging and time-consuming because of the

large number of MRI slices for each patient which composes the three-dimensional information

[16]. The intra/inter-observer variability of manually segmented scans can be significant in some

cases. Effectively, the development of robust automated MS segmentation methods, which can

segment large amounts of MRI data and do not suffer from intra/inter-observer variability, is

nowadays an active research field [78] [85]. However, automated segmentation of brain tissue is

still a challenging problem due to the complexity of the images, differences in tissue intensities,

noise, intensity non-uniformities, partial volume effects [3] or absence of models of the anatomy

that fully capture the possible deformations in each structure [52].

3 1.2 Research framework

1.2 Research framework

This master thesis is located within the framework of same research project, which the Computer

Vision and Robotics (VICOROB) group of the Universitat de Girona is involved on:

Tıtulo del proyecto/contrato: AVALEM: avaluacio de l’atrofia

en pacients amb lesions d’esclerosi multiple

Empresa/Administracion financiadora: FUNDACIO ESCLEROSI MULTIPLE

Numero de proyecto/contrato: CEM-CAT2011 Importe: 40.000,00 Duracion,

desde: 2011 hasta: 2013

Investigador/a Principal: Xavier Llado Bardera

Tıtulo del proyecto/contrato: SALEM: segmentacion automatica de

lesiones de esclerosis multiple en imagenes de resonancia magnetica

Empresa/Administracion financiadora: Instituto de Salud Carlos III

Numero de proyecto/contrato: PI09/91018 Importe: 79.860,00 Duracion,



Tıtulo del proyecto/contrato: SALEM: Toolkit para la segmentacion

automatica de lesiones de EM en RM.

Empresa/Administracion financiadora: Centre d’Innovacio i Desenvolupament

Empresarial (CIDEM)

Numero de proyecto/contrato: VALTEC09-1-0025 Importe: 100.000,00 Duracion,



The project is based on the collaboration with relevant hospitals and medical expert teams

in the field of multiple sclerosis such as the Hospital Vall d’Hebron, the Clınica Girona and the

Hospital Dr. Josep Trueta. These 3 hospitals provides data from real patients used to evaluate

the implemented segmentation methods.

1.3 Objectives

Brain tissue segmentation is essential to perform an analysis of the brain atrophy and its

evolution in MS patients. In this document, several segmentation methods are analyzed

for the automatic brain tissue classification and volumetric quantification. Various

existing key segmentation methods are studied while new methods are also implemented, tested,


and compared. The effect of the multiple sclerosis lesions in the tissue segmentation and

quantification process are also investigated with baseline and 12 months scans from healthy

and MS patient scans.

Four brain tissue MRI scan sets will be used to evaluate the performance of the tissue

segmentation tools:

• 9 T1w synthetic volumes from the intracraneal Brainweb 3D simulated MR image gener-

ator [22].

• 20 Normal Subjects, T1-w Scans with manual segmentations from the Internet Brain

Segmentation database (IBSR)1.

• 9 T1w healthy subjects with a second scan after 12 months for temporal comparison from

SALEM.

• 10 T1w subjects with different lesion loads and a second scan after 12 months for temporal

comparison from SALEM.

The main goals of this master thesis are subdivided in the following objectives:

• Analyze the state of the art of brain tissue segmentation techniques. This objective aims

to review the whole brain tissue segmentation state of the art to understand better their

advantages and drawbacks.

• Select a representative set of the best state-of-the art techniques for brain tissue segmen-

tation. This objective aims to implement some automatic segmentation methods to be

added to the best available public methods.

• Perform a quantitative evaluation of the selected methods on simulated and real brain 3D

MRI scans. This objective aims to assess the accuracy of the methods in relation to the

provided ground truth of the public datasets.

• Evaluate the presence of MS in the quantization of brain tissue atrophy. This objective

aims to evaluate tissue volume differences and extend it in the presence of MS with

different lesion loads. Segmented volumes are evaluated in 12-months consecutive scans

for both healthy and MS patients with different levels of disease.

1.4 Planning

According to the objectives described in 1.3, the master thesis will be developed in different

steps. Those steps are summarized as follows with the proposed realization time:

1IBSR database is public accesible in http://www.cma.mgh.harvard.edu/ibsr

5 1.5 Document structure

• Analyze the existing state-of-the-art proposed methods by scientific commu-

nity in brain tissue segmentation. During the first stage of the master thesis, the

state-of-the-art in MRI brain tissue segmentation will be studied. The analysis is extended

with various preprocessing image tools commonly used by the community. Expected date:

1st February -15th February 2012.

• Implementation of brain tissue segmentation methods. Some of the best state-

of-the-art brain tissue segmentation methods will be implemented. These methods will

be incorporated to the best public methods available to complete a representative list

of segmentation methods to be used in the quantitative evaluation. Expected date: 16th

February- 31th March 2012.

• Quantitative accuracy evaluation on simulated data. The segmentation methods

will be evaluated using 3D simulated brain scans from BrainWeb dataset. The methods

will be assessed for different levels of noise and intensity inhomogeneities. Expected date:

1st April- 15th April 2012.

• Quantitative accuracy evaluation on real data. The segmentation methods will be

evaluated using 3D real brain scans with different levels of difficulty and intensity inho-

mogeneities from IBSR dataset. The accuracy of the methods is evaluated in relationship

with the provided ground-truth and compared with previous studies found in the brain

tissue segmentation literature. Expected date: 1st April- 30th April 2012.

• Quantitative evaluation of volumetric analysis. The volumetric evaluation of the

segmentation methods in the presence of MS will be carried out using healthy and disease

patients with different lesion loads. The experiments will be carried out using baseline

and 12 monthsscans from SALEM and AVALEM projects. Expected date: 16th April-

15h May 2012.

• Documentation. All details regarding those steps will be documented in this master

thesis document. The documentation step will be concluded with a scientific paper and

a poster of the proposed work. Expected date: 16th May - 30th May 2012.

1.5 Document structure

This document is structured in 6 chapters. Chapter 1 introduces a brief summary of the back-

ground, objectives and planning of the master thesis. Chapter 2 defines in more detail the

necessary research background in automatic brain tissue segmentation. MRI is introduced em-

phasizing in acquisition sequences used for brain tissue classification. Moreover, specific issues


and concepts related with computer vision are defined. Chapter 3 reviews the state-of-the-art in

the tissue segmentation research field. It pays attention on the most recent techniques dealing

with automatic brain tissue segmentation and focus on advantages and drawbacks of each tech-

nique. A classification of those techniques based on supervised and unsupervised methodologies

is introduced. Results from surveyed works are organized by employed dataset and compared

between them. Chapter 4 develops in more detail the set of the selected techniques used in

the study. In particular, public available segmentation tools as SPM5 and SPM8 [5], FSL

FAST [89] or GAMIXTURE [79] are revised. The revision of selected techniques is completed

analyzing four implemented methods: A K-nearest neighbor auto-trained with the same input

scan and atlas tissue probabilistic initialization [27], a self-organized map (SOM) approach [77]

and two fuzzy clustering approaches with classic membership function and spatially informa-

tion weighting respectively [61]. Chapter 5 performs a complete quantitative evaluation of the

results obtained with different datasets. Firstly, synthetic and real data are used to evaluate

the quantitative accuracy of the selected methods. Secondly, 12 months consecutive scans are

used to evaluate the volumetric analysis in the presence of the disease, with both healthy and

MS patients with different lesion loads. Finally, in chapter 6 conclusions summarizing the de-

veloped work are presented. Based on these conclusions, possible solutions are also introduced

to be implemented as future work.

Chapter 2

Problem definition

2.1 Magnetic Resonance Imaging

Imaging is usually preferred over biopsy on clinical practice when collateral risks from surgical

procedures are an important risk. Unlike X-rays and computer tomography (CT) scans, MRI

does not emit ionic radiation. Instead, magnetic waves stimule hydrogen atoms in molecules

using the property of nuclear magnetic resonance (NMR). NMR is the phenomenon in which

magnetic nuclei in a magnetic field absorb and re-emit electromagnetic radiation at a specific

resonance frequency. The radiation energy is dependent of the magnetic field and atom proper-

ties permitting different imaging configurations. For several decades, MRI has been used widely

in scientific research and medical care. 3D processing of medical brain images is an active re-

search topic in computer vision where MRI has provided meaningful information about brain

tissue at very high resolutions for use in fields like reparative surgery, radiotherapy, stereo-

tactic neurosurgery and others [50]. In particular, automatic 3D brain segmentation of white

matter (WM), gray matter (GM) and Cerebro spinal fluid (CSF) is especially important for

quantitative analysis and tissue volume measurements. Those quantitative measurements are

a key factor to assess the progress or remission of several diseases of the central nervous system

like MS. As a consequence of the high resolution data provided by MRI, its application to the

study of MS lesions has greatly improved the ability to diagnose and monitor the disease evo-

lution [48]. Effectively, MRI is the most reactive technique detecting demyelnating lesions on

the CNS in MS patients [64]. New diagnostic criteria based on MRI, allows to demonstrate the

dissemination of the pathology in space and time allowing early diagnosis [68] and permitting

the exclusion of other possible pathologies.

7

Chapter 2: Problem definition 8

2.1.1 MRI concepts

MRI takes the advantage of high amount of water in body tissue. Briefly, the principle behind

is the magnetization of proton molecules which get aligned in the presence of a large magnetic

field. Water molecules have two hydrogen nuclei or protons and if they are stimulated by

a radio-frequency coil pulse (RF), the average magnetic moment of the protons of a body

region exposed to magnetization becomes aligned with the direction of the field. Varying

the electromagnetic field accordingly to the resonance frequency, flips the spins of the water

molecules in the direction of the induced magnetic field. After the induced magnetic field is

discontinued, those molecules are progressively realigned to the initial static field, also known

as thermodynamic equilibrium. The re-equilibrium state time is defined as relaxation time.

During this relaxation, a radio frequency signal is generated, which can be measured with

receiver coils. The magnetization is repeated applying a new pulse sequence magnetization of

the water molecules. The repetition time (TR) is defined as the amount of time that exists

between successive pulse sequences applied to the same body region.

The composition of each brain tissue defines the relaxation times and permits different

image acquisitions types. T1-weighted acquisition sequence depicts differences in the spin-

lattice relaxation time of various tissues within the body. The spin-lattice relaxation time (T1

relaxation time) is defined as the time spend by each tissue to return to its thermodynamical

equilibrium. T2-weighted acquisition sequence depicts differences in the spin-spin relaxation

time. The spin-spin relaxation time (T2 relaxation time) characterizes the signal decay as

the time it takes for the magnetic resonance signal to reach a certain peak. The echo time

(TE) represents the amount of time between the application of the RF pulse and that peak

of the echo signal. The voxel intensity of the acquired image is determined by the previously

defined relaxation time variables (TR, TE , T1 and T2 times). Each acquisition sequence is

defined by these times variables. The T1w sequences are defined by a short TR and short TE

(TR < 1000ms, TE > 30ms). T2w sequences are defined by long TR and TE (TR > 2000ms,

TE > 80ms). T1 acquired scans clearly separate grey matter from white matter and are often

used in brain tissue segmentation when only one modality is used (monospectral). This is the

case for all the analyzed methods in this study, which are only run with T1 scans. Moreover, T1

is also included in combination with other sequences into (multispectral) tissue segmentation

methods. T2 scans are often used in the intra-tissue classification of abnormal fluid against the

normal tissue, which made T2 suitable for lesion detection, However, the correlation between

the burden of lesions observed on conventional MRI scans and the clinical lesion evidences

remains weak. In particular, discrepancies between clinical and conventional MRI findings in

MS are explained, at least partially, by the limited ability of conventional MRI to characterize

and quantify the heterogeneous features of MS pathology [35]. Consequently, more advanced

9 2.2 Computer vision aspects

(a) Brain tissue (b) T1w (c) T2w (d) FLAIR

Figure 2.1: Different acquisition sequences. (a) brain tissue segmentation with CSF(red), GM(green)and WM(blue). (b) t1w image, (c) t2w image and (d) FLAIR image

acquisition methods are being used today such as fluid attenuated inversion recovery (FLAIR)

[9], diffusion techniques as diffusion weighted imaging (DWI) [58] and diffusion tensor (DTI) [25]

or magnetization transfer (MT) [86]. FLAIR sequences suppress fluids from the image, for

example, restraining the CSF tissue effects on the acquired image. Those acquisition sequences

have been widely used to classify periventricular hyperintense lesions, such as MS plaques

[43]. Conversely, diffusion (DWI, DTI) and Magnetization techniques (MT) have been using

in histopathology studies with MS [65] [36]. DWI and DTI techniques use the rate of existing

diffusion in water molecules, known as Brownian motion, to link their apparent diffusivity to

produce neural tract data. In Magnetization Transfer, the image contrast is improved based on

the observed changes caused by magnetization transfer from hydrogen nuclei of water (hydration

or bound) with restricted motion into the hydrogen nuclei of water that move with many degrees

of freedom (free or bulk).

2.2 Computer vision aspects

Figure 2.1 shows a set of typical MRI images acquired with T1w, T2w and FLAIR modalities.

In T1 (figure 2.1(b)), CSF tissue has the darkest intensities while WM has the brightest. On the

contrary, in T2 (figure 2.1(c)) CSF has the brightest intensities while WM is the darkest. On

both sequences, GM has an intermediate gray level. However, on FLAIR acquisition sequences

(fig.2.1(d)), WM and GM have an intermediate grey level and lesions seems brighter.

3D MRI volumes are a stack of individual 2D magnetic resonance images captured at dif-

ferent slices with constant distance intervals. Every voxel, also known as volumetric pixel,


(a) Synthetic representation (b) Real volume

Figure 2.2: Skull stripping of the brain. (a) MRI non tissue parts as eyes, skull and fat are alsopresent on images. (b) Same scan after skull-stripping process

represents a value on the regular grid expressed by each MR image and slice. Normally, voxels

codify the gray level intensity with 8 or 16bit numerical data range.

In what follow, different troubles related with MRI images are explained. Basically, acquired

MRI volumes have to deal with the extraction of the skull of the brain, partial volume effects

and intensity inhomogeneities.

2.2.1 Skull stripping

Acquired brain MRI volumes incorporate non brain tissue parts of the head such as eyes, fat,

spinal cord or brain skull. The segmentation of brain tissue from nonbrain tissue in MRI is

commonly referred as skull stripping, and it is an important image processing step in many

neuroimage studies. Studies have reported that differences in skull stripping would lead into

unexpected results in the tissue classification if skull or eyes are included as brain tissue [1].

Figure 2.2 depicts the skull stripping process. Figure 2.2(a) shows ta MRI scan, where eyes,

fat and skull are present. Figure ??(b) shows the preprocessed MRI scan with skull and non

brain parts extracted.

2.2.2 Partial volume effects

Automatic brain tissue segmentation algorithms classify the voxels into their possible classes

(CSF, GM and WM). However, one of the most important problems are the classification of

voxels where more than one tissue is present. This phenomenon is referred to as partial volume

effects (PVE). PVE blur the intensity distinction between tissue classes at their border. For

example, a T1 image voxel containing a boundary between CSF and WM can be misclassified

11 2.2 Computer vision aspects

(a) Synthetic representation (b) Real volume

Figure 2.3: Partial volume effects representation. (a) Synthetic representation of blurring intensityon boundaries and (b) real volume

as GM because of the increase in the blur (figure 2.3 (a)). A real example is shown in (figure

2.3 (a)), where the highlighted region is blurred due to PVE in the boundary tissues.

2.2.3 Intensity inhomogeneity

MRI acquisition process can be corrupted by several image artifacts. These artifacts have a

direct impact on segmentation results. Automated brain segmentation pipelines usually in-

corporate a preprocessing step by which these image inhomogeneities are removed. Sources

of inhomogeneities have been studied extensively [72]. The artifacts causes have been divided

into two main groups [82] by classifying them as inherent to the same MRI device or provoked

by the same scanned object. Main causes in first group are especially derived from radio fre-

quency (RF) transmissions and receptions but also differences in the magnetic field, bandwidth

filtering of the data or eddy currents driven by field gradients. Cause derivations in the second

group are related to the imaged object itself (position, shape, and orientation of the object

inside the magnet) or dielectric properties of the object. Figure 2.4 depicts four volumes with

different inhomogeneity intensity levels from the IBSR dataset 1: figure 2.4(a) shows a volume

with low bias while figure 2.4 (b) depicts a volume with high bias localized in one stripe in the

image center. Figure 2.4(c) shows a very high bias with different stripes along the volume while

figure2.4(d) characterizes a typical corrupted image acquisition. Intensity inhomogeneities are

inherent from the MRI scanner can be corrected by shimming techniques [82], special imag-

ing sequences and different sets of coils, or by calibrating the MRI device by a phantom or

a mathematical model. However, inhomogeneity correction related with the scanned object is

1IBSR dataset is sorted by level of difficulty. Some volumes come with very high intensity inhomogeneityload


(a) Low bias (b) High bias (c) Very high bias (d) Corrupted volume

Figure 2.4: Different acquisition images with intensity inhomogeneity biases from the IBSR dataset(a) Low bias, (b) high bias localized in one stripe in the image center, (c) very high bias with differentstripes (d) corrupted image

still a hard problem. The linear increase of frequency used to stimulate imaged objects in high

magnetic field MR scanners increases the effects of RF standing waves and penetration caused

due to the important impact of the object.

Chapter 3

State of the art

This chapter reviews the state-of-the-art of methods in brain tissue segmentation, pointing out

their advantages and disadvantages. Different strategies have been defined in previous literature

reviews to classify those methods. Bezdek et al. [12] studied 90 papers on MRI segmentation

using pattern recognition techniques and divided them as supervised and unsupervised methods.

Clarke et al. [21] expanded this classification with pre-processing and registration techniques

as well methods of validation. Pham et al. [60] analyzed the most common algorithms used in

segmentation of anatomical structures in medical imaging. The authors analyzed previous work

classifying them directly by approach, i.e clustering, classifiers, thresholding, Markov model,

etc. In another review, Zhang et al. [90] categorized the methods by classification, assessment

criteria and performance metrics proposed by each method. More recently, Balafar et al [10]

added several imaging modalities, methods for noise reduction inhomogeneity correction to

classify the different approaches.

3.1 Classification of segmentation approaches

In machine learning, classification can be based in supervised or unsupervised learning. Su-

pervised learning uses a training set of correctly-identified observations. Those observations

are used as prior information to perform the brain tissue segmentation. On the contrary, un-

supervised learning, known as clustering analysis, does not use any prior information in the

classification task and involves grouping data into categories based on some measure of in-

herent similarity, i.e distance based-measures, etc... As pointed by Bezdek et al [12], we also

categorize the methods by learning type as supervised (S) and unsupervised (U). Moreover,

we extend the categorization adding algorithm and evaluation characteristics such as intensity

uniformity correction (IUC), image type (T1,T2,PD), dataset and statistical measure used. Al-

13

Chapter 3: State of the art 14

though a large number of works have been published, the study is focused only in those which

use common databases such as IBSR1 and Brainweb (BW) [22]. This characteristic will per-

mit us to compare quantitatively the accuracy within each method. 30 published works have

been reviewed from 2001 to nowadays. Table 3.1 summarizes the characteristics of the selected

works.

3.1.1 Supervised methods

Supervised methods (S) described in table 3.1 can be organized in two main groups. The first

group of methods make use of the Bayesian formulation, which implies an iterative simulta-

neous estimation of tissue classes and intensities. They model the intensity distribution of

brain tissues by a Gaussian mixture model (GMM) classifying voxels according to the intensity

distribution of the data. Prior information such as probabilistic atlases is introduced in the

methodology to estimate initial parameters of the model. Given the weighted mixture distribu-

tion, segmentation is commonly estimated by a expectation maximization (EM) optimization

algorithm with maximum a posteriori (MAP) or a maximum likelihood (ML) [5] [33] [57].

Moreover, spatial coherence assumptions can be added using a Bayesian approach, in the form

of Markov Random Fields (RMF) models [8] [89] [14] or multi-scale information [3]. The second

group is composed by statistical supervised classification approaches. Those methods typically

develop different strategies to train non-parametric [23] [27] or high-dimension feature space

classifiers [49] [87] [85], where training data is extracted from prior information given by statis-

tical or expert labelled atlases.

Probabilistic methods

Marroquin et al (2002) [57] proposed a variant of the Bayesian estimation framework with

parameter estimation via Expectation Maximization algorithm (EM). Initial tissue probabilities

were computed using a robust registration of a standard-brain and intensity inhomogeneities

for each class were eliminated by separate parametric smooth models. However, the method

produced a hard segmentation result which does not deal with PVE. Similarly, Dugas-Phocion

et al. (2004) [33] introduced an improved EM algorithm to brain tissue segmentation with

partial volume effect quantization. It was based on a Gaussian Mixture Model (GMM) with

prior information from probabilistic brain atlas. According to the authors, the vessel was

wrongly classified as CSF and they introduce a new class to the vessel. The results reported an

improvement on the global accuracy of the method. Ashburner et al (2005) [5] also presented a

probabilistic brain segmentation framework based on GMM with prior atlas information. Here,

1http://www.cma.mgh.harvard.edu/ibsr/data.html

http://www.cma.mgh.harvard.edu/ibsr/data.html

15 3.1 Classification of segmentation approaches

Article reference Algorithm characteristics ExperimentAuthor Year Algorithm S-U IUC Image Type Database Measure

Ashburner et al. [5] 2005 GMM S yes all BW DiceAskelrod et al. [3] 2007 SWA S no T1 IBSR DiceAskelrod et al. [4] 2007 SVM S no T1 IBSR DiceAwate et al. [8] 2006 AMRF S no T1 BW,IBSR DiceBazin et al. [11] 2008 TBMS S no T1 IBSR DiceBoer et al. [27] 2010 KNN S no T1,PD other DiceBricq et al. [14] 2008 HMRF S yes T1,T2 BW,IBSR DiceCocosco et al [23] 2003 KNN S no all other DiceJimenez et al. [49] 2006 MS S no T1 IBSR OverlapMarroquin et al. [57] 2002 EM S yes T1,T2,PD BW DiceDugas-Phocion et al. [33] 2004 EM S no T1,T2 other DiceScherrer et al. [69] 2010 MRF S yes T1 BW DiceWels et al. [85] 2011 DMC-EM S yes T1,T2,PD BW,IBSR DiceYi et al. [87] 2009 RF S yes T1 IBSR OverlapZhang et al. [89] 2001 HRMF+EM S yes all IBSR -Demirhan et al. [28] 2011 WSOM U no T1 IBSR OverlapHasanzadeh et al. [42] 2007 GFCM U no all BW DiceHe et al. [44] 2008 AFCM U no T1 BW DiceKalaiselvi et al. [51] 2011 HFCM U no T1 IBSR -Krinidis et at [53] 2010 FLICM U no T1 other -Pham et al. [61] 2001 RFCM U no T1 BW -Tohka et al. [79] 2007 GA-GMM U no T1,T2 BW -Tian et al. [78] 2011 GA-VEM U no T1 IBSR OverlapYuanfeng et al. [88] 2011 PSOM U no T1 BW DiceCaldairou et al. [18] 2009 RFCM U no T1 BW, IBSR DiceShen et al. [71] 2005 SOM+FCM U no T1 BW, IBSR -

Table 3.1: Selected state-of-the-art automatic brain tissue segmentation methods. The acronymesfor the algorithms stand for: Adaptive Fuzzy C-means (AFCM), Adaptive Markov Random Fields(AMRF), Discriminative Model Constrained Expectation Maximization (DMC-EM), Expectation Max-imization (EM), Gaussian Mixture Model (GMM), Genetic Algorithm (GA), Genetic Fuzzy C-means(GFCM), Hidden Random Markov Fields (HRMF), Histogram-based Fuzzy C-means (HFCM), Hy-brid Fuzzy C-means (HbFCM), K-nearest Neighbor (KNN), Markov Random Fields (MRF), Mean-shift (MS), Random Forests (RM), Probabilistic Self Organized Map (PSOM), Robust Fuzzy C-Means(RFCM), Segmentation by Weighting Aggregates (SWA), Self Organized Map (SOM), Support VectorMachines (SVM), Topology-Preserving Fast Marching Segmentation (TBMS), Variational ExpectationMaximization (VEM), Weighted Self Organizing Map (WSOM).


prior tissue probabilities were obtained registering a ICBM Tissue Probabilistic Atlas2 to obtain

the initial tissue probabilities.3. The method aggregates the combination of image registration,

tissue classification and bias correction in the same time by a derivation of the log-likelihood

function. 4.

Zhang et al. (2001) [89] proposed a parametric EM-based approach with Hidden Markov

Random Fields (HMRF) and thresholding-based initialization5 as substitute for the widely used

GMM, more sensitive to noise. The method was based on an (HMRF-EM) framework, a com-

bination of the HMRF model which encoded spatial information through the mutual influences

of neighboring voxel, and the associated Markov Random Field- Maximum a posteriori (MRF-

MAP) estimation by EM fitting procedures. Bias correction was incorporated by introducing

the algorithm proposed by Guillemaud et al [39] into the EM parameter estimation. Awate et

al. (2006) [8] presented a segmentation method build on adaptive statistical model of image

neighborhoods and atlas-based initialization. Although the method was not using training data

it adapted to the data given by an initial configuration that was generated from an atlas of

labels and then performed tissue classification by adaptively learning the image-neighborhood

statistics via data-driven nonparametric density estimation (non parametric MRF). Addition-

ally, Bricq et al (2008) [14] introduced another Bayesian approach where prior information

were obtained both from probabilistic brain atlas containing prior expectations about the spa-

tial localization of different tissue classes and neighborhood information using a Hidden Markov

Chain (HMC) model. The bias field estimation was corrected online as a linear combination of

smooth bias functions as proposed by Van Lemput et al. [81]. Scherrer et al (2010) [69] provided

a joint model framework for carrying out tissue and structure segmentations by distributing a

set of local agents that estimate cooperatively local MRF models. The approach was based on

a fully Bayesian joint model that integrates within a multi-agent framework local tissue and

structure segmentations and local intensity distribution modeling. The joint model was build

on the specification of three conditional Markov Random Field (MRF) models: two encoding

cooperations between tissue and structure segmentations with a priori anatomical knowledge,

and another model specifying a Markovian spatial prior over the model parameters that en-

ables local estimations and handle consistently intensity non-uniformities correction without

bias field modeling.

Askelrod et al. (2007) [3] proposed a method based on Segmentation by Weighted aggregates

(SWA). This method incorporated prior knowledge information into a multi-scale framework

2Atlas freely available from http://www.loni.ucla.edu/ICBM/ICBM_Probabilistic.html3This method is developed in more detail in the next chapter as part of our selected segmentation methods

list4The proposed method is public available as Statistical Parametric Mapping (SPM). http://www.fil.ion.

ucl.ac.uk/spm/5Public available as part of FSL http://www.fmrib.ox.ac.uk/fsl/index.html

http://www.loni.ucla.edu/ICBM/ICBM_Probabilistic. html

http://www.fil.ion.ucl.ac.uk/spm/

http://www.fil.ion.ucl.ac.uk/spm/

http://www.fmrib.ox.ac.uk/fsl/index.html


through a Bayesian formulation. Atlas priors provided probabilistic information and were added

to a likelihood function estimated from a manual training. The method constructed a pyramid

of different resolution image graph representations. This configuration permitted the authors

to adaptively represent progressively larger additions of voxels with similar properties.

Statistical classification

Cocosco et al. (2003) [23] proposed a non parametric K-Nearest Neighbor (KNN) classifier with

adaptive training. A set of training samples were generated from prior tissue probability maps

registered on the subject itself. The training set was customized by using a pruning strategy,

such that the classification was robust against anatomical variability and pathology. Similarly,

Boer et al. (2009) [27] introduced another KNN classifier also automatically trained on the

subject itself by using T1 and PD volume intensities. The training dataset was build in the

same manner developed in Cocosco et al (2003) [23], but here training samples for the classifier

were obtained from the subject itself by atlas-based registration of single or multiple label atlases

to the subject. The transformations were obtained by registration of the grayscale images and

applied to the labeled images. Jimenez et al. (2006) [49] proposed another non parametric

estimation strategy, based on the Mean-Shift (MS) algorithm, which defines the cluster centers

using the local modes of the underlying joint space-range density function. Tissue boundaries

information was improved by including an edge confidence map representing the confidence

of truly being in the presence of a border between adjacent regions. The confidence measure

was used to fuse iteratively a region adjacency map, merging and pruning regions with weak

edges and very small regions respectively. Class labeling was taken by maximum a posteriori

MAP decision, based on prior label atlases. For each homogeneous region found in the graph,

a probability map associated to each of the tissue classes was computed.

Askelrod et al. (2006) [4] proposed the combination of a fast pyramidal multichannel 3D

segmentation algorithm with a high-dimension feature space support vector machine (SVM).

The pyramid was constructed by adding different scale aggregates based on sets of weighted

voxels with intensity similarity. The pyramid added to voxel gray level intensity, rich atributes

about texture and shape. The feature set for the SVM classifier was composed by the expanded

attributes and registered prior knowledge of anatomic structures from probabilistic atlases. Yi

et al. (2009) [87] proposed a learning-based method based on discriminative Random Decision

Forest (RF) classification which took into account partial volume effects and non-uniformity

intensity correction by a smoothing multiplying factor. The authors built a RM feature space of

context-rich visual information, based on raw intensities of voxels, image gradient, atlas based

probabilities and the output of the Maximum a posteriori (MAP) classifier. The partial volume

effects were estimated as new classes and re-assigned with the mixing fraction of them. Wells


et al. (2010) [85] proposed another brain segmentation pipeline which also included proba-

bilistic boosting trees (PBT) as Discriminative Model Constrained Expectation Maximization

(DMC-EM) method. It was build on unsupervised statistical EM segmentation into an inte-

grated Bayesian framework and MRI modality specific discriminative modeling. The algorithm

estimated intensity non-uniformities via EM and regularized segmentation and parameter es-

timation using a Markov random field (MRF) prior model which provides knowledge about

spatial and appearance-related homogeneity of segments in terms of pairwise clique potentials

of adjacent voxels. Moreover, the method incorporated into the segmentation process unique

clique potentials composed by patient-specific knowledge about the global spatial distribution

of brain tissue by MRI-specific Haar-like features and rigidly aligned probabilistic atlas-based

features. Those clique potentials were used to classify image voxels via a probabilistic boosting

tree (PBT).

3.1.2 Unsupervised methods

Unsupervised methods (U) also described in table 3.1 can be organized in three main groups.

Fuzzy c-means (FCM) is the most common clustering technique applied for MRI brain tissue

segmentation. Fuzzy partitioning is carried out through an iterative optimization of an objec-

tive function, very similar to one used in hard k-means clustering but weighted by a fuzziness

degree. Because its fuzzy nature, voxels are allowed to belong to several classes making FCM

techniques intrinsically able to deal efficiently with partial volume artifacts. However, fuzzy

clustering is commonly very sensitive to noise and intensity uniformities, since spatial infor-

mation is not carried out in the partitioning process. Some works have been proposed to deal

with this aspect weighting membership functions by the addition of local [2] [20] [76] [17] [53]

or non-local [18] neighboring information. Furthermore, other strategies have been presented

to overcome intensity uniformities based on FCM, modifying cluster spaces [44] or by complex

neighboring approaches [71]. A second group of strategies are based on evolutionary optimiza-

tion. Genetic algorithms (GA) are one of the different unsupervised evolutionary optimization

techniques. They implement adaptive heuristic global optimization inspired on the evolutionary

ideas of natural selection and genetics. GA have been used in MRI segmentation as a prior

parameter estimation of finite mixture models as GMM [79] [42] or more recently to initialize

the parameters of proposed variations of the EM algorithm [78]. The third group is composed

by unsupervised artificial neural network (ANN) classifiers. Those models are computational

models inspired by functional aspects of biological neural networks. In particular, self organized

maps (SOM) have been significantly used in brain tissue segmentation as clustering methods [28]

or optimization processes [88] [71] to Probabilistic Neural Networks (PNN).


Fuzzy Clustering

Several modifications on the FCM algorithm have been proposed to incorporate the intensity in-

homogeneity correction. Pham et al (2001) generalized the FCM objective function to include

spatial penalty on the membership functions (RFCM). The authors introduced a smoothing

function to control the fuzzy membership of voxels to a label by a penalty based on the mem-

bership of a neighborhood to the other classes. The trade-off between classic fuzzy objective

function and smoothing was controlled by a β parameter, optionally tunned both experimen-

tally or by regression. Similarly, Ahmed et al. (2002) ?? presented the FCM S algorithm,

which also proposed a smoothing factor modifying the objective function by the addition of

a regularization term based on neighborhood membership to the same class. The amount of

regularization had to be tuned experimentally. Chen et al (2004) [20] proposed a variation

of FCM S (FCM S1), simplifying the regularization term. In this approach, spatial informa-

tion was obtained by a pre-filtered image containing the average of neighboring voxels. Since

pre-filtering was done before the clustering process, the method improved the execution time.

Szilagyi et al (2003) [76] introduced a new approach for regularization by computing in advance

a linearly-weighted sum image derived from the original image and its local neighbor and av-

erage images (EnFCM). This process sped-up the clustering, since regularization was done in

advance and the sum image was used in the minimization function instead of the input image.

However, those methods were handicapped by manually tuning of regularization parameters.

Cai et al (2007) [17] extended EnFCM with a fast generalized fuzzy c-means (FGFCM) algo-

rithm to improve the clustering results, as well as to facilitate the choice of the neighboring

control parameter. The precomputed image proposed in EnFCM was modified by a new image

based on a different local similarity measure to combine both spatial and gray level image infor-

mation into its objective function. Krinidis et at (2010) [53] presented a fuzzy local information

approach (FLICM) . The method made use of a new fuzzy local spatial and gray level similar-

ity factor to guarantee noise insensitiveness, image detail preservation and neighbor influence

depending on their distance to the voxel. Moreover, the algorithm was non-parametric in terms

of regularization weights. Finally, Caldairou et al. (2009) [18] integrated into the FCM seg-

mentation methodology a regularization from a non-local (NL) de-noising framework [15]. NL

exploited the similarity of small neighborhoods around a voxel within the same scene in order

to handle large neighborhoods without prior knowledge. The method combined non-local data

and regularization terms to handle with intensity inhomogeneities and image noise respectively.

Shen et al. (2005) [71] proposed a technique based on an extension of FCM, where segmen-

tation performance was improved by neighborhood attraction with artificial neural network

(ANN) optimization. During clustering, each pixel attempted to attract its neighboring pix-

els toward its own cluster. Neighbor attraction were based on two properties: voxel intensities


(feature attraction) and the spatial position of the neighbors (distance attraction). Classical dis-

tance in FCM from voxels to clusters was modified to incorporate both attractions. The degree

of attraction was controlled by two parameters λ and ξ, optimized by a simple Artificial Neural

Network (ANN). He et al. (2008) [44] extended the Adaptive Fuzzy C-Means (AFCM) approach

to multi-spectral segmentation with efficient intensity non-uniformity correction. Mahalanobis

distance replaced the classical euclidian measure between voxels and clusters in FCM, which

has the tendency to generate equal cluster volumes with spherical occupancy in the feature

space. Conversely, Mahalanobis generates volume and shape of the non-spherical occupancy

in the feature space of clusters. The proposed method introduced the size and density of the

clusters into the ACFM approach using the algorithm of Gath and Geva [37] to overcome equal

cluster volume limitation. Kalaiselvi et al. (2011) [51] proposed to modify cluster initialization

in FCM algorithm. MRI intensity characteristics of brain regions were used to initialize the

centroids. Lowest value in the image histogram was used as cluster center for background with

the highest intensity 255 is fixed as centroid for WM. In between, an equal interval was assumed

based on the peaks as centroids for CSF and GM.

Evolutionary optimization

Hasanzadeh et al. (2007) [42] introduced a method based on a genetic fuzzy system for modeling

different tissues in brain MRI. A fuzzy classifier was trained for each tissue where an evolution

process was defined for training classifiers. The output probability distribution functions of

these classifiers were modeled by a GMM. Gaussian parameters were used to voxel classifica-

tion via Maximum likelihood (ML) estimators and Bayesian classifiers. Similarly, Tonka et al.

(2007) [79] proposed a Gaussian Mixture Model (GMM) approach based on real coded genetic

algorithms initialization (GA-GMM)6. The authors proposed to use a blended crossover to

minimize the premature evolutionary algorithms convergence problem and a new permutation

operator specifically meant for the GMM parameter estimation. The permutation operator

allowed to impose biologically meaningful constraints to the GMM parametrization7. More

recently, Tian et al. (2010) [78] introduced a hybrid genetic and variational EM algorithm

for GMM (GA-VEM). The algorithm aimed to overcome the intrinsic overfitting found in EM

based on the global optimization provided by GA and the capability of avoid overfitting present

in VEM. In the variational EM approach, GMM parameters are assumed to be stochastic vari-

ables governed by “hyperparameters” which in conjunction with voxel labels can be estimated

in a variational extension of the EM technique. Here GA was employed to initialize prior

distributions of hyper-parameters involved in the VEM algorithm used to estimate the GMM.

6The implementation of the algorithm is public available from www.cs.tut.fi/~jupeto/gamixture.html7This method is developed in more detail in the next chapter as part of our selected segmentation methods

www.cs.tut.fi/~jupeto/gamixture.html

21 3.2 Reported results

Neural networks

Yuanfeng et al. (2011) [88] proposed a classification method based on a probabilistic 3D Neural

Network (PNN). Spatial information was incorporated into the denoising process by using a

neighboring system in 3D to build a robust training set. In order to made the made PNN

classifier more robust to noise, the training set was used to train ta SOM and reference output

vectors from each class were used to train the PNN. Demirhan et al. (2011) [28] presented

a method which combines unsupervised learning algorithm from self-organizing maps (SOM)

and supervised learning vector quantization (LVQ) methods. The authors proposed to distinct

different tissues using stationary wavelet transform (SWT) applied to the input volumes. An

application of spatial filtering into the wavelet coefficients permitted the extraction of statistical

information of brain tissues. SOM was used to segment input images, with feature vectors

composed by those statistical information and SWT coefficients. LVQ optimized the weight

vectors obtained from the trained SOM to produce optimal decision boundaries.

3.2 Reported results

Automatic brain tissue segmentation methods are usually evaluated using different quantitative

measures on both synthetic and real MRI volumes. Among these measures, common coefficients

found in the literature include Jaccard [47] and Dice [30] similarity indexes or statistical func-

tions found in pattern recognition such as sensitivity and specificity [32]. Although several

published works reported an evaluation using real patient scans from hospitals, public MRI

databases are commonly utilized to compare reported results between studies. In particular,

simulated scans from Brainweb database [22] and real T1w scans from Internet Brain Segmen-

tation Repository (IBSR)8 are being widely used.

3.2.1 Brainweb

Standard simulations from Brainweb database 9 include parameter setting fixed to 3 modalities

(T1, T2 and PD), 5 slice thicknesses (1, 3, 5, 7 or 9mm), 6 levels of noise (0, 1, 3, 4, 5 or

9%), and 3 levels of intensity non-uniformity (0, 20 or 40%) defining a volume of 187x217x181

voxels (x, y, z). An anatomical brain model is employed to generate simulated brain MRI data

consisting on a set of 3-dimensional ”fuzzy” tissue membership volumes for each tissue class.

Tissue classes include GM, WM and CSF but also muscle, fat or skin. The voxel values in

these volumes reflect the proportion of tissue present in that voxel.10. The brain model used to

8http://www.cma.mgh.harvard.edu/ibsr/data.html9http://mouldy.bic.mni.mcgill.ca/brainweb/selection_normal.html

10http://mouldy.bic.mni.mcgill.ca/brainweb/anatomic_normal.html

http://www.cma.mgh.harvard.edu/ibsr/data.html

http://mouldy.bic.mni.mcgill.ca/brainweb/selection_normal.html

http://mouldy.bic.mni.mcgill.ca/brainweb/anatomic_normal.html


(a) n=0%, b=0% (b) n=0%, b=20% (c) n=0%,b=40% (d) n=3%, b=0%

(e) n=3%, b=20% (f) n=3%, b=40% (g) n=7%, b=20% (h) n=7%, b=40%

Figure 3.1: Brainweb generated dataset for different noise levels (n) and biases (b).

generate the simulations can be also employed as ground truth. 1mm simulations provide a dis-

crete ground-truth model while in other slice thicknesses the anatomical model is fuzzy. Figure

?? depicts the effect of noise and inhomogeneity in simulated data. From scans without noise

(figs. (a), (b) and (c)), it can be observed how bias increases the apparent intensity of voxels

due to abrupt changes in the distribution of voxel intensities. Furthermore, noise corruption

can be recognized in volumes with 0% bias (figs (a), (d) and (g)) as a decrease of intensity due

to noise addition.

3.2.2 IBSR

IBSR is a public dataset from the Center for Morphometric Analysis at Massachusetts General

Hospital 11. The IBSR provides two datasets with 20 (IBSR20) and 18 (IBSR18) raw image

scans from healthy subjects. Table 3.2 shows the specifications for both datasets. IBSR20 data

are available in 8 or 16 bits while IBSR18 are only available in 8 bits resolution format. Labelled

volumes for evaluation in IBSR20 are provided by trained investigators using a semi-automated

intensity contour mapping algorithm [34] and also using signal intensity histograms. IBSR18

11Available at http://www.cma.mgh.harvard.edu/ibsr

http://www.cma.mgh.harvard.edu/ibsr


Table 3.2: Summary of publicly available databases on IBSR repository [85]

IBSR18 IBSR20

Source http://www.cma.mgh.harvard.edu/ibsr http://www.cma.mgh.harvard.edu/ibsr

Volume size 256 × 256 × 128 256 × 65 × 256Voxel spacing 0.84 × 0.84 × 1.5mm3 1.0 × 3.1 × 1.0mm3

0.94 × 0.94 × 1.5mm3

1.00 × 1.00 × 1.5mm3

Modality T1 T1Number of scans 18 20

data are provided with segmentation results of 43 individually labeled principle gray and white

matter structures of the brain.

3.2.3 Evaluation measures

Quantitative evaluations are commonly based on the comparison between the segmentation

results and a manually expert labelled volume or ground truth. Usually, intra-inter observer

variability is avoided by the utilization of labelled volumes from more than one expert. Still,

this is not a sufficient condition and it is difficult to find a consensus among experts. Warfield

et al. (2004) [84] proposed an algorithm to simultaneous truth and performance level estima-

tion (STAPLE). The method took a collection of segmentations of an image, and computed

simultaneously a probabilistic estimate of the true segmentation and a measure of the perfor-

mance level represented by each segmentation. However, most of the reviewed studies report

evaluations based on statistical analysis measures derived from classification rates with respect

to the ground truth such as true positive (TPR), true negative (TNR), false positive (FPR)

and false negative (FNR) rates. In a single tissue classification, these rates are defined as:

• TPR is the percentage of voxels classified as tissue by the method that are labeled as

tissue by the expert.

• TNR is the percentage of voxels classified as non-tissue by the method that are labeled

as non-tissue by the expert.

• FPR is the percentage of voxels classified as tissue by the method that are labeled as

non-tissue by the expert.

• FNR is the percentage of voxels classified as non-tissue by the method that are labeled

as tissue by the expert.




From these rates, sensitivity or true positive fraction (TPF) is the classifier ability to cor-

rectly identify tissue voxels [32]. It can be defined as:

sensitivity =|TPR|

|TPR|+ |FNR|(3.1)

Similarly, specificity is defined as the classifier ability to identify non-tissue:

specificity =|TNR|

|TNR|+ |FPR|(3.2)

The accuracy of the classifier is usually computed as the rate of correct predicted voxels over

all predicted voxels. Hence,

accuracy =|TPR|+ |TNR|

|TPR|+ |TNR|+ |FNR|+ |FPR|(3.3)

and conversely, the error rate of the classifier is given by the misclassified voxels over all pre-

dicted voxels as:

accuracy =|FPR|+ |FNR|

|TPR|+ |TNR|+ |FNR|+ |FPR|(3.4)

Furthermore, similarity indexes can be used to compute the accuracy of the method. Dice

coefficient [30] is defined as the set agreement between classification and ground truth :

dsc =2 · |TPR|

2 · |TPR|+ |FPR|+ |FNR|(3.5)

Analogously, the Jaccard similarity index [47], also known as the Tanimoto coefficient, measures

the overlap between the segmentation results and the ground truth as:

j =|TPR|

|TPR|+ |FPR|+ |FNR|(3.6)

Other measures based on intensity, distance or connectivity can be used, as reported by Car-

denas et al [19]. However, Dice coefficient is the most broadly used measure to quantitatively

evaluate the accuracy of brain tissue segmentation. This behavior can be shown also in table

3.1, where Dice is employed in 16 out 26 reviewed methods. We extend the measure list adding

two metrics commonly used in volumetric quantification of brain tissue and tissue atrophy in

MS disease. The Fractional Brain Tissue of a given class returns the normalized fraction of the

given tissue in the brain. It is defined as the amount of voxels which are classified as the given


class divided by all brain voxels. Hence:

FBTcsf =CSF

CSF +GM +WM(3.7)

FBTGM =GM

CSF +GM +WM(3.8)

FBTWM =WM

CSF +GM +WM(3.9)

Moreover, experts usually measure the atrophy in MS lesion tissues using the Brain parenchymal

factor coefficient which is defined as the number of GM,WM voxels and tissue lesion voxels L

divided by the all the brain voxels. A decrease in BPF over time might give early diagnostic

clues about the onset of MS disease:

BPF =GM +WM + L

CSF +GM +WM(3.10)

3.2.4 Results analysis

Published results are grouped by database. Although some works introduce other accuracy

measurements, only Jaccard and Dice similarity indexes are considered. Moreover, T1 modality

results are preferred to multispectral modalities when possible to obtain a unbiased comparison

between methods. Since some of the surveyed works compare their segmentation accuracy with

other approaches present in our study, results from those methods are included here when the

current work is not reporting results for the same database and similarity index.

Brainweb database

Table 3.2 summarizes the results obtained by surveyed works using Brainweb database. In all

works, WM and GM tissues segmentation accuracy is evaluated with Dice metric. A single

study is additionally reporting overlap measures based on Jaccard index. CSF is not evaluated

in all methods because most studies are only interested in GM and WM volumetric measures.

Since Brainweb simulated scans can be configured to introduce different noise and intensity

inhomogeneity loads, information about experimental setup is also included in the table.

In general, methods performed with very high similarity indexes (> 0.83) even with high

loads of inhomogeneity and noise. With 0% of inhomogeneity bias, Caldairou et al. (2009) [18]

implemented the RFCM approach defined in Pham et al. (2001) [61]. The RFCM method

performed a Dice overlap coefficient of (0.91 and 0.93) for GM and WM respectively. The

AFCM strategy from Awate et al. (2006) [8] reported a similar index for GM but increased

Dice overlap for WM (0.92 and 0.96). Those results were outperformed by Marroquin et al.


BrainWeb

Article reference Jaccard index Dice index observationsAlgorithm (author and year) csf gm wm csf gm wm

GMM (Ashburner et al. (2005) [5]) - - 0.93 0.96 40% bias, 0%noiseAMRF (Awate et al. (2006) [8]) - - - - 0.92 0.96 0% bias, 0%noiseHMRF (Bricq et al. (2008) [14]) - - - - 0.975 0.98 20% bias, 0%noiseEM (Marroquin et al. (2002) [57]) - - - - 0.96 0.97 0%bias, 1%noiseMRF (Scherrer et al. (2008) [69]) - - - 0.79 0.91 0.93 Avg all volsDMC-EM (Wels et al. (2011) [?]) - - - 0.77 0.92 0.94 20%bias, 0%noiseSOM (Hasanzadeh et al. (2007) [42]) 0.91 0.90 0.91 0.95 0.94 0.95 20%bias, 3%noiseRFCM (Pham et al. (2001) [61]) - - - 0.92 0.91 0.93 From [18]AFCM (He et al. (2008) [44]) - - - 0.98 0.90 0.91 40%bias, 3%noisePSOM (Yuanfeng et al. (2011) [88]) - - - - 0.91 0.93RFCM (Caldairou et al. (2009) [18]) - - - 0.87 0.83 0.86 20%bias, 9%noise

Table 3.3: Surveyed works based on Brainweb database and Dice or Jaccard indexes. The acronymesfor the algorithms stand for: Adaptive Fuzzy C-means (AFCM), Adaptive Markov Random Fields(AMRF), Discriminative Model Constrained Expectation Maximization (DMC-EM), Expectation Max-imization (EM), Gaussian Mixture Model (GMM), Markov Random Fields (MRF), Probabilistic SelfOrganized Map (PSOM), Robust Fuzzy C-Means (RFCM), Self Organized Map (SOM), Support VectorMachines (SVM).

(2002) [57] with a classical EM implementation (0.96 and 0.97). Bricq et al. (2008) [14] and

Wels et al. (2011) [?] employed simulations with 20% inhomogeneity bias and 0% Rician noise

with dissimilar results. Bricq et al work returned higher indexes for GM and WM (0.975 and

0.98) than Wels et al. (0.92 and 0.94). Lower results were also obtained by the SOM approach

of Hasanzadeh et al. (2007) [42] by increasing the Rician noise to 3% (0.94 and 0.95). The

same behavior was seen in Caldairou et al. (2009) [18] with 9% rician noise, where results

were significantly inferior (0.83 and 0.86). Two studies used simulations with 40% of intensity

inhomogeneity. Ashburner et al. (2005) [5] GMM method reported a Dice metric of (0.93 and

0.96) for GM and WM respectively without Rician noise. The ACFM of He et al. (2008) [44]

increased the Rician noise to 3% with again lower results (0.90 and 0.91).

Since not all the works incorporated CSF tissue results, Dice indexes for CSF are commented

together. The best result was obtained by using AFCM approach of He et al. (2008) [44] with

a Dice metric of 0.98 even with a simulation with 40% of bias. Similar results were reported

by Hasanzadeh et al. (2007) [42] (0.95) and significantly lower results by Caldairou et al.

(2009) [18]. The worst results were found in Wels et al. (2011) [85] DMC-EM approach with a

Dice similarity index for CSF of (0.77).


IBSR database

Table 3.3 summarizes the results obtained from surveyed works using real T1 scans from IBSR

database. Again CSF is not evaluated in all methods. From 18 surveyed works, 10 studies

evaluate segmentation results using Dice index, 8 with Jaccard index and 4 with both measures.

The IBSR database of normal subjects is composed by 20 scans with different levels of difficulty.

Some of the volumes are corrupted by acquisition artifacts as shown in section 2.2.3. Table 3.3

shows the number of scans employed in each experiment.

IBSR

Article reference Jaccard index Dice index observationsAlgorithm (author and year) csf gm wm csf gm wm

GMM (Ashburner et al. (2005) [5]) - - 0.78 0.85 n=18. From [18]SWA (Askelrod et al. (2007) [3]) - - - 0.83 0.86 0.87 n=18SVM (Askelrod et al. (2006) [4]) 0.34 0.68 0.66 0.51 0.81 0.80 n=20AMRF (Awate et al. (2006) [8]) - - - - 0.80 0.88 n=18HMRF (Bricq et al. (2008) [14]) - - - - 0.8 0.86 n=18MS (Jimenez et al. (2006) [49]) 0.21 0.59 0.62 - - - n=20DMC-EM (Wels et al. (2011) [85]) 0.62 0.73 0.77 0.76 0.83 0.87 n=18RFCM (Pham et al. (2001) [61]) - - - - 0.84 0.86 n=18. From [18]RF (Yi et al. (2009) [87]) 0.61 0.83 0.73 0.69 0.90 0.83 n=20HRMF+EM (Zhang et al. (2001) [89]) 0.03 0.52 0.49 - - - n=20. From [78]WSOM (Demirhan et al. (2011) [28]) - 0.65 0.54 - 0.70 0.78 n=10GA-GMM (Tohka et al. (2007) [79]) 0.07 0.63 0.60 - - - n=20. From [78]GA-VEM (Tian et al. (2011) [78]) 0.20 0.70 0.57 - - - n=20RFCM (Caldairou et al. (2009) [18]) - - - - 0.83 0.84 n=18

Table 3.4: Surveyed works based on IBSR database and Dice or Jaccard indexes . The acronymesfor the algorithms stand for: Adaptive Markov Random Fields (AMRF), Discriminative Model Con-strained Expectation Maximization (DMC-EM), Gaussian Mixture Model (GMM), Genetic Algorithm(GA), Hidden Random Markov Fields (HRMF), Markov Random Fields (MRF), Mean-shift (MS), Ran-dom Forests (RF), Robust Fuzzy C-Means (RFCM), Segmentation by Weighting Aggregates (SWA),Self Organized Map (SOM), Support Vector Machines (SVM), Variational Expectation Maximization(VEM), Weighted Self Organizing Map (WSOM).

There is a certain correlation on Jaccard and Dice indexes. Typically, Jaccard similarity

values are lower than Dice. This behavior can be also seen in surveyed methods reporting

both similarity measures [4] [85] [28] [87]. From those reporting Dice coefficient, Askelrod et al.

(2006) [4] and Yi et al. (2009) [87] employed the IBSR20 dataset. The RF method proposed

by Yi et al. (2009) [87] outperformed the SVM classifier proposed by Askelrod et al. (2006) [4]

with Dice values of (0.90 and 0.83) for GM and WM respectively in comparison with those

obtained by SVM (0.81 and 0.80). Awate et al. (2006) [8] , Ashburner et al. (2005) [5],

Askelrod et al. (2007) [3], Bricq et al. (2008) [14], Pham et al. (2001) [61] and Caldairou et

al. [18] employed the IBSR18 dataset. SWA method from Askelrod et al. (2007) reported the


best overall results for GM and WM (0.86 and 0.87). MRF approaches from Awate et al. and

Bricq et al. reported both close values around (0.8 and 0.88) and (0.84 and 0.86). Similarly,

Dice coefficients in RFCM approaches of Pham et al. and Caldairou et al. were also similar

between them (0.84 and 0.6) . However, Gaussian Mixture Model (GMM) approach seemed to

perform lower on GM, as reported by Caldairou [18], with Dice values (0.78 and 0.85).

All the methods evaluated only on the Jaccard index employed IBSR20 scans. Tian et al.

(2011) reported results for GA-VEM approach [78] and also implemented the GA-GMM ap-

proach developed by Tohka et al. [79] and HRMF+EM from Zhang et al. (2001) [89], comparing

their work with those ones. From their survey, GA-VEM approach outperformed GA-GM and

HMRF+EM, segmenting GM with a Jaccard index of (0.70 and 0.57) compared with GA-GMM

(0.63 and 0.60) and HMRF-EM (0.52 and 049). Conversely, GA-GMM seemed to slightly out-

perform the other methods segmenting WM. Furthermore, MS method from Jimenez et al.

(2006) [49] reported the best overlap results segmenting WM (0.59 and 0.62).

Some methods returned a very low Jaccard index for CSF, such as GA-VEM [78], GA-

GMM [79], HMF+EM [89] and MS [49]. Low indexes are provoked by differences between

the ground truth and expected probabilities of CSF tissue of those methods. However, SWA

method from Askelrod et al. (2007) reported again the best overall results for CSF. In general,

methods that returned very high Dice indexes for GM and WM also reported a high index for

CSF as seen in Wells et al. [85] and Yi et al. [87].

Chapter 4

Proposal

The main motivation of this master thesis is two-fold: first, to perform an exhaustive compar-

ative evaluation of existing state-of-the-art brain tissue segmentation methods using T1w data

which is the most used in tissue classification; and second, to extend the evaluation with an

quantitative analysis of how MS lesions affect tissue classification. In order to generalize our

findings as much as possible, methods are selected trying to balance state-of-the art methods

with different segmentation strategies and most commonly-used methods in neuroimaging.

Our analysis of the most recent techniques presented in chapter 3 has shown accurate re-

sults using different strategies. Methods which used a priori information to introduce spatial

information into the segmentation process returned higher Dice metrics. Similarly, clustering

methods with spatial regularization also reported high accuracy both in synthetic and real data.

In this chapter, we propose an evaluation pipeline based on 8 state-of-the art tissue

segmentation methods, which includes skull extraction, intensity inhomogeneity

correction and tissue segmentation. Four of these methods are publicly available and in-

clude two segmentation packages widely used by the neuroimage community such as FAST1

and SPM with both SPM5 and SPM8 versions2. We also run an implementation of a GMM

approach with heuristic optimization (GAGMM). 3. This method performs tissue classification

by optimizing the tissue probabilities of the GMM by a modified real codec GA instead of

the classic EM algorithm. Moreover, we complete the set of methods implementing two fuzzy

clustering approches developed in Pham et al. [60], a KNN classifier derived from De Boer et

al. [27] and a unsupervised Neural Network method based on the work of [77] et al.

Our pipeline comprises 3 different stages, as illustrated in figure 4.1. From a T1w scan, a

1http://www.fmrib.ox.ac.uk/fsl/fast4/index.html2http://www.fil.ion.ucl.ac.uk/spm/software/3http://www.loni.ucla.edu/Software/GAMixture

29

http://www.fmrib.ox.ac.uk/fsl/fast4/index.html

http://www.fil.ion.ucl.ac.uk/spm/software/

http://www.loni.ucla.edu/Software/GAMixture

Chapter 4: Proposal 30

Figure 4.1: Scheme for our pipeline approach. From a T1w scan, non brain parts of the MRI scan arestripped and brain voxels are corrected for intensity inhomogeneities. Corrected voxels are classifiedto one of the three tissue classes (CSF, GM and WM). Accuracy evaluation is measured by comparingthe returned tissue classification with the provided ground-truth.

preprocessing stage is carried out before tissue classification. The first step is to stripe the non

brain parts of the MRI scan. After skull-stripping, intensity inhomogeneity correction is applied

to all scans in order to eliminate image artifacts. In the second stage, intensity corrected voxels

are classified as CSF, GM or WM tissue. Finally, in the evaluation stage, the accuracy of the

segmentation is measured by comparing the returned tissue classification with the provided

ground-truth. Every stage is described in more detail in the next sections.

4.1 Preprocessing

The preprocessing step is composed by two main processes: skull stripping and intensity in-

homogeneity correction. These preprocessing steps are required to reduce miss-classification

errors in the later tissue estimation. Normally, the skull is striped before intensity correction to

avoid deviations in tissue intensity distributions caused by the skull. Both concepts have been

31 4.1 Preprocessing

introduced in chapter 2, as part of the intrinsic problems inherent to MRI. Here, we focus more

on the implementation aspects of the pipeline.

4.1.1 Skull stripping

Brain skull extraction from MRI images have been studied in several works comparing the

state-of-the-art techniques. Publicly available methods such as statistical parametric mapping

(SPM2) [6], BET brain extraction tool [74] or the brain surface extractor (BSE) [66] have been

evaluated with new proposals. Boesen et al. [13], compared those methods with their own

McStrip proposal [63] on T1w data. Hartley et al. [40], compared only the accuracy of BET

and BSE against 296 PDw images of asiatic subjects.

Like FAST segmentation approach, BET [74] is part of FSL package4, but both tools are

independent. Furthermore, SPM5 extracts brain tissue internally, with the option to save the

skull-stripped volume. However, using an internal feature of SPM5 as a skull-stripping step

for all methods could benefit this algorithm in detriment of the others. In order to be as

independent as possible, BET is used as skull-strip preprocessing tool. In all the experiments,

we run the BET2 version of the program5 with default configuration only setting the output

option to obtain a mask of the brain tissue. Brain masks will permit to disable non tissue voxels

such as background, speeding up also the segmentation process.

4.1.2 Intensity correction

Three out of eight methods (FAST, SPM5 and SPM8) implement an internal intensity inho-

mogeneity correction. Avoiding intensity correction as a previous step to segmentation would

benefit these methods among the others in noisy images. Here, we propose to correct image

artifacts as a previous step, and disable the internal feature of those 3 methods. These strategy

has been used previously as well in a quantitative study of those 3 methods with the KNN

approach proposed by De Boer et al [26].

Two different studies have recently revised different ways to overcome intensity correction

[83] [45]. Both studies classify these methods into various groups: segmentation-based, filtering-

based, surface fitting-based, histogram-based and other specific techniques. However, as pointed

out by Hou [45], none of the methods has been shown to be superior to the others and exclusively

applicable. Again, FAST, SPM5 and SPM8 returns a corrected image from their segmentation

output. Following the same coherence for the skull stripping tool selection, we decided to use

the external bias correction N3 tool proposed by Sled et al [73], which is part of the BIC package

4http://www.fmrib.ox.ac.uk/fsl/5BET2 is the default version of the program in FSL 4.1 package

http://www.fmrib.ox.ac.uk/fsl/


(a) Input scan (b) BET output (c) N3 output (d) Bias field

Figure 4.2: Proposed preprocessing stage: (a) Input scan from SALEM. (b) Skull-stripped scanreturned by BET. (c) N3 corrected scan returned by N3. (d) Bias field returned by N3.

developed by the McConnell Brain Imaging Center of Montreal Neurological institute 6.

Figure 4.2 shows a real example of the preprocessing stage using selected tools. For an input

scan (Figure 4.2(a)), non tissue parts are extracted using BET (Figure 4.2(b)) and the skull-free

scan is corrected in N3 (Figure 4.2(c)). As it is very hard to appreciate the differences between

the original and corrected version, figure (Figure 4.2(d)) shows the bias fileld returned by N3.

4.2 Tissue segmentation methods

In the following section, we analyze 8 methods which are introduced into the pipeline for brain

tissue classification. FAST, SPM5, SPM8 and GAGMM are publicly available to download. In

these cases, we introduce the software and we explain briefly their implementation approaches.

Furthermore, we have implemented two fuzzy clustering approaches, a SOM derived Neural

network an a KNN classifier which are introduced with more detail. The set of methods define

a representative sample of the different strategies found in the state-of-the-art. All of them

will be evaluated on synthetic and real T1w scans of healthy subjects and also on scans with

different MS lesion loads.

4.2.1 Tissue classification with FAST

The FMRIB’s Automated Segmentation Tool (FAST) is a segmentation program included in the

FSL package and developed by the FMRIB group in Oxford. FAST is an standalone program

which can be used both from a GUI interface or the command line. The whole process is

fully automated and can also produce a bias field-corrected input image and a probabilistic

and/or partial volume tissue segmentation. Because the method is robust and reliable, it is

6N3 is available as part of the MINC-TOOLS package. www.bic.mni.mcgill.ca/ServicesSoftware/HomePage

www.bic.mni.mcgill.ca/ServicesSoftware/HomePage

33 4.2 Tissue segmentation methods

commonly used by the neuroimaging community as a quantitative tissue evaluation tool, but

also to compare results with other proposed approaches.

FAST is implemented on the work of Zhang et al. [89]. The authors proposed a parametric

EM-based approach with Hidden Markov Random Fields (HMRF) and thresholding-based ini-

tialization as substitute for the widely used GMM, more sensitive to noise. MRF theory was

used to include spatial information from neighboring pixels. In MRF, spatial information in an

image is modeled through contextual constraints of neighboring voxels. The model is based on

contextual constraints characterizing mutual influences among voxels using conditional MRF

distributions. Moreover, the HRMF parameters are estimated using the iterative Expectation

Maximization (EM) algorithm. Bias correction was incorporated by introducing the algorithm

proposed by Guillemaud et al [39] into the EM parameter estimation.

Basically, HMRF-EM framework for Brain MRI tissue classification is defined following the

EM approach. A first segmentation is done by image thresholding, which also provides an

initial parameter estimation for the HMRF. With the parameter initialization, the iterative

EM algorithm is started. The algorithm seeks for a solution for three dependent unknowns: the

bias field, the image classification and the involved model parameters. At each iteration, the

bias field and class labels are estimated by MRF-MAP probabilities and after them posterior

probabilities are computed taking into account the obtained bias field correction and estimated

labels. Finally, the parameters of the model are updated for the next iteration. The algorithm

stops when a maximum number of iterations have been performed.

FAST can be set to provide along with the discrete segmentation of brain tissue and intensity

corrected volumes, outputs for each class with tissue probabilities and PVE. Figure 4.3 depicts

the segmentation output for the input image ( Figure 4.3(a)). The program return three volumes

with tissue classification probabilities (Figure (4.3(b)), a discrete segmentation ( Figure 4.3(c))

and a PVE segmentation with added labels for CSF/GM and GM/WM tissues. Although we

have implemented a wrapper to call all the functionalities of FAST from MATLAB environment,

we call the program with default options in all the experiments except that intensity uniformity

correction was disabled.

4.2.2 Tissue classification with SPM

Statistical Parametric Mapping (SPM) is an open source software developed in MATLAB envi-

ronment by the Wellcome Department of Imaging Neuroscience at University College in London.

The SPM software package has been designed for the analysis of brain imaging data sequences.

The sequences can be a series of images from different cohorts, or time-series from the same

subject. The current version of the software is SPM8, and implements two segmentation ap-

proaches: a default segmentation method also present in the previous version SPM5 and based


(a) Input scan (b) GM output (c) segmentation (d) PVE out

Figure 4.3: FAST segmentation output. (a) Input image. (b) Probability tissue output. (c) Discretesegmentation. (d) Segmentation with PVE.

(a) Axial view (b) Sagittal view (c) Coronal view

Figure 4.4: MNI prior tissue atlas used in SPM5 and SPM8. (a) Axial view. (b) Sagittal view. (c)Coronal view

on the work of Ashburner et Friston [5] and an improvement of this version called new segment

which is available only since this new version. In the following, we will denote the first approach

as SPM5 while the new segment version will be called SPM8. SPM8 is essentially the same

algorithm as that described in [5] with several improvements in registration and extended tissue

maps.

The algorithm present in SPM5 is basically a probabilistic brain segmentation framework

based on GMM with prior atlas information. Brain tissue distributions are estimated by a

Gaussian Mixture Model where prior tissue probabilities are obtained registering a probabilistic

atlas into the input scan to obtain the initial tissue probabilities for each tissue. Figure 4.4

shows the prior tissue probability atlases used in SPM. The method aggregates the combination

of image registration, tissue classification and bias correction in the same time by a derivation

of the log-likelihood function

The program permits several output configurations options for segmented tissue: by default,


segmented images are aligned with the input image, but it is also possible to align the output

images with the prior atlas to produce normalized segmentation to be used in other processes.

In our experiments, the program have been run with default options except that intensity

uniformity correction was disabled.

4.2.3 Tissue classification with GAGMM

GAGMM is part of the GAMIXTURE package, publicly available from the laboratory of Neu-

roImaging in the UCLA university. The implementation runs in Matlab and it is based on the

work of Tohka et al. (2007) [79] . The method combines the estimation of tissue distributions by

finite mixture models with heuristic optimization of the tissue distribution parameters. In GA

algorithms, a population of chromosomes encode different solutions to an optimization prob-

lem by iteratively recombination. GA are usually represented by chromosomes consisting in

vectors with binary gens. However, real-codec GA are represented by vectors of floating point

numbers which are able to adapt better to data distributions. A set of chromosomes generate

a population, and at each generation chromosomes are stochastically selected and recombined

between them by crossover and mutation to form new generations which iteratively minimizes

the optimization problem.

Basically, in GAGMM tissue segmentation is carried out by estimating Gaussian distribu-

tions for tissue classes. But, instead using approaches like EM algorithm to the optimization

of the distributions parameters, a GA optimizes the distributions. Population chromosomes

are here defined as vectors containing the parameters for the mixture models to be fitted to

the data. The authors proposed two modifications in the GA algorithm based on a blended

crossover to avoid premature convergence associated with flat crossovers and a reduced permu-

tation operator a with prior information designed to reduce the parameter search space. The

fact that the tissue distributions are modeled by Gaussian distributions gives the algorithm the

possibility to estimate the PVE as new classes that are a posteriori reclassified as GM,WM or

CSF.

GAGMM parameters can be set extensively. Given the GA optimization, population,

crossover rates or chromosome selection are available to tune. Furthermore, normal param-

eters of FMM such as number of Gaussians to estimate or number of PVE are also available.

Input parameter makes the software highly configurable for a set of different problems but also

sensible to changes. Figure 4.5 depicts the segmentation output for different settings of PVE. In

our experiments, we use default configuration for the GA algorithm, the number of Gaussians

is set to 3 and PVE to 2.


(a) Input scan (b) no PVE (c) PVE=1 (d) PVE =2

Figure 4.5: GAGMM output segmentation: (a) Input scan. (b) Output set not to deal with PVE.(c) PVE=1. (d) PVE =2.

4.2.4 Tissue classification with SOM

Self organized maps (SOM) or Kohonen networks cluster data based on an iterative process of

comparison of these data with related changes within them. The SOM organizes unknown data

into groups of similar patterns, according to a similarity criterion (e.g. Euclidean distance).

Such networks can learn to detect regularities and correlations in their input and adapt their

future responses to that input accordingly, even in the presence of certain noisy data [77]. The

map preserves topological relationships between inputs in a way that neighboring inputs in the

input space are mapped to neighboring neurons in the map space.

With this approach, a competitive learning algorithm re-adjust iteratively a matrix of

weights W . Given the number of expected classes k, W is defined as f × k, where f is the

number of features of the input vector and k is the number of expected output classes. Elements

in the weighting matrix form a network representation, where each element of W works as a

weight and each column as a output neuron. In practice, network training is done by computing

the closest Euclidian distance within each input vector and the weightings of each column of W .

The weights of the winning column are updated for every input vector multiplying them by a

learning rate or geometrical decrease α parameter. Once all the input vectors have updated W ,

the learning rate is decreased and the process is repeated for all the input vectors again (epoch).

The learning rate decreases the capability of each epoch to modify the weighting matrix. When

W has not significantly changes between epochs or a maximum number of epochs have been

reached, the algorithm stops.

Brain tissue classification with SOM can be achieved training a weighting matrix W and

after it, multiplying the input vectors of the input image by W . Lets say that three classes

have to be found (k = 3), and the feature space of voxels are represented by input vectors

vi = [vf , . . . , vf ] with f feature elements. According to what have been said, W will have a size

of k × f . Hence, the classsification of a pixel i will be given by the vector Ci = W · vi, where


the classification result will be the column element of Ci with higher value.

SOM have been used in some works for brain tissue classification such as Yuanfeng et al. [88],

Tian et al [77] [78] [62]. Tian et al [?] used three MRI modalities (T1, T2 and PD) as input

vector for normal brain tissue classification (CSF, GM and WM). However, we are restricting

our study only to T1w images. Hence, we base our SOM approach in a 1 dimensional T1 feature

space self-trained with the subject itself. Given that our feature space is reduced to f = 1 and

three main tissue classes have to be classified, the learned weighting matrix in our experiments

is reduced to the centers of the clusters of each brain tissue. After learn the final clusters, labels

are assigned to voxels computing the absolute minimum distance between cluster centers and

image voxels.

4.2.5 Tissue classification with FCM

Fuzzy C-Means clustering techniques integrate fuzzy sets approach into the classical k-Means

clustering algorithm. Basically, in their application into brain tissue classification, a set of vox-

els with given intensity value have to be segmented as three different classes (CSF, GM and

WM). On the contrary to k-means, voxels do not necessary belong to one of the classes, but

can “partially” belong to several of them. In the basic k-means clustering approach, the seg-

mentation process for tissue classification is defined as the minimisation of the energy function

J [56]:

JKM =

3∑k=1

∑xj∈Sk

||xj − µk||2 (4.1)

where the intensity of the voxel is denoted as xj and µk is the centroid of the k tissue class

with k = {1, . . . , 3}. Classification is achieved by a refinement algorithm [70] which produces

an optimal partition in 3 classes by iteratively minimizing the squared error objective function

within cluster elements. Algorithm returns the centers of each cluster µk.

Fuzzy C-means introduces into the classical K-means a membership function of a voxel j

for a given tissue class k as Ujk, controlled by a fuzzy parameter q which sets the fuzziness

of the system. Hence, voxel classification with fuzzy clustering is defined by modifying the

minimization function JKM 4.1 as:

JFCM =

3∑k=1

∑xj∈Sk

Uqjk||xj − µk||2 (4.2)

Contrary to k-means, the algorithm also returns Ujk, which provides the probability of voxels

to belong to each tissue class. This fact makes FCM approach interesting for brain tissue

classification because it is intrinsically able to deal efficiently with PVE artifacts, since voxels


(a) output for WM (b) Input scan (c) output for CSF (d) FCM output forGM

Figure 4.6: Membership classification output in FCM. The algorithm return the probability of eachvoxel to belong to one of three tissues. (a) Synthetic scan. (b) Output for CSF. (c) Output for GM.(d) Output for WM.

presenting PVE are simply allowed to belong to several classes by the FCM. Figure 4.6 shows

the typical output of the algorithm, where the probability of each voxel to belong to one of

three tissues is returned. Given the probability for each class, defuzzyfication is done labeling

each voxel with the maximum probability from the output segmentations of each tissue.

Our implementation is based on the work of Pham et al. [61]. The authors developed

the basic steps of FCM algorithm in their work as a previous step to their proposal, which

incorporates spatial regularization to deal with artifacts. We develop this modification in more

detail in the next subsection 4.2.6.

4.2.6 Tissue classification with RFCM

Some authors have proposed modifications in FCM in order to add spatial information [61] [18]

[44] [53]. FCM objective function 4.3 does not take into consideration any spatial dependence

between observations. Hence, membership functions can be distorted by voxels with noise in

the observed image. Robust Fuzzy C-Means (RFCM), modifies the FCM objective function

including a penalty term based on neighbor membership for other classes. This penalty term

permits the algorithm to be more robust to noise. RFCM equations are only slightly different

from the objective function 4.3. Hence, the new objective function is defined as:

JRFCM =

3∑k=1

∑xj∈Sk

Uqjk||xj − µk||2 +

β

2

3∑k=1

∑xj∈Sk

Uqjk

∑l|inNj

∑m∈Mk

Uqlm (4.3)

where Nj is the set of neighbors of a given voxel j, an Mk is the set of classes different to the

current evaluated class k. Parameter β weights the amount of regularization included by the


algorithm. Basically, the penalty term is minimizing the membership of a voxel when there is

a discrepancy between the voxel class of the class and their neighbors.

Our RFCM implementation is based on the equations for Uqjk membership and mk cluster

center functions proposed by Pham et al. [61]. We have introduced these modifications in our

previous FCM approach to implement this new version. Different selections in the number

of neighbors Nj and the regularization parameter β are proposed by the authors. Moreover,

they developed a method based on cross-correlation to optimally set the parameter β. How-

ever this method is excessively time consuming, and for our experiments β and Nj have been

experimentally set to 5 and 7 respectively, for all the test after several runs on different datasets.

4.2.7 Tissue classification with KNN

Based on the approach of De Boer et al. [27], CSF, GM and WM are segmented using an

automatically trained kNN classifier which is an extension of the work by Cocosco et al. [23].

Training samples for the kNN classifier are obtained from the subject itself by probabilistic

atlas-based registration as proposed in [27].

Initially, all the voxels are used to train the classifier. By registering tissue priors probability

atlas into the input image, a label for each voxel can be assigned from the maximum probability

obtained from the tissue atlases. However, a threshold parameter is introduced to allow only

to add voxels to the training set with a probability bigger than the threshold. The selection of

this parameter value will define the amount of training voxels of an specific brain region that

will be considered as a certain tissue in the training stage. In DeBoer et al. [27], PDw and T1w

intensities are included in the feature vector and a sampling and pruning steps are applied to

the dataset to remove incorrect samples from both modalities. Here, we are limited to T1w

images, and hence the feature vector is only based on intensities of one modality image. Given

this limitation, we decided to include in the training dataset all the voxels intensities which

have been not thresholded.

Although our implementation is one dimensional, the fact of include all the voxels with

high probability, decrease considerably the classification time given that for each voxel we have

to compute the distance with the rest of the voxels. Similar to De Boer work, we based our

implementation on a Approximate Neighbor Searching approach (ANN) which preprocess d-

dimensional data points into a data structure to report the k closer elements of the training

dataset efficiently7.

The kNN classifier based on ANN performs the final classification based on thresholded

training samples from the same image. It was shown in [75] [29] that if the number of neighbors

7Our implementation use the ANN library which can be downloaded fromhttp://www.cs.umd.edu/mount/ANN/


k satisfies both equations 4.4 and 4.5 the kNN classifier will have the minimum possible error

probability of a generic classifier (achieved when the classifier knows the true data distributions).

limn→∞

k =∞ (4.4)

limn→∞

k

n= 0 (4.5)

Although different values of k neighbors would have been selected, DeBoer [27] and Cocosco [23]

works both fixed k to 45 neighbors. In our experiments, we decided to use also the same number

of neighbors.

4.3 Tissue classification in the presence of lesions

The proposed pipeline was designed to evaluate the methods comparing the obtained results

with a ground-truth. However, we are also interested in the analysis of the efficiency classifying

brain tissue in the presence also of MS lesions in the brain. WM atrophy is usually measured

by the experts computing the volume of WM tissue or the FBP. The presence of WM lesions

which are not known a priori can introduce errors in the volume measurement if those lesions

are miss-classified as GM or CSF by the segmentation algorithms. We propose a modification

of the basic pipeline to include lesion data as shown in Figure 4.7. The segmentation framework

is modified as follows:

• Given the MS lesion masks provided within the SALEM subjects are used to disable

regions with lesion in scans before segmenting them. The segmentation output is modified

adding the lesion regions as WM. This masking process is the same methodology followed

by radiologists and neurologists in hospital.

• At the same time, scans with lesions are segmented without taking into account the lesion

regions.

Comparison between both sets will be used to assess the efficiency, as will be explained

in the following section within the experiments done using the ral patients from the SALEM

project.

4.4 Evaluation metrics

The third stage of the pipeline is the evaluation of the segmentation results obtained by the

methods. Given the proposed basic pipeline and the modification introduced in the previous

41 4.4 Evaluation metrics

Figure 4.7: Scheme for our modified pipeline approach. From a T1w scan, non brain parts of theMRI scan are stripped and brain voxels are corrected for intensity inhomogeneities. MS lesions aremasked before classifying voxels to one of the three tissue classes (CSF, GM and WM). Lesion voxelsare added as WM. At the same time, scans with lesions are segmented without taking into account thelesion regions. Evaluation is assessed by compare the FBT of both segmentations


section to evaluate the efficiency of the methods in the presence of MS lesions, we assess the 8

analyzed approaches using the following metrics:

• Dice, Sensitivity and specificity metrics between segmentation results and database ground-

truths on real and synthetic healthy T1w scans from the Brainweb and IBSR20 datasets.

• Differences in Fractional Brain Tissue (FBT) on real T1w scans of healthy subjects from

SALEM database with a first and a 12 months follow-up scan for each subject.

• Differences in FBT and Brain Parenchymal Fraction (BPF) on real T1w scans of subjects

with different levels of MS disease from SALEM database with a first and a 12 months

follow-up scan. Efficiency of methods in the presence of lesion is assessed by comparing

the obtained results with FBT and BPF values of the same subjects masking the WM

lesions before segmentation and adding them posteriorly as WM.

Chapter 5

Results

This chapter presents the segmentation results obtained by the 8 proposed automatic brain

tissue segmentation methods. First, a complete description of the employed datasets is given.

Afterwards, quantitative evaluation for all methods is carried out computing the following

metrics: sensitivity, specificity and Dice coefficient on synthetic and real datasets of healthy

subjects. Finally, the effects of MS lesions into segmentation are presented as differences in

fractions of brain tissue (FBT) and Brain Parenchimal factor (BPF) between scans of the

same MS patient with WM lesions.

5.1 Datasets

The 8 analyzed segmentation methods are evaluated using synthetic and real data. As seen

in chapter 3, Brainweb [22] and Internet Brain Segmentation Repository IBSR have become

standard datasets for inter-study comparison. In this study, Brainweb and IBSR datasets are

also employed to evaluate the methods on synthetic and real data respectively. Moreover, one of

the objectives of the study is to evaluate the effect of MS lesions into the segmentation methods.

Lesion data is provided by relevant hospitals and medical expert teams in the field of multiple

sclerosis from the SALEM project. MS patients selected within the project with different lesion

loads are used to evaluate the effect of lesions in brain tissue segmentation. Those patients

include an initial scan and its follow-up (after either 6 months or one year) using the 4 main

conventional MRI techniques (PD-w, T1-w, T2-w and FLAIR images). Number of subjects

and characteristics of each dataset are the following ones:

• 9 simulated scans from Brainweb with ground truth. Modality is fixed to T1w, since some

of evaluated methods are mono-spectral. Slice thickness is also fixed to 1mm in order to

43

Chapter 5: Results 44

benefit from the discrete ground truth. Then, 9 different configurations are generated

from picking up 3 out of 5 noise options (0, 3% and 7%) and all intensity inhomogeneities

(0, 20% or 40%).

• 20 real T1w scans from IBSR20 with ground truth. IBSR20 dataset is chosen among

IBSR18 because MRI scans have different levels of difficulty and inhomogeneity. Further-

more, provided ground truths in IBSR20, based only on WM, GM and CSF tissues, avoid

manual groupings of the 43 brain structures provided by IBSR18 segmentations.

• 19 T1w scans from different hospitals of SALEM Database with basal scan and its follow-

up (after 12 months). This set comprises both healthy (9 subjects) and MS lesion patients

(10 subjects) with different lesion loads from two hospitals.

5.2 Synthetic data results

The eight segmentation methods have been run on synthetic data and results are evaluated

with the ground truth provided by BrainWeb. Since different levels of noise and intensitiy

inhomogeneity has been used, this database will permit us to measure the sensitivity and speci-

ficity of each method in the presence of different levels of noise and intensity inhomogeneities.

Noise percentage is representative of the percent ratio of the standard deviation of the white

Gaussian noise versus the signal. Higher sensitivity and specificity metrics indicated that the

segmentation algorithm correctly identified more tissue voxels, and also was better at rejecting

tissue voxels that were not related to the tissue class of interest. Figure 5.1 depicts the average

Dice coefficient for each method computed from all BrainWeb scans. The complete table of

measures for Dice, sensitivity and specificity metrics for all 9 simulated scans can be consulted

in the table A.2 of the appendix.

GAGMM, RFCM, FCM and SOM seem to outperform the other methods with very similar

Dice values on CSF tissue (0.77 ± 0.05). Both SPM approaches report similar Dice values

(0.73±0.02) while FAST is reporting lower similarity (0.71±0.03). KNN seems to underperform

(0.22± 0.09) compared to other methods. Best sensitivity (0.73± 0.03) and specificity (0.88±0.03) are found on FAST and GAGMM respectively.

On GM tissue, 6 methods reported Dice values close to 0.93 ± 0.04. SPM5 and FAST

returned lower similarity measures (0.89 and 0.87±0.02 respectively) while KNN was the worst

method again with (0.83± 0.03). Best sensitivity (0.94± 0.05) was obtained by GAGMM and

best specificity by SOM (0.95 ± 0.03). On WM tissue, GAGMM and SOM also outperformed

the other methods with a Dice value of (0.95± 0.04). KNN, RFCM and FCM returned similar

values (0.94±0.04), while FAST and SPM5 reported both the lower values with a Dice coefficient

of 0.91±0.04. The best sensitivity and specificity was found on RFCM (0.97±0.03) and SPM5

45 5.2 Synthetic data results

(a) CSF tissue

(b) GM tissue

(c) WM tissue

Figure 5.1: Dice metrics boxplots computed from all methods with all Brainweb scans. (a) CSFtissue, (b) GM tissue, (c) WM tissue. Red line depicts the median value; Green cross depicts the meanvalue.


(0.96 ± 0.05). All methods reported very low standard deviation, with values (< 0.09) for the

worst case.

Furthermore, noise and bias artifacts were evaluated on all 9 scans. From the obtained

results, it has been found that segmentation methods reported similar values on this dataset

independently of the amount of intensity inhomogeneity. Bias correction introduced in the

pipeline seems to eliminate simulated intensity inhomogeneities. 1. However, Rician noise

seems to affect clearly the accuracy of the methods. For CSF tissue, all methods seem to follow

the same trend, decreasing the accuracy as long as noise increases. On GM and WM tissue,

SPM8 seems only to be affected with higher amounts of noise. On the contrary, SPM5 and

FAST increase their accuracy with 3% of noise but then report similar results to other methods

for 7%. The other strategies seem to follow the same trend decreasing their accuracy while

increasing noise. The complete table of results can be consulted in table A.3 of the appendix.

5.2.1 Discussion

Reviewed works usually reported results based on an unique level of noise and intensity inho-

mogeneity. Here, statistical measures are taken as averages for all the generated subset, which

comprises different levels of noise and inhomogeneity. Additionally, average Dice metrics for

different levels of noise and bias have been included as well in our study. Obtained results

for FAST are SPM5 are similar from those reported in previous studies [80] [46]. Tsang et al.

(2008) [80] showed also a similar trend in the results for different levels of noise, where scans

with 3% Rician Noise performed better than those without noise. According to the authors, the

reason was because imaging noise is an inevitable part of image acquisition, and methods deals

with it intrinsically. Ashburner et al [5] (2005) reported slightly higher results for SPM5 in GM

and WM tissues. This behavior is explained by the intensity inhomogeneity correction applied

in our experiments. In order to compare the different methods as fairly as possible, N3 was

employed to correct intensity bias and internal correction in methods were disabled. Very close

results of those obtained by Ashburner have been obtained repeating the segmentation process

in SPM5 with default options on a original simulated image. External intensity correction can

also explain lower values obtained in our experiment for CSF tissue on all methods. Although

Ashburner did not provide results for CSF, the results obtained for CSF with default options

follow those reported by Huang [46] and Tsang [80]. Repetition of FAST segmentations with

default options have confirmed our hypothesis, obtaining again results close to other published

works for all tissues. In general, GMM approaches such as SPM8, SPM5 and GAGMM per-

formed better than Hidden MRF method implementation in FAST. From the results obtained,

1Table A.1 shows Dice metrics for each method performing on different levels of intensity inhomogeneity.This table has not been included here for simplicity.

47 5.2 Synthetic data results

(a) Brainweb ground truth (b) SPM5 output (c) KNN output (d) MNI probability atlas(CSF)

Figure 5.2: Segmentation results for various methods on Brainweb scans. (a) Brainweb ground truth,(b) SPM5 output, (c) KNN output, (d) MNI probability CSF tissue atlas. Tissue labels are CSF (red),GM( green) and WM (red). KNN classifier self-trained on the same subject with atlas registration ismiss-classifying CSF tissue as GM in brain tissue borders because MNI atlas returns low probabilitiesfor CSF in borders.

and comparing them with other GMM approaches, GA optimization on the GMM model was

more accurately than atlas based methods as SPM5 and SPM8. Clustering methods as RFCM,

FCM or SOM reported similar values between them for all tissues, even with higher levels of

noise. This can be explained again by the segmentation pipeline used. Since all the scans are

preprocessed for intensity bias correction, RFCM regularization is very low and the model is

equal to FCM. Obtained results for GM and WM with SOM method are in accordance with

values obtained by Hasanzadeh. et al [42] (2007) using a similar approach. Finally, KNN

clearly underperforms on CSF tissue. Figure 5.2 depicts segmentation outputs for the KNN

classifier (Figure 5.2(c)) and SPM5 (Figure 5.2(b)) for comparison. KNN training dataset is

based on the registration of the MNI atlas prior (Figure 5.2(d)) into the input scan where MAP

probability with minimum probability > 0.7 is applied to add new elements to the training

set. Brainweb provided ground-truth (Figure 5.2(a)) considers the borders of the brain as CSF

while the MNI atlas is reporting probabilities < 0.6. Therefore, KNN classified border tissue

as GM, which explains why GM sensitivity and specificity values are lower as well. Taking into

account the low percentage of CSF tissue in brain, relative small changes in segmentation for

CSF tissue will decrease considerably the reported similarity. Previous tests setting the MAP

minimum threshold with lower values to increase CSF sensitivity have reported an increase of

miss-classification between GM and WM.


5.3 Real data results

As a second experiment, segmentation methods are performed on real T1w data from IBSR20

dataset introduced in section 3.2.2. Labelled volumes are also provided for evaluation by trained

experts using a semi-automated intensity contour mapping algorithm. The results obtained with

this dataset will permit us to evaluate the segmentation methods on real data with different

levels of intensity inhomogeneity and real acquisition artifacts. IBSR scans have been sorted by

decreasing order of difficulty. Average Dice coefficients by method on all 20 scans are depicted

in figure 5.3. The complete average measures for Dice, sensitivity and specificity can be found

in table A.4 of the appendix.

It can be observed how all methods reported very low values for CSF tissue. As reported

in the previous section, the amount of CSF tissue in the brain is small compared to GM and

WM and small differences in segmentation results will produce considerable changes in reported

results. However, KNN performed slightly better (0.41± 0.01) than the other methods, which

reported Dice values lower than (0.25) for CSF tissue. Again, KNN clearly outperformed the

other strategies with a Dice value 0.86± 0.05 for GM tissue. GMM methods such as GAGMM,

SPM5 and SPM8 reported similar results (0.76± 0.06). RFCM returned lower similarity than

GMM approaches (0.76± 0.06) while other methods performed with values lower than (0.70).

On WM, both SPM methods, FAST and KNN reported Dice values close (0.80). FCM and

SOM returned similar lower Dice values (0.77) while RFCM underperformed with respect to

the other approaches (0.72± 0.14).

Boxplots on figure 5.3 depicted higher distances between the first and third quartile for all

methods than those found with synthetic data. For CSF tissue, bigger differences were found

for the KNN classifier, while smallest on FCM and FAST. In general, all the methods depicted

high variability for CSF classification, compared with their response to the other tissues. On

GM, biggest variability was found on GAGMM approach and lowest on SPM5 and SPM8. On

WM, the highest variability was found in RFCM, followed by the other methods which return

similar lower variability, and again SPM approaches with very low deviation values.

Variability between scans for the same segmentation method was also analyzed. Figure

5.4 show Dice metrics returned by all methods, evaluated for each scan. Scans are sorted by

decreasing difficulty. For GM and WM tissue, all methods depicted an ascending accuracy

trend which corresponds with the disposition of the scans from difficult to more easy. In GM,

intra-scan accuracy differences were higher compared to WM plot. Although the ascending

trend in accuracy, differences for the same subject classifying GM remained stable along the

20 scans for all methods. Conversely, WM tissue plot depicted a bigger discrepancy between

methods on difficult scans and closer Dice metrics between methods as long as the scans were

easier to classify. Finally, CSF tissue followed a similar trend for all methods and scans with

49 5.3 Real data results

(a) CSF tissue

(b) GM tissue

(c) WM tissue

Figure 5.3: Dice metrics boxplots computed from all methods with all IBSR20 scans. (a) CSF tissue,(b) GM tissue, (c) WM tissue. Red line depicts the median value; Green cross depicts the mean value.


bigger differences in KNN for the middle difficult scans. From the plots, unexpected values can

be observed for the SPM5 method on volume 1-24 in all tissues, and a RFCM on 16-3 in GM.

5.3.1 Discussion

Analyzing other works where one or more methods are evaluated with IBSR20, similar results

for all tissues have been found. Tian et al (2011) [78] analyzed FSL, SPM8 and GAGMM

with the same database but with different bias preprocessing. Reported Dice values for all

tissues were in correspondence with our findings with small differences probably due to changes

in the segmentation pipeline. Caldairou et al [18] analyzed different Fuzzy methods which

included both FCM and RFCM. Reported Dice metrics were higher than results obtained in

our experiments. However, those experiments were done in ISBR18, and results can no be

directly compared. Results provided for SPM5 and FAST by Tsang et al. (2008) [80] for GM

and WM also reported similar results of those obtained in our experiments.

Low values obtained on brain CSF tissue in all methods are directly related with the ground

truth database. Provided labelled scans from IBSR classify the border tissue of the brain

as GM, while 7 out of 8 segmentation methods tend to classify these voxels as CSF, which

decreases considerably the Dice coefficient for CSF tissue. Higher results obtained for KNN are

explained because MNI probabilistic atlases assigned low probability to CSF in brain borders,

forcing the algorithm to classify those voxels as GM. However, reported sensitivity provides a

good estimator of the capability of each method to classify tissue as CSF. Effectively, 7 out of

8 methods (KNN underperformed for sensitivity) reported high values for sensitivity (> 0.79),

which indicated that methods are actually classifying correctly CSF tissue located in the lateral

ventricles. On the contrary, analyzing reported results, KNN sensitivity to detect CSF tissue

in the same brain region is lower than other methods. However, the classifier tend to detect

more accurately non-tissue classes, as reported by the higher specificity obtained. Brain tissue

in the borders of the brain was usually miss-classified as GM. Hence, obtained Dice coefficients

for CSF will affect the accuracy of the method for GM tissue classification. This phenomenon

explains the high accuracy obtained by the KNN for GM tissue classification. On the same

line, GMM approaches obtain better classification rates for GM than clustering strategies or

FAST. These results are related with those obtained for CSF, where GMM approaches reported

better classification rates as well. Similarly, FAST, SOM and FCM, which obtained lower

values for CSF, report the lowest Dice coefficients for GM classification. Moreover, on difficult

scans, GM miss-classification is also related with volume artifacts. Figure 5.5(a) shows the

most difficult volume provided in the IBSR20 database and the expected ground truth (Figure

5.5(b)). Figures 5.5(c) and (d) show obtained FAST and KNN tissue classification results in our

experiment. The input scan is hardly affected by acquisition artifacts. 7 out 8 methods miss-

51 5.3 Real data results

(a) CSF tissue

(b) GM tissue

(c) WM tissue

Figure 5.4: Dice metrics plots evaluated for each scan and method. (a) CSF tissue, (b) GM tissue,(c) WM tissue. Methods are labeled by color. FAST (red), SPM5 (green), SPM8 (blue), GAGMM(pink), SOM (yellow), RFCM (cyan), FCM (black), KNN (orange)


(a) 5 8 corrected scan (b) 5 8 ground truth (c) FAST output (d) KNN output

Figure 5.5: Segmentation results for various methods on IBSR 5 8 scan. (a) Scan as provided inIBSR, (b) Brainweb ground truth, (c) FAST output, (d) KNN output. Tissue labels are CSF (red),GM( green) and WM (red). Scan 5 8 is provided with real hard artifacts

classify those artifacts basically as WM, affecting the GM accuracy. On the contrary, KNN and

SPM5 were not altered by those artifacts, because they used prior information from probabilistic

atlases which modeled the bias as brain tissues. However, SPM8 was performing significantly

lower than SPM5 in scan 5 8. The algorithm was not considering those voxels and classified

them as background. Clustering methods performed very similar on GM matter classification.

Spatial regularization introduced in RFCM outperformed basic clustering approaches like SOM

and FCM in scans with moderate amount of artifacts.

WM tissue plot from Figure 5.4(c) revealed bigger discrepancies on Dice metrics between

methods on difficult scans. With scans of middle and low difficulty, it existed a correspondence

predicting WM tissue between methods as long as the scans were easier to classify. As it

has been said, those discrepancies were directly related with image artifacts because methods

tended to miss-classify them as WM. Again, best results were obtained by methods with spatial

probabilistic priors (KNN, SPM5 and SPM8). Although GAGMM returned accurate results

for images with low and middle difficulty, reported average Dice metrics were lower than other

GMM approaches because the GA initialization and optimization was penalized in the presence

of big artifacts. Regularization in fuzzy clustering was not improving results of simple clustering

methods, specially in images with high amount of bias. This could be caused by a wrong

penalization introduced in fuzzy membership computations. Because artifacts are long stripes

along the posterior part of the brain, neighboring windows are collecting information from voxels

also corrupted. This could be avoid incrementing considerably the window size in detriment of

excessive time consumption.

Analyzing the results more globally, KNN with auto trained dataset based on prior atlases

provided the best results on all tissues. This technique, even simple, was able to reduce con-

siderably the effect of image artifacts in scans. Furthermore, GMM approaches modeled more

53 5.4 MS lesion results

Table 5.1: FBT evaluation on 9 healthy subjects of SALEM dataset for all methods. Reported valuesare mean ± standard deviation. Diff refers to the absolute difference between basal and 12 months foreach method.

CSF GM WMMethod basal 12 months diff basal 12 months diff basal 12 months diff

FAST 0.19 ±0.02 0.19 ±0.01 0.01 0.39 ±0.02 0.39 ±0.01 0.00 0.42 ±0.02 0.43 ±0.01 0.01SPM5 0.19 ±0.02 0.18 ±0.01 0.01 0.44 ±0.02 0.44 ±0.02 0.00 0.37 ±0.03 0.38 ±0.02 0.01SPM8 0.17 ±0.01 0.16 ±0.01 0.01 0.45 ±0.01 0.45 ±0.01 0.00 0.38 ±0.02 0.38 ±0.01 0.00

GAGMM 0.11 ±0.02 0.10 ±0.02 0.00 0.64 ±0.14 0.59 ±0.14 0.05 0.26 ±0.15 0.31 ±0.15 0.05SOM 0.16 ±0.02 0.16 ±0.01 0.00 0.41 ±0.04 0.41 ±0.02 0.00 0.43 ±0.05 0.44 ±0.02 0.00RFCM 0.15 ±0.04 0.12 ±0.01 0.03 0.51 ±0.21 0.37 ±0.02 0.14 0.34 ±0.25 0.51 ±0.02 0.17FCM 0.11 ±0.01 0.11 ±0.01 0.00 0.36 ±0.04 0.34 ±0.03 0.02 0.52 ±0.04 0.55 ±0.03 0.03KNN 0.01 ±0.01 0.01 ±0.01 0.00 0.64 ±0.03 0.62 ±0.03 0.03 0.34 ±0.03 0.37 ±0.03 0.03

accurately brain tissue probabilities. Atlas prior initialization was revealing more robustness

than heuristic GA initialization and optimization, penalized in images with high amounts of

image artifacts. FAST returned high results for WM tissue, which could indicate that lower GM

results were obtained because differences with the ground truth interpreting CSF in addition to

the presence of big artifacts which debilitate MRF priors. Clustering methods seem to perform

well on images without or low level of artifacts, as seen in WM plots. Spatial regularization

introduced in RFCM seems to outperform basic clustering approaches like SOM and FCM in

scans with moderate amount of artifacts.

5.4 MS lesion results

Finally, the third experiment evaluates the effect of different loads of MS disease in brain tissue

segmentation. Our SALEM dataset consists on 18 scans: 9 healthy subjects and 10 patients

with different lesion loads mainly located in WM tissue. For each subject, both initial and 12

months scans are available with corresponding lesion masks. Lesion masks are used to localize

the zones of the brain where disease is present. With these data, three tests are computed:

1) fractional brain tissue (FBT) is evaluated on healthy scans and for each subject differences

between FBT of both scans are computed. The same test is done twice with subjects with

lesion, 2) masking lesions not to be considered by methods and adding them after segmentation

as WM to evaluate FBT, and 3 ) not masking lesions and computing FBT as it was normal

tissue. Table 5.1 shows FBT for each healthy subject on both consecutive scans. Difference

between FBT for each subject is reported. Figures 5.3 and 5.2 show the same results for not

masked and masked lesion scans respectively.

From the table 5.1 , it can be observed how all methods are not providing significantly dif-


ferences in FBT between basal and 12 months scans on healthy data. The maximum difference

reported for CSF is 0.03 for the RFCM method. The same behavior is seen for GM and WM

tissues. However, RFCM seems to return more difference between scans with 0.14 and 0.17

in GM and WM respectively. Ge et al. (2002) [38] reported common FBT values for brain

tissue in adults (GM = 0.50, WM =0.35, and CSF = 0.15). With healthy subjects, most of the

methods report values in concordance with those normal FBT values. On the contrary, since

KNN is miss-classifying CSF tissue as GM class, reported values for GM are not in accordance

with normal FBT values.

Consecutive scans from the same subject which have not been masked to extract lesions,

reported small differences between them in all methods. This behavior is seen as well indepen-

dently of the tissue. However, more significantly differences were obtained between methods.

In CSF, minimum estimation of FBT for all scans was given by KNN with (0.01 ± 0.02) and

maximum by FAST (0.21 ± 0.04). The minimum FBT estimation for GM was given by the

RFCM method (0.32± 0.06) and the maximum by KNN(0.63± 0.03). On WM, minimum esti-

mation is given by KNN(0.37±0.03) and maximum by RFCM (0.57±0.07). Results from table

5.2 showed again small differences in FBT between consecutive scans when they are segmented

masking the lesions and adding them posteriorly as WM. 6 out 8 methods reported average dif-

ferences lower than FBT = 0.05 in all tissues between masked scans and not masked segmented

scans of the same subject. Overlap between masked and not masked scans are represented in

figure 5.6.

FBT values for initial scans (basal) and 12 months (12m) of each subject where lesions

have been masked are depicted in red and blue respectively. Similarly, for the same subjects,

values for initial and 12 months scans without masking the lesions are depicted with × and

© respectively. The graph permits to observe differences in WM tissue estimation between

subjects which are hidden in table average measures. From the plot, diferences for GAGMM

are found. The GA initialization seems not to provide a good model for GMM when it is

run with default options. Although the method permits to set different GA parameters, default

options have been conserved here. Regarding to SOM, masked lesion voxels that are expected to

be WM seems to modify the training weighting matrix and forces the algorithm to miss-classify

WM.

Furthermore, we focus the study in how methods classify lesions. Figure 5.7(a) depicts

a SALEM scan with high lesion load (420 mm3). Figures 5.7(c) to (i) show segmentations

provided by each method. From the plots we observe how methods tend to classify lesions

voxels as WM (green) or GM (red). Focusing in the lesion regions, more smoothed results seem

to be obtained for the plotted slice in GMM models such as SPM5, SPM8 and GAGMM than in

clustering methods or FAST, which seem to classify the two big spots as GM. Table 5.4 shows

quantitatively the average fractions of classified lesion tissue by each method for all scans in the


Figure 5.6: WM tissue FBP for lesion masked and non masked scans. On masked scans, lesions arebeen added as WM on FBP. On non masked scans, lesions are classified into normal tissues. Red valuesrefers to masked scans. Blue values refers to not masked scans. × refers to initial scan, © refers to 12month scan.

basal study2 . We observe how analyzed methods effectively tend to basically classify lesions

as GM at least as 16% (SOM) and with less proportions GM and WM.

Finally, parenchymal brain fraction (PBF) is computed on all scans with MS disease. PBF

is the fraction of GM, WM and lesion volume with respect to the whole brain volume. This

measure is used by experts to evaluate tissue atrophy in MS diagnosis. Figure 5.8 depicts

obtained BPF values for each segmentation method. The plots seem to indicate small changes

between masked and not masked scans in 6 out of 8 methods. GAGMM and SOM seem to

fluctuate between subjects, which is caused by miss-classified scans, as explained before. KNN

obtain considerably higher values with respect to the other methods, which is explained by low

FBT reported for CSF tissue. Reported BPF values in our experiment are in concordance with

studies evaluating WM tissue atrophy. Atkins et al. [7] and Rudick et al. [67] reported BPF

values in MS patients from 0.83 to 0.82 and 0.87 on healthy patients.

5.4.1 Discussion

Evaluate tissue classification methods in the presence of MS lesion is not an easy task. GAGMM

and SOM reported problems segmenting SALEM scans. GAGMM performed with high accu-

2Since there are no significant differences between consecutive scans, we base the classification in basal study


Table 5.2: FBT evaluation on 9 subjects with lesion loads from SALEM dataset for all methods.Segmentation is not carried out on lesions and are added as WM when computing FBT . Reportedvalues are mean ± standard deviation. Diff refers to the absolute difference between basal and 12months for each method.


FAST 0.21 ±0.04 0.21 ±0.04 0.00 0.34 ±0.06 0.34 ±0.06 0.00 0.46 ±0.05 0.46 ±0.04 0.00SPM5 0.20 ±0.03 0.20 ±0.03 0.00 0.44 ±0.04 0.46 ±0.05 0.02 0.37 ±0.05 0.35 ±0.06 0.02SPM8 0.16 ±0.02 0.17 ±0.01 0.00 0.46 ±0.01 0.46 ±0.01 0.00 0.38 ±0.01 0.38 ±0.01 0.00


Table 5.3: FBT evaluation on 9 subjects with lesion loads from SALEM dataset for all methods.Lesions have not been masked and segmentation methods try to classify lesions as normal tissue.Reported values are mean ± standard deviation. Diff refers to the absolute difference between basaland 12 months for each method.


FAST 0.21 ±0.04 0.21 ±0.04 0.00 0.34 ±0.06 0.34 ±0.06 0.00 0.46 ±0.05 0.46 ±0.04 0.00SPM5 0.20 ±0.03 0.20 ±0.03 0.00 0.44 ±0.04 0.46 ±0.05 0.02 0.37 ±0.05 0.35 ±0.06 0.02SPM8 0.16 ±0.02 0.17 ±0.01 0.00 0.46 ±0.01 0.46 ±0.01 0.00 0.38 ±0.01 0.38 ±0.01 0.00


Table 5.4: Lesion tissue classification in basal study. Percentage of lesion (number of voxels classifiedas tissue / total lesion voxels) which is segmented by methods as CSF, GM and WM, when lesions arenot masked. Reported values are mean and ± standard deviation of percentage of tissue for hospital 1(H1) and 2 (H2)

Method CSF GM WMmean std mean std mean std

FAST 0.06 ±0.05 0.22 ±0.11 0.72 ±0.11SPM5 0.04 ±0.03 0.20 ±0.14 0.76 ±0.14SPM8 0.03 ±0.02 0.17 ±0.09 0.80 ±0.09

GAGMM 0.01 ±0.01 0.22 ±0.30 0.76 ±0.30SOM 0.01 ±0.02 0.16 ±0.09 0.83 ±0.09RFCM 0.02 ±0.02 0.22 ±0.13 0.76 ±0.13FCM 0.02 ±0.02 0.22 ±0.29 0.76 ±0.29KNN 0.00 ±0.00 0.37 ±0.16 0.63 ±0.16


(a) scan 201 (b) lesions (c) FAST (d) SPM5 (e) SPM8

(f) GAGMM (g) SOM (h) RFCM (i) FCM (j) KNN

Figure 5.7: Segmentation results for all methods on 201 SALEM scan without masking the lesion.(a) Scan 210 as provided by SALEM. (b) scan with lesion highlighted. (c) FAST output, (d) SPM5output, (e) SPM8 output, (f) GAGMM output, (g) SOM output, (h) RFCM output, (i) FCM output,(j) KNN output

Figure 5.8: Brain Parenchymal Fraction for lesion masked and non masked scans


racy in both past databases using default settings ( 3 classes with 2 PVE). However, here it

fails in some scans, not necessary with lesion. Our suspect is that proper settings of the GA al-

gorithm have to be done in order to adapt the method to a different MRI scanner, which makes

the algorithm unsuitable for doctors. This method is not commented in the next discussion.

Introduced tables in previous section have shown that exist small differences between con-

secutive healthy and MS disease scans from the same subject. Very close FBT results in a

second 12 month scan, for which is know to be for the same patient and without pathology, can

be used to assess the reproducibility of the methods. All analyzed methods obtained very close

results for all follow-up in healthy scans which indicates their capability to reproduce results.

Provided WM lesion masks permits the evaluation of methods canceling affected regions and

replacing them as WM after segmentation. One important point to consider is that methods

tend to report small differences in FBT on GM and WM between scans without masking and

scans with lesions added to WM. All the methods miss-classify at least 16% of lesions voxels

as GM and bigger than 20% in 6 methods (table 5.4). This effect would lead to obtain wrong

estimation of WM and GM where tissue volume is necessary because lesion voxels which are

supposed to be WM are miss-classified as GM. If we analyze each method, SPM8 and SOM

reported the smallest values of miss-classification (0.17 ± 0.09) while KNN returned the most

miss-classification rate from WM to GM in average (0.37± 0.16) .

Moreover, BPF is not helping to identify changes in atrophy, since BPF will not identify

differences in WM tissue if WM is miss-classified into GM. Hence the rate for the GM+WM

will remain similar and no changes will be detected. However, our analysis is only valued in

quantitative measurements. Further work have to be done in order to incorporate into the

analysis a qualitative analysis made by radiologists and neurologists to define which methods

define better the classification of brain tissue. This qualitative evaluation have not been done

at the time of closing this report, by time constraints in the doctors agenda.

Chapter 6

Conclusions

This master thesis have been carried out in 4 main parts. Firstly, we have reviewed the

state-of-the-art on automatic brain tissue segmentation. After an extensive analysis of

recent papers we have presented a classification based on supervised and unsupervised meth-

ods. Moreover, we have focused on published works whose evaluation have been done with

public databases such as Brainweb and IBSR. In order to evaluate the accuracy of the reviewed

methods, we have divided both supervised and unsupervised strategies by the dataset used,

and discussed the best obtained results.

Secondly, we have proposed a framework to compare brain tissue segmentation

tecnniques which includes preprocessing, segmentation and evaluation stages. For

the preprocessing step, we have introduced different techniques used to deal with inherent prob-

lems in MRI such as skull-stripping and intensity inhomogeneities and we have proposed some

of them to be incorporated to our pipeline reasoning our choose in each case. For the seg-

mentation stage, we have selected 4 publicly available segmentation approaches from

the-state-of-the-art, where some of them such as FAST, SPM5 and SPM8 are widely used

by the neuroimaging community for tissue segmentation and volumetric analysis. We have also

introduced the GAGMM approach, which implements tissue classification by the optimization

of the parameters of tissue distributions in GMM by genetic algorithms. Moreover, we have

selected 4 more works from the state-of-the art and we have implemented them.

First, we have presented two FCM approaches based on the work of Pham et al. [61]. The first

one is based on basic FCM theory while the second one modifies the FCM energy function with

a penalization term to include spatial information into the membership function. The modifi-

cation is designed to improve the performance of the basic FCM approach by penalizing voxels

with high variance with their neighborhood. Furthermore, we have proposed a SOM clustering

59

Chapter 6: Conclusions 60

approach based on the work of Tian et al. [77]. The SOM matrix is trained on the subject itself

restricting the feature vector dimensions to T1w intensities. It has been seen that, given the

limitation in the feature space, the learned weighting matrix in practice returns the clustering

centers of tissue distributions and classification is achieved computing the minimum absolute

distance from voxels to the cluster centers. Finally, we have developed a modified version of

the KNN classifier proposed by De Boer et al. [27]. The classifier is also auto trained on the

subject itself registering prior probability tissue atlases on the subject scan. After selecting the

voxels with probability higher than a given threshold, the training dataset is build based on

voxel intensities and labels from the tissue with higher probability.

Thirdly, we have proposed a modification of the framework in order to measure

the efficiency of the methods in the presence of MS lesions. This modification is based

on the addition of a lesion masking step before segmentation. Hence, methods have been run

twice: first with lesion masked scans and afterward with the same scans without modification.

The capability of the methods to deal with WM lesions is evaluated by comparison between

the obtained FBT coefficients by both sets.

Finally, we have evaluated the 8 methods on synthetic and real T1w scans of healthy

subjects from Brainweb and IBSR20 respectively and in scans with different loads of MS lesion

from the SALEM project database. Results on Brainweb and IBSR20 have been presented using

quantitative measures (the Dice similarity index, sensitivity and specificity), while results on

SALEM scans have been evaluated by the Fraction of Brain Tissue and the Brain Parenchymal

fraction coefficient. Results on synthetic data have shown that in general, all the methods

performed with very high accuracy, specially Gaussian Mixture Models as SPM8 and GAGMM.

The results obtained with synthetic data were according with previous studies using one or more

of these methods. The preprocessing bias correction step removed completely the different levels

of intensity inhomogeneities present in the data, and methods reported very similar results

independently of the amount of bias. On the contrary, all methods returned a decrease in the

accuracy where noise were increased considerably. Results on the IBSR data have shown

that KNN provided the best results on all tissues. In general, GMM approaches again have

been able to model more accurately brain tissue probabilities than clustering methods or FAST.

Atlas prior initialization have been revealed more robust that heuristic GA optimization, which

is penalized in images with high amounts of image artifacts. However, FAST have returned high

results for WM tissue, and lower GM results could have been obtained by a penalization with

the ground truth segmenting CSF in addition to the presence of big artifacts which debilitate

MRF priors. Clustering methods have performed well on images without or low level of artifacts.

Spatial regularization introduced in RFCM have outperformed basic clustering approaches like

61 6.1 Future Work

SOM and FCM in scans with moderate amount of artifacts but have reported to fail in the

presence of more bias.

Finally, experiments with initial scans and 12 moths follow-ups with healthy subjects of the

SALEM project, have reported the capability of the methods to repeat FBT estimations. On

the other hand, when methods have been run with the masks, the results have indicated that

in general methods tend to miss-classify WM as GM at least in 17%. These results vary from

SPM8 (17%) to KNN, which is miss-classifying WM in (37%). However, these evaluations are

carried out based on segmentation results and provided labeled scans which can differ between

experts. Therefore, evaluations have to be weighted by experts to decide the real accuracy of

each approach.

6.1 Future Work

We present here some improvements to do in future works. Some of them have not been

implemented in this work due to time contraints. Others are part of new projects in the

research framework of SALEM and AVALEM. First, with respect to the work presented here:

1. Qualitative evaluation by radiologists and neurologists of the obtained results. Add a

quantitative evaluation into our quantitatively evaluation to balance obtained results.

2. Modify the implemented algorithms with improvements: Three out of four implemented

methods are not dealing with spatial information. Some modifications could be done,

specially in building training datasets to include spatial information to improve the ro-

bustness of the methods.

3. Optimize the implemented methods: although time is not a hard constraint in brain

tissue segmentation, all the methods have been implemented in MATLAB, and their run-

ning time could be improved significantly implementing them in other common computer

languages as ITK/C++.

4. MS lesion evaluation in tissue classification: Lesion effects on tissue segmentation methods

have been evaluated masking lesions as WM.

62

63

Appendix A

Results tables

Table A.1: Dice metrics computed from segmented Brainweb scans with different noise levels. Re-ported values are mean ± standard deviation.

3% noiseMethod CSF GM WM

0% 20% 40% 0% 20% 40% 0% 20% 40%FAST 0.743 0.743 0.751 0.853 0.854 0.855 0.856 0.858 0.856SPM5 0.751 0.748 0.755 0.876 0.874 0.875 0.883 0.883 0.883SPM8 0.793 0.793 0.801 0.932 0.932 0.931 0.951 0.951 0.948

GAGMM 0.820 0.819 0.826 0.967 0.967 0.967 0.985 0.984 0.983SOM 0.826 0.826 0.832 0.962 0.962 0.961 0.976 0.976 0.975

RFCM 0.823 0.823 0.831 0.960 0.961 0.961 0.974 0.975 0.975FCM 0.827 0.826 0.832 0.965 0.965 0.964 0.980 0.979 0.978KNN 0.267 0.254 0.265 0.842 0.841 0.839 0.964 0.963 0.960

3% noiseFAST 0.703 0.701 0.699 0.901 0.900 0.900 0.958 0.957 0.955SPM5 0.731 0.730 0.727 0.917 0.917 0.917 0.957 0.956 0.956SPM8 0.764 0.763 0.759 0.933 0.933 0.933 0.954 0.955 0.955

GAGMM 0.778 0.776 0.771 0.948 0.948 0.948 0.964 0.964 0.964SOM 0.783 0.781 0.777 0.944 0.944 0.943 0.959 0.960 0.960

RFCM 0.781 0.780 0.777 0.945 0.945 0.945 0.960 0.961 0.961FCM 0.784 0.782 0.778 0.946 0.946 0.945 0.961 0.961 0.961KNN 0.303 0.307 0.304 0.854 0.854 0.853 0.955 0.955 0.953

7% noiseFAST 0.670 0.677 0.678 0.857 0.865 0.867 0.911 0.917 0.920SPM 0.698 0.700 0.704 0.878 0.877 0.881 0.901 0.902 0.904SPM5 0.701 0.708 0.709 0.883 0.884 0.886 0.898 0.899 0.901

GAGMM 0.707 0.715 0.704 0.872 0.882 0.884 0.885 0.894 0.898ANN 0.709 0.715 0.716 0.860 0.869 0.873 0.881 0.890 0.894

RFCM 0.707 0.715 0.716 0.876 0.882 0.883 0.894 0.901 0.902FCM 0.709 0.716 0.716 0.864 0.873 0.876 0.883 0.892 0.895KNN 0.091 0.108 0.117 0.790 0.796 0.799 0.883 0.892 0.896

Table A.2: Statistical evaluation on BrainWeb synthetic database. Reported values are mean ± standard deviation. dsc, Dice similarity;sens, sensitivity; spec, specificity.

Method CSF GM WMdsc sens spec dsc sens spec dsc sens spec

FAST 0.71 ±0.03 0.73 ±0.06 0.68 ±0.01 0.87 ±0.02 0.86 ±0.04 0.89 ±0.06 0.91 ±0.04 0.87 ±0.09 0.96 ±0.05SPM5 0.73 ±0.02 0.72 ±0.06 0.74 ±0.02 0.89 ±0.02 0.88 ±0.02 0.90 ±0.04 0.91 ±0.03 0.88 ±0.07 0.96 ±0.05SPM8 0.75 ±0.04 0.70 ±0.08 0.83 ±0.02 0.92 ±0.02 0.91 ±0.03 0.92 ±0.02 0.93 ±0.03 0.93 ±0.03 0.94 ±0.04

GAGMM 0.77 ±0.05 0.68 ±0.07 0.88 ±0.03 0.93 ±0.04 0.94 ±0.05 0.93 ±0.03 0.95 ±0.04 0.95 ±0.03 0.94 ±0.05SOM 0.77 ±0.05 0.71 ±0.07 0.85 ±0.03 0.92 ±0.04 0.90 ±0.06 0.95 ±0.03 0.94 ±0.04 0.97 ±0.03 0.92 ±0.05RFCM 0.77 ±0.05 0.70 ±0.07 0.87 ±0.02 0.93 ±0.04 0.92 ±0.05 0.94 ±0.03 0.94 ±0.03 0.97 ±0.02 0.92 ±0.05FCM 0.77 ±0.05 0.71 ±0.07 0.86 ±0.03 0.93 ±0.04 0.91 ±0.05 0.95 ±0.03 0.94 ±0.04 0.97 ±0.03 0.92 ±0.05KNN 0.22 ±0.09 0.13 ±0.06 0.86 ±0.06 0.83 ±0.03 0.95 ±0.04 0.74 ±0.02 0.94 ±0.03 0.93 ±0.02 0.94 ±0.04ASOM 0.77 ±0.05 0.71 ±0.06 0.77 ±0.05 0.92 ±0.04 0.90 ±0.05 0.92 ±0.04 0.95 ±0.04 0.96 ±0.03 0.93 ±0.05

Table A.3: Dice metrics computed from segmented Brainweb scans with different noise levels (0%, 3% and 7%). Reported values aremean ± standard deviation.

Method CSF GM WM0% 3% 7% 0% 3% 7% 0% 3% 7%

FAST 0.746 ±0.004 0.701 ±0.002 0.675 ±0.004 0.854 ±0.001 0.900 ±0.001 0.863 ±0.005 0.857 ±0.001 0.957 ±0.001 0.916 ±0.005SPM5 0.751 ±0.004 0.729 ±0.002 0.701 ±0.003 0.875 ±0.001 0.917 ±0.000 0.879 ±0.002 0.883 ±0.000 0.956 ±0.000 0.902 ±0.002SPM8 0.796 ±0.005 0.762 ±0.003 0.706 ±0.004 0.931 ±0.001 0.933 ±0.000 0.884 ±0.002 0.950 ±0.001 0.955 ±0.000 0.899 ±0.002

GAGMM 0.822 ±0.004 0.775 ±0.004 0.709 ±0.006 0.967 ±0.000 0.948 ±0.000 0.879 ±0.007 0.984 ±0.001 0.964 ±0.000 0.892 ±0.007SOM 0.828 ±0.003 0.780 ±0.003 0.713 ±0.004 0.961 ±0.001 0.943 ±0.000 0.867 ±0.007 0.976 ±0.001 0.960 ±0.000 0.888 ±0.007RFCM 0.826 ±0.004 0.779 ±0.002 0.713 ±0.004 0.961 ±0.001 0.945 ±0.000 0.880 ±0.004 0.975 ±0.001 0.960 ±0.000 0.899 ±0.004FCM 0.829 ±0.003 0.781 ±0.003 0.713 ±0.004 0.965 ±0.001 0.945 ±0.000 0.871 ±0.006 0.979 ±0.001 0.961 ±0.000 0.890 ±0.006KNN 0.262 ±0.007 0.305 ±0.002 0.106 ±0.013 0.840 ±0.002 0.854 ±0.000 0.795 ±0.005 0.963 ±0.002 0.954 ±0.001 0.890 ±0.007

Table A.4: Statistal evaluation on Real T1 IBSR20 database. Reported values are mean ± standard deviation. dsc, Dice similarity;sens, sensitivity; spec, specificity.

IBSR datasetCSF GM WM

Method dsc sens spec dsc sens spec dsc sens specFAST 0.13 ±0.04 0.91 ±0.04 0.07 ±0.03 0.68 ±0.06 0.56 ±0.05 0.85 ±0.08 0.79 ±0.10 0.82 ±0.12 0.76 ±0.08SPM5 0.17 ±0.07 0.86 ±0.20 0.09 ±0.04 0.76 ±0.06 0.69 ±0.08 0.85 ±0.02 0.80 ±0.04 0.78 ±0.04 0.84 ±0.09SPM8 0.21 ±0.07 0.89 ±0.04 0.12 ±0.05 0.78 ±0.06 0.70 ±0.07 0.88 ±0.05 0.81 ±0.08 0.82 ±0.08 0.79 ±0.08

GAGMM 0.25 ±0.12 0.79 ±0.10 0.16 ±0.09 0.77 ±0.09 0.71 ±0.08 0.85 ±0.12 0.74 ±0.16 0.77 ±0.22 0.75 ±0.07ANN 0.15 ±0.06 0.89 ±0.05 0.08 ±0.03 0.69 ±0.09 0.59 ±0.08 0.85 ±0.12 0.77 ±0.14 0.81 ±0.18 0.74 ±0.07

RFCM 0.25 ±0.05 0.81 ±0.05 0.08 ±0.03 0.73 ±0.09 0.64 ±0.07 0.84 ±0.12 0.72 ±0.14 0.82 ±0.18 0.75 ±0.08FCM 0.14 ±0.10 0.90 ±0.08 0.15 ±0.07 0.69 ±0.12 0.58 ±0.13 0.88 ±0.11 0.77 ±0.16 0.80 ±0.22 0.67 ±0.10KNN 0.41 ±0.12 0.34 ±0.10 0.56 ±0.20 0.86 ±0.05 0.84 ±0.07 0.88 ±0.02 0.80 ±0.05 0.83 ±0.03 0.77 ±0.08

Bibliography

[1] Julio Acosta-Cabronero, Guy B. Williams, Jo£o M.S. Pereira, George Pengas, and Peter J.

Nestor. The impact of skull-stripping and radio-frequency bias correction on grey-matter

segmentation for voxel-based morphometry. NeuroImage, 39(4):1654 – 1665, 2008.

[2] M.N. Ahmed, S.M. Yamany, N. Mohamed, A.A. Farag, and T. Moriarty. A modified fuzzy

c-means algorithm for bias field estimation and segmentation of mri data. Medical Imaging,

IEEE Transactions on, 21(3):193 –199, march 2002.

[3] Ayelet Akselrod-Ballin, Meirav Galun, John Moshe Gomori, Achi Brandt, and Ronen

Basri. Prior knowledge driven multiscale segmentation of brain mri. In Proceedings of the

10th international conference on Medical image computing and computer-assisted interven-

tion, MICCAI’07, pages 118–126, Berlin, Heidelberg, 2007. Springer-Verlag.

[4] Ayelet Akselrod-Ballin, Meirav Galun, Moshe Gomori, Ronen Basri, and Achi Brandt.

Atlas guided identification of brain structures by combining 3d segmentation and svm

classification. In Rasmus Larsen, Mads Nielsen, and Jon Sporring, editors, Medical Image

Computing and Computer-Assisted Intervention – MICCAI 2006, volume 4191 of Lecture

Notes in Computer Science, pages 209–216. Springer Berlin / Heidelberg, 2006.

[5] J. Ashburner and K.J. Friston. Unified segmentation. NeuroImage, 26:839–851, 2005.

[6] John Ashburner and Karl J. Friston. Voxel-based morphometry’Aıthe methods. NeuroIm-

age, 11(6):805 – 821, 2000.

[7] M. Stella Atkins, Jeff J. Orchard, Ben Law, and Melanie K. Tory. t robustness of the brain

parenchymal fraction for measuring brain atrophy.

[8] Suyash P. Awate, Tolga Tasdizen, Norman Foster, and Ross T. Whitaker. Adaptive markov

modeling for mutual-information-based, unsupervised mri brain-tissue classification. Med-

ical Image Analysis, 10(5):726 – 739, 2006. ¡ce:title¿The Eighth International Confer-

ence on Medical Imaging and Computer Assisted Intervention’Aı MICCAI 2005¡/ce:title¿

66

¡xocs:full-name¿The Eighth International Conference on Medical Imaging and Computer

Assisted Intervention’Aı MICCAI 2005¡/xocs:full-name¿.

[9] Rohit Bakshi, Suzie Ariyaratana, Ralph H. B. Benedict, and Lawrence Jacobs. Fluid-

attenuated inversion recovery magnetic resonance imaging detects cortical and juxtacortical

multiple sclerosis lesions. Arch Neurol, 58(5):742–748, 2001.

[10] M. Balafar, A. Ramli, M. Saripan, and S. Mashohor. Review of brain mri image segmenta-

tion methods. Artificial Intelligence Review, 33:261–274, 2010. 10.1007/s10462-010-9155-0.

[11] Pierre-Louis Bazin and Dzung L. Pham. Homeomorphic brain image segmentation

with topological and statistical atlases. Medical Image Analysis, 12(5):616 – 625, 2008.

¡ce:title¿Special issue on the 10th international conference on medical imaging and com-

puter assisted intervention - MICCAI 2007¡/ce:title¿.

[12] J. C. Bezdek, L. O. Hall, and L. P. Clarke. Review of MR image segmentation techniques

using pattern recognition. Medical Physics, 20:1033–1048, July 1993.

[13] Kristi Boesen, Kelly Rehm, Kirt Schaper, Sarah Stoltzner, Roger Woods, Eileen Liders,

and David Rottenberg. Quantitative comparison of four brain extraction algorithms. Neu-

roImage, 22(3):1255 – 1261, 2004.

[14] S. Bricq, Ch. Collet, and J.P. Armspach. Unifying framework for multimodal brain mri

segmentation based on hidden markov chains. Medical Image Analysis, 12(6):639 – 652,

2008. ¡ce:title¿Special issue on information processing in medical imaging 2007¡/ce:title¿.

[15] Antoni Buades, Bartomeu Coll, and Jean-Michel Morel. Nonlocal image and movie de-

noising. International Journal of Computer Vision, 76:123–139, 2008. 10.1007/s11263-007-

0052-1.

[16] Mariano Cabezas, Arnau Oliver, Xavier Llado, Jordi Freixenet, and Meritxell Bach Cuadra.

A review of atlas-based segmentation for magnetic resonance brain images. Computer

Methods and Programs in Biomedicine, 104(3):e158 – e177, 2011.

[17] Weiling Cai, Songcan Chen, and Daoqiang Zhang. Fast and robust fuzzy c-means clustering

algorithms incorporating local information for image segmentation. Pattern Recognition,

40(3):825 – 838, 2007.

[18] Benoıt Caldairou, Francois Rousseau, Nicolas Passat, Piotr Habas, Colin Studholme, and

Christian Heinrich. A non-local fuzzy segmentation method: Application to brain mri.

In Xiaoyi Jiang and Nicolai Petkov, editors, Computer Analysis of Images and Patterns,

67

volume 5702 of Lecture Notes in Computer Science, pages 606–613. Springer Berlin /

Heidelberg, 2009.

[19] Ruben Cardenes, Rodrigo de Luis-Garcia, and Meritxell Bach-Cuadra. A multidimensional

segmentation evaluation for medical image data. Computer Methods and Programs in

Biomedicine, 96(2):108 – 124, 2009.

[20] Songcan Chen and Daoqiang Zhang. Robust image segmentation using fcm with spatial

constraints based on new kernel-induced distance measure. Systems, Man, and Cybernetics,

Part B: Cybernetics, IEEE Transactions on, 34(4):1907 –1916, aug. 2004.

[21] L. P. Clarke, R. P. Velthuizen, M. A. Camacho, J. J. Heine, M. Vaidyanathan, L. O. Hall,

R. W. Thatcher, and M. L. Silbiger. MRI segmentation: Methods and applications. Magn

Reson Imaging, 13:343–368, 1995.

[22] Chris A. Cocosco, Vasken Kollokian, Remi K.-S. Kwan, G. Bruce Pike, and Alan C. Evans.

Brainweb: Online interface to a 3d mri simulated brain database. NeuroImage, 5:425, 1997.

[23] Chris A. Cocosco, Alex P. Zijdenbos, and Alan C. Evans. A fully automatic and robust

brain mri tissue classification method. Medical Image Analysis, 7(4):513 – 527, 2003.

¡ce:title¿Medical Image Computing and Computer Assisted Intervention¡/ce:title¿.

[24] Alastair Compston and Alasdair Coles. Multiple sclerosis. Lancet, 372(9648):1502–17,

2008.

[25] Thomas E. Conturo, Robert C. McKinstry, Joseph A. Aronovitz, and Jeffrey J. Neil.

Diffusion mri: Precision, accuracy and flow effects. NMR in Biomedicine, 8(7):307–332,

1995.

[26] Renske de Boer, Henri A. Vrooman, M. Arfan Ikram, Meike W. Vernooij, Monique M.B.

Breteler, Aad van der Lugt, and Wiro J. Niessen. Accuracy and reproducibility study of

automatic mri brain tissue segmentation methods. NeuroImage, 51(3):1047 – 1056, 2010.

[27] Renske De Boer, Henri A Vrooman, Fedde Van Der Lijn, Meike W Vernooij, M Arfan

Ikram, Aad Van Der Lugt, Monique M B Breteler, and Wiro Niessen. White matter lesion

extension to automatic brain tissue segmentation on mri. NeuroImage, 45(4):1151–1161,

2009.

[28] Ayse Demirhan and Inan Gulan. Combining stationary wavelet transform and self-

organizing maps for brain mr image segmentation. Engineering Applications of Artificial

Intelligence, 24(2):358 – 367, 2011.

68

[29] Luc Devroye. On the almost everywhere convergence of nonparametric regression function

estimates, 1981.

[30] L. R. Dice. Measures of the amount of ecologic association between species. Ecology,

26(3):297–302, July 1945.

[31] Tarun. Dua, Paul. Rompani, World Health Organization., and Multiple Sclerosis Interna-

tional Federation. Atlas : multiple sclerosis resources in the world, 2008. World Health

Organization Geneva, Switzerland :, 2008.

[32] Richard O. Duda, Peter E. Hart, and David G. Stork. Pattern Classification (2nd Edition).

Wiley-Interscience, 2 edition, November 2001.

[33] Guillaume Dugas-Phocion, Miguel Ballester, Gregoire Malandain, Christine Lebrun, and

Nicholas Ayache. Improved em-based tissue segmentation and partial volume effect quan-

tification in multi-sequence brain mri. In Christian Barillot, David Haynor, and Pierre

Hellier, editors, Medical Image Computing and Computer-Assisted Intervention – MICCAI

2004, volume 3216 of Lecture Notes in Computer Science, pages 26–33. Springer Berlin /

Heidelberg, 2004.

[34] P. A. Filipek, C. Richelme, D. N. Kennedy, and V. S. Caviness. The young adult human

brain: an MRI-based morphometric analysis. Cereb Cortex, 4(4):344–360, 1994.

[35] M Filippi and F Agosta. Imaging biomarkers in multiple sclerosis. J Magn Reson Imaging,

31(4):770–88, 2010.

[36] M. Filippi, G. Iannucci, C. Tortorella, L. Minicucci, M.A. Horsfield, B. Colombo, M.P.

Sormani, and G. Comi. Comparison of ms clinical phenotypes using conventional and

magnetization transfer mri. Neurology, 52(3):588, 1999.

[37] Gath and Geva. Unsupervised optimal fuzzy clustering. Pattern Analysis and Machine

Intelligence, IEEE Transactions on, 11(7):773 –780, jul 1989.

[38] Yulin Ge, Robert I. Grossman, James S. Babb, Marcie L. Rabin, Lois J. Mannon, and

Dennis L. Kolson. Age-related total gray matter and white matter changes in normal

adult brain. part i: Volumetric mr imaging analysis. American Journal of Neuroradiology,

23(8):1327–1333, 2002.

[39] R. Guillemaud and M. Brady. Estimating the bias field of mr images. Medical Imaging,

IEEE Transactions on, 16(3):238 –251, june 1997.

69

[40] S.W. Hartley, A.I. Scher, E.S.C. Korf, L.R. White, and L.J. Launer. Analysis and validation

of automated skull stripping tools: A validation study based on 296 mr images from the

honolulu asia aging study. NeuroImage, 30(4):1179 – 1186, 2006.

[41] Khader M. Hasan, Indika S. Walimuni, Humaira Abid, Sushmita Datta, Jerry S. Wolin-

sky, and Ponnada A. Narayana. Human brain atlas-based multimodal mri analysis of

volumetry, diffusimetry, relaxometry and lesion distribution in multiple sclerosis patients

and healthy adult controls: Implications for understanding the pathogenesis of multiple

sclerosis and consolidation of quantitative mri results in ms. Journal of the Neurological

Sciences, 313(1’Aı2):99 – 109, 2012.

[42] M. Hasanzadeh and S. Kasaei. Multispectral brain mri segmentation using genetic fuzzy

systems. In Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International

Symposium on, pages 1 –4, feb. 2007.

[43] R H Hashemi, W G Bradley, D Y Chen, J E Jordan, J A Queralt, A E Cheng, and

J N Henrie. Suspected multiple sclerosis: Mr imaging with a thin-section fast flair pulse

sequence. Radiology, 196(2):505–510, 1995.

[44] Renjie He, Balasrinivasa Sajja, Sushmita Datta, and Ponnada Narayana. Volume and shape

in feature space on adaptive fcm in mri segmentation. Annals of Biomedical Engineering,

36:1580–1593, 2008. 10.1007/s10439-008-9520-1.

[45] Zujun Hou. A review on MR image intensity inhomogeneity correction. nternational

Journal of Biomedical Imaging, 2006.

[46] A. Huang, R. Abugharbieh, R. Tam, and A. Traboulsee. Automatic mri brain tissue

segmentation using a hybrid statistical and geometric model. In Biomedical Imaging:

Nano to Macro, 2006. 3rd IEEE International Symposium on, pages 394 –397, april 2006.

[47] Paul Jaccard. The distribution of the flora in the alpine zone.1. New Phytologist, 11(2):37–

50, 1912.

[48] K. A. Jellinger. New frontiers of mr-based technique in multiple sclerosis. European Journal

of Neurology, 10(4):467–467, 2003.

[49] J.R. Jimenez-Alaniz, V. Medina-Banuelos, and O. Yanez-Suarez. Data-driven brain mri

segmentation supported on edge confidence and a priori tissue information. Medical Imag-

ing, IEEE Transactions on, 25(1):74 –83, jan. 2006.

70

[50] M. Joliot and B.M. Mazoyer. Three-dimensional segmentation and interpolation of mag-

netic resonance brain images. Medical Imaging, IEEE Transactions on, 12(2):269 –277,

jun 1993.

[51] T. Kalaiselvi and K. Somasundaram. Fuzzy c-means technique with histogram based

centroid initialization for brain tissue segmentation in mri of head scans. In Humanities,

Science Engineering Research (SHUSER), 2011 International Symposium on, pages 149

–154, june 2011.

[52] Tina Kapur, W.Eric L. Grimson, William M. Wells III, and Ron Kikinis. Segmentation

of brain tissue from magnetic resonance images. Medical Image Analysis, 1(2):109 – 127,

1996.

[53] Stelios Krinidis and Vassilios Chatzis. A robust fuzzy local information c-means clustering

algorithm. Trans. Img. Proc., 19(5):1328–1337, May 2010.

[54] Cornelia Laule, Irene M Vavasour, Esther Leung, David KB Li, Piotr Kozlowski, Anthony L

Traboulsee, Joel Oger, Alex L MacKay, and GR Wayne Moore. Pathological basis of

diffusely abnormal white matter: insights from magnetic resonance imaging and histology.

Multiple Sclerosis Journal, 17(2):144–150, 2011.

[55] F. D. Lublin and S. C. Reingold. Defining the clinical course of multiple sclerosis: results

of an international survey. National Multiple Sclerosis Society (USA) Advisory Committee

on Clinical Trials of New Agents in Multiple Sclerosis. Neurology, 46(4):907–911, April

1996.

[56] J. B. MacQueen. Some methods for classification and analysis of multivariate observa-

tions. In L. M. Le Cam and J. Neyman, editors, Proc. of the fifth Berkeley Symposium on

Mathematical Statistics and Probability, volume 1, pages 281–297. University of California

Press, 1967.

[57] J.L. Marroquin, B.C. Vemuri, S. Botello, E. Calderon, and A. Fernandez-Bouzas. An

accurate and efficient bayesian method for automatic segmentation of brain mri. Medical

Imaging, IEEE Transactions on, 21(8):934 –945, aug. 2002.

[58] Karla L. Miller and John M. Pauly. Nonlinear phase correction for navigated diffusion

imaging. Magnetic resonance in medicine : official journal of the Society of Magnetic Res-

onance in Medicine / Society of Magnetic Resonance in Medicine, 50(2):343–353, August

2003.

71

[59] J M Minderhoud, J H van der Hoeven, and A J Prange. Course and prognosis of chronic

progressive multiple sclerosis. results of an epidemiological study. Acta Neurol Scand,

78(1):10–5, 1988.

[60] D. L. Pham, C. Xu, and J. L. Prince. Current methods in medical image segmentation.

Annual review of biomedical engineering, 2(1):315–337, 2000.

[61] Dzung L. Pham. Spatial models for fuzzy clustering. Computer Vision and Image Under-

standing, 84(2):285 – 297, 2001.

[62] W.E. Reddick, J.O. Glass, E.N. Cook, T.D. Elkin, and R.J. Deaton. Automated segmenta-

tion and classification of multispectral magnetic resonance images of brain using artificial

neural networks. Medical Imaging, IEEE Transactions on, 16(6):911 –918, dec. 1997.

[63] Kelly Rehm, Kirt Schaper, Jon Anderson, Roger Woods, Sarah Stoltzner, and David Rot-

tenberg. Putting our heads together: a consensus approach to brain/non-brain segmenta-

tion in t1-weighted mr volumes. NeuroImage, 22(3):1262 – 1270, 2004.

[64] M. A. Rocca, N. Anzalone, A. Falini, and M. Filippi. Contribution of magnetic resonance

imaging to the diagnosis and monitoring of multiple sclerosis. La Radiologia Medica, pages

1–14, March 2012.

[65] M. Rovaris, A. Gass, R. Bammer, S. J. Hickman, O. Ciccarelli, D. H. Miller, and M. Filippi.

Diffusion mri in multiple sclerosis. Neurology, 65(10):1526–1532, 2005.

[66] S. Ruan, C. Jaggi, J. Xue, J. Fadili, and D. Bloyet. Brain tissue classification of magnetic

resonance images using partial volume modeling. Medical Imaging, IEEE Transactions on,

19(12):1179 –1187, dec. 2000.

[67] R.A. Rudick, E. Fisher, J.-C. Lee, J. Simon, L. Jacobs, and the Multiple Sclerosis Collab-

orative Research Group. Use of the brain parenchymal fraction to measure whole brain

atrophy in relapsing-remitting ms. Neurology, 53(8):1698, 1999.

[68] Ali Sahrain and E.-W Radue. Mri atlas of ms lesions. 2008.

[69] Benoit Scherrer, Florence Forbes, Catherine Garbay, and Michel Dojat. Fully bayesian joint

model for mr brain scan tissue and structure segmentation. In Dimitris Metaxas, Leon Axel,

Gabor Fichtinger, and Gabor Szekely, editors, Medical Image Computing and Computer-

Assisted Intervention – MICCAI 2008, volume 5242 of Lecture Notes in Computer Science,

pages 1066–1074. Springer Berlin / Heidelberg, 2008.

[70] G. A. F. Seber. Multivariate Distributions, pages 17–58. John Wiley & Sons, Inc., 2008.

72

[71] Shan Shen, W. Sandham, M. Granat, and A. Sterr. Mri fuzzy segmentation of brain tissue

using neighborhood attraction with neural-network optimization. Information Technology

in Biomedicine, IEEE Transactions on, 9(3):459 –467, sept. 2005.

[72] Andrew Simmons, Paul S. Tofts, Gareth J. Barker, and Simon R. Arridge. Sources of

intensity nonuniformity in spin echo images at 1.5 t. Magnetic Resonance in Medicine,

32(1):121–128, 1994.

[73] J.G. Sled, A.P. Zijdenbos, and A.C. Evans. A nonparametric method for automatic cor-

rection of intensity nonuniformity in mri data. Medical Imaging, IEEE Transactions on,

17(1):87 –97, feb. 1998.

[74] S.M Smith. Fast robust automated brain extraction. Hum. Brain Mapp., 17(3):143–155,

November 2002.

[75] M Styner, J Lee, B Chin, M Chin, O Commowick, H Tran, S Markovic-Plese, V Jewells,

and S Warfield. 3d segmentation in the clinic: A grand challenge ii: Ms lesion segmentation.

MIDAS, pages 1–6, 2008.

[76] L. Szilagyi, Z. Benyo, S.M. Szilagyi, and H.S. Adam. Mr brain image segmentation using

an enhanced fuzzy c-means algorithm. In Engineering in Medicine and Biology Society,

2003. Proceedings of the 25th Annual International Conference of the IEEE, volume 1,

pages 724 – 726 Vol.1, sept. 2003.

[77] D. Tian and L. Fan. A brain mr images segmentation method based on som neural network.

In Bioinformatics and Biomedical Engineering, 2007. ICBBE 2007. The 1st International

Conference on, volume 2, pages 686 –689, july 2007.

[78] GuangJian Tian, Yong Xia, Yanning Zhang, and Dagan Feng. Hybrid genetic and vari-

ational expectation-maximization algorithm for gaussian-mixture-model-based brain mr

image segmentation. Information Technology in Biomedicine, IEEE Transactions on,

15(3):373 –380, may 2011.

[79] J. Tohka, E. Krestyannikov, I.D. Dinov, A.M. Graham, D.W. Shattuck, U. Ruotsalainen,

and A.W. Toga. Genetic algorithms for finite mixture model based voxel classification in

neuroimaging. Medical Imaging, IEEE Transactions on, 26(5):696 –711, may 2007.

[80] On Tsang, Ali Gholipour, Nasser Kehtarnavaz, Kaundinya Gopinath, Richard Briggs, and

Issa Panahi. Comparison of tissue segmentation algorithms in neuroimage analysis software

tools. In Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th Annual

International Conference of the IEEE, pages 3924 –3928, aug. 2008.

73

[81] K. Van Leemput, F. Maes, D. Vandermeulen, and P. Suetens. Automated model-based

bias field correction of mr images of the brain. Medical Imaging, IEEE Transactions on,

18(10):885 –896, oct. 1999.

[82] U. Vovk, F. Pernus, and B. Likar. A review of methods for correction of intensity inhomo-

geneity in mri. Medical Imaging, IEEE Transactions on, 26(3):405 –421, march 2007.

[83] U. Vovk, F. Pernus, and B. Likar. A review of methods for correction of intensity inhomo-

geneity in mri. Medical Imaging, IEEE Transactions on, 26(3):405 –421, march 2007.

[84] S.K. Warfield, K.H. Zou, and W.M. Wells. Simultaneous truth and performance level esti-

mation (staple): an algorithm for the validation of image segmentation. Medical Imaging,

IEEE Transactions on, 23(7):903 –921, july 2004.

[85] Michael Wels, Yefeng Zheng, Martin Huber, Joachim Hornegger, and Dorin Comaniciu.

A discriminative model-constrained em approach to 3d mri brain tissue classification and

intensity non-uniformity correction. Physics in Medicine and Biology, 56(11):3269, 2011.

[86] Steven D. Wolff and Robert S. Balaban. Magnetization transfer contrast (mtc) and tissue

water proton relaxation in vivo. Magnetic Resonance in Medicine, 10(1):135–144, 1989.

[87] Zhao Yi, Antonio Criminisi, Jamie Shotton, and Andrew Blake. Discriminative, seman-

tic segmentation of brain tissue in mr images. In Guang-Zhong Yang, David Hawkes,

Daniel Rueckert, Alison Noble, and Chris Taylor, editors, Medical Image Computing and

Computer-Assisted Intervention – MICCAI 2009, volume 5762 of Lecture Notes in Com-

puter Science, pages 558–565. Springer Berlin / Heidelberg, 2009.

[88] Lian Yuanfeng and Wu Falin. Three-dimensional probabilistic neural network using for mr

image segmentation. In Electronic Measurement Instruments (ICEMI), 2011 10th Inter-

national Conference on, volume 3, pages 127 –131, aug. 2011.

[89] Y. Zhang, M. Brady, and S. Smith. Segmentation of brain mr images through a hidden

markov random field model and the expectation-maximization algorithm. Medical Imaging,

IEEE Transactions on, 20(1):45 –57, jan 2001.

[90] Yu Jin Zhang. A review of recent evaluation methods for image segmentation. In Signal

Processing and its Applications, Sixth International, Symposium on. 2001, volume 1, pages

148 –151 vol.1, 2001.

74

Documents

MRI Brain Tissue segmentation - GitHub Pageshaustive comparative evaluation of existing state-of-the-art brain tissue segmentation methods using T1w data which is the most used for