Medical Image Segmentation of Pet Scan Datasets Using ... · MEDICAL IMAGE SEGMENTATION OF PET SCAN DATASETS USING CLUSTERING APPROACH . ... or integrity of cerebral white matter

MEDICAL IMAGE SEGMENTATION OF PET SCAN

DATASETS USING CLUSTERING APPROACH

A. Meenaa, K. Raja

b

aResearch Scholar, Dept. of CSE, Sathyabama University, Chennai, Tamilnadu, India bDean (Academics), Alpha College of Engineering, Chennai, Tamilnadu, India

[email protected]

[email protected]

Abstract

In recent days the computer based imaging techniques are used in the bio medical imaging. The PET scan images are one of the

bio medical imaging techniques similar to that of MRI scan images but PET scan images are helpful in finding the development

of tumors. The PET scan images requires expertise in the segmentation where clustering plays an important role in the

automation process. The segmentation of such images is manual to automate the process clustering is used. Clustering is

commonly known as unsupervised learning process of n dimensional data sets are clustered into k groups (k<n) so as to maximize

the inter cluster similarity and to minimize the intra cluster similarity. The availability of more clustering algorithms in various

domain are increased the selection complexity. The major consideration factors are the increasing size of datasets known as

volume of datasets, nature of datasets, and number of clusters. For each factor the given algorithm is tested and to identify the

optimized one. The algorithm optimization is concluded based on the performance, quality and number of clusters

extracted. This paper is proposed to study the commonly used K- Means clustering algorithm and it discusses a brief list of toolboxes

for reproducing and extending works presented in this medical image analysis. This work is developed using AForge .NET

framework in windows environment and tested with sample PET scan image. The computational results are compared with

K- Means resultant image which is developed in MATLAB 7.0.1.

Keywords: Clustering, K- means, PET scan images, AForge .NET framework, MATLAB, MIPAV

1. Introduction

The most commonly used radiographic techniques are known as Computed Tomography (CT), Magnetic

Resonance Imaging (MRI) and Positron Emission Tomography (PET). These technologies are major

component techniques in diagnosis, clinical studies, treatment planning and are widely used for medical

research. The motive of automatic medical image segmentation is to describe the image content based on its

features. In recent years an ample of approaches has been proposed to segment medical images according its merits

and limitations. The symmetry based approach is mathematically defined as a distance preserving transformation of

the plane or space which leaves a given set of points unchanged and their respective features.

A Positron Emission Tomography (PET) is also known as PET imaging or a PET scan, is a type of nuclear

medicine imaging. This scan image detects chemical and physiological changes related to metabolism. It uses a

radioactive tracer element which is injected in the body and the tumors or cancers in the body are identified based

on the movements of the tracer element [1]. This scan images are more sensitive than other image techniques such

CT and MRI because the other imaging techniques only shows the physiology of the body parts where as the PET

scan images shows the internal formation of tumors and cancer cells by means of the metabolism of the body parts

[2].

This paper presents a study on the application of well known K-means clustering algorithm. This algorithm is

used to automate process of segmentation of the tumor affected area based on the datasets classified by its type,

size, and number of clusters [7]. The rest of the paper is organized as follows. Section 2 states the related work in this

area, section 3 describes the basic concept of clustering, section 4 covers the various clustering algorithms, section 5 and

section 6 presents the system architecture and implementation method of these toolboxes respectively. Comparison

between AForge .NET framework and MATLAB are made in the concluding section.

International Conference on Mathematical Computer Engineering - ICMCE - 2013 550

ISBN 978-93-82338-91-8 © 2013 Bonfring

mailto:[email protected]

2. Related Work

Digital image processing allows an algorithm to avoid problems such as the build-up of noise and signal

distortion occurs in analog image processing.

Michael J. Fulham et al., in the year 2002 stated that quantitative positron emission tomography provides the

measurements of dynamic physiological and biochemical processes in humans. In 2003, Ciccarelli et al

proposed a method sclerosis that disrupts the normal organization or integrity of cerebral white matter and the

underlying changes in cartilage structure during osteoarthritis (Meder et al., 2006). Functional imaging

methods are also being used to evaluate the appropriateness and efficacy of therapies such as Parkinson’s

disease, depression, schizophrenia, and Alzheimer’s disease. Quantum dots (qdots) are fluorescent nano

particles of semiconductor material is specially designed to detect the biochemical markers of cancer described

by Carts-Powell, 2006. Osama Abu Abbas in 2008 explained about the various clustering algorithm and its

application based on the type of dataset used. In 2009, Stefan Kramer et al described the structured patient data

for the analysis of the implementation of a clustering algorithm. The author expressed the medical research in

dementia is to correlate images of the brain with other variables, for instance, demographic information or

outcomes of clinical tests. In this paper, clustering is applied to whole PET scans.

3. Clustering

Clustering is used to classify items into similar groups in the process of data mining. It also exploits segmentation

which is used for quick bird view for any kind of problem. Unlike classification clustering and unsupervised

learning do not rely on predefined classes and class labeled training examples. For this reason, clustering is a form

of learning by observation, rather than learning by examples.

Many clustering techniques have been proposed over the years from different research disciplines. These

techniques are used to perform with a given data and are being applied in an ample variety of interdisciplinary

applications. For example, clustering can be used to derive plant and animal taxonomies in the field of

biology; categorize genes with similar functionality in genetic engineering and clustering can be used in

business to group the population by demographic information into segments for direct marketing and sales.

4. Clustering algorithms

The most frequently known clustering algorithms are chosen to study. Then the basic K-means algorithm is

implemented in two different toolboxes such as MATLAB and AForge .NET framework in windows environment.

4.1 K-Means clustering algorithm

K-Means is a well known partitioning method. Objects are classified as belonging to one of k groups, k chosen a

priori [3]. Cluster membership is determined by calculating the centroid for each group and assigning each object to

the group with the closes centroid[13]. This approach minimizes the overall within-cluster dispersion by iterative

reallocation of cluster members [4].

Pseudo code for centroid calculation

Step1: Initialize / Calculate new centroid

Step2: Calculate the distance between object and every centroid

Step3: Object Clustering

Step4: If any object moved from one cluster to the other, go to

Step1 or Stop


ISBN 978-93-82338-91-8 © 2013 Bonfring

Pseudo code for image segmentation

4.2 Clustering Using REpresentatives - CURE

Partitioning algorithms are based on specifying an initial number of groups and iteratively reallocating objects among

groups to convergence. In contrast hierarchical algorithms combine or divide existing groups, creating a hierarchical

structure that reflects the order in which groups are merged or divided [5]. In CURE the representative

points to mark a cluster and all the data are inserted in to a heap. The data are moved to its closest cluster. This

process continues until the number of clustered initially given is equal to the heap size.

4.3 Self-Organization Map (SOM)

Self-Organizing Maps (SOM) are a data visualization technique inspired by neural networks in the brain. SOM uses

a competitive and cooperation mechanism to achieve unsupervised learning. It is used for complex

applications and to find the statistical calculation. From the input space it preserves the topological properties and the

cell arrangement is based on hexagonal or rectangular grid. The problem in image segmentation is that the number of

neural units in the competitive layer needs to be equal to the selected number of regions [9]. But, it is not possible to

predict the correct number of regions N in the segmented image in all real applications.

5. System architecture

The given PET scan images are converted to text datasets which consists of the value less than one. Then it is

transformed to the comma separated value (CSV) file format. It is a type of delimited text format in which a comma

separates the columns in which tabular data is stored and is common to all computer platforms.

Here, the clustering is applied based on k means clustering algorithm. The early seed value is known as the initial

centroid values. From that value the new centroids is calculated and do the process until the optimized clustering if

formed. The algorithm is supported the initial learning process by means of compatibility factors defined as size, type

and number of clusters. The clustered segmented image visualized and to save in appropriate file format.

Fig.1 System architecture

Step1: Initialize centroids corresponding to require

number of clusters

Step2: Calculate original centroid (Call K- Means)

Step3: Calculate the mask

Step4: Do the segmentation process


ISBN 978-93-82338-91-8 © 2013 Bonfring

6. Implementation Method

An implementation of K-means is loaded in two different environments. First the basic algorithm is tested in

AForge .NET framework using PET scan image datasets. And then it could be verified in MATLAB.

6.1 AForge .NET framework

AForge.NET is a C# framework. It could be applied to wider range of image processing and computer vision

tasks. The .NET framework contains Common Language Runtime (CLR) and the .NET framework class libraries

also known as Base Class Libraries. CLR is used to provide a universal execution engine for developers. AForge

.imaging is an additional library user responsive kit which is aimed to help as in image enhancement/processing

in various domains such as pattern recognition, image processing, knowledge based systems and more in AForge

.NET framework. This paper is implemented in visual studio.NET framework. It provides a visual environment

and to design .NET Applications.

The proposed work has four modules where the image if is given as input then it is converted to datasets. The

converted datasets are clustered and the obtained segmented PET scan images are shown the tumor affected

areas.

6.2 Image to dataset conversion

The sample data was collected from Alzheimer’s Disease Neuroimaging Initiative (ADNI). Alzheimer Disease is a

serious loss of thinking ability in a person. In 2006, there were 26.6million sufferers worldwide. Alzheimer's is

predicted to affect 1 in 85 people globally by 2050[11]. Normally women are having a higher risk of developing AD

[12]. ADNI is used to collect and validate neuro imaging data such as MRI and PET images. The given input image

consists of various pixels points where the pixel points are the ratio of RGB color combinations. The image is then

converted to byte stream. The converted byte stream is stored in a jagged array in the form of a text file as datasets.

Fig.2 Image conversion Fig.3 CSV conversion Fig.4 Image segmentation using K- Means

6.3 Data preprocessing and image retrieval

The dataset in the form of text file is imported into the system. It converts the text file to dataset in the CSV format.

The converted CSV consists of the byte stream. It is used to initialize the jagged pixel array and the array is

transformed to bitmap image. Using jagged array the length of each array pixel can differ. It can use less memory and

be faster than two dimensional arrays because of uneven shape. The image retrieved from the byte stream is shown in

the form of the thumbnail view in this part of the system and the system’s mode is changed to file imported along with

the number of rows and columns being displayed. Fig.3 is shown to the CSV file converted format. This jagged index

is used to reallocating the pixel values to its corresponding centroid values.


ISBN 978-93-82338-91-8 © 2013 Bonfring

6.4 Knowledge based cluster analysis

This part of the system is to select the K- Means clustering algorithm for the given dataset and to segment an

image. Randomly the number of cluster is selected as 5. The clustering algorithm is used to automate the

process of segmentation here when the clustering is done based on the pixel values. It can be changed to its

pixel values so that the image segmentation is made possible in the byte stream. The thumbnail view of both

the clustered and non clustered image using K – means is shown in the Fig. 4. The system enables to view of

the segmented PET scan datasets.

6.5 MATLAB Applications

MATrix LABoratoty(MATLAB) is widely used for implementing algorithms in numerical environment. The image

processing tool box has a stable, well supported set of software tools for wide range of digital image processing and

segmentation. The major applications are intensity transformation, image restoration, registration, image data

compression, morphological image processing, regions and boundary representation and description. However, some

limitations are listed in MATLAB such as its low processing speed and wasteful use of memory [10].

6.6 MATLAB cluster analysis

The given image is loaded in to MATLAB. First the number of clusters is assigned. Then the centroid (c) initialization

is calculated as follows

c = (1:k)*m / (k+1) (1)

where the double precision image pixel in single column (m) value and number of centroid (k) is used to calculate the

initial centroid value.

The calculation of distance (d) between centroid and object is derived from

d = abs (o (i) – c) (2)

Equation (2) o(i) is known as one dimensional array distance. Using that value, new centroid is calculated in equation

3.

nc(i) = sum (a.*h(a)) / sum (h(a)) (3)

where the value object clustering function (a) and non zero element obtained from object clustering h(a) is used to

compute the new centroid (nc) value. The resultant new centroid value is used for masking creation and then the image

segmentation. Fig. 5.is showed the calculation of threshold value for segmentation.

Fig. 5 Threshold calculation

Segmentation

Clustering

Original

input image Segmented

output image

New centroid

value n(c)


ISBN 978-93-82338-91-8 © 2013 Bonfring

7. Result

The obtained image from MATLAB and .NET framework is analyzed using MIPAV tool. This tool is particularly

designed for medical image processing and analysis. The selection volume of interest is identified and the statistical

parameter is listed below. In Table 1 shows that the co efficient of variance value in .NET framework is lesser than

MATLAB environment. It proves the less significant distribution in .NET. In Fig. 7 shows an implementation effect of

MATLAB and .NET framework environment.

Table 1: Statistical analysis using MIPAV

Fig.7 Comparison Chart

8. Conclusion and Future work

The bio medical imaging techniques have been prominently used for the clinical purpose such that the anatomy and the

physiology of the internal parts can be monitored [8]. PET scan is a bio medical nuclear imaging techniques provide a

solution for abnormal cells. The segmented image provides the clear picture about the affected portions. This paper

explained the basic K-Means algorithm in different working platform. First the K-Means is tested on AForge.NET

framework in windows environment. Here the given input image is converted into byte stream and stored in the form of

jagged array. Then it is imported in the CSV file format which is allowed in any file system. In this application the

number of clusters is automatically selected. The obtained image is compared with MATLAB environment. The

comparative clustering result independently of the platform environment and produced the optimal segmented image.

9. References

[1] Koon-Pong Wong, Dagan Geng, Steven R.Meikle, Michael J.Fulham “Segmentation of Dynamic PET Images Using cluster analysis” IEEE

Transactions on nuclear science, Vol. 49, pp.200-207, 2002.

[2] Andreas Hapfelmeier, Jana Schmidt, Marianne Muller, Stefan Kramer “Interpreting PET scans by structured Patient Data: A Data mining case study in dementia Research” IEEE Knowledge and Information Systems, pp.213-222, 2009.

[3] Osama Abu Abbas “Comparison between Data clustering algorithms” The International Arab Journal of Information Technology, Vol. 5,

pp.320-325. [4] Oyelade, O. J, Oladipupo, O. O, Obagbuwa, I. C “Application of k-Means Clustering algorithm for prediction of Students’ Academic

Performance” International Journal of Computer Science and Information Security, vol. 7, pp.292-295, 2010. [5] George Karypis, Eui-Hong, Han Vipin Kumar “CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling” IEEE

Computer, Vol. 32, pp. 68-75.

[6] Chang HH, Valentino DJ, Duckwiler GR, Toga AW (2007). Segmentation of Brain MR Images Using a Charged Fluid Model, IEEE

Transactions on Biomedical Engineering 54: 1798-1813.

[7] D. L. Pham, C. Xu, and J. L. Prince. “ Current methods in medical images segmentation”, Annual review of biomedical engineering, 2: 315-

337, 2000 [8] H. N. Wagber, Z. Szabo, J. W. Buchanan, Principles of nuclear medicine. Pensylvania, pp. 564 – 575, 1995

[9] Anamika Ahirwar, R. S. Jadon, characterization of tumor region using SOM and Neuro Fuzzy techniques in Digital Mammography,

International Journal of Computer Science & Information Technology (IJCSIT), Vol 3, No 1, Feb 2011 [10] M Bister et al. Increasing the speed of medical image processing in MatLab.

[11] Brookmeyer, R; Johnson, E; Ziegler-Graham, K; Arrighi, HM “"Forecasting the global burden of Alzheimer's disease” July 2007, 186 – 91

[12] Bermejo-Pareja F, Benito-León J, Vega S, Medrano MJ, Román GC "Incidence and subtypes of dementia in three elderly populations of central Spain". J. Neurol. Sci. 264 (1–2), Jan 2008: 63–72.

[13] M.C. Su and C. H. Chou, “ A Modified Version of the K – Means Algorithm with a Distance Based on Cluster Symmetry,” IEEE Trans. On

Pattern Analysis and Machine Intelligence, vol.23, no.6, pp. 674 – 680, June. 2001.

S.

No

Parameter Result obtained from

MATLAB .NET

1. No. of Voxels 16472 15544

2. Average voxel intensity 86.0916 79.2168

3. Standard Deviation 92.0758 65.3007

4. Skewness 0.7643 0.5899

5. Kurtosis 1.7756 1.7946

6. Largest slice distance 142mm 134mm

7. Median intensity 23 71

8. Mode intensity 16 13

9. Mode count 3490 4618

10. Coefficient of variance 106.951 82.433


ISBN 978-93-82338-91-8 © 2013 Bonfring

Documents

Medical Image Segmentation of Pet Scan Datasets Using ... · MEDICAL IMAGE SEGMENTATION OF PET SCAN DATASETS USING CLUSTERING APPROACH . ... or integrity of cerebral white matter