Click here to load reader
Upload
vanxuyen
View
212
Download
0
Embed Size (px)
Citation preview
MEDICAL IMAGE SEGMENTATION OF PET SCAN
DATASETS USING CLUSTERING APPROACH
A. Meenaa, K. Raja
b
aResearch Scholar, Dept. of CSE, Sathyabama University, Chennai, Tamilnadu, India bDean (Academics), Alpha College of Engineering, Chennai, Tamilnadu, India
Abstract
In recent days the computer based imaging techniques are used in the bio medical imaging. The PET scan images are one of the
bio medical imaging techniques similar to that of MRI scan images but PET scan images are helpful in finding the development
of tumors. The PET scan images requires expertise in the segmentation where clustering plays an important role in the
automation process. The segmentation of such images is manual to automate the process clustering is used. Clustering is
commonly known as unsupervised learning process of n dimensional data sets are clustered into k groups (k<n) so as to maximize
the inter cluster similarity and to minimize the intra cluster similarity. The availability of more clustering algorithms in various
domain are increased the selection complexity. The major consideration factors are the increasing size of datasets known as
volume of datasets, nature of datasets, and number of clusters. For each factor the given algorithm is tested and to identify the
optimized one. The algorithm optimization is concluded based on the performance, quality and number of clusters
extracted. This paper is proposed to study the commonly used K- Means clustering algorithm and it discusses a brief list of toolboxes
for reproducing and extending works presented in this medical image analysis. This work is developed using AForge .NET
framework in windows environment and tested with sample PET scan image. The computational results are compared with
K- Means resultant image which is developed in MATLAB 7.0.1.
Keywords: Clustering, K- means, PET scan images, AForge .NET framework, MATLAB, MIPAV
1. Introduction
The most commonly used radiographic techniques are known as Computed Tomography (CT), Magnetic
Resonance Imaging (MRI) and Positron Emission Tomography (PET). These technologies are major
component techniques in diagnosis, clinical studies, treatment planning and are widely used for medical
research. The motive of automatic medical image segmentation is to describe the image content based on its
features. In recent years an ample of approaches has been proposed to segment medical images according its merits
and limitations. The symmetry based approach is mathematically defined as a distance preserving transformation of
the plane or space which leaves a given set of points unchanged and their respective features.
A Positron Emission Tomography (PET) is also known as PET imaging or a PET scan, is a type of nuclear
medicine imaging. This scan image detects chemical and physiological changes related to metabolism. It uses a
radioactive tracer element which is injected in the body and the tumors or cancers in the body are identified based
on the movements of the tracer element [1]. This scan images are more sensitive than other image techniques such
CT and MRI because the other imaging techniques only shows the physiology of the body parts where as the PET
scan images shows the internal formation of tumors and cancer cells by means of the metabolism of the body parts
[2].
This paper presents a study on the application of well known K-means clustering algorithm. This algorithm is
used to automate process of segmentation of the tumor affected area based on the datasets classified by its type,
size, and number of clusters [7]. The rest of the paper is organized as follows. Section 2 states the related work in this
area, section 3 describes the basic concept of clustering, section 4 covers the various clustering algorithms, section 5 and
section 6 presents the system architecture and implementation method of these toolboxes respectively. Comparison
between AForge .NET framework and MATLAB are made in the concluding section.
International Conference on Mathematical Computer Engineering - ICMCE - 2013 550
ISBN 978-93-82338-91-8 © 2013 Bonfring
2. Related Work
Digital image processing allows an algorithm to avoid problems such as the build-up of noise and signal
distortion occurs in analog image processing.
Michael J. Fulham et al., in the year 2002 stated that quantitative positron emission tomography provides the
measurements of dynamic physiological and biochemical processes in humans. In 2003, Ciccarelli et al
proposed a method sclerosis that disrupts the normal organization or integrity of cerebral white matter and the
underlying changes in cartilage structure during osteoarthritis (Meder et al., 2006). Functional imaging
methods are also being used to evaluate the appropriateness and efficacy of therapies such as Parkinson’s
disease, depression, schizophrenia, and Alzheimer’s disease. Quantum dots (qdots) are fluorescent nano
particles of semiconductor material is specially designed to detect the biochemical markers of cancer described
by Carts-Powell, 2006. Osama Abu Abbas in 2008 explained about the various clustering algorithm and its
application based on the type of dataset used. In 2009, Stefan Kramer et al described the structured patient data
for the analysis of the implementation of a clustering algorithm. The author expressed the medical research in
dementia is to correlate images of the brain with other variables, for instance, demographic information or
outcomes of clinical tests. In this paper, clustering is applied to whole PET scans.
3. Clustering
Clustering is used to classify items into similar groups in the process of data mining. It also exploits segmentation
which is used for quick bird view for any kind of problem. Unlike classification clustering and unsupervised
learning do not rely on predefined classes and class labeled training examples. For this reason, clustering is a form
of learning by observation, rather than learning by examples.
Many clustering techniques have been proposed over the years from different research disciplines. These
techniques are used to perform with a given data and are being applied in an ample variety of interdisciplinary
applications. For example, clustering can be used to derive plant and animal taxonomies in the field of
biology; categorize genes with similar functionality in genetic engineering and clustering can be used in
business to group the population by demographic information into segments for direct marketing and sales.
4. Clustering algorithms
The most frequently known clustering algorithms are chosen to study. Then the basic K-means algorithm is
implemented in two different toolboxes such as MATLAB and AForge .NET framework in windows environment.
4.1 K-Means clustering algorithm
K-Means is a well known partitioning method. Objects are classified as belonging to one of k groups, k chosen a
priori [3]. Cluster membership is determined by calculating the centroid for each group and assigning each object to
the group with the closes centroid[13]. This approach minimizes the overall within-cluster dispersion by iterative
reallocation of cluster members [4].
Pseudo code for centroid calculation
Step1: Initialize / Calculate new centroid
Step2: Calculate the distance between object and every centroid
Step3: Object Clustering
Step4: If any object moved from one cluster to the other, go to
Step1 or Stop
International Conference on Mathematical Computer Engineering - ICMCE - 2013 551
ISBN 978-93-82338-91-8 © 2013 Bonfring
Pseudo code for image segmentation
4.2 Clustering Using REpresentatives - CURE
Partitioning algorithms are based on specifying an initial number of groups and iteratively reallocating objects among
groups to convergence. In contrast hierarchical algorithms combine or divide existing groups, creating a hierarchical
structure that reflects the order in which groups are merged or divided [5]. In CURE the representative
points to mark a cluster and all the data are inserted in to a heap. The data are moved to its closest cluster. This
process continues until the number of clustered initially given is equal to the heap size.
4.3 Self-Organization Map (SOM)
Self-Organizing Maps (SOM) are a data visualization technique inspired by neural networks in the brain. SOM uses
a competitive and cooperation mechanism to achieve unsupervised learning. It is used for complex
applications and to find the statistical calculation. From the input space it preserves the topological properties and the
cell arrangement is based on hexagonal or rectangular grid. The problem in image segmentation is that the number of
neural units in the competitive layer needs to be equal to the selected number of regions [9]. But, it is not possible to
predict the correct number of regions N in the segmented image in all real applications.
5. System architecture
The given PET scan images are converted to text datasets which consists of the value less than one. Then it is
transformed to the comma separated value (CSV) file format. It is a type of delimited text format in which a comma
separates the columns in which tabular data is stored and is common to all computer platforms.
Here, the clustering is applied based on k means clustering algorithm. The early seed value is known as the initial
centroid values. From that value the new centroids is calculated and do the process until the optimized clustering if
formed. The algorithm is supported the initial learning process by means of compatibility factors defined as size, type
and number of clusters. The clustered segmented image visualized and to save in appropriate file format.
Fig.1 System architecture
Step1: Initialize centroids corresponding to require
number of clusters
Step2: Calculate original centroid (Call K- Means)
Step3: Calculate the mask
Step4: Do the segmentation process
International Conference on Mathematical Computer Engineering - ICMCE - 2013 552
ISBN 978-93-82338-91-8 © 2013 Bonfring
6. Implementation Method
An implementation of K-means is loaded in two different environments. First the basic algorithm is tested in
AForge .NET framework using PET scan image datasets. And then it could be verified in MATLAB.
6.1 AForge .NET framework
AForge.NET is a C# framework. It could be applied to wider range of image processing and computer vision
tasks. The .NET framework contains Common Language Runtime (CLR) and the .NET framework class libraries
also known as Base Class Libraries. CLR is used to provide a universal execution engine for developers. AForge
.imaging is an additional library user responsive kit which is aimed to help as in image enhancement/processing
in various domains such as pattern recognition, image processing, knowledge based systems and more in AForge
.NET framework. This paper is implemented in visual studio.NET framework. It provides a visual environment
and to design .NET Applications.
The proposed work has four modules where the image if is given as input then it is converted to datasets. The
converted datasets are clustered and the obtained segmented PET scan images are shown the tumor affected
areas.
6.2 Image to dataset conversion
The sample data was collected from Alzheimer’s Disease Neuroimaging Initiative (ADNI). Alzheimer Disease is a
serious loss of thinking ability in a person. In 2006, there were 26.6million sufferers worldwide. Alzheimer's is
predicted to affect 1 in 85 people globally by 2050[11]. Normally women are having a higher risk of developing AD
[12]. ADNI is used to collect and validate neuro imaging data such as MRI and PET images. The given input image
consists of various pixels points where the pixel points are the ratio of RGB color combinations. The image is then
converted to byte stream. The converted byte stream is stored in a jagged array in the form of a text file as datasets.
Fig.2 Image conversion Fig.3 CSV conversion Fig.4 Image segmentation using K- Means
6.3 Data preprocessing and image retrieval
The dataset in the form of text file is imported into the system. It converts the text file to dataset in the CSV format.
The converted CSV consists of the byte stream. It is used to initialize the jagged pixel array and the array is
transformed to bitmap image. Using jagged array the length of each array pixel can differ. It can use less memory and
be faster than two dimensional arrays because of uneven shape. The image retrieved from the byte stream is shown in
the form of the thumbnail view in this part of the system and the system’s mode is changed to file imported along with
the number of rows and columns being displayed. Fig.3 is shown to the CSV file converted format. This jagged index
is used to reallocating the pixel values to its corresponding centroid values.
International Conference on Mathematical Computer Engineering - ICMCE - 2013 553
ISBN 978-93-82338-91-8 © 2013 Bonfring
6.4 Knowledge based cluster analysis
This part of the system is to select the K- Means clustering algorithm for the given dataset and to segment an
image. Randomly the number of cluster is selected as 5. The clustering algorithm is used to automate the
process of segmentation here when the clustering is done based on the pixel values. It can be changed to its
pixel values so that the image segmentation is made possible in the byte stream. The thumbnail view of both
the clustered and non clustered image using K – means is shown in the Fig. 4. The system enables to view of
the segmented PET scan datasets.
6.5 MATLAB Applications
MATrix LABoratoty(MATLAB) is widely used for implementing algorithms in numerical environment. The image
processing tool box has a stable, well supported set of software tools for wide range of digital image processing and
segmentation. The major applications are intensity transformation, image restoration, registration, image data
compression, morphological image processing, regions and boundary representation and description. However, some
limitations are listed in MATLAB such as its low processing speed and wasteful use of memory [10].
6.6 MATLAB cluster analysis
The given image is loaded in to MATLAB. First the number of clusters is assigned. Then the centroid (c) initialization
is calculated as follows
c = (1:k)*m / (k+1) (1)
where the double precision image pixel in single column (m) value and number of centroid (k) is used to calculate the
initial centroid value.
The calculation of distance (d) between centroid and object is derived from
d = abs (o (i) – c) (2)
Equation (2) o(i) is known as one dimensional array distance. Using that value, new centroid is calculated in equation
3.
nc(i) = sum (a.*h(a)) / sum (h(a)) (3)
where the value object clustering function (a) and non zero element obtained from object clustering h(a) is used to
compute the new centroid (nc) value. The resultant new centroid value is used for masking creation and then the image
segmentation. Fig. 5.is showed the calculation of threshold value for segmentation.
Fig. 5 Threshold calculation
Segmentation
Clustering
Original
input image Segmented
output image
New centroid
value n(c)
International Conference on Mathematical Computer Engineering - ICMCE - 2013 554
ISBN 978-93-82338-91-8 © 2013 Bonfring
7. Result
The obtained image from MATLAB and .NET framework is analyzed using MIPAV tool. This tool is particularly
designed for medical image processing and analysis. The selection volume of interest is identified and the statistical
parameter is listed below. In Table 1 shows that the co efficient of variance value in .NET framework is lesser than
MATLAB environment. It proves the less significant distribution in .NET. In Fig. 7 shows an implementation effect of
MATLAB and .NET framework environment.
Table 1: Statistical analysis using MIPAV
Fig.7 Comparison Chart
8. Conclusion and Future work
The bio medical imaging techniques have been prominently used for the clinical purpose such that the anatomy and the
physiology of the internal parts can be monitored [8]. PET scan is a bio medical nuclear imaging techniques provide a
solution for abnormal cells. The segmented image provides the clear picture about the affected portions. This paper
explained the basic K-Means algorithm in different working platform. First the K-Means is tested on AForge.NET
framework in windows environment. Here the given input image is converted into byte stream and stored in the form of
jagged array. Then it is imported in the CSV file format which is allowed in any file system. In this application the
number of clusters is automatically selected. The obtained image is compared with MATLAB environment. The
comparative clustering result independently of the platform environment and produced the optimal segmented image.
9. References
[1] Koon-Pong Wong, Dagan Geng, Steven R.Meikle, Michael J.Fulham “Segmentation of Dynamic PET Images Using cluster analysis” IEEE
Transactions on nuclear science, Vol. 49, pp.200-207, 2002.
[2] Andreas Hapfelmeier, Jana Schmidt, Marianne Muller, Stefan Kramer “Interpreting PET scans by structured Patient Data: A Data mining case study in dementia Research” IEEE Knowledge and Information Systems, pp.213-222, 2009.
[3] Osama Abu Abbas “Comparison between Data clustering algorithms” The International Arab Journal of Information Technology, Vol. 5,
pp.320-325. [4] Oyelade, O. J, Oladipupo, O. O, Obagbuwa, I. C “Application of k-Means Clustering algorithm for prediction of Students’ Academic
Performance” International Journal of Computer Science and Information Security, vol. 7, pp.292-295, 2010. [5] George Karypis, Eui-Hong, Han Vipin Kumar “CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling” IEEE
Computer, Vol. 32, pp. 68-75.
[6] Chang HH, Valentino DJ, Duckwiler GR, Toga AW (2007). Segmentation of Brain MR Images Using a Charged Fluid Model, IEEE
Transactions on Biomedical Engineering 54: 1798-1813.
[7] D. L. Pham, C. Xu, and J. L. Prince. “ Current methods in medical images segmentation”, Annual review of biomedical engineering, 2: 315-
337, 2000 [8] H. N. Wagber, Z. Szabo, J. W. Buchanan, Principles of nuclear medicine. Pensylvania, pp. 564 – 575, 1995
[9] Anamika Ahirwar, R. S. Jadon, characterization of tumor region using SOM and Neuro Fuzzy techniques in Digital Mammography,
International Journal of Computer Science & Information Technology (IJCSIT), Vol 3, No 1, Feb 2011 [10] M Bister et al. Increasing the speed of medical image processing in MatLab.
[11] Brookmeyer, R; Johnson, E; Ziegler-Graham, K; Arrighi, HM “"Forecasting the global burden of Alzheimer's disease” July 2007, 186 – 91
[12] Bermejo-Pareja F, Benito-León J, Vega S, Medrano MJ, Román GC "Incidence and subtypes of dementia in three elderly populations of central Spain". J. Neurol. Sci. 264 (1–2), Jan 2008: 63–72.
[13] M.C. Su and C. H. Chou, “ A Modified Version of the K – Means Algorithm with a Distance Based on Cluster Symmetry,” IEEE Trans. On
Pattern Analysis and Machine Intelligence, vol.23, no.6, pp. 674 – 680, June. 2001.
S.
No
Parameter Result obtained from
MATLAB .NET
1. No. of Voxels 16472 15544
2. Average voxel intensity 86.0916 79.2168
3. Standard Deviation 92.0758 65.3007
4. Skewness 0.7643 0.5899
5. Kurtosis 1.7756 1.7946
6. Largest slice distance 142mm 134mm
7. Median intensity 23 71
8. Mode intensity 16 13
9. Mode count 3490 4618
10. Coefficient of variance 106.951 82.433
International Conference on Mathematical Computer Engineering - ICMCE - 2013 555
ISBN 978-93-82338-91-8 © 2013 Bonfring