Upload
setifffffffff
View
219
Download
0
Embed Size (px)
Citation preview
7/30/2019 1202.full
1/15
http://sim.sagepub.com/SIMULATION
http://sim.sagepub.com/content/88/10/1202The online version of this article can be found at:
DOI: 10.1177/0037549712445233
2012 88: 1202 originally published online 22 May 2012SIMULATIONKhamron Sunat, Panida Padungweang and Sirapat Chiewchanwattana
Generalized Transport Mean Shift algorithm for ubiquitous intelligence
Published by:
http://www.sagepublications.com
On behalf of:
Society for Modeling and Simulation International (SCS)
can be found at:SIMULATIONAdditional services and information for
http://sim.sagepub.com/cgi/alertsEmail Alerts:
http://sim.sagepub.com/subscriptionsSubscriptions:
http://www.sagepub.com/journalsReprints.navReprints:
http://www.sagepub.com/journalsPermissions.navPermissions:
http://sim.sagepub.com/content/88/10/1202.refs.htmlCitations:
What is This?
- May 22, 2012OnlineFirst Version of Record
- Oct 8, 2012Version of Record>>
at Bibliotheques de l'Universite Lumiere Lyon 2 on November 4, 2012sim.sagepub.comDownloaded from
http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/content/88/10/1202http://sim.sagepub.com/content/88/10/1202http://www.sagepublications.com/http://www.scs.org/http://sim.sagepub.com/cgi/alertshttp://sim.sagepub.com/cgi/alertshttp://sim.sagepub.com/subscriptionshttp://sim.sagepub.com/subscriptionshttp://sim.sagepub.com/subscriptionshttp://www.sagepub.com/journalsReprints.navhttp://www.sagepub.com/journalsReprints.navhttp://www.sagepub.com/journalsPermissions.navhttp://sim.sagepub.com/content/88/10/1202.refs.htmlhttp://online.sagepub.com/site/sphelp/vorhelp.xhtmlhttp://online.sagepub.com/site/sphelp/vorhelp.xhtmlhttp://online.sagepub.com/site/sphelp/vorhelp.xhtmlhttp://sim.sagepub.com/content/early/2012/05/21/0037549712445233.full.pdfhttp://sim.sagepub.com/content/early/2012/05/21/0037549712445233.full.pdfhttp://sim.sagepub.com/content/88/10/1202.full.pdfhttp://sim.sagepub.com/content/88/10/1202.full.pdfhttp://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://online.sagepub.com/site/sphelp/vorhelp.xhtmlhttp://sim.sagepub.com/content/early/2012/05/21/0037549712445233.full.pdfhttp://sim.sagepub.com/content/88/10/1202.full.pdfhttp://sim.sagepub.com/content/88/10/1202.refs.htmlhttp://www.sagepub.com/journalsPermissions.navhttp://www.sagepub.com/journalsReprints.navhttp://sim.sagepub.com/subscriptionshttp://sim.sagepub.com/cgi/alertshttp://www.scs.org/http://www.sagepublications.com/http://sim.sagepub.com/content/88/10/1202http://sim.sagepub.com/7/30/2019 1202.full
2/15
Simulation
Simulation: Transactions of the Society for
Modeling and Simulation International
88(10) 12021215
2012 The Society for Modeling and
Simulation International
DOI: 10.1177/0037549712445233
sim.sagepub.com
Generalized Transport Mean Shiftalgorithm for ubiquitous intelligence
Khamron Sunat1, Panida Padungweang2 and
Sirapat Chiewchanwattana1
Abstract
Much research has been conducted recently relating to ubiquitous intelligent computing. Ubiquitous intelligence-enabledtechniques, such as clustering and image segmentation, have focused on the development of intelligence methodologies. In
this paper, a simultaneous mode-seeking and clustering algorithm called the Generalized Transport Mean Shift (GTMS)was introduced. The data points were designated as the transportertrailer characteristic. The important concept of
transportation was used to solve the problem of redundant computations of mode-seeking algorithms. The time com-
plexity of the GTMS algorithm is much lower than that of the Mean Shift (MS) algorithm. This means it is able to be usedin a problem that has a very high data point, in particular, the segmentation of images containing the green vegetation. The
proposed algorithm was tested on clustering and image-segmentation problems. The experimental results showed thatthe GTMS algorithm improves upon the existing algorithms in terms of both accuracy and time consumption. The GTMSalgorithms highest speed is also 333.98 times faster than that of the standard MS algorithm. The redundancy computation
can be reduced by omitting more than 90% of the data points at the third iteration of the mode-seeking process. This isbecause GTMS algorithm mainly reduces the data in the mode-seeking process. Thus, use of the GTMS algorithm would
allow for the building of an intelligent portable device for surveying green vegetables in a ubiquitous environment.
Keywords
Mean Shift algorithm, agglomerative mean shift clustering, Generalized Transport Mean Shift algorithm, image segmenta-
tion, clustering, mode seeking, ubiquitous intelligence
1. Introduction
Ubiquitous intelligence computing is widely dedicated to
research on the technologies used to improve the intelli-
gence capability of multimedia devices. In this situation,
the intelligence capacity to elaborate, extract information,
and improve the quality of the extracted information from
the environment is crucial. The development of ubiquitous
intelligence, such as data analysis, image analysis, pattern
analysis, and computer vision, is also addressed. The clus-
tering problem is an important process in data analysis.
Computer vision problems, such as video and motionestimation, require an appropriate area of support for cor-
respondence operations. The area of support can be identi-
fied using segmentation techniques. Pattern recognition
problems can also make use of segmentation results in
matching. Consequently, image segmentation and cluster-
ing are important processes in ubiquitous intelligence com-
puting and are used for analyzing and investigating the
nature of the given data. A powerful technique for solving
this problem is to automatically find the mode of density
of the given data. Normally the algorithm is informed by
the data density. The local maximums of the density sur-
face are assumed to be the modes or the centers of clusters.
All data points are computed to find their modes. The data
points that have the same mode will be assigned to be in
the same cluster. The algorithm is useful for clustering,1,2
image segmentation,3,4 and tracking.5 However, finding
the mode of all of the data points requires a repeated pro-
cess, which is very time consuming. The Generalized
Transport Mean Shift (GTMS) algorithm and its variation
are extensively proposed to overcome this difficulty.
The Mean Shift (MS) algorithm is a powerful technique
for seeking the modes of any given data. The standard MSalgorithm is an iterative procedure that can automatically
find the mode of density of a data point. It begins by
1Department of Computer Science, Khon Kaen University, Thailand2Department of Mathematics, Statistic and Computer, Ubon Ratchathani
University, Thailand
Corresponding author:
Khamron Sunat, Department of Computer Science, Khon Kaen
University, Khon Kaen, 40002, Thailand.
Email: [email protected]
at Bibliotheques de l'Universite Lumiere Lyon 2 on November 4, 2012sim.sagepub.comDownloaded from
http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/7/30/2019 1202.full
3/15
computing the weight of all points by the function of their
distances from the considered point. The next step is to
compute the weighted mean point to get the shift position.
This changes iteratively because of density and increases
until the shift position is not changed or is changed less
than the acceptable distance. It is assumed that this posi-
tion is the mode of the considering point and all data
points that converge into the same mode will also be
assumed to be the same cluster. However, the standard
MS algorithm process is very slow because its time com-
plexity is O(kn2 m).6 This makes it unsuitable for use,
especially in image-segmentation applications, such as the
segmentation of images containing green vegetation,
which have a very high data point. Improving the speed of
the MS algorithm by reducing the computational complex-
ity is, therefore, very important.
Techniques to speed up the MS algorithm were fre-
quently proposed. A speed-up technique of neural network
learning was applied to the MS algorithm, as proposed by
Padungweing et al.;7
this allowed it to perform at greaterspeeds than the previous version, whilst also retaining the
accuracy of the original. Following this, the speed up of
the Gaussian Blurring Mean Shift (GBMS) algorithm has
been proposed6 by clustering the data in each iteration and
removing the cluster that has some data converted to its
mode. Thus, the next iteration will have less data, allowing
for faster computing. However, the GBMS algorithm can
produce different results when compared to the standard
MS algorithm. This is because the density estimation is
computed using the current position in each iteration, but
the density estimation of the standard one is usually com-
puted using the initial position of the given data set.
Several methods that can improve the speed of the MSalgorithm for image segmentation were proposed.8 Firstly,
the neighborhood pixels are grouped into the same cell and
then the MS algorithm was used. The cell that is shifted
into the shifted cell in the previous iteration will stop com-
puting and will be assumed to be the same cluster.
Secondly, the neighborhood pixels in the spatial domain,
which are indicated by the specific distance, are grouped
into the same cluster and the process continues as in the
first method. Both methods produced excellent speed ups.
However, a wrong clustering result can occur, even in the
first step of clustering the group of points. The remaining
two methods approximate the E (Expectation) and M
(Maximization) steps in each iteration using a subset of thedata and the quadratic convergence technique, which
helped to decrease the number of iterations. However, this
requires high computation, resulting in less help in speed-
ing up as hoped. The Improved Fast Gaussian Transform
Mean Shift (IFGT-MS) algorithm9 adopts the improved
Gaussian transform for numerical approximation. It is very
efficient for large-scale and high-dimensional data sets.
However, the IFGT-MS algorithm not only is limited to
the Gaussian kernel function but also it fails on moderate
scale data.10,11 Based on the best of the authors knowl-
edge, the recently and much proposed algorithm is
Agglomerative Mean Shift (Agglo-MS) clustering.10,11
Covering hyper ellipsoids were used to cluster data itera-
tively, which leads to hierarchical clustering via the MS
process. The covering hyper ellipsoids need to compute
the inverse of the covariant matrix, which is an extra cost
and biased by data dimension. It can also produce a poor
result if the parameter is not properly selected. However,
this algorithm inspired us by demonstrating the use benefit
of the hill climbing algorithm, where many data points are
shifted though the same direction.
In this paper, the GTMS algorithm is proposed for intel-
ligence modeling. The basic idea of the MS algorithm,
which is the shift process, is presented in this algorithm.
However, instead of finding the mode of all points, the
GTMS algorithm requires few points, called transporters,
which are representing their trailers. Moreover, finding
the transporters does not require any extra cost, because
the GTMS algorithm uses the distance values that must becomputed in the shifting process. The trailers are the data
points that are shifted into the same mode as a transporter.
The relationship transportertrailer is investigated by
considering the direction of the transporters trajectory and
the trailers shift direction. The trailers are excluded for
the next iteration and only the transporters are computed.
In addition, a transporter can be assigned as a trailer in a
next iteration; this not only reduces the number of trans-
porters to be computed, but also performs a simultaneous
hierarchical clustering.
In Section 2, we briefly explain the nature of the MS
algorithm and the Agglo-MS algorithm. The GTMS algo-
rithm will be introduced and proposed in Section 3. Theexperimental results on real-world clustering, image-
segmentation problems and discussion will be described in
Section 4. Section 5 is the conclusion.
2. Standard Mean Shift algorithm and
Agglomerative Mean Shift algorithm
Let XRm be a data set in an m-dimensional Euclidean
space ofn data points. X = (x1, x2,., xn) andxi = [x1, x2,
., xm,]T. A probability density estimation of a given data
x is defined by
p(x)=1
n
Xni=1
K (x xi)=k k2
; 1
where K(t) is a kernel function ands is a constant band-
width such that s> 0. A mode of the density is a position
x having zero gradient, rp(x)= 0: The MS algorithm is aniterative procedure for seeking the mode of density estima-
tion with repeated shifting of the position x towards high
density and is written as
Sunat et al. 1203
at Bibliotheques de l'Universite Lumiere Lyon 2 on November 4, 2012sim.sagepub.comDownloaded from
http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/7/30/2019 1202.full
4/15
x(+ 1)=f(x()); 2
with
f(x)=
Pni= 1 K
0 (x xi)=k k2
xi
Pnj= 1 K
0 (x xj) 2
; 3
where K(t) = dK/dt and is the iteration index. Using the
Gaussian function K(t) = e-t/2, (2) and (3) can be reduced12
to
x(+1)=Xn
i= 1p(ijx())xi 4
and
p(ijx())=exp ( 1
2(x() xi)
2)Pnj= 1 exp (
12
(x() xj)
2) : 5
The algorithm will be terminated if the shift distance is
equal to zero or is less than a tolerant threshold as follows:
x() x(1) threshold: 6
The clustering is performed by representing each mode of
the kernel density estimate as the cluster and the data
points are converged to their corresponding modes. This
idea of plotting two clusters can be depicted graphically,
as shown in Figure 1(a). Figure 1(b) shows that data points
are shifted rising toward their mode. The solid black lines
represent the trajectory of each data point.
The Agglo-MS10,11 is an agglomerative MS clustering
algorithm. It is built upon an iterative query set compres-
sion mechanism motivated by the quadratic bounding opti-mization characteristic of the MS algorithm. It performs
well on segmentation of images and clustering of moder-
ate scale data sets. Since the space is limited, the interested
reader is directed to Yuan et al.10,11
3. Generalized Transport Mean Shift
algorithmIn general, there are many positions shifting through the
same trajectory and trying to place themselves at their mode,
as the example shows in Figure 1. Considering Figure 2, the
ith data is shifted to the position that is closed to the original
position of the kth data at iteration . Also, the direction of
the shift vector of ith data is in parallel to the trajectory vec-
tor of the kth data. Therefore, the ith data should be consid-
ered as the trailer of the kth data, which is assumed to be a
transporter of the ith data. Hence, the shifting of the ith
data need not be computed in the next iteration.
Even though the jth data at iteration is also shifted to
the position near the original position of the kth data, its
mode is different from the mode of the kth data. One of
the main ideas of this work is that the nearest point that is
assigned as the transporter should have the same direction
of trajectory vector as the direction of the shift vector of
the trailer.
In order to acquire the solution, four matrices are intro-
duced. The first matrix is a matrix of the trajectory vector
of all the data points. The second matrix stores the indexes
of the transporters. The last two matrices are logical, indi-
cating the convergence status and the present status of the
data points. The details of each matrix are as follows.
Let URmxn. The ith column of U, denoted by ui, is a
unit trajectory vector of the ith data at the first iterationand can be computed as
Figure 1. (a) The plotting of two data clusters. (b) The trajectory of data point by applying the Mean Shift algorithm to a two-
dimensional data set. The third axis denotes density of data.
1204 Simulation: Transactions of the Society for Modeling and Simulation International 88(10)
at Bibliotheques de l'Universite Lumiere Lyon 2 on November 4, 2012sim.sagepub.comDownloaded from
http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/7/30/2019 1202.full
5/15
ui =x1i x
0i
x1i x0ik k
: 7
Let TR1xn be a transporter matrix, where the ith column
of T denoted by ti is an index of a transporter of the ith
data, such that
ti =argmin
j
xi xj
2
if ij
i otherwise;
8