1202.full

7/30/2019 1202.full

1/15

http://sim.sagepub.com/SIMULATION

http://sim.sagepub.com/content/88/10/1202The online version of this article can be found at:

DOI: 10.1177/0037549712445233

2012 88: 1202 originally published online 22 May 2012SIMULATIONKhamron Sunat, Panida Padungweang and Sirapat Chiewchanwattana

Generalized Transport Mean Shift algorithm for ubiquitous intelligence

Published by:

http://www.sagepublications.com

On behalf of:

Society for Modeling and Simulation International (SCS)

can be found at:SIMULATIONAdditional services and information for

http://sim.sagepub.com/cgi/alertsEmail Alerts:

http://sim.sagepub.com/subscriptionsSubscriptions:

http://www.sagepub.com/journalsReprints.navReprints:

http://www.sagepub.com/journalsPermissions.navPermissions:

http://sim.sagepub.com/content/88/10/1202.refs.htmlCitations:

What is This?

- May 22, 2012OnlineFirst Version of Record

- Oct 8, 2012Version of Record>>

at Bibliotheques de l'Universite Lumiere Lyon 2 on November 4, 2012sim.sagepub.comDownloaded from
http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/content/88/10/1202http://sim.sagepub.com/content/88/10/1202http://www.sagepublications.com/http://www.scs.org/http://sim.sagepub.com/cgi/alertshttp://sim.sagepub.com/cgi/alertshttp://sim.sagepub.com/subscriptionshttp://sim.sagepub.com/subscriptionshttp://sim.sagepub.com/subscriptionshttp://www.sagepub.com/journalsReprints.navhttp://www.sagepub.com/journalsReprints.navhttp://www.sagepub.com/journalsPermissions.navhttp://sim.sagepub.com/content/88/10/1202.refs.htmlhttp://online.sagepub.com/site/sphelp/vorhelp.xhtmlhttp://online.sagepub.com/site/sphelp/vorhelp.xhtmlhttp://online.sagepub.com/site/sphelp/vorhelp.xhtmlhttp://sim.sagepub.com/content/early/2012/05/21/0037549712445233.full.pdfhttp://sim.sagepub.com/content/early/2012/05/21/0037549712445233.full.pdfhttp://sim.sagepub.com/content/88/10/1202.full.pdfhttp://sim.sagepub.com/content/88/10/1202.full.pdfhttp://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://online.sagepub.com/site/sphelp/vorhelp.xhtmlhttp://sim.sagepub.com/content/early/2012/05/21/0037549712445233.full.pdfhttp://sim.sagepub.com/content/88/10/1202.full.pdfhttp://sim.sagepub.com/content/88/10/1202.refs.htmlhttp://www.sagepub.com/journalsPermissions.navhttp://www.sagepub.com/journalsReprints.navhttp://sim.sagepub.com/subscriptionshttp://sim.sagepub.com/cgi/alertshttp://www.scs.org/http://www.sagepublications.com/http://sim.sagepub.com/content/88/10/1202http://sim.sagepub.com/

7/30/2019 1202.full

2/15

Simulation

Simulation: Transactions of the Society for

Modeling and Simulation International

88(10) 12021215

2012 The Society for Modeling and

Simulation International

DOI: 10.1177/0037549712445233

sim.sagepub.com

Generalized Transport Mean Shiftalgorithm for ubiquitous intelligence

Khamron Sunat1, Panida Padungweang2 and

Sirapat Chiewchanwattana1

Abstract

Much research has been conducted recently relating to ubiquitous intelligent computing. Ubiquitous intelligence-enabledtechniques, such as clustering and image segmentation, have focused on the development of intelligence methodologies. In

this paper, a simultaneous mode-seeking and clustering algorithm called the Generalized Transport Mean Shift (GTMS)was introduced. The data points were designated as the transportertrailer characteristic. The important concept of

transportation was used to solve the problem of redundant computations of mode-seeking algorithms. The time com-

plexity of the GTMS algorithm is much lower than that of the Mean Shift (MS) algorithm. This means it is able to be usedin a problem that has a very high data point, in particular, the segmentation of images containing the green vegetation. The

proposed algorithm was tested on clustering and image-segmentation problems. The experimental results showed thatthe GTMS algorithm improves upon the existing algorithms in terms of both accuracy and time consumption. The GTMSalgorithms highest speed is also 333.98 times faster than that of the standard MS algorithm. The redundancy computation

can be reduced by omitting more than 90% of the data points at the third iteration of the mode-seeking process. This isbecause GTMS algorithm mainly reduces the data in the mode-seeking process. Thus, use of the GTMS algorithm would

allow for the building of an intelligent portable device for surveying green vegetables in a ubiquitous environment.

Keywords

Mean Shift algorithm, agglomerative mean shift clustering, Generalized Transport Mean Shift algorithm, image segmenta-

tion, clustering, mode seeking, ubiquitous intelligence

1. Introduction

Ubiquitous intelligence computing is widely dedicated to

research on the technologies used to improve the intelli-

gence capability of multimedia devices. In this situation,

the intelligence capacity to elaborate, extract information,

and improve the quality of the extracted information from

the environment is crucial. The development of ubiquitous

intelligence, such as data analysis, image analysis, pattern

analysis, and computer vision, is also addressed. The clus-

tering problem is an important process in data analysis.

Computer vision problems, such as video and motionestimation, require an appropriate area of support for cor-

respondence operations. The area of support can be identi-

fied using segmentation techniques. Pattern recognition

problems can also make use of segmentation results in

matching. Consequently, image segmentation and cluster-

ing are important processes in ubiquitous intelligence com-

puting and are used for analyzing and investigating the

nature of the given data. A powerful technique for solving

this problem is to automatically find the mode of density

of the given data. Normally the algorithm is informed by

the data density. The local maximums of the density sur-

face are assumed to be the modes or the centers of clusters.

All data points are computed to find their modes. The data

points that have the same mode will be assigned to be in

the same cluster. The algorithm is useful for clustering,1,2

image segmentation,3,4 and tracking.5 However, finding

the mode of all of the data points requires a repeated pro-

cess, which is very time consuming. The Generalized

Transport Mean Shift (GTMS) algorithm and its variation

are extensively proposed to overcome this difficulty.

The Mean Shift (MS) algorithm is a powerful technique

for seeking the modes of any given data. The standard MSalgorithm is an iterative procedure that can automatically

find the mode of density of a data point. It begins by

1Department of Computer Science, Khon Kaen University, Thailand2Department of Mathematics, Statistic and Computer, Ubon Ratchathani

University, Thailand

Corresponding author:

Khamron Sunat, Department of Computer Science, Khon Kaen

University, Khon Kaen, 40002, Thailand.

Email: [email protected]

http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/

7/30/2019 1202.full

3/15

computing the weight of all points by the function of their

distances from the considered point. The next step is to

compute the weighted mean point to get the shift position.

This changes iteratively because of density and increases

until the shift position is not changed or is changed less

than the acceptable distance. It is assumed that this posi-

tion is the mode of the considering point and all data

points that converge into the same mode will also be

assumed to be the same cluster. However, the standard

MS algorithm process is very slow because its time com-

plexity is O(kn2 m).6 This makes it unsuitable for use,

especially in image-segmentation applications, such as the

segmentation of images containing green vegetation,

which have a very high data point. Improving the speed of

the MS algorithm by reducing the computational complex-

ity is, therefore, very important.

Techniques to speed up the MS algorithm were fre-

quently proposed. A speed-up technique of neural network

learning was applied to the MS algorithm, as proposed by

Padungweing et al.;7

this allowed it to perform at greaterspeeds than the previous version, whilst also retaining the

accuracy of the original. Following this, the speed up of

the Gaussian Blurring Mean Shift (GBMS) algorithm has

been proposed6 by clustering the data in each iteration and

removing the cluster that has some data converted to its

mode. Thus, the next iteration will have less data, allowing

for faster computing. However, the GBMS algorithm can

produce different results when compared to the standard

MS algorithm. This is because the density estimation is

computed using the current position in each iteration, but

the density estimation of the standard one is usually com-

puted using the initial position of the given data set.

Several methods that can improve the speed of the MSalgorithm for image segmentation were proposed.8 Firstly,

the neighborhood pixels are grouped into the same cell and

then the MS algorithm was used. The cell that is shifted

into the shifted cell in the previous iteration will stop com-

puting and will be assumed to be the same cluster.

Secondly, the neighborhood pixels in the spatial domain,

which are indicated by the specific distance, are grouped

into the same cluster and the process continues as in the

first method. Both methods produced excellent speed ups.

However, a wrong clustering result can occur, even in the

first step of clustering the group of points. The remaining

two methods approximate the E (Expectation) and M

(Maximization) steps in each iteration using a subset of thedata and the quadratic convergence technique, which

helped to decrease the number of iterations. However, this

requires high computation, resulting in less help in speed-

ing up as hoped. The Improved Fast Gaussian Transform

Mean Shift (IFGT-MS) algorithm9 adopts the improved

Gaussian transform for numerical approximation. It is very

efficient for large-scale and high-dimensional data sets.

However, the IFGT-MS algorithm not only is limited to

the Gaussian kernel function but also it fails on moderate

scale data.10,11 Based on the best of the authors knowl-

edge, the recently and much proposed algorithm is

Agglomerative Mean Shift (Agglo-MS) clustering.10,11

Covering hyper ellipsoids were used to cluster data itera-

tively, which leads to hierarchical clustering via the MS

process. The covering hyper ellipsoids need to compute

the inverse of the covariant matrix, which is an extra cost

and biased by data dimension. It can also produce a poor

result if the parameter is not properly selected. However,

this algorithm inspired us by demonstrating the use benefit

of the hill climbing algorithm, where many data points are

shifted though the same direction.

In this paper, the GTMS algorithm is proposed for intel-

ligence modeling. The basic idea of the MS algorithm,

which is the shift process, is presented in this algorithm.

However, instead of finding the mode of all points, the

GTMS algorithm requires few points, called transporters,

which are representing their trailers. Moreover, finding

the transporters does not require any extra cost, because

the GTMS algorithm uses the distance values that must becomputed in the shifting process. The trailers are the data

points that are shifted into the same mode as a transporter.

The relationship transportertrailer is investigated by

considering the direction of the transporters trajectory and

the trailers shift direction. The trailers are excluded for

the next iteration and only the transporters are computed.

In addition, a transporter can be assigned as a trailer in a

next iteration; this not only reduces the number of trans-

porters to be computed, but also performs a simultaneous

hierarchical clustering.

In Section 2, we briefly explain the nature of the MS

algorithm and the Agglo-MS algorithm. The GTMS algo-

rithm will be introduced and proposed in Section 3. Theexperimental results on real-world clustering, image-

segmentation problems and discussion will be described in

Section 4. Section 5 is the conclusion.

2. Standard Mean Shift algorithm and

Agglomerative Mean Shift algorithm

Let XRm be a data set in an m-dimensional Euclidean

space ofn data points. X = (x1, x2,., xn) andxi = [x1, x2,

., xm,]T. A probability density estimation of a given data

x is defined by

p(x)=1

n

Xni=1

K (x xi)=k k2

; 1

where K(t) is a kernel function ands is a constant band-

width such that s> 0. A mode of the density is a position

x having zero gradient, rp(x)= 0: The MS algorithm is aniterative procedure for seeking the mode of density estima-

tion with repeated shifting of the position x towards high

density and is written as

Sunat et al. 1203


7/30/2019 1202.full

4/15

x(+ 1)=f(x()); 2

with

f(x)=

Pni= 1 K

0 (x xi)=k k2

xi

Pnj= 1 K

0 (x xj) 2

; 3

where K(t) = dK/dt and is the iteration index. Using the

Gaussian function K(t) = e-t/2, (2) and (3) can be reduced12

to

x(+1)=Xn

i= 1p(ijx())xi 4

and

p(ijx())=exp ( 1

2(x() xi)

2)Pnj= 1 exp (

12

(x() xj)

2) : 5

The algorithm will be terminated if the shift distance is

equal to zero or is less than a tolerant threshold as follows:

x() x(1) threshold: 6

The clustering is performed by representing each mode of

the kernel density estimate as the cluster and the data

points are converged to their corresponding modes. This

idea of plotting two clusters can be depicted graphically,

as shown in Figure 1(a). Figure 1(b) shows that data points

are shifted rising toward their mode. The solid black lines

represent the trajectory of each data point.

The Agglo-MS10,11 is an agglomerative MS clustering

algorithm. It is built upon an iterative query set compres-

sion mechanism motivated by the quadratic bounding opti-mization characteristic of the MS algorithm. It performs

well on segmentation of images and clustering of moder-

ate scale data sets. Since the space is limited, the interested

reader is directed to Yuan et al.10,11

3. Generalized Transport Mean Shift

algorithmIn general, there are many positions shifting through the

same trajectory and trying to place themselves at their mode,

as the example shows in Figure 1. Considering Figure 2, the

ith data is shifted to the position that is closed to the original

position of the kth data at iteration . Also, the direction of

the shift vector of ith data is in parallel to the trajectory vec-

tor of the kth data. Therefore, the ith data should be consid-

ered as the trailer of the kth data, which is assumed to be a

transporter of the ith data. Hence, the shifting of the ith

data need not be computed in the next iteration.

Even though the jth data at iteration is also shifted to

the position near the original position of the kth data, its

mode is different from the mode of the kth data. One of

the main ideas of this work is that the nearest point that is

assigned as the transporter should have the same direction

of trajectory vector as the direction of the shift vector of

the trailer.

In order to acquire the solution, four matrices are intro-

duced. The first matrix is a matrix of the trajectory vector

of all the data points. The second matrix stores the indexes

of the transporters. The last two matrices are logical, indi-

cating the convergence status and the present status of the

data points. The details of each matrix are as follows.

Let URmxn. The ith column of U, denoted by ui, is a

unit trajectory vector of the ith data at the first iterationand can be computed as

Figure 1. (a) The plotting of two data clusters. (b) The trajectory of data point by applying the Mean Shift algorithm to a two-

dimensional data set. The third axis denotes density of data.

1204 Simulation: Transactions of the Society for Modeling and Simulation International 88(10)


7/30/2019 1202.full

5/15

ui =x1i x

0i

x1i x0ik k

: 7

Let TR1xn be a transporter matrix, where the ith column

of T denoted by ti is an index of a transporter of the ith

data, such that

ti =argmin

j

xi xj

2

if ij

i otherwise;

8

Documents

1202.full