15
Research Article Optimized K-Means Algorithm Samir Brahim Belhaouari, 1 Shahnawaz Ahmed, 1 and Samer Mansour 2 1 Department of Mathematics & Computer Science, College of Science and General Studies, Alfaisal University, P.O. Box 50927, Riyadh, Saudi Arabia 2 Department of Soſtware Engineering, College of Engineering, Alfaisal University, P.O. Box 50927, Riyadh, Saudi Arabia Correspondence should be addressed to Shahnawaz Ahmed; [email protected] Received 18 May 2014; Revised 9 August 2014; Accepted 13 August 2014; Published 7 September 2014 Academic Editor: Gradimir Milovanovi´ c Copyright © 2014 Samir Brahim Belhaouari et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e localization of the region of interest (ROI), which contains the face, is the first step in any automatic recognition system, which is a special case of the face detection. However, face localization from input image is a challenging task due to possible variations in location, scale, pose, occlusion, illumination, facial expressions, and clutter background. In this paper we introduce a new optimized k-means algorithm that finds the optimal centers for each cluster which corresponds to the global minimum of the k-means cluster. is method was tested to locate the faces in the input image based on image segmentation. It separates the input image into two classes: faces and nonfaces. To evaluate the proposed algorithm, MIT-CBCL, BioID, and Caltech datasets are used. e results show significant localization accuracy. 1. Introduction e face detection problem is to determine the face existence in the input image and then return its location if it exists. But in the face localization problem it is given that the input image contains a face and the goal is to determine the location of this face. e automatic recognition system, which is a special case of the face detection, locates, in its earliest phase, the region of interest (ROI) that contains the face. Face localization in an input image is a challenging task due to possible variations in location, scale, pose, occlusion, illumi- nation, facial expressions, and clutter background. Various methods were proposed to detect and/or localize faces in an input image; however, there are still needs to improve the performance of localization and detection methods. A survey on some face recognition and detection techniques can be found in [1]. A more recent survey, mainly on face detection, was written by Yang et al. [2]. One of the ROI detection trends is the idea of segmenting the image pixels into a number of groups or regions based on similarity properties. It gained more attention in the recognition technologies, which rely on grouping the features vector to distinguish between the image regions, and then concentrate on a particular region, which contains the face. One of the earliest surveys on image segmentation techniques was done by Fu and Mui [3]. ey classify the techniques into three classes: features shareholding (clustering), edge detection, and region extraction. A later survey by N. R. Pal and S. K. Pal [4] did only concentrate on fuzzy, nonfuzzy, and colour images techniques. Many efforts had been done in ROI detection, which may be divided into two types: region growing and region clustering. e difference between the two types is that the region clustering searches for the clusters without prior information while the region growing needs initial points called seeds to detect the clusters. e main problem in region growing approach is to find the suitable seeds since the clusters will grow from the neighbouring pixels of these seeds based on a specified deviation. For seeds selection, Wan and Higgins [5] defined a number of criteria to select insensitive initial points for the subclass of region growing. To reduce the region growing time, Chang and Li [6] proposed a fast region growing method by using parallel segmentation. As mentioned before, the region clustering approach searches for clusters without prior information. Pappas and Jayant [7] generalized the K-means algorithm to group the Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2014, Article ID 506480, 14 pages http://dx.doi.org/10.1155/2014/506480

Research Article Optimized K-Means Algorithmdownloads.hindawi.com/journals/mpe/2014/506480.pdf · Research Article Optimized K-Means Algorithm ... Correspondence should be addressed

  • Upload
    buinga

  • View
    222

  • Download
    0

Embed Size (px)

Citation preview

Research ArticleOptimized K-Means Algorithm

Samir Brahim Belhaouari1 Shahnawaz Ahmed1 and Samer Mansour2

1 Department of Mathematics amp Computer Science College of Science and General Studies Alfaisal UniversityPO Box 50927 Riyadh Saudi Arabia

2Department of Software Engineering College of Engineering Alfaisal University PO Box 50927 Riyadh Saudi Arabia

Correspondence should be addressed to Shahnawaz Ahmed mathshangmailcom

Received 18 May 2014 Revised 9 August 2014 Accepted 13 August 2014 Published 7 September 2014

Academic Editor Gradimir Milovanovic

Copyright copy 2014 Samir Brahim Belhaouari et al This is an open access article distributed under the Creative CommonsAttribution License which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited

The localization of the region of interest (ROI) which contains the face is the first step in any automatic recognition system whichis a special case of the face detection However face localization from input image is a challenging task due to possible variations inlocation scale pose occlusion illumination facial expressions and clutter background In this paper we introduce a new optimizedk-means algorithm that finds the optimal centers for each cluster which corresponds to the global minimum of the k-means clusterThis method was tested to locate the faces in the input image based on image segmentation It separates the input image into twoclasses faces and nonfaces To evaluate the proposed algorithmMIT-CBCL BioID and Caltech datasets are usedThe results showsignificant localization accuracy

1 Introduction

The face detection problem is to determine the face existencein the input image and then return its location if it existsBut in the face localization problem it is given that the inputimage contains a face and the goal is to determine the locationof this face The automatic recognition system which is aspecial case of the face detection locates in its earliest phasethe region of interest (ROI) that contains the face Facelocalization in an input image is a challenging task due topossible variations in location scale pose occlusion illumi-nation facial expressions and clutter background Variousmethods were proposed to detect andor localize faces in aninput image however there are still needs to improve theperformance of localization and detectionmethods A surveyon some face recognition and detection techniques can befound in [1] A more recent survey mainly on face detectionwas written by Yang et al [2]

One of the ROI detection trends is the idea of segmentingthe image pixels into a number of groups or regions basedon similarity properties It gained more attention in therecognition technologies which rely on grouping the featuresvector to distinguish between the image regions and then

concentrate on a particular region which contains the faceOne of the earliest surveys on image segmentation techniqueswas done by Fu and Mui [3] They classify the techniquesinto three classes features shareholding (clustering) edgedetection and region extraction A later survey by N R Paland S K Pal [4] did only concentrate on fuzzy nonfuzzyand colour images techniques Many efforts had been done inROI detection which may be divided into two types regiongrowing and region clustering The difference between thetwo types is that the region clustering searches for the clusterswithout prior information while the region growing needsinitial points called seeds to detect the clusters The mainproblem in region growing approach is to find the suitableseeds since the clusters will grow from the neighbouringpixels of these seeds based on a specified deviation For seedsselection Wan and Higgins [5] defined a number of criteriato select insensitive initial points for the subclass of regiongrowing To reduce the region growing time Chang and Li[6] proposed a fast region growing method by using parallelsegmentation

As mentioned before the region clustering approachsearches for clusters without prior information Pappas andJayant [7] generalized the K-means algorithm to group the

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2014 Article ID 506480 14 pageshttpdxdoiorg1011552014506480

2 Mathematical Problems in Engineering

image pixels based on spatial constraints and their intensityvariations These constraints include the following firstthe region intensity to be close to the data and secondto have a spatial continuity Their generalization algorithmis adaptive and includes the previous spatial constraintsLike the K-means clustering algorithm their algorithm isiterative These spatial constraints are included using Gibbsrandom field model and they concluded that each region ischaracterized by a slowly varying intensity function Becauseof unimportant features the algorithm is not that muchaccurate Later to improve theirmethod they proposed usinga caricature of the original image to keep only the mostsignificant features and remove the unimportant features [8]Besides the problem of time costly segmentation there arethree basic problems that occur during the clustering processwhich are center redundancy dead centers and trappedcenters at local minima Moving K-mean (MKM) proposedin [9] can overcome the three basic problems by minimizingthe center redundancy and the dead center as well as reducingindirectly the effect of trapped centers at local minima Inspite of that MKM is sensitive to noise centers are notlocated in the middle or centroid of a group of data andsome members of the centers with the largest fitness areassigned as members of a center with the smallest fitness Toreduce the effects of these problems Isa et al [10] proposedthree modified versions of the MKM clustering algorithmfor image segmentation fuzzy moving k-means adaptivemoving k-means and adaptive fuzzy moving k-means Afterthat Sulaiman and Isa [11] introduced a new segmentationmethod based on a new clustering algorithm which isAdaptive Fuzzy k-means Clustering Algorithm (AFKM) Onthe other hand the Fuzzy C-Means (FCM) algorithm wasused in image segmentation but it still has some drawbacksthat are the lack of enough robustness to noise and outliersthe crucial parameter 120572 being selected generally based onexperience and the time of segmenting of an image beingdependent on its size To overcome the drawbacks of theFCM algorithm Cai et al [12] integrated both local spatialand gray information to propose fast and robust Fuzzy C-means algorithm called Fast Generalized Fuzzy C-MeansAlgorithm (FGFCM) for image segmentation Moreovermany researchers have provided a definition for some datacharacteristics where it has a significant impact on the K-means clustering analysis including the scattering of the datanoise high-dimensionality the size of the data and outliers inthe data datasets types of attributes and scales of attributes[13] However there is a need for more investigation tounderstand how data distributions affect K-means clusteringperformance In [14] Xiong et al provided a formal andorganized study of the effect of skewed data distributionson K-means clustering Previous research found the impactof high-dimensionality on K-means clustering performanceThe traditional Euclidean notion of proximity is not veryuseful on real-world high-dimensional datasets such as doc-ument datasets and gene-expression datasets To overcomethis challenge researchers turn tomake use of dimensionalityreduction techniques such as multidimensional scaling [15]Second K-means has difficulty in detecting the ldquonaturalrdquoclusters with nonspherical shapes [13] Modified K-means

clustering algorithm is a direction to solve this problemGuhaet al [16] proposed the CURE method which makes use ofmultiple representative points to obtain the ldquonaturalrdquo clustersshape information The problem of outliers and noise in thedata can also reduce clustering algorithms performance [17]especially for prototype-based algorithms such as K-meansThe direction to solve this kind of problem is by combiningsome outlier removal techniques before conducting K-meansclustering For example a simple method [9] of detectingoutliers is based on the distance measure On the other handmanymodifiedK-means clustering algorithms that workwellfor smallermedium-size datasets are unable to deal with largedatasets Bradley et al in [18] introduced a discussion ofscaling K-means clustering to large datasets

Arthur and Vassilvitskii implemented a preliminary ver-sion k-means++ in C++ to evaluate the performance of k-means K-means++ [19] uses a careful seeding method tooptimize the speed and the accuracy of k-means Experi-ments were performed on four datasets and results showedthat that this algorithm is 119874(log 119896)-competitive with theoptimal clustering Experiments also proved its better speed(70 faster) and accuracy (potential value obtained is betterby factors of 10 to 1000) than k-means Kanungo et al[20] also proposed an 119874(1198993120598minus119889) algorithm for the k-meansproblem It is (9 + 120598) competitive but quiet slow in practiceXiong et al investigates the measures that best reflects theperformance of K-means clustering [14] An organized studywas performed to understand the impact of data distributionson the performance of K-means clustering Research workalso has improved the traditional K-means clustering so thatit can handle datasets with large variation of cluster sizesThisformal study illustrates that the entropy sometimes providedmisleading information on the clustering performance Basedon this a coefficient of variation (CV) is proposed to validatethe clustering outcome Experimental results proved that fordatasets with great variation in ldquotruerdquo cluster sizes K-meanslessens (less than 10) the variation in resultant cluster sizesFor datasets with small variation in ldquotruerdquo cluster sizes K-means increases (greater than 03) the variation in resultantcluster sizes

Scalability of the k-means for large datasets is one ofthe major limitations Chiang et al [21] proposed a simpleand easy to implement algorithm for reducing the compu-tation time of k-means This pattern reduction algorithmcompresses andor removes iteration patterns that are notlikely to change their membershipThe pattern is compressedby selecting one of the patterns to be removed and setting itsvalue to the average of all patterns removed If 119863[119903119894

119897] is the

pattern to be removed then we can write it as follows

119863[119903119894

119897] =

|119877119894

119897|

sum

119895=1

119863[119903119897

119894119895] where 119903119894

119897isin 119877119894

119897 (1)

This algorithm can be also applied to other clusteringalgorithms like population-based clustering algorithms Thecomputation time of many clustering algorithms can bereduced using this technique as it is especially effectivefor large and high-dimensional datasets In [22] Bradley

Mathematical Problems in Engineering 3

et al proposed Scalable k-means that uses buffering anda two-stage compression scheme to compress or removepatterns in order to improve the performance of k-meansThe algorithm is slightly faster than k-means but does notshow the same result always The factors that degrade theperformance of scalable k-means are the compression pro-cesses and compression ratio Farnstrom et al also proposeda simple single pass k-means algorithm [23] which reducesthe computation time of k-means Ordonez and Omiecinskiproposed relational k-means [24] that uses the block andincremental concept to improve the stability of scalable k-meansThe computation time of k-means can also be reducedusing Parallel bisecting k-means [25] and triangle inequality[26] methods

Although K-means is the most popular and simple clus-tering algorithm the major difficulty with this process is thatit cannot ensure the global optimumresults because the initialcluster centers are selected randomly Reference [27] presentsa novel technique that selects the initial cluster centers byusing Voronoi diagramThis method automates the selectionof the initial cluster centers To validate the performanceexperiments were performed on a range of artificial and realworld datasets The quality of the algorithm is assessed usingthe following error rate (ER) equation

ER =Number ofmisclassified objects

Total number of objectstimes 100 (2)

The lower the error rate the better the clustering Theproposed algorithm is able to produce better clustering resultsthan the traditional K-means Experimental results proved60 reduction in iterations for defining the centroids

All previous solutions and efforts to increase the perfor-mance of K-means algorithm still need more investigationsince they are all looking for a local minimum Searching forthe global minimum will certainly improve the performanceof the algorithmThe rest of the paper is organized as followsIn Section 2 the proposed method is introduced that findsthe optimal centers for each cluster which corresponds tothe global minimum of the k-means cluster In Section 3 theresults are discussed and analyzed using two sets from MITBioID and Caltech datasets Finally in Section 4 conclusionsare drawn

2 K-Means Clustering

K-means an unsupervised learning algorithm first proposedby MacQueen 1967 [28] is a popular method of clusteranalysis It is a process of organizing the specified objectsinto uniform classes called clusters based on similaritiesamong objects based on certain criteria It solves the well-known clustering problem by considering certain attributesand performing an iterative alternating fitting process Thisalgorithm partitions a dataset 119883 into 119896 disjoint clusters suchthat each observation belongs to the cluster with the nearestmean Let119909

119894be the centroid of cluster 119888

119894and let119889(119909

119895 119909119894)be the

180

160

140

120

100

80

60

40

20

0

Clus

terin

g er

ror

Number of clusters k2 4 6 8 10 12 14 16

Global k-means

Fast global k-means

k-means

Min k-means

Figure 1 Comparison of k-means

dissimilarity between 119909119894and object 119909

119895isin 119888119894Then the function

minimized by the k-means is given by the following equation

min1199091 119909119896

119864 =

119896

sum

119894=1

sum

119909119895isin119888119894

119889 (119909119895 119909119894) (3)

K-means clustering is utilized in a vast number ofapplications including machine learning fault detectionpattern recognition image processing statistics and artificialintelligent [11 29 30] k-means algorithm is considered asone of the fastest clustering algorithms with a number ofvariants that are sensitive to the selection of initial pointsand are intended for solving many issues of k-means likethe evaluation of the clusters number [31] the method ofinitialization of the clusters centroids [32] and the speed ofthe algorithm [33]

21 Modifications in K-Means Clustering Global k-meansalgorithm [34] is a proposal to improve global search prop-erties of k-means algorithm and its performance on verylarge datasets by computing clusters successively The mainweakness of k-means clustering that is its sensitivity to initialpositions of the cluster centers is overcome by global k-means clustering algorithm which consists of a deterministicglobal optimization technique It does not select initial valuesrandomly instead an incremental approach is applied tooptimally add one new cluster center at each stage Globalk-means also proposes a method to reduce the computa-tional load while maintain the solution quality Experimentswere performed to compare the performance of k-meansglobal k-means min k-means and fast global k-means asshown in Figure 1 Numerical results show that the globalk-means algorithm considerably outperforms other k-meansalgorithms

4 Mathematical Problems in Engineering

Bagirov [35] proposed a new version of the global k-means algorithm for minimum sum-of-squares clusteringproblems He also compared three different versions ofthe k-means algorithm to propose the modified versionof the global k-means algorithm The proposed algorithmcomputes clusters incrementally and 119896 minus 1 cluster centersfrom the previous iteration are used to compute k-partitionof a dataset The starting point of the 119896th cluster center iscomputed byminimizing an auxiliary cluster function Givena finite set of points 119860 in the 119899-dimensional space the globalk-means compute the centroid1198831 of the set119860 as the followingequation

1198831=

1

119898

119898

sum

119894=1

119886119894 119886119894isin 119860 119894 = 1 119898 (4)

Numerical experiments performed on 14 datasets demon-strate the efficiency of themodified global k-means algorithmin comparison to the multistart k-means (MS k-means) andthe GKM algorithms when the number of clusters 119896 gt 5Modified global k-means however requires more computa-tional time than the global k-means algorithm Xie and Jiang[36] also proposed a novel version of the global k-meansalgorithm The method of creating the next cluster center inthe global k-means algorithm is modified and that showeda better execution time without affecting the performanceof the global k-means algorithm Another extension of thestandard k-means clustering is the Global Kernel k-means[37] which optimizes the clustering error in feature spaceby employing kernel-means as a local search procedure Akernel function is used in order to solve the M-clusteringproblem and near-optimal solutions are used to optimizethe clustering error in the feature space This incrementalalgorithm has high ability to identify nonlinearly separableclusters in input space and no dependence on cluster initial-ization It can handle weighted data points making it suitablefor graph partitioning applications Twomajor modificationswere performed in order to reduce the computational costwith no major effect on the solution quality

Video imaging and Image segmentation are importantapplications for k-means clustering Hung et al [38]modifiedthe k-means algorithm for color image segmentation wherea weight selection procedure in the W-k-means algorithm isproposed The evaluation function 119865(119868) is used for compari-son

119865 (119868) = radic119877

119877

sum

119894=1

(119890119894256)2

radic119860119894

(5)

where 119868 = segmented image 119877 = the number of regions in thesegmented image 119860

119894= the area or the number of pixels of

the 119894th region and 119890119894= the color error of region 119894

119890119894is the sum of the Euclidean distance of the color vectors

between the original image and the segmented image Smallervalues of 119865(119868) give better segmentation results Resultsfrom color image segmentation illustrate that the proposedprocedure produces better segmentation than the randominitialization Maliatski and Yadid-Pecht [39] also proposean adaptive k-means clustering hardware-driven approach

Y

X

My

1

My

My

2

Mx1

Mx Mx2

C1

C2

Figure 2 The separation points in two dimensions

The algorithm uses both pixel intensity and pixel distance inthe clustering process In [40] also a FPGA implementationof real-time k-means clustering is done for colored imagesA filtering algorithm is used for hardware implementationSuliman and Isa [11] presented a unique clustering algo-rithm called adaptive fuzzy-K-means clustering for imagesegmentation It can be used for different types of imagesincluding medical images microscopic images and CCDcamera images The concepts of fuzziness and belongingnessare applied to produce more adaptive clustering as comparedto other conventional clustering algorithms The proposedadaptive fuzzy-K-means clustering algorithm provides bettersegmentation and visual quality for a range of images

22 The Proposed Methodology In this method the facelocation is determined automatically by using an optimizedK-means algorithm First the input image is reshaped intovectors of pixels values followed by clustering the pixelsbased on applying a certain threshold determined by thealgorithm into two classes Pixels with values below thethreshold are put in the nonface classThe rest of the pixels areput in the face class However some unwanted parts may getclustered to this face classTherefore the algorithm is appliedagain to the face class to obtain only the face part

221 Optimized K-Means Algorithm Theproposed K-meansmodified algorithm is intended to solve the limitations of thestandard version by using differential equations to determinethe optimum separation point This algorithm finds theoptimal centers for each cluster which corresponds to theglobal minimum of the k-means cluster As an example saywe would like to cluster an image into two subclasses faceand nonface So we look for a separator which will separatethe image into two different clusters in two dimensionsThenthe means of the clusters can be found easily Figure 2 showsthe separation points and themean of each class To find these

Mathematical Problems in Engineering 5

points we proposed the followingmodification in continuousand discrete cases

Let 1199091 1199092 1199093 119909

119873 be a dataset and let 119909

119894be an

observation Let119891(119909) be the probabilitymass function (PMF)of 119909

119891 (119909) =

infin

sum

minusinfin

1

119873

120575 (119909 minus 119909119894) 119909

119894isin R (6)

where 120575(119909) is the Dirac functionWhen the size of the data is very large the question of the

cost of the minimization ofsum119896119894=1sum119909119894isin119904119894

|119909119894minus 120583119894|2 arises where

119896 is the number of clusters and 120583119894is the mean of cluster 119904

119894

Let

Var 119896 =119896

sum

119894=1

sum

119909119894isin119904119894

1003816100381610038161003816119909119894minus 120583119894

1003816100381610038161003816

2 (7)

Then it is enough to minimize Var 1 without losing thegenerality since

min119904119894

119889

sum

119894=1

sum

119909119895isin119904119894

10038161003816100381610038161003816119909119895minus 120583119894

10038161003816100381610038161003816

2

ge

119889

sum

119897=1

min119904119894

(

119896

sum

119894=1

sum

119909119895isin119904119894

10038161003816100381610038161003816119909119897

119895minus 120583119897

119894

10038161003816100381610038161003816

2

) (8)

where 119889 is the dimension of 119909119895= (1199091

119895 119909

119889

119895) and 120583

119895=

(1205831

119895 120583

119889

119895)

222 Continuous Case For the continuous case 119891(119909) is likethe probability density function (PDF)Then for 2 classes weneed to minimize the following equation

Var 2 = int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+ int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909

(9)

If 119889 ge 2 we can use (8)

minVar 2 ge119889

sum

119896=1

min(int119872

minusinfin

(119909119896minus 120583119896

1)

2

119891 (119909119896) 119889119909

+int

119872

minusinfin

(119909119896minus 120583119896

2)

2

119891 (119909119896) 119889119909)

(10)

Let 119909 isin R be any random variable with probabilitydensity function 119891(119909) we want to find 120583

1and 120583

2such that

it minimizes (7) as follows

(120583lowast

1 120583lowast

2) = arg(1205831 1205832)

[ min11987212058311205832

[int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909]]

(11)

min11987212058311205832

Var 2 = min119872

[min1205831

int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+min1205832

int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909]

(12)

min11987212058311205832

Var 2 = min119872

[min1205831

int

119872

minusinfin

(119909 minus 1205831(119872))2119891 (119909) 119889119909

+min1205832

int

infin

119872

(119909 minus 1205832(119872))2119891 (119909) 119889119909]

(13)

We know that

min119909119864 [(119883 minus 119909)

2] = 119864 [(119883 minus 119864 (119909))

2] = Var (119909) (14)

So we can find 1205831and 1205832in terms of119872 Let us define Var

1

and Var2as follows

Var1= int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

Var2= int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909

(15)

After simplification we get

Var1= int

119872

minusinfin

1199092119891 (119909) 119889119909

+ 1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909

Var2= int

infin

119872

1199092119891 (119909) 119889119909

+ 1205832

2int

infin

119872

119891 (119909) 119889119909 minus 21205832int

infin

119872

119909119891 (119909) 119889119909

(16)

To find the minimum Var1and Var

2are both partially

differentiated with respect to 1205831and 120583

2 respectively After

simplification we conclude that the minimization occurswhen

1205831=

119864 (119883minus)

119875 (119883minus)

1205832=

119864 (119883+)

119875 (119883+)

(17)

where119883minus= 119883 sdot 1

119909lt119872and119883

+= 119883 sdot 1

119909ge119872

6 Mathematical Problems in Engineering

To find the minimum of Var 2 we need to find 120597120583120597119872

1205831= (

int

119872

minusinfin119909119891 (119909) 119889119909

int

119872

minusinfin119891 (119909) 119889119909

)

997904rArr

1205971205831

120597119872

=

119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883

minus)

1198752(119883minus)

1205832= (

int

infin

119872119909119891 (119909) 119889119909

int

infin

119872119891 (119909) 119889119909

)

997904rArr

1205971205832

120597119872

=

119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+

)

1198752(119883+)

(18)

After simplification we get

1205971205831

120597119872

=

119891 (119872)

1198752(119883minus)

[119872119875 (119883minus) minus 119864 (119883

minus)]

1205971205832

120597119872

=

119891 (119872)

1198752(119883+)

[119872119875 (119883+) minus 119864 (119883

+)]

(19)

In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903

2120597119872 separately and add themas follows

120597

120597119872

(Var 2) = 120597

120597119872

(Var1) +

120597

120597119872

(Var2)

120597Var1

120597119872

=

120597

120597119872

(int

119872

minusinfin

1199092119891 (119909) 119889119909

+1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909)

= 1198722119891 (119872) + 2120583

11205831015840

1119875 (119883minus)

+ 1205832

1119891 (119872) minus 2120583

1015840

1119864 (119883minus) minus 2120583

1119872119891(119872)

(20)

After simplification we get

120597Var1

120597119872

= 119891 (119872)[1198722+

1198642(119883minus)

119875(119883minus)2minus 2119872

119864 (119883minus)

119875 (119883minus)

] (21)

Similarly

120597Var2

120597119872

= minus1198722119891 (119872) + 2120583

21205831015840

2119875 (119883+)

minus 1205832

2119891 (119872) minus 2120583

1015840

2119864 (119883+) + 2120583

2119872119891(119872)

(22)

After simplification we get

120597Var2

120597119872

= 119891 (119872)[minus1198722+

1198642(119883+)

119875(119883+)2minus 2119872

119864 (119883+)

119875 (119883+)

] (23)

Adding both these differentials and after simplifying itfollows that

120597

120597119872

(Var 2) = 119891 (119872)[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2)

minus2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)]

(24)

Equating this with 0 gives

120597

120597119872

(Var 2) = 0

[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2) minus 2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)] = 0

[

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

] [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

(25)

But [(119864(119883minus)119875(119883

minus))minus(119864(119883

+)119875(119883

+))] = 0 as in that case

1205831= 1205832 which contradicts the initial assumption Therefore

[

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

997904rArr 119872 = [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

] sdot

1

2

119872 =

1205831+ 1205832

2

(26)

Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly

To find theminimum in terms of119872 it is enough to derive

120597Var 2 (1205831(119872) 120583

2(119872))

120597119872

= 119891 (119872) [

1198642(119883minus)

119875(119883 lt 119872)2minus

1198642(119883+)

119875(119883 ge 119872)2

minus2119872[

119864 (119883minus)

119875 (119883 le 119872)

minus

119864 (119883+)

119875 (119883 gt 119872)

]]

(27)

Then we find

119872 = [

119864 (119883+)

119875 (119883+)

+

119864 (119883minus)

119875 (119883minus)

] sdot

1

2

(28)

223 Finding the Centers of Some Common ProbabilityDensity Functions

(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is

119891 (119909) =

1

119887 minus 119886

119886 lt 119909 lt 119887

0 otherwise(29)

Mathematical Problems in Engineering 7

where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1

119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) =

1

2

(

1198872minus1198722

119887 minus 119886

)

119864 (119883minus) =

1

2

(

1198722minus 1198862

119887 minus 119886

)

(30)

Putting all these in (28) and solving we get

119872 =

1

2

(119886 + 119887)

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

=

1

4

(3119886 + 119887)

1205832 (119872) =

119864 (119883+)

119875 (119883+)

=

1

4

(119886 + 3119887)

(31)

(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is

119891 (119909 120583 120590) =

1

119909120590radic2120587

119890minus(ln 119909minus120583)221205902

119909 gt 0 (32)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

(33)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(120583 + 1205902minus ln119872)radic2

120590

)]

119864 (119883minus)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(minus120583 minus 1205902+ ln119872)radic2

120590

)]

(34)

Putting all these in (28) and assuming 120583 = 0 and 120590 = 1

and solving we get

119872 = 4244942583

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= 1196827816

1205832(119872) =

119864 (119883+)

119875 (119883+)

= 7293057348

(35)

(3) Normal Distribution The probability density function ofa normal distribution is

119891 (119909 120583 120590) =

1

120590radic2120587

119890minus(119909minus120583)

221205902

(36)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

(37)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) = minus

1

2

120583minus1 + erf (12

radic2 (119872 minus 120583)

120590

)

119864 (119883minus) =

1

2

1205831 + erf (12

radic2 (119872 minus 120583)

120590

)

(38)

Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get

119872 = 0

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= minusradic2

120587

1205832(119872) =

119864 (119883+)

119875 (119883+)

= radic2

120587

(39)

224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901

119895be the

mass density function for an observation 119909119895

Let 119909119894minus1

lt 119872119894minus1

lt 119909119894and 119909

119894lt 119872119894lt 119909119894+1

We define 120583119909le119909119894

as themean for all 119909 le 119909119894 and 120583

119909ge119909119894as themean for all 119909 ge 119909

119894

Define Var 2(119872119894minus1) as the variance of two centers forced by

119872119894minus1

as a separator

Var (119872119894minus1) = sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

+ sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895

(40)

Var (119872119894) = sum

119909119895le119909119894

(119909119895minus 120583119909119895le119909119894

)

2

119901119895

+ sum

119909119895ge119909119894+1

(119909119895minus 120583119909119895ge119909119894+1

)

2

119901119895

(41)

8 Mathematical Problems in Engineering

The first part of Var(119872119894minus1) simplifies as follows

sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

= sum

119909119895le119909119894minus1

1199092

119895119901119895minus 2120583119909119895le119909119894minus1

times sum

119909119895le119909119894minus1

119909119895119901119895+ 1205832

119909119895le119909119894minus1sum

119909119895le119909119894minus1

119901119895

= 119864 (1198832sdot 1119909119895le119909119894minus1

) minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

(42)

Similarly

sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895= 119864 (119883

2sdot 1119909119895ge119909119894

) minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894minus1) = 119864 (119883

2sdot 1119909119895le119909119894minus1

)

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

+ 119864 (1198832sdot 1119909119895ge119909119894

)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894) = 119864 (119883

2sdot 1119909119895le119909119894

)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

+ 119864 (1198832sdot 1119909119895ge119909119894+1

)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

(43)

Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872

119894minus1)

ΔVar 2 = [

[

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

minus[

[

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

]

]

(44)

Now in order to simplify the calculation we rearrangeΔVar 2 as follows

ΔVar 2 = [

[

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

]

]

+[

[

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

(45)

The first part is simplified as follows

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

=

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

[119864 (119883 sdot 1119909119895le119909119894minus1

) + 119909119894119901119894]

2

119875 (119883 le 119909119894minus1) + 119901119894

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894)

+ 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

(46)

Similarly simplifying the second part yields

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

[119864 (119883 sdot 1119909119895ge119909119894+1

) + 119909119894119901119894]

2

119875 (119883 ge 119909119894+1) + 119901119894

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

minus 1199011198941198642(119883 sdot 1

119909119895ge119909119894+1)

minus 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894119875 (119883 ge 119909

119894+1)

Mathematical Problems in Engineering 9

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times minus119901119894+11198642(119883 sdot 1

119909119895ge119909119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894+1119875 (119883 ge 119909

119894+1)

(47)In general these types of optimization arise when the

dataset is very bigIf we consider119873 to be very large then 119901

119896≪ max(sum

119894ge119896+1

119875119894 sum119894le119896minus1

119875119894) 119875(119883 le 119909

119894minus1) + 119901119894asymp 119875(119883 le 119909

119894) and 119875(119883 ge

119909119894+1) + 119901119894asymp 119875(119883 ge 119909

119894+1)

ΔVar 2 asymp [

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 1199012

1198941199092

119894[

1

119875 (119883 le 119909119894minus1)

minus

1

119875 (119883 ge 119909119894+1)

]

(48)

and as lim119873rarrinfin

1199012

1198941199092

119894[(1119875(119883 le 119909

119894minus1))minus(1119875(119883 ge 119909

119894+1))] =

0 we can further simplify ΔVar 2 as followsΔVar 2

asymp[

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

= 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

(49)To find the minimum of ΔVar 2 let us locate when

ΔVar 2 = 0

997904rArr 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

= 0

(50)

VarD

X1 X2 X3 X4 X5 X6 Xn

M

Figure 3 Discrete random variables

But 119901119894

= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1

)119875(119883 le 119909119894minus1)) minus

(119864(119883 sdot 1119909119895ge119909119894+1

)119875(119883 ge 119909119894+1))] lt 0 since 119909

119894is an increasing

sequence which implies that 120583119894minus1

lt 120583119894where

120583119894minus1

=

119864 (119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

120583119894=

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

(51)

Therefore

[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 2119909119894= 0 (52)

Let us define

119865 (119909119894) = 2119909

119894minus[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

119865 (119909119894) = 0

997904rArr 119909119894=[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

sdot

1

2

997904rArr 119909119894=

120583119894minus1+ 120583119894

2

(53)

where 120583119894minus1

and 120583119894are the means of the first and second parts

respectivelyTo find the minimum for the total variation it is enough

to find the good separator119872 between 119909119894and 119909

119894minus1such that

119865(119909119894) sdot 119865(119909

119894+1) lt 0 and 119865(119909

119894) lt 0 (see Figure 3)

After finding few local minima we can get the globalminimum by finding the smallest among them

3 Face Localization Using Proposed Method

Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding

10 Mathematical Problems in Engineering

Figure 4 The original image

Figure 5 The face with shade only

a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909

1and another from 119909

1to 255 by

applying a certain threshold determined by the algorithmLet [0 119909

1] be the class 1-which represents the nonface

part and let [1199091 255] be the class 2-which represents the

face with the shadeAll pixels with values less than the threshold belong to

the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5

In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909

1to 1199092and from 119909

2to 255Then the pixels with values

less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909

1 119909

2] be

the class 1 which represents the illuminated background partand let [119909

2 255] be the class 2 which represents the face

partThe result of the removal of the illuminated back-ground

part is shown in Figure 6

Figure 6 The face without illumination

Figure 7 The filtered image

One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7

Figure 8 shows the final result which is a window contain-ing the face extracted from the original image

4 Results

In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor

41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

2 Mathematical Problems in Engineering

image pixels based on spatial constraints and their intensityvariations These constraints include the following firstthe region intensity to be close to the data and secondto have a spatial continuity Their generalization algorithmis adaptive and includes the previous spatial constraintsLike the K-means clustering algorithm their algorithm isiterative These spatial constraints are included using Gibbsrandom field model and they concluded that each region ischaracterized by a slowly varying intensity function Becauseof unimportant features the algorithm is not that muchaccurate Later to improve theirmethod they proposed usinga caricature of the original image to keep only the mostsignificant features and remove the unimportant features [8]Besides the problem of time costly segmentation there arethree basic problems that occur during the clustering processwhich are center redundancy dead centers and trappedcenters at local minima Moving K-mean (MKM) proposedin [9] can overcome the three basic problems by minimizingthe center redundancy and the dead center as well as reducingindirectly the effect of trapped centers at local minima Inspite of that MKM is sensitive to noise centers are notlocated in the middle or centroid of a group of data andsome members of the centers with the largest fitness areassigned as members of a center with the smallest fitness Toreduce the effects of these problems Isa et al [10] proposedthree modified versions of the MKM clustering algorithmfor image segmentation fuzzy moving k-means adaptivemoving k-means and adaptive fuzzy moving k-means Afterthat Sulaiman and Isa [11] introduced a new segmentationmethod based on a new clustering algorithm which isAdaptive Fuzzy k-means Clustering Algorithm (AFKM) Onthe other hand the Fuzzy C-Means (FCM) algorithm wasused in image segmentation but it still has some drawbacksthat are the lack of enough robustness to noise and outliersthe crucial parameter 120572 being selected generally based onexperience and the time of segmenting of an image beingdependent on its size To overcome the drawbacks of theFCM algorithm Cai et al [12] integrated both local spatialand gray information to propose fast and robust Fuzzy C-means algorithm called Fast Generalized Fuzzy C-MeansAlgorithm (FGFCM) for image segmentation Moreovermany researchers have provided a definition for some datacharacteristics where it has a significant impact on the K-means clustering analysis including the scattering of the datanoise high-dimensionality the size of the data and outliers inthe data datasets types of attributes and scales of attributes[13] However there is a need for more investigation tounderstand how data distributions affect K-means clusteringperformance In [14] Xiong et al provided a formal andorganized study of the effect of skewed data distributionson K-means clustering Previous research found the impactof high-dimensionality on K-means clustering performanceThe traditional Euclidean notion of proximity is not veryuseful on real-world high-dimensional datasets such as doc-ument datasets and gene-expression datasets To overcomethis challenge researchers turn tomake use of dimensionalityreduction techniques such as multidimensional scaling [15]Second K-means has difficulty in detecting the ldquonaturalrdquoclusters with nonspherical shapes [13] Modified K-means

clustering algorithm is a direction to solve this problemGuhaet al [16] proposed the CURE method which makes use ofmultiple representative points to obtain the ldquonaturalrdquo clustersshape information The problem of outliers and noise in thedata can also reduce clustering algorithms performance [17]especially for prototype-based algorithms such as K-meansThe direction to solve this kind of problem is by combiningsome outlier removal techniques before conducting K-meansclustering For example a simple method [9] of detectingoutliers is based on the distance measure On the other handmanymodifiedK-means clustering algorithms that workwellfor smallermedium-size datasets are unable to deal with largedatasets Bradley et al in [18] introduced a discussion ofscaling K-means clustering to large datasets

Arthur and Vassilvitskii implemented a preliminary ver-sion k-means++ in C++ to evaluate the performance of k-means K-means++ [19] uses a careful seeding method tooptimize the speed and the accuracy of k-means Experi-ments were performed on four datasets and results showedthat that this algorithm is 119874(log 119896)-competitive with theoptimal clustering Experiments also proved its better speed(70 faster) and accuracy (potential value obtained is betterby factors of 10 to 1000) than k-means Kanungo et al[20] also proposed an 119874(1198993120598minus119889) algorithm for the k-meansproblem It is (9 + 120598) competitive but quiet slow in practiceXiong et al investigates the measures that best reflects theperformance of K-means clustering [14] An organized studywas performed to understand the impact of data distributionson the performance of K-means clustering Research workalso has improved the traditional K-means clustering so thatit can handle datasets with large variation of cluster sizesThisformal study illustrates that the entropy sometimes providedmisleading information on the clustering performance Basedon this a coefficient of variation (CV) is proposed to validatethe clustering outcome Experimental results proved that fordatasets with great variation in ldquotruerdquo cluster sizes K-meanslessens (less than 10) the variation in resultant cluster sizesFor datasets with small variation in ldquotruerdquo cluster sizes K-means increases (greater than 03) the variation in resultantcluster sizes

Scalability of the k-means for large datasets is one ofthe major limitations Chiang et al [21] proposed a simpleand easy to implement algorithm for reducing the compu-tation time of k-means This pattern reduction algorithmcompresses andor removes iteration patterns that are notlikely to change their membershipThe pattern is compressedby selecting one of the patterns to be removed and setting itsvalue to the average of all patterns removed If 119863[119903119894

119897] is the

pattern to be removed then we can write it as follows

119863[119903119894

119897] =

|119877119894

119897|

sum

119895=1

119863[119903119897

119894119895] where 119903119894

119897isin 119877119894

119897 (1)

This algorithm can be also applied to other clusteringalgorithms like population-based clustering algorithms Thecomputation time of many clustering algorithms can bereduced using this technique as it is especially effectivefor large and high-dimensional datasets In [22] Bradley

Mathematical Problems in Engineering 3

et al proposed Scalable k-means that uses buffering anda two-stage compression scheme to compress or removepatterns in order to improve the performance of k-meansThe algorithm is slightly faster than k-means but does notshow the same result always The factors that degrade theperformance of scalable k-means are the compression pro-cesses and compression ratio Farnstrom et al also proposeda simple single pass k-means algorithm [23] which reducesthe computation time of k-means Ordonez and Omiecinskiproposed relational k-means [24] that uses the block andincremental concept to improve the stability of scalable k-meansThe computation time of k-means can also be reducedusing Parallel bisecting k-means [25] and triangle inequality[26] methods

Although K-means is the most popular and simple clus-tering algorithm the major difficulty with this process is thatit cannot ensure the global optimumresults because the initialcluster centers are selected randomly Reference [27] presentsa novel technique that selects the initial cluster centers byusing Voronoi diagramThis method automates the selectionof the initial cluster centers To validate the performanceexperiments were performed on a range of artificial and realworld datasets The quality of the algorithm is assessed usingthe following error rate (ER) equation

ER =Number ofmisclassified objects

Total number of objectstimes 100 (2)

The lower the error rate the better the clustering Theproposed algorithm is able to produce better clustering resultsthan the traditional K-means Experimental results proved60 reduction in iterations for defining the centroids

All previous solutions and efforts to increase the perfor-mance of K-means algorithm still need more investigationsince they are all looking for a local minimum Searching forthe global minimum will certainly improve the performanceof the algorithmThe rest of the paper is organized as followsIn Section 2 the proposed method is introduced that findsthe optimal centers for each cluster which corresponds tothe global minimum of the k-means cluster In Section 3 theresults are discussed and analyzed using two sets from MITBioID and Caltech datasets Finally in Section 4 conclusionsare drawn

2 K-Means Clustering

K-means an unsupervised learning algorithm first proposedby MacQueen 1967 [28] is a popular method of clusteranalysis It is a process of organizing the specified objectsinto uniform classes called clusters based on similaritiesamong objects based on certain criteria It solves the well-known clustering problem by considering certain attributesand performing an iterative alternating fitting process Thisalgorithm partitions a dataset 119883 into 119896 disjoint clusters suchthat each observation belongs to the cluster with the nearestmean Let119909

119894be the centroid of cluster 119888

119894and let119889(119909

119895 119909119894)be the

180

160

140

120

100

80

60

40

20

0

Clus

terin

g er

ror

Number of clusters k2 4 6 8 10 12 14 16

Global k-means

Fast global k-means

k-means

Min k-means

Figure 1 Comparison of k-means

dissimilarity between 119909119894and object 119909

119895isin 119888119894Then the function

minimized by the k-means is given by the following equation

min1199091 119909119896

119864 =

119896

sum

119894=1

sum

119909119895isin119888119894

119889 (119909119895 119909119894) (3)

K-means clustering is utilized in a vast number ofapplications including machine learning fault detectionpattern recognition image processing statistics and artificialintelligent [11 29 30] k-means algorithm is considered asone of the fastest clustering algorithms with a number ofvariants that are sensitive to the selection of initial pointsand are intended for solving many issues of k-means likethe evaluation of the clusters number [31] the method ofinitialization of the clusters centroids [32] and the speed ofthe algorithm [33]

21 Modifications in K-Means Clustering Global k-meansalgorithm [34] is a proposal to improve global search prop-erties of k-means algorithm and its performance on verylarge datasets by computing clusters successively The mainweakness of k-means clustering that is its sensitivity to initialpositions of the cluster centers is overcome by global k-means clustering algorithm which consists of a deterministicglobal optimization technique It does not select initial valuesrandomly instead an incremental approach is applied tooptimally add one new cluster center at each stage Globalk-means also proposes a method to reduce the computa-tional load while maintain the solution quality Experimentswere performed to compare the performance of k-meansglobal k-means min k-means and fast global k-means asshown in Figure 1 Numerical results show that the globalk-means algorithm considerably outperforms other k-meansalgorithms

4 Mathematical Problems in Engineering

Bagirov [35] proposed a new version of the global k-means algorithm for minimum sum-of-squares clusteringproblems He also compared three different versions ofthe k-means algorithm to propose the modified versionof the global k-means algorithm The proposed algorithmcomputes clusters incrementally and 119896 minus 1 cluster centersfrom the previous iteration are used to compute k-partitionof a dataset The starting point of the 119896th cluster center iscomputed byminimizing an auxiliary cluster function Givena finite set of points 119860 in the 119899-dimensional space the globalk-means compute the centroid1198831 of the set119860 as the followingequation

1198831=

1

119898

119898

sum

119894=1

119886119894 119886119894isin 119860 119894 = 1 119898 (4)

Numerical experiments performed on 14 datasets demon-strate the efficiency of themodified global k-means algorithmin comparison to the multistart k-means (MS k-means) andthe GKM algorithms when the number of clusters 119896 gt 5Modified global k-means however requires more computa-tional time than the global k-means algorithm Xie and Jiang[36] also proposed a novel version of the global k-meansalgorithm The method of creating the next cluster center inthe global k-means algorithm is modified and that showeda better execution time without affecting the performanceof the global k-means algorithm Another extension of thestandard k-means clustering is the Global Kernel k-means[37] which optimizes the clustering error in feature spaceby employing kernel-means as a local search procedure Akernel function is used in order to solve the M-clusteringproblem and near-optimal solutions are used to optimizethe clustering error in the feature space This incrementalalgorithm has high ability to identify nonlinearly separableclusters in input space and no dependence on cluster initial-ization It can handle weighted data points making it suitablefor graph partitioning applications Twomajor modificationswere performed in order to reduce the computational costwith no major effect on the solution quality

Video imaging and Image segmentation are importantapplications for k-means clustering Hung et al [38]modifiedthe k-means algorithm for color image segmentation wherea weight selection procedure in the W-k-means algorithm isproposed The evaluation function 119865(119868) is used for compari-son

119865 (119868) = radic119877

119877

sum

119894=1

(119890119894256)2

radic119860119894

(5)

where 119868 = segmented image 119877 = the number of regions in thesegmented image 119860

119894= the area or the number of pixels of

the 119894th region and 119890119894= the color error of region 119894

119890119894is the sum of the Euclidean distance of the color vectors

between the original image and the segmented image Smallervalues of 119865(119868) give better segmentation results Resultsfrom color image segmentation illustrate that the proposedprocedure produces better segmentation than the randominitialization Maliatski and Yadid-Pecht [39] also proposean adaptive k-means clustering hardware-driven approach

Y

X

My

1

My

My

2

Mx1

Mx Mx2

C1

C2

Figure 2 The separation points in two dimensions

The algorithm uses both pixel intensity and pixel distance inthe clustering process In [40] also a FPGA implementationof real-time k-means clustering is done for colored imagesA filtering algorithm is used for hardware implementationSuliman and Isa [11] presented a unique clustering algo-rithm called adaptive fuzzy-K-means clustering for imagesegmentation It can be used for different types of imagesincluding medical images microscopic images and CCDcamera images The concepts of fuzziness and belongingnessare applied to produce more adaptive clustering as comparedto other conventional clustering algorithms The proposedadaptive fuzzy-K-means clustering algorithm provides bettersegmentation and visual quality for a range of images

22 The Proposed Methodology In this method the facelocation is determined automatically by using an optimizedK-means algorithm First the input image is reshaped intovectors of pixels values followed by clustering the pixelsbased on applying a certain threshold determined by thealgorithm into two classes Pixels with values below thethreshold are put in the nonface classThe rest of the pixels areput in the face class However some unwanted parts may getclustered to this face classTherefore the algorithm is appliedagain to the face class to obtain only the face part

221 Optimized K-Means Algorithm Theproposed K-meansmodified algorithm is intended to solve the limitations of thestandard version by using differential equations to determinethe optimum separation point This algorithm finds theoptimal centers for each cluster which corresponds to theglobal minimum of the k-means cluster As an example saywe would like to cluster an image into two subclasses faceand nonface So we look for a separator which will separatethe image into two different clusters in two dimensionsThenthe means of the clusters can be found easily Figure 2 showsthe separation points and themean of each class To find these

Mathematical Problems in Engineering 5

points we proposed the followingmodification in continuousand discrete cases

Let 1199091 1199092 1199093 119909

119873 be a dataset and let 119909

119894be an

observation Let119891(119909) be the probabilitymass function (PMF)of 119909

119891 (119909) =

infin

sum

minusinfin

1

119873

120575 (119909 minus 119909119894) 119909

119894isin R (6)

where 120575(119909) is the Dirac functionWhen the size of the data is very large the question of the

cost of the minimization ofsum119896119894=1sum119909119894isin119904119894

|119909119894minus 120583119894|2 arises where

119896 is the number of clusters and 120583119894is the mean of cluster 119904

119894

Let

Var 119896 =119896

sum

119894=1

sum

119909119894isin119904119894

1003816100381610038161003816119909119894minus 120583119894

1003816100381610038161003816

2 (7)

Then it is enough to minimize Var 1 without losing thegenerality since

min119904119894

119889

sum

119894=1

sum

119909119895isin119904119894

10038161003816100381610038161003816119909119895minus 120583119894

10038161003816100381610038161003816

2

ge

119889

sum

119897=1

min119904119894

(

119896

sum

119894=1

sum

119909119895isin119904119894

10038161003816100381610038161003816119909119897

119895minus 120583119897

119894

10038161003816100381610038161003816

2

) (8)

where 119889 is the dimension of 119909119895= (1199091

119895 119909

119889

119895) and 120583

119895=

(1205831

119895 120583

119889

119895)

222 Continuous Case For the continuous case 119891(119909) is likethe probability density function (PDF)Then for 2 classes weneed to minimize the following equation

Var 2 = int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+ int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909

(9)

If 119889 ge 2 we can use (8)

minVar 2 ge119889

sum

119896=1

min(int119872

minusinfin

(119909119896minus 120583119896

1)

2

119891 (119909119896) 119889119909

+int

119872

minusinfin

(119909119896minus 120583119896

2)

2

119891 (119909119896) 119889119909)

(10)

Let 119909 isin R be any random variable with probabilitydensity function 119891(119909) we want to find 120583

1and 120583

2such that

it minimizes (7) as follows

(120583lowast

1 120583lowast

2) = arg(1205831 1205832)

[ min11987212058311205832

[int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909]]

(11)

min11987212058311205832

Var 2 = min119872

[min1205831

int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+min1205832

int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909]

(12)

min11987212058311205832

Var 2 = min119872

[min1205831

int

119872

minusinfin

(119909 minus 1205831(119872))2119891 (119909) 119889119909

+min1205832

int

infin

119872

(119909 minus 1205832(119872))2119891 (119909) 119889119909]

(13)

We know that

min119909119864 [(119883 minus 119909)

2] = 119864 [(119883 minus 119864 (119909))

2] = Var (119909) (14)

So we can find 1205831and 1205832in terms of119872 Let us define Var

1

and Var2as follows

Var1= int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

Var2= int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909

(15)

After simplification we get

Var1= int

119872

minusinfin

1199092119891 (119909) 119889119909

+ 1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909

Var2= int

infin

119872

1199092119891 (119909) 119889119909

+ 1205832

2int

infin

119872

119891 (119909) 119889119909 minus 21205832int

infin

119872

119909119891 (119909) 119889119909

(16)

To find the minimum Var1and Var

2are both partially

differentiated with respect to 1205831and 120583

2 respectively After

simplification we conclude that the minimization occurswhen

1205831=

119864 (119883minus)

119875 (119883minus)

1205832=

119864 (119883+)

119875 (119883+)

(17)

where119883minus= 119883 sdot 1

119909lt119872and119883

+= 119883 sdot 1

119909ge119872

6 Mathematical Problems in Engineering

To find the minimum of Var 2 we need to find 120597120583120597119872

1205831= (

int

119872

minusinfin119909119891 (119909) 119889119909

int

119872

minusinfin119891 (119909) 119889119909

)

997904rArr

1205971205831

120597119872

=

119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883

minus)

1198752(119883minus)

1205832= (

int

infin

119872119909119891 (119909) 119889119909

int

infin

119872119891 (119909) 119889119909

)

997904rArr

1205971205832

120597119872

=

119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+

)

1198752(119883+)

(18)

After simplification we get

1205971205831

120597119872

=

119891 (119872)

1198752(119883minus)

[119872119875 (119883minus) minus 119864 (119883

minus)]

1205971205832

120597119872

=

119891 (119872)

1198752(119883+)

[119872119875 (119883+) minus 119864 (119883

+)]

(19)

In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903

2120597119872 separately and add themas follows

120597

120597119872

(Var 2) = 120597

120597119872

(Var1) +

120597

120597119872

(Var2)

120597Var1

120597119872

=

120597

120597119872

(int

119872

minusinfin

1199092119891 (119909) 119889119909

+1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909)

= 1198722119891 (119872) + 2120583

11205831015840

1119875 (119883minus)

+ 1205832

1119891 (119872) minus 2120583

1015840

1119864 (119883minus) minus 2120583

1119872119891(119872)

(20)

After simplification we get

120597Var1

120597119872

= 119891 (119872)[1198722+

1198642(119883minus)

119875(119883minus)2minus 2119872

119864 (119883minus)

119875 (119883minus)

] (21)

Similarly

120597Var2

120597119872

= minus1198722119891 (119872) + 2120583

21205831015840

2119875 (119883+)

minus 1205832

2119891 (119872) minus 2120583

1015840

2119864 (119883+) + 2120583

2119872119891(119872)

(22)

After simplification we get

120597Var2

120597119872

= 119891 (119872)[minus1198722+

1198642(119883+)

119875(119883+)2minus 2119872

119864 (119883+)

119875 (119883+)

] (23)

Adding both these differentials and after simplifying itfollows that

120597

120597119872

(Var 2) = 119891 (119872)[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2)

minus2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)]

(24)

Equating this with 0 gives

120597

120597119872

(Var 2) = 0

[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2) minus 2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)] = 0

[

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

] [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

(25)

But [(119864(119883minus)119875(119883

minus))minus(119864(119883

+)119875(119883

+))] = 0 as in that case

1205831= 1205832 which contradicts the initial assumption Therefore

[

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

997904rArr 119872 = [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

] sdot

1

2

119872 =

1205831+ 1205832

2

(26)

Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly

To find theminimum in terms of119872 it is enough to derive

120597Var 2 (1205831(119872) 120583

2(119872))

120597119872

= 119891 (119872) [

1198642(119883minus)

119875(119883 lt 119872)2minus

1198642(119883+)

119875(119883 ge 119872)2

minus2119872[

119864 (119883minus)

119875 (119883 le 119872)

minus

119864 (119883+)

119875 (119883 gt 119872)

]]

(27)

Then we find

119872 = [

119864 (119883+)

119875 (119883+)

+

119864 (119883minus)

119875 (119883minus)

] sdot

1

2

(28)

223 Finding the Centers of Some Common ProbabilityDensity Functions

(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is

119891 (119909) =

1

119887 minus 119886

119886 lt 119909 lt 119887

0 otherwise(29)

Mathematical Problems in Engineering 7

where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1

119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) =

1

2

(

1198872minus1198722

119887 minus 119886

)

119864 (119883minus) =

1

2

(

1198722minus 1198862

119887 minus 119886

)

(30)

Putting all these in (28) and solving we get

119872 =

1

2

(119886 + 119887)

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

=

1

4

(3119886 + 119887)

1205832 (119872) =

119864 (119883+)

119875 (119883+)

=

1

4

(119886 + 3119887)

(31)

(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is

119891 (119909 120583 120590) =

1

119909120590radic2120587

119890minus(ln 119909minus120583)221205902

119909 gt 0 (32)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

(33)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(120583 + 1205902minus ln119872)radic2

120590

)]

119864 (119883minus)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(minus120583 minus 1205902+ ln119872)radic2

120590

)]

(34)

Putting all these in (28) and assuming 120583 = 0 and 120590 = 1

and solving we get

119872 = 4244942583

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= 1196827816

1205832(119872) =

119864 (119883+)

119875 (119883+)

= 7293057348

(35)

(3) Normal Distribution The probability density function ofa normal distribution is

119891 (119909 120583 120590) =

1

120590radic2120587

119890minus(119909minus120583)

221205902

(36)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

(37)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) = minus

1

2

120583minus1 + erf (12

radic2 (119872 minus 120583)

120590

)

119864 (119883minus) =

1

2

1205831 + erf (12

radic2 (119872 minus 120583)

120590

)

(38)

Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get

119872 = 0

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= minusradic2

120587

1205832(119872) =

119864 (119883+)

119875 (119883+)

= radic2

120587

(39)

224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901

119895be the

mass density function for an observation 119909119895

Let 119909119894minus1

lt 119872119894minus1

lt 119909119894and 119909

119894lt 119872119894lt 119909119894+1

We define 120583119909le119909119894

as themean for all 119909 le 119909119894 and 120583

119909ge119909119894as themean for all 119909 ge 119909

119894

Define Var 2(119872119894minus1) as the variance of two centers forced by

119872119894minus1

as a separator

Var (119872119894minus1) = sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

+ sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895

(40)

Var (119872119894) = sum

119909119895le119909119894

(119909119895minus 120583119909119895le119909119894

)

2

119901119895

+ sum

119909119895ge119909119894+1

(119909119895minus 120583119909119895ge119909119894+1

)

2

119901119895

(41)

8 Mathematical Problems in Engineering

The first part of Var(119872119894minus1) simplifies as follows

sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

= sum

119909119895le119909119894minus1

1199092

119895119901119895minus 2120583119909119895le119909119894minus1

times sum

119909119895le119909119894minus1

119909119895119901119895+ 1205832

119909119895le119909119894minus1sum

119909119895le119909119894minus1

119901119895

= 119864 (1198832sdot 1119909119895le119909119894minus1

) minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

(42)

Similarly

sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895= 119864 (119883

2sdot 1119909119895ge119909119894

) minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894minus1) = 119864 (119883

2sdot 1119909119895le119909119894minus1

)

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

+ 119864 (1198832sdot 1119909119895ge119909119894

)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894) = 119864 (119883

2sdot 1119909119895le119909119894

)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

+ 119864 (1198832sdot 1119909119895ge119909119894+1

)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

(43)

Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872

119894minus1)

ΔVar 2 = [

[

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

minus[

[

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

]

]

(44)

Now in order to simplify the calculation we rearrangeΔVar 2 as follows

ΔVar 2 = [

[

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

]

]

+[

[

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

(45)

The first part is simplified as follows

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

=

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

[119864 (119883 sdot 1119909119895le119909119894minus1

) + 119909119894119901119894]

2

119875 (119883 le 119909119894minus1) + 119901119894

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894)

+ 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

(46)

Similarly simplifying the second part yields

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

[119864 (119883 sdot 1119909119895ge119909119894+1

) + 119909119894119901119894]

2

119875 (119883 ge 119909119894+1) + 119901119894

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

minus 1199011198941198642(119883 sdot 1

119909119895ge119909119894+1)

minus 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894119875 (119883 ge 119909

119894+1)

Mathematical Problems in Engineering 9

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times minus119901119894+11198642(119883 sdot 1

119909119895ge119909119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894+1119875 (119883 ge 119909

119894+1)

(47)In general these types of optimization arise when the

dataset is very bigIf we consider119873 to be very large then 119901

119896≪ max(sum

119894ge119896+1

119875119894 sum119894le119896minus1

119875119894) 119875(119883 le 119909

119894minus1) + 119901119894asymp 119875(119883 le 119909

119894) and 119875(119883 ge

119909119894+1) + 119901119894asymp 119875(119883 ge 119909

119894+1)

ΔVar 2 asymp [

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 1199012

1198941199092

119894[

1

119875 (119883 le 119909119894minus1)

minus

1

119875 (119883 ge 119909119894+1)

]

(48)

and as lim119873rarrinfin

1199012

1198941199092

119894[(1119875(119883 le 119909

119894minus1))minus(1119875(119883 ge 119909

119894+1))] =

0 we can further simplify ΔVar 2 as followsΔVar 2

asymp[

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

= 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

(49)To find the minimum of ΔVar 2 let us locate when

ΔVar 2 = 0

997904rArr 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

= 0

(50)

VarD

X1 X2 X3 X4 X5 X6 Xn

M

Figure 3 Discrete random variables

But 119901119894

= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1

)119875(119883 le 119909119894minus1)) minus

(119864(119883 sdot 1119909119895ge119909119894+1

)119875(119883 ge 119909119894+1))] lt 0 since 119909

119894is an increasing

sequence which implies that 120583119894minus1

lt 120583119894where

120583119894minus1

=

119864 (119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

120583119894=

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

(51)

Therefore

[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 2119909119894= 0 (52)

Let us define

119865 (119909119894) = 2119909

119894minus[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

119865 (119909119894) = 0

997904rArr 119909119894=[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

sdot

1

2

997904rArr 119909119894=

120583119894minus1+ 120583119894

2

(53)

where 120583119894minus1

and 120583119894are the means of the first and second parts

respectivelyTo find the minimum for the total variation it is enough

to find the good separator119872 between 119909119894and 119909

119894minus1such that

119865(119909119894) sdot 119865(119909

119894+1) lt 0 and 119865(119909

119894) lt 0 (see Figure 3)

After finding few local minima we can get the globalminimum by finding the smallest among them

3 Face Localization Using Proposed Method

Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding

10 Mathematical Problems in Engineering

Figure 4 The original image

Figure 5 The face with shade only

a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909

1and another from 119909

1to 255 by

applying a certain threshold determined by the algorithmLet [0 119909

1] be the class 1-which represents the nonface

part and let [1199091 255] be the class 2-which represents the

face with the shadeAll pixels with values less than the threshold belong to

the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5

In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909

1to 1199092and from 119909

2to 255Then the pixels with values

less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909

1 119909

2] be

the class 1 which represents the illuminated background partand let [119909

2 255] be the class 2 which represents the face

partThe result of the removal of the illuminated back-ground

part is shown in Figure 6

Figure 6 The face without illumination

Figure 7 The filtered image

One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7

Figure 8 shows the final result which is a window contain-ing the face extracted from the original image

4 Results

In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor

41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 3

et al proposed Scalable k-means that uses buffering anda two-stage compression scheme to compress or removepatterns in order to improve the performance of k-meansThe algorithm is slightly faster than k-means but does notshow the same result always The factors that degrade theperformance of scalable k-means are the compression pro-cesses and compression ratio Farnstrom et al also proposeda simple single pass k-means algorithm [23] which reducesthe computation time of k-means Ordonez and Omiecinskiproposed relational k-means [24] that uses the block andincremental concept to improve the stability of scalable k-meansThe computation time of k-means can also be reducedusing Parallel bisecting k-means [25] and triangle inequality[26] methods

Although K-means is the most popular and simple clus-tering algorithm the major difficulty with this process is thatit cannot ensure the global optimumresults because the initialcluster centers are selected randomly Reference [27] presentsa novel technique that selects the initial cluster centers byusing Voronoi diagramThis method automates the selectionof the initial cluster centers To validate the performanceexperiments were performed on a range of artificial and realworld datasets The quality of the algorithm is assessed usingthe following error rate (ER) equation

ER =Number ofmisclassified objects

Total number of objectstimes 100 (2)

The lower the error rate the better the clustering Theproposed algorithm is able to produce better clustering resultsthan the traditional K-means Experimental results proved60 reduction in iterations for defining the centroids

All previous solutions and efforts to increase the perfor-mance of K-means algorithm still need more investigationsince they are all looking for a local minimum Searching forthe global minimum will certainly improve the performanceof the algorithmThe rest of the paper is organized as followsIn Section 2 the proposed method is introduced that findsthe optimal centers for each cluster which corresponds tothe global minimum of the k-means cluster In Section 3 theresults are discussed and analyzed using two sets from MITBioID and Caltech datasets Finally in Section 4 conclusionsare drawn

2 K-Means Clustering

K-means an unsupervised learning algorithm first proposedby MacQueen 1967 [28] is a popular method of clusteranalysis It is a process of organizing the specified objectsinto uniform classes called clusters based on similaritiesamong objects based on certain criteria It solves the well-known clustering problem by considering certain attributesand performing an iterative alternating fitting process Thisalgorithm partitions a dataset 119883 into 119896 disjoint clusters suchthat each observation belongs to the cluster with the nearestmean Let119909

119894be the centroid of cluster 119888

119894and let119889(119909

119895 119909119894)be the

180

160

140

120

100

80

60

40

20

0

Clus

terin

g er

ror

Number of clusters k2 4 6 8 10 12 14 16

Global k-means

Fast global k-means

k-means

Min k-means

Figure 1 Comparison of k-means

dissimilarity between 119909119894and object 119909

119895isin 119888119894Then the function

minimized by the k-means is given by the following equation

min1199091 119909119896

119864 =

119896

sum

119894=1

sum

119909119895isin119888119894

119889 (119909119895 119909119894) (3)

K-means clustering is utilized in a vast number ofapplications including machine learning fault detectionpattern recognition image processing statistics and artificialintelligent [11 29 30] k-means algorithm is considered asone of the fastest clustering algorithms with a number ofvariants that are sensitive to the selection of initial pointsand are intended for solving many issues of k-means likethe evaluation of the clusters number [31] the method ofinitialization of the clusters centroids [32] and the speed ofthe algorithm [33]

21 Modifications in K-Means Clustering Global k-meansalgorithm [34] is a proposal to improve global search prop-erties of k-means algorithm and its performance on verylarge datasets by computing clusters successively The mainweakness of k-means clustering that is its sensitivity to initialpositions of the cluster centers is overcome by global k-means clustering algorithm which consists of a deterministicglobal optimization technique It does not select initial valuesrandomly instead an incremental approach is applied tooptimally add one new cluster center at each stage Globalk-means also proposes a method to reduce the computa-tional load while maintain the solution quality Experimentswere performed to compare the performance of k-meansglobal k-means min k-means and fast global k-means asshown in Figure 1 Numerical results show that the globalk-means algorithm considerably outperforms other k-meansalgorithms

4 Mathematical Problems in Engineering

Bagirov [35] proposed a new version of the global k-means algorithm for minimum sum-of-squares clusteringproblems He also compared three different versions ofthe k-means algorithm to propose the modified versionof the global k-means algorithm The proposed algorithmcomputes clusters incrementally and 119896 minus 1 cluster centersfrom the previous iteration are used to compute k-partitionof a dataset The starting point of the 119896th cluster center iscomputed byminimizing an auxiliary cluster function Givena finite set of points 119860 in the 119899-dimensional space the globalk-means compute the centroid1198831 of the set119860 as the followingequation

1198831=

1

119898

119898

sum

119894=1

119886119894 119886119894isin 119860 119894 = 1 119898 (4)

Numerical experiments performed on 14 datasets demon-strate the efficiency of themodified global k-means algorithmin comparison to the multistart k-means (MS k-means) andthe GKM algorithms when the number of clusters 119896 gt 5Modified global k-means however requires more computa-tional time than the global k-means algorithm Xie and Jiang[36] also proposed a novel version of the global k-meansalgorithm The method of creating the next cluster center inthe global k-means algorithm is modified and that showeda better execution time without affecting the performanceof the global k-means algorithm Another extension of thestandard k-means clustering is the Global Kernel k-means[37] which optimizes the clustering error in feature spaceby employing kernel-means as a local search procedure Akernel function is used in order to solve the M-clusteringproblem and near-optimal solutions are used to optimizethe clustering error in the feature space This incrementalalgorithm has high ability to identify nonlinearly separableclusters in input space and no dependence on cluster initial-ization It can handle weighted data points making it suitablefor graph partitioning applications Twomajor modificationswere performed in order to reduce the computational costwith no major effect on the solution quality

Video imaging and Image segmentation are importantapplications for k-means clustering Hung et al [38]modifiedthe k-means algorithm for color image segmentation wherea weight selection procedure in the W-k-means algorithm isproposed The evaluation function 119865(119868) is used for compari-son

119865 (119868) = radic119877

119877

sum

119894=1

(119890119894256)2

radic119860119894

(5)

where 119868 = segmented image 119877 = the number of regions in thesegmented image 119860

119894= the area or the number of pixels of

the 119894th region and 119890119894= the color error of region 119894

119890119894is the sum of the Euclidean distance of the color vectors

between the original image and the segmented image Smallervalues of 119865(119868) give better segmentation results Resultsfrom color image segmentation illustrate that the proposedprocedure produces better segmentation than the randominitialization Maliatski and Yadid-Pecht [39] also proposean adaptive k-means clustering hardware-driven approach

Y

X

My

1

My

My

2

Mx1

Mx Mx2

C1

C2

Figure 2 The separation points in two dimensions

The algorithm uses both pixel intensity and pixel distance inthe clustering process In [40] also a FPGA implementationof real-time k-means clustering is done for colored imagesA filtering algorithm is used for hardware implementationSuliman and Isa [11] presented a unique clustering algo-rithm called adaptive fuzzy-K-means clustering for imagesegmentation It can be used for different types of imagesincluding medical images microscopic images and CCDcamera images The concepts of fuzziness and belongingnessare applied to produce more adaptive clustering as comparedto other conventional clustering algorithms The proposedadaptive fuzzy-K-means clustering algorithm provides bettersegmentation and visual quality for a range of images

22 The Proposed Methodology In this method the facelocation is determined automatically by using an optimizedK-means algorithm First the input image is reshaped intovectors of pixels values followed by clustering the pixelsbased on applying a certain threshold determined by thealgorithm into two classes Pixels with values below thethreshold are put in the nonface classThe rest of the pixels areput in the face class However some unwanted parts may getclustered to this face classTherefore the algorithm is appliedagain to the face class to obtain only the face part

221 Optimized K-Means Algorithm Theproposed K-meansmodified algorithm is intended to solve the limitations of thestandard version by using differential equations to determinethe optimum separation point This algorithm finds theoptimal centers for each cluster which corresponds to theglobal minimum of the k-means cluster As an example saywe would like to cluster an image into two subclasses faceand nonface So we look for a separator which will separatethe image into two different clusters in two dimensionsThenthe means of the clusters can be found easily Figure 2 showsthe separation points and themean of each class To find these

Mathematical Problems in Engineering 5

points we proposed the followingmodification in continuousand discrete cases

Let 1199091 1199092 1199093 119909

119873 be a dataset and let 119909

119894be an

observation Let119891(119909) be the probabilitymass function (PMF)of 119909

119891 (119909) =

infin

sum

minusinfin

1

119873

120575 (119909 minus 119909119894) 119909

119894isin R (6)

where 120575(119909) is the Dirac functionWhen the size of the data is very large the question of the

cost of the minimization ofsum119896119894=1sum119909119894isin119904119894

|119909119894minus 120583119894|2 arises where

119896 is the number of clusters and 120583119894is the mean of cluster 119904

119894

Let

Var 119896 =119896

sum

119894=1

sum

119909119894isin119904119894

1003816100381610038161003816119909119894minus 120583119894

1003816100381610038161003816

2 (7)

Then it is enough to minimize Var 1 without losing thegenerality since

min119904119894

119889

sum

119894=1

sum

119909119895isin119904119894

10038161003816100381610038161003816119909119895minus 120583119894

10038161003816100381610038161003816

2

ge

119889

sum

119897=1

min119904119894

(

119896

sum

119894=1

sum

119909119895isin119904119894

10038161003816100381610038161003816119909119897

119895minus 120583119897

119894

10038161003816100381610038161003816

2

) (8)

where 119889 is the dimension of 119909119895= (1199091

119895 119909

119889

119895) and 120583

119895=

(1205831

119895 120583

119889

119895)

222 Continuous Case For the continuous case 119891(119909) is likethe probability density function (PDF)Then for 2 classes weneed to minimize the following equation

Var 2 = int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+ int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909

(9)

If 119889 ge 2 we can use (8)

minVar 2 ge119889

sum

119896=1

min(int119872

minusinfin

(119909119896minus 120583119896

1)

2

119891 (119909119896) 119889119909

+int

119872

minusinfin

(119909119896minus 120583119896

2)

2

119891 (119909119896) 119889119909)

(10)

Let 119909 isin R be any random variable with probabilitydensity function 119891(119909) we want to find 120583

1and 120583

2such that

it minimizes (7) as follows

(120583lowast

1 120583lowast

2) = arg(1205831 1205832)

[ min11987212058311205832

[int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909]]

(11)

min11987212058311205832

Var 2 = min119872

[min1205831

int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+min1205832

int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909]

(12)

min11987212058311205832

Var 2 = min119872

[min1205831

int

119872

minusinfin

(119909 minus 1205831(119872))2119891 (119909) 119889119909

+min1205832

int

infin

119872

(119909 minus 1205832(119872))2119891 (119909) 119889119909]

(13)

We know that

min119909119864 [(119883 minus 119909)

2] = 119864 [(119883 minus 119864 (119909))

2] = Var (119909) (14)

So we can find 1205831and 1205832in terms of119872 Let us define Var

1

and Var2as follows

Var1= int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

Var2= int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909

(15)

After simplification we get

Var1= int

119872

minusinfin

1199092119891 (119909) 119889119909

+ 1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909

Var2= int

infin

119872

1199092119891 (119909) 119889119909

+ 1205832

2int

infin

119872

119891 (119909) 119889119909 minus 21205832int

infin

119872

119909119891 (119909) 119889119909

(16)

To find the minimum Var1and Var

2are both partially

differentiated with respect to 1205831and 120583

2 respectively After

simplification we conclude that the minimization occurswhen

1205831=

119864 (119883minus)

119875 (119883minus)

1205832=

119864 (119883+)

119875 (119883+)

(17)

where119883minus= 119883 sdot 1

119909lt119872and119883

+= 119883 sdot 1

119909ge119872

6 Mathematical Problems in Engineering

To find the minimum of Var 2 we need to find 120597120583120597119872

1205831= (

int

119872

minusinfin119909119891 (119909) 119889119909

int

119872

minusinfin119891 (119909) 119889119909

)

997904rArr

1205971205831

120597119872

=

119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883

minus)

1198752(119883minus)

1205832= (

int

infin

119872119909119891 (119909) 119889119909

int

infin

119872119891 (119909) 119889119909

)

997904rArr

1205971205832

120597119872

=

119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+

)

1198752(119883+)

(18)

After simplification we get

1205971205831

120597119872

=

119891 (119872)

1198752(119883minus)

[119872119875 (119883minus) minus 119864 (119883

minus)]

1205971205832

120597119872

=

119891 (119872)

1198752(119883+)

[119872119875 (119883+) minus 119864 (119883

+)]

(19)

In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903

2120597119872 separately and add themas follows

120597

120597119872

(Var 2) = 120597

120597119872

(Var1) +

120597

120597119872

(Var2)

120597Var1

120597119872

=

120597

120597119872

(int

119872

minusinfin

1199092119891 (119909) 119889119909

+1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909)

= 1198722119891 (119872) + 2120583

11205831015840

1119875 (119883minus)

+ 1205832

1119891 (119872) minus 2120583

1015840

1119864 (119883minus) minus 2120583

1119872119891(119872)

(20)

After simplification we get

120597Var1

120597119872

= 119891 (119872)[1198722+

1198642(119883minus)

119875(119883minus)2minus 2119872

119864 (119883minus)

119875 (119883minus)

] (21)

Similarly

120597Var2

120597119872

= minus1198722119891 (119872) + 2120583

21205831015840

2119875 (119883+)

minus 1205832

2119891 (119872) minus 2120583

1015840

2119864 (119883+) + 2120583

2119872119891(119872)

(22)

After simplification we get

120597Var2

120597119872

= 119891 (119872)[minus1198722+

1198642(119883+)

119875(119883+)2minus 2119872

119864 (119883+)

119875 (119883+)

] (23)

Adding both these differentials and after simplifying itfollows that

120597

120597119872

(Var 2) = 119891 (119872)[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2)

minus2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)]

(24)

Equating this with 0 gives

120597

120597119872

(Var 2) = 0

[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2) minus 2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)] = 0

[

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

] [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

(25)

But [(119864(119883minus)119875(119883

minus))minus(119864(119883

+)119875(119883

+))] = 0 as in that case

1205831= 1205832 which contradicts the initial assumption Therefore

[

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

997904rArr 119872 = [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

] sdot

1

2

119872 =

1205831+ 1205832

2

(26)

Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly

To find theminimum in terms of119872 it is enough to derive

120597Var 2 (1205831(119872) 120583

2(119872))

120597119872

= 119891 (119872) [

1198642(119883minus)

119875(119883 lt 119872)2minus

1198642(119883+)

119875(119883 ge 119872)2

minus2119872[

119864 (119883minus)

119875 (119883 le 119872)

minus

119864 (119883+)

119875 (119883 gt 119872)

]]

(27)

Then we find

119872 = [

119864 (119883+)

119875 (119883+)

+

119864 (119883minus)

119875 (119883minus)

] sdot

1

2

(28)

223 Finding the Centers of Some Common ProbabilityDensity Functions

(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is

119891 (119909) =

1

119887 minus 119886

119886 lt 119909 lt 119887

0 otherwise(29)

Mathematical Problems in Engineering 7

where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1

119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) =

1

2

(

1198872minus1198722

119887 minus 119886

)

119864 (119883minus) =

1

2

(

1198722minus 1198862

119887 minus 119886

)

(30)

Putting all these in (28) and solving we get

119872 =

1

2

(119886 + 119887)

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

=

1

4

(3119886 + 119887)

1205832 (119872) =

119864 (119883+)

119875 (119883+)

=

1

4

(119886 + 3119887)

(31)

(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is

119891 (119909 120583 120590) =

1

119909120590radic2120587

119890minus(ln 119909minus120583)221205902

119909 gt 0 (32)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

(33)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(120583 + 1205902minus ln119872)radic2

120590

)]

119864 (119883minus)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(minus120583 minus 1205902+ ln119872)radic2

120590

)]

(34)

Putting all these in (28) and assuming 120583 = 0 and 120590 = 1

and solving we get

119872 = 4244942583

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= 1196827816

1205832(119872) =

119864 (119883+)

119875 (119883+)

= 7293057348

(35)

(3) Normal Distribution The probability density function ofa normal distribution is

119891 (119909 120583 120590) =

1

120590radic2120587

119890minus(119909minus120583)

221205902

(36)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

(37)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) = minus

1

2

120583minus1 + erf (12

radic2 (119872 minus 120583)

120590

)

119864 (119883minus) =

1

2

1205831 + erf (12

radic2 (119872 minus 120583)

120590

)

(38)

Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get

119872 = 0

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= minusradic2

120587

1205832(119872) =

119864 (119883+)

119875 (119883+)

= radic2

120587

(39)

224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901

119895be the

mass density function for an observation 119909119895

Let 119909119894minus1

lt 119872119894minus1

lt 119909119894and 119909

119894lt 119872119894lt 119909119894+1

We define 120583119909le119909119894

as themean for all 119909 le 119909119894 and 120583

119909ge119909119894as themean for all 119909 ge 119909

119894

Define Var 2(119872119894minus1) as the variance of two centers forced by

119872119894minus1

as a separator

Var (119872119894minus1) = sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

+ sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895

(40)

Var (119872119894) = sum

119909119895le119909119894

(119909119895minus 120583119909119895le119909119894

)

2

119901119895

+ sum

119909119895ge119909119894+1

(119909119895minus 120583119909119895ge119909119894+1

)

2

119901119895

(41)

8 Mathematical Problems in Engineering

The first part of Var(119872119894minus1) simplifies as follows

sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

= sum

119909119895le119909119894minus1

1199092

119895119901119895minus 2120583119909119895le119909119894minus1

times sum

119909119895le119909119894minus1

119909119895119901119895+ 1205832

119909119895le119909119894minus1sum

119909119895le119909119894minus1

119901119895

= 119864 (1198832sdot 1119909119895le119909119894minus1

) minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

(42)

Similarly

sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895= 119864 (119883

2sdot 1119909119895ge119909119894

) minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894minus1) = 119864 (119883

2sdot 1119909119895le119909119894minus1

)

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

+ 119864 (1198832sdot 1119909119895ge119909119894

)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894) = 119864 (119883

2sdot 1119909119895le119909119894

)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

+ 119864 (1198832sdot 1119909119895ge119909119894+1

)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

(43)

Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872

119894minus1)

ΔVar 2 = [

[

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

minus[

[

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

]

]

(44)

Now in order to simplify the calculation we rearrangeΔVar 2 as follows

ΔVar 2 = [

[

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

]

]

+[

[

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

(45)

The first part is simplified as follows

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

=

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

[119864 (119883 sdot 1119909119895le119909119894minus1

) + 119909119894119901119894]

2

119875 (119883 le 119909119894minus1) + 119901119894

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894)

+ 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

(46)

Similarly simplifying the second part yields

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

[119864 (119883 sdot 1119909119895ge119909119894+1

) + 119909119894119901119894]

2

119875 (119883 ge 119909119894+1) + 119901119894

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

minus 1199011198941198642(119883 sdot 1

119909119895ge119909119894+1)

minus 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894119875 (119883 ge 119909

119894+1)

Mathematical Problems in Engineering 9

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times minus119901119894+11198642(119883 sdot 1

119909119895ge119909119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894+1119875 (119883 ge 119909

119894+1)

(47)In general these types of optimization arise when the

dataset is very bigIf we consider119873 to be very large then 119901

119896≪ max(sum

119894ge119896+1

119875119894 sum119894le119896minus1

119875119894) 119875(119883 le 119909

119894minus1) + 119901119894asymp 119875(119883 le 119909

119894) and 119875(119883 ge

119909119894+1) + 119901119894asymp 119875(119883 ge 119909

119894+1)

ΔVar 2 asymp [

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 1199012

1198941199092

119894[

1

119875 (119883 le 119909119894minus1)

minus

1

119875 (119883 ge 119909119894+1)

]

(48)

and as lim119873rarrinfin

1199012

1198941199092

119894[(1119875(119883 le 119909

119894minus1))minus(1119875(119883 ge 119909

119894+1))] =

0 we can further simplify ΔVar 2 as followsΔVar 2

asymp[

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

= 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

(49)To find the minimum of ΔVar 2 let us locate when

ΔVar 2 = 0

997904rArr 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

= 0

(50)

VarD

X1 X2 X3 X4 X5 X6 Xn

M

Figure 3 Discrete random variables

But 119901119894

= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1

)119875(119883 le 119909119894minus1)) minus

(119864(119883 sdot 1119909119895ge119909119894+1

)119875(119883 ge 119909119894+1))] lt 0 since 119909

119894is an increasing

sequence which implies that 120583119894minus1

lt 120583119894where

120583119894minus1

=

119864 (119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

120583119894=

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

(51)

Therefore

[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 2119909119894= 0 (52)

Let us define

119865 (119909119894) = 2119909

119894minus[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

119865 (119909119894) = 0

997904rArr 119909119894=[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

sdot

1

2

997904rArr 119909119894=

120583119894minus1+ 120583119894

2

(53)

where 120583119894minus1

and 120583119894are the means of the first and second parts

respectivelyTo find the minimum for the total variation it is enough

to find the good separator119872 between 119909119894and 119909

119894minus1such that

119865(119909119894) sdot 119865(119909

119894+1) lt 0 and 119865(119909

119894) lt 0 (see Figure 3)

After finding few local minima we can get the globalminimum by finding the smallest among them

3 Face Localization Using Proposed Method

Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding

10 Mathematical Problems in Engineering

Figure 4 The original image

Figure 5 The face with shade only

a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909

1and another from 119909

1to 255 by

applying a certain threshold determined by the algorithmLet [0 119909

1] be the class 1-which represents the nonface

part and let [1199091 255] be the class 2-which represents the

face with the shadeAll pixels with values less than the threshold belong to

the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5

In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909

1to 1199092and from 119909

2to 255Then the pixels with values

less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909

1 119909

2] be

the class 1 which represents the illuminated background partand let [119909

2 255] be the class 2 which represents the face

partThe result of the removal of the illuminated back-ground

part is shown in Figure 6

Figure 6 The face without illumination

Figure 7 The filtered image

One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7

Figure 8 shows the final result which is a window contain-ing the face extracted from the original image

4 Results

In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor

41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

4 Mathematical Problems in Engineering

Bagirov [35] proposed a new version of the global k-means algorithm for minimum sum-of-squares clusteringproblems He also compared three different versions ofthe k-means algorithm to propose the modified versionof the global k-means algorithm The proposed algorithmcomputes clusters incrementally and 119896 minus 1 cluster centersfrom the previous iteration are used to compute k-partitionof a dataset The starting point of the 119896th cluster center iscomputed byminimizing an auxiliary cluster function Givena finite set of points 119860 in the 119899-dimensional space the globalk-means compute the centroid1198831 of the set119860 as the followingequation

1198831=

1

119898

119898

sum

119894=1

119886119894 119886119894isin 119860 119894 = 1 119898 (4)

Numerical experiments performed on 14 datasets demon-strate the efficiency of themodified global k-means algorithmin comparison to the multistart k-means (MS k-means) andthe GKM algorithms when the number of clusters 119896 gt 5Modified global k-means however requires more computa-tional time than the global k-means algorithm Xie and Jiang[36] also proposed a novel version of the global k-meansalgorithm The method of creating the next cluster center inthe global k-means algorithm is modified and that showeda better execution time without affecting the performanceof the global k-means algorithm Another extension of thestandard k-means clustering is the Global Kernel k-means[37] which optimizes the clustering error in feature spaceby employing kernel-means as a local search procedure Akernel function is used in order to solve the M-clusteringproblem and near-optimal solutions are used to optimizethe clustering error in the feature space This incrementalalgorithm has high ability to identify nonlinearly separableclusters in input space and no dependence on cluster initial-ization It can handle weighted data points making it suitablefor graph partitioning applications Twomajor modificationswere performed in order to reduce the computational costwith no major effect on the solution quality

Video imaging and Image segmentation are importantapplications for k-means clustering Hung et al [38]modifiedthe k-means algorithm for color image segmentation wherea weight selection procedure in the W-k-means algorithm isproposed The evaluation function 119865(119868) is used for compari-son

119865 (119868) = radic119877

119877

sum

119894=1

(119890119894256)2

radic119860119894

(5)

where 119868 = segmented image 119877 = the number of regions in thesegmented image 119860

119894= the area or the number of pixels of

the 119894th region and 119890119894= the color error of region 119894

119890119894is the sum of the Euclidean distance of the color vectors

between the original image and the segmented image Smallervalues of 119865(119868) give better segmentation results Resultsfrom color image segmentation illustrate that the proposedprocedure produces better segmentation than the randominitialization Maliatski and Yadid-Pecht [39] also proposean adaptive k-means clustering hardware-driven approach

Y

X

My

1

My

My

2

Mx1

Mx Mx2

C1

C2

Figure 2 The separation points in two dimensions

The algorithm uses both pixel intensity and pixel distance inthe clustering process In [40] also a FPGA implementationof real-time k-means clustering is done for colored imagesA filtering algorithm is used for hardware implementationSuliman and Isa [11] presented a unique clustering algo-rithm called adaptive fuzzy-K-means clustering for imagesegmentation It can be used for different types of imagesincluding medical images microscopic images and CCDcamera images The concepts of fuzziness and belongingnessare applied to produce more adaptive clustering as comparedto other conventional clustering algorithms The proposedadaptive fuzzy-K-means clustering algorithm provides bettersegmentation and visual quality for a range of images

22 The Proposed Methodology In this method the facelocation is determined automatically by using an optimizedK-means algorithm First the input image is reshaped intovectors of pixels values followed by clustering the pixelsbased on applying a certain threshold determined by thealgorithm into two classes Pixels with values below thethreshold are put in the nonface classThe rest of the pixels areput in the face class However some unwanted parts may getclustered to this face classTherefore the algorithm is appliedagain to the face class to obtain only the face part

221 Optimized K-Means Algorithm Theproposed K-meansmodified algorithm is intended to solve the limitations of thestandard version by using differential equations to determinethe optimum separation point This algorithm finds theoptimal centers for each cluster which corresponds to theglobal minimum of the k-means cluster As an example saywe would like to cluster an image into two subclasses faceand nonface So we look for a separator which will separatethe image into two different clusters in two dimensionsThenthe means of the clusters can be found easily Figure 2 showsthe separation points and themean of each class To find these

Mathematical Problems in Engineering 5

points we proposed the followingmodification in continuousand discrete cases

Let 1199091 1199092 1199093 119909

119873 be a dataset and let 119909

119894be an

observation Let119891(119909) be the probabilitymass function (PMF)of 119909

119891 (119909) =

infin

sum

minusinfin

1

119873

120575 (119909 minus 119909119894) 119909

119894isin R (6)

where 120575(119909) is the Dirac functionWhen the size of the data is very large the question of the

cost of the minimization ofsum119896119894=1sum119909119894isin119904119894

|119909119894minus 120583119894|2 arises where

119896 is the number of clusters and 120583119894is the mean of cluster 119904

119894

Let

Var 119896 =119896

sum

119894=1

sum

119909119894isin119904119894

1003816100381610038161003816119909119894minus 120583119894

1003816100381610038161003816

2 (7)

Then it is enough to minimize Var 1 without losing thegenerality since

min119904119894

119889

sum

119894=1

sum

119909119895isin119904119894

10038161003816100381610038161003816119909119895minus 120583119894

10038161003816100381610038161003816

2

ge

119889

sum

119897=1

min119904119894

(

119896

sum

119894=1

sum

119909119895isin119904119894

10038161003816100381610038161003816119909119897

119895minus 120583119897

119894

10038161003816100381610038161003816

2

) (8)

where 119889 is the dimension of 119909119895= (1199091

119895 119909

119889

119895) and 120583

119895=

(1205831

119895 120583

119889

119895)

222 Continuous Case For the continuous case 119891(119909) is likethe probability density function (PDF)Then for 2 classes weneed to minimize the following equation

Var 2 = int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+ int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909

(9)

If 119889 ge 2 we can use (8)

minVar 2 ge119889

sum

119896=1

min(int119872

minusinfin

(119909119896minus 120583119896

1)

2

119891 (119909119896) 119889119909

+int

119872

minusinfin

(119909119896minus 120583119896

2)

2

119891 (119909119896) 119889119909)

(10)

Let 119909 isin R be any random variable with probabilitydensity function 119891(119909) we want to find 120583

1and 120583

2such that

it minimizes (7) as follows

(120583lowast

1 120583lowast

2) = arg(1205831 1205832)

[ min11987212058311205832

[int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909]]

(11)

min11987212058311205832

Var 2 = min119872

[min1205831

int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+min1205832

int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909]

(12)

min11987212058311205832

Var 2 = min119872

[min1205831

int

119872

minusinfin

(119909 minus 1205831(119872))2119891 (119909) 119889119909

+min1205832

int

infin

119872

(119909 minus 1205832(119872))2119891 (119909) 119889119909]

(13)

We know that

min119909119864 [(119883 minus 119909)

2] = 119864 [(119883 minus 119864 (119909))

2] = Var (119909) (14)

So we can find 1205831and 1205832in terms of119872 Let us define Var

1

and Var2as follows

Var1= int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

Var2= int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909

(15)

After simplification we get

Var1= int

119872

minusinfin

1199092119891 (119909) 119889119909

+ 1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909

Var2= int

infin

119872

1199092119891 (119909) 119889119909

+ 1205832

2int

infin

119872

119891 (119909) 119889119909 minus 21205832int

infin

119872

119909119891 (119909) 119889119909

(16)

To find the minimum Var1and Var

2are both partially

differentiated with respect to 1205831and 120583

2 respectively After

simplification we conclude that the minimization occurswhen

1205831=

119864 (119883minus)

119875 (119883minus)

1205832=

119864 (119883+)

119875 (119883+)

(17)

where119883minus= 119883 sdot 1

119909lt119872and119883

+= 119883 sdot 1

119909ge119872

6 Mathematical Problems in Engineering

To find the minimum of Var 2 we need to find 120597120583120597119872

1205831= (

int

119872

minusinfin119909119891 (119909) 119889119909

int

119872

minusinfin119891 (119909) 119889119909

)

997904rArr

1205971205831

120597119872

=

119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883

minus)

1198752(119883minus)

1205832= (

int

infin

119872119909119891 (119909) 119889119909

int

infin

119872119891 (119909) 119889119909

)

997904rArr

1205971205832

120597119872

=

119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+

)

1198752(119883+)

(18)

After simplification we get

1205971205831

120597119872

=

119891 (119872)

1198752(119883minus)

[119872119875 (119883minus) minus 119864 (119883

minus)]

1205971205832

120597119872

=

119891 (119872)

1198752(119883+)

[119872119875 (119883+) minus 119864 (119883

+)]

(19)

In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903

2120597119872 separately and add themas follows

120597

120597119872

(Var 2) = 120597

120597119872

(Var1) +

120597

120597119872

(Var2)

120597Var1

120597119872

=

120597

120597119872

(int

119872

minusinfin

1199092119891 (119909) 119889119909

+1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909)

= 1198722119891 (119872) + 2120583

11205831015840

1119875 (119883minus)

+ 1205832

1119891 (119872) minus 2120583

1015840

1119864 (119883minus) minus 2120583

1119872119891(119872)

(20)

After simplification we get

120597Var1

120597119872

= 119891 (119872)[1198722+

1198642(119883minus)

119875(119883minus)2minus 2119872

119864 (119883minus)

119875 (119883minus)

] (21)

Similarly

120597Var2

120597119872

= minus1198722119891 (119872) + 2120583

21205831015840

2119875 (119883+)

minus 1205832

2119891 (119872) minus 2120583

1015840

2119864 (119883+) + 2120583

2119872119891(119872)

(22)

After simplification we get

120597Var2

120597119872

= 119891 (119872)[minus1198722+

1198642(119883+)

119875(119883+)2minus 2119872

119864 (119883+)

119875 (119883+)

] (23)

Adding both these differentials and after simplifying itfollows that

120597

120597119872

(Var 2) = 119891 (119872)[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2)

minus2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)]

(24)

Equating this with 0 gives

120597

120597119872

(Var 2) = 0

[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2) minus 2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)] = 0

[

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

] [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

(25)

But [(119864(119883minus)119875(119883

minus))minus(119864(119883

+)119875(119883

+))] = 0 as in that case

1205831= 1205832 which contradicts the initial assumption Therefore

[

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

997904rArr 119872 = [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

] sdot

1

2

119872 =

1205831+ 1205832

2

(26)

Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly

To find theminimum in terms of119872 it is enough to derive

120597Var 2 (1205831(119872) 120583

2(119872))

120597119872

= 119891 (119872) [

1198642(119883minus)

119875(119883 lt 119872)2minus

1198642(119883+)

119875(119883 ge 119872)2

minus2119872[

119864 (119883minus)

119875 (119883 le 119872)

minus

119864 (119883+)

119875 (119883 gt 119872)

]]

(27)

Then we find

119872 = [

119864 (119883+)

119875 (119883+)

+

119864 (119883minus)

119875 (119883minus)

] sdot

1

2

(28)

223 Finding the Centers of Some Common ProbabilityDensity Functions

(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is

119891 (119909) =

1

119887 minus 119886

119886 lt 119909 lt 119887

0 otherwise(29)

Mathematical Problems in Engineering 7

where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1

119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) =

1

2

(

1198872minus1198722

119887 minus 119886

)

119864 (119883minus) =

1

2

(

1198722minus 1198862

119887 minus 119886

)

(30)

Putting all these in (28) and solving we get

119872 =

1

2

(119886 + 119887)

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

=

1

4

(3119886 + 119887)

1205832 (119872) =

119864 (119883+)

119875 (119883+)

=

1

4

(119886 + 3119887)

(31)

(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is

119891 (119909 120583 120590) =

1

119909120590radic2120587

119890minus(ln 119909minus120583)221205902

119909 gt 0 (32)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

(33)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(120583 + 1205902minus ln119872)radic2

120590

)]

119864 (119883minus)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(minus120583 minus 1205902+ ln119872)radic2

120590

)]

(34)

Putting all these in (28) and assuming 120583 = 0 and 120590 = 1

and solving we get

119872 = 4244942583

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= 1196827816

1205832(119872) =

119864 (119883+)

119875 (119883+)

= 7293057348

(35)

(3) Normal Distribution The probability density function ofa normal distribution is

119891 (119909 120583 120590) =

1

120590radic2120587

119890minus(119909minus120583)

221205902

(36)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

(37)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) = minus

1

2

120583minus1 + erf (12

radic2 (119872 minus 120583)

120590

)

119864 (119883minus) =

1

2

1205831 + erf (12

radic2 (119872 minus 120583)

120590

)

(38)

Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get

119872 = 0

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= minusradic2

120587

1205832(119872) =

119864 (119883+)

119875 (119883+)

= radic2

120587

(39)

224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901

119895be the

mass density function for an observation 119909119895

Let 119909119894minus1

lt 119872119894minus1

lt 119909119894and 119909

119894lt 119872119894lt 119909119894+1

We define 120583119909le119909119894

as themean for all 119909 le 119909119894 and 120583

119909ge119909119894as themean for all 119909 ge 119909

119894

Define Var 2(119872119894minus1) as the variance of two centers forced by

119872119894minus1

as a separator

Var (119872119894minus1) = sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

+ sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895

(40)

Var (119872119894) = sum

119909119895le119909119894

(119909119895minus 120583119909119895le119909119894

)

2

119901119895

+ sum

119909119895ge119909119894+1

(119909119895minus 120583119909119895ge119909119894+1

)

2

119901119895

(41)

8 Mathematical Problems in Engineering

The first part of Var(119872119894minus1) simplifies as follows

sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

= sum

119909119895le119909119894minus1

1199092

119895119901119895minus 2120583119909119895le119909119894minus1

times sum

119909119895le119909119894minus1

119909119895119901119895+ 1205832

119909119895le119909119894minus1sum

119909119895le119909119894minus1

119901119895

= 119864 (1198832sdot 1119909119895le119909119894minus1

) minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

(42)

Similarly

sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895= 119864 (119883

2sdot 1119909119895ge119909119894

) minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894minus1) = 119864 (119883

2sdot 1119909119895le119909119894minus1

)

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

+ 119864 (1198832sdot 1119909119895ge119909119894

)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894) = 119864 (119883

2sdot 1119909119895le119909119894

)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

+ 119864 (1198832sdot 1119909119895ge119909119894+1

)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

(43)

Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872

119894minus1)

ΔVar 2 = [

[

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

minus[

[

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

]

]

(44)

Now in order to simplify the calculation we rearrangeΔVar 2 as follows

ΔVar 2 = [

[

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

]

]

+[

[

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

(45)

The first part is simplified as follows

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

=

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

[119864 (119883 sdot 1119909119895le119909119894minus1

) + 119909119894119901119894]

2

119875 (119883 le 119909119894minus1) + 119901119894

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894)

+ 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

(46)

Similarly simplifying the second part yields

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

[119864 (119883 sdot 1119909119895ge119909119894+1

) + 119909119894119901119894]

2

119875 (119883 ge 119909119894+1) + 119901119894

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

minus 1199011198941198642(119883 sdot 1

119909119895ge119909119894+1)

minus 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894119875 (119883 ge 119909

119894+1)

Mathematical Problems in Engineering 9

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times minus119901119894+11198642(119883 sdot 1

119909119895ge119909119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894+1119875 (119883 ge 119909

119894+1)

(47)In general these types of optimization arise when the

dataset is very bigIf we consider119873 to be very large then 119901

119896≪ max(sum

119894ge119896+1

119875119894 sum119894le119896minus1

119875119894) 119875(119883 le 119909

119894minus1) + 119901119894asymp 119875(119883 le 119909

119894) and 119875(119883 ge

119909119894+1) + 119901119894asymp 119875(119883 ge 119909

119894+1)

ΔVar 2 asymp [

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 1199012

1198941199092

119894[

1

119875 (119883 le 119909119894minus1)

minus

1

119875 (119883 ge 119909119894+1)

]

(48)

and as lim119873rarrinfin

1199012

1198941199092

119894[(1119875(119883 le 119909

119894minus1))minus(1119875(119883 ge 119909

119894+1))] =

0 we can further simplify ΔVar 2 as followsΔVar 2

asymp[

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

= 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

(49)To find the minimum of ΔVar 2 let us locate when

ΔVar 2 = 0

997904rArr 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

= 0

(50)

VarD

X1 X2 X3 X4 X5 X6 Xn

M

Figure 3 Discrete random variables

But 119901119894

= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1

)119875(119883 le 119909119894minus1)) minus

(119864(119883 sdot 1119909119895ge119909119894+1

)119875(119883 ge 119909119894+1))] lt 0 since 119909

119894is an increasing

sequence which implies that 120583119894minus1

lt 120583119894where

120583119894minus1

=

119864 (119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

120583119894=

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

(51)

Therefore

[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 2119909119894= 0 (52)

Let us define

119865 (119909119894) = 2119909

119894minus[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

119865 (119909119894) = 0

997904rArr 119909119894=[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

sdot

1

2

997904rArr 119909119894=

120583119894minus1+ 120583119894

2

(53)

where 120583119894minus1

and 120583119894are the means of the first and second parts

respectivelyTo find the minimum for the total variation it is enough

to find the good separator119872 between 119909119894and 119909

119894minus1such that

119865(119909119894) sdot 119865(119909

119894+1) lt 0 and 119865(119909

119894) lt 0 (see Figure 3)

After finding few local minima we can get the globalminimum by finding the smallest among them

3 Face Localization Using Proposed Method

Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding

10 Mathematical Problems in Engineering

Figure 4 The original image

Figure 5 The face with shade only

a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909

1and another from 119909

1to 255 by

applying a certain threshold determined by the algorithmLet [0 119909

1] be the class 1-which represents the nonface

part and let [1199091 255] be the class 2-which represents the

face with the shadeAll pixels with values less than the threshold belong to

the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5

In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909

1to 1199092and from 119909

2to 255Then the pixels with values

less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909

1 119909

2] be

the class 1 which represents the illuminated background partand let [119909

2 255] be the class 2 which represents the face

partThe result of the removal of the illuminated back-ground

part is shown in Figure 6

Figure 6 The face without illumination

Figure 7 The filtered image

One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7

Figure 8 shows the final result which is a window contain-ing the face extracted from the original image

4 Results

In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor

41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 5

points we proposed the followingmodification in continuousand discrete cases

Let 1199091 1199092 1199093 119909

119873 be a dataset and let 119909

119894be an

observation Let119891(119909) be the probabilitymass function (PMF)of 119909

119891 (119909) =

infin

sum

minusinfin

1

119873

120575 (119909 minus 119909119894) 119909

119894isin R (6)

where 120575(119909) is the Dirac functionWhen the size of the data is very large the question of the

cost of the minimization ofsum119896119894=1sum119909119894isin119904119894

|119909119894minus 120583119894|2 arises where

119896 is the number of clusters and 120583119894is the mean of cluster 119904

119894

Let

Var 119896 =119896

sum

119894=1

sum

119909119894isin119904119894

1003816100381610038161003816119909119894minus 120583119894

1003816100381610038161003816

2 (7)

Then it is enough to minimize Var 1 without losing thegenerality since

min119904119894

119889

sum

119894=1

sum

119909119895isin119904119894

10038161003816100381610038161003816119909119895minus 120583119894

10038161003816100381610038161003816

2

ge

119889

sum

119897=1

min119904119894

(

119896

sum

119894=1

sum

119909119895isin119904119894

10038161003816100381610038161003816119909119897

119895minus 120583119897

119894

10038161003816100381610038161003816

2

) (8)

where 119889 is the dimension of 119909119895= (1199091

119895 119909

119889

119895) and 120583

119895=

(1205831

119895 120583

119889

119895)

222 Continuous Case For the continuous case 119891(119909) is likethe probability density function (PDF)Then for 2 classes weneed to minimize the following equation

Var 2 = int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+ int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909

(9)

If 119889 ge 2 we can use (8)

minVar 2 ge119889

sum

119896=1

min(int119872

minusinfin

(119909119896minus 120583119896

1)

2

119891 (119909119896) 119889119909

+int

119872

minusinfin

(119909119896minus 120583119896

2)

2

119891 (119909119896) 119889119909)

(10)

Let 119909 isin R be any random variable with probabilitydensity function 119891(119909) we want to find 120583

1and 120583

2such that

it minimizes (7) as follows

(120583lowast

1 120583lowast

2) = arg(1205831 1205832)

[ min11987212058311205832

[int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909]]

(11)

min11987212058311205832

Var 2 = min119872

[min1205831

int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

+min1205832

int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909]

(12)

min11987212058311205832

Var 2 = min119872

[min1205831

int

119872

minusinfin

(119909 minus 1205831(119872))2119891 (119909) 119889119909

+min1205832

int

infin

119872

(119909 minus 1205832(119872))2119891 (119909) 119889119909]

(13)

We know that

min119909119864 [(119883 minus 119909)

2] = 119864 [(119883 minus 119864 (119909))

2] = Var (119909) (14)

So we can find 1205831and 1205832in terms of119872 Let us define Var

1

and Var2as follows

Var1= int

119872

minusinfin

(119909 minus 1205831)2119891 (119909) 119889119909

Var2= int

infin

119872

(119909 minus 1205832)2119891 (119909) 119889119909

(15)

After simplification we get

Var1= int

119872

minusinfin

1199092119891 (119909) 119889119909

+ 1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909

Var2= int

infin

119872

1199092119891 (119909) 119889119909

+ 1205832

2int

infin

119872

119891 (119909) 119889119909 minus 21205832int

infin

119872

119909119891 (119909) 119889119909

(16)

To find the minimum Var1and Var

2are both partially

differentiated with respect to 1205831and 120583

2 respectively After

simplification we conclude that the minimization occurswhen

1205831=

119864 (119883minus)

119875 (119883minus)

1205832=

119864 (119883+)

119875 (119883+)

(17)

where119883minus= 119883 sdot 1

119909lt119872and119883

+= 119883 sdot 1

119909ge119872

6 Mathematical Problems in Engineering

To find the minimum of Var 2 we need to find 120597120583120597119872

1205831= (

int

119872

minusinfin119909119891 (119909) 119889119909

int

119872

minusinfin119891 (119909) 119889119909

)

997904rArr

1205971205831

120597119872

=

119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883

minus)

1198752(119883minus)

1205832= (

int

infin

119872119909119891 (119909) 119889119909

int

infin

119872119891 (119909) 119889119909

)

997904rArr

1205971205832

120597119872

=

119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+

)

1198752(119883+)

(18)

After simplification we get

1205971205831

120597119872

=

119891 (119872)

1198752(119883minus)

[119872119875 (119883minus) minus 119864 (119883

minus)]

1205971205832

120597119872

=

119891 (119872)

1198752(119883+)

[119872119875 (119883+) minus 119864 (119883

+)]

(19)

In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903

2120597119872 separately and add themas follows

120597

120597119872

(Var 2) = 120597

120597119872

(Var1) +

120597

120597119872

(Var2)

120597Var1

120597119872

=

120597

120597119872

(int

119872

minusinfin

1199092119891 (119909) 119889119909

+1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909)

= 1198722119891 (119872) + 2120583

11205831015840

1119875 (119883minus)

+ 1205832

1119891 (119872) minus 2120583

1015840

1119864 (119883minus) minus 2120583

1119872119891(119872)

(20)

After simplification we get

120597Var1

120597119872

= 119891 (119872)[1198722+

1198642(119883minus)

119875(119883minus)2minus 2119872

119864 (119883minus)

119875 (119883minus)

] (21)

Similarly

120597Var2

120597119872

= minus1198722119891 (119872) + 2120583

21205831015840

2119875 (119883+)

minus 1205832

2119891 (119872) minus 2120583

1015840

2119864 (119883+) + 2120583

2119872119891(119872)

(22)

After simplification we get

120597Var2

120597119872

= 119891 (119872)[minus1198722+

1198642(119883+)

119875(119883+)2minus 2119872

119864 (119883+)

119875 (119883+)

] (23)

Adding both these differentials and after simplifying itfollows that

120597

120597119872

(Var 2) = 119891 (119872)[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2)

minus2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)]

(24)

Equating this with 0 gives

120597

120597119872

(Var 2) = 0

[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2) minus 2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)] = 0

[

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

] [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

(25)

But [(119864(119883minus)119875(119883

minus))minus(119864(119883

+)119875(119883

+))] = 0 as in that case

1205831= 1205832 which contradicts the initial assumption Therefore

[

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

997904rArr 119872 = [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

] sdot

1

2

119872 =

1205831+ 1205832

2

(26)

Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly

To find theminimum in terms of119872 it is enough to derive

120597Var 2 (1205831(119872) 120583

2(119872))

120597119872

= 119891 (119872) [

1198642(119883minus)

119875(119883 lt 119872)2minus

1198642(119883+)

119875(119883 ge 119872)2

minus2119872[

119864 (119883minus)

119875 (119883 le 119872)

minus

119864 (119883+)

119875 (119883 gt 119872)

]]

(27)

Then we find

119872 = [

119864 (119883+)

119875 (119883+)

+

119864 (119883minus)

119875 (119883minus)

] sdot

1

2

(28)

223 Finding the Centers of Some Common ProbabilityDensity Functions

(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is

119891 (119909) =

1

119887 minus 119886

119886 lt 119909 lt 119887

0 otherwise(29)

Mathematical Problems in Engineering 7

where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1

119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) =

1

2

(

1198872minus1198722

119887 minus 119886

)

119864 (119883minus) =

1

2

(

1198722minus 1198862

119887 minus 119886

)

(30)

Putting all these in (28) and solving we get

119872 =

1

2

(119886 + 119887)

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

=

1

4

(3119886 + 119887)

1205832 (119872) =

119864 (119883+)

119875 (119883+)

=

1

4

(119886 + 3119887)

(31)

(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is

119891 (119909 120583 120590) =

1

119909120590radic2120587

119890minus(ln 119909minus120583)221205902

119909 gt 0 (32)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

(33)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(120583 + 1205902minus ln119872)radic2

120590

)]

119864 (119883minus)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(minus120583 minus 1205902+ ln119872)radic2

120590

)]

(34)

Putting all these in (28) and assuming 120583 = 0 and 120590 = 1

and solving we get

119872 = 4244942583

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= 1196827816

1205832(119872) =

119864 (119883+)

119875 (119883+)

= 7293057348

(35)

(3) Normal Distribution The probability density function ofa normal distribution is

119891 (119909 120583 120590) =

1

120590radic2120587

119890minus(119909minus120583)

221205902

(36)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

(37)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) = minus

1

2

120583minus1 + erf (12

radic2 (119872 minus 120583)

120590

)

119864 (119883minus) =

1

2

1205831 + erf (12

radic2 (119872 minus 120583)

120590

)

(38)

Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get

119872 = 0

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= minusradic2

120587

1205832(119872) =

119864 (119883+)

119875 (119883+)

= radic2

120587

(39)

224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901

119895be the

mass density function for an observation 119909119895

Let 119909119894minus1

lt 119872119894minus1

lt 119909119894and 119909

119894lt 119872119894lt 119909119894+1

We define 120583119909le119909119894

as themean for all 119909 le 119909119894 and 120583

119909ge119909119894as themean for all 119909 ge 119909

119894

Define Var 2(119872119894minus1) as the variance of two centers forced by

119872119894minus1

as a separator

Var (119872119894minus1) = sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

+ sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895

(40)

Var (119872119894) = sum

119909119895le119909119894

(119909119895minus 120583119909119895le119909119894

)

2

119901119895

+ sum

119909119895ge119909119894+1

(119909119895minus 120583119909119895ge119909119894+1

)

2

119901119895

(41)

8 Mathematical Problems in Engineering

The first part of Var(119872119894minus1) simplifies as follows

sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

= sum

119909119895le119909119894minus1

1199092

119895119901119895minus 2120583119909119895le119909119894minus1

times sum

119909119895le119909119894minus1

119909119895119901119895+ 1205832

119909119895le119909119894minus1sum

119909119895le119909119894minus1

119901119895

= 119864 (1198832sdot 1119909119895le119909119894minus1

) minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

(42)

Similarly

sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895= 119864 (119883

2sdot 1119909119895ge119909119894

) minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894minus1) = 119864 (119883

2sdot 1119909119895le119909119894minus1

)

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

+ 119864 (1198832sdot 1119909119895ge119909119894

)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894) = 119864 (119883

2sdot 1119909119895le119909119894

)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

+ 119864 (1198832sdot 1119909119895ge119909119894+1

)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

(43)

Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872

119894minus1)

ΔVar 2 = [

[

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

minus[

[

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

]

]

(44)

Now in order to simplify the calculation we rearrangeΔVar 2 as follows

ΔVar 2 = [

[

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

]

]

+[

[

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

(45)

The first part is simplified as follows

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

=

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

[119864 (119883 sdot 1119909119895le119909119894minus1

) + 119909119894119901119894]

2

119875 (119883 le 119909119894minus1) + 119901119894

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894)

+ 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

(46)

Similarly simplifying the second part yields

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

[119864 (119883 sdot 1119909119895ge119909119894+1

) + 119909119894119901119894]

2

119875 (119883 ge 119909119894+1) + 119901119894

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

minus 1199011198941198642(119883 sdot 1

119909119895ge119909119894+1)

minus 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894119875 (119883 ge 119909

119894+1)

Mathematical Problems in Engineering 9

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times minus119901119894+11198642(119883 sdot 1

119909119895ge119909119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894+1119875 (119883 ge 119909

119894+1)

(47)In general these types of optimization arise when the

dataset is very bigIf we consider119873 to be very large then 119901

119896≪ max(sum

119894ge119896+1

119875119894 sum119894le119896minus1

119875119894) 119875(119883 le 119909

119894minus1) + 119901119894asymp 119875(119883 le 119909

119894) and 119875(119883 ge

119909119894+1) + 119901119894asymp 119875(119883 ge 119909

119894+1)

ΔVar 2 asymp [

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 1199012

1198941199092

119894[

1

119875 (119883 le 119909119894minus1)

minus

1

119875 (119883 ge 119909119894+1)

]

(48)

and as lim119873rarrinfin

1199012

1198941199092

119894[(1119875(119883 le 119909

119894minus1))minus(1119875(119883 ge 119909

119894+1))] =

0 we can further simplify ΔVar 2 as followsΔVar 2

asymp[

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

= 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

(49)To find the minimum of ΔVar 2 let us locate when

ΔVar 2 = 0

997904rArr 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

= 0

(50)

VarD

X1 X2 X3 X4 X5 X6 Xn

M

Figure 3 Discrete random variables

But 119901119894

= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1

)119875(119883 le 119909119894minus1)) minus

(119864(119883 sdot 1119909119895ge119909119894+1

)119875(119883 ge 119909119894+1))] lt 0 since 119909

119894is an increasing

sequence which implies that 120583119894minus1

lt 120583119894where

120583119894minus1

=

119864 (119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

120583119894=

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

(51)

Therefore

[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 2119909119894= 0 (52)

Let us define

119865 (119909119894) = 2119909

119894minus[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

119865 (119909119894) = 0

997904rArr 119909119894=[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

sdot

1

2

997904rArr 119909119894=

120583119894minus1+ 120583119894

2

(53)

where 120583119894minus1

and 120583119894are the means of the first and second parts

respectivelyTo find the minimum for the total variation it is enough

to find the good separator119872 between 119909119894and 119909

119894minus1such that

119865(119909119894) sdot 119865(119909

119894+1) lt 0 and 119865(119909

119894) lt 0 (see Figure 3)

After finding few local minima we can get the globalminimum by finding the smallest among them

3 Face Localization Using Proposed Method

Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding

10 Mathematical Problems in Engineering

Figure 4 The original image

Figure 5 The face with shade only

a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909

1and another from 119909

1to 255 by

applying a certain threshold determined by the algorithmLet [0 119909

1] be the class 1-which represents the nonface

part and let [1199091 255] be the class 2-which represents the

face with the shadeAll pixels with values less than the threshold belong to

the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5

In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909

1to 1199092and from 119909

2to 255Then the pixels with values

less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909

1 119909

2] be

the class 1 which represents the illuminated background partand let [119909

2 255] be the class 2 which represents the face

partThe result of the removal of the illuminated back-ground

part is shown in Figure 6

Figure 6 The face without illumination

Figure 7 The filtered image

One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7

Figure 8 shows the final result which is a window contain-ing the face extracted from the original image

4 Results

In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor

41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

6 Mathematical Problems in Engineering

To find the minimum of Var 2 we need to find 120597120583120597119872

1205831= (

int

119872

minusinfin119909119891 (119909) 119889119909

int

119872

minusinfin119891 (119909) 119889119909

)

997904rArr

1205971205831

120597119872

=

119872119891 (119872)119875 (119883minus) minus 119891 (119872)119864 (119883

minus)

1198752(119883minus)

1205832= (

int

infin

119872119909119891 (119909) 119889119909

int

infin

119872119891 (119909) 119889119909

)

997904rArr

1205971205832

120597119872

=

119872119891 (119872)119875 (119883+) minus 119891 (119872)119864 (119883+

)

1198752(119883+)

(18)

After simplification we get

1205971205831

120597119872

=

119891 (119872)

1198752(119883minus)

[119872119875 (119883minus) minus 119864 (119883

minus)]

1205971205832

120597119872

=

119891 (119872)

1198752(119883+)

[119872119875 (119883+) minus 119864 (119883

+)]

(19)

In order to find the minimum of Var 2 we will find120597Var1120597119872 and 120597119881119886119903

2120597119872 separately and add themas follows

120597

120597119872

(Var 2) = 120597

120597119872

(Var1) +

120597

120597119872

(Var2)

120597Var1

120597119872

=

120597

120597119872

(int

119872

minusinfin

1199092119891 (119909) 119889119909

+1205832

1int

119872

minusinfin

119891 (119909) 119889119909 minus 21205831int

119872

minusinfin

119909119891 (119909) 119889119909)

= 1198722119891 (119872) + 2120583

11205831015840

1119875 (119883minus)

+ 1205832

1119891 (119872) minus 2120583

1015840

1119864 (119883minus) minus 2120583

1119872119891(119872)

(20)

After simplification we get

120597Var1

120597119872

= 119891 (119872)[1198722+

1198642(119883minus)

119875(119883minus)2minus 2119872

119864 (119883minus)

119875 (119883minus)

] (21)

Similarly

120597Var2

120597119872

= minus1198722119891 (119872) + 2120583

21205831015840

2119875 (119883+)

minus 1205832

2119891 (119872) minus 2120583

1015840

2119864 (119883+) + 2120583

2119872119891(119872)

(22)

After simplification we get

120597Var2

120597119872

= 119891 (119872)[minus1198722+

1198642(119883+)

119875(119883+)2minus 2119872

119864 (119883+)

119875 (119883+)

] (23)

Adding both these differentials and after simplifying itfollows that

120597

120597119872

(Var 2) = 119891 (119872)[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2)

minus2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)]

(24)

Equating this with 0 gives

120597

120597119872

(Var 2) = 0

[(

1198642(119883minus)

119875(119883minus)2minus

1198642(119883+)

119875(119883+)2) minus 2119872(

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

)] = 0

[

119864 (119883minus)

119875 (119883minus)

minus

119864 (119883+)

119875 (119883+)

] [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

(25)

But [(119864(119883minus)119875(119883

minus))minus(119864(119883

+)119875(119883

+))] = 0 as in that case

1205831= 1205832 which contradicts the initial assumption Therefore

[

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

minus 2119872] = 0

997904rArr 119872 = [

119864 (119883minus)

119875 (119883minus)

+

119864 (119883+)

119875 (119883+)

] sdot

1

2

119872 =

1205831+ 1205832

2

(26)

Therefore we have to find all values of119872 that satisfy (26)which is easier than minimizing Var 2 directly

To find theminimum in terms of119872 it is enough to derive

120597Var 2 (1205831(119872) 120583

2(119872))

120597119872

= 119891 (119872) [

1198642(119883minus)

119875(119883 lt 119872)2minus

1198642(119883+)

119875(119883 ge 119872)2

minus2119872[

119864 (119883minus)

119875 (119883 le 119872)

minus

119864 (119883+)

119875 (119883 gt 119872)

]]

(27)

Then we find

119872 = [

119864 (119883+)

119875 (119883+)

+

119864 (119883minus)

119875 (119883minus)

] sdot

1

2

(28)

223 Finding the Centers of Some Common ProbabilityDensity Functions

(1) Uniform Distribution The probability density function ofthe continuous uniform distribution is

119891 (119909) =

1

119887 minus 119886

119886 lt 119909 lt 119887

0 otherwise(29)

Mathematical Problems in Engineering 7

where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1

119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) =

1

2

(

1198872minus1198722

119887 minus 119886

)

119864 (119883minus) =

1

2

(

1198722minus 1198862

119887 minus 119886

)

(30)

Putting all these in (28) and solving we get

119872 =

1

2

(119886 + 119887)

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

=

1

4

(3119886 + 119887)

1205832 (119872) =

119864 (119883+)

119875 (119883+)

=

1

4

(119886 + 3119887)

(31)

(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is

119891 (119909 120583 120590) =

1

119909120590radic2120587

119890minus(ln 119909minus120583)221205902

119909 gt 0 (32)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

(33)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(120583 + 1205902minus ln119872)radic2

120590

)]

119864 (119883minus)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(minus120583 minus 1205902+ ln119872)radic2

120590

)]

(34)

Putting all these in (28) and assuming 120583 = 0 and 120590 = 1

and solving we get

119872 = 4244942583

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= 1196827816

1205832(119872) =

119864 (119883+)

119875 (119883+)

= 7293057348

(35)

(3) Normal Distribution The probability density function ofa normal distribution is

119891 (119909 120583 120590) =

1

120590radic2120587

119890minus(119909minus120583)

221205902

(36)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

(37)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) = minus

1

2

120583minus1 + erf (12

radic2 (119872 minus 120583)

120590

)

119864 (119883minus) =

1

2

1205831 + erf (12

radic2 (119872 minus 120583)

120590

)

(38)

Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get

119872 = 0

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= minusradic2

120587

1205832(119872) =

119864 (119883+)

119875 (119883+)

= radic2

120587

(39)

224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901

119895be the

mass density function for an observation 119909119895

Let 119909119894minus1

lt 119872119894minus1

lt 119909119894and 119909

119894lt 119872119894lt 119909119894+1

We define 120583119909le119909119894

as themean for all 119909 le 119909119894 and 120583

119909ge119909119894as themean for all 119909 ge 119909

119894

Define Var 2(119872119894minus1) as the variance of two centers forced by

119872119894minus1

as a separator

Var (119872119894minus1) = sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

+ sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895

(40)

Var (119872119894) = sum

119909119895le119909119894

(119909119895minus 120583119909119895le119909119894

)

2

119901119895

+ sum

119909119895ge119909119894+1

(119909119895minus 120583119909119895ge119909119894+1

)

2

119901119895

(41)

8 Mathematical Problems in Engineering

The first part of Var(119872119894minus1) simplifies as follows

sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

= sum

119909119895le119909119894minus1

1199092

119895119901119895minus 2120583119909119895le119909119894minus1

times sum

119909119895le119909119894minus1

119909119895119901119895+ 1205832

119909119895le119909119894minus1sum

119909119895le119909119894minus1

119901119895

= 119864 (1198832sdot 1119909119895le119909119894minus1

) minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

(42)

Similarly

sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895= 119864 (119883

2sdot 1119909119895ge119909119894

) minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894minus1) = 119864 (119883

2sdot 1119909119895le119909119894minus1

)

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

+ 119864 (1198832sdot 1119909119895ge119909119894

)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894) = 119864 (119883

2sdot 1119909119895le119909119894

)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

+ 119864 (1198832sdot 1119909119895ge119909119894+1

)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

(43)

Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872

119894minus1)

ΔVar 2 = [

[

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

minus[

[

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

]

]

(44)

Now in order to simplify the calculation we rearrangeΔVar 2 as follows

ΔVar 2 = [

[

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

]

]

+[

[

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

(45)

The first part is simplified as follows

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

=

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

[119864 (119883 sdot 1119909119895le119909119894minus1

) + 119909119894119901119894]

2

119875 (119883 le 119909119894minus1) + 119901119894

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894)

+ 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

(46)

Similarly simplifying the second part yields

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

[119864 (119883 sdot 1119909119895ge119909119894+1

) + 119909119894119901119894]

2

119875 (119883 ge 119909119894+1) + 119901119894

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

minus 1199011198941198642(119883 sdot 1

119909119895ge119909119894+1)

minus 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894119875 (119883 ge 119909

119894+1)

Mathematical Problems in Engineering 9

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times minus119901119894+11198642(119883 sdot 1

119909119895ge119909119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894+1119875 (119883 ge 119909

119894+1)

(47)In general these types of optimization arise when the

dataset is very bigIf we consider119873 to be very large then 119901

119896≪ max(sum

119894ge119896+1

119875119894 sum119894le119896minus1

119875119894) 119875(119883 le 119909

119894minus1) + 119901119894asymp 119875(119883 le 119909

119894) and 119875(119883 ge

119909119894+1) + 119901119894asymp 119875(119883 ge 119909

119894+1)

ΔVar 2 asymp [

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 1199012

1198941199092

119894[

1

119875 (119883 le 119909119894minus1)

minus

1

119875 (119883 ge 119909119894+1)

]

(48)

and as lim119873rarrinfin

1199012

1198941199092

119894[(1119875(119883 le 119909

119894minus1))minus(1119875(119883 ge 119909

119894+1))] =

0 we can further simplify ΔVar 2 as followsΔVar 2

asymp[

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

= 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

(49)To find the minimum of ΔVar 2 let us locate when

ΔVar 2 = 0

997904rArr 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

= 0

(50)

VarD

X1 X2 X3 X4 X5 X6 Xn

M

Figure 3 Discrete random variables

But 119901119894

= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1

)119875(119883 le 119909119894minus1)) minus

(119864(119883 sdot 1119909119895ge119909119894+1

)119875(119883 ge 119909119894+1))] lt 0 since 119909

119894is an increasing

sequence which implies that 120583119894minus1

lt 120583119894where

120583119894minus1

=

119864 (119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

120583119894=

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

(51)

Therefore

[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 2119909119894= 0 (52)

Let us define

119865 (119909119894) = 2119909

119894minus[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

119865 (119909119894) = 0

997904rArr 119909119894=[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

sdot

1

2

997904rArr 119909119894=

120583119894minus1+ 120583119894

2

(53)

where 120583119894minus1

and 120583119894are the means of the first and second parts

respectivelyTo find the minimum for the total variation it is enough

to find the good separator119872 between 119909119894and 119909

119894minus1such that

119865(119909119894) sdot 119865(119909

119894+1) lt 0 and 119865(119909

119894) lt 0 (see Figure 3)

After finding few local minima we can get the globalminimum by finding the smallest among them

3 Face Localization Using Proposed Method

Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding

10 Mathematical Problems in Engineering

Figure 4 The original image

Figure 5 The face with shade only

a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909

1and another from 119909

1to 255 by

applying a certain threshold determined by the algorithmLet [0 119909

1] be the class 1-which represents the nonface

part and let [1199091 255] be the class 2-which represents the

face with the shadeAll pixels with values less than the threshold belong to

the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5

In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909

1to 1199092and from 119909

2to 255Then the pixels with values

less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909

1 119909

2] be

the class 1 which represents the illuminated background partand let [119909

2 255] be the class 2 which represents the face

partThe result of the removal of the illuminated back-ground

part is shown in Figure 6

Figure 6 The face without illumination

Figure 7 The filtered image

One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7

Figure 8 shows the final result which is a window contain-ing the face extracted from the original image

4 Results

In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor

41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 7

where 119886 and 119887 are the two parameters of the distributionTheexpected values for119883 sdot 1

119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) =

1

2

(

1198872minus1198722

119887 minus 119886

)

119864 (119883minus) =

1

2

(

1198722minus 1198862

119887 minus 119886

)

(30)

Putting all these in (28) and solving we get

119872 =

1

2

(119886 + 119887)

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

=

1

4

(3119886 + 119887)

1205832 (119872) =

119864 (119883+)

119875 (119883+)

=

1

4

(119886 + 3119887)

(31)

(2) Log-NormalDistributionTheprobability density functionof a log-normal distribution is

119891 (119909 120583 120590) =

1

119909120590radic2120587

119890minus(ln 119909minus120583)221205902

119909 gt 0 (32)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (ln119872minus 120583)

120590

)

(33)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(120583 + 1205902minus ln119872)radic2

120590

)]

119864 (119883minus)

= [

1

2

119890120583+(12)120590

2

1 + erf (12

(minus120583 minus 1205902+ ln119872)radic2

120590

)]

(34)

Putting all these in (28) and assuming 120583 = 0 and 120590 = 1

and solving we get

119872 = 4244942583

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= 1196827816

1205832(119872) =

119864 (119883+)

119875 (119883+)

= 7293057348

(35)

(3) Normal Distribution The probability density function ofa normal distribution is

119891 (119909 120583 120590) =

1

120590radic2120587

119890minus(119909minus120583)

221205902

(36)

The probabilities in left and right of119872 are respectively

119875 (119883+) =

1

2

minus

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

119875 (119883minus) =

1

2

+

1

2

erf (12

radic2 (119872 minus 120583)

120590

)

(37)

The expected values for119883 sdot 1119909ge 119872 and119883 sdot 1

119909lt 119872 are

119864 (119883+) = minus

1

2

120583minus1 + erf (12

radic2 (119872 minus 120583)

120590

)

119864 (119883minus) =

1

2

1205831 + erf (12

radic2 (119872 minus 120583)

120590

)

(38)

Putting all these in (28) and solving we get 119872 = 120583Assuming 120583 = 0 and 120590 = 1 and solving we get

119872 = 0

1205831(119872) =

119864 (119883minus)

119875 (119883minus)

= minusradic2

120587

1205832(119872) =

119864 (119883+)

119875 (119883+)

= radic2

120587

(39)

224 Discrete Case Let119883 be a discrete random variable andassume we have a set of119873 observations from119883 Let 119901

119895be the

mass density function for an observation 119909119895

Let 119909119894minus1

lt 119872119894minus1

lt 119909119894and 119909

119894lt 119872119894lt 119909119894+1

We define 120583119909le119909119894

as themean for all 119909 le 119909119894 and 120583

119909ge119909119894as themean for all 119909 ge 119909

119894

Define Var 2(119872119894minus1) as the variance of two centers forced by

119872119894minus1

as a separator

Var (119872119894minus1) = sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

+ sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895

(40)

Var (119872119894) = sum

119909119895le119909119894

(119909119895minus 120583119909119895le119909119894

)

2

119901119895

+ sum

119909119895ge119909119894+1

(119909119895minus 120583119909119895ge119909119894+1

)

2

119901119895

(41)

8 Mathematical Problems in Engineering

The first part of Var(119872119894minus1) simplifies as follows

sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

= sum

119909119895le119909119894minus1

1199092

119895119901119895minus 2120583119909119895le119909119894minus1

times sum

119909119895le119909119894minus1

119909119895119901119895+ 1205832

119909119895le119909119894minus1sum

119909119895le119909119894minus1

119901119895

= 119864 (1198832sdot 1119909119895le119909119894minus1

) minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

(42)

Similarly

sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895= 119864 (119883

2sdot 1119909119895ge119909119894

) minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894minus1) = 119864 (119883

2sdot 1119909119895le119909119894minus1

)

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

+ 119864 (1198832sdot 1119909119895ge119909119894

)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894) = 119864 (119883

2sdot 1119909119895le119909119894

)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

+ 119864 (1198832sdot 1119909119895ge119909119894+1

)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

(43)

Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872

119894minus1)

ΔVar 2 = [

[

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

minus[

[

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

]

]

(44)

Now in order to simplify the calculation we rearrangeΔVar 2 as follows

ΔVar 2 = [

[

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

]

]

+[

[

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

(45)

The first part is simplified as follows

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

=

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

[119864 (119883 sdot 1119909119895le119909119894minus1

) + 119909119894119901119894]

2

119875 (119883 le 119909119894minus1) + 119901119894

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894)

+ 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

(46)

Similarly simplifying the second part yields

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

[119864 (119883 sdot 1119909119895ge119909119894+1

) + 119909119894119901119894]

2

119875 (119883 ge 119909119894+1) + 119901119894

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

minus 1199011198941198642(119883 sdot 1

119909119895ge119909119894+1)

minus 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894119875 (119883 ge 119909

119894+1)

Mathematical Problems in Engineering 9

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times minus119901119894+11198642(119883 sdot 1

119909119895ge119909119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894+1119875 (119883 ge 119909

119894+1)

(47)In general these types of optimization arise when the

dataset is very bigIf we consider119873 to be very large then 119901

119896≪ max(sum

119894ge119896+1

119875119894 sum119894le119896minus1

119875119894) 119875(119883 le 119909

119894minus1) + 119901119894asymp 119875(119883 le 119909

119894) and 119875(119883 ge

119909119894+1) + 119901119894asymp 119875(119883 ge 119909

119894+1)

ΔVar 2 asymp [

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 1199012

1198941199092

119894[

1

119875 (119883 le 119909119894minus1)

minus

1

119875 (119883 ge 119909119894+1)

]

(48)

and as lim119873rarrinfin

1199012

1198941199092

119894[(1119875(119883 le 119909

119894minus1))minus(1119875(119883 ge 119909

119894+1))] =

0 we can further simplify ΔVar 2 as followsΔVar 2

asymp[

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

= 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

(49)To find the minimum of ΔVar 2 let us locate when

ΔVar 2 = 0

997904rArr 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

= 0

(50)

VarD

X1 X2 X3 X4 X5 X6 Xn

M

Figure 3 Discrete random variables

But 119901119894

= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1

)119875(119883 le 119909119894minus1)) minus

(119864(119883 sdot 1119909119895ge119909119894+1

)119875(119883 ge 119909119894+1))] lt 0 since 119909

119894is an increasing

sequence which implies that 120583119894minus1

lt 120583119894where

120583119894minus1

=

119864 (119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

120583119894=

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

(51)

Therefore

[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 2119909119894= 0 (52)

Let us define

119865 (119909119894) = 2119909

119894minus[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

119865 (119909119894) = 0

997904rArr 119909119894=[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

sdot

1

2

997904rArr 119909119894=

120583119894minus1+ 120583119894

2

(53)

where 120583119894minus1

and 120583119894are the means of the first and second parts

respectivelyTo find the minimum for the total variation it is enough

to find the good separator119872 between 119909119894and 119909

119894minus1such that

119865(119909119894) sdot 119865(119909

119894+1) lt 0 and 119865(119909

119894) lt 0 (see Figure 3)

After finding few local minima we can get the globalminimum by finding the smallest among them

3 Face Localization Using Proposed Method

Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding

10 Mathematical Problems in Engineering

Figure 4 The original image

Figure 5 The face with shade only

a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909

1and another from 119909

1to 255 by

applying a certain threshold determined by the algorithmLet [0 119909

1] be the class 1-which represents the nonface

part and let [1199091 255] be the class 2-which represents the

face with the shadeAll pixels with values less than the threshold belong to

the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5

In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909

1to 1199092and from 119909

2to 255Then the pixels with values

less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909

1 119909

2] be

the class 1 which represents the illuminated background partand let [119909

2 255] be the class 2 which represents the face

partThe result of the removal of the illuminated back-ground

part is shown in Figure 6

Figure 6 The face without illumination

Figure 7 The filtered image

One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7

Figure 8 shows the final result which is a window contain-ing the face extracted from the original image

4 Results

In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor

41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

8 Mathematical Problems in Engineering

The first part of Var(119872119894minus1) simplifies as follows

sum

119909119895le119909119894minus1

(119909119895minus 120583119909119895le119909119894minus1

)

2

119901119895

= sum

119909119895le119909119894minus1

1199092

119895119901119895minus 2120583119909119895le119909119894minus1

times sum

119909119895le119909119894minus1

119909119895119901119895+ 1205832

119909119895le119909119894minus1sum

119909119895le119909119894minus1

119901119895

= 119864 (1198832sdot 1119909119895le119909119894minus1

) minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

(42)

Similarly

sum

119909119895ge119909119894

(119909119895minus 120583119909119895ge119909119894

)

2

119901119895= 119864 (119883

2sdot 1119909119895ge119909119894

) minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894minus1) = 119864 (119883

2sdot 1119909119895le119909119894minus1

)

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

+ 119864 (1198832sdot 1119909119895ge119909119894

)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

Var 2 (119872119894) = 119864 (119883

2sdot 1119909119895le119909119894

)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

+ 119864 (1198832sdot 1119909119895ge119909119894+1

)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

(43)

Define ΔVar 2 = Var 2(119872119894) minus Var 2(119872

119894minus1)

ΔVar 2 = [

[

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

minus[

[

minus

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

]

]

(44)

Now in order to simplify the calculation we rearrangeΔVar 2 as follows

ΔVar 2 = [

[

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

]

]

+[

[

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

]

]

(45)

The first part is simplified as follows

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

1198642(119883 sdot 1

119909119895le119909119894)

119875 (119883 le 119909119894)

=

1198642(119883 sdot 1

119909119895le119909119894minus1)

119875 (119883 le 119909119894minus1)

minus

[119864 (119883 sdot 1119909119895le119909119894minus1

) + 119909119894119901119894]

2

119875 (119883 le 119909119894minus1) + 119901119894

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894)

+ 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 1198642(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

=

1

119875 (119883 le 119909119894minus1) [119875 (119883 le 119909

119894minus1) + 119901119894]

times 1199011198941198642(119883 sdot 1

119909119895le119909119894minus1)

minus 2119909119894119901119894119864(119883 sdot 1

119909119895le119909119894minus1)119875 (119883 le 119909

119894minus1)

minus1199092

1198941199012

119894119875 (119883 le 119909

119894minus1)

(46)

Similarly simplifying the second part yields

1198642(119883 sdot 1

119909119895ge119909119894)

119875 (119883 ge 119909119894)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

[119864 (119883 sdot 1119909119895ge119909119894+1

) + 119909119894119901119894]

2

119875 (119883 ge 119909119894+1) + 119901119894

minus

1198642(119883 sdot 1

119909119895ge119909119894+1)

119875 (119883 ge 119909119894+1)

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

minus 1199011198941198642(119883 sdot 1

119909119895ge119909119894+1)

minus 1198642(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894119875 (119883 ge 119909

119894+1)

Mathematical Problems in Engineering 9

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times minus119901119894+11198642(119883 sdot 1

119909119895ge119909119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894+1119875 (119883 ge 119909

119894+1)

(47)In general these types of optimization arise when the

dataset is very bigIf we consider119873 to be very large then 119901

119896≪ max(sum

119894ge119896+1

119875119894 sum119894le119896minus1

119875119894) 119875(119883 le 119909

119894minus1) + 119901119894asymp 119875(119883 le 119909

119894) and 119875(119883 ge

119909119894+1) + 119901119894asymp 119875(119883 ge 119909

119894+1)

ΔVar 2 asymp [

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 1199012

1198941199092

119894[

1

119875 (119883 le 119909119894minus1)

minus

1

119875 (119883 ge 119909119894+1)

]

(48)

and as lim119873rarrinfin

1199012

1198941199092

119894[(1119875(119883 le 119909

119894minus1))minus(1119875(119883 ge 119909

119894+1))] =

0 we can further simplify ΔVar 2 as followsΔVar 2

asymp[

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

= 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

(49)To find the minimum of ΔVar 2 let us locate when

ΔVar 2 = 0

997904rArr 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

= 0

(50)

VarD

X1 X2 X3 X4 X5 X6 Xn

M

Figure 3 Discrete random variables

But 119901119894

= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1

)119875(119883 le 119909119894minus1)) minus

(119864(119883 sdot 1119909119895ge119909119894+1

)119875(119883 ge 119909119894+1))] lt 0 since 119909

119894is an increasing

sequence which implies that 120583119894minus1

lt 120583119894where

120583119894minus1

=

119864 (119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

120583119894=

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

(51)

Therefore

[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 2119909119894= 0 (52)

Let us define

119865 (119909119894) = 2119909

119894minus[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

119865 (119909119894) = 0

997904rArr 119909119894=[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

sdot

1

2

997904rArr 119909119894=

120583119894minus1+ 120583119894

2

(53)

where 120583119894minus1

and 120583119894are the means of the first and second parts

respectivelyTo find the minimum for the total variation it is enough

to find the good separator119872 between 119909119894and 119909

119894minus1such that

119865(119909119894) sdot 119865(119909

119894+1) lt 0 and 119865(119909

119894) lt 0 (see Figure 3)

After finding few local minima we can get the globalminimum by finding the smallest among them

3 Face Localization Using Proposed Method

Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding

10 Mathematical Problems in Engineering

Figure 4 The original image

Figure 5 The face with shade only

a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909

1and another from 119909

1to 255 by

applying a certain threshold determined by the algorithmLet [0 119909

1] be the class 1-which represents the nonface

part and let [1199091 255] be the class 2-which represents the

face with the shadeAll pixels with values less than the threshold belong to

the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5

In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909

1to 1199092and from 119909

2to 255Then the pixels with values

less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909

1 119909

2] be

the class 1 which represents the illuminated background partand let [119909

2 255] be the class 2 which represents the face

partThe result of the removal of the illuminated back-ground

part is shown in Figure 6

Figure 6 The face without illumination

Figure 7 The filtered image

One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7

Figure 8 shows the final result which is a window contain-ing the face extracted from the original image

4 Results

In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor

41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 9

=

1

119875 (119883 ge 119909119894+1) [119875 (119883 ge 119909

119894+1) + 119901119894]

times minus119901119894+11198642(119883 sdot 1

119909119895ge119909119894+1)

+ 2119909119894119901119894119864(119883 sdot 1

119909119895ge119909119894+1)119875 (119883 ge 119909

119894+1)

+1199092

1198941199012

119894+1119875 (119883 ge 119909

119894+1)

(47)In general these types of optimization arise when the

dataset is very bigIf we consider119873 to be very large then 119901

119896≪ max(sum

119894ge119896+1

119875119894 sum119894le119896minus1

119875119894) 119875(119883 le 119909

119894minus1) + 119901119894asymp 119875(119883 le 119909

119894) and 119875(119883 ge

119909119894+1) + 119901119894asymp 119875(119883 ge 119909

119894+1)

ΔVar 2 asymp [

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 1199012

1198941199092

119894[

1

119875 (119883 le 119909119894minus1)

minus

1

119875 (119883 ge 119909119894+1)

]

(48)

and as lim119873rarrinfin

1199012

1198941199092

119894[(1119875(119883 le 119909

119894minus1))minus(1119875(119883 ge 119909

119894+1))] =

0 we can further simplify ΔVar 2 as followsΔVar 2

asymp[

[

1198642(119883 sdot 1

119909119895le119909119894minus1) 119901119894

1198752(119883 le 119909

119894minus1)

minus

1198642(119883 sdot 1

119909119895ge119909119894+1) 119901119894

1198752(119883 ge 119909

119894+1)

]

]

minus 2119901119894119909119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

= 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

(49)To find the minimum of ΔVar 2 let us locate when

ΔVar 2 = 0

997904rArr 119901119894[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

minus

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

times[

[

(

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

) minus 2119909119894]

]

= 0

(50)

VarD

X1 X2 X3 X4 X5 X6 Xn

M

Figure 3 Discrete random variables

But 119901119894

= 0 and so is [(119864(119883 sdot 1119909119895le119909119894minus1

)119875(119883 le 119909119894minus1)) minus

(119864(119883 sdot 1119909119895ge119909119894+1

)119875(119883 ge 119909119894+1))] lt 0 since 119909

119894is an increasing

sequence which implies that 120583119894minus1

lt 120583119894where

120583119894minus1

=

119864 (119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

120583119894=

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

(51)

Therefore

[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

minus 2119909119894= 0 (52)

Let us define

119865 (119909119894) = 2119909

119894minus[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

119865 (119909119894) = 0

997904rArr 119909119894=[

[

119864(119883 sdot 1119909119895le119909119894minus1

)

119875 (119883 le 119909119894minus1)

+

119864 (119883 sdot 1119909119895ge119909119894+1

)

119875 (119883 ge 119909119894+1)

]

]

sdot

1

2

997904rArr 119909119894=

120583119894minus1+ 120583119894

2

(53)

where 120583119894minus1

and 120583119894are the means of the first and second parts

respectivelyTo find the minimum for the total variation it is enough

to find the good separator119872 between 119909119894and 119909

119894minus1such that

119865(119909119894) sdot 119865(119909

119894+1) lt 0 and 119865(119909

119894) lt 0 (see Figure 3)

After finding few local minima we can get the globalminimum by finding the smallest among them

3 Face Localization Using Proposed Method

Now it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding

10 Mathematical Problems in Engineering

Figure 4 The original image

Figure 5 The face with shade only

a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909

1and another from 119909

1to 255 by

applying a certain threshold determined by the algorithmLet [0 119909

1] be the class 1-which represents the nonface

part and let [1199091 255] be the class 2-which represents the

face with the shadeAll pixels with values less than the threshold belong to

the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5

In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909

1to 1199092and from 119909

2to 255Then the pixels with values

less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909

1 119909

2] be

the class 1 which represents the illuminated background partand let [119909

2 255] be the class 2 which represents the face

partThe result of the removal of the illuminated back-ground

part is shown in Figure 6

Figure 6 The face without illumination

Figure 7 The filtered image

One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7

Figure 8 shows the final result which is a window contain-ing the face extracted from the original image

4 Results

In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor

41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

10 Mathematical Problems in Engineering

Figure 4 The original image

Figure 5 The face with shade only

a global minimum and we can explain our solution basedon the modified algorithm Suppose we have an image withpixels values from 0 to 255 First we reshaped the input imageinto a vector of pixels Figure 4 shows an input image beforethe operation of the K-means modified algorithmThe imagecontains a face part a normal background and a backgroundwith illumination Then we split the image pixels into twoclasses one from 0 to 119909

1and another from 119909

1to 255 by

applying a certain threshold determined by the algorithmLet [0 119909

1] be the class 1-which represents the nonface

part and let [1199091 255] be the class 2-which represents the

face with the shadeAll pixels with values less than the threshold belong to

the first class and will be assigned to 0 values This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground as shown in Figure 5

In order to remove the illuminated background partanother threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part New classes will be obtainedfrom 119909

1to 1199092and from 119909

2to 255Then the pixels with values

less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part Let [119909

1 119909

2] be

the class 1 which represents the illuminated background partand let [119909

2 255] be the class 2 which represents the face

partThe result of the removal of the illuminated back-ground

part is shown in Figure 6

Figure 6 The face without illumination

Figure 7 The filtered image

One can see that some face pixels were assigned to 0 dueto the effects of the illumination Therefore there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face A filtering processis then performed to reduce the noise and the result is shownin Figure 7

Figure 8 shows the final result which is a window contain-ing the face extracted from the original image

4 Results

In this Section the experimental results of the proposedalgorithm will be analyzed and discussed First it starts withpresenting a description of the datasets used in this workfollowed by the results of the previous part of the workThree popular databases were used to test the proposed facelocalizationmethodTheMIT-CBCL database has image size115 times 115 and 300 images were selected from this datasetbased on different conditions such as light variation poseand background variations The second dataset is the BioIDdataset with image size 384 times 286 and 300 images from thedatabase were selected based on different conditions suchas indooroutdoor The last dataset is the Caltech datasetwith image size 896 times 592 and the 300 images from thedatabase were selected based on different conditions such asindooroutdoor

41 MIT-CBCL Dataset This dataset was established by theCenter for Biological and Computational Learning (CBCL)

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 11

Figure 8 The corresponding window in the original image

Figure 9 Samples fromMIT-CBCL dataset

at Massachusetts Institute of Technology (MIT) [41] It has10 persons with 200 images per person and an image size of115times115 pixelsThe images of each person are with differentlight conditions clutter background and scale and differentposes Few examples of these images are shown in Figure 9

42 BioID Dataset This dataset [42] consists of 1521 graylevel images with a resolution of 384 times 286 pixels Each oneshows the frontal view of a face of one out of 23 differenttest persons The images were taken under different lightingconditions with a complex background and contain tiltedand rotated faces This dataset is considered as the mostdifficult one for eye detection (see Figure 10)

43 Caltech Dataset This database [43] has been collectedby Markus Weber at California Institute of Technology Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions different face expressions andcomplex backgrounds (see Figure 11)

44 Proposed Method Result The proposed method is eval-uated on these datasets where the result was 100 with a

Figure 10 Sample of faces from BioID dataset for localization

Figure 11 Sample of faces from Caltech dataset for localization

Table 1 Face localization using K-mean modified algorithm onMIT-CBCL BioID and Caltech the sets contain faces with differentposes and faces with clutter background as well as indoorsoutdoors

Dataset Image Accuracy ()MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

localization time of 4 s Table 1 shows the proposed methodresults with the MIT-CBCL Caltech and BioID databasesIn this test we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds Each set contains 150 images The BioID andCaltech sets contain imageswith indoor and outdoor settingsThe results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes In addition anotherparameter was considered in this test which is the localizationtime From the results the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

12 Mathematical Problems in Engineering

Figure 12 Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset

(a) (b) (c)

Figure 13 Example of face localization using k-mean modified algorithm on BioID dataset (a) the original image (b) face position and (c)filtered image

Figure 12 shows some examples of face localization onMIT-CBCL

In Figure 13(a) an original image from BioID dataset isshownThe localization position is shown in Figure 13(b) Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined

Figure 14 shows the face position on an image from theCaltech dataset

5 Conclusion

In this paper we focus on developing the RIO detectionby suggesting a new method for face localization In thismethod of localization the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes At the

end the block window corresponding to the class of pixelscontaining the face only is extracted from the input imageThree sets of images taken from MIT BioID and Caltechdatasets and differing in illumination and backgroundsas well as in-dooroutdoor settings were used to evaluatethe proposed method The results show that the proposedmethod achieved significant localization accuracyOur futureresearch direction includes finding the optimal centers forhigher dimension data whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 13

(a) (b) (c)

Figure 14 Example of face localization using k-mean modified algorithm on Caltech dataset (a) the original image (b) face position and(c) filtered image

Acknowledgments

This study is a part of the Internal Grant Project no419011811122 The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch

References

[1] R Chellappa C L Wilson and S Sirohey ldquoHuman andmachine recognition of faces a surveyrdquo Proceedings of the IEEEvol 83 no 5 pp 705ndash740 1995

[2] M-H Yang D J Kriegman and N Ahuja ldquoDetecting faces inimages a surveyrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 24 no 1 pp 34ndash58 2002

[3] K S Fu and J K Mui ldquoA survey on image segmentationrdquoPattern Recognition vol 13 no 1 pp 3ndash16 1981

[4] N R Pal and S K Pal ldquoA review on image segmentationtechniquesrdquo Pattern Recognition vol 26 no 9 pp 1277ndash12941993

[5] S-YWan andW EHiggins ldquoSymmetric region growingrdquo IEEETransactions on Image Processing vol 12 no 9 pp 1007ndash10152003

[6] Y-L Chang and X Li ldquoFast image region growingrdquo Image andVision Computing vol 13 no 7 pp 559ndash571 1995

[7] T N Pappas and N S Jayant ldquoAn adaptive clustering algorithmfor image segmentationrdquo in Proceedings of the InternationalConference on Acoustics Speech and Signal Processing (ICASSPrsquo89) vol 3 pp 1667ndash1670 May 1989

[8] T N Pappas ldquoAn adaptive clustering algorithm for imagesegmentationrdquo IEEE Transactions on Signal Processing vol 40no 4 pp 901ndash914 1992

[9] M Y Mashor ldquoHybrid training algorithm for RBF networkrdquoInternational Journal of the Computer the Internet and Manage-ment vol 8 no 2 pp 50ndash65 2000

[10] N A M Isa S A Salamah and U K Ngah ldquoAdaptive fuzzymovingK-means clustering algorithm for image segmentationrdquo

IEEE Transactions on Consumer Electronics vol 55 no 4 pp2145ndash2153 2009

[11] S N Sulaiman and N AM Isa ldquoAdaptive fuzzy-K-means clus-tering algorithm for image segmentationrdquo IEEE Transactions onConsumer Electronics vol 56 no 4 pp 2661ndash2668 2010

[12] W Cai S Chen and D Zhang ldquoFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentationrdquo Pattern Recognition vol 40 no 3 pp 825ndash8382007

[13] P Tan M Steinbach and V Kumar Introduction to DataMining Pearson Addison-Wesley 2006

[14] H Xiong J Wu and J Chen ldquoK-means clustering versusvalidation measures a data-distribution perspectiverdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 2 pp 318ndash331 2009

[15] I Borg and P J F GroenenModernMultidimensional ScalingmdashTheory and Applications Springer Science + Business Media2005

[16] S Guha R Rastogi and K Shim ldquoCURE an efficient clusteringalgorithm for large databasesrdquo ACM SIGMOD Record vol 27no 2 pp 73ndash84 1998

[17] EM Knorr R T Ng and V Tucakov ldquoDistance-based outliersalgorithms and applicationsrdquo VLDB Journal vol 8 no 3-4 pp237ndash253 2000

[18] P S Bradley U Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[19] D Arthur and S Vassilvitskii ldquok-means++ the advantages ofcareful seedingrdquo in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms pp 1027ndash1035 2007

[20] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoA local search approximationalgorithm for k-means clusteringrdquo Computational GeometryTheory and Applications vol 28 no 2-3 pp 89ndash112 2004

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

14 Mathematical Problems in Engineering

[21] M C Chiang C W Tsai and C S Yang ldquoA time-efficient pat-tern reduction algorithm for k-means clusteringrdquo InformationSciences vol 181 no 4 pp 716ndash731 2011

[22] P S Bradley U M Fayyad and C Reina ldquoScaling clusteringalgorithms to large databasesrdquo in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo98) pp 9ndash15 1998

[23] F Farnstrom J Lewis and C Elkan ldquoScalability for clusteringalgorithms revisitedrdquo SIGKDD Explorations vol 2 no 1 pp 51ndash57 2000

[24] C Ordonez and E Omiecinski ldquoEfficient disk-based K-means clustering for relational databasesrdquo IEEE Transactions onKnowledge and Data Engineering vol 16 no 8 pp 909ndash9212004

[25] Y Li and S M Chung ldquoParallel bisecting k-means withprediction clustering algorithmrdquo Journal of Supercomputingvol 39 no 1 pp 19ndash37 2007

[26] C Elkan ldquoUsing the triangle inequality to accelerate k-meansrdquoin Proceedings of the 20th International Conference on MachineLearning pp 147ndash153 August 2003

[27] D Reddy and P K Jana ldquoInitialization for K-means clusteringusing voronoi diagramrdquo Procedia Technology vol 4 pp 395ndash400 2012

[28] J B MacQueen ldquoSome methods for classification and analysisof multivariate observationsrdquo in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability pp 281ndash297 University of California Press 1967

[29] R Ebrahimpour R Rasoolinezhad Z Hajiabolhasani andM Ebrahimi ldquoVanishing point detection in corridors usingHough transform and K-means clusteringrdquo IET ComputerVision vol 6 no 1 pp 40ndash51 2012

[30] P S Bishnu and V Bhattacherjee ldquoSoftware fault predictionusing quad tree-based K-means clustering algorithmrdquo IEEETransactions on Knowledge and Data Engineering vol 24 no6 pp 1146ndash1150 2012

[31] T Elomaa and H Koivistoinen ldquoOn autonomous K-meansclusteringrdquo in Proceedings of the International Symposium onMethodologies for Intelligent Systems pp 228ndash236 May 2005

[32] J M Pena J A Lozano and P Larranaga ldquoAn empiricalcomparison of four initialization methods for the K-Meansalgorithmrdquo Pattern Recognition Letters vol 20 no 10 pp 1027ndash1040 1999

[33] T Kanungo D M Mount N S Netanyahu C D PiatkoR Silverman and A Y Wu ldquoAn efficient k-means clusteringalgorithms analysis and implementationrdquo IEEETransactions onPattern Analysis andMachine Intelligence vol 24 no 7 pp 881ndash892 2002

[34] A Likas N Vlassis and J J Verbeek ldquoThe global k-meansclustering algorithmrdquo Pattern Recognition vol 36 no 2 pp451ndash461 2003

[35] A M Bagirov ldquoModified global k-means algorithm for mini-mumsum-of-squares clustering problemsrdquoPatternRecognitionvol 41 no 10 pp 3192ndash3199 2008

[36] J Xie and S Jiang ldquoA simple and fast algorithm for global K-means clusteringrdquo in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS 10)vol 2 pp 36ndash40 Wuhan China March 2010

[37] G F Tzortzis and A C Likas ldquoThe global kernel 120581-meansalgorithm for clustering in feature spacerdquo IEEE Transactions onNeural Networks vol 20 no 7 pp 1181ndash1194 2009

[38] W-L Hung Y-C Chang and E Stanley Lee ldquoWeight selectionin W-K-means algorithm with an application in color imagesegmentationrdquo Computers and Mathematics with Applicationsvol 62 no 2 pp 668ndash676 2011

[39] B Maliatski and O Yadid-Pecht ldquoHardware-driven adaptive 120581-means clustering for real-time video imagingrdquo IEEE Transac-tions on Circuits and Systems for Video Technology vol 15 no 1pp 164ndash166 2005

[40] T Saegusa and T Maruyama ldquoAn FPGA implementation ofreal-time K-means clustering for color imagesrdquo Journal of Real-Time Image Processing vol 2 no 4 pp 309ndash318 2007

[41] ldquoCBCL face recognition Databaserdquo Vol 2010httpcbclmitedusoftware-datasetsheiselefacerecognition-databasehtml

[42] BioID BioID Support Downloads BioID Face Database 2011[43] httpwwwvisioncaltecheduhtml-filesarchivehtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of