Upload
aboul-ella-hassanien
View
760
Download
0
Embed Size (px)
Citation preview
Data Clustering Using Swarm Intelligence Algorithms
An Overview
Faculty of Computers and Information, Cairo University and SRGE member
Mona M.Soliman
http://www.egyptscience.net
Bio-inspiring and evolutionary computation: Trends, applications and open issues workshop, 7 Nov. 2015 Faculty of Computers and Information, Cairo University
Agenda
Introduction Types of data clustering Classical clustering
Algorithms Swarm Intelligence
Algorithms Clustering with SI Algorithms
2
Clustering means the act of partitioning an
unlabeled dataset into groups of similar objects. Each group, called a `cluster', consists of
objects that are similar between themselves and dissimilar to objects of other groups.
From a machine learning perspective, clusters correspond to the hidden patterns in data, the search for clusters is a kind of unsupervised learning, and the resulting system represents a data concept.
3
IntroductionProblem Definition
In the past few decades, cluster analysis has played a central role in a variety of fields ranging from : Engineering (machine learning, artificial intelligence, pattern
recognition, mechanical engineering, electrical engineering) Computer sciences (web mining, spatial database analysis, textual
document collection, image segmentation) Life and medical sciences (genetics, biology,
microbiology,paleontology, psychiatry, pathology) Earth sciences (geography. geology, remote sensing) Social sciences (sociology, psychology, archeology, education) Economics (marketing, business)
4
IntroductionMotivation
What is a good cluster is5
Inter-cluster distances are maximized
Intra-cluster distances are
minimized
Types of Data Clustering6
Data Clustering
Hierarchal
Agglomerative Divisive
PartitionalErro
r Minimization
Graph
theoretic
Density Based
Model
Based
bottom-up
Top-down
K means
minimal Spanning Tree
expectation
maximation
-Decision tree
-Neural network
Hierarchal clustering7
Types of Data Clustering8
Data Clustering
Hierarchal
Agglomerative Divisive
PartitionalErro
r Minimization
Graph
theoretic
Density Based
Model
Based
bottom-up
Top-down
K means
minimal Spanning Tree
expectation
maximation
-Decision tree
-Neural network
Original Points A Partitional Clustering
Partitinal clustering
The Classical Clustering Algorithmsk-means Algorithm The K-means algorithm
groups D-dimensional data vectors into a predefined number of clusters on the basis of the Euclidean distance as the similarity criteria.
Euclidean distances among data vectors are minimum for data vectors within a cluster as compared with distances to other data vectors in different clusters.
Vectors of the same cluster are associated with one centroid vector, which represents the center of that cluster and is the mean of the data vectors that belong together.
10
Swarm Intelligence AlgorithmsBiological Foundation
The collective and social behavior of living creatures motivated researchers to undertake the study of today what is known as Swarm Intelligence
The efforts to mimic such behaviors through computer simulation finally resulted into the fascinating field of SI.
SI systems are typically made up of a population of simple agents interacting locally with one another and with their environment.
11
The behavior of a single ant, bee, termite and wasp often is too simple, but their collective and social behavior is of paramount significance.
Ant Colony Optimization (1992)
Particle Swarm Optimization (1995)
Fish Swarm Optimization (2002)
Bee Swarm Optimization (2005)
Cat Swarm Optimization (2006)
Firefly Optimization (2008) Cuckoo Search Optimization
(2009) Bat Swarm Optimization
(2010) Grey wolf Optimization (2014)
12
Swarm Intelligence AlgorithmsAn overview
Clustering with the SI AlgorithmsRelevance of SI Algorithms in Clustering
Data clustering may be well formulated as a difficult global optimization problem; thereby making the application of SI tools more obvious and appropriate.
13
Improving the performance of existing classical clustering methods (e.g. k-means , k-medoid, fuzzy clustering )• K-means clustering have many drawbacks: Such as being
trapped in local minimum and being sensitive to initial cluster centers
• improve the cluster quality by refinement algorithm. (ACO,PSO,Bee Swarm, Firefly Swarm)• Determining the optimal number of clusters• Determine the initial cluster centers
Clustering with the SI AlgorithmsRelevance of SI Algorithms in Clustering
15
Clustering with the SI AlgorithmsRelevance of SI Algorithms in Clustering
Creation of clustering algorithm based on SI algorithms• Fish swarm Clustering• Cat Swarm Clustering
Thank You