Introduction to Clustering algorithm

Clustering Algorithm COMPLEX NETWORK ALGORITHMAMIR HADIFAR

2Objectives

At the end of this presentation you will understand : Understand data science and it’s application Get overview of Machine Learning Learn some type of clustering algorithm Implementation clustering with R

3Data science and it’s Applications

Extract knowledge or insight from data From speech-recognition and search engine to health-care and

humanities These scenarios involves :

Storing , organizing and integrating huge amount of unstructured data Processing and Analyzing data Extracting Knowledge , insight and predict future from data

Processing , Analyzing , Extracting knowledge and insight done through Machine Learning

4Data science and it’s Applications

5Machine Learning

Field of study that gives computers the ability to learn without being explicitly programmed

Classified into three broad category : Supervised Learning Unsupervised Learning *Reinforcement Learning

6Machine Learning Category

Supervised learning Decision tree learning Classification …

Unsupervised learning Clustering Association rule learning …

7Cluster definition

Cluster analysis or clustering grouping similar object together ( called cluster)

Type of Clustering Intra-class similarity Inter-class similarity

8Clustering Scenario

The following scenarios implement clustering :

Market segmentation Summarized news ( cluster and then find centroid ) City planning Image segmentation

9Methods of clustering

Partitioning methods (Centroid models ) Hierarchical methods (Connectivity models ) Density-based methods Grid-based methods Model-based methods Constraint-based methods

10Partitioning method

database of ‘n’ objects and the partitioning method constructs ‘k’ partition of data which satisfy following : Each group contains at least one object Each object must belong to exactly one group

Points to remember This method create initial partitioning Use iterative relocation technique to improve partitioning

11K-Mean or Lyold’s algorithm

12Other K-mean variant

K-mean++ K-mean stream Mini batch k-mean K-medoids Fuzzy k-means Many others

13K-mean Clustering with R

14Hierarchical Clustering

Agglomerative Bottom up

Divisive Top down

15Calculate distance between points

Single linkage Complete linkage Average linkage

16H Clustering with R

17Density based Methods

Areas of higher density consider as cluster Sparse areas usually consider as noise It use two basic idea

Density reachable Density connectivity

18DBSCAN (Density-Based Spatial Clustering of Applications with Noise)

Advantage Does not require a-priori specification of number of clusters. Able to identify noise data while clustering. is able to find arbitrarily size and arbitrarily shaped clusters

Disadvantage Fails in case of neck type of dataset. Does not work well in case of high dimensional data

21Grid based algorithm

Using multi-resolution grid data structure Clustering complexity depends on number of grid cell and not objects Space into finite number cells that form a grid structure on which all of

the operation for clustering is performed Clique , STING , WaveCluster

22Clique ( CLustering-In-QUEst

Clique is used for clustering high-dimensional data High dimensional data means have many attrs Clique identifies the dense unit in subspace

23StackOverFlow Analysis Using R

Introduction to Clustering algorithm

Education

1 Clustering Algorithms Applications Hierarchical Clustering k -Means Algorithms CURE Algorithm

CSE601 Hierarchical Clustering - University at Buffalo · Agglomerative Clustering Algorithm • More popular hierarchical clustering technique • Basic algorithm is straightforward

Distributed clustering algorithm for large scale clustering problemsuu.diva-portal.org/smash/get/diva2:676130/FULLTEXT01.pdf · Distributed clustering algorithm for large scale clustering

Non-convex polygons clustering algorithm

HCS Clustering Algorithm

Clustering Algorithm (DBSCAN) - Computer Sciencegu/teaching/courses/csc76010/slides/Clustering Algorithm by Vishal...Partitioning based clustering algorithms divide the dataset into

DOCUMENT CLUSTERING USING HIERARCHICAL ALGORITHM

A manifold based clustering algorithm and application to object … · 2012-07-09 · clustering algorithm and application to object discovery in RGBD data Rahul Erai Introduction

CARSO: Clustering Algorithm for Road Surveillance and Overtakinghuszak/publ/CARSO - Clustering Algorithm... · 2019-07-01 · CARSO: Clustering Algorithm for Road Surveillance and

Fast K-Means Algorithm Clustering

K-means+: An Autonomous Clustering Algorithm

- itechprosolutions.initechprosolutions.in/.../uploads/2015/09/...Clustering-Algorithm-docx.… · Web viewFeature Selection based Fast Clustering Algorithm with Minimum Spanning

An efficient k-means clustering algorithm: analysis and ...mount/Projects/KMeans/pami02.pdfk-means clustering algorithm, which we call the filtering algorithm. This algorithm is easy

UE 141 Clustering - University at Buffalojing/ue141/sp13/doc/Clustering1.pdfAgglomerative Clustering Algorithm • More popular hierarchical clustering technique • Basic algorithm

Clustering Genetic Algorithmpetra/slides/dexa.pdf · Introduction Clustering Genetic Algorithm Experimental results Conclusion Mutation One-point mutation, Biased one-point mutation

A Robust Information Clustering Algorithm

DATA MINING - EVALUATING CLUSTERING ALGORITHM

Support Vector Clustering Algorithm

k - medoid clustering with genetic algorithm

Trajectory clustering - Traclus Algorithm