Introduction to Machine Learning - Clustering · 2020. 6. 22. · Introduction to Machine Learning...

Introduction to Machine Learning

Clustering

Varun ChandolaComputer Science & Engineering

State University of New York at BuffaloBuffalo, NY, USA

chandola@buffalo.edu

Chandola@UB CSE 474/574 1 / 19

Outline

ClusteringClustering DefinitionK-Means ClusteringInstantations and Variants of K-MeansChoosing ParametersInitialization IssuesK-Means Limitations

Spectral ClusteringGraph LaplacianSpectral Clustering Algorithm

Publishing a Magazine

I Imagine your are a magazineeditor

I Need to produce the next issue

I What do you do?

I Call your four assistanteditors

1. Politics2. Health3. Technology4. Sports

I Ask each to send in k articlesI Join all to create an issue

Publishing a Magazine

I Imagine your are a magazineeditor

I Need to produce the next issue

I What do you do?I Call your four assistant

editors

1. Politics2. Health3. Technology4. Sports

I Ask each to send in k articlesI Join all to create an issue

Treating a Magazine Issue as a Data Set

I Each article is a data point consisting of words, etc.

I Each article has a (hidden) type - sports, health, politics, andtechnology

Now imagine your are the readerI Can you assign the type to each article?

I Simpler problem: Can you group articles by type?

I Clustering

Treating a Magazine Issue as a Data Set

I Each article is a data point consisting of words, etc.

I Each article has a (hidden) type - sports, health, politics, andtechnology

Now imagine your are the readerI Can you assign the type to each article?

I Simpler problem: Can you group articles by type?

I Clustering

What is Clustering?

I Grouping similar things together

I A notion of a similarity or distance metric

I A type of unsupervised learningI Learning without any labels or target

Expected Outcome of Clustering

Technology Health Politics Sports

Expected Outcome of Clustering

Technology Health Politics Sports

K-Means Clustering

I Objective: Group a set of N points (∈ <D) into K clusters.

1. Start with k randomly initialized points in D dimensional spaceI Denoted as {ck}Kk=1

I Also called cluster centers

2. Assign each input point xn (∀n ∈ [1,N]) to cluster k, such that:

dist(xn, ck)

3. Revise each cluster center ck using all points assigned to cluster k

4. Repeat 2

K-Means Clustering

dist(xn, ck)

4. Repeat 2

K-Means Clustering

dist(xn, ck)

4. Repeat 2

K-Means Clustering

dist(xn, ck)

4. Repeat 2

K-Means Clustering

dist(xn, ck)

4. Repeat 2

Variants of K-Means

I Finding distanceI Euclidean distance is popular

I Finding cluster centersI Mean for K-MeansI Median for k-medoids

Choosing Parameters

1. Similarity/distance metricI Can use non-linear transformationsI K-Means with Euclidean distance produces “circular” clusters

2. How to set k?I Trial and errorI How to evaluate clustering?I K-Means objective function

J(c,R) =N∑

K∑k=1

Rnk‖xn − ck‖2

I R is the cluster assignment matrix

{1 If xn ∈ cluster k0 Otherwise

Initialization Issues

I Can lead to wrong clustering

I Better strategies

1. Choose first centroid randomly, choose second farthest away fromfirst, third farthest away from first and second, and so on.

2. Make multiple runs and choose the best

Strengths and Limitations of K-Means

StrengthsI Simple

I Can be extended to other types of data

I Easy to parallelize

WeaknessesI Circular clusters (not with kernelized versions)

I Choosing K is always an issue

I Not guaranteed to be optimal

I Works well if natural clusters are round and of equal densities

I Hard Clustering

Issues with K-Means

I “Hard clustering”

I Assign every data point to exactly one cluster

I Probabilistic ClusteringI Each data point can belong to multiple clusters with varying

probabilitiesI In general

P(xi ∈ Cj) > 0 ∀j = 1 . . .K

I For hard clustering probability will be 1 for one cluster and 0 for allothers

Spectral Clustering

I An alternate approach to clustering

I Let the data be a set of N points

X = x1, x2, . . . , xN

I Let S be a N × N similarity matrix

Sij = sim(xi , xj)

I sim(, ) is a similarity function

I Construct a weighted undirected graph from S with adjacencymatrix, W

{sim(xi , xj) if xi is nearest neighbor of xj

0 otherwise

I Can use more than 1 nearest neighbors to construct the graph

Spectral Clustering as a Graph Min-cut Problem

I Clustering X into K clusters is equivalent to finding K cuts in thegraph WI A1,A2, . . . ,AK

I Possible objective function

cut(A1,A2, . . . ,AK ) ,1

K∑k=1

W (Ak , Ak)

I where Ak denotes the nodes in the graph which are not in Ak and

W (A,B) ,∑

i∈A,j∈B

Straight min-cut results in trivial solution

I For K = 2, an optimal solution would have only one node in A1 andrest in A2 (or vice-versa)

Normalized Min-cut Problem

normcut(A1,A2, . . . ,AK ) ,1

K∑k=1

W (Ak , Ak)

vol(Ak)

where vol(A) ,∑

i∈A di , di is the weighted degree of the node i

NP Hard Problem

I Equivalent to solving a 0-1 knapsack problem

I Find N binary vectors, ci of length K such that cik = 1 only if pointi belongs to cluster k

I If we relax constraints to allow cik to be real-valued, the problembecomes an eigenvector problemI Hence the name: spectral clustering

The Graph Laplacian

L , D−W

I D is a diagonal matrix with degree of corresponding node as thediagonal value

Properties of Laplacian Matrix

1. Each row sums to 0

2. 1 is an eigen vector with eigen value equal to 0

3. Symmetric and positive semi-definite

4. Has N non-negative real-valued eigenvalues

5. If the graph (W) has K connected components, then L has Keigenvectors spanned by 1A1 , . . . , 1AK with 0 eigenvalue.

Spectral Clustering Algorithm

ObservationI In practice, W might not have K exactly isolated connected

components

I By perturbation theory, the smallest eigenvectors of L will be closeto the ideal indicator functions

AlgorithmI Compute first (smallest) K eigen vectors of L

I Let U be the N × K matrix with eigenvectors as the columns

I Perform kMeans clustering on the rows of U

References

Introduction to Machine Learning - Clustering · 2020. 6. 22. · Introduction to Machine Learning...

Documents

Machine Learning - Computer Science Division | … · PPT file · Web view2008-11-10 · Machine Learning Clustering ... Bryan Pardo Kinds of Clustering Sequential Fast ... A Sequential

Machine Learning with MapReduce. K-Means Clustering 3

machine learning - Clustering in R

Machine Learning : Clustering, Self-Organizing Mapsds24/lehre/ml_ws_2013/ml_09_clust.pdf · 12/12/2013 Machine Learning : Clustering, Self-Organizing Maps 11 SOM-s (usually) consist

Chapter 9 Classification and Clustering. Classification and Clustering n Classification/clustering are classical pattern recognition/ machine learning

Artificial Intelligence for Cybersecurity · Artificial Intelligence and Machine Learning. Machine Learning. Unsupervised Learning. Clustering. Clustering (2) •Can be aggregative

Chapter 9 Classification and Clustering. Classification and Clustering Classification and clustering are classical pattern recognition and machine learning

Avdesh Chandola 3d Profile

Apache Mahout. Mahout Introduction Machine Learning Clustering K-means Canopy Clustering Fuzzy K-Means Conclusion

ANALYZING AND OPTIMIZING ANT -CLUSTERING … · · 2015-10-16ANALYZING AND OPTIMIZING ANT-CLUSTERING ... proposed method. KEYWORDS Ant-Clustering Algorithm, ... statistics, machine

Clustering UNSUPERVISED LEARNING INTRODUCTION Machine Learning

Clustering With Bregman Divergences - Machine Learning

28 Machine Learning Unsupervised Hierarchical Clustering

Machine learning hands on clustering

Machine Learning: Clustering · Machine Learning: Clustering Ste en Rendle Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim Wintersemester 2007 / 2008

Wireless Sensor Network Clustering with Machine Learning

On Clustering of Machine Learning Attempts in Heliophysics ... · •Introduction: the role of machine learning in Heliophysics •Clustering of machine learning research publications

(Machine Learning) Clustering & Classifying Houses in King County, WA

An Unsupervised Learning Approach for Overlapping Co-clustering Machine Learning Project Presentation Rohit Gupta and Varun Chandola {rohit,chandola}@cs.umn.edu

10701 Machine Learning Clustering - cs.cmu.eduepxing/Class/10701/slides/clustering.pdf · Clustering 10701 Machine Learning • Organizing data into clusters ... • Clustering methods