View
78
Download
0
Category
Preview:
DESCRIPTION
A Tutorial on Spectral Clustering. Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Statistics and Computing, Dec. 2007, Vol. 17, No. 4 2011-07-22 Presented by Yongjin Kwon. Outline. Introduction Spectral Clustering Algorithms Two Explanations of Spectral Clustering - PowerPoint PPT Presentation
Citation preview
A Tutorial on Spectral Clustering
Ulrike von Luxburg
Max Planck Institute for Biological Cybernetics
Statistics and Computing, Dec. 2007, Vol. 17, No. 4
2011-07-22
Presented by Yongjin Kwon
Outline
Introduction
Spectral Clustering Algorithms
Two Explanations of Spectral Clustering
Graph Partitioning Point of View
Random Walks Point of View
Conclusion
2
Introduction
Clustering Algorithms
k-means / k-means++
Mixture of Gaussians (MoG)
Hierarchical Clustering (Centroid-based, MST, Average Dis-tance)
DBSCAN
ROCK
BIRCH
CURE
…
3
Introduction (Cont’d)
Spectral Clustering
Simple, but powerful method of clustering
Requires less assumptions on the form of clusters
Outperforms the traditional approaches, such as k-means clustering.
4
(http://www.squobble.com/academic/publications/FFF_MIMO/node4.html)
Introduction (Cont’d)
Spectral Clustering?
Spectrum
Spectral analysis
– Scientific or mathematic methods of analyzing something, such as light or waves, and finding the basis for them
5
Introduction (Cont’d)
Spectral Clustering?
Spectral analysis in linear algebra
– Basic features of matrices : eigenpairs (eigenvalue, eigenvec-tor)
– Methods of using the eigenpairs to solve given problems
Spectral Clustering!
Methods of using the eigenvectors of some matrices to find a partition of the data such that points in the same group are similar
6
Spectral Clustering Algorithms
Similarity Graphs
ε-neighborhood graph
– Connect all points whose pairwise distances are smaller than ε.
k-nearest neighbor graph
– Connect two points if one is among the k-nearest neighbors of the other (and vice versa for mutual k-nearest neighbor graph).
– Each edge is weighted by the similarity of their endpoints.
fully connected graph
– Connect all points and weight all edges by similarity of their endpoints.
7
Spectral Clustering Algorithms (Cont’d)
In spectral clustering, the Gaussian similarity is used to repre-sent local neighborhood relationships.
: adjacency matrix of similarity graph
: degree matrix
8
Spectral Clustering Algorithms (Cont’d)
Graph Laplacian
unnormalized graph Laplacian :
normalized graph Laplacian
Example
9
related to random walk
Assume the weights of edges are 1.
Spectral Clustering Algorithms (Cont’d)
Unnormalized Spectral Clustering
10
Spectral Clustering Algorithms (Cont’d)
Normalized Spectral Clustering [Shi2000]
11
Spectral Clustering Algorithms (Cont’d)
Normalized Spectral Clustering [Ng2002]
12
Two Explanations of Spectral Cluster-ing
Graph partitioning point of view
Based on mincut problem in a similarity graph
Find a partition such that the edges between clusters have a very low weight and the edges within a cluster have high weight.
Random walks point of view
Based on random walks on the similarity graph
Find a partition such that random walk stays long within the same cluster and seldom jumps to other clusters.
13
Graph Partitioning Point of View
Mincut problem
Given a number k, find a partition which mini-mizes
In practice, the mincut problem results in a size imbal-ance in the partition.
14
Graph Partitioning Point of View (Cont’d)
Two common objective functions
Normalized mincut problem
Given a number k, find a partition which mini-mizes
15
or .
Graph Partitioning Point of View (Cont’d)
Represent partitions by k indicator vectors .
Relationship between mincut problem and graph Lapla-cian
Mincut problem is converted into
16
Graph Partitioning Point of View (Cont’d)
Relaxing the form of indicator vectors, normalized min-cut problems can be expressed with graph Laplacian.
17
Graph Partitioning Point of View (Cont’d)
The optimization problems are NP-hard.
Relax the discreteness condition of vectors .
By the Rayleigh-Ritz theorem, the solutions of above problems are the matrix that contains the k smallest eigenvectors of the graph Laplacian and normalized one , repectively.
18
Graph Partitioning Point of View (Cont’d)
Note that the solutions of the relaxed optimization prob-lems does NOT indicate which nodes are included in which groups!
However, we hope that if the data are well-separate, the eigenvectors of graph Laplacians are close to piecewise constant (and close to indicator vectors).
Thinking of the rows of the solution matrices as another representation of data points, k-mean clustering is a way of finding an appropriate group for each point.
19
k-means cluster-
ing!
Random Walks Point of View
Tradition Probability Matrix
If the graph is connected and non-bipartite, then the random walk always possesses a unique stationary dis-tribution
20
Assume the weights of edges are 1.
Random Walks Point of View (Cont’d)
Relationship between Ncut problem and random walks
For the random walk in the stationary distribution,
The problem of finding a partition such that random walk does not have many opportunities to jump between clus-ters is equiva-lent to Ncut problem due to:
Relationship between and
21
Conclusion
Spectral clustering has been made popular by several researches, and extended to many non-standard set-tings.
Spectral clustering has many advantages.
Requires less assumptions on the form of clusters
Be simple and efficient to implement
Has no issues of getting stuck in local minima
…
However, there are some issues when applying it.
Not trivial to choose a good similarity graph
Unstable under different parameter settings
22
Recommended