Upload
others
View
23
Download
0
Embed Size (px)
Citation preview
Impact of Regularization on Spectral Clustering
Antony Joseph* and Bin Yu#
* Walmart Research Lab in San Francisco(formerly UCB and LBNL)
# Departments of Statistics and EECSUC Berkeley
Workshop on Spectral Algorithms, Simons Inst, Oct. , 2014
Monday, November 3, 2014
/46
spectral clusteringin
graphs
Berkeley Drosophila Genome Project (BDGP)(The fruit fly project)
Overview
2
collaborators :
• Siqi Wu, UC Berkeley
• Erwin Frise, Lawrence Berkeley Lab
• Ann Hammonds, Lawrence Berkeley Lab
• Sue Celniker, Lawrence Berkeley Lab
Monday, November 3, 2014
/46
spectral clusteringin
graphs
Berkeley Drosophila Genome Project (BDGP)(The fruit fly project)
Overview
2
collaborators :
• Siqi Wu, UC Berkeley
• Erwin Frise, Lawrence Berkeley Lab
• Ann Hammonds, Lawrence Berkeley Lab
• Sue Celniker, Lawrence Berkeley Lab
Monday, November 3, 2014
/46
A Graph
Social network people
The fruit fly project pixels/points in the embryo template
...
Context Nodes
3Monday, November 3, 2014
/46
The fruit fly project (Berkeley Drosophila Genome Project)
4Monday, November 3, 2014
/46
Drosophila(fruit fly)
Widely studied :• genetic mechanism similar to humans • easy to maintain in the lab• short life cycle • ...
5Monday, November 3, 2014
Image dataset from the fruit fly project
Monday, November 3, 2014
Image dataset from the fruit fly project
tailless
Monday, November 3, 2014
Image dataset from the fruit fly project
• Over 100,000 stained embryo images (over 7000 genes)
Monday, November 3, 2014
Image dataset from the fruit fly project
• Over 100,000 stained embryo images (over 7000 genes)
• the interaction between different genes
• the genes required for development of various organs.
Goals: Contribute to the understanding of ...
Monday, November 3, 2014
/46
‘Fate’ map in early embryos
Lohs-Schardin et. al (’70), Hartenstein et. al. (‘85)
Laser ablation experiments in embryos in early stages of development
7Monday, November 3, 2014
/46
‘Fate’ map in early embryos
hind gut
anterior mid-gut
dorsal epidermis
ventral neurogenic region
procephalic neurogenic region
pharynx
mesoderm
esophagus
Lohs-Schardin et. al (’70), Hartenstein et. al. (‘85)
Laser ablation experiments in embryos in early stages of development
7Monday, November 3, 2014
/46
Do genes explain the `fate’ map?
.... early stage gene expression images
8Monday, November 3, 2014
/46
Discovery of fate map &
communities on graphs
9Monday, November 3, 2014
/46
Discovery of fate map &
communities on graphs
Nodes : pixels/points in the embryo
9Monday, November 3, 2014
/46
Discovery of fate map &
communities on graphs
9Monday, November 3, 2014
/46
Discovery of fate map &
communities on graphs
Edge if lot of genes are co-expressed at the two nodes
9Monday, November 3, 2014
/46
Discovery of fate map &
communities on graphs
9Monday, November 3, 2014
/46
Discovery of fate map &
communities on graphs
fate map
9
??
Monday, November 3, 2014
/46
Edge between node i and node j
10Monday, November 3, 2014
/46
= . . . . . .
Edge between node i and node j
10Monday, November 3, 2014
/46
= . . . . . .
= . . . . . .
Edge between node i and node j
10Monday, November 3, 2014
/46
= . . . . . .
= . . . . . .
> >
Edge between node i and node j
10Monday, November 3, 2014
/46
= . . . . . .
= . . . . . .
> >
Edge between node i and node j
90-th percentile
10Monday, November 3, 2014
/46
τ = τ =
Take K = 8
11
Comparing unregularized vs. regularized SC
Monday, November 3, 2014
/46
τ = τ =
Take K = 8dorsal epidermis
mesoderm
hind gut
ventral neurogenic region
procephalic neurogenic region
anterior mid-gut
pharynx
esophagus
11
Comparing unregularized vs. regularized SC
Monday, November 3, 2014
/46
Communities
people like minded people
pixels/points in embryo area of future organs
...
Nodes Communities
12Monday, November 3, 2014
/46
Finding communities
13Monday, November 3, 2014
/46
Finding communities
Notion of (two) communities
13Monday, November 3, 2014
/46
Finding communities
Notion of (two) communities
Methods
Spectral clustering (Fiedler (’73), Donath & Hoffman (’73), ...)
Modularity (Newman & Girvan (‘03)), Latent space methods (Hoff et. al. (’02))Profile-likelihood (Bickel & Chen (’09)), Pseudo-Likelihood (Amini et. al. (’13)),
13Monday, November 3, 2014
/4614
Spectral Clustering
Monday, November 3, 2014
/46
Notation
15
Adjacency matrix:(symmetric binary)
Number of nodes: n
A ∈ Rn×n
Aij = Aji =
�1, (i, j)
0,
Monday, November 3, 2014
/46
Notation
15
Adjacency matrix:(symmetric binary)
Number of nodes: n
A ∈ Rn×n
Aij = Aji =
�1, (i, j)
0,
Each row/column of A associated with a node
Monday, November 3, 2014
/46
Notation
15
Adjacency matrix:(symmetric binary)
Number of nodes: n
A ∈ Rn×n
Aij = Aji =
�1, (i, j)
0,
Degree matrix:(diagonal)
D ∈ Rn×n
Dii =�
j
Aij
Monday, November 3, 2014
/46
Spectral Clustering
16
(Normalizedsymmetric Laplacian matrix)
= − / − /
Spectral clustering deals with the eigenvectors of the matrix :
Monday, November 3, 2014
/46
Spectral Clustering
16
(Normalizedsymmetric Laplacian matrix)
= − / − /
Spectral clustering deals with the eigenvectors of the matrix :
Other matrices used ...
D −A
A
D−1 A ( Normalized random walk Laplacian)
(Unnormalized Laplacian)
(Adjacency matrix)
Monday, November 3, 2014
/4617
Illustration of SC
Monday, November 3, 2014
/4617
Illustration of SC
A =
Monday, November 3, 2014
/4617
A =
Illustration of SC
Monday, November 3, 2014
/4617
L =
Illustration of SC
Monday, November 3, 2014
/4617
First eigenvector
Seco
nd e
igen
vect
or
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.038 0.0360.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
L =
Illustration of SC
Monday, November 3, 2014
/4617
First eigenvector
Seco
nd e
igen
vect
or
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.038 0.0360.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
L =
Illustration of SC
cluster
Seco
nd e
igen
vect
or
First eigenvector
Monday, November 3, 2014
/46
SC for finding K clusters (Shi and Malik (00), Ng et. al (’02))
18
n×K V K L
V K
Monday, November 3, 2014
/46
SC for finding K clusters (Shi and Malik (00), Ng et. al (’02))
18
n×K V K L
V K
V
Monday, November 3, 2014
/46
Popularity of spectral clustering
• Computational advantage :
-requires eigenvector decomposition which is very fast
Theoretical backing :
- relaxation of various cut-based measures
(Hagen & Kahng (’92), Shi & Malik (’00), Ng et al, (’02))
- Stochastic Block Model and its extensions
(McSherry (‘01), Rohe. et. al (‘11), Chaudhari et. al. (’12), Sussman (’12),
Fishkind (’11))
19Monday, November 3, 2014
/46
Performance of spectral clustering improves greatly through regularization
Regularization proposed by Amini, Chen, Bickel and Levina (AoS, 2013)
20Monday, November 3, 2014
/46
Performance of spectral clustering improves greatly through regularization
Regularization proposed by Amini, Chen, Bickel and Levina (AoS, 2013)
20
A
Aτ = A+τ
n11�, τ > 0.
Lτ Aτ
Vτ K
Vτ = K Lτ
Monday, November 3, 2014
/46
Performance of spectral clustering improves greatly through regularization
Regularization proposed by Amini, Chen, Bickel and Levina (AoS, 2013)
Alternative forms of regularization proposed and analyzed in Chaudhuri et. al (2012), Qin & Rohe (’13)
20
A
Aτ = A+τ
n11�, τ > 0.
Lτ Aτ
Vτ K
Vτ = K Lτ
Monday, November 3, 2014
/46
Stochastic Block Model
21Monday, November 3, 2014
/46
Stochastic Block Model (SBM) (Holland et. al (’83))
22
n
(i, j) Pij
Monday, November 3, 2014
/46
Stochastic Block Model (SBM) (Holland et. al (’83))
22
SBM with two blocks
=� �
=
n× n
n
(i, j) Pij
Monday, November 3, 2014
/4623
sample.4
.3
.2
.2
Monday, November 3, 2014
/4623
sample.4
.3
.2
.2
Monday, November 3, 2014
/46
Analysis of regularization for the SBM(Focus on K =2)
24Monday, November 3, 2014
/4625
Comparing unregularized vs. regularized SC
==
τ =
first sample eigenvector first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
seco
nd s
ampl
e ei
genv
ecto
r
τ =
.003
.04
.0025
.0025
Monday, November 3, 2014
/4625
Comparing unregularized vs. regularized SC
==
τ =
first sample eigenvector first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
seco
nd s
ampl
e ei
genv
ecto
r
τ =
.003
.04
.0025
.0025
k-means success : 100%k-means success : 87%
Monday, November 3, 2014
/46
Recap : Regularized spectral clustering
26
Aτ = A+τ
n11�, τ > 0.
Lτ = D−1/2τ AτD
−1/2τ
Vτ
Vτ = Lτ
Monday, November 3, 2014
/46
Population level quantities
27
A = P=
Monday, November 3, 2014
/46
Population level quantities
27
τ
τ
Pτ = P + τn11
�
Lpopτ
Monday, November 3, 2014
/46
Population level quantities
27
τ
τ
Pτ = P + τn11
�
Lpopτ
Recall:
Vτ n× 2
Vτ
Monday, November 3, 2014
/46
Population level quantities
27
τ
τ
Pτ = P + τn11
�
Lpopτ
Vτ V popτ
Monday, November 3, 2014
/46
Population level quantities
27
1,τ 2,τ
τ
τ
Pτ = P + τn11
�
Lpopτ
Vτ V popτ
Monday, November 3, 2014
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
28Monday, November 3, 2014
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
28Monday, November 3, 2014
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
τ =max = , max ∈ � ,τ − ,τ�
� ,τ − ,τ�
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
/4629
τ
τ =max = , max ∈ � ,τ − ,τ�
� ,τ − ,τ�
Monday, November 3, 2014
/4629
τ
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
/46
τ
29
τ
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
/4629
τ
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
/4629
τ
τ �√ � τ − τ �
µ ,τ
Implication of matrix perturbation theory (Davis - Kahan) :
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
/4629
τ
τ
τ �√ � τ − τ �
µ ,τ
Implication of matrix perturbation theory (Davis - Kahan) :
(µ2,τ τ)
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
/4629
τ
Implication of concentration of Laplacian (Oliveira (’10)):
with high probability� τ − τ � � min
�√
, + τ, ,
( , + τ)
��log
τ � log ,
τ �√ � τ − τ �
µ ,τ
Implication of matrix perturbation theory (Davis - Kahan) :
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
/46
Improvements using extension of techniques in Balakrishnan et. al. (’11).
29
τ
τ �√ � τ − τ �
µ ,τ
Implication of matrix perturbation theory (Davis - Kahan) :
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
/4630
Let,
Set,
dn :=
τ = dn
Monday, November 3, 2014
/4630
Let,
Set,
dn :=
τ = dn
Result (SBM with two blocks):
dn �√n log n
µ2,0
Monday, November 3, 2014
/4630
Let,
Set,
Summary:
dn :=
τ = dn
Result (SBM with two blocks):
dn �√n log n
µ2,0
Monday, November 3, 2014
/46
Choice of regularization parameter
31Monday, November 3, 2014
/46
� τ − τ �µ ,τ
Recall: trade-offs dictated by
32Monday, November 3, 2014
/46
� τ − τ �µ ,τ
Recall: trade-offs dictated by
� τ − ˆτ �µ̂ ,τ
τ
32Monday, November 3, 2014
/46
Estimates based on estimated SBM (or degree corrected SBM)
� τ − τ �µ ,τ
Recall: trade-offs dictated by
� τ − ˆτ �µ̂ ,τ
τ
32Monday, November 3, 2014
/46
� τ − τ �µ ,τ
Recall: trade-offs dictated by
� τ − ˆτ �µ̂ ,τ
τ
32Monday, November 3, 2014
/46
ˆτ , µ̂ , τ
33
P=
τ
C1, C2
Monday, November 3, 2014
/46
ˆτ , µ̂ , τ
33
P=
τ
C1, C2
p1, p2 q C1 C2
e.g. p̂1 = C1
Monday, November 3, 2014
/46
ˆτ , µ̂ , τ
33
P̂ =
p̂1
p̂2q̂
q̂
q̂
τ
C1, C2
p1, p2 q C1 C2
e.g. p̂1 = C1
Monday, November 3, 2014
/46
ˆτ , µ̂ , τ
33
P̂ =
p̂1
p̂2q̂
q̂
q̂
τ
C1, C2
p1, p2 q C1 C2
e.g. p̂1 = C1
P̂ L̂popτ µ̂2,τ L̂pop
τ
Monday, November 3, 2014
/4634
Example
==
.003
.01
.0025
.0025
first sample eigenvector first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
seco
nd s
ampl
e ei
genv
ecto
r
τ = 0 τ = 18
Monday, November 3, 2014
/4634
Example
==
.003
.01
.0025
.0025
first sample eigenvector first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
seco
nd s
ampl
e ei
genv
ecto
r
k-means success : 94%k-means success : 75%
τ = 0 τ = 18
Monday, November 3, 2014
/46
Political blog data
35Monday, November 3, 2014
/46
source : Adamic & Glance (’05)
=
36Monday, November 3, 2014
/46
0 50 100 150 200 250 300 350 4000
100
200
300
400
500
600
700
800
Histogram of degrees
source : Adamic & Glance (’05)
=
36Monday, November 3, 2014
/46
first
eig
enve
ctor
nodes
Political blogs data set
Unregularized Spectral Clustering
37
seco
nd e
igen
vect
or
nodes
Monday, November 3, 2014
/4638Monday, November 3, 2014
/46
second eigenvector discriminates these from the remaining
38Monday, November 3, 2014
/4638Monday, November 3, 2014
/4638
third eigenvector
Monday, November 3, 2014
/46
Regularized SC for political blogs datasetfir
st e
igen
vect
or
seco
nd e
igen
vect
or
39
τ = .
nodesnodes
Monday, November 3, 2014
/46
Regularized SC for political blogs dataset
13% of misclassified nodes for regularizedcompared to 48% for unregularized
first
eig
enve
ctor
seco
nd e
igen
vect
or
39
τ = .
nodesnodes
Monday, November 3, 2014
/46
τ = τ =
Take K = 8
40
Comparing unregularized vs. regularized Spectral Clustering (SC)
Monday, November 3, 2014
/46
τ = τ =
Take K = 8dorsal epidermis
mesoderm
hind gut
ventral neurogenic region
procephalic neurogenic region
anterior mid-gut
pharynx
esophagus
40
Comparing unregularized vs. regularized Spectral Clustering (SC)
Monday, November 3, 2014
/46
Summary
• Theoretical upper bound under SBM shows “bias-variance”-like trade-off while the amount of regularization increases in SC
• Theoretical analysis motivates practically useful scheme (using SBM or degree-corrected SBM) to select regularization parameter in RSC.
Promising results in fruitfly image segmentation
Paper at (2014 rev): http://arxiv.org/pdf/1312.1733.pdf
41Monday, November 3, 2014
/46
Ongoing/future directions
The BDGP project (with Antony Joseph, Siqi Wu, Ann Hammonds, Sue Celniker, Erwin Frise)
• Fast algorithm for computing the data-driven choice of regularization parameter• Role of regularization in other scenarios, such as hierarchical clusters• Regularization parameter choice for continuous data
Spectral Clustering (with Antony Joseph)
• Analysis of gene interactions in different regions of early stage embryos• Extension of analysis to later stage embryos
42Monday, November 3, 2014