36
Topographic graph clustering with kernel and dissimilarity methods Nathalie Villa-Vialaneix http://www.nathalievilla.org & Fabrice Rossi University of Toulouse Seminar on Similarity-based learning on structures, Dagstuhl 16-20 February, 2009 Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 1 / 16

Topographic graph clustering with kernel and dissimilarity methods

  • Upload
    tuxette

  • View
    106

  • Download
    1

Embed Size (px)

DESCRIPTION

February 19th, 2009 Dagstuhl seminar on "Similarity-based learning on structures", Shloss Dagstuhl, Germany

Citation preview

Page 1: Topographic graph clustering with kernel and dissimilarity methods

Topographic graph clustering with kernel anddissimilarity methods

Nathalie Villa-Vialaneixhttp://www.nathalievilla.org

& Fabrice Rossi

University of Toulouse

Seminar on Similarity-based learning on structures, Dagstuhl16-20 February, 2009

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 1 / 16

Page 2: Topographic graph clustering with kernel and dissimilarity methods

A short introduction

A weighted undirected graph G

A (rectangular) grid

with vertices V = {x1, . . . , xn}

with neurons {1, . . . ,M},

and weights W

a distance d on the gridand prototypes (pj)

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 2 / 16

Page 3: Topographic graph clustering with kernel and dissimilarity methods

A short introduction

A weighted undirected graph G A (rectangular) gridwith vertices V = {x1, . . . , xn} with neurons {1, . . . ,M},

and weights W a distance d on the gridand prototypes (pj)

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 2 / 16

Page 4: Topographic graph clustering with kernel and dissimilarity methods

Table of content

1 Dissimilarity and kernel based organizing maps

2 Applications

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 3 / 16

Page 5: Topographic graph clustering with kernel and dissimilarity methods

Dissimilarity and kernel based organizing maps

Table of content

1 Dissimilarity and kernel based organizing maps

2 Applications

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 4 / 16

Page 6: Topographic graph clustering with kernel and dissimilarity methods

Dissimilarity and kernel based organizing maps

SOM, dissimilarity SOM and kernel SOM

Original SOM algorithm (batch): x1, . . . , xn ∈ Rd

1 Initalization: Initialize randomly p01 , ..., p0

M in Rd

2 For l = 1, . . . , L do

3 Assignment: for all i = 1, . . . , n do

f l(xi) = arg minj=1,...,M

‖xi − p l−1j ‖Rd

4 Representation: for all j = 1, . . . ,M,

p lj = arg min

p∈Rd

n∑i=1

h l(f l(xi), j)‖xi − p‖2Rd

Online versions by [Lau et al., 2006]

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 5 / 16

Page 7: Topographic graph clustering with kernel and dissimilarity methods

Dissimilarity and kernel based organizing maps

SOM, dissimilarity SOM and kernel SOMDissimilarity SOM (batch): xi ∈ G defined by a dissimilarity relation:δ(xi , xj)

1 Initalization: Initialize randomly p01 , ..., p0

M in (xi)i

2 For l = 1, . . . , L do

3 Assignment: for all i = 1, . . . , n do

f l(xi) = arg minj=1,...,M

δ(xi , p l−1j )

4 Representation: for all j = 1, . . . ,M,

p lj = arg min

p∈(xi)i

n∑i=1

h l(f l(xi), j)δ(xi , p)

[Kohohen and Somervuo, 1998, Kohonen and Somervuo, 2002]

Online versions by [Lau et al., 2006]

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 5 / 16

Page 8: Topographic graph clustering with kernel and dissimilarity methods

Dissimilarity and kernel based organizing maps

SOM, dissimilarity SOM and kernel SOMRelational Topographic Mapping or Kernel SOM (batch): xi ∈ G

defined by a kernel relation: K(xi , xj)⇒ ∃ φ : G → (H , 〈., .〉H):K(x, x′) = 〈φ(x), φ(x′)〉H

1 Initalization: Initialize randomly p0j =

∑ni=1 γ

0jiφ(xi)

2 For l = 1, . . . , L do

3 Assignment: for all i = 1, . . . , n do

f l(xi) = arg minj=1,...,M

‖φ(xi) − p l−1j ‖H

4 Representation: for all j = 1, . . . ,M,

γlj = arg min

γ∈Rn

n∑i=1

h l(f l(xi), j)‖φ(xi) −n∑

k=1

γkφ(xk )‖2H

[Villa and Rossi, 2007, Hammer and Hasenfuss, 2007]

Online versionsby [Lau et al., 2006]

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 5 / 16

Page 9: Topographic graph clustering with kernel and dissimilarity methods

Dissimilarity and kernel based organizing maps

SOM, dissimilarity SOM and kernel SOMRelational Topographic Mapping or Kernel SOM (batch): xi ∈ G

defined by a kernel relation: K(xi , xj)⇒ ∃ φ : G → (H , 〈., .〉H):K(x, x′) = 〈φ(x), φ(x′)〉H

1 Initalization: Initialize randomly p0j =

∑ni=1 γ

0jiφ(xi)

2 For l = 1, . . . , L do

3 Assignment: for all i = 1, . . . , n do

f l(xi) = arg minj=1,...,M

n∑k ,k ′=1

γl−1jk γl−1

jk ′ K(xk , xk ′) − 2n∑

k=1

γl−1jk K(xi , xk )

4 Representation: for all j = 1, . . . ,M,

γljk =

h(f l(xk ), j)∑nk ′=1 h(f l(xk ′), j)

[Villa and Rossi, 2007, Hammer and Hasenfuss, 2007]

Online versionsby [Lau et al., 2006]

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 5 / 16

Page 10: Topographic graph clustering with kernel and dissimilarity methods

Dissimilarity and kernel based organizing maps

SOM, dissimilarity SOM and kernel SOMRelational Topographic Mapping or Kernel SOM (batch): xi ∈ G

defined by a kernel relation: K(xi , xj)⇒ ∃ φ : G → (H , 〈., .〉H):K(x, x′) = 〈φ(x), φ(x′)〉H

1 Initalization: Initialize randomly p0j =

∑ni=1 γ

0jiφ(xi)

2 For l = 1, . . . , L do

3 Assignment: for all i = 1, . . . , n do

f l(xi) = arg minj=1,...,M

n∑k ,k ′=1

γl−1jk γl−1

jk ′ K(xk , xk ′) − 2n∑

k=1

γl−1jk K(xi , xk )

4 Representation: for all j = 1, . . . ,M,

γljk =

h(f l(xk ), j)∑nk ′=1 h(f l(xk ′), j)

Online versions by [Lau et al., 2006]Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 5 / 16

Page 11: Topographic graph clustering with kernel and dissimilarity methods

Dissimilarity and kernel based organizing maps

Which kernels?Laplacian [Kondor and Lafferty, 2002]

For a graph with vertices V = {x1, . . . , xn} and weights (wi,j)i,j=1,...,n

(positive, symmetric), the Laplacian is: L = (Li,j)i,j=1,...,n where

Li,j =

{−wi,j if i , jdi =

∑j,i wi,j if i = j

;

1 Diffusion matrix [Kondor and Lafferty, 2002]: for β > 0,Kβ = e−βL =

∑+∞k=1

(−βL)k

k ! .⇒

Kβ : V × V → R

(xi , xj) → Kβi,j

heat kernel (or diffusion kernel)

2 Generalized inverse of the Laplacian [Fouss et al., 2007] :K = L+.

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 6 / 16

Page 12: Topographic graph clustering with kernel and dissimilarity methods

Dissimilarity and kernel based organizing maps

Which kernels?Laplacian [Kondor and Lafferty, 2002]

For a graph with vertices V = {x1, . . . , xn} and weights (wi,j)i,j=1,...,n

(positive, symmetric), the Laplacian is: L = (Li,j)i,j=1,...,n where

Li,j =

{−wi,j if i , jdi =

∑j,i wi,j if i = j

;

1 Diffusion matrix [Kondor and Lafferty, 2002]: for β > 0,Kβ = e−βL =

∑+∞k=1

(−βL)k

k ! .⇒

Kβ : V × V → R

(xi , xj) → Kβi,j

heat kernel (or diffusion kernel)

2 Generalized inverse of the Laplacian [Fouss et al., 2007] :K = L+.

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 6 / 16

Page 13: Topographic graph clustering with kernel and dissimilarity methods

Dissimilarity and kernel based organizing maps

Which kernels?Laplacian [Kondor and Lafferty, 2002]

For a graph with vertices V = {x1, . . . , xn} and weights (wi,j)i,j=1,...,n

(positive, symmetric), the Laplacian is: L = (Li,j)i,j=1,...,n where

Li,j =

{−wi,j if i , jdi =

∑j,i wi,j if i = j

;

1 Diffusion matrix [Kondor and Lafferty, 2002]: for β > 0,Kβ = e−βL =

∑+∞k=1

(−βL)k

k ! .⇒

Kβ : V × V → R

(xi , xj) → Kβi,j

heat kernel (or diffusion kernel)2 Generalized inverse of the Laplacian [Fouss et al., 2007] :

K = L+.Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 6 / 16

Page 14: Topographic graph clustering with kernel and dissimilarity methods

Applications

Table of content

1 Dissimilarity and kernel based organizing maps

2 Applications

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 7 / 16

Page 15: Topographic graph clustering with kernel and dissimilarity methods

Applications

Tuning the parameters

On a practical point of view, how tuning• Parameter of the kernel (if there is one);

• Size of the grid;

• Other parameters: kind of neighborhood on the grid, annealingscheme of SOM, ...

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 8 / 16

Page 16: Topographic graph clustering with kernel and dissimilarity methods

Applications

Quality criterion that can be used• Kaski-Lagus quality criterion:

KL =1n

n∑i=1

‖φ(xi) − pLfL (xi)‖+ min

(j0,...,jq)∈Ci

q−1∑k=0

‖pLjk − pL

jk+1‖

where j0 is the best matching unit for xi , jq is the second bestmatching unit for xi and Ci are paths made with direct neighbors onthe prior structure.

, : Organization quality criterion/ : Can not be used to tune the type of kernel or either its parameter.

• Modularity:

Q =1

2m

n∑i,j=1

(wij −

didj

2m

)I{fL (xi)=fL (xj)}

, : True quality measure for graphs (can be used to tune everyparameters)/ : Clustering measure: no idea about the organization quality

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 9 / 16

Page 17: Topographic graph clustering with kernel and dissimilarity methods

Applications

Quality criterion that can be used• Kaski-Lagus quality criterion:

KL =1n

n∑i=1

‖φ(xi) − pLfL (xi)‖+ min

(j0,...,jq)∈Ci

q−1∑k=0

‖pLjk − pL

jk+1‖

where j0 is the best matching unit for xi , jq is the second bestmatching unit for xi and Ci are paths made with direct neighbors onthe prior structure., : Organization quality criterion/ : Can not be used to tune the type of kernel or either its parameter.

• Modularity:

Q =1

2m

n∑i,j=1

(wij −

didj

2m

)I{fL (xi)=fL (xj)}

, : True quality measure for graphs (can be used to tune everyparameters)/ : Clustering measure: no idea about the organization quality

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 9 / 16

Page 18: Topographic graph clustering with kernel and dissimilarity methods

Applications

Quality criterion that can be used• Kaski-Lagus quality criterion:

KL =1n

n∑i=1

‖φ(xi) − pLfL (xi)‖+ min

(j0,...,jq)∈Ci

q−1∑k=0

‖pLjk − pL

jk+1‖

where j0 is the best matching unit for xi , jq is the second bestmatching unit for xi and Ci are paths made with direct neighbors onthe prior structure., : Organization quality criterion/ : Can not be used to tune the type of kernel or either its parameter.

• Modularity:

Q =1

2m

n∑i,j=1

(wij −

didj

2m

)I{fL (xi)=fL (xj)}

, : True quality measure for graphs (can be used to tune everyparameters)/ : Clustering measure: no idea about the organization quality

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 9 / 16

Page 19: Topographic graph clustering with kernel and dissimilarity methods

Applications

Quality criterion that can be used• Kaski-Lagus quality criterion:

KL =1n

n∑i=1

‖φ(xi) − pLfL (xi)‖+ min

(j0,...,jq)∈Ci

q−1∑k=0

‖pLjk − pL

jk+1‖

where j0 is the best matching unit for xi , jq is the second bestmatching unit for xi and Ci are paths made with direct neighbors onthe prior structure., : Organization quality criterion/ : Can not be used to tune the type of kernel or either its parameter.

• Modularity:

Q =1

2m

n∑i,j=1

(wij −

didj

2m

)I{fL (xi)=fL (xj)}

, : True quality measure for graphs (can be used to tune everyparameters)/ : Clustering measure: no idea about the organization quality

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 9 / 16

Page 20: Topographic graph clustering with kernel and dissimilarity methods

Applications

A first example: a medieval social networkExample from [Boulet et al., 2008]In Cahors (Lot, France), stands a big corpus of 5000 agrarian contracts

• coming from 4 seignories (about 25 little villages) of south west ofFrance;

• being established between 1240 and 1520 (just before and after thehundred years’ war).

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 10 / 16

Page 21: Topographic graph clustering with kernel and dissimilarity methods

Applications

Simplification of this network by kernel SOM

Kernel SOM with heatkernel (β chosen with modularity); optimization ofthe size of the grid and of the kind of initialization (random vs PCA) ismade with Kaski-Lagus criterion.

Modularity : 0.597 ; KL = 0.531

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 11 / 16

Page 22: Topographic graph clustering with kernel and dissimilarity methods

Applications

Simplification of this network by kernel SOM

Modularity : 0.597 ; KL = 0.531

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 11 / 16

Page 23: Topographic graph clustering with kernel and dissimilarity methods

Applications

A brief comparison with spectral clustering

Number of clusters: 35 50Maximum size of the clusters: 255 268

Modularity: 0.597 0.420

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 12 / 16

Page 24: Topographic graph clustering with kernel and dissimilarity methods

Applications

A brief comparison with spectral clustering

Number of clusters: 35 29Maximum size of the clusters: 255 325

Modularity: 0.597 0.433

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 12 / 16

Page 25: Topographic graph clustering with kernel and dissimilarity methods

Applications

Organizing map as a basis for a full representation of thegraph [Truong et al., 2007]

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 13 / 16

Page 26: Topographic graph clustering with kernel and dissimilarity methods

Applications

Classification and drawing criteria

The previous tuning method is unsatisfactory: need of a method toevaluate the organization of the graph on the map as well as itsclustering.We propose to combine a graph drawing quality measure and a graphclustering quality measure:

1 the percentage of pairs of edges that cross on the map;

2 the modularity.

Combining these two measures gives Pareto points that can help theuser to make the tradeoff between the quality of the representation of thegraph on the map and the quality of the clustering.

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 14 / 16

Page 27: Topographic graph clustering with kernel and dissimilarity methods

Applications

Classification and drawing criteria

The previous tuning method is unsatisfactory: need of a method toevaluate the organization of the graph on the map as well as itsclustering.We propose to combine a graph drawing quality measure and a graphclustering quality measure:

1 the percentage of pairs of edges that cross on the map;

2 the modularity.

Combining these two measures gives Pareto points that can help theuser to make the tradeoff between the quality of the representation of thegraph on the map and the quality of the clustering.

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 14 / 16

Page 28: Topographic graph clustering with kernel and dissimilarity methods

Applications

A second example: a collaboration network

Example coming from [Newman, 2006].

Weighted connected graphwith 379 vertices.

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 15 / 16

Page 29: Topographic graph clustering with kernel and dissimilarity methods

Applications

A second example: a collaboration network

Example coming from [Newman, 2006].

Weighted connected graphwith 379 vertices.

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 15 / 16

Page 30: Topographic graph clustering with kernel and dissimilarity methods

Applications

A second example: a collaboration network

Example coming from [Newman, 2006].

Weighted connected graph Modularity: 0.827with 379 vertices. % cut edges: 0.094

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 15 / 16

Page 31: Topographic graph clustering with kernel and dissimilarity methods

Applications

A second example: a collaboration network

Example coming from [Newman, 2006].

Weighted connected graph Modularity: 0.816with 379 vertices. % cut edges: 0.005

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 15 / 16

Page 32: Topographic graph clustering with kernel and dissimilarity methods

Applications

A second example: a collaboration network

Example coming from [Newman, 2006].

Weighted connected graph Modularity: 0.771with 379 vertices. % cut edges: 0.044

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 15 / 16

Page 33: Topographic graph clustering with kernel and dissimilarity methods

Applications

Conclusion

• Organizing maps for graphs having several hundred of vertices.

• Based on kernel methods.

• Simplified representation of the graph.

• Tuning the parameters of the algorithm is very important to obtainmeaningfull maps.

• Which kernel? Sometimes the heat kernel seems to work well andsometimes it provides poor solutions.

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 16 / 16

Page 34: Topographic graph clustering with kernel and dissimilarity methods

Applications

Boulet, R., Jouve, B., Rossi, F., and Villa, N. (2008).Batch kernel SOM and related laplacian methods for social network analysis.Neurocomputing, 71(7-9):1257–1273.

Fouss, F., Pirotte, A., Renders, J., and Saerens, M. (2007).Random-walk computation of similarities between nodes of a graph, withapplication to collaborative recommendation.IEEE Transactions on Knowledge and Data Engineering, 19(3):355–369.

Hammer, B. and Hasenfuss, A. (2007).Relational topographic maps.Technical Report IfI-07-01, Clausthal University of Technology.

Kohohen, T. and Somervuo, P. (1998).Self-Organizing maps of symbol strings.Neurocomputing, 21:19–30.

Kohonen, T. and Somervuo, P. (2002).How to make large self-organizing maps for nonvectorial data.Neural Networks, 15(8):945–952.

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 16 / 16

Page 35: Topographic graph clustering with kernel and dissimilarity methods

Applications

Kondor, R. and Lafferty, J. (2002).Diffusion kernels on graphs and other discrete structures.In Proceedings of the 19th International Conference on Machine Learning,pages 315–322.

Lau, K., Yin, H., and Hubbard, S. (2006).Kernel self-organising maps for classification.Neurocomputing, 69:2033–2040.

Newman, M. (2006).Finding community structure in networks using the eigenvectors of matrices.Physical Review, E, 74(036104).

Truong, Q., Dkaki, T., and Charrel, P. (2007).An energy model for the drawing of clustered graphs.In Proceedings of Vème colloque international VSST, Marrakech, Maroc.

Villa, N. and Rossi, F. (2007).A comparison between dissimilarity SOM and kernel SOM for clustering thevertices of a graph.

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 16 / 16

Page 36: Topographic graph clustering with kernel and dissimilarity methods

Applications

In Proceedings of the 6th Workshop on Self-Organizing Maps (WSOM 07),Bielefield, Germany.

Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 16 / 16