40
Self-Organized Swarms for cluster preserving Projections of high- dimensional Data Alfred Ultsch,Lutz Herrmann Databionics Research Group

Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Self-Organized Swarms for cluster preserving Projections of high-dimensional Data

Alfred Ultsch,Lutz Herrmann

Databionics Research Group

Page 2: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 2

Outline of the talk

How to visualize structures of the Rn

What do we know about the structure

preserving of projections?

Problems with naive usage

U-Matrix on ESOM

Swarm Organized Projection (SOP)

Page 3: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 3

How to visualize structures of the Rn

Answer: Projection into R<4

Possibilities:

1. Linear Projections

2. Nonlinear Projections preserving

distances

3. Discontinuous Projections preserving

what ??

Page 4: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 4

1. Linear Projections PCA

Principal Component Analysis PCA

PCA is „all wrong“ here

Page 5: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 5

1. Linear Projections ICA

Independent Component Analysis ICA

ICA is „all wrong“ here

Page 6: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 6

2. Nonlinear Projections MDS

Multidimensional Scaling MDS

MDS is „all wrong“ here

See non linear distortion of space!

Page 7: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 7

2. Nonlinear Projections Sammon

Sammons Mapping SM

SM is „all wrong“ here

See non linear distortion of space!

Page 8: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 8

Curvilinear Component Analysis

advanced offspring of MDS

Michel Verleysen‘s study group

“unfolds non-linear manifolds“

minimization of topographic error

Page 9: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 9

3. Discontinuous Projection : CCA

Curvilinear Component Analysis CCA [Verleysen et al /Leuven]

advanced offspring of MDS

“unfolds non-linear manifolds“

minimization of “topographic error”

Problem: cooling scheme for neighborhood radius

-2

-1

0

1

2

-2-1

01

23

-2

-1

0

1

2

x

Chainlink

y

z

CCA

Page 10: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 10

3. Discontinuous Projection : ESOM

Emergent SOM = ESOM

Visualize Structures using U-matrix

Appoved recipes for parameters

Page 11: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 11

ESOM (Emergent SOM)

Self Organizezed Feture Maps come in 2

Flavours:

1. k-means-SOM

Very few neurons

Neuron = Cluster

Clustering Properties:

identical with k-means

2. Emergent SOM (ESOM)

„cluster“

Page 12: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 12

ESOM (Emergent SOM) Self Organized Feature Maps come in 2 Flavours:

1. k-means-SOM

2. Emergent SOM (ESOM)

4000+ Neurons on Rectangular Grid

Borderless (toroid)

Properties:

nonlinear topology (cluster) preserving projection

Rn (Data) -> grid of neurons

Nonlinear Interpolation,

rather Projection than Clustering!

82 Neurons

50N

.

Page 13: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 13

3.Discontinuous Projection : ESOM

ESOM

ESOM: is „all wrong“ here

Page 14: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 14

3.Discontinuous Projection : ESOM

ESOM

However: Cluster structure is preserved!

Disentangling of rings !

Page 15: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 15

What do we know about the structure

preserving of projections?

Topology Preservation of Rn -> Rm

is NOT POSSIBLE

Only possible if „true data dimension = m“ i.e.

If Data resides on m-dimensional submanifold

Measuring this „intrinsic“ dimension is difficult!

„swiss roll“

Page 16: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 16

What do we know about the structure

preserving of projections?

perfect preservation of topology for

Rn –Rm , m < n is per se impossible

So, how do the projection algorithms perform?

Ad 1. linear Projections d(p(x),p(y)) = flin(D(x,y))

PCA aims at variance preservation

ICA aims at „Non-Gaussianity“

Both aims may not be Cluster preserving!

Page 17: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 17

What do we know about the structure

preserving of projections?

Ad 2. Non Linear Projections

Often Stress-Measure E

E(D(x,y),d(p(x),p(y)) = error(D(x,y)-> d(p(x),p(y)))

Directly minimized (see MDS, Sammon, CCA)

However: may not be Cluster preserving!

Distance Structures may be non linear distorted!

Visualization may be misleading! (see above)

Page 18: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 18

Ad 3. discontinuous (E) SOM

Bottom lines of theory on SOM:

SOM is nonlinear, discontinuous and “usually” cluster preserving

For SOM in principle no Energy function possible

Variants of SOM show topology preservation under certain preconditions

Page 19: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 19

Let’s face the Problem:

For projections R -> Rm, n>3,m<4

Conservation of structures (clusters) in

general impossible

So what?

Proposal: visualize the “problem areas”

Easy for ESOM : The U-matrix

Page 20: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 20

U-Matrix

Neurons

on the

Grid

E

SESSW

W

NENNW

ESOM,

Topology preservation

nonlinear disentangeling

Grid of

neurons

weights

of

neurons

U-heigths are average local distances !

U-Matrix

Page 21: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 21

U-matrix on SOM Shows distance structures of the Rn

High ridges = large distances

Low valleys = points in Rn close to each other

Neurons

on the

Grid

E

SESSW

W

NENNW

U-Matrix of Chainlink U-heigths are local distances !

Page 22: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 22

Ant-Based Projections/Clustering

stochastic ants perform random walks on a grid

fixed perceptive neighbourhood of size σ2 2 {9,25}

one or many ants:

pick input sample when neighbourhood contains dissimilar samples

drop input sample when neighbourhood containts similiar samples

σ

σ

Page 23: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 23

Important Ant Models

Deneubourg (1990): modelling emergent phenomena of ants clustering corpses

Lumer, Faieta (1994): clustering pairwise dissimilarity data, e.g. four gaussians

Ramos (2003): ACLUSTER method includes pheromones, analyzes web traffic data

Handl, Knowles (2005): ATTA method, first solid empirical evaluations

Tan et al. (2006): empirical evaluation: number of ants is irrelevant

Page 24: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 24

too many, too small clusters appear

topologically distorted clusters

results are highly sensitive to parameters, e.g. grid size

no proof of topological ordering

-0.2 0 0.2 0.4 0.6 0.8 1 1.2-0.2

0

0.2

0.4

0.6

0.8

1

1.2

data:

4 gaussian

cluster

data processed with

Lumer/Faieta algorithm

Problems with Ant-Based Methods

Page 25: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 25

Strengths & Weaknesses

Ant-Based

Clustering

(Emergent)

SOM

Quality of

topographic

mapping

poor good

continuous

learningyes no

data metric spacesnormed vector

spaces

Improve the quality of

topographic mappings

to obtain a method that

is superior to Self-

Organizing Maps !

Page 26: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 26

Xy

Xy

i ymih

yxymih

xm))(,(

))(,(

minarg)(

Bestmatch Objectives

Batch-SOM

Dissimilarity-SOM

Ants

z

z

yi izmh

zyizmh

xxm)),((

)),((

minargminarg)(

Xy

Xy

i ymih

yxymihiN

xm))(,(

))(,(1

1)(

maxarg)(2

Distortion

upper limit of the

SOM objective,

resembles

Dissimiliarity SOM

Page 27: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 27

Ants versus SOM

Ants maximize the product :

“output density × topography

preservation”

This distorts the formation of correct

topographic maps.

Page 28: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 28

From Ants to Swarms

Ants

use small, but fixed neighbourhoods

account for output space density

Swarms

use large, shrinking neighbourhoods

do not account for output space density

Page 29: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 29

Lumer/Faieta Projection of Chainlink

too many & too small clusters emerge

-4

-2

0

2

4

-2-1.5

-1-0.5

00.5

11.5

2

-3

-2

-1

0

1

2

3

0 10 20 30 40 50 600

10

20

30

40

50

60

Ants

Page 30: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 30

Summary up to here (ESANN 08)

Ants and swarms construct much worse

mappings than ESOM.

Reason: objective function includes

optimization of output density

Page 31: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 31

Swarm Model improvement

Make ant systems more ESOM-like:

1. use perceptive neighbourhoods like in ESOM

(starting large then shrinking…)

2. omit output space densities from objective function , preservation is of doubtful value

distorts the topographic term:

output space densities easy to optimize and

this will dominate the objective function

Page 32: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 32

Swarm Clustering of Chainlink

Cluster structure is preserved.

-4

-2

0

2

4

-2-1.5

-1-0.5

00.5

11.5

2

-3

-2

-1

0

1

2

3

0 10 20 30 40 50 600

10

20

30

40

50

60

AntsSwarm

Page 33: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 33

Trials on Fundamental Clustering

Problems Suite (FCPS)

FCPS: a collection of simple data sets

Every decent clustering /

topo-mapping algorithm

should be able to handle

these little problems

the ESOM does

But many don’t

Page 34: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 34

FCPS download

google: FCPS , Ultsch

Page 35: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 35

0

5

10

15

20

25

atom chainlink hepta iris target 2diamonds wingnut

Experimental Results:

Ants versus Swarm Model on FCPS

Measuring Topography

Preservation with “Minimal

Pathlength” [Goodhill 95]

percent improvements

typ > +10%

On 100 runs: significanctly

reduced error values

according to Kolmogorov-

Smirnov test on α = 1% level

Page 36: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 36

Real World Application of Swarm

Model

Bioinformatics Data

containing protein data

GPD194 [Popescu 06].

194 proteins / 3 Classes

pairwise dissimilarities.

well defined cluster

structure

see Silhouettes:

Page 37: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 37

Other Projections of GPD194

MDS

Page 38: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 38

Other Projections of GPD194

CCA

Page 39: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 39

SOP Projection

Page 40: Alfred Ultsch,Lutz Herrmann Databionics Research Group · ICA is „all wrong“ here. Herrmann/ Ultsch University of Marburg 6 2. Nonlinear Projections MDS Multidimensional Scaling

Herrmann/ UltschUniversity of Marburg 40

Conclusions

“Naive” usage of your favorite projection:

May show cluster that are not in the data

May miss cluster that are in the data

Cluster preserving Projection Rn -> R2 for

nontrivial data in principle impossible

ESOM/U-Matrix superior to most projections

Swarm Organized Projection (SOP) still better