Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Self-Organized Swarms for cluster preserving Projections of high-dimensional Data
Alfred Ultsch,Lutz Herrmann
Databionics Research Group
Herrmann/ UltschUniversity of Marburg 2
Outline of the talk
How to visualize structures of the Rn
What do we know about the structure
preserving of projections?
Problems with naive usage
U-Matrix on ESOM
Swarm Organized Projection (SOP)
Herrmann/ UltschUniversity of Marburg 3
How to visualize structures of the Rn
Answer: Projection into R<4
Possibilities:
1. Linear Projections
2. Nonlinear Projections preserving
distances
3. Discontinuous Projections preserving
what ??
Herrmann/ UltschUniversity of Marburg 4
1. Linear Projections PCA
Principal Component Analysis PCA
PCA is „all wrong“ here
Herrmann/ UltschUniversity of Marburg 5
1. Linear Projections ICA
Independent Component Analysis ICA
ICA is „all wrong“ here
Herrmann/ UltschUniversity of Marburg 6
2. Nonlinear Projections MDS
Multidimensional Scaling MDS
MDS is „all wrong“ here
See non linear distortion of space!
Herrmann/ UltschUniversity of Marburg 7
2. Nonlinear Projections Sammon
Sammons Mapping SM
SM is „all wrong“ here
See non linear distortion of space!
Herrmann/ UltschUniversity of Marburg 8
Curvilinear Component Analysis
advanced offspring of MDS
Michel Verleysen‘s study group
“unfolds non-linear manifolds“
minimization of topographic error
Herrmann/ UltschUniversity of Marburg 9
3. Discontinuous Projection : CCA
Curvilinear Component Analysis CCA [Verleysen et al /Leuven]
advanced offspring of MDS
“unfolds non-linear manifolds“
minimization of “topographic error”
Problem: cooling scheme for neighborhood radius
-2
-1
0
1
2
-2-1
01
23
-2
-1
0
1
2
x
Chainlink
y
z
CCA
Herrmann/ UltschUniversity of Marburg 10
3. Discontinuous Projection : ESOM
Emergent SOM = ESOM
Visualize Structures using U-matrix
Appoved recipes for parameters
Herrmann/ UltschUniversity of Marburg 11
ESOM (Emergent SOM)
Self Organizezed Feture Maps come in 2
Flavours:
1. k-means-SOM
Very few neurons
Neuron = Cluster
Clustering Properties:
identical with k-means
2. Emergent SOM (ESOM)
„cluster“
Herrmann/ UltschUniversity of Marburg 12
ESOM (Emergent SOM) Self Organized Feature Maps come in 2 Flavours:
1. k-means-SOM
2. Emergent SOM (ESOM)
4000+ Neurons on Rectangular Grid
Borderless (toroid)
Properties:
nonlinear topology (cluster) preserving projection
Rn (Data) -> grid of neurons
Nonlinear Interpolation,
rather Projection than Clustering!
82 Neurons
50N
.
Herrmann/ UltschUniversity of Marburg 13
3.Discontinuous Projection : ESOM
ESOM
ESOM: is „all wrong“ here
Herrmann/ UltschUniversity of Marburg 14
3.Discontinuous Projection : ESOM
ESOM
However: Cluster structure is preserved!
Disentangling of rings !
Herrmann/ UltschUniversity of Marburg 15
What do we know about the structure
preserving of projections?
Topology Preservation of Rn -> Rm
is NOT POSSIBLE
Only possible if „true data dimension = m“ i.e.
If Data resides on m-dimensional submanifold
Measuring this „intrinsic“ dimension is difficult!
„swiss roll“
Herrmann/ UltschUniversity of Marburg 16
What do we know about the structure
preserving of projections?
perfect preservation of topology for
Rn –Rm , m < n is per se impossible
So, how do the projection algorithms perform?
Ad 1. linear Projections d(p(x),p(y)) = flin(D(x,y))
PCA aims at variance preservation
ICA aims at „Non-Gaussianity“
Both aims may not be Cluster preserving!
Herrmann/ UltschUniversity of Marburg 17
What do we know about the structure
preserving of projections?
Ad 2. Non Linear Projections
Often Stress-Measure E
E(D(x,y),d(p(x),p(y)) = error(D(x,y)-> d(p(x),p(y)))
Directly minimized (see MDS, Sammon, CCA)
However: may not be Cluster preserving!
Distance Structures may be non linear distorted!
Visualization may be misleading! (see above)
Herrmann/ UltschUniversity of Marburg 18
Ad 3. discontinuous (E) SOM
Bottom lines of theory on SOM:
SOM is nonlinear, discontinuous and “usually” cluster preserving
For SOM in principle no Energy function possible
Variants of SOM show topology preservation under certain preconditions
Herrmann/ UltschUniversity of Marburg 19
Let’s face the Problem:
For projections R -> Rm, n>3,m<4
Conservation of structures (clusters) in
general impossible
So what?
Proposal: visualize the “problem areas”
Easy for ESOM : The U-matrix
Herrmann/ UltschUniversity of Marburg 20
U-Matrix
Neurons
on the
Grid
E
SESSW
W
NENNW
ESOM,
Topology preservation
nonlinear disentangeling
Grid of
neurons
weights
of
neurons
U-heigths are average local distances !
U-Matrix
Herrmann/ UltschUniversity of Marburg 21
U-matrix on SOM Shows distance structures of the Rn
High ridges = large distances
Low valleys = points in Rn close to each other
Neurons
on the
Grid
E
SESSW
W
NENNW
U-Matrix of Chainlink U-heigths are local distances !
Herrmann/ UltschUniversity of Marburg 22
Ant-Based Projections/Clustering
stochastic ants perform random walks on a grid
fixed perceptive neighbourhood of size σ2 2 {9,25}
one or many ants:
pick input sample when neighbourhood contains dissimilar samples
drop input sample when neighbourhood containts similiar samples
σ
σ
Herrmann/ UltschUniversity of Marburg 23
Important Ant Models
Deneubourg (1990): modelling emergent phenomena of ants clustering corpses
Lumer, Faieta (1994): clustering pairwise dissimilarity data, e.g. four gaussians
Ramos (2003): ACLUSTER method includes pheromones, analyzes web traffic data
Handl, Knowles (2005): ATTA method, first solid empirical evaluations
Tan et al. (2006): empirical evaluation: number of ants is irrelevant
Herrmann/ UltschUniversity of Marburg 24
too many, too small clusters appear
topologically distorted clusters
results are highly sensitive to parameters, e.g. grid size
no proof of topological ordering
-0.2 0 0.2 0.4 0.6 0.8 1 1.2-0.2
0
0.2
0.4
0.6
0.8
1
1.2
data:
4 gaussian
cluster
data processed with
Lumer/Faieta algorithm
Problems with Ant-Based Methods
Herrmann/ UltschUniversity of Marburg 25
Strengths & Weaknesses
Ant-Based
Clustering
(Emergent)
SOM
Quality of
topographic
mapping
poor good
continuous
learningyes no
data metric spacesnormed vector
spaces
Improve the quality of
topographic mappings
to obtain a method that
is superior to Self-
Organizing Maps !
Herrmann/ UltschUniversity of Marburg 26
Xy
Xy
i ymih
yxymih
xm))(,(
))(,(
minarg)(
Bestmatch Objectives
Batch-SOM
Dissimilarity-SOM
Ants
z
z
yi izmh
zyizmh
xxm)),((
)),((
minargminarg)(
Xy
Xy
i ymih
yxymihiN
xm))(,(
))(,(1
1)(
maxarg)(2
Distortion
upper limit of the
SOM objective,
resembles
Dissimiliarity SOM
Herrmann/ UltschUniversity of Marburg 27
Ants versus SOM
Ants maximize the product :
“output density × topography
preservation”
This distorts the formation of correct
topographic maps.
Herrmann/ UltschUniversity of Marburg 28
From Ants to Swarms
Ants
use small, but fixed neighbourhoods
account for output space density
Swarms
use large, shrinking neighbourhoods
do not account for output space density
Herrmann/ UltschUniversity of Marburg 29
Lumer/Faieta Projection of Chainlink
too many & too small clusters emerge
-4
-2
0
2
4
-2-1.5
-1-0.5
00.5
11.5
2
-3
-2
-1
0
1
2
3
0 10 20 30 40 50 600
10
20
30
40
50
60
Ants
Herrmann/ UltschUniversity of Marburg 30
Summary up to here (ESANN 08)
Ants and swarms construct much worse
mappings than ESOM.
Reason: objective function includes
optimization of output density
Herrmann/ UltschUniversity of Marburg 31
Swarm Model improvement
Make ant systems more ESOM-like:
1. use perceptive neighbourhoods like in ESOM
(starting large then shrinking…)
2. omit output space densities from objective function , preservation is of doubtful value
distorts the topographic term:
output space densities easy to optimize and
this will dominate the objective function
Herrmann/ UltschUniversity of Marburg 32
Swarm Clustering of Chainlink
Cluster structure is preserved.
-4
-2
0
2
4
-2-1.5
-1-0.5
00.5
11.5
2
-3
-2
-1
0
1
2
3
0 10 20 30 40 50 600
10
20
30
40
50
60
AntsSwarm
Herrmann/ UltschUniversity of Marburg 33
Trials on Fundamental Clustering
Problems Suite (FCPS)
FCPS: a collection of simple data sets
Every decent clustering /
topo-mapping algorithm
should be able to handle
these little problems
the ESOM does
But many don’t
Herrmann/ UltschUniversity of Marburg 34
FCPS download
google: FCPS , Ultsch
Herrmann/ UltschUniversity of Marburg 35
0
5
10
15
20
25
atom chainlink hepta iris target 2diamonds wingnut
Experimental Results:
Ants versus Swarm Model on FCPS
Measuring Topography
Preservation with “Minimal
Pathlength” [Goodhill 95]
percent improvements
typ > +10%
On 100 runs: significanctly
reduced error values
according to Kolmogorov-
Smirnov test on α = 1% level
Herrmann/ UltschUniversity of Marburg 36
Real World Application of Swarm
Model
Bioinformatics Data
containing protein data
GPD194 [Popescu 06].
194 proteins / 3 Classes
pairwise dissimilarities.
well defined cluster
structure
see Silhouettes:
Herrmann/ UltschUniversity of Marburg 37
Other Projections of GPD194
MDS
Herrmann/ UltschUniversity of Marburg 38
Other Projections of GPD194
CCA
Herrmann/ UltschUniversity of Marburg 39
SOP Projection
Herrmann/ UltschUniversity of Marburg 40
Conclusions
“Naive” usage of your favorite projection:
May show cluster that are not in the data
May miss cluster that are in the data
Cluster preserving Projection Rn -> R2 for
nontrivial data in principle impossible
ESOM/U-Matrix superior to most projections
Swarm Organized Projection (SOP) still better