Upload
ita
View
49
Download
1
Embed Size (px)
DESCRIPTION
Adaptive Cluster Ensemble Selection. Javad Azimi, Xiaoli Fern {azimi, xfern}@eecs.oregonstate.edu Oregon State University Presenter: Javad Azimi. Cluster Ensembles. Data Set. Setting up different clustering methods. Clustering 1. Clustering 2. …………. Clustering n. - PowerPoint PPT Presentation
Citation preview
Adaptive Cluster Ensemble Selection
Javad Azimi, Xiaoli Fern
{azimi, xfern}@eecs.oregonstate.edu
Oregon State University
Presenter: Javad Azimi.
1
Cluster Ensembles
2
Setting up different clustering methods.
Generating different results.
Combine to obtain final results.
Data Set
Result 1 Result 2 ……………. Result n
Final Clusters
Clustering 1 Clustering 2 …………. Clustering n
Consensus Function
Cluster Ensembles:Challenge
• One can easily generate hundreds or thousands of clustering results.
• Is it good to always include all clustering results in the ensemble?
• We may want to be selective.–Which subset is the best?
3
What makes a good ensemble?
• Diversity– Members should be different from each other
– Measured by Normalized Mutual Information (NMI)
• Select a subset of ensemble members based on diversity:– Hadjitodorov et al. 2005: Ensemble with median diversity
usually works better.
– Fern and Lin 2008: Cluster ensemble members into distinct groups and then choose one from each group.
4
Diversity in Cluster Ensembles:Drawback
• They aim to design selection heuristics without considering the characteristics of the data sets and ensembles.
• Our goal: selecting adaptively based on the behavior of the data set and ensemble itself.
5
Our Approach
• We empirically examined the behavior of the ensembles and the clustering performance on 4 different data sets.– Use the four training sets to learn an adaptive
strategy
• We evaluated the learned strategy on test data sets.
• 4 training data sets: Iris, Soybean, Wine, Thyroid.
6
An Empirical Investigation
1. Generate a large ensemble– 100 independent runs of two different algorithms (K-means
and MSF)
2. Analyze the diversity of the generated ensemble– Generate a final result P* based on all ensemble members– Compute the NMI between ensemble members and P*– Examine the distribution of the diversity
3. Consider different potential subsets selected based on diversity and evaluate their clustering performance
7
Observation #1
• There are two distinct types of ensembles– Stable: most ensemble members are similar to P*– Unstable: most ensemble members are different from
P*.8
stable
unstable
Name Average NMI# of ensemble with
NMI >0.5Class
Iris 0.693 197 S
Soybean 0.676 179 S
Wine 0.471 85 NS
Thyroid 0.437 61 NS
# of
ens
embl
es
NMI with P*
Consider Different Subsets
• Compute the NMI between each member and P*
• Sort NMI values
• Consider 4 different subsets
Lowest NMI Highest NMI
Full ensemble (F)
9
Low diversity (L)High diversity (H)
Medium diversity (M) Members sorted based on NMI values
Observation #2
• Different subsets work the best for stable and unstable data:– Stable: subsets F and L worked well
– Unstable: subset H worked well
10
Name F L H M CategoryIris 0.744 0.744 0.640 0.725 S
Soybean 1 1 0.557 0.709 SThyroid 0.257 0.223 0.656 0.325 NS
Wine 0.474 0.376 0.680 0.494 NS
Our final strategy
• Generate a large ensemble П (200 solutions) • Obtain the consensus partition P*
• Compute NMI between ensemble members and P* and sort them in decreasing order.
• If average NMI > 0.5, classify ensemble as stable and output P* as the final partition
• Otherwise, classify ensemble as non-stable and select the H (high diversity) subset, and output its consensus clustering.
11
Experimental Setup• 100 independent runs of k-means and MSF are
used to generate the ensemble members.
• Consensus function: average link HAC on the co-association matrix
12
Experimental Results:Data Set Classification
NameMean NMI
#members NMI >0.5
Class
Segmentation 0.602 169 S
Glass 0.589 131 S
Vehicle 0.670 199 S
Heart 0.241 11 NS
Pima 0.299 26 NS
O8X 0.488 91 NS
13
Experimental Results:Results on Different Subsets
Name 1st (F) 2nd (L) 3rd(H) 4th(M) Best P Data set Class
O8X 0.491 0.444 0.655* 0.582 0.637 NS
Glass 0.269* 0.272 0.263 0.269 0.397 S
Vehicle 0.146* 0.141 0.119 0.136 0.227 S
Heart 0.095 0.079 0.340* 0.104 0.169 NS
Pima 0.071 0.071 0.127* 0.060 0.076 NS
Seg. 0.406* 0.379 0.390 0.438 0.577 S
14
Experimental Results:Proposed Method versus Fern-Lin
NameProposed method
Fern-Lin
Iris(S) 0.74 0.613Soybean(S) 1 0.866Thyroid(NS) 0.656 0.652
Wine(NS) 0.680 0.612O8X(NS) 0.655 0.637Glass(S) 0.269 0.301
Vehicle(S) 0.146 0.122Heart(NS) 0.340 0.207Pima(NS) 0.127 0.092
Seg.(S) 0.406 0.550
15
Experimental Results:Selecting a Method vs Selecting the Best Ensemble
Members
• Which members are selected for final clustering?
Wine Thyroid
Wine (NS) Thyroid (NS)
16
NM
I with
P*
K-means MSF
Only MSF members are selected MSF and K-means member are selected
Experimental Results:How accurate are the selected ensemble
members?• x-axis: members in decreasing order of NMI values with P*
• y-axis: their correspond NMI values with ground truth labels
Soybean(S) Thyroid(NS)
17
Selected ensemble members
More accurate ensemble members are selected
Most similar to P* Most dissimilar to P*
Conclusion
• We empirically learned a simple ensemble selection strategy:– First classify an given ensemble as stable or unstable.– Then select a subset according to the classification
result.
• On separate test data sets, we achieve excellent results:– Some times significantly better than best ensemble
member.– Outperforms an existing selection method.
18