Upload
farzad-vasheghani-farahani
View
142
Download
8
Tags:
Embed Size (px)
Citation preview
Multiple Classifier SystemFarzad Vasheghani Farahani – Machine Learning
Outline
Introduction
Decision Making
General Idea
Brief History
Reasons & Rationale
Statistical
Large volumes of data
Too little data
Divide and Conquer
Data Fusion
Multiple Classifier system
Designing
Diversity
Create an Ensemble
Combining Classifiers
Example
Conclusions
References
Ensemble-based Systems in Decision
Making
For many tasks, we often seek second opinion before making a decision,
sometimes many more
Consulting different doctors before a major surgery
Reading reviews before buying a product
Requesting references before hiring someone
We consider decisions of multiple experts in our daily lives
Why not follow the same strategy in automated decision making?
Multiple classifier systems, committee of classifiers, mixture of experts,
ensemble based systems
Ensemble-based Classifiers
How to (i) generate individual components of the ensemble systems
(base classifiers), and (ii) how to combine the outputs of individual
classifiers?
Brief History of Ensemble Systems
Dasarathy and Sheela (1979) partitioned the feature space using two
or more classifiers
Schapire (1990) proved that a strong classifier can be generated by
combining weak classifiers through boosting; predecessor of AdaBoost
algorithm
Two types of combination:
classifier selection
classifier fusion
Why Ensemble Based Systems?
Why Ensemble Based Systems?
1. Statistical reasons
A set of classifiers with similar training performances may have different generalization performances
Combining outputs of several classifiers reduces the risk of selecting a poorly performing classifier
Example:
Suppose there are 25 base classifiers
Each classifier has error rate, = 0.35
Probability that the ensemble classifier makes a wrong prediction:
2525
1
25(1 ) 0.06i i
i i
Why Ensemble Based Systems?
2. Large volumes of data
If the amount of data to be analyzed is too large, a single classifier
may not be able to handle it; train different classifiers on
different partitions of data
Why Ensemble Based Systems?
3. Too little data
Ensemble systems can also be used when there is too little data;
resampling techniques
Why Ensemble Based Systems?
4. Divide and Conquer
Divide data space into smaller & easier-to-learn partitions; each classifier learns only one of the simpler partitions
Why Ensemble Based Systems?
5. Data Fusion
Given several sets of data from various sources, where the nature
of features is different (heterogeneous features), training a single
classifier may not be appropriate (e.g., MRI data, EEG recording,
blood test,..)
Multiple Classifier system Designing
Major Steps
All ensemble systems must have two key components:
Generate component classifiers of the ensemble
Method for combining the classifier outputs
“Diversity” of Ensemble
Objective: create many classifiers, and combine their outputs
to improve the performance of a single classifier
Intuition: if each classifier makes different errors, then their
strategic combination can reduce the total error!
Need base classifiers whose decision boundaries are adequately
different from those of others
Such a set of classifiers is said to be “diverse”
How to achieve classifier diversity?
A. Use different training sets to train individual classifiers
B. Use different training parameters for a classifier
C. Different types of classifiers (MLPs, decision trees, NN
classifiers, SVM) can be combined for added diversity
D. Using random feature subsets, called random subspace
method
Create an Ensemble(Coverage Optimization)
Creating An Ensemble
Two questions:
1. How will the individual classifiers be generated?
2. How will they differ from each other?
Create Ensembles Methods
1. Subsample Approach (Data sample)
Bagging
Random forest
Boosting
Adaboost
Wagging
Rotation forest
RotBoost
Mixture of Expert
2. Subspace Approach (Feature Level)
Random based
Feature reduction
Performance based
3. Classifier Level Approach
Bagging
Boosting
Combining Classifiers(Decision Optimization)
Two Important Concept (i)
(i) trainable vs. non-trainable
Trainable rules: parameters of the combiner, called “weights” determined through a separate training algorithm
Non-trainable rules: combination parameters are available as classifiers are generated; Weighted majority voting is an example
Two Important Concept (ii)
(ii) Type of the output of classifiers
Combine Classifier
Absolute output
Majority Voting
Naïve Bayes
Behavior Knowledge Space
Ranked output
Borda Counting
Maximum Ranking
Continuous output
Algebraic Metohd
Fuzzy Integral
Decesion Template
Example (“Zoo” UCI Data Set)
1. animal name: Unique for each instance2. hair: Boolean3. feathers: Boolean4. eggs: Boolean5. milk: Boolean6. airborne: Boolean7. aquatic: Boolean8. predator: Boolean9. toothed: Boolean10. backbone: Boolean11. breathes: Boolean12. venomous: Boolean13. fins: Boolean14. legs: Numeric (set of values: {0,2,4,5,6,8})15. tail: Boolean16. domestic: Boolean17. catsize: Boolean18. type: Numeric (integer values in range [1,7])
Conclusions
Ensemble systems are useful in practice
Diversity of the base classifiers is important
Ensemble generation techniques: bagging, AdaBoost, mixture of
experts
Classifier combination strategies: algebraic combiners, voting
methods, and decision templates.
No single ensemble generation algorithm or combination rule is
universally better than others
Effectiveness on real world data depends on the classifier diversity
and characteristics of the data
References
[1] Polikar R., “Ensemble Based Systems in Decision Making,” IEEE
Circuits and Systems Magazine, vol.6, no. 3, pp. 21-45, 2006
[2] Polikar R., “Bootstrap Inspired Techniques in Computational Intelligence,” IEEE Signal Processing Magazine, vol.24, no. 4, pp. 56-72, 2007
[3] Polikar R., “Ensemble Learning,” Scholarpedia, 2008.
[4] Kuncheva, L. I. , Combining Pattern Classifiers: Methods and
Algorithms. New York, NY: Wiley, 2004.