22

PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data MiningLecture 9 - Random Forests

Asst.Prof.Dr. Burkay Genç

Hacettepe University

November 20, 2016

Page 2: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Introduction

A single decision tree is often too simple

Many models working together is better than a single model

Consider a panel of experts to make a decision

Usually some variables are indistinguishable

Example: 75% Yes, 25% No ([75, 25])I Splitting by A : [80, 20] - [40, 60]I Splitting by B : [80, 20] - [40, 60]I Idea: build both then merge into an ensemble

Page 3: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Overview

Stability

The decision tree model is instableI Remove a few items and the model can change marginally

Random forest models are much more stable

They can tolerate noise and change

Page 4: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Overview

Underrepresentation

Random forests handle underrepresented classi�cation tasks quitewell

This is where, in the binary classi�cation task, one class has very few(e.g., 5% or fewer) observations compared with the other class

Page 5: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Overview

Randomness

Variables are selected randomly

This provide robustness

Also each tree is built much fasterI Much less needs to be done at each split

Page 6: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Overview

Variable vs Observation

Random forests are very useful when there are many variables andand not that many observations

Page 7: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Knowledge Representation

Decision Tree

The main building block is the decision tree (usually)

However, any other model may also be used

Random Forest model is a meta-algorithm

Page 8: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Knowledge Representation

How does it work?

Build many random DTs

Treat them equally

If majority says yes then the decision is a yes, otherwise it is a no

For regression, compute the average

Page 9: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Algorithm

Bagging

RF algorithm uses the concept of bagging

BAG: bootstrap aggregation

Randomly create a sample of all observationsI Sample size is about two thirds of the original datasetI Replacements are allowedI An observation may appear more than once in a sample

Use each sample to train a di�erent DT

Page 10: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Algorithm

Sampling the variables

At each split, we randomize the variable selection:I Choose a small set of variables arbitrarilyI Select the best variable among these

At each split (node), do this with a di�erent subset of variables

How does this a�ect speed?

Page 11: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Algorithm

Randomness

Randomness allows to purposefully create di�erent trees

Di�erent trees represent di�erent experts

Allow each tree to over�t, they only contribute a small amount tothe overall decision

Page 12: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Algorithm

Ensemble Scoring

Do not favor any DTs over others

They are all equally valuable perspectives on the same data

If the majority of di�erent perspectives agrees on a decision, then itis likely to be the correct decision

Page 13: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Example

Building a model with Rattle

User interface

Page 14: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Example

Building a model with Rattle

Examining the output

Missing value imputation

If imputation is not checked then sample size may be lower, as thedefault behavior is to drop the observations with missing values.

Page 15: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Example

Building a model with Rattle

Performance evaluation

OOB

OOB is out-of-bag. It means the error is computed using theobservations left out of the bag for each tree.

Page 16: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Example

Building a model with Rattle

Underrepresented Classes

Notice the high amount of false-negatives in the previous table

That means you won't be prepared for the rain tomorrow ->Problem!

RF can �x this by balancing the underrepresented andoverrepresented classesI RainTomorrow=='Yes' is underrep. with 66/366 observationsI RainTomorrow=='No' is overrep. with 300/366 observations

Page 17: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Example

Building a model with Rattle

Sample Size

We sacri�ced false-positives in favor of false-negatives

We will carry an umbrella when there is no need, but we will becaught unprepared less frequently

Page 18: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Example

Building a model with Rattle

Variable Importance

Page 19: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Example

Building a model with Rattle

Importance

Page 20: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Example

Building a model with Rattle

Error rates

Page 21: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Example

Building a model with Rattle

Conversion to rules

You can convert each tree into the corresponding rule set

Provide the index of the tree for a speci�c tree

If you provide 0, then you get the rules of all treesI Be careful though, this can easily get out of control

Page 22: PSS718 - Data Mining - Hacettepe Üniversitesiyunus.hacettepe.edu.tr/.../slides/lecture9.pdf · PSS718 - Data Mining Example Building a model with attleR Conversion to rules ouY can

PSS718 - Data Mining

Example

Building a forest with R

The R command