16
2015 Using Rule Ensembles to Predict Credit Risk Diane Chang, Intuit October 16, 2015 #GHC15 2015

Using Rule Ensembles to Predict Credit Risk #GHC15

Embed Size (px)

Citation preview

Page 1: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Using Rule Ensembles to Predict Credit Risk

Diane Chang, Intuit

October 16, 2015

#GHC15

2015

Page 2: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Trade-off: speed vs. rates

Page 3: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Can accounting data help?

Page 4: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Low credit risk?

Page 5: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Traditionally, one big model

Annual Revenue

≤ $100K > $100K

In TX Not in TX Retail Non-retail

State Industry

… …

Page 6: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Ensemble methods

Page 7: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Lots of little trees

Page 8: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Trees Rules

Time in Business

≤ 3 yrs > 3 yrs

> 8 > 10% ≤ 10%

# Invoices Sales Growth

≤ 8

Page 9: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Trees Rules

Time in Business

≤ 3 yrs > 3 yrs # Invoices Sales Growth

Rule 1: Time in Business ≤ 3 yrs

Rule 2: Time in Business > 3 yrs Rule 3: Time in Business ≤ 3 yrs and # Invoices ≤ 8

Rule 4: Time in Business ≤ 3 yrs and # Invoices > 8

1 2

Rule 5: Time in Business > 3 yrs and Sales Growth ≤ 10% Rule 6: Time in Business > 3 yrs and Sales Growth > 10%

≤ 8

3

> 8

4

≤ 10%

5

> 10%

6

Page 10: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

A lot of rules!

Rule 1 Rule 2 Rule 3 Rule 4 Rule 5 Rule 6

Rule 1 Rule 2 Rule 3 Rule 4 … Rule n

Page 11: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Which rule is most predictive?

w1* R1 + w2* R2 + w3* R3 + … + wn* Rn

Weights, wi’s, computed via logistic regression

Page 12: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Scoring

https://en.wikipedia.org/wiki/Logistic_regression

logep

1- pp =

1

1+ e-score

score = logep

1- p=w0 +w1R1 +w2R2 +...+wnRn

Page 13: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Example

R1: Time in Business ≤ 3 yrs R2: Time in Business > 3 yrs R3: Time in Business ≤ 3 yrs & Invoices ≤ 8 R4: Time in Business ≤ 3 yrs & Invoices > 8 R5: Time in Business > 3 yrs & Sales Growth ≤ 10% R6: Time in Business > 3 yrs & Sales Growth > 10%

✓ ✗

✗ ✓

score = -2 + 0.1 − 0.6 = -2.5

Make a loan offer!

• 2 years in business • 10 invoices per month

• Sales growth: 12%

score= -2+0.1R1 -0.25R2 +0.14R3 -0.6R4 +0.01R5 -0.07R6

p =1

1+ e2.5» 0.08

Page 14: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Rule ensembles RULE

Rule Ensemble

Logistic Regression

✔ ✖

Missing data

High predictive power

Large number of variables

Page 15: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Try it yourself!

Professor Friedman’s website:

− http://statweb.stanford.edu/~jhf/R_RuleFit.html

Open source “wrapper” to RuleFit

− https://github.com/intuit/rego

Page 16: Using Rule Ensembles to Predict Credit Risk #GHC15

2015

Got Feedback?

Rate and Review the session using the GHC Mobile App

To download visit www.gracehopper.org