Credit Scoring and Credit Control XIV · 2017-10-04 · Customer behavior summarized continuously...

Preview:

Citation preview

Credit Scoring and Credit Control XIV

26 – 28 August 2015

#creditconf15 @uoebusiness

© 2015 Fair Isaac Corporation. Confidential. This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.

Predictive profiling from other customers’ behavior

Collaborative Profiling

© 2015 Fair Isaac Corporation. Confidential. 3

Demographic variables?weak predictive power

Past customer behavior!

• Generation 1─ Global risk summaries─ The credit report, or cycle snapshot: a static summary

• Generation 2─ continuously updated transaction profile for each account and customer─ typical in fraud analytics

• Generation 3 ─ predict using other customers’ behaviors and profiles

What works in customer analytics?

© 2015 Fair Isaac Corporation. Confidential. 4

Non-numerical event streams are everywhere

» Matt: » 132.74.91.11: » cookie_id 133x71: » Scott:

What event streams look like to a computer (no semantic knowledge):

► Netflix: films watched► Amazon: merchandise purchased► Google: text in a web page► Apple: apps purchased► Banking: card transactions► Websites: links clicked► Networks: DNS queries

© 2015 Fair Isaac Corporation. Confidential. 5

Discerning “Behavioral Archetypes” From Data

FICO Analytics

Archetypes Of Customer Behavior

► A “customer” might be:► an actual human customer► a web client► a network client► an employee ► anything generating sequential

data streams tied to any “id” ► How much structure goes

unexploited?

© 2015 Fair Isaac Corporation. Confidential. 6

Well-Known Large-Scale Analytics Successes

• Netflix, Amazon, Apple: recommend film/product/music based on individually revealed preferences and general desirability

• Google: retrieve and rank search results based on individual search history and global relevance

• Marketing analytics: customized offers based collective analysis of customers

Individualized predictionsSensitive to collective behaviors

© 2015 Fair Isaac Corporation. Confidential. 7

Profiling in a nutshell

• What is event profiling?

─ Given sequential data for one customer 𝑥𝑥1, 𝑥𝑥2, … 𝑥𝑥𝑡𝑡 predict 𝑥𝑥𝑡𝑡+1 or external tag

─ ideal: �̂�𝑝(𝑥𝑥𝑡𝑡+1| 𝑥𝑥𝑡𝑡 ,𝑥𝑥𝑡𝑡−1, … 𝑥𝑥1) but impractical to train or represent

─ introduce a finite-dimensional profile vector: 𝜽𝜽𝑡𝑡

• model: �̂�𝑝(𝑥𝑥𝑡𝑡+1| 𝑥𝑥𝑡𝑡,𝜽𝜽𝑡𝑡)

• update: 𝜽𝜽𝑡𝑡+1 = 𝐹𝐹 𝑥𝑥𝑡𝑡,𝜽𝜽𝑡𝑡

• 𝜽𝜽𝑡𝑡 & 𝐹𝐹 are crafted with expert knowledge & empirical experimentation

© 2015 Fair Isaac Corporation. Confidential. 8

Collaborative Filtering in a nutshell

• Collaborative filtering

─ Given sequential data for many customers {𝑥𝑥1,𝑥𝑥2, … 𝑥𝑥𝑡𝑡}, {}, {}, {}, predict 𝑥𝑥𝑡𝑡+1 or external

tag

• Item x Item CF prediction: “People who purchased i also bought j”

─ Estimate 𝐿𝐿 𝑖𝑖 = �𝑝𝑝 𝑖𝑖 𝑗𝑗)�𝑝𝑝(𝑖𝑖)

for all items in dictionary i & j from big historical data

─ Smarter: 𝐿𝐿 𝑖𝑖 = ∑𝑘𝑘 �𝑝𝑝 𝑖𝑖 𝑘𝑘)�𝑝𝑝 𝑘𝑘 𝑗𝑗)�𝑝𝑝(𝑖𝑖)

for smaller “latent factor” space indexed by k

─ One example: “non-negative matrix factorization”

© 2015 Fair Isaac Corporation. Confidential. 9

Collaborative Filtering in a nutshell

• User x Item CF prediction: “People like you also bought item j”

─ neighborhood: “people with similar histories close to you in some space”

─ clustering/segmentation: “people in the same category as you” and learn categories

─ user profile:

• Represent user in latent factor space 𝜽𝜽𝒕𝒕 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 = 𝐹𝐹 𝑥𝑥1, 𝑥𝑥2, … 𝑥𝑥𝑡𝑡 ,𝑮𝑮

• Represent mapping 𝑮𝑮 from latent factors to observable elements

• predict observable ∑𝑘𝑘 𝐺𝐺𝑖𝑖𝑘𝑘𝜃𝜃𝑘𝑘(𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢)�̂�𝑝(𝑖𝑖)

,

• learn 𝐹𝐹 and 𝑮𝑮 simultaneously from large data

© 2015 Fair Isaac Corporation. Confidential. 10

Collaborative FilteringMachine Learning/Analytic Techniques

• Item or user similarity-based near neighbor weightings

• Singular value decomposition

• Non-negative matrix factorization

• Bayesian networks

• Latent semantic indexing

• Latent Dirichlet allocation

Chuang et al. 2012

© 2015 Fair Isaac Corporation. Confidential. 11

Learning “Behavioral Archetypes” from DataHigh performance FICO implementation

Large-scaleBayesian analysis

(Millions+ of “words” over thousands/millions of customers)

Decomposition of Customer Behavior

Model

© 2015 Fair Isaac Corporation. Confidential. 12

Mapping Real Customers onto Archetypes

Desirable properties of our technology: » Optimized to predict observed customer

behavior» Most customers are “stereotyped”:

dominated by a few archetypes» Calibrated probabilistic interpretation

Archetypes of Customer BehaviorActual Customer

35.2% 2.4% 50.5% 0.1% 11.8%=

© 2015 Fair Isaac Corporation. Confidential. 13

What is, and is not, a behavioral “archetype”?

► A behavioral archetype is:► A probabilistic (soft) clustering of correlated behaviors ► A basis for decomposing any actual customer’s observed transactions

► A behavioral archetype is not:► A segmentation or split of actual customers► A cluster defined by heuristic or intuitive human judgment► A selected or constructed “representative customer” or set thereof

© 2015 Fair Isaac Corporation. Confidential. 14

Words and Archetypes

Where the art comes in• The scientist chooses the “word” representation from raw data streams

movies A,B,C, … VISA Transaction Element AA.BB & CC.DD Network services

► Typical: Word space i dim. 100-10000+, archetype space k dim 10-200.► Training:

► archetype / word distribution matrix 𝜙𝜙𝑘𝑘𝑖𝑖 (rows sum to 1) ► k-dim archetype distribution vector per training customer: 𝜽𝜽𝑗𝑗 (sum to 1) ► both are pushed to statistical sparsity: highly interpretable empirically

© 2015 Fair Isaac Corporation. Confidential. 15

Statistical sparsity:automatic soft clustering

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

1 2 3 4

non-sparse archetype

Series1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1 2 3 4

sparse archetype

Series1

results: insightful &

interpretable

low information

content

© 2015 Fair Isaac Corporation. Confidential. 16

Collaborative Profiles: Real-Time Archetype TrackingFICO Patent Pending

20.0% 20.0% 20.0% 20.0% 20.0%

20.7% 19.1% 21.5% 19.0% 19.6%

22.7% 16.9% 25.3% 16.5% 18.6%

32.2% 5.9% 44.4% 4.1% 13.4%

35.2% 2.4% 50.5% 0.1% 11.8%

T=0

T=1

T=2

T=20

T=10

Real-time updates from transaction

stream

note sparsity

© 2015 Fair Isaac Corporation. Confidential. 17

Collaborative Profiles: Real-Time Archetype Tracking

► Collaborative profiling

►Given sequential data for many customers

► for each customer’s 𝑥𝑥1,𝑥𝑥2, … 𝑥𝑥𝑡𝑡 predict 𝑥𝑥𝑡𝑡+1 in light of all

►Train offline archetypes 𝜙𝜙𝑘𝑘𝑖𝑖 with large historical data set

►Per customer: store profile vector: 𝜽𝜽𝑡𝑡►predict: �̂�𝑝(𝑥𝑥𝑡𝑡+1| 𝑥𝑥𝑡𝑡,𝜽𝜽𝑡𝑡) ∝ 𝜙𝜙𝑘𝑘𝑖𝑖 𝜽𝜽𝑡𝑡

►profile update: 𝜽𝜽𝑡𝑡+1 = 𝐹𝐹 𝜽𝜽𝑡𝑡 ,𝜙𝜙𝑘𝑘𝑖𝑖 , 𝑥𝑥𝑡𝑡+1

►Novel FICO profile algorithm, store only k-dim vector, efficient update

►Literature considers batch case only {𝑥𝑥1, 𝑥𝑥2, … 𝑥𝑥𝑇𝑇} → 𝜽𝜽𝑇𝑇U.S. Patent Pending 13/725,561

© 2015 Fair Isaac Corporation. Confidential. 18

05

101520

Bas

is P

oint

s Fraud Losses

Falcon Introduced

Supervised neural network models Individual cardholder profiles for streaming transactions

Purchase frequency Amounts Time Volume Merchant category Location

FICO’s Falcon protects 65% of world’s payment cards

Rapidly assimilates data such as

© 2015 Fair Isaac Corporation. Confidential. 19

Fraud Criminal

35.2% 2.4% 50.5% 0.1% 11.8% T=20

20.1% 6.2% 25.3% 32.6% 15.9%

14.1% 7.7% 15.2% 45.5% 17.5%

8.0% 9.2% 5.1% 58.5% 19.2%

T=21

T=22

T=23

Change in previously stable behavior = elevated fraud risk.

Card is Compromised!

Customer

Application: Fraud DetectionFalcon Fraud Manager: with Collaborative Profiles

© 2015 Fair Isaac Corporation. Confidential. 20

Collaborative ProfilesPredict Your Customer

• A behavior may be considered “high-probability”, even if it has never been observed directly on this customer’s account before

• Reduces false positives

• Inference through archetypes trained on collective data

Atypical behavior for this customer?

Typical behavior for this customer?

Risk adjustment

Individualized predictions, sensitive to collective behaviors

© 2015 Fair Isaac Corporation. Confidential. 21

Collaborative ProfilesConceptual Example

The relationship between air/hotel/drug store is discovered by a data-driven, theoretically sound algorithm, not a hand-built feature.

www.abcnews.go.com

• New “out of town drug store” transaction1. No prior drug store purchase on account: superficially higher risk2. Out of home area: superficially higher risk3. Card was historically used heavily for hotel and air travel purchases4. Customer shows strong “traveler” archetype5. Collaborative Profile indicates lower risk than for the average cardholder6. False positive fraud alert avoided

© 2015 Fair Isaac Corporation. Confidential. 22

Acc

ount

Det

ectio

n R

ate

(%)

Account False-Positive Ratio

Preliminary Results on Research ModelUS Credit Portfolio

REL: -23.7%

REL: -18.6%

Lift without any fraud data used

to train archetypes!

© 2015 Fair Isaac Corporation. Confidential. 23

• Consider a marketing analytics application:─ Predict individual customer behavior─ Understand collective soft clusters of customers and items─ Optimized coupons & discounts─ Suggest new items: cross-sell─ Reduce attrition with targeted incentives

• Don’t segment your customers into a rigid box X or Y!

Marketing Analytics, The Next Generation

http://www.cmo.com/articles/2012/6/27/the-death-of-demographics.html

In our research we don’t intentionally set out to disprove segmentation, but the data is telling us that theories based on old ways of thinking are often off-base.”

© 2015 Fair Isaac Corporation. Confidential. 24

Archetypal clusters of grocery items

• “easy”─ Frozen dinners─ Cup-o-soups ─ Breakfast bars─ …

• “salad”─ Lettuce ─ Tomatoes─ Croutons─ Carrots─ …

• “make your own pizza”─ Mozzarella─ Pizza dough─ Parmesan cheese─ Italian Salami─ …

• “tropical fruit”─ Mango─ Papaya─ Coconut─ …

If a customer likes Mango & Coconut, they probably like Papaya

© 2015 Fair Isaac Corporation. Confidential. 25

0

0.5

Archetype 1 Archetype 2 Archetype 3 Archetype 4

0

1

1 2 3 4

0

0.5

1 2 3 4

Incremental update is not even across archetypes!

Item alone is weighted equally across archetypes 1 and 2

profile @ t

profile @ t+1

Collaborative Profiling is not just linear algebra!A profile is not the simple average of its transactions

archetype weighting of

datum New Bayesian

hypothesis

Old Bayesian

hypothesis

© 2015 Fair Isaac Corporation. Confidential. 26

If a customer buys Tofu, do we increase weight in both archetypes in proportion?

No, not necessarily!• If prior history had high Asian allocation, then “Tofu” reinforces it in profile• If prior history had high Vegetarian allocation, then “Tofu” reinforces different profile

component

Items are interpreted in light of customer’s historyA profile is not the average of its transactions

• “Asian”─ Tofu─ Bok Choy─ Garlic─ …

• “Vegetarian”─ Tofu─ Veggie Burger─ Vegan Cheese─ …

© 2015 Fair Isaac Corporation. Confidential. 27

Conclusion

• Customer behavior summarized continuously in real-time into customer archetypes─ Changes in archetype distribution are out of pattern

• Predicts previously unseen behaviors in collective view─ Archetype distribution is based on the global view of transaction set─ A new transaction may be “expected” through archetype profile, not personal history

• Strong Performance on Mature US Credit model

• Applications:─ Transaction Fraud─ Marketing analytics─ Insider (conduct) threat─ Cyber security

© 2015 Fair Isaac Corporation. Confidential. This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation’s express consent.

Thank YouMatthew Kennel+1 858 369 8455matthewkennel@fico.com

Credit Scoring and Credit Control XIV

26 – 28 August 2015

#creditconf15 @uoebusiness