Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
29/08/07 1 / 54
Plastic Card Fraud Detection using Peer Group analysis
David Weston, Niall Adams, David Hand, Christopher Whitrow, Piotr Juszczak
29 August, 2007
EPSRC Think Crime Initiative
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 2 / 54
• EPSRC Think Crime Initiative• Crime Prevention & Detection• Funding 12 projects• Also feasibilty studies and more
Think Crime Project
• Develop Fraud Detection Tools• Real Data
ThinkCrime Team
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 3 / 54
• Members of the team are
◦ David Hand◦ Niall Adams◦ Christopher Whitrow◦ Piotr Juszczak◦ David Weston◦ Gordon Blunt
• Collaborating banks
◦ Abbey National, Alliance and Leicester, Capital One,Lloyds TSB
Overview
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 4 / 54
• Peer Group Analysis
◦ Introduction◦ Applied to Time-Aligned Multivariate Continuous Data◦ Applied to Credit Card Transaction Data
• Performance Evaluation• Experiments & Results• Conclusions & Current Work
Peer Group Analysis -Introduction
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction• Approaches to FraudDetection
• Anomaly Detection
• Peer Group Analysis• Anomaly Detection toPeer Groups I• Anomaly Detection toPeer Groups II• Anomaly Detection toPeer Groups III
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork29/08/07 5 / 54
Approaches to Fraud Detection
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction• Approaches to FraudDetection
• Anomaly Detection
• Peer Group Analysis• Anomaly Detection toPeer Groups I• Anomaly Detection toPeer Groups II• Anomaly Detection toPeer Groups III
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork29/08/07 6 / 54
• Broadly 2 approaches to statistical fraud detection• Supervised or Anomaly Detection
Approaches to Fraud Detection
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction• Approaches to FraudDetection
• Anomaly Detection
• Peer Group Analysis• Anomaly Detection toPeer Groups I• Anomaly Detection toPeer Groups II• Anomaly Detection toPeer Groups III
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork29/08/07 6 / 54
• Broadly 2 approaches to statistical fraud detection• Supervised or Anomaly Detection
◦ Supervised
• Historical Instances of Fraud• Less likely to falsely flag a transaction as fraudulent• Approach Chris is taking
Anomaly Detection
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction• Approaches to FraudDetection
• Anomaly Detection
• Peer Group Analysis• Anomaly Detection toPeer Groups I• Anomaly Detection toPeer Groups II• Anomaly Detection toPeer Groups III
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork29/08/07 7 / 54
• Does not use historical Instances of Fraud• Build a profile of ‘usual’ behaviour• Significant deviations considered frauds• More likely to falsely flag a transaction as fraudulent• Potential to adapt to changing fraud patterns• Approach Piotr is taking
Peer Group Analysis
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction• Approaches to FraudDetection
• Anomaly Detection
• Peer Group Analysis• Anomaly Detection toPeer Groups I• Anomaly Detection toPeer Groups II• Anomaly Detection toPeer Groups III
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork29/08/07 8 / 54
• Similar to anomaly detection methods• Do not need to build a model of usual behaviour for
account holder• Determine a peer group• Find other accounts that you expect will behave similarly to
the account holder• Find accounts that have behaved similarly in the past• Monitor account holder’s behaviour with respect to peer
group• Anomalous behaviour, should account holder deviate
strongly from peer group
Anomaly Detection to Peer Groups I
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction• Approaches to FraudDetection
• Anomaly Detection
• Peer Group Analysis• Anomaly Detection toPeer Groups I• Anomaly Detection toPeer Groups II• Anomaly Detection toPeer Groups III
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork29/08/07 9 / 54
• The weekly amount spent on a credit card for a particularaccount
• Week 1 to Week n
y1, . . . , yn−1, yn
• Target Account• Wish to determine if the amount spent in week n is
anomalous
Anomaly Detection based on account profile
y1 y2 · · · yn−1 yn
Anomaly Detection to Peer Groups II
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction• Approaches to FraudDetection
• Anomaly Detection
• Peer Group Analysis• Anomaly Detection toPeer Groups I• Anomaly Detection toPeer Groups II• Anomaly Detection toPeer Groups III
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork29/08/07 10 / 54
Population Normalised Anomaly Detection
xm,1 xm,2 · · · xm,n−1 xm,n
...
x2,1 x2,2 · · · x2,n−1 x2,n
x1,1 x1,2 · · · x1,n−1 x1,n
y1 y2 · · · yn−1 yn
Anomaly Detection to Peer Groups III
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction• Approaches to FraudDetection
• Anomaly Detection
• Peer Group Analysis• Anomaly Detection toPeer Groups I• Anomaly Detection toPeer Groups II• Anomaly Detection toPeer Groups III
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
• Peer Groups Example
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork29/08/07 11 / 54
Sort accounts in order of decreasing similarity, π(i)
xπ(m),1 xπ(m),2 · · · xπ(m),n−1 xπ(m),n...
xπ(k),1 xπ(k),2 · · · xπ(k),n−1 xπ(k),n...
...
xπ(2),1 xπ(2),2 · · · xπ(2),n−1 xπ(2),n
xπ(1),1 xπ(1),2 · · · xπ(1),n−1 xπ(1),n
y1 y2 · · · yn−1 yn
• Peer Group size k
Peer Groups Example
29/08/07 12 / 54
10 20 30 40 50 60 7025
30
35
40
45
50
55
60
Peer Groups Example
29/08/07 13 / 54
10 20 30 40 50 60 7025
30
35
40
45
50
55
60
Peer Groups Example
29/08/07 14 / 54
10 20 30 40 50 60 7025
30
35
40
45
50
55
60
Peer Groups Example
29/08/07 15 / 54
50 52 54 56 58 60 62 64 66 68 7035
40
45
50
55
Peer Groups Example
29/08/07 16 / 54
50 52 54 56 58 60 62 64 66 68 700
10
20
30
40
50
60
Peer Group Analysis
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
• Detecting Anomalies
• Detecting Anomalies
• Robustifying PeerGroups
• Robustifying PeerGroups
• Peer Group Quality• Whitening thePopulation
• Building Peer Groups
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 17 / 54
Detecting Anomalies
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
• Detecting Anomalies
• Detecting Anomalies
• Robustifying PeerGroups
• Robustifying PeerGroups
• Peer Group Quality• Whitening thePopulation
• Building Peer Groups
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 18 / 54
• Assuming we already have a peer group set of accounts forour target account.
• yn is multivariate (column vector) and continuous• Mahalanobis distance of the target from the mean of its
peer group• µ is mean of xπ(1),n, . . . , xπ(k),n
• C is covariance matrix of xπ(1),n, . . . , xπ(k),n
• Mahalanobis distance of a target from its peer group
◦
√
(yn − µ)T C−1(yn − µ)
Detecting Anomalies
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
• Detecting Anomalies
• Detecting Anomalies
• Robustifying PeerGroups
• Robustifying PeerGroups
• Peer Group Quality• Whitening thePopulation
• Building Peer Groups
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 19 / 54
• If the distance is above an externally selected threshold,then we flag the target as fraudulent.
−10 −8 −6 −4 −2 0 2 4 6 8 10−10
−8
−6
−4
−2
0
2
4
6
8
10
Peer GroupTarget
Robustifying Peer Groups
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
• Detecting Anomalies
• Detecting Anomalies
• Robustifying PeerGroups
• Robustifying PeerGroups
• Peer Group Quality• Whitening thePopulation
• Building Peer Groups
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 20 / 54
• Peer Group contaminated by fraudulent transactions• Outlier Masking• Outlier Swamping
−10 −8 −6 −4 −2 0 2 4 6 8 10−10
−8
−6
−4
−2
0
2
4
6
8
10
Peer GroupTarget
Robustifying Peer Groups
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
• Detecting Anomalies
• Detecting Anomalies
• Robustifying PeerGroups
• Robustifying PeerGroups
• Peer Group Quality• Whitening thePopulation
• Building Peer Groups
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 21 / 54
• Robustify the covariance matrix for the MahalanobisDistance evaluation
• Use Heuristic
Robustifying Peer Groups
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
• Detecting Anomalies
• Detecting Anomalies
• Robustifying PeerGroups
• Robustifying PeerGroups
• Peer Group Quality• Whitening thePopulation
• Building Peer Groups
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 21 / 54
• Robustify the covariance matrix for the MahalanobisDistance evaluation
• Use Heuristic• An account that has deviated strongly from its peer group
at time t should not contribute to any peer group at time t
Robustifying Peer Groups
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
• Detecting Anomalies
• Detecting Anomalies
• Robustifying PeerGroups
• Robustifying PeerGroups
• Peer Group Quality• Whitening thePopulation
• Building Peer Groups
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 21 / 54
• Robustify the covariance matrix for the MahalanobisDistance evaluation
• Use Heuristic• An account that has deviated strongly from its peer group
at time t should not contribute to any peer group at time t
• For each peer group select 75% closest to their own peergroups
Peer Group Quality
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
• Detecting Anomalies
• Detecting Anomalies
• Robustifying PeerGroups
• Robustifying PeerGroups
• Peer Group Quality• Whitening thePopulation
• Building Peer Groups
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 22 / 54
It is not necessarily the case that peer group analysis can besuccessfully deployed on all accounts.
qt =1
k
k∑
i=1
(yt − xπ(i),t)T (yt − xπ(i),t) (1)
where T is the transpose. This is a simple measure of howclose the members of the peer group are to the target.
• A good quality peer group is one that closely follows thetarget over time.
Qs,e =1
te − ts
te∑
t=ts
qt. (2)
Whitening the Population
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
• Detecting Anomalies
• Detecting Anomalies
• Robustifying PeerGroups
• Robustifying PeerGroups
• Peer Group Quality• Whitening thePopulation
• Building Peer Groups
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 23 / 54
• Whitening the population to make the scatter of a peergroup (of size 2) commensurate across time
• The smaller the value of Qs,e the better the peer grouptracks the target over time.
t=1 t=2 t=3
Peer Group Members
Population
Target
Building Peer Groups
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
• Detecting Anomalies
• Detecting Anomalies
• Robustifying PeerGroups
• Robustifying PeerGroups
• Peer Group Quality• Whitening thePopulation
• Building Peer Groups
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 24 / 54
• Possible to know apriori the peer group membership• Employee fraud detection, people with the same job
description can be naturally grouped together.• IBM FAMS. Health care fraud. Geography, speciality• Infer peer group membership from the time series itself• Measuring similarity of time series
Applying Peer GroupAnalysis
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis• Time Alignment &Feature Extraction• Time Alignment &Feature Extraction• Outlier Detection fromPeer Groups• Active and InactiveAccounts
• Building Peer Groups
• Building Peer Groups
• Building Peer Groups
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 25 / 54
Time Alignment & Feature Extraction
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis• Time Alignment &Feature Extraction• Time Alignment &Feature Extraction• Outlier Detection fromPeer Groups• Active and InactiveAccounts
• Building Peer Groups
• Building Peer Groups
• Building Peer Groups
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 26 / 54
• Accounts’ transactions are asynchronous data streams• Synchronise account time series by extracting features
from the data streams at regular time intervals• M(s, e, A) summarise transactions of account A occurring
from day s to day e inclusive
◦ Mean amount spent◦ Number of transactions◦ Entropy of Merchant Category Groups
• 16 Groups +1 for ATMs
• Returns 1 point in 3 dimensional space
Time Alignment & Feature Extraction
29/08/07 27 / 54
Account B
Day
Am
ount
With
draw
n
0 1 2 3 4 5 6 7 8 9 100
20
40
60
80
100
0 1 2 3 4 5 6 7 8 9 100
20
40
60
80
100Account A
Day
Am
ount
With
draw
n
M(7,10,B)
M(7,10,A)
Outlier Detection from Peer Groups
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis• Time Alignment &Feature Extraction• Time Alignment &Feature Extraction• Outlier Detection fromPeer Groups• Active and InactiveAccounts
• Building Peer Groups
• Building Peer Groups
• Building Peer Groups
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 28 / 54
• Once a day at midnight• Summary statistic for day t, behaviour of the past d days
M(t − d + 1, t, A)• Smaller d, the more sensitive to new transactions• Mahalanobis distance in 3 dimensional space
Active and Inactive Accounts
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis• Time Alignment &Feature Extraction• Time Alignment &Feature Extraction• Outlier Detection fromPeer Groups• Active and InactiveAccounts
• Building Peer Groups
• Building Peer Groups
• Building Peer Groups
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 29 / 54
• Account inactive on day t if it has not performed anytransactions on that day
• Do not test for outlierness for inactive accounts• Unusually long periods of inactivity will not be considered
fraudulent
Active and Inactive Accounts
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis• Time Alignment &Feature Extraction• Time Alignment &Feature Extraction• Outlier Detection fromPeer Groups• Active and InactiveAccounts
• Building Peer Groups
• Building Peer Groups
• Building Peer Groups
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 29 / 54
• Account inactive on day t if it has not performed anytransactions on that day
• Do not test for outlierness for inactive accounts• Unusually long periods of inactivity will not be considered
fraudulent• Account not active over entire summary statistic window• Active peer group members. Closest k accounts that are
active on at least one day of the summary statistic window
Building Peer Groups
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis• Time Alignment &Feature Extraction• Time Alignment &Feature Extraction• Outlier Detection fromPeer Groups• Active and InactiveAccounts
• Building Peer Groups
• Building Peer Groups
• Building Peer Groups
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 30 / 54
• Subdivide training data into n non-overlapping windows
◦ M(1, L
n, A), . . . ,M((n − 1)L
n+ 1, L,A)
• Point in 3n dimensional space• Complication, potential for bias• Standardise each window by whitening
Building Peer Groups
29/08/07 31 / 54
Account B
Am
ount
With
draw
n
0 1 2 3 4 5 6 7 8 9 100
20
40
60
80
100
0 1 2 3 4 5 6 7 8 9 100
20
40
60
80
100
Account A
Am
ount
With
draw
n
M(6 2
3, 10,A)M(1,3 1
3,A) M(3 1
3,6 2
3,A)
M(6 2
3, 10,B)M(1,3 1
3,B) M(3 1
3,6 2
3,B)
Building Peer Groups
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis• Time Alignment &Feature Extraction• Time Alignment &Feature Extraction• Outlier Detection fromPeer Groups• Active and InactiveAccounts
• Building Peer Groups
• Building Peer Groups
• Building Peer Groups
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
29/08/07 32 / 54
• Find k nearest neighbours• Large number of accounts• Accounts that have high volume of transactions unlikely to
be tracked by accounts with low volume• First sort by number of transactions in training data
Performance Evaluation
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
• Performance Criteria
• Performance Metric
• Performance Curve• Average PerformanceCurve
Experiments & Results
Conclusions & CurrentWork
29/08/07 33 / 54
Performance Criteria
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
• Performance Criteria
• Performance Metric
• Performance Curve• Average PerformanceCurve
Experiments & Results
Conclusions & CurrentWork
29/08/07 34 / 54
• Reduce total amount lost to fraud• Reduce number of fraudulent transactions• Reduce the time between fraud starting and fraud
detection• Reduce the number of account holders affected by flagging
legitimate transactions as fraud• Number of possible performance metrics
Performance Metric
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
• Performance Criteria
• Performance Metric
• Performance Curve• Average PerformanceCurve
Experiments & Results
Conclusions & CurrentWork
29/08/07 35 / 54
• If an account has been flagged as containing fraudulenttransactions. The card issuer would need to investigate thisaccount.
• minimise the amount of fraud given the number ofinvestigations the card company can make
Performance Curve
• x-axis number of fraudulent accounts missed as aproportion of the number of fraudulent accounts
• y-axis number of fraud flags raised as a proportion of thenumber of accounts
• Different to ROC curve. The smaller the area under thecurve the better the performance.
• Random classification is represented by a diagonal linefrom the top left to the bottom right.
Performance Curve
29/08/07 36 / 54
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Proportion of Frauds not found
Num
ber
of F
raud
Fla
gs R
aise
d pe
r D
ayas
a P
ropo
rtio
n of
the
Pop
ulat
ion
• The lower the curve the better the performance.• Twice Area under Curve [0,1], smaller the area the better the
performance
Average Performance Curve
29/08/07 37 / 54
• Produce one curve for each day• Take the average of the curves.• For a given proportion of fraud flags raised
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Proportion of Frauds not found
Num
ber
of F
raud
Fla
gs R
aise
d pe
r D
ayas
a P
ropo
rtio
n of
the
Pop
ulat
ion
Experiments & Results
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
• Experiments
• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Global OutlierDetector• Peer GroupsPerformance• Peer GroupsPerformance
29/08/07 38 / 54
Experiments
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
• Experiments
• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Global OutlierDetector• Peer GroupsPerformance• Peer GroupsPerformance
29/08/07 39 / 54
Data
• 4 months of data• Accounts with > 80 transactions and fraud free for first 3
months.• About 4000 accounts 6% defrauded in final month• Performed Peer Group Analysis once a day for the
remaining month
Parameters
• Peer Group building 8 segments• Summary Statistic window size 7 days• Active Peer Group Size 100• Robustifying Peer Groups not used
Varying Length of Summary Statistic Window
29/08/07 40 / 54
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.2
0.4
0.6
0.8
1
Proportion of Frauds not Found
Num
ber
of F
raud
Fla
gs R
aise
d pe
r D
ay a
s a
Pro
port
ion
of th
e P
opul
atio
n
1 day
Varying Length of Summary Statistic Window
29/08/07 41 / 54
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.2
0.4
0.6
0.8
1
Proportion of Frauds not Found
Num
ber
of F
raud
Fla
gs R
aise
d pe
r D
ay a
s a
Pro
port
ion
of th
e P
opul
atio
n
1 day3 days
Varying Length of Summary Statistic Window
29/08/07 42 / 54
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.2
0.4
0.6
0.8
1
Proportion of Frauds not Found
Num
ber
of F
raud
Fla
gs R
aise
d pe
r D
ay a
s a
Pro
port
ion
of th
e P
opul
atio
n
1 day3 days5 days
Varying Length of Summary Statistic Window
29/08/07 43 / 54
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.2
0.4
0.6
0.8
1
Proportion of Frauds not Found
Num
ber
of F
raud
Fla
gs R
aise
d pe
r D
ay a
s a
Pro
port
ion
of th
e P
opul
atio
n
1 day3 days5 days7 days
Varying Length of Summary Statistic Window
29/08/07 44 / 54
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.2
0.4
0.6
0.8
1
Proportion of Frauds not Found
Num
ber
of F
raud
Fla
gs R
aise
d pe
r D
ay a
s a
Pro
port
ion
of th
e P
opul
atio
n
1 day3 days5 days7 days14 days
Global Outlier Detector
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
• Experiments
• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Varying Length ofSummary StatisticWindow• Global OutlierDetector• Peer GroupsPerformance• Peer GroupsPerformance
29/08/07 45 / 54
• Is peer group analysis doing nothing more than findingoutliers to the population?
• Special case, use largest possible peer group• All accounts apart from target account• Subtract Performance Curve for Peer Group from Global.• Values less than zero imply Peer Group method is
performing better.
Peer Groups Performance
29/08/07 46 / 54
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Proportion of Frauds Not Found
Nu
mb
er
of
Fra
ud
Fla
gs
Ra
ise
d p
er
Da
ya
s a
Pro
po
rtio
n o
f th
e P
op
ula
tion
Non Robust
Peer Groups Performance
29/08/07 47 / 54
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Proportion of Frauds Not Found
Nu
mb
er
of
Fra
ud
Fla
gs
Ra
ise
d p
er
Da
ya
s a
Pro
po
rtio
n o
f th
e P
op
ula
tion
Non RobustNon Robust without Fraud Contamination
Peer Groups Performance
29/08/07 48 / 54
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Proportion of Frauds Not Found
Nu
mb
er
of
Fra
ud
Fla
gs
Ra
ise
d p
er
Da
ya
s a
Pro
po
rtio
n o
f th
e P
op
ula
tion
Non RobustNon Robust without Fraud ContaminationRobust
Peer Groups Performance
29/08/07 49 / 54
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Proportion of Frauds Not Found
Nu
mb
er
of
Fra
ud
Fla
gs
Ra
ise
d p
er
Da
ya
s a
Pro
po
rtio
n o
f th
e P
op
ula
tion
Non RobustNon Robust without Fraud ContaminationRobustGlobal
Peer Groups Versus Global Outlier Detector
29/08/07 50 / 54
Performance of the peer group analysis compared with global populationoutlier detector.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−0.1
−0.05
0
0.05
0.1
Number of Fraud Flags Raised per Day as a Proportion of the Population
Pe
rfo
rma
nce
Diff
ere
nce
Robustified Peer GroupPeer Group
Peer Groups Versus Global Outlier Detector
29/08/07 51 / 54
Performance of the robustified peer group analysis compared with globalpopulation outlier detector on screened data.
0 0.2 0.4 0.6 0.8 1
−0.1
−0.05
0
0.05
0.1
Number of Fraud Flags Raised per Day as a Proportion of the Population
Per
form
ance
Diff
eren
ce
Conclusions & CurrentWork
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
• Conclusions• 1 Day Symposium,23rd November 2007
29/08/07 52 / 54
Conclusions
• EPSRC Think CrimeInitiative
• ThinkCrime Team
• Overview
Peer Group Analysis -Introduction
Peer Group Analysis
Applying Peer GroupAnalysis
Performance Evaluation
Experiments & Results
Conclusions & CurrentWork
• Conclusions• 1 Day Symposium,23rd November 2007
29/08/07 53 / 54
• We have demonstrated there exist credit card transactionaccounts that evolve sufficiently closely to enablefraudulent behaviour to be detected.
• Finding frauds that are not global outliers to the population.
Current work
• Combining Methods
1 Day Symposium, 23rd November 2007
29/08/07 54 / 54
Statistical and machine learning approaches to detecting fraud andpredicting consumer behaviour
• Competing Risks in Retail Finance, Crowder MJ
• Event History Analysis for Debt Collection Portfolios, Zhou F, Hand DJ, Heard
NA
• A dynamic scorecard for monitoring baseline performance with application
to tracking a mortgage portfolio, Whittaker J, Whitehead C, Somers M• Estimating the iceberg: how much fraud is there in the UK? Blunt G, Hand DJ
• Evaluating Fraud Detection Systems, Hand DJ
• Transaction Aggregation: A Winning Strategy vs. Fraud? Whitrow C, Weston
D, Juszczak P, Hand DJ, Adams N
• Detecting Plastic Card Fraud using Peer Group Analysis, Weston D, Whitrow
C, Juszczak P, Hand DJ, Adams N
• Behavioural finance as a multi-instance learning problem,Juszczak P, Hand
DJ