Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Activity Monitoring: Noticing Interesting Changes in Behavior
ICML-2001Williams College
Foster ProvostFoster ProvostNew York UniversityNew York University
2
Fraud detection
Date and Time Day Duration From To Fraud1/01/95 10:05:01 Mon 13 mins Brooklyn, NY Stamford, CT1/05/95 14:53:27 Fri 5 mins Brooklyn, NY Greenwich, CT1/08/95 09:42:01 Mon 3 mins Bronx, NY White Plains, NY1/08/95 15:01:24 Mon 9 mins Brooklyn, NY Brooklyn, NY1/09/95 15:06:09 Tue 5 mins Manhattan, NY Stamford, CT1/09/95 16:28:50 Tue 53 sec Brooklyn, NY Brooklyn, NY1/10/95 01:45:36 Wed 35 sec Boston, MA Chelsea, MA YES1/10/95 01:46:29 Wed 34 sec Boston, MA Yonkers, NY YES1/10/95 01:50:54 Wed 39 sec Boston, MA Chelsea, MA YES1/10/95 11:23:28 Wed 24 sec White Plains,NY Congers, NY1/11/95 22:00:28 Thu 37 sec Boston, MA EastBoston, MA YES1/11/95 22:04:01 Thu 37 sec Boston, MA EastBoston, MA YES
Can we identify which accounts have been compromised so as to take corrective action?
A typical defrauded account
3
Profiling customer behavior
Data Mining
CallData
Profilerconstruction
Neural network
DC-1 detector
Fraud indicators
ProfilingTemplates
PnP2P1BehaviorProfilers}
Example profiles:• Customer makes an average of 2 ± 2 calls at night• Customer makes 12 ± 5 calls during business hours• Customer makes 0 ± 0 calls from the South Bronx at night
4
5
Activity Monitoring
6
Activity Monitoring tasks
� Network performance monitoring� Fault monitoring� Crisis monitoring� Financial market alerting� Intelligent transaction monitoring
� (e.g., for insider trading)� Automated trading systems (e.g., for hedge funds)� News alerting� Intrusion detection� Fraud detection
� telecom, credit, insurance, etc.� note: fraud adapts to detection methods, so learning
systems are essential
7
Predicting customer problems from performance monitoring data
01000020000300004000050000600007000080000
Day
-7
Day
-6
Day
-5
Day
-4
Day
-3
Day
-2
Yes
ter
day
Tod
ay
ESLIaz
SESLIaz
UASLIaz
DGRMLIaz
ESPIaz
SESPIaz
UASPIaz
DGRMPIaz
01000020000300004000050000600007000080000
Day
-7
Day
-6
Day
-5
Day
-4
Day
-3
Day
-2
Yes
ter
day
Tod
ay
ESLIaz
SESLIaz
UASLIaz
DGRMLIaz
ESPIaz
SESPIaz
UASPIaz
DGRMPIaz
CAC: SWA8AW7
CAC: SWA2GK3
For which CAC’s is the likelihood of an impending trouble report highest?
PerformanceMonitoring Data
PerformanceMonitoring Data
troublereport?
troublereport?
8
Issues
u Multi-channel, multiple-type, time-sequence data– numerical, categorical, textual, feature-vectors, etc.
u Various possible granularities of data– different methods apply– coarser-grained summaries– comparative evaluation difficult
u Multiple alarm (don’t have equal value)– approximate formulations ignore this
u Timeliness of alarms is critical– delaying can be costly
9
Benefit of a hit?
s(τ,α,H,D)
Cost of a false alarm?
f(α,H,D)
Note: should normalize false alarms (per unit time)
Activity Monitoring
10
News Alerting Intrusion Detection
11
Different methods for fraud detection
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
Alarm onNone
Collisions &Velocities
High Usage BestIndividual
SOTA DC-1Detector
Detectors
Cos
t ($)
74
76
78
80
82
84
86
88
90
92
94
Acc
urac
y (%
)
Cost
Accuracy
Activity Monitoring: Noticing Interesting Changes in Behavior
ICML-2001Williams College
Foster ProvostFoster ProvostNew York UniversityNew York University
The EndThe End