Upload
shebuti
View
122
Download
3
Tags:
Embed Size (px)
Citation preview
Shebuti & Leman
Stony Brook University
Shebuti Rayana Leman Akoglu
Shebuti & Leman 2
Network intrusionHealthcare fraud
Credit card fraudTax evasion
Event Detection & Characterization in Dynamic Graphs
& Many More…
Shebuti & Leman 3
Problem: Given a sequence of graphs,
Q1. Event detection: find time points at which graph changes significantly
Q2. Characterization: find (top k) nodes / edges / regions that change the most
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 4
Main framework
Compute graph similarity/distance scores
Find unusual occurrences in time series
… ……
time
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 5
Flow of Ensemble Approach Event Detection in Dynamic Graphs
Ensemble AlgorithmsEigen Behavior based Event Detection (EBED)
Probabilistic Approach (PTSAD)
SPIRIT
Consensus MethodRank based
Score based
ResultsDataset 1: Challenge Network flow Data
Dataset 2: New York Times News Corpus
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 6Event Detection & Characterization in Dynamic Graphs
Event Detection
Consensus Rank Merging•Rank based
•Inverse Rank•Kemeny Young
•Score Based•Unification (avg, max)•Mixture Model (avg, max)
• Final Ensemble (Inverse Rank)
Characterization
Shebuti & Leman 7
Numerous algorithms for event detection
Hard to decide which one will work well for a specific data set
Our Goal: design an ensemble approach which might not give best result but “better” than most base algorithms
Challenges:
Different scores/scales
Different merging approaches
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 8
Extract “typical behavior” (eigen-behavior) of nodes/edges
eigen-behavior ≡ principal eigen-vector
Compare eigen-behavior over time Score the time ticks depending on
amount of change in behaviorfrom previous time tick.
Mark the ones with high score as anomalous.
T
N
Feature: Degree
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 9
Nodes
T
Features(egonet)
Time
T
N
Feature:degree
WW
past pattern
eigen-behavior at t eigen-behaviors
N
rightsingularvector
change-scoremetric: Z = 1- uTr
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 10
Individual nodes/edges time series with distributions Poisson
Zero-inflated Poisson
Hurdle Process
▪ Hurdle Component: Bernoulli & Markov Chain
▪ Count Component: Zero-truncated Poisson
Model Selection: AIC, log likelihood, Vuong’s test and log gain
Find single-sided p-value as the probability of observing a count as extreme as v [P(X ≥ v)]
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 11Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 12
Streaming Pattern dIscoveRy in multIple Time-series (SPIRIT) [Papadimitriou et al. 2005]
Discovers trends – whenever trend changes it introduce new hidden variable & remove when not needed
Detects anomalous points in trends
Nodes weights change in each step
At a change point the node which has highest weight is most anomalous
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 13Event Detection & Characterization in Dynamic Graphs
Event Detection
Characterization
Shebuti & Leman 14Event Detection & Characterization in Dynamic Graphs
Rank based Score based
•Inverse Rank•Kemeny Young[J. Kemeny 1959]
•Unification [Zimek et al. 2011]-avg & max
•Mixture Model [Jing et al. 2006]-avg & max
Final Ensemble: inverse rank
Consensus
RankList1
ScoreList1
RankList2
ScoreList2
RankList3
ScoreList3
FinalRankList
Shebuti & Leman 15
We were given a “Cyber Challenge Network” from NGAS R&T Space Park
Simulated cyber network traffic
10 days activities
125 hosts
To-from information with timestamps
Find “suspicious” events and the entities associated with the corresponding events in Challenge Network.
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 16
Eigen-behaviors
Probabilistic Approach
SPIRIT
Z-score
1 – norm.
(sum
p-value)
projection
Event Detection & Characterization in Dynamic Graphs
Time tick
Feature:Degree
Shebuti & Leman 17
Eigen-behaviors
Probabilistic Approach
SPIRIT
relative
activity
change
projection
weight
Event Detection & Characterization in Dynamic Graphs
at Time tick 376
nodes
normal.
|log(p)|
Shebuti & Leman 18Event Detection & Characterization in Dynamic Graphs
Algorithm Sample rate (10 min)
Base Algorithms
EBED 0.8333
PTSAD 0.5722
SPIRIT 0.7292
ConsensusRank
MergingAlgorithms
Inverse Rank (1/R) 1.0000
Kemeny Young 0.8095
Unification (avg) 0.8056
Unification (max) 0.7255
Mixture model (avg) 0.1684
Mixture model (max) 0.1684
Final Ensemble (1/R) 0.8667
Average Precision Table (Feature: Degree)
Shebuti & Leman 19Event Detection & Characterization in Dynamic Graphs
Algorithm Event at 376 Event at 1126
Base Algorithms
EBED 1.0000 1.0000
PTSAD 1.0000 0.2500
SPIRIT 0.3026 0.0213
ConsensusRank
MergingAlgorithms
Inverse Rank (1/R) 1.0000 0.5000
Kemeny Young 1.0000 0.2000
Unification (avg) 1.0000 1.0000
Unification (max) 0.8333 1.0000
Mixture model (avg) 1.0000 1.0000
Mixture model (max) 1.0000 1.0000
Final Ensemble (1/R) 1.0000 1.0000
Average Precision Table for Node anomalies Feature: Degree [Sample rate 10 min]
Shebuti & Leman 20Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 21
~8 years (Jan 2000- July 2007) of published articles of New York Times
Graph links: Co-mention of named entities (people, places, organization)
Sample rate: 1 week
No ground truth
Big Events detected:
January, 2001 – George W. Bush elected US president
September 11, 2001 – Terrorist attack in WTC
February 1, 2003 – Space Shuttle Columbia Disaster
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 22Event Detection & Characterization in Dynamic Graphs
Feature: Weighted Degree
Eigen-behaviors
Probabilistic Approach
SPIRIT
1 – norm.
(sum
p-value)
pro
jectio
n
2001 electionColumbia disaster
9/11 WTC attack
Z S
core
Shebuti & Leman 23Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 24
Heterogeneous detectors
different scores
different effectiveness (depending on dataset)
Ensemble for event detection on dynamic graphs
Multiple consensus (merging) approaches
two-phase consensus finding
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 25
Near-future: Robust consensus by automatically selecting effective base algorithms Challenge: no ground truth
Near-future: real-time detection
Event detection under diverse data sources (e.g., news media, social media, the Web, …)
Challenges: different entity types, different time granularity, entity resolution
Event Detection & Characterization in Dynamic Graphs
Shebuti & Leman 26
Judge a man by his questions rather than his answers.-Voltaire
Event Detection & Characterization in Dynamic Graphs
Event Detection
Characterization
http://www.cs.stonybrook.edu/~datalab/