17
HawkEye: A Real-Time Anomaly Detection System Satnam Singh, PhD

HawkEye: A Real-Time Anomaly Detection System

Embed Size (px)

Citation preview

Page 1: HawkEye: A Real-Time Anomaly Detection System

HawkEye: A Real-Time Anomaly Detection System

Satnam Singh, PhD

Page 2: HawkEye: A Real-Time Anomaly Detection System

Anomaly Types: Point Anomalies

• Data points that are significantly away from baseline are considered as outliers/point anomalies

• Detection Strategy: Use classification model/parametric models to learn the baseline and then detect deviations from baseline as anomalous

• Anomaly Detectors: Parametric models (LogNormal, Poisson), MPCA, One-Class SVM

*

Page 3: HawkEye: A Real-Time Anomaly Detection System

Collective Anomalies

• Anomalies are sequence of data points, measured typically at successive times, spaced at (often uniform) time intervals

• Detection Strategy: Compute anomaly score sequentially [e.g. likelihood ratio of anomalous to baseline probability distributions] and declare an anomaly whenever it crosses a threshold

• Anomaly Detectors: Change detection statistical techniques such as CUSUM, Page’s Test, GLRT

Page 4: HawkEye: A Real-Time Anomaly Detection System

Contextual Anomalies

• Data items are considered as anomalous in a specific context but not in other situations

• Using the context either raise or supress anomalies

• Anomaly Detectors: Seasonality detection using multiple models, Time series modeling

NormalAnomaly

Number of Requests madeon Retail website

Tuesday Tuesday Tuesday

Page 5: HawkEye: A Real-Time Anomaly Detection System

Data Stream Complexity

• Data stream complexity varies from simple to complex

• Don’t need complex algorithms for simple data streams

• Use algorithms to define data stream complexity

Page 6: HawkEye: A Real-Time Anomaly Detection System

Data Stream Complexity Estimator• Compute data stream summary statistics (e.g. percentage of zero values, max value,

non-zero values, entropy, etc.) of each job for entire training data• Perform anomaly detection using most complex anomaly detection pipeline on all the

job streams of training data• Use anomaly counts-based rule to perform to estimate data complexity. Used

following rule for complexity estimation:Constant-valued data stream: Anomaly counts 0%Simple: Anomaly counts Less than 0.5%, Medium Complex: Anomaly counts in between 0.5% and 2%Highly Complex: Anomaly counts more than 2%

• Use summary statistics computed in Step (a) as features and complexity computed in step (c) as class labels. Feed these features and class labels to a decision tree.

• Using Decision tree (information gain heuristic) identify features that are informative for classification. We found that decision tree achieves nearly 84% accuracy. Using Decision tree we derive following rule to automatically classify any job stream:

If entropy==0 : Level-1 “Constant-valued”if entropy <= 0.42: Level-2 “Simple” entropy > 0.42 and entropy< 0.75               if  zero percentage  <= 97: Level-3 “Medium Complex”               elif   zero percentage  > 97: Level-2 “Simple” entropy>0.75               if  zero percentage  <= 97: Level-4 “Highly Complex”               elif   zero percentage  > 97: Level-2 “Simple”

Page 7: HawkEye: A Real-Time Anomaly Detection System

HawkEye: Anomaly Detection Framework

1. Data StreamComplexity Estimator• Summary

Statistics• Entropy

2. Automated Baselining &Anomaly

Detector Selection- Parametric Models - Page’s Test

3. Seasonality Detection and

Prediction

4. AnomalySuppressionand Fusion

AlertsdB

Metricsdata

UserDashboard

Page 8: HawkEye: A Real-Time Anomaly Detection System

Sliding Window Size Selection

Level 2

Level 3

Level 4

Page 9: HawkEye: A Real-Time Anomaly Detection System

Anomaly Score: Statistics-based Detector

Page 10: HawkEye: A Real-Time Anomaly Detection System

Page’s Test: Detect Collective AnomaliesAn efficient change detection scheme

Use Page’s test to detect a switch from ordinary noise-only observations to those which look similar to the models

A change detection problem, is such that the distribution of observations is different before and after an unknown time no; and we want to detect the change, if it exists, asap.

Find the stopping time

Process beginsat t = 75

Detectiondeclared at t = 80

h = 30

arg minT nn

N S h test statistic 1max 0, ( )n n nS S g x

log likelihood ratio

Test statistic Sn is “clamped” at zero

( )( ) ln

( )K n

nH n

f xg x

f x

Page 11: HawkEye: A Real-Time Anomaly Detection System

Anomaly Score: Page’s Test Detector

Page 12: HawkEye: A Real-Time Anomaly Detection System

Seasonality Detection and Prediction

Page 13: HawkEye: A Real-Time Anomaly Detection System

Anomaly Detection Results: Historical Statistics-Detector

Anomalies in RedAnomaly Count:170

Anomaly Score Distribution

Page 14: HawkEye: A Real-Time Anomaly Detection System

Anomaly Detection Results: Page’s Test

Anomalies in RedAnomaly Count:6

Anomaly Score Distribution

Page 15: HawkEye: A Real-Time Anomaly Detection System

Anomaly Detection Results: Historical Statistics-Detector

Anomalies in RedAnomaly Count:159

Anomaly Score Distribution

Page 16: HawkEye: A Real-Time Anomaly Detection System

Anomaly Detection Results: Page’s Test

Anomalies in RedAnomaly Count:16

Anomaly Score Distribution

Page 17: HawkEye: A Real-Time Anomaly Detection System

Anomaly Detection Results

System 1 System 2 Historical

Statistics-based detector

Page’s Test Historical Statistics-based detector

Page’s Test

Total No. of datums No of Jobs* Datums per Job= 155*14050=2177750

151*14050=2121550

No. of Missing datums No. of Jobs*Missing Datums= 155*133=20615

151*18=2718

No. of valid datums 2157135 2118832

No. of anomalies 28832 9793 32054 13197

% Anomalies 1.33% 0.454% 1.512% 0.623%

Computation Time Taken

2 mins 4 mins 2 mins 5 mins