Anomaly Detection by Mean and Standard Deviation (LT at AQ)

Preview:

DESCRIPTION

 

Citation preview

Anomaly Detection

iwanaga

Who am I

@quake_alert@quake_alert_en@quake_alert_fr@quake_alert_kr

Yoshihiro Iwanaga

Motivation for detecting anomaly

Traditional system monitoring

• process existence• ping, http, tcp response• disk usage

→ “fixed” rule / threshold

Motivation for detecting anomaly

Notice something out of ordinary

• network traffic is heavier than usual • number of login try is obviously larger• a colleague is strangely gracious today

→ Unusual behaviors; Indications of fault.

Such info helpspreventing service degrading in advance!!

but rule/threshold vary with service, host, client, time…

key to detect anomaly

usual unusual

Watch differences b/w

e.g. Network Traffic

Mon Tue Wed Thu Fri traffic

time

Superimpose 24 hour plot

Traffic at 15:00 on workdayis about 1.2 Gbps

traffic

timePeriodicity!!

mean

mean - 3σ

mean + 3σ

amount of dispersion from mean

Acceptable “range”

→ e.g. Acceptable range of traffic at 15:00 on workday is1.01 to 1.38 Gbps

Case examples

DDoS

partialhardware failure

Traffic

number of mail passed spam filterspam rate

e-mail

Applied a wrong spam rule

However

Reality is not that simple…

人生楽ありゃ苦もあるさ涙の後には虹も出る

歩いてゆくんだしっかりと自分の道をふみしめて

山上路夫

downloading large files

mass e-mail sending

“Traffic spike” happens so frequently

Frequent false-positive alerting will be

“cry-wolf” system…

heuristic filtering

In usual, traffic gets cool downwithin 15 minutes

notify engineersif anomaly continues more than 15 minutes

Engineers’ knowledge is gold minefor better algorithm

→ one practical example:

Recommended