Upload
ezra-james
View
36
Download
2
Embed Size (px)
DESCRIPTION
Learning Rules for Anomaly Detection of Hostile Network Traffic. Matthew V. Mahoney and Philip K. Chan Florida Institute of Technology. Problem : How to detect novel intrusions in network traffic given only a model of normal traffic. Normal web server request GET /index.html HTTP/1.0 - PowerPoint PPT Presentation
Citation preview
Learning Rules for Anomaly Learning Rules for Anomaly Detection of Hostile Detection of Hostile
Network TrafficNetwork Traffic
Matthew V. Mahoney and Philip K. ChanMatthew V. Mahoney and Philip K. Chan
Florida Institute of TechnologyFlorida Institute of Technology
ProblemProblem: How to detect novel : How to detect novel intrusions in network traffic given intrusions in network traffic given
only a model of normal trafficonly a model of normal traffic
Normal web server requestNormal web server request
GET /index.html HTTP/1.0GET /index.html HTTP/1.0 Code Red II wormCode Red II worm
GET /default.ida?NNNNNNNNN…GET /default.ida?NNNNNNNNN…
What has been doneWhat has been done
FirewallsFirewalls Can’t block attacks on open ports (web, mail, DNS)Can’t block attacks on open ports (web, mail, DNS)
Signature Detection (SNORT, BRO)Signature Detection (SNORT, BRO) Hand coded rules (search for “default.ida?NNN”)Hand coded rules (search for “default.ida?NNN”) Can’t detect new attacksCan’t detect new attacks
Anomaly Detection (eBayes, ADAM, SPADE)Anomaly Detection (eBayes, ADAM, SPADE) Learn rules from normal traffic for low-level protocols Learn rules from normal traffic for low-level protocols
(IP, TCP, ICMP)(IP, TCP, ICMP) But application protocols (HTTP, mail) are too hard to But application protocols (HTTP, mail) are too hard to
modelmodel
Learning Rules for Anomaly Learning Rules for Anomaly Detection (LERAD)Detection (LERAD)
Associative mining (APRIORI, etc.) learns rules Associative mining (APRIORI, etc.) learns rules with high support and confidence for with high support and confidence for oneone value value
LERAD learns rules with high support (n) and a LERAD learns rules with high support (n) and a small small setset of allowed values (r) of allowed values (r)
Any value seen at least once in training is Any value seen at least once in training is allowedallowed
If port = 80 and word1 = “GET” then word3 {“HTTP/1.0”, “HTTP/1.1”} (r = 2)
LERAD StepsLERAD Steps
1.1. Generate candidate rulesGenerate candidate rules
2.2. Remove redundant rulesRemove redundant rules
3.3. Remove poorly trained rulesRemove poorly trained rules
LERAD is fast because steps 1-2 can be LERAD is fast because steps 1-2 can be done on a small random sample (~100 done on a small random sample (~100 tuples)tuples)
Step 1. Generate Candidate RulesStep 1. Generate Candidate RulesSuggested by matching attribute valuesSuggested by matching attribute values
SampleSample PortPort Word1Word1 Word2Word2 Word3Word3
S1S1 8080 GETGET /index.html/index.html HTTP/1.0HTTP/1.0
S2S2 8080 GETGET /banner.gif/banner.gif HTTP/1.0HTTP/1.0
S3S3 2525 HELOHELO pascalpascal MAILMAIL
S1 and S2 suggest:S1 and S2 suggest: port = 80port = 80 if port = 80 then word1 = “GET”if port = 80 then word1 = “GET” if word3 = “HTTP/1.0” and word1 = “GET then port = 80if word3 = “HTTP/1.0” and word1 = “GET then port = 80
S2 and S3 suggest no rulesS2 and S3 suggest no rules
Step 2. Remove Redundant RulesStep 2. Remove Redundant RulesFavor rules with higher score = n/rFavor rules with higher score = n/r
SampleSample PortPort Word1Word1 Word2Word2 Word3Word3
S1S1 8080 GETGET /index.html/index.html HTTP/1.0HTTP/1.0
S2S2 8080 GETGET /banner.gif/banner.gif HTTP/1.0HTTP/1.0
S3S3 2525 HELOHELO pascalpascal MAILMAIL
Rule 1: Rule 1: if port = 80 then word1 = “if port = 80 then word1 = “GETGET” ” (n/r = (n/r = 2/1)2/1)
Rule 2: Rule 2: if word2 = “/index.html” then word1 = if word2 = “/index.html” then word1 = “GET” “GET” (n/r = 1/1)(n/r = 1/1)
Rule 2 has lower score and covers no new values, Rule 2 has lower score and covers no new values, so it is redundantso it is redundant
Step 3. Remove Poorly Trained RulesStep 3. Remove Poorly Trained RulesRules with violations in a validation set will probably Rules with violations in a validation set will probably
generate false alarmsgenerate false alarms
Train Validate Test
r (number of allowed values)
Fully trained rule (kept)
Incompletely trainedrule (removed)
Attribute SetsAttribute Sets
Inbound client Inbound client packets (PKT)packets (PKT) IP packet cut into 24 IP packet cut into 24
16-bit fields16-bit fields
Inbound client TCP Inbound client TCP streamsstreams Date, timeDate, time Source, destination IP Source, destination IP
addresses and portsaddresses and ports Length, durationLength, duration TCP flagsTCP flags First 8 application First 8 application
wordswords
Anomaly score = tn/r summed over violated rules, t = time since previous violation
Experimental EvaluationExperimental Evaluation 1999 DARPA/Lincoln Laboratory Intrusion 1999 DARPA/Lincoln Laboratory Intrusion
Detection Evaluation (IDEVAL)Detection Evaluation (IDEVAL) Train on week 3 (no attacks)Train on week 3 (no attacks) Test on inside sniffer weeks 4-5 (148 simulated Test on inside sniffer weeks 4-5 (148 simulated
probes, DOS, and R2L attacks)probes, DOS, and R2L attacks) Top participants in 1999 detected 40-55% of attacks at Top participants in 1999 detected 40-55% of attacks at
10 false alarms per day10 false alarms per day 2002 university departmental server traffic (UNIV)2002 university departmental server traffic (UNIV)
623 hours over 10 weeks623 hours over 10 weeks Train and test on adjacent weeks (some unlabeled Train and test on adjacent weeks (some unlabeled
attacks in training data)attacks in training data) 6 known real attacks (some multiple instances)6 known real attacks (some multiple instances)
Experimental ResultsExperimental ResultsPercent of attacks detected at 10 false alarms per dayPercent of attacks detected at 10 false alarms per day
0
10
20
30
40
50
60
70
IDEVAL PKT IDEVAL TCP UNIV PKT UNIV TCP
UNIV Detection/False Alarm UNIV Detection/False Alarm TradeoffTradeoff
Percent of attacks detected at 0 to 40 false alarms per dayPercent of attacks detected at 0 to 40 false alarms per day
0
20
40
60
80
100
0 10 20 30 40
False alarms per day per detector
Perc
ent o
f att
acks
de
tect
ed
Comb
TCP
PKT
Run Time PerformanceRun Time Performance(750 MHz PC – Windows Me)(750 MHz PC – Windows Me)
Preprocess 9 GB IDEVAL traffic = 7 min.Preprocess 9 GB IDEVAL traffic = 7 min. Train + test < 2 min. (all systems)Train + test < 2 min. (all systems)
Anomalies are due to bugs and Anomalies are due to bugs and idiosyncrasies in hostile codeidiosyncrasies in hostile code
No obvious way to distinguish from benign eventsNo obvious way to distinguish from benign events
UNIV attackUNIV attack How detectedHow detected
Inside port scanInside port scan HEAD / HTTP\1.0 (backslash)HEAD / HTTP\1.0 (backslash)
Code Red II wormCode Red II worm TCP segmentation after TCP segmentation after GETGET
Nimda wormNimda worm host: wwwhost: www
Scalper wormScalper worm host: unknownhost: unknown
Proxy scanProxy scan host: www.yahoo.comhost: www.yahoo.com
DNS version probeDNS version probe (not detected)(not detected)
ContributionsContributions
LERAD differs from association mining in LERAD differs from association mining in that the goal is to find rules for anomaly that the goal is to find rules for anomaly detection: a small detection: a small setset of allowed values of allowed values
LERAD is fast because rules are LERAD is fast because rules are generated from a small samplegenerated from a small sample
Testing is fast (50-75 rules)Testing is fast (50-75 rules) LERAD improves intrusion detectionLERAD improves intrusion detection
Models application protocolsModels application protocols Detects more attacksDetects more attacks
Thank youThank you