24
Fast Portscan Detection Using Sequential Hypothesis Testing Authors: Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari Balakrishnan Publication: IEEE Symposium on Security and Privacy 2004 Presenter: Ryan Cunningham

Fast Portscan Detection Using Sequential Hypothesis Testing Authors: Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari Balakrishnan Publication: IEEE

Embed Size (px)

Citation preview

Fast Portscan Detection Using Sequential Hypothesis Testing

Authors: Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari Balakrishnan

Publication: IEEE Symposium on Security and Privacy 2004

Presenter: Ryan Cunningham

A quick note

All images and equations taken directly from the publication

Port scanning

Network reconnaissance technique Usually a prelude to an attack Difficult to detect

Traffic difficult to distinguish from regular traffic Stealth scans can occur very slowly Some scans are legitimate

Search engine spiders SSH, peer-to-peer applications, etc.

Previous detection techniques Limit distinct connection attempts from one IP

Network Security Monitor Snort

Also detects malformed packets

Limit failed connection attempts from one IP Bro

Sensitive to service on specific port Robertson et al. showed threshold very important

Previous detection techniques Probabilistic model

Developed by Leckie et al. Assesses typical traffic a machine receives Also assesses the traffic a remote machine is

likely to send Combines these probabilities

If the result is too much, an alert is sounded Generates too many false positives

Previous detection techniques SPICE

Similar to probabilistic model Used to detect low traffic “stealth” scans Too computationally intensive for real world

Data set

Traffic from two sites LBL

6,000 hosts Sparse address space 4.4%

ICSI 200 hosts Dense address space 42%

Data set

Anonymized TCP logs from Bro Recorded for one 24 hour period Bro NIDS flags for comparison and validation

Data set

Unsuccessful Login attempt analysis

Data set

Ratio of successful login attempts to unsuccessful login attempt analysis

Observations

Scans usually come from one host Scans make lots of failed connection

attempts and few successful connection attempts

Scans should ideally be detected quickly False positive rate should be configurable

Sequential Hypothesis Testing Proposed by Wald in the 1940’s Method of doing repeated hypothesis testing

as sequential data is gathered Deciding between two hypotheses Each time a data point arrives, decide

Accept H0 (in our case, benign traffic)

Accept H1 (in our case, port scan traffic) Wait for more data (next connection attempt)

Sequential Hypothesis Testing We specify parameters and

> false positive rate < detection accuracy

We must estimate parameters and

probability a benign connection attempt is successful

probability a scanner connection attempt is successful

Sequential Hypothesis Testing For each test, we compute the likelihood ratio:

Where

Sequential Hypothesis Testing Compare likelihood ratio to:

If < then this is benign traffic

> then this is scan traffic Otherwise, wait for another connection

Sequential Hypothesis Testing We can estimate the expected number of

connections required to decide with:

Derivation is long and messy

Sequential Hypothesis Testing

Algorithm

Results

Efficiency =true positive / total reported positive

Effectiveness =true positive / total actually positive

Results

Comparison with Snort and Bro N bar = average number of local hosts scanned

before decision is made

Contributions

Extremely fast port scan detection algorithm High accuracy Low false positive rate Sound statistical foundation Soundly evaluate the weaknesses of their

approach Good use of appendixes Cure for insomnia

Weaknesses

Buffer of activity Attacker can spoof multiple IP addresses

How is filled buffer dealt with? Flush buffer

Attacker can use this to hide scan activity Maintain larger buffer

Attacker can keep going until system crashes

Distributed port scans undetectable Botnets are increasing in popularity

Weaknesses

Test assumes independent connection attempts As suggested in paper, an attacker could exploit

knowledge of the system to connect to some systems while doing surveillance on others

No real time testing conducted, only simulation

Reasoning is a little circular Poor use of language

Improvements

Implement and test in real time Perform suggested improvements in paper

Differentiate between different services Differentiate between rejected and unanswered

connection attempts Use a honeypot to see if complete three way hand

shake is completed (to detect spoofed IPs) Should have kept some of the data away as a

sort of test data set