View
215
Download
1
Tags:
Embed Size (px)
Citation preview
Identifying Malicious Web Requests through Changes in Locality and Temporal Sequence
DIMACS Workshop on Security of Web Services and E-Commerce
Li-Chiou [email protected]
School of Computer Science and Information SystemsPace University
May 4th, 2005
© Li-Chiou Chen, 5/6/2005 2
Needs for anomaly detection in distributed network traces
The fast spreading Internet worms or malicious programs interrupts web services Early detection and response is a vital approach
These attacks are usually launched from distributed locations Network traces left at distributed locations are
invaluable for searching clues of potential future attacks
E.g. Dshield, the Honeynet Project
© Li-Chiou Chen, 5/6/2005 3
Types of IDS
Based on data Network-based IDS
Monitors and inspects network traffic Host-based IDS
Runs on a single host
Based on detection techniques Signature-based IDS
Uses pattern matching to identify known attacks Anomaly-based IDS
Uses statistical, data mining or other techniques to distinguish normal from abnormal activities
© Li-Chiou Chen, 5/6/2005 4
Outline
Toolkits for inferring anomaly patterns from distributed network traces
Previous works Changes of locality over time Markov chain analysis Preliminary results Summary Future works
© Li-Chiou Chen, 5/6/2005 5
TIAP: Toolkits for inferring anomalous patterns in distributed network traces
Alerts to other IDS or TIAP peers
(using IDMEF)
Network traces (web log, tcpdump, etc)
Data conversion
Locality pattern analysis
Sequence pattern analysis
Response module
Alerts from other IDS or TIAP peers
(using IDMEF)
Alerts to administrators
© Li-Chiou Chen, 5/6/2005 6
Web level IDS
Anomaly detection Structure of a HTTP request (Kruegel and Vigna 03) Normality on streams of data access patterns (Sion et al
03) Misuse detection
State transition analysis of HTTP requests (Vigna et al 03)
Look for attack signatures (Almgren et al 01)
© Li-Chiou Chen, 5/6/2005 7
Changes in locality patterns and temporal sequence patterns
Locality where the web request is sent, such as the source IP
address, which web server is requested, such as the destination
IP address Temporal sequence
the order of requested objects during a given period of time
© Li-Chiou Chen, 5/6/2005 8
Locality pattern analysis in distributed network traces
ABCDABAA ABPOKIKL
t1: AB
t2: ....
t3: ….
t4: ….
© Li-Chiou Chen, 5/6/2005 9
An example: web traces in common log format from 6 web servers
tstamp, ip, server, doc_tpe, user_agent62978, 38.0.69.1, 1, 2, 362979, 38.0.69.1, 1, 2, 362979, 38.0.69.1, 2, 2, 363001, 38.0.69.1, 1, 2, 3……..………
S1 S2 S3 S4 S5 S6
A session
© Li-Chiou Chen, 5/6/2005 10
Data profiles
6 web servers (2 of them have links to each other, 4 of them are independent)
One day web trace One session: a distinct IP, 10 minutes interval 193,070 HTTP requests, 11,177 sessions HTTP requests from outside of the organization
© Li-Chiou Chen, 5/6/2005 11
Locality pattern analysis
Number of web site
accessed
Number of document type
accessed % browser % web bot1 1 99% 1%1 2 22% 78%1 3 94% 6%1 4 93% 7%1 5 0% 100%1 6 100% 0%1 7 100% 0%2 1 0% 100%2 2 12% 88%2 3 0% 100%2 4 0% 100%3 2 0% 100%3 3 0% 100%3 4 0% 100%4 2 0% 100%4 3 0% 100%4 4 0% 100%
86 sessions by only two web bots
© Li-Chiou Chen, 5/6/2005 12
Markov chain analysis
N S N S S O S N S O S N S S N S S …………………..
X X Y Y Y X X Z Z X X Z Z Z W W W …………………..
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 …………….
N S N S S O S N S O S N S S N S S …………………..
sampling window 1 sampling window 2
N
SO
N
SO
© Li-Chiou Chen, 5/6/2005 13
Data profiles
1 web servers One week web traces Window size 30 Reference list 30
© Li-Chiou Chen, 5/6/2005 14
Change of distinct IP over time- browsers
0
10
20
30
40
50
60
70
80
90
100
0 24 48 72 96 120 144 168 192
Hours (since 09/30/2004 0:00AM)
Nu
mb
er o
f u
nq
ie I
P p
er f
ive
min
ute
s
© Li-Chiou Chen, 5/6/2005 15
Change of distinct IP over time- web bots
0
5
10
15
20
25
0 24 48 72 96 120 144 168 192
Hours (since 09/30/2004 00:00AM)
Nu
mb
er o
f u
niq
ue
IP p
er f
ive
min
ute
s
© Li-Chiou Chen, 5/6/2005 16
Markov chain results
Old (O)
Same (S)New (N)
0.13 (0.10)0.13 (0.08)
0.06 (0.04) 0.83 (0.10)
0.43(0.14)
0.43(0.17)
0.40 (0.22)
0.42(0.21)
0.18 (0.16)
© Li-Chiou Chen, 5/6/2005 17
Illustration of the state transition probability
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 24 48 72 96 120 144 168
Hours (since 09/30/04, 0:00AM)
Pro
babi
lity
S->S
S->O
S->N
© Li-Chiou Chen, 5/6/2005 18
Summary
The preliminary locality pattern analysis works well with identifying distinct web bot access patterns
The Markov chain analysis provides a way to infer attacks that utilize random IP addresses
A combination of the two approaches is needed
© Li-Chiou Chen, 5/6/2005 19
Ongoing works
Incorporate the analytical results for malware or intrusion detections
A distributed framework of data collection and information sharing for inferring malwares or intrusion attempts across servers/platforms/geographical locations
Collection of attack logs for analytical purpose
Use of the Intrusion Detection Message Exchange Format (IDMEF) for message changes among servers