2
Autonomic Intrusion Detection System Wei Wang 1 , Thomas Guyet 2,3 , and Svein J. Knapskog 1 1 Centre for Quantifiable Quality of Service in Communication Systems, Norwegian University of Science and Technology (NTNU) {wei.wang, knapskog}@q2s.ntnu.no 2 Project DREAM, INRIA Rennes/IRISA, France 3 AGROCAMPUS OUEST, Rennes, France Abstract. We propose a novel framework of autonomic intrusion de- tection that fulfills online and adaptive intrusion detection in unlabeled audit data streams. The framework owns ability of self-managing: self- labeling, self-updating and self-adapting. Affinity Propagation (AP) uses the framework to learn a subject’s behavior through dynamical cluster- ing of the streaming data. The testing results with a large real HTTP log stream demonstrate the effectiveness and efficiency of the method. 1 Problem statement, motivation and solution Anomaly Intrusion Detection Systems (IDS) are important in current network security framework. Insomuch as data involved in current network environments evolves continuously and as the normal behavior of a subject may have some changes over time, a static anomaly IDS is often ineffective. The detection models should be frequently updated by incorporating new incoming normal examples and be adapted to behavioral changes. To achieve this goal, there are at least two main difficulties: (1) the lack of precisely labeled data that is very difficult to obtain in practice; (2) the streaming nature of the data with behavioral changes. In order to tackle these difficulties, we propose a framework to fulfil au- tonomic intrusion detection that detects anomalies in an online and adaptive fashion through dynamical clustering of audit data streams. The autonomic IDS works in a fashion of self-managing, adapting to unpredictable changes whilst hiding intrinsic complexity to operators. It has abilities of self-labeling, self-updating and self-adapting for detecting attacks over unlabeled audit data streams. The self-updating consists in updating the detection model to take into account the normal variability of the data items. On the opposite, self-adapting consists in rebuilding the model in case of behavioral changes. The framework is under an assumption of rareness of abnormal data. We thus “capture” the anomalies by finding outliers in the data streams. Given a bunch of data stream, our method identifies outliers through the initial clus- tering. In the framework, the detection model is a set of clusters of normal data items. The outliers generated during the clustering as well as any incoming outlier that is too far from the current model are suspected to be attacks. To refine our diagnosis, we define three states of a data item: normal, suspicious and anomalous. If an outlier is identified, it is marked as suspicious and then put into a reservoir. Otherwise, the detection model is updated with the normal incoming data until a change is found, triggering model rebuilding to adapt to the current behavior. A suspicious item is considered as real anomalous if it is again marked as suspicious after the adaption.

Autonomic Intrusion Detection System - IRISApeople.irisa.fr/Thomas.Guyet/publis/RAID_poster.pdf · Autonomic Intrusion Detection System ... We propose a novel framework of autonomic

  • Upload
    buidiep

  • View
    242

  • Download
    5

Embed Size (px)

Citation preview

Autonomic Intrusion Detection System

Wei Wang1, Thomas Guyet2,3, and Svein J. Knapskog1

1 Centre for Quantifiable Quality of Service in Communication Systems, NorwegianUniversity of Science and Technology (NTNU)

{wei.wang, knapskog}@q2s.ntnu.no2 Project DREAM, INRIA Rennes/IRISA, France

3 AGROCAMPUS OUEST, Rennes, France

Abstract. We propose a novel framework of autonomic intrusion de-tection that fulfills online and adaptive intrusion detection in unlabeledaudit data streams. The framework owns ability of self-managing: self-labeling, self-updating and self-adapting. Affinity Propagation (AP) usesthe framework to learn a subject’s behavior through dynamical cluster-ing of the streaming data. The testing results with a large real HTTPlog stream demonstrate the effectiveness and efficiency of the method.

1 Problem statement, motivation and solution

Anomaly Intrusion Detection Systems (IDS) are important in current networksecurity framework. Insomuch as data involved in current network environmentsevolves continuously and as the normal behavior of a subject may have somechanges over time, a static anomaly IDS is often ineffective. The detection modelsshould be frequently updated by incorporating new incoming normal examplesand be adapted to behavioral changes. To achieve this goal, there are at leasttwo main difficulties: (1) the lack of precisely labeled data that is very difficult toobtain in practice; (2) the streaming nature of the data with behavioral changes.

In order to tackle these difficulties, we propose a framework to fulfil au-tonomic intrusion detection that detects anomalies in an online and adaptivefashion through dynamical clustering of audit data streams. The autonomicIDS works in a fashion of self-managing, adapting to unpredictable changeswhilst hiding intrinsic complexity to operators. It has abilities of self-labeling,self-updating and self-adapting for detecting attacks over unlabeled audit datastreams. The self-updating consists in updating the detection model to take intoaccount the normal variability of the data items. On the opposite, self-adaptingconsists in rebuilding the model in case of behavioral changes.

The framework is under an assumption of rareness of abnormal data.We thus “capture” the anomalies by finding outliers in the data streams. Givena bunch of data stream, our method identifies outliers through the initial clus-tering. In the framework, the detection model is a set of clusters of normaldata items. The outliers generated during the clustering as well as any incomingoutlier that is too far from the current model are suspected to be attacks. Torefine our diagnosis, we define three states of a data item: normal, suspiciousand anomalous. If an outlier is identified, it is marked as suspicious and then putinto a reservoir. Otherwise, the detection model is updated with the normalincoming data until a change is found, triggering model rebuilding to adapt tothe current behavior. A suspicious item is considered as real anomalous if it isagain marked as suspicious after the adaption.

2 Implementation and discussionThe autonomic IDS is effective for detecting rare attacks [1]. Detecting burstyattacks is a challenge as the attack scenario does not well match the assumption.We thus design another two mechanisms during the autonomic detection. First, ifa data item is very far from the model, the data item will be flagged as anomalousimmediately (other than considered as suspicious). Second, a change is triggeredif the percentage of outliers is high (e.g., larger than 60%) during a time period.Bursty attacks can thus be easily identified by the large dissimilarity and by theprompt model rebuilding.

We use Affinity Propagation (AP) and StrAP [2] to detect bursty attacks withthe framework. We use a real HTTP log stream to test the method. Characterdistribution of each HTTP request is used as the feature and the IDS is toidentify whether a request is normal or not. The data contains 40,095 requestsin which 239 attacks occurring in a very short interval (request 7923-9743th, seeFig.1(a), the k-NN distance between a data item and the training items) afterfiltering out static requests. To facilitate comparison, we also use another threestatic methods k-NN, PCA and one class SVM for the detection. The first 800attack-free requests are used for training the static models while the first 800requests are used for AP initial clustering. Testing results are shown in Fig.1(b).

0 0.5 1 1.5 2 2.5 3 3.5 4

x 104

0

0.1

0.2

0.3

0.4

0.5

0.6

HTTP requests in order

Ano

mal

y in

dex

normalattacks

0 5 10 15 20 25 3020

30

40

50

60

70

80

90

100

False Positive Rate (%)

Det

ectio

n R

ate

(%)

APkNNPCASVM

(a) Distance distribution of the log stream (b) Testing results with comparison

Fig. 1. Dynamic normal behaviors and testing results with comparison

Fig.1(a) shows that the normal behavior changes over time and Fig.1(b)indicates that the autonomic detection method achieves the better results thanother three static methods while the detection rates are higher than 50%. Notethat the autonomic IDS does not need a priori knowledge while static methodsneed labeled data for training. Our future work is combining the autonomic IDSwith effective static methods to prevent mimicry attacks (e.g., implementinglarge-scale attacks to evade the autonomic IDS).

References

1. Wang, W., Masseglia, F., Guyet, T., Quiniou, R., Cordier, M.O.: A general frame-work for adaptive and online detection of web attacks. In: WWW. (2009) 1141–1142

2. Zhang, X., Furtlehner, C., Sebag, M.: Data streaming with affinity propagation. In:ECML/PKDD (2). (2008) 628–643