28
Adaptive Hybrid Model for Network Intrusion Detection and Comparison Among Machine Learning Algorithms Md. Enamul Haque Department of Computer Engineering King Fahd University of Petroleum and Minerals Saudi Arabia [email protected] Supervised by Dr. Talal Alkharobi May 21, 2014 Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 1 / 28

Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Adaptive Hybrid Model for Network Intrusion Detection and ComparisonAmong Machine Learning Algorithms

Md. Enamul Haque

Department of Computer EngineeringKing Fahd University of Petroleum and Minerals

Saudi Arabia

[email protected] by Dr. Talal Alkharobi

May 21, 2014

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 1 / 28

Page 2: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Coming UpToday’s agenda.

Network Intrusion Detection.

Objective.

Proposed Model.

Algorithms Used.

Classifier Overview.

Dataset Description.

Results.

Conclusion

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 2 / 28

Page 3: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Network Intrusion DetectionLet’s review

Host based intrusion detection: monitors andanalyzes the internal interfaces

Network based intrusion detection.

Misuse Based: searches for known intrusivepatterns.Anomaly Based: Supervised, Unsupervised,and Hybrid Anomaly Detection..

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 3 / 28

Page 4: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Attack TypesBroad category

DOS: Denial of service.

R2L: Unauthorized access to the local system from a remote host.

U2R: Unauthorized access to the root of a local system.

Probe: Sensing network from outside to detect vulnerabilities.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 4 / 28

Page 5: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Anomaly TypesBroad classification

Table : Anomaly Types

Attack Type Exploits

DOS back, land, neptune, pod, smurf, teardrop

U2R buffer overflow, load module, perl, rootkit

R2I ftp write, guess pass, imap, multi hop

phi, spy, warezclient, warezmaster

Probe ip sweep, saint, satan, Nmap

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 5 / 28

Page 6: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Exploits CategorySample Information

Feature name Description Type

duration length (number of seconds) Continuous

of the connection

protocol type type of the protocol Discrete

e.g. tcp, udp, icmp etc.

land 1 if connection is from/to the Discrete

samehost/port; 0 otherwise.

urgent number of urgent packets Continuous

hot number of ”hot” indicators Continuous

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 6 / 28

Page 7: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

ObjectiveLet’s be clear what we wanted to do.

We have intrusion classified data and incoming traffic

Classify the incoming traffic if there is any abnormality.

If abnormality present, classify into specific category.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 7 / 28

Page 8: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

MotivationReinventing the wheel or what?

Building more accurate prediction model.

Adaptive learning for the model.

Detect novel intrusions.

Performance Comparison among existing learning models.

Artificial Neural Network and Support Vector Machines are already used.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 8 / 28

Page 9: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Proposed ModelOverview

Figure : Network Intrusion Detection Model

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 9 / 28

Page 10: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

AlgorithmIn brief

Figure : Network Intrusion Detection ModelMd. Enamul Haque (KFUPM) COE 551 May 21, 2014 10 / 28

Page 11: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Classifiers UsedThree major classifiers used

Classifiers

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 11 / 28

Page 12: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Naive Bayes ClassifierHow it works in simple terms?

Value of a particular feature is unrelated to the presence or absence of any other feature,given the class variable.

Example: A fruit may be considered to be an apple if it is red, round, and about 3 inch indiameter.

Each of these features are considered to contribute independently to the probability thatthis fruit is an apple.

Regardless of the presence or absence of the other features.

It can be trained very efficiently in a supervised learning setting.

Requires a small amount of training data to estimate the parameters (means andvariances of the variables) necessary for classification.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 12 / 28

Page 13: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Naive Bayes ClassifierHow it works in mathematical terms?

Bayes theorem:p(C |F1, . . . ,Fn) = p(C)p(F1,...,Fn|C)

p(F1,...,Fn)

posterior = prior×likelihoodevidence

In our problem,C = Anomaly / NormalF1, . . . ,Fn = The featuresn=No. of features.

Figure : Prediction based on recent events

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 13 / 28

Page 14: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Random ForestsHow it works?

Training set X = x1, . . . xn with class label/ responses Y = y1 . . . yn

Sample from n training examples X,Y; callthese Xb,Yb.

Train a decision or regression tree fb onXb,Yb.

Predictions for unseen samples x ′ can bemade by averaging the predictions from allthe individual regression trees on

x ′: f̂ = 1B

B∑b=1

f̂b(x ′)Figure : Random forests

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 14 / 28

Page 15: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

k-Nearest NeighborInstance based KNN (IBK)

Classify an unknown example with themost common class among k closestexamples.

Tell me who your neighbors are, and I willtell you who you are!

Example: k=3, 2 sea bass, 1 salmon.

Classified as sea bass.

Figure : Simple example for the idea.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 15 / 28

Page 16: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

k-NN Distance SelectionWorst case scenario

Feature 1 gives the correct class: 1 or 2.

Feature 2 gives irrelevant number from 100 to 200.

Training dataset: [1 150] [2 110]

Classify [1 100]

D([1 100], [1 150]) =√

(1− 1)2 + (100− 150)2 = 50 (1)

D([1 100], [2 110]) =√

(1− 2)2 + (100− 110)2 = 10.5 (2)

[1 100] is misclassified!

The denser the samples , the less of this problem.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 16 / 28

Page 17: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

k-NN:Feature NormalizationEqualizing the scale of the features.

Notice that 2 features are on different scales:

First feature takes values between 1 or 2.

Second feature takes values between 100 to 200.

Idea: normalize features to be on the same scale.

Different normalization approaches.

Linearly scale the range of each feature to be, say, in range [0,1].

fnew =fold − f min

old

f maxold − f min

old

(3)

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 17 / 28

Page 18: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

k-NN: How to Choose k?Is there any standard?

Figure : Sometimes due to noise1-NN provides erroneousoutcome.

Figure : 3-NN provides betterclassification accuracy than1-NN in this case.

Rule of thumb isk <√n, n is number of

examples.

In practice, k = 1 is oftenused for efficiency, butcan be sensitive to noise.

Larger k may improveperformance.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 18 / 28

Page 19: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Dataset DistributionsAnomaly and normal quantity

Category No. of Instances

Normal 67343

Anomaly 58630

Total 125973

Table : Dataset Used in the Experiment

Category No. of Instances Contribution

DOS 9234 Continuous

U2R 11 Continuous

R2L 209 Continuous

Probe 2289 Continuous

Table : Distribution of Reduced Dataset for Anomaly Class

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 19 / 28

Page 20: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 20 / 28

Page 21: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 21 / 28

Page 22: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Feature ReductionToo much for computation.

Attribute Evaluator Search Method No. of Selected Attribute Selected Attributes

CFS Genetic Search 15 4,5,6,8,10,12,17,23,26,29,30,32,37,38,39CFS PSO Search 9 4,5,6,12,26,29,30,37,39CFS Best First 6 4,5,6,12,26,30CFS Evolutionary Search 18 3,4,5,6,8,17,19,23,25,26,29,30,33,34,37,38,39,41

Consistency Subset Greedy Stepwise 10 1,3,4,5,14,23,32,34,35,37

Table : Features Reduction

Reduce the features without affecting the accuracy to gain less computation.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 22 / 28

Page 23: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Detailed Accuracy By Class10-fold Cross Validation for Random Forest

Table : Detailed Accuracy By Class : 10-fold Cross Validation for Random Forest

TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Type0.999 0.002 0.998 0.999 0.999 0.998 1.000 1.000 normal0.998 0.001 0.999 0.998 0.999 0.998 1.000 1.000 anomaly

Table : Confusion Matrix for Random Forest

a b Classified As

67308 35 a = normal

117 58513 b = anomaly

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 23 / 28

Page 24: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Classification AccuracyBased on confusion matrix

NaiveBayes PART RandomForest Grading Adaboost IBK0

10

20

30

40

50

60

70

80

90

100

Machine Learning Classifier

Accura

cy(%

)

Figure : Classification accuracy for different learning/classification algorithms. The major parameterswere tuned for each of the execution.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 24 / 28

Page 25: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Tools and EquipmentsThose came handy

KDD Cup 1999

MySQL: Data preprocessing.

MATLAB: Algorithm testing and graph generation.

WEKA 3.7.9: Actual classification performed.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 25 / 28

Page 26: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

References

Herrero, lvaro, et al. RT-MOVICAB-IDS: Addressing real-time intrusion detection. Future GenerationComputer Systems 29.1 (2013): 250-261.

McHugh, John. Testing intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusiondetection system evaluations as performed by Lincoln Laboratory. ACM transactions on Information andsystem Security 3.4 (2000): 262-294.

Tavallaee, Mahbod, et al. A detailed analysis of the KDD CUP 99 data set. Proceedings of the SecondIEEE Symposium on Computational Intelligence for Security and Defence Applications 2009. 2009.

Kim, Gisung, Seungmin Lee, and Sehun Kim. A novel hybrid intrusion detection method integratinganomaly detection with misuse detection. Expert Systems with Applications 41.4 (2014): 1690-1700.

Luo, Bin, and Jingbo Xia. A novel intrusion detection system based on feature generation withvisualization strategy. Expert Systems with Applications (2014).

Fung, Carol J., and Raouf Boutaba. Design and management of collaborative intrusion detection networks.Integrated Network Management (IM 2013), 2013 IFIP/IEEE International Symposium on. IEEE, 2013.

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 26 / 28

Page 27: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Future DirectionsLets think about next level!!

Classify the anomaly class into further specific divisions

Usage of unsupervised learning methods.

Knowledge base development

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 27 / 28

Page 28: Adaptive Hybrid Model for Network Intrusion Detection and ...enamul86.weebly.com/uploads/3/1/3/3/31331981/may21... · Adaptive Hybrid Model for Network Intrusion Detection and Comparison

Questions?Suggestions?

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 28 / 28