360
International Journal of Computer Science & Information Security © IJCSIS PUBLICATION 20 10 IJCSIS Vol. 8 No. 1, April 2010 ISSN 1947-5500 

IJCSIS Volume 8 No. 1 April 2010

  • Upload
    ijcsis

  • View
    264

  • Download
    0

Embed Size (px)

Citation preview

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    1/359

    International Journal of

    Computer Science

    & Information Security

    IJCSIS PUBLICATION 2010

    IJCSIS Vol. 8 No. 1, April 2010

    ISSN 1947-5500

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    2/359

    Editorial

    Message from Managing Editor

    International Journal of Computer Science and Information Security (IJCSIS)

    provides a major venue for rapid publication of high quality computer science research,

    including multimedia, information science, security, mobile & wireless network, data

    mining, software engineering and emerging technologies etc. IJCSIS has continued to

    make progress and has attracted the attention of researchers worldwide, as indicated by

    the increasing number of both submissions and published papers, and also from the

    web statistics.. It is included in major Indexing and Abstracting services.

    We thank all those authors who contributed papers to the April 2010 issue and the

    reviewers, all of whom responded to a short and challenging timetable. We are

    committed to placing this journal at the forefront for the dissemination of novel and

    exciting research. We should like to remind all prospective authors that IJCSIS does

    not have a page restriction. We look forward to receiving your submissions and to

    receiving feedback.

    IJCSIS April 2010 Issue (Vol. 8, No. 1) has an acceptance rate of 35%.

    Special thanks to our technical sponsors for their valuable service.

    Available at http://sites.google.com/site/ijcsis/

    IJCSIS Vol. 8, No. 1, April 2010 Edition

    ISSN 1947-5500 IJCSIS 2010, USA.

    Indexed by (among others):

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    3/359

    IJCSIS EDITORIAL BOARD

    Dr. Gregorio Martinez Perez Associate Professor - Professor Titular de Universidad, University of Murcia

    (UMU), Spain

    Dr. M. Emre Celebi,Assistant Professor, Department of Computer Science, Louisiana State Universityin Shreveport, USA

    Dr. Yong LiSchool of Electronic and Information Engineering, Beijing Jiaotong University,P. R. China

    Prof. Hamid Reza NajiDepartment of Computer Enigneering, Shahid Beheshti University, Tehran, Iran

    Dr. Sanjay JasolaProfessor and Dean, School of Information and Communication Technology,Gautam Buddha University

    Dr Riktesh Srivastava Assistant Professor, Information Systems, Skyline University College, UniversityCity of Sharjah, Sharjah, PO 1797, UAE

    Dr. Siddhivinayak Kulk arniUniversity of Ballarat, Ballarat, Victoria, Australia

    Professor (Dr) Mokhtar BeldjehemSainte-Anne University, Halifax, NS, Canada

    Dr. Alex Pappachen James, (Research Fellow)Queensland Micro-nanotechnology center, Griffith University, Australia

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    4/359

    TABLE OF CONTENTS

    1. Paper 29031048: Buffer Management Algorithm Design and Implementation Based on Network

    Processors (pp. 1-8)

    Yechang Fang, Kang Yen, Dept. of Electrical and Computer Engineering, Florida International University,Miami, USA

    Deng Pan, Zhuo Sun, School of Computing and Information Sciences, Florida International University,Miami, USA2. Paper 08031001: Multistage Hybrid Arabic/Indian Numeral OCR System (pp. 9-18)

    Yasser M. Alginaih, Ph.D., P.Eng. IEEE Member, Dept. of Computer Science, Taibah University, Madinah,Kingdom of Saudi Arabia

    Abdul Ahad Siddiqi, Ph.D., Member IEEE & PEC, Dept. of Computer Science, Taibah University,

    Madinah, Kingdom of Saudi Arabia

    3. Paper 30031056: Attribute Weighting with Adaptive NBTree for Reducing False Positives in

    Intrusion Detection (pp. 19-26)

    Dewan Md. Farid, and Jerome Darmont, ERIC Laboratory, University Lumire Lyon 2, Bat L - 5 av.

    Pierre Mendes, France, 69676 BRON Cedex, France

    Mohammad Zahidur Rahman, Department of Computer Science and Engineering, Jahangirnagar

    University, Dhaka 1342, Bangladesh

    4. Paper 30031053: Improving Overhead Computation and pre-processing Time for Grid Scheduling

    System (pp. 27-34)

    Asgarali Bouyer, Mohammad javad hoseyni, Department of Computer Science, Islamic Azad University-

    Miyandoab branch, Miyandoab, Iran

    Abdul Hanan Abdullah, Faculty Of Computer Science And Information Systems, Universiti Teknologi

    Malaysia, Johor, Malaysia

    5. Paper 20031026: The New Embedded System Design Methodology For Improving Design Process

    Performance (pp. 35-43)

    Maman Abdurohman, Informatics Faculty, Telecom Institute of Technology, Bandung, Indonesia

    Kuspriyanto, STEI Faculty, Bandung Institute of Technology, Bandung, Indonesia

    Sarwono Sutikno, STEI Faculty, Bandung Institute of Technology, Bandung, IndonesiaArif Sasongko, STEI Faculty, Bandung Institute of Technology, Bandung, Indonesia

    6. Paper 30031060: Semi-Trusted Mixer Based Privacy Preserving Distributed Data Mining for

    Resource Constrained Devices (pp. 44-51)

    Md. Golam Kaosar, School of Engineering and Science, Victoria University, Melbourne, Australia

    Xun Yi, Associate Preofessor, School of Engineering and Science, Victoria University, Melbourne,

    Australia

    7. Paper 12031005: Adaptive Slot Allocation And Bandwidth Sharing For Prioritized Handoff Calls

    In Mobile Netwoks (pp. 52-57)

    S. Malathy, Research Scholar, Anna University, Coimbatore

    G. Sudha Sadhasivam, Professor, CSE Department, PSG College of Technology, Coimbatore.

    K. Murugan, Lecturer, IT Department, Hindusthan Institute of Technology, Coimbatore

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    5/359

    S. Lokesh, Lecturer, CSE Department, Hindusthan Institute of Technology, Coimbatore

    8. Paper 12031009: An Efficient Vein Pattern-based Recognition System (pp. 58-63)

    Mohit Soni, DFS, New Delhi- 110003, INDIA.

    Sandesh Gupta, UIET, CSJMU, Kanpur-208014, INDIA.

    M.S. Rao, DFS, New Delhi-110003, INDIAPhalguni Gupta, Professor, IIT Kanpur, Kanpur-208016, INDIA.

    9. Paper 15031013: Extending Logical Networking Concepts in Overlay Network-on-Chip

    Architectures (pp. 64-67)

    Omar TayanCollege of Computer Science and Engineering, Department of Computer Science, Taibah University, Saudi

    Arabia, P.O. Box 30002

    10. Paper 15031015: Effective Bandwidth Utilization in IEEE802.11 for VOIP(pp. 68-75)

    S. Vijay Bhanu, Research Scholar, Anna University, Coimbatore, Tamilnadu, India, Pincode-641013.

    Dr.RM.Chandrasekaran, Registrar, Anna University, Trichy, Tamilnadu, India, Pincode: 620024.

    Dr. V. Balakrishnan, Research Co-Supervisor, Anna University, Coimbatore.

    11. Paper 16021024: ECG Feature Extraction Techniques - A Survey Approach (pp. 76-80)

    S. Karpagachelvi, Mother Teresa Women's University, Kodaikanal, Tamilnadu, India.

    Dr. M.Arthanari, Tejaa Shakthi Institute of Technology for Women, Coimbatore- 641 659, Tamilnadu,

    India.M. Sivakumar, Anna University Coimbatore, Tamilnadu, India

    12. Paper 18031017: Implementation of the Six Channel Redundancy to achieve fault tolerance in

    testing of satellites (pp. 81-85)

    H S Aravinda *, Dr H D Maheshappa**, Dr Ranjan Moodithaya ***

    * Department of Electronics and Communication, REVA ITM, Bangalore-64, Karnataka, India.** Director & Principal, East Point College of Engg, Bidarahalli, Bangalore-40, Karnataka, India.

    *** Head, KTMD Division, National Aerospace Laboratories, Bangalore-17, Karnataka, India.

    13. Paper 18031018: Performance Oriented Query Processing In GEO Based Location Search

    Engines (pp. 86-94)

    Dr. M. Umamaheswari, Bharath University, Chennai-73, Tamil Nadu,India,

    S. Sivasubramanian, Bharath University, Chennai-73,Tamil Nadu,India,

    14. Paper 20031027: Tunable Multifunction Filter Using Current Conveyor (pp. 95-98)

    Manish Kumar, Electronics and Communication, Engineering Department, Jaypee Institute of Information

    Technology, Noida, IndiaM.C. Srivastava, Electronics and Communication, Engineering Department, Jaypee Institute of

    Information Technology, Noida, India

    Umesh Kumar, Electrical Engineering Department, Indian Institute of Technology, Delhi, India

    15. Paper 17031042: Artificial Neural Network based Diagnostic Model For Causes of Success and

    Failures (pp. 95-105)

    Bikrampal Kaur, Chandigarh Engineering College, Mohali, IndiaDr. Himanshu Aggarwal, Punjabi University, Patiala-147002, India

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    6/359

    16. Paper 28031045: Detecting Security threats in the Router using Computational Intelligence (pp.

    106-111)

    J. Visumathi, Research Scholar, Sathyabama University, Chennai-600 119

    Dr. K. L. Shunmuganathan, Professor & Head, Department of CSE, R.M.K. Engineering College, Chennai-

    601 206

    17. Paper 31031091: A Novel Algorithm for Informative Meta Similarity Clusters Using Minimum

    Spanning Tree (pp. 112-120)

    S. John Peter, Department of Computer Science and Research Center, St. Xaviers College, Palayamkottai,

    Tamil Nadu, India

    S. P. Victor, Department of Computer Science and Research Center, St. Xaviers College, Palayamkottai,Tamil Nadu, India

    18. Paper 23031032: Adaptive Tuning Algorithm for Performance tuning of Database Management

    System (pp. 121-124)

    S. F. Rodd, Department of Information Science and Engineering, KLSs Gogte Institute of Technology,

    Belgaum, INDIADr. U. P. Kulkarni, Department of Computer Science and Engineering, SDM College of Engineering andTechnology, Dharwad, INDIA

    19. Paper 26031038: A Survey of Mobile WiMAX IEEE 802.16m Standard (pp. 125-131)

    Mr. Jha Rakesh, Deptt. Of E & T.C., SVNIT, Surat, India

    Mr. Wankhede Vishal A., Deptt. Of E & T.C., SVNIT, Surat, India

    Prof. Dr. Upena Dalal, Deptt. Of E & T.C., SVNIT, Surat, India

    20. Paper 27031040: An Analysis for Mining Imbalanced Datasets (pp. 132-137)

    T. Deepa, Faculty of Computer Science Department, Sri Ramakrishna College of Arts and Science for

    Women, Coimbatore, Tamilnadu, India.Dr. M. Punithavalli, Director & Head, Sri Ramakrishna College of Arts & Science for Women, Coimbatore,

    Tamil Nadu, India

    21. Paper 27031039: QoS Routing For Mobile Adhoc Networks And Performance Analysis Using

    OLSR Protocol (pp. 138-150)

    K.Oudidi, Si2M Laboratory, National School of Computer Science and Systems Analysis, Rabat, Morocco

    A. Hajami, Si2M Laboratory, National School of Computer Science and Systems Analysis, Rabat, MoroccoM. Elkoutbi, Si2M Laboratory, National School of Computer Science and Systems Analysis, Rabat,

    Morocco

    22. Paper 28031047: Design of Simple and Efficient Revocation List Distribution in Urban Areas for

    VANETs (pp. 151-155)

    Ghassan Samara , National Advanced IPv6 Center, Universiti Sains Malaysia, Penang, Malaysia

    Sureswaran Ramadas, National Advanced IPv6 Center, Universiti Sains Malaysia, Penang, Malaysia

    Wafaa A.H. Al-Salihy, School of Computer Science, Universiti Sains Malaysia, Penang, Malaysia

    23. Paper 28031044: Software Process Improvization Framework For Indian Small Scale Software

    Organizations Using Fuzzy Logic (pp. 156-162)

    A. M. Kalpana, Research Scholar, Anna University Coimbatore, Tamilnadu, India

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    7/359

    Dr. A. Ebenezer Jeyakumar, Director/Academics, SREC, Coimbatore, Tamilnadu, India

    24. Paper 30031052: Urbanizing the Rural Agriculture - Knowledge Dissemination using Natural

    Language Processing (pp. 163-169)

    Priyanka Vij (Author) Student, Computer Science Engg. Lingayas Institute of Mgt. & Tech, Faridabad,

    Haryana, India Harsh Chaudhary (Author) Student, Computer Science Engg. Lingaya s Institute of Mgt. & Tech,

    Faridabad, Haryana, India

    Priyatosh Kashyap (Author) Student, Computer Science Engg. Lingaya s Institute of Mgt. & Tech,

    Faridabad, Haryana, India

    25. Paper 31031073: A New Joint Lossless Compression And Encryption Scheme Combining A

    Binary Arithmetic Coding With A Pseudo Random Bit Generator (pp. 170-175)

    A. Masmoudi * , W. Puech **, And M. S. Bouhlel *

    * Research Unit: Sciences and Technologies of Image and Telecommunications, Higher Institute of

    Biotechnology, Sfax TUNISIA

    ** Laboratory LIRMM, UMR 5506 CNRS University of Montpellier II, 161, rue Ada, 34392MONTPELLIER CEDEX 05, FRANCE

    26. Paper 15031012: A Collaborative Model for Data Privacy and its Legal Enforcement (pp. 176-182)

    Manasdeep, MSCLIS, IIIT Allahabad

    Damneet Singh Jolly, MSCLIS, IIIT AllahabadAmit Kumar Singh, MSCLIS, IIIT Allahabad

    Kamleshwar Singh, MSCLIS, IIIT Allahabad

    Mr Ashish Srivastava, Faculty, MSCLIS, IIIT Allahabad

    27. Paper 12031010: A New Exam Management System Based on Semi-Automated Answer Checking

    System (pp. 183-189)

    Arash Habibi Lashkari, Faculty of ICT, LIMKOKWING University of Creative Technology,

    CYBERJAYA, Selangor,Dr. Edmund Ng Giap Weng, Faculty of Cognitive Sciences and Human Development, University Malaysia

    Sarawak (UNIMAS)

    Behrang Parhizkar, Faculty of Information, Communication And Technology, LIMKOKWING Universityof Creative Technology, CYBERJAYA, Selangor, Malaysia

    Siti Fazilah Shamsudin, Faculty of ICT, LIMKOKWING University of Creative Technology, CYBERJAYA,

    Selangor, Malaysia

    Jawad Tayyub, Software Engineering With Multimedia, LIMKOKWING University of Creative Technology,

    CYBERJAYA, Selangor, Malaysia

    28. Paper 30031064: Development of Multi-Agent System for Fire Accident Detection Using Gaia

    Methodology (pp. 190-194)

    Gowri. R, Kailas. A, Jeyaprakash.R, Carani AnirudhDepartment of Information Technology, Sri Manakula Vinayagar Engineering College, Puducherry 605107.

    29. Paper 19031022: Computational Fault Diagnosis Technique for Analog Electronic Circuits using

    Markov Parameters (pp. 195-202)

    V. Prasannamoorthy and N.Devarajan

    Department of Electrical Engineering, Government College of Technology, Coimbatore, India

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    8/359

    30. Paper 24031037: Applicability of Data Mining Techniques for Climate Prediction A Survey

    Approach (pp. 203-206)

    Dr. S. Santhosh Baboo, Reader, PG and Research department of Computer Science, Dwaraka Doss

    Goverdhan Doss Vaishnav College, Chennai

    I. Kadar Shereef, Head, Department of Computer Applications, Sree Saraswathi Thyagaraja College,

    Pollachi

    31. Paper 17021025: Appliance Mobile Positioning System (AMPS) (An Advanced mobile

    Application) (pp. 207-215)

    Arash Habibi Lashkari, Faculty of ICT, LIMKOKWING University of Creative Technology,

    CYBERJAYA, Selangor, Malaysia

    Edmund Ng Giap Weng, Faculty of Cognitive Sciences and Human Development, University Malaysia

    Sarawak (UNIMAS)

    Behrang Parhizkar, Faculty of ICT, LIMKOKWING University of Creative Technology, CYBERJAYA,

    Selangor, Malaysia

    Hameedur Rahman, Software Engineering with Multimedia, LIMKOKWING University of Creative

    Technology, CYBERJAYA, Selangor, Malaysia

    32. Paper 24031036: A Survey on Data Mining Techniques for Gene Selection and CancerClassification (pp. 216-221)

    Dr. S. Santhosh Baboo, Reader, PG and Research department of Computer Science, Dwaraka Doss

    Goverdhan Doss Vaishnav College, ChennaiS. Sasikala, Head, Department of Computer Science, Sree Saraswathi Thyagaraja College, Pollachi

    33. Paper 23031033: Non-Blind Image Watermarking Scheme using DWT-SVD Domain (pp. 222-228)

    M. Devapriya, Asst.Professor, Dept of Computer Science, Government Arts College, Udumalpet.Dr. K. Ramar, Professor & HOD, Dept of CSE, National Engineering College, Kovilpatti -628 502.

    34. Paper 31031074: Speech Segmentation Algorithm Based On Fuzzy Memberships (pp. 229-233)

    Luis D. Huerta, Jose Antonio Huesca and Julio C. Contreras

    Departamento de Informtica, Universidad del Istmo Campus Ixtepc, Ixtepc Oaxaca, Mxico

    35. Paper 30031058: How not to share a set of secrets (pp. 234-237)

    K. R. Sahasranand , Nithin Nagaraj, Department of Electronics and Communication Engineering, Amrita

    Vishwa Vidyapeetham, Amritapuri Campus, Kollam-690525, Kerala, India.

    Rajan S., Department of Mathematics, Amrita Vishwa Vidyapeetham, Amritapuri Campus, Kollam-690525,Kerala, India.

    36. Paper 30031057: Secure Framework for Mobile Devices to Access Grid Infrastructure (pp. 238-

    243)

    Kashif Munir, Computer Science and Engineering Technology Unit King Fahd University of Petroleumand Minerals HBCC Campus, King Faisal Street, Hafr Al Batin 31991

    Lawan Ahmad Mohammad, Computer Science and Engineering Technology Unit King Fahd University of

    Petroleum and Minerals HBCC Campus, King Faisal Street, Hafr Al Batin 31991

    37. Paper 31031076: DSP Specific Optimized Implementation of Viterbi Decoder (pp. 244-249)

    Yame Asfia and Dr Muhamamd Younis Javed, Department of Computer Engg, College of Electrical and

    Mechanical Engg, NUST, Rawalpindi, Pakistan

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    9/359

    Dr Muid-ur-Rahman Mufti, Department of Computer Engg, UET Taxila, Taxila, Pakistan

    38. Paper 31031089: Approach towards analyzing motion of mobile nodes- A survey and graphical

    representation (pp. 250-253)

    A. Kumar, Sir Padampat Singhania University, Udaipur , Rajasthan , India

    P.Chakrabarti, Sir Padampat Singhania University, Udaipur , Rajasthan , IndiaP. Saini, Sir Padampat Singhania University, Udaipur , Rajasthan , India

    39. Paper 31031092: Recognition of Printed Bangla Document from Textual Image Using Multi-

    Layer Perceptron (MLP) Neural Network (pp. 254-259)

    Md. Musfique Anwar, Nasrin Sultana Shume, P. K. M. Moniruzzaman and Md. Al-Amin Bhuiyan

    Dept. of Computer Science & Engineering, Jahangirnagar University, Bangladesh

    40. Paper 31031081: Application Of Fuzzy System In Segmentation Of MRI Brain Tumor (pp. 261-

    270)

    Mrigank Rajya, Sonal Rewri, Swati SheoranCSE, Lingayas University, Limat, Faridabad India, New Delhi, India

    41. Paper 30031059: E-Speed Governors For Public Transport Vehicles (pp. 270-274)

    C. S. Sridhar, Dr. R. ShashiKumar, Dr. S. Madhava Kumar, Manjula Sridhar, Varun. D

    ECE dept, SJCIT, Chikkaballapur.

    42. Paper 31031087: Inaccuracy Minimization by Partioning Fuzzy Data Sets - Validation of

    Analystical Methodology (pp. 275-280)

    Arutchelvan. G, Department of Computer Science and Applications Adhiparasakthi College of Arts andScience G. B. Nagar, Kalavai , India

    Dr. Srivatsa S. K., Dept. of Electronics Engineering, Madras Institute of Technology, Anna University,

    Chennai, India

    Dr. Jagannathan. R, Vinayaka Mission University, Chennai, India

    43. Paper 30031065: Selection of Architecture Styles using Analytic Network Process for the

    Optimization of Software Architecture (pp. 281-288)

    K. Delhi Babu, S.V. University, Tirupati

    Dr. P. Govinda Rajulu, S.V. University, Tirupati

    Dr. A. Ramamohana Reddy, S.V. University, Tirupati

    Ms. A.N. Aruna Kumari, Sree Vidyanikethan Engg. College, Tirupati

    44. Paper 27031041: Clustering Time Series Data Stream A Literature Survey (pp. 289-294)

    V.Kavitha, Computer Science Department, Sri Ramakrishna College of Arts and Science for Women,

    Coimbatore, Tamilnadu, India.M. Punithavalli, Sri Ramakrishna College of Arts & Science for Women, Coimbatore ,Tamil Nadu, India.

    45. Paper 31031086: An Adaptive Power Efficient Packet Scheduling Algorithm for Wimax

    Networks (pp. 295-300)

    R Murali Prasad, Department of Electronics and Communications, MLR Institute of technology,Hyderabad

    P. Satish Kumar, professor, Department of Electronics and Communications, CVR college of engineering,

    Hyderabad

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    10/359

    46. Paper 30041037: Content Base Image Retrieval Using Phong Shading (pp. 301-306)

    Uday Pratap Singh, LNCT, Bhopal (M.P) INDIA

    Sanjeev Jain, LNCT, Bhopal (M.P) INDIA

    Gulfishan Firdose Ahmed, LNCT, Bhopal (M.P) INDIA

    47. Paper 31031090: The Algorithm Analysis of E-Commerce Security Issues for Online Payment

    Transaction System in Banking Technology (pp. 307-312)

    Raju Barskar, MANIT Bhopal (M.P)Anjana Jayant Deen,CSE Department, UIT_RGPV, Bhopal (M.P)

    Jyoti Bharti, IT Department, MANIT, Bhopal (M.P)

    Gulfishan Firdose Ahmed, LNCT, Bhopal (M.P)

    48. Paper 28031046: Reduction in iron losses In Indirect Vector-Controlled IM Drive Using FLC (pp.

    313-317)

    Mr. C. Srisailam , Electrical Engineering Department, Jabalpur Engineering College, Jabalpur, MadhyaPradesh,

    Mr. Mukesh Tiwari, Electrical Engineering Department, Jabalpur Engineering College, Jabalpur, MadhyaPradesh,Dr. Anurag Trivedi, Electrical Engineering Department, Jabalpur Engineering College, Jabalpur, Madhya

    Pradesh

    49. Paper 31031071: Bio-Authentication based Secure Transmission System using Steganography (pp.

    318-324)

    Najme Zehra, Assistant Professor, Computer Science Department, Indira Gandhi Institute of Technology,

    GGSIPU, Delhi.

    Mansi Sharma, Scholar, Indira Gandhi Institute of Technology, GGSIPU, Delhi.

    Somya Ahuja, Scholar, Indira Gandhi Institute of Technology, GGSIPU, Delhi.Shubha Bansal, Scholar, Indira Gandhi Institute of Technology, GGSIPU, Delhi.

    50. Paper 31031068: Facial Recognition Technology: An analysis with scope in India (pp. 325-330)

    Dr.S.B.Thorat, Director, Institute of Technology and Mgmt, Nanded, Dist. - Nanded. (MS), India

    S. K. Nayak, Head, Dept. of Computer Science, Bahirji Smarak Mahavidyalaya, Basmathnagar, Dist. -

    Hingoli. (MS), IndiaMiss. Jyoti P Dandale, Lecturer, Institute of Technology and Mgmt, Nanded, Dist. - Nanded. (MS), India

    51. Paper 31031069: Classification and Performance of AQM-Based Schemes for Congestion

    Avoidance(pp. 331-340)K.Chitra Lecturer, Dept. of Computer Science D.J.Academy for Managerial Excellence Coimbatore, Tamil

    Nadu, India 641 032

    Dr. G. Padamavathi Professor & Head, Dept. of Computer Science Avinashilingam University for Women,Coimbatore, Tamil Nadu, India 641 043

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    11/359

    (IJCSIS) International Journal of Computer Science and Information Security,

    Vol. 8, No. 1, 2010

    Buffer Management Algorithm Design and

    Implementation Based on Network Processors

    Yechang Fang, Kang Yen Deng Pan, Zhuo Sun

    Dept. of Electrical and Computer Engineering School of Computing and Information Sciences

    Florida International University Florida International University

    Miami, USA Miami, USA

    {yfang003, yenk}@fiu.edu {pand, zsun003}@fiu.edu

    AbstractTo solve the parameter sensitive issue of the

    traditional RED (random early detection) algorithm, an

    adaptive buffer management algorithm called PAFD (packet

    adaptive fair dropping) is proposed. This algorithm supports

    DiffServ (differentiated services) model of QoS (quality of

    service). In this algorithm, both of fairness and throughput are

    considered. The smooth buffer occupancy rate function is

    adopted to adjust the parameters. By implementing buffer

    management and packet scheduling on Intel IXP2400, the

    viability of QoS mechanisms on NPs (network processors) is

    verified. The simulation shows that the PAFD smoothes the

    flow curve, and achieves better balance between fairness and

    network throughput. It also demonstrates that this algorithm

    meets the requirements of fast data packet processing, and the

    hardware resource utilization of NPs is higher.

    Keywords-buffer management; packet dropping; queue

    management; network processor

    I. INTRODUCTIONNetwork information is transmitted in the form of data

    flow, which constitutes of data packets. Therefore, different

    QoS means different treatment of data flow. This treatment

    involves assignment of different priority to data packets.

    Queue is actually a storage area to store IP packets with

    priority level inside routers or switches. Queue management

    algorithm is a particular calculation method to determine the

    order of sending data packets stored in the queue. Then the

    fundamental requirement is to provide better and timely

    services for high priority packets [1]. The NP is a dedicated

    processing chip to run on high speed networks, and to

    achieve rapid processing of packets.

    Queue management plays a significant role in the control

    of network transmission. It is the core mechanism to control

    network QoS, and also the key method to solve the network

    congestion problem. Queue management consists of buffer

    management and packet scheduling. Generally the buffer

    management is applied at the front of a queue andcooperates with the packet scheduling to complete the queue

    operation [2, 3]. When a packet arrives at the front of a

    queue, the buffer management decides whether to allow the

    packet coming into the buffer queue. From another point of

    view, the buffer management determines whether to drop the

    packet or not, so it is also known as dropping control.

    The control schemes of the buffer management can be

    analyzed from two levels, data flow and data packet. In the

    data stream level and viewed form the aspect of systemresource management, the buffer management needs to

    adopt certain resource management schemes to make a fair

    and effective allocation of queue buffer resources among

    flows through the network nodes. In the data packet level

    and viewed from the aspect of packet dropping control, the

    buffer management needs to adopt certain drop control

    schemes to decide that under what kind of circumstances a

    packet should be dropped, and which packet will be dropped.

    Considering congestion control response in an end-to-end

    system, the transient effects for dropping different packets

    may vary greatly. However, statistics of the long-term

    operation results indicates that the transient effect gap is

    minimal, and this gap can be negligible in majority of cases.

    In some specific circumstances, the completely shared

    resource management scheme can cooperate with drop

    schemes such as tail-drop and head-drop to reach effective

    control. However, in most cases, interaction between these

    two schemes is very large. So the design of buffer

    management algorithms should consider both of the twoschemes to obtain better control effects [4, 5].

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    12/359

    (IJCSIS) International Journal of Computer Science and Information Security,

    Vol. 8, No. 1, 2010

    II. EXISTING BUFFERMANAGEMENT ALGORITHMSReference [6] proposed the RED algorithm for active

    queue management (AQM) mechanism [7] and then

    standardized as a recommendation from IETF [8]. It

    introduces congestion control to the router's queue

    operations. RED uses early random drop scheme to smooth packet dropping in time. This algorithm can effectively

    reduce and even avoid the congestion in network, and also

    solve the TCP protocol global synchronization problem.

    However, one concern of the RED algorithm is the

    stability problem, i.e., the performance of the algorithm is

    very sensitive to the control parameters and changes in

    network traffic load. During heavy flow circumstances, the

    performance of RED will drop drastically. Since RED

    algorithm is based on best-effort service model, which doesnot consider different levels of services and different user

    flows, it cannot provide fairness. In order to improve the

    fairness and stability, several improved algorithms have

    been developed, including WRED, SRED, Adaptive-RED,

    FRED, RED with In/Out (RIO) [9, 10] etc. But these

    algorithms still have a lot of problems. For example, a large

    number of studies have shown that it is difficult to find a

    RIO parameter setting suitable for various and changing

    network conditions.

    III. THE PAFDALGORITHMIn this paper, we propose a new buffer management

    algorithm called PAFD (Packet Adaptive Fair Dropping).

    This algorithm will adaptively gain balance between

    congestion and fairness according to cache congestion

    situation. When there is minor congestion, the algorithm will

    tend to fairly drop packets in order to ensure all users access

    the system resources to their scale. For moderate congestion,

    the algorithm will incline to drop the packet of low quality

    service flows by reducing its sending rate using scheduling

    algorithm to alleviate congestion. In severe congestion, the

    algorithm will tend to fairly drop packets, through the upper

    flow control mechanism to meet the QoS requirements, and

    reduces sending rate of most service flows, in order to speed

    up the process of easing the congestion.

    In buffer management or packet scheduling algorithms,

    it will improve the system performance to have service

    flows with better transmission conditions reserved in

    advance. But this operation will make system resources such

    as buffer space and bandwidth be unfairly distributed, so

    that QoS of service flows with poor transmission conditions

    cannot be guaranteed. Packet scheduling algorithms usually

    use generalized processor sharing (GPS) as a comparative

    model of fairness. During the process of realization of

    packet scheduling algorithms based on GPS, each service

    flow has been assigned a static weight to show their QoS.

    The weight iactually express the percentage of the service

    flow i in the entire bandwidth B. i will not change with

    packet scheduling algorithms, and meet

    1

    1N

    ii

    =

    = (1)

    whereNexpresses the number of service flows in the link.

    And the service volume is described by

    inc i

    ij

    j B

    g B

    =

    (2)

    where i,j denotes two different service flows. In GPS based

    algorithms, the bandwidth allocation of different service

    flows meets the requirement Bi/i = Bj/j, where Bi is the

    allocated bandwidth of the service flow i. By assigning a

    smaller weight to an unimportant background service flow,

    the weight of service flow with high priority high will be

    much larger than low, so that the majority of the bandwidth

    is accessed by high-priority service flows.

    A. Algorithm DescriptionIn buffer management algorithms, how to control the

    buffer space occupation is very key [11]. Here we define

    j

    j

    i

    i

    W

    C

    W

    C= (3)

    where Ci is the buffer space occupation, and Wi expresses

    the synthetic weight of the service flow i. When the cache is

    full, the service flow with the largest value ofCi/Wi will bedropped in order to guarantee fairness. Here the fairness is

    reflected in packets with different queue length [12, 13].

    Assume that ui is the weight, and vi is the current queue

    length of the service flow i. The synthetic weight Wican be

    calculated as described by

    (1 )i i iW u v = + (4)

    where is the adjust parameter of the two weighting

    coefficients ui and vi . can be pre-assigned, or determinedin accordance with usage of the cache. ui is related to the

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    13/359

    (IJCSIS) International Journal of Computer Science and Information Security,

    Vol. 8, No. 1, 2010

    service flow itself, and different service flows are assigned

    with different weight values. As long as the service flow is

    active, this factor will remain unchanged. vi is time varying,

    which reflects dropping situation of the current service flow.

    Suppose a new packet T arrives, then the PAFD

    algorithm process is described as follows:

    Step 1: Check whether the remaining cache spacecan accommodate the packet T, if the remaining

    space is more than or equal to the length ofT, add T

    into the cache queue. Otherwise, drop some packets

    from the cache to free enough storage space. The

    decision on which packet will be dropped is given in

    the following steps.

    Step 2: Calculate the weighting coefficients u and vfor each service flow, and the parameter. Then get

    the values of new synthetic weights Wfor each flow

    according to (4).

    Step 3: Choose the service flow with the largestweighted buffer space occupation (Ci/Wi), if the

    service flow associated to the packet Thas the same

    value as it, then drop T at the probability P and

    returns. Otherwise, drop the head packet of the

    service flow with the largest weighted buffer space

    occupation at probability 1P, and add T into thecache queue. Here ProbabilityPis a random number

    generated by the system to ensure the smoothness

    and stability of the process.

    Step 4: Check whether the remaining space canaccommodate another new packet, if the answer is

    yes, the packet will be transmitted into the cache.

    Otherwise, return to Step 3 to continuously choose

    and drop packets until there is sufficient space.

    If all packet lengths are the same, the algorithm only

    needs one cycle to compare and select the service flow with

    the largest weighted buffer space occupation. Therefore, the

    time complexity of the algorithm is O(N). In this case, we

    also need additional 4Nstorage space to store the weights.

    Taking into account the limited capacity of wireless network,

    N is usually less than 100. So in general the algorithm's

    overhead on time and space complexity are not large. On the

    other hand, if packet lengths are different, then it is

    necessary to cycle Step 3 and Step 4 until the cache hasenough space to accommodate the new packet. The largest

    cycling times is related to the ratio between the longest and

    the shortest packets. At this moment, the time complexity

    overhead is still small based on practices.

    In Step 2, , a function of shared buffer, is a parameter

    for adjusting proportion of the two weighting coefficients u

    and v. For a large value of, the PAFD algorithm will tend

    to fairly select and drop packets according to the synthetic

    weight W. Otherwise, the algorithm tends to select and drop

    the service flow with large queue length. A reasonable value

    for can be used to balance between fairness and

    performance. Here we introduce an adaptive method to

    determine the value of . This adaptive method will

    determine value based on the congestion situation of the

    cache, and this process does not require manual intervention.

    When there is a minor congestion, the congestion can berelieved by reducing the sending rate of a small number of

    service flows. The number of service flows in wireless

    network nodes is not as many as that in the wired network.

    So the minor congestion can be relieved by reducing the

    sending rate of any one of service flows. We hope this

    choice is fair, to ensure that all user access to the system

    resources according to their weights.

    When there is a moderate congestion, the congestion can

    not be relieved by reducing the sending rate of any one ofservice flows. Reducing the rate of different service flows

    will produce different results. We hope to reduce the rate of

    service flows which are most effective to the relief of

    congestion. That is, the service flow which current queue

    length is the longest (The time that these service flow

    occupied the cache is also the longest). This not only

    improves system throughput, but also made to speeds up the

    congestion relief.

    When there is a severe congestion, it is obvious that

    reducing the sending rate of a small portion of the service

    flows cannot achieve the congestion relief. We may need to

    reduce the rate of a lot of service flows. Since the TCP has a

    characteristic of additive increase multiplicative decrease

    (AIMD), continuous drop packets from one service flow to

    reduce the sending rate would adversely affect the

    performance of the TCP flow. While the effect on relieving

    system congestion will become smaller, we gradually

    increase the values of parameters, and the algorithm will

    choose service flows to drop packet fairly. On one hand, at

    this point the "fairness" can bring the same benefits as in the

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    14/359

    (IJCSIS) International Journal of Computer Science and Information Security,

    Vol. 8, No. 1, 2010

    minor congestion system; on the other hand this is to avoid

    continuously dropping the longer queue service flow.

    Congestion is measured by the system buffer space

    occupation rate. is a parameter relevant to system

    congestion status and its value is between 0 to 1. Assume

    that the current buffer space occupation rate is denoted by

    Buffercur, and Buffermedium, Buffermin, and Buffermax represent

    threshold value of the buffer space occupation rate for

    moderate, minor, and severe congestion, respectively.

    When Buffercur is close to Buffermin, the system enters a

    state of minor congestion. WhenBuffercur reachesBuffermax,

    the system is in a state of severe congestion. Buffermedium

    means moderate congestion. If we value by using linear

    approach, the system will have a dramatic oscillation.

    Instead we use high order nonlinear or index reduction to getsmooth curve of as shown in Figure 1.

    Fig.1. An adaptive curve of

    The value of can also be calculated as below

    2 2

    2 22 2 2

    2 2

    2 2

    0, if [3] United States Postal Services, "Postal Addressing Standards".

    Updated, Viewed on, May 28, 2009,

    [4] John Buck, International mail-sorting automation in the low-

    volume environment, The Journal of Communication

    Distribution, May/June 2009.

    [5] United States Postal Services, Uhttp://www.usps.comUViewed 23rd ofJune, 2009.

    [6] A.C. Downton, R.W.S. Tregidgo and C.G. Leedham, and

    Hendrawan, "recognition of handwritten british postal addresses,"From Pixels to Features 111: Frontiers in Handwriting

    Recognition, pp. 129-143, 1992.[7] Y. Tokunaga, "History and current state of postal mechanization in

    Japan", Pattern Recognition Letters, vol. 14, no. 4, pp. 277-280,

    April 1993

    [8] Canada Post, , Viewed 13th of July, 2009.

    [9] Canada Post, In Quest of More Advanced Recognition

    Technology, 28th of October 2004. Viewed March2009,

    [10] Udo Miletzki, Product Manager Reading Coding, Siemens AG andMohammed H. Al Darwish, "Significant technological advances in

    17 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

    http://www.parascript.com/objects/1107MST.pdfhttp://www.parascript.com/objects/0308OCRSystems.pdfhttp://www.parascript.com/objects/MailMagazine.InternationalMailSorting.May2009.pdfhttp://www.parascript.com/objects/MailMagazine.InternationalMailSorting.May2009.pdfhttp://www.parascript.com/objects/MailMagazine.InternationalMailSorting.May2009.pdfhttp://www.parascript.com/objects/MailMagazine.InternationalMailSorting.May2009.pdfhttp://www.parascript.com/objects/MailMagazine.InternationalMailSorting.May2009.pdfhttp://www.parascript.com/objects/MailMagazine.InternationalMailSorting.May2009.pdfhttp://www.usps.com/http://www.usps.com/http://www.usps.com/http://www.usps.com/http://www.usps.com/http://www.canadapost.ca/http://www.canadapost.ca/http://www.canadapost.ca/http://www.canadapost.ca/http://www.canadapost.ca/http://www.hqrd.hitachi.co.jp/iwfhr9/AfterWS/Pict%20ures/Panel-presen/Panel-Ulvr.pdfhttp://www.hqrd.hitachi.co.jp/iwfhr9/AfterWS/Pict%20ures/Panel-presen/Panel-Ulvr.pdfhttp://www.hqrd.hitachi.co.jp/iwfhr9/AfterWS/Pict%20ures/Panel-presen/Panel-Ulvr.pdfhttp://www.hqrd.hitachi.co.jp/iwfhr9/AfterWS/Pict%20ures/Panel-presen/Panel-Ulvr.pdfhttp://www.hqrd.hitachi.co.jp/iwfhr9/AfterWS/Pict%20ures/Panel-presen/Panel-Ulvr.pdfhttp://www.hqrd.hitachi.co.jp/iwfhr9/AfterWS/Pict%20ures/Panel-presen/Panel-Ulvr.pdfhttp://www.hqrd.hitachi.co.jp/iwfhr9/AfterWS/Pict%20ures/Panel-presen/Panel-Ulvr.pdfhttp://www.hqrd.hitachi.co.jp/iwfhr9/AfterWS/Pict%20ures/Panel-presen/Panel-Ulvr.pdfhttp://www.canadapost.ca/http://www.usps.com/http://www.parascript.com/objects/MailMagazine.InternationalMailSorting.May2009.pdfhttp://www.parascript.com/objects/MailMagazine.InternationalMailSorting.May2009.pdfhttp://www.parascript.com/objects/0308OCRSystems.pdfhttp://www.parascript.com/objects/1107MST.pdf
  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    28/359

    (IJCSIS) International Journal of Computer Science and Information Security,

    Vol. 8, No. 1, 2010

    Saudi Post, the first Arabic address reader for delivery point

    sorting", Saudi Post Corporation World Mail Review, Nov. 2008

    [11] Hiromichi Fujisawa, A view on the past and future of character

    and document recognition, International Conference on

    Document Analysis and Recognition, Brazil, Sept. 23-26, 2009.[12] P.W. Palumbo and S. N. Srihari, "postal address reading in real

    time" Inter. J. of Imaging Science and Technology, 1996

    [13] K. Roy, S. Vajda, U. Pal, B. B. Chaudhuri, and A. Belaid, "A

    system for Indian postal automation", Proc. 2005 Eight Intl. Conf.on Document Analysis and Recognition (ICDAR'05)

    [14] Yih-Ming Su, and Jhing-Fa Wang, "recognition of handwrittenChinese postal address using Neural Networks", Proc.

    International Conference on Image Processing and Character

    Recognition, Kaohsiung, Taiwan, Rep. of China, 1996, pp.213-219[15] El-Emami and M. Usher, "On-line recognition of handwritten

    Arabic characters," Pattern Analysis and Machine Intelligence,

    IEEE Transactions on, vol. 12, pp. 704-710, 1990[16] U. Pal, R. K. Roy, and F. Kimura, "Indian multi-script full pin-

    code string recognition for postal automation", In the 10th

    International Conference on Document Analysis and Recognition,Barcelona, Spain, 2009, pp.460-465

    [17] Sameh M. Awaidah & Sabri A. Mahmoud, "A multiple

    feature/resolution scheme to Arabic (Indian) numerals recognitionusing hidden Markov models", In Signal Processing, Volume 89

    ,No. 6, June 2009

    [18] R. C. Gonzalez and R. E. Woods, Digital Image Processing, 3rd

    Edition, Prentice Hall, Upper Saddle River, New Jersey, 2008

    [19] Y. M. Alginahi, Thresholding and character recognition forsecurity documents with watermarked background, Conference

    on Document Image Computing, Techniques and Applications,

    Canberra, Australia, December, 2008[20] H.K. Kwan and Y.Cai, A Fuzzy Neural Network and its

    applications to pattern recognition, IEEE Trans. On Fuzzy

    Systems, Vol.2. No.3, August 1994, pp. 185-193.[21] P.M. Patil, U.V. Kulkarni and T.R. Sontakke, Performance

    evaluation of Fuzzy Neural Network with various aggregation

    operators, Proceedings of the 9th International Conference on

    Neural Information Processing , Vol. 4, Nov 2002, pp. 1744-1748

    [22] Duda, R.O. and Hart, P.E. 1973. Pattern classification and scene

    analysis. Wiley: New York, NY.

    [23] Bow, S-T, Pattern Recognition and Image Processing, 2nd Edition,

    Marcel Dekker, Inc. New York, Basel, 2002.

    [24] Artificial Neural Networks Technology, Viewed Jan 2009,

    [25] Yasser M. Alginahi, and Abdul Ahad Siddiqi, "A Proposed Hybrid

    OCR System for Arabic and Indian Numerical Postal Codes ", The2009 International Conference on Computer Technology &

    Development (ICCTD), Kota Kinabalu, Malaysia, November,

    2009, pp-400-405.

    Yasser M. Alginahi, became a member of IEEEin 2000. He earned a Ph.D., in electrical engineering from the University

    of Windsor, Ontario, Canada, a Masters of Science in electricalengineering and a Bachelors of Science in biomedical engineering from

    Wright State University, Ohio, U.S.A. Currently, he is an Assistant

    Professor, Dept. of Computer Science, College of Computer Science andEngineering, Taibah University, Madinah, KSA. His current research

    interests are Document Analysis, Pattern Recognition (OCR), crowd

    management, ergonomics and wireless sensor networks. He is alicensed Professional Engineer and a member of Professional Engineers

    Ontario, Canada (PEO). He has over a dozen of research publications

    and technical reports to his credit.

    Dr. Abdul Ahad Siddiqi received a PhD and a MSc

    in Artificial Intelligence in year 1997, and 1992 respectively from

    University of Essex, U.K. He also holds a bachelor degree in ComputerSystems Engineering from NED University of Engineering and

    Technology, Pakistan. He is a Member of IEEE, and Pakistan

    Engineering Council. Presently he is an Associate Professor at Collegeof Computer Science and Engineering at Taibah University, Madinah,

    KSA. He has worked as Dean of Karachi Institute of Information

    Technology, Pakistan (affiliated with University of Huddersfield, U.K.)

    between 2003 and 2005. He has over 18 research publications to his

    credit. He has received research grants from various funding agencies,notably from Pakistan Telecom, and Deanship of Research at TaibahUniversity for research in are areas of Intelligent Information Systems,

    Information Technology, and applications of Genetic Algorithms.

    18 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

    http://portal.acm.org/author_page.cfm?id=81414620297&coll=GUIDE&dl=GUIDE&trk=0&CFID=57812569&CFTOKEN=58172535http://portal.acm.org/author_page.cfm?id=81410593561&coll=GUIDE&dl=GUIDE&trk=0&CFID=57812569&CFTOKEN=58172535http://portal.acm.org/author_page.cfm?id=81410593561&coll=GUIDE&dl=GUIDE&trk=0&CFID=57812569&CFTOKEN=58172535http://portal.acm.org/author_page.cfm?id=81414620297&coll=GUIDE&dl=GUIDE&trk=0&CFID=57812569&CFTOKEN=58172535
  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    29/359

    (IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, 2010

    Attribute Weighting with Adaptive NBTree for

    Reducing False Positives in Intrusion Detection

    Dewan Md. Farid, and Jerome DarmontERIC Laboratory, University Lumire Lyon 2

    Bat L - 5 av. Pierre Mendes, France

    69676 BRON Cedex, France

    [email protected],[email protected]

    Mohammad Zahidur RahmanDepartment of Computer Science and Engineering

    Jahangirnagar University

    Dhaka 1342, Bangladesh

    [email protected]

    Abstract In this paper, we introduce new learning algorithms

    for reducing false positives in intrusion detection. It is based ondecision tree-based attribute weighting with adaptive naveBayesian tree, which not only reduce the false positives (FP) atacceptable level, but also scale up the detection rates (DR) for

    different types of network intrusions. Due to the tremendousgrowth of network-based services, intrusion detection has

    emerged as an important technique for network security.Recently data mining algorithms are applied on network-basedtraffic data and host-based program behaviors to detect

    intrusions or misuse patterns, but there exist some issues incurrent intrusion detection algorithms such as unbalanceddetection rates, large numbers of false positives, and redundant

    attributes that will lead to the complexity of detection model anddegradation of detection accuracy. The purpose of this study is to

    identify important input attributes for building an intrusiondetection system (IDS) that is computationally efficient andeffective. Experimental results performed using the KDD99

    benchmark network intrusion detection dataset indicate that theproposed approach can significantly reduce the number andpercentage of false positives and scale up the balance detection

    rates for different types of network intrusions.

    Keywords-attribute weighting; detection rates; false positives;

    intrusion detection system; nave Bayesian tree;

    I. INTRODUCTIONWith the popularization of network-based services,

    intrusion detection systems (IDS) have become important toolsfor ensuring network security that is the violation ofinformation security policy. IDS collect information from avariety of network sources using intrusion detection sensors,and analyze the information for signs of intrusions that attemptto compromise the confidentiality and integrity of networks[1]-[3]. Network-based intrusion detection systems (NIDS)

    monitor and analyze network traffics in the network fordetecting intrusions from internal and external intruders [4]-[9].Internal intruders are the inside users in the network with someauthority, but try to gain extra ability to take action withoutlegitimate authorization. External intruders are the outsideusers without any authorized access to the network that theyattack. IDS notify network security administrator or automatedintrusion prevention systems (IPS) about the network attacks,when an intruder try to break the network. Since the amount ofaudit data that an IDS needs to examine is very large even for a

    small network, several data mining algorithms, such as decisiontree, nave Bayesian classifier, neural network, Support VectorMachines, and fuzzy classification, etc [10]-[20] have beenwidely used by the IDS community for detecting known andunknown intrusions. Data mining based intrusion detectionalgorithms aim to solve the problems of analyzing the huge

    volumes of audit data and realizing performance optimizationof detection rules [21]. But there are still some drawbacks incurrently available commercial IDS, such as low detectionaccuracy, large number of false positives, unbalanced detectionrates for different types of intrusions, long response time, andredundant input attributes.

    A conventional intrusion detection database is complex,dynamic, and composed of many different attributes. Theproblem is that not all attributes in intrusion detection databasemay be needed to build efficient and effective IDS. In fact, theuse of redundant attributes may interfere with the correctcompletion of mining task, because the information they addedis contained in other attributes. The use of all attributes maysimply increase the overall complexity of detection model,

    increase computational time, and decrease the detectionaccuracy of the intrusion detection algorithms. It has beentested that effective attributes selection improves the detectionrates for different types of network intrusions in intrusiondetection. In this paper, we present new learning algorithms fornetwork intrusion detection using decision tree-based attributeweighting with adaptive nave Bayesian tree. In nave Bayesiantree (NBTree) nodes contain and split as regular decision-trees,but the leaves contain nave Bayesian classifier. The proposedapproach estimates the degree of attribute dependency byconstructing decision tree, and considers the depth at whichattributes are tested in the tree. The experimental results showthat the proposed approach not only improves the balancedetection for different types of network intrusions, but also

    significantly reduce the number and percentage of falsepositives in intrusion detection.

    The rest of this paper is organized as follows. In Section II,we outline the intrusion detection models, architecture of datamining based IDS, and related works. In Section III, the basicconcepts of feature selection and nave Bayesian tree areintroduced. In Section IV, we introduce the proposedalgorithms. In Section V, we apply the proposed algorithms tothe area of intrusion detection using KDD99 benchmark

    19 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

    mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]
  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    30/359

    (IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, 2010

    network intrusion detection dataset, and compare the results toother related algorithms. Finally, Section VI contains theconclusions with future works.

    II. INTRUSION DETECTION SYSTEM:IDSA. Misuse Vs. Anomaly Vs. Hybrid Detection Model

    Intrusion detection techniques are broadly classified into

    three categories: misuse, anomaly, and hybrid detection model.Misuse or signature based IDS detect intrusions based onknown intrusions or attacks stored in database. It performs pattern matching of incoming packets and/or commandsequences to the signatures of known attacks. Known attackscan be detected reliably with a low false positive using misusedetection techniques. Also it begins protecting thecomputer/network immediately upon installation. But the majordrawback of misuse-based detection is that it requiresfrequently signature updates to keep the signature database up-to-date and cannot detect previously unknown attacks. Misusedetection system use various techniques including rule-basedexpert systems, model-based reasoning systems, state transitionanalysis, genetic algorithms, fuzzy logic, and keystroke

    monitoring [22]-[25].

    Anomaly based IDS detect deviations from normalbehavior. It first creates a normal profile of system, network, orprogram activity, and then any activity that deviated from thenormal profile is treated as a possible intrusion. Various datamining algorithms have been using for anomaly detectiontechniques including statistical analysis, sequence analysis,neural networks, artificial intelligence, machine learning, andartificial immune system [26]-[33]. Anomaly based IDS havethe ability to detect new or previously unknown attacks, andinsider attacks. But the major drawback of this system is largenumber of false positives. A false positive occurs when an IDSreports as an intrusion an event that is in fact legitimate

    network/system activity.

    A hybrid or compound detection system detect intrusionsby combining both misuse and anomaly detection techniques.Hybrid IDS makes decision using a hybrid model that is based on both the normal behavior of the system and theintrusive behavior of the intruders. Table I shows thecomparisons of characteristics of misuse, anomaly, and hybriddetection models.

    TABLE I. COMPARISONS OF INTRUSION DETECTION MODELS

    Characteristics Misuse Anomaly HybridDetection Accuracy High (for

    known attacks)

    Low High

    Detecting New Attacks No Yes Yes

    False Positives Low Very high HighFalse Negatives High Low Low

    Timely Notifications Fast Slow Rather Fast

    Update Usage Patterns Frequent Not Frequent Not Frequent

    B. Architecture of Data Mining Based IDSAn IDS monitors network traffic in a computer network

    like a network sniffer and collects network logs. Then thecollected network logs are analyzed for rule violations by usingdata mining algorithms. When any rule violation is detected,

    the IDS alert the network security administrator or automatedintrusion prevention system (IPS). The generic architecturalmodel of data mining based IDS is shown in Fig 1.

    Figure 1. Organization of a generalized data mining based IDS

    Audit data collection: IDS collect audit data andanalyzed them by the data mining algorithms to detectsuspicious activities or intrusions. The source of thedata can be host/network activity logs, command-basedlogs, and application-based logs.

    Audit data storage: IDS store the audit data for futurereference. The volume of audit data is extremely large.Currently adaptive intrusion detection aims to solve theproblems of analyzing the huge volumes of audit dataand realizing performance optimization of detectionrules.

    Processing component: The processing block is theheart of IDS. It is the data mining algorithms that apply

    for detecting suspicious activities. Algorithms for theanalysis and detection of intrusions have beentraditionally classified into two categories: misuse (orsignature) detection, and anomaly detection.

    Reference data: The reference data stores informationabout known attacks or profiles of normal behaviors.

    Processing data: The processing element mustfrequently store intermediate results such as

    20 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    31/359

    (IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, 2010

    information about partially fulfilled intrusionsignatures.

    Alert: It is the output of IDS that notifies the networksecurity officer or automated intrusion preventionsystem (IPS).

    System security officer or intrusion prevention system(IPS) carries out the prescriptions controlled by the

    IDS.

    C. Related WorkThe concept of intrusion detection began with Andersons

    seminal paper in 1980 [34] by introducing a threatclassification model that develops a security monitoringsurveillance system based on detecting anomalies in user behavior. In 1986, Dr. Denning proposed several models forcommercial IDS development based on statistics, Markovchains, time-series, etc [35], [36]. In 2001, Lindqvist et al. proposed a rule-based expert system called eXpert-BSM fordetecting misuse of host machine by analyzing activities insidethe host in forms of audit trails [37], which generates detailreports and recommendations to the system administrators, andproduces low false positives. Rules are conditional statementsthat derived by employing domain expert knowledge. In 2005,Fan et al. proposed a method to generate artificial anomaliesinto training dataset of IDS to handle both misuse and anomalydetection [38]. This method injects artificial anomaly data intothe training data to help a baseline classifier distinguishbetween normal and anomalous data. In 2006, Bouzida et al.[39] introduced a supplementary condition to the baselinedecision tree (DT) for anomaly intrusion detection. The idea isthat instead of assigning a default class (normally based onprobability distribution) to the test instance that is not covered by the tree, the instance is assigned to a new class. Then,instances with the new class are examined for unknown attackanalysis. In 2009, Wu and Yen [21] applied DT and supportvector machine (SVM) algorithm to built two classifiers forcomparison by employing a sampling method of severaldifferent normal data ratios. More specifically, KDD99 datasetis split into several different proportions based on the normalclass label for both training set and testing set. The overallevaluation of a classifier is based on the average value ofresults. It is reported that in general DT is superior to SVMclassifier. In the same way, Peddabachigari et al. [40] appliedDT and SVM for intrusion detection, and proven that decisiontree is better than SVM in terms of overall accuracy.Particularly, DT much better in detecting user to root (U2R)and remote to local (R2L) network attacks, compared to SVM.

    Nave Bayesian (NB) classifier produces a surprising result

    of classification accuracy in comparison with other classifierson KDD99 benchmark intrusion detection dataset. In 2001,Barbara et al. [41] proposed a method based on the techniquecalled Pseudo-Bayes estimators to enhance the ability ofADAM intrusion detection system [42] in detecting newattacks and reducing false positives, which estimates the priorand posterior probabilities for new attacks by using informationderived from normal instances and known attacks withoutrequiring prior knowledge about new attacks. This studyconstructs a nave Bayes Classifier to classify a given instance

    into a normal instance, known attack, or new attack. In 2004,Amor et al. [43] conducted an experimental study of the performance comparison between NB classifier and DT onKDD99 dataset. This experimental analysis reported that DToutperforms in classifying normal, denial of service (DoS), andR2L attacks, whereas NB classifier is superior in classifyingProbe and U2R attacks. With respect to running time, theauthors pointed out that NB classifier is 7 times faster than DT.

    Another nave Bayes method for detecting signatures ofspecific attacks is motivated by Panda and Patra in 2007 [44].From the experimental results implemented on KDD99 dataset,the authors give a conclusion that NB classifier performs backpropagation neural network classifier in terms of detection ratesand false positives. It is also reported that NB classifier produces a relatively high false positive. In a later work, thesame authors Panda and Patra [45] in 2009, compares NBclassifier with 5 other similar classifiers, i.e., JRip, Ridor, NNge, Decision Table, and Hybrid Decision Table, andexperimental results shows that the NB classifier is better thanother classifiers.

    III. FEATURE SELECTION AND ADAPTIVENBTREEA. Feature Selection

    Feature selection becomes indispensable for highperformance intrusion detection using data mining algorithms,because irrelevant and redundant features may lead to complexintrusion detection model as well as poor detection accuracy.Feature selection is the process of finding a subset of featuresfrom total original features. The purpose of feature selection isto remove the irrelevant input features from the dataset forimproving the classification accuracy. Feature selection inparticularly useful in the application domains that introduce alarge number of input dimensions like intrusion detection.Many data mining methods have been used for selectingimportant features from training dataset such as information

    gain based, gain ratio based, principal component analysis(PCA), genetic search, and classifier ensemble methods etc[46]-[53]. In 2009, Yang et al. [54] introduced a wrapper-basedfeature selection algorithm to find most important features fromthe training dataset by using random mutation hill climbingmethod, and then employs linear support vector machine(SVM) to evaluate the selected subset-features. Chen et al. [55] proposed a neural-tree based algorithm to identify importantinput features for classification, based on an evolutionaryalgorithm that the feature contributes more to the objectivefunction will consider as an important feature.

    In this paper, to select the important input attributes fromtraining dataset, we construct a decision tree by applying ID3

    algorithm in training dataset. The ID3 algorithm constructsdecision tree using information theory [56], which choosesplitting attributes from the training dataset with maximuminformation gain. Information gain is the amount ofinformation associated with an attribute value that is related tothe probability of occurrence. Entropy is the quantifyinformation that is used to measure the amount of randomnessfrom a dataset. When all data in a set belong to a single class,there is no uncertainty then the entropy is zero. The objectiveof ID3 algorithm is to iteratively partition the given dataset into

    21 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    32/359

    (IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, 2010

    sub-datasets, where all the instances in each final subset belongto the same class. The value for entropy is between 0 and 1 andreaches a maximum when the probabilities are all the same.Given probabilitiesp1, p2,..,ps, where i=1pi=1;

    Entropy: H(p1,p2,ps) = =

    s

    i 1

    (pi log(1/pi)) (1)

    Given a dataset,D,H(D) finds the amount of sub-datasets oforiginal dataset. When that sub-dataset is split into s new sub-datasets S = {D1, D2,,Ds}, we can again look at the entropy ofthose sub-datasets. A subset is completely ordered if allinstances in it are the same class. The ID3 algorithm calculatesthe gain by the equation (2).

    Gain (D,S) = H(D)-=

    s

    i 1

    p(Di)H(Di) (2)

    After constructing the decision tree from training dataset,we weight the attributes of training dataset by the minimumdepth at which the attribute is tested in the decision tree. Thedepth of root node of the decision tree is 1. The weight for an

    attribute is set to d1 , where dis the minimum depth at which

    the attribute is tested in the tree. The weights of attributes thatdo not appear in the decision tree are assigned to zero.

    B. Nave Bayesian Tree Nave Bayesian tree (NBTree) is a hybrid learning

    approach of decision tree and nave Bayesian classifier. InNBTree nodes contain and split as regular decision-trees, butthe leaves are replaced by nave Bayesian classifier, theadvantage of both decision tree and nave Bayes can be utilizedsimultaneously [57]. Depending on the precise nature of theprobability model, NB classifier can be trained very efficientlyin a supervised learning. In many practical applications, parameter estimation for nave Bayesian models uses the

    method of maximum likelihood. Suppose the training dataset,D consists of predictive attributes {A1, A2,,An}, where eachattributeAi = {Ai1, Ai2,,Aik} contains attribute values and a setof classes C = {C1, C2,,Cn}. The objective is to classify anunseen example whose class value is unknown but values forattributes A1 through Ak are known. The aim of decision treelearning is to construct a tree model: {A1, A2,,An}C.Correspondingly the Bayes theorem, if attributeAi is discreteor continuous, we will have:

    P(Cj | Aij) =( ) ( )

    ( )ijjjij

    AP

    CPCAP | (3)

    Where P(Cj|Aij) denote the probability. The aim of

    Bayesian classification is to decide and choose the class thatmaximizes the posteriori probability. SinceP(Aij) is a constantindependent ofC, then:

    C* = ( )ijj

    Cc

    ACP |maxarg

    = ( ) ( )jjijCc

    CPCAP |maxarg

    (4)

    Adaptive nave Bayesian tree splits the dataset by applyingentropy based algorithm and then used standard nave Bayesianclassifiers at the leaf node to handle attributes. It appliesstrategy to construct decision tree and replaces leaf node withnave Bayesian classifier.

    IV. PROPOSED LEARNING ALGORITHMA. Proposed Attribute Weighting Algorithm

    In a given training data, D = {A1, A2,,An} of attributes,where each attribute Ai = {Ai1, Ai2,,Aik} contains attributevalues and a set of classes C = {C1, C2,,Cn}, where eachclass Cj = {Cj1, Cj2,,Cjk} has some values. Each example inthe training data contains weight, w = {w1,w2, wn}. Initially,all the weights of examples in training data have equal unitvalue that set to wi = 1/n. Where n is the total number oftraining examples. Estimates the prior probability P(Cj) foreach class by summing the weights that how often each classoccurs in the training data. For each attribute, Ai, the numberof occurrences of each attribute value Aij can be counted bysumming the weights to determine P(Aij). Similarly, theconditional probabilityP(Aij |Cj) can be estimated by summing

    the weights that how often each attribute value occurs in theclass Cj in the training data. The conditional probabilitiesP(Aij|Cj) are estimated for all values of attributes. The algorithmthen uses the prior and conditional probabilities to update theweights. This is done by multiplying the probabilities of thedifferent attribute values from the examples. Suppose thetraining example ei has independent attribute values {Ai1,

    Ai2,,Aip}. We already know the prior probabilities P(Cj) andconditional probabilities P(Aik|Cj), for each class Cj andattributeAik. We then estimateP(ei |Cj) by

    P(ei | Cj) = P(Cj) P(Aij | Cj) (5)

    To update the weight of training example ei, we canestimate the likelihood ofei for each class. The probability thatei is in a class is the product of the conditional probabilities foreach attribute value. The posterior probabilityP(Cj | ei) is thenfound for each class. Then the weight of the example isupdated with the highest posterior probability for that exampleand also the class value is updated according to the highest posterior probability. Now, the algorithm calculates theinformation gain by using updated weights and builds a tree.After the tree construction, the algorithm initialized weightsfor each attributes in training data D. If the attribute in thetraining data is not tested in the tree then the weight of theattribute is initialized to 0, else calculates the minimum depth,d that the attribute is tested at and initialized the weight of

    attribute to d1 . Finally, the algorithm removes all the

    attributes with zero weight from the training dataD. The mainprocedure of proposed algorithm is described as follows.

    Algorithm 1: Attribute Weighting

    Input: Training Dataset,D

    Output: Decision tree, T

    Procedure:

    1. Initialize all the weights for each example in D,wi=1/n, where n is the total number of the examples.

    22 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    33/359

    (IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, 2010

    2. Calculate the prior probabilities P(Cj) for each classCj inD.P(Cj) =

    =

    n

    i

    i

    Ci

    i

    w

    w

    1

    3. Calculate the conditional probabilities P(Aij | Cj) foreach attribute values inD.P(Aij | Cj) =

    iCi

    ij

    w

    AP )(

    4. Calculate the posterior probabilities for each exampleinD.

    P(ei | Cj) = P(Cj) P(Aij | Cj)5. Update the weights of examples in D with Maximum

    Likelihood (ML) of posterior probabilityP(Cj|ei);wi= PML(Cj|ei)

    6. Change the class value of examples associated withmaximum posterior probability, Cj = Ci PML(Cj|ei).

    7. Find the splitting attribute with highest informationgain using the updated weights, wi inD.

    Information Gain =

    = =

    =

    =

    =

    =

    =

    =

    =n

    i Ci

    i

    Ci

    i

    Ci

    ik

    jn

    i

    i

    Ci

    i

    n

    i

    i

    Ci

    i

    ij

    i

    ijii ww

    w

    w

    w

    w

    w

    11

    11

    loglog

    8. T = Create the root node and label with splittingattribute.

    9. For each branch of the T, D = database created byapplying splitting predicate toD, and continue steps 1

    to 8 until each final subset belong to the same class orleaf node created.

    10. When the decision tree construction is completed, foreach attribute in the training data D: If the attribute is

    not tested in the tree then weight of the attribute isinitialized to 0. Else, let dbe the minimum depth thatthe attribute is tested in the tree, and weight of the

    attribute is initialized to d1 .

    11. Remove all the attributes with zero weight from thetraining dataD.

    B. Proposed Adaptive NBTree AlgorithmGiven training data, D where each attribute Ai and each

    example ei have the weight value. Estimates the prior

    probabilityP(Cj) and conditional probabilityP(Aij | Cj) from

    the given training dataset using weights of the examples. Then

    classify all the examples in the training dataset using these

    prior and conditional probabilities with incorporating attributeweights into the nave Bayesian formula:

    P(ei | Cj) = ( ) ( )=

    m

    i

    W

    jijjiCAPCP

    1

    | (6)

    Where Wi is the weight of attribute Ai. If any example of

    training dataset is misclassified, then for each attribute Ai,

    evaluate the utility, u(Ai), of a spilt on attribute Ai. Let j =

    argmaxi(ui), i.e., the attribute with the highest utility. If uj is

    not significantly better than the utility of the current node,

    create a NB classifier for the current node. Partition the

    training data D according to the test on attribute Ai. IfAi is

    continuous, a threshold split is used; ifAi is discrete, a multi-

    way split is made for all possible values. For each child, call

    the algorithm recursively on the portion ofD that matches the

    test leading to the child. The main procedure of algorithm is

    described as follows.

    Algorithm 2: Adaptive NBTreeInput: Training datasetD of labeled examples.

    Output: A hybrid decision tree with nave Bayesian

    classifier at the leaves.

    Procedure:1. Calculate the prior probabilities P(Cj) for each class

    Cj inD.P(Cj) =

    =

    n

    i

    i

    Ci

    i

    w

    w

    1

    2. Calculate the conditional probabilities P(Aij | Cj) foreach attribute values inD.P(Aij | Cj) =

    iCi

    ij

    w

    AP )(

    3. Classify each example in D with maximum posteriorprobability.P(ei | Cj) = ( ) ( )

    =

    m

    i

    W

    jijjiCAPCP

    1

    |

    4. If any example in D is misclassified, then for eachattribute Ai, evaluate the utility, u(Ai), of a spilt on

    attributeAi.

    5. Letj = argmaxi(ui), i.e., the attribute with the highestutility.

    6. Ifuj is not significantly better than the utility of thecurrent node, create a nave Bayesian classifier forthe current node and return.

    7. Partition the training data D according to the test onattribute Ai. IfAi is continuous, a threshold split isused; ifAi is discrete, a multi-way split is made for all

    possible values.

    8. For each child, call the algorithm recursively on theportion ofD that matches the test leading to the child.

    V. EXPERIMENTAL RESULTS AND ANALYSISA. Dataset

    Experiments have been carried out on KDD99 cup benchmark network intrusion detection dataset, a predictivemodel capable of distinguishing between intrusions and normal

    connections [58]. In 1998, DARPA intrusion detectionevaluation program, a simulated environment was set up toacquire raw TCP/IP dump data for a local-area network (LAN)

    by the MIT Lincoln Lab to compare the performance of variousintrusion detection methods. It was operated like a realenvironment, but being blasted with multiple intrusion attacksand received much attention in the research community ofadaptive intrusion detection. The KDD99 dataset contest uses aversion of DARPA98 dataset. In KDD99 dataset each examplerepresents attribute values of a class in the network data flow,

    23 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    34/359

    (IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, 2010

    and each class is labeled either normal or attack. Examples inKDD99 dataset are represented with a 41 attributes and alsolabeled as belonging to one of five classes as follows: (1)

    Normal traffic; (2) DoS (denial of service); (3) Probe,surveillance and probing; (4) R2L, unauthorized access from aremote machine; (5) U2R, unauthorized access to local superuser privileges by a local unprivileged user. In KDD99 datasetthese four attack classes are divided into 22 different attack

    classes that tabulated in Table II.TABLE II. ATTACKS IN KDD99DATASET

    4 Main Attack Classes 22 Attack Classes

    Denial of Service (DoS) back, land, neptune, pod, smurt, teardrop

    Remote to User (R2L)ftp_write, guess_passwd, imap, multihop, phf,

    spy, warezclient, warezmaster

    User to Root (U2R) buffer_overflow, perl, loadmodule, rootkit

    Probing ipsweep, nmap, portsweep, satan

    The input attributes in KDD99 dataset are either discrete orcontinuous values and divided into three groups. The firstgroup of attributes is the basic features of network connection,which include the duration, prototype, service, number of bytesfrom source IP addresses or from destination IP addresses, andsome flags in TCP connections. The second group of attributesin KDD99 is composed of the content features of networkconnections and the third group is composed of the statisticalfeatures that are computed either by a time window or awindow of certain kind of connections. Table III shows thenumber of examples of 10% training data and 10% testing datain KDD99 dataset. There are some new attack examples intesting data, which is no present in the training data.

    TABLE III. NUMBER OF EXAMPLES IN TRAINING AND TESTING KDD99DATA

    Attack Types Training Examples Testing Examples

    Normal 97277 60592

    Denial of Service 391458 237594

    Remote to User 1126 8606

    User to Root 52 70

    Probing 4107 4166Total Examples 494020 311028

    B. Performance MeasuresIn order to evaluate the performance of proposed learning

    algorithm, we performed 5-class classification using KDD99network intrusion detection benchmark dataset and considertwo major indicators of performance: detection rate (DR) andfalse positives (FP). DR is defined as the number of intrusioninstances detected by the system divided by the total number ofintrusion instances present in the dataset.

    DR = 100*_

    _det_

    attacksTotal

    attacksectedTotal (7)

    FP is defined as the total number of normal instances.

    FP = 100*__

    __

    processnormalTotal

    processiedmisclassifTotal (8)

    All experiments were performed using an Intel Core 2 DuoProcessor 2.0 GHz processor (2 MB Cache, 800 MHz FSB)with 1 GB of RAM.

    C. Experiment and analysis on Proposed AlgorithmFirstly, we use proposed algorithm 1 to perform attribute

    selection from training dataset of KDD99 dataset and then we

    use our proposed algorithm 2 for classifier construction. The

    performance of our proposed algorithm on 12 attributes in

    KDD99 dataset is listed in Table IV.

    TABLE IV. PERFORMANCE OF PROPOSED ALGORITHM ON KDD99DATASET

    Classes Detection Rates (%) False Positives (%)

    Normal 100 0.04

    Probe 99.93 0.37

    DoS 100 0.03

    U2R 99,38 0.11

    R2L 99.53 6.75

    Table V and Table VI depict the performance of naveBayesian (NB) classifier and C4.5 algorithm using the original41 attributes of KDD99 dataset.

    TABLE V. PERFORMANCE OFNBCLASSIFIER ON KDD99DATASET

    Classes Detection Rates (%) False Positives (%)

    Normal 99.27 0.08

    Probe 99.11 0.45

    DoS 99.68 0.05U2R 64.00 0.14

    R2L 99.11 8.12

    TABLE VI. PERFORMANCE OF C4.5ALGORITHM USING KDD99DATASET

    Classes Detection Rates (%) False Positives (%)

    Normal 98.73 0.10

    Probe 97.85 0.55

    DoS 97.51 0.07

    U2R 49.21 0.14

    R2L 91.65 11.03

    Table VII and Table VIII depict the performance of NBclassifier and C4.5 using reduces 12 attributes.

    TABLE VII. PERFORMANCE OFNBCLASSIFIER USING KDD99DATASET

    Classes Detection Rates (%) False Positives (%) Normal 99.65 0.06

    Probe 99.35 0.49

    DoS 99.71 0.04

    U2R 64.84 0.12

    R2L 99.15 7.85

    TABLE VIII. PERFORMANCE OF C4.5ALGORITHM USING KDD99DATASET

    Classes Detection Rates (%) False Positives (%)

    Normal 98.81 0.08

    Probe 98.22 0.51

    DoS 97.63 0.05

    U2R 56.11 0.12

    R2L 91.79 8.34

    We also compare the intrusion detection performance

    among Support Vector Machines (SVM), Neural Network(NN), Genetic Algorithm (GA), and proposed algorithm on

    KDD99 dataset that tabulated in Table IX [59], [60].

    TABLE IX. COMPARISON OF SEVERAL ALGORITHMS

    SVM NN GA Proposed Algorithm

    Normal 99.4 99.6 99.3 99.93

    Probe 89.2 92.7 98.46 99.84

    DoS 94.7 97.5 99.57 99.91

    U2R 71.4 48 99.22 99.47

    R2L 87.2 98 98.54 99.63

    24 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    35/359

    (IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, 2010

    VI. CONCLUSIONS AND FUTURE WORKSThis paper presents a hybrid approach to intrusion detection

    based on decision tree-based attribute weighting with naveBayesian tree, which is suitable for analyzing large number ofnetwork logs. The main propose of this paper is to improve the

    performance of nave Bayesian classifier for network intrusiondetection systems (NIDS). The experimental results manifestthat proposed approach can achieve high accuracy in both

    detection rates and false positives, as well as balanced detectionperformance on all four types of network intrusions in KDD99dataset. The future works focus on applying the domainknowledge of security to improve the detection rates for currentattacks in real time computer network, and ensemble with othermining algorithms for improving the detection rates in intrusiondetection.

    ACKNOWLEDGMENT

    Support for this research received from ERIC Laboratory,University Lumire Lyon 2 France, and Department ofComputer Science and Engineering, Jahangirnagar University,Bangladesh.

    REFERENCES

    [1] Xuan Dau Hoang, Jiankun Hu, and Peter Bertok, A program-basedanomaly intrusion detection scheme using multiple detection enginesand fuzzy inference, Journal of Network and Computer Applications,Vol. 32, Issue 6, November 2009, pp. 1219-1228.

    [2] P. Garcia-Teodoro, J. Diaz-Verdejo, G. Macia-Fernandez, and E.Vazquez, Anomaly-based network intrusion detection: Techniques,systems and challenges, Computers & Security, Vol. 28, 2009, pp. 18-28.

    [3] Animesh Patch, and Jung-Min Park, An overview of anomaly detectiontechniques: Existing solutions and latest technological trends,Computer Netwroks, Vol. 51, Issue 12, 22 August 2007, pp. 3448-3470.

    [4] Lih-Chyau Wuu, Chi-Hsiang Hung, and Sout-Fong Chen, Buildingintrusion pattern miner for Snort network intrusion detection system,Journal of Systems and Software, Vol. 80, Issue 10, October 2007, pp.1699-1715.

    [5] Chia-Mei Chen, Ya-Lin Chen, and Hsiao-Chung Lin, An efficientnetwork intrusion detection, Computer Communications, Vol. 33, Issue4, 1 March 2010, pp. 477-484.

    [6] M. Ali Aydin, A. Halim Zaim, and K. Gokhan Ceylan, A hybridintrusion detection system for computer netwrok security, Computer &Electrical Engineering, Vol. 35, Issue 3, May 2009, pp. 517-526.

    [7] Franciszek Seredynski, and Pascal Bouvry, Anomaly detection inTCP/IP networks using immune systems paradigm, ComputerCommunications, Vol. 30, Issue 4, 26 February 2007, pp. 740-749.

    [8] Jr, James C. Foster, Matt Jonkman, Raffael Marty, and Eric Seagren,Intrusion detection systems, Snort Intrusion detection and PreventionToolkit, 2006, pp. 1-30.

    [9] Ben Rexworthy, Intrusion detections systems an outmoded networkprotection model, Network Security, Vol. 2009, Issus 6, June 2009, pp.17-19.

    [10] Wei Wang, Xiaohong Guan, and Xiangliang Zhang, Processing ofmassive audit data streams for real-time anomaly intrusion detection,Computer Communications, Vol. 31, Issue 1, 15 January 2008, pp. 58-72.

    [11] Han-Ching Wu, and Shou-Hsuan Stephen Huand, Neural network- based detection of stepping-stone intrusion, Expert Systems withApplications, Vol. 37, Issuse 2, March 2010, pp. 1431-1437.

    [12] Xiaojun Tong, Zhu Wang, and Haining Yu, A research using hybridRBF/Elman neural netwroks for intrusion detection system secure

    model, Computer Physics Communications, Vol. 180, Issue 10,October 2009, pp. 1795-1801.

    [13] Chih-Forn, and Chia-Ying Lin, A triangle area based nearset neighborsapproach to intrusion detection, Pattern Recognition, Vol. 43, Issuse 1,January 2010, pp. 222-229.

    [14] Kamran Shafi, and Hussein A. Abbass, An adaptive genetic-basedsignature learning system for intrusion detection, Expert System withApplications, Vol. 36, Issue 10, December 2009, pp. 12036-12043.

    [15] Zorana Bankovic, Dusan Stepanovic, Slobodan Bojanic, and Octavio NietopTalasriz, Improving network security using genetic algorithmapproach, Computers & Electrical Engineering, Vol. 33. Issues 5-6,2007, pp. 438-541.

    [16] Yang Li, and Li guo, An active learning based TCM-KNN algorithmfor supervised network intruison detection, Computers & security, Vol.26, Issues 7-8, December 2007, pp. 459-467.

    [17] Wun-Hwa Chen, Sheng-Hsun Hsu, and Hwang-Pin Shen, Applicationof SVM and ANN for intrusion detection, Computers & OperationsResearch, Vol. 32, Issue 10, October 2005, pp. 2617-1634.

    [18] Ming-Yang Su, Gwo-Jong Yu, and Chun-Yuen Lin, A real-timenetwork intrusion detection system for large-scale attacks based on anincremental mining approach, Computer & Security, Vol. 28, Issue 5,July 2009, pp. 301-309.

    [19] Zeng Jinquan, Liu Xiaojie, Li Tao, Liu Caiming, Peng Lingxi, and SunFeixian, A self-adaptive negative selection algorithm used for anomalydetection, Progress in Natural Science, Vol. 19, Issue 2, 10 February

    2009, pp. 261-266.[20] Zonghua Zhang, and Hong Shen, Application of online-training SVMs

    for real-time intrusion detection with different considerations,Computer Communications, Vol. 28, Issue 12, 18 July 2005, pp. 1428-1442.

    [21] Su-Yun Wu, and Ester Yen, Data mining-based intrusion detectors,Expert Systems with Applications, Vol. 36, Issue 3, Part 1, April 2009,

    pp. 5605-5612.

    [22] S. R. Snapp, and S. E. Smaha, Signature analysis model definition andformalism, In Proc. of the 4th Workshop on Computer Security IncidentHandling, Denver, CO. 1992.

    [23] P. A. Poras, and A. Valdes, Live traffic analysis of TCP/IP gateways,In Proc. of the Network and Distributed System Security Symposium,San Diego, CA: Internet Society, 11-13 March, 1998.

    [24] T. D. Garvey, and T. F. Lunt, Model based intrusion detection, InProc. of the 14th National Conference Security Conference, 1991, pp.

    372-385.

    [25] F. Carrettoni, S. Castano, G. Martella, and P. Samarati, RETISS: A realtime security system for threat detection using fuzzy logic, In Proc. ofthe 25th IEEE International Carnahan Conference on SecurityTechnology, Taipei, Taiwai ROC, 1991.

    [26] T. F. Lunt, A. Tamaru, F. Gilham, R. Jagannathan, P. G. Neumann, H. S.Javitz, A. Valdes, and T. D. Garvey, A real-time intrusion detectionexpert system (IDES), Technical Report, Computer ScienceLaboratory, Menlo Park, CA: SRI International.

    [27] S. A. Hofmeyr, S. Forrest, A. Somayaji, Intrusion detection usingsequences of system calls, Journal of Computer Security, Vol. 6, 1998,

    pp. 151-180.

    [28] S. A. Hofmeyr, and S. Forrest, Immunity by design: An artificialimmune system, In Proc. of the Genetic and Evolutionary ComputationConference (GECCO 99), Vol. 2, San Mateo, CA: Morgan Kaufmann,1999, pp. 1289-1296.

    [29] J. M. Jr. Bonifacio, A. M. Cansian, A. C. P. L. F. Carvalho, and E. S.Moreira, Neural networks applied in intrusion detection systems, Inthe Proc. of the International Conference on Computational Intelligenceand Multimedia Application, Gold Coast, Australia, 1997, pp. 276-280.

    [30] H. Debar, M. Becker, and D. Siboni, A neural network component foran intrusion detection system, In Proc. of the IEEE Symposium onResearch in Security and Privacy, Oakland, CA: IEEE Computer SocietyPress, 1992, pp. 240-250.

    [31] W. Lee, S. J. Stolfo, and P. K. Chan, Learning patterns from Unixprecess execution traces for intrusion detection, AAAI Workshop: AI

    25 http://sites.google.com/site/ijcsis/

    ISSN 1947-5500

  • 8/9/2019 IJCSIS Volume 8 No. 1 April 2010

    36/359

    (IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, 2010

    Approaches to Fraud Detection and Risk Management, Menlo Park, CA:AAAI Press, 1999, pp. 50-56.

    [32] W. Lee, S. J. Stolfo, and K. W. Mok, Mining audit data to builtintrusion detection models, In Proc. of the 4 th International Conferenceon Knowledge Discovery and Data Mining (KDD-98), Menlo Park, CA:AAAI Press, 2000, pp. 66-72.

    [33] S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff, A senceof self for Unix Precesses, In Proc. of the 1996 IEEE Symposium onSecurity and Privacy, Oakland, CA: IEEE Computer Society P