Intelligent Network Intrusion Detection System

  • Upload
    vrnn70

  • View
    40

  • Download
    2

Embed Size (px)

DESCRIPTION

Network intrusion detection system

Citation preview

  • TRIBHUVAN UNIVERSITY

    INSTITUTE OF ENGINEERING

    PULCHOWK CAMPUS

    DEPARTMENT OF ELECTRONICS AND COMPUTER ENGINEERING

    A

    FINAL YEAR PROJECT REPORT

    ON

    INTELLIGENT NETWORK INTRUSION DETECTION SYSTEM

    By:

    PUNEET KHANAL (062BCT527)

    RAJIV SHRESTHA (062BCT529)

    RAJU KC (062BCT530)

    LALITPUR, NEPAL

    MARCH, 2010

  • TRIBHUVAN UNIVERSITY

    INSTITUTE OF ENGINEERING

    PULCHOWK CAMPUS

    INTELLIGENT NETWORK INTRUSION DETECTION SYSTEM

    By:

    Puneet Khanal

    Rajiv Shrestha

    Raju KC

    A PROJECT SUBMITTED TO THE DEPARTMENT OF ELECTRONICS AND COMPUTER

    ENGINEERING IN PARTIAL FULLFILLMENT OF THE REQUIREMENT FOR THE

    BACHELORS DEGREE IN ELECTRONICS & COMMUNICATION / COMPUTER

    ENGINEERING

    DEPARTMENT OF ELECTRONICS AND COMPUTER ENGINEERING

    LALITPUR, NEPAL

    March, 2010

  • ii

    LETTER OF APPROVAL

    The undersigned certify that they have read, and recommended to the Institute of

    Engineering for acceptance, a project report entitled Intelligent Network Intrusion

    Detection System" submitted by Puneet Khanal, Rajiv Shrestha and Raju KC in partial

    fulfillment of the requirements for the degree Bachelor of Computer Engineering.

    ______________________________ ______________________________

    Project Supervisor Project Supervisor

    Babu Ram Dawadi Manoj Ghimire

    Assistant Professor Lecturer

    Department of Electronics and Computer Department of Electronics and

    Engineering Computer Engineering

    ______________________________ ______________________________

    Internal Examiner External Examiner

    Purushottam Sigdel Krishna Prasad Bhandari

    Director Senior Engineer

    Center for Information Technology Nepal Telecom

    ________________________________

    Project Coordinator and Deputy Head

    Surendra Shrestha, Ph.D.

    Department of Electronics and Computer Engineering

    Institute of Engineering

    DATE OF APPROVAL: 17th March, 2010

  • iii

    COPYRIGHT

    The author has agreed that the Library, Department of Electronics and Computer

    Engineering, Pulchowk Campus, Institute of Engineering may make this report freely

    available for inspection. Moreover, the author has agreed that permission for extensive

    copying of this project report for scholarly purpose may be granted by the supervisors who

    supervised the project work recorded herein or, in their absence, by the Head of the

    Department wherein the project report was done. It is understood that the recognition will

    be given to the author of this report and to the Department of Electronics and Computer

    Engineering, Pulchowk Campus, Institute of Engineering in any use of the material of this

    project report. Copying or publication or the other use of this report for financial gain

    without approval of to the Department of Electronics and Computer Engineering,

    Pulchowk Campus, Institute of Engineering and authors written permission is prohibited.

    Request for permission to copy or to make any other use of the material in this report in

    whole or in part should be addressed to:

    Head

    Department of Electronics and Computer Engineering

    Pulchowk Campus, Institute of Engineering

    Lalitpur, Kathmandu

    Nepal

  • iv

    ACKNOWLEDGEMENT

    We are sincerely thankful to the Department of Electronics and Computer Engineering for

    providing the opportunity to do this project.

    We are indebted to our supervisor Mr. Babu Ram Dawadi and Mr. Manoj Ghimire for their

    valuable suggestions and constant guidance for the accomplishment of the project. Besides,

    we are also thankful to the Project Coordinator Mr. Surendra Shrestha for assisting and

    guiding us in the project.

    Last but not the least we are thankful towards our friends as well as teachers who

    supported us all the way in the course of the project

    Puneet Khanal (062BCT527)

    Rajiv Shrestha (062BCT529)

    Raju KC (062BCT530)

  • v

    ABSTRACT

    Network Intrusion Detection Systems (NIDS) aim at preventing network attacks and

    unauthorized remote use of computers. More accurately, depending on the kind of attack it

    targets, an NIDS can be oriented to detect misuses (by defining all possible attacks) or

    anomalies (by modeling legitimate behavior and detecting those that do not fit on that

    model). Still, since their problem knowledge is restricted to possible attacks, misuse

    detection fails to notice anomalies and vice versa. Against this, we present here Intelligent

    Network Intrusion Detection System (INIDS), the misuse and anomaly detection system

    based on Naive Bayes Classifier, trained with a KDDCup99 dataset traffic, to analyze

    completely network packets, and the strategy to create a consistent knowledge model that

    integrates misuse and anomaly-based knowledge.

    Finally, we evaluate against well-known and new attacks showing how it outperforms a

    well-established industrial NIDS.

    Keywords: Network Attacks, Misuse Detection, Anomaly Detection, Network Packets,

    Naive Bayes Classifier

  • vi

    TABLE OF CONTENTS

    PAGE OF APPROVAL.....II

    COPYRIGHT...III

    ACKNOWLEDGEMENT...IV

    ABSTRACT..V

    TABLE OF CONTENTS.VI

    LIST OF FIGURES...VIII

    LIST OF TABLES...IX

    LIST OF SYMBOLS AND ABBREVIATIONS..X

    1 INTRODUCTION...1

    1.1 What is an IDS?......................................................................................................1

    1.2 What is not an IDS?................................................................................................3

    1.3 Attack Types...3

    1.4 Existing System..4

    1.5 Problem Statement..4

    1.6 Objectives...4

    1.7 Scope of the Project....5

    2 LITERATURE REVIEW....6

    2.1 The TCP/IP Reference Model..6

    2.1.1 Internet Protocol (IP).....7

    2.1.2 Internet Control Message Protocol (ICMP)....10

    2.1.3 User Datagram Protocol (UDP)..12

    2.1.4 Transmission Control Protocol (TCP).13

    2.2 Naive Bayes Classifier...16

    2.3 Some Well-Known Attacks....18

    2.3.1 DoS..18

    2.3.2 Probe....22

    2.4 jNetPcap.25

  • vii

    2.5 jSMILE...25

    3 SYSTEM DESIGN...26

    3.1 System Block Diagram...27

    3.2 Data Flow Diagrams (DFDs).27

    3.3 Unified Modeling Language (UML)..30

    4 METHODOLOGY31

    5 IMPLEMENTATION...33

    5.1 Object-Oriented Design..33

    6 TESTING..34

    6.1 Level of Testing.34

    6.2 Software Testing Strategies....35

    7 RESULT....36

    7.1 Screenshots.....36

    7.2 Comparison with Other Existing System...41

    8 CONCLUSIONS AND FURTHER WORK.42

    8.1 Conclusions....42

    8.2 Further Work......42

    REFERENCES 43

    APPENDIX A: RFCs...45

    APPENDIX B: UDP and TCP Ports47

    APPENDIX C: ICMP Messages..48

    APPENDIX D: CD Contents...50

  • viii

    LIST OF FIGURES

    Figure 2.1 TCP/IP Internet Model......7

    Figure 2.2 IP Header Format......8

    Figure 2.3 ICMP Header Format..11

    Figure 2.4 UDP Header Format...12

    Figure 2.5 TCP Header Format....13

    Figure 2.6 Smurf attack....20

    Figure 3.1 System Block Diagram...27

    Figure 3.2 Level-0 DFD...28

    Figure 3.3 Level-1 DFD...28

    Figure 3.4 Level-2 DFD...29

    Figure 3.5 Use Case Diagram..30

    Figure7.1 Naive Bayes Classifier.36

    Figure 7.2 GUI Layout.37

    Figure 7.3 Detection of normal packets only...38

    Figure 7.4 Detection of anomalous packets only.39

    Figure 7.5 Detection of both normal and anomalous packets .....40

    Figure 7.6 Accuracy of known attack..41

    Figure 7.7 Accuracy of unknown attack..41

    Figure 7.8 Ease of Use.41

  • ix

    LIST OF TABLES

    Table 2.1 Types of Service... 9

    Table 2.2 Description of flags in the control field...15

    Table A.1 RFCs for each protocol...45

    Table B.1 List of UDP and TCP ports.47

    Table C.1 List of permitted ICMP messages...48

  • x

    LIST OF SYMBOLS AND ABBREVIATIONS

    Product

    ACK Acknowledgment

    API Application Programming Interface

    DFDs Data Flow Diagrams

    DNS Domain Name System

    DoS Denial-of-Service

    DS Dataset

    DSCP Differentiated Services Code Point

    GUI Graphical User Interface

    HIDS Host-based Intrusion Detection System

    ICMP Internet Control Message Protocol

    IDS Intrusion Detection System

    INIDS Intelligent Network Intrusion Detection System

    IP Internet Protocol

    NIDS Network Intrusion Detection System

    OS Operating System

    TCP Transmission Control Protocol

    TCP/IP Transmission Control protocol / Internetworking Protocol

    TOS Type of Service

    TTL Time to Live

    UDP User Datagram Protocol

  • 1

    1. INTRODUCTION

    Nowadays, as more people make use of the internet, their computers and valuable data in

    their computer systems become a more interesting target for the intruders. Attackers scan

    the Internet constantly, searching for potential vulnerabilities in the machines that are

    connected to the network. Intruders aim at gaining control of a machine and to insert a

    malicious code into it. Later on, using these slaved machines (also called Zombies)

    intruder may initiate attacks such as worm attack, Denial-of-Service (DoS) attack and

    probing attack.

    1.1. What is an IDS?

    Intrusion is any set of actions that threaten the integrity, availability, or confidentiality of a

    network resource. An intrusion detection system (IDS) monitors network traffic and

    monitors for suspicious activity and alerts the system or network administrator. In some

    cases the IDS may also respond to anomalous or malicious traffic by taking action such as

    blocking the user or source IP address from accessing the network.

    IDS come in a variety of flavors and approach the goal of detecting suspicious traffic in

    different ways. There are network based (NIDS) and host based (HIDS) intrusion detection

    systems.

    a) NIDS: Network Intrusion Detection Systems (NIDS) are a subset of security

    management systems that are used to discover inappropriate, incorrect, or anomalous

    activities within networks.

    b) HIDS: Host-based intrusion detection system (HIDS) monitors and analyzes the

    internals of a computing system rather than the network packets on its external interfaces.

    There are IDS that detect based on looking for specific signatures of known threats- similar

    to the way antivirus software typically detects and protects against malware- and there are

  • 2

    IDS that detect based on comparing traffic patterns against a baseline and looking for

    anomalies.

    a) Signature Based: A signature based IDS will monitor packets on the network and

    compare them against a database of signatures or attributes from known malicious threats.

    This is similar to the way most antivirus software detects malware. The issue is that there

    will be a lag between a new threat being discovered in the wild and the signature for

    detecting that threat being applied to the IDS. During that lag time, the IDS would be

    unable to detect the new threat. The limitation of this approach lies in its dependence on

    frequent updates of the signature database and its inability to generalize and detect novel or

    unknown intrusions.

    b) Anomaly Based: An IDS which is anomaly based will monitor network traffic and

    compare it against an established baseline. The baseline will identify what is normal for

    that network- what sort of bandwidth is generally used, what protocols are used, what ports

    and devices generally connect to each other- and alert the administrator or user when

    traffic is detected which is anomalous, or significantly different, than the baseline.

    However, statistical anomaly detection is not based on an adaptive intelligent model and

    cannot learn from normal and malicious traffic patterns.

    There are IDS that simply monitor and alert and there are IDS that perform an action or

    actions in response to a detected threat.

    a) Passive IDS: A passive IDS simply detects and alerts. When suspicious or malicious

    traffic is detected an alert is generated and sent to the administrator or user and it is up to

    them to take action to block the activity or respond in some way.

    b) Reactive IDS: Reactive IDS will not only detect suspicious or malicious traffic and

    alert the administrator, but will take pre-defined proactive actions to respond to the threat.

    Typically this means blocking any further network traffic from the source IP address or

    user.

  • 3

    Intrusion detection systems help network administrators prepare for and deal with network

    security attacks. These systems collect information from a variety of systems and network

    sources, and analyze them for signs of intrusion and misuse. A variety of techniques have

    been employed for analysis ranging from traditional statistical methods to new machine

    learning approaches.

    1.2. What is not an IDS?

    Contrary to popular marketing belief and terminology employed in the literature on

    intrusion detection systems, not everything falls into this category. In particular, the

    following security devices are not IDS:

    Network logging systems used, for example, network traffic monitoring systems.

    Anti-virus products designed to detect malicious software such as viruses, trojan

    horses, worms, logic bombs.

    Firewalls.

    Security/cryptographic systems, for example VPN, SSL, S/MIME, Kerberos,

    Radius etc.

    1.3. Attack Types

    Attack can be classified into three types. They are as follows:

    a) Reconnaissance: These attacks involve the gathering of information about a system in

    order to find its weaknesses such as port sweeps, ping sweeps, port scans, and Domain

    Name System (DNS) zone transfers.

    b) Exploits: These attacks take advantage of a known bug or design flaw in the system.

    c) Denial-of-Service (DoS): These attacks disrupt or deny access to a service or resource.

  • 4

    1.4. Existing System

    One of the most well known and widely used intrusion detection systems is the open

    source, freely available Snort. It is available for a number of platforms and operating

    systems including both Linux and Windows. Snort has a large and loyal following and

    there are many resources available on the Internet where we can acquire signatures to

    implement to detect the latest threats.

    1.5. Problem Statement

    The classical signature-based approach:

    Cannot detect unknown or new intrusions.

    Patches and regular updates are required.

    The statistical anomaly-based approach:

    Not based on an adaptive intelligent model.

    Cannot learn from normal and malicious traffic patterns.

    An alternative approach based on machine learning must be developed.

    1.6. Objectives

    To implement intrusion detection system using Nave Bayes Classifier,

    To protect secure information of an organization from outside and inside intruders,

    To detect novel or unknown intrusions in real-time.

  • 5

    1.7. Scope of the Project

    Increased network complexity, greater access, and a growing emphasis on the Internet have

    made network security a major concern for organizations. The number of computer

    security breaches has risen significantly in the last three years. In February 2000, several

    major web sites including Yahoo, Amazon, E-Bay, Datek, and E-Trade were shut down

    due to denial-of-service attacks on their web servers.

    Today, a large amount of sensitive information is processed through computer networks,

    thus it is increasingly important to make information systems, especially those used for

    critical functions in the military and commercial sectors, resistant and tolerant to network

    intrusions. Hence Intrusion Detection has become an integral part of the information

    security process.

  • 6

    2. LITERATURE REVIEW

    2.1. The TCP/IP Reference Model

    The TCP/IP layer is a multi-layered architecture. This means that we have one

    functionality running at one depth, and another one at another level, and so forth. We can

    add new functionality to the application layers, for example, without having to re-

    implement the whole TCP/IP stack code, or to include a complete TCP/IP stack into the

    actual application.

    The following four layers comprise the TCP/IP Internet model:

    a) Application layer

    Handles implementation of user applications.

    b) Transport layer

    Manages end-to-end communications between hosts.

    Two transport layers protocols are TCP and UDP.

    c) Network layer

    Gets data from source to destination.

    d) Link layer

    Manages data transfer to and from physical medium.

  • 7

    Figure 2.1 TCP/IP Internet Model

    2.1.1. Internet Protocol (IP)

    The IP protocol resides in the Internet layer. It is an unreliable and connectionless

    datagram protocol-a best-effort delivery service. The term best-effort means that IPv4

    provides no error control or flow control (except for error detection on the header). IPv4

    assumes the unreliability of the under- lying layers and does its best to get a transmission

    through to its destination, but with no guarantees. If reliability is important, IPv4 must be

    paired with a reliable protocol such as TCP.

    IP Header

    A datagram is a variable-length packet consisting of two parts: header and data.

    The header is 20 to 60 bytes in length and contains information essential to routing and

    delivery. The header has a 20-byte fixed part and a variable length optional part of

    maximum of 40-bytes. The header format is shown below:

    Web

    browser

    TCP

    IP

    Ethernet

    driver

    Ethernet

    driver

    IP

    TCP

    Web server

    Stream

    TCP segment

    IP datagram

    Ethernet frame

  • 8

    32-bits

    VER(4-bits) HLEN(4-bits) Service(8-bits) Total Length(16-bits)

    Identification(16-bits) Flags(3-bits) Fragmentation Offset(13-bits)

    TTL(8-bits) Protocol(8-bits) Header Checksum(16-bits)

    Source Address(32-bits)

    Destination Address(32-bits)

    Options Padding

    Figure 2.2 IP Header Format

    IP Header Field Description

    Version (VER): This four bits field tells the version of IPV4 protocol in binary which

    value is 0100.

    Header Length (HLEN): This four bits field defines the total length of the datagram

    header in four byte words. This field is needed because the length of the header is variable

    (between 20 and 60 bytes). When there are no options, the header length is 20 bytes, and

    the value of this field is five (5 x 4 = 20). When the option field is at its maximum size, the

    value of this field is 15 (15 x 4 = 60).

    Service: This has two interpretations. They are:

    a) Service Type

    In this interpretation, the first three bits are called precedence bits. The next four bits are

    called type of service (TOS) bits, and the last bit is not used.

  • 9

    Table 2.1 Types of Service

    TOS Bits Description

    0000 Normal (default)

    0001 Minimize cost

    0010 Maximize reliability

    0100 Maximize throughput

    1000 Minimize delay

    b) Differentiated Services

    According to this standard bits [0-5] is Differentiated Services Code Point (DSCP) and the

    remaining two bits [6-7] are still unused.

    Total Length: This field defines the total length (header plus data) of the IPv4 datagram in

    bytes. The maximum size is 65535 octets, or bytes, for a single packet.

    Identification: This field is used in reassembly of fragmented packets.

    Flags: This field is used in fragmentation. The first bit is reserved, but still not used, and

    must be set to zero. The second bit is set to zero if the packet may be fragmented and to

    one if it may not be fragmented. The third and last bit can be set to zero if this was the last

    fragment and one if there are more fragments of this same packet.

    Fragmentation Offset: The fragmentation offset field tells where in the datagram that this

    packet belongs. The fragments are calculated in 64 bits, and the first fragment has offset

    zero.

    Time to Live: The TTL field defines how long the packet may live, or rather how many

    "hops" it may take over the Internet. After processing the datagram, each router

    decrements this number by one. If this value, after being decremented, is zero, the router

    discards the datagram.

  • 10

    Protocol: This field indicates the protocol of the next level layer. This can be TCP, UDP

    or ICMP.

    Checksum: This field is used for error detection.

    Source Address: This field contains the source address.

    Destination Address: This field contains the destination address.

    Option: If the Header Length is greater than five, it means that the Options field is present

    and must be considered. The options field contains different optional settings such as

    Internet timestamps, SACK or record route options.

    Padding: This field is used to make the header end at an even 32 bit boundary. The field

    must always be set to zeroes straight through to the end.

    2.1.2. Internet Control Message Protocol (ICMP)

    The Internet Control Message Protocol (ICMP) is gives important information about the

    health of the network.

    Types of Messages

    ICMP messages are divided into two broad categories:

    a) error-reporting messages, and

    b) query messages.

    The error-reporting messages report problems that a router or a host (destination) may

    encounter when it processes an IP packet. Five types of errors are handled: destination

    unreachable, source quench, time exceeded, parameter problems, and redirection. The

    query messages, which occur in pairs, help a host or a network manager get specific

    information from a router or another host. For example, nodes can discover their

  • 11

    neighbors. Also, hosts can discover and learn about routers on their network, and routers

    can help a node redirect its messages. Four types of query messages are echo request and

    reply, timestamp request and reply, address-mask request and reply, & router solicitation

    and advertisement.

    ICMP Header

    8-bits 8-bits 16-bits

    Type Code Checksum

    Rest of the header

    Data Sections

    Figure 2.3 ICMP Header Format

    ICMP Header Field Description

    Type: The type field contains the ICMP type of the packet. This is always different from

    ICMP type to type.

    Code: All ICMP types can contain different codes as well. Some types only have a single

    code, while others have several codes that they can use.

    Checksum: This field is used for error detection.

  • 12

    2.1.3. User Datagram Protocol (UDP)

    The User Datagram Protocol (UDP) is called a connectionless, unreliable transport

    protocol. It does not add anything to the services of IP except to provide process-to-

    process communication instead of host-to-host communication. Also, it performs very

    limited error checking.

    If UDP is so powerless, why would a process want to use it? With the disadvantages come

    some advantages. UDP is a very simple protocol using a minimum of overhead. If a

    process wants to send a small message and does not care much about reliability, it can use

    UDP.

    UDP Header

    The UDP header can be said to contain a very basic and simplified TCP header. It contains

    destination-ports, source-ports, header length and a checksum as seen in the image below.

    16-bits 16-bits

    Source Port Destination Port

    Total Length Checksum

    Figure 2.4 UDP Header Format

    UDP Header Field Description

    Source Port: This field indicates the port number used by the process running on the

    source host. It is 16-bits long. The port number can range from 0 to 65,535.

    Destination Port: This field indicates the port number used by the process running on the

    destination host. It is also 16-bits long.

  • 13

    Total Length: The length field specifies the length of the whole packet (header and data

    portions).

    Checksum: This field is used to detect errors over the entire user datagram (header plus

    data).

    2.1.4. Transmission Control Protocol (TCP)

    TCP, like UDP, is a process-to-process (program-to-program) protocol. TCP, therefore,

    like UDP, uses port numbers. Unlike UDP, TCP is a connection- oriented protocol; it

    creates a virtual connection between two TCPs to send data. In addition, TCP uses flow

    and error control mechanisms at the transport level. In brief, TCP is called a connection-

    oriented, reliable transport protocol. It adds connection-oriented and reliability features to

    the services of IP.

    TCP Header

    32-bits

    Source Port Address(16-bits) Destination Port Address(16-bits)

    Sequence Number(32-bits)

    Acknowledge Number(32-bits)

    HLEN

    (4-bits)

    Reserved

    (6-bits)

    U

    R

    G

    A

    C

    K

    P

    S

    H

    R

    S

    T

    S

    Y

    N

    F

    I

    N

    Window Size(16-bits)

    Checksum(16-bits) Urgent Pointer(16-bits)

    Options and Padding

    Figure 2.5 TCP Header Format

  • 14

    TCP Header Field Description

    Source Port: This field indicates the source port of the packet. The source port is directly

    bound to the process on the sending system.

    Destination Port: This field indicates the destination port of the TCP packet. Just as with

    the source port, this port is directly bound to the process on the receiving system.

    Sequence Number: This field is used to set a number on each TCP packet so that the TCP

    stream can be properly sequenced. The Sequence number is then returned in the ACK field

    to acknowledge that the packet was properly received.

    Acknowledgement Number: This field is used to acknowledge a specific packet a host

    has received. For example, we receive a packet with one Sequence number set, and if

    everything is okay with the packet, we reply with an ACK packet with the

    Acknowledgment number set to the same as the original Sequence number.

    Header Length: This four bits field indicates the number of four byte words in the TCP

    header. The length of the header can be between 20 and 60 bytes. Therefore, the value of

    this field can be between five (5 x 4 = 20) and 15 (15 x 4 = 60).

    Reserved: This is a six bits field reserved for future usage.

    Control: This field defines six different control flags as:

  • 15

    Table 2.2 Description of flags in the control field

    Flag Description

    URG The value of the urgent pointer field is valid.

    ACK The value of the acknowledgment field is valid.

    PSH Push the data.

    RST Reset the connection.

    SYN Synchronize sequence numbers during connection.

    FIN Terminate the connection.

    Window: This field is used by the receiving host to tell the sender how much data the

    receiver permits at the moment. This can be done by sending an ACK back, which contains

    the Sequence number that we want to acknowledge, and the Window field then contains

    the maximum accepted sequence numbers that the sending host can use before he receives

    the next ACK packet. The next ACK packet will update accepted Window which the

    sender may use.

    Checksum: This field contains the checksum of the whole TCP header. The checksum

    also covers a 96 bit pseudo header containing the destination-address, source-address,

    protocol, and TCP length. This is for extra security.

    Urgent Pointer: This field contains a pointer that points to the end of the data which is

    considered urgent. If the connection has important data that should be processed as soon as

    possible by the receiving end, the sender can set the URG flag and set the Urgent pointer to

    indicate where the urgent data ends.

    Option: The Option field is a variable length field and contains optional headers that we

    may want to use.

    Padding: This padding field pads the TCP header until the whole header ends at a 32-bit

    boundary. This ensures that the data part of the packet begins on a 32-bit boundary, and no

    data is lost in the packet. The padding always consists of only zeros.

  • 16

    2.2. Naive Bayes Classifier

    A Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem

    with strong (naive) independence assumptions. A more descriptive term for the underlying

    probability model would be "independent feature model".

    In simple terms, a naive Bayes classifier assumes that the presence (or absence) of a

    particular feature of a class is unrelated to the presence (or absence) of any other feature.

    Depending on the precise nature of the probability model, naive Bayes classifiers can be

    trained very efficiently in a supervised learning setting. In spite of their naive design and

    apparently over-simplified assumptions, naive Bayes classifiers have worked quite well in

    many complex real-world situations.

    An advantage of the naive Bayes classifier is that it requires a small amount of training

    data to estimate the parameters (means and variances of the variables) necessary for

    classification. Because independent variables are assumed, only the variances of the

    variables for each class need to be determined and not the entire covariance matrix. The

    Naive Bayes algorithm affords fast, highly scalable model building and scoring. It scales

    linearly with the number of predictors and rows. The build process for Naive Bayes is

    parallelized. Naive Bayes can be used for both binary and multiclass classification

    problems.

    The Naive Bayes algorithm is based on conditional probabilities. It uses Bayes' Theorem, a

    formula that calculates a probability by counting the frequency of values and combinations

    of values in the historical data.

    Bayes' Theorem

    Bayes' Theorem finds the probability of an event occurring given the probability of another

    event that has already occurred. If B represents the dependent event and A represents the

    prior event, Bayes' theorem can be stated as follows.

  • 17

    Prob(B given A) = Prob(A and B)/Prob(A)

    To calculate the probability of B given A, the algorithm counts the number of cases where

    A and B occur together and divides it by the number of cases where A occurs alone.

    Naive Bayes Algorithm

    X be a set of instances xi = (a1,a2,,an)

    V be a set of classifications vj

    Naive Bayes assumption:

    . (2.1)

    This leads to the following algorithm:

    Naive_Bayes_Learn ( examples )

    for each target value vj

    estimate P ( vj )

    for each attribute value ai of each attribute a

    estimate P ( ai | vj )

    Classify_New_Instance ( x )

    We generally estimate P ( ai | vj ) using m-estimates:

    . (2.2)

    where:

    n = the number of training examples for which v = vj

    nc = number of examples for which v = vj and a = ai

    p = a priori estimate for P ( ai | vj )

    m = the equivalent sample size

  • 18

    2.3. Some Well-Known Attacks

    2.3.1. DoS

    A denial of service attack (DoS attack) or distributed denial of service (DDos) is an

    attempt to make a computer resource unavailable to its intended users. Perpetrators of DoS

    attacks typically target sites or services hosted on high-profile web servers such as banks,

    credit card payment gateways, etc. The term is generally used with regards to computer

    networks, but is not limited to this field, for example, it is also used in reference to CPU

    resource management.

    One common method of attack involves saturating the target (victim) machine with

    external communications requests, such that it cannot respond to legitimate traffic, or

    responds so slowly as to be rendered effectively unavailable. In general terms, DoS attacks

    are implemented by either forcing the targeted computer(s) to reset, or consuming its

    resources so that it can no longer provide its intended service or obstructing the

    communication media between the intended users and the victim so that they can no longer

    communicate adequately.

    Denial-of-service attacks are considered violations of the IAB's Internet proper use policy,

    and also violate the acceptable use policies of virtually all Internet Service Providers. They

    also commonly constitute violations of the laws of individual nations.

    There are many varieties of denial of service (or DoS) attacks. Some DoS attacks (like a

    mailbomb, neptune, or smurf attack) abuse a perfectly legitimate feature. Others (teardrop,

    Ping of Death) create malformed packets that confuse the TCP/IP stack of the machine that

    is trying to reconstruct the packet. Still others (apache2, back, syslogd) take advantage of

    bugs in a particular network daemon.

    Some Captured DoS attacks are as follows:

    a) Smurf

    b) Neptune

    c) Teardrop

  • 19

    d) Pod

    e) Land

    f) Nuke

    Smurf

    The smurf attack is a way of generating significant computer network traffic on a victim

    network. This is a type of denial-of-service attack that floods a target system via spoofed

    broadcast ping messages.

    In the "smurf" attack, attackers use ICMP echo request packets directed to IP broadcast

    addresses from remote locations to create a denial-of-service attack. There are three parties

    in these attacks: the attacker, the intermediary, and the victim (note that the intermediary

    can also be a victim). The attacker sends ICMP echo request packets to the broadcast

    address (xxx.xxx.xxx.255) of many subnets with the source address spoofed to be that of

    the intended victim. Any machines that are listening on these subnets will respond by

    sending ICMP echo reply packets to the victim. The smurf attack is effective because the

    attacker is able to use broadcast addresses to amplify what would otherwise be a rather

    innocuous ping flood. In the best case (from an attackers point of view), the attacker can

    flood a victim with a volume of packets 255 times as great in magnitude as the attacker

    would be able to achieve without such amplification. This amplification effect is illustrated

    by Figure 2.6. The attacking machine sends a single spoofed packet to the broadcast

    address of some network, and every machine that is located on that network responds by

    sending a packet to the victim machine. Because there can be as many as 255 machines on

    an Ethernet segment, the attacker can use this amplification to generate a flood of ping

    packets 255 times as great in size as would otherwise be possible. This figure is a

    simplification of the smurf attack. In an actual attack, the attacker sends a stream of icmp

    ECHO requests to the broadcast address of many subnets, resulting in a large,

    continuous stream of ECHO replies that flood the victim.

  • 20

    Hundreds of echo replys flood

    One echo request sent to

    broadcast address.

    Figure 2.6 Smurf attack

    Teardrop

    A teardrop attack is a denial of service attack. The teardrop attack uses IP to create packet

    reassembly problems so the target computer crashes. The teardrop attack uses erroneous

    packet header information indicating overlapping fragments of packets so some data in

    some packets must overwrite data in other packets to re-assemble the packet. Attempts to

    re-assemble these packets with overlapping data can cause the computer to crash if the

    software is not prepared to handle erroneous packet header information.

    Neptune

    Neptune (SYN Flood) is a denial of service attack to which every TCP/IP implementation

    is vulnerable (to some degree). For distinguishing a Neptune attack network traffic is

    monitored for a number of simultaneous SYN packets destined for a particular machine.

    The host sending these packets is usually unreachable.

    Internet

    Attacker Victim

    Echo Request

    From attacker

    To 192.168.0.225

    Echo Reply

    from 192.168.0.20

    to victim

    Echo Reply

    from 192.168.0.20

    to victim

    Echo Reply

    from 192.168.0.20

    to victim

    Echo Reply

    from 192.168.0.20

    to victim

  • 21

    Each half-open TCP connection made to a machine causes the tcpd server to add a

    record to the data structure that stores information describing all pending connections. This

    data structure is of finite size, and it can be made to overflow by intentionally creating too

    many partially-open connections. The half-open connections data structure on the victim

    server system will eventually fill and the system will be unable to accept any new

    incoming connections until the table is emptied out. Normally there is a timeout associated

    with a pending connection, so the half-open connections will eventually expire and the

    victim server system will recover. However, the attacking system can simply continue

    sending IP-spoofed packets requesting new connections faster than the victim system can

    expire the pending connections. In some cases, the system may exhaust memory, crash, or

    be rendered otherwise inoperative.

    POD

    A ping of death (abbreviated "POD") is a type of attack on a computer that involves

    sending a malformed or otherwise malicious ping to a computer. A ping is normally 64

    bytes in size (or 84 bytes when IP header is considered); many computer systems cannot

    handle a ping larger than the maximum IP packet size, which is 65,535 bytes. Sending a

    ping of this size can crash the target computer.

    Traditionally, this bug has been relatively easy to exploit. Generally, sending a 65,536 byte

    ping packet is illegal according to networking protocol, but a packet of such a size can be

    sent if it is fragmented; when the target computer reassembles the packet, a buffer overflow

    can occur, which often causes a system crash.

    This exploit has affected a wide variety of systems, including Unix, Linux, Mac, Windows,

    printers, and routers. However, most systems since 1997-1998 have been fixed, so this bug

    is mostly historical.

    In recent years, a different kind of ping attack has become wide-spread - ping flooding

    simply floods the victim with so much ping traffic that normal traffic fails to reach the

    system (a basic denial-of-service attack).

  • 22

    Land

    The Land attack occurs when an attacker sends a spoofed SYN packet in which the source

    address is the same as the destination address. The reason a LAND attack works is because

    it causes the machine to reply to itself continuously. Directed against vulnerable systems,

    this attack caused systems to lock up or become unstable.

    Nuke

    Nuke is an old dos attack against computer network consisting of fragmented or otherwise

    invalid ICMP packets sent to the target, achieved by using modified ping utility to

    repeatedly send the corrupt data, thus slowing down the affected computer until it comes to

    complete stop.

    2.3.2. Probe

    Probing is a class of attacks in which an attacker scans a network of computers to collect

    information or find known vulnerabilities. An intruder with a map of machines and

    services that are available on a network can use this information to look for exploits. There

    are different types of probing: some of them abuse the computers legitimate features;

    other ones use social engineering techniques. This class of attacks is the most commonly

    heard and requires very little technical expertise. Examples are Ipsweep, Mscan, Nmap,

    Saint, Satan, Pingsweep and Portsweep attacks.

    Following are the captured attacks.

    a) Satan

    b) Ipsweep

    c) Portsweep

    d) Nmap

  • 23

    Nmap

    Nmap is a "Network Mapper", used to discover computers and services on a computer

    network, thus creating a "map" of the network. Just like many simple port scanners, Nmap

    is capable of discovering passive services on a network despite the fact that such services

    aren't advertising themselves with a service discovery protocol. In addition Nmap may be

    able to determine various details about the remote computers. These include operating

    system, device type, uptime, software product used to run a service, exact version number

    of that product, presence of some firewall techniques and, on a local area network, even

    vendor of the remote network card.

    Nmap can be used for black hat hacking, or attempting to gain unauthorized access to

    computer systems. It would typically be used to discover open ports which are likely to be

    running vulnerable services, in preparation for attacking those services with another

    program.

    System administrators often use Nmap to search for unauthorized servers on their network,

    or for computers which don't meet the organization's minimum level of security.

    Satan

    Satan is a probing intrusion which automatically scans a network of computers to gather

    information or find known vulnerabilities.

    SATAN is an early predecessor of the SAINT scanning program described in the

    lastsection. While SAINT and SATAN are quite similar in purpose and design, the

    particular vulnerabilities that each tools checks for are slightly different. Like SAINT,

    SATAN is distributed as a collection of perl and C programs that can be run either from

    within a web browser or from the UNIX command prompt. SATAN supports three levels

    of scanning: light, normal, and heavy. The vulnerabilities that SATAN checks for in heavy

    mode are:

  • 24

    NFS export to unprivileged programs

    NFS export via portmapper

    NIS password file access

    REXD access

    tftp file access

    remote shell access

    unrestricted NFS export

    unrestricted X Server access

    write-able ftp home directory

    several Sendmail vulnerabilities

    several ftp vulnerabilities

    Scans in light and normal mode simply check for smaller subsets of these vulnerabilities.

    Ipsweep

    An Ipsweep attack is a surveillance sweep to determine which hosts are listening on a

    network. This information is useful to an attacker in staging attacks and searching for

    vulnerable machines. There are many methods an attacker can use to perform an Ipsweep

    attack. The most common method and the method used within the simulation is to send

    ICMP Ping packets to every possible address within a subnet and wait to see which

    machines respond.

    Portsweep

    Port Sweep is a network testing tool that will let attacker learn a lot about Internet and its

    functionality. It is like more applications combined together to get more efficient results in

    easier way. Attacker can gather information about the computer and some other computers

    that are connected to Internet. This professionally designed application can be handy in

    finding all information (location, network type) about certain computer (IP, server, e-

    mail).Attacker can sweep their network to see if there is any open ports waiting to be

    hacked, to see what data is sent etc.

  • 25

    2.4. jNetPcap

    jNetPcap is a java wrapper around libpcap and WinPcap native libraries found on various

    unix and windows platforms. jNetPcap exposes the functionality as a java programming

    interface (API) which helps in capturing packets in the network.

    The main classes which implement libpcap and WinPcap functionality are:

    org.jnetpcap.Pcap class - core libpcap methods available on all platforms

    org.jnetpcap.winpcap.winpcap class - extensions based on WinPcap library

    typically only available on windows based system

    The core libpcap implementation of jNetPcap, provides methods to do the following

    functions

    Find a complete list of network interfaces the system has

    Open either a network interface or a PCAP capture file for reading packets

    Apply a packet filter

    Dump packets into a PCAP capture file

    Transmit raw link layer packets over a network interface

    Gather statistics on network interface and report counters

    2.5. jSMILE

    jSMILE is a platform independent library of java classes for reasoning in graphical

    probabilistic models, such as Bayesian networks and influence diagrams. It can be

    embedded in programs that use graphical probabilistic models as their reasoning engines.

    It is enough for jSMILE to have JRE installed so it be used to create stand-alone

    applications, applets, and servlets. Model building and inference are under full control of

    the application program, as the jSMILE library serves merely as a set of tools and

    structures that facilitates them.

  • 26

    3. SYSTEM DESIGN

    Our aim is to design and develop an Intelligent Network Intrusion Detection System

    (INIDS) that would be accurate, low in false alarms, not easily cheated by small variations

    in patterns, adaptive and real time detection.

    Attributes Used

    For our INIDS, we have extracted 18 features from tcpdump files which can identify

    packet characteristics. The features are:

    protocol type,

    ip length,

    dont fragment flag(df),

    more fragment flag(mf),

    fragmentation offset,

    syn flood,

    urgent pointer,

    tcp flags(urg, ack, psh, rst, syn, fin),

    tcp window size,

    udp checksum,

    icmp flood,

    icmp checksum, and

    type (packet is normal or attack)

  • 27

    3.1. System Block Diagram

    Figure 3.1 System Block Diagram

    3.2. Data Flow Diagrams (DFDs)

    DFD is a structured, diagrammatic technique for showing the functions performed by a

    system and the data flowing into, out of, and within it.

    The 'Context Diagram 'or level-0 DFD is an overall, simplified, view of the target

    system, which contains only one process box and the primary inputs and outputs.

    Network

    Sniffer

    Detector

    File

    System

    Knowledge

    Based

    Engine

    Training

    DataSet

    Captured

    Normal

    Attack

    Trained

  • 28

    Figure 3.2 Level-0 DFD

    The level-1 DFD shows all processes at the first level of numbering, data stores, external

    entities and the data flows between them. The purpose of this level is to show the major

    high-level processes of the system and their interrelation.

    Figure 3.3 Level-1 DFD

  • 29

    The level-2 DFD is a decomposition of a process shown in a level-1 diagram. Here we

    have decomposed inference engine process.

    Figure 3.4 Level-2 DFD

  • 30

    3.3. Unified Modeling Language (UML)

    UML is now the most widely used graphical representation scheme for modeling object-

    oriented systems. An attractive feature of the UML is its flexibility. The UML is extensible

    and is independent of any particular OOAD process. We have created a use case diagram

    to model the interactions between network administrators or crackers with theirs use cases.

    Network Admin

    Cracker

    INIDS

    Train Dataset

    Test Dataset

    Attack System

    Add to Dataset

    Run System

    Figure 3.5 Use Case Diagram

  • 31

    4. METHODOLOGY

    To develop our system, we have adopted the traditional waterfall model. The waterfall

    model is a sequential software development process, in which progress is seen as flowing

    steadily downwards like a waterfall through the phases of conception, analysis, design,

    construction, testing and maintenance. To follow the waterfall model, one proceeds from

    one phase to the next in a sequential manner. For example, when the requirements are fully

    completed, one proceeds to design. When the design is fully completed, an implementation

    of that design is made by coders. Towards the later stages of this implementation phase,

    separate software components produced are combined to introduce new functionality and

    reduced risk through the removal of errors. Thus the waterfall model maintains that one

    should move to a phase only when its preceding phase is completed and perfected.

    As this project is based on knowledge-based, a sizeable proportion of time was spent

    researching strategies for implementation. In order to achieve our desired goal regarding

    our project, we had come across several books and websites along with the remarkable

    suggestions of friends and seniors. We studied different existing systems that are

    applicable in several fields. We went through those existing systems and found out their

    characteristics, applicability and limitations as well. In this regard, the existed intrusion

    detection system "snort" became the inspiring software for us which is signature-based and

    failed to detect unknown intrusions and rely on the signatures extracted by human experts.

    A learning algorithm is good if it produces better prediction for the classifications of

    unseen examples. First we train our model with training dataset and then we test with test

    dataset. So, it is more convenient to adopt the following methodology:

    Collect a large set of examples.

    Divide it into two disjoint sets: the training set and the test set.

    Apply the learning algorithm to the training set.

    Measure the percentage of examples in the test set that are correct classified.

  • 32

    For the training and testing of our INIDS, we have used the 1998 DARPAs dataset

    provided by MIT Lincoln Laboratory. It is widely used dataset to train and test the

    intrusion detection system. It provides around 4 gigabytes of compressed Tcpdump data

    for 7 weeks of the network traffic. Each week has five days, and each day has the TCP

    dump data. It also provides TCP dump list file, which labels every flow whether the flow is

    attack or not. Every entries consists of the flow identifier number, date, time when the first

    packet of the flow is arrived, duration, service name, source port number, destination port

    number, source IP address, destination IP address, attack score, and the name of the attack.

    With this file, we are able to recognize which flow is an attack and to extract the data from

    the TCP dump data with the information in the TCP dump list file.

    First week and second week of training data consists of normal traffic and other week

    consists of mixed dataset i.e. normal traffic and attack traffic. For the purpose of training

    our intrusion detection system, we have extracted normal traffic from outside tcpdump of

    the day Wednesday and Thursday of second week. Similarly, we have extracted attack

    traffic from other weeks traffic. We have used editcap tool to split the huge tcpdump file

    and wireshark to filter the desired packets.

    For our INIDS, we have extracted 18 features from tcpdump files which can identify

    packet characteristics. The features have to be preprocessed to be suitable for naive bayes

    algorithm because naive bayes algorithm cannot handle continuous value. So, while

    making dataset the continuous features are discretized. Then, this dataset is fed for the

    purpose of learning naive bayes classifier. Again, when inferencing we extract all the

    features for each packet and we feed them to naive bayes classifier which calculates the

    probability of packet is normal and based on the threshold the packet is classified as

    normal or attack.

  • 33

    5. IMPLEMENTATION

    5.1. Object-Oriented Design

    In this technique, various objects that occur in the problem domain and the solution

    domain are first identified and different kinds of relationships that exist among these

    objects are identified. This object structure is further refined to obtain the detailed design.

    This approach has several advantages such as less development effort, and time and better

    maintainability.

    During this implementation phase, each component of the design is implemented as a

    program module, and each of these programs modules is unit tested, debugged and

    documented.

    Tools Used:

    Netbeans 6.5 IDE

    API Used:

    JSmile API

    JNetPcap

    Language Used:

    Java

    System Installation Requirement:

    Operating System - XP, Vista, Window - 7

    CPU - 500 MHz (or above)

    Memory - 128MB (or above)

  • 34

    6. TESTING

    Testing is necessary to carry-out whether the modules or system is working properly or

    not.

    6.1. Level of Testing

    While implementing our system, we go through various levels of testing which are as

    follows:

    a) Unit Testing: The purpose or unit testing is to determine the correct working of the

    individual modules.

    b) Integration Testing: During this phase the different modules are integrated in a

    planned manner. The different modules making up a system are never integrated in a single

    shot. Integration is normally carried out through a number of steps. During each integration

    step, the partially integrated system is tested.

    c) System Testing: Finally when all the modules have been successfully integrated and

    tested, system testing is carried out.

  • 35

    6.2. Software Testing Strategies

    Two of the most prevalent strategies that we performed are black-box testing and white-

    box testing.

    a) Black-Box testing: Demonstrates that software functions are operational and the input

    is properly accepted and output is correct produced.

    b) White-Box testing: Examines the fundamental aspect of the system with complete

    information and access to the internal logical structure, code and algorithms.

    A lot of features are still to be added in our project. There are many limitations which are

    still to be corrected. Before releasing the final version of software, alpha testing, beta

    testing and acceptance testing can be done additionally.

  • 36

    7. RESULT

    7.1. Screenshots

    Figure7.1 Naive Bayes Classifier

  • 37

    Figure 7.2 GUI Layout

  • 38

    Figure 7.3 Detection of normal packets only

  • 39

    Figure 7.4 Detection of anomalous packets only

  • 40

    Figure 7.5 Detection of both normal and anomalous packets

  • 41

    7.2. Comparison with Other Existing System

    Our INIDS can be compared with the existing IDS system such as snort which is regarded

    as ideal intrusion detection system. Snort is signature-based, whereas our system is

    machine learning-based. In terms of known attacks, we see that snort is better, whereas in

    case of unknown attacks, our system is better. Snort has command line configuration mode

    whereas our system has GUI mode for the configuration. As a result, one can find that our

    system is easy to use.

    High

    Low

    High

    Figure 7.6 Accuracy of known attack Figure 7.7 Accuracy of unknown attack

    High

    Low

    Figure 7.8 Ease of Use

    SN

    OR

    T

    INID

    S

    INID

    S

    SN

    OR

    T

    INID

    S

    SN

    OR

    TS

    Low

    or

    0

  • 42

    8. CONCLUSIONS AND FURTHER WORK

    8.1. Conclusions

    We accomplished the project regarding the detection of network intrusions based on Naive

    Bayes algorithm. The completed project can detect the novel attacks with the learning

    techniques which were not detected by the existing system, Snort. Comparing with snort,

    although it provides high accuracy, it was more time consuming requiring regular updates.

    Our system can detect the intrusions more efficiently with less time consuming.

    After completing this project we are able to do teamwork and knew the way to task

    dividing and cooperating in the task. Successful work not only made us feel proud but we

    also became good companions. In this way we completed our project successfully.

    8.2. Further Work

    Our system works only for IPv4 network. In future, it can be extended to IPv6 network.

    We have analyzed only packet header. So, our system could not detect Exploits

    intrusions. So, we could add payload analyzing features in our system in future.

    As a nave Bayesian network is a restricted network that has only two layers and assumes

    complete independence between the information nodes. This poses a limitation to this

    research work. In order to alleviate this problem so as to reduce the false positives, active

    platform or event based classification may be thought of using Bayesian network. We

    continue our work in this direction in order to build an efficient intrusion detection model.

  • 43

    REFERENCES

  • 44

  • 45

    APPENDIX A: RFCs

    Table A.1: RFCs for each protocol

    Protocol RFC

    ARP and RARP 826, 903, 925, 1027, 1293, 1329, 1433, 1868, 1931, 2390

    BGP 1092, 1105, 1163, 1265, 1266, 1267, 1364, 1392, 1403, 1565,

    1654, 1655, 1665, 1771, 1772, 1745, 1774, 2283

    BOOTP and DHCP 951, 1048, 1084, 1395, 1497, 1531, 1532, 1533, 1534, 1541

    BGP 1542, 2131, 2132

    CIDR 1322, 1478, 1479, 1517, 1817

    DHCP See BOOTP and DHCP

    DNS 799, 811, 819, 830, 881, 882, 883, 897, 920, 921, 1034, 1035,

    1386, 1480, 1535, 1536, 1537, 1591, 1637, 1664, 1706, 1712,

    1713, 1982, 2065, 2137, 2317, 2535, 2671

    FTP 114, 133, 141, 163, 171, 172, 238, 242, 250, 256, 264, 269, 281,

    291, 354, 385, 412, 414, 418, 430, 438, 448, 463, 468, 478, 486,

    505, 506, 542, 553, 624, 630, 640, 691, 765, 913, 959, 1635, 1785,

    2228, 2577

    HTML 1866

    HTTP 2068, 2109

    ICMP 777, 792, 1016, 1018, 1256, 1788, 2521

    IGMP 988, 1054, 1112, 1301, 1458, 1469, 1768, 2236, 2357, 2365, 2502,

    2588

    IMAP See SMTP, MIME, POP, IMAP

    IP 760, 781, 791, 815, 1025, 1063, 1071, 1141, 1190, 1191, 1624,

    2113

    IPv6 1365, 1550, 1678, 1680, 1682, 1683, 1686, 1688, 1726, 1752,

    1826, 1883, 1884, 1886, 1887, 1955, 2080, 2373, 2452, 2463

  • 46

    Table A.1: RFCs for each protocol (Continued)

    Protocol RFC

    MIB See SNMP, MIB, SMI

    MIME See SMTP, MIME, POP, IMAP

    Multicast Routing 1584, 1585, 2117, 2362

    NAT 1361, 2663, 2694

    OSPF 1131, 1245, 1246, 1247, 1370, 1583, 1584, 1585, 1586, 1587,

    2178, 2328, 2329, 2370

    POP See SMTP, MIME, POP, IMAP

    RARP See ARP and RARP

    RIP 1131, 1245, 1246, 1247, 1370, 1583, 1584, 1585, 1586, 1587,

    1722, 1723, 2082, 2453

    SCTP 2960, 3257, 3284, 3285, 3286, 3309, 3436, 3554, 3708, 3758

    SMI See SNMP, MIB, SMI

    SMTP, MIME, POP, 196, 221, 224, 278, 524, 539, 753, 772, 780, 806, 821, 934, 974

    IMAP 1047, 1081, 1082, 1225, 1460, 1496, 1426, 1427, 1652, 1653,

    1711, 1725, 1734, 1740, 1741, 1767, 1869, 1870, 2045, 2046,

    2047, 2048, 2177, 2180, 2192, 2193, 2221, 2342, 2359, 2449

    TCP 675, 700, 721, 761, 793, 879, 896, 1078, 1106, 1110, 1144, 1145,

    1146, 1263, 1323, 1337, 1379, 1644, 1693, 1901, 1905, 2001

    TELNET 137, 340, 393, 426, 435, 452, 466, 495, 513, 529, 562, 595, 596,

    599, 669, 679, 701, 702, 703, 728, 764, 782, 818, 854, 855, 1184,

    1205, 2355

    TFTP 1350, 1782, 1783, 1784

    UDP 768

    VPN 2547,2637,2685

    WWW 1614, 1630, 1737, 1738

  • 47

    APPENDIX B: UDP and TCP Ports

    Table B.1: List of UDP and TCP ports

    PortNumber UDP/TCP Protocol

    7 TCP ECHO

    13 UDP/TCP DAYTIME

    19 UDP/TCP CHARACTER GENERATOR

    20 TCP FTP-DATA

    21 TCP FTP-CONTROL

    23 TCP TELNET

    25 TCP SMTP

    37 UDP/TCP TIME

    67 UDP BOOTP-SERVER

    68 UDP BOOTP-CLIENT

    69 UDP TFTP

    70 TCP GOPHER

    79 TCP FINGER

    80 TCP HTTP

    109 TCP POP-2

    110 TCP POP-3

    111 UDP/TCP RPC

    161 UDP SNMP

    162 UDP SNMP-TRAP

    179 TCP BGP

    520 UDP RIP

  • 48

    APPENDIX C: ICMP Messages

    Table C.1: List of permitted ICMP messages

    Type Code Description

    0 - Echo Reply 0 Echo reply (used to ping)

    1 and 2 Reserved

    3 - DestinationUnreachable

    0 Destination network unreachable

    1 Destination host unreachable

    2 Destination protocol unreachable

    3 Destination port unreachable

    4 Fragmentation required, and DF flag set

    5 Source route failed

    6 Destination network unknown

    7 Destination host unknown

    8 Source host isolated

    9 Network administratively prohibited

    10 Host administratively prohibited

    11 Network unreachable for TOS

    12 Host unreachable for TOS

    13 Communication administratively prohibited

    4 - Source Quench 0 Source quench (congestion control)

    5 - Redirect Message

    0 Redirect Datagram for the Network

    1 Redirect Datagram for the Host

    2 Redirect Datagram for the TOS & network

    3 Redirect Datagram for the TOS & host

    6 Alternate Host Address

    7 Reserved

    8 - Echo Request 0 Echo request

    9 - Router Advertisement 0 Router Advertisement

    10 - Router Solicitation 0 Router discovery/selection/solicitation

  • 49

    Table C.1: List of permitted ICMP messages (Continued)

    Type Code Description

    11 - Time Exceeded

    0 TTL expired in transit

    1 Fragment reassembly time exceeded

    12 - Parameter Problem: Bad

    IP header

    0 Pointer indicates the error

    1 Missing a required option

    2 Bad length

    13 - Timestamp 0 Timestamp

    14 - Timestamp Reply 0 Timestamp reply

    15 - Information Request 0 Information Request

    16 - Information Reply 0 Information Reply

    17 - Address Mask Request 0 Address Mask Request

    18 - Address Mask Reply 0 Address Mask Reply

    19 Reserved for security

    20 through 29 Reserved for robustness experiment

    30 - Traceroute 0 Information Request

    31 Datagram Conversion Error

    32 Mobile Host Redirect

    33 Where-Are-You (originally meant for IPv6)

    34 Here-I-Am (originally meant for IPv6)

    35 Mobile Registration Request

    36 Mobile Registration Reply

    37 Domain Name Request

    38 Domain Name Reply

    39 SKIP Algorithm Discovery Protocol, Simple Key-

    Management for Internet Protocol

    40 Photuris, Security failures

    41 ICMP for experimental mobility protocols such as

    Seamoby [RFC4065]

    42 through 255 Reserved

  • 50

    APPENDIX D: CD Contents

    a) Source Codes

    b) Readme