Implementation of a Prototype to Secure Web Applications ...sci.tamucc.edu/~cams/projects/507.pdfto detect SQL injection and Cross site scripting attacks in web applications. This

i

Implementation of a Prototype to Secure Web

Applications from SQL Injection and Cross Site

Scripting Attacks Using an Intelligent Pattern Matching

Approach

GRADUATE PROJECT REPORT

Submitted to the Faculty of

the Department of Computing Sciences

Texas A&M University-Corpus Christi

Corpus Christi, Texas

In Partial Fulfillment of the Requirements for the Degree of

Master of Science in Computer Science

By

Ismail Aamir Mohammed

Fall 2016

Committee Members

Dr. Mario A. Garcia ____________________________

Committee Chairperson

Dr. David Thomas ____________________________

Committee Member

ii

Abstract

Nowadays, due to development of the Internet, there is a substantial increase in

the use of web and, the web applications have become the significant part of everyone’s

life. Because of the increase in cyber-attacks, web application security has become one

of the most important ongoing issue. There is an increase in risk of web attacks because

of web developers not being aware of the cyber-attacks, and because of the loopholes in

the prevailing technologies. The web applications have become target to many attacks

like SQL injection, session management, cross site scripting and broken authentication.

A lot of research is going on to safeguard these web applications from such malicious

attacks. Also, there have been few tools developed to protect the web applications from

malicious attacks but each of them has some specific flaws. This paper discusses an

approach the flaws in the previous tools. The approach used here analyzes the validity of

requests to the web applications and then generates cases for different attacks. These

cases then help in differentiating the malicious and non-malicious traffic in the web

applications.

iii

TABLE OF CONTENTS

Abstract ................................................................................................................... ii

Table of Contents ................................................................................................... iii

List of Figures ..........................................................................................................v

List of Tables ........................................................................................................ vii

1. Background and Rationale ...................................................................................1

1.1 Introduction ....................................................................................................1

1.2 Literature Review ...........................................................................................2

1.3 Problems in Existing System .........................................................................3

2. Narrative ..............................................................................................................5

2.1 Problem Statement .........................................................................................5

2.2 Motivation ......................................................................................................5

2.3 Objective .......................................................................................................6

2.4 Scope ..............................................................................................................6

3. Proposed System Design......................................................................................7

3.1 Target Web Application .................................................................................8

3.2 Intelligent Pattern Matching Module .............................................................8

3.3 Unified Modelling Language .......................................................................15

4. Implementation of the Modules .........................................................................21

5. Testing and Evaluation ......................................................................................25

5.1 User Interface ...............................................................................................25

5.2 Test Case 1: Entering Malicious Script .......................................................27

iv

5.3 Test Case 2: Malicious script to retrieve Confidential Information ............30

5.4 Test Case 3: SQLI attack to Retrieve Information .......................................33

5.5 Test Case 4: SQLI attack to Delete Table ....................................................36

5.6 Test Case 5: XSS Attack to Steal the Cookies .............................................38

5.7 Test Case 6: URL Redirection using XSS Attack .......................................41

5.8 Summary of Test Cases ...............................................................................44

5.9 Unit Test Case List.......................................................................................45

6.Conclusion and Future Enhancements ................................................................46

6.1 Conclusion ...................................................................................................46

6.2 Future work ..................................................................................................47

Bibliography ..........................................................................................................48

v

LIST OF FIGURES

Fig 1. System Architecture ......................................................................................7

Fig 2. Architecture Workflow ..................................................................................8

Fig 3. Intelligent Pattern Matching Module .............................................................9

Fig 4. Invalid Request Showing SQL Injection Attack .........................................11

Fig 5. Invalid Request Showing XSS attack ..........................................................12

Fig 6. Examples of Regular Expressions ...............................................................13

Fig 7. Rules Stored in the Database .......................................................................14

Fig 8. Use Case Diagram for the Pattern Matching Module ..................................16

Fig 9. Use Case Diagram for Filtering on Target Application ..............................17

Fig 10: Class Diagram for the Pattern Matching Module ......................................19

Fig 11: Class Diagram for Filtering on Target Application ...................................20

Fig 12: Code Snippet for My Filter Class ..............................................................21

Fig 13: Code Snippet for Typesafe Request ..........................................................22

Fig 14: Log Table with its Different Columns .......................................................22

Fig 15: Table for List Analyzer .............................................................................23

Fig 16: Code Snippet for Process log info .............................................................24

Fig 17: Regular Expressions for SQLI Attacks .....................................................24

Fig 18: Scenario for Invalid Username or Password .............................................26

Fig 19: Scenario after Logging in to Application ..................................................26

Fig 20: Scenario where User Enters Malicious Script ...........................................27

Fig 21: Scenario where Script Starts Executing ....................................................28

vi

Fig 22: Scenario where Log ID is Set to Vulnerable .............................................29

Fig 23: Scenario where IP address is set to Block in Database .............................29

Fig 24: Error Page ..................................................................................................30

Fig 25: Scenario of XSS Attack to Retrieve Information ......................................31

Fig 26: Scenario where Attacker Retrieves the Information .................................31

Fig 27: Scenario where the Log ID is Set to Vulnerable .......................................32

Fig 28: Scenario where the IP Address is Set to Blocked ......................................33

Fig 29: Scenario of SQLI attack ............................................................................34

Fig 30: Scenario where the Log ID is Set to Vulnerable .......................................35

Fig 31: Scenario where the IP address is Set to Blocked .......................................35

Fig 32: Another scenario of SQLI Attack ..............................................................36

Fig 33: Scenario where the Log Id is Set to Vulnerable ........................................37

Fig 34: Scenario where the IP Address is Set to Blocked ......................................37

Fig 35: XSS Attack to Steal the Cookies ...............................................................38

Fig 36: XSS Attack Showing Current Session Value ............................................39

Fig 37: Scenario where the Log ID is set to Vulnerable ........................................40

Fig 38: Scenario IP Address of Attacker is Blocked .............................................40

Fig 39: URL Redirection using XSS Attack ..........................................................41

Fig 40: Scenario after URL Redirection Attack ....................................................42

Fig 41: Scenario where the Log ID is set to Vulnerable ........................................43

Fig 42: Scenario IP Address of Attacker is Blocked .............................................43

vii

LIST OF TABLES

Table 1: Access log header information ................................................................10

Table 2: Summary of the Test Cases .....................................................................44

Table 3: Scenario to Test Pattern Matching Module .............................................45

1

1. Background and Rationale

1.1 Introduction:

Increase in usage of web based applications has made it mandatory to secure them

from a plethora of attacks. Web application layer is one of the main layer targeted by

these attacks. The most important reason for these kind of attacks is because of flaws in

the existing technologies and lack of understanding about security among web

application developers. From the recent studies, it can be seen that more than 70% percent

of attacks happen in payment card industry and also, more than 50% of authentic data is

exposed to attackers from the organizations who use shared credentials [5]. Based on this

figures, web security should be the most important thing to be considered while

developing a web application.

The two main techniques that are being used widely are default deny model and

default allow model. The default allow model maintains a list of exploits, so whenever

traffic passes through the application gateway, it is compared with the list and if any of

the input matches the content in the list, it will be considered malicious and will be

blocked [1][6].

Whereas the default deny model maintains a white list which contains all the legit

values which are considered non malicious for the web applications. These legit values

will be obtained from the user input of several web applications. So, whenever there will

be an input to the web application, it will be compared to the values in list. If the input

matches with the values in the white list, it will be considered legit and will be allowed

2

to enter. In case if it does not match the values in the white list, it will be considered

malicious [1] [6]. Many issues were faced with manual maintenance of the list of cases

that were used in default allow model and default deny model. In the proposed technique,

an intelligent pattern matching approach will be introduced which will address the issues

being faced in maintaining a list.

1.2 Literature Review

Various techniques have been proposed to address the attacks in web applications.

But each existing technique has its own flaws. A model has been proposed by C.M. Frenz

et al., to detect cross site scripting attacks in web applications [9]. In this paper, the

authors have made use of regular expressions to detect malicious attacks in web

applications. This approach was not found to be robust as it only works for simple web

applications [9].

Lwin Khin Shar et al., have proposed a technique which makes use of two phases

to detect XSS attacks in web applications [10]. In the first phase, a taint based analysis

approach is used which provides an output in HTML document showing flow of user

data. In the next phase data dependency and pattern matching is analyzed which prevents

the injection into the code due to cross site scripting. Again, this technique only focuses

on prevention of one type of attack.

A.M Chandrasekhar et al., [11] have combined K-means, fuzzy neural networks

and support vector machine classifier techniques to implement a web attack detection

mechanism [11]. In this mechanism, K-means algorithm is used to cluster the input

3

dataset into K clusters which are then trained using fuzzy logic. Each of the data is passed

through fuzzy classifier which then generates vectors. Then a classification is done using

support vector machine which helps in the detection of web attacks.

In [14] Zhang et al., have proposed a technique which uses taint analysis approach

to detect SQL injection and Cross site scripting attacks in web applications. This

approach makes use of a flow graph to detect sink and source points in the data flow. It

then taints the data in source code if the data is coming from an insecure source and

targeting a sensitive sink. The main drawback of this approach is that it has a high positive

false rate [14].

Sharma et al., [15] have presented an integrated approach which makes use of

two modes called production mode and safe mode to detect cross site scripting and SQL

injection attacks in web applications. The main drawback of this technique is that it is

not able to detect vulnerabilities from more dynamic and complex applications [15].

1.3 Problems in Existing System

The most important issue was to update and maintain the list of attacks. The list

of attacks being specific to each web application, it will be required to modify the list for

different types of applications. It will also require continuous development due to

increasing number of attacks every day [7].

Maintenance of the list of attacks is also a time consuming task. The big

applications like business applications have lots of user inputs, so it will be more time

4

consuming to differentiate between the legitimate and illegitimate user inputs and then

saving of these user inputs into the list [1] [7].

Another issue is incorporating the list of attacks for existing web applications. As

the source code for already existing web applications is copyrighted, it will get difficult

to add the list of attacks to the already existing web applications [4].

Another significant issue is that there exist multiple input sources for a web

application. Different input sources like input from the third party applications, input

from the front end of the application, input through the databases and input through

another network exist for a web application. It will be an enormous task to maintain list

of attacks manually from multiple sources of input.

Hence in order to overcome some of the problems discussed above, an intelligent pattern

matching approach will be developed which will automate the blocking of vulnerable

attacks.

5

2. Narrative

2.1 Problem statement:

Mitigating advanced malicious activities against network has become a big and a

continuous challenge. Existing techniques failed in securing web applications against

cornucopia of unwanted, potentially malicious attacks. Another technique, called default

deny approach, was introduced. The default deny model encountered trouble in

incorporating new attacks into the list of existing attacks. Hence, there was a need for a

new technique which can automatically incorporate the new attacks to the existing list of

attacks.

2.2 Motivation:

The two mechanisms that were used to protect web applications from malicious

attacks include the default allow model and default deny model. The disadvantage

associated with default allow model, which allows all traffic but just prevents traffic

noticed by web application firewall, is that it failed in keeping web applications safe from

various attacks like SQLI, Input validation, XSS etc. The issues encountered with default

deny model were updating and maintenance issues, time intensive tasks, multiple sources

of input etc. Hence, it can be discerned that with little changes in the system, networks can

be secured from the illegitimate websites. Therefore, an Intelligent pattern matching

approach is introduced which secures the web application against SQLI and XSS attacks

efficiently.

6

2.3 Objective:

The main objective of this research is to implement a prototype to secure web

applications from malicious attacks using an Intelligent pattern matching technique. In this

approach, a technique is implemented which detects different attacks and stores them in a

database to validate future authentication.

2.4 Scope:

This research is an adaptation of a novel technique [16] which will automate the

process of detecting vulnerable attacks and appending them to the existing list of attacks.

The mechanism is effective in securing web applications against SQLI attacks, and Cross

Site Scripting attacks efficiently.

7

3. Proposed System Design

An architecture has been designed to resolve the problems that were discussed in the

existing techniques. The primal features of the aimed system are:

Initial examination of the log would be done and consequently a profile of the web

application would be generated that corresponds to the semi-structured XML

format.

The creation of list of attacks is an offline process which would eventually save run

time processing.

System architecture is described in Figure 1 and the architecture workflow is described in

Figure 2.

Figure 1: System Architecture

8

Figure 2: Architecture Workflow

The proposed architecture has two main components. The target web application and the

Intelligent pattern matching module. Each of the components is explained below.

3.1 Target Web Application

The target web application is the web application that will be secured from

vulnerable attacks; For testing, a hotel management web application will be used as a target

web application. All the HTTP requests and responses to the target web application will be

sent to the access log of the intelligent pattern matching module. The Intelligent pattern

matching module after comparing the HTTP requests and responses with different cases

validates user details on the target web application.

3.2 Intelligent Pattern Matching Module:

The pattern matching module automatically generates cases for the attacks in the

form of semi structured XML format and saves them in the database. The generated cases

9

will help in securing the web application from the attackers in future. Detailed architecture

of the pattern matching module is depicted in the figure 3.

Figure 3: Intelligent Pattern Matching Module

Each component illustrated in figure 3 is explained as follows:

A. Access log: Access log constitutes the salient segment of the module structure. It is a

log table which stores all the information about HTTP request and response. And based on

this data, a vivid distinction is made between the legitimate and the illegitimate requests.

This log also permits users to view the HTTP request in addition to the response generated

by the web application for that request. Moreover, log scanning can be used to collect all

the data from the access log. Table 1 illustrates the access log header information and their

description. Date time stamp represents the time at which the web application was

accessed. Hostname with port number represents the hostname of the user accessing the

target web application. Client address with port number represents the IP address of the

client. Accept language represents the type of language that is requested by the client

Access Log

List of Attacks

Pattern Matching Case

Generator

Regular Expression

Repository

10

browser. Full resource URL represents all the input parameters that are requested. Method

name is the type of method requested by the client.

Table 1: Access log header information

B. Pattern Matching: The response of the requests is reexamined and legitimate requests

are separated from the invalid requests. Furthermore, this module carries out following

possible operations.

Based on the response, it analyses illegitimate requests and then blocks them.

For example, an IP which made attacks in the past is identified and all the requests

from that IP are terminated.

HTTP Header Fields Description

Date time stamp The time at which the web application was

accessed.

Hostname with port number Hostname of the user accessing target web

application.

Client address with port number Represents the IP address of the client.

Accept Language Language requested by the client browser

Full resource URL with parameter Represents parameters requested by the

client along with the URL.

Method Name Type of Http method (GET or POST)

11

Then the intelligent pattern matching is used to examine resemblance and disparity

of various parts of requests using the regular expressions stored in the regular

expression repository while determining the matching patterns.

Channels the information to the next module which is case generator module.

Figure 4 shows an example of invalid request in which SQL injection attack was

carried out by the attacker. The user entered username as “anything’ OR ’x’=’x”

and password as “anything’ OR ’x’=’x” to carry out a SQL injection attack.

Figure 4: Invalid Request showing SQL injection attack [13]

12

Figure 5 shows another example of invalid request in which an attacker carries out

a XSS attack by inserting a script into POST method of the HTTP request.

Figure 5: Invalid Request showing XSS attack [13]

C. Regular Expressions Repository: The regular expressions repository contains various

kinds of regular expressions and this helps in setting up of configuration file. These regular

expressions are used for detection of SQLI attacks. The input received to the web

application will be compared to these regular expressions to analyze if any similarity is

13

existing or not. If there exists any similarity between the input and the regular expression

that particular IP address will be set to block by the Case generator module. Updated

repository can be used to share usual updates to the intended system. Some examples of

regular expressions [18] are depicted in the Figure 6.

Figure 6: Examples of Regular Expressions [18]

D. Case Generator Module: Here, all the extracted information from the Intelligent

pattern matching module is used to create a semi structured XML format. Its essential

component is to generate cases for new attacks. Few of the advantages are listed below:

It is a good safeguard against cross site scripting and SQL injection attacks.

All the log information about a new URL is directed to pattern matching module

where it is compared with different regular expressions. Now, based on the results

14

given by pattern matching module, semi structure XML cases are generated for the

new attacks as illustrated in figure 7 and stored in the database.

Figure 7: Rules Stored in the Database

15

Salient features of the Intelligent Pattern Matching Module:

Scanning of log is done for the case which uses pattern matching. It carries out log

scanning to separate the legitimate and illegitimate requests based on the response

generated by the web server.

Learns about cardinality and size of the parameters. cardinality function is used to

find out free and fixed parameters.

Checks the correspondence of regular expression with input.

Finally, generates semi structured XML cases and stores them in the database.

3.3 Unified Modelling Language (UML)

Unified modelling language is used to represent the design of the system software.

The main aim of UML diagrams is to help users understand about interaction between

different components of the application.

A. Use Case diagram:

Use case diagram consists of the actors and the use cases. Actor represents

the role played by the user or any external system whereas the use cases describe

the functionality of the application.

Figure 8 shows the use case diagram for the pattern matching module. It

consists of one actor which is the pattern matching module and six use cases. The

six use cases are Analyze log messages, Intercept input request and response, start

learning cycle, generate cases, generate rules and send rules to target application.

16

Figure 8: Use Case Diagram for the Pattern Matching Module

a. Analyze Log Messages: This use case represents the action performed by the

pattern matching module to analyze the log messages from target application.

b. Intercept Input Request and Response: This use case represents the action

performed by the pattern matching module to separate the legitimate and

illegitimate requests.

c. Start Pattern Matching: This use case represents the action performed by the

pattern matching module to start the pattern matching process based on the request

type.

17

d. Generate Cases: This use case represents the different cases created by the pattern

matching module.

e. Send Rules to Target Application: It represents the action performed by the

pattern matching module to send the generated cases to the target web application.

Figure 9 shows the use case diagram for filtering on target application. It consists of two

actors and six use cases. Among two actors, one is legitimate user and the other one is

target application. The six use cases are Login, web authentication, check filter category,

authenticate based on rules, log the rules and send the rules to pattern matching module.

Figure 9: Use Case Diagram for Filtering on Target Application

18

a. Login: Existing users can login to food ordering system by entering their username

and password.

b. Web Authentication: Web authentication is done on the input values given by the

user.

c. Check Filter Category: It represents the action performed by the target application

to check the filter category of user input.

d. Authenticate Based on Rules: It represents the authentication that is done based

on the rules in the database.

e. Log it: It represents the action performed to the log based on the input values given

by the user.

f. Send to Pattern Matching Module: It represents the action performed to send

the rules to the pattern matching module.

B. Class Diagram: The relationship between different classes and objects can be

showed using a class diagram. The class diagram consists of three sections, the first

section shows the name of the class, second section shows the attributes, and third

section represents the functions which are required for implementation of the logic.

The arrows are used to illustrate the relationship between different classes.

Figure 10 shows the class diagram for Intelligent pattern matching module

which is used to secure the target application. The classes used in the intelligent

pattern matching module are istCategory, ProcessLoginfo, learnvalidateengine and

shareDB. The listCategory consists of functions like get error message, get log id

19

and set status message. It is associated with Learn Validate Engine class which

consists of functions like request method, request parameters and response body.

The Learn validate engine class is associated with shareDB which consists of

functions like insert query, update query and select query. The ProcessLoginfo class

consists of functions like ProcessLogin, validip, and it is associated with ShareDB

class.

Figure 10: Class Diagram for the Pattern Matching Module

20

Figure 11 shows the class diagram for the filter used on the target application in order to

authenticate the user input. It consists of classes like MyFilter, Buffered Request

Wrapper, ProcessLogInfo, BufferedServletInputStream, and ShareDB. MyFilter class is

responsible for functions like dofilter, destroy and getClientIpAddr. It is associate with

BufferedRequestWrapper and ProcessLoginfo.

Figure 11: Class Diagram for filtering on Target Application

21

4. Implementation of the Module

The web application is developed using Eclipse IDE and MySql for databases. The

main modules that have been implemented are as follows:

1) My Filter.

2) Log Table.

3) List Analyzer.

4) Process Log info.

5) Learn Validate Engine.

1) My Filter Class: My Filter class is responsible for taking in the Http requests and

comparing them with the existing rules. If the user input contains any malicious code, the

IP address of that particular user will be blocked and will be logged as vulnerable in Log

Table. Figure 12 shows the code snippet for My Filter class. It has a method called

getclientipaddress which takes the http request from the user and validates it against

different combinations to extract the original IP address without any proxy settings.

Figure 12: Code snippet for My Filter class

22

My Filter class also has method called getTypesafeRequestMap which is responsible for

traversing the request parameters and converting them in the form of map and key values.

Figure 13 shows the code snippet for getTypesafeRequestMap.

Figure 13: Code snippet for Typesafe Request

2) Log Table: Log table is responsible for storing all the log records of Http requests. It

stores id of the request, request body, requested parameters, IP address of the client, and

the requested method. Figure 14 represents the Log Table with its different columns.

Figure 14: Log Table with its Different Columns

23

3) List Analyzer: List Analyzer is responsible for analyzing, and separating legitimate and

illegitimate requests. It consists of id, number of errors, description about the error, id of

the request from the log table, and status about the request if it is vulnerable or not. Figure

15 shows the table for List Analyzer.

Figure 15: Table for List Analyzer

4) Process Log info: Process log info is responsible for initializing all log information by

capturing log messages. Each new request parameter which is not already stored is

captured and stored in the database. Validate client IP method validates each client IP

address by comparing them with the rules in the list generated by the learning module.

The code snippet for Process log info is shown in figure 16.

24

Figure 16: Code snippet for process log info

5) Learn Validate Engine: Learn validate engine is responsible for separating the

input request parameters. It also contains the regular expressions for detection of

SQLI attacks. Figure 17 shows the regular expressions that have been used for

detection of SQLI attacks [18].

Figure 17: Regular Expressions for SQLI attack [18]

25

5. Testing and Evaluation

Testing is done to find if there are any errors in the final web application. Testing

the final web application makes sure that the application functions without any errors.

During testing, behavior of the application is analyzed by giving different inputs to it.

A Hotel management based food ordering application was implemented to be

utilized as a target web application for the evaluation of automated list rules generating

technique. For testing; the target application will be executed from a virtual operating

system. Different regular expressions which describe common fields such as passwords,

credit cards and broadly used data types likes hex, digits etc., will be utilized for testing

purposes.

5.1 User Interface:

To make an order, the user should login to the system using the username and

password provided to him. The user should login using proper username and password. If

the user forgets to enter either the user name or the password, error message will be

displayed. Figure 18 depicts the scenario where the user forgot to enter his password. An

error message saying Invalid username or password is displayed.

26

Figure 18: Scenario for Invalid username or password

After logging in successfully, the user will be navigated to main page where he can

perform different functions such as ordering the food, viewing the bill etc. Figure 19

shows the scenario after logging in to the application.

Figure 19: Scenario after logging in to application

27

5.2 Test Case 1: Entering Malicious Script

Now, if the user tries to enter any malicious code in any text box on home page, it

will get logged into the database table and the user will be blocked from accessing the

website in future. Figure 20 shows a scenario where user is trying to enter malicious script

‘<script>alert();</script>’ in to the text box.

Figure 20: Scenario where User Enters Malicious Script

Using a malicious script, the user will try to make the web application irresponsive. Figure

21 shows a scenario where user entered script starts displaying unwanted dialog boxes.

28

Figure 21: Scenario where Script Starts Executing

Then the pattern matching module runs its batch process internally which will compare the

input parameters with the XML file containing the information about attacks. If the input

parameters match the list of attacks, it blocks the IP address of the user and stores all the

inputs given by that user in the database for future analysis. Figure 22 and 23 depict the

scenario after running the batch process internally by the pattern matching module. Figure

22 shows the scenario where log id related to IP address of that user is set to vulnerable in

the list of attacks and figure 23 shows the scenario where the status for IP address of the

user who entered the malicious script is changed to blocked. Now the IP address of the

attacker will be blocked from accessing the web application in future.

29

Figure 22: Scenario where Log ID is Set to Vulnerable

Figure 23: Scenario where IP address is set to Block in database

30

An error page as shown in figure 24 will be displayed if the blocked user tries to access

the web application in future.

Figure 24: Error Page

5.3 Test Case 2: Malicious Script to Retrieve Confidential Information

Figure 25 shows another scenario of XSS attack where user is trying to retrieve

information about the web application by entering a malicious script in the text box for

category name. Here the user entered malicious script as

<script>alert(document.title);</script> which returns the information about the title of the

document. The attacker may enter different scripts to retrieve confidential information.

31

Figure 25 Scenario of XSS Attack to Retrieve Information

Figure 26 shows how the previously entered script starts executing and shows the

confidential information about the web application. In this case, the script tag is trying to

retrieve the title of the web page.

Figure 26: Scenario where Attacker Retrieves the Information

32

Now, after executing the batch process internally by the pattern matching module, the IP

address of the user who entered the malicious script will be set to blocked in the database

and log id of that user will be set to vulnerable in the list of attacks. Figure 27 shows the

scenario where log id related to IP address of that user is set to vulnerable in the list of

attacks and figure 28 shows the scenario where the status for IP address of the user who

entered the malicious script is changed to blocked. Now the IP address of the attacker will

be blocked from accessing the web application in future.

Figure 27: Scenario where the Log ID is Set to Vulnerable

33

Figure 28: Scenario where the IP Address is Set to Blocked

5.4 Test Case 3: SQLI attack to Retrieve Information

Figure 29 shows the scenario of SQLI attack where an attacker is trying to retrieve

the details about other users which are stored in the database. Here, the user entered the

SQL query as “is null or 1=1” to retrieve the details about other users. This way user can

get access to confidential information.

34

Figure 29: Scenario of SQLI attack

Now, after running the batch process in the background, the IP address of the user trying

to enter the query will be set to blocked in the database and log id related to that IP address

will be set to vulnerable in the of list rules. Figure 30 shows the scenario where log id

related to IP address of that user is set to vulnerable in the list of attacks and figure 31

shows the scenario where the status for IP address of the user who entered the malicious

script is changed to blocked. Now the IP address of the attacker will be blocked from

accessing the web application in future.

35

Figure 30: Scenario where the Log ID is Set to Vulnerable

Figure 31: Scenario where the IP Address is Set to Blocked

36

5.5 Test Case 4: SQLI attack to Delete Table

Figure 32 shows the scenario of SQLI attack where user trying to enter a vulnerable

query to perform some piggy back attack. Here, the user entered the SQL query as “drop

table test” to perform a malicious operation. This helps user in erasing some confidential

information stored in the database.

Figure 32: Another scenario of SQLI Attack

Now, after running the batch process in the background, the IP address of the user

trying to enter the query to drop the table will be set to blocked and logged in to the set of

list of attacks. Figure 33 shows the scenario where the IP address of the attacking user is

set to blocked and figure 34 shows the scenario where the log id related to that IP address

will be set to vulnerable.

37

Figure 33: Scenario where the Log Id is Set to Vulnerable

Figure 34: Scenario where the IP address Set to Blocked

38

5.6 Test Case 5: XSS Attack to Steal the Cookies

Here the user is trying to do a session hijack by stealing the cookies from the victim’s

browser. The attacker enters a malicious script as <script>alert(document.cookie);</script> to get

the current session information from the victim’s browser. Figure 35 shows the scenario where an

attacker enters the script to be executed.

Figure 35: XSS Attack to Steal the Cookies

Figure 36 shows a scenario where the script entered by the attacker starts executing.

It returns the current session value of the victim’s computer. Using this session value, the

attacker can perform session hijacking.

39

Figure 36: XSS Attack Showing Current Session Value

Now, as the batch process gets executed internally, the IP address of the attacker trying

to perform a session hijacking attack will be blocked from accessing the web application

in future. This will be done by taking the request parameters given by the user and

comparing them with the regular expressions stored in the pattern repository. Based on

similarity between the regular expressions and input parameters, the case generator module

generates the cases. Figure 37 shows the scenario where the log id related to IP address of

the attacker is set to vulnerable and figure 38 shows the scenario where the IP address of

the attacking user is set to blocked.

40

Figure 37: Scenario where the Log ID is set to Vulnerable

Figure 38: Scenario IP Address of Attacker is Blocked

41

5.7 Test Case 6: URL Redirection using XSS Attack

In this scenario, the attacker injects a script at client side of the web application.

This script contains URL of some specific web page. So whenever a user clicks on the link

he will redirected to the page injected by the attacker in the script instead of the actual page.

Figure 39 depicts a scenario where attacker entered the script containing URL of

the web page.

Figure 39: URL Redirection using XSS Attack

Now, when a user clicks on add item link he will be redirect to view bill details

instead of add item page. Figure 40 depicts a scenario when a user clicks on an add item

link.

42

Figure 40: Scenario after URL Redirection Attack

Now, as the batch process gets executed internally, the IP address of the attacker trying

to perform a session hijacking attack will be blocked from accessing the web application

in future. This will be done by taking the request parameters given by the user and

comparing them with the regular expressions stored in the pattern repository. Based on

similarity between the regular expressions and input parameters, the case generator module

generates the cases. Figure 41 shows the scenario where the log id related to IP address of

the attacker is set to vulnerable and figure 42 shows the scenario where the IP address of

the attacking user is set to blocked.

43

Figure 41: Scenario where the Log ID is set to Vulnerable

Figure 42: Scenario IP Address of Attacker is Blocked

44

5.8 Summary of Test Cases:

This section summarizes all the test cases discussed above in a table. Table 2 depicts the

summary of all the test cases.

Table 2: Summary of the Test Cases

S.

No.

Type of

Attack

Input Expected

Result

Actual

Result

Status

1 XSS

Attack

<script>alert();</script> Attack

detected

Same as

expected

Pass

2 XSS

Attack

<script>alert(document.title);</script> Attack

Detected

Same as

expected

Pass

3 SQLI

Attack

is null or 1=1; Attack

Detected

Same as

expected

Pass

4 SQLI

Attack

drop table test Attack

Detected

Same as

expected

Pass

5 XSS

Attack

<script>alert(document.cookie);</script> Attack

Detected

Same as

expected

Pass

6 XSS

Attack

<script>window.location="http://

ViewBillToCook.jsp"</script>

Attack

Detected

Same as

expected

Pass

45

5.9 Unit Test Case List:

Table 3 depicts a scenario to test pattern matching module’s filter capturing

request and response with the target application from virtual OS.

Table 3: Scenario to Test Pattern Matching Module

ID Test Case Description Expected

Result

Actual

Result

Status

1 Capture request and

response of each

interaction with Target

Application

Verify Log

Table with

each updated

Request

Same as

expected

Pass

2 Capture Request

Method

Verify POST

or GET

method

Same as

expected

Pass

3 Capture IP addresses Capture client

IP addresses

Same as

expected

Pass

46

6. Conclusion and Future Enhancements

6.1 Conclusion

The main idea of this research is to implement a prototype to secure web

applications from malicious attacks using an Intelligent pattern matching technique. This

intelligent pattern matching technique is proved to address the conventional problems of

securing the web applications from malicious attacks.

This approach mainly comprises of four modules: access log, pattern matching

module, pattern repository, and case generator module. Each module is responsible for

performing some task in securing a web application from XSS and SQLI Injection attacks.

The access log is responsible for storing all the log header information like request

parameters, IP address of the client, page requested by the client, HTTP method etc.

The pattern matching module is responsible for analyzing all the request parameters

received by the target application, and then comparing those request parameters with the

regular expressions stored in the pattern repository. Pattern repository consists of different

regular expressions that represent vulnerable attacks. After comparison, the pattern

matching module forwards the requests to case generator module which is responsible for

creating the cases based on the results generated by the pattern matching module.

Different test cases have been used for testing the intelligent pattern matching

technique. It was tested by passing different inputs containing vulnerable queries and

scripts. Testing results prove this technique detects cross site scripting and SQL injection

attacks efficiently.

47

Due to its matching capability, intelligent pattern matching module is developed to

read the changes in the application which helps in crafting the list of attacks based on the

changes.

6.2 Future work

Machine learning algorithm can be used for automatically forming the new attack

rules by analyzing the existing list of attacks. In the current technique generation of regular

expressions is done manually. System can be made more feasible by making generation of

regular expressions automatic. This can be done by using Machine Learning algorithm and

then forming regular expressions for new attacks by analyzing the new attack patterns from

the existing list of attacks.

48

Bibliography

[1] Bruce Schneier Blog, “black listing vs. white listing”, retrieved from

http://www.schneier.com/blog/archives/2011/01/whitelistingvs.html.

[2] OWASP, “Web application attacks statistic”, Retrieved from http://www.owasp.org.

[3] Yao-WenHuang,Shih-KunHuang,Chung-HungTsai, “Web Application Security

Assessment by Fault Injection and Behavior Monitoring”, ACM ,May 20-24, 2003.

[4] Federico Maggi, William Robertson, Christopher Kruegel, and Giovanni Vigna,

“Protecting a Moving Target: Addressing Web Application Concept Drift”, Raid 2009.

[5] Jeff Orloff, “Applicure Webhacking Facts and Figure”, Retrieved from

http://www.applicure.com/blog/web-application-hacking-facts-figures

[6] ModSecurity, “Mod security”, Retrieved from http://www.modsecurity.org

[7] Bockermann, C.; Mierswa, I.; Morik, K., “On the Automated Creation of

Understandable Positive Security Models for Web Applications,” in Pervasive

Computing and Communications, 2008. PerCom 2008. Sixth Annual IEEE International

Conference, vol., no., pp.554-559, 17-21 March 2008

[8] William Robertson, Giovanni Vigna, Christopher Kruegel, and Richard A.

Kemmerer, “Using Generalization and Characterization Techniques in the Anomaly-

based Detection of Web Attacks”, Proceeding of the Network and Distributed System

Security NDSS Symposium San Diego, CA February 2006.

http://www.schneier.com/blog/archives/2011/01/whitelistingvs.html

http://www.owasp.org/

http://www.modsecurity.org/

49

[9] C. M. Frenz,J. P. Yoon, “XSSmon: A Perl Based IDS for the Detection of Potential

XSS Attacks”,Systems, Applications and Technology Conference (LISAT), IEEE Long

Island, 2012.

[10] Lwin Khin Shar, Hee Beng Kuan Tan, "Automated removal of cross site scripting

vulnerabilities in web applications", Information and Software Technology 54, 467–478,

2012.

[11] A. M. Chandrasekhar, K. Raghuveer “Intrusion Detection Technique by using K-

means, Fuzzy Neural Network and SVM classifiers”, 2013 International Conference on

Computer and Informatics(ICCCI), Coimbatore, INDIA, Jan04-06,2013

[12] J. Abirami; R. Devakunchari; C. Valliyammai “A top web security vulnerability

SQL injection attack — Survey”, Seventh International Conference on Advanced

Computing (ICoAC), 2015

[13] Niklas Särökaari, “How to identify malicious HTTP Requests” retrieved from

https://www.sans.org/reading-room/whitepapers/detection/identify-malicious-http-

requests-34067,2012

[14] Z.Xin-Hua; Z .Wang, "A Static Analysis Tool for Detecting Web Application

Injection Vulnerabilities for ASP Program," 2nd International Conference on e-Business

and Information System Security (EBISS), 22-23 May 2010.

[15] P. Sharma, R. Johari, S.S. Sarma, “Integrated approach to prevent SQL injection

attack and reflected cross site scripting attack”, Int. J. Syst. Assur. Eng. Manag.3 (4) 343–

351, 2012

https://www.sans.org/reading-room/whitepapers/detection/identify-malicious-http-requests-34067,2012

https://www.sans.org/reading-room/whitepapers/detection/identify-malicious-http-requests-34067,2012

50

[16] Syed Mishal Murtaza, Asif Sohail Abid “Automated White-List Learning Technique

for Detection of Malicious Attack on Web Application”, Centres of Excellence in

Sciences & Applied Technology, (CESAT)

[17] https://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project

[18] https://www.symantec.com/connect/articles/detection-sql-injection-and-cross-site-

scripting-attacks

https://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project

Documents

Implementation of a Prototype to Secure Web Applications ...sci.tamucc.edu/~cams/projects/507.pdfto detect SQL injection and Cross site scripting attacks in web applications. This