7
An Autoencoder-Based Network Intrusion Detection System for the SCADA System Mustafa Altaha 1 , Jae-Myeong Lee 1 , Muhammad Aslam 2 , and Sugwon Hong 1 1 Dept. of Computer Engineering, Myongji University, Yongin, R. of Korea 2 Dept. of Electrical Engineering, Myongji University, Yongin, R. of Korea Email: [email protected]; [email protected]; [email protected]; [email protected] AbstractThe intrusion detection system (IDS) is the main tool to do security monitoring that is one of the security strategies for the supervisory control and data acquisition (SCADA) system. In this paper, we develop an IDS based on the autoencoder deep learning model (AE-IDS) for the SCADA system. The target SCADA communication protocol of the detection model is the Distributed Network Protocol 3 (DNP3), which is currently the most commonly utilized communication protocol in the power substation. Cyberattacks that we consider are data injection or modification attacks, which are the most critical attacks in the SCADA systems. In this paper, we extracted 17 data features from DNP3 communication, and use them to train the autoencoder network. We measure accuracy and loss of detection and compare them with different supervised deep learning algorithms. The unsupervised AE-IDS model shows better performance than the other deep learning IDS models. Index TermsNetwork intrusion detection system, DNP3, SCADA, autoencoder, deep learning, cybersecurity I. INTRODUCTION Security monitoring is one of the integral strategies for the supervisory control and data acquisition (SCADA) systems, more generally the industrial control systems (ICS), and the intrusion detection system (IDS) is a main tool of doing security monitoring. The anomaly IDS detection is the process to determine which observed events are to be identified as abnormal because it has significant deviation from normal behaviors that is called ‘profile.’ The difficult part is how to decide or derive profiles which reflect all semantics of the system. Thus, the main task to do security monitoring is to design domain-specific IDS which is aware of the target domain semantics. Depending on how to drive the profiles, all IDS models for the SCADA/ICS systems come under three categories: traffic-aware, protocol-aware, and process-aware [1]-[4]. Encouraged by the success in some fields, many researchers have begun to focus on constructing IDSs using machine learning and deep learning methods [5]- [11]. One of the challenges to develop anomaly IDS systems is to have capabilities to detect hitherto unknown anomalies or attacks. For this purpose, recently unsupervised learning and semi-supervised learning are gaining attention. At the same time, autoencoder is Manuscript received November 13, 2020; revised May 15, 2021. doi:10.12720/jcm.16.6.210-216 highlighted as a useful tool to realize the unsupervised/semi-supervised learning [12]-[14]. In this paper, we develop an IDS for the SCADA security based on the autoencoder model. The target of the detection model is the DNP3 protocol, which is currently the most commonly utilized communication protocol in the power substation and other ICS systems [15]. One of the major obstacles to develop the SCADA/ICS-specific IDS is to generate datasets that can reflect real traffic behaviors of the target systems. In this study, we generate our own dataset based on OpenDNP3 library through GNS3 [16], [17]. Another hinderance to construct and evaluate IDS for the SCADA/ICS system is how we simulate attacks. Even though we can find some information from the past SCADA/ICS cyberattack incidents, we have to consider attack scenarios including possible attacks in the future as well as the erstwhile known attacks. Thus, we should build and evaluate the proposed IDS models on top of this adversarial/attack model. In light of these characteristics of the dataset, the approach using unsupervised learning models has an advantage over supervised learning models, since they address imbalanced classification problems between mostly normal traffic data and very few attack data, which have to be assumed to be basically unknown. In this paper, we develop an IDS based on the autoencoder because the autoencoder model is able to derive a certain profile, i.e., a normal pattern from a vast amount of normal traffic. The paper [18] analyzes all possible attacks against all layers of the DNP3 protocol stack. In this paper, we focus only on data injection and modification attacks as well as DoS attack, since those attacks are considered to be the most critical attacks that can cause to wreak more damages to the SCADA/ICS operations. We assume that data injection and modification attacks would be materialized via the man-in-the-middle (MITM) attack. In the DNP3 operation mode, a master, which is a Remote Terminal Unit (RTU) or a Human Machine Interface (HMI), does regular polling to remote outstations, which are programming logic controllers (PLCs) or Intelligent Electronic Devices (IEDs), in order to monitor their exact status or any exceptions. For the static or event polls, a master establishes TCP connections with outstations. The papers [19], [20] show Journal of Communications Vol. 16, No. 6, June 2021 ©2021 Journal of Communications 210

An Autoencoder-Based Network Intrusion Detection System

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Autoencoder-Based Network Intrusion Detection System

An Autoencoder-Based Network Intrusion

Detection System for the SCADA System

Mustafa Altaha1, Jae-Myeong Lee1, Muhammad Aslam2, and Sugwon Hong1

1 Dept. of Computer Engineering, Myongji University, Yongin, R. of Korea 2 Dept. of Electrical Engineering, Myongji University, Yongin, R. of Korea

Email: [email protected]; [email protected]; [email protected]; [email protected]

Abstract—

The intrusion detection system (IDS) is the main

tool to do security monitoring that is one of the security

strategies for the supervisory control and data acquisition

(SCADA) system. In this paper, we develop an IDS based on

the autoencoder deep learning model (AE-IDS) for the SCADA

system. The target SCADA communication protocol of the

detection model is the Distributed Network Protocol 3 (DNP3),

which is currently the most commonly utilized communication

protocol in the power substation. Cyberattacks that we consider

are data injection or modification attacks, which are the most

critical attacks in the SCADA systems. In this paper, we

extracted 17 data features from DNP3 communication, and use

them to train the autoencoder network. We measure accuracy

and loss of detection and compare them with different

supervised deep learning algorithms. The unsupervised AE-IDS

model shows better performance than the other deep learning

IDS models.

Index Terms—Network intrusion detection system, DNP3,

SCADA, autoencoder, deep learning, cybersecurity

I. INTRODUCTION

Security monitoring is one of the integral strategies for

the supervisory control and data acquisition (SCADA)

systems, more generally the industrial control systems

(ICS), and the intrusion detection system (IDS) is a main

tool of doing security monitoring. The anomaly IDS

detection is the process to determine which observed

events are to be identified as abnormal because it has

significant deviation from normal behaviors that is called

‘profile.’ The difficult part is how to decide or derive

profiles which reflect all semantics of the system. Thus,

the main task to do security monitoring is to design

domain-specific IDS which is aware of the target domain

semantics. Depending on how to drive the profiles, all

IDS models for the SCADA/ICS systems come under

three categories: traffic-aware, protocol-aware, and

process-aware [1]-[4].

Encouraged by the success in some fields, many

researchers have begun to focus on constructing IDSs

using machine learning and deep learning methods [5]-

[11]. One of the challenges to develop anomaly IDS

systems is to have capabilities to detect hitherto unknown

anomalies or attacks. For this purpose, recently

unsupervised learning and semi-supervised learning are

gaining attention. At the same time, autoencoder is

Manuscript received November 13, 2020; revised May 15, 2021.

doi:10.12720/jcm.16.6.210-216

highlighted as a useful tool to realize the

unsupervised/semi-supervised learning [12]-[14].

In this paper, we develop an IDS for the SCADA

security based on the autoencoder model. The target of

the detection model is the DNP3 protocol, which is

currently the most commonly utilized communication

protocol in the power substation and other ICS systems

[15].

One of the major obstacles to develop the

SCADA/ICS-specific IDS is to generate datasets that can

reflect real traffic behaviors of the target systems. In this

study, we generate our own dataset based on OpenDNP3

library through GNS3 [16], [17]. Another hinderance to

construct and evaluate IDS for the SCADA/ICS system is

how we simulate attacks. Even though we can find some

information from the past SCADA/ICS cyberattack

incidents, we have to consider attack scenarios including

possible attacks in the future as well as the erstwhile

known attacks. Thus, we should build and evaluate the

proposed IDS models on top of this adversarial/attack

model. In light of these characteristics of the dataset, the

approach using unsupervised learning models has an

advantage over supervised learning models, since they

address imbalanced classification problems between

mostly normal traffic data and very few attack data,

which have to be assumed to be basically unknown. In

this paper, we develop an IDS based on the autoencoder

because the autoencoder model is able to derive a certain

profile, i.e., a normal pattern from a vast amount of

normal traffic.

The paper [18] analyzes all possible attacks against all

layers of the DNP3 protocol stack. In this paper, we focus

only on data injection and modification attacks as well as

DoS attack, since those attacks are considered to be the

most critical attacks that can cause to wreak more

damages to the SCADA/ICS operations. We assume that

data injection and modification attacks would be

materialized via the man-in-the-middle (MITM) attack.

In the DNP3 operation mode, a master, which is a

Remote Terminal Unit (RTU) or a Human Machine

Interface (HMI), does regular polling to remote

outstations, which are programming logic controllers

(PLCs) or Intelligent Electronic Devices (IEDs), in order

to monitor their exact status or any exceptions. For the

static or event polls, a master establishes TCP

connections with outstations. The papers [19], [20] show

Journal of Communications Vol. 16, No. 6, June 2021

©2021 Journal of Communications 210

Page 2: An Autoencoder-Based Network Intrusion Detection System

the dynamics of TCP connections, such as connection

frequency and duration, etc., by collecting traffic from

real power substations. In this paper, we focus on TCP

connection-related behaviors as features, which can

represent normal operation patterns. We extract 17 input

features based on TCP connections to train a model.

Main focus of this paper is to show the performance of

an autoencoder-based IDS (AE-IDS) model with feature

representation based on TCP connections to detect

normal and abnormal behaviors in DNP3 operation

modes. Even though we use restricted dataset derived

from streamlined DNP3 operation modes with specific

attack types, we try to find out how well AE-IDS

performs, comparing with other well-known supervised

learning models.

The rest of this paper is organized as follows. In

section II, we present the proposed IDS model based on

an autoencoder. In section III, we present an experiment

setup and evaluation of the proposed model. And in

section IV, we show comparison results between AE-IDS

and other supervised leaning-based IDS models. Finally,

we suggest conclusion and future work in section V.

II. IDS MODEL BASED ON AUTOENCODER

The IDS design is composed of two main phases: the

extracting phase and the detection phase. In the extracting

phase, the DNP3 packet is processed to extract data

features that represent the behavior of the network. Each

processed DNP3 packet is categorized as normal or

abnormal, and normal packets are used in the training

phase. The detection phase uses the same features, which

are extracted from the extracting phase, as input. The

constructed IDS model predicts which packets belong to

either normal or abnormal.

A. Extracting Phase

The generated dataset has 470 normal instances (TCP

connections), 10 disable unsolicited messages attack

instances, 11 cold restart command attack instances, and

370 DoS attack instances. This gives a total of 861

instances. The total instances or TCP connections are

relatively small, but these instances are enough to capture

normal patterns, which later are used to discern

abnormality of traffic at the experiment.

This generated dataset consists of 17 features, which

are also used similarly by [21]. Among them, the 12

features are related to behaviors of TCP connections. The

reason we choose these features is that TCP connection or

flow-based behavior is more likely to reflect normal

behavior of DNP3 operation, rather than packet-based or

window-based behavior, which are also commonly

considered as an alternative to extract input features to

train. “Fig. 1,” shows the correlation between the input

features and the label to denote attack types. The 12

features are:

1) Duration: How long the connection lasts before it

finishes.

2) Src_bytes: Number of data bytes transferred from

source to destination in a single connection.

3) Dst_bytes: Number of data bytes transferred from

destination to source in a single connection.

4) Flag: Status of the connection, i.e., Normal or Error.

5) Count: Number of connections to the same

destination host as the current connection in the past

two seconds.

6) Srv_count: Number of connections to the same

service (port number) as the current connection in the

past two seconds.

7) Same_srv_rate: The percentage of connections that

were to the same service, among the connections

aggregated in Count.

8) Dst_host_count: Number of host connections to the

destination

9) Dst_host_srv_count: Number of services connecting

to the destination.

10) Srv_rate: The percentage of connections that were to

the same service.

11) Port_rate: The percentage of connections that were to

the same source port, among the connections having

the same port number.

12) Rttd: Round trip time delay is the length of time it

takes for a signal to be sent plus the length of time it

takes for an acknowledgement of that signal to be

received.

Fig. 1. Correlation between label and features

The remaining 5 features are related to DNP3 specific

features as follows:

13) Contains dnp3 packets: A feature indicating if the

connection contains a DNP3 packet.

14) DNP3 payload length: The total length of the DNP3

payload contained within the connection.

15) Min dnp3 payload length: The minimum DNP3

payload length in the connection.

16) Cold restart in dnp3 pkt: A Boolean indicating if

there exists a cold restart or disable unsolicited

message command in the connection.

Journal of Communications Vol. 16, No. 6, June 2021

©2021 Journal of Communications 211

Page 3: An Autoencoder-Based Network Intrusion Detection System

17) Func code_not_Support_count: A Boolean indicating

changes in function code.

B. Detecting Phase

As a deep learning structure is defined as a sequence of

layers, we create a sequential model and add layers one at

a time until the model network topology reaches at an

optimal level. As we mentioned above, we train an

autoencoder model with normal traffic, and detect

irregularities from an attack dataset at the test stage. The

input layer represents the number of data features that are

extracted from the extracting phase as shown in “Fig. 1.”

Then, after multiple hidden layers, an output layer

produces restored original input features.

The autoencoder model consists of two parts: encoder

and decoder. An encoder aims to compress the input data

into a low-dimensional latent representation, and a

decoder reconstructs the input data from the low-

dimension representation generated by the encoder, based

on the nature of data.

Equation (1) shows the encoding process where the

input is changed into a compressed representation, while

Equation (2) shows the decoding process where the input

is reconstructed from the compressed representation:

h(t) = f(wX(t) + b) (1)

X’(t) = f(w’h(t) + b’) (2)

where h represents the encoded representation or the code

of input X(t); X’(t) is the decoded or reconstructed input;

f is the activation function used by the model; w and w’

are the weights of the encoder and decoder respectively;

and b and b’ are corresponding biases of the encoder and

decoder.

The AE-IDS model consists of an input layer, hidden

encoder/decoder layers, and an output layer. The input

layer consists of 17 features; the final layer is the output

layer that reconstructs the input. As for the activation

function, we use the tanh function, since it is known to

speed up the training [22]. For the hidden layers, the

network should be large enough to capture the structure

of the problem. The optimum number of layers is decided

by a hyperparameter analysis.

III. EVALUATION

A. Exprement Setup

We conduct an experiment on the network, which

consists of one DNP3 master and one outstation, using a

network software emulator, GNS3. The experiment

utilizes two Linux hosts, which are connected to a switch

with negotiated speed of 1000 Mbps. One is working as a

master and the other working as an outstation, both of

which are running OpenDNP3. We also add an attack

host used for penetration testing, which is running Kali

Linux [23]. We assume that an attacker already gained an

access to the DNP3 network, and the Kali Linux node is

used to execute malicious activities. The experiment

focuses on the following attacks: DoS and packet

injection and modification using disable unsolicited

messages, cold restart function codes. For packet

injection/modification attacks, we execute the man-in-

the-middle (MITM) attack by ARP spoofing.

“Fig. 2,” shows DNP3 network configuration in the

experiment setup. To generate the attack traffic, we

assumed that an attacker (Kali Linux) has already

compromised the network. For the DoS attack, hping3 is

used to generate and send DoS traffic to port 20000 (the

DNP3 port) of the outstation node [24]. In order to

perform disable unsolicited messages attack and cold

restart attack, we set up the man-in-the-middle (MITM)

attack by arpspoof [25]. After the success of the MITM

attack, we write a Python TCP hijacking script, which

uses the scapy library to manipulate frames to simulate

attacks of packet injection and modification [26].

In this experiment, the application fields of intercepted

DNP3 packets coming from the master node to the

outstation was altered from a read class to a cold restart

or disable unsolicited in order to compromise normal

operation of the outstation node. The details of the fields

and function codes of the DNP application packet are

explained in the standard [15].

Fig. 2. DNP3 experiment configuration

B. Evaluations

Keras programming environment with TensorFlow at

the backend is used to design the model. Designing an

efficient deep learning model involves a challenging task

called the hyperparameter optimization. First, we carried

out the hyperparameter optimization of the AE-IDS

model. After that, the proposed AE-IDS model was

trained on the prepared dataset to evaluate the

performance of the model: accuracy and loss.

1) Hyperparameter optimization

Optimizing the proposed AE-IDS model was

performed by varying the number of hidden layers and

hidden neurons in each hidden layer. In order to optimize

hyperparameters, we trained the corresponding

autoencoder model for each combination of

hyperparameters. The reconstruction errors, i.e., mean

square error (MSE), on the training data for

hyperparameter optimization are given in “Fig. 3.”

AE-IDS with three hidden layers and 6 Neurons at

each layer is superior to others in terms of loss. Therefore,

we selected 3 hidden layers with 6 hidden neurons as an

optimal AE-IDS model. The optimizer used in the AE

Journal of Communications Vol. 16, No. 6, June 2021

©2021 Journal of Communications 212

EXPERIMENT AND

Page 4: An Autoencoder-Based Network Intrusion Detection System

model is the Adam optimizer, and tanh is used as the

activation function. The batch size is 8, and epoch is 20.

Fig. 3. MSE for different number of hidden layers and hidden neurons.

2) Model evaluation

“Fig. 4,” shows MSE for normal data, i.e., 470 normal

TCP connections. As shown in this figure, the loss as a

measurement of MSE is less than 0.18 for all connections.

Fig. 4. MSE for normal data.

Fig. 5. MSE for attack data.

“Fig. 5,” shows the loss for attack instances, i.e., the

TCP connections in which attack packets exist. When

there are any attacks, the model shows that the loss is

above 0.01 as shown in this figure.

In order to detect an injection attacks, DNP3 payload

inspection is required. In section II of the detection phase,

we extract from the TCP connection some DNP3 specific

features like “Contains dnp3 pkt”, indicating if the

connection contains a DNP3 packet, or “Cold restart in

dnp3 pkt,” indicating if there exist a cold restart or

disable unsolicited command in the connection, or “Fcns

in conn,” indicating if a function code is changed in a

connection. Thus, if any abnormal activity happens, the

model clearly distinguishes between normal connections

from injected DNP3 attack connections, depending on the

patterns of the connections.

IV. COMPARISON WITH SUPERVISED LEARNING MODELS

In order to investigate the effectiveness of the AE-IDS

model for SCADA security, we compared our approach

with the typical deep neural network methods. The aim of

the evaluation is to find out the performance of AE-IDS

by examining the following measurements, comparing

with other models.

Accuracy = (TP+TN)/ (P + N) (3)

F1_socre = (2 * TP)/ (2 * TP + FP + FN) (4)

Recall = TP/P (5)

Precision = TP/(TP + FP) (6)

where P and N are the actual number of attack and no

attack instances respectively; True Positive (TP) is the

number of attacks classified rightly as attack; True

Negative (TN) is the number of normal instances rightly

classified normal; False Positive (FP) is the number of

normal events misclassified as attacks; and False

Negative (FN) is the number of attacks misclassified as

normal.

In our previous work [21], an intrusion detection

system using the supervised deep learning algorithms,

was proposed to secure the DNP3 network. The proposed

deep learning algorithms were: Feedforward neural

network (FNN), Recurrent neural networks (RNN), Long

short-term memory (LSTM), and convolutional neural

network (CNN). In this paper we compare the AE-IDS

with other supervisory deep learning model-based IDSs

in this specific case of dataset and DNP3 operation

environment.

TABLE I: COMPARISON OF RESULTS

Algorithm Accuracy% Recall% Precision% F1 Score%

FNN 98.75 98.01 98.59 98.12

RNN 98.68 97.68 98.95 98.96

LSTM 98.68 97.68 97.68 97.69

CNN 98.68 97.68 97.68 97.69

AE-IDS 99.70 100.00 99.74 99.49

Table I shows that the performance of AE-IDS is better

than other models in terms of accuracy, recall, and f1-

score. The accuracy of AE-IDS is approximately 99.70%,

which means that there is a chance of around 99% of

detecting any anomalous traffic inside the network. In

this experiment, the loss (MSE) threshold to decide

Journal of Communications Vol. 16, No. 6, June 2021

©2021 Journal of Communications 213

Page 5: An Autoencoder-Based Network Intrusion Detection System

whether a connection is an attack or not is set to 0.10. So,

as shown in “Fig. 5,” the recall of AE-IDS is 100%.

In addition, another metric to be considered is the loss.

The lower loss means the better model. Loss is not in

percentage as opposed to accuracy and it is a summation

of errors made for each example in training or test sets.

Fig. 6 shows the loss upon training the network. Similar

to accuracy, loss decreases as the number of epochs

increases until it reaches a value of 0.026 for FNN,

0.0264 for RNN, 0.0534 for LSTM, and 0.1363 for CNN.

As for the AE-IDS, this value is 0.0131, which is almost

a negligible loss at the end of the training.

Fig. 6. Model loss during the training of the network

V. CONCLUSION AND FUTURE WORK

The autoencoder is one of the most interesting models

to extract features from the high-dimensional data in the

context of deep learning. In this paper we propose an

autoencoder-based intrusion detection system (AE-IDS)

approach to build an effective and flexible IDS. The main

purpose of this work is to show comparison of the

performance of AE-IDS with other intrusion detection

models based on different supervised deep learning

algorithms. The proposed AE-IDS outperforms other

models in terms of accuracy, recall, f1-score and loss,

proving its efficiency in SCADA security.

However, we cannot generalize this conclusion, since

our experiment is restricted to small dataset and specific

DNP3 operation environments. And superiority of input

features based on TCP connections cannot be concluded

compared to other alternatives, such as packet-based or

window-based. The comparison still needs more

experiment in various operation environments.

As a further work, we need to expand the

attack/adversary model, which can reflect various unseen

and unexpected attacks. We also need to generate more

comprehensive DNP3 dataset to train a model and

validate its efficiency as an IDS for SCADA security,

considering different DNP3 operations reflected in real

substations. In addition, we need to validate and evaluate

autoencoder-based intrusion detection system (AE-IDS)

by comparing with the traditional rule-based or protocol-

based IDSs. We will also further explore how sparsity

constraints are imposed on autoencoder and how sparse

AE-IDS can be designed to further improve intrusion

detection effectiveness.

CONFLICT OF INTEREST

The authors declare no conflict of interest.

AUTHOR CONTRIBUTIONS

M. Al. designed, implemented, evaluated the model,

and wrote the paper. J.-M.L. also worked in the

implementation and evaluation. S.H. supervised and

evaluate the whole process and gave ideas to improve the

article. M.As. supported discussions on improvements

and carried out proof reading of the article. All authors

have read and agreed to the published version of the

manuscript.

ACKNOWLEDGMENT

This research was supported by Korea Electric Power

Corporation. (Grant number: R18XA01)

REFERENCES

[1] S. Hong, J. H. Lee, M. Altaha, and M. Aslam, “Security

monitoring and network management for the power control

network,” I. J. of Electrical and Electronic Engineering &

Telecommunications, vol. 9, no. 5, pp. 356-363, Sep. 2020.

[2] D. Bhamare, M. Zolanvari, A. Erbad, R. Jain, K. Khan, and

N. Meskin, “Cybersecurity for industrial control systems:

A survey,” Computers & Security, vol. 89, February 2020.

[3] Y. Hu, A. Yang, H. Li, Y. Sun, and L. Sun, “A survey of

intrusion detection on industrial control systems,” I. J. of

Distributed Sensor Networks, vol. 14, no. 8, 2018.

[4] A. Volkova, M. Niedermeiser, R. Basmadjian, and H. de

Meer, “Security challenge in control network protocols: A

survey,” IEEE Communications Surveys & Tutorials, vol.

21, no. 1, 2019.

[5] H. Liu and B. Lang, “Machine learning and deep learning

methods for intrusion detection systems: A survey,”

Applied Sciences vol. 9, p. 4396, Oct. 2019.

[6] A. M. Aleesa, B. B. Zaidan, A. A. Zaidan, and N. M. Sahar,

“Review of intrusion detection systems based on deep

learning techniques: Coherent taxonomy, challenges,

motivations, recommendations, substantial analysis and

future direction,” Neural Computing and Application, vol.

32, pp. 8827-9858, Oct. 2019.

[7] B. Chalapathy and S. Chawla. (Jan. 2019). Deep Learning

for Anomaly Detection: A Survey. [Online]. Available:

https://arxiv.org/abs/1901.03407

[8] Y. Luo, Y. Xiao, L. Cheng, G. Peng, and D. Yao. (Mar.

2020). Deep learning-based anomaly detection in cyber-

physical systems: Progress and opportunities. [Online].

Available: https://arxiv.org/abs/2003.13213

[9] R. L. Perez, F. Adamsky, R. Soua, and T. Engel, “Forget

the myth of the air gap: Machine learning for reliable

intrusion detection in SCADA systems,” EAI Endorsed

Trans. on Security and Safety, vol. 6, no. 19, 2019.

Journal of Communications Vol. 16, No. 6, June 2021

©2021 Journal of Communications 214

Page 6: An Autoencoder-Based Network Intrusion Detection System

[10] H. Yang, L. Cheng, and M. C. Chuah, “Deep-Learning-

Based network intrusion detection for SCADA systems,”

in Proc. IEEE Conference on Communications and

Network Security (CNS), June 2019.

[11] A. Hijazi, E. A. Safadi, and J. M. Flaus, “A deep learning

approach for intrusion detection system in industry

network,” in Proc. Int’l Conference on Big Data and

Cybersecurity Intelligence, Beirut, Lebanon, 2019.

[12] C. Wang, B. Wang, H. Liu, and H. Qu, “Anomaly

detection for industrial control system based on

autoencoder neural network,” Hindawi Wireless

Communications and Mobile Computing, vol. 2020, Aug.

2020.

[13] M. Charib, S. H. Dastgerdi, and M. Sabokron. (Nov. 2019).

AutoIDS: Auto-encoder Based Method for Intrusion

Detection System. [Online]. Available:

https://arxiv.org/pdf/1911.03306.pdf

[14] F. Farahnakian and J. Heikkonen, “A deep auto-encoder

based approach for intrusion detection system,” in Proc.

Int’l Conf. on Advanced Communication Technology

(ICACT), Feb. 2018.

[15] IEEE Standard for Electric Power Systems

Communications Distributed Network Protocol (DNP3),

IEEE Standard Association, IEEE Std 1815-2012.

[16] OpenDNP3. [Online]. Available:

https://www.automatak.com/opendnp3

[17] W. Chris, GNS3 Network Simulation Guide, 1st ed. Packt

Publ., 2013.

[18] S. East, J. Butts, M. Papa, and S. Shenoi, “A taxonomy of

attacks on the DNP3 protocol,” in Critical Infrastructure

Protection III, Springer, Berlin, Heidelberg, March, 2009,

pp. 67-81.

[19] D. Formby, A. Walid, and R. Beyah, “A case study in

power substation network dynamics,” Proc. the ACM on

Measurement and Analysis of Computing Systems, vol. 1,

no. 1, June 2017.

[20] S. S. Jung, D. Formby, C. Day, and R. Beyah, “A first look

at machine-to-machine power grid network traffic,” in

Proc. IEEE Int’l Conf. on Smart Grid Communications,

Nov. 2014.

[21] M. Altaha, J. H. Lee., M. Aslam, and S. Hong, “Network

intrusion detection based on deep neural networks for the

SCADA system,” Journal of Physics: Conference Series,

vol. 1585, July 2020.

[22] H. Zhang, T. W. Weng, P. Y. Chen, C. J. Hsieh, and L.

Daniel, “Efficient neural network robustness certification

with general activation functions,” in Advances in neural

Information Processing Systems, 2018, pp. 4939-4948.

[23] Linux, Kali. [Online]. Available: https://www.kali.org

[24] hping3, [Online]. Available: http://www.hping.org/

[25] Arpspoof, [Online]. https://linux.die.net/man/8/arpspoof

[26] Scapy, [Online]. Available:

http://www.secdev.org/projects/scapy

Copyright © 2021 by the authors. This is an open access article

distributed under the Creative Commons Attribution License

(CC BY-NC-ND 4.0), which permits use, distribution and

reproduction in any medium, provided that the article is

properly cited, the use is non-commercial and no modifications

or adaptations are made.

Mustafa Altaha born on February 7,

1992. He completed his bachelor degree

in computer communication at Al

Mansour University Colleague, Iraq, in

2014. Then he starts his Master program

at Myongji University, Korea, from

2015-2017 at department of information

and communication engineering. He is

currently in his second year of his PhD in computer engineering

department at Myongji University.

He worked as network operated engineer in Earthlink-Co

(biggest Internet service provider in Iraq) from 2014-2015.

During his master degree program, he published on OFDM

systems, and during his first year as Ph.D. he published an

articles on fault tolerance in PTP systems. His current research,

is about implementing intrusion detection in SCADA system by

using deep learning algorithms.

MSc. Altaha was awarded the first in Iraq in cisco national net-

riders competition in network skills in 2014 and, awarded the

fifth in Middle East in cisco international net-riders competition

in 2014.

Jae-Myeong Lee was born in Seoul, S.

Korea on September 28, 1992. He

received a Bachelor’s degree in computer

engineering from Myongji University,

Korea, in 2018. Then he started his

Master’s degree program in computer

engineering at Myongji University,

Korea, from 2018. He is now researching

about Machine Learning, Deep Learning, Computer Security,

Smart Grid, and SCADA Security, under supervision of Prof.

Sugwon Hong.

He was enthusiastic about computer programming from a very

young age. In 2009, He was awarded Microsoft Most Valuable

Professional (MVP) for Visual Basic by Microsoft, which was

the youngest record in Korea.

Muhammad Aslam born on December

12, 1991. He completed his bachelor

degree in electrical engineering at

University of Engineering and

Technology, Peshawar, Pakistan, in 2013.

He completed his Master program at

Myongji University, Korea, from 2015-

2017 at department of electrical

engineering. He is currently in his 3rd year of PhD in electrical

engineering department at Myongji University.

During his Masters, he worked with Next-Generation Power

Technology (NPTC) research center in Myongji University,

where he worked on power system protection, power system

load flow calculation and Artificial Intelligence (AI). Currently,

Journal of Communications Vol. 16, No. 6, June 2021

©2021 Journal of Communications 215

Page 7: An Autoencoder-Based Network Intrusion Detection System

he is working to apply AI into power system and network

security

Sugwon Hong was born in Incheon, S.

Korea. He received the B.S. degree in

physics from Seoul National University,

Seoul, Korea, in 1979, and the M.S. and

Ph.D. degrees in computer science from

North Carolina State University, Raleigh,

USA, in 1988 and 1992, respectively.

His employment experience includes

Korea Institute of Science and Technology (KIST), Software

Development Center; Korea Energy Economics Institute (KEEI);

SK Innovation Co., Ltd. (formerly Korea Oil Company);

Electronic and Telecommunication Research Institute (ETRI).

Currently he is a professor in the department of computer

engineering, Myongji University, Yongin, S. Korea, where he

has been since 1995. His current research interest are cyber

security and smart grid.

Journal of Communications Vol. 16, No. 6, June 2021

©2021 Journal of Communications 216