37
IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 Towards Massive Machine Type Communications in Ultra-Dense Cellular IoT Networks: Current Issues and Machine Learning-Assisted Solutions Shree Krishna Sharma, Senior Member, IEEE, and Xianbin Wang, Fellow, IEEE Abstract— The ever-increasing number of resource-constrained Machine-Type Communication (MTC) devices is leading to the critical challenge of fulfilling diverse communication require- ments in dynamic and ultra-dense wireless environments. Among different application scenarios that the upcoming 5G and beyond cellular networks are expected to support, such as enhanced Mo- bile Broadband (eMBB), massive Machine Type Communications (mMTC) and Ultra-Reliable and Low Latency Communications (URLLC), the mMTC brings the unique technical challenge of supporting a huge number of MTC devices in cellular networks, which is the main focus of this paper. The related challenges include Quality of Service (QoS) provisioning, handling highly dynamic and sporadic MTC traffic, huge signalling overhead and Radio Access Network (RAN) congestion. In this regard, this paper aims to identify and analyze the involved technical issues, to review recent advances, to highlight potential solutions and to propose new research directions. First, starting with an overview of mMTC features and QoS provisioning issues, we present the key enablers for mMTC in cellular networks. Along with the highlights on the inefficiency of the legacy Random Access (RA) procedure in the mMTC scenario, we then present the key features and channel access mechanisms in the emerging cellular IoT standards, namely, LTE-M and Narrowband IoT (NB-IoT). Subsequently, we present a framework for the performance analysis of transmission scheduling with the QoS support along with the issues involved in short data packet transmission. Next, we provide a detailed overview of the existing and emerging solutions towards addressing RAN congestion problem, and then identify potential advantages, challenges and use cases for the applications of emerging Machine Learning (ML) techniques in ultra-dense cellular networks. Out of several ML techniques, we focus on the application of low-complexity Q-learning approach in the mMTC scenario along with the recent advances towards enhancing its learning performance and convergence. Finally, we discuss some open research challenges and promising future research directions. Index Terms—Cellular IoT, mMTC, 5G and beyond wireless, RAN congestion, Machine learning, Q-learning, LTE-M, NB-IoT. I. I NTRODUCTION The convergence of emerging wireless communication tech- nologies, ubiquitous wireless infrastructure and vertical In- ternet of Things (IoT) applications such as industrial au- tomation, connected cars and smart-grid is leading to an integrated enabling platform for future smart and connected societies. This platform envisions to synergistically integrate This work was supported in part by NSERC Discovery and CREATE programs under project number RGPIN-2018-06254 and 432280-2013. (Cor- responding author: Xianbin Wang). The authors are with the Department of Electrical and Computer En- gineering, Western University, London, ON, N6A 3K7, Canada, Email: [email protected], [email protected]. the ever-increasing number of smart devices (forecasted by IHS Markit to be around 125 billion by 2030), intelligent industry processes, people and societies together to enhance the overall quality of our daily life [1]. Towards supporting connected IoT devices, there are several recent developments in the area of licensed cellular technologies such as Long Term Evolution (LTE) for Machine-Type Communications (LTE-M) and Narrow-Band IoT (NB-IoT), and unlicensed technologies such as WiFi, ZigBee and LoRa [2]. Out of these, cellular technologies are considered to be promising due to their several advantages including Quality of Service (QoS) provisioning, wide coverage area and tight coordination, and therefore, cellular IoT is of the main focus in this paper. A. Recent Developments in Cellular IoT In recent years, cellular IoT has gained significant impor- tance from academia, industries, regulators and standardization bodies to enable the incorporation of IoT devices in the existing cellular infrastructures. The ITU-R has categorized the emerging diversified telecommunication services in the upcoming 5G and beyond cellular networks into the following three classes [3]: (i) enhanced Mobile Broadband (eMBB),(ii) massive Machine Type Communications (mMTC), and (iii) Ultra-Reliable and Low Latency Communications (URLLC). Out of the above-mentioned categories, the eMBB com- prises of high data rate services of 5G systems while the mMTC deals with the scalable connectivity to a massive number of devices in the order of 10 6 devices per square kilometers with diverse QoS requirements [4]. On the other hand, URLLC aims to provide robust connectivity with very low latency. The main challenge in the mMTC case is to support a huge number of devices with the limited radio resources whereas the key challenge for the URLLC scenario is to provide extremely high reliability in the order of 99.999% within a very short duration in the order of 1 ms [5]. Among these usage scenarios, the mMTC has to deal with various non-conventional challenges including QoS provisioning, Ran- dom Access Network (RAN) congestion, highly dynamic and sporadic traffic, and large signalling overhead. To this end, this paper focuses on the involved issues and the potential enablers of the mMTC scenario with a particular emphasis on the RAN congestion problem and emerging Machine Learning (ML)-based solutions. In terms of ongoing standardization efforts, the main cellular IoT standards introduced by the 3GPP are LTE-M and NB- IoT. Out of these, LTE-M is intended for mid-range IoT arXiv:1808.02924v1 [eess.SP] 8 Aug 2018

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1

Towards Massive Machine Type Communications inUltra-Dense Cellular IoT Networks: Current Issues

and Machine Learning-Assisted SolutionsShree Krishna Sharma, Senior Member, IEEE, and Xianbin Wang, Fellow, IEEE

Abstract— The ever-increasing number of resource-constrainedMachine-Type Communication (MTC) devices is leading to thecritical challenge of fulfilling diverse communication require-ments in dynamic and ultra-dense wireless environments. Amongdifferent application scenarios that the upcoming 5G and beyondcellular networks are expected to support, such as enhanced Mo-bile Broadband (eMBB), massive Machine Type Communications(mMTC) and Ultra-Reliable and Low Latency Communications(URLLC), the mMTC brings the unique technical challenge ofsupporting a huge number of MTC devices in cellular networks,which is the main focus of this paper. The related challengesinclude Quality of Service (QoS) provisioning, handling highlydynamic and sporadic MTC traffic, huge signalling overhead andRadio Access Network (RAN) congestion. In this regard, thispaper aims to identify and analyze the involved technical issues,to review recent advances, to highlight potential solutions and topropose new research directions. First, starting with an overviewof mMTC features and QoS provisioning issues, we presentthe key enablers for mMTC in cellular networks. Along withthe highlights on the inefficiency of the legacy Random Access(RA) procedure in the mMTC scenario, we then present the keyfeatures and channel access mechanisms in the emerging cellularIoT standards, namely, LTE-M and Narrowband IoT (NB-IoT).Subsequently, we present a framework for the performanceanalysis of transmission scheduling with the QoS support alongwith the issues involved in short data packet transmission. Next,we provide a detailed overview of the existing and emergingsolutions towards addressing RAN congestion problem, and thenidentify potential advantages, challenges and use cases for theapplications of emerging Machine Learning (ML) techniques inultra-dense cellular networks. Out of several ML techniques, wefocus on the application of low-complexity Q-learning approachin the mMTC scenario along with the recent advances towardsenhancing its learning performance and convergence. Finally,we discuss some open research challenges and promising futureresearch directions.

Index Terms— Cellular IoT, mMTC, 5G and beyond wireless,RAN congestion, Machine learning, Q-learning, LTE-M, NB-IoT.

I. INTRODUCTION

The convergence of emerging wireless communication tech-nologies, ubiquitous wireless infrastructure and vertical In-ternet of Things (IoT) applications such as industrial au-tomation, connected cars and smart-grid is leading to anintegrated enabling platform for future smart and connectedsocieties. This platform envisions to synergistically integrate

This work was supported in part by NSERC Discovery and CREATEprograms under project number RGPIN-2018-06254 and 432280-2013. (Cor-responding author: Xianbin Wang).

The authors are with the Department of Electrical and Computer En-gineering, Western University, London, ON, N6A 3K7, Canada, Email:[email protected], [email protected].

the ever-increasing number of smart devices (forecasted byIHS Markit to be around 125 billion by 2030), intelligentindustry processes, people and societies together to enhancethe overall quality of our daily life [1]. Towards supportingconnected IoT devices, there are several recent developmentsin the area of licensed cellular technologies such as LongTerm Evolution (LTE) for Machine-Type Communications(LTE-M) and Narrow-Band IoT (NB-IoT), and unlicensedtechnologies such as WiFi, ZigBee and LoRa [2]. Out of these,cellular technologies are considered to be promising due totheir several advantages including Quality of Service (QoS)provisioning, wide coverage area and tight coordination, andtherefore, cellular IoT is of the main focus in this paper.

A. Recent Developments in Cellular IoTIn recent years, cellular IoT has gained significant impor-

tance from academia, industries, regulators and standardizationbodies to enable the incorporation of IoT devices in theexisting cellular infrastructures. The ITU-R has categorizedthe emerging diversified telecommunication services in theupcoming 5G and beyond cellular networks into the followingthree classes [3]: (i) enhanced Mobile Broadband (eMBB),(ii)massive Machine Type Communications (mMTC), and (iii)Ultra-Reliable and Low Latency Communications (URLLC).

Out of the above-mentioned categories, the eMBB com-prises of high data rate services of 5G systems while themMTC deals with the scalable connectivity to a massivenumber of devices in the order of 106 devices per squarekilometers with diverse QoS requirements [4]. On the otherhand, URLLC aims to provide robust connectivity with verylow latency. The main challenge in the mMTC case is tosupport a huge number of devices with the limited radioresources whereas the key challenge for the URLLC scenariois to provide extremely high reliability in the order of 99.999%within a very short duration in the order of 1 ms [5]. Amongthese usage scenarios, the mMTC has to deal with variousnon-conventional challenges including QoS provisioning, Ran-dom Access Network (RAN) congestion, highly dynamic andsporadic traffic, and large signalling overhead. To this end,this paper focuses on the involved issues and the potentialenablers of the mMTC scenario with a particular emphasis onthe RAN congestion problem and emerging Machine Learning(ML)-based solutions.

In terms of ongoing standardization efforts, the main cellularIoT standards introduced by the 3GPP are LTE-M and NB-IoT. Out of these, LTE-M is intended for mid-range IoT

arX

iv:1

808.

0292

4v1

[ee

ss.S

P] 8

Aug

201

8

Page 2: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 2

applications which can support voice and video services, whileNB-IoT systems target to provide very large coverage andsupport for ultra-low cost devices [6]. Since MTC devicesusually do not require high channel throughput, the existingLTE-M and NB-IoT standards allocate a small bandwidth forIoT devices, i.e., LTE-M assigns 1.4 MHz bandwidth while theNB-IoT allocates a significantly lower bandwidth of 180 kHz[7]. Despite these recent developments, there are still severalchallenges to be addressed while supporting MTC devices incellular systems.

B. Challenges in Cellular IoTAlthough centralized cellular systems provide several ad-

vantages in terms of providing large coverage, tight timesynchronization and handover operations for mobile users,they are sluggish in terms of handling low-end devices andface several challenges in supporting a large number of MTCdevices with diverse QoS requirements. While incorporatingMTC devices in the existing LTE/LTE-A based cellular sys-tems, cellular operators have to face a lot of challenges bothat the operational and planning levels. More specifically, therearise various issues related to the MTC device deployment,mMTC traffic, energy efficiency of low-cost MTC devicesand the network protocol aspects such as signalling overhead[8]. Furthermore, network congestion may occur in differentsegments of LTE/LTE-A based cellular network includingRAN, core network and signalling network [9]. Out of these,RAN congestion problem is crucial in ultra-dense cellularIoT networks due to the limited available radio resources atthe access-side and the massive number of sporadic accessattempts from heterogeneous MTC devices.

Existing contention-based protocols are effective to sup-port the conventional Human-Type Communications (HTC),however, their performance significantly degrades in mMTCscenarios due to infrequent and massive number of accessrequests [10]. Also, due to limited available preambles in theexisting LTE-based systems, several MTC devices may needto select the same preambles at the same time, resulting ina significantly high probability of collision in the access net-work. Furthermore, the number of transmission attempts fromthe massive number of heterogeneous IoT devices could besignificantly large [11], and their activation periods and framesizes could be very different [12]. This sporadic and dynamicnature of mMTC access attempts and data transmissions mayresult in the peak traffic in both the access and traffic channelswell beyond the capacity of the IoT access network, thusleading to the inevitable congestion in an IoT access network[13]. Moreover, although data packets transmitted by IoTdevices are relatively short, very high signalling overhead perdata packet becomes another critical issue [8, 14]. To this end,it is significantly important to investigate suitable transmissionscheduling and efficient signalling reduction techniques inultra-dense scenarios by utilizing emerging tools such as ML.

C. Need of Machine Learning and Associated Challenges inIoT/mMTC Networks

Optimizing the operation of cellular networks in dynamicwireless environments has been challenging over the gen-

erations since the number of configurable parameters of acellular network has been rapidly increasing from one cel-lular generation to the next one [15, 16]. The widely-usedlink adaptation techniques in the existing wireless systems,which adapt different physical layer parameters includingtransmission power and modulation and coding scheme basedon the reliability/link of a communication link, may not beefficient in ultra-dense cellular IoT networks. This adaptationis based on the prediction of reliability of a wireless link inthe form of some metrics such as Packet Error Rate (PER)and this prediction process becomes extremely complex dueto the increasing trend of using multiple antennas, widebandsignals and a number of advanced signal processing algorithms[17]. Furthermore, the prediction of PER with good accuracybecomes difficult in practice by using the conventional signalprocessing tools. Moreover, due to a significantly large numberof environmental parameters such as channel state information,signal power, noise variance, non-Gaussian noise effect andtransceiver hardware impairments, it becomes complicated toprovide the near-optimal/optimal tuning of the transmissionparameters to achieve the efficient link adaptation [18]. Theseverity of this problem greatly increases in ultra-dense net-works due to the involvement of massive number of devicesand system parameters.

Understanding the context of the surrounding wirelessenvironment significantly facilitates in developing context-aware adaptive communication protocols and in taking op-timized decisions. Nevertheless, handling self-configuration,self-optimization and self-healing operations in the ultra-densecellular networks becomes challenging since the networksneed to observe dynamic environmental variations, learn un-certainties, plan response actions and configure the associ-ated network parameters effectively. To this end, emergingML-assisted techniques seem promising since they can playsignificant roles in learning the system variations/parameteruncertainties, classifying the involved cases/issues, predictingthe future results/challenges and investigating potential solu-tions/actions [16]. Moreover, the conventional link adaptationtechniques are more localized to a particular network anda geographical region, and do not usually consider theirimpacts on the other systems. However, future ultra-densecellular networks will need to handle mutual impact among theinvolved entities to maximize the overall system performance.To this end, by utilizing the emerging collaborative edge-cloudprocessing platform [19], ML-assisted solutions can enablethe utilization of global network knowledge at the edge-side,and also facilitate the coordination among different distributedsystems. In this direction, the application of ML techniquesto address various issues in dynamic wireless environmentshas recently received an important attention [16, 20] and inthe context of MTC environments, some existing works havealready studied the applications of different ML techniques inlearning various system parameters [10, 21–25].

However, the direct application of conventional ML tech-niques to complex and dynamic wireless IoT environments isnot straight-forward due to several underlying constraints suchas low computational capability of MTC devices, distributednature and heterogeneous QoS requirements of IoT devices,

Page 3: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 3

TABLE ICLASSIFICATION OF SURVEY/OVERVIEW WORKS IN THE AREAS OF IOT/MMTC, ML AND UDNS.

Main domain Sub-topics ReferencesEnabling technologies/protocols, challenges and applications [1, 2, 6, 11, 14, 28–32]Random access schemes [22, 33–35]

IoT/mMTC Traffic characterization and issues [36]Transmission scheduling [37, 38]IoT big data analytics [19, 39, 40]QoS provisioning [41]Intelligence in 5G networks [16, 20]

Machine Learning (ML) Learning in IoT/sensor networks [26, 42, 43]Reinforcement learning [44, 45]

Ultra-Dense Networks (UDNs) [46–48]

and the distinct features of mMTC traffic as compared tothe conventional HTC traffic [26]. Furthermore, due to thelimited computed power and low memory size of IoT devices,implementing sophisticated learning techniques in IoT devicesbecomes challenging [27]. In this regard, this paper identifiesthe implementation issues of the ML techniques in ultra-denseIoT scenarios and provides an emphasis on computationallysimpler Q-learning based solutions.

D. Review of Related Overview/Survey Articles

In this subsection, we provide a brief overview of theexisting survey works in the main domains related to thispaper, namely, IoT/mMTC, ML and Ultra-Dense Networks(UDNs). Also, we present the classification of the existingreferences related to these domains into different sub-topicswhich are listed in Table I.

Several existing papers have provided the review of en-abling technologies, protocols, challenges and applicationsof IoT/mMTC in different contexts [1, 2, 6, 11, 14, 28–32].Authors in [1] provided a comprehensive overview of ex-isting IoT protocols including application protocols, servicediscovery protocols, infrastructure protocols, and discussedsome enabling technologies including cloud computing, edgecomputing and big data analytics for IoT systems. Further-more, the authors in [6] presented a detailed survey of MTCsystems including its features, requirements and the requiredarchitectural enhancements in LTE/LTE-A based networks.In the context of short packet transmissions in mMTC/IoTenvironment, the contribution in [11] provided a review of therecent advances in information theoretic principles governingthe transmissions of short data packets and discussed theapplications of these principles to different scenarios includinga two-way channel, a downlink broadcast channel and anuplink Random Access Channel (RACH). Besides, the article[14] highlighted the requirements and design challenges formMTC systems and discussed various physical and MediumAccess Control (MAC) layer solutions for energy-efficientand massive access. Another overview paper [28] discussedthe physical limitations of MTC devices while operating incellular networks, and then analyzed the impact of these devicelimitations on the link performance and the link budget design.

In addition, the article [30] presented a review on var-ious features defined by the 3GPP to support Machine-to-Machine (M2M) communications in LTE-based cellular sys-tems and discussed recent advances in different layers includ-ing the physical layer improvements under the enhanced MTC

(eMTC), and MAC and higher layer enhancements brought bythe extended Discontinuous Reception (eDRX). Furthermore,the survey article [2] provided a comprehensive survey ofthree main low power and long range M2M solutions, namely,Low Power Wide Area Network (LPWAN), IEEE 802.11ah-based network and cellular M2M including LTE-M and NB-IoT. Besides, the overview article [29] presented the newrequirements and challenges in large-scale MTC applications,and discussed some enabling techniques including efficientoverhead signalling protocols, data aggregation and in-deviceintelligent processing. Also, the survey article [31] provided acomprehensive tutorial on the development of MTC designover different releases of LTE and recent user equipmentsbelonging to the MTC and the NB-IoT categories, calledCAT-M and CAT-N, respectively. Moreover, another overviewarticle [32] provided a review of various features of NB-IoTintroduced in LTE Release 14 including the increased posi-tioning accuracy, multi-casting, enhanced non-anchor carrieroperation and lower device power class, and the applicabilityof these features for NB-IoT systems.

The design of effective Random Access (RA) schemes inthe mMTC environment is an important challenge due tomassive access requests and sporadic device transmissionsfrom a huge number of resource-constrained MTC devices. Inthis regard, authors in [22] provided an overview of differentRA overload control mechanisms to avoid the RAN congestioncaused by the random channel access from the MTC devices.Furthermore, the article [33] presented a comprehensive surveyof various RA solutions attempting to enhance the RACHoperation of LTE/LTE-A based cellular networks, and carriedout the performance evaluation of LTE RACH from theenergy efficiency perspective. Moreover, the authors in [34]provided a review of emerging LPWAN technologies both inthe unlicensed band (LoRa and SIGFOX) and in the licensedband (LTE-M and NB-IoT) while considering three commonfundamental objectives of these access mechanisms, namely,high system capacity, wide coverage and long battery life. Inaddition, the article [35] provided an overview of the existingRA solutions towards supporting MTC devices in LTE/LTE-A based networks and these solutions are compared in termsof five key metrics, namely, access success rate, access delay,QoS guarantee, energy efficiency and the impact on the HTC.

The nature of MTC traffic is significantly different from theHTC traffic as detailed later in Section II-D, however, existingcellular networks are mainly optimized to support the HTCtraffic. Therefore, it is important to understand and characterizeMTC traffic to facilitate the incorporation of MTC devices in

Page 4: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 4

cellular networks. To this end, the authors in [36] provideda discussion on the traffic issues of MTC and the associatedcongestion problems on the access channels, traffic channelsand core network, and presented a comprehensive review ofthe existing solutions towards addressing these problems alongwith their advantages and disadvantages.

One of the promising solutions to handle massive accessrequests in the mMTC scenarios is to employ suitable trans-mission scheduling techniques at the distributed MTC devices.In this context, the authors in [37] identified limitations forsignalling and scheduling of M2M devices over the existingLTE-based cellular infrastructures and discussed some of theexisting proposals. Furthermore, the article [38] provideda detailed survey on the uplink scheduling techniques forM2M devices over LTE/LTE-A based cellular networks byconsidering various aspects of M2M communications suchas scalability, energy efficiency, QoS support and multi-hopconnectivity.

Another important aspect in ultra-dense IoT networks ishow to handle the massive amount of data generated from theresource-constrained sensors and MTC devices. In this regard,authors in [39, 40] discussed the connection between IoT andbig data analytics, and provided a survey on the existingresearch attempts in the domain of big IoT data analytics. Also,authors in [40] provided an overview of the existing networkmethodologies suitable for real-time IoT data analytics alongwith the fundamentals of real-time IoT analytics, software plat-forms and use cases, and highlighted real-time IoT analyticsissues related to network scalability, network fault tolerance,spectral efficiency and network delay. Moreover, the article[19] presented basic features, challenges and enablers for bigdata analytics in wireless IoT networks, and discussed theimportance of collaborative cloud-edge processing for live dataanalytics along with the associated challenges and potentialenablers.

Besides, providing QoS support in ultra-dense IoT networksis challenging due to massive connectivity, heterogeneity andthe resource constraints of the MTC devices as detailed laterin Section II. While analyzing from the energy efficiencyperspective, maximizing QoS usually becomes energy costlyand higher energy efficiency can be achieved by consideringsatisfactory QoS levels. Motivated by this, authors in [41]provided a discussion on the need for QoS satisfaction andthe methods to achieve QoS satisfaction efficiently. Also, gametheory-based fully distributed algorithms were presented to en-hance the energy efficiency of IoT systems while maintaininga desired QoS threshold.

In the direction of incorporating intelligence in 5G andbeyond networks, there have been some recent attempts inapplying Artificial Intelligence (AI)-based techniques to ad-dress various issues in wireless communications. The AItechniques can provide significant benefits in achieving ef-ficient management, organization and optimization of varioussystem resources in emerging ultra-dense 5G HeterogeneousNetworks (HetNets). In this regard, the article [20] discussedthe state-of-the-art AI-based techniques for intelligent Het-Net systems by considering the objective of achieving self-configuration, self-optimization and self-healing. Furthermore,

authors in [16] recently introduced the fundamental conceptsof AI and its relation with 5G candidate technologies. Also,the challenges and opportunities for the application of AI inmanaging network resources in intelligent 5G networks werediscussed.

In the IoT/mMTC environment, learning techniques needto consider the unique features such as heterogeneity, re-source constraints and QoS requirements. In this direction,the article [26] discussed the applicability of different typesof learning techniques in the IoT scenarios by taking theirlearning performance, computational complexity and requiredinput information into account. Furthermore, authors in [42]discussed various aspects of deep Reinforcement Learning(RL) and its application in building cognitive smart citieswhile considering the use cases in the areas of water consump-tion, energy and agriculture. Moreover, the recent article [43]provided an overview of existing context-aware computingstudies along with the learning and big data related worksin the direction of intelligent IoT systems. In the context ofRL techniques, a comprehensive survey of multi-agent RL isprovided in [44] by considering the aspects of stability of thelearning dynamics of the agents and adaptation to the varyingbehavior of other learning agents. Also, the recent article [45]provided a brief survey of the existing deep RL algorithmsalong with the highlights on current research areas and theassociated challenges.

Additionally, there exist a few survey and overview papersin the area of UDNs [46–48]. The authors in [46] providedan overview of the operation of UDNs in the millimeter-waveband and presented wireless self-backhauling across multiplehops to improve the deployment flexibility. Furthermore, thearticle [47] highlighted the key issues in incorporating M2Mcommunications in the emerging UDNs and also identifieddifferent ways to support M2M communications the UDNsfrom the perspective of different protocol layers includingphysical, MAC, network and application. Moreover, authors in[48] provided a comprehensive review on the recent advancesand enabling technologies for UDNs along with a discussionon the widely-used performance metrics and modeling tech-niques.

E. Contributions

Although several existing survey/overview articles reviewedin Section I-D have considered different aspects of mMTCsystems, ML techniques and UDNs, a comprehensive analysisof the research issues involved in supporting the massivenumber of MTC devices in ultra-dense cellular IoT networksand a detailed review of the recent advances including ML-assisted solutions attempting to address these challenges aremissing in the literature. As highlighted earlier in Section I-B, there arise several challenges while incorporating MTCdevices in the existing LTE/LTE-A based cellular systems. Themain issues include QoS provisioning to heterogeneous MTCdevices, addressing random and dynamic MTC traffic, trans-mission scheduling with QoS support and RAN congestion.To this end, the overall aim of this paper is to analyze thesedifferent issues, to review the existing works attempting to

Page 5: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 5

Organization

I. Introduction

II. QoS Provisioning in

Ultra-Dense Cellular IoT

Networks

IV. Transmission

Scheduling for mMTC

with QoS Support

III. Random Access

Procedure in Cellular IoT

Networks

V. Solutions for RAN

Congestion Problem in

Cellular IoT Networks

VI. Learning-Assisted

Solutions for RAN

Congestion Problem

VII. Research Challenges

and Future Directions

VIII. Conclusions

A. Machine-Type

Communications

B. Challenges for QoS

Provisioning

A. Existing Techniques

A. Advantages of Learning

in Wireless

Communications

B. Emerging Solutions

A. RA Procedure in Legacy

LTE Systems

B. Failure of RA Procedure

and Inefficiency in mMTC

D. Traffic Characterization

and Modeling for MTC

B. Learning Techniques

for IoT/mMTC

C. Overview of Existing

ML Techniques1. Exploration

Strategies for Q-

Learning

2. Performance

Enhancement of Q-

Learning

D. Q-Learning for RACH

Congestion Problem

C. Potential Enablers for

mMTC in Cellular

Networks

C. LTE-M: Key Features

and Channel Access

Mechanisms

A. Framework for

Performance Analysis

with QoS support

B. Short Data Packet

Transmission and

Associated Issues

D. NB-IoT: Key Features

and Channel Access

Mechanisms

Fig. 1. Structure of the Paper

overcome these challenges, to highlight potential enablers, andto propose ML-assisted solutions to address various challengesin ultra-dense cellular IoT networks. In the following, wehighlight the main contributions of this survey paper.

1) The major challenges faced by the existing cellularIoT networks in supporting the massive number ofMTC devices are identified and the potential enablingtechnologies are highlighted along with the key features,traffic characterization and the application scenarios ofthe mMTC.

2) The inefficiency of the legacy LTE RA procedure insupporting MTC devices is pointed out and its adaptationfor mMTC systems is presented along with the mainfeatures and channel access mechanisms of emergingcellular IoT standards (LTE-M and NB-IoT).

3) A mathematical framework for the performance analysisof transmission scheduling with the QoS support in anmMTC system is presented, and several limitations andthe design aspects of short data packet transmission areidentified.

4) Existing solutions towards addressing the RAN conges-tion problem in cellular IoT networks are reviewed alongwith the highlights on three emerging techniques.

5) The potential benefits, challenges and promising usecase scenarios for the applications of emerging MLtechniques in ultra-dense cellular networks are identifiedand the existing ML techniques are reviewed by broadlycategorizing them into supervised, unsupervised and RLtechniques.

6) A framework for the application of low-complexity Q-learning in addressing the RACH congestion problemis presented along with different exploration strategies,and some performance enhancement techniques are sug-gested in multi-agent and dynamic wireless environ-ments.

7) Various research issues are identified and some interest-ing future directions are presented to stimulate futureresearch activities in the related domains.

Page 6: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 6

TABLE IIDEFINITIONS OF ACRONYMS

Acronyms Definitions Acronyms DefinitionsACB Access Class Barring NPSS Narrowband Primary Synchronization SignalACK Acknowledgement NSSS Narrowband Secondary Synchronization SignalAI Artificial Intelligence NPBCH Narrowband Physical Broadcast ChannelBS Base Station NRS Narrowband Reference SignalCE Coverage Enhancement NPDCCH Narrowband Physical Downlink Control ChannelCRA Coded Random Access NPDSCH Narrowband Physical Downlink Shared ChannelCS Compressive Sensing NOMA Non-Orthogonal Multiple AccessCIoT Cellular IoT OFDM Orthogonal Frequency-Division MultiplexingDCI Downlink Control Information OFDMA Orthogonal Frequency-Division Multiple AccessDNC Deterministic Network Calculus PSK Phase-Shift KeyingDQCA Distributed Queuing Collision Avoidance PDU Periodic UpdateeDRX Extended Discontinuous Reception PDCCH Physical Downlink Control ChannelEPDCCH Enhanced PDCCH PRACH Physical RACHeMBB Enhanced Mobile Broadband PUSCH Physical Uplink Shared ChannelED Event Driven RB Resource BlockFDD Frequency Division Duplex RRC Radio Resource ControlHARQ Hybrid Automatic Repeat Request RAN Random Access NetworkHetNet Heterogeneous Network RA Random AccessHTC Human-Type Communications RACH Random Access ChannelH2H Human to Human RAR Random Access ResponseIoT Internet of Things RAW Random Access WindowLSA Licensed Shared Access RL Reinforcement LearningLTE Long Term Evolution SAS Spectrum Access SystemLTE-A LTE-Advanced SCMA Sparse Code Multiple AccessMAC Medium Access Control SC-FDMA Single Carrier Frequency Division Multiple AccessMCL Maximum Coupling Loss SDN Software Defined NetworkingMDP Markov Decision Process SL Sequential LearningMTC Machine-Type Communications TA Timing AlignmentmMTC Massive MTC TDD Time Division DuplexM2M Machine-to-Machine TTI Transmission Time IntervalML Machine Learning UE User EquipmentMPRACH MTC PRACH UDN Ultra-Dense NetworkMPDCCH MTC PDCCH URLLC Ultra-Reliable and Low-latency CommunicationsMUD Multi-User Detection UFMC Universal Filtered Multi-CarrierNB-IoT Narrowband IoT QoS Quality of Service

F. Paper Organization

The remainder of this paper is organized as follows: SectionII identifies the main features, application areas and thepotential enablers for mMTC in cellular networks along witha discussion on various issues associated with QoS provi-sioning in ultra-dense IoT networks. Section III highlightsthe inefficiency of the conventional LTE RA procedure inmMTC systems along with the basics of RA procedure inlegacy LTE systems, and then present key features and channelaccess mechanisms in two emerging cellular IoT standards,namely, LTE-M and NB-IoT. Also, the characterization ofMTC traffic is presented along with the 3GPP-based MTCtraffic models and related works. Subsequently, Section IVprovides a mathematical framework for the performance anal-ysis of transmission scheduling with QoS support in themMTC systems, and also present various design aspects andlimitations of short data packet transmission in the mMTCenvironment. Section V presents a review of the existing so-lutions for RAN congestion problem in cellular IoT networksalong with some emerging solutions while Section VI presentsthe advantages and challenges of ML techniques in wirelessIoT systems and also provides an overview of the existingML techniques. Also, it provides a detailed explanation onthe Q-learning mechanism from the perspective of addressingRAN congestion minimization along with some Q-learningperformance enhancement techniques. Finally, Section VIIprovides some research challenges and future directions, andSection VIII concludes this paper. To improve the flow of thispaper, we provide the structure of the paper in Fig. 1 and thedefinitions of acronyms in Table II.

II. QOS PROVISIONING IN ULTRA-DENSE IOT NETWORKS

In this section, we first discuss various aspects related toQoS provisioning in ultra-dense IoT networks in the generalcontext, and then provide specific details related to cellular IoTscenarios. The International Mobile Telecommunication (IMT)vision for 2020 and beyond envisions to provide the connec-tion density target of about 106 devices per square kilometers[3]. However, the available radio resources and communicationinfrastructures are limited and there is a significant amountof cost involved while acquiring new radio resources such asspectrum and building new infrastructures. Because of thisissue, existing communication networks need to be made asefficient as possible to support the massive number of devices,thus leading to the concept of ultra-dense IoT networks. Dueto diverse types of emerging IoT services and heterogeneouscapabilities of MTC devices, QoS provisioning in ultra-denseIoT networks is a crucial challenge.

The overall network performance and QoS of emergingInternet protocol-based networks significantly depend on theeffective management of instantaneous traffic flowing in thenetwork. In wireless IoT networks, the instantaneous aggre-gated traffic at the IoT gateway can be bursty and may greatlyexceed the average aggregated traffic since it aggregates pe-riodic transmissions from a large number of sensor nodeswith different periods and frame sizes [12]. Due to this,there may occur congestion during possible bursty intervalsand much higher backhaul link (from aggregators to thecloud) bandwidth is needed than that required for the non-bursty traffic. In this direction, one of the important researchchallenges is how to make the aggregated traffic as close to

Page 7: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 7

TABLE IIIADVANTAGES OF ACHIEVING DESIRED QOS LEVELS INSTEAD OF MAXIMIZING QOS

Advantages Related Causes1. Reduction of energy consumption 1. Extraneous energy will be wasted while maximizing QoS.2. Better compliance with the fixed data rate services 2. No need to maximize data rates for fixed data rate services

such as video surveillance and online gaming3. A good perceived performance at the users’ end 3. End-users are usually insensitive to small changes.

in QoS levels, allowing the room for energy saving4. Better support for emerging application-oriented networks 4. Desired QoS level is required only within a specific coverage area.5. The set of feasible regions of optimization solutions are enlarged. 5. Relaxation of global optimum for the QoS maximization makes

the mathematical problems less restrictive.6. Adaptive resource allocation problem leads to the cost-effective solutions. 6. No need to waste additional radio resources by considering

the conventional optimization assumptions such as full-buffer traffic.

the average traffic as possible.

In contrast to the conventional HTC traffic, there are sev-eral unique features of MTC traffic [49] which need to beconsidered while devising transmission scheduling and trafficmanagement strategies for wireless IoT networks. The amountof small packets in IoT-type networks could become signifi-cantly large due to the resource-constrained sensor devices andthe transmission of short packets from mMTC devices [11]. Ascompared to the dominant downlink traffic in the conventionalcellular systems, uplink to downlink ratio for the MTC trafficis much higher. Furthermore, MTC devices usually havelimited power budget and the MTC traffic consists of packetswith the short payload length. Also, MTC traffic may arrive inthe batch-mode due to high density of devices and correlatedtransmission [50]. Moreover, the standard Poisson process maynot be suitable for modeling the MTC traffic since the MTCtransmissions usually exhibit spatial and temporal synchro-nism. In addition, in contrast to the conventional voice trafficwhich has a constant sampling rate for a given codec, MTCtraffic usually comprises of different packet sizes and inter-arrival patterns [51]. Different MTC applications have distinctcharacteristics and service requirements such as priority anddelay constraints, thus leading to the need of separate trafficmodelling and scheduling schemes to incorporate mMTCdevices in the current LTE-based cellular networks.

In addition, MTC devices have completely different QoSrequirements than that of the conventional HTC devices. MTCnodes are usually constrained in terms of battery power and theemployed protocols need to be as energy-efficient as possible.Although the previous research in the area of wireless sensornetwork protocols mainly focused on monitoring applicationsbased on low-rate delay-tolerant data collection, the currentIoT-based research has moved to several new applicationssuch as eHealthCare, industrial automation, military and smarthome. These applications have different QoS requirements interms of delay, throughput, priority, reliability and differenttraffic patterns such as event-driven, periodic and streaming[52]. To this end, it is crucial to consider these distinct QoSfeatures while designing transmission and access techniquesfor the MTC devices.

In the above context, several existing works deal withthe maximization of QoS while attempting to minimize theenergy consumption. Several techniques such as sleep modeoptimization, power control mechanisms, adaptation of thedata rates and learning-assisted algorithms have been em-ployed in various settings [41]. However, the objective of QoS

maximization may lead to unnecessary energy consumptionand achieving satisfactory levels of QoS may be sufficient tobalance other performance metrics of systems such as energyefficiency. In contrast to the existing works related to themaximization of QoS, the objective of achieving satisfactoryQoS levels can provide several benefits as highlighted in TableIII [41].

Moreover, in time-critical MTC applications, several re-quirements in terms of low end-to-end delay, deterministicdelay, bounds on systematic delay variations and linear delay-payload (packet size) dependence need to be considered [53].The packet arrival period in the HTC systems such as multi-media ranges from 10 ms to 40 ms whereas this may rangefrom about 10 ms to several minutes in MTC systems [54].In addition, some applications such as data reporting in thesmart grid have deterministic (hard) timing constraints andserious consequences may occur in the case of violence ofthese constraints. In this regard, multiplexing massive accesseswith these diverse QoS characteristics effectively is a crucialchallenge in ultra-dense IoT networks.

In the following subsections, we describe several aspectsof MTC systems, highlight existing challenges for QoS provi-sioning in ultra-dense cellular IoT networks, present potentialenablers for the incorporation of MTC devices in cellular IoTsystems, and then present the characterization and modelingof the MTC traffic.

A. Machine-Type Communications

MTC has got a wide variety of application areas rangingfrom industrial automation and control to environmental mon-itoring towards building an information ambient society. Themain applications of the MTC are listed below [6].

1) Industrial automation and control: This class includesseveral scenarios such as production on demand, qualitycontrol, automatic interactions among machines, opti-mization of packaging, logistics and supply chain, andinventory tracking.

2) Intelligent transportation: Under this category, MTCfinds applications in different scenarios such as logisticservices, M2M assisted driving, fleet management, e-ticketing and passenger services, smart parking andsmart car counting.

3) Smart-grid: This category include various applicationscenarios including automatic meter reading, power de-mand management, smart electricity distribution and

Page 8: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 8

patrolling, online monitoring of transmission lines andtransmission tower protection [55].

4) Smart environment: Several application scenarios in-cluding smart homes/offices/shops, smart lighting, smartindustrial plants, smart water supply, environmentalmonitoring and green environment can be enabled withMTC.

5) Security and public safety: Several applications suchas remote surveillance, personal tracking and publicinfrastructure protection can be considered under thiscategory.

6) e-Health: In this application area, various scenarios existsuch as tracking or monitoring a patient or a segment ofan organ in a patient, identification and authenticationof patients, diagnosing patient conditions and providingreal-time information on patients health related data tothe remote monitoring center.

Since MTC applications are significantly different fromtheir HTC counterparts, they have distinct QoS requirementswith different service features such as time-controlled, time-tolerant, small data transmission, low or no mobility, group-based connection, priority-based transmissions and low powerconsumption [56]. In this regard, the 3GPP has identifiedthe following 14 features for M2M communications1: (i) lowmobility, (ii) time-controlled, (iii) time-tolerant, (iv) packetswitched only, (v) mobile originated only, (vi) small datatransmission, (vii) infrequent mobile terminated, (viii) M2Mmonitoring, (ix) priority alarm message, (x) secure connection,(xi) location specific trigger, (xii) network-provided destina-tion for uplink, (xiii) infrequent transmission, and (xiv) group-based policing and addressing.

Furthermore, the 3GPP has specified various general re-quirements for the MTC systems [6, 9] in order to effectivelyoperate MTC devices and also to establish successful linkagebetween an MTC subscriber and the network operator. Someof the main technical requirements include: (i) providing acontrol mechanism to the network operators for the addi-tion/removal/restriction of individual MTC device features, (ii)exploring a peak reduction mechanism for data and signalingtraffic when a number of MTC devices concurrently attemptfor their data transmissions, (iii) yielding a mechanism torestrict downlink data traffic and also limiting access towardsa specific access point name in case of network overload, and(iv) investigating techniques to maintain efficient connectivityfor a large number of MTC devices and to lower the corre-sponding energy consumption.

In comparison to the conventional HTC, the emerging MTChas the features of infrequent transmissions and low datarates. Also, the size of signalling data packets can be muchlarger than the size of user data packets in M2M applications[37]. Furthermore, although M2M devices need to transmitsmall amounts of data, communication infrastructure may getcongested if a huge number of M2M devices attempt to accessthe network near-simultaneously [57]. Hence, the performanceof the existing cellular standards has to be evaluated for thisemerging type of traffic.

1For the description of these features, interested readers may refer to [9].

In addition, the 3GPP has identified the following per-formance objectives to support mMTC in the emerging airinterface 5G New Radio (NR) [5, 58].

1) Very high connection density of about 106 devices perkm2 in an urban environment

2) Ultra-low complexity and low-cost IoT devices/networks3) Battery life in extreme coverage beyond 10 years with

the battery life evaluated at 164 dB MCL, and a batterycapacity of 5 Wh.

4) Maximum Coupling Loss (MCL) of about 164dB for adata rate of 160 bps at the application layer

5) Latency of about 10 seconds or less on the uplink todeliver a 20-byte application layer packet (measured at164dB MCL)

B. Challenges for QoS Provisioning in Ultra-Dense IoT Net-works

There arise several challenges in incorporating MTC devicesin LTE/LTE-A based cellular networks. First, the massivenumber of devices try to access the scarce network resourcesin a short period of time and there may arise the need of eitherutilizing the available resources efficiently or allocating addi-tional bandwidth to incorporate these devices [59]. Secondly,there are significant differences in the transceiver propertiesand the applications of MTC devices from the existing LTE-based user terminals [28]. In most of the applications, MTCdevices consume low power and have intermittent low ratetransmissions. Furthermore, due to the need of cost-effectivedeployment of massive devices, MTC devices have degradedtransceiver performance and reduced coverage as comparedto the LTE user terminals. Besides, their effects in the com-munication performance of the existing LTE-A users need tobe monitored and mitigated carefully. In this regard, one ofthe important research questions is how to provide concurrentaccess to a large number of MTC devices without degradingthe QoS of the existing cellular users.

Since a network interface is fully utilized during the peaktime, the devices may not be able to send or receive data and itis crucial to optimize the peak traffic in the emerging content-centric wireless networks [60]. Furthermore, as highlightedearlier in Section II-A, the emerging MTC applications arequite different from the traditional HTC applications due tounique features such as group-based communications, time-controlled, small data transmissions, and low or no mobility[56, 61]. These distinct features of MTC applications resultin diverse QoS requirements and it is important to takethese QoS requirements into account while devising multipleaccess techniques for future cellular IoT networks. The mainperformance indicators of an mMTC system are the numberof concurrent connections to be supported, energy efficiencyand network coverage [62].

Existing cellular networks face the following major prob-lems in supporting MTC devices [9, 63, 64]. In Figure 2, wepresent the pictorial representation of these issues in the formof device-level and network-level challenges.

1) Highly dynamic traffic and random access time:The data traffic arisen from the MTC devices is highly

Page 9: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 9

Challenges of

existing cellular

networks for

mMTC

Congestion in

radio

access,core

and signalling

networks

Distributed

radio,

computing and

caching

resources

Need for

highly

scalable

network

Ultra-low

device

complexity

Dynamic

traffic and

random

access time

Diverse QoS

requirements

Need for

improving

network

coverage

Congestion in C ti i

radio

access,core

and signalling

networks

Distributed

radio,

computing and

caching

resources

Need for

highly

scalable

network

Need for

improving

network

coverage

Ultra-low

device

complexity

Dynamic

traffic and

random

access time

Diverse QoS

requirements

Device-level

Challenges

Network-level

Challenges

Small data-

packets

Low

battery

lifetime

Fig. 2. Challenges of existing cellular networks to support emerging massive machine-type communications.

dynamic in nature as compared to more predictable HTCtraffic. Furthermore, there arises a need to handle themixed traffic models with the event-driven and periodictraffics. In addition, the existing contention-based radioaccess schemes will need to coordinate random trans-missions from the massive number of devices [13].

2) Ultra-low device complexity: Due to the requirementof cheap MTC devices for mass deployment, the devicesare constrained in terms of computational and memoryresources, thus providing the limited performance.

3) Low battery lifetime: Because of the cost and spaceconstraints, MTC devices are limited in their batterycapacity. Furthermore, due to distributed nature of IoTdevices and the involved cost-issues in replacing the bat-teries, the battery lifetime of MTC devices is expectedto be more than 10 years with the battery capacity of5 Wh, thus leading to the need of investigating powersaving methods for ultra-dense cellular IoT networks.

4) Small data packet transmissions: In addition to thehuge signalling burden associated with a large numberof small packet transmissions from MTC devices, therearise other challenges such as the requirement of higherresource granularity and efficient channel coding forshort block lengths in contrast to the channel codingschemes designed for long packets in the conventionalcellular systems [14].

5) Diverse QoS requirements: MTC devices have diverseQoS requirements in terms of data rate and latencyrequirements and existing cellular technologies need toadapted to handle these features.

6) Network congestion: As highlighted earlier in Section

I-B, the incorporation of massive MTC devices in theexisting LTE/LTE-based cellular network may result incongestion in different segments of the network includ-ing RAN, the core network and the signalling network.

7) Highly scalable network: Because of the need to sup-port a significantly large number of connected devicesranging from a factor of 10× to 100× as compared tothe cellular devices, it is crucial to maintain the systemperformance with the increase in the connection density.

8) Need for improving network coverage: There arisessignificant shrinkage in the link budget due to thereduced capability of MTC devices. In order to increasethe coverage to the areas where MTC devices aredeployed (such as deep inside a building), LTE release13 targeted the coverage extension of at least 15 dB forthe MTC devices. This coverage improvement enablesthe support of the devices in the locations where theconventional cellular networks face difficulty.

9) Distributed radio, computing and caching resources:With the recent trend of migrating communicationsnetworks from the connection-oriented to the content-oriented nature, it is important to investigate syner-gies among communications, computing and cachingresources which are distributed across different devicesin ultra-dense IoT networks [19]. However, the con-ventional cellular networks based on the centralizedmanagement are sluggish in terms of network resourcemanagement and they need to evolve to deal with themanagement of distributed resources.

Towards modeling and analysis of the QoS of wirelessnetworks, one of the important mathematical tools is Deter-

Page 10: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 10

ministic Network Calculus (DNC), which is useful to calculatedelay parameters such as delay bound, backlog bound andother service quality parameters by utilizing the traffic/packetarrival and service curves [65]. This DNC tool enables thedetermined boundary analysis for the system performance andoffers a strict service guarantee by considering the worst-case scenarios. The main QoS metrics that can be evaluatedinclude delay bound and backlog bound. The metric delaybound represents the maximum between arrival and servicecurves while the backlog bound denotes the maximum verticaldeviation between these two curves.

In addition to providing high capacity to the fairly limitednumber of traditional user equipments to support high datarate services such as video streaming, the air interface of 5Gcellular network has to provide connectivity to the massivenumber of concurrent transmissions coming from the MTCdevices [66]. Also, the exchange of signalling informationneeds to be minimized both in the uplink and downlink dueto a large number of MTC devices to be supported withthe limited available radio resources. Furthermore, the RAprocedure at the device-side should be simplified as much aspossible by shifting the burden to the network side/eNodeBdue to the resource constrained and low-cost nature of MTCdevices.

Moreover, the conventional centralized approaches for con-gestion management in cellular networks are not scalableas desired by the MTC systems and also the distributedscheduling approaches can not easily acquire the knowledgeabout the network load and requirements of other applications[67]. Furthermore, the conventional congestion managementprocess is mostly a reactive process instead of the proactiveone needed for MTC devices. By deferring and shapingtransmissions at the source itself in a network and being awareof the underlying application properties, better congestionmanagement can be obtained for MTC devices [67].

Authors in [29] highlighted the difference between HTCover cellular and MTC over cellular in terms of various param-eters such as uplink, downlink, subscriber load, device typesand requirements in terms of delay, energy, signalling andcellular architecture. Furthermore, the signalling overhead inthe uplink and downlink MTC links are analyzed consideringthe SMS-type raw data size (< 248 bytes) and email-type rawdata size for smart metering and vehicular sensing applicationsvia experimental measurements. It has been shown that thesignalling overhead for the downlink control messages isconsiderably higher than for the uplink case, and highersignalling overhead occurs in vehicular applications than inthe smart metering applications.

C. Potential Enablers for mMTC in Cellular Networks

Due to some distinct transitions while going from theconventional HTC platform to the emerging mMTC such asfrom the larger packet sizes to the smaller packet sizes, fromthe downlink-focused communication scenario to the uplinkdominant, and from high data rate to low data rate transmis-sions from MTC devices, mMTC systems have new designrequirements than those of the conventional HTC systems.

Enablers for mMTC

in cellular IoT

networks

Dynamic

resource

allocation

schemes

Advanced

spectrum

sharing

techniques

Clustering

and data

aggregation

SDN and

virtualization

techniques

Advanced RA

schemes such

as NoMA, CRA

SCMA and

DQCA

Constant

envelope

based

modulation

schemes like

CPM

Flexible

waverforms

Collaborative

edge-cloud

processing

Low signalling

overhead MAC

protocols

Compressive

sensing based

MUD

Advanced

transmission

scheduling

techniques

Fig. 3. Potential enabling techniques for mMTC in cellular IoT networks.

To this end, it is important to look into the mMTC networkdesign problem from a different perspective than the traditionapproach followed for the HTC systems.

In Figure 3, we present the main enabling techniques beingconsidered to facilitate the incorporation of mMTC in theupcoming cellular IoT networks. In the following, we brieflydescribe these enabling techniques.

1) Flexible waveform design: The design of flexible wave-forms can enable the in-band mMTC channels withinthe LTE carrier [14]. In this regard, the traditionalwaveforms designed for HTC communications need tobe adapted to support mMTC while considering variousaspects such as end-to-end latency, robustness againsttime and synchronization errors, out-of-band radiations,spectral efficiency and transceiver complexity [68].

2) Dynamic resource allocation techniques: The 3GPPsuggested to allocate RACH resources dynamically toaddress the RAN congestion problem [69]. For exam-ple, by deciding the number of preambles adaptivelywithout knowing the number of devices and accessprobability, the RACH throughput can be maximized[70]. Furthermore, since this approach can dynamicallychange the size of the RACH resource pool and other re-sources, the total data collection time from the resource-constrained MTC devices can be minimized in delay-sensitive/emergency applications [71]. The two mainissues for employing this process are the requirementof estimating the number of contending devices in anRA slot and determining the preamble pool size.

3) Advanced spectrum sharing methods: Although boththe licensed and unlicensed bands can be exploitedfor mMTC applications, lack of QoS guarantees in theunlicensed band becomes highly problematic [4]. In this

Page 11: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 11

regard, emerging advanced spectrum sharing techniquessuch as Licensed Shared Access (LSA) and SpectrumAccess System (SAS) [72] could be potential solutionsfor mMTC applications since they can provide betterinterference characterization.

4) Clustering and data aggregation schemes: By group-ing MTC devices into smaller clusters based on somesuitable criteria such as geographical locations or QoSrequirements and then aggregating the individual devicedata at the MTC gateway/aggregator, the RAN conges-tion can be significantly minimized [62]. Furthermore,the investigation of energy-efficient clustering schemesfacilitates the deployment of low-power MTC devices[62].

5) Software Defined Networking (SDN) and virtualiza-tion techniques: Based on the functionalities of MTCdevices and their QoS requirements, a physical cellu-lar network can be virtualized into different networkssuch as industrial, vehicular, smart grids and emergencynetworks, with all these networks sharing the same setof radio, computing and networking resources [73]. Thedynamic sharing of resources and the reconfigurationof network elements among thus virtualized networkscan be carried out by utlizing an SDN paradigm whichdecouples the control plane from the data plane andincorporates the capability of programming in the IoTnetwork.

6) Advanced RA schemes: Several emerging RA schemessuch as Non-Orthogonal Multiple Access (NOMA),Sparse Code Multiple Access (SCMA), Coded RandomAccess (CRA) [14] and distributed queueing based ac-cess protocol [74] can be considered as the promisingenablers for the mMTC in cellular networks.

7) Constant envelope coded-modulation schemes: Dueto space/cost constraints, MTC devices need to use low-cost amplifiers which are prone to non-linearities andhardware imperfections. In this scenario, constant en-velope signals can enable the non-linear power-efficientand cost-effective operation at the MTC devices. There-fore, constant envelope coded modulation schemes suchas Continuous Phase Modulation (CPM) can be consid-ered as enablers for the mMTC [14].

8) Compressed Sensing (CS)-based Multi-User Detec-tion (MUD): The amount of collisions in the IoTaccess network can be further minimized by employingadvanced interference cancellation receivers. As an ex-ample, the CS-MUD can enhance the resource efficiencyand serve higher number of users by using the combi-nation of non-orthogonal RA and joint detection of userdata and activity [14]. In this regard, the combination ofadvanced MAC protocols with the CS-based MUD canbe utilized by exploiting the sparse joint activity in themMTC environment [75].

9) Low signalling overhead MAC protocols: One ofthe main technical challenges in an mMTC system isto reduce the amount of signalling overhead generatedby the MTC devices and the design of low-signallingoverhead protocols will facilitate the deployment of

MTC devices in cellular networks [29].10) Advanced transmission scheduling techniques: The

transmission scheduling techniques designed for cellu-lar IoT systems should be able to accommodate theMTC devices with heterogeneous QoS requirements inaddition to the legacy cellular users. In this regard,advanced scheduling techniques such as latency-awarescheduling [76], fast uplink grant [77] and learning-assisted scheduling [10] seem promising to schedule thesporadic transmissions from a huge number of MTCdevices over limited RACH resources.

11) Collaborative cloud-edge processing: Cloud comput-ing platform has very high computational and storagecapacity, and has a global view of the network butis not suitable for delay sensitive applications. On theother hand, edge-computing is suitable for applicationsdemanding low delay and high QoS but has lowercomputational resources and storage capacity. In thisregard, collaborative processing between these two plat-forms will be a promising approach to address variousissues including latency minimization [78], dynamicspectrum sharing [79], peak traffic management and dataoffloading in ultra-dense IoT networks [19].

The ML-assisted techniques, detailed later in Section VI,can address various issues related to self-configuration, self-optimization and self-healing in emerging wireless networksand seem promising in facilitating the implementation of themost of the technology enablers listed in Fig. 3 towardsenhancing the performance of mMTC systems. However, theML techniques should be as simple as possible to be appliedin the MTC devices and the investigation of low-complexityadaptive ML techniques is one of the emerging future researchdirections as highlighted later in Section VII.

D. Traffic Characterization and Modeling for mMTC Systems

The characterization and modeling of mMTC traffic iscrucial to support MTC devices in the existing cellular net-works due to various reasons specified in the following. Theincorporation of MTC devices in cellular networks may causeharmful interference to the existing cellular users and maysignificantly degrade the system performance of LTE/LTE-A based cellular systems. To this end, it is important toanalyze the impact of MTC traffic on the existing cellularusers by utilizing suitable interference modeling in realisticwireless environments [80]. Furthermore, suitable interferencemitigation, resource allocation and resource sharing schemesneed to be investigated to ensure the sufficient protection of thecellular users against harmful interference caused by the mas-sive number of MTC devices and by utilizing the given MTCtraffic models, these schemes can be designed in an efficientmanner. Moreover, since MTC traffic is uplink dominant andthe rigid QoS support framework of LTE designed for voiceand data services may not be capable of addressing specificQoS requirements of MTC traffic in terms of latency, jitter andpacket loss, suitable transmission scheduling techniques needto be investigated to support a large number of MTC deviceswhile fulfilling their specific QoS requirements. Besides, to

Page 12: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 12

investigate suitable traffic management schemes such as peaktraffic reduction in wireless IoT networks, it is essentialto understand and characterize the traffic models applicablefor a particular IoT application [13]. In addition, the trafficcharacteristics depend on the application scenarios and theMTC devices usually have heterogeneous traffic patterns interms of their amplitudes, starting times and activation periods[52].

Existing traffic models in telecommunication systems can becategorized into: (i) source traffic models, mainly applicablefor video, data and voice transmissions, and (ii) aggregatedtraffic model applicable for Internet, high-speed links andbackbone networks [81]. Since an IoT network consists of alarge number of sensors or MTC devices which are usuallycontrolled by a gateway/server, IoT traffic at the gatewayusually fits into the aggregated traffic model. In this regard,the 3GPP has defined the following two types of aggregatedMTC traffic models [69].

1) Model 1: Uniform distribution over a duration T inwhich MTC devices access the network uniformly over aperiod of time, i.e., in a non-synchronized manner. Thismodel does not take account of the correlation betweenthe transmissions of the devices.

2) Model 2: Beta distribution over T in which a largeamount of MTC devices access the network in a highlysynchronized manner. This model generates correlatedtraffic in a specific time interval.

For the first model, the following function can be consid-ered.

fi(t) =

{Pi, 0 ≤ t ≤ τi,0, τi ≤ t ≤ Ti,

(1)

where Pi denotes the amplitude of the traffic profile forthe ith device and τi/Ti represents the corresponding dutycycle. Similarly, for the second model, the probability densityfunction of the Beta distribution, is given by [69]

p(t, α, β) =1

B(α, β)t(α−1)(1− t)(β−1), (2)

where t denotes a single realization over the time axis, α >0, β > 0 are the scale parameters, and B(α, β) denotes theBeta function, which is a normalization constant to ensurethat the total probability equals to 1 and can be defined asB(α, β) =

∫ 1

0xα−1(1− x)β−1dx [13].

Considering M number of MTC devices and their activationperiods between t = 0 and t = T , the RA intensity for themth device is given by the probability distributions either in(1) or (2) depending on the employed model. The number ofarrivals in the ith access slot is given by [69]

Iaccess(i) =M

∫ ti+1

ti

p(t)dt, (3)

where ti denotes the time of the ith access opportunity andp(t) given by (1) or (2). The distribution of access attemptsshould be limited over a certain observation period T in suchway that

∫ T0p(t)dt = 1. Figure 4 illustrates the RA intensity

of two 3GPP-based MTC traffic models with T = 1.Although aggregated traffic modeling is suitable for scenar-

ios involving a large number of devices and is less complex to

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

2.5

Fig. 4. Access intensity for two 3GPP based MTC traffic models.

realize, it is less precise than the source traffic modeling sinceit is not able to capture the real traffic features at the sourcelevel [49, 81]. On the other hand, source traffic modeling treatsthe traffic for every devices separately, and hence is moreprecise. However, modeling source traffic becomes complexfor a large number of source devices. Therefore, it is crucialto investigate suitable traffic models which can combine thebenefits of both the source and aggregate traffic models. Inthis regard, coupled Markov modulated Poisson processes [81]seems a promising approach, which has higher accuracy thanthe aggregated modeling and has lower complexity than theconventional source traffic modeling.

Considering a variety of applications, the MTC traffic canbe categorized into the following three traffic patterns: (i)Periodic Update (PU), (ii) Event-Driven (ED) and (iii) payloadexchange [82]. The PU traffic has a regular pattern, constantdata size and is non-real time type (example: smart meterreading) while the ED traffic has a variable pattern, varyingdata size and is a real time traffic (example: health emergencyalarming). On the other hand, payload exchange traffic followseither of the above traffic types and it may be of constant sizeor variable size, real time or non-real time depending on theapplication scenario. In addition, there exist three main typesof traffic shaping policies [83]: (i) traffic shaping for bulkapplications where each flow is assigned a fixed bandwidth,(ii) traffic shaping for the aggregate traffic, and (iii) time-basedtraffic shaping which is applied only at the peak-time to reducecongestion and cost.

The uplink traffic generated from the sensors in most ofMTC applications is heterogeneous and can be classified into[84]: (i) non real-time with no task completion deadline, (ii)soft real-time with the decreased utility if the deadline notmet and (iii) firm real-time having zero utility if the deadlineis not met. As an example, industrial M2M traffic has very lowlatency requirements in the order of a few milliseconds [85].In general, the PU traffic is periodic with tight service deadlinewhile the ED traffic is random with all three traffic categories,i.e., non real-time, firm or soft real-time. From the schedulerdesigner perspective, it is crucial to maximize a system utilitymetric in order to maximally satisfy the delay requirements ofall the classes [84].

Page 13: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 13

Moreover, possible network applications in wireless IoTnetworks can be classified into the following [56].

1) Elastic applications: This category corresponds to moretraditional HTC applications such as electronic email,file transfer as well as the downloading of remote datafrom the MTC servers. These applications are mostlydelay tolerant in nature and the user utility usually hasdiminishing marginal improvements with the incremen-tal increase in the achievable data rate.

2) Hard real-time applications: These applications havea desired delay constraint with hard real-time require-ments. Beyond the desired time frame, there is noadditional utility gain while increasing the data rateand the user utility becomes the step function of theachievable data rate.

3) Delay adaptive applications: Some delay sensitiveapplications can occasionally tolerate a small delay witha certain delay-bound violation and the packet droppingprobability. The user utility in these applications (suchas remote monitoring of e-Health services) deterioratesrapidly when the achievable data rate becomes less thanthe required intrinsic data rate.

4) Rate-adaptive applications: These applications try toadjust their transmission rates based on the availableradio resources with the moderate delays. A highlyefficient scheduler is needed to enhance the performanceof these applications in time-varying channel conditions.

Traffic shaping, also called packet shaping, delays certaintypes of data packets in order to optimize the overall per-formance of a network. To achieve the optimized networkperformance, Internet traffic thesedays is intentionally shapedinto ON/OFF pattern [86]. Also, ON/OFF pattern is generateddue to some inherent characteristics of applications such asHTTP web browsing and MapReduce operation at the datacenter/server. The main benefits in performing ON/OFF trafficshaping include: (i) reduction of computing overhead at theserver-side, (ii) energy saving at the wireless terminals and (iii)minimizing the bandwidth waste while delivering streamingservices. However, this On/OFF traffic shaping faces severalchallenges such as the impact on packet drop probability,harmful effect on other real-time applications and weakeningthe congestion control function of the transmission controlprotocol [86]. To address these issues, it is crucial to de-sign suitable models to characterize the relation among theassociated parameters of ON/OFF traffic such as the ratioof ON/OFF duration, burst size, and burst transmission rate,and also the models for packet loss probability and temporarycongestion caused by the bursty transmissions.

Another approach to manage the peak-traffic is to employdemand-side management, which adopts suitable measures atthe customer-side/sensor-side to optimize the overall networkperformance [83]. On one hand, the demand profile can beflattened to limit the amplitude fluctuations while simultane-ously accommodating the same amount of traffic volume. Onthe other hand, the utilization profile of network resourcescan be flattened by rescheduling or delaying the services [60].Some approaches for traffic scheduling include limiting the

queue size and setting a bandwidth limit so that aggregatetraffic size in the active queue does not exceed the limit.However, scheduling techniques cause delay for processing thenetwork demand and it is crucial to investigate suitable tech-niques to minimize the delay introduced by traffic schedulingmechanisms. Furthermore, a suitable bandwidth limit shouldbe applied to balance the trade-off between latency and energysaving [83].

III. RANDOM ACCESS PROCEDURE IN CELLULAR IOTNETWORKS

An MTC device must go through the access procedure toestablish a connection to the Base Station (BS)/eNodeB/accesspoint mainly in the following situations [33]: (i) while es-tablishing an initial access to the network, (ii) while receiv-ing/transmitting new data and MTC device is not synchronizedto the network, (iii) during the transmission of new data whenno scheduling resources are allocated on the uplink controlchannel, (iv) to perform a seamless handover, (v) in order tore-connect to the network in the case of radio link failure.

The RA methods in the LTE-based cellular systems canbe categorized into contention-based (for delay-tolerant accessrequests) and contention-free (for delay-sensitive requests)schemes [87]. Out of these, the contention-based scheme is ofthe main interest here due to the limitation in the number ofavailable Resource Blocks (RBs) as compared to the massivenumber of access requests to be supported. In the contention-based RA approach, a huge number of MTC devices haveto select the same preambles because of the limitation in theavailable preambles in the existing LTE-based cellular sys-tems, and this results in significantly high number of collisionsin the access network and subsequently leads to the RANoverload or radio access congestion problem in ultra-denseIoT networks. In this direction, one important question to beanswered is how to concurrently support the massive numberof MTC devices in ultra-dense cellular IoT networks withoutaffecting the performance of the existing cellular devices byusing the current communication technologies/standards.

In the following, we briefly describe the RA procedure inthe legacy LTE systems and its inefficiency in handling themassive number of devices in the mMTC environment, thenpresent some adaptations made to support MTC devices incellular networks, and subsequently highlight the main featuresand access mechanisms in emerging cellular IoT standards,namely, LTE-M and NB-IoT.

A. RA Procedure in Legacy LTE Systems

After an eNodeB broadcasts the system information to thedevices, the contention-based RA procedure follows a four-stage message handshake procedure as depicted in Fig. 5[33, 88], which mainly involves the following four stages: (i)RA preamble transmission from the device to the eNodeB(Message 1), (ii) RA Response (RAR) from the eNodeB tothe device (Message 2), (iii) connection request message fromthe device to the eNodeB (Message 3), and (iv) connectionresolution message from the eNodeB to the device (Message4).

Page 14: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 14

In the first stage, each device randomly selects an RApreamble from the set of available preambles broadcastedby the eNodeB during the initial network synchronizationphase and sends the RA request (Message 1) by transmittingthus selected preamble in an RACH. At this stage, the UserEquipments (UEs) just transmit the selected preambles and notthe device IDs. In the second stage, the eNodeB acknowledgesthe received distinct preambles with an RA response (Message2) which includes the preamble index being acknowledged,instructions for the timing alignment and the command forthe RB allocation. Subsequently, in the third stage, the UErecognizes the RA response addressed to it by noting thepreamble index it has used for the RA request in the firststep and utilizes the dedicated RB on the Physical UplinkShared Channel (PUSCH). The devices which made currenttransmissions of the RA request with the same preamble inthe first stage will be instructed to use the same RB in the step3 and such transmissions will go through collisions. On theother hand, for the packets (which contain the correspondingdevice IDs) which are successfully decoded in step 3, theeNodeB sends a contention resolution message (Message 4)to the corresponding devices.

In this RA procedure, after sending the preamble in theRA request (Message 1), the device sets an RAR window andwaits for the eNodeB’s response with an uplink grant (Message2) in the RAR message. If the UE successfully receives itsMessage 2 within the defined RAR window, the UE sends theRadio Resource Control (RRC) connection request (Message3) to the eNodeB. At this stage, the device starts the Message4 timer and waits to receive its own RRC connection setupmessage (Message 4) from the eNodeB [89].

In the above-mentioned RA procedure, the physical-layermapping of RACHs is called Physical RACHs (PRACHs)which are time-frequency blocks specified by the eNodeB.The periodicity of RA slots is broadcasted by the eNodeBin terms of the PRACH configuration index, which variesbetween every 1 ms (i.e., a maximum of 1 RA slot per 1subframe) to 20 ms (i.e., the minimum of 1 RA slot every2 frames) [33]. The transmission scheduling in terms of timeand frequency depends on the configuration of the PRACH.For example, for the PRACH configuration index of 6, therewill be RACHs in every 5 ms within a bandwidth of 180 kHz,with a duration ranging from 1 ms to 3 ms.

B. Failure of RA Procedure and Its Inefficiency in mMTCsystems

The aforementioned RACH procedure in LTE/LTE-A basednetworks may fail due to the following reasons [89].

1) Failure of preamble transmission: This occurs eitherdue to the collision of RACH preambles (caused due toconcurrent transmission of the same set of preamblesby more than one devices) or insufficient preambletransmission power.

2) Failure of Message 2 reception: This occurs mainly dueto the lack of downlink radio resources (i.e., PDCCH)to send the RA response (Message 2) to all the receivedpreambles within the devices’ RAR windows.

Device eNodeB

System Information

Broadcast

Random Access Preamble (Message 1)

On the PRACH

Random Access Response (Message 2)

On the PDCCH

Connection Request (Message 3)

On the PUSCH

Connection Resolution (Message 4)

On the PDSCH

Connection set-up complete, service

request

4-s

tag

e m

ess

ag

e h

an

dsh

ake

RA

P

roce

du

reFig. 5. Illustration of four-stage message handshake-based RA procedurein LTE-based cellular systems (PRACH: Physical Random Access Channel,PDCCH: Physical Downlink Control Channel, PUSCH: Physical UplinkShared Channel, PDSCH: Physical Downlink Shared Channel).

3) Failure of Message 3 transmission: This occurs dueto the failure in transmitting Message 3 to the eNodeBby employing the Hybrid Automatic Repeat Request(HARQ) process at the device.

4) Failure of Message 4 reception: This occurs dueto the failure in receiving Message 4 by the deviceswithin the Message 4 expiration time while using HARQtransmission from the eNodeB either due to insufficientPDCCH resources or an imperfect channel condition.

Out of 64 preambles used in LTE-A networks, 54 preamblesare used for the contention-based access, while the remaining10 preambles are reserved for the contention-free access whichis needed for high-priority services such as handover. In every5 ms, there arises an access opportunity and 200 access oppor-tunities per second. This corresponds to the absolute maximumcapacity of 10, 800 preambles per seconds in the absence ofaccess collisions [33]. However, due to ALOHA type accessprotocol and random backoffs, performance becomes muchlower than this maximum limit in practical cellular systems.Furthermore, the situation becomes worse while supportingthe MTC devices. Besides the problem of supporting highernumber of nodes in ultra-dense IoT networks, other perfor-mance metrics such as access delay and energy consumptionare important to be considered.

To study the performance of the aforementioned 4-stagemessage handshake RA procedure in LTE systems, authorsin [90] studied the stability limit of this legacy RA procedure,which indicates the probability of failure of the RA proce-dure and the maximum achievable throughput. It has beenshown that the performance of the RA procedure deterioratesrapidly while sharing the Physical Downlink Control Channel

Page 15: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 15

(PDCCH) resources between Messages 2 and 4 with differentpriorities and the overall RA performance can be enhanced byincreasing the size of the PDCCH resource.

Furthermore, in mMTC systems, the system performancemay severely degrade in the presence of concurrent massiveaccess requests due to high probability of collision causedby the signaling and traffic load spikes since the contention-based operation of the RACH in LTE-A networks is based onALOHA-type access [91]. One of the possibilities to reducethe load of physical RACH is to increase the number of accessopportunities scheduled in a frame, however, this approach willreduce the amount of resources needed for data transmission.In this regard, it is crucial to balance the tradeoff betweenthe amount of resources available for data transmission andthe amount of access opportunities to be scheduled per framewhile designing an uplink scheduler for MTC applications bytaking into account of limited available bandwidth. Besides,the main performance metrics to be improved include accesssuccess probability, preamble collision rate, access delay anddevice energy consumption [33].

The LTE RA procedure employed in legacy LTE systemsis not efficient to support MTC devices mainly due to thefollowing main reasons [66].

1) Because of the limited number of available preamblesfor the contention-based RA procedure, the massivenumber of concurrent transmissions of the same pream-bles would cause the overload of the RA procedure bothin the uplink and the downlink and this will result in highcollision probability, access failure rate and the accessdelay.

2) To support a huge number of access requests, additionaldownlink resources need to be allocated since each RARmessage for one MTC device consists of 56 bits.

3) Even after an MTC device becomes successful in the RAprocedure, the signalling overhead degrades the overallsystem efficiency since the size of the upload datapayload from the MTC device is significantly smallerthan the traditional cellular terminals.

C. Adaptation of RA Procedure for MTC devices

It should be noted that the RACH in the RA procedure isrelated to two different channels, namely, PRACH and PDCCHas illustrated in [87]. A single PRACH consists of six physicalRBs and has a bandwidth of 1.08 MHz. Over the wholesystem bandwidth, a maximum of 6 RACHs can be deployedfor time-division multiplexing with one RACH for frequency-division multiplexing. However, while sending RAR in thedownlink channel, only a single PDCCH is responsible forhandling multiple PRACHs. Although this is not a problem inthe conventional HTC devices, this becomes a serious problemin MTC devices due to their hardware limitations in terms oftheir capacity to listen to the wideband PDCCH. In general,a low-cost MTC device consists of a single RF interfaceoperating with 1.4 MHz bandwidth. To address this issue, the3GPP has proposed Enhanced-PDCCH (EPDCCH) with thenarrow bandwidth of 1.4 MHz for low-cost MTC devices andeach PRACH has a dedicated NB EPDCCH. In this modified

RACH structure adapted for low-cost MTC devices, the RArequests from the devices are distributed across multiple NBchannels, thus reducing the congestion caused due to widebandnature of PDCCH in the conventional RA structure. Despitethis enhanced RACH structure designed for low-cost MTCdevices, the capacity of this RACH structure is not sufficientto handle the massive number of RA requests coming fromthe ever-increasing number of devices.

Towards addressing the problem of RACH overload in thecellular IoT systems, several methods have been proposedin the literature [87]. From the perspective that whether thedevice or the eNodeB employs the solution, the existingschemes can be broadly categorized into push-based and pull-based. In the first approach, the RA requests are controlledfrom the device-side while in the pull-based approach, thecontention in the RA procedure is controlled from the eNodeB.Besides, there are some strict separation schemes and softseparation schemes to concurrently support both the HTC andMTC traffic in LTE-A networks [33, 91]. The strict separa-tion schemes mainly comprise of the following: (i) resourceseparation: orthogonal allocation of resources between HTCand MTC traffic and dynamic shifting of resources amongtwo classes, (ii) slotted access methods which define accesscycles including the RA slots dedicated to the MTC deviceaccess, and (iii) pull-based scheme in which the MTC devicesare allowed to access the PRACH only upon being paged bythe corresponding eNodeB. On the other hand, soft-separationschemes include the following: (i) backoff tuning which as-signs longer back-off intervals to the MTC devices whichdo not succeed during the preamble transmission of the RAprocedure and (ii) Access Class Barring (ACB) scheme. Abrief description of various existing and emerging solutionsfor the RAN congestion problem is provided in Section V.

Most of the existing MTC related works focus on the BSload balancing, radio resource management and the groupingof MTC devices, and only a few studies have been conductedin optimizing the access control of massive requests from theMTC devices [65]. The incoming requests from the MTCdevices can be categorized into delay-sensitive and delay-tolerant based on the delay tolerance level of the underlyingapplications and the aggregator/BS can be equipped with twoqueues with one having higher priority over the other in orderto deal with the two traffic classes. The criteria used fordefining delay tolerant and delay sensitive may differ fromone scenario to another [65].

The RA delay is one of the important aspects to be consid-ered while designing RA techniques for mMTC in the existingcellular networks. In this regard, authors in [94] derived lowerbounds for the LTE-A RA delay by considering uniformlydistributed and Beta-distributed traffic arrivals and analyzedthe effect of frequency of RA opportunities and the numberof preambles. It has been shown that the RA delay can bereduced by several orders of magnitude by effectively tuningthese system parameters.

As briefly highlighted in Section I, cellular IoT standardsmainly comprise of two categories, namely, LTE-M and NB-IoT, which are described in the following subsections. Also,in Table IV, we highlight the key differences between these

Page 16: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 16

TABLE IVCOMPARISON OF TWO MAIN CELLULAR IOT (LTE-M AND NB-IOT) TECHNOLOGIES

Feature/parameter LTE-M NB-IoTChannel bandwidth 1.4 MHz 180 KHzTransmission mode HD-FDD/FDD/TDD HD-FDDPeak data rate 375 kbps (HD-FDD), 1 Mbps (TDD) 50 kbps (HD-FDD)Latency 50-100 ms 1.5-10 secondsNoise figure 9 dB (uplink), 5 dB (downlink) 5 dB (uplink), 3 dB (downlink)Maximum coupling loss (MCL) 155.7 dB 164 dBModes of operation In-band Inband, guard-band and standalonePower consumption Best at medium data rates Best at very low data ratesMobility support Full mobility No connected mobilityVoice over LTE support Yes No

two technologies optimized to provide cellular connectivity toIoT devices [95, 96]. For the detailed differences among LTE-M, NB-IoT and legacy LTE in terms of supported featuresand functionalities for different uplink and downlink physicalchannels, interested readers may refer to [31].

D. LTE-M: Key Features and Channel Access Mechanisms

While looking at the history of MTC, the first generationof a full featured MTC device emerged in 3GPP ReleaseR12. In this release R12, the 3GPP has defined the category0, i.e., CAT-0 for the low-cost MTC operation [30]. In thesubsequent releases, the efforts to incorporate mMTC devicescontinued and LTE release 13 (R13) in 2016 introduced twospecial categories, namely, CAT-M (also called LTE-M) forMTC and CAT-N for the NB-IoT to support various featuresof MTC/IoT applications. LTE Rel-14 enhancements werecompleted in June 2017, and the improvements under the Rel-15 are ongoing and are expected to be released by June 2018.

With respect to Cat-1 category which was the lowest UEcategory in LTE Release 11 from the perspective of transmis-sion capability (peak rate of 10 Mbps in the downlink and 5Mbps in the uplink), Cat-0 devices have a reduced complexityof about 50 % and have a reduced transmission rate of 1 Mbpsfor both the downlink and the uplink [31]. Also, Cat-0 categoryenables the use of only one receiver antenna with a maximumreceiver bandwidth of 20 MHz and supports FDD half-duplexoperation with relaxed switching time, eliminating the needof dual receiver chains and duplex filters for low cost MTCdevices, respectively. In the subsequent LTE releases after theintroduction of LTE-M in release 13, several new features havebeen added. The key features of LTE-M in different releasesare included in the following [58].

1) Release-13: The main features included in this re-lease include Coverage Enhancement (CE) mode A/B,bandwidth limited operations (1.4 MHz), half-duplexsupport, in-band operation mode, RRC connection, datatransmission via a control plane, mobility support andeDRX.

2) Release-14: The key features incorporated in this ver-sion include multi-cast support with single-cell point-to-multipoint, positioning enhancements such as enhanced-cell ID requirements and observed time difference of ar-rival support, larger channel PDSCH/PUSCH bandwidth(up to 5 & 20 MHz), Voice over LTE enhancements, andsupport for HARQ-ACK bundling and inter-frequencymeasurements.

3) Release-15: The main features of this release are re-duced latency and power consumption, lower UE powerclass, improved spectral efficiency, improved load con-trol of idle UEs, eDRX enhancements and support forhigher UE velocity.

The four-stage message handshake procedure followed inthe current LTE standard results in very high overhead formost of the IoT devices since the packets transmitted by theresource-constrained IoT devices are quite short as comparedto the conventional cellular packets [33]. In this regard, severalapproaches are being investigated to design efficient channelaccess mechanisms to support MTC in the existing cellularsystems. One approach investigated in the literature is tofollow ALOHA-like immediate access without any reservation[97]. Although this scheme completely eliminates the channelreservation phase and provides very low latency, the systemthroughput is limited by the slotted ALOHA capacity of 1/e.Besides, another approach is to utilize a preamble-initiatedcontention-based mechanism in which the nodes transmita randomly selected preamble to reserve a time/frequencyresource [7]. In contrast to the conventional RACH procedurefollowed in LTE, this method eliminates Message 3 andMessage 4 of the four-stage message handshake procedureand the data is transmitted on the RB specified in the RARmessage, thus significantly lowering the delay. However, if twoor more nodes choose the same set of preambles for the RArequest, the collisions occur which are detected by the lack ofAcknowledgement (ACK) message.

From the performance analysis carried out in [7], it is shownthat the preamble-initiated access achieves 86 % more capacityin comparison to both the conventional LTE access mechanismand ALOHA-like immediate transmission scheme for smalldata packet transmissions in IoT application scenarios. How-ever, in terms of delay, the ALOHA-like scheme reduces thedelay by about 62 % for low traffic loads in comparison tothe preamble initiated access and by about 77 % as comparedto the conventional LTE access mechanism.

To capture the signature of the LTE signal, an MTC devicewill need to receive the synchronization signals which occupy6 RBs of the eNodeB’s bandwidth [31]. Although decodingPDCCH becomes impossible due to the bandwidth limita-tion (only 1.4 MHz) of MTC devices, the enhanced versionEPDCCH, which uses only one RB, is a good candidate,but is not sufficient for the required coverage enhancement[98]. However, increasing its bandwidth to 6 RBS in the 1.4MHz bandwidth on the MTC device along with the repetition

Page 17: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 17

will provide the good coverage of about −14 dB and theEPDCCH also supports beamforming to enhance the coverage[28]. Therefore, 6 RBs, i.e., one narrowband is usually usedas the basic unit for the MTC bandwidth [31].

The main changes incorporated in the physical layer oper-ation of LTE-M as compared to the legacy LTE are brieflydescribed below [31].

1) Frequency hopping: Due to narrow-bandwidth and asingle receiver chain at the MTC device, the benefits dueto spatial diversity and frequency diversity are not avail-able. To compensate for the performance loss causeddue to frequency diversity, the concept of frequencyhopping, which allows MTC transmissions to hop fromone NB channel to another, is employed. The challengesassociated with frequency hopping in MTC devicesinclude the need of retuning the RF chain, and the priorknowledge of the hopping pattern at the eNodeB and thedevice.

2) Repetitions: To achieve sufficient link budget in thedownlink for the coverage enhancement, repeated copiesof the same signal are transmitted over time to boostthe link performance via time diversity. The main issueinvolved with this repeated transmission strategy is therequirement of increased decoding time, i.e., latency,demanding for longer wake-up time for the MTC device.

3) MTC Physical RACH (MPRACH) and PhysicalDownlink Control Channel (MPDCCH): To com-pensate for the additional path-loss caused due to theextended coverage for the MTC device, the PRACH oflegacy LTE needs to be modified. For this, frequencydiversity and repetitions need to applied to the MPRACHto achieve the required diversity. Furthermore, additionalfeatures such as defining downlink control formats andenhancing control channel assignment procedure need tobe added to support frequency hopping and repetitionsin MPDCCH.

4) MTC search spaces: In order to reduce the numberof decoding trials by the devices in the LTE systems,each MTC device can be assigned only a defined searchspace area of the whole control region to be monitored.In contrast to the legacy Enhanced PDCCH, there aremainly two classes of search spaces in MPDCCH,namely, device-specific search space and common searchspace.

5) MTC Downlink Control Information (DCI) formats:To reduce blind decoding iterations, i.e., device com-plexity as well as to facilitate the use of frequencyhopping, repetition and enhanced coverage, three dif-ferent DCI formats have been defined for uplink grant,downlink scheduling and paging in MTC devices.

E. Narrowband IoT: Key Features and Channel Access Mech-anisms

To address various challenges of supporting MTC devices incellular IoT networks specified in Section II-B, the 3GPP hasproposed the concept of NB-IoT in its Release 13 [64]. Themain objectives behind the NB-IoT concept include providing

better indoor coverage and support to a massive number oflow-throughput devices, with low power consumption and re-laxed delay requirements [63]. To accomplish these objectives,the NB-IoT follows the procedures of optimizing control planeand user plane of Cellular-IoT (CIoT) evolved packet systemtowards reducing the signalling overhead for small data packettransmissions [99].

For both the uplink and downlink operations, NB-IoT canoperate with an effective narrowband operation of 180 kHzbandwidth corresponding to one RB in the LTE network. Inthe downlink of an NB-IoT system, Orthogonal Frequency-Division Multiple Access (OFDMA) is employed with thesubcarrier spacing of 15 kHz over 12 sub-carriers while in theuplink, both single tone and multi-tones are supported (single-tone with the subcarrier spacing of either 3.75 kHz or 15 kHz)[100]. The NB-IoT usually can be operated in the followingthree operation modes [64, 101].

1) In-band operation: This mode of operation utilizes theRBs within an LTE carrier by reserving one RB for theNB-IoT system.

2) Guard band operation: This mode uses the unusedresources within the guard band of the LTE carrierswhile ensuring that this does not affect the normalcapacity of the LTE carrier.

3) Stand-alone operation: This mode utilizes the re-farmed GSM low band already existing in several coun-tries (700 MHz, 800 MHz, and 900 MHz) [101].

The NB-IoT technology provides greater flexibility for thedeployment of IoT devices in different applications such assmart city, smart home, smart metering and smart agricultureby reusing the existing network architectures. The main re-quirements for the NB-IoT system include the following [101].

1) Low power consumption: NB-IoT systems utilize thepower saving mode and eDRX to maximize the batterylife.

2) Low channel bandwidth: Due to low channel band-width of 200 kHz (180 kHz plus guard bands),GSM channel re-farming is applicable for NB-IoT sys-tems since a single NB-IoT channel can utilize oneGSM/GPRS channel.

3) Low cost for the end-device: Due to low channelbandwidth of 200 kHz, the front-end and digitizer ofNB-IoT receivers are much simpler than that of theexisting LTE-based systems operating on the bandwidthof 1.4 MHz, thus leading to low-complexity (cheaper)devices.

4) Low deployment cost: Besides the device cost, due tothe capability of reusing existing GSM bands, the de-ployment cost for the network operators is significantlyreduced.

5) Extended coverage: NB-IoT can provide about tentimes better coverage area compared to the legacy GPRSsystems as it can be achieve the additional 20 dB linkbudget gain.

6) Support for massive number of connections: Due toimproved coverage and low channel bandwidth, it cansupport significantly higher number of MTC devices.

Page 18: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 18

The main signals and channels involved in the downlinkof an NB-IoT system are Narrowband Primary Synchro-nization Signal (NPSS), Narrowband Secondary Synchroniza-tion Signal (NSSS), Narrowband Physical Broadcast Channel(NPBCH), Narrowband Reference Signal (NRS), NarrowbandPhysical Downlink Control Channel (NPDCCH) and Narrow-band Physical Downlink Shared Channel (NPDSCH) [102].Out of these, NPSS and NSSS are used by an NB-IoT deviceto carry out cell search procedure including cell identity de-tection, and frequency and time synchronization. The NPBCHincludes the master information block while the NRS is usedto provide phase reference required for the demodulation ofdownlink signals. Similarly, NPDCCH includes the schedulinginformation for both the uplink and downlink data channelswhile the NPDSCH carries various information such as systeminformation, paging message, RAR message and also datafrom the higher layers.

Besides, the uplink transmission scheduling of devices in theNB-IoT mainly comprises of Narrowband PRACH (NPRACH)and Narrowband PUSCH (NPUSCH) [100]. Out of these,NPRACH corresponds to the time-frequency resource used totransmit RA preambles and the NPUSCH is used for carryingthe uplink data. The differences of the above-mentioned uplinkand downlink channels from the legacy LTE systems are high-lighted in [102]. The key technique employed by an NB-IoTsystem to obtain enhanced coverage with low complexity isrepetition, which utilizes the repeated transmission of both datatransmission and the involved control signalling transmission[100].

The RA procedure in the NB-IoT system is responsiblefor establishing a radio link during the initial access, forscheduling the transmission requests and to achieve uplinksynchronization among the NB-IoT devices [102]. Threedifferent types of NPRACH resource can be configured byassigning separate repetition values for a basic RA preamble toserve the devices belonging to different coverage classes withdifferent ranges of path loss. The device estimates its coveragelevel by measuring the downlink received signal power, andthen the device transmits an RA preamble in the NPRACHresources configured for the estimated coverage level. Theconfiguration of NPRACH resources is made flexible in a time-frequency resource grid to enable the deployment of NB-IoTsystems in different scenarios.

IV. TRANSMISSION SCHEDULING FOR MTC SYSTEMSWITH QOS SUPPORT

Most of the models used to analyze the capacity of wirelesssystems are based on physical layer models and they can notcapture the link-layer QoS requirements such as bounds onthe delay [103]. Therefore, physical-layer only models are notsuitable for QoS support mechanisms such as resource reser-vation and admission control. Furthermore, in contrast to thewired links, it is challenging to guarantee QoS requirementsin wireless systems due to low reliability, multi-path fading,co-channel interference and time-varying capacities. In orderto incorporate complex QoS requirements into account, it isimportant to understand the queuing behavior of the connec-tions and to capture the QoS requirements while characterizing

the performance of packet-switching based wireless networks.For analyzing the queuing behavior, the characterization ofsource traffic as well as the services is an important aspect tobe considered [103].

Due to distinct QoS requirements of MTC devices, it is cru-cial to provide QoS support for MTC devices in future wirelessnetworks. For example, non-real time MTC applications suchas data transmissions aim to enhance the reliability of trans-mission and do not have strict delay constraint. Whereas, real-time MTC applications such as video surveillance/demand,the important QoS metrics are strict latency and data raterequirements rather than high spectral efficiency [104]. Tomeet the QoS requirements of different network applicationshighlighted in Section II-D, it is crucial to design efficientradio resource allocation algorithms for the MTC devices inthe uplink while considering the constraints on the availableradio spectrum.

One of the potential candidate platforms to support MTCdevices is the LTE-A standard and the 3GPP has been workingon several enhancements of LTE-A standard towards thisdirection. The 3GPP uses Single Carrier Frequency DivisionMultiple Access (SC-FDMA) [105] as the multiple accessscheme in the uplink of LTE cellular networks due to its mainadvantage of low Peak-to-Average Power Ratio (PAPR) ascompared to that of the OFDMA. Due to this feature, thereduced requirements on the processing power and batteryare suitable for the resource-constrained MTC devices [104].However, the allocation of RBs in the SC-FDMA becomescomplex as compared to that in the OFDMA scheme due tothe sequential transmission of the RBs in the SC-FDMA incontrast to the transmission of orthogonal RBs in the OFDMA-based systems.

The minimum resource unit used for scheduling downlinkand uplink transmissions in the LTE-A based cellular systemsis referred to as an RB. Each RB comprises of 12 sub-carrierswith each sub-carrier having the bandwidth of 180 kHz inthe frequency domain and one sub-frame of 1 ms duration inthe time domain [22]. The RB can be considered as a time-frequency resource in which an UE performs RA and eachRA slot comprises of the bandwidth equivalent to the 6 RBs,i.e., 1.08 MHz and its duration in the time domain is 1 ms.In LTE-based cellular networks, the eNodeB broadcasts theperiodicity of the RA slots by means of a variable referredto as the Physical RACH (PRACH) configuration Index [33],and subsequently, the MTC devices and the legacy cellularusers can perform RA by using the PRACH channel. Eventhough the data size from the MTC devices is significantlysmall, the massive number of devices attempt to concurrentlycommunicate over the same radio channel, thus leading to thenetwork overload problem [22]. In contrast to the conventionalHTC services such as multimedia for which the packet arrivalperiods range from 10 ms to 40 ms, the packet arrival periodsin MTC applications may range from 10 ms to several minutes[54].

A. Framework for Performance Analysis with QoS SupportIn this subsection, we present a mathematical framework

to carry out the performance analysis of mMTC systems

Page 19: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 19

with QoS support in terms of the effective capacity, effectiveSNR and the estimated number of MTC devices. For thisanalysis, we consider the uplink in a single-cell of 3GPP LTE-A networks serving multiple MTC devices with the SC-FDMAscheme. This scenario can also be studied in conjunction withthe legacy cellular/HTC users as in [106], however, herein,we deal only the case of MTC devices since we are interestedin providing QoS support for MTC devices while maximizingsome network performance metric subject to the constraintson the available radio spectrum. In practice, these devicescan be grouped based on the employed transmission protocolsand QoS requirements, and can be deployed on the clusterbasis by using different wireless technologies such as WiFi,Bluetooth and Zigbee [104]. Furthermore, MTC devices cancommunicate to the eNodeB via an MTC gateway and thetotal available RBs can be divided between the access link(MTC devices to the MTC gateway) and the backhaul link(from the MTC gateway to the eNodeB) in the time domainas considered in [106].

Let us assume that there are M number of total MTCdevices in the coverage area of the eNodeB, indexed by theset M = {1, . . . ,m, . . . ,M} and there are L number ofavailable RBs, indexed by the set L = {1, . . . , l, . . . , L}. Weassume Poisson distribution for the traffic arrival rate of theMTC devices and block fading wireless channel between MTCdevices and the eNodeB/gateway as in [104]. Also, we assumethat channel coherence time is greater than the TransmissionTime Interval (TTI) and the channel gain remains constantduring a TTI.

To incorporate QoS requirements of MTC devices into theproblem formulation, one way is to define a QoS exponent foreach MTC device and to introduce this exponent in the defi-nition of system capacity. Let θm denote the QoS exponent ofthe mth MTC device indicating a steady-state delay violationprobability of the mth M2M device. Considering a queue ofinfinite buffer size required due to a constant arrival rate λ,the delay violation probability is given by [104]

δ = Pr(dm > dmax) ≈ φm(λ)e−θmdmax, (4)

where Pr(. ) denotes the probability operation, dm representsthe delay experienced by a source packet of the mth MTCdevice, dmax is a delay bound, and φm(λ) = Pr(dm > 0)indicates the probability of non-empty buffer. In this formula-tion, the pair of (φm(λ), θm(λ)) can be used to characterizethe link from the mth device to the gateway/eNodeB.

In contrast to the conventional physical layer-based capacity,we define the effective capacity to take the link-layer QoSrequirements into account. The effective capacity [103] isdefined as the maximum constant arrival rate that a givenservice process can support to guarantee a QoS requirementspecified by θ and can be defined for the mth MTC device as

Rme (θm) = − 1

θmlnE[e−θmRm ], (5)

where Rme denotes the effective capacity for the mth MTCdevice, θm represents the statistical QoS exponent of the mthMTC device, E(. ) denotes the expectation, and Rm is thedata rate of the mth MTC device. In order to guarantee a QoS

requirement of θm for the mth MTC device, the followingcondition should be satisfied [104]

Rme (θm) ≥ λm, (6)

where λm is the traffic arrival rate for the mth MTC device.By solving (6), one can obtain θm. Subsequently, by usingthe Shannon’s capacity formula, the maximum achievabletransmission rate for the mth MTC device can be expressedas

Rm = Blog2(1 + γm) = Blog2

(1 +

Pm|hm|2

σ2n

), (7)

where B is the bandwidth of each RB, Pm is transmissionpower of the mth MTC device, |hm|2 is the channel gain, σ2

n

is the Additive White Gaussian Noise (AWGN) power, andγm = Pm|hm|2

σ2n

is the SNR for the mth MTC device.Although the value of θm can be derived by finding the

probability density function of λm corresponding to (7) andby subsequently solving (6), the evaluation process is quitecomplex. Another approach is to use an intuition from [103]and to obtain the QoS exponent θ(λ) in the following way

θ(λ) =φ(λ)λ

λτs(λ) + E[Q(t)], (8)

where Q(t) is the length of a queue at time t and τs denotes thethe average remaining service time of a packet being served.The detailed methodology to estimate the value of θ(λ) in (8)for the considered MTC scenario has been illustrated in [104].

Despite the significant benefits of SC-FDMA in terms ofpower and battery requirements, there arise some restrictionsfor uplink resource allocation (RB and power allocations)while employing SC-FDMA in the uplink [105]. The mainaspects to be considered include: (i) a single RB can only beallocated to at most one user, (ii) multiple RBs allocated to asingle user should be adjacent, and (iii) the transmit power onall the RBs allocated to a user should be equal. Let us assumethat the set of RBs Lm is allocated to the mth MTC device inthe current TTI, then the achievable rate (upper bound) from(7) in terms of effective SNR can be written as

Rm = B.Lmlog2(1 + γeff,m), (9)

where Lm = |Lm| denotes the cardinality of the set Lm andγeff,m denotes the effective SNR for the mth MTC device.Since each data symbol is spread over the whole bandwidth inSC-FDMA transmission, the effective SNR can be computedas an average of SNRs over the allocated set of RBs to aparticular MTC device as follows

γeff,m =1

Lm

∑l∈Lm

γm,l, (10)

where γm,l is the SNR of the mth device for the lth RB.Another aspect to be considered is how to effectively design

the medium access scheme to support the massive numberof devices. One approach is to determine the optimal size ofthe Random Access Window (RAW) based on the estimatednumber of MTC devices in the following way [107]. If thereare idle slots available at the RAW, the eNodeB/access pointcan estimate the number of devices for the uplink access

Page 20: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 20

by using suitable estimation techniques such as maximumlikelihood estimation method. Let I be the measured numberof idle slots in the uplink RAW, LUL be the number of slotsof the uplink RAW and NUL be the number of devices forthe uplink access. When NUL devices contend in LUL, theprobability of selecting a slot by a device for the uplink accessbecomes 1

LULand the corresponding complementary probabil-

ity is (1− 1LUL

). Thus, the idle probability that no devices forthe uplink access transmit the power save poll message, bywhich the device requests for the downlink data or the ACKframe from the eNodeB, is pidle = (1 − 1/LUL)

NUL and theprobability pidle is estimated as: p̂idle = I

LUL. Subsequently,

the estimated number of devices for the uplink access byutilizing the aforementioned idle probability can be calculatedas

NUL =log(p̂idle)

log(1− 1LUL

). (11)

On the other hand, the existing packet schedulers are mainlydesigned for a specific wireless system such as LTE and do notfully capture the heterogeneous characteristics of ultra-denseIoT networks. In this regard, authors in [84] proposed delay-efficient joint packet scheduling and subcarrier assignmentby considering the classification of the uplink MTC trafficaggregated at the MTC aggregator into multiple classes basedon traffic features such as packet size, arrival rate and delay re-quirements. By employing an MTC specific traffic model, theincoming data from the sensors at the aggregator is categorizedeither as ED or PU types and the delay requirements of thesePU and ED traffic types are mapped onto sigmoidal and steputility functions, respectively. In addition, in order to ensurethat the packets transmitted by an MTC device are withinthe delay budget, authors in [108] introduced a new MACelement, called Packet Age, with which the device informsthe scheduler about the waiting time of the oldest packet inthe device buffer along with the buffer size specified in thebuffer status report.

B. Short Data Packet Transmission and Associated Issues

In this subsection, we briefly discuss various issues relatedto short data packet transmission in IoT systems along withits information theoretic perspective. One of the emergingareas in the MTC systems is ultra-reliable and low-latencycommunications, also known as mission-critical MTC. Someof the applications of mission-critical MTC are industrialcontrol, intelligent transportation systems and smart grids forpower distribution automation [109]. As an example, industrialcontrol applications may need to transmit about 100 bits within100 µseconds with 10−9 PER [110].

Existing wireless systems are designed to support the con-ventional HTC traffic having long packet sizes and each packetconsists of information payload and the control information(metadata) which usually contains various information aboutlogical addresses, packet initiation and termination, synchro-nization and security. As compared to the transmission of longpackets, the transmission of short packets in the wireless IoTsystems differs mainly in the following two ways [11]. First,existing transmission techniques are based on the assumption

that the metadata is negligible as compared to the size ofthe information payload. However, this assumption does notapply to the transmission of short packets since the metadatasize becomes no longer negligible, resulting in the need ofhighly efficient encoding schemes. Secondly, for the case oflong packets, there exist channel codes which enable thereconstruction of information payload with high probability.The thermal noise and channel distortions average out for thecase of long packets due to the law of large numbers, however,this averaging does not occur for the case of short packetsand the classical law of large numbers is not applicable formMTC applications, resulting in the need of new informationtheoretic principles. In this regard, authors in [11] discussedvarious information theoretic approaches to characterize thetransmission of short packets in wireless communication sys-tems and applied these principles on the transmission of shortpackets in various channels such as a two-way channel, adownlink broadcast channel and the uplink RACH. In addition,authors in [111] investigated the tradeoffs among reliability,throughput, and latency for the transmission of informationover multiple-antenna Rayleigh block-fading channels.

In the above context, several recent works have investigateddifferent physical layer approaches to support small packettransmissions in mMTC/IoT environment by considering theirspecific characteristics, which are briefly reviewed in the fol-lowing paragraphs. Also, in Table V, we list the recent researchworks towards supporting small data packet transmission inwireless networks with their main themes and applicablesystems.

In short data packet transmissions, one effective way ofenhancing the packet transmission efficiency is to optimize thepilot overhead [112]. However, most of the existing pilot over-head optimization works considering the objective of ergodicchannel capacity maximization are based on the assumption ofsufficiently large packet length resulting in small packet errorprobability, which is not suitable for short-packet transmission.In this regard, authors in [112] formulated the optimization ofapproximate achievable rate as a function of block length, pilotlength and error probability, and illustrated the importance ofconsidering packet size and error probability while optimizingpilot overhead via numerical results.

Another potential enabling approach to support short packettransmissions in IoT/mMTC environments is to design suitabletransmit waveforms. In the IoT environment, there are someapplications with very small packet sizes such as the datatransmitted from sensors like temperature sensors while someother applications such as car to car and car to infrastructurecommunications demand very fast response time. In order tosupport these diverse set of applications, the 5G and beyondair interface should be able to support transmissions with verysmall air interface latency enabled by very short transmissionframes [113]. In this regard, it is important to investigatesuitable waveforms for supporting diverse applications in anIoT environment. Among potential multi-carrier waveformcontenders such as filtered Cyclic Prefix-OFDM, Filter bankmulti-carrier and Universal Filtered Multi-Carrier (UFMC),authors in [113] concluded UFMC as the best choice forIoT systems with short burst transmissions due to its several

Page 21: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 21

TABLE VRECENT RESEARCH WORKS TOWARDS SUPPORTING SMALL DATA PACKET TRANSMISSIONS IN WIRELESS NETWORKS

Main Theme Applicable systems ReferencesOptimization of pilot overhead IoT Sensor networks [112]Design of air interface and waveforms Multicarrier 5G systems [113]Coding and modulation schemes IoT Sensor networks [114]Non-orthogonal multiple access MTC scenarios [115]Minimization of the core network signalling 5G cellular network [116]Joint encoding of grouped messages Wireless broadcast channel [117]Autonomous transmission mode Delay tolerant IoT/MTC scenarios [118]Receiver algorithms to enhance the reception quality 5G wireless networks [119]Energy and information outage performance analysis Wireless powered network [120]Exploitation of frequency diversity to enhance reliability Tactile Internet [121]

benefits in terms of supporting fast Time Division Duplex(TDD) switching, low latency modes, low energy consumptionand small packet transmission.

In addition, investigating suitable coding and modulationschemes is crucial to achieve high energy efficiency for short-packet transmissions having low-duty cycles. Due to lowerduty cycle, time synchronization and phase coherency forshort-packet transmissions become non-trivial. Furthermore,because of short packet length, a large coding can not beachieved as in the conventional voice or data networks. More-over, the overhead required to maintain time synchronizationand phase coherency becomes significantly large while usingthe conventional coherent modulation schemes [114]. In thisregard, the time synchronization overhead can be reducedby employing either non-coherent modulation/demodulationschemes such as Phase-Shift Keying (PSK) with differentialencoding or orthogonal modulations. To this end, authors in[114] analyzed the tradeoff between energy efficiency andbandwidth in non-coherent short packet transmission systems.

Towards addressing the problem of scalability and efficientconnectivity to the massive number of MTC devices with shortpackets, the NOMA scheme is considered as one candidatemultiple access solution [115]. Due to its benefit of improvingfairness and spectral efficiency for low-latency transmissionwith respect to the orthogonal multiple access technique, itis considered promising for IoT applications. In this regard,authors in [115] analyzed a trade-off among the transmissionrate, transmission delay (in terms of block-length) and de-coding error probability by considering a two user downlinkNoMA system with finite block-length constraints.

Furthermore, towards minimizing the signalling overheadfor small data packet transmissions, authors in [116] proposeda framework based on 5G RAN controlled user-centric mobil-ity, in which an anchor node is allocated and updated for eachend-device and it maintains the connection of the device to thecore network within its coverage area. In this approach, an usercentric area is dynamically allocated so that an user/device canmove freely and communicate with the network without anystate transitions signaling required in the existing connectionmanagement schemes with RRC protocols [116].

Moreover, while implementing multiple-antenna based in-terference suppression in IoT systems with small data packetstructures, the insufficient training period may result in severedegradation in the estimation of the desired channel andinterference covariance matrix. The main challenge here isto obtain the reliable channel estimation without significantly

affecting the data transmission duration, i.e., to balance thetrade-off between the pilot training period and the data trans-mission period. In this regard, authors in [119] investigatedan efficient receiver structure which can exploit informationreceived during the data transmission period to enhance thereception quality for the short packet transmissions. Moreover,in the context of energy harvesting networks, authors in [120]provided a comprehensive analysis of the backscatter wirelesspowered communication with sporadic short data packets byusing a stochastic geometry framework.

V. SOLUTIONS FOR RAN CONGESTION PROBLEM INCELLULAR IOT NETWORKS

In this section, we first review several existing techniquestowards addressing RAN congestion problem in cellular IoTnetworks, and then discuss some emerging solutions.

A. Existing Techniques

Towards addressing the RAN congestion problem in LTE-based cellular networks, 3GPP has specified the following sixdifferent solutions of LTE RA congestion [69]:: (i) ACB, (ii)MTC-Specific backoff, (iii) dynamic resource allocation, (iv)Slotted random access, (v) separate RA resources and (vi) pull-based RA. In the following, we briefly describe the principlesof these techniques along with other related solutions in theliterature [35, 69, 122]. Also, in Table VI, we provide the listof these schemes along with their main principles and thecorresponding references.

1) Back-off based scheme: In this scheme, the devicesretransmit after a backoff time if they encounter acollision. This scheme can enhance the network per-formance under a low congestion level, however, be-comes problematic in high-level congestions [123] Thisis the conventional approach followed in contention-based wireless networks and 3GPP has suggested severalimprovements to solve the RAN overload problem. Tosupport MTC devices in the existing cellular networks,3GPP has suggested the use of MTC-specific backoffscheme in which MTC devices are subject to a largerbackoff interval than the HTC devices [69].

2) Access Class Barring (ACB) scheme: This schemeclassifies the contending devices into multiple accessclasses with different access probabilities and each classis assigned to an ACB parameter and an access barringtimer [69]. The working principle of the ACB schemecan be summarized in the following way. First, the BS

Page 22: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 22

TABLE VISUMMARY OF EXISTING SOLUTIONS FOR RAN CONGESTION PROBLEM IN CELLULAR IOT NETWORKS

RA Scheme Main principle ReferencesBack-off based scheme In the occurrence of collisions, devices retransmit after an MTC-specific backoff period. [69, 123]Access Class Barring (ACB) Multiple access classes of devices are assigned different access probabilities. [91]Extended Access Barrier (EAB) A certain access class of devices is barred from the channel access. [124]Cooperative ACB scheme ACB parameter is designed by many BSs in a collaborative way. [92]Dynamic ACB ACB parameter is updated dynamically based on previous collisions. [93]Prioritized RA with dynamic ACB Utilizes class-dependent back-offs and dynamic ACB. [125, 126]Dynamic resource allocation Congestion level is predicted at the BS and additional RACH resources are allocated dynamically. [35]Slotted random access Each device is assigned a dedicated RA slot and is allowed to perform RA only in that slot. [69]Separation of RA Resources Available preambles or RA slots are divided between MTC and HTC devices. [35, 69]Pull-based/Paging-based Devices perform RA attempts only after receiving paging messages from the BS. [69]Group-based RA RACH resources are allocated on the basis of groups formed based on some defined criterion. [127, 128]Code-Expanded RA RA codewords are generated and each device utilizes a set of preambles in each RA slot. [129]

broadcasts the ACB parameter, i.e., 0 ≤ p ≤ 1 to theMTC devices and each MTC device trying to connectto the BS generates a random number 0 ≤ r ≤ 1uniformly. Then, the MTC device is allowed to start theRA procedure if r < p and otherwise, the access to thatparticular device is barred and the device has to wait fora random backoff time determined based on the barringduration of that class. Therefore, by controlling the ACBparameter p, the BS can control the stabilization of RAto optimize some network performance metrics such asthroughput [91].However, in the presence of severe congestion causedby the presence of massive number of IoT devices,the value of p may be set to be extremely low, thusleading to the intolerable delay. Also, the ACB schemeis not suitable for event-driven applications in which thecontention may arise within a short time duration [33].Furthermore, the operating parameters such as transmis-sion probability should be adjusted based on the networkstatus and estimating the number of devices/networkstatus becomes challenging due to highly bursty trafficin event-driven MTC communications [123].To address the above drawbacks of the ACB scheme,there have been some attempts in the literature. Someof the important ones include the following.

a) Extended Access Barrier (EAB) scheme [124]: Inthis scheme, devices belonging to a certain accessclass are barred from the channel access to providesome form of service differentiation [124]. Theoperation of this scheme depends mainly on thefollowing two factors: (i) the sets of barred accessclasses and (ii) the time of turning EAB on oroff. The larger the set of barred access classes,the higher will be the access success probabilitywhich comes at the cost of increased mean accessdelay. Furthermore, the timing of turning EAB onor off relies on the input network load which isproportional to the number of devices concurrentlyaccessing the network.

b) Cooperative ACB scheme [92]: In this scheme,ACB parameters are determined across the networkjointly by many BSs interconnected via the X2

interface [92] rather than individually calculated ateach BS. This scheme aims to balance the traffic

load among the BSs in a heterogeneous multi-tier cellular network with the objective of reducingthe congestion level and also improving the accessdelay.

c) Dynamic ACB scheme [93]: In this approach, theACB parameters are updated dynamically based onthe information about the number of collisions inthe previous time slots.

d) Prioritized RA with dynamic ACB [125, 126]:This scheme pre-allocates the RACH resourcesfor different classes of MTC devices with class-dependent backoff procedures and reduces thenumber of concurrent requests for the RACH byemploying the dynamic ACB method.

3) Dynamic Resource Allocation: In this scheme, the BSpredicts the congestion level of the access network over-load caused due to MTC devices and allocates additionalRACH resources dynamically in the time domain orfrequency domain or both for the MTC devices [35,69]. However, the allocation of more radio resources forRACH will reduce the radio resources available for thetraffic channels and this trade-off needs to consideredwhile implementing this solution.

4) Slotted Random Access: In this method, a dedicatedRA opportunity is provided to each MTC device and isallowed to perform RA only in the access slot allocatedto it [69]. However, in ultra-dense IoT scenarios, thismethod will result in very high access delay since theduration for each RA cycle will be significantly large.

5) Separation of RA Resources: In this approach, differentRACHs are allocated to MTC devices and HTC devicesto avoid the impact of RA congestion on HTC devices.The separation of RA resources can be done either bysplitting the available preambles into MTC and HTCsubsets or by allocating different RA slots for MTC andHTC devices [35, 69].

6) Pull-based/Paging-based scheme: All the schemes de-scribed above fall under the category of push-basedapproach in which RA attempts are done randomly bythe devices. However, in the pull-based method, thedevices perform RA attempts only after receiving pagingmessages from the BS. To reduce the number of pagingload in this approach, a number of MTC devices canbe paged together by following a group paging method

Page 23: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 23

[69].7) Group-based RA Scheme [127, 128]: The MTC devices

can be grouped based on some criterion such as havingsimilar QoS/delay requirements and being deployed in aspecific geographical region, and RACH resources canbe allocated on the group-basis to reduce the accessnetwork congestion. In a group-based RA scheme pro-posed in [127], the devices within one paging group arepartitioned into different access groups based on somecriterion and only one device within each access group,called group delegate/header, is made responsible forcommunicating with the BS. The group delegate can bedecided by the BS based on some suitable metrics suchas transmission power and channel conditions.Another grouping approach is to divide the cell coveragearea into a different spatial groups and to enable theuse of same preambles at the same RA slot by theMTC devices located in different groups if the minimumdistance of these MTC devices is larger than the multi-path delay spread [87, 130]. While sending the RARmessage, the BS sends distinct RARs to all the detecteddevices having different Timing Alignment (TA) valueseven if they use the same preamble during the RApreamble transmission phase.

8) Code-Expanded RA Scheme [129]: In this approach,the contention space is expanded to the code domainby creating the RA codewords. While initiating an RAattempt, each device sends a set of preambles over thegiven RA slots instead of transmitting only a singlepreamble at any random RA slot, thus creating a setof preambles in each RA slot.

9) Tree-based RA scheme [131, 132]: This category ofRA schemes utilizes the tree-based algorithms such asq-ary tree splitting technique [133], which rely on theutilization of feedback obtained after each contentionattempt [132]. This RA scheme is mostly used to ad-dress the contention problem caused due to synchro-nized arrivals of the traffic from a large number ofMTC devices. Furthermore, the combination of collisionavoidance techniques such as access barring can be usedin combination with tree-based collision resolution inorder to form a hybrid RA scheme [131].

B. Emerging Solutions

In the following, we provide some of the emerging researchdirections to address RAN congestion problem in wireless IoTnetworks.

1) Learning-based Techniques: Recently, learning-basedtechniques have received important attention in addressingthe RAN congestion problem in cellular IoT networks. Inthis direction, an RL scheme has been applied in [22] forthe selection of an appropriate BS for the MTC deviceswith the objective of avoiding access network congestion andminimizing the packet delay. In addition, a Q-learning basedaccess scheme has been studied in [10] to support MTCtraffic in the existing cellular networks. In this Q-learningbased approach, MTC devices learn to avoid collisions among

each other without involving a central entity and after thelearning convergence, each MTC device gets a unique RACHslot. Furthermore, authors in [23] applied a Q-learning basedunsupervised algorithm in order to select an appropriate BSfor MTC devices on the basis of QoS parameters in dynamicnetwork traffic conditions. Moreover, a hierarchical stochasticlearning algorithm has been applied in [134] to enable eachdevice to make the access decision with the assistance ofcommon control information broadcasted from the BS. Inaddition, in [21], a Q-learning algorithm has been applied todynamically adjust the value of a barring factor to be allocatedto the MTC device in the ACB scheme.

2) Distributed Queueing: The existing approaches to en-hance the RACH performance are mainly based on theALOHA-type mechanisms which suffer from some level ofinefficiency, instability and uncertainty in the outcome of theaccess opportunities to be assigned to the devices [33]. Inthis regard, one promising approach is Distributed QueuingCollision Avoidance (DQCA) [135] which is a distributed andalways-stable high performance protocol. This MAC protocolbehaves as an RA mechanism for low traffic load and switchesautomatically and smoothly to a reservation scheme whenthe traffic volume increases [135, 136]. More specifically, theDCQA protocol utilizes two distributed queues which operatein parallel [136]. The first queue, called collision resolutionqueue, deals with the resolution of access-request signal col-lisions, while the other queue, called data transmission queue,helps to manage the data transmission. The main features ofthis protocol are the following [135].

1) It can eliminate back-off periods and avoid collisions indata packet transmissions.

2) Its performance is independent of the number of trans-mitting nodes.

3) It is stable independently of the traffic conditions.4) As compared to other centralized or distributed MAC, it

utilizes very few bits for signaling operation purposes.Furthermore, authors in [74] proposed a distributed queuing-

based access protocol for LTE with the objective of improvingthe RA performance for MTC systems without altering theexisting frame structure of LTE systems. The original versionof distributed queueing protocol envisions orthogonal mini-slots as access opportunities and its implementation requiresa change in the LTE frame structure since the preambles inLTE are not orthogonal in time domain. To address this issue,the authors in [74] considered the distribution of allocatedpreambles for MTC devices among Ng virtual groups, witheach virtual group having Np number of preambles and eachpreamble being equivalent to one mini-slot considered in theoriginal distributed queueing protocol.

3) SDN and Virtualization for RAN Management: To sup-port differentiated MTC services with diverse QoS require-ments, a physical wireless network can be abstracted and slicedinto multiple virtual networks by employing suitable networkfunction virtualization techniques [73]. On the other hand,Software Defined Networking (SDN) enables the separationof a data plane and a control plane, and provides the capa-bility of programming a network via a centralized controller.Due to the global view of the underlying network, an SDN

Page 24: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 24

controller enables the efficient management of radio resourcesin dynamic network traffic and channel conditions. In cellularIoT networks, a hypervisor can divide the physical networkinto different IoT networks based on device classes and func-tionalities, and the SDN controller can dynamically allocatethe available radio resources among these virtual networks tomeet the QoS requirements of different IoT networks. Amongthese virtual networks, each MTC device can select one ofthe virtual networks to access to the physical network whilemeeting its connection requirements [137]. In addition to radioresources, it is also possible to enhance the utilization of othernetwork resources such as computing, caching and networkingresources [73].

VI. LEARNING-ASSISTED SOLUTIONS FOR RANCONGESTION PROBLEM IN CELLULAR IOT NETWORKS

A. Advantages of Learning Techniques in Wireless Communi-cations

The main questions this section attempts to answer are whylearning techniques are important in wireless communicationsystems, which parameters to learn and for what purposes.First, we present the main advantages of learning techniquesin wireless communications systems in general, and thendiscuss why learning techniques are needed on the top ofthe conventional link adaptation techniques. Subsequently, wediscuss various parameters which can be learnt by usinglearning techniques in different application scenarios.

As highlighted earlier in Section I-C, the number of config-urable system parameters has increased significantly from onecellular generation to the next one. For example, the numberof configurable parameters has increased to about 1500 in a4G node from about 500 in a 2G node and from about 1000in a 3G node, and it is predicted to be around 2000 in a 5Gnode [15, 16]. In this regard, the process of optimizing thesereconfigurable parameters in 5G and beyond systems becomesextremely complex and performing self-configuration, self-optimization and self-healing operations will be challenging.Also, emerging ultra-dense networks will need to observeenvironmental variations, learn uncertainties, plan responseactions and configure the network parameters effectively tohandle these operations. To this end, emerging ML techniquescould bring potential benefits in efficient handling of these op-erations. The main role of learning techniques include learningthe system variations/parameter uncertainties, classifying theinvolved cases/issues, predicting the future results/challengesand investigating potential solutions/actions [16].

Wireless systems utilize link adaptation techniques to adaptthe physical layer parameters such as modulation and codingscheme based on the reliability/quality of the communicationlink. In practice, different applications such as wireless videobroadcasting and VoIP demand for different reliability con-straints. Before performing this link adaptation, the reliabilityof a wireless link in the form of some metrics such as PERis predicted for each set of physical layer parameters, needsto be predicted [17]. During the link adaptation process, therearises a tradeoff between data rate and reliability since thePER calculated for a set of physical layer parameters is in

general inversely related to the data rate. In order to predict thereliability, existing systems form explicit input/output modelsof a wireless channel and then analyze the performance ofphysical layer for each set of the parameters.

However, due to the increasing trend of using multiple an-tennas, wideband signals and a number of advanced signal pro-cessing algorithms, the above-mentioned reliability predictionprocess becomes extremely complex [17] and the prediction ofPER with good accuracy becomes difficult in practice [145].Furthermore, due to a significantly large number of environ-mental parameters such as channel state information, signalpower, noise variance, non-Gaussian noise effect, transceiverhardware impairments such as power amplifier non-linearityand quantization error, it becomes challenging to provide thenear-optimal/optimal tuning of the transmission parameters toachieve the efficient link adaptation [18]. The severity of thisproblem greatly increases in ultra-dense networks due to theinvolvement of various agents and system parameters suchas Signal to Interference plus Noise Ratio (SINR) mismatchin ultra-dense small cell networks [146], and therefore, thelink adaptation in emerging ultra-dense networks becomes ex-tremely challenging. Also, the existing link adaptation systemsare localized to individual links and small coverage areas, anddo not take into account of the consequences on other systemsfrom the system-level perspective.

In order to make the link/system adaptation more flexibleand efficient, existing works have applied ML techniques indifferent settings [17, 18, 141–143]. The contribution in [17]investigated an online learning framework for the link adapta-tion by using a modified k nearest neighbor (kNN) algorithmto learn the mappings between the channel conditions andPER values for all possible Modulation and Coding Schemes(MCS) supported by the system. In this online learning frame-work, when a new packet is delivered, the predicted PERassociated with each MCS is calculated by using the kNNalgorithm and the best MCS is selected. After a packet istransmitted, the packet is stored as a prior data of the selectedMCS for future prediction purpose. Although the accuracy ofPER prediction becomes more accurate with the increase inthe number of packet transmissions, this kNN-based approachrequires to store all the previous samples and has higher timecomplexity, not suitable for a real time operation [17]. Inthis regard, the authors in [18] proposed an online kernelizedsupport vector regression method which can work with theminimal memory size and has low computational complexitywhile providing a comparable performance to those of theexisting algorithms.

Learning techniques are expected to provide significantbenefits by adaptively learning numerous parameters in variousapplication scenarios as listed below. Also, we provide a briefsummary of these use cases along with the correspondingreferences in Table VII.

1) Learning to exploit an unique RA slot for each MTCdevice within the considered transmission frame in away that concurrent transmissions in the same RACHopportunity can be avoided [10].

2) Learning to adapt an access control parameter, i.e.,access barring factor for the RACH congestion [21].

Page 25: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 25

TABLE VIIUSE CASES FOR THE APPLICATIONS OF LEARNING TECHNIQUES IN ULTRA-DENSE CELLULAR SYSTEMS

Use Case Sub-cases/related description ReferenceRACH congestion minimization Learning to find the unique access slot for each MTC device [10]

Learning to adapt an ACB parameter [21]Learning to associate MTC devices with the best eNodeBs [22, 23]

Autonomous adaptive resource allocation Learning to find the existence of the critical delay-sensitive messages [24]Dynamic spectrum sharing Learning to predict the occupancy status of radio channels [138]Learning-assisted edge-side processing Extracting the useful information from the raw sensor data [25]

at the edge devices to reduce the communication burden [25]Selection of a suitable RAT Learning to select a suitable RAT among different RAT technologies [139]

under network conditions and user preferences constraintsNetwork traffic control Learning to control network traffic to enhance computational efficiency and scalability [140]Adaptation of transmission parameters Learning link quality/reliability to adapt parameters such as MCS and transmission slot [17, 18, 141–143]Data analytics To extract user activity/mobility patterns, temporal, spatial and social correlations [42]Provisioning of personalized services To learn network contexts for a global view of communications, computing [144]

and caching resources

3) Learning to associate MTC devices with suitableBSs/eNodeBs with the objective minimizing overall ac-cess network congestion [22, 23].

4) Learning the existence of delay sensitive/critical mes-sages by IoT devices in heterogeneous ultra-dense IoTnetworks so that enough resources can be dynamicallyallocated and critical information can be successfullytransmitted to the BS/eNodeB/aggregator as soon as theyare generated [24]. Based on the learned informationabout critical messages, IoT devices can collectivelyadjust their uplink transmission parameters such asorthogonal codes, transmission slot period, periodicityof transmission and the received power for performingautonomous resource allocation and the coordination forthe usage of available codes.

5) Learning the radio spectrum by dynamic spectrum shar-ing among the systems/nodes in a collaborative mannerto predict the occupancy status of radio channels [138].

6) Learning the relationship of the contextual information(related to the surrounding radio environment) collectedfrom IoT sensors to extract knowledge and to pre-dict the future context at the edge devices [25]. In-stead of transferring all the raw contextual data to thenetwork/cloud-centre, only the inferred knowledge canbe transferred, thus reducing the communication burden.This approach deals with pushing learning intelligencefrom the network to the distributed edge devices havingheterogeneous computing abilities.

7) Learning for the selection of Radio Access Technology(RAT) while performing vertical handovers among het-erogeneous networks having different RAT technologiesunder the constraints of network conditions and userpreferences, which can be designed on the device-side,network side or in a hybrid manner [139].

8) Learning for network traffic control (such as routing)in ultra-dense heterogeneous networks to alleviate theissues of computational efficiency and scalability of theexisting approaches [140].

9) Learning link quality/reliability to adapt the transmissionparameters (such as MCS, transmission slot, receivedpower etc.) of a wireless link. [17, 18, 141–143]

10) Learning to extract user activity/mobility patterns,temporal, spatial and social correlations from raw

unstructured/semi-structured data coming from massivenumber of sensors.

B. Learning Techniques for IoT/MTC Systems

The main challenges of applying learning techniques in anIoT environment include the following [26].

1) MTC devices have low computational capability, how-ever, the widely-used ML techniques such as RL anddecision trees can be computationally complex to im-plement.

2) Because of the distributed nature of IoT devices and highenergy required to maintain constant communicationwith the BS/centralized aggregator, distributed learningneeds to be investigated for an IoT environment.

3) Due to limited radio resources and energy constraints,only the limited amount of information is available atthe IoT devices, and therefore, it is necessary to adaptthe learning mechanisms based on the limited amountof information.

4) In some critical applications such as eHealthCare andindustrial control, IoT devices need to learn quicklyin order to satisfy the ultra-reliable and low-latencyrequirements. For this purpose, learning time should beas small as possible to quickly adjust the performanceparameters.

5) To enable the harmonious coexistence of MTC and HTCsystems, learning techniques should consider both theexisting traffic as well as the new traffic from the MTCdevices.

Existing works have applied learning techniques in thecontext of MTC/IoT in the following ways: (i) adaptationof an access control parameter, i.e., access barring factor tominimize the RACH overload [21], (ii) learning a dedicatedslot within the MTC transmission frame by using an intelligentslot assignment strategy to avoid the collisions of access re-quests [10], (iii) BS/eNodeB selection by using Q-learning/RLtechniques to minimize the access network overload [22, 23],and (iv) sequential learning with finite memory in order tolearn transmission parameters under stringent memory andcomputational constraints [24, 147].

Page 26: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 26

Machine

Learning

Techniques

Unsupervised

Learning

Reinforcement

Learning

Supervised

Learning

Classification

Regression

Policy-iteration

basedValue-iteration

based

Clustering

Dimensionality

reduction

Density

estimation

· Neural

Network (NN)

· Trees

· Deep NNs

· Support vector

machine

· Binary

decision tree

· Naive Baysian

classifier

· NN

· Deep NNs

· Monte-Carlo

based

· Temporal

difference

methods (e.g.,

SARSA)

· Q-learning

· Variations of

Q-learning

(collaborative

adaptive,

modular,

fuzzy-Q)

· K-means

· PCA

· Spectral

clustering

· Dirchlet

process

· Auto-

associative NN

· Isometric

Feature

Mapping

(ISOMAP)

· Local linear

embedding

· Boltzmann

Machine

· Kernel

density

· Gaussian

mixtures

Fig. 6. Classification of existing machine learning techniques.

C. Overview of Existing Machine Learning TechniquesML techniques learn necessary information either from the

available data-sets or by interactions with the surroundingenvironments and make suitable decisions on future actions tobe followed by the agents either based on some learned modelsor in a model-free manner. The classification of existingML techniques in terms of different bases such as learningprinciples, objectives and the employed algorithm is providedin Fig. 6 [140]. On the basis of the employed learning mecha-nisms, the existing ML techniques can be broadly categorizedinto the following [26, 140, 148]: (i) supervised learning, (ii)unsupervised learning, and (iii) reinforcement learning, whichare briefly described below. Figures 7 and 8 provide theillustrations of the principles of these three categories of MLtechniques [149].

The first category of ML techniques, i.e., supervised learn-ing requires the need of training data to be labelled and theoutput of the algorithm needs to be already fed to the machine.Being aware of the output, the learning agent builds a modelto move from the input to the output guided by the inputtraining set. Based on the employed learning algorithms, thesupervised learning techniques can be classified into [43, 140]:Artificial Neural Networks (ANNs), Deep NNs, Bayesian Net-works (BNs), Support Vector Machine (SVM), Deep Learning(DL), Case-based Reasoning (CBR), Decision Trees (DTs),K-Nearest Neighbor (KNN), Instance-based Reasoning (IBR)and Naive Bayesian Classifier. The main difficulty of applyingsupervised ML techniques in IoT scenarios is that they requirethe processing of extensive data-set to learn from the dynamicenvironment but IoT devices are limited in terms of computing

and caching/memory resources [26].On the other hand, the second category of ML techniques,

i.e., unsupervised learning does not require the need of labelsof data-sets and is more complex than supervised learningtechniques in terms of the computational cost [140]. Althoughthis learning class has not been widely used as comparedto the supervised learning in the current context, it can beconsidered as a promising future ML paradigm since themain objective of ML is to make the learning agent capableof learning without any supervision or human intervention.Since the learning agent does not have any training data-set and the knowledge of the output, the learning processis quite complex as compared to the supervised counterpart.This learning approach divides the unlabelled heterogeneousdata into smaller homogeneous sub-sets which can be eas-ily understood and managed [43]. Therefore, unsupervisedlearning techniques are mainly based on following threeobjectives [140]: (i) clustering, (ii) dimensionality reduction,and (iii) density estimation. For clustering-based unsupervisedlearning, different ML algorithms such as K-means, spectralclustering, Principal Component Analysis (PCA) and Dirchletprocesses can be utilized. Similarly, unsupervised learningfor dimensionality reduction can be employed by using MLalgorithms like auto-associative NN, local linear embeddingand Isometric Feature Mapping (ISOMAP). Furthermore, thedensity estimation-based unsupervised learning can be realizedby using ML algorithms such as Kernel density, Gaussianmixtures, Boltzmann machine and Deep Boltzmann machine.

In order to take the advantages of supervised and unsu-pervised learning, there is another intermediate type of ML

Page 27: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 27

techniques, called semi-supervised learning technique [150]which uses both the labelled and unlabelled data for learning.The main objective of semi-supervised learning scheme toimprove the learning accuracy by utilizing a small amountof the labelled data along with the large amount of theunlabelled data. The above-mentioned types of ML techniqueshave been applied mainly in the centralized framework [148].For example, cloud-based processing can enable the operationof big data analytics to handle the massive amount of datagathered from the heterogeneous IoT sensors/devices [19].

The third category of learning techniques, i.e., RL enables anumber of agents to interact with the environment and involvesan environment of states, actions to be taken by the agents,state transition functions, an immediate reward and an initialobservation function [26]. In this approach, an agent learnsfrom the previous experience in the absence of the trainingdata set. This learning mechanism deals with finding a properbalance between exploration for the random actions and ex-ploitation of current knowledge, i.e., exploration-exploitationtrade-off. The role of exploration phase in RL learning isto attempt some random actions towards searching betterrewarding actions, while the exploitation phase tries to utilizethe previously learned utility to maximize the reward for theagent [151]. Although an RL technique requires the knowledgeof state transition function and it has slow convergence, itsunique feature of action-reward feedback to the agents makesthis suitable for the applications in IoT systems.

Besides the above-discussed three categories of ML tech-niques, some researchers have also considered another cat-egory of learning techniques, called as Sequential Learning(SL) [26], which helps the autonomous agents to learn the trueunderlying state of the environment having binary states. Inthis approach, the agents learn the system state in the sequenceby following a given order while observing the environmentand the actions or observations of previous agents, and theneventually converge to a true underlying state with repeatedhypothesis testing [26]. The main advantage of employingSL techniques in the IoT systems is its flexibility in termsof memory requirements since it enables the convergence offinite memory SL in the resource-constrained IoT devices[147]. However, the main drawback of the SL approach isthat it relies on direct communication links among MTCdevices since the information required for SL comes fromother agents, thus leading to the requirement of additionalnetwork resources.

Among several RL techniques, Q-learning requires lowcomputational resources for its implementation and does notrequire the knowledge of the model of the environment,thus being a suitable learning technique for the resource-constrained IoT devices [151]. Furthermore, it is possible toimplement this technique in a distributed way. Therefore, inthe following subsection, we utilize the Q-learning techniqueto address the problem of RACH congestion in cellular IoTnetworks.

D. Q-learning for RACH Congestion ProblemThe main problem with the existing contention-based RA

schemes is that the occurrence of collisions is unavoidable and

Learning and

Processing

Raw data

Classification/

regression

algorithms

Supervisor

Output

Desired

output

Training

data set

(a)

Raw data

Algorithms like

clustering and

density

estimation

Output

· No training data set

· No desired output

InterpretationLearning and

processing

(b)

Fig. 7. Illustrations of the principles of supervised and unsupervised learning.

Algorithms

such as Q-

learning

Learning device

Environment

Interpreter

Actio

n

Fig. 8. Illustration of the principles of a reinforcement learning technique.

the achievable RACH throughput in the presence of massiveaccess requests/loads gets significantly reduced [10]. For ex-ample, the maximum throughput of the widely-used slottedALOHA technique is e−1 ( 37%). Furthermore, because oflow RACH throughput and the employed backoff strategies,the aggregated traffic including both the newly generated andthe retransmitted exceeds the RACH channel capacity at a cer-tain point, thus making the system unstable. Although SlottedALOHA can work well with the conventional cellular/HTCtraffic despite the instability issue, the support of M2M trafficbecomes problematic due to infrequent and massive number ofaccess requests, thus causing the problem of RACH overload.

Learning techniques can be employed at the MTC devices inorder to enable them to learn to avoid concurrent transmissionsduring the RACH contention period without any assistancefrom the central entity. After a learning technique achievesits convergence, each MTC device can get a unique dedicatedslot, thus avoiding collisions among their transmissions. Inthe following, first, we present a framework for the RL witha single MTC device and then develop the formulation of Q-learning in the considered context.

The environment perceived by an MTC device can be

Page 28: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 28

usually described by a Markov Decision Process (MDP) anda finite MDP can be denoted by a tuple < X,U, f, ρ >, whereX represents the finite set of environment states, U is the finiteset of device actions, f denotes the state transition probabilityfunction and captures the environmental dynamics, and ρ is thereward function [44]. In this MDP modeling, a state parameterxt ∈ X indicates the characteristics of the environment at thetth time instance. At each time-step, the device can changeits state by taking actions ut ∈ U and due to this action,the environmental state alters from the current state xt tothe other some state xt+1 based on the employed transitionprobability function f(xt, ut, xt+1). During this transition, thedevice receives an instantaneous reward rt+1 ∈ R with thedefined function ρ, i.e., rt+1 = ρ(xt, ut, xt+1).

Given a state, the device chooses its action based on itspolicy π and the policy can be either stochastic or determin-istic. Each time the device applies a policy, it accumulatesthe rewards from the environment, resulting in the returnof∑Ll=0 γ

lrl+1, where γ ∈ [0 1] is the discount factorwhich provides more weights on the immediate rewards andL denotes the length of one episode, which denotes the timeperiod after which the state is reset for an episodic MDP [45].For the case of non-episodic MDP, L =∞. At each time-step,the learning device aims to maximize the expected discountedreturn in the long-term, i.e., long-term reward, given by [44,45]

Rt = E

(L∑l=0

γlrt+l+1

), (12)

where E denotes the expectation operator and this is takenover probability state transitions, i.e., dynamics of the consid-ered environment.

In this RL process, the learning device attempts to maximizeits long-term performance, while only receiving feedbackabout its immediate action, i.e., one-step performance. Toachieve this, the device needs to compute an optimal action-value function, known as a Q-function. Given a certain policyπ, the expected return of a state-action pair, Qπ(x, u) is givenby

Qπ(x, u) = E

(L∑l=0

γlrt+l+1|xt = x, ut = u, π

). (13)

Subsequently, the optimal Q-function can be written as:Q∗(x, u) = maxπ Q

π(x, u) and it satisfies the well-knownBellman optimality equation in the following way.

Q∗(x, u) =∑x′∈X

f(x, u, x′)[ρ(x, u, x′)

+γmaxu′

Q∗(x′, u′)], x ∈ X,u ∈ U. (14)

One simplest way of choosing a future action by a de-vice is to employ a greedy policy which selects the actionwith the highest Q-value at every state as follows: π(x) =argmaxuQ(x, u).

Among different ML techniques listed in Fig. 6, Q-learningis simple to implement and uses a look-up table with theQ-values representing the utilities for state-action pairs. Theutility of taking an action ‘u’ in a state ‘x’ is denoted by

Q(x, u) and can be calculated as the expected value of the sumof immediate reward and discounted utility of the resultingstate after executing the action ‘u’. In the Q-learning process,the current estimate of Q∗ value, i.e., Qt(xt, ut) is updatedby using the estimated samples given by the right-hand sideof (14), which are computed by relating with the actualexperience from the execution of the action, in the form ofthe pairs of subsequent states (xt, xt+1) and the rewards rt+1.In this way, this Q-learning process transforms (14) to thefollowing iterative procedure [44].

Qt+1(xt, ut) = Qt(xt, ut) + αt[rt+1

+γmaxu′

Qt(xt+1, u′)−Q(xt, ut)]. (15)

where αt denotes the learning rate applies at the tth time-step and the expression inside the square brackets indicatesthe difference between the estimates of Q∗(xt, ut) at twosuccessive time steps.

To provide an example framework for the application ofQ-learning in an MTC scenario with N number of devices,we consider a frame-based slotted ALOHA scheme as in [10,152], in which a frame is divided into K number of accessslots. Each MTC node has individual Q values correspondingto every slots in the frame and these values are updated basedon the outcomes of transmission, i.e., success or failure. Atthe start, all the MTC nodes can start with zero or random Qvalues, learn gradually via their transmissions, and then finallyreach to the optimal transmission strategy after finding uniqueRA slots for their transmissions.

Let Q(i, k) indicate the preference of the ith node to trans-mit a packet in the kth RA slot. After every data transmission,the new Q value, i.e., Qt+1(i, k) is updated based on theprevious Q value and the current reward based on the followingrelation

Qt+1(i, k) = Qt(i, k) + α(R−Qt(i, k)), (16)

where R is the current reward and α is the learning rate.The reward value of R = +1 is assigned if the transmissionbecomes successful, and otherwise, R = −1. At each instanceof transmission, the node selects a slot with the highest Qvalue and in case of two or more maximum values, a randomselection approach can be applied. Regarding the selection ofα, the higher the value, the faster will be the convergence of Qvalue towards the reward value R. However, a small value ofα is usually preferred to enhance the robustness with respectto infrequent collisions caused by channel variations [152].Furthermore, although learning rate α is usually consideredfixed in the existing works, it is typically time varying innature, decreasing with time and each state-action pair maybe associated with a different learning rate [44].

E. Exploration Strategies for Q-Learning

Q-learning aims at finding an optimal policy to select anaction in the current state and one of the design aspects for theQ-learning algorithm is to balance the exploration-exploitationtradeoff. The exploitation is performed by executing one ofthe actions which maximizes Q(x, u) whereas the explorationis carried out by randomly selecting an action to build a

Page 29: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 29

better estimation of the optimal Q-function.There are severalstrategies to create this balance and the widely-used threestrategies are described below [153, 154].

1) ε-greedy strategy: This is the most commonly usedexploration strategy in which the Q-learning algorithmutilizes a parameter 0 ≤ ε ≤ 1 to decide on theaction to follow. The algorithm chooses the action withthe highest Q-value in the current state with (1 − ε)probability and a random action with the probabilityε. The value of the parameter ε can be varied overtime as the learning progresses. The main drawback ofthis approach is that it treats all the possible actionsequivalently during its exploration by choosing an actionuniformly from the set of possible actions.

2) Soft-max strategy: To address the drawback of ε-greedystage during the exploration phase, this soft-max strategyuses either a Gibbs or Boltzmann distribution, in whichthe learning device at a state x selects an action u withthe following probability

π(x, u) =eQ(x,u)τ∑

u eQ(x,u)τ

, (17)

where τ > 0 denotes the temperature parameter of theBoltzmann distribution and depicts how randomly thevalues are chosen. For example, τ = 0 case representsno exploration at all while T → ∞ case reflects thescenario in which the learning device chooses the actionvalues almost with equal probability.

3) Optimism in the face of uncertainty: This approachencourages exploration by assigning higher initial valuesto the Q-function. However, the convergence time willbe increased since the estimated Q-values can be quitebad estimates of the actual Q-value and these bad esti-mates may last longer during the learning process. Also,in the presence of dynamic uncertainty, this techniqueis not useful since the exploration mostly occurs at thebeginning of the learning process.

F. Performance Enhancement of Q-Learning

In the following, we present several variants of Q-learningtechniques which aim towards improving the performance ofthe ordinary Q-learning.

1) Collaborative Q-learning: In this subsection, we de-scribe the need of collaborative Q-learning in ultra-denseIoT networks and review the related works. The traditionalQ-learning algorithm which is based on single state-actionmay not be suitable for the multi-agent environment withmultiple policies. In this regard, collaborative learning canbe utilized by exploiting the overall/global objective/rewardof the collective environment instead of the reward for asingle learning device. Furthermore, the global reward can beutilized in addition to the individual reward to improve theperformance of the existing learning schemes.

In a multi-agent environment like the case of ultra-denseIoT networks, if all agents keep mappings of their joint statesand actions, this will require each learning device to maintainvery large Q-value tables, whose sizes are exponential in the

number of agents [155]. This becomes difficult even in thecase of a single state case. For instance, when M number oflearning devices play the repeated game with only two actions,the size of the table becomes 2M . Therefore, it requires morestate space, information space and action space. Furthermore,in the multi-agent case, state transitions, the instantaneousrewards and future expected return are based on the jointaction of the multiple agents. In addition, instead of theindividual policies, a joint policy formed by individual policieswill impact the Q-function. Therefore, the design of state-action pairs or the optimal policy to maximize the overallreward becomes the joint/collaborative problem in a multi-agent environment, thus leading to the need of collaborativelearning.

Some existing works have applied collaborative Q-learningin different settings [25, 155, 157, 158]. One way of realiz-ing collaborative Q-learning is to estimate the belief of theopponent and the environment knowledge instead of the Q-value function in a way that the learning agent does notneed to observe the opponents’ reward and their Q learningparameters [155]. In this regard, the authors in [156] appliedthe collaborative Q-learning framework to optimize the waitingtime in intelligent traffic control applications. Furthermore,the authors in [25] employed a collaborative ML techniqueat the edge computing devices with the objective of extractingthe statistical relationships among the contextual informationcollected by the edge devices and constructing predictivemodels to maximize the communication efficiency. With thisedge-centric learning approach, only the inferred knowledgecan be transferred to the network instead of transmitting allthe raw contextual information. Furthermore, the authors in[157] used collaborative Q-learning in finding an optimal pathbetween any starting point and a target in a grid environmentfor a mobile robot. In contrast to the conventional approachof using a single Q-table, the work in [157] exploited the useof two Q-tables, i.e., one local Q table and another Master Qtable.

Moreover, in a multi-agent environment, it is necessary fora learning agent to keep the track of its environment, as wellas other agents’ actions. In such a multi-agent environment,rather than considering individual actions of the agents as inthe ordinary Q-learning, the joint actions need to be consideredwhile devising a learning strategy [159]. Instead of the Q-value used in the ordinary Q-learning, a Nash Q-value functionshould be defined. For the mth learning device in a multi-agent environment, the Nash Q-value function can be definedas [159, 160]

Qt+1m (x, u1, ...uM ) = (1−α)Qtm(x, u1, ...uM )+α[rtm+γQ̃tm(x′)],

(18)where (u1, ...uM ) is a joint action, rtm is an one-period rewardfor the mth learning device and Q̃tm(x′) denotes the payoff ofthe mth device in the state x′ for the chosen Nash equilibrium.The convergence of the Q-learning algorithm using the Q-value function defined in (18) becomes slower as the numberof agents increases due to the resulted increase in the jointaction set [159].

Page 30: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 30

2) Situation-Aware Adaptive Q-learning: The collisions ofthe RA requests caused due to concurrent transmissions ofmultiple RA requests in one RACH sub-frame results in higheraccess delay since the devices have to retransmit their RArequests. Although the average access delay can be minimizedby using a higher number of RACH sub-frames, the sub-frames available for data transmission will be reduced sincethe sub-frames allocated for RACH procedure can not be usedfor the data transmission purpose [161]. In this regard, it iscrucial to balance the tradeoff between the radio resourcesallocated for RACH procedure and data transmission process.Furthermore, in many cases, the network usage pattern (thenumber of devices/users trying to connect to the network) istime varying in nature. In this context, a network should becapable of detecting the variation in the arrival rate of theaccess request and should adapt the number of sub-framesto be allocated for RACH accordingly [161]. In addition, itis important to balance a tradeoff between exploration andexploitation while adapting the Q-learning parameters to thedynamic uncertain environment [162].

Although Q-learning has been shown to converge and hasbeen used in many fields including mechatronics controland robotics, it has some issues such as how to improvethe convergence rate and to avoid the convergence in thelocal optimum [163]. To address these issues, three differentparameters of the Q-learning technique, namely, learning rateα, discount rate γ and temperature parameter τ in Boltzmanndistribution, should be dynamically adapted based on thedynamicity of the underlying learning environment. One ofthe widely used methods to adapt the learning parameters invarious applications is fuzzy-logic based learning, which isbriefly described in the following subsection along with therelated literature.

3) Fuzzy-logic based Adaptive Q-learning: In the Q-learning process, the Q-values are usually stored in a look-uptable but this storage process becomes infeasible in practicein the presence of a large number of state-action spaces andwith the continuous state space [164]. Although Q-valuescan be stored by using feed-forward neural networks or self-organizing maps, the learning process becomes slower. Incontrast, incorporation of Q-learning into fuzzy environmentsseems promising since fuzzy interference systems being uni-versal approximations can be considered as good candidatesto store Q-values and the prior knowledge can be providedto the fuzzy rules in order to significantly reduce the trainingpart.

The implementation of Q-learning becomes impracticaland even impossible in continuous state spaces [164]. Insuch cases, fuzzy-logic based approach helps to discretizethe continuous state or action spaces into finite states byemploying suitable fuzzy rules and also the speed of fuzzy-logic based Q-learning can be increased by incorporating theprior knowledge via fuzzy rules [165]. In other words, FuzzyQ-learning discretizes continuous variables by using fuzzylabels and a fuzzy rule-based inference system is employedto find an action for these discretized states [166].

Other drawbacks with the ordinary RL are that it is difficultwith the continuous states and behaviors in the real world

environment due to discrete set of actions and spaces, and itbecomes complex to learn the problem with multiple objec-tives [167]. To solve these issues, fuzzy-logic based rules canbe employed to tune the learning parameters of the Q-learningtechnique towards making it more adaptive.

In the context of cooperative fuzzy Q-learning, authors in[166] utilized this learning approach to optimize the coverageand capacity of cellular networks by adapting the tilting ofvertical antennas. The employed cooperative Fuzzy Q-learningmechanisms enable cooperation among the learning agentsduring the exploration phase and is fully distributed in theexploitation phase. The cooperation in the exploration phaseis employed by utilizing the global reward of all the consideredcells instead of the local rewards belonging to individual cells.This cooperation with the help of global reward helps to speedup the exploration phase of the Q-learning process while alsoallowing the learning agents to exploit the learned knowledgeindependently while selecting their actions. Furthermore, aself-learning cooperative strategy is developed in [163] bycombining adaptive Q-learning with the fuzzy method for itsapplication in robot soccer systems.

Moreover, the contribution in [168] recently proposed afuzzy Q learning-based user centric backhaul-aware user as-sociation scheme in which the BSs broadcast their constraintsand capabilities in terms of meeting the requirements ofheterogeneous UEs in terms of the optimized bias factors.The employed fuzzy-logic based Q-learning scheme helps todynamically adjust these bias factors based on network condi-tions and users’ requirements in an automated and distributedmanner.

Similarly, authors in [169] proposed a fuzzy Q-Learningbased energy controller for a small cell powered by localrenewable energy, local storage, and the smart grid to elongatethe lifespan of the storage devices and to minimize theelectricity expenditures of the mobile operators. The employedfuzzy Q-learning based controlled can be utilized without theprior knowledge of the mobile traffic demand, energy pricingand weather.

In order to avoid the inefficient and expensive manual tuningof cellular network parameters in 5G small cell networks, it iscrucial to perform automatic configuration and optimization ofthe network parameters including the handover parameters. Inthis regard, authors in [170] employed a fuzzy logic controllerbased dynamic fuzzy Q-Learning algorithm for mobility ro-bustness optimization in a heterogeneous network. With theproposed dynamic fuzzy Q-learning algorithm, the systemlearns necessary parameter values towards optimizing the calldropping ratio and handover ratio, and it has been shown thatthe Q-Learning algorithm can lower the handover ratio whilekeeping the call-dropping ratio at the lower level. Besides,by considering various parameters such as link quality, theavailable bandwidth, link quality, and relative vehicle move-ment, authors in [171] proposed a fuzzy constraint Q-learningalgorithm for vehicular ad-hoc networks in order to evaluatethe quality of a wireless link towards finding the optimal route.

4) Model-based Q-Learning: The main drawback ofmodel-free learning is the convergence time. By predicting amodel about the transition state probabilities, the performance

Page 31: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 31

of the learning techniques could be improved. In time-varyingdynamic scenarios, it becomes advantageous to predict theenvironmental dynamics in the centralized entities such ascloud-center and to utilize the corresponding model to enhancethe performance of distributed Q-learning at the resource-constrained IoT devices. In such a collaborative cloud-edgeprocessing framework [19], the predicted model at the cloud-center can be communicated to the edge-side to improve theperformance of distributed learning at the edge-side of thenetwork.

Existing works have used model-based Q-learning in dif-ferent applications such as robotic applications [172, 173] andwireless channel allocation [174].

VII. RESEARCH CHALLENGES AND FUTURE DIRECTIONS

A. Design Issues for Low-Power MTC Device TransmissionSchemes

Current protocols designed for cellular IoT such as NB-IoT and LTE-M are based on the assumption of the low-latency requirement. This requirement results in significantcost in terms of device price and system capacity [34]. Besides,the main issues involved in providing cellular connectivityto the low-power MTC devices include the device batterylife, system capacity, coverage and cost. Due to low costand low capability of MTC devices, a significant compromisein the link performance needs to be made resulting in theshrinkage of the coverage area. One way of compensatingthis coverage loss is to utilize an extended transmission timeinterval, which, however, will lead to the increase in the batteryconsumption. Therefore, it is crucial to balance the tradeoffbetween energy consumption and coverage expansion whiledesigning transmission technologies for the MTC devices.

Furthermore, transmission schemes should be able to sup-port a significantly large number of devices within a givenbandwidth while ensuring higher battery efficiency. In this re-gard, the concept of effective bandwidth [34] could be utilizedto achieve a good balance between the spectral efficiency fora given coverage and the corresponding transmission time.On one hand, the bandwidth allocated for a device shouldnot be far less than the effective bandwidth to avoid theexcessive transmission time. On the other hand, for a givensensitivity level of the device, it is preferred to have allocatedbandwidth not more than the effective bandwidth in order toaccommodate more number of devices in the saved bandwidthwithout significantly increasing the transmission time. More-over, advanced transmission scheduling techniques and lowsignalling overhead MAC protocols need to be investigated toeffectively support the massive number of MTC devices in theupcoming 5G and beyond cellular networks.

B. Spectrum Issues for mMTC Systems

Since the available usable spectrum below 6 GHz is limited,it is crucial to investigate suitable spectrum sharing solu-tions for emerging 5G and beyond systems including eMBB,URLLC and mMTC. The most commonly discussed dynamic

spectrum sharing solutions for 5G and beyond systems in-clude interweave cognitive communications, underlay cogni-tive communications, carrier aggregation, Licensed AssistedAccess (LAA), Licensed Shared Access (LSA) and SpectrumAccess System (SAS) [72]. Another option is to explore higherfrequency bands such as millimeter wave (mmWave) bands.However, due to propagation related and hardware imper-fections issues in mmWave bands, it becomes challengingto operate mMTC devices in mmWave bands as they areconstrained in terms of resources and are mostly deployed inindoor or underground environments where propagation issuescould become problematic. Therefore, for mMTC applications,the frequency bands below 6 GHz is of more importance. Inthis spectrum region, there arises the issue of whether licensedor unlicensed band becomes suitable for mMTC applicationsas pointed out in the following.

In the spectrum region below 6 GHz, the operation inthe unlicensed band by using carrier aggregation and LAAcould be a cost-effective solution for mMTC applicationssince these frequencies are freely available to any device.However, uncontrolled interference conditions resulted fromthe free access from the massive number of devices and thelack of QoS guarantees may severely limit the ability ofutilizing unlicensed bands for mMTC applications [4]. Onthe other hand, emerging techniques such as LSA and SASprovide better interference characterization due to the central-ized control and seem more suitable for mMTC applications.Besides, some of the mMTC applications based on periodicdata reporting such as smart metering can effectively workwith the shared spectrum bands in a demand-based mannerinstead of exclusively assigning the portion of a licensedspectrum. Due to low bit-rate requirements of mMTC appli-cations, the bandwidth requirement may not be significant,however, the exclusive licensed spectrum can be allocated tomore demanding applications such as eMBB and URLLC.To this end, it is an important future research direction toinvestigate the feasibility of utilizing shared spectrum formMTC applications.

C. Traffic Characterization Issues for mMTC Systems

Since traffic characteristics in IoT sensory networks usuallydepend on the application scenarios, traffic characterizationis considered to be an important issue for the design andoptimization of the network infrastructure. The mMTC net-works may generate various types of traffic patterns such asPU, ED and streaming, and these traffic patterns may havedifferent amplitudes, activation periods and starting times [52].Besides, the data packets generated by MTC devices can beof varying sizes and bandwidth requirements. For example,the data packets generated by the temperature and humiditysensors are usually of small size in the order of bytes, whereasvideo monitoring devices can generate data sizes in the orderof megabytes.

As highlighted in Section II-D, the 3GPP has defined twodifferent models for the aggregated traffic models for the MTCtraffic. The first model based on uniform distribution is for thenon-synchronized type of traffic whereas the second model

Page 32: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 32

based on Beta distribution is for the highly synchronizedtraffic. This aggregated traffic modeling is simpler but is lessprecise than the source traffic modeling which models thetraffic for each MTC device separately and is more complex. Inthis regard, investigating suitable low-complexity and precisetraffic modeling is an important future research direction.

Furthermore, the Transmission Control Protocol (TCP) em-ployed at the transport layer of current LTE-A networks isnot efficient for MTC traffic due to several issues associatedwith connection set up, congestion control, data buffering andreal-time applications [52]. Therefore, it is crucial to developan enhanced version of TCP over LTE/LTE-A in order toaccommodate the MTC traffic.

D. Issues with Machine Learning in the mMTC environment

While employing ML techniques in an mMTC environment,an important aspect to be considered is how to make ML tech-niques more practically realizable for the resource-constrainedMTC devices. To achieve this objective, the following issuesneed to be considered. First, the convergence rate/learning timeof the employed learning algorithm should be as small as pos-sible and there may arise the issue of a local minimum. Sincethe required learning time may reduce the time for data trans-mission purpose, their trade-off should be designed properly.Second, in mMTC environments, the distributed implementa-tion of the algorithm needs to be considered across multiplelearning devices. Third, different parameters associated withthe learning algorithms such as learning rate α, discount rateγ and exploration-exploitation tradeoff parameter ε should beadapted dynamically to enhance the performance of Q-learningalgorithm in dynamic environments. Furthermore, there mayarise the fairness issue while applying learning algorithmsin a multi-agent environment since different devices mayreach to convergence at different time intervals, thus creatingdifferent learning times for different devices. Moreover, theheterogeneity of MTC devices need to be considered in termsof different aspects such as learning capability, cache size, datarate and delay tolerance limit.

In the above context, future works should focus on address-ing the aforementioned issues to employ the ML-assisted solu-tions towards enabling the incorporation of MTC devices in theupcoming 5G and beyond cellular networks. Various emergingtechniques described in Section VI-F such as collaborativelearning, situation-aware adaptive learning, fuzzy-logic basedQ-learning and model-based learning could be exploited toenhance the performance of the ML techniques in ultra-denseIoT networks involving MTC devices.

E. Distributed Resource Management in Ultra-Dense IoT Net-works

In contrast to the connection-oriented design approach ofthe existing wireless networks while considering only thecommunication resources, future content-oriented networksare expected to utilize other resources such as computingand caching resources. In ultra-dense IoT networks, thesecommunications, computing and caching resources are usually

distributed across the different entities of the network includ-ing the devices, aggregator/eNodeBs and core-network/cloud-center. The coordination of these distributed resources toenhance the performance of ultra-dense IoT networks involv-ing low-end MTC devices is an important research issue.Furthermore, in the emerging cloud-assisted IoT networks, itis advantageous to handle computationally intensive task atthe cloud-center due to the availability of huge computingand storage capabilities while it becomes beneficial to handledelay-sensitive applications at the edge-side of the network.In this regard, it is an important future research directionto investigate suitable techniques for edge-cloud collaborativeprocessing in various applications including dynamic spectrumsharing, event detection, context-aware resource allocation,live data analytics and security enhancement [19].

Moreover, various features of MTC device transmissionssuch as time-controlled, time tolerant, priority alarm message,infrequent transmission, group-based policing and addressingcan be useful to utilize the distributed cache embedded in MTCdevices. The caching capabilities distributed across heteroge-neous IoT devices can enable the scheduling of sporadic trans-missions from the MTC devices in order to significantly reducethe peak traffic in an IoT access network as demonstrated in[13]. This will subsequently reduce the demand for the radioresources at the peak time, thus significantly saving the radioresource cost for the telecommunication operators. In addition,distributed caching can be exploited to enable aggregate trans-mission for various other applications such as saving energyfor low-power IoT devices and reducing signalling/protocoloverhead for MTC device transmissions. Also, the physical-level cache embedded in low-end distributed MTC devicescan be exploited to facilitate the implementation of a cross-layer based transmission scheduling in pushing data from thephysical layer to the MAC layer at the suitable intervals.

F. Device Heterogeneity and Grouping-based TransmissionSchemes

The heterogeneity of MTC devices in terms of different as-pects such as computing capability, cache size, battery power,data rate and latency requirements becomes problematic forthe efficient QoS provisioning in ultra-dense IoT networks.Furthermore, RAN congestion, high signalling overhead andpower consumption are critical issues to be addressed in ultra-dense IoT networks. In this regard, grouping-based features ofMTC transmissions such as group-based policing and group-based addressing [9] can be utilized to enable the group-based operation in the mMTC environment towards alleviatingvarious problems such as signalling congestion and power con-sumption. Furthermore, by implementing a group-based accessauthentication technique, severe signalling congestion causedby the conventional independent access authentication schemein the existing cellular systems can be avoided [175] and thesecurity of emerging MTC applications can be significantlyenhanced.

By grouping MTC devices on the basis of either servicerequirements or the physical locations of MTC devices, group-based data gathering, aggregation and reporting can be uti-lized in ultra-dense IoT networks including IEEE 802.11ah

Page 33: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 33

based systems [176]. In such group-based schemes, a groupheader/cluster head collects the access requests, uplink datapackets, and device status information from the resource-constrained MTC devices belonging to that group, and thenforwards the aggregated traffic to an eNodeB/aggregator [6].Also, the downlink data packets and signalling messagescan be relayed to the MTC devices by the group header,thus significantly reducing the radio resources required fordirect communications between the devices and the eNodeB.Moreover, as the jittering constraints become challengingwhile transmitting small data packets from a large number ofdevices, this issue can be addressed by employing grouping-based resource scheduling which allocate radio resources tothe specific groups having similar QoS characteristics [61].However, the existing device grouping mechanisms are formedmostly in a traditional way by selecting the devices in arandom fashion or in an uniform manner, and it is an importantresearch direction to investigate efficient grouping mechanismsby exploiting the QoS characteristics and locations of hetero-geneous MTC devices.

G. Deep Learning for Emerging IoT Applications and Asso-ciated Issues

In real-world IoT environments, one of the major issuesis how to reliably extract the meaningful information out ofthe massive unstructured/semi-structured IoT data obtainedfrom a complex and noisy environment. Conventional MLlearning techniques may fail in such complex and dynamicenvironments and deep learning can be considered as apromising solution [40]. Due to multi-layer structure of deeplearning, it is considered as an effective approach for theedge-computing environment in order to accurately extractnecessary information from the raw IoT sensor data [177].Since it is possible to offload the parts of learning layers in theedge-side of the network and transfer the reduced intermediatedata to the remote cloud-center, the deep learning model seemssuitable for the emerging edge computing paradigms in IoTsystems. Another interesting advantage of deep learning inedge-computing environment is that it can provide privacypreservation in transferring the intermediate data. In contrast tothe traditional big data systems such as Spark or MapReduce inwhich intermediate data contains the user privacy information,the intermediate data generated in deep learning networksusually have different semantics than that in the original sourcedata [40].

In practical IoT systems, the data is usually dynamic andunlabelled, and the conventional statistically trained modelsare not efficient to handle the large unlabelled and dynamicdata-set. Furthermore, it is highly impractical to manuallylabel all the IoT raw data. Due to this, the conventionalsupervised training-based learning techniques are not suitablefor large-scale IoT/mMTC environments [178]. Moreover, theconventional cloud-based architecture requires the transfer ofa huge amount of IoT data from the edge-devices to the cloud-center. To address these issues, the application of deep learningwith collaborative cloud-edge/fog processing seems to be apromising future research direction [19, 178].

However, the application of deep learning in IoT systemsfaces the crucial challenge of meeting the computationalrequirements. The main associated issues include the high-speed training of large-scale IoT networks with the massivedata-sets and embedding deep learning capability in low-powerIoT devices [179]. This computational challenge caused dueto the expected growth in the size of the data-sets and thealgorithmic complexity of ML algorithms is demanding theneed of improving the computational efficiency of existingcomputing platforms. Also, it is not feasible to offload all thedata to the cloud-center for processing due to constraints onthe bandwidth, privacy and battery life, and the computationalefficiency at the device-side needs to be accelerated for deeplearning applications. In this regard, some of the potentialfuture solutions to improve computational efficiency of MLalgorithms include deep-learning accelerators, approximatecomputing and post-CMOS device technologies [179].

Many IoT products have already used the ML techniquesto acquire and analyze the environmental data. For example,Google’s Nest Learning Thermostat uses ML algorithms tounderstand the patterns of its users’ temperature schedulesand preferences by utilizing the temperature data recorded ina structure way. However, unstructured multimedia data suchas visual images and audio signals are difficult to learn byusing the conventional ML techniques. In this regard, someemerging IoT devices are already using sophisticated deeplearning techniques to capture and understand the complexenvironments [27]. As an example, the face-recognition se-curity system from the Microsoft’s Windows IoT team usesdeep-learning technology to perform tasks such as unlockinga door by recognizing its users’ faces.

Due to demanding real-time requirements of IoT applica-tions in terms of latency and high cost of radio resourcesrequired in delivering information to the cloud-center, it isadvantageous to implement deep learning techniques at thedevice-side. However, due to the limited computing powerand low memory size of IoT devices, it is challenging toimplement deep learning at the device-side. Therefore, mostof the time, existing deep-learning applications require third-party libraries and it may be difficult to migrate them to the IoTdevices [27]. In this regard, it is an important future researchdirection to investigate suitable paradigms such as convolutionneural networks based inference engine [27] to facilitate theimplementation of deep learning at the IoT devices.

VIII. CONCLUSIONS

Future cellular IoT networks are expected to support themassive number of resource-constrained MTC devices whilesatisfying their diverse QoS requirements and will need todeal with several challenges for enhancing the access la-tency, scalability, connection reliability, energy efficiency andnetwork throughput. To this end, this paper has discussedvarious challenges of mMTC systems such as QoS provision-ing, mMTC traffic characterization, transmission schedulingwith QoS support, small data packet transmission and RANcongestion, and has provided a detailed review on the existingstudies attempting to address these issues. By consideringmachine learning as an important enabler to address some of

Page 34: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 34

these issues in ultra-dense cellular IoT networks, the paperhas identified the potential advantages, research challengesand the application scenarios of ML-assisted solutions. Amongpotential ML techniques, the application of Q-learning inminimizing the RAN congestion has been presented along withsome performance enhancement techniques. Finally, someimportant research issues and interesting directions for futureresearch have been provided.

REFERENCES

[1] A. Al-Fuqaha, et al., “Internet of things: A survey on enablingtechnologies, protocols, and applications,” IEEE Commun. SurveysTuts., vol. 17, no. 4, pp. 2347–2376, Fourthquarter 2015.

[2] H. Wang and A. O. Fapojuwo, “A survey of enabling technologies oflow power and long range machine-to-machine communications,” IEEECommun. Surveys Tuts., vol. 19, no. 4, pp. 2621–2639, Fourthquarter2017.

[3] ITU-R, “IMT vision - framework and overall objectives of the futuredevelopment of IMT for 2020 and beyond,” ITU-R Rec. M.2083-0,Sept. 2015.

[4] B. A. Jayawickrama, Y. He, E. Dutkiewicz, and M. D. Mueck,“Scalable spectrum access system for massive machine type commu-nication,” IEEE Network, vol. PP, no. 99, pp. 1–7, 2018.

[5] 3GPP, “Study on scenarios and requirements for next generation accesstechnologies (Release 14),” Technical Report 38.913, 2017.

[6] F. Ghavimi and H. H. Chen, “M2M communications in 3GPPLTE/LTE-A networks: Architectures, service requirements, challenges,and applications,” IEEE Commun. Surveys Tuts., vol. 17, no. 2, pp.525–549, Secondquarter 2015.

[7] M. Koseoglu, “Performance analysis of small data transmissionschemes for cellular M2M communications,” in Proc. 16th AnnualMediterranean Ad Hoc Networking Workshop, June 2017, pp. 1–6.

[8] Z. Dawy, et al., “Toward massive machine type cellular communica-tions,” IEEE Wireless Communications, vol. 24, no. 1, pp. 120–128,Feb. 2017.

[9] 3GPP, “Service requirements for machine-type communications(MTC), TS 22.368, V11.5.0, June 2012.

[10] L. M. Bello, P. Mitchell, and D. Grace, “Application of Q-learningfor RACH access to support M2M traffic over a cellular network,”in European Wireless 2014; 20th European Wireless Conference, May2014, pp. 1–6.

[11] G. Durisi, T. Koch, and P. Popovski, “Toward massive, ultrareliable,and low-latency wireless communication with short packets,” Proc. ofthe IEEE, vol. 104, no. 9, pp. 1711–1726, Sept. 2016.

[12] S. Oh and J. W. Jang, “A scheme to smooth aggregated traffic fromsensors with periodic reports,” MDPI Sensors, vol. 17, no. 503, pp.1–18, March 2017.

[13] S. K. Sharma and X. Wang, “Distributed caching enabled peak trafficreduction in ultra-dense IoT networks,” IEEE Commun. Letters, vol.22, no. 6, pp. 1252-1255, June 2018.

[14] C. Bockelmann, et al., “Massive machine-type communications in 5G:physical and MAC-layer solutions,” IEEE Commun. Mag., vol. 54,no. 9, pp. 59–65, Sept. 2016.

[15] A. Imran, A. Zoha, and A. Abu-Dayya, “Challenges in 5G: how toempower SoN with big data for enabling 5G,” IEEE Network, vol. 28,no. 6, pp. 27–33, Nov 2014.

[16] R. Li, , “Intelligentet al., 5G: When cellular networks meet artificialintelligence,” IEEE Wireless Commun., vol. 24, no. 5, pp. 175–183,Oct. 2017.

[17] R. C. Daniels and R. W. Heath, “An online learning frameworkfor link adaptation in wireless networks,” in Proc. Info. Theory andApplications Workshop, Feb. 2009, pp. 138–140.

[18] S. Yun and C. Caramanis, “Reinforcement learning for link adaptationin MIMO-OFDM wireless systems,” in Proc. IEEE GLOBECOM , Dec.2010, pp. 1–5.

[19] S. K. Sharma and X. Wang, “Live data analytics with collaborativeedge and cloud processing in wireless IoT networks,” IEEE Access,vol. 5, pp. 4621–4635, March 2017.

[20] X. Wang, X. Li, and V. C. M. Leung, “Artificial intelligence-basedtechniques for emerging heterogeneous network: State of the arts,opportunities, and challenges,” IEEE Access, vol. 3, pp. 1379–1391,2015.

[21] J. Moon and Y. Lim, “Access control of MTC devices using re-inforcement learning approach,” in Proc. Int. Conf. on InformationNetworking, Jan. 2017, pp. 641–643.

[22] M. Hasan, E. Hossain, and D. Niyato, “Random access for machine-to-machine communication in LTE-advanced networks: issues andapproaches,” IEEE Commun. Mag., vol. 51, no. 6, pp. 86–93, June2013.

[23] A. H. Mohammed, A. S. Khwaja, A. Anpalagan, and I. Woungang,“Base station selection in M2M communication using Q-learningalgorithm in LTE-A networks,” in Proc. IEEE 29th Int. Conf. AdvancedInfo. Networking and Applications, March 2015, pp. 17–22.

[24] T. Park and W. Saad, “Resource allocation and coordination for criticalmessages using finite memory learning,” in Prof. IEEE GlobecomWorkshops (GC Wkshps), Dec. 2016, pp. 1–6.

[25] K. Portelli and C. Anagnostopoulos, “Leveraging edge computingthrough collaborative machine learning,” in Proc. 5th Int. Conf. onFuture Internet of Things and Cloud Workshops (FiCloudW), Aug.2017, pp. 164–169.

[26] T. Park, N. Abuzainab, and W. Saad, “Learning how to communicatein the internet of things: Finite resources and heterogeneity,” IEEEAccess, vol. 4, pp. 7063–7073, 2016.

[27] J. Tang, D. Sun, S. Liu, and J. L. Gaudiot, “Enabling deep learning onIoT devices,” Computer, vol. 50, no. 10, pp. 92–96, 2017.

[28] M. Wang, et al., “Cellular machine-type communications: physicalchallenges and solutions,” IEEE Wireless Commun., vol. 23, no. 2,pp. 126–135, April 2016.

[29] Z. Dawy, et al., “Toward massive machine type cellular communica-tions,” IEEE Wireless Commun., vol. 24, no. 1, pp. 120–128, 2017.

[30] A. Rico-Alvarino, et al., “An overview of 3GPP enhancements onmachine to machine communications,” IEEE Commun. Magazine,vol. 54, no. 6, pp. 14–21, June 2016.

[31] M. Elsaadany, A. Ali, and W. Hamouda, “Cellular LTE-A technologiesfor the future internet-of-things: Physical layer features and chal-lenges,” IEEE Commun. Surveys Tuts., vol. PP, no. 99, pp. 1–1, 2017.

[32] A. Hoglund, et al., “Overview of 3GPP release 14 enhanced NB-IoT,”IEEE Network, vol. 31, no. 6, pp. 16–22, Nov. 2017.

[33] A. Laya, L. Alonso, and J. Alonso-Zarate, “Is the random accesschannel of LTE and LTE-A suitable for M2M communications? asurvey of alternatives,” IEEE Commun. Surveys Tuts., vol. 16, no. 1,pp. 4–16, Firstquarter 2014.

[34] W. Yang, et al., “Narrowband wireless access for low-power massiveinternet of things: A bandwidth perspective,” IEEE Wireless Commun.,vol. 24, no. 3, pp. 138–145, 2017.

[35] M. S. Ali, E. Hossain, and D. I. Kim, “LTE/LTE-A random access formassive machine-type communications in smart cities,” IEEE Commun.Mag., vol. 55, no. 1, pp. 76–83, Jan. 2017.

[36] E. Soltanmohammadi, K. Ghavami, and M. Naraghi-Pour, “A survey oftraffic issues in machine-to-machine communications over LTE,” IEEEInternet of Things J., vol. 3, no. 6, pp. 865–884, Dec 2016.

[37] A. G. Gotsis, A. S. Lioumpas, and A. Alexiou, “M2M schedulingover LTE: Challenges and new perspectives,” IEEE Veh. Technol. Mag.,vol. 7, no. 3, pp. 34–39, Sept. 2012.

[38] M. A. Mehaseb, Y. Gadallah, A. Elhamy, and H. Elhennawy, “Classi-fication of LTE uplink scheduling techniques: An M2M perspective,”IEEE Commun. Surveys Tuts., vol. 18, no. 2, pp. 1310–1335, Sec-ondquarter 2016.

[39] M. Marjani, et al., “Big IoT data analytics: Architecture, opportunities,and open research challenges,” IEEE Access, vol. 5, pp. 5247–5261,2017.

[40] S. Verma, et al., “A survey on network methodologies for real-time analytics of massive IoT data and open research issues,” IEEECommun.. Surveys Tuts., vol. 19, no. 3, pp. 1457–1477, thirdquarter2017.

[41] H. Elhammouti, et al., “Self-organized connected objects: RethinkingQoS provisioning for IoT services,” IEEE Commun. Mag., vol. 55,no. 9, pp. 41–47, 2017.

[42] M. Mohammadi and A. Al-Fuqaha, “Enabling cognitive smart citiesusing big data and machine learning: Approaches and challenges,”IEEE Commun. Mag., vol. 56, no. 2, pp. 94–101, Feb. 2018.

[43] O. B. Sezer, E. Dogdu, and A. M. Ozbayoglu, “Context-aware com-puting, learning, and big data in internet of things: A survey,” IEEEInternet of Things J., vol. 5, no. 1, pp. 1–27, Feb. 2018.

[44] L. Busoniu, et al., “A comprehensive survey of multiagent reinforce-ment learning,” IEEE Trans. Systems, Man, and Cybernetics, Part C(Applications and Reviews), vol. 38, no. 2, pp. 156–172, March 2008.

Page 35: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 35

[45] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath,“Deep reinforcement learning: A brief survey,” IEEE Signal Process.Mag., vol. 34, no. 6, pp. 26–38, Nov 2017.

[46] R. Baldemair, et al., “Ultra-dense networks in millimeter-wave frequen-cies,” IEEE Commun. Mag., vol. 53, no. 1, pp. 202–208, Jan. 2015.

[47] S. Chen, et al., “Machine-to-machine communications in ultra-densenetworks: A survey,” IEEE Commun. Surveys Tuts., vol. 19, no. 3, pp.1478–1503, thirdquarter 2017.

[48] M. Kamel, W. Hamouda, and A. Youssef, “Ultra-dense networks: Asurvey,” IEEE Commun. Surveys Tuts., vol. 18, no. 4, pp. 2522–2545,Fourthquarter 2016.

[49] K. Smiljkovic, V. Atanasovski, and L. Gavrilovska, “Machine-to-machine traffic characterization: Models and case study on integrationin LTE,” in Proc. 4th Int. Conf. Wireless Commun., Veh. Technol., Info.Th. and Aerospace Electronic Systems (VITAE), May 2014, pp. 1–5.

[50] X. Li, J. B. Rao, and H. Zhang, “Engineering machine-to-machinetraffic in 5G,” IEEE Internet of Things J., vol. 3, no. 4, pp. 609–618,Aug 2016.

[51] N. Afrin, J. Brown, and J. Y. Khan, “An adaptive buffer based semi-persistent scheduling scheme for machine-to-machine communicationsover LTE,” in Proc. Eighth Int. Conf. Next Generation Mobile Apps,Services and Technol., Sept 2014, pp. 260–265.

[52] J. Kim, J. Lee, J. Kim, and J. Yun, “M2M service platforms: Survey,issues, and enabling technologies,” IEEE Commun. Surveys Tuts.,vol. 16, no. 1, pp. 61–76, Firstquarter 2014.

[53] J. Fabini and T. Zseby, “M2M communication delay challenges:Application and measurement perspectives,” in Proc. IEEE Int. Instru-mentation and Measurement Technol. Conf., May 2015, pp. 1859–1864.

[54] S. Y. Lien and K. C. Chen, “Massive access management for QoS guar-antees in 3GPP machine-to-machine communications,” IEEE Commun.Letters, vol. 15, no. 3, pp. 311–313, March 2011.

[55] Y. Saleem, N. Crespi, M. H. Rehmani and R. Copeland, “In-ternet of Things-aided Smart Grid: Technologies, Architectures,Applications, Prototypes, and Future Research Directions”, CoRR,abs/1704.08977,2017, online: http://arxiv.org/abs/1704.08977.

[56] K. Zheng, et al., “Radio resource allocation in LTE-advanced cellularnetworks with M2M communications,” IEEE Commun. Mag., vol. 50,no. 7, pp. 184–192, July 2012.

[57] M. Y. Cheng, et al., “Overload control for machine-type-communications in LTE-advanced system,” IEEE Commun. Mag.,vol. 50, no. 6, pp. 38–45, June 2012.

[58] R. Ratasuk, N. Mangalvedhe, D. Bhatoolaul, and A. Ghosh, “LTE-M evolution towards 5G massive MTC,” in Proc. IEEE GlobecomWorkshops (GC Wkshps), Dec. 2017, pp. 1–6.

[59] P. K. Wali and D. Das, “A novel access scheme for IoT communica-tions in LTE-advanced network,” in Proc. IEEE Int. Conf. AdvancedNetworks and Telecommun. Systems (ANTS), Dec 2014, pp. 1–6.

[60] A. M. AlAdwani, A. Gawanmeh, and S. Nicolas, “A demand sidemanagement traffic shaping and scheduling algorithm,” in Proc. SixthAsia Modelling Symp., May 2012, pp. 205–210.

[61] S. Y. Lien, K. C. Chen, and Y. Lin, “Toward ubiquitous massive ac-cesses in 3GPP machine-to-machine communications,” IEEE Commun.Mag., vol. 49, no. 4, pp. 66–74, April 2011.

[62] Z. Li, et al., “Enabling heterogeneous MMTC by energy-efficient andconnectivity-aware clustering and routing,” in Proc. IEEE GlobecomWorkshops (GC Wkshps), Dec. 2017, pp. 1–6.

[63] 3GPP, “Cellular system support for ultralow complexity and lowthroughput Internet of Things (CIoT),”, TR 45.820, V13.1.0, Nov.2015.

[64] P. Andres-Maldonado, et al., “Narrowband IoT data transmissionprocedures for massive machine-type communications,” IEEE Network,vol. 31, no. 6, pp. 8–15, Nov. 2017.

[65] J. Huang, et al.,, “Optimizing M2M communications and quality ofservices in the IoT for sustainable smart cities,” IEEE Trans. onSustainable Computing, vol. PP, no. 99, pp. 1–1, 2017.

[66] M. Centenaro, et al.,, “Comparison of collision-free and contention-based radio access protocols for the internet of things,” IEEE Trans.Commun., vol. 65, no. 9, pp. 3832–3846, Sept. 2017.

[67] U. Devi, et al., “SERA: A scheduling framework forM2M transmission in cellular networks,” online, web:http://researcher.ibm.com/researcher/files/in-umamadev/M2M-long.pdf.

[68] Y. Medjahdi, et al., “On the road to 5G: Comparative study of physicallayer in MTC context,” IEEE Access, vol. 5, pp. 26556–26581, 2017.

[69] 3GPP, “Study on RAN improvements for machine-type communica-tions,” Tech. Rep., TR 37.868, 2012.

[70] J. Choi, “On the adaptive determination of the number of preamblesin RACH for MTC,” IEEE Commun. Letters, vol. 20, no. 7, pp. 1385–1388, July 2016.

[71] S. J. SungHyung Lee and J. Kim, “Dynamic resource allocation ofrandom access for MTC devices,” ETRI Journal, vol. 39, no. 4, pp.546–557, Aug. 2017.

[72] S. K. Sharma, et al., “Dynamic spectrum sharing in 5G wirelessnetworks with full-duplex technology: Recent advances and researchchallenges,” IEEE Commun. Surveys Tuts., vol. 20, no. 1, pp. 674–707,Firstquarter 2018.

[73] M. Li, et al., “Random access and virtual resource allocation insoftware-defined cellular networks with machine-to-machine commu-nications,” IEEE Trans. Veh. Technol., vol. 66, no. 7, pp. 6399–6414,July 2017.

[74] A. Samir, et al., “Partial contention free random access protocol forM2M communications in LTE networks,” J. of Wireless Networkingand Commun., vol. 6, no. 3, pp. 66–72, 2016.

[75] G. Wunder, C. Stefanovic, P. Popovski, and L. Thiele, “Compressivecoded random access for massive MTC traffic in 5G systems,” in Proc.49th Asilomar Conf. on Signals, Systems and Computers, Nov. 2015,pp. 13–17.

[76] I. M. Delgado-Luque, et al., “Evaluation of latency-aware schedulingtechniques for M2M traffic over LTE,” in Proc. 20th European SignalProcessing Conference (EUSIPCO), Bucharest, 2012, pp. 989-993.

[77] S. Ali, N. Rajatheva and W. Saad, “Fast Uplink Grant for Ma-chine Type Communications: Challenges and Opportunities”, CoRR,abs/1801.04953, 2018, online: http://arxiv.org/abs/1801.04953.

[78] S. Bhandari, S. K. Sharma and X. Wang, “Latency Minimization inWireless IoT Using Prioritized Channel Access and Data Aggregation,”in Proc. IEEE Global Commun. Conf., Singapore, Dec. 2017, pp. 1-6.

[79] S. K. Sharma and X. Wang, “Cooperative sensing delay minimizationin cloud-assisted DSA networks” in Proc. IEEE 28th Annual Int. Symp.on Personal, Indoor, and Mobile Radio Commun. (PIMRC), Montreal,QC, Oct. 2017, pp. 1-6.

[80] T. P. C. de Andrade, C. A. Astudillo and N. L. S. da Fonseca,“Impact of M2M traffic on human-type communication users on theLTE uplink channel,” in Proc. 7th IEEE Latin-American Conferenceon Communications (LATINCOM), Arequipa, 2015, pp. 1-6.

[81] M. Laner, P. Svoboda, N. Nikaein, and M. Rupp, “Traffic models formachine type communications,” in Proc. The Tenth Int. Symp. WirelessCommun. Systems, Aug. 2013, pp. 1–5.

[82] N. Nikaein, et al., “Simple traffic modeling framework for machinetype communication,” in Proc. The Tenth Int. Symp. Wireless Commun.Systems, Aug 2013, pp. 1–5.

[83] A. M. AlAdwani, A. Gawanmeh, and S. Nicolas, “Traffic shaping anddelay optimization in demand side management,” in Proc. 15th Int.Conf. on Computer Modelling and Simulation, April 2013, pp. 722–727.

[84] A. Kumar, A. Abdelhadi, and C. Clancy, “A delay efficient multiclasspacket scheduler for heterogeneous M2M uplink,” in Proc. IEEEMILCOM Conf., Nov 2016, pp. 313–318.

[85] ——, “An online delay efficient packet scheduler for M2M traffic inindustrial automation,” in Proc. 2016 Annual IEEE Systems Conf., April2016, pp. 1–6.

[86] Y. Zhao, B. Zhang, C. Li, and C. Chen, “On/off traffic shaping in theinternet: Motivation, challenges, and solutions,” IEEE Network, vol. 31,no. 2, pp. 48–57, March 2017.

[87] J. S. Kim, S. Lee, and M. Y. Chung, “Efficient random-access schemefor massive connectivity in 3GPP low-cost machine-type communica-tions,” IEEE Trans. Veh. Technol., vol. 66, no. 7, pp. 6280–6290, July2017.

[88] S. Duan, V. Shah-Mansouri, Z. Wang, and V. W. S. Wong, “D-ACB:Adaptive congestion control algorithm for bursty M2M traffic in LTEnetworks,” IEEE Trans. Veh. Technol., vol. 65, no. 12, pp. 9847–9861,Dec. 2016.

[89] G. Y. Lin, S. R. Chang, and H. Y. Wei, “Estimation and adaptation forbursty LTE random access,” IEEE Trans. Veh. Technol., vol. 65, no. 4,pp. 2560–2577, April 2016.

[90] P. Osti, et al., “Analysis of PDCCH performance for M2M traffic inLTE,” IEEE Trans. Veh. Technol., vol. 63, no. 9, pp. 4357–4371, Nov.2014.

[91] A. Biral, et al., “The challenges of M2M massive access in wirelesscellular networks,” Digital Commun. and Networks, vol. 1, no. 1, pp.1–19, 2015.

[92] S. Y. Lien, et al., “Cooperative access class barring for machine-to-machine communications,” IEEE Trans. Wireless Commun., vol. 11,no. 1, pp. 27–32, Jan. 2012.

Page 36: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 36

[93] S. Duan, V. Shah-Mansouri, and V. W. S. Wong, “Dynamic accessclass barring for M2M communications in LTE networks,” in Proc.IEEE Global Commun. Conf., Dec. 2013, pp. 4747–4752.

[94] M. Koseoglu, “Lower bounds on the LTE-A average random accessdelay under massive M2M arrivals,” IEEE Trans. Commun., vol. 64,no. 5, pp. 2104–2115, May 2016.

[95] DIGI, “What are the difference between LTE-M and NB-IoT CellularProtocols,” available online: https://www.digi.com/, last accessed 15thMarch 2018.

[96] I. Lusky, “CAT-M1 vs NB-IoT-examining the real differences,” avail-able online: https://www.iot-now.com/, last accessed 16th March 2018.

[97] S. Andreev, et al., “Efficient small data access for machine-typecommunications in LTE,” in Proc. IEEE Int. Conf. Commun. (ICC),June 2013, pp. 3569–3574.

[98] J. Hu, et al., “Enhanced LTE physical downlink control channeldesign for machine-type communications,” in Proc. 7th Int. Conf. NewTechnologies, Mobility and Security (NTMS), July 2015, pp. 1–5.

[99] 3GPP, “General packet radio service (GPRS) enhancements for evolveduniversal terrestrial radio access network (E-UTRAN) access,” TS23.401 V14.3.0, March 2017.

[100] C. Yu, et al., “Uplink scheduling and link adaptation for narrowbandinternet of things systems,” IEEE Access, vol. 5, pp. 1724–1734, 2017.

[101] R. Boisguene, S. C. Tseng, C. W. Huang, and P. Lin, “A survey onNB-IoT downlink scheduling: Issues and potential solutions,” in Proc.13th Int. Wireless Commun. and Mobile Computing Conf. (IWCMC),June 2017, pp. 547–551.

[102] Y. P. E. Wang, et al., “A primer on 3GPP narrowband internet ofthings,” IEEE Commun. Mag., vol. 55, no. 3, pp. 117–123, March2017.

[103] D. Wu and R. Negi, “Effective capacity: a wireless link model forsupport of quality of service,” IEEE Trans. Wireless Commun., vol. 2,no. 4, pp. 630–643, July 2003.

[104] F. Ghavimi, Y. W. Lu, and H. H. Chen, “Uplink scheduling andpower allocation for M2M communications in SC-FDMA-based LTE-A networks with QoS guarantees,” IEEE Trans. Veh. Technol., vol. 66,no. 7, pp. 6160–6170, July 2017.

[105] H. G. Myung, J. Lim, and D. J. Goodman, “Single carrier FDMA foruplink wireless transmission,” IEEE Veh. Technol. Mag., vol. 1, no. 3,pp. 30–38, Sept. 2006.

[106] A. Aijaz, et al., “Energy-efficient uplink resource allocation in lte net-works with M2M/H2H co-existence under statistical QoS guarantees,”IEEE Trans. Commun., vol. 62, no. 7, pp. 2353–2365, July 2014.

[107] C. W. Park, D. Hwang, and T. J. Lee, “Enhancement of IEEE 802.11ahMAC for M2M communications,” IEEE Commun. Letters, vol. 18,no. 7, pp. 1151–1154, July 2014.

[108] N. Afrin, J. Brown, and J. Y. Khan, “Performance analysis of anenhanced delay sensitive lte uplink scheduler for M2M traffic,” in Proc.2013 Australasian Telecommun. Networks and Applications Conf., Nov.2013, pp. 154–159.

[109] E. Dahlman, et al., “5G wireless access: requirements and realization,”IEEE Commun. Mag., vol. 52, no. 12, pp. 42–47, Dec. 2014.

[110] N. A. Johansson, Y. P. E. Wang, E. Eriksson, and M. Hessler, “Radioaccess for ultra-reliable and low-latency 5G communications,” in Proc.IEEE Int. Conf. Commun. Workshop (ICCW), June 2015, pp. 1184–1189.

[111] G. Durisi, et al., “Short-packet communications over multiple-antennarayleigh-fading channels,” IEEE Trans. Commun., vol. 64, no. 2, pp.618–629, Feb. 2016.

[112] M. Mousaei and B. Smida, “Optimizing pilot overhead for ultra-reliableshort-packet transmission,” in Proc. IEEE Int. Conf. Commun. (ICC),May 2017, pp. 1–5.

[113] F. Schaich, T. Wild, and Y. Chen, “Waveform contenders for 5G-suitability for short packet and low latency transmissions,” in Proc.IEEE 79th Veh. Technol. Conf. (VTC Spring), May 2014, pp. 1–5.

[114] D. S. Yoo, et al., “Coding and modulation for short packet transmis-sion,” IEEE Trans. Veh. Technol., vol. 59, no. 4, pp. 2104–2109, May2010.

[115] X. Sun, et al., “Short-packet downlink transmission with non-orthogonal multiple access,” IEEE Trans. Wireless Commun., pp. 1–1,2018.

[116] D. Aziz, H. Bakker, A. Ambrosy, and Q. Liao, “Signalling minimiza-tion framework for short data packet transmission in 5G,” in Proc.IEEE 84th Veh. Technol. Conf. (VTC-Fall), Sept. 2016, pp. 1–6.

[117] K. F. Trillingsgaard and P. Popovski, “Downlink transmission of shortpackets: Framing and control information revisited,” IEEE Trans.Commun., vol. 65, no. 5, pp. 2048–2061, May 2017.

[118] K. Balachandran, J. H. Kang, K. Karakayali, and K. M. Rege, “Delay-tolerant autonomous transmissions for short packet communications,”in Proc. IEEE Globecom Workshops (GC Wkshps), Dec. 2016, pp. 1–6.

[119] B. Lee, et al., “Packet structure and receiver design for low latencywireless communications with ultra-short packets,” IEEE Trans. Com-mun., vol. 66, no. 2, pp. 796–807, Feb. 2018.

[120] Q. Yang, et al., “Wireless powered asynchronous backscatter networkswith sporadic short packets: Performance analysis and optimization,”IEEE Internet of Things J., vol. 5, no. 2, pp. 984–997, April 2018.

[121] C. She, C. Yang, and T. Q. S. Quek, “Uplink transmission designwith massive machine type devices in tactile internet,” in Proc. IEEEGlobecom Workshops (GC Wkshps), Dec. 2016, pp. 1–6.

[122] A. H. E. Fawal, et al., “RACH overload congestion mechanism forM2M communication in LTE-A: Issues and approaches,” in Proc. Int.Symp. on Networks, Computers and Commun. (ISNCC), May 2017, pp.1–6.

[123] H. Wu, et al., “FASA: Accelerated S-ALOHA using access historyfor event-driven M2M communications,” IEEE/ACM Trans. Network.,vol. 21, no. 6, pp. 1904–1917, Dec. 2013.

[124] R. G. Cheng, et al., “Modeling and analysis of an extended access bar-ring algorithm for machine-type communications in LTE-A networks,”IEEE Trans. on Wireless Commun., vol. 14, no. 6, pp. 2956–2968, June2015.

[125] J. P. Cheng, C. h. Lee, and T. M. Lin, “Prioritized random access withdynamic access barring for RAN overload in 3GPP LTE-A networks,”in Proc. IEEE GLOBECOM Workshops (GC Wkshps), Dec. 2011, pp.368–372.

[126] T. M. Lin, et al., “PRADA: Prioritized random access with dynamicaccess barring for MTC in 3GPP LTE-A networks,” IEEE Trans. Veh.Technol., vol. 63, no. 5, pp. 2467–2472, June 2014.

[127] G. Farhadi and A. Ito, “Group-based signaling and access control forcellular machine-to-machine communication,” in Proc. IEEE 78th Veh.Technol. Conf., Sept. 2013, pp. 1–6.

[128] T. H. Chuang, M. H. Tsai, and C. Y. Chuang, “Group-based up-link scheduling for machine-type communications in LTE-advancednetworks,” in Proc. IEEE Int. Conf. Advanced Info. Networking andApplications Workshops, March 2015, pp. 652–657.

[129] N. K. Pratas, H. Thomsen, C. Stefanovi, and P. Popovski, “Code-expanded random access for machine-type communications,” in Proc.2012 IEEE Globecom Workshops, Dec 2012, pp. 1681–1686.

[130] H. S. Jang, et al., “Spatial group based random access for M2Mcommunications,” IEEE Communications Letters, vol. 18, no. 6, pp.961–964, June 2014.

[131] H. M. Grsu, M. Vilgelm, W. Kellerer, and M. Reisslein, “Hybridcollision avoidance-tree resolution for M2M random access,” IEEETrans. Aerospace and Electronic Systems, vol. 53, no. 4, pp. 1974–1987, Aug. 2017.

[132] G. C. Madueno, C. Stefanovic, and P. Popovski, “Efficient LTE accesswith collision resolution for massive M2M communications,” in Proc.2014 IEEE Globecom Workshops (GC Wkshps), Dec. 2014, pp. 1433–1438.

[133] A. J. E. M. Janssen and M. J. de Jong, “Analysis of contention treealgorithms,” IEEE Trans. Info. Th., vol. 46, no. 6, pp. 2163–2172, Sept.2000.

[134] Y. Ruan, W. Wang, Z. Zhang, and V. K. N. Lau, “Delay-aware mas-sive random access for machine-type communications via hierarchicalstochastic learning,” in Proc. IEEE Int. Conf. Commun. (ICC), May2017, pp. 1–6.

[135] L. Alonso, R. Ferrus, and R. Agusti, “WLAN throughput improvementvia distributed queuing MAC,” IEEE Commun. Letters, vol. 9, no. 4,pp. 310–312, April 2005.

[136] E. Kartsakli, et al., “Cross-layer enhancement for WLAN systems withheterogeneous traffic based on DQCA,” IEEE Commun. Mag., vol. 46,no. 6, pp. 60–66, June 2008.

[137] M. Li, et al., “Energy-efficient M2M communications with mobile edgecomputing in virtualized cellular networks,” in Proc. IEEE Int. Conf.Commun. (ICC), May 2017, pp. 1–6.

[138] L. Li, X. He, and H. Li, “Learning the spectrum using collaborativefiltering in directional millimeter wave networks,” in Proc. IEEEGLOBECOM, Dec. 2017, pp. 1–7.

[139] V. Rakovic and L. Gavrilovska, “Novel RAT selection mechanismbased on hopfield neural networks,” in Proc. Int. Congress on UltraModern Telecommun. and Control Systems, Oct. 2010, pp. 210–217.

[140] N. Kato, et al., “The deep learning vision for heterogeneous networktraffic control: Proposal, challenges, and future perspective,” IEEEWireless Commun., vol. 24, no. 3, pp. 146–153, June 2017.

Page 37: IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 1 … · Machine-Type Communication (MTC) devices is leading to the ... Term Evolution (LTE) for Machine-Type Communications (LTE-M)

IEEE COMMUNICATIONS SURVEYS & TUTORIALS (DRAFT) 37

[141] R. C. Daniels, C. Caramanis, and R. W. Heath, “A supervised learningapproach to adaptation in practical MIMO-OFDM wireless systems,”in Proc. IEEE GLOBECOM, Nov. 2008, pp. 1–5.

[142] T. Desai and H. Shah, “Energy efficient link adaptation using machinelearning techniques for wireless OFDM,” in Proc. Int. Conf. InventiveComputation Technologies (ICICT), vol. 3, Aug. 2016, pp. 1–4.

[143] S. K. Pulliyakode and S. Kalyani, “Reinforcement learning techniquesfor outer loop link adaptation in 4G/5G systems,” ArXiv e-prints, 2017.

[144] M. Chen, et al., “Data-driven computing and caching in 5G networks:Architecture and delay analysis,” IEEE Wireless Communications,vol. 25, no. 1, pp. 70–75, Feb. 2018.

[145] Y. W. Blankenship, et al., “Link error prediction methods for multicar-rier systems,” in IEEE 60h Veh. Technol. Conf., vol. 6, Sept. 2004, pp.4175–4179.

[146] Y. Park, et al., “A new link adaptation method to mitigate sinr mis-match in ultra-dense small cell LTE networks,” IEEE Trans. WirelessCommun., vol. PP, no. 99, pp. 1–1, 2017.

[147] T. Park and W. Saad, “Learning with finite memory for machine typecommunication,” in Proc. Annual Conf. on Information Science andSystems (CISS), March 2016, pp. 608–613.

[148] M. A. Alsheikh, S. Lin, D. Niyato, and H. P. Tan, “Machine learn-ing in wireless sensor networks: Algorithms, strategies, and applica-tions,” IEEE Commun. Surveys Tuts., vol. 16, no. 4, pp. 1996–2018,Fourthquarter 2014.

[149] R. van Loon, “Machine Learning Explained: UnderstandingSupervised, Unsupervised, and Reinforcement Learning,”https://www.datasciencecentral.com/profiles/blogs/machine-learning-explained-understanding-supervised-unsupervised, online; accessed5th March 2018.

[150] J. Yoo and K. H. Johansson, “Semi-supervised learning for mobilerobot localization using wireless signal strengths,” in Proc. Int. Conf.on Indoor Positioning and Indoor Navigation (IPIN), Sept 2017, pp.1–8.

[151] K. Shah and M. Kumar, “Distributed independent reinforcement learn-ing (DIRL) approach to resource management in wireless sensornetworks,” in IEEE Int. Conf. Mobile Adhoc and Sensor Systems, Oct.2007, pp. 1–9.

[152] Y. Chu, et al., “Application of reinforcement learning to medium accesscontrol for wireless sensor networks,” Engineering Applications ofArtificial Intelligence, pp. 23–32, Nov. 2015.

[153] A. D. Tijsma, M. M. Drugan, and M. A. Wiering, “Comparingexploration strategies for Q-learning in random stochastic mazes,” inIEEE Symp. Series on Computational Intelligence (SSCI), Dec. 2016,pp. 1–8.

[154] D. L. Pooole and Alan K. Mackworth, “Artificial Intelligence: Foun-dations of Computational Agents,” in Cambridge University Press,Second Edition, 2017.

[155] C. g. Li, et al., “Multi-intersections traffic signal intelligent controlusing collaborative Q-learning algorithm,” in Proc. Seventh Int. Conf.on Natural Computation, vol. 1, July 2011, pp. 185–188.

[156] A. R. Rosyadi, T. A. B. Wirayuda, and S. Al-Faraby, “Intelligent trafficlight control using collaborative Q-learning algorithms,” in Proc. 4thInt. Conf. Info. and Commun. Technol., May 2016, pp. 1–6.

[157] C. Lamini, Y. Fathi, and S. Benhlima, “Collaborative Q-learningpath planning for autonomous robots based on holonic multi-agentsystem,” in Proc. 10th Int. Conf. on Intelligent Systems: Theories andApplications, Oct. 2015, pp. 1–6.

[158] S. Kar, J. M. F. Moura, and H. V. Poor, “QD-learning: A collaborativedistributed strategy for multi-agent reinforcement learning throughconsensus+innovations,” IEEE Trans. Signal Process., vol. 61, no. 7,pp. 1848–1862, April 2013.

[159] J. Ni, M. Liu, L. Ren, and S. X. Yang, “A multiagent Q-learning-basedoptimal allocation approach for urban water resource managementsystem,” IEEE Trans. Automation Science and Engg., vol. 11, no. 1,pp. 204–214, Jan. 2014.

[160] J. Hu and M. P. Wellman, “NASH Q-learning for general-sum stochas-tic games,” J. Mach. Learning Res., vol. 4, no. 6, pp. 1039-1069, 2003.

[161] S. Choi, et al., “Automatic configuration of random access channelparameters in LTE systems,” in Proc. IFIP Wireless Days (WD), Oct.2011, pp. 1–6.

[162] M. Rahimiyan and H. R. Mashhadi, “An adaptive Q-learning algorithmdeveloped for agent-based computational modeling of electricity mar-ket,” IEEE Trans. Systems, Man, and Cybernetics, Part C (Applicationsand Reviews), vol. 40, no. 5, pp. 547–556, Sept. 2010.

[163] K.-S. Hwang, S.-W. Tan, and C.-C. Chen, “Cooperative strategy basedon adaptive Q-learning for robot soccer systems,” IEEE Trans. FuzzySystems, vol. 12, no. 4, pp. 569–576, Aug. 2004.

[164] P. Y. Glorennec and L. Jouffe, “Fuzzy Q-learning,” in Proc. Int. FuzzySystems Conf., vol. 2, Jul 1997, pp. 659–662.

[165] P. Jamshidi, et al., “Self-learning cloud controllers: Fuzzy Q-learningfor knowledge evolution,” in Proc. Int. Conf. Cloud and AutonomicComputing, Sept 2015, pp. 208–211.

[166] M. N. ul Islam and A. Mitschele-Thiel, “Cooperative fuzzy Q-learningfor self-organized coverage and capacity optimization,” in Proc. IEEE23rd Int. Symp. (PIMRC), Sept. 2012, pp. 1406–1411.

[167] Y. Maeda, “Modified Q-learning method with fuzzy state division andadaptive rewards,” in Proc. IEEE Int. Conf. Fuzzy Systems, vol. 2, 2002,pp. 1556–1561.

[168] F. Pervez, et al., “Fuzzy Q-learning-based user-centric backhaul-awareuser cell association scheme,” in Proc. 13th Int. Wireless Commun. andMobile Computing Conf., June 2017, pp. 1840–1845.

[169] M. Mendil, et al., “Fuzzy Q-learning based energy management ofsmall cells powered by the smart grid,” in Proc. IEEE 27th Annual Int.Symp. PIMRC, Sept. 2016, pp. 1–6.

[170] J. Wu, J. Liu, Z. Huang, and S. Zheng, “Dynamic fuzzy Q-learning forhandover parameters optimization in 5G multi-tier networks,” in Proc.Int. Conf. Wireless Commun. Signal Process. (WCSP), Oct 2015, pp.1–5.

[171] C. Wu, S. Ohzahata, and T. Kato, “Flexible, portable, and practicablesolution for routing in vanets: A fuzzy constraint Q-learning approach,”IEEE Trans. Veh. Technol., vol. 62, no. 9, pp. 4251–4263, Nov. 2013.

[172] Z. Deng, et al., “Combining model-based Q-learning with structuralknowledge transfer for robot skill learning,” IEEE Trans. Cognitiveand Developmental Systems, pp. 1–1, 2017.

[173] A. Sharma, et al., “Model based path planning using Q-learning,” inProc. IEEE Int. Conf. Industrial Technology (ICIT), March 2017, pp.837–842.

[174] E. S. El-Alfy, Y.-D. Yao, and H. Heffes, “A model-based Q-learningscheme for wireless channel allocation with prioritized handoff,” inProc. IEEE GLOBECOM, vol. 6, 2001, pp. 3668–3672.

[175] J. Cao, M. Ma and H. Li, “A group-based authentication and keyagreement for MTC in LTE networks,” in Proc. IEEE Global Commu-nications Conference (GLOBECOM), Anaheim, CA, 2012, pp. 1017-1022.

[176] S. Bhandari, S. K. Sharma and X. Wang, “Device Grouping for Fastand Efficient Channel Access in IEEE 802.11ah based IoT Networks,”in Proc. IEEE Int. Conf. Commun. (ICC) Workshops, June 2018.

[177] H. Li, K. Ota, and M. Dong, “Learning IoT in edge: Deep learning forthe internet of things with edge computing,” IEEE Network, vol. 32,no. 1, pp. 96–101, Jan. 2018.

[178] M. Song, et al., “In-Situ AI: Towards Autonomous and IncrementalDeep Learning for IoT Systems,” in 2018 Proc. IEEE Int. Symp. HighPerformance Computer Architecture (HPCA), Vienna, Feb. 2018, pp.92-103.

[179] S. Venkataramani, K. Roy, and A. Raghunathan, “Efficient embeddedlearning for IoT devices,” in Proc. 21st Asia and South Pacific DesignAutomation Conference (ASP-DAC), Jan. 2016, pp. 308–311.