Machine Learning for Resource Management in Cellular and IoT … · 2019-07-23 · IoT services, innovative resource management solutions are required. IoT devices can exploit Device-to-Device

1

Machine Learning for Resource Management inCellular and IoT Networks: Potentials, Current

Solutions, and Open ChallengesFatima Hussain, Syed Ali Hassan, Rasheed Hussain, Ekram Hossain

Abstract—Internet-of-Things (IoT) refers to a massively het-erogeneous network formed through smart devices connected tothe Internet. In the wake of disruptive IoT with huge amount andvariety of data, Machine Learning (ML) and Deep Learning (DL)mechanisms will play pivotal role to bring intelligence to the IoTnetworks. Among other aspects, ML and DL can play an essentialrole in addressing the challenges of resource management inlarge-scale IoT networks. In this article, we conduct a system-atic and in-depth survey of the ML- and DL-based resourcemanagement mechanisms in cellular wireless and IoT networks.We start with the challenges of resource management in cellularIoT and low-power IoT networks, review the traditional resourcemanagement mechanisms for IoT networks, and motivate theuse of ML and DL techniques for resource management inthese networks. Then, we provide a comprehensive survey ofthe existing ML- and DL-based resource allocation techniques inwireless IoT networks and also techniques specifically designedfor HetNets, MIMO and D2D communications, and NOMAnetworks. To this end, we also identify the future researchdirections in using ML and DL for resource allocation andmanagement in IoT networks.

Index Terms—Internet-of-Things (IoT), Wireless IoT, MachineLearning, Deep Learning, Resource Allocation, Resource Man-agement, D2D, MIMO, HetNets, NOMA.

I. INTRODUCTION

By 2020, it is expected to have 4 billion connected people,25 billion embedded smart systems generating 50 trillion GBof data [1]. In the wake of such statistics, IoT, specificallywireless IoT, has a great potential for the future smart world;however, the deployment of IoT on a massive scale also bringsalong lots of challenges. Some of these challenges includedesign of the networking and storage architecture for smart de-vices, efficient data communication protocols, proactive iden-tification and protection of IoT devices and applications frommalicious attacks, standardization of technologies, and devicesand application interfaces. Additionally there are challengessuch as device management, data management, diversity, andinter-operability [2], [3].

F. Hussain is with Royal Bank of Canada, Toronto, Canada (email:[email protected]).

S. A. Hassan is with the School of Electrical Engineering and ComputerScience (SEECS), National University of Sciences and Technology (NUST),Islamabad, Pakistan (email: [email protected]).

R. Hussain is with the Networks and Blockchain Laboratory, Instituteof Information Security and Cyber-Physical System, Innopolis University,Innopolis, Russia (email: [email protected]).

E. Hossain is with Department of Electrical and ComputerEngineering at University of Manitoba, Winnipeg, Canada (email:[email protected]). This work was supported in part by aDiscovery Grant from the NSERC, Canada.

For the wireless IoT, since radio resources are inherentlyscarce, resource management becomes very challenging inpractical networks, specially when the number of contendingusers or smart devices is very large. The overall performanceof any wireless network depends on how efficiently anddynamically hyper-dimensional resources such as time slots,frequency bands and orthogonal codes are managed, andalso how well the variable traffic loads and wireless channelfluctuations are incorporated into the network design in orderto provide connectivity among users with diverse Quality-of-Service (QoS) requirements. Development of new IoT appli-cations advocates for an improved spectral efficiency, highdata rates and throughput as well as consideration of contextand personalization of the IoT services, and consequently, theproblem of resource management becomes crucial [4], [5].

A. Motivation of the Work

1) Resource management issues in IoT networks:• Massive channel access: When a massive number of

devices access the wireless channel simultaneously, itoverloads the channel (e.g. Physical Random AccessCHannel- PRACH in LTE networks). Also, a massivedeployment of IoT devices bring challenges related to net-work congestion, networking and storage architecture forsmart devices, and efficient data communication protocols[2], [3]. Solutions for combating congestion challenges inaccess channel for IoT devices are mostly based on usingdifferent access probabilities and prioritization amongdifferent classes. Resource management for cellular IoTnetworks needs to consider load balancing and accessmanagement techniques to support massive capacity andconnectivity while utilizing the network resources effi-ciently.

• Power allocation and interference management: In adense (and large-scale) IoT network, intra- and inter-cell interference problems become crucial, and therefore,sophisticated power allocation and interference manage-ment techniques are required. Many of the IoT appli-cations can be of heterogeneous nature with potentiallyvarying network topology, traffic as well as channelcharacteristics. Therefore, it is very challenging to choosetransmit power dynamically in response to changingphysical channel and network conditions.

• User association (or cell selection) and hand-off man-agement: In cellular IoT applications, the devices have to

arX

iv:1

907.

0896

5v1

[cs

.NI]

21

Jul 2

019

2

TABLE IEXISTING SURVEYS ON IOT

Year Paper Topic(s) of the survey Related content inour paper Enhancements in this article

2015 [6] Enabling technologies, protocols and applica-tions of IoT Sec. II Enhanced list of papers and coverage of more state-

of-the-art solutions with focus on ML and DL2017 [7] Communication standards in IoT N/A ML- and DL-based generic solutions in IoT

2016 [8] Network types, technologies and their character-istics Sec. II

Focus on state-of-the-art solutions in IoT andcoverage of resource management, security, andprivacy

2017 [9] Health-care communications standards in IoT N/A Focus on generic IoT and in-depth coverage of thesolutions

2018 [10]Architecture, scheduling, network technologies,and power management of IoT operating sys-tems

N/A focus on the current solutions for generic IoTapplications and future research directions

2018 [11] Blockchain and SDN solutions for IoT N/A Generic IoT and in-depth review of the ML- andDL-based solutions in multiple domains

2018 [12] Resource scheduling techniques in IoT Sec. II Detailed investigation of resource management inIoT and state-of-the-art techniques

2016 [13] Secure routing protocols in IoT N/A Focus on the applications and resource manage-ment with security and privacy

2017 [14] Trust models for service management in IoT N/A Coverage of overall aspects of security with solu-tions based on ML and DL

2019 [15] data fusion in IoT applications through ML N/A In-depth coverage of ML and DL in differentaspects of IoT

2018 [16] IoT data analytics through DL N/A In-depth review of ML- and DL-based solutions inIoT

associate with a Base Station (BS) or a gateway node toconnect to the Internet. Again, the association may notbe necessarily symmetric in both uplink and downlinkscenarios (i.e. an IoT device may associate to differentBSs at the uplink and the downlink). This association orcell selection affects the resource allocation. Also, for themobile IoT devices, resource management methods needto consider the possibilities of hand-off in different cells(e.g. macrocells, microcells).

• Harmonious coexistence of Human-to-Human (H2H) andIoT traffic: For harmonious coexistence between the ex-isting and new IoT traffic, intelligent resource manage-ment (e.g. for channel and power allocation/interferencemanagement, user/device association) is crucial that boththe types of communications can co-exist with theirrequirements satisfied.

• Coverage extension: IoT devices are essentially low-power devices with limited computational and storagecapabilities, in contrast to the traditional devices suchas smart phones and tablets. This phenomenon reducesthe coverage range. Therefore, to extend the coverage forIoT services, innovative resource management solutionsare required. IoT devices can exploit Device-to-Device(D2D) or relay-based communications techniques fortheir coverage extension. Dynamic resource allocationmethods will be required for such communication sce-narios.

• Energy management: Energy-efficient communication isvery important since most of the sensors and actuators inIoT are battery-powered with limited battery capacity andcharging facilities. Design of energy-efficient resourceallocation and communication protocol design is essentialto provide satisfactory QoS and support in dynamicnetwork environments.

• Real-time processing and Ultra-Reliable and Low La-tency Communication (URLLC): Mission-critical IoT ap-plications such as remote surgery, public safety, Intelli-gent Transport System (ITS) require low-latency com-munication and ultra-reliable connectivity. This calls forreliable and delay-aware resource allocation techniques.

• Heterogeneous QoS: Since IoT networks are comprisedof heterogeneous devices with different capabilities andcharacteristics as well as communications requirements,they will require specialized and customized resourcesspecific to their requirements.

2) Limitations of traditional resource management tech-niques: Conventionally, the resource allocation problems aresolved using optimization methods considering the instanta-neous CSI and QoS requirements of the users. Generally theresource allocation problems are not convex and, therefore, thesolutions obtained by using the traditional techniques are notglobally optimal. Also, the solutions may not be obtained inreal time. For practical implementation, more efficient (e.g.from computation as well as performance point of view)solutions are needed. In some scenarios, where the resourceallocation problem may not be well-defined mathematically(e.g. due to the nature of the propagation environment, mobil-ity patterns of the users), we may not be able to formulatethe optimization problem. It motivates the exploration ofnew resource allocation schemes. In this context, the data-driven ML-based resource allocation solutions would be viablesolutions in such scenarios and should be adaptable to thedynamic nature of IoT networks.

3) Why Machine Learning (ML) for resource manage-ment?: ML can be defined as the ability to infer knowledgefrom data and subsequently to use the knowledge to adapt thebehavior of an ML agent based on the acquired knowledge.ML techniques have been used in tasks such as classification,

3

regression and density estimation. IoT devices generate sheeramount of data and the data-driven ML techniques can exploitthese data to develop automated solutions for the IoT services.When the available data is massive and multi-dimensional,ML or more specifically, Deep Learning (DL) can be used forfeature extraction and useful classification.

Also, we can use ML techniques when the prior knowledgeof system, network, users, and parameters is not available andneeds to be predicted and the control decisions need to bemade. One such technique is Reinforcement Learning (RL)in which the unknown parameters and system behavior aremonitored over time with trial and error and optimal controlactions are learned. For resource management problems, whenthere is a model deficit or algorithm deficit, ML is recom-mended [17]. Model deficit means that there is insufficient do-main knowledge or non-existent mathematical models, whilealgorithm deficit means that a well-established mathematicalmodel is present but the optimization of the existing algorithmsusing these models is very complex. In such a situation, lowercomplexity solutions using ML are preferred. Also, when thecontextual information is very important to be incorporatedinto the decision making process, ML techniques are mostsuitable. Most of the IoT applications fall under the abovecharacterization because of the unknown system or networkstates and parameter values, and large number of devicesgenerating enormous amount of data.

To discuss the importance and requirements of ML tech-niques in various IoT applications, we focus on resourcemanagement in cellular IoT networks and smart home envi-ronment:

• Resource allocation in cellular IoT networks: Cellular IoTnetworks (e.g. 5G cellular and beyond) should supportextremely high data rates and require continuous con-nectivity to serve heterogeneous devices and users withvarying QoS requirements. The transmitting devices areexpected to autonomously connect to an Access Point(AP) or Base Station (BS) and access the frequencychannels for achieving acceptable Signal-to-InterferenceRatio (SIR). Furthermore, careful channel and powerallocation is required to support various IoT applicationssimultaneously. For this purpose, obtaining the networkparameters such as CSI, traffic characteristics, and de-mands of the users/devices are challenging, especiallyin case of serving highly mobile users. Also, estimationof the communication parameters such as the Angleof Arrival (AoA) and Angle of Departure (AoD) areimportant for millimeter wave communications.To address the above challenges, ML can be a suitablesolution from both the network and user perspectives.IoT networks (network entities such as eNodeB) areexpected to learn the context and diverse characteristicsfor both human and machine users, respectively, in orderto achieve optimal system configurations. Smart devicesare expected to be capable of autonomously accessingthe available spectrum with the aid of learning, andadaptively adjusting the transmission power for the sakeof avoiding interference and for conserving the energy.For this purpose, Deep Reinforcement Learning (DRL)

and linear regression can be used. In millimeter Wave(mmWave) communications, the values of CSI and AoA,which are not known a priori, can be estimated based onan RL technique, e.g. Q-learning. Location prediction ofhigh-mobility users can be obtained using Support VectorMachine (SVM), and Recurrent Neural Networks (RNNs)[18], [19], and this location estimation can be used forresource allocation.Various correlated system parameters such as users po-sition and velocity, propagation environment, and in-terference in the system can be taken into account todevelop ML-based resource allocation methods. Further-more, techniques such as Principal Component Analysis(PCA) and K-means clustering can be used for clusteringvarious types of devices so that the available resourcescan be allocated according to the QoS requirements ofthese groups or clusters [20]–[23]. It is worth noting thattraditionally, the optimization techniques cannot incorpo-rate the context and are unable to react according to thechanging environments [24], [25]. Therefore, using thesetechniques may result in poor resource utilization. Also,the game theoretic approaches often do not incorporateheterogeneity of players as well as dynamic change inthe network topology and number of players for resourceallocation [25].

• ML for Resource management in smart home environ-ment: Smart home applications are among the mostpopular IoT applications that involve a combination ofheterogeneous and ubiquitous devices such as securitycameras, hand-held scanners, tablets, smart appliance,and wireless sensors. All of these devices randomlyaccess the network resources and have varying accessand QoS requirements. ML techniques such as Q-learningand multi-armed bandit can work well for schedulingand random access of the resources in the smart homeenvironment. It is because these reinforcement learningmethods learn and can adapt dynamically to the variationin the network environment [26], [27]. Also, K-meansclustering and PCA can be used to group small sen-sors and aggregate the small payload data, respectively.Note that the traditional optimization and heuristics-basedtechniques can be computationally very expensive to runon small, inexpensive, and energy-constrained sensors.Furthermore, traditional game theoretical approaches maynot be suitable due to heterogeneity of the devices as wellas overhead due to the information updates and exchange.

B. Scope of the Survey

In the following, we succinctly summarize the publishedsurveys and compare our survey paper with the existingsurveys in terms novelty of covered topics and depth ofcontents.

1) Existing surveys: Table I lists the existing surveys relatedto IoT networks. Most of the published surveys do not focuson the ML and DL techniques in IoT networks. The currentpublished surveys either focus on particular aspects of the IoTor do not cover the resource management in a holistic way. In

4

TABLE IIACRONYMS AND THEIR DESCRIPTIONS

Acronym Explanation Acronym ExplanationML Machine Learning NOMA Non-Orthogonal Multiple AccessWSN Wireless Sensor Networks DL Deep LearningRL Reinforce Learning SVM Support Vector MachinePRACH Physical Random Access Channel KNN K-Nearest NeighbourNB Naive Bayes NN Neural NetworkDNN Deep Neural Network CNN Convolutional Neural NetworkPCA Principal Component Analysis RNN Recurrent Neural NetworkMLP Multi-Layer Perception ELM Extreme Learning MachineITS Intelligent Transportation System ANN Artificial Neural NetworkLSTM Long-Short Term Memory GAN Generative Adversarial NetworkV2V Vehicle-to-Vehicle communication DRL Deep Reinforcement LearningQoS Quality of Service CSI Channel State InformationQoE Quality of Experience MIMO Multiple-Input Multiple-OutputHetNet Heterogeneous Network D2D Device-to-Device CommunicationSDN Software-Defined Network CDSA Control-Data plane Separation ArchitectureRSSI Received Signal Strength Indicator URLLC Ultra-Reliable and Low Latency CommunicationDDoS Distributed Denial of Service LPWAN Low-Power Wide Area NetworkNB-IoT Narrow Band Internet of Things SINR Signal-to-Interference-plus-Noise RatioBAN Body Area Network AML Adversarial Machine LearningNE Nash Equilibrium PSO Particle Swarm OptimizationGPU Graphics Processing Unit MCS Modulation and Coding Scheme

DOA Direction-of-Arrival D-SCMA Deep learning aided Sparse Code Multiple Ac-cess

BER Bit Error Rate OFDM Orthogonal Frequency Division MultiplexingDQN Deep Q Networks SAX Symbolic Aggregate approximationMDP Markov Decision Process RBM Restricted Boltzmann MachineCSS Cooperative Spectrum Sensing CRN Cognitive Radio NetworksTACL Transfer Actor-Critic Learning DPM Dynamic Power Management

SMDP Semi-Markov Decision Process MWIS-DW Maximum Weighted Independent Set problemwith Dependent Weights

FIFO First In First Out RRM Radio Resource ManagementSIC Successive Interference Cancellation SDR Software-Defined Radio

Table I, we focus on the topics that are covered in the existingsurveys and we reflect on the same topics (if applicable) inthis article. Furthermore, we also categorically discuss theenhancements in this article. The surveys in Table I focus onenabling technologies, standards, architectures, scheduling, so-lutions, routing, data fusion, and analytics. Although ArtificialIntelligence (AI)-based and ML techniques have been used inIoT to address different challenges that were not addressedthrough traditional techniques, the role of ML and DL has notbeen covered in a holistic way in the existing surveys.

To fill in this gap, we investigate in-depth, the role ofML and DL in IoT networks and discuss the ML- and DL-based resource management solutions for wireless IoT. Wehave covered the latest surveys (till 2019) in this article. Notethat the role of ML and DL is multi-faceted in IoT, thereforecovering the entire spectrum of applications of ML and DLtechniques in not possible in one article. We only focus on theresource management aspect of wireless IoT networks.

2) Summary of the contributions: We cover different as-pects of resource allocation and management in IoT networksthat leverage ML and DL techniques. The objective of thissurvey is to bridge the gap between the resource allocationrequirements of the IoT networks and the services providedby the ML and DL techniques.

Resource allocation and management are essential for theQoS guarantees in IoT applications and therefore need closeattention. After covering the resource management challenges

in different IoT networks (e.g. cellular IoT, low-power IoT,cognitive IoT, and mobile IoT networks), we discuss theexisting solutions including those for Device-to-Device (D2D),HetNet, (massive) Multiple Input Multiple Output (MIMO),and Non-Orthogonal Multiple Access (NOMA) systems. Thenwe motivate using ML and DL in addressing the resourceallocation and management challenges in IoT that were notaddressed by the traditional solutions. We then dive deeperinto the ML and DL techniques that address the challengesof the traditional resource management and are leveraged forresource allocation in heterogeneous IoT networks. We alsooutline the future research opportunities and direction. Thepictorial illustration of the scope of this paper is shown inFig. 1. The contributions of this article are summarized asfollows.

1) We discuss in detail the traditional techniques used forresource allocation in IoT network and their limitationsthat motivate the use of ML and DL techniques.

2) We conduct an in-depth and comprehensive survey onthe role of ML and DL techniques in the IoT networks.

3) We discuss the existing solutions to IoT networks thatleverage ML and DL algorithms with an emphasis onthe major aspects of the resource management in IoTnetworks.

4) Based on the conducted survey of the ML- and DL-based resource management in IoT networks supportingD2D, MIMO and massive MIMO, HetNets, and NOMA

5

Fig. 1. Taxonomy of this‘ survey. [ML: Machine Learning, DL: Deep Learning, RM: Resource Management, DRL: Deep Reinforcement Learning, NOMA:Non-Orthogonal Multiple Access]

technologies, we identify the existing challenges andfuture research directions. This will stimulate furtherresearch in ML and DL for IoT networks.

The rest of the article is organized as follows: In TableII, we list all the acronyms used in the paper. The challengesrelated to resource management in IoT networks are discussedin Section II. In Section III, we discuss the existing resourcemanagement techniques and their limitations followed by therole of ML and DL in resource management in Section IV.In Section V, we discuss the existing ML-based resourcemanagement solutions in IoT networks. Then in Section VI,we review the ML and DL-based solutions for resource man-agement for the emerging cellular wireless technologies forIoT networks. In Section VII, we discuss the future researchchallenges in developing ML- and DL-based solutions forresource management in IoT networks before we conclude thearticle in Section VIII.

II. RESOURCE MANAGEMENT IN IOT NETWORKS

In the following, we discuss the resource managementchallenges in four different breeds of the IoT networks, i.e.cellular IoT networks, low-power IoT networks, cognitive IoTnetworks, and mobile IoT networks.

A. Cellular IoT Networks

Cellular IoT networks can generally be characterized bythe underlying technology they are using. For instance, theresource management techniques in different communicationmodes of a cellular IoT network (such as the Device-to-Device (D2D) mode, HetNet mode, NOMA mode, or dynamicspectrum access-based mode) are different than those in ahomogeneous cellular mode of operation. Generally the issuesrelated to resource management in D2D systems includesession setup and management [28], channel allocation andinterference management in the underlay networks [29], [30].HetNets also suffer from the same problems of co-tier andcross-tier interference [31]. To date, several studies have beenconducted on HetNet models and their resource optimization[32], [33]. Recently, NOMA has gained significant attentiondue to its potential to provide high spectral efficiency. Somepapers have already published that focus on the resourcemanagement aspect and optimization of the network param-eters [34], [35]. Traditional methods for resource allocationin cellular IoT networks rely mainly on the optimizationtechniques. These methods face challenges when the numberof users become large and/or when the multi-cell scenariosare considered. This is because the optimization space be-

6

comes overwhelmingly large to cater for the entire network,and therefore, computational complexity of finding solutionsincreases significantly.

B. Low-Power IoT Networks

Although the IoT networks offer both long-range and short-range communications through different technologies, thereare many applications that require the energy portfolio ofshort-range technologies while demanding long-range commu-nications. Therefore, low-power IoT networks especially thosesupporting long-range communications such as the Low-PowerWide Area Networks (LPWANs) become inevitable. LPWANsoffer long-range communications by limiting the data rates andenergy, and have been rigorously studied for various smartapplications. Many technologies in the licensed bands (e.g.LTE-M, NB-IOT and EC-GSM) as well as those in un-licensedbands (e.g. LoRa, SigFox, Weightless) for low-power networksare under consideration which have to deal with different typesof resource allocation problems [36]. In most of the cases, thenetworks are characterized with a large number of nodes andthe collisions among packets are avoided through schedulingand duty cycling [37]. Similarly, the problems of interferencein these networks arising due to the underlying modulationschemes result in performance degradation of the system [38].

C. Cognitive IoT Networks

Cognitive networks have been an area of interest since lastdecade and many solutions have been proposed to solve dif-ferent resource management problems in cognitive networks.One of the salient features of a cognitive IoT network is thatthe nodes constantly search for resources opportunistically thatare better suited for them to boost the overall performanceof the network. The cognitive networks allocate resources of‘primary user’ to so-called ‘secondary user’ when the primaryuser is absent or idle. However, as soon as the primary deviceis activated (the activity pattern of whom could be random), thesecondary user has to vacate that channel. Therefore, resourceallocation in these networks, which needs to consider primaryusers’ activity and QoS requirements as well as secondaryusers’ QoS requirements, wireless propagation, and othernetwork parameters, is very challenging [39]. The channelsensing, detection, and acquisition are mainly done by statictechniques that suffer from various imperfections such ascollisions, increased outages, and decreased system throughput[40].

D. Mobile IoT Networks

Mobile IoT (MIoT) is the extension of traditional IoT withmobility [41]. More precisely, in MIoT, the IoT services andapplications of IoT are mobile and can physically be trans-ferred from one location to another. Another orthogonal variantof such IoT network is the Internet of Mobile Things (IoMT)where mobile things are employed to form an IoT network[42], [43]. Traditional IoT enables the communication amongstatic things to realize different services and increase theconnectivity among objects and the physical world, whereas in

MIoT and IoMT, the communicating entities, i.e. smart thingsmove and maintain their inter-connection and accessibility.Traditional IoT scenarios such as smart home mimic the staticconnectivity among objects, whereas mobile objects such asvehicles, mobile robots, wearable devices add the mobilitydimension to the IoT. Traditional applications and servicesrealized through IoMT and MIoT include, but not limited to,smart industry with mobile robots, smart transportation withvehicles, and so on.

As afore-mentioned, the defining feature of the IoMT andMIoT is the node mobility. Since mobility incurs extra controlinformation to be exchanged among the network managemententities, resource management in IoMT and MIoT is morechallenging than the traditional static IoT networks. In additionto these challenges, application context is also an importantparameter to consider for resource management.

III. EXISTING RESOURCE MANAGEMENT TECHNIQUESAND LIMITATIONS

In this section, we discuss the existing resource manage-ment tools and techniques in IoT networks and the enablingtechnologies.

A. Optimization and Heuristic Techniques

To date, many solutions have been proposed for efficientresource allocation in IoT networks using optimization andheuristics-based techniques. For instance, the authors in [44]used genetic algorithm for load balancing in fog IoT networks.However, the complexity of the algorithm increases withincreasing the number of user requests and network size.In the same spirit, the authors in [45] proposed a resourceallocation approach using genetic algorithm. They transformedthe resource allocation problem into a degree-constrainedminimum spanning tree problem and applied a genetic al-gorithm to solve the problem in a “near-optimal”’ fashion.However, the approach does not scale well in a dynamic IoTenvironment and may not be implementable in a practicalscenario. Similarly, the authors in [46] used Particle SwarmOptimization (PSO) based meta-heuristic for the distributionof blocks of codes among the IoT devices. They developedmulti-component applications capable of being deployed onvarious IoT nodes, edge, and cloud. However, it works well instatic environment and lack its applicability in multi-objectiveand dynamic environments.

In general, the traditional optimization and heuristics-basedtechniques have the following characteristics.

• Inaccuracy of the model-based resource allocationmethods: Numerous model-oriented methods have beendeveloped in the literature for resource management (e.g.for power allocation and interference management). How-ever, the mathematical models used for these methods,although analytically tractable, are not always accuratedue to the practical channel environment and hardwareimperfections. In this context, model-free and data-drivenmethods are potentially more promising.

• Cost and complexity: Heuristic techniques (e.g. branchand bound) are also computationally expensive and incur

7

significant timing overhead. Therefore, it will not suitwell in delay-sensitive and real-time IoT applications.Moreover, the computational complexity of these heuris-tics proportionally increases with the increase in thenetwork size. Also, with evolutionary algorithms suchas genetic algorithms, designing the objective functionsand describing the parameters and the operators can bedifficult.

• Convergence to local optima: For sufficiently complexproblems (e.g. non-convex problems), the task of obtain-ing a global optimum solution in the presence of manylocal optimum points is very challenging.

• Sub-optimal solutions in high dimensional problems:With traditional heuristics-based algorithms, the qualityof the solutions can degrade with the increase in di-mensionality of the problem. Heterogeneity and dynam-ically changing characteristics of the IoT networks callfor more intelligent, self-adaptive, and auto-configurabletechniques.

• Parameter sensitivity and lack of flexibility: The so-lutions obtained through heuristics-based approaches aresensitive to the corresponding chosen system parameters.If the key decision variables and the constraints change(e.g. due to new devices added to the network), a hard-or pre-coded heuristic technique may no longer remainfeasible and it may need to be reconfigured for producingmore viable solutions. Also, the traditional heuristicsalgorithms do not utilize the contexts of the users/devicesand the network, and therefore, are unable to react to adynamically changing environment.

B. Game Theoretical Approaches

Game theory is generally used for distributed resourceallocation in wireless and IoT networks in the presence ofresource competition or cooperation among nodes (e.g. non-cooperative and cooperative game theory techniques, respec-tively). To date, many game-theoretic mechanisms have beenemployed to address the resource management issues in IoTnetworks [47]–[51]. In [50], the authors proposed a non-cooperative game-based resource allocation mechanism forLTE-based IoT networks. The authors focused on reducingthe inter-cell and intra-cell interference and proposed an ef-ficient power allocation algorithm. In another work, Zhanget al. [49], [51] proposed an optimized resource allocationframework for fog-based IoT. Specifically, the authors em-ployed a Stackelberg game among data service operators,subscribers, and the fog nodes. The problem of computingresource allocation was formulated as a game between theservice operators and subscribers. Similarly, Semasinghe etal. employed non-conventional game theoretic approaches forresource management in large-scale IoT networks [47]. Theauthors used evolutionary, mean field, minority games, andother similar approaches to solve the resource managementproblems in IoT networks. Kharrufa et al. [48] used a gametheory-based approach to optimize resource allocation, wherethe mobile nodes compete for network resources and aimat reaching a Nash Equilibrium (NE) solution for resource

TABLE IIIEXISTING SURVEYS ON RESOURCE MANAGEMENT USING CONVENTIONAL

METHODS

Topic(s) of the survey Related content

Device-to-Device (D2D) Communications [28], [29], [30],[52],

Heterogeneous Networks [31], [32], [33],Vehicular Networks [53], [54]Multiple Input Multiple Output (MIMO) andMassive MIMO [55], [56], [57]

Non-Orthogonal Multiple Access (NOMA) [34], [35]

management. In fact, most of the game-theoretic models aim atobtaining the NE solution for the resource allocation problem.

Despite their advantages, there are some limitations ofthe game-theoretic approaches for resource management. Wediscuss the limitations as follows.

• Homogeneity of players and availability of networkinformation: Generally, NE solutions are based on theassumption that all the players are homogeneous, haveequal/similar capabilities, and have complete networkinformation. However, heterogeneous IoT devices havedifferent computational capabilities, use different com-munication technologies, and exhibit different behaviors.Therefore, the conventional NE solutions may not beappropriate for a practical IoT environment. Also, de-veloping game models and obtaining the correspondingsolutions for a system with incomplete and/or imperfectinformation can be very challenging.

• Lack of scalability and slow convergence: The numberof players plays an important role in the applicability ofgame-theoretic approaches. When the number of playersincreases, with a traditional game model, obtaining thesolution of the model becomes more complicated. Al-though specific game models such as the evolutionarygames or mean-field games are suitable for modelingcompetition among large number of nodes (i.e. players)[47], due to the poor convergence properties, the dis-tributed solutions obtained using these approaches maynot be feasible for practical systems. Therefore, in an IoTnetwork with large number devices, use of game-theoreticapproaches for resource allocation can be limited.

• Information exchange overhead: Implementation ofgame-theoretic approaches may require a significantamount of information exchange (which will lead todelay), and is not practical due to massive number ofdevices in the IoT networks. These devices have limitedmemory and energy, and cannot maintain the records ofall the actions pertaining to the devices in the network.

IV. MACHINE LEARNING AND DEEP LEARNING FORRESOURCE MANAGEMENT

In this section, we discuss the basics of ML, DL, and DRL1

and their suitability for resource management in IoT networks.

1A more detailed description of the ML and DL techniques can be foundin [58].

8

Fig. 2. Machine learning vs. Deep learning

A. Machine Learning (ML) and Deep Learning (DL) Basics

Broadly speaking, ML is divided into three categories: su-pervised, unsupervised and reinforcement learning techniques.

Supervised learning: These techniques use models andlabels known a priori, and can estimate and predict unknownparameters. Support Vector Machine (SVM), naive Bayesclassifier, Random Forest, and Decision Tree (DT) are mostcommonly used supervised learning algorithms used for classi-fication and modelling of the existing data sets. This family ofalgorithms is mainly used for channel estimation, localization,spectrum sensing problems.

Unsupervised learning: These techniques are used withunlabeled data and utilize input data in an heuristic manner.Potential applications of these algorithms in IoT networksinclude cell clustering, user association and load balancing.

Reinforcement Learning (RL): RL techniques do not requiretraining data sets and an agent learns by interacting with theenvironment [103]. The objective is to learn the optimal policy(i.e. optimal mapping between state and action) so that thelong-term reward is maximized for the agent. Q-learning is oneof the most popular RL techniques. RL techniques can be uti-lized by the IoT devices and sensors for decision making andinferring under unknown and dynamically changing networkconditions. For instance, RL can be utilized during spectrumsharing for channel access in cognitive radio networks (in thepresence of unknown channel availability) and in small cellarea networks (with unknown resource conditions). It can alsobe utilized for BS association, specially with unknown energyand load status of the BS. However, RL techniques may sufferfrom slow convergence rate (in terms of learning the optimalpolicy at a system state) and curse of dimensionality, whichcan create challenges for the RL techniques in dynamic IoTnetworks.

Deep Learning (DL): DL is a breed of the ML techniqueswhich is based on Artificial Neural Networks (ANNs) withmultiple hidden layers between the input and output layers. ADeep Neural Network (DNN), when trained, approximates thefunctional relationship that is present in a data set. It is ex-pected that the relationship will be generalized for data pointsnot present in the training data set. DL is recommended forsolving various problems in prediction, classification, resource

allocation, spectral state estimation, object detection, speechrecognition and language translation. DL learns features atmultiple abstraction levels and allows a system to learn com-plex functions, and maps the input to the output directly fromdata [104]. For instance, in Fig. 2, unlike ML, DL performsfeature extraction and classification at various layers under thesame process in which these are two separate processes. In DL,the first layer learns primitive features such as color and shapein an image by finding frequently occurring combinations ofdigitized pixels. These identified features are fed to the nextlayer, which trains itself to recognize more complex (com-bination of already identified) features such as edge, corner,nose, ears etc. of an image. These features are fed to the nextlayer which performs identification of more complex features(e.g. human, animal). This recognition process is continued forsuccessive layers until the object is identified. The availableopen source DL frameworks/platforms include Torch/PyTorch,Thenano, TensorFlow and MXNet. DL is considered betterthan the traditional ML techniques due to following reasons:

• DL is capable of handling large amounts of data and DLalgorithms are scalable with increasing amount of datawhich can be beneficial to the model training and mayimprove prediction accuracy.

• DL can automatically and hierarchically extract high-level features and corresponding correlations from theinput data. Therefore, the learning model is not comprisedof multiple non-linear layers designed by the humans.It simplifies and automates the entire process. This au-tomatic feature extraction, which is not very commonin traditional ML techniques, can be beneficial for IoTnetworks where data generated by heterogeneous sourcescan exhibit non-trivial spatial/temporal patterns.

• IoT applications mostly generate unlabeled or semi-labeled data. DL has the capability to exploit the un-labeled data to learn useful patterns in an unsuper-vised manner (e.g., using Restricted Boltzmann Machine(RBM) and Generative Adversarial Network (GAN)). Onthe other hand, the conventional ML algorithms workeffectively only when sufficient labeled data is available.

• DL also reduces the computational and time complexityas single trained model can be used for various tasksand can achieve multiple objectives without model re-training. Graphics Processing Unit (GPU)-based parallelcomputing enables DL to make predictions (such asresource allocations) very fast. Also, when used alongwith technologies such as SDN and cloud computing, DLmakes storage, analysis and computation of data easierand faster.

However, the major drawback of DL is that a large amountof data will be required for training and testing, which maynot be available or could be hard to generate.

Deep Reinforcement Learning (DRL): DRL is a combinationof DNN and RL where a DNN is used as an agent for RL. Inother words, a DNN approximates the optimal policy functionand it is trained using the data obtained through interactionwith the environment. Therefore, no prior training data will berequired. For problems in which the number of system states

9

TABLE IVRESEARCH PROBLEMS IN IOT NETWORKS AND APPLIED MACHINE LEARNING TECHNIQUES

Research Objective/Problems Machine Learning Techniques (Surveyed References)Clustering and Data Aggregation

• K-Means Clustering [23], [22], [59], [60]• Fuzzy C-Means Clustering [61]• Supervised Learning [62], [63]• Deep Learning [64], [65]• Principal Component Analysis (PCA) [20], [21]

Resource Allocation (Scheduling, RandomAccess) • Q-Learning [23], [66], [67]

• Multi-Armed Bandit [68]• Reinforcement Learning [69], [70]• DNN [71]• Simulated Annealing [72]• Bayes Classification [73]

Traffic Classification and Mobility Predic-tion • Clustering [74]

• Deep Learning [75], [76]• Recurrent Neural Network• Supervised Inductive Learning [77]• SVM [18], [19]

Power Allocation and Interference Manage-ment • Deep Reinforcement Learning [78]–[80]

• CART Decision Tree [81]• Linear Regression [82]• K-Means Clustering [83]• Supervised Learning [84]• KNN [85]• Q-Learning [86]–[88]• Deep Learning [89]

Resource Discovery, Cell and Channel Se-lection • Deep Learning [90], [91]

• Supervised Learning [92], [93], [94]• PCA [95]• Bayesian Learning [96], [97]• KNN [95], [98]• Reinforcement Learning [23] [99], [100]• Support Vector Machine [95], [101], [102]

and data dimensionality are very large and the environment isnon-stationary, using traditional RL to learn the optimal policymay not be efficient (e.g. in terms of convergence rate). Bycombining DNN and RL, agents can learn by themselves andcome up with a policy to maximize long-term rewards. In thisapproach, the experience obtained through RL can help theDNN to find the best action in a given state [105]–[109].

DRL is broadly divided into two categories: Deep Q Net-works (DQN) and Policy Gradient. DQN is a value-basedmethod while policy gradient is a policy-based method. In avalue-based method, a Q value is calculated for every action inthe action space and selection of the action depends on the Qvalue). In a policy-based method, the optimal policy is learneddirectly from a policy function (which gives the probabilitydistribution of every action) without worrying about the valuefunction (i.e. without calculating the Q-functions). In DQN, Q-learning is combined with a DNN and the state action valuefunction (Q) is determined by the DNN. In a policy gradientmethod, the policy with respect to the expected future rewardis directly learned by optimizing the deep policy networks

using gradient descent 2.DRL is considered as a promising technique for future

intelligent IoT networks, and can be applied to tasks withdiscrete and continuous state and action space, and also withmulti-variable optimization problems. DRL has been proposedfor IoT networks for predictive analysis and automation [23].Detailed description of deep learning methodologies artificialneural networks can be found in [110]. Authors providedcomprehensive literature survey on deep learning in wirelesscommunication networks followed by various case-studies, inwhich DL has been useful. In the same spirit, authors presentedand discussed various advance level deep Q learning (DQL)models in [111]. DQL inherits the advantages of DL andQ-learning and find its use fullness in applications havinglarge state and action spaces. To find the optimal policy andcorrelations between the Q-values and the target value (withvarying data distribution), two DQN models; experience replayand target Q-network, are presented.

2https://medium.com/deep-math-machine-learning-ai/ch-13-deep-reinforcement-learning-deep-q-learning-and-policy-gradients-towards-agi-a2a0b611617e

10

B. Deep Learning and Resource Management in IoT Networks

A flurry of recent research works has used DL techniquesfor solving the resource allocation problems in IoT networks.For instance, a Q-learning based algorithm that uses cognitiveradio-based IoT has been proposed in [66]. The proposedalgorithm uses DL to optimize the transmission of differentpackets of variable sizes through multiple channels so thatthe overall network efficiency is maximized. It has also beenused in cognitive IoT networks [91], [112] where a cooperativestrategy between secondary users is used to detect multipleoccupancy of bands in a primary user. In the same spirit,the authors in [113] proposed DRL-based spectrum sensingapproaches using which multiple pairs of secondary userscoexist with these primary users.

DL has also been used for advanced wireless transmissiontechnologies such as MIMO and NOMA. For instance, [114]used DL for high-resolution channel estimation in MIMOsystems. [115] proposed a DL technique, referred to as DL-aided Sparse Code Multiple Access (D-SCMA) for code-basedNOMA. An adaptive code book is constructed automaticallywhich decreases the Bit Error Rate (BER) performance ofa NOMA system. Using DNN-based encoder and decoder,a decoding strategy is learned and formulated. Simulationresults for proposed scheme shows smaller BER and lowcomputational time as compared to conventional schemes.

To this end, DL has also been used for load balancingand interference mitigation in IoT networks. For instance, theauthors in [76] developed DL-based load prediction and load-balancing algorithms for IoT networks. Also, the authors in[79] presented a DL-based interference minimization method.Similarly, [116] used unsupervised DL in an autoencoder thatoptimizes the transmit parameters to provide diversity gains. In[117], the authors provided a robust DL mechanism to estimatethe CSI in an Orthogonal Frequency Division Multiplexing(OFDM) system, where the estimator performs better for pilottraining and channel distortions.

C. DRL and Resource Management in IoT Networks

DRL has been used for resource allocation problems toachieve high spectral efficiency. For example, the authors in [4]developed a deep Q-learning based spectrum sharing approachfor primary and secondary users in which the primary usershave fixed power control and the secondary users adjust theirpower after autonomously learning about the shared commonspectrum. Similarly, the authors in [118], proposed a novelcentralized DRL-based power allocation scheme for a multi-cell system. Specifically, the authors used a Deep Q-learning(DQL) approach to achieve near-optimal power allocation andalso to maximize the overall network throughput.

Furthermore, in [119], the authors proposed a DRL-basedthroughput maximization scheme in small cell networks. TheDRL algorithm is developed by using a specific type ofdeep Recurrent Neural Network (RNN), called a Long Short-Time Memory (LSTM) network for channel selection, carrieraggregation, and fractional spectrum access.

In [120], a multi-agent DRL method was used for resourceallocation in vehicle-to-vehicle (V2V) communications. DRL

is used to find the mapping between the local observationsof each vehicle such as, local CSI and interference levelsaffecting the resource allocation decision. Each vehicle (agent)makes optimal decisions for transmission power by interactingwith the environment. Ye et al. [121] also used a DRL-basedapproach for resource allocation in vehicular networks. Toreduce the transmission overhead, this approach uses localinformation on the available resources rather than waiting forthe global information. The authors in [122] proposed a DL-based approach to predict the resource allocation based on24 hour data count in V2V communication. He et al. [123]proposed a DRL-based approach for optimized orchestrationof computing resources in vehicular networks. Tan et al.[124] used DRL in a multi-timescale framework for cachereplacement and computing resource management. A detaileddiscussion about various DRL strategies can be found in [125].

V. ML-BASED RESOURCE MANAGEMENT IN IOTNETWORKS

A summary of the ML-based resource management methodsin IoT networks is provided in Table IV. A review of thesemethods is provided below.

A. Scheduling and Duty Cycling

For an energy-efficient operation, the nodes in an IoTnetwork would be generally sending data intermittently andin short packets. In addition to traditional ways of controllingthis duty-cycling, ML has been recently used to optimize thesenetworks and system level parameters. In [73], the authorspropose an ML-based up-link duty cycle period optimizationalgorithm for a broad class of IoT networks such as LongRange Wide Area Networks (LoRaWANs). More specifically,the authors optimize the up-link transmission period of LoRadevices using a Bayes classifier of ML that improves thenetwork energy efficiency. A scheduling method for an IoTnetwork for a smart home application was studied in [67]where the authors compared various Q-learning algorithms forscheduling of home appliances.

Traffic scheduling in IoT systems need to consider variouskinds of traffic such as delay-tolerant traffic and video and/orvoice traffic. In [69], an RL-based traffic scheduling methodwas proposed that dynamically adapts to various traffic vari-ations in a real-life scenario. More specifically, the authorsapplied RL algorithms on a 4-week real data traffic froma particular city and showed that the network performancealmost doubles as compared to traditional scheduling. Fora cloud-based IoT network, an ML-based task schedulingalgorithm was proposed in [71] which uses multiple criteriato optimize the network performance.

B. Resource Allocation

ML can be used to solve the resource allocation problemsin IoT networks where a huge data set can be collected andused to train algorithms that produce very robust results forvarious resource allocation problems. In this context, [126]uses cloud computing and ML together to perform resource

11

allocation in a general wireless network and applies theirproposed strategy for beam allocation in a wireless system.A comprehensive analysis of cloud computing-based resourceallocation algorithms was also discussed in [126]. Similarly,[72] used different ML algorithms in a cloudlet-based networkto efficiently optimize the network in terms of distributingcomputing and processing functions over the entire networkentities.

Note that, for the centralized wireless IoT networks such ascellular networks, resource allocation is generally performedby the BS based on the CSI of the users/devices and theirQuality of Servie (QoS) requirements. However, for distributedIoT networks such as Machine-to-Machine (M2M) and ad-hocnetworks, there is no centralized control and the users may nothave the channel statistics available for the entire network.Channel allocation tasks in such scenarios can harness thebenefit of state-of-the-art ML techniques.

ML-based resource management algorithms have also beenstudied for video applications in IoT networks. In [127], theauthors use a mixture of supervised and unsupervised learningalgorithms to optimize the Quality-of-Experience (QoE) ofthe end users when video streams are transmitted to themover a wireless channel. The learning algorithms extract thequality-rate characteristics of unknown video sequences andmanage simultaneous transmission of the video over channelsto guarantee a minimum QoE for users. Heterogeneous cloudradio access networks have also gained attention in the pastfew years for which the resource allocation and interferencemanagement are the key problems. Furthermore, in [128],the authors propose a centralized ML scheme to limit theinterference below a certain threshold and it increases theenergy efficiency of the network. A similar study on resourceallocation in wireless systems is presented in [70] where theauthors use DRL algorithms to manage resources efficiently.

C. Power Allocation and Interference Management

In [81], the authors used distributed Q-learning and CARTDecision Tree algorithms for interference and power manage-ment in D2D communication networks. The authors were ableto reduce the time complexity and improve system capacityand energy efficiency. In [82], a system was proposed forinterference level detection and power adaptation accordingto the interference level in the radio channels. The authorsutilized a linear regression algorithm which uses CSI to predictthe transmit power levels in order to minimize the interferenceand power wastage. Similarly, the authors in [80] proposeda hierarchical framework for resource allocation and powermanagement in a cloud computing environment. The authorsused a DRL method to develop a Dynamic Power Management(DPM) policy for minimizing power consumption.

Collaborative distributed Q-learning was proposed in [86]to reduce congestion in the access channel for resource-constrained MTC devices in IoT networks. The proposedsystem finds unique Random Access (RA) slots and reducethe possible collisions. Similarly, in [85], the authors pre-sented a model incorporating ML for low power transmission

among IoT devices using LoRa3. Moreover, the authors in[84] developed a real-time burst-based interference detectionand identification model for IoT networks. They developedan analytical model for interference classification by firstextracting Spectral Features (SFs) by exploiting the physicalproperties of the target interference signals. Afterwards, theyused supervised learning classifiers such as multi-class SVMand Random Forest tree for the isolation and identification ofinterference.

D. Clustering and Data Aggregation

Clustering is one of the techniques to avoid overload andcongestion in an IoT network. Clustering can be performed onthe basis of parameters such as Euclidean distance, Signal-to-Interference-plus-Noise Ratio (SINR), device types, and QoSparameters. The most commonly used technique is K-meansclustering algorithm, which is an unsupervised ML algorithm.K-means clustering partitions the available data set into Kclusters and every data point belongs to a cluster which hasthe least Euclidean distance from the centroid of that cluster.In [60], the authors propose a normalization algorithm tocompute the distances iteratively, which increases the accuracyof clustering and also decreases the clustering time.

The data streams generated in an IoT network can be hugeand the data are generally unlabeled. The processing of rawdata is computationally expensive and manually labelling thedata is not possible. In this context, [63] uses the SymbolicAggregate approximation (SAX) algorithm to reduce the di-mension of data stream and data is labeled using density-basedclustering. The density-based clustering extracts the number ofclasses in data and labels them using the found classes.

E. Resource Discovery and Cell Selection

A multi-band spectrum sensing policy was devised in [90],where the authors used an RL-based algorithm to sense thefree channels of primary users. In [92], the authors proposeda cooperative sensing scheme where multiple primary userswith complex channel states cooperate using Extreme MachineLearning (ELM) algorithm to obtain accurate channel stateinformation and classification. This enables the secondaryusers to successfully access the idle channel which is not inuse by primary user. Spectrum sensing can be very challengingin large-scale heterogeneous IoT networks. The authors in[97] proposed a new spectrum sensing scheme using Bayesianmachine learning in which heterogeneous spectrum statesare inferred using the Bayesian inference carried during thelearning process.

In [101], the authors used an SVM-based cooperativespectrum sensing technique. Similarly, in [100], the authorspresented an intelligent spectrum mobility management modelfor cognitive radio networks which can be used by the sec-ondary users to switch among the spectrum bands. The authorsused an ML scheme known as Transfer Actor-Critic Learning(TACL) was used for spectrum mobility management. For

3LoRa is a long-range low-power radio transmission technology which usesthe license-free sub-GHz frequency bands.

12

spectrum mobility, two modes are investigated: spectrum hand-off and stay-and-wait. In the spectrum hand-off mode, theuser switches to another channel whereas in the stay-and-waitmode, the user stops transmission until the channel qualityimproves. Also, long-term impact of the proposed scheme onnetwork performance was analyzed in terms of throughput,delay, packet dropping rate, and packet error rate. Similarly,in [94], the authors proposed an online learning-based cell loadapproximation scheme in a wireless network.

F. Traffic Classification and Mobility Prediction

Traffic engineering with traffic classification as well astraffic (and hence mobility) prediction would be requiredto accommodate the time-varying IoT traffic [74]. The au-thors in [77] proposed a methodology for construction ofthe model for normal behaviour for ongoing network traffic.Authors proposed traffic prediction procedure by separatingshort term (non-periodical or temporal)and longer-term(houror day)variations. A longitudinal model of network connec-tions was developed which can be used to identify deviationsin the network traffic patterns. In [19], the authors surveyed theflow-based statistical features of P2P traffic. They employedan SVM model to develop a system for classification of theencrypted BitTorrent traffic from usual Internet traffic.

The authors in [129] proposed a learning-based networkpath planning model. They considered the sequence of nodesin a network path and determined implicit forwarding pathsbased on empirical data of the network traffic. The authorsemployed neural networks for capturing the data characteris-tics of the traffic forwarding and routing path. In [18], theauthors proposed a time window method to acquire concisefeatures from the packet header of network traffic for variousapplications. They used SVM, Back Propagation (BP) neuralnetwork, and Particle Swarm Optimization (PSO) to train andclassify, and afterwards recognize the network traffic. In [130],the authors compared the performance of six widely-usedsupervised ML algorithms (Nave Bayes, Bayes Net, RandomForest, Decision Tree, and Multilayer Perceptron) for clas-sifying network traffic. The authors concluded that RandomForest and Decision Tree algorithms are the best classifiersfor network traffic classifications, when classification accuracyand computational efficiency are required.

G. Resource Allocation in Mobile IoT Networks

Cellular networks, vehicular networks, mobile and ad hocnetworks are examples of mobile IoT networks where the com-municating devices are mobile. Various ML techniques havebeen proposed for modern cellular networks including D2Dcommunication, massive MIMO and mmWave communica-tions technologies as well as heterogeneous cellular networks(HetNets). We will provide a survey of these techniques in thenext section. For Vehicular Ad-hoc NETworks (VANETs), thetraditional approaches for resource allocation include dynamicprogramming [131], [132], Semi-Markov Decision Process(SMDP) [133], Maximum Weighted Independent Set problemwith Dependent Weights (MWIS-DW) [134], and users’ ag-gregate utility maximization [135]. Recently, ML algorithms

have been used to address the resource optimization problem invehicular networks. For instance, an ML algorithm was appliedin [136] to reduce the network overhead in a delay-tolerantrouting system where repeated copies of a message signal arediscarded.

In the same context, over the last decade, vehicular net-works have evolved to rich service and application space,for instance, vehicular clouds. In vehicular clouds, vehiclescan form a cloud, can use the services of the cloud, or thecombination of the two [137]. There are plethora of servicesrealized through vehicular clouds such as traffic informationas a service, cooperation as a service, storage as a service,and vehicle witness as a service [138]–[140]. In such sce-narios, resource management is of paramount importance forefficient service provisioning. For this purpose, both ML andDL techniques have been proposed to allocate and manageresources in vehicular clouds. In [141], the authors exploredthe applications of ML and DL in managing the resources ofvehicular networks.

In [142], Arkian et al. proposed a resource allocation mech-anism for vehicular networks using cluster-based Q-learningapproach. The authors aim at efficient cluster-head selectionmechanism through employing fuzzy logic. The essence ofthe resource management scheme is to select the vehiclewith enough resources to offer, in an efficient and QoS-awareway. In another work, Salahuddin et al. [143] proposed anRL-based technique for resource provisioning in vehicularclouds. The specific feature of the proposed RL-based resourcemanagement is that it provisions the resources based on a long-term benefit, which on one hand increases the utility, and onthe other hand decreases the computational overhead.

VI. MACHINE LEARNING TECHNIQUES FOR EMERGINGCELLULAR IOT NETWORKS

In this section, we survey the existing ML- and DL-basedresource management solutions for specific cellular IoT tech-nologies including HetNets, D2D, MIMO (massive MIMO),and NOMA.

A. ML Techniques in Heterogeneous Networks (HetNets)

In a HetNet, cellular users coexist with small cell usersusing different Radio Access Technologies (RATs). The majorissues related to resource management include mitigation ofcross- and co-tier interference, mobility management, userassociation, RAT selection, and self organization. In additionto the conventional methods (e.g. based on optimization orheuristics), ML-based methods have been used for resourcemanagement in HetNets [144]. For instance, [145] presenteda mobility management method for a HetNet using RL,where inter-cell coordination and handovers are learnt by thedevices to efficiently manage network resources. Similarly,[146] reduced the energy consumption of the networks usinga fuzzy game-theory approach where decision rules are learntover time for better power consumption while maintaininga minimum QoS level for the users. The authors in [147]proposed a novel framework for cognitive RAT selection bythe terminal devices in a HetNet using ML techniques.

13

Recently, ML techniques have also been used to pro-vide self-organization (which includes self-configuration, self-organization, and self-healing) in the networks. In [148],the authors studied an ANN approach to deal with self-optimization in HetNets. In [149], the authors proposed anefficient resource allocation algorithm for QoS provisioningin terms of high data rates that uses an online learningalgorithm and Q-value theory to optimize the resources. Asimilar work on resource allocation in heterogeneous cognitiveradio networks can be found in [128].

B. ML Techniques in Device-to-Device Communications

In D2D networks, two devices in proximity connect toeach other bypassing the central base station. By performingproximity communications, a D2D network offloads the trafficfrom the main BS and also improves the spectral efficiency aswell as Energy Efficiency (EE) of the network [150]. A lowtransmit power between the radios ensures EE, whereas lowpath-loss ensures high spectral efficiency and sum rate. Thereis a considerable amount of literature addressing various issuesof D2D networks including resource and power allocation,mode selection, proximity detection, interference avoidance[151], [152]. Recently, ML has been used to address differentproblems related to D2D communications such as caching[153], security, and privacy [154] and so on.

A major challenge in D2D networks is to efficiently utilizethe limited resources to provide QoS requirements to all net-work entities, including cellular as well as D2D users. In [155],the authors devised a channel access strategy based on banditlearning in a distributed D2D system, where each pair selectsan optimal channel for its communications. This does not onlyreduce the interference among users on the same channels,but also increases the rates of individual D2D pairs. Similarly,another work on channel selection using autonomous learningwas presented in [156]. A method for resource allocation inD2D networks was presented in [157] where Q-learning isused to find the optimal resource selection strategy. Similarly,in [158], the authors used cooperative RL that improves theindividual throughputs of the devices and the sum rates of thesystem using optimal resource allocation through a cooperativestrategy planning. In [159], power allocations for various D2Dpairs were optimally designed using ML to provide energy-efficient network solution.

C. ML in MIMO and Massive MIMO Communications

Mobile nodes connected in a cellular system enjoy veryhigh data rates if the number of antenna elements at the BSis large. Recently, a relatively new concept known as massiveMIMO has been an active area of research. Many algorithmshave been studied for channel estimation, beamforming, mit-igating pilot contamination, and improving energy-efficiencyin massive MIMO systems. In this context, ML techniqueshave also been used to address different problems in massiveMIMO systems. For instance, in [160], the authors addressedthe channel estimation problem using sparse Bayesian learningalgorithm. This algorithm needs good knowledge of the chan-nel. Furthermore, the authors used the channel components

in the beam domain as Gaussian Mixture (GM) distributionwhile Expectation-Maximization (EM) is used to learn thesystem parameters. The proposed channel estimator shows agood improvement as compared to the conventional methodsin the presence of pilot contamination.

The positioning problem in MIMO systems was addressedin [161] using least square SVM. SVM finds the mappingbetween the sample space and coordinate space based onMIMO channel established by the multipath propagation.Simulations show improved location performance of the SVMas compared to ANN and K-Nearest Neighbor (KNN). Fur-thermore, in [162], the authors recommended a supervisedML technique, i.e., SVM-based technique for MIMO channellearning. Similarly, in [163], the authors proposed a supervisedlearning for pilot assignment to address the pilot contaminationin MIMO systems, considering the output features as pilotassignment while the input features as user locations in thecells. This scheme is implemented using a DNN.

In [164], the authors presented ML algorithms for linkadaptation in massive MIMO. For the selection of transmissionparameters, autoencoder-SVM and autoencoder-softmax wereproposed. The autoencoder model is used to find features fromCSI while logical regression models are used to choose modu-lation and channel coding algorithms. Simulation results showimproved spectral efficiency compared to the conventionalnon-learning algorithms. Adaptive Modulation and Coding(AMC), which is also a part of link adaptation, is a challengingtask in spatial and frequency selective channels. For theselection of suitable modulation scheme and coding rate inMIMO-OFDM systems, the past observations of the CSI anderrors can be used. Similarly, in [165], the authors proposeda supervised learning method that uses the past informationof the CSI and errors, to predict suitable modulation schemeand coding rate without finding the relation between inputand output of wireless transceiver. To increase the accuracy oflink adaptation using ML, low dimensional feature set usingordered SNRs was proposed. Simulation results showed thatthe proposed scheme improves the frame error rate and spectralefficiency.

In another work, Mauricio et al. [166] presented an MLapproach to solve the grouping problem in multi-user massiveMIMO systems. The mobile stations are classified into clus-ters using K-means clustering and then mobile stations areselected from the clusters to make an Space-Division MultipleAccess (SDMA) group for capacity maximization. Similarly,in [167], the authors presented a Radio Resource Management(RRM) and hybrid beamforming method for downlink massiveMIMO system using ML techniques. The computationallycomplex resource management in multi-user massive MIMOis replaced with neural network using supervised learning. Theresults showed that the proposed ML-based method gives thesame sum rate but decreases the execution time as comparedto the performance of CVX-based method. Furthermore, Wanget al. [168] proposed a ML-based beam allocation method forMIMO systems that outperforms the conventional methods.

14

D. ML Techniques for NOMA

For the next generation wireless networks, NOMA tech-nique is a prime candidate to give spectral efficient perfor-mance and massive connectivity. The key idea behind NOMAis to handle multiple users in the same system resource unit[169]. These resource units are spreading code, time slot,sub-carrier, or space [170]. Given the complexity of resourceallocation problems in NOMA (i.e. user clustering, powerallocation in the presence of imperfect Successive InterferenceCancellation (SIC) and jamming attacks), it is of interest toexploit the benefits of learning-based technique for NOMAsystems [171].

Notably, various works have recently applied ML andDL techniques for different NOMA problems. For instance,in [172], the authors proposed a novel approach using DLin NOMA system where single BS serves multiple usersdeployed in a single cell. The authors used a DL-basedLSTM network which is integrated with traditional NOMAsystem. This enables the system to observe and characterizethe channel characteristics automatically. In [173], the authorsproposed an online learning based novel detection method foruplink NOMA scheme in a single cell. An online adaptivefilter is designed for large cluster size of users, which providesbetter results than SIC-based detection method. In [174],the authors proposed a MIMO-based anti-jamming game forNOMA transmission. They used a Q-learning-based allocationof power coefficients for transmission in dynamic downlinkNOMA to counter jamming attacks. This mechanism improvesthe NOMA system transmission efficiency in the presence ofsmart jammers.

In [175], the authors investigated the key problems ofmmWave-based NOMA system, i.e., power allocation to theusers and clustering of users. They derived an efficient userclustering for NOMA system. Considering continuous arrivalsof users in the system, an online user clustering algorithmon the basis of K-means clustering was proposed whichgreatly reduces the computational complexity of the NOMAsystem. Similarly, in [176], the authors proposed a NOMA-based resource management approach in a multi-cell network.Furthermore, in [177], the authors investigated the resourcesharing problem for downlink transmission by consideringmaximum power constraint in NOMA networks. Moreover, in[178], the authors dealt with the resource allocation problemin a NOMA-based heterogeneous IoT network and proposeda deep RNN-based algorithm for optimal solution. The resultsshowed that the proposed DL-based scheme outperforms thetraditional schemes in terms of connectivity scale and spectralefficiency.

VII. LESSONS LEARNED AND FUTURE RESEARCHCHALLENGES

A. Lessons Learned

Resource allocation in IoT networks is a fundamental prob-lem that becomes increasingly complex with the considerationof advanced wireless technologies as well as system uncer-tainties (e.g., due to dynamic channel and traffic variations,device mobility and multi-dimensional QoS requirements)

and large-scale nature of the system (e.g. with thousands ofdevices and network nodes). Resource allocation problemsare generally posed as optimization problems with one ormore objective functions and some constraints. However, inmany cases the conventional optimization problems are notsolvable with reasonable complexity, unless the problems aresimplified and/or the constraints are relaxed. Also, when theproblems are non-convex, they often provide only sub-optimalsolutions. To reduce the complexity of obtaining resourceallocation solutions, new methods will be required. Also,in many cases the resource allocation problems cannot bemodeled analytically due to the lack of appropriate modelingtools.

Another common approach for resource allocation in dis-tributed IoT networks is the use of game-theoretic approaches.However, modeling and analysis of the game models forlarge-scale heterogeneous IoT networks under practical systemconstraints (e.g. uncertainty in terms of the number of devicesand network characteristics, lack of availability of informationabout other devices) is extremely challenging. Also, even ifthe equilibrium solutions can be characterized, obtaining thosesolutions instantly would be a very highly ambitious goal dueto slow convergence rate.

In the above context, increasing interest in the interdis-ciplinary research has led to the use of ML, DL, RL, andrelated tools to make IoT devices and networks smarter andintelligent, and adaptive to their dynamic environments. De-crease in the computational costs, improved computing power,availability of unlimited solid-state storage, and integration ofvarious technological breakthroughs have made this possible.Applications of ML and DL techniques enable IoT devicesto perform complex sensing and recognition tasks and enablethe realization of new applications and services consideringthe real-time interactions among humans, smart devices andthe surrounding environment. DL, RL, and DRL algorithmscan be leveraged for automated extraction of complex featuresfrom large amounts of high dimensional unsupervised dataproduced by IoT devices.

ML techniques can support self-organizing operations bylearning and processing statistical information from the envi-ronment. These learning techniques are inherently distributedand are scalable for dense and large-scale IoT networks. Data-driven ML techniques are used to detect possible patternsand similarities in large data sets and can make predictions.ML is leveraged to develop algorithms that are capable ofapplying statistical methods on input data, performing anal-ysis and predicting an output value without being explicitlyprogrammed. Therefore, ML techniques require datasets tolearn from and then the learned model is applied to the realdata. However, the learned model may not generalize wellto the entire range of features and properties of the data. Inthis regard, DL techniques have been employed to addresssome of the limitations of the ML techniques. Nonetheless,ML and DL algorithms have inherent limitations and also thedevelopment of these techniques brings along new challengesthat must be addressed. In particular, data-driven ML andDL techniques require “good quality” data which may not beavailable and also difficult to generate. For example, for a DL-

15

based resource management method for channel and powerallocation problem in a large-scale cellular IoT network, fora given network state (e.g. number of users, their associationto the BS, channel gains to the respective BS), the optimalchannel and power allocation data will be required to trainthe DNN. Generation of these data will be computationally tooexpensive even using techniques such as genetic algorithm(s).Therefore, the challenges related to data generation will needto be addressed to develop DL-based resource managewmentschemes.

Here we will focus on the challenges related to the devel-opment, (re)training, interoperability, cost, generalization, andthe data-related issues of the ML models.

B. Challenges Associated With ML Models for IoT Networks1) Development of ML models for IoT networks: It is

challenging to develop a suitable model to process data comingfrom a range of diverse IoT devices. Similarly, effective inputdata labeling is also a cumbersome task. For instance, incase of health-care applications, the training model must bepin-point accurate to derive the right features from the dataand learn effectively, because false positives could have direconsequences in such models. Reliability of the ML modelswill be essential for mission-critical IoT applications whichwill be vulnerable to the anomalies (e.g. misclassifications)created because of ML or DL algorithms. Additionally, RLsuffers from instability and divergence when all possibleaction-reward pairs are not used in a given scenario [105].In a nutshell, depending on the use-case, the underlying IoTtechnology and the characteristics unique to that use-case, MLmodels must be efficiently crafted to meet the requirements ofthe use-cases. For the deployment of ML and DL models inthe resource-constrained IoT devices, it is essential to reducethe processing and storage overhead [179].

Again, engineering of the ML models (e.g. optimizing thehyperparameters in a DNN such as the number of hiddenlayers, number of neurons per layer, optimizing the learningrate in a DRL model) are challenging.

2) Lack of multitasking capabilities: DL models are highlyspecialized and trained in a specific application domain. Thesemodels are only effective to solve one specific type of theproblem for which it is trained. Whenever there is a change inthe definition, state, or the nature of the problem (even if thenew problem is very similar to the older one), re-assessmentand retraining of the model is essential. Restructuring thearchitecture of the entire model is required when multitaskingis done by the DL algorithms. For real-time IoT applications,it is difficult for the IoT devices to change and retain the DLmodels with frequently changing input information in a timelymanner. Therefore, DL models capable of multitasking shouldbe further investigated.

3) Retraining of ML and DL models: When ML algorithmsare applied to IoT applications and resource management inIoT networks, the models need to be updated according tothe fresh collected data. This phenomenon gives rise to manyimportant questions such as when and how ML models shouldbe retrained with the changing environment and how to ensurethat an appropriate amount of data is collected.

4) Interpretability of ML and DL models: When applyingML models to make informed decisions in IoT networks,in addition to accurate predictions, it is also helpful for thedomain experts to gain insights into the decision makingprocess of the models. In other words, interpretability of themodel is very important to generate deep insights into the MLmodel. Powerful models such as DL and NNs are difficultto understand and interpret. Further research is required tobuild smart models which are not only highly accurate but alsointerpretable. The need for explainability of the ML modelsdrives the need for simple models such as decision trees. Itwill be of fundamental importance to strengthen the existingtheoretical foundations of DL with repeated experimental data.In this way, the performances of DL models can be quantifiedbased on certain parameters such as computational complexity,learning efficiency, and accuracy.

5) Cost of training of ML and DL models: Sufficientamount of data is required to train ML models and a con-siderable cost is associated with training these models. Thistraining cost becomes more pressing, specifically in real-timeIoT applications. Furthermore, the ML models are neededto be up-to-date in order to have accurate predictions overlonger duration of time. Although model retraining can bedone manually in the same way as the model was trained, itis a time consuming process. Another method is continuouslearning in which an automated system is developed thatcontinuously evaluates and retrains the existing models. Bothof these methods require a well architected system model thatshould take into account not only the initial training procedureof a model but also a plan for keeping the models up-to-date.

6) Generalization of ML models: Generalization is definedas the ability of an ML model to adapt to new and unseendata. The data is drawn from the same distribution used fortraining the model. Accuracy of the model is determined byhow well it can be generalized to data points outside thetraining data sets. The generalization error is the measure ofaccuracy and the validity of an ML model. The propagationchannels in a wireless IoT network can have varying anddiverse physical characteristics. Therefore, although trainedoffline, the ML/DL models for resource management in IoTnetworks should be generalized to various channel conditionsand must have dynamic adaptive capabilities. More researchand implementation efforts are required to build practicalmodels for real system with real user workloads.

C. Data Challenges in ML and DL-Enabled IoT Networks

1) Scarcity of datasets: Learning algorithms entirely de-pend on the amount and quality of data to build models.Often, these datasets for IoT networks are scarce. Also, theavailability of such datasets is subject to laws through whichthe data should be collected, the environment, and the consentof the data subject. Generation of synthetic datasets that couldtrain models is also challenging. Another important issue is theuniformity of the datasets that could be used in cross-platformenvironments.

2) Heterogeneity of IoT data: Due to the massive andheterogeneous nature of IoT networks, the data generated

16

by such networks has multiple dimensions and ML modelsmust be developed to extract useful information and featuresfrom the data. In this regard, pre-treatment of the data (pre-processing, cleansing, and ordering) and fusion of multi-sourcedata will play an important role to make it ready for therespective ML model.

3) Collection of data: The data generated in IoT networkscan be used for a variety of different purposes dependingon the application domain. For instance, some data maybeprivacy-critical where it must be anonymized before using itfor training. The example of such data is the data generatedby smart-home environment and by the sensors on the humanbody. These data must be sent to central server(s) for furtherprocessing. Therefore, the context is also essential in thecollection of data from different domains. Also, the ML andDL algorithms need to cope with anomalies in the data (e.g.data imbalance due to class skewness and class overlapping)in an intelligent manner.

4) Avalanche effect of ML and DL: The avalanche effect isa desired characteristic of the cryptographic algorithms wherea small change in the input causes huge change in the output.However, this effect is the default property of the ML and DLalgorithms which also makes them vulnerable to adversarialattacks. In the context of IoT, this property of ML and DLmechanism will cause dire consequences to the applications.Therefore, the integrity of the data must be pre-checked beforegiving the data as input to the ML and DL algorithms. Recentperturbation attacks on NN [180] advocate that concrete stepsare required to guarantee the integrity of input data to the MLand DL algorithms. One possible solution would be to take thecontext of the data into account and incorporate an effectiveaccess control mechanism to the input data.

VIII. CONCLUSION

The fundamental notion of connectivity among differentsmart devices with processing, storage, and communicationcapabilities, collectively constitute an IoT network. IoT hasthe potential to revolutionize many aspects of our lives rangingfrom our life-style, business, environment, infrastructure andso on. The heterogeneity introduced by the IoT is humongousfrom many aspects such as device types, network types,and data types. In the wake of such massive heterogeneousenvironment, IoT networks need scalable, adaptive, efficient,and fair network management mechanisms. Among otheraspects of the network management, resource managementhas a pivotal role in the successful realization of massivelyheterogeneous IoT networks. In essence, IoT networks do notonly face resource allocation challenges but also require con-textual and situational custom resource management decisions.Unlike the traditional resource management methods includingoptimization and heuristics-based methods, game theoreticaland cooperative approaches, the ML and DL models can deriveactions from the run-time context information and can re-tune and re-train themselves with changes in the environment.ML and DL techniques are promising for automatic resourcemanagement and decision making process, specifically forlarge-scale, complex, distributed and dynamically changingIoT application environment.

In this article, we have discussed the limitations of thetraditional resource management techniques for wireless IoTnetworks. We have carried out a comprehensive survey of theML- and DL-based resource management techniques in theIoT networks with enabling technologies such as HetNets,D2D, MIMO/massive MIMO, and NOMA. Furthermore, wehave also covered different aspects of the resource manage-ment such as scheduling and duty cycling, resource allocation(in static and mobile IoT networks), clustering and data aggre-gation, resource discovery through spectrum sensing and cellselection, traffic classification and prediction, power allocation,and interference management. To this end, we have identifiedthe future research challenges in applying ML and DL in IoTresource management. Applications of machine learning tech-niques, specifically deep reinforcement learning techniques,to solve complex radio resource management problems inemerging IoT networks is a fertile area of research. For thesenetworks, we will need to carefully craft the solutions accord-ing to the challenges such as model uncertainty, interpretabilityof the models, cost of model training, and generalization fromtest workloads to real application user workloads.

REFERENCES

[1] F. Hussain, Internet of Things; Building Blocks and Business Models.Springer, 2017.

[2] D. B. J. Sen, “Internet of Things - Applications and Challengesin Technology and Standardization,” IEEE Transactions in WirelessPersonal Communication, May 2011.

[3] U. S. Shanthamallu, A. Spanias, C. Tepedelenlioglu, and M. Stanley, “ABrief Survey of Machine Learning Methods and their Sensor and IoTApplications ,” IEEE Conference on Information, Intelligence, Systemsand Applications, March 2018.

[4] Y. Li, Z. Gao, L. Huang, X. Du, and M. Guizani, “Resource man-agement for future mobile networks: Architecture and technologies,”Elsevier, pp. 392–398, 2018.

[5] X. Li, J. Fang, W. Cheng, H. Duan, Z. Chen, and H. Li, “Intelligentpower control for spectrum sharing in cognitive radios: A deep rein-forcement learning approach,” IEEE Access, vol. 6, pp. 25 463–25 473,2018.

[6] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, andM. Ayyash, “Internet of things: A survey on enabling technologies,protocols, and applications,” IEEE Communications Surveys Tutorials,vol. 17, no. 4, pp. 2347–2376, Fourthquarter 2015.

[7] V. Gazis, “A survey of standards for machine-to-machine and theinternet of things,” IEEE Communications Surveys Tutorials, vol. 19,no. 1, pp. 482–511, Firstquarter 2017.

[8] E. Ahmed, I. Yaqoob, A. Gani, M. Imran, and M. Guizani, “Internet-of-things-based smart environments: state of the art, taxonomy, and openresearch challenges,” IEEE Wireless Communications, vol. 23, no. 5,pp. 10–16, October 2016.

[9] S. B. Baker, W. Xiang, and I. Atkinson, “Internet of things for smarthealthcare: Technologies, challenges, and opportunities,” IEEE Access,vol. 5, pp. 26 521–26 544, 2017.

[10] F. Javed, M. K. Afzal, M. Sharif, and B. Kim, “Internet of things(iot) operating systems support, networking technologies, applications,and challenges: A comparative review,” IEEE Communications SurveysTutorials, vol. 20, no. 3, pp. 2062–2100, thirdquarter 2018.

[11] D. E. Kouicem, A. Bouabdallah, and H. Lakhlef, “Internet of thingssecurity: A top-down survey,” Computer Networks, vol. 141, pp. 199 –221, 2018. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1389128618301208

[12] A. Chowdhury and S. A. Raut, “A survey study on internet ofthings resource management,” Journal of Network and ComputerApplications, vol. 120, pp. 42 – 60, 2018. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S1084804518302315

[13] D. Airehrour, J. Gutierrez, and S. K. Ray, “Secure routing forinternet of things: A survey,” Journal of Network and ComputerApplications, vol. 66, pp. 198 – 213, 2016. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S1084804516300133

http://www.sciencedirect.com/science/article/pii/S1389128618301208




17

[14] J. Guo, I.-R. Chen, and J. J. Tsai, “A survey of trust computationmodels for service management in internet of things systems,” Com-puter Communications, vol. 97, pp. 1 – 14, 2017. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S0140366416304959

[15] W. Ding, X. Jing, Z. Yan, and L. T. Yang, “A survey on data fusionin internet of things: Towards secure and privacy-preserving fusion,”Information Fusion, vol. 51, pp. 129 – 144, 2019. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S1566253518304731

[16] M. Mohammadi, A. Al-Fuqaha, S. Sorour, and M. Guizani, “Deeplearning for iot big data and streaming analytics: A survey,” IEEECommunications Surveys Tutorials, vol. 20, no. 4, pp. 2923–2960,Fourthquarter 2018.

[17] O. Simeone, “A very brief introduction to machine learning with ap-plications to communication systems,” IEEE Transaction on CognitiveCommunications and Networking, vol. 4, no. 4, pp. 648–664, December2018.

[18] Q. et al., “Real-time multi-application network traffic identificationbased on machine learning,” in in Springer,, vol. 92, 2015, pp. 473–480.

[19] T. et al., “Machine learning-based classification of encrypted internettraffic,” in in Springer,, 2017, p. 578592.

[20] R. Masiero, G. Quer, D. Munaretto, M. Rossi, J. Widmer, and M. Zorzi,“Data acquisition through joint compressive sensing and principalcomponent analysis,” in GLOBECOM 2009 - 2009 IEEE GlobalTelecommunications Conference, Nov 2009, pp. 1–6.

[21] M. S. Jamal, V. K. S, H. Ochiai, H. Esaki, and K. Kataoka, “Instruct:A clustering based identification of valid communications in iot net-works,” in 2018 Fifth International Conference on Internet of Things:Systems, Management and Security, Oct 2018, pp. 228–233.

[22] R. Han, F. Zhang, and Z. Wang, “Accurateml: Information-aggregation-based approximate processing for fast and accurate machine learning onmapreduce,” in IEEE INFOCOM 2017 - IEEE Conference on ComputerCommunications, May 2017, pp. 1–9.

[23] F. Hussain, A. Anpalagan, A. S. Khwaja, and M. Naeem, “ResourceAllocation and Congestion Control in Clustered M2M Communicationusing Q-Learning,” Wiley Transactions on Emerging Telecommunica-tions Technologies, 2015.

[24] B. FrankoviIvana and B. Budinska, “Advantages and disadvantages ofheuristic and multi agents approaches to the solution of schedulingproblem,” in in IFAC,, vol. 33, no. 13, May 2000, pp. 367–372.

[25] J. Moura and D. Hutchison, “Game theory for multi-access edge com-puting: Survey, use cases, and future trends,” IEEE CommunicationsSurveys Tutorials, vol. 21, no. 1, pp. 260–288, Firstquarter 2019.

[26] S. et al., “Sleeping multi-armed bandit learning for fast up-link grant allocation in machine type communications,,” inhttps://arxiv.org/abs/1810.12983, October 2018.

[27] K. et al., “Big-data streaming applications scheduling based on stagedmulti-armed bandits,,” in IEEE Transactions on Computers, vol. 65,no. 12, November 2016, p. 115.

[28] K. Doppler, M. Rinne, C. Wijting, C. Ribeiro, and K. Hugl,“Device-to-device communication as an underlay to lte-advancednetworks,” Comm. Mag., vol. 47, no. 12, pp. 42–49, Dec. 2009.[Online]. Available: http://dx.doi.org/10.1109/MCOM.2009.5350367

[29] W. Wu, W. Xiang, Y. Zhang, K. Zheng, and W. Wang, “Performanceanalysis of device-to-device communications underlaying cellularnetworks,” Telecommun. Syst., vol. 60, no. 1, pp. 29–41, Sep. 2015.[Online]. Available: http://dx.doi.org/10.1007/s11235-014-9919-y

[30] C. Yu, K. Doppler, C. B. Ribeiro, and O. Tirkkonen, “Resource sharingoptimization for device-to-device communication underlaying cellularnetworks,” IEEE Transactions on Wireless Communications, vol. 10,no. 8, pp. 2752–2763, August 2011.

[31] Y. L. Lee, T. C. Chuah, J. Loo, and A. Vinel, “Recent advances inradio resource management for heterogeneous lte/lte-a networks,” IEEECommunications Surveys Tutorials, vol. 16, no. 4, pp. 2142–2180,Fourthquarter 2014.

[32] S. T. A. N. National, “Radio resource management in heterogeneousnetworks, functional models and implementation requirements,” 2015.

[33] S. Tarapiah, K. Aziz, and S. Atalla, “Common radio resourcemanagement algorithms in heterogeneous wireless networks withkpi analysis,” International Journal of Advanced Computer Scienceand Applications, vol. 6, no. 10, 2015. [Online]. Available:http://dx.doi.org/10.14569/IJACSA.2015.061007

[34] M. Al-Imari, P. Xiao, and M. A. Imran, “Receiver and resourceallocation optimization for uplink noma in 5g wireless networks,” in2015 International Symposium on Wireless Communication Systems(ISWCS), Aug 2015, pp. 151–155.

[35] L. Song, Y. Li, Z. Ding, and H. V. Poor, “Resource management innon-orthogonal multiple access networks for 5g and beyond,” IEEENetwork, vol. 31, no. 4, pp. 8–14, July 2017.

[36] U. Raza, P. Kulkarni, and M. Sooriyabandara, “Low power wide areanetworks: An overview,” IEEE Communications Surveys Tutorials,vol. 19, no. 2, pp. 855–873, Secondquarter 2017.

[37] K. Mekki, E. Bajic, F. Chaxel, and F. Meyer, “A comparativestudy of lpwan technologies for large-scale iot deployment,” ICTExpress, vol. 5, no. 1, pp. 1 – 7, 2019. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S2405959517302953

[38] A. Mahmood, E. Sisinni, L. Guntupalli, R. Rondn, S. A. Hassan, andM. Gidlund, “Scalability analysis of a lora network under imperfectorthogonality,” IEEE Transactions on Industrial Informatics, vol. 15,no. 3, pp. 1425–1436, March 2019.

[39] R. Zhang, Y. Liang, and S. Cui, “Dynamic resource allocation incognitive radio networks,” IEEE Signal Processing Magazine, vol. 27,no. 3, pp. 102–114, May 2010.

[40] N. Nie and C. Comaniciu, “Adaptive channel allocation spectrumetiquette for cognitive radio networks,” Mobile Networks andApplications, vol. 11, no. 6, pp. 779–797, Dec 2006. [Online].Available: https://doi.org/10.1007/s11036-006-0049-y

[41] A. Alnahdi and S. Liu, “Mobile internet of things (miot) and itsapplications for smart environments: A positional overview,” in 2017IEEE International Congress on Internet of Things (ICIOT), June 2017,pp. 151–154.

[42] K. Nahrstedt, H. Li, P. Nguyen, S. Chang, and L. Vu, “Internet ofmobile things: Mobility-driven challenges, designs and implementa-tions,” in 2016 IEEE First International Conference on Internet-of-Things Design and Implementation (IoTDI), April 2016, pp. 25–36.

[43] I. Tcarenko, Y. Huan, D. Juhasz, A. M. Rahmani, Z. Zou, T. Wester-lund, P. Liljeberg, L. Zheng, and H. Tenhunen, “Smart energy efficientgateway for internet of mobile things,” in 2017 14th IEEE AnnualConsumer Communications Networking Conference (CCNC), Jan 2017,pp. 1016–1017.

[44] O. at el., “The fog balancing: Load distribution for small cell cloudcomputing,” IEEE VTC, pp. 1–6, May 2015.

[45] M. Kim and I.-Y. Ko, “An efficient resource allocation approach basedon a genetic algorithm for composite services in iot environments,”IEEE International Conference on Web Services, pp. 543–550, 2015.

[46] M. et al., “An efficient resource allocation approach based on a geneticalgorithm for composite services in iot environments,” InternationalConference on Big Data and Internet of Things, pp. 136–142, October2018.

[47] P. Semasinghe, S. Maghsudi, and E. Hossain, “Game theoretic mecha-nisms for resource management in massive wireless iot systems,” IEEECommunications Magazine, vol. 55, no. 2, pp. 121–127, February 2017.

[48] H. Kharrufa, H. Al-Kashoash, and A. H. Kemp, “A game theoreticoptimization of rpl for mobile internet of things applications,” IEEESensors Journal, vol. 18, no. 6, pp. 2520–2530, March 2018.

[49] H. Zhang, Y. Zhang, Y. Gu, D. Niyato, and Z. Han, “A hierarchicalgame framework for resource management in fog computing,” IEEECommunications Magazine, vol. 55, no. 8, pp. 52–57, Aug 2017.

[50] Z. Zhou, M. Dong, K. Ota, G. Wang, and L. T. Yang, “Energy-efficientresource allocation for d2d communications underlaying cloud-ran-based lte-a networks,” IEEE Internet of Things Journal, vol. 3, no. 3,pp. 428–438, June 2016.

[51] H. Zhang, Y. Xiao, S. Bu, D. Niyato, F. R. Yu, and Z. Han, “Computingresource allocation in three-tier iot fog networks: A joint optimizationapproach combining stackelberg game and matching,” IEEE Internetof Things Journal, vol. 4, no. 5, pp. 1204–1215, Oct 2017.

[52] M. Haenggi, J. G. Andrews, F. Baccelli, O. Dousse, andM. Franceschetti, “Stochastic geometry and random graphs forthe analysis and design of wireless networks,” IEEE J.Sel. A.Commun., vol. 27, no. 7, pp. 1029–1046, Sep. 2009. [Online].Available: http://dx.doi.org/10.1109/JSAC.2009.090902

[53] Q. Zheng, K. Zheng, P. Chatzimisios, and F. Liu, “Joint optimizationof link scheduling and resource allocation in cooperative vehicularnetworks,” EURASIP Journal on Wireless Communications andNetworking, vol. 2015, no. 1, p. 170, Jun 2015. [Online]. Available:https://doi.org/10.1186/s13638-015-0392-4

[54] L. Liang, S. Xie, G. Y. Li, Z. J. Ding, and X. Yu, “Graph-based radioresource management for vehicular networks,” 2018 IEEE InternationalConference on Communications (ICC), pp. 1–6, 2018.

[55] E. Castaeda, A. Silva, A. Gameiro, and M. Kountouris, “An overviewon resource allocation techniques for multi-user mimo systems,” IEEECommunications Surveys Tutorials, vol. 19, no. 1, pp. 239–284,Firstquarter 2017.



http://dx.doi.org/10.1109/MCOM.2009.5350367

http://dx.doi.org/10.1007/s11235-014-9919-y

http://dx.doi.org/10.14569/IJACSA.2015.061007


https://doi.org/10.1007/s11036-006-0049-y

http://dx.doi.org/10.1109/JSAC.2009.090902

https://doi.org/10.1186/s13638-015-0392-4

18

[56] K. Zheng, L. Zhao, J. Mei, B. Shao, W. Xiang, and L. Hanzo,“Survey of large-scale mimo systems,” IEEE Communications SurveysTutorials, vol. 17, no. 3, pp. 1738–1760, thirdquarter 2015.

[57] T. O. Olwal, K. Djouani, and A. M. Kurien, “A survey of resourcemanagement toward 5g radio access networks,” IEEE CommunicationsSurveys Tutorials, vol. 18, no. 3, pp. 1656–1686, thirdquarter 2016.

[58] S. A. H. Fatima Hussain, Rashhed Hussain and E. Hussain, “1machinelearning in iot security:current solutions and future challenges,” inArxiv.org, 2019.

[59] R. H. Javed, A. Siddique, R. Hafiz, O. Hasan, and M. Shafique,“Approxct: Approximate clustering techniques for energy efficientcomputer vision in cyber-physical systems,” in 2018 12th InternationalConference on Open Source Systems and Technologies (ICOSST), Dec2018, pp. 64–70.

[60] S. Gupta, A. Thakral, and S. Sharma, “Novel technique for predictionanalysis using normalization for an improvement in k-means clus-tering,” in 2016 International Conference on Information Technology(InCITe)-The Next Generation IT Summit on the Theme-Internet ofThings: Connect your Worlds. IEEE, 2016, pp. 32–36.

[61] V. V. Kumari and P. R. K. Varma, “A semi-supervised intrusion detec-tion system using active learning svm and fuzzy c-means clustering,”in 2017 International Conference on I-SMAC (IoT in Social, Mobile,Analytics and Cloud)(I-SMAC). IEEE, 2017, pp. 481–485.

[62] C.-M. Chung, C.-C. Chen, W.-P. Shih, T.-E. Lin, R.-J. Yeh, andI. Wang, “Automated machine learning for internet of things,” in2017 IEEE International Conference on Consumer Electronics-Taiwan(ICCE-TW). IEEE, 2017, pp. 295–296.

[63] M. A. Khan, A. Khan, M. N. Khan, and S. Anwar, “A novel learningmethod to classify data streams in the internet of things,” in 2014National Software Engineering Conference. IEEE, 2014, pp. 61–66.

[64] C. et al., “Deep Learning in Mobile and Wireless Networking: ASurvey,” IEEE Communications Surveys & Tutorials, vol. 1, pp. 1–43,March 2018.

[65] S. Park, M. Sohn, H. Jin, and H. Lee, “Situation reasoning frameworkfor the internet of things environments using deep learning results,” in2016 IEEE International Conference on Knowledge Engineering andApplications (ICKEA). IEEE, 2016, pp. 133–138.

[66] J. Zhu, Y. Song, D. Jiang, and H. Song, “A new deep-q-learning-based transmission scheduling mechanism for the cognitive internetof things,” IEEE Internet of Things Journal, pp. 1–1, 2017.

[67] N. Chauhan, N. Choudhary, and K. George, “A comparison of re-inforcement learning based approaches to appliance scheduling,” in2016 2nd International Conference on Contemporary Computing andInformatics (IC3I), Dec 2016, pp. 253–258.

[68] S. Maghsudi and S. Stanczak, “Channel Selection for Network- As-sisted D2D Communication via No-Regret Bandit Learning with Cal-ibrated Forecasting,” IEEE Transaction in Wireless Communications,vol. 14, no. 3, pp. 1309–1322, March 2015.

[69] S. Chinchali, P. Hu, T. Chu, M. Sharma, M. Bansal, R. Misra,M. Pavone, and S. Katti, “Cellular network traffic scheduling with deepreinforcement learning,” in AAAI, 2018.

[70] H. Mao, M. Alizadeh, I. Menache, and S. Kandula, “Resourcemanagement with deep reinforcement learning,” in Proceedings of the15th ACM Workshop on Hot Topics in Networks, ser. HotNets ’16.New York, NY, USA: ACM, 2016, pp. 50–56. [Online]. Available:http://doi.acm.org/10.1145/3005745.3005750

[71] G. Rjoub and J. Bentahar, “Cloud task scheduling based on swarmintelligence and machine learning,” in 2017 IEEE 5th InternationalConference on Future Internet of Things and Cloud (FiCloud), Aug2017, pp. 272–279.

[72] S. Bohez, T. Verbelen, P. Simoens, and B. Dhoedt, “Discrete-event simulation for efficient and stable resource allocation incollaborative mobile cloudlets,” Simulation Modelling Practiceand Theory, vol. 50, pp. 109 – 129, 2015, special Issue onResource Management in Mobile Clouds. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S1569190X14000768

[73] J. S. Jang, Y. L. Kim, and J. H. Park, “A study on the optimization ofthe uplink period using machine learning in the future iot network,”in 2016 IEEE International Conference on Pervasive Computing andCommunication Workshops (PerCom Workshops), March 2016, pp. 1–3.

[74] T. et al., “Traffic prediction for dynamic traffic engineering,” in inElsevier,, vol. 85, July 2015, pp. 34–60.

[75] ——, “A learning-based multimodel integrated framework for dynamictraffic flow forecasting,” in in Springer,, vol. 85, March 2018, pp. 407–430.

[76] H.-Y. Kim1 and J.-M. Kim2, “A load balancing scheme based on deep-learning in iot,” in in Springer,, October 2016, p. 873878.

[77] C. Caruso, D. Malerba, and D. Papagni, “Learning the daily model ofnetwork traffic,” in in Springer,, 2005, p. 131141.

[78] L. W. Fan Meng, Peng Chen and J. Cheng, “Power allocation in multi-user cellular networks: Deep reinforcement learning approaches,” inarXiv.org, January 2019, pp. 1–29.

[79] W. Kang and D. Kim, “Poster abstract: Deeprt: A predictable deeplearning inference framework for iot devices,” in 2018 IEEE/ACMThird International Conference on Internet-of-Things Design and Im-plementation (IoTDI), April 2018, pp. 279–280.

[80] N. Liu, Z. Li, J. Xu, Z. Xu, S. Lin, Q. Qiu, J. Tang, and Y. Wang,“A hierarchical framework of cloud resource allocation and powermanagement using deep reinforcement learning,” in 2017 IEEE 37thInternational Conference on Distributed Computing Systems (ICDCS),June 2017, pp. 372–382.

[81] Z. Fan, X. Gu, S. Nie, and M. Chen, “D2d power control based onsupervised and unsupervised learning,” in 2017 3rd IEEE InternationalConference on Computer and Communications (ICCC), Dec 2017, pp.558–563.

[82] P. Lynggaard, “Using machine learning for adaptive interference sup-pression in wireless sensor networks,” IEEE Sensors Journal, vol. 18,no. 21, pp. 8820–8826, Nov 2018.

[83] Z. Z. J. Y. W. C. L. Li, Y. Xu and Z. Han, “A prediction-basedcharging policy and interference mitigation approach in the wirelesspowered internet of things,” in 2019 IEEE Journal on Selected Areasin Communications, vol. 37, no. 2, Feb 2019, pp. 439–451.

[84] S. Grimaldi, A. Mahmood, and M. Gidlund, “Real-time interferenceidentification via supervised learning: Embedding coexistence aware-ness in iot devices,” IEEE Access, vol. 7, pp. 835–850, 2019.

[85] V. M. Suresh, R. Sidhu, P. Karkare, A. Patil, Z. Lei, and A. Basu,“Powering the iot through embedded machine learning and lora,” in2018 IEEE 4th World Forum on Internet of Things (WF-IoT), Feb2018, pp. 349–354.

[86] S. K. Sharma and X. Wang, “Collaborative distributed q-learningfor rach congestion minimization in cellular iot networks,” IEEECommunications Letters, pp. 1–1, 2019.

[87] A. Iranfar, S. N. Shahsavani, M. Kamal, and A. Afzali-Kusha, “Aheuristic machine learning-based algorithm for power and thermal man-agement of heterogeneous mpsocs,” in 2015 IEEE/ACM InternationalSymposium on Low Power Electronics and Design (ISLPED), July2015, pp. 291–296.

[88] S. Toumi, M. Hamdi, and M. Zaied, “An adaptive q-learning approachto power control for d2d communications,” in 2018 International Con-ference on Advanced Systems and Electric Technologies (IC ASET),March 2018, pp. 206–209.

[89] M. Kulin, T. Kazaz, I. Moerman, and E. De Poorter, “End-to-endlearning from spectrum data: A deep learning approach for wirelesssignal identification in spectrum monitoring applications,” IEEE Ac-cess, vol. 6, pp. 18 484–18 501, 2018.

[90] J. Oksanen, J. Lundn, and V. Koivunen, “Reinforcement learningmethod for energy efficient cooperative multiband spectrum sensing,”in 2010 IEEE International Workshop on Machine Learning for SignalProcessing, Aug 2010, pp. 59–64.

[91] W. Lee, M. Kim, and D. Cho, “Deep cooperative sensing: Cooperativespectrum sensing based on convolutional neural networks,” IEEETransactions on Vehicular Technology, pp. 1–1, 2019.

[92] X. Ma, S. Ning, X. Liu, H. Kuang, and Y. Hong, “Cooperative spectrumsensing using extreme learning machine for cognitive radio networkswith multiple primary users,” in 2018 IEEE 3rd Advanced InformationTechnology, Electronic and Automation Control Conference (IAEAC).IEEE, 2018, pp. 536–540.

[93] S. Zafaruddin, I. Bistritz, A. Leshem, and D. Niyato, “Distributedlearning for channel allocation over a shared spectrum,” arXiv preprintarXiv:1902.06353, 2019.

[94] D. A. Awan, R. L. Cavalcante, and S. Stanczak, “A robust machinelearning method for cell-load approximation in wireless networks,” in2018 IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP). IEEE, 2018, pp. 2601–2605.

[95] C. et. al, “Machine Learning Paradigms for Next-Generation WirelessNetworks,” IEEE Wireless Communications, vol. 24, no. 2, pp. 1–8,December 2016.

[96] W. et al., “Channel Estimation for Massive MIMO Using Gaussian-Mixture Bayesian Learning,” IEEE Transaction in Wireless Communi-cations, vol. 14, no. 3, p. 13561368, March 2015.

[97] Y. Xu, P. Cheng, Z. Chen, Y. Li, and B. Vucetic, “Mobile collaborativespectrum sensing for heterogeneous networks: A bayesian machine

http://doi.acm.org/10.1145/3005745.3005750

http://www.sciencedirect.com/science/article/pii/S1569190X14000768

19

learning approach,” IEEE Transactions on Signal Processing, vol. 66,no. 21, pp. 5634–5647, 2018.

[98] X. Zhou, M. Sun, G. Y. Li, and B.-H. F. Juang, “Intelligent wirelesscommunications enabled by cognitive radio and machine learning,”China Communications, vol. 15, no. 12, pp. 16–48, 2018.

[99] Y. et al., “Dynamic channel selection with reinforcement learningfor cognitive WLAN over BER,” Internet Journal of CommunicationSystems, vol. 25, no. 8, pp. 1077–1090, August 2012.

[100] A. Koushik, F. Hu, and S. Kumar, “Intelligent spectrum managementbased on transfer actor-critic learning for rateless transmissions incognitive radio networks,” IEEE Transactions on Mobile Computing,vol. 17, no. 5, pp. 1204–1215, 2018.

[101] Z. Li, W. Wu, X. Liu, and P. Qi, “Improved cooperative spectrumsensing model based on machine learning for cognitive radio networks,”IET Communications, vol. 12, no. 19, pp. 2485–2492, 2018.

[102] X.-L. Huang, Y. Gao, X.-W. Tang, and S.-B. Wang, “Spectrum mappingin large-scale cognitive radio networks with historical spectrum deci-sion results learning,” IEEE Access, vol. 6, pp. 21 350–21 358, 2018.

[103] T. Park, N. Abuzainab, and W. Saad, “Learning How to Communicatein the Internet of Things: Finite Resources and Heterogeneity,” IEEEAccess, vol. 4, pp. 7063–7073, November 2016.

[104] T. Wang, C.-K. Wen, H. Wang, F. Gao, T. Jiang, and S. Jin, “DeepLearning for Wireless Physical Layer:Opportunities and Challenges,”IEEE China Communication, vol. 14, no. 11, pp. 92–111, October2017.

[105] M. Mahmud, M. S. Kaiser, A. Hussain, and S. Vassanelli, “Applicationsof deep learning and reinforcement learning to biological data,” IEEETransactions on Neural Networks and Learning Systems, vol. 29, no. 6,pp. 2063–2079, June 2018.

[106] M. Mohammadi, A. Al-Fuqaha, M. Guizani, and J. Oh, “Semisuper-vised deep reinforcement learning in support of iot and smart cityservices,” IEEE Internet of Things Journal, vol. 5, no. 2, pp. 624–635,April 2018.

[107] N. D. Nguyen, T. Nguyen, and S. Nahavandi, “System design per-spective for human-level agents using deep reinforcement learning: Asurvey,” IEEE Access, vol. 5, pp. 27 091–27 102, 2017.

[108] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath,“A Brief Survey of Deep Reinforcement Learning,” IEEE SignalProcessing Magazine, May 2017.

[109] B. Lake, T. Ullman, J. Tenenbaum, and S. Gershman, “BuildingMachines That Learn and Think Like People,” The Behavioral andBrain Sciences, 2016.

[110] A. Zappone, M. D. Renzo, and M. Debbah, “Wireless networks designin the era of deep learning: Model-based, ai-based, or both?,,” inarXiv:1902.02647v1, Feburary 2019, p. 143.

[111] N. C. Luong, D. T. Hoang, S. Gong, D. N. P. W. Y.-C. Liang, and D. I.Kim, “Applications of deep reinforcement learning in communicationsand networking: A survey ,,” in https://arxiv.org/abs/1810.07862, Oc-tober 2018, p. 136.

[112] W. Lee, M. Kim, and D.-H. Cho, “Deep cooperative sensing: Cooper-ative spectrum sensing based on convolutional neural networks,” IEEETransactions on Vehicular Technology, 2019.

[113] L. Zhang, J. Tan, Y.-C. Liang, G. Feng, and D. Niyato, “Deep rein-forcement learning based modulation and coding scheme selection incognitive heterogeneous networks,” arXiv preprint arXiv:1811.02868,2018.

[114] H. Huang, J. Yang, H. Huang, Y. Song, and G. Gui, “Deep learning forsuper-resolution channel estimation and doa estimation based massivemimo system,” IEEE Transactions on Vehicular Technology, vol. 67,no. 9, pp. 8549–8560, 2018.

[115] M. Kim, N.-I. Kim, W. Lee, and D.-H. Cho, “Deep learning-aidedSCMA,” IEEE Communications Letters, vol. 22, no. 4, pp. 720–723,2018.

[116] T. J. O’Shea, T. Erpek, and T. C. Clancy, “Deep learning basedMIMO communications,” CoRR, vol. abs/1707.07980, 2017. [Online].Available: http://arxiv.org/abs/1707.07980

[117] H. Ye, G. Y. Li, and B. H. Juang, “Power of deep learning for channelestimation and signal detection in ofdm systems,” IEEE WirelessCommunications Letters, vol. 7, no. 1, pp. 114–117, Feb 2018.

[118] H. T. Kazi Ishfaq Ahmed and E. Hossain, “A deep q-learningmethod for downlink power allocation in multi-cell networks,,” inhttps://arxiv.org/pdf/1904.13032.pdf, April 2019.

[119] U. Challita, L. Dong, and W. Saad, “Proactive resource managementin LTE-U systems: A deep learning perspective,” CoRR, vol.abs/1702.07031, 2017. [Online]. Available: http://arxiv.org/abs/1702.07031

[120] H. Ye and G. Y. Li, “Deep reinforcement learning for resource alloca-tion in v2v communications,” in 2018 IEEE International Conferenceon Communications (ICC), May 2018, pp. 1–6.

[121] ——, “Deep reinforcement learning for resource allocation in v2vcommunications,” in 2018 IEEE International Conference on Com-munications (ICC), May 2018, pp. 1–6.

[122] X. Du, H. Zhang, H. V. Nguyen, and Z. Han, “Stacked lstm deep learn-ing model for traffic prediction in vehicle-to-vehicle communication,”in 2017 IEEE 86th Vehicular Technology Conference (VTC-Fall), Sep.2017, pp. 1–5.

[123] Y. He, N. Zhao, and H. Yin, “Integrated networking, caching, andcomputing for connected vehicles: A deep reinforcement learningapproach,” IEEE Transactions on Vehicular Technology, vol. 67, no. 1,pp. 44–55, Jan 2018.

[124] L. T. Tan and R. Q. Hu, “Mobility-aware edge caching and computingin vehicle networks: A deep reinforcement learning,” IEEE Transac-tions on Vehicular Technology, vol. 67, no. 11, pp. 10 190–10 203, Nov2018.

[125] K. I. Ahmed, H. Tabassum, and E. Hossain, “Deep learning for radioresource allocation in multi-cell networks,” CoRR, vol. abs/1808.00667,2018. [Online]. Available: http://arxiv.org/abs/1808.00667

[126] J. B. Wang, J. Wang, Y. Wu, J. Y. Wang, H. Zhu, M. Lin, and J. Wang,“A machine learning framework for resource allocation assisted bycloud computing,” IEEE Network, vol. 32, no. 2, pp. 144–151, March2018.

[127] A. Testolin, M. Zanforlin, M. D. F. D. Grazia, D. Munaretto, A. Zanella,M. Zorzi, and M. Zorzi, “A machine learning approach to qoe-basedvideo admission control and resource allocation in wireless systems,”in 2014 13th Annual Mediterranean Ad Hoc Networking Workshop(MED-HOC-NET), June 2014, pp. 31–38.

[128] I. AlQerm and B. Shihada, “Enhanced machine learning scheme forenergy efficient resource allocation in 5g heterogeneous cloud radioaccess networks,” in 2017 IEEE 28th Annual International Symposiumon Personal, Indoor, and Mobile Radio Communications (PIMRC), Oct2017, pp. 1–7.

[129] Y. et al., “Learning-based network path planning for traffic engineer-ing,” in in Elsevier,, vol. 92, September 2018, pp. 59–67.

[130] P. Perera, Y.-C. Tian, , C. Fidge, and W. Kelly, “A comparison of super-vised machine learning algorithms for classification of communicationsnetwork traffic,” in in Springer,, 2017, p. 445454.

[131] W. Xing, N. Wang, C. Wang, F. Liu, and Y. Ji, “Resource allocationschemes for d2d communication used in vanets,” in 2014 IEEE 80thVehicular Technology Conference (VTC2014-Fall), Sep. 2014, pp. 1–6.

[132] Y. Xu, H. Zhou, X. Wang, and B. Zhao, “Resource allocation forscalable video streaming in highway vanet,” in 2015 InternationalConference on Wireless Communications Signal Processing (WCSP),Oct 2015, pp. 1–5.

[133] M. Li, L. Zhao, and H. Liang, “An smdp-based prioritized channelallocation scheme in cognitive enabled vehicular ad hoc networks,”IEEE Transactions on Vehicular Technology, vol. 66, no. 9, pp. 7925–7933, Sep. 2017.

[134] X. Cao, L. Liu, Y. Cheng, L. X. Cai, and C. Sun, “On optimal device-to-device resource allocation for minimizing end-to-end delay in vanets,”IEEE Transactions on Vehicular Technology, vol. 65, no. 10, pp. 7905–7916, Oct 2016.

[135] Y. Qi, H. Wang, L. Zhang, and B. Wang, “Optimal access mode selec-tion and resource allocation for cellular-vanet heterogeneous networks,”IET Communications, vol. 11, no. 13, pp. 2012–2019, 2017.

[136] L. P. Portugal-Poma, C. A. C. Marcondes, H. Senger, and L. Arantes,“Applying machine learning to reduce overhead in dtn vehicularnetworks,” in 2014 Brazilian Symposium on Computer Networks andDistributed Systems, May 2014, pp. 94–102.

[137] R. Hussain, Z. Rezaeifar, and H. Oh, “A paradigm shift fromvehicular ad hoc networks to vanet-based clouds,” Wireless PersonalCommunications, vol. 83, no. 2, pp. 1131–1158, Jul 2015. [Online].Available: https://doi.org/10.1007/s11277-015-2442-y

[138] R. Hussain, Z. Rezaeifar, Y.-H. Lee, and H. Oh, “Secure andprivacy-aware traffic information as a service in vanet-based clouds,”Pervasive and Mobile Computing, vol. 24, pp. 194 – 209, 2015,special Issue on Secure Ubiquitous Computing. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S1574119215001455

[139] R. Hussain, D. Kim, J. Son, J. Lee, C. A. Kerrache, A. Benslimane,and H. Oh, “Secure and privacy-aware incentives-based witness servicein social internet of vehicles clouds,” IEEE Internet of Things Journal,vol. 5, no. 4, pp. 2441–2448, Aug 2018.

http://arxiv.org/abs/1707.07980




https://doi.org/10.1007/s11277-015-2442-y


20

[140] R. Hussain and H. Oh, “Cooperation-aware vanet clouds: Providingsecure cloud services to vehicular ad hoc networks,” JIPS, vol. 10, pp.103–118, 2014.

[141] L. Liang, H. Ye, and G. Y. Li, “Towards intelligent vehicular networks:A machine learning framework,” IEEE Internet of Things Journal, pp.1–1, 2019.

[142] H. R. Arkian, R. E. Atani, A. Diyanat, and A. Pourkhalili,“A cluster-based vehicular cloud architecture with learning-basedresource management,” The Journal of Supercomputing, vol. 71,no. 4, pp. 1401–1426, Apr 2015. [Online]. Available: https://doi.org/10.1007/s11227-014-1370-z

[143] M. A. Salahuddin, A. Al-Fuqaha, and M. Guizani, “Reinforcementlearning for resource provisioning in the vehicular cloud,” IEEEWireless Communications, vol. 23, no. 4, pp. 128–135, August 2016.

[144] M. S. Omar, S. A. Hassan, H. Pervaiz, Q. Ni, L. Musavian, S. Mumtaz,and O. A. Dobre, “Multiobjective optimization in 5g hybrid networks,”IEEE Internet of Things Journal, vol. 5, no. 3, pp. 1588–1597, June2018.

[145] M. Simsek, M. Bennis, and . Gven, “Context-aware mobility manage-ment in hetnets: A reinforcement learning approach,” in 2015 IEEEWireless Communications and Networking Conference (WCNC), March2015, pp. 1536–1541.

[146] K. Vasudeva, S. Dikmese, . Gven, A. Mehbodniya, W. Saad, andF. Adachi, “Fuzzy logic game-theoretic approach for energy efficientoperation in hetnets,” in 2017 IEEE International Conference onCommunications Workshops (ICC Workshops), May 2017, pp. 552–557.

[147] J. S. Perez, S. K. Jayaweera, and S. Lane, “Machine learning aidedcognitive rat selection for 5g heterogeneous networks,” in 2017 IEEEInternational Black Sea Conference on Communications and Network-ing (BlackSeaCom). IEEE, 2017, pp. 1–5.

[148] S. Fan, H. Tian, and C. Sengul, “Self-optimization ofcoverage and capacity based on a fuzzy neural network withcooperative reinforcement learning,” EURASIP Journal on WirelessCommunications and Networking, vol. 2014, no. 1, p. 57, Apr 2014.[Online]. Available: https://doi.org/10.1186/1687-1499-2014-57

[149] I. Alqerm and B. Shihada, “Sophisticated online learning schemefor green resource allocation in 5g heterogeneous cloud radio accessnetworks,” IEEE Transactions on Mobile Computing, vol. 17, no. 10,pp. 2423–2437, Oct 2018.

[150] R. I. Ansari, C. Chrysostomou, S. A. Hassan, M. Guizani, S. Mumtaz,J. Rodriguez, and J. J. P. C. Rodrigues, “5g d2d networks: Techniques,challenges, and future prospects,” IEEE Systems Journal, vol. 12, no. 4,pp. 3970–3984, Dec 2018.

[151] S. Liu, Y. Wu, L. Li, X. Liu, and W. Xu, “A two-stage energy-efficient approach for joint power control and channel allocation ind2d communication,” IEEE Access, vol. 7, pp. 16 940–16 951, 2019.

[152] M. Ahmed, Y. Li, M. Waqas, M. Sheraz, D. Jin, and Z. Han, “Asurvey on socially aware device-to-device communications,” IEEECommunications Surveys Tutorials, vol. 20, no. 3, pp. 2169–2197,thirdquarter 2018.

[153] P. Cheng, C. Ma, M. Ding, Y. Hu, Z. Lin, Y. Li, and B. Vucetic,“Localized small cell caching: A machine learning approach based onrating data,” IEEE Transactions on Communications, vol. 67, no. 2,pp. 1663–1676, Feb 2019.

[154] M. Haus, M. Waqas, A. Y. Ding, Y. Li, S. Tarkoma, and J. Ott, “Securityand privacy in device-to-device (d2d) communication: A review,” IEEECommunications Surveys Tutorials, vol. 19, no. 2, pp. 1054–1079,Secondquarter 2017.

[155] S. Maghsudi and S. Staczak, “Channel selection for network-assistedd2d communication via no-regret bandit learning with calibrated fore-casting,” IEEE Transactions on Wireless Communications, vol. 14,no. 3, pp. 1309–1322, March 2015.

[156] A. Asheralieva and Y. Miyanaga, “An autonomous learning-basedalgorithm for joint channel and power level selection by d2d pairsin heterogeneous cellular networks,” IEEE Transactions on Communi-cations, vol. 64, no. 9, pp. 3996–4012, Sept 2016.

[157] Y. Luo, Z. Shi, X. Zhou, Q. Liu, and Q. Yi, “Dynamic resourceallocations based on q-learning for d2d communication in cellularnetworks,” in 2014 11th International Computer Conference on WaveletActiev Media Technology and Information Processing(ICCWAMTIP),Dec 2014, pp. 385–388.

[158] M. I. Khan, M. M. Alam, Y. L. Moullec, and E. Yaacoub,“Throughput-aware cooperative reinforcement learning for adaptiveresource allocation in device-to-device communication,” FutureInternet, vol. 9, no. 4, 2017. [Online]. Available: http://www.mdpi.com/1999-5903/9/4/72

[159] I. AlQerm and B. Shihada, “Energy-efficient power allocation in mul-titier 5g networks using enhanced online learning,” IEEE Transactionson Vehicular Technology, vol. 66, no. 12, pp. 11 086–11 097, Dec 2017.

[160] C.-K. Wen, S. Jin, K.-K. Wong, J.-C. Chen, and P. Ting, “Channel es-timation for massive mimo using gaussian-mixture bayesian learning,”IEEE Transactions on Wireless Communications, vol. 14, no. 3, pp.1356–1368, 2015.

[161] J. Zhang, Y. Zhuo, and Y. Zhao, “Mobile location based on svm inmimo communication systems,” in 2010 International Conference onInformation, Networking and Automation (ICINA), vol. 2. IEEE, 2010,pp. V2–360.

[162] C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, and L. Hanzo,“Machine learning paradigms for next-generation wireless networks,”IEEE Wireless Communications, vol. 24, no. 2, pp. 98–105, 2017.

[163] K. Kim, J. Lee, and J. Choi, “Deep learning based pilot allocationscheme (dl-pas) for 5g massive mimo system,” IEEE CommunicationsLetters, vol. 22, no. 4, pp. 828–831, 2018.

[164] Z. Dong, J. Shi, W. Wang, and X. Gao, “Machine learning basedlink adaptation method for mimo system,” in 2018 IEEE 29th An-nual International Symposium on Personal, Indoor and Mobile RadioCommunications (PIMRC). IEEE, 2018, pp. 1226–1231.

[165] R. C. Daniels, C. M. Caramanis, and R. W. Heath, “Adaptation inconvolutionally coded mimo-ofdm wireless systems through supervisedlearning and snr ordering,” IEEE Transactions on vehicular Technology,vol. 59, no. 1, pp. 114–126, 2010.

[166] W. V. Mauricio, D. C. Araujo, F. H. C. Neto, F. R. M. Lima, andT. F. Maciel, “A low complexity solution for resource allocation andsdma grouping in massive mimo systems,” in 2018 15th InternationalSymposium on Wireless Communication Systems (ISWCS). IEEE,2018, pp. 1–6.

[167] I. Ahmed and H. Khammari, “Joint machine learning based resourceallocation and hybrid beamforming design for massive mimo systems,”in 2018 IEEE Globecom Workshops (GC Wkshps), Dec 2018, pp. 1–6.

[168] J.-B. Wang, J. Wang, Y. Wu, J.-Y. Wang, H. Zhu, M. Lin, and J. Wang,“A machine learning framework for resource allocation assisted bycloud computing,” IEEE Network, vol. 32, no. 2, pp. 144–151, 2018.

[169] Y. Saito, Y. Kishiyama, A. Benjebbour, T. Nakamura, A. Li, andK. Higuchi, “Non-orthogonal multiple access (NOMA) for cellular fu-ture radio access,” in 2013 IEEE 77th vehicular technology conference(VTC Spring). IEEE, 2013, pp. 1–5.

[170] M. N. Jamal, S. A. Hassan, D. N. K. Jayakody, and J. J. P. C.Rodrigues, “Efficient nonorthogonal multiple access: Cooperative useof distributed space-time block coding,” IEEE Vehicular TechnologyMagazine, vol. 13, no. 4, pp. 70–77, Dec 2018.

[171] M. Vaezi, R. Schober, Z. Ding, and H. V. Poor, “Non-orthogonalmultiple access: Common myths and critical questions,” arXiv preprintarXiv:1809.07224, 2018.

[172] G. Gui, H. Huang, Y. Song, and H. Sari, “Deep learning for aneffective nonorthogonal multiple access scheme,” IEEE Transactionson Vehicular Technology, vol. 67, no. 9, pp. 8440–8450, 2018.

[173] D. A. Awan, R. L. Cavalcante, M. Yukawa, and S. Stanczak, “Detec-tion for 5G-NOMA: An online adaptive machine learning approach,”in 2018 IEEE International Conference on Communications (ICC).IEEE, 2018, pp. 1–6.

[174] L. Xiao, Y. Li, C. Dai, H. Dai, and H. V. Poor, “Reinforcement learning-based NOMA power allocation in the presence of smart jamming,”IEEE Transactions on Vehicular Technology, vol. 67, no. 4, pp. 3377–3389, 2018.

[175] J. Cui, Z. Ding, P. Fan, and N. Al-Dhahir, “Unsupervised MachineLearning-Based User Clustering in Millimeter-Wave-NOMA Systems,”IEEE Transactions on Wireless Communications, vol. 17, no. 11, pp.7425–7440, 2018.

[176] F. Nikjoo, A. Mirzaei, and A. Mohajer, “A Novel Approach to Effi-cient Resource Allocation in NOMA Heterogeneous Networks: Multi-Criteria Green Resource Management,” Applied Artificial Intelligence,vol. 32, no. 7-8, pp. 583–612, 2018.

[177] B. Wang, H. Zhang, K. Long, G. Liu, and X. Li, “Green ResourceAllocation in Intelligent Software Defined NOMA Networks,” inInternational Conference on Machine Learning and Intelligent Com-munications. Springer, 2017, pp. 418–427.

[178] M. Liu, T. Song, and G. Gui, “Deep cognitive perspective: Resourceallocation for NOMA based heterogeneous IoT with imperfect SIC,”IEEE Internet of Things Journal, 2018.

[179] S. et al., “Deep Learning for the Internet of Things,” IEEE Journal ofComputer, vol. 51, no. 5, pp. 32–41, May 2018.

https://doi.org/10.1007/s11227-014-1370-z

https://doi.org/10.1007/s11227-014-1370-z

https://doi.org/10.1186/1687-1499-2014-57

http://www.mdpi.com/1999-5903/9/4/72

http://www.mdpi.com/1999-5903/9/4/72

21

[180] J. Su, D. V. Vargas, and K. Sakurai, “One pixel attack for foolingdeep neural networks,” CoRR, vol. abs/1710.08864, 2017. [Online].Available: http://arxiv.org/abs/1710.08864


Documents

Machine Learning for Resource Management in Cellular and IoT … · 2019-07-23 · IoT services, innovative resource management solutions are required. IoT devices can exploit Device-to-Device