Distributed Learning for Energy-Efﬁcient Resource ... · heterogeneous networks (HetNets), in which different macro cell BSs (MBSs) and small cell BSs (SBSs) must coexist, have

0018-9545 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2696974, IEEETransactions on Vehicular Technology

Distributed Learning for Energy-Efficient ResourceManagement in Self-Organizing Heterogeneous

NetworksAtefeh Hajijamali Arani, Abolfazl Mehbodniya, Senior Member, IEEE, Mohammad Javad Omidi, Fumiyuki

Adachi, Life Fellow Member, IEEE, Walid Saad, Senior Member, IEEE, and Ismail Guvenc, Senior Member, IEEE

Abstract—In heterogeneous networks, a dense deployment ofbase stations (BSs) leads to increased total energy consumption,and consequently increased co-channel interference (CCI). Inthis paper, to deal with this problem, self-organizing mechanismsare proposed for joint channel and power allocation procedureswhich are performed in a fully distributed manner. A dynamicchannel allocation mechanism is proposed, in which the problemis modeled as a noncooperative game, and a no-regret learningalgorithm is applied for solving the game. In order to improve theaccuracy and reduce the effect of shadowing, we propose anotherchannel allocation algorithm executed at each user equipment(UE). In this algorithm, each UE reports the channel withminimum CCI to its associated BS. Then, the BS selects itschannel based on these received reports. To combat the energyconsumption problem, BSs choose their transmission power byemploying an ON-OFF switching scheme. Simulation results showthat the proposed mechanism, which is based on the second pro-posed channel allocation algorithm and combined with the ON-OFF switching scheme, balances load among BSs. Furthermore,it yields significant performance gains up to about 40.3%, 44.8%,and 70.6% in terms of average energy consumption, UE’s rate,and BS’s load, respectively, compared to a benchmark based onan interference-aware dynamic channel allocation algorithm.

Index Terms—Heterogeneous Networks; Energy Efficiency;Co-Channel Interference; Learning Algorithm.

I. INTRODUCTION

With the anticipated growth in smartphone penetration anddemand for ubiquitous information access, next-generationwireless networks face enormous challenges to meet the ever-increasing network capacity [1]. Among them, the growingenergy consumption of the networks is a main challengewhich will directly result in the increase of carbon footprintand particularly environmental problems [2]. In the networks,

Copyright (c) 2015 IEEE. Personal use of this material is permitted.However, permission to use this material for any other purposes must beobtained from the IEEE by sending a request to [email protected].

This research is supported by “Towards Energy-Efficient Hyper-DenseWireless Networks with Trillions of Devices”, the Commissioned Research ofNICT, JAPAN and KDDI foundation research grant, “Energy-Efficient RadioResource Management for Next Generation Wireless Network”, in part by theU.S. National Science Foundation under CNS-1406968 and CNS-1460333.

A. Hajijamali Arani and MJ.Omidi are with the Department of Electricaland Computer Engineering, Isfahan University of Technology, Isfahan, 84156-83111, Iran (e-mail: [email protected]; [email protected]).

A. Mehbodniya and F. Adachi are with the Research Organiza-tion of Electrical Communication Tohoku University, 2-1-1 Katahira,Aoba-ku, Sendai, 980-8577 Japan (e-mail: [email protected];[email protected]).

W. Saad is with Wireless@VT, Bradley Department of Electrical andComputer Engineering, Virginia Tech, Blacksburg, VA, 24061 USA, (e-mail:[email protected]).

I. Guvenc is with the Department of Electrical and Computer Engineering,North Carolina State University, Raleigh, NC (e-mail: [email protected]).Preliminary results from this work were presented at the IEEE ICC 2016.

base stations (BSs) consume a considerable amount of energy(amounting to about 60%–80% of the whole energy con-sumption) [3]. Therefore, reducing the energy consumptionof BSs can lead to significant energy savings. In this respect,heterogeneous networks (HetNets), in which different macrocell BSs (MBSs) and small cell BSs (SBSs) must coexist,have emerged as a promising approach to increase capacity andimprove energy efficiency (EE) of the next-generation wirelessnetworks [2], [4]–[7].

However, the deployment of HetNets introduces signif-icant technical issues which need to be addressed. Someof these issues include radio resource allocation and self-organization [8]. Unlike MBSs, SBSs are likely to be user-deployed in an ad hoc manner in unplanned locations withoutany operator supervision. Accordingly, self-organizing net-work (SON) has been recognized as a key approach whichimproves the network’s performance, decreases the networkingcost, and enhances the intelligent management [9]. SON allowsSBSs to adapt to the changes in the network’s conditions,and lead their strategies to provide proper performance in adistributed and flexible manner with minimal human interven-tion [10].

On the other hand, the traffic load of SBSs and MBSsis dynamic across time and space domain [11]. Accordingto [12], for a large time portion during a day, the traffic loadis much below the peak traffic load, and a large number ofBSs are often under-utilized. In current cellular networks, BSs’deployment and operation are usually performed based on thepeak traffic load. Therefore, in low traffic load situations, theEE of BSs will decrease. This observation requires networkoperators to design and utilize effective methods for managingthe networks in a much more energy efficient way than theexisting cellular networks. For instance, techniques such asBS ON-OFF switching (alternatively termed as sleep mode)and cell breathing are suitable solutions to reduce the energyconsumption of BSs. In BS ON-OFF switching method, theBS can switch to low power consumption modes in light trafficload conditions. Using cell breathing technique allows the BSto adaptively adjust its cell size according to the traffic loadconditions [13], [14]. Therefore, by using these methods, thenetwork can be well adapted to spatial and temporal trafficfluctuations. Additionally, in dense deployment of HetNets,tackling the co-channel interference (CCI) is still a majorchallenge and a bottleneck towards improving network per-formance. Thus, the problem of channel sharing among BSsis of utmost importance.



MBS 1

SBS 1

SBS 2

MUE 1

MUE 2MUE 3

SUE 2

Beacon signalData siganlUser’s report signal

Beacon DataSensing

Time frame

1 10State of BSs

MBS 1 SBS 1 SBS 2

Figure 1. A two-tier HetNet model for the channel assignment scheme based on Gibbs sampling method with BSs which are able to switch between ON andOFF states. In beacon transmission phase, the BSs transmit their estimated loads. In sensing phase, the BSs receive the reports from their associated UEs. Indata phase, the BSs send data packets to their associated UEs.

A. Related Work

Several new techniques have been proposed to enhance theEE of HetNets via various approaches such as sleep modeand cell breathing [12], [15]–[22]. In [12], a user equipment(UE) association mechanism for sleeping cell UEs based onmaximum mean channel access probability is proposed. Theproposed mechanism adapts to traffic load at BSs, receivedsignal power, and cumulative interference power. In [15], acooperative optimization problem for maximizing EE, subjectto minimum received signal strength with a set of clustersis considered. For solving the problem, each cluster employssimulated annealing search algorithm in a distributed manner.However, this work does not consider each BS’s load. Theauthors in [16] proposed a set of small cell driven, corenetwork driven, and UE driven sleep mode algorithms forSBSs in HetNets.

In [17], a centralized sleep mode technique for HetNetsby exploiting the cooperation of cells is proposed. In [18],a multi-objective optimization problem based on sleep modetechnique for an orthogonal frequency-division multiple access(OFDMA) based system is developed. In order to find thesolution, a genetic algorithm is applied. However, the proposedmechanisms in [17] and [18] rely on centralized methods, andcome at the expense of knowing global information. In [19], arandom SBS ON-OFF switching scheme is introduced, wherethe UEs can delay their transmissions for a closer SBS tobecome available, thereby significantly improving EE.

In [20], network energy consumption is minimized byoptimizing the cell sizes. Moreover, it is shown that theoptimal cell size from an energy perspective depends on BStechnology, data rates, and traffic demands. In [21], the authorsconcentrated on separating the control plane from the dataplane, and propose a probabilistic sleeping approach basedon the traffic load for data BSs. In [22], an opportunisticsleep mode scheme based on a noncooperative game for thedownlink transmission of a two–tier HetNet is proposed. Forsolving the proposed game, a learning approach in a distributedmanner is applied. Nonetheless, despite the insights glancedfrom these interesting studies [12], [15]–[18], [20]–[22] inimproving the EE of the networks, they focus on only aparticular challenge of the HetNets, i.e. energy consumption,and they do not consider channel allocation issues.

Since only a limited number of channels are allocated tothe network, efficient assignment of channels among BSs is

an important issue. In this regard, several studies have ad-dressed channel allocation problem, and suggest some schemesto share channels among BSs. In [23], a dynamic channelassignment algorithm based on interference information insurrounding cells for a multihop cellular network is proposed.In [24], a centralized dynamic channel assignment based oninterference conditions and traffic demands is considered.Several distributed dynamic channel assignment approachesfor cellular networks are investigated in [25], [26]. Moreover,a number of schemes based on neural network [27]–[29], tabusearch [30]–[33], genetic algorithm [34], and simulated an-nealing algorithm [35] have been proposed. Although efficientchannel assignment is a significant part of resource allocationproblem, it is necessary to take into consideration techniques inorder to improve the EE of the networks that these works [23]–[35] do not consider EE issue in the networks, especially fromthe network operator’s perspective.

The aforementioned works [12], [15]–[18], [20]–[35] do notjointly address power and channel allocation problems in adistributed manner. Moreover, they do not consider scenarioswith UEs’ mobility, handoff problems, and the impact ofresource allocation mechanisms on UE’s quality-of-service(QoS).

B. Contribution

The main contribution of this paper is to introduce anovel framework for joint channel allocation and BS ON-OFFswitching problem in densely deployed HetNets. We proposetwo novel channel allocation schemes. The first scheme isimplemented at the level of BSs, allowing BSs to dynami-cally choose their channels, and adapt them to the network’sconditions. Since the channels themselves do not have theirown individual payoff functions or preferences, investigatingnoncooperative game is more suitable compared to a matchinggame with two-sided preferences. Moreover, in a matchinggame, the matching can require additional signaling betweentwo player sets which can lead to overhead in the design,and increase complexity [36]. On the other hand, due to thedistributed nature of networks and the competition among BSs,we cast the problem as a noncooperative game. To solve thisgame, a novel distributed learning-based approach is used, inwhich the knowledge of received interference on each channelis needed for choosing the appropriate channel. The needfor this distributed learning solution is desirable as it has



several advantages. First, since BSs can autonomously selecttheir channels based on the environmental information aboutthe channels without information exchange, it can reduce thesignaling overhead in the network. Moreover, this informationis useful for BSs to select their strategies with better long-termperformance. Second, a distributed decision taken by a BS isindependent on the number of BSs in the network. Therefore,this approach is a suitable solution for densely deployedHetNets, and when the number of BSs varies over time, andthere is no centralized controller. Furthermore, centralizedapproaches rely on a single controller entity. If the controllerentity is compromised, then it can lead to failures at theentire network. Thus, this distributed approach can improvethe network’s robustness to failures and attacks.

The second proposed scheme is implemented at the levelof UEs. In this algorithm, all UEs in the coverage area ofa BS choose their priority channels, and report them to theBS. After collecting all the reports, the BS decides to chooseits channel based on the reports received from its associatedUEs, using Gibbs sampling method [37], [38]. Fig. 1 illustratesour two-tier network model based on BS ON-OFF switchingand channel assignment algorithm based on Gibbs samplingmethod.

The advantage of the first proposed algorithm is its simplic-ity and low complexity as opposed to the second algorithmwhich requires feedback from UEs to BSs. However, wecan show intuitively that the second algorithm manages thechannels among BSs more effectively because of employingadditional information from UEs. Furthermore, in order toimprove the EE of the network, an ON-OFF switching al-gorithm similar to [39] is implemented at BSs, in which BSsdynamically choose their transmit power. The transmit poweris decided using a machine-learning based game-theoreticalgorithm, and by evaluating a payoff function. Moreover, weconsider mobility for UEs, and use a Manhattan grid modelwhich is a more realistic mobility model than the randomwaypoint model. The handoff decision is made according toBS’s estimated load which is periodically advertised by BSsthrough beacon signals. Simulation results show that the pro-posed approaches improve the network’s performance in termsof energy consumption and throughput as compared to thebaseline approaches. As a result the proposed methods achieveboth EE and spectral efficiency (SE) in a fully distributedmanner without any need for information exchanges amongBSs.

The rest of this paper is organized as follows. In SectionII, we introduce our system model. Section III describesthe problem formulation including UE association and BSoperation problem. In Section IV, we propose our joint channeland power allocation schemes. Section V presents a no-regret learning approach for solving the proposed games. Thesimulation results are presented in Section VI, and conclusionsare drawn in Section VII.

Notations: In this paper, the following notations are used.The regular and boldface symbols refer to scalars and matrices,respectively. For any finite set A, the cardinality of set A andthe set of all probability distributions over it are denoted by|A| and D(A), respectively. {f(x) |g(x)} represents the subset

of all values f(x) for which the assertion g(x) about x is true.X = [xi,j ]M×N and xi,j represent matrix X with dimensionM -by-N and the element in row i and column j of matrixX , respectively, which i = 1, . . . , M and j = 1, . . . , N . Thefunction 1φ denotes the indicator function which equals 1 ifevent φ is true and 0, otherwise. Moreover, [x]+ = max{0, x}denotes the positive part of x. The set of real numbers isdenoted by R.

II. SYSTEM MODEL

A. Deployment ScenarioWe consider the downlink of a two-tier HetNet with a set of

BSs B = {1, . . . , |B|} including a set of MBSs BM overlaidwith a set of SBSs BS, i.e. B = BM ∪ BS. For each coveragearea, a MBS is located at the center of area, and SBSs areuniformly distributed within the coverage of the MBS. Wedenote by K the set of active uniformly distributed mobileUEs with different data rate demands. Since in this paper,we have been focusing more on issues regarding learningschemes for energy-efficient resource management, we assumea single antenna for each UE and BS. However, using multipleinput multiple output-orthogonal frequency division multiplex-ing (MIMO-OFDM) technique can improve the transmissionquality (throughput, bit error rate, etc.). Moreover, the powerconsumption model of BSs depends on the number of trans-mitting antennas.

We assume that the total bandwidth W is equally dividedamong several orthogonal channels with bandwidth W/|Q|where Q = {1, . . . , |Q|} is the set of available channels, with|Q| < |B|. Moreover, MBSs and SBSs can operate over thesame channel. OFDM symbols are grouped into a collection ofphysical resource blocks (RBs). We consider RM and RS RBsfor MBSs and SBSs, respectively, that are distributed amongtheir associated UEs. We assume an open access scheme forBSs in the system, i.e. the UEs are allowed to associate withany tier’s BSs. In order to save power consumption under lowtraffic load, some BSs can be switched to an OFF state (orsleep mode). The vector of BSs’ states at time t in RB r isdenoted by It,r = [It,rb ]|B|×1

, with each element It,rb beingequal to 1 if BS b is in ON state (or active mode) at time t,and equal to 0, otherwise. Therefore, the set of active BSs attime t in RB r is given by:

BrON(t) = {b|It,rb = 1,∀b ∈ B }. (1)

Let PMm,r(t) (P S

s,r(t)) be the transmit power of MBS m ∈ BM

(SBS s ∈ BS) over RB r ∈ RM (RS) at time t. We denote byqb,r(t) the channel over which BS b is transmitting at timet. We treat interference as noise. Therefore, the signal-to-interference-and-noise-ratio (SINR) at the receiver of macrocell UE (MUE) k ∈ K associated with MBS m ∈ BM

transmitting over channel qm,r(t) ∈ Q, and allocated inRB r ∈ RM at time t is defined by (2). In (2), gM

m,k(t)

(gSs,k(t)) denotes the total channel gain including path loss and

lognormal shadow fading between MBS m (SBS s) and UE kat time t. Since the time scale for measuring the total channelgain is much larger than the time scale of fast fading, we donot consider fast fading. Let σ2 be the additive white Gaussiannoise (AWGN) power level per RB at the receiver of UEs, and



γm,k,r (t) =PMm,r (t) gM

m,k(t)∑m∈BM, m6=m

PMm,r(t)I

t,rm gM

m,k(t)1(qm,r(t)=qm,r(t))︸︷︷︸Im,rMBS→MBS

+∑s∈BS

P Ss,r (t) It,rs gS

s,k (t)1(qm,r(t)=qs,r(t))︸︷︷︸Im,rSBS→MBS

+σ2. (2)

γs,k,r (t) =P Ss,r (t) gS

s,k(t)∑m∈BM

PMm,r(t)I

t,rm gM

m,k(t)1(qs,r(t)=qm,r(t))︸︷︷︸Is,rMBS→SBS

+∑

s∈BS, s6=s

P Ss,r(t)I

t,rs gS

s,k(t)1(qs,r(t)=qs,r(t))︸︷︷︸Is,rSBS→SBS

+σ2. (3)

assumed to be constant for all UEs. Im,rMBS→MBS ( Is,rMBS→SBS)and Im,rSBS→MBS (Is,rSBS→SBS) indicate the interference causedby MBSs and SBSs over MBS m (SBS s), respectively. TheSINR at the receiver of small cell UE (SUE) k ∈ K associatedwith SBS s ∈ BS transmitting over channel qs,r(t) ∈ Q, andallocated in RB r ∈ RS at time t is defined by (3).

From Shannon’s capacity formula, the achievable transmis-sion rate of UE k from BS b in RB r at time t in bit/sec/Hzis given by:

Ck,r (t) =W

|Q|log2 (1 + γb,k,r (t)) . (4)

We assume that new flows arrive into the system with meanarrival rate λk(t) and mean packet size 1/µk(t) for UE k attime t. This assumption can capture different spatial trafficvariations by setting different arrival rates or file sizes fordifferent UEs, i.e. heterogeneous UEs. Therefore, the load den-sity of BS b at time t is defined as lb(t) =

{lkb (t)

∣∣k ∈ At,rb },which lkb (t) = λk(t)

µk(t)Ck,r(t) represents the time fraction requiredto deliver the traffic load density from BS b to UE k. Let At,rbbe the set of UE associated with BS b in RB r at time t, whichis defined in Section III. Hence, the load of BS b at time t isexpressed by:

Lb (t) =∑k∈At,rb

lkb (t). (5)

Moreover, for a queue system M/M/1, the average number offlows at BS b is equal to Lb

1−Lb which is proportional to theexpected delay at BS b [40]. The load of BS is an importantmetric for evaluating the UE’s QoS. Mathematically speaking,all the UEs which are associated with a BS can be satisfied ifthe BS’s load does not exceed 1. Nevertheless, when the BS’sload exceeds 1, some UEs which are associated with it mayexperience a sudden drop in their received throughput. Thisphenomenon is referred as dropped UEs.

B. Power Consumption ModelIn HetNets, the BSs of different tiers have different power

consumption. The total input power of a BS consists of thetransmission power and power consumed by the componentsof the BS, including power amplifier, radio frequency module,cooling system, baseband unit, DC-DC power supply, andmain supply. There are several models available for powerconsumption for the networks. In a simple model, it is assumed

that the BSs do not consume any power in a sleep mode.However, such a model is not realistic, and even in the sleepmode, the BSs still consume a certain amount of powerfor sensing purposes. To address this issue, we consider alinear power model for the BSs in the network [41]. Sucha linear model has two parts including static and dynamicpart. The static part involves the power consumed in the sleepmode, while the dynamic part depends on the transmit power.Therefore, the total power consumed by the BSs in the networkat time t can be expressed as follows [39]:

PNetwork (t) =∑b∈B

∑r∈Rj ,j∈{M,S}

PTotalb,r (t), (6)

where

PTotalb,r (t) = POFF

b +P jb,r(t)1(b∈BrON(t))

ηPAb Λ(1− λFeed

b ), j ∈ {M,S} , (7)

with

POFFb =

PRFb + PBB

b

Λ, (8)

andΛ=

(1− λDC

b

) (1− λMS

b

) (1− λCool

b

), (9)

where PTotalb,r (t) and POFF

b are the total power consumptionand the power consumption in sleep mode by BS b in RBr at time t, respectively. PRF

b and PBBb denote the power of

the radio frequency module and the total power of basebandengine consumed by BS b, respectively. ηPA

b indicates thepower amplifier efficiency of BS b. λFeed

b ,λDCb , λMS

b , and λCoolb

represent losses which are incurred by feeder, DC-DC powersupply, main supply, and cooling system, respectively [15]. Weassume that all parameters except P jb,r(t) for BSs in any tierare constant over time. However, this model does not capturethe embodied energy which comprises the energy consumedin the initial BS manufacturing and the energy associated withmaintaining during the effective lifetime of BSs [42].

III. PROBLEM FORMULATION

Given this model, our goal is to improve the EE of thenetwork and tackle the CCI problem by focusing on jointpower and channel allocation in a distributed manner. Fur-thermore, the association of the UEs to the BSs is determinedby the received signal power at the location of UE and BS’s



estimated load. At each time t, the HetNet can be configureddynamically based on a transmission power vector, P r (t), achannel vector, Qr (t), an association matrix between the UEsand BSs, At,r, and a state vector, It,r, as follows:

P r (t) = [P jb,r (t)]|B|×1

, r ∈ Rj , b ∈ B, j ∈ {M,S} ,

Qr (t) = [qb,r (t)]|B|×1, At,r = [at,rb,k]|B|×|K|

,

It,r = [It,rb ]|B|×1.

(10)

Each single binary element at,rb,k in matrix At,r represents theassociation relation between UE k and BS b such that at,rb,k = 1indicates UE k is associated with BS b at time t, otherwiseat,rb,k = 0. We define the set of UE associated with BS b, At,rb ,at time t as follow:

At,rb ={k∣∣∣ at,rb,k = 1, ∀k ∈ K

}. (11)

We assume that each BS b ∈ B broadcasts its estimated loadthrough a beacon signal in the downlink transmission [39].Then, the UEs periodically assess their actual performance.Thus, the set of dropped UEs, U , and the set of new UEs joinedto the network, N , should perform new association processesin order to assign to new BSs. Moreover, we assume that eachUE k ∈ K is associated with at most one BS at each time t,i.e., max(

∑b∈B a

t,rb,k) = 1. Given the set of BSs, we define an

optimization problem which involves finding the elements ofmatrix At,r (i.e., at,rb,k ∀b ∈ B, and ∀k ∈ K) as follows:

maxAt,r

=[at,rb,k

]

∑∀k∈K,

k∈(U⋃

N )

∑∀b∈B,

j∈{M,S}

{at,rb,k P

jb,r(t)It,rb (βb1(b∈BS)

+ 1(b∈BM))gjb,k(t)(1− Lb(t))

},

subject to at,rb,k ∈ {0, 1}, ∀b ∈ B, ∀k ∈ K,∑∀b∈B

at,rb,k ≤ 1, ∀k ∈ K,

(12)where βb denotes the cell range expansion (CRE) bias used bySBS b ∈ BS in order to effectively expand its coverage area.By convention, MBSs have a bias 1 (0 dB) [43]. Moreover,SBSs can adaptively optimize this parameter based on thenetwork’s conditions [44]. However, here we assume that theSBSs have a fixed CRE bias. Let Lb(t) be the estimated loadof BS b ∈ B at time t, which is obtained according to:

Lb (t) = Lb (t− 1)+

(1

t

)α (Lb(t− 1)− Lb(t− 1)

), (13)

where α > 0 is learning rate exponent for the load estimation.For solving the above problem, the UEs choose their as-

sociated BSs using exhaustive search within the set of BSs.For each BS b ∈ B, we define a utility function which isthe difference between its benefit and cost, and the BS aimsat maximizing it. Since each BS has incentive to increasethe number of UEs served by it, we consider the fractionof UEs served by the BS (serving ratio) as the benefit [45].The cost function for each BS is including its total energyconsumption and load. The weighted benefit function nrb (t)

and cost function crb (t) for BS b in RB r at time t can beexpressed as [46]:

nrb (t) = ωnb

∣∣At,rb ∣∣|K|

, (14)

crb (t) = ωlbLb(t) + ωp

b

PTotalb,r (t)

POFFb + PTXMax

b

, (15)

where

PTXMaxb =

PMaxb

ηPAΛ(1− λFeedb )

, (16)

where PMaxb denotes the maximum transmit power of BS b,

while ωnb , ωl

b, and ωpb denote the weight parameters which

indicate the impact of subscription benefit, load, and energyon the utility function for each BS b ∈ B, respectively. Hence,we define the utility function of BS b in RB r at time t by,

ψrb (t) =nrb (t)− crb (t) =

ωnb

∣∣At,rb ∣∣|K|

− ωlbLb (t)− ωp

b

PTotalb,r (t)

POFFb + PTXMax

b

.(17)

In order to deal with the CCI problem, each BS aims atselecting the channel with minimum CCI power. Thus, themaximization of network utility and minimization of the totalinterference to find the optimal resource allocation profile aregiven by the following problems:

(P1) maxP r(t),Qr(t), It,r

∑∀r∈Rj ,j∈{M,S}

∑∀b∈B

ψrb (t),

subject to 0 ≤ Lb (t) ≤ 1, ∀b ∈ B,∑r∈Rj

P jb,r(t) ≤ PMaxb , ∀b ∈ B, j ∈ {M,S},

qb,r (t) ∈ Q, ∀b ∈ B,It,rb ∈ {0, 1}, ∀b ∈ B,

(18)and

(P2) minP r(t),Qr(t), It,r

∑∀r∈Rj ,j∈{M,S}

∑∀b∈B

(Ib,rMBS→MBS + Ib,rSBS→MBS)

1(b∈BM) + (Ib,rMBS→SBS + Ib,rSBS→SBS)1(b∈BS),

subject to 0 ≤ Lb (t) ≤ 1, ∀b ∈ B∑r∈Rj

P jb,r(t) ≤ PMaxb , ∀b ∈ B, j ∈ {M,S},

qb,r (t) ∈ Q, ∀b ∈ B,It,rb ∈ {0, 1}, ∀b ∈ B.

(19)In this paper, we deal with above problems with a two-stepprocedure, including power and channel assignment.

IV. OPPORTUNISTIC RESOURCE ALLOCATION FORSELF-ORGANIZED HETNETS

In this section, we propose our distributed power andchannel assignment schemes. We assume that the BSs indepen-dently select their transmission strategies. In this regard, for



power allocation, each BS individually selects its transmissionpower according to the following ON-OFF switching ap-proach. For channel allocation, each BS selects its channel byutilizing one of our proposed channel assignment mechanisms.Note that, the traffic pattern dynamically varies in time andspace domain, but could be assumed approximately constantfor a particular period of time. Therefore, we can make areasonable assumption of time-scale, in which the variationof network traffic pattern is determined at a slower time-scalecompared to the dynamic BS operation. Furthermore, basedon the number of iterations required for convergence and theprocessor’s strength, the convergence time for the proposedalgorithms would be in the order of minutes, where trafficpattern would not experience a significant change.

A. Power Allocation Sub-Problem: Game-Theoretic Frame-work (ON-OFF Switching)

The described resource allocation problem can be formu-lated as noncooperative games. Here, we only consider powerallocation sub-problem, and name it ON-OFF switching ap-proach. The normal form of the game is expressed as follows:

G(P ) =

(B, {S(P )

b }b∈B,{u

(P )b

}b∈B

), (20)

where B represents the set of players, S(P )b =

{s(P )b,1 , . . . , s

(P )

b,|S(P )b |} is the strategy set of player b where s(P )

b,i

denotes the ith pure strategy of player b. u(P )b represents

the payoff function of player b in game G(P ). The players,strategy set and payoff function are defined as follows:• Players: The set of players B corresponds to the set of

BSs B. For a valid game we require |B|> 1.• Strategy set: A pure strategy of player b is its trans-

mission power. The available pure strategies for BS b

are S(P )b = {0, 1

|S(P )b |−1

PMaxb , . . . ,

|S(P )b |−1

|S(P )b |−1

PMaxb } where

|S(P )b |≥ 2. Since the strategy space for player b is

including of s(P )b,i = 0 and s

(P )b,i 6= 0, the strategy of

BS b not only is composed of transmission power, butalso it is composed of the state of BS b.

• Payoff: The payoff function u(P )b of player b is defined

as (17).In Section V, we discuss the solution of this game.

B. Channel Assignment Sub-ProblemIn this subsection, we focus on channel assignment sub-

problem, and propose two novel channel assignment mech-anisms. In particular, our proposed mechanisms function ina fully distributed manner. Thus, no central controller isneeded. In our first proposed mechanism, we investigate thecompetition among BSs using a game-theoretic approach. Inour proposed game, BSs are the players, in which each playeraims at choosing a channel with minimum CCI power. Then,a no-regret learning approach, presented in Section V, is usedto solve the game. The algorithm is a probabilistic learningprocedure, in which each strategy is played with a probabilitybased on the regret measurement. The proposed algorithm

is particularly useful because it does not need to exchangeinformation in the network.

Our second approach leverages reports from UEs to BSs,in which each UE computes the average received CCI powerover each channel, and selects the channel having the lowestaveraged CCI power. Then, BSs choose their channels basedon the reports received from their associated UEs. For thiscase, BSs apply a probabilistic approach for choosing theirchannels inspired from softmax selection approach [47], usingGibbs sampling method.

1) Channel Assignment Mechanism 1: Game-TheoreticFramework: The channel assignment sub-problem can alsobe formulated as a noncooperative game. The normal form ofgame is expressed as follows:

G(Q) =

(B, {S(Q)

b }b∈B,{u

(Q)b

}b∈B

), (21)

where S(Q)b is the strategy set of player b, and u(Q)

b representsthe payoff function of player b in game G(Q). The differencesbetween G(Q) and the game described in previous subsection,G(P ), are in the strategy set of players and payoff function.Therefore, the strategy set and payoff function are defined asfollows:• Strategy set: A pure strategy of player is its channel.

Therefore, the available pure strategies for each player bare S(Q)

b = {s(Q)b,1 , . . . , s

(Q)

b,|S(Q)b |} where S(Q)

b = Q.• Payoff: The payoff function of player b is defined as

follows:

u(Q)b = −(Ib,rMBS→MBS + Ib,rSBS→MBS)1(b∈BM)

− (Ib,rMBS→SBS + Ib,rSBS→SBS)1(b∈BS).(22)

To solve the game, we apply the learning algorithm describedin Section V, and name it dynamic channel assignment basedon learning algorithm (DCA-LA). In this algorithm, eachplayer is interested in minimizing its average regret over time.As a result, it assigns higher probability to the channel withmore regret.

2) Channel Assignment Mechanism 2: Gibbs Sampling-Based UE Interference-Aware: The performance of networkshinges on CCI experienced at the UE’s location which variesby the network’s conditions. In light of this, we propose achannel assignment scheme, referred as Gibbs sampling-basedUE interference-aware (GUIA) technique. This scheme isbased on the interference that the UEs of each BS experience.In the proposed approach, each UE k at the coverage areaof BS b computes the average CCI power received over eachchannel from other BSs in the network. Using the first orderfiltering with a constant forgetting factor λ, the average CCIpower at the receiver of UE k over channel q at time t in RBr is given by,

Iqk,b,r(t) = (1− λ)Iqk,b,r(t) + λ Iqk,b,r(t− 1), (23)

where Iqk,b,r(t) denotes instantaneous CCI power received byUE k over channel q at time t in RB r. Each UE k ∈ K has aCCI table which updates for all q ∈ Q. Then, UE k selects the



channel with minimum CCI power from this table, and sendit (i.e. the channel itself) to its associated BS b according to:

qk,b,r(t) = argminq∈Q

Iqk,b,r(t),

subject to k ∈ At,rb .(24)

Each BS b ∈ B selects its channel using the received reports.In this regard, BS b calculates the repetition of each channel(i.e. channel absolute frequency) q ∈ Q in all received reportsas follows:

fb,q,r(t) =∑∀k∈At,rb

1(qk,b,r(t)=q). (25)

Then, each BS assigns a probability distribution to all availablechannels according to the reports received from the UEs. Thechannels which are mostly reported by the UEs have a higherselection probability. A common method for enabling thisselection is to use a Boltzmann-Gibbs distribution [37], [38].A Boltzmann-Gibbs is a probability distribution of particlesover states, which can be expressed as: Λ(x) ∝ exp (−Eθ ).Here, E is the energy of state x, and θ is a constant whichis proportional to Boltzmann’s constant and thermodynamictemperature. This implies the probability which a systemwill be in state x, and it is proportional with the energy ofstates and system’s temperature. In this regard, the Boltzmann-Gibbs distribution for BS b, Λb = [Λb,1, . . . ,Λb,|Q|], can becalculated as follows:

Λb,q(fb,q,r(t)) =exp( 1

θbfb,q,r(t))∑

∀q′∈Q exp( 1θbfb,q′,r(t))

, (26)

where Λb,q(fb,q,r(t)) is the element of Λb which is related tochannel q [48]. Let 1

θb> 0 denotes the temperature parameter

for BS b that balances between exploration and exploitation.Note that, by allowing θb → 0, it leads to selecting thechannel which is mostly reported by the UEs associated toBS b. Therefore, this can lead to the strategies with zeroprobabilities. On the contrary, by allowing θ → ∞, it resultsin a uniform distribution over the strategy set of player b.Then, each BS b ∈ B updates the probability assigned to eachchannel q ∈ Q at time t according to:

πb,q,r(t) = πb,q,r(t− 1) +(

1

t

)ν (Λb,q(fb,q,r(t))− πb,q,r(t− 1)

),

(27)where πb,q,r(t − 1) and ν are the probability assigned tochannel q at time t − 1 and the learning rate exponent,respectively. Then, each BS b chooses its channel using amapping function which maps the probability distribution{πb,1,r(t), . . . , πb,|Q|,r(t)} to an element in the set of channels.

V. NO REGRET-BASED LEARNING ALGORITHM FORMEETING EQUILIBRIUM

Next, the learning procedure for solving the proposed gamesis described. Hereinafter, for notational convenience, we dropthe indexes (P ) and (Q) used in the formulations of game G(P )

and G(Q) when interchangeably referring to one of the games.Let πb(t) = {πb,1(t), . . . , πb,|Sb|(t)} ∈ D(Sb) be the mixedstrategy of player b. Here, πb,i(t) = Pr(sb(t) = sb,i) is the

probability that player b plays strategy sb,i at time t, wheresb(t) is the strategy of player b at time t.

We use a no-regret learning approach [49] to solve the BSoperation problem based on the game theoretic frameworks inorder to allocate power and channel, and consequently obtainε-coarse correlated equilibrium. This algorithm is a proba-bilistic learning procedure. This type of learning algorithmhas a good potential to learn mixed strategy equilibria. In ourproposed algorithm, BSs learn their environment, and optimizetheir performance by modifying their transmission powerlevels and channels. Moreover, they impact the performanceof other neighboring BSs, by minimizing their regrets for nothaving selected other strategies.

Definition 1: (ε-coarse correlated equilibrium): A mixedstrategy profile Γ = (Γs,∀s ∈ S = S1 × . . . × S|B|) is anε-coarse correlated equilibrium if no player can gain moreε by deviating to a pure strategy. More formally, a mixedstrategy profile Γ on the set of strategy profiles S is an ε-coarse correlated equilibrium for game G, if and only if, forall b ∈ B, s′b ∈ Sb we have,∑

s∈SΓsub(s

′b, s−b) 6

∑s∈S

Γsub(s) + ε, (28)

where s−b is the strategy of players other than player b. Inthe following, we use a no-regret procedure in a distributedmanner, in order to obtain ε-coarse correlated equilibrium. Weassume that each BS b ∈ B has only local information, andaims at maximizing its payoff function ub, and minimizingits regret for not having played other strategies. The regret ofplayer b is defined as the difference between average payoffobtained up to time t when playing strategy sb,i, and averagepayoff obtained up to time t when the player changes itsstrategies [50]. Specially, for any strategy sb,i ∈ Sb, the regretof sb,i at time t is:

Rsb,i(t) := [Dsb,i(t)]+, (29)

where

Dsb,i(t) =1

t

∑τ6t

ub(sb,i, s−b(τ))− ub(τ), (30)

where ub(τ) denotes the time average of player b’s payoff. Weconsider the behavioral rule proposed in [49], in which BSscan balance the tradeoff between minimizing their regrets andestimating average payoffs. Moreover, they play each strategywith non-zero probability and based on their regrets. The so-lution which captures this behavioral rule is Λb,sb,i(Rsb,i(t)).Based on the sole knowledge of the value of the obtainedpayoff, each BS b learns and estimates its expected payofffor all its strategies. Therefore, for each BS b ∈ B and eachsb,i ∈ Sb, payoff estimation is updated as follows:

ub,sb,i(t+ 1) =ub,sb,i(t)+(1

t+ 1

)γ1(sb(t+1)=sb,i)(ub(t+ 1)− ub,sb,i(t)),

(31)

where ub,sb,i(t) and γ denote the estimated payoff at time t andthe learning rate exponent for updating the payoff estimation,respectively. The point here is that the strategy played at the



last iteration sees the corresponding estimated payoff updated,while the estimated payoff for the other strategies were notchanged. For calculating the regret, each player needs theknowledge of other players’ strategies. Therefore, we use alearning tool for updating the estimated regret. In order toobtain distributed approach, each BS b ∈ B estimates its regretfor each sb,i ∈ Sb as follows:

Rsb,i(t+ 1) = Rsb,i(t)+(1

t+ 1

)ζ(ub,sb,i(t+ 1)− ub(t+ 1)− Rsb,i(t)),

(32)and then, the probability assigned to each strategy sb,i ∈ Sbis updates as follows:

πb,sb,i(t+ 1) = πb,sb,i(t)+(1

t+ 1

)ν(Λb,sb,i(Rsb,i(t+ 1))− πb,sb,i(t)),

(33)where ζ and ν are the learning rate exponent for updating theestimation regret and probability distribution, respectively. Weassume that for all BSs B, there is a mapping function whichmaps the mixed strategy to a pure strategy.

We combine DCA-LA and GUIA with ON-OFF switchingapproach, and refer them to hereinafter as DCA-LA/ON-OFF switching and GUIA/ON-OFF switching, respectively.The pseudo code for the proposed SON mechanisms, i.e.DCA-LA/ON-OFF switching and GUIA/ON-OFF switchingare summarized in Algorithm 1 and Algorithm 2, respectively.The procedures continue until reaching a maximum numberof iterations, TMax, which is required for the mechanisms toconverge. In Appendix C, a discussion about the computationalrequirements of the proposed mechanisms is presented.

A. Convergence AnalysisIn this subsection, we investigate the convergence of the

proposed SON mechanisms. Since both mechanisms use theBoltzmann-Gibbs distribution in order to allocate the re-sources, we use xb(t) = (x

(P )b (t), x

(Q)b (t)) as the transmission

strategy of BS b composed of its transmission power and chan-nel at time t. Let x(P )

b (t) and x(Q)b (t) denote the transmission

power level and channel selected by BS b at time t, respec-tively. Therefore, the vector x(t) = [x1(t), . . . ,x|B|(t)] ∈ Xdenotes the transmission strategies selected by the BSs, whereX =

∏|B|b=1 S

(P )b × S(Q)

b . For DCA-LA/ON-OFF switchingmechanism, let Ψb be a vector comprising average regret esti-mation for power and channel. For GUIA/ON-OFF switchingmechanism, Ψb comprises average regret estimation for power,and channel absolute frequency. Let 1/tı for all ı = {γ, ζ, ν}be the learning rates. For converging the mechanisms, usingstochastic approximation theory and the ordinary differentialequation, all learning rate exponents ı = {γ, ζ, ν} are chosenaccording to the following conditions [47], [51]:

limt→∞

t∑n=1

1

nı= +∞, (34)

and

limt→∞

t∑n=1

(1

nı)2

< +∞. (35)

Algorithm 1 : Proposed DCA-LA/ON-OFF switching algo-rithm.

1: Input: U , N , ub,sb,i(t), Rsb,i(t), πb,sb,i(t) ∀b ∈ B, andsb,i ∈ Sb

2: Output: At, u(Q)b (t), u(P )

b (t), ub,sb,i(t+ 1), Rsb,i(t+ 1),πb,sb,i(t+ 1) ∀b ∈ B, and sb,i ∈ Sb

3: Initialization: t = 1, B = {1, ..., |B|}, K = {1, ..., |K|},S(P )b , S(Q)

b ∀b ∈ B4: while t ≤ TMax do5: for ∀r ∈ Rj , j ∈ {M,S} do6: for ∀b ∈ B, and j ∈ {M,S} do7: Advertise estimated load Lb (t)8: Select P jb,r(t) using πb,sb,i(t) for game G(P )

9: Select qb,r(t) using πb,sb,i(t) for game G(Q)

10: end for11: for ∀k ∈ K do12: if (k ∈ U) ∨ (k ∈ N ) then13: Find at,rb,k (12)14: end if15: end for16: Updating: At,r

17: for ∀b ∈ B do18: Calculations: Lb(t), u(Q)

b (t), u(P )b (t)

19: end for20: for ∀b ∈ B do21: Updating: ub,sb,i(t+1), Rsb,i(t+ 1), πb,sb,i(t+ 1)

for both game G(P ), and G(Q) (31)- (33)22: end for23: end for24: t ← t+ 1,25: end while

The condition expressed in (34) ensures that the learning ratesare large enough to overcome any random fluctuations orinitial conditions. The condition expressed in (35) guaranteesthat the learning rates diminish with iterations, and eventuallybecome small enough to assure convergence. For satisfyingthe above conditions, we choose all ı = {γ, ζ, ν} ∈ (0.5, 1).Furthermore, we consider the utility estimation procedure asa fast process relative to the regret estimation, and the regretestimation procedure as a fast process relative to the strategydistribution process. Therefore, the learning rate exponentsmeet the following criteria:

limt→∞

(1/tζ)

(1/tγ)= 0, (36)

and

limt→∞

(1/tν)

(1/tζ)= 0. (37)

Therefore, the learning rate exponents follow ζ > γ, ν > ζ.Moreover, we assume that θ > 0, unless when we mentionθ → 0. In the following theorem, we show the proposed SONmechanisms converge to stationary distributions using similarargument as [38].

Theorem 1. Starting from an arbitrary initial configura-tion, the SON mechanisms correspond to Markov chains,



Algorithm 2 : Proposed GUIA/ON-OFF switching algorithm.

1: Input: U , N , ub,sb,i(t), Rsb,i(t), πb,sb,i(t) ∀b ∈ B, andsb,i ∈ Sb

2: Output: At, u(Q)b (t), u(P )

b (t), ub,sb,i(t+ 1), Rsb,i(t+ 1),πb,sb,i(t+ 1) ∀b ∈ B, and sb,i ∈ Sb

3: Initialization: t = 1, B = {1, ..., |B|}, K = {1, ..., |K|},Q = {1, . . . , |Q|}, S(P )

b , ∀b ∈ B4: while t ≤ TMax do5: for ∀r ∈ Rj , j ∈ {M,S} do6: for ∀b ∈ B, and j ∈ {M,S} do7: Advertise estimated load Lb (t)8: Select P jb,r(t) using πb,sb,i(t) for game G(P )

9: Select qb,r(t) according to fb,q,r(t) (28)10: end for11: for ∀k ∈ K do12: if (k ∈ U) ∨ (k ∈ N ) then13: Find at,rb,k (12)14: end if15: end for16: Updating: At,r

17: for ∀b ∈ B do18: for ∀k ∈ At,rb do19: for ∀q ∈ Q do20: Calculations: Iqk,b,r(t)21: Calculations: Iqk,b,r(t)22: end for23: Select qk,b,r (24)24: end for25: end for26: for ∀b ∈ B do27: Calculations: Lb(t), u(P )

b (t), fb,q,r(t)28: end for29: for ∀b ∈ B do30: Updating: πb,q,r(t) (27),ub,sb,i(t+1), Rsb,i(t+ 1),

πb,sb,i(t+ 1) for game G(P ) (31)- (33)31: end for32: end for33: t ← t+ 1,34: end while

in which the system converges to a stationary distributionΓ = (Γx,∀x ∈ X ), i.e., [38]

Γx =exp( 1

θ Ψx)∑∀x′∈X exp( 1

θ Ψx′), (38)

andlimθ→0

sup(Γx − Γ0) = 0, (39)

where Γ0 is the uniform distribution over the set of globaloptimal solutions.

The proof of Theorem 1 is presented in Appendix A.The following theorem characterizes the convergence of the

proposed games under the no-regret learning algorithm.

Theorem 2. In the behavioral rule defined in (31)-(33), thestationary distribution Γ comprises an ε-coarse correlatedequilibrium for G(P ) and G(Q).

Table ISYSTEM-LEVEL SIMULATION PARAMETERS

System ParametersParameter ValuePhysical link type DownlinkCarrier frequency/ Channel bandwidth 2 GHz/ 10 MHzNoise power spectral density -174 dBm/HzMean packet arrival rate 1800 Kbpsθb 0.1Weights ωn

b , ωlb, ωp

b 1, 0.5, 0.5Learning rate exponent γ, ζ, ν, α 0.6, 0.7, 0.8, 0.9PThreshold -60 dBm

BSs ParametersParameter MBS SBSMaximum power 46 dBm 30 dBmFeeder loss 3 dB 0 dBDC-DC loss 7.5% 9%Mains supply loss 9% 11%Cooling loss 10% 0%ηPAb 31.1% 6.7%PRFb 12.9 W 0.8 WPBBb 29.6 W 3 W

|S(P )b | 2 4

Shadowing standarddeviation

8 dB 10 dB

Radius cell 250 m 40 mDistance-dependentpath loss model (din Km)

128.1+37.6 log10(d) 140.7+36.7 log10(d)

Minimum distance MBS-SBS: 75 mMBS-UE: 35 m

SBS-SBS: 40 mSBS-UE: 10 m

Load threshold(LThreshold

b )0.9 0.7

The proof of Theorem 2 is presented in Appendix B. More-over, a proof of the convergence of the set of coupled learningalgorithms relies on the stochastic approximation algorithmsby using ordinary differential equations [49].

VI. SIMULATION RESULTS

For our simulations, the proposed mechanisms are validatedin a single cell served by one MBS and a set of SBSswith the maximum transmit power of 46 dBm and 30 dBmas presented in [9], respectively. We assume the number ofchannels to be 4, i.e. |Q| = 4. Our proposed self-organizingmechanisms, i.e. proposed DCA-LA/ON-OFF switching andproposed GUIA/ON-OFF switching, are compared with twofollowing benchmark references:• Interference-Aware Dynamic Channel Selection (IADCS):

Each BS transmits with its maximum power, and eval-uates averages CCI power over each channel, and fi-nally selects the channel with minimum average CCIpower [52].

• IADCS/ON-OFF Switching: Each BS transmits accordingto the ON-OFF switching mechanism, and evaluatesaverage CCI power over each channel, and finally selectsthe channel with minimum average CCI power [53].

Here, the solid curves belong to the benchmark solutions,and the dashed curves refer to the proposed mechanisms.We present simulation results for two scenarios, includingstationary environment, i.e. without considering mobility for



-0.55

-0.5

-0.45

-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

0 1000 2000 3000 4000 5000

Aver

age

uti

lity

per

BS

Number of iterations

Proposed DCA-LA/ON-OFF switching

Proposed GUIA/ON-OFF switching

Figure 2. Convergence of average utility per BS of the proposed algorithmsfor a network with 20 SBSs and 40 UEs.

0

0.4

0.8

1.2

1.6

2

0 1000 2000 3000 4000 5000

Aver

age

rate

per

UE

[M

bp

s]




Figure 3. Convergence of average rate per UE of the proposed algorithmsfor a network with 20 SBSs and 40 UEs.

UEs, and a non-stationary environment, i.e. with consideringUEs’ mobility. The communications are carried out in fullbuffer [54], [55] in accordance to the system parameters shownin Table I.

A. Convergence of the Proposed AlgorithmsHere, we investigate the convergence behavior of our pro-

posed SON mechanisms. First, we consider a network with40 UEs and 20 SBSs. Fig. 2 and Fig. 3 show the convergencebehavior of BS’s utility, defined in (17), and UE’s rate, re-spectively. We can observe that DCA-LA/ON-OFF switchingalgorithm needs more iterations to converge compared toGUIA/ON-OFF switching, i.e. the tradeoff between conver-gence speed and computational complexity. Furthermore, theproposed GUIA/ON-OFF switching yields better performancecompared to DCA-LA/ON-OFF switching algorithm.

Fig. 4 and Fig. 5 show convergence behavior of our pro-posed mechanisms, respectively, in terms of BS’s utility andUE’s rate, for a network with 5 SBSs, and the number of UEs= 20 and 50. From Fig. 4 and Fig. 5, it can be observed thattwo proposed algorithms converge within a small number ofiterations. Furthermore, by increasing the number of UEs, theiterations for convergence slightly increases.

B. Stationary EnvironmentFig. 6 shows average energy consumption per BS as the

number of SBSs varies for 60 active UEs. As the numberof SBSs in the network increases, the path loss betweenthe BS and the UE degrades. Thus, it induces a decreas-ing in the required transmit power to provide the certain

-0.7

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0 50 100 150 200 250 300

Aver

age

uti

lity

per

BS


Proposed DCA-LA/ON-OFF switchingProposed GUIA/ON-OFF switchingProposed DCA-LA/ON-OFF switchingProposed GUIA/ON-OFF switching

20 UEs

50 UEs

Figure 4. Convergence of average utility per BS of the proposed algorithmsfor a network with 5 SBSs, and the number of UEs = 20 and 50.

0

0.4

0.8

1.2

1.6

2

0 50 100 150 200 250 300

Aver

age

rate

per

UE

[M

bps]


Proposed DCA-LA/ON-OFF switchingProposed GUIA/ON-OFF switchingProposed DCA-LA/ON-OFF switchingProposed GUIA/ON-OFF switching

50 UEs

20 UEs

Figure 5. Convergence of average rate per UE of the proposed algorithmsfor a network with 5 SBSs, and the number of UEs = 20 and 50.

received power at the UE’s receiver. From Fig. 6, we canobserve that with increasing the number of SBSs, the averageenergy consumption per BS decreases. At the number ofSBSs = 50, the reduction of average energy consumptionper BS in IADCS/ON-OFF switching, DCA-LA/ON-OFFswitching, and proposed GUIA/ON-OFF switching comparedto IADCS approach are 33.8%, 35.6%, and 40.3%, respec-tively. Moreover, for a given number of SBSs, proposedGUIA/ON-OFF switching mechanism consumes less energycompared to the other approaches. However, it has almostthe same performance compared to the approaches basedon ON-OFF switching. For instance, it reduces the averageenergy consumption per BS by 7.2%, and 9.8% compared toDCA-LA/ON-OFF switching, and IADCS/ON-OFF switching,respectively. The main reason is that the DCA-LA/ON-OFFswitching and IADCS/ON-OFF switching utilize the ON-OFFswitching mechanism for unnecessary BSs, whereas in IADCSalgorithm, each BS transmits with its maximum power. Onthe other hand, in GUIA/ON-OFF switching, the BS selectsthe better downlink channel, in terms of CCI, compared tothe other mechanisms based on the ON-OFF switching. Thiscomes because of receiving the reports from its associatedUEs. Therefore, the interference over the selected channelusing GUIA/ON-OFF switching mechanism is less than theother mechanisms. This leads to a decrease in the requiredtransmit power to guarantee the same received power at theUE’s receiver. Since the proposed mechanisms are simulatedfor the number of SBSs = 50, it can imply a dense SBSsdeployment. Furthermore, in the proposed mechanisms, the



15

20

25

30

35

40

45

50

55

10 15 20 25 30 35 40 45 50

Ave

rage

ene

rgy

cons

umpt

ion

per

BS

[W]

Number of SBSs per MBS

IADCS

IADCS/ON-OFF switching



Figure 6. Average energy consumption per BS versus the number of SBSs,given 60 UEs.

0

0.2

0.4

0.6

0.8

1

10 15 20 25 30 35 40 45 50

Ave

rage

load

per

BS


IADCS

IADCS/ON-OFF switching

Proposed DCA-LA/ON-OFFswitching


Figure 7. Average load per BS versus the number of SBSs, given 60 UEs.

average CCI power over each channel is evaluated without thedistinguishing the source of interference, which may consistof interference from MBSs and SBSs. As a result, withoutloss of generality, increasing the number of SBSs can increaseinterference in the network, and thus, it can imply a densedeployment. Fig. 7 shows the average load per BS for 60active UEs. As the number of SBSs increases, average load perBS decreases through offloading UEs associated with highlyloaded BSs to lightly loaded BSs. Since GUIA/ON-OFFswitching mechanism assigns the resources in the efficientmanner, it can reduce interference on downlink channel atthe UE’s receiver. As a result, it improves the achievabletransmission rate of the UE, and thus, it reduces the BS’s load.From Fig. 7, it can be seen that the proposed GUIA/ON-OFFswitching method balances the load among BSs, and yields78.3%, 70.6%, and 59.2% average load reduction per BScompared to IADCS/ON-OFF switching, IADCS, and DCA-LA/ON-OFF switching approaches, respectively.

Fig. 8 shows the average utility per BS, defined in (17),versus the number of UEs for 5 SBSs. As the number ofUEs increases, more BSs are working in active mode, and lessBSs choose OFF mode. Therefore, the BS’s load and energyconsumption increase, and thus, the utility of the BS decreases.Since GUIA/ON-OFF switching balances load, and saves moreenergy compared to the other mechanisms, it improves theBS’s utility. However, for high number of UEs, the behavior ofmechanisms based on the ON-OFF switching method becomescloser to each other. This is mainly due to the fact that, the load

-1

-0.9

-0.8

-0.7

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

10 30 50 70 90 110

Ave

rage

util

ity p

er B

S

Number of UEs

IADCS

IADCS-ON/OFF switching



Figure 8. Average utility per BS versus the number of UEs, given 5 SBSs.

Table IIPROBABILITY DISTRIBUTION FOR UE’S MOVEMENT

Movement direction probabilityCurrent direction pck = 0.5Turning right prk = 0.25Turning left plk = 0.25

of BSs increases, and approaches to the maximum BS’s load.On the other hand, the average energy consumed by BSs in themechanisms based on the ON-OFF switching method is almostsame. Fig. 8 shows that the average utility of the proposedGUIA/ON-OFF switching yields, respectively, 17.8%, 34%,and 63.9% improvement over DCA-LA/ON-OFF switching,IADCS/ON-OFF switching, and IADCS approaches for 20UEs.

C. Non-Stationary Environment

We now turn to the scenario, in which the UEs can freelymove, and perform handoff according to their received signalstrength (RSS), distance, and BS’s estimated load. Since in therandom waypoint model, UEs’ movements occur in an openfield, and they can be located at every point of the region, itmay lead to inaccurate results. The models based on the streetsof urban areas are more realistic mobility models, and providemore accurate simulation results. In the streets of urban areas,the UEs’ movements are limited to streets often separated bybuildings, trees, and different obstructions.

Hereby, we use a Manhattan grid model, in which the UEsmove in a straight line, and they can change their directionsat each intersection with a given probability. One way is byconsidering a reduced size for the grid squares, which allowsus to make the model closer to the irregular situation. At eachintersection, each UE selects its movement direction accordingto the probability distributions in Table II. Here pc

k, prk, and pl

k

denote the probabilities for moving on the current direction,turning right, and turning left for UE k, respectively. Weassume that the UEs move with constant speed, and when a UEgoes out of a boundary, another UE enters on the other side.We consider a two-fold handoff algorithm: handoff betweenMBS and SBS, and handoff between SBS and SBS. Let dk,b(t)and Rb be estimated distance between UE k and serving BS b,and the cell radius of BS b, respectively. The handoff algorithmemployed by the UEs is given in Algorithm 3. Here, Pb,k,r (t)denotes the received power at UE k associated with serving



Algorithm 3 : Handoff algorithm based on BS’s estimatedload.

1: if (dk,b(t+ 1) > dk,b(t)) ∧ (dk,b(t+ 1) > 0.8×Rb)2: if 10 logPb,k,r(t)(1 − Lb) < PThreshold + 10 log(1 −

LThresholdb ) then

3: Search for another BS better than serving BS accord-ing to (12)

4: else5: Continue with serving BS6: end if7: end if

0

2

4

6

8

10

12

10 15 20 25 30 35

Ave

rage

thro

ughp

ut p

er B

S [M

bps]


IADCSIADCS/ON-OFF switchingProposed DCA-LA/ON-OFF switchingProposed GUIA/ON-OFF switchingIADCSIADCS/ON-OFF switchingProposed DCA-LA/ON-OFF switchingProposed GUIA/ON-OFF switching

60 UEs

30 UEs

Figure 9. Average throughput per BS versus the number of SBSs, given 30and 60 UEs.

BS b in RB r at time t. For this subsection, we consider theUEs move with velocity 5 m/sec on the layout.

In Fig. 9, we depict average throughput per BS whenincreasing the number of SBSs for 30 and 60 UEs. Weobserve that the proposed GUIA/ON-OFF switching mech-anism achieves the highest throughput compared to the otherapproaches. For instance, for a network with 60 UEs, ityields up to 35.9%, 42.4%, and 44.4% improvement in termsof throughput, relative to the proposed DCA-LA/ON-OFFswitching, IADCS/ON-OFF switching, and IADCS, respec-tively. We can observe that as the number of SBSs increases,average throughput per BS decreases. The main reason is thatwith increasing the number of SBSs, the BS’s load decreases.On the other hand, for a given number of UEs in the network,the number of associated UEs to the BS decreases. Moreover,in our scenario, the decreasing rate of associated UEs to thedecreasing rate of load is less than one. Furthermore, at a givennumber of SBSs, with an increase in the number of UEs, theaverage throughput per BS increases.

In Fig. 10, we show the average rate per UE versus thenumber of SBSs for 30 and 60 UEs. As the number of SBSsincreases, the BS’s load decreases, and thus, the UE’s rateincreases. Moreover, with increasing the number of UEs in thearea, UE’s rate decreases. This is due to the fact that the BS’sload increases, and it may lead to overload some BSs, andthus, decreasing UE’s rate. Since GUIA/ON-OFF switchingmechanism exhibits a load balancing behavior, it improvesthe UE’s rate. In this respect, in Fig. 10, we can see thatthe proposed GUIA/ON-OFF switching improves the averagerate, respectively, 35.7%, 43.2%, and 44.8% when comparedto proposed DCA-LA/ON-OFF switching, IADCS/ON-OFF

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

10 15 20 25 30 35

Ave

rage

rate

per

UE

[Mbp

s]



30 UEs

60 UEs

1780000

1785000

1790000

1795000

1800000

10 12 14

30 UEs

60 UEs

Figure 10. Average rate per UE versus the number of SBSs, given 30 and60 UEs.

0

0.005

0.01

0.015

0.02

0.025

0.03

10 15 20 25 30 35

Ave

rage

num

ber

of d

ropp

ed U

Es



60 UEs

30 UEs 60 UEs

Figure 11. Average number of dropped UEs versus the number of SBSs,given 30 and 60 UEs.

switching, and IADCS for the network with 60 UEs and 10SBSs.

Fig. 11 shows the average number of dropped UEs, whichindicates the UE’s QoS, versus the number of SBSs for 30and 60 UEs. As the number of SBSs increases, the number ofdropped UEs decreases because of load balancing behavior.Furthermore, it can be observed that increasing the numberof UEs results in increasing the average number of droppedUEs. Since GUIA/ON-OFF switching mechanism has a goodperformance in terms of average BS’s load, it yields insignif-icant dropped UEs. For a network with 30 UEs and 10 SBSs,the reductions of average number of dropped UEs in theproposed GUIA/ON-OFF switching compared to the proposedDCA-LA/ON-OFF switching, IADCS/ON-OFF switching, andIADCS are 96.8%, 97.1%, and 97.2%, respectively.

VII. CONCLUSION

In this paper, we have proposed two dynamic channelassignment mechanisms, i.e. DCA-LA and GUIA. Later, wehave combined them with a BS ON-OFF switching algorithmin order to reduce the energy consumption of the network.The proposed DCA-LA/ON-OFF switching algorithm usesa game-theoretic approach, in which each BS selects itschannel and power based on a no-regret learning algorithm.In GUIA/ON-OFF switching algorithm, BSs utilize someinformation from their associated UEs to improve the per-formance of the network. Then, they select their channelsbased on a Gibbs-sampler. GUIA/ON-OFF switching algo-rithm balances the load among BSs. Therefore, it improves



Table IIICOMPUTATIONAL REQUIREMENTS FOR THE PROPOSED CHANNEL ASSIGNMENT ALGORITHMS

OperationsRequired Instructions for

Proposed DCA-LA Proposed GUIAat BSs at UEs at BSs at UEs

Sum (6× |Q|+1)× |B| 0 (3× |Q|−1)× |B|+|K| |Q|×|K|Multiplication (3× |Q|+1)× |B| 0 2× |Q|×|B| 2× |Q|×|K|Division |Q|×|B| 0 |Q|×|B| 0Exponential |Q|×|B| 0 |Q|×|B| 0Comparison (|Q|−1)× |B| 0 |Q|×|K| (|Q|−1)× |K|Total number of instructions (63× |Q|+1)× |B| 5× |Q|×|K|+(58× |Q|−1)× |B|

system throughput, and consequently yields a better SE. Asa result, our proposed algorithm achieves both high EE andSE. The proposed algorithms are all executed in a distributedmanner. Simulation results have shown that GUIA/ON-OFFswitching algorithm provides a better performance over theother algorithms, and significantly outperforms them in termsof average energy consumption, average load, average BS’sutility, average throughput, average number of dropped UEs,and average UE’s rate.

APPENDIX APROOF OF THEOREM 1

Transition from x(t) to x(t+1) takes place when some BSsupdate their resource values according to their observations onx(t). Therefore, the transition depends only on current con-figuration vector. As a result, each proposed SON mechanismcan be modeled as a Markov chain with the set of cliques B,global energy Ψ : X → R2 derived from the potential functionΨb where Ψx =

∑∀b∈B Ψb(xb) and finite configuration space

X . Since for each x ∈ X , Γx is non-zero and positive, theMarkov chain is ergodic (aperiodic and positive recurrent).Therefore, a stationary distribution exists. Now, we prove thesecond part of the theorem. Let X ∗ be the set of global optimalsolutions. For each x∗ ∈ X ∗,

(Γx − Γ0) =exp( 1

θ Ψx)∑∀x′∈X exp( 1

θ Ψx′)− Γ0 =

exp( 1θ (Ψx − Ψx∗))∑

∀x′∈X exp( 1θ (Ψx′ − Ψx∗))

− Γ0 ≤

1

|X ∗|+∑∀x′ /∈X∗ exp( 1

θ (Ψx′ − Ψx∗))− Γ0, (40)

When θ → 0, we have,

limθ→0

(1

|X ∗|+∑∀x′ /∈X∗ exp( 1

θ (Ψx′ − Ψx∗))− Γ0) =

1

|X ∗|− Γ0 = 0. (41)

This concludes the proof.

APPENDIX BPROOF OF THEOREM 2

First, we use the following lemma to show the existence ofthe equilibrium.Lemma 1: In every finite game, the set of correlated equilibriais nonempty, closed and convex. Since the proposed games arefinite, they have at least one mixed strategy equilibrium.

For x′b ∈ S(P )b × S(Q)

b , we define,

ϑ = maxx∈X\X∗

(ub(x′b,x−b)− ub(x)), (42)

and

δ = minx∈X\X∗

((Ψx∗ − Ψx)). (43)

Let 0 ≤ θ ≤ δ

ln(ϑε (|X||X∗|−1)

) . Now, we prove (28), thus, we

have,∑x∈X

Γx(ub(x′b,x−b)− ub(x)) =

∑x∈X

exp( 1θ (Ψx − Ψx∗))(ub(x

′b,x−b)− ub(x))

|X ∗|+∑

x′′∈X\X∗ exp ( 1θ (Ψx′′ − Ψx∗))

≤

|X ∗|(ub(x′b,x∗−b)− ub(x∗))|X ∗|

+∑x∈X\X∗ exp ( 1

θ (Ψx − Ψx∗))(ub(x′b,x−b)− ub(x))

|X ∗|,

(44)Since (ub(x

′,x∗−b)− ub(x∗)) ≤ 0, we have,

|X ∗|(ub(x′b,x∗−b)− ub(x∗))|X ∗|

+∑x∈X\X∗ exp ( 1

θ (Ψx − Ψx∗))(ub(x′,x−b)− ub(x))

|X ∗|≤

ϑ∑

x∈X\X∗ exp ( 1θ (Ψx − Ψx∗))

|X ∗|≤

ϑ(|X ||X ∗|

− 1) exp(−1

θδ) ≤ ϑ(

|X ||X ∗|

− 1)ε

ϑ( |X ||X∗| − 1)≤ ε,

(45)Therefore, this concludes the proof of (28). Moreover, usingsimilar argument as [56], we can assume that (28) does nothold for ε = 0 when θ → 0. Therefore,

limθ→0

sup∑x∈X

Γx(ub(x′b,x−b)− ub(x)) =

1

|X ∗|∑x∈X

(ub(x′b,x−b)− ub(x)) > 0. (46)

Thus, for some x ∈ X ∗, (ub(x′b,x−b) > ub(x)), and it

implies the regret of player b for x′b. Therefore, player binterests to deviate from it, and this yields x /∈ X ∗. Thus, theassumption (46) is not true, and Γx converges to an ε-coarse



0

1000

2000

3000

4000

5000

6000

1 2 3 4 5 6 7 8 9 10

Tot

al n

umbe

r of

ins

truc

tion

s

Total number of channels

Proposed DCA-LA algorithm

Proposed GUIA algorithm

Figure 12. Total required instructions of proposed channel assignmentalgorithms versus the number of channels for a network with |B|= 5 and|K|= 50.

correlated equilibrium. Since Γx and the expected value of Ψx

monotonically decrease with θ [38], we assume that:

( limθ→0

∑x∈X

Γxub(x))−∑x∈X

Γxub(x) ≤ ε. (47)

Thus, for θ > 0,

limθ→0

∑x∈X

Γxub(x′b,x−b)− lim

θ→0

∑x∈X

Γxub(x) ≥∑x∈X

Γxub(x′b,x−b)−

∑x∈X

Γxub(x)− ε. (48)

Since

limθ→0

∑x∈X

Γxub(x′b,x−b)− lim

θ→0

∑x∈X

Γxub(x) ≤ 0. (49)

we conclude that Γ is an ε-coarse correlated equilibrium.

APPENDIX CCOMPUTATIONAL REQUIREMENTS

In the proposed DCA-LA/ON-OFF switching andGUIA/ON-OFF switching mechanisms, the BSs select theirtransmit power by using the same ON-OFF switchingapproach. Therefore, in order to compare the computationalrequirements of the proposed mechanisms, we only considerthe computational requirements of the proposed channelassignment approaches. Here, with considering digital signalprocessors (DSPs), we present the computational requirementsof the proposed DCA-LA and GUIA approaches. We assumethat addition, multiplication, and comparison operationsrequire one DSP cycle [57]. For division operation andexponential computation, 42 and 11 operations are considered,respectively. Since the computational analysis does not takeinto account the compiler optimizations and the ability ofDSPs to execute various instructions, the analysis providesan upper bound on computations. Table III summarizes thecomputational instructions for the proposed DCA-LA andGUIA approaches. From UE’s perspective, the computationalinstructions of channel assignment algorithms are given bythe number of available channels. Accordingly, the proposedGUIA algorithm requires more computational instructionscompared to DCA-LA algorithm.

Fig. 12 shows the total required instructions over differentnumber of channels for the proposed channel assignmentalgorithms, assuming |B|= 5 and |K|= 50. We can observethat the proposed GUIA algorithm requires more instructions

1000

1500

2000

5 10 15 20 25 30

Tot

al n

umbe

r of

inst

ruct

ions

Number of UEs



Figure 13. Total required instructions of proposed channel assignmentalgorithms versus the number of UEs for a network with |B|= 5 and |Q|= 4.

100

1100

2100

3100

4100

5100

6100

7100

5 10 15 20 25T

otal

num

ber

of in

stru

ctio

ns

Number of BSs



Figure 14. Total required instructions of proposed channel assignmentalgorithms versus the number of BSs for a network with |K|= 50 and |Q|= 4.

compared to DCA-LA algorithm. Fig. 13 presents the totalrequired instructions over different number of UEs for anetwork with |B|= 5 and |Q|= 4. It can be seen that withincreasing the number of UEs, the total required instructionsof the proposed GUIA algorithm increases, while the requiredinstructions of DCA-LA algorithm does not change. Fig. 14shows the total required instructions over different number ofBSs for a network with |K|= 50 and |Q|= 4. We see thatwith increasing the number of BSs, the required instructionsincreases for both proposed algorithms.

REFERENCES

[1] S. Navaratnarajah, A. Saeed, M. Dianati, and M. A. Imran, “Energyefficiency in heterogeneous wireless access networks,” IEEE WirelessCommun., vol. 20, no. 5, pp. 37–43, 2013.

[2] R. Hu and Y. Qian, “An energy efficient and spectrum efficient wirelessheterogeneous network framework for 5G systems,” IEEE Commun.Mag., vol. 52, no. 5, pp. 94–101, 2014.

[3] B. Wang, Q. Kong, W. Liu, and L. T. Yang, “On efficient utilization ofgreen energy in heterogeneous cellular networks,” IEEE Syst. J., no. 99,pp. 1–12, 2015.

[4] M. Peng, Y. Li, J. Jiang, J. Li, and C. Wang, “Heterogeneous cloud radioaccess networks: a new perspective for enhancing spectral and energyefficiencies,” IEEE Wireless Commun., vol. 21, no. 6, pp. 126–135, 2014.

[5] H. Zhang, S. Chen, X. Li, H. Ji, and X. Du, “Interference managementfor heterogeneous networks with spectral efficiency improvement,” IEEEWireless Commun., vol. 22, no. 2, pp. 101–107, 2015.

[6] J. Tang, D. So, E. Alsusa, K. Hamdi, and A. Shojaeifard, “Resource al-location for energy efficiency optimization in heterogeneous networks,”IEEE J. Sel. Areas Commun., vol. 33, no. 10, pp. 2104–2117, 2015.

[7] C. Yang, Z. Chen, B. Xia, and J. Wang, “When ICN meets C-RAN forHetNets: an SDN approach,” IEEE Commun. Mag., vol. 53, no. 11, pp.118–125, 2015.

[8] M. Peng, C. Wang, J. Li, H. Xiang, and V. Lau, “Recent advances inunderlay heterogeneous networks: Interference control, resource alloca-tion, and self-organization,” IEEE Commun. Surveys Tuts., vol. 17, no. 2,pp. 700–729, 2015.



[9] D. Lopez-Perez, I. Guvenc, G. De la Roche, M. Kountouris, T. Q. Quek,and J. Zhang, “Enhanced intercell interference coordination challengesin heterogeneous networks,” IEEE Wireless Commun., vol. 18, no. 3,pp. 22–30, 2011.

[10] R. Hernandez-Aquino, S. A. R. Zaidi, D. McLernon, M. Ghogho,and A. Imran, “Tilt angle optimization in two-tier cellular networks-a stochastic geometry approach,” IEEE Trans.Commun., vol. 63, no. 12,pp. 5162–5177, 2015.

[11] J. G. Andrews, “Seven ways that HetNets are a cellular paradigm shift,”IEEE Commun. Mag., vol. 51, no. 3, pp. 136–144, 2013.

[12] H. Tabassum, U. Siddique, E. Hossain, and M. J. Hossain, “Downlinkperformance of cellular systems with base station sleeping, user associ-ation, and scheduling,” IEEE Trans.Wireless Commun., vol. 13, no. 10,pp. 5752–5767, 2014.

[13] D. Feng, C. Jiang, G. Lim, L. J. Cimini, G. Feng, and G. Y. Li, “A surveyof energy-efficient wireless communications,” IEEE Commun. SurveysTuts., vol. 15, no. 1, pp. 167–178, 2013.

[14] A. Merwaday and I. Guvenc, “Optimisation of FeICIC for energyefficiency and spectrum efficiency in LTE-advanced HetNets,” IETElectronics Letters, vol. 52, no. 11, pp. 982–984, 2016.

[15] G. P. Koudouridis and H. Li, “Distributed power on-off optimisation forheterogeneous networks-a comparison of autonomous and cooperativeoptimisation,” in Proc. IEEE 17th Int. Workshop CAMAD, 2012, pp.312–317.

[16] I. Ashraf, F. Boccardi, and L. Ho, “Sleep mode techniques for small celldeployments,” IEEE Commun. Mag., vol. 49, no. 8, pp. 72–79, 2011.

[17] G. Wu, G. Feng, and S. Qin, “Cooperative sleep-mode and performancemodeling for heterogeneous mobile network,” in Proc. IEEE WCNCW,2013, pp. 6–11.

[18] P. Chandhar and S. S. Das, “Energy saving in OFDMA cellular networkswith multi-objective optimization,” in Proc. IEEE ICC, 2014, pp. 3951–3956.

[19] H. Celebi, N. Maxemchuk, Y. Li, and I. Guvenc, “Energy reductionin small cell networks by a random on/off strategy,” in Proc. IEEEGlobecom Workshops (GC Wkshps), 2013, pp. 176–181.

[20] S. Bhaumik, G. Narlikar, S. Chattopadhyay, and S. Kanugovi, “Breatheto stay cool: adjusting cell sizes to reduce energy consumption,” in Pro-ceedings of the first ACM SIGCOMM workshop on Green networking,2010, pp. 41–46.

[21] S. Zhang, J. Wu, J. Gong, S. Zhou, and Z. Niu, “Energy-optimal prob-abilistic base station sleeping under a separation network architecture,”in Proc. IEEE GLOBECOM, 2014, pp. 4239–4244.

[22] S. Samarakoon, M. Bennis, W. Saad, and M. Latva-aho, “Opportunisticsleep mode strategies in wireless small cell networks,” in Proc. IEEEICC, 2014, pp. 2707–2712.

[23] X. J. Li and P. H. J. Chong, “A dynamic channel assignment schemefor TDMA-based multihop cellular networks,” IEEE Trans. WirelessCommun., vol. 7, no. 6, pp. 1999–2003, 2008.

[24] G. F. Marias, D. Skyrianoglou, and L. Merakos, “A centralized approachto dynamic channel assignment in wireless ATM LANs,” in Proc. IEEEINFOCOM’99., vol. 2, 1999, pp. 601–608.

[25] G. Cao and M. Singhal, “Distributed fault-tolerant channel allocationfor cellular networks,” IEEE J. Sel. Areas Commun., vol. 18, no. 7, pp.1326–1337, 2000.

[26] Y. Furuya and Y. Akaiwa, “Channel segregation, a distributed adaptivechannel allocation scheme for mobile communication systems,” IEICETransactions on Communications, vol. 74, no. 6, pp. 1531–1537, 1991.

[27] Z. He, Y. Zhang, C. Wei, and J. Wang, “A multistage self-organizingalgorithm combined transiently chaotic neural network for cellularchannel assignment,” IEEE Trans. Veh. Technol., vol. 51, no. 6, pp.1386–1396, 2002.

[28] C. Zhao and L. Gan, “Dynamic channel assignment for large-scalecellular networks using noisy chaotic neural network,” IEEE Trans.Neural Netw., vol. 22, no. 2, pp. 222–232, 2011.

[29] J. Wang, Z. Tang, X. Xu, and Y. Li, “A discrete competitive Hopfieldneural network for cellular channel assignment problems,” Neurocom-puting, vol. 67, pp. 436–442, 2005.

[30] R. Montemanni, J. N. Moon, and D. H. Smith, “An improved tabu searchalgorithm for the fixed-spectrum frequency-assignment problem,” IEEETrans. Veh. Technol., vol. 52, no. 4, pp. 891–901, 2003.

[31] D. Gozupek, G. Genc, and C. Ersoy, “Channel assignment problem incellular networks: A reactive tabu search approach,” in Computer andInformation Sciences, 2009. ISCIS 2009. 24th International Symposiumon, 2009, pp. 298–303.

[32] Y. Peng, L. Wang, and B. H. Soong, “Optimal channel assignment incellular systems using tabu search,” in Proc. IEEE PIMRC., vol. 1, 2003,pp. 31–35.

[33] C. Y. Lee and H. G. Kang, “Cell planning with capacity expansion inmobile communications: A tabu search approach,” IEEE Trans. Veh.Technol., vol. 49, no. 5, pp. 1678–1691, 2000.

[34] S. C. Ghosh, B. P. Sinha, and N. Das, “Channel assignment using geneticalgorithm based on geometric symmetry,” IEEE Trans. Veh. Technol.,vol. 52, no. 4, pp. 860–875, 2003.

[35] J. Chen, S. Olafsson, and X. Gu, “Observations on using simulatedannealing for dynamic channel allocation in 802.11 WLANs,” in Proc.IEEE VTC Spring, 2008, pp. 1801–1805.

[36] Y. Gu, W. Saad, M. Bennis, M. Debbah, and Z. Han, “Matching theoryfor future wireless networks: fundamentals and applications,” IEEECommun. Mag., vol. 53, no. 5, pp. 52–59, 2015.

[37] P. Bremaud, Markov Chains: Gibbs Fields, Monte Carlo Simulation,and Queues, ser. Texts in Applied Mathematics (Book 31). Springer,1999.

[38] L. P. Qian, Y. J. Zhang, and M. Chiang, “Distributed nonconvex powercontrol using Gibbs sampling,” IEEE Trans. Commun., vol. 60, no. 12,pp. 3886–3898, 2012.

[39] A. H. Arani, A. Mehbodniya, M. J. Omidi, and F. Adachi, “Learning-based joint power and channel assignment for hyper dense 5G networks,”in Proc. IEEE ICC, 2016.

[40] H. Kim, G. de Veciana, X. Yang, and M. Venkatachalam, “Distributed α-optimal user association and cell load balancing in wireless networks,”IEEE/ACM Trans. Netw., vol. 20, no. 1, pp. 177–190, 2012.

[41] M. Ismail, W. Zhuang, E. Serpedin, and K. Qaraqe, “A survey on greenmobile networking: From the perspectives of network operators andmobile users,” IEEE Commun. Surveys Tuts., vol. 17, no. 3, pp. 1535–1556, 2015.

[42] I. Humar, X. Ge, L. Xiang, M. Jo, M. Chen, and J. Zhang, “Rethinkingenergy efficiency models of cellular networks with embodied energy,”IEEE Netw., vol. 25, no. 2, pp. 40–49, 2011.

[43] J. Andrews, S. Singh, Q. Ye, X. Lin, and H. Dhillon, “An overviewof load balancing in HetNets: Old myths and open problems,” IEEEWireless Commun., vol. 21, no. 2, pp. 18–25, 2014.

[44] M. Simsek, M. Bennis, and I. Guvenc, “Learning based frequency-and time-domain inter-cell interference coordination in HetNets,” IEEETrans. Veh. Technol., vol. 64, no. 10, pp. 4589–4602, 2015.

[45] M. H. Manshaei, P. Marbach, and J. P. Hubaux, “Evolution and marketshare of wireless community networks,” in Game Theory for Networks,2009. GameNets ’09. International Conference on, 2009, pp. 508–514.

[46] A. H. Arani, M. J. Omidi, A. Mehbodniya, and F. Adachi, “A handoffalgorithm based on estimated load for dense green 5G networks,” inProc. IEEE GLOBECOM, 2015, pp. 1–7.

[47] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction,ser. Adaptive Computation and Machine Learning series. MIT Press,1998.

[48] S. Samarakoon, M. Bennis, W. Saad, and M. Latva-aho, “Backhaul-aware interference management in the uplink of wireless small cellnetworks,” IEEE Trans. Wireless Commun., vol. 12, no. 11, pp. 5813–5825, 2013.

[49] M. Bennis, S. M. Perlaza, and M. Debbah, “Learning coarse correlatedequilibria in two-tier wireless networks,” in Proc. IEEE ICC, 2012, pp.1592–1596.

[50] S. Hart and A. Mas-Colell, “A simple adaptive procedure leading tocorrelated equilibrium,” Econometrica, vol. 68, no. 5, pp. 1127–1150,2000.

[51] D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming.Athena Scientific, 1996.

[52] Y. Matsumura, S. Kumagai, T. Obara, T. Yamamoto, and F. Adachi,“Channel segregation based dynamic channel assignment for WLAN,”in Proc. IEEE ICCS, 2012, pp. 463–467.

[53] A. Mehbodniya, K. Temma, R. Sugai, W. Saad, I. Guvenc, andF. Adachi, “Energy-efficient dynamic spectrum access in wireless het-erogeneous networks,” in Proc. IEEE ICCW, 2015, pp. 2775–2780.

[54] W. Guo and T. O’Farrell, “Dynamic cell expansion with self-organizingcooperation,” IEEE J. Sel. Areas Commun., vol. 31, no. 5, pp. 851–860,2013.

[55] I. Aydin, H. Yanikomeroglu, and . Aygl, “User-aware cell switch-offalgorithms,” in Proc. 2015 IWCMC, 2015, pp. 1236–1241.

[56] S. Samarakoon, M. Bennis, W. Saad, and M. Latva-aho, “Dynamicclustering and on/off strategies for wireless small cell networks,” IEEETrans. Wireless Commun., vol. 15, no. 3, pp. 2164–2178, 2016.

[57] A. Galindo-Serrano, “Self-organized femtocells: A time difference learn-ing approach,” Ph.D. dissertation, Universitat Politecnica de Catalunya(UPC), Barcelona, Spain, 2013.



Atefeh Hajijamali Arani received her M.Sc. de-gree in Telecommunication engineering from TarbiatModares University, Tehran, Iran, in 2012. Cur-rently, she is a Ph.D. candidate at Isfahan Universityof Technology, Isfahan, Iran. She was a visitingresearcher at Tohoku University, Japan, in 2015.Her research interests include heterogeneous net-works, resource management, game theory, and self-organizing networks.

Abolfazl Mehbodniya (SM’16) he has been an As-sistant Professor at department of communicationsengineering, Tohoku University since Jan. 2013. Hehas 10+ years of experience in electrical engineering,wireless communications, and project management.He has over 40 published conference and journalpapers in the areas of radio resource management,sparse channel estimation, interference mitigation,short-range communications, 4G/5G design, OFDM,heterogeneous networks, artificial neural networks(ANNs) and fuzzy logic techniques with applications

to algorithm and protocol design in wireless communications. He is therecipient of JSPS fellowship for foreign researchers, JSPS young facultystartup grant and KDDI foundation grant.

M. Javad Omidi received his Ph.D. from Universityof Toronto in 1998. He worked in industry by joininga research and development group designing broad-band communication systems for 5 years in US andCanada. In 2003 he joined the Department of Electri-cal and Computer Engineering, at Isfahan Universityof Technology (IUT), Iran; and then served as thechair of Information Technology Center and chair ofthe ECE department at this university. Currently heis the Director of IRIS a UNESCO organization inIran, and the VP for Research and Development at

Isfahan Science and Technology Town .He has numerous publications andmore than 15 US and international patents in the area of telecommunications.He is supervising a software radio lab in IUT and his scientific researchinterests are in the areas of mobile computing, wireless communications,digital communication systems, software radio, cognitive radio, and VLSIarchitectures for communication algorithms.

Fumiyuki Adachi (Fellow’02) is currently aspecially-appointed Professor at Research Organiza-tion of Electrical Communication (ROEC), TohokuUniversity. His research interest includes wirelesssignal processing including wireless access, equal-ization, transmit/receive antenna diversity, adaptivetransmission, and channel coding. He is an IEEEFellow and an IEICE Fellow. He was a recipientof the IEEE Vehicular Technology Society AvantGarde Award 2000, IEICE Achievement Award2002, Thomson Scientific Research Front Award

2004, Ericsson Telecommunications Award 2008, Telecom System Tech-nology Award 2010, the Prime Minister Invention Award 2010, the KDDIFoundation Excellent Research Award 2012, and C&C Prize 2014.

Walid Saad (S’07, M’10, SM’15) received his Ph.Ddegree from the University of Oslo in 2010. Cur-rently, he is an Assistant Professor and the StevenO. Lane Junior Faculty Fellow at the Departmentof Electrical and Computer Engineering at VirginiaTech, where he leads the Network Science, Wire-less, and Security (NetSciWiS) laboratory, within theWireless@VT research group. His research interestsinclude wireless networks, game theory, cybersecu-rity, unmanned aerial vehicles, and cyber-physicalsystems. Dr. Saad is the recipient of the NSF CA-

REER award in 2013, the AFOSR summer faculty fellowship in 2014, and theYoung Investigator Award from the Office of Naval Research (ONR) in 2015.He was the author/co-author of five conference best paper awards at WiOpt in2009, ICIMP in 2010, IEEE WCNC in 2012, IEEE PIMRC in 2015, and IEEESmartGridComm in 2015. He is the recipient of the 2015 Fred W. EllersickPrize from the IEEE Communications Society. Dr. Saad serves as an editorfor the IEEE Transactions on Wireless Communications, IEEE Transactionson Communications, and IEEE Transactions on Information Forensics andSecurity.

Ismail Guvenc (SM’10) has been an AssociateProfessor at North Carolina State University sinceAugust 2016. His recent research interests include5G and mmWave wireless networks, heterogeneousnetworks, and UAV communications. He has pub-lished more than 170 conference/journal papers andbook chapters, several standardization contributions,three books, and over 30 U.S. patents. He is a recip-ient of the 2016 FIU College of Engineering FacultyResearch Award, 2015 NSF CAREER Award, 2014Ralph E. Powe Junior Faculty Enhancement Award,

and 2006 USF Outstanding Dissertation Award.

Documents

Distributed Learning for Energy-Efﬁcient Resource ... · heterogeneous networks (HetNets), in which different macro cell BSs (MBSs) and small cell BSs (SBSs) must coexist, have