28
Dynamic Self-Optimization of the Antenna Tilt for Best Trade-off Between Coverage and Capacity in Mobile Networks Nikolay Dandanov 1 Hussein Al-Shatri 2 Anja Klein 2 Vladimir Poulkov 1 Ó Springer Science+Business Media New York 2016 Abstract One major factor influencing the coverage and capacity in mobile networks is related to the configuration of the antennas and especially the antenna tilt angle. By utilizing antenna tilt, signal reception within a cell can be improved and interference radiation towards other cells can be effectively reduced, which leads to a higher signal-to- interference-plus-noise ratio received by the users and increased sum data rate in the network. In this work, a method for capacity and coverage optimization using base station antenna electrical tilt in mobile networks is proposed. It has the potential to improve network performance while reducing operational costs and complexity, and to offer better quality of experience for the mobile users. Our solution is based on the application of reinforcement learning and the simulation results show that the algorithm improves sig- nificantly the overall data rate of the network, as compared to no antenna tilt optimization. The analysis in this paper focuses on the downlink of the cellular system. For the simu- lation experiments a multicellular and sectorized mobile network in an urban environment and randomly distributed user terminals are considered. The main contribution in this work is related to the development of a learning algorithm for automated antenna tilting. Keywords Mobile networks Self-optimization User satisfaction Antenna tilt Coverage Capacity Machine learning Reinforcement learning & Vladimir Poulkov vkp@tu-sofia.bg Nikolay Dandanov n_dandanov@tu-sofia.bg Hussein Al-Shatri [email protected] Anja Klein [email protected] 1 Faculty of Telecommunications, Technical University of Sofia, Sofia, Bulgaria 2 Institute for Telecommunications, Technical University Darmstadt, Darmstadt, Germany 123 Wireless Pers Commun DOI 10.1007/s11277-016-3849-9

Dynamic Self-Optimization of the Antenna Tilt for Best (QoS) in terms of achievable bit rate, over the entire area and maximizing the system capacity in terms of bit rate. One means

Embed Size (px)

Citation preview

Dynamic Self-Optimization of the Antenna Tilt for BestTrade-off Between Coverage and Capacity in MobileNetworks

Nikolay Dandanov1 • Hussein Al-Shatri2 • Anja Klein2 •

Vladimir Poulkov1

� Springer Science+Business Media New York 2016

Abstract One major factor influencing the coverage and capacity in mobile networks is

related to the configuration of the antennas and especially the antenna tilt angle. By

utilizing antenna tilt, signal reception within a cell can be improved and interference

radiation towards other cells can be effectively reduced, which leads to a higher signal-to-

interference-plus-noise ratio received by the users and increased sum data rate in the

network. In this work, a method for capacity and coverage optimization using base station

antenna electrical tilt in mobile networks is proposed. It has the potential to improve

network performance while reducing operational costs and complexity, and to offer better

quality of experience for the mobile users. Our solution is based on the application of

reinforcement learning and the simulation results show that the algorithm improves sig-

nificantly the overall data rate of the network, as compared to no antenna tilt optimization.

The analysis in this paper focuses on the downlink of the cellular system. For the simu-

lation experiments a multicellular and sectorized mobile network in an urban environment

and randomly distributed user terminals are considered. The main contribution in this work

is related to the development of a learning algorithm for automated antenna tilting.

Keywords Mobile networks � Self-optimization � User satisfaction � Antenna tilt �Coverage � Capacity � Machine learning � Reinforcement learning

& Vladimir [email protected]

Nikolay [email protected]

Hussein [email protected]

Anja [email protected]

1 Faculty of Telecommunications, Technical University of Sofia, Sofia, Bulgaria

2 Institute for Telecommunications, Technical University Darmstadt, Darmstadt, Germany

123

Wireless Pers CommunDOI 10.1007/s11277-016-3849-9

1 Introduction

In this section, an introduction and motivation of the problem we are analyzing and solving

is presented, together with a description of the state of the art and an explanation of the

paper structure.

1.1 Motivation and Background

Mobile networks nowadays provide much more sophisticated services and are becoming

more and more complex. The demanded high quality of the services requires an optimal

utilization of the networks resources. Typical for modern mobile networks is the imple-

mentation of a multi-radio access technology (multi-RAT) scenario. This is the simulta-

neous operation of RATs and infrastructure concepts from different generations, such as

global system for mobile communications (GSM), universal mobile telecommunications

system (UMTS) and long term evolution (LTE) [1]. Deploying state of the art and

upgrading legacy technologies and solutions for mobile networks should be straightfor-

ward and possible with minimal need of decommissioning old services or changing the

physical topology of network elements, such as base stations (BS) [1].

The environment in a mobile network changes constantly and dynamically, which leads

to variations in the coverage area of the cells. Reasons for these changes in the network are

the exponential growth of traffic over the years [2], variable traffic load, dynamical radio

propagation conditions, malfunction of BSs, insertion or deletion of one or more BSs,

errors made in the network planning phase, etc. In addition to this, users are mobile and

their spatial and temporal distributions are uneven, thus hotspots occasionally could occur

in densely populated urban areas. Therefore, as technologies and user demands evolve, the

network performance should be scalable and much higher throughput should be provided.

However, manual optimization of the network performance is a highly complex and time

and cost consuming task for mobile operators [1].

In order to make the network more autonomous, the 3rd generation partnership project

(3GPP) has introduced the concept of self-organizing networks (SON) as a step towards

achieving fully cognitive networks and autonomic networking in the future [1]. Part of the

SON use cases define coverage and capacity optimization (CCO) for 4th generation mobile

networks. One of the CCO use cases has been identified for automatic optimization of

network resources and its main objective is to provide sufficient coverage and capacity in

the whole network area with minimum radio resources. This CCO use case is further

divided to three sub-objectives maximizing the relative coverage in the area so that con-

tinuous coverage is achieved [1]. The relative coverage is defined as the probability that the

received SINR is better than the minimum required, thus providing a sufficient quality of

service (QoS) in terms of achievable bit rate, over the entire area and maximizing the

system capacity in terms of bit rate.

One means for optimizing the capacity and coverage provided by a BS is steering the tilt

angle of its antenna. Tilting is possible both mechanically and electrically, but usually the

latter is utilized for coverage and capacity optimization [3].

Depending on the current network environment, number of users, their distribution and

the high-level goals of the mobile operator, best trade-off between coverage and capacity

for the network can be achieved. High-level goals or policies of the operator are deter-

mined by the prioritization of capacity over coverage or vice versa. Such optimization can

be utilized for example when a hotspot of users is formed in between different cells. If one

N. Dandanov et al.

123

BS is serving the users in the hotspot and they are close to it, it can adjust its antenna tilt

towards them. This would lead to a stronger received signal for the users in the hotspot and

if the neighboring BS also adjust their tilts, the interference from other cells would be

reduced, thus improving the SINR.

For example, in a high-dense urban area with many business buildings, the offices

should be well covered during business work hours. The restaurants and canteens nearby

should be covered during lunch and dinner time, resulting in a change in the tilt of the

nearby BS antennas. Outside of working hours, the homes of the users should get better

coverage, while during the weekends, places for entertainment, such as shopping malls,

cinemas, parks and restaurants shall be well covered. Such distributions of users are

strongly correlated as they are repeatable and in time, on a weekly, daily or even hourly

basis. Additionally, each user has different throughput requirements for the particular

services he wants to use and a specific behavior, which also varies in time.

These patterns in location and behavior can be learned from the network and utilized in

the future to predict user location and demands. Thus, the antenna tilts could be adjusted in

advance, so that the network has a proactive response to the environment. Therefore, self-

optimization of the antenna tilt is considered very important for the performance

enhancement of current and future networks.

In this paper, a method to provide predictive and adaptive means of optimizing the

tradeoff between the coverage and capacity of a mobile network in a timely manner is

proposed. Based on this, an algorithm acting in a self-organized way, contributing also to

the self-optimization and self-healing functions of the network, is developed. The algo-

rithm relies on real-time measurements, key performance indicators (KPIs) and statistically

collected data from the network elements and user terminals. For solving the problem in an

autonomous and predictive way, the proposed algorithm utilizes methods defined in

machine learning and more specifically—reinforcement learning (RL).

The proposed solution can be applied to counteract issues like insufficient coverage at

the cell borders (coverage holes), excessive interference (pilot pollution), or weak signals

received by the users. The optimization procedures are designed in such a way, that they

can be combined with other self-organizing functions and features, to facilitate, for

example, load balancing, interference coordination and power allocation. This way, a

holistic approach to the coordination between the different SON features can be utilized.

The solution can be integrated with novel concepts in mobile networks such as additional

vertical or horizontal sectorization.

Current practical implementations of CCO techniques in mobile networks are limited.

Benefits from applying the proposed solution are enhanced network performance and

service quality as well as reduction of qualified manual effort for optimization of the

antenna tilt. The effectiveness of the solution is illustrated through simulation experiments.

Based on the results, the algorithm provides significant gain in the overall throughput of

the network, as compared to no antenna tilt optimization. The applied machine learning

algorithm is model-independent. Therefore, the solution can be applied to mobile networks

in general, independent of their generation, technologies, standards or the way the network

is realized.

The described idea and method for CCO of a mobile network utilizing antenna electrical

tilt have the potential to offload regular optimization tasks from the mobile network

operators. Such tasks are delegated to qualified engineers and require manual effort, hence

they are expensive, time-consuming and repeatable [1]. Moreover, as human resources are

limited, the reduction of costs for operation, administration and maintenance (OAM) of the

network is a major goal for the mobile operators.

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

From a technical perspective, the quality of network operation could be enhanced if

such an optimization solution is deployed. Coverage and capacity can be dynamically and

predictively optimized, depending on the geographical distribution of the users and their

momentary throughput demands. If shorter optimization time periods are implemented, in

the order of minutes, the network can dynamically and adaptively (or even predictively and

proactively) provide capacity to hotspots or cover effectively areas when poor signal

reception occurs.

The presented solution improves throughput, guarantees the minimum data rate required

by the user and enhances the mobile services in general. As a result, the quality of

experience (QoE) for the users is improved and usage of more and diverse services is

stimulated, which contributes to the revenue from network operation.

1.2 State of the Art

In [4], the authors focus on the self-optimization of remote electrical tilt from the per-

spective of CCO. Simulation studies are performed for best-effort traffic and resource fair

scheduling by using LTE snapshot simulator where propagation and uneven user distri-

bution are modeled in three dimensions for a realistic macro-cellular network scenario.

Coverage and capacity optimization by means of adaptive remote electrical tilt (RET)

adjustment is mainly discussed and the results are given in terms of different performance

indicators. It is shown that RET optimization may provide significant performance

improvement in case of suboptimal network planning or reuse of 3G network planning.

In [5], the adaptation of antenna tilt angle is formulated as an optimization task aiming

for utility fairness. Namely, the objective is to jointly adjust antenna tilt angles within the

cellular network so as to maximize user utility, subject to network constraints. Adjustments

at BS are carried out jointly in a coordinated manner in order to manage interference. This

optimization problem is nonconvex, but the authors show that under certain conditions, it

can be reformulated as convex. Specifically, they show that: (1) in the operating mode with

high SINR and with an appropriate choice of variables, the optimization is convex for any

concave utility function; and (2) in any SINR regime, the optimization can be formulated

in a convex manner when the objective is a proportional fair rate allocation. The approach

in [5] is lightweight, making use of measurements that are already available at base

stations, and is therefore suited to distributed implementation. However, this optimization

model is correct only under certain conditions, whereas the algorithm proposed in the

current paper functions with no such constraints.

In [4], a centralized optimization algorithm on the basis of case based learning (CBL) is

proposed. This method relies on the storage of cases or instances in a memory and applies

them directly in new situations. CBL is an appropriate technique because of its low

complexity in implementation and high accuracy in case of small number of training

examples and irrelevant features. In CBL, all numeric feature values are linearly nor-

malized and k-nearest neighbor algorithm is used as a prediction function [4]. An overview

of the CBL algorithm proposed in [4] is made in [6]. According to this overview, a limited

number of training examples are collected from the network by a central server and stored

in memory. These training examples are measurements (e.g., received power, SINR, call

drops) and they are applied to new states of the network. The optimum antenna tilt is then

determined using the k-nearest neighbor algorithm where the case that closely matches the

current state is chosen. Although [4] does not state what the minimum number of training

examples should be, different sets of examples would have to be collected for different

network environments.

N. Dandanov et al.

123

In [6] the authors apply a RL based Sparse Sampling algorithm for the coverage self-

optimization through antenna tilting. According to this research, this algorithm is better

than supervised learning and Q-learning based algorithms, as it has the ability to adapt to

network environments without prior knowledge, handle large state spaces, perform self-

healing and potentially focus on multiple coverage problems. The paper focuses on the

problem of coverage self-optimization in the context of LTE networks by adjusting the

antenna tilt.

In [7], the proposed method for self-optimization of the tilt angle is based on fuzzy RL

techniques and operates in a fully distributed, asynchronous and autonomous fashion

without any need for a priori information for the network conditions or any human

interventions. The solution is shown to be capable of handling extremely noisy feedback

information from mobile users, as well as being responsive to the changes in the envi-

ronment, including self-healing properties. The simulation results confirm the convergence

of the solution to the global optimal settings and that it provides up to 20 % performance

improvement when compared to an existing fuzzy logic based RL approach. The CCO in

[7] is done using a Fuzzy RL approach, namely Q-learning, which is a technique where the

entire state space is searched to find the optimum solution. Since the cellular network

environment can possibly be in infinite states or configurations, this number is drastically

reduced in [7] through fuzzification, followed by optimization done in a distributed

manner. Although the fuzzification step has the advantage of reducing the state space and

the noise, it reduces the flexibility of the algorithm and its ability to focus on multiple

coverage problems [6]. But the algorithm also does not introduce significant throughput

gain.

The RL sparse sampling algorithm in [6] also overcomes the problems of the algorithms

used in [4] and [7], namely, the inability to adapt to network environments without prior

knowledge and the inability to handle a large set of network configurations. A centralized

architecture is adopted and the algorithm is shown to have self-optimizing and self-healing

capabilities.

From an architecture perspective, in [6] the authors focus their work in the context of

LTE networks, however they have implemented a centralized approach. This means, that

there is one agent per cell site that implements the algorithm, and a master agent that

decides which agents get to act simultaneously. A design choice has been made wherein

the master agent allows only non-conflicting or non-overlapping cells to act simultane-

ously. Only the first-tier neighbors of a cell are assumed to be conflicting. Nevertheless,

such an assumption might not be very appropriate if an urban environment is considered as

in the current work. In contrast to [6], a distributed architecture is selected in our scenario.

A centralized approach is also implemented in [4]. It is declared that this strengthens the

analysis of the basic cause of the problem (e.g., capacity vs. coverage, excessive uptilting

or downtilting) and selects the cells to be optimized with dedicated weighting parameters

for prioritization as well as avoiding the changes in the parameters of cells which may

interact with each other.

Nevertheless, because of the inconsistent radio propagation characteristics of the dif-

ferent cells in the network and the variation of geographical distribution of traffic load

across the cells, it is recommended that the downtilt adjustment is performed in a dis-

tributed fashion on a cell-by-cell basis and that the solution is responsive to the environ-

ment changes. Compared to the centralized tilt angle optimization, it is shown that the cell-

based scheme can provide significantly more gain [7].

The proposed method in this work contributes with adding dynamic and adaptive tilt

adjustment, based on RL methodology with low computational complexity for distributed

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

real-time operation. In comparison with other research works, which consider either no

learning and/or only static network environment, and/or static user distribution, we con-

sider the mobile network to be a dynamic environment adaptive to current user distribu-

tions. This means that based on a learning algorithm, the network can predict the optimal

antenna tilts for a particular user distribution. The learning approach does not depend on

the model, reduces the signaling and does not need information about future user distri-

butions and their probabilities.

1.3 Structure of the Paper

The rest of the paper is structured as follows. Section 2 describes the system model,

formulates the problem and elaborates on how we model the antenna radiation pattern and

mobile radio channel. The proposed RL solution is explained in Sect. 3. The basic prin-

ciples of RL and a flowchart of the algorithm are presented also there. Simulation results

are provided, described and evaluated in Sect. 4. Finally, in Sect. 5, we conclude the paper

and recommend future research directions on the topic.

2 Problem Formulation

In this section, we describe the system model, formulate the problem and show how the

antenna radiation pattern and mobile radio channel are modeled.

2.1 System Model

We consider a mobile network in an urban environment as shown in Fig. 1 and assume that

each base station is equipped with three antennas, each covering one sector. From a

practical perspective, this assumption corresponds to typical mobile network deployments

as the BS serves users inside its sector. Further, the term cell is used with the same meaning

Fig. 1 Model of the mobile network

N. Dandanov et al.

123

as sector. We assume that each cell can be geometrically approximated as a hexagon, thus

the network is multicellular and sectorized. In practice, due to inconsistent radio propa-

gation characteristics and uneven environment landscape, the shape of the cell is not a

perfect hexagon.

The user equipment (UE) or user terminals are randomly distributed in the network. It

could happen that some UEs are not covered by any cell and they can get no service. The

analysis in this paper focuses on the radio signals transmitted by the BS and received by the

UE, i.e. the downlink of the cellular system.

Let B denote the number of cells and K denote the number of UEs in the network. We

assume that the transceiver power for each BS antenna is constant and is the same for all

the base stations in the network. We suppose, that the b-th base station antenna,

b ¼ f1; . . .;Bg, transmits signal sðtÞ with power PT, as shown in Fig. 2.

The signal is amplified with the gain of the BS antenna, denoted as GT k;bð ÞðhbÞ. Itpropagates through the wireless channel between the b-th BS antenna and the k-th UE,

k ¼ f1; . . .;Kg. The path loss and shadowing are expressed with the channel gain between

the UE and the BS antenna hk;b��

��2. We consider additive channel noise nðtÞ and interfer-

ence iðtÞ. The channel noise and internal UE noise is approximated as additive white

Gaussian noise (AWGN), i.e. with constant power spectral density (PSD) and power PN.

The period of optimizing the antenna tilt angle is in the order of hours, hence, short-term

channel variations, such as fast fading, are omitted in the analytical model.

The k-th UE receives the signal rðtÞ from the b-th BS antenna with power PRðk;bÞ, which

depends on the gain of the BS antenna in the downlink ðGT k;bð ÞÞ, the gain of the UE antenna

in the uplink ðGR k;bð ÞÞ, the transmit power ðPTÞ and channel gain ð hk;b��

��2Þ

PRðk;bÞ ¼ PT � GT k;bð Þ � GRðk;bÞ � hk;b��

��2: ð1Þ

A more detailed definition of the channel gain hk;b��

��2is given in Sect 2.4.

We assume that each UE is equipped with an omnidirectional antenna, which has a

constant gain of one in all directions, i.e. GRðk;bÞ ¼ 1. Therefore, GRðk;bÞ is omitted in the

next equations.

In order to maintain the work concise, the system design described here implements a

simplified path loss and shadowing model, which captures the essence of signal propa-

gation in an urban environment. These effects are expressed with the channel gain hk;b��

��2,

which is also a dimensionless quantity.

We assume a frequency reuse factor of one, meaning that the system is limited by the

interference from neighboring cells and not by the received noise. Therefore, for deter-

mining the data rate offered to UE k by cell b, the SINR is decisive. It is expressed as

ck;b ¼PRðk;bÞ

PN þ PIðkÞ; ð2Þ

|h |

n(t)

i(t)

GR(k,b)s(t) r(t)G (θ )T(k,b) b2

k,b

Fig. 2 Model of the mobileradio channel

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

where the term PIðkÞ is the power of the interfering signals iðtÞ received at k and transmitted

by all cells, other than b

PIðkÞ ¼XB

l¼1;l 6¼b

PR k;lð Þ: ð3Þ

The data rate with which UE k is served by cell b can be expressed using the Shannon–

Hartley theorem [8] as

Rk½bit=s] ¼ Df � log2 1þ ck;b� �

; ð4Þ

where Df is the frequency bandwidth of the transmitted signal sðtÞ and Rk is in bits per

second. We assume that we have divided the available bandwidth in Nsc subcarriers by

using multicarrier transmission, such as orthogonal frequency-division multiplexing

(OFDM) applied in LTE and WiMAX [1]. Then Eq. (4) can be generalized as

Rk ¼XNsc kð Þ

n¼1

DfNsc

� log2 1þ ck;b;n� �

; ð5Þ

where Nsc kð Þ is the number of subcarrier frequencies allocated to user k and ck;b;n is the

SINR for the n-th subcarrier. For simplicity, we assume that each user is assigned a single

subcarrier.

In order to achieve a higher data rate for user k, we have to maximize the SINR ck;b andPRðk;bÞ and to minimize the interference PIðkÞ, according to (2), (3) and (5). For a snapshot

of the network in the time domain, we assume that for particular and static BS and UE the

channel gain hk;b��

��2and the BS transceiver power PT are fixed. Thus, we can alter the BS

antenna gain GT in order to maximize the received SINR.

2.2 Modeling the Radiation Pattern of BS Antenna

In this subsection, the problem of how to accurately model the radiation pattern of the

antennas at the BS will be introduced. This is important for CCO, because the proposed

solution adjusts the antenna tilt in order to optimize capacity and coverage.

Antenna tilt is defined as the angle between the direction of the main beam of the

antenna pattern and the horizon. In Fig. 3, a BS with its antenna is depicted, as seen from

the side (cross section with the vertical plane). The main, side and back lobes of the

antenna radiation pattern, as well as the vertical beamwidth and tilt angle parameters, are

depicted. The total tilt angle is shown as htilt and consists of two angles—mechanical and

electrical. Fundamentally, downtilting is possible both mechanically and electrically [3]. In

mechanical downtilt, antenna main lobe is lowered on one side and the antenna back lobe

is raised on the other side because antenna elements are physically directed towards the

ground (see Fig. 4a). In Fig. 4a, the mechanical downtilt angle is denoted by hmechanical.

For an antenna at the BS in a mobile network, an antenna array of dipole elements is

extensively used [9]. In electrical downtilt, the main, side and back lobes are tilted uni-

formly (see Fig. 4b). This is achieved by adjusting the phases for each of the antenna array

elements [3]. In Fig. 4b, the electrical downtilt angle is denoted by helectrical. The total tilt

angle is

htilt ¼ hmechanical þ helectrical: ð6Þ

N. Dandanov et al.

123

when the antenna is downtilted towards the ground, then the change in angle is positive. If

the antenna is uptilted, then the change in angle is negative [6]. In Fig. 3, the antenna is

downtilted, which means that the tilt is positive, htilt [ 0�. Figures 3 and 4 show cross

sections of the antenna radiation pattern with the vertical (elevation) plane. The azimuthal

plane is the one which is perpendicular to the elevation plane and parallel to the direction

of the main beam of the antenna.

Adjusting the mechanical downtilt requires site visits, which are time and cost ineffi-

cient and inflexible. The advances in electrical downtiliting have enabled remote adjust-

ment of radiation pattern for all azimuth angles by means of signal phasing alteration.

Novel techniques enable intelligent algorithms to be employed for autonomous configu-

ration and optimization of the downtilt angles and remove or minimize the need for

expensive drive testing and statistical analysis of network performance and fault reports

[7]. Therefore, the antenna is usually tilted mechanically during the installation and initial

setup of the network. Afterwards, the fine-tuning adjustments of the tilt are carried out

utilizing the electrical tilt.

In the following, we show how the BS antenna gain GT can be adjusted with the change

of the antenna tilt angle. For the bth BS antenna, in our notation, the combined antenna tilt

angle htilt is denoted as hb. When hb [ 0�, the antenna is downtilted towards the ground. If

hb\0�, the antenna is uptilted away from the ground.

Theoretically, the radiation pattern of an ideal BS antenna is not a simple function of the

tilt angle hb. In order to facilitate the modelling and simulations of next generation mobile

networks, 3GPP has proposed a simplified antenna radiation pattern in [10]. Most of the

published works, which are aiming to solve the problem of autonomous capacity and

Base Station -3dB -3dB

antennaHalf-power vertical

beamwidth horizonθmechanical

θelectricalθtilt

main lobe

side lobes

back lobe

Fig. 3 A base station and one of its antennas. The main, side and back lobes of the antenna radiationpattern, as well as the vertical beamwidth and tilt angle parameters, are denoted

antenna normal

horizon

θmechanical

main lobe

side lobes

back lobe

antenna normal ≡ horizon

θelectricalmain lobe

side lobes

back lobe

(b)(a)

Fig. 4 Mechanical downtilt (a) and electrical downtilt (b)

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

coverage optimization, use this approximation model. According to it, the antenna atten-

uation in the horizontal (azimuthal) plane for a 3-sector cell site is expressed as

AH uð Þ ¼ �min 12u

u3dB

� �2

;Am

" #

; dBi½ �; ð7Þ

where u; ½�� is the angle between the normal to the antenna and the direction of mea-

surement, u3dB; ½�� is the horizontal half power beamwidth (HPBW) of the main lobe and

Am; ½dBi� is the attenuation with which the side and back lobes of the pattern are

approximated. It must be noted that in practice, usually the back lobe of the antenna pattern

has less attenuation as compared to the side lobes. These parameters apply to the azimuthal

(horizontal) plane. The antenna attenuation in the vertical plane is

AV h; hbð Þ ¼ �min 12h� hbh3dB

� �2

; SLAv

" #

; dBi½ �; ð8Þ

where h; ½�� is the angle between the normal to the antenna and the direction of mea-

surement, h3dB; ½�� is the vertical HPBW of the main lobe and SLAv; ½dBi� is the attenuationof side lobes of the antenna pattern. These parameters apply to the vertical (elevation)

plane. hb; ½�� represents the tilt angle of the antenna with respect to the horizon. Typically,

u3dB ¼ 70�, Am ¼ 25 dBi, h3dB ¼ 10� and SLAv ¼ 20 dBi [10].

Finding the antenna attenuation in any point in space can be analytically expressed by

combining (7) and (8)

A u; h; hbð Þ ¼ �min � AH uð Þ þ AV h; hbð Þ½ �;Amf g; dBi½ �: ð9Þ

In order to obtain the antenna gain in a certain point in space, the maximum antenna

gain GT;max; ½dBi� should be added to (9)

GT u; h; hbð Þ dBi½ � ¼ A u; h; hbð Þ þ GT;max: ð10Þ

The maximum antenna gain is technology and vendor specific. The operating frequency

also influences the antenna gain [11]. Typically, GT;max is in the order of 15 dBi [10, 11].

Equation (10) shows, that the gain of the BS antenna depends on the angles u, h and hb. uand h are determined by the relative position of the BS and UE. Hence, for a static snapshot

of the mobile network, these angles can be considered as fixed. However, by tuning the tilt

angle hb, the antenna gain can be adjusted, i.e. GT hbð Þ.In [3], [5] and [12] it is confirmed, that the pattern defined by [10] is suitable for

practical and theoretical modelling of the vertical pattern of antennas, deployed in today’s

mobile networks.

2.3 Problem Statement

The objective of this work is to maximize the user satisfaction by adjusting the tilt angle of

the BS antennas in a mobile network. Let us assume, that the kth user demands a certain

data rate ~Rk. One of the means for achieving maximum user satisfaction is to ensure that

the throughput for each of the UE reaches at least its corresponding demanded rate ~Rk.

Therefore, we can postulate the objective function as

N. Dandanov et al.

123

XK

k¼1

wk � ~Rk � Rk

� �

; ð11Þ

where Rk � Rk;b is the data rate received by the k-th user. wk is a weighting factor which

determines the user priority and 0�wk � 1. The closer to 1 is wk, the priority of the user is

higher and will achieve higher data rate. Higher values can be assigned to users, which are

more distant to the cell or are located in poor radio conditions and therefore can have lower

data rate. If wk ¼ 0, then the satisfaction of the k-th user is neglected. Such low value can

be used when an UE is close to the BS antenna and receives higher than the demanded data

rate.

As the electrical tilt can be altered remotely and therefore used to optimize the network

performance predictively, the optimization problem aims to minimize (11) by adjusting the

tilt angle and can be postulated as

argminH

XK

k¼1

wk � ~Rk � Rk Hð Þ� �

subject to hb;min � hb � hb;max;

ð12Þ

where H ¼ h1; . . .; hb; . . .; hBð Þ is a vector, containing the tilt angle of each BS in the

network. The weighting factor can be set to wk ¼ 0 for users, for which Rk Hð Þ[ ~Rk, and to

0\wk � 1 otherwise. This way, satisfied users will not be taken into consideration for the

optimization. The limits of the electrical tilt angle are chosen to be hb;min ¼ 0� and

hb;max ¼ 10�, as in practical antennas [1].

It must be noted that the optimization problem (12) is non-convex and therefore cannot

be easily solved with standard optimization techniques.

2.4 Modeling the Mobile Radio Channel

Considering only the path loss, the mobile radio channel gain in an urban environment

hk;b��

��2corresponds to

hk;b��

��2¼ k

4 � p � d0

� �2

:d0

dk;b

� �a

; ð13Þ

where dk;b is the distance between BS antenna b and UE k, as shown in Fig. 5.

d0 is a reference distance for the antenna far-field and a is a path loss exponent.

According to the definitions in [8], the scenario modelled in this work is an urban

macrocell environment, hence a ¼ 3; 7; 6; 5f g:. Due to the scattering phenomena in the

antenna near-field, model (13) is generally only valid at transmission distances d[ d0,

where d0 is typically assumed to be ð1� 10Þm indoors and ð10� 100Þm outdoors. k is the

radio wavelength used for signal transmission and depends on the signal frequency

k ¼ c

f; m½ �; ð14Þ

where c � 3 108 m=s½ � is the velocity of radio wave propagation, which in free space is

equal to the speed of light; f ; ½Hz� is the frequency of the transmitted signal.

In (1), PRðk;bÞ is a mean value, which is averaged over any random variations due to

shadowing in the mobile channel. In addition to path loss, the signal typically experiences

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

random variation due to objects in the signal path, giving rise to a random variation of the

path loss at a given distance. Changes in reflecting surfaces and scattering objects can also

cause random variation of the path loss, therefore a model for the random attenuation due

to these effects is implemented. Since the location, size, and dielectric properties of

blocking objects, as well as the changes in reflecting surfaces and scattering objects that

cause the random attenuation, are generally unknown, statistical models are widely used to

characterize this attenuation. The most common model for this additional attenuation is

log-normal shadowing. This model has been confirmed empirically to accurately model the

variation in path loss (channel gain) or received power in both outdoor and indoor radio

propagation environments. In the log-normal shadowing model, the channel gain hj j2 is

assumed random with a log-normal distribution, given by [8]

p hj j2

¼ nffiffiffiffiffiffi

2pp

rwdBhj j2

exp �10 log10 hj j2�lwdB

2

2r2wdB

2

64

3

75; hj j2 [ 0; ð15Þ

where n ¼ 10=ln 10, lwdBis the mean of wdB ¼ 10 log10 hj j2 in [dB] and rwdB

is the standard

deviation ofwdB, also in [dB].When the channel gain is log-normally distributed, the received

power PRðk;bÞ is also log-normal. The mean of hj j2 (the linear average path loss) is [8]

l hj j2 ¼ E hj j2h i

¼ explwdB

nþr2wdB

2n2

" #

: ð16Þ

The conversion from the linear mean (in [dB]) to the log mean (in [dB]) is [8]

10 log10 l hj j2 ¼ lwdBþr2wdB

2n: ð17Þ

The distribution of the wdB value is Gaussian with mean lwdBand standard deviation

rwdB[8]. The standard deviation rwdB

is used to generate channel gain variations in the

current work. Incorporating shadow (slow) fading, the channel gain can be expressed as

hk;b��

��2¼ k

4 � p � d0

� �2

� d0

dk;b

� �a

�wk;b; ð18Þ

Hk

Hb

Base Sta�onuser terminal

θbdk,b

antennabth

kth

Fig. 5 A base station with the bth antenna and the kth user terminal. Antenna heights are also denoted: Hb

for the BS and Hk for the UE

N. Dandanov et al.

123

where the dimensionless coefficient w is a log-normal distributed random variable. It can

be derived from

w ¼ 10wdB10 ; ð19Þ

where wdB is a normal distributed random variable with mean zero and variance r2wdB.

Substituting (1), (2), (3) and (18) in (5), we get

Rk ¼XNsc kð Þ

n¼1

DfNsc

� log2 1þPT � GT k;bð Þ � k

4�p�d0

2

� d0dk;b

h ia�wk;b

PN þPB

l¼1;l 6¼b PT � GT k;lð Þ � k4�p�d0

2

� d0dl;b

h ia�wl;b

0

B@

1

CA;

bit

s

� �

: ð20Þ

Here dk;b and dl;b express the transmission distances between the UE and the respective

BS antennas. Similarly, wk;b and wl;b are log-normally distributed random variables for

modelling the shadow fading.

3 Reinforcement Learning Solution

In this section, the proposed RL solution is explained. First, the key idea, basic principles

and application of RL to our scenario are explained. Then, the architecture of the solution

and representation as a Markov decision process are described. The section concludes with

a definition of the reward function and flowchart of the algorithm.

The problem defined in the previous section is non-convex and therefore is hard to be

solved with standard optimization techniques in real time. However, a ML approach, with a

solution implementing Reinforcement Learning RL can be applied for the autonomous

optimization of the BS antenna tilt angles. RL is suitable for application to the considered

problem, because it has relatively low computational complexity and avoids the ‘‘trial and

error’’ nature of some other ML techniques. This is important for mobile networks, because

failure in the network operation is highly undesirable.

3.1 Idea and Basic Principles of RL

According to [13], in RL the training and testing phases are intermixed. To collect

information, the learner actively interacts with the environment and in some cases affects

the environment, and receives an immediate reward for each action. The object of the

learner is to maximize its reward over a course of actions and iterations with the envi-

ronment. No long-term reward feedback is provided by the environment, hence the learner

is faced with the exploration versus exploitation dilemma, since it must choose between

exploring unknown actions to gain more information versus exploiting the information

already collected [13].

In the scenario of the stated problem, the agent is the cell site and it observes the

environment (or network) through the measurements it collects. In RL, an agent acts on the

environment and learns based on what it observes. If the agent performs an action at time t,

it causes the environment to go into state stþ1 (see Fig. 6). RL requires mapping that state

to a reward rtþ1 and by observing that reward, the agent knows if it has performed a good

or a bad action. The agent’s behavior is a sequence of observations, actions and rewards for

those actions. Over time, the agent learns to perform the right action, given a particular

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

state [6]. The agent’s goal is to maximize the total amount of reward it receives over the

long run [14]. The agent-environment interaction, described above, is shown in Fig. 6 [13].

3.2 Application of RL for Solving the Current Problem

For the case of self-optimization of antenna tilt, the learner is the instance of an algorithm,

whether it will be distributed in the BSs or in a centralized unit in the network. The

environment consists of the UEs, the tilt angle of the BS antenna, other BSs, the conditions

of the radio link between the BS and the UE.

The context of the developed RL solution is based on what data is available at each BS:

the relative distance d to each of the served UEs (the delay time for the signal propagation

to reach the UE is known); the power of the received UE signal PRðk;bÞ; the interference

from other cells PIðkÞ; the weighting factors wk for each UE.

3.3 Architecture of the Solution

The first generations of mobile communications systems, including GSM (second gener-

ation, 2G) and UMTS (third generation, 3G), did not implement communication between

BSs. However, in LTE such communication was implemented with the X2 interface

between BSs (evolved NodeBs). It is used to facilitate the signaling between BSs and to

enhance system performance. Hence, it is foreseen that such communication will also be

implemented in future mobile systems.

Therefore, for the presented solution, a distributed architecture with coordination and

communication between the BSs is chosen. The algorithm is executed in each BS. The BSs

communicate with each other and coordinate their actions. Nevertheless, because the

learning algorithm is independent of the model, the idea can be applied to older generations

of mobile networks as well, but with a centralized architecture.

In most practical mobile network deployments, when a centralized Operations,

Administration and Management (OAM) system is implemented, it could be used to

provide high-level goals for the trade-off between capacity and coverage to each BS and to

coordinate actions between different BSs. In addition, such a centralized element as the

OAM system enhances the scalability of the solution, as the complexity of distributed

decision-making grows with the number of elements in the system.

The distributed approach has inherent redundancy, being a distributed system. The

mutual coordination between the BSs and between the OAM system and each BS avoids

concurrency issues in the distributed system, such as oscillations, race conditions and

deadlocks. Practically, the solution is considered to be responsive to changes in the

environment [1].

Fig. 6 The reinforcementlearning framework: agent-environment interaction [14]

N. Dandanov et al.

123

3.4 Representation as a Markov Decision Process: States and Actions

The idea and working principles of the developed method can be represented by a Markov

decision process (MDP). States are mapped to the current antenna tilt angle setting and the

transition between different states means that the tilt angle has changed. The state-action

diagram can be generalized, as shown in Fig. 7. The states are sl 2 S. If the action to keep

state sl active is denoted with ml, then the possible actions are ml 2 AðslÞ. Transitionsbetween any two states are possible, therefore the state-action diagram is fully intercon-

nected. The mapping of the states to the antenna tilt angles can for example be the

following: s0 corresponding to hb ¼ hb;min, s1 to hb ¼ 1�, sðS�1Þ to hb ¼ hb;max. In such a

case, the step for changing the tilt is Dhb ¼ 1�. From practical point of view, most of the

antennas currently deployed in mobile networks enable tilt adjustment with such a step. A

model with six states, a step of Dhb ¼ 2� and possible states S ¼ f0�; 2�; 4�; 6�; 8�; 10�g is

implemented in the simulation tests, results from which are presented in the sections

below.

3.5 Reward Function and Matrix

For each action ml it takes, the agent receives a reward Rl. The reward is bigger if the

number of users whose demand is satisfied is greater, which means that the overall capacity

and coverage have been enhanced. The reward function can be expressed as

Rl ¼ l � Sþ g � U; ð21Þ

where S is the sum data rate for all users of the cell, which is normalized to the cell

capacity, that can be achieved with the allocated frequency resources in an interference-

free (noise-limited) environment. U is the number of satisfied users in the cell and it is

normalized to the potential total number of served users inside the cell. For this purpose,

the algorithm needs to have information of the user distribution in the network, which is

obtained by the relative distance d to each of the users, based on the delay time for the

signal propagation to reach each of the users.

The coefficients l and g give the priority for the trade-off between capacity and cov-

erage. For example, if l[ g, the optimization will aim to achieve higher throughput, for

less users in the network. Similarly, for l\g, more users will be covered, but the average

throughput will be lower.

s0 s1 s(S-2) s(S-1)

m1

m1

m1

m0

m0

m0

m(S-2)

m(S-2)

m(S-2)

m(S-1)

m(S-1)

m(S-1)

m0 m1 m(S-2)

m(S-1)

Fig. 7 Generalized state-action diagram. States are sl 2 S, possible actions are ml 2 AðslÞ

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

Based on the rewards for each action, a Reward Matrix (RM) is constructed. It maps the

reward Rl, received for being in state sl and taking action ml. An exemplary generalized

RM is shown on Table 1 and in (22).

R ¼

R0 R1

R0 R1

� � �� � �

RS�2

RS�2

RS�1

RS�1

..

. ... . .

. ... ..

.

R0 R1

R0 R1

� � �� � �

RS�2

RS�2

RS�1

RS�1

2

666664

3

777775

: ð22Þ

3.6 Flowchart of the Algorithm

The basic idea and principles of the developed method for CCO, utilizing the BS electrical

antenna tilt, are illustrated with a flowchart of the algorithm and a simplified version in

Fig. 8.

The described algorithm is for the b-th BS antenna or b-th cell. In the initial phase the

optimization starts with exploring all of the possible tilt angles sl from the state space S.

The setting of the tilt angle can be done either randomly or with a prediction for the most

optimal state. During the simulation tests, both methods were implemented. For simplicity,

the final solution presented here uses random state selection. Although the convergence

time of the two methods is almost the same, state selection based on prediction is generally

better suited for practical implementation in real mobile networks.

The time interval for exploration of each tilt angle setting is in the order of hours.

Throughout this time, statistical data is collected related to the network performance and

received data rate by each user in the cell. Based on this data, the respective reward Rl. is

calculated using (21). A reward matrix R is constructed where the values for the received

rewards are filled in. Following the biggest reward in R is found. As a result, the corre-

sponding tilt angle is the optimal one for the current user distribution and radio propagation

environment.

After exploring all of the states, follows the exploitation phase of the algorithm, where

the optimal tilt angle is set until the next iteration of the algorithm. During this time, for

example 24 h, statistics are collected and the reward matrix is periodically updated.

In the next time step, the current network performance is compared with the stored

performance data and the tilt angle may be predictively tuned to optimize the capacity and

coverage for the current user distribution and radio propagation environment.

A detailed flowchart of the algorithm is illustrated in Fig. 9a–c. It represents the work of

the network as a whole and adds more steps to the logic described above. After the

Table 1 Generalized rewardmatrix

Action

State m0 m1 ��� ms-2 ms-1

S0 R0 R1 ��� Rs-2 Rs-1

S1 R0 R1 ��� Rs-2 Rs-1

��� ��� ��� ��� ���Ss-2 R0 R1 ��� Rs-2 Rs-1

Ss-1 R0 R1 ��� Rs-2 Rs-1

N. Dandanov et al.

123

initialization of the algorithm, a check for available statistical data is made. If the collected

performance data is sufficient, the algorithm enters directly into the exploitation phase. If

the data is not enough, the exploration phase is started.

In this phase all the B BS antennas are optimized, but not simultaneously. If the number

of BS in the network is large, a suitable approach is to do the optimization on a per cell

cluster basis. This means, that cells are optimized in clusters, one cell at a time. A cluster is

a group of cells, formed by a central cell and all of its first-tier neighbors. This way, the

interference from other cells and respectively the received SINR are kept constant. The

different clusters are optimized in parallel, so that the convergence time is relatively low. It

is assumed, that the cells, other than those in the current cluster, are located too far away to

cause significant interference and therefore they can be optimized separately. This

assumption is valid in general for urban macrocells.

After the exploration phase, the algorithm enters the exploitation phase. If no opti-

mization is needed or there is not enough statistical data collected, it is stopped until the

next time step. If there is a need of optimization and sufficient amount of statistical data is

obtained, the tilt angle of each of the BS antennas is optimized on a per-cluster basis using

prediction for the optimal angle.

The reward Rl is computed and saved in the reward matrix R. Statistical data is collected

until the next time step after which the algorithm enters in the next iteration and continues

to operate in a similar manner.

If a problem is detected at any step of the algorithm, it is reported to the central OAM

system and its execution is interrupted.

Start

Explore all states of the environment (possible antenna �lt angle for each

base sta�on)

Find the op�mum �lt angle for each base sta�on for the current

environment

Exploit the learned op�mum �lt angles depending on the

environment. Predict and adapt to the changes. Con�nue learning.

Fig. 8 Flowchart of thesimplified algorithm for capacityand coverage optimization,utilizing the base stationelectrical antenna tilt

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

4 Performance Evaluation and Results

In this section, we provide, describe and evaluate the results of simulation tests of the

proposed solution. Based on the algorithm presented in Sects. 2 and 3, a simulation model

was developed in the MATLAB computing environment, version 2013b.

4.1 Simulation Parameters and Environment

The antenna parameters, which are used for implementing the solution, are shown in

Table 2. The main simulation parameters are listed in Table 3.

Based on the parameters in Table 2, the plots of the vertical pattern AV h; hbð Þ for

hb ¼ 0�, h ¼ f�180�; 180�g and of the horizontal pattern AH uð Þ for u ¼ �180�; 180�f g inpolar coordinates are shown in Fig. 10.

Start

Set initial antenna tilt angle for each base station antenna

Explore all the electrical tilt angles for the current base station (explore

the states sl of the environment)

Initialize iteration index (current BS antenna number)

b := 1

Optimize the next

BS antenna

b := b + 1Find the optimum tilt angle for the

base station antenna b and the current network environment

Exploit the learned optimum tilt angle

Save the reward Rl in the reward matrix. Save the information for current user distribution and network environment

Compute the reward Rl

b ?= Bno

yes

1

1

2

Problem detected?

Send report to OAM system

End

no

yes

Enough statistical data

collected?

no

yes3

(a)

Fig. 9 Flowchart of the algorithm for CCO, utilizing the BS electrical antenna tilt

N. Dandanov et al.

123

4.2 Simulation Results

In this subsection, the results from simulations of the proposed solutions are described and

evaluated.

The first stage of implementing and testing the model consists of modelling a simple

cell site with three sectors (three cells) and a simplified channel. The goal is to show the

basic operation of the algorithm. Each of the sectors is covered by one antenna. The three

antennas are directed at 0�; 120� and 240�, respectively. At 0�, the main beam of the

On the next time step, compute the network performance using the

reward function

Optimization needed?

no

yes

Enough statistical data

collected?

no

yes

Wait for one time step

2

3

Problem detected?

no

yes

Send report to OAM system

Problem detected?

no

yes

End

(b)

Fig. 9 continued

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

antenna pattern points towards the X axis. The angle increases with a counterclockwise

rotation. The initial electrical tilt angle for each of the BS antennas is set to hb ¼ 6� for

b ¼ f1; 3g. In practice, this is a common value for urban BS deployments.

In the beginning, for simplicity the simulation tests are conducted with three users,

distributed within the network, assuming no shadowing and fast fading. The radio prop-

agation characteristics are modelled for free-space transmission, meaning that the path loss

exponent in (13) is set to a ¼ 2. No inter-cell interference is considered. The user locations

and the network are assumed constant and static in time. The dimensions of the simulated

network are 10 km by 10 km. The BS antennas are placed at the coordinate origin.

The power of the received signal by a user is computed for the whole network with a

geometrical step of 10m in both the X and Y directions. The samples are plotted on a map

of the received power, PRðk;bÞ (1). The received power map before initializing the

Predict the optimal tilt angle for the current network environment and

user distribution

Update the reward matrix

Initialize iteration index (current BS antenna number)

b := 1

Compute the reward Rl

b ?= Bno

yes

Optimize the next

BS antenna

b := b + 1

2

3(c)

Fig. 9 continued

N. Dandanov et al.

123

optimization algorithm is shown in Fig. 11. The user locations are represented with

asterisks (*) and are numbered. The cells are denoted by their indexes and their approx-

imate borders illustrated with white dashed lines.

The received power map results after completing the exploration phase of the algorithm

are presented in Fig. 12. From these results can be concluded, that cells 1 and 3 have tilted

their antenna angles upwards. The reason for this is the current user distribution. As we can

observe from these two figures, user 1 is located at the origin, user 2 is near the border

between cells 1 and 3 and user 3 is near the edge of cell 3. Because user 1 is very close to

the antennas, it receives a signal with relatively high power. User 2, being near the cell

border, does not have so good reception before the optimization. Hence, cell 2 is uptilted to

provide higher received power to user 2. Similar is the case for user 3 and therefore cell 3 is

also uptilted from 6� to 2�. Cell 2 is serving no users. Hence, it is downtilted to

Table 2 Antenna simulation parameters

Parameter Symbol Modeling Value Dimension

Horizontal antenna radiation pattern AH uð Þ According to (7), [10] – dBi

Horizontal HPBW u3dB According to [10] 70 �Maximum horizontal attenuation Am 25 dBi

Vertical antenna radiation pattern AV h; hbð Þ According to (8), [10] – dBi

Vertical HPBW h3dB According to [10] 10 �Vertical side lobe attenuation SLAv 20 dBi

Minimum electrical tilt angle hb;min According to [4] 0 �Maximum electrical tilt angle hb;max 10 �Step of electrical tilt adjustment Dhb – 2 �Combined 3D antenna pattern A u; h; hbð Þ According to (9), [10] – dBi

Maximum antenna gain GT;max – 14 dBi

Transceiver power for each BS antenna PT;dBm According to [10] 46 dBm

PT &40 W

BS antenna height Hb 32 m

UE antenna gain GRðk;bÞ According to [4] 0 dBi

UE antenna height Hk According to [10] 1.5 m

Table 3 Main simulation parameters

Parameter Symbol Modeling Value Dimension

Carrier frequency f – 2600 MHz

Frequency bandwidth Df According to [10] 10 MHz

Number of subcarriers Nsc – 5000 –

Number of subcarriers, allocated to user k Nsc kð Þ – 1 –

Frequency reuse factor FR According to [4] 1 –

Path loss exponent a According to [8] 4.5 –

Shadow fading standard deviation rwdBAccording to [4] 8 dB

SINR threshold for service availability cmin – 15 dB

Maximal working temperature Tmax – 25 �CCall duration Dtcall – 60 s

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

hb;max ¼ 10�. This downtilting policy can be modified, but is chosen in order to reduce the

interference to adjacent cells. The states through which the antenna tilt angle changes for

cells 1 and 3, are illustrated in Fig. 13. The mapping between states and tilt angles is

s1 ¼ 0�; s2 ¼ 2�; s3 ¼ 4�; s4 ¼ 6�; s5 ¼ 8�; s6 ¼ 10�: ð23Þ

This number of states and relatively coarse tilt adjustment step Dhb are chosen in order

to reduce the complexity of the simulations and to show the functionality and effectiveness

Fig. 10 a Antenna vertical pattern AV h; hbð Þ for hb ¼ 0�, h ¼ f�180�; 180�g in polar coordinates; bantenna horizontal pattern AH uð Þ for u ¼ f�180�; 180�g in polar coordinates

Received power map

Rece

ived

pow

er, [

dBm

]

cell 1

cell 2

cell 3

The ini�al down�lt angles for the cells are: 6°, 6°, 6°

Fig. 11 Simple network scenario: received power map before initializing the optimization algorithm

N. Dandanov et al.

123

of the method. In this first stage of development, the state to be explored is selected based

on prediction for the possible rewards.

The algorithm explores consecutively all possible states for each of the cells. The initial

state of each antenna is s4. An empty reward matrix is created starting with cell 1. The first

state selected is s1, therefore the antenna electrical tilt angle is set to 0�. The received

reward is computed and the RM updated. Then the algorithm explores the next most-

Received power map

Rece

ived

pow

er, [

dBm

]

cell 1

cell 2

cell 3

The op�mal down�lt angles for the cells are: 2°, 10°, 2°

Fig. 12 Simple network scenario: received power map after completing the optimization algorithm

Fig. 13 States during the exploration phase for cells 1 and 3

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

distant state to the already explored one, which means s6. Hence, it downtilts the angle to

10�. Because the received reward in this state is smaller than the received for the previous

action, the previous state, s1, is marked as near-optimal. The closest unexplored state to the

near-optimal state is selected next, i.e. s2. The received reward is so far the highest, so s2 is

now marked as near-optimal.

In a similar manner, in the next iterations, the unexplored states are selected. After all

tilt angles have been explored and the reward matrix has been constructed, the action with

biggest reward is reproduced. Hence, the optimal tilt angle is set. With this action, the

exploration phase completes. It is executed again for the next cells in the cluster, in this

case three times in total. Because cell 2 serves no users, it is directly downtilted to 0�. Asimilar simulation test was performed for the same network scenario with more randomly

distributed users in the network and with similar results.

As a next step, shadow fading and radio propagation characteristics for urban envi-

ronment were implemented in the model and inter-cell interference was considered. The

simulation results after completing the exploration phase of the algorithm for a network

with three cells and three users, whose locations are assumed constant and static in time,

are shown in Fig. 14. The dimensions of the simulated network are 1km by 1km and the

computational step is 1m. This reduction of the size of the cells is due to the higher

influence of path loss and fading, as compared with the previous examples. User locations

are represented with asterisks (*). The cells are denoted by numbers and their borders are

illustrated with dashed lines. The data rate for each of the three users is shown in Table 4.

As it can be seen from the results, user 2 (located near the border between cells 1 and 2)

Op�mal �lt angles: cell 1: 8°; cell 2: 10°; cell 3: 6°Re

ceiv

ed p

ower

, [dB

m]

Received power map: op�mal antenna angles

Fig. 14 Received power map after optimization of 3 cells and 3 randomly distributed users

Table 4 Simulation results for a network with 3 cells and 3 randomly distributed users

Data rate [bit/s] User 1 User 2 User 3 Sum rate for the network [bit/s]

Before tilt optimization 12,436.59 0.00 6965.57 19,402.16

After optimization 13,446.30 5049.23 7159.97 25,655.50

N. Dandanov et al.

123

receives low SINR and therefore has no service. After tilt optimization, user 2 receives

sufficient SINR and is covered. The sum rate for the network is around 32:2% higher after

the optimization. With a scenario consisting of more cells and users, a similar improvement

is achieved.

Following simulations for a network consisting of 15 cells and 30 randomly distributed

users were conducted. The dimensions of the network are 5 by 5 km and the computational

step is 5 m. The received power map after optimization and approximate cell borders are

illustrated in Fig. 15. In order to minimize the interference between adjacent cells and to

maximize the received SINR, most of the BS antennas are additionally tilted down to 10�from the initial 6�.

It could be shown that a practical deployment of the solution would not be too complex

and will require rather insignificant changes in the network. A preliminary calculation

shows, that for a deployment in a small exemplary mobile network consisting of 5000 base

stations, the invested capital expenditures (CAPEX) will be returned as additional revenue

in less than three years of network operation, maintaining optimal network operation.

5 Conclusions and Future Work

In this paper, we propose a novel approach for self-optimization in mobile networks

utilizing the BS antenna tilt. To illustrate the approach, we developed an RL based

algorithm and simulated its application in an urban scenario. Based on the simulation

parameters listed in Tables 2 and 3, the simulation results show around 30% of

improvement in the sum data rate of the network after applying antenna tilt angle opti-

mization. The proposed machine learning algorithm is model-independent and can be

Op�mal �lt angles:

cell 1 10°cell 2 10°cell 3 10°cell 4 10°cell 5 10°cell 6 10°cell 7 10°cell 8 10°cell 9 2°cell 10 10°cell 11 10°cell 12 6°cell 13 10°cell 14 6°cell 15 10°

Rece

ived

pow

er, [

dBm

]

Fig. 15 Received power map after optimization of 15 cells and 30 users, distributed randomly in thenetwork. Approximate cell borders

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

applied to current and next generations of mobile networks, contributing to the self-

organizing functions of the network, for energy saving, inter-cell interference coordination,

load balancing, etc. The achievement of a synergistic effect for the network OAM and

improvement of network performance, while reducing operational costs, energy con-

sumption and needed resources is expected as a result of the application of the algorithm.

It must be mentioned that in this work only the exploration phase of the developed

algorithm is described. As the system operates and collects statistics, it learns based on

experience. This knowledge consists of previous user distributions and demands. Hence, it

can be utilized to improve the prediction of future geographical distributions of user

terminals. When such a prediction is implemented in the exploitation phase, the tilt angle

could be adjusted a priori to avoid formation of coverage holes or weak signal reception

with insufficient data rate.

The modelled antenna radiation pattern is simplified and follows a technical specifi-

cation by 3GPP for the LTE technology. For the purpose of evaluating the results against a

practically deployed mobile network, a pattern of a real antenna can be used. Example for a

widely installed antenna in mobile networks nowadays is given in [11].

For simplification, the number of subcarriers, allocated to user k, Nsc kð Þ, is fixed,

meaning that all user terminals use the same frequency bandwidth. The simulation model

can be further extended by treating the number of allocated subcarriers as a variable,

depending on the user data rate demand. In addition, the problem of capacity optimization

with different frequency reuse schemes can be studied by evaluating the cell-edge

throughput performance. This will tailor the solution to mobile communication systems,

which use Frequency-division multiple access (FDMA), such as GSM.

Simulation tests in a network, especially in a realistic urban environment, incorporating

user mobility and prediction, are expected also to confirm the effectiveness of the method.

The application of the method for the simulation of specific network technologies

including detailed models of effects, such as fast fading, will be a subject of future work.

References

1. Hamalainen, S., Sanneck, H., & Sartori, C. (2011). LTE self-organising networks (SON): Networkmanagement automation for operational efficiency. Wiley.

2. Cisco (2015). Visual networking index (VNI). http://www.cisco.com/c/en/us/solutions/service-provider/visual-networking-index-vni/index.html. Aaccessed September 2015.

3. Yilmaz, O., Hamalainen, S., & Hamalainen, J. (2009). Comparison of remote electrical and mechanicalantenna downtilt performance for 3GPP LTE. Vehicular Technology Conference Fall (VTC 2009-Fall),2009 IEEE 70th. pp. 1–5.

4. Yilmaz, O., Hamalainen, J., & Hamalainen, S. (2010). Self-optimization of remote electrical tilt.Personal Indoor and Mobile Radio Communications (PIMRC), 2010 IEEE 21st International Sympo-sium on. pp. 1128–1132.

5. Partov, B., Leith, D. J., & Razavi, R. (2015). Utility fair optimization of antenna tilt angles in LTEnetworks. IEEE/ACM Transactions on Networking, 23(1), 175–185.

6. Thampi, A., Kaleshi, D., Randall, P., Featherstone, W., & Armour, S. (2012). A sparse samplingalgorithm for self-optimisation of coverage in LTE networks. Wireless Communication Systems(ISWCS), 2012 International Symposium on, pp. 909–913.

7. Razavi, R., Klein, S., & Claussen, H. (2010). Self-optimization of capacity and coverage in LTEnetworks using a fuzzy reinforcement learning approach. Personal Indoor and Mobile Radio Com-munications (PIMRC), 2010 IEEE 21st International Symposium on, pp. 1865, 1870.

8. Goldsmith, A. (2005). Wireless communications. Cambridge: Cambridge University Press.9. Balanis, C. A. (2005). Antenna theory: Analysis and design, 3rd Edn. Wiley.

N. Dandanov et al.

123

10. GPP. (2014). TR 36.814, technical specification group radio access network (E-UTRA). EvolvedUniversal Terrestrial Radio Access (E-UTRA); Further advancements for E-UTRA physical layeraspects, ver. 1.7.0, Release 9. http://www.3gpp.org/ftp/Specs/archive/36_series/36.814/36814-170.zip.Accessed July 2015.

11. Kathrein (2015). Antenna type 742215 technical specification. https://www.kathrein.de/svg/download/9364238a.pdf. Accessed August 2015.

12. Gunnarsson, F., Johansson, M. N., Furuskar, A., Lundevall, M., Simonsson, A., Tidestav, C., &Blomgren, M. (2008). Downtilted base station antennas—a simulation model proposal and impact onHSPA and LTE performance. Vehicular Technology Conference, 2008. VTC 2008-Fall. IEEE 68th,pp. 1–5.

13. Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012). Foundations of machine learning. The MITPress.

14. Sutton, R., & Barto, A. (1998). Introduction to reinforcement learning, 1st edn. MIT Press.

Nikolay Dandanov graduated with distinction the B.Sc. programme inTelecommunications at the Technical University of Sofia, Bulgaria, in2016. He is currently doing his M.Sc. degree in TelecommunicationNetworks at the same university. He prepared his Bachelor’s Thesis inthe field of self-organization in mobile networks as an exchange stu-dent at the Technische Universitat Darmstadt, Germany, in 2015.Nikolay has more than 4 years of research and industry experience inthe field of telecommunications. From 2012 to 2015, he was a NetworkOperations Center engineer in a data center at Telepoint LTD. In 2014,he did an internship in Technology Strategy Architecture Team at oneof Bulgaria’s mobile network operators—Mobiltel EAD. In November2015, he joined the Teleinfrastructure R&D laboratory at the TechnicalUniversity of Sofia as a Telecommunications Expert. His main fields ofscientific interest and expertise are in resource and interference man-agement in mobile networks and Internet of Things, as well as inaccess, core and multimedia networks.

Hussein Al-Shatri received the B.Sc. degree in electronic and com-munications engineering from Hadhramout University, Yemen, theM.Sc. degree in communications engineering from Munich Universityof Technology, Germany, and the Ph.D. degree in electrical engi-neering from the University of Rostock, Germany, in 2003, 2008, and2014, respectively. Between 2009 and 2014, he was assistantresearcher with the Institute of Communications Engineering,University of Rostock. During that time, he was active in the topics ofpower allocation and interference alignment. Since August 2014, he isa postdoctoral researcher with Communications Engineering Labora-tory, Technische Universitat Darmstadt, Germany. His researchinterests include hierarchical signal processing, cloud radio accessnetworks, distributed algorithms design, and user preferences analysis& integration in underlay wireless networks.

Dynamic Self-Optimization of the Antenna Tilt for Best…

123

Anja Klein received the diploma and Dr.-Ing. (Ph.D.) degrees inelectrical engineering from the University of Kaiserslautern, Germany,in 1991 and 1996, respectively. From 1991 to 1996, she was a memberof the staff of the research group for RF communications at theUniversity of Kaiserslautern. In 1996, she joined Siemens AG, MobileNetworks Division, Munich and Berlin. She was active in the stan-dardization of third generation mobile radio in ETSI and in 3GPP, forinstance leading the TDD group in RAN1 of 3GPP. She was vicepresident, heading a development department and a systems engi-neering department. In May 2004, she joined the Technische Univer-sitat Darmstadt, Germany, as full professor, heading theCommunications Engineering Lab. Her main research interests are inmobile radio, including multi-antenna systems, radio resource man-agement, interference management, relaying and multi-hop, coopera-tive communication, network planning, and cross-layer design. Dr.Klein has published over 290 refereed papers and has contributed to

twelve books. She is inventor and co-inventor of more than 45 patents in the field of mobile radio. In 1999,she was inventor of the year of Siemens AG. Dr. Klein is a member of IEEE and of Verband DeutscherElektrotechniker-Informationstechnische Gesellschaft (VDE-ITG).

Professor Vladimir Poulkov Ph.D., has received his M.Sc. and Ph.D.degrees at the Technical University of Sofia. He has more than30 years of teaching, research and industrial experience in the field oftelecommunications, starting from 1981 as R&D engineer working forthe telecommunication industry, and developing his carrier to a fullprofessor at the Faculty of Telecommunications, Technical Universityof Sofia, Bulgaria. He has successfully managed and realized numer-ous industrial and engineering projects, related to the development ofthe telecommunication transmission and access network infrastructurein Bulgaria, many R&D and educational projects. His fields of scien-tific interest and expertise are related to interference suppression,resource management in next generation networks and IoT. He isauthor of more than 100 scientific publications and is leading B.Sc.,M.Sc. and Ph.D. courses in the field of Information TransmissionTheory and Access Networks. In the period 2007–2015 he was Dean ofthe Faculty of Telecommunications at the Technical University of

Sofia. Currently he is head of the ‘‘Teleinfrastructure R&D’’ laboratory at the Technical University of Sofia.He is Chairman of the Bulgarian Cluster of Telecommunications, Senior IEEE Member and co-founder ofthe CONASENSE (Communication, Navigation, Sensing and Services) society.

N. Dandanov et al.

123