Download pdf - PRECODING FOR MULTIUSER MIMO SYSTEMS … FOR MULTIUSER MIMO SYSTEMS WITH MULTIPLE BASE STATIONS by Imad Azzam A thesis submitted in conformity with the requirements for the degree

PRECODING FOR MULTIUSER MIMO

SYSTEMS WITH MULTIPLE BASE STATIONS

by

Imad Azzam

A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science,

Graduate Department of Electrical and Computer Engineeringin the University of Toronto.

Copyright c© 2008 by Imad Azzam.All Rights Reserved.

Precoding for Multiuser MIMO Systems with

Multiple Base Stations

Master of Applied Science ThesisEdward S. Rogers Sr. Department of Electrical and Computer Engineering

University of Toronto

by Imad AzzamJune 2008

Abstract

Future cellular networks are expected to support extremely high data rates and user

capacities. This thesis investigates the downlink of a wireless cellular system that takes

advantage of multiple antennas at base stations and mobile stations, frequency reuse

across all cells, and cooperation among base stations. We identify asynchronous inter-

ference resulting from multi-cell communication as a key challenge, prove the existence

of a downlink/uplink duality in that case, and present a linear precoding scheme that

exploits this duality. Since this result is not directly extendable to orthogonal frequency

division multiplexing (OFDM), we propose a ‘hybrid’ algorithm for two cooperating base

stations, which combines linear and nonlinear precoding. This algorithm minimizes the

sum mean squared error of the system and is extendable to OFDM. Finally, we con-

sider the problem of user selection for multiuser precoding in OFDM-based systems. We

extend an available single-cell user selection scheme to multiple cooperating cells.

ii

To my parents, Halim and Amal,

and brothers, Raed and Abdullah

iii

Acknowledgements

I would like express sincere and deep gratitude to my advisor, Professor Raviraj Adve.

His continuous support, advice, discussions, and insight are invaluable and highly appre-

ciated. This work would not have been possible without him.

I would like thank all the friends that I met in the Communications Group for their

help and support throughout my two years in the group.

I would like to express special thanks to all the Lebanese friends I made at the

University of Toronto. In particular, an infinite amount of gratefulness goes to Rani

Daher, Nahi Abdul Ghani, Sari Onaissi, and Khaled Heloue. Their friendship made a

big difference in my life in Toronto. Rani’s daily support and encouragement through

advice and ‘cheering songs’ are unforgettable. Nahi’s novel sense of humor and social

networking guidelines made life smooth and enjoyable. Sari’s and Khaled’s brotherly

guidance helped me make better decisions and is greatly appreciated.

Many thanks go to my friends from back home. I could not have made it without

them. I hope that one day we will be reunited to celebrate our accomplishments.

No words can describe how grateful and indebted I am to my parents and brothers.

Their love and care are the motivation for all my achievements, and they will forever be.

I am deeply thankful to my aunt Ghada, who is more of an elder sister to me. Her

support and advice on many occasions were priceless.

I would like to acknowledge Bell Canada’s support through its Bell University Labo-

ratories R&D program, which made my research possible.

Imad H. Azzam

June 2008

iv

Contents

1 Introduction and Background 1

1.1 Motivation and Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1 Linear Precoding and Duality . . . . . . . . . . . . . . . . . . . . 3

1.2.2 Multiple Base Stations and Asynchronous Interference . . . . . . 4

1.2.3 Nonlinear Precoding . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.4 User Selection for Multiuser MIMO-OFDM Systems . . . . . . . . 8

1.3 Thesis Overview and Structure . . . . . . . . . . . . . . . . . . . . . . . 9

2 Multiple Base Stations and Asynchronous Interference 11

2.1 System Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.1 Downlink System Model . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.2 Virtual Uplink System Model . . . . . . . . . . . . . . . . . . . . 16

2.2 Downlink/Uplink Duality . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.1 First Step: SINR Targets . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.2 Second Step: Equating MSEs . . . . . . . . . . . . . . . . . . . . 28

2.3 Single Base Station: Synchronous Interference . . . . . . . . . . . . . . . 28

2.3.1 Review of the MIMO-SMSE Algorithm . . . . . . . . . . . . . . . 30

2.3.2 Multiple Base Stations: Assuming Synchronous Interference . . . 32

2.4 Minimizing the SMSE with Asynchronous Interference . . . . . . . . . . 33

2.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.6 MIMO-OFDM and Multiple Base Stations . . . . . . . . . . . . . . . . . 39

2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

v

CONTENTS CONTENTS

3 The Hybrid Algorithm 42

3.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.2 The Hybrid Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.2.1 Interference Pre-subtraction . . . . . . . . . . . . . . . . . . . . . 46

3.2.2 Whitening Interference from Edge Users at Intra-cell Users . . . . 48

3.2.3 The Hybrid Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 50

3.2.4 Data Vector Estimation for THP Users . . . . . . . . . . . . . . . 53

3.3 A Variation on the Hybrid Algorithm . . . . . . . . . . . . . . . . . . . . 54


3.4.1 Zero Power Cross Channels . . . . . . . . . . . . . . . . . . . . . 55

3.4.2 Non-zero Power Cross Channels . . . . . . . . . . . . . . . . . . . 60

3.5 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4 User Selection in Multiuser MIMO-OFDM Systems 66

4.1 System Model and Problem Statement . . . . . . . . . . . . . . . . . . . 66

4.1.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.2 Original Algorithm and Proposed Modifications . . . . . . . . . . . . . . 68


4.4 MIMO-OFDM and the Hybrid Algorithm . . . . . . . . . . . . . . . . . . 71

4.5 User Selection for the Hybrid Algorithm . . . . . . . . . . . . . . . . . . 74

4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5 Conclusions and Future Work 79

5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

A Equating MSEs 84

B Derivations for Asynchronous Interference 86

Bibliography 91

vi

List of Figures

2.1 Asynchronous interference in the downlink . . . . . . . . . . . . . . . . . 15

2.2 Asynchronous interference in the uplink . . . . . . . . . . . . . . . . . . . 20

2.3 Indices of interfering symbols of one user from two BSs . . . . . . . . . . 27

2.4 Linear precoding with and without asynchronous interference, with path

loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.5 Linear precoding with and without asynchronous interference, no path loss 39

2.6 SMSE versus power of three users . . . . . . . . . . . . . . . . . . . . . . 40

3.1 System model for the hybrid algorithm . . . . . . . . . . . . . . . . . . . 43

3.2 Performance of the hybrid algorithm without path loss (K1 = K2 = 1,

Ke = 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.3 Performance of the hybrid algorithm compared to ZPCC cases without

path loss (K1 = K2 = 1, Ke = 2) . . . . . . . . . . . . . . . . . . . . . . 57

3.4 BER plot of the separate streams for ZPCC case 2 . . . . . . . . . . . . . 59

3.5 BER plot of the separate streams for hybrid algorithm and ZPCC case 3 59

3.6 Performance of the hybrid algorithm with path loss . . . . . . . . . . . . 60

3.7 Performance of the hybrid algorithm with inter-cell interference . . . . . 61

3.8 Performance of the hybrid algorithm variation . . . . . . . . . . . . . . . 62

3.9 Performance of the hybrid algorithm with complex transmission . . . . . 62

4.1 Four different scenarios for user selection . . . . . . . . . . . . . . . . . . 71

4.2 Performance of hybrid algorithm for K1 = K2 = 4 and Ke = 8 . . . . . . 75

4.3 User selection method for the hybrid algorithm . . . . . . . . . . . . . . 76

vii

LIST OF FIGURES LIST OF FIGURES

4.4 Two different user selection methods for the hybrid algorithm . . . . . . 77

4.5 Varying number of edge users for 1 intra-cell user per cell (1f1 refers to f

edge users) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.6 Varying number of edge users for 2 intra-cell users per cell (2f2 refers to

f edge users) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.1 Extension of hybrid algorithm to 3 BSs . . . . . . . . . . . . . . . . . . . 83

viii

Chapter 1

Introduction and Background

1.1 Motivation and Objective

Meeting the demands that are expected from future generation networks poses intriguing

challenges for today’s wireless system designers. After the widespread deployment and

commercialization of third generation wireless cellular networks, research is now mainly

focused on taking data rates and user capacity to even higher levels, paving the way for

the fourth generation (4G) of wireless communications. Two emerging technologies that

are potential candidates for 4G wireless networks are multiple-input, multiple-output

(MIMO) systems and transmission based on orthogonal frequency division multiplexing

(OFDM). The use of antenna arrays (multiple antennas) at the transmitters and/or

receivers in MIMO systems enables multiuser communication with multiple users over

the same bandwidth. The use of orthogonal sub-carriers in OFDM provides protection

against inter-symbol interference (ISI). A vast amount of research and techniques have

been proposed for reaping the potential benefits of these two separate technologies [1–4].

Combining them potentially creates an even more capable system, which raises questions

about how to jointly optimize their functionality to further improve performance.

Another theme in communication networks that has been gaining more recognition

is cooperation. Cooperation can be generically defined as the sharing of resources to

achieve a common objective. In the context of wireless cellular networks, cooperation

can take place between the base stations (BSs), the users, and/or dedicated relays. The

1

1.2. LITERATURE REVIEW 2

interaction of one or more BSs necessitates the study of how neighboring cells affect each

other, especially in terms of inter-cell interference. Existing cellular systems avoid this

problem by deploying different frequencies in different cells. However, such an approach

has the drawback of reducing the system efficiency, since the available bandwidth is

divided into disjoint ranges and distributed among several neighboring cells. Accordingly,

the ultimate goal of system design becomes developing a scheme that mitigates the inter-

cell interference instead of avoiding it, which in turn enables the use of the entire available

bandwidth in every cell.

In reality, an effective wireless cellular system can bring together MIMO, OFDM, and

cooperation, which provides a high number of degrees of freedom. While this provides

great flexibility in design and the ability to tune the system to meet differing require-

ments, it also complicates the optimization of the system towards a required objective.

Consequently, our sponsors, Bell Mobility, through the Bell University Laboratories R&D

program, have asked us to explore various techniques targeted at the design of such a

system. Accordingly, our goal is to provide insight into the development of a practical

cooperative multiuser wireless cellular system that meets the demanding requirements of

4G communication networks.

We focus on a multiuser MIMO-OFDM system where cooperation is present in terms

of BS coordination and joint processing. More specifically, MIMO techniques are used to

multiplex data streams on the same bandwidth, OFDM is used to increase user capacity

and guard against ISI, and BS cooperation is used to provide users with better service

through joint processing and protect them further against multiuser interference (MUI)

and inter-cell interference. To enable communication over this system, BSs process the

data of users before transmission (precoding) and each user processes its received signal

to retrieve its own data (decoding). In this thesis, we address the precoding and decoding

problem for multiuser communications with coordinated transmissions from multiple BSs.

1.2 Literature Review

In what follows, we provide a brief survey of important works in four areas related to

this thesis. The survey starts with a review of prior work in linear precoding for MIMO


systems and the related concept of downlink/uplink duality. Next, we move to the area of

multiple base stations and discuss precoding in that scenario. The notion of asynchronous

interference is introduced and related work is presented. Then, nonlinear precoding is

briefly reviewed, with special focus on Tomlinson-Harashima Precoding (THP) that will

be used in this work. Finally, some works dealing with user selection for MIMO-OFDM

systems are presented. Note that basic mathematical models are not presented here and

delayed until the relevant sections for the convenience of the reader.

1.2.1 Linear Precoding and Duality

Our work focuses on the downlink of multiuser MIMO communication systems where

channel state information (CSI) is available at the transmitter. Using this CSI, system

performance can be maximized by pre-distorting the transmission to best match the

available CSI. In this thesis, we make extensive use of linear precoding [5–9] where the

signals to be transmitted are multiplied with a precoding matrix before transmission.

Similarly, the receiver multiplies the received signal with a decoding matrix to minimize

MUI. While the early works focused on minimizing the sum of the mean squared error

(SMSE) across all users’ signals [5–8], linear precoding to maximize sum data rate is also

possible [7, 9, 10].

Most of the work in precoding is based on a duality between the multiuser downlink

and a virtual multiuser uplink [6–8]. The duality states that, under the same sum power

constraint, the downlink and uplink both have the same achievable signal-to-interference-

plus-noise ratio (SINR) region [6]. In other words, if a certain set of SINR targets can be

achieved in the downlink for a given sum power constraint, then those targets can also be

achieved in the uplink with the same power constraint. This duality can be used to state

that the downlink and uplink have the same achievable minimum squared error (MSE)

region under the same sum power constraint [7]. This was shown for the single receive

antenna case in [6, 7] and for the MIMO scenario in [8]. Therefore, this duality can be

thought of as a tool that provides us with two different perspectives of the same system.

If the solution of a certain problem can be found using one perspective, then this solution

can be transformed to suit the other perspective. In our case, as in [8], minimizing the


SMSE proves to be easier in the uplink. Consequently, the uplink solution is determined

and then transformed to the downlink solution.

Furthermore, this downlink/uplink duality suggests that the precoding and decoding

matrices can be obtained via receive processing only, i.e., using the Wiener filter, and are

the same whether considering the downlink or uplink [8]. The outstanding issue is then

power allocation across users, which is a convex optimization problem when minimizing

SMSE [8]. In this work, we generalize this duality for the MIMO scenario with multiple

transmitters and multiple receivers that are geographically distant, which gives rise to

asynchronous MUI, a notion to be investigated later in more detail. We also study

the implications of asynchronous interference for the convexity of the power allocation

problem.

It is worth elaborating further on the work presented in [8], as it forms the basis of the

linear precoding found in our work. Based on the outcome of the downlink/uplink dual-

ity, two iterative algorithms that jointly optimize the precoding/decoding matrices and

the power allocation to minimize the SMSE of a MIMO system are presented. One algo-

rithm cycles between the downlink and uplink to derive the precoding/decoding matrices

and power allocation. The other performs the optimization completely in the uplink to

obtain this optimal decoding matrices and uplink power allocation and then transforms

the solution to the downlink. This is made possible by a scheme proposed in [11] for

deriving the uplink precoding matrices given per-user power constraints without consid-

ering the downlink temporarily, and generalized in [8] for a sum power constraint. Both

algorithms have better performance than block diagonalization (BD), another MIMO

linear precoding technique [8]. Moreover, they provide performance very similar to se-

quential quadratic programming (SQP), a computationally intensive technique that can

be used to derive the precoding and decoding matrices directly in the downlink [5,8,12].

We make use of both algorithms at various points in this thesis.

1.2.2 Multiple Base Stations and Asynchronous Interference

More recently researchers have begun investigating the notion of multiple BSs cooperating

to achieve more efficient use of bandwidth. The ultimate goal of such systems is a


frequency reuse factor of unity. In multi-BS, multiuser precoding, multiple BSs coordinate

their transmissions to a group of users (that may straddle a traditional cell boundary).

This cooperation can provide better system performance, especially when servicing cell-

edge users. The work in [13] is probably the first to discuss multiuser communications

with multiple BSs. A system with multiple transmitter-receiver pairs that interfere with

each other is considered. An iterative algorithm that attempts to minimize the transmit

power while meeting quality of service requirements at the receivers is presented and

shown to converge to a local minimum. In a recent study, Tolli et al. [14] investigate

linear precoding for multiple cooperating BSs. They propose an algorithm to jointly

design precoding and decoding matrices for a multiuser MIMO system in attempt to

maximize the sum rate under per-BS power constraints. The authors treat all MUI as

synchronous and the resulting algorithm is similar to that of [8] with a single ‘super BS’

using as many transmit antennas as all the cooperating BSs collectively have. In [15],

Tamakai et al. study the achievable sum rates for a MIMO downlink system with multiple

BSs cooperating at three different levels. However, they assume that a variant of Time

Division Multiple Access (TDMA) is used in the best results they achieve. In [16],

Dahrouj et al. consider the downlink of a multiuser, multi-BS system and propose an

algorithm that jointly optimizes the beamformers used by the cooperating BSs in order

to minimize the total transmit power while satisfying the SINR constraints of the users.

When dealing with multi-BS environments, a crucial assumption often made is that

both the desired and interfering signals arrive synchronously at each user [14–19]. How-

ever, synchronous interference is physically impossible [20]. In point of fact, it is this

asynchronous interference that is the key challenge in developing transmission schemes

for multi-BS scenarios. The work in [20] provides amendments to some existing algo-

rithms accounting for this asynchronous interference. In this paper, we show that the

virtual uplink should be modeled carefully for duality to remain intact. Furthermore, the

convexity of the power allocation problem is unclear, and it cannot be extended directly

from that of the synchronous interference case. Accordingly, we provide some amend-

ments to extend the linear precoding algorithm presented in [8] and make it suitable for

a multi-BS system with asynchronous interference1.

1It is worthwhile to note that the problem of asynchronous interference is also encountered when


Several other works consider the problem of asynchronous interference and propose

different methods for mitigating its effect, mainly by using the cyclic redundancy pro-

vided by OFDM. Thomas and Vook [22] design space-time filters such that the MSE

between the head and tail of the OFDM symbol (originally equal without interference

due to cyclic redundancy) is minimized. In another approach, again based on the cyclic

redundancy, Yano and Taromaru [23] propose a special receiver structure that detects

the arrival of asynchronous interference and adapts the processing weights accordingly.

Jung and Zoltowski [24] propose using space-time filters twice, once to estimate the in-

terferer by examining a ‘window’ synchronized with the interferers signal and subtract

it from the original signal, and the second time to estimate the desired signal. Note

that [22,24] assume that the interference is misaligned by an integer factor of the symbol

period. Moreover, [22–24] all assume single-BS systems and do not address the multi-BS

scenario, neither in terms of the difficulties that may arise nor in terms of any advantages.

In this work, we briefly discuss some of difficulties experienced when attempting to com-

municate to multiple users from multiple BSs using OFDM. Consequently, we assume

that cooperation happens only for users in regions where asynchronous interference is

limited and propose a method to help such users further.

1.2.3 Nonlinear Precoding

When both CSI and interference are known at the transmitter, another form of precoding

for multiple users is Dirty Paper Coding (DPC) [25]. Briefly stated, since the interfer-

ence is known beforehand at the transmitter, it can be subtracted from a user’s desired

data. Examples of such coding techniques includes the pioneering work by Costa [25],

Tomlinson-Harashima Precoding (THP) first introduced in [26–28], and vector perturba-

tion techniques [29]. In this work we make use of THP and present some related works

below.

designing Code Division Multiple Access (CDMA) systems. In those systems, the asynchronous interfer-ence is resolved by assuming the presence of additional virtual users that also interfere with the desiredsignal and the system is designed to guard the desired user against those extra interferers [21]. A similarapproach may be attempted with linear precoding; however, it may lead to a complicated system withvery high dimensions. In this thesis, we focus on formulating relatively simple linear precoding, andhence we avoid adopting this method.


The main idea behind THP is moving the decision feedback loop of the common deci-

sion feedback equalizer (DFE) from the receiver side to the transmitter side. This helps

reduce the possibility of error propagation; however, it can increase the average transmit-

ted power. Accordingly, a modulo operation at the transmitter alleviates the problem of

transmit power exceeding the usual transmit power constraint. The modulo operation is

a non-linear operation, and hence makes THP a non-linear precoding technique [30].

Another important issue which arises when working with THP is user ordering. As-

sume that a total of K users numbered 1, . . . , K are present in the system. For user k,

the interference from users k + 1, . . . , K are presubtracted. Accordingly, user 1 sees no

interference, user 2 sees interference only from user 1, and so on. Choosing the opti-

mal order for the K users is a complicated problem and different works employing THP

use different ordering methods that meet their optimization criterion. Windpassinger et

al. [31] present a MIMO system that uses THP, and argue that it performs better than

linear precoding with DFE at the receiver. As for user ordering, they propose using two

methods presented in [32, 33]. Doostnejad et al. [34] propose a combination of linear

and nonlinear precoding for the MIMO downlink to minimize individual MSEs given in-

dividual power constraints or minimize the total transmit power given individual SINR

constraints. Moreover, the downlink/uplink duality is generalized from MISO systems

to MIMO systems for nonlinear precoding. Users are ordered according to the Frobenius

norm of their channels. The user that sees no interference (user 1 according to the pre-

vious numbering) is chosen to be the user that has the lowest channel Frobenius norm,

and the user that sees interference from all other users (user K) is chosen to be the

user that has the highest channel square Frobenius norm. In [35], Fung et al. propose a

practical THP implementation for the downlink of a multiuser, MIMO system and show

that it outperforms existing THP implementations based on zero-forcing techniques in

terms of power efficiency. The total power is minimized while satisfying the minimum

data rate and maximum bit error rate (BER) requirements of the users. User ordering

is also addressed and an algorithm that finds the optimal user ordering in polynomial

time is proposed. A less complex algorithm that finds a near-optimal user ordering is

also presented.

In our work, THP is used by two cooperating BSs to presubtract interference between


user groups, as opposed to individual users. As for the ordering problem, since the number

of user groups is restricted to 2, only two possible orderings are possible and both are

considered.

1.2.4 User Selection for Multiuser MIMO-OFDM Systems

Most works on multiuser communications assume communication to a small, fixed group

of users. This is because the number of transmit antennas essentially limits the number

of users that can be simultaneously serviced. In a realistic system, however, the number

of users would far exceed the number of antennas at the BS. One possible way to deal

with the high number of users is combining OFDM with MIMO. In such a system, it is

possible to transmit multiple data streams on each of the many subcarriers made available

by OFDM. This is a multiuser extension of orthogonal frequency division multiple access

(OFDMA). Of course, OFDM and OFDMA have their own implementation challenges,

such as phase noise and carrier offsets [4], but those are out of the scope of this work.

When linear precoding is used on each subcarrier, the maximum number of data

streams that can be transmitted on a given subcarrier is equal to the number of transmit

antennas. With many data streams required for transmission, the choice of which users

should be assigned to which subcarriers is not clear. Many works exist that address this

problem for single-carrier MIMO systems or single-input, single-output (SISO) OFDM

systems. However, since we are dealing with a MIMO-OFDM system, we focus our

attention on works that address user selection for such a system.

User selection methods vary according the chosen optimization criterion. Zhang and

Ben Letaief [36] propose a theoretical joint user, power, and bit-load allocation scheme

in which the BER or data rate requirements of users are met. However, due to its com-

plexity, they propose a more practical algorithm where users are separated into different

groups such that the inter-group user correlation is low. Only users from different groups

are allowed to transmit on the same subcarrier, and traditional methods, such as FDMA,

are used to allocate users from the same group to different subcarriers. With the users

decoupled in such a manner, power and bit-loads are jointly allocated to each user in-

dependently using the initial algorithm, but applied to that one user, which reduces its

1.3. THESIS OVERVIEW AND STRUCTURE 9

complexity. In another approach, Pan et al. [37] propose an algorithm that allocates

subcarriers to users and minimizes the power required to meet certain data rates. The

main idea is to use DPC to allow more than one user to be allocated on each subcarrier.

In a more recent work, the authors of [38] propose an approach that attempts to meet

the data rates required by users in two steps, briefly summarized below. First, some

simplifying assumptions are made based on which an average channel gain for each user

is determined. Using these average channel gains, the number of subcarriers to meet each

users data rate is approximated. Second, subcarriers are allocated to users such that the

initial approximations are satisfied. Then, an extra user may be allocated to a subcarrier

if it has a high channel gain and a low correlation with users already on that subcarrier.

Karaa and Adve [39] propose a MIMO-OFDM user allocation scheme tailored for

linear precoding as presented in [8]. The optimization criterion is the minimization of

the SMSE. For every subcarrier, the proposed approach runs an iteration that alternates

between power allocation and optimal beamforming for all system users, until the relative

change in SMSE is below a certain threshold. Afterwards, the data streams that are

allocated the most power are assigned to the subcarrier. It is shown by simulations that

the performance of this scheme closely approaches that of the optimal user selection

(simulated by brute force), when both are combined with linear precoding. This work is

reviewed in more detail later, as it is modified to enhance its performance further and

then extended to user selection with multiple BSs.

1.3 Thesis Overview and Structure

As mentioned previously, our goal is to provide insight into the design of a multiuser,

multi-BS, MIMO-OFDM system with a frequency reuse factor of unity. More specifically,

we will initially consider the downlink of a single-carrier, multi-BS, MIMO system and

explain the nature of the asynchronous interference that is experienced in such systems.

The contributions of this thesis are:

• Developing the useful downlink/uplink duality for a single-carrier, multi-BS, MIMO

system with asynchronous interference.

1.3. THESIS OVERVIEW AND STRUCTURE 10

• Modifying an existing multiuser linear precoding algorithm that minimizes the

SMSE of the system to account for the multiple BSs and asynchronous interfer-

ence.

• Presenting a brief discussion on the difficulties of using OFDM for BS coopera-

tion and proposing a cooperative algorithm that combines linear precoding and

non-linear THP (hybrid algorithm) and minimizes the SMSE of the system. This

algorithm avoids the mentioned difficulties by having two BSs cooperate only for

edge users that are approximately equally distant from them.

• Modifying an existing user selection algorithm for MIMO-OFDM to enhance its

performance and suggesting how it can be used to select users when the hybrid

algorithm is used along with OFDM.

Consequently, the thesis is organized as follows. In Chapter 2, we develop the system

model and identify asynchronous interference as the key challenge. We prove the existence

of a downlink/uplink duality in a multi-BS MIMO system with asynchronous interference

and present a linear precoding scheme suitable for that scenario. In Chapter 3, we present

the hybrid algorithm which minimizes the SMSE of a system where two BSs cooperate to

communicate with edge users, i.e., cooperation is restricted to edge users exclusively. In

Chapter 4, we present a MIMO-OFDM user selection algorithm, as well as a simulation

exercise on how user selection for the hybrid algorithm might be done. The thesis wraps

up with some conclusions and suggestions for future work in Chapter 5.

Chapter 2

Multiple Base Stations and

Asynchronous Interference

In this chapter, we investigate a single-carrier MIMO system with multiple cooperating

BSs. We provide system models for the downlink and virtual uplink that take into account

the asynchronous nature of the interference, inevitable due to the use of multiple BSs.

We proceed to show that a downlink/uplink duality exists, and then exploit this duality

to extend an existing single-BS linear precoding algorithm to accommodate multiple BSs

and asynchronous interference.

2.1 System Models

This section describes the system models of the multiuser downlink and the multiuser

virtual uplink. In both cases it is assumed that there are B BSs and K users randomly

distributed in the cells of the B BSs. Each BS has M antennas, while user k has Nk

antennas with N =∑K

k=1 Nk. Moreover, each user transmits or receives Lk data streams

simultaneously, where Lk ≤ minM, Nk. Also, L =∑K

k=1 Lk. Note that all data streams

share the same frequency and time channels. The BSs communicate to the users over

frequency flat channels. All BSs are assumed to know the CSI to all users perfectly. We

emphasize that this CSI also includes the propagation delays between all the BSs and all

the users.

11

2.1. SYSTEM MODELS 12

2.1.1 Downlink System Model

Let xk (Lk × 1) be the column data vector containing the data streams of user k to be

transmitted from all BSs. The data symbols in xk are independent with unit average

power. Moreover, the data vectors of a user k are independent over time. As for the data

vectors of different users, they are also independent. Therefore,

E[xk(m)xHj (n)] =

ILk

, k = j and m = n

0, otherwise, (2.1)

where (·)H is the Hermitian operator, E [·] is the expectation operator, and m and n are

discrete time indices. In general, unless required, we will drop the time index.

Before transmission, each data stream in xk is allocated a certain power. This is done

by multiplying xk by√

Pk (Lk×Lk), where Pk is a diagonal matrix, whose components are

the powers allocated to the different data streams of xk. Furthermore, before transmission

from BS b, the data vector meant for user k is linearly precoded with a matrix U(b)k

(M × Lk). Hence, the M × 1 signal transmitted from BS b to user k is

t(b)k = U

(b)k

√Pkxk. (2.2)

The channel between BS b and user k is represented by the matrix H(b)k (M × Nk),

whose elements are circularly symmetric complex Gaussian random variables with unit

variance (Rayleigh fading). Since perfect CSI is assumed, these channels are known at

the BSs. The Nk × 1 signal received by user k is

yk =B∑

b=1

H(b)H

k U(b)k

√Pkxk + interference + nk, (2.3)

where nk denotes the additive white Gaussian noise (AWGN). The interference term will

be explained in more detail below. Finally, user k processes its received signal linearly

by multiplying it by a decoding matrix VHk (Lk ×Nk) to produce an estimate of its own


data vector

xk = VHk yk. (2.4)

Let Ek denote the error covariance matrix of user k. Ek is expressed as follows.

Ek = E[(xk − xk)(xk − xk)

H]

(2.5)

The diagonal elements of Ek are the mean squared errors (MSEs) of the data streams of

user k and, therefore, the SMSE of user k can be found as follows,

SMSEk = tr (Ek) , (2.6)

where tr(·) is the trace operator. The SMSE of the whole system can be expressed as

the sum of the individual SMSEs,

SMSE =K∑

k=1

SMSEk =K∑

k=1

tr (Ek) . (2.7)

Accordingly, the optimization problem of finding precoding and decoding matrices and a

power allocation that minimize the SMSE in the downlink under a sum power constraint

can be described as follows,

minPk,U

(b)k ,Vk

k=1:K, b=1:B

K∑

k=1

tr (Ek) (2.8)

subject toK∑

k=1

tr(Pk) ≤ B × Pmax,

where B × Pmax is the maximum sum power available over all the BSs, and Pmax is the

average transmit power of one BS over time.

Timing Issues: In the first summation of Eqn. (2.3), it is assumed that the trans-

missions from all BSs meant for a certain user k (t(b)k , for b = 1, . . . , B) are received by

user k synchronously. Therefore, as in [20], BS b advances the time at which it transmits


t(b)k by τ

(b)k − τ

(bk)k , where τ

(b)k is the propagation delay from BS b to user k, and τ

(bk)k is

the propagation delay from user k to the nearest BS. Due to the random distribution of

users, each user will be at a different distance from the different BSs. Since each user

receives its own data from the different BSs synchronously, it is impossible to ensure that

the interference at user k from other users in the system be synchronous with user k’s

desired signals as well; the interference is therefore asynchronous. The distances between

BSs and users in practical systems imply that the delays related to the asynchronous in-

terference are very small; however, relative to the symbol period in such systems, which

is on the order of microseconds, these delays are considerable and cannot be neglected.

Most works dealing with multiple BS ignore this issue, which significantly simplifies any

system design. The issue of asynchronous interference was first identified in [20] and we

show here that it is the key challenge in designing multi-BS cooperation schemes.

The asynchronous interference causes the pulse shape used to transmit the interfering

data streams to be misaligned with the matched filter at user k. Adapting from [20], this

misalignment can expressed as follows.

interference =B∑

b=1

H(b)H

k

K∑j=1j 6=k

U(b)j

√Pji

(b)jk , (2.9)

where

i(b)jk (m) = ρ(δ

(b)jk − TS)xj(m

(b)jk ) + ρ(δ

(b)jk )xj(m

(b)jk + 1), (2.10)

ρ(τ) =

∫ TS

0

g(t)g(t− τ)dt, (2.11)

τ(b)jk = τ

(b)k − τ

(bk)k − (τ

(b)j − τ

(bj)j ), δ

(b)jk = τ

(b)jk mod TS. (2.12)

Here, TS is the symbol period. g(t) is the pulse shaping filter. It is non-zero only for

t ∈ [0, TS], real, and has unit power. The term i(b)jk is the misaligned interference caused


Figure 2.1: Asynchronous interference in the downlink

on user k when BS b transmits to user j; it is a linear combination of two consecutive

data vectors being transmitted to user j, with time indices m(b)jk and m

(b)jk + 1. Note that

by convention, in the downlink, m(b)jk is the index given to the first interfering symbol and

m(b)jk + 1 is that given to the second. Figure 2.1 helps in understanding the expression

for i(b)jk . The term τ

(b)jk represents the difference between the time when interference from

BS b while transmitting to user j arrives at user k and the time when user k receives its

desired signal. The expression can be understood as follows. Assume all BSs transmit to

their closest users at the same time, say t = 0. User k is closest to BS bk and, therefore,

user k receives its desired signal at t = τ(bk)k . BS b advances its transmission to user j

by τ(b)j − τ

(bj)j so that user j receives all its intended signals from all BSs simultaneously.

The propagation delay from BS b to user k is τ(b)k and hence the interference from user j

arrives at user k at t = τ(b)k − (τ

(b)j − τ

(bj)j ). This makes the interference from BS b when

transmitting to user j arrive at user k after τ(b)jk = τ

(b)k − τ

(bk)k − (τ

(b)j − τ

(bj)j ). Therefore,

xk = VHk

B∑

b=1

H(b)H

k U(b)k

√Pkxk

︸︷︷︸desired signal

+VHk

B∑

b=1

H(b)H

k

K∑j=1j 6=k

U(b)j

√Pji

(b)jk

︸︷︷︸asynchronous multiuser interference

+VHk nk (2.13)

As noted previously, the expression of i(b)jk (m) (Eqn. (2.10)) contains the two consec-

utive data vectors xj(m(b)jk ) and xj(m

(b)jk + 1). It is assumed that the channels experience

quasi-static fading, meaning that the fading is the same over a considerable number of


symbol periods. This implies that the same power and precoding matrices can be used for

several consecutive symbols. Accordingly, Pj and U(b)j can be used for i

(b)jk in Eqn. (2.13).

Before proceeding to the virtual uplink model, it is worth investigating the value of

E[i(b1)j1k i

(b2)H

j2k

]as it will be frequently used later. Using Eqn. (2.1) and Eqn. (2.10), it can

be easily shown that

E[i(b1)j1k i

(b2)H

j2k

]=

0, for j1, j2, and k all distinct

β(b1,b2)jk ILj

, for j1 = j2 = j 6= k

ILj, for j1 = j2 = j = k

, (2.14)

where

β(b1,b2)jk =

0, if |m(b2)jk −m

(b1)jk | > 1

ρ(δ(b1)jk )ρ(δ

(b2)jk − Ts), if m

(b2)jk = m

(b1)jk + 1

ρ(δ(b1)jk )ρ(δ

(b2)jk ) + ρ(δ

(b1)jk − Ts)ρ(δ


(b2)jk = m

(b1)jk

ρ(δ(b2)jk )ρ(δ


(b2)jk = m

(b1)jk − 1

. (2.15)

The details of the derivation are provided in Appendix B.

2.1.2 Virtual Uplink System Model

In precoding for a single BS, many works make use of an downlink/uplink duality be-

cause of the significant simplification it provides in solving the optimization problem in

Eqn. (2.8). Since we hope to exploit this duality for our multi-BS case, we present in

this section the virtual uplink system model to be used. It is worth emphasizing that

the uplink is purely virtual, i.e., it is a purely mathematical construct to simplify the

downlink optimization problem. In this regard, we investigate two variants of the virtual

uplink model.

In the uplink, the system model is very similar to that in the downlink, except for the

following changes. The K users are now communicating to B BSs. User k transmits Lk

data streams, which make up the components of the Lk × 1 data vector xk. The power

allocation matrix used by user k is Qk. Vk is now the precoding matrix used by user k.


Therefore, the signal transmitted by user k to all BSs is

tk = Vk

√Qkxk. (2.16)

The channel matrices are the same as those in the downlink with the dimensions reversed

and components conjugated (Hermitian operation). Therefore, the M ×1 signal received

by BS b from user k is

y(b)k = H

(b)k Vk

√Qkxk + interference + n(b), (2.17)

where n(b) denotes the AWGN at BS b1. The interference term will be explained in more

detail below after the timing models for the virtual uplink are presented. The BSs group

the received vectors y(b)k into one global received vector yk = [y

(1)T

k . . . y(B)T

k ]T (BM ×1),

where (·)T is the transpose operator. yk is processed linearly by multiplying it by the

hermitian of the decoding matrix Uk (BM × Lk) to produce an estimate of the data

vector of user k

xk = UHk yk. (2.18)

Note that Uk can be expressed as Uk =[U

(1)T

k , . . . ,U(B)T

k

]T

, where U(b)k is the decoding

matrix used by BS b for user k. Therefore, xk can be also expressed as

xk = UHk yk =

B∑

b=1

U(b)H

k y(b)k . (2.19)

One crucial difference between the work in [8] and the multi-BS case is the issue of

timing. In what follows, we will consider two different methods for modeling the timing

of signals in the uplink. Each method produces a different expression for the term τ(b)jk ,

which represents the difference between the time when interference from user j arrives

at BS b and the time when the desired signal of user k arrives at BS b. Otherwise, all

other asynchronous interference terms are common.

1We assume that each BS can process its received signal several times with different delays in orderto obtain a y(b)

k vector for each user k that is synchronized with the data of user k.


Simultaneous Transmissions: In the first method, we model the virtual uplink

in a simple, straightforward manner. We assume that all the MSs transmit at the same

time, say t = 0. The idea behind the first method comes from the downlink assumption

that all BSs transmit to their closest users at the same time. In this case, τ(b)jk is expressed

as follows.

τ(b)jk = τ

(b)j − τ

(b)k (2.20)

Eqn. (2.20) can be understood as follows. As explained previously, all the users transmit

at the same time, t = 0. The interfering signal from user j arrives at BS b at t = τ(b)j ,

while the desired signal from user k arrives at t = τ(b)k . Therefore, the difference is

τ(b)jk = τ

(b)j − τ

(b)k .

Time Reversal: In the second method, we will look first at how the virtual uplink

is modeled in the case with one BS. As in [8], the signal received by that one BS in the

uplink is given as follows.

y =K∑

k=1

HkVk

√Qkxk + n (2.21)

Note that Eqn. (2.21) has no timing terms, meaning that the different signals from the

different users arrive synchronously at the BS. With users being at different distances

from the BS, this is only possible if the users transmit at different times such that their

signals arrive at the same time at the BS. This cannot be directly extended to the

multiple BS case since, similar to what was mentioned previously in the timing issues in

Section 2.1.1, it is impossible for all the signals from all the users to arrive at all the BSs

synchronously.

In the downlink, we assumed that a BS transmits to all users for which it is the closest

BS at the same time, say t = 0. Accordingly, we propose that in the virtual uplink each

BS should necessarily receive synchronously only the signals transmitted by the users

that have it as the closest BS, say t = 0 as well. In this case, τ(b)jk is expressed as follows.

τ(b)jk = τ

(b)j − τ

(bj)j − (τ

(b)k − τ

(bk)k ) (2.22)


Eqn. (2.22) can be understood as follows. Since each BS should receive the signals

transmitted by the users that have it as the closest BS at the same time, t = 0, user

k transmits its signal at t = −τ(bk)k and this signal arrives at BS b at t = −τ

(bk)k + τ

(b)k .

Similarly, the signal from user j arrives at BS b at t = −τ(bj)j + τ

(b)j . Therefore, the

interference of user j on user k arrives after τ(b)jk = τ

(b)j − τ

(bj)j − (τ

(b)k − τ

(bk)k ).

With the expressions of τ(b)jk at hand, we present the expressions of the asynchronous

interference caused by user j on user k at BS b and the estimated data vector for user k,

xk.

e(b)jk (m) = ρ(δ

(b)jk )xj(m

(b)jk ) + ρ(δ

(b)jk − TS)xj(m

(b)jk + 1), (2.23)

δ(b)jk = τ

(b)jk mod TS, (2.24)

xk =B∑

b=1

U(b)H

k H(b)k Vk

√Qkxk

︸︷︷︸desrired signal

+B∑

b=1

U(b)H

k

K∑j=1j 6=k

H(b)j Vj

√Qje

(b)jk

︸︷︷︸asynchronous multiuser interference

+B∑

b=1

U(b)H

k n(b) (2.25)

The expression for e(b)jk in Eqn. (2.23) can be explained in a manner similar to that of i

(b)jk

in Eqn.(2.10). Figure 2.2 helps in understanding Eqn. (2.23). Note that the indices m(b)jk

and m(b)jk + 1 remain unchanged since they are simply a way of identifying the symbols,

and these symbols are the same whether they are traveling in the downlink or uplink.

In a similar derivation to that of E[i(b1)j1k i

(b2)H

j2k

], it can be shown that

E[e

(b1)j1k e

(b2)H

j2k

]=

0, for j1, j2, and k all distinct

γ(b1,b2)jk ILj

, for j1 = j2 = j 6= k

ILj, for j1 = j2 = j = k

, (2.26)

2.2. DOWNLINK/UPLINK DUALITY 20

Figure 2.2: Asynchronous interference in the uplink

where

γ(b1,b2)jk =

0, if |m(b2)jk −m

(b1)jk | > 1

ρ(δ(b2)jk )ρ(δ


(b2)jk = m

(b1)jk + 1

ρ(δ(b1)jk )ρ(δ

(b2)jk ) + ρ(δ

(b1)jk − Ts)ρ(δ


(b2)jk = m

(b1)jk

ρ(δ(b1)jk )ρ(δ


(b2)jk = m

(b1)jk − 1

. (2.27)

More details are provided in Appendix B.

2.2 Downlink/Uplink Duality

In general, proving the existence of a downlink/uplink duality requires two steps. In the

first step, SINR targets are set on each data stream and the proof requires showing that

the same total power is needed to meet these targets in the downlink and uplink. In the

second step, the proof requires showing that the MSE for a data stream is the same in

the downlink or uplink, using the fact that the uplink and downlink SINRs are the same.

2.2.1 First Step: SINR Targets

Let the target SINR for data stream j of user k be Γkj. Let p be a column downlink power

allocation vector whose elements are the powers allocated to all data streams across all

users, i.e., the diagonal elements of the matrices Pk for k = 1, . . . , K. We would like

to find p such that the minimum of the ratios SINRDLkj /Γkj over all values of k and j is


maximized and ||p||1 = 1Tp ≤ B × Pmax, where 1 is an all-ones L × 1 vector. It was

shown in [6] that the solution of this problem makes all the above ratios equal to the

same level, denoted as CDL.

CDL =SINRDL

kj

Γkj

, 1 ≤ k ≤ K, 1 ≤ j ≤ Lk, ||p||1 ≤ B × Pmax. (2.28)

The above is true since the SINR for a given data stream is monotonically increasing in

the power of that data stream, and monotonically decreasing in the power of any other

data stream. This still holds when the interference is asynchronous, and can be easily

seen from the explicit expression of the SINR to be presented later.

From Eqn. (2.28), we can write the following equation for any data stream j of user

k.

pkj1

CDL= pkj

Γkj

SINRDLkj

. (2.29)

From Eqn. (2.13), the SINR of a data stream j of user k in the downlink can be found by

dividing its power by the power of the remaining data streams and noise in the system.

This SINR can be expressed as

SINRDLkj = pkj

vHkjS

DLkj vkj

vHkjT

DLkj vkj

, (2.30)

SDLkj =

(B∑

b=1

H(b)H

k u(b)kj

)(B∑

b=1

u(b)H

kj H(b)k

), (2.31)

where u(b)kj denotes the column of U

(b)k corresponding to stream j of the user k at BS b.


Also,

TDLkj =

Lk∑

l=1l 6=j

pkl

(B∑

b=1

H(b)H

k u(b)kl

)(B∑

b=1

u(b)H

kl H(b)k

)

+ E

[K∑

c=1c6=k

Lc∑

l=1

pcl

(B∑

b=1

H(b)H

k u(b)cl i

(b)ckj

)(B∑

b=1

i(b)H

ckj u(b)H

cl H(b)k

)]+ σ2INk

,

=

Lk∑

l=1l 6=j

B∑

b1=1

B∑

b2=1

H(b1)H

k u(b1)kl pklu

(b2)H

kl H(b2)k

+K∑

c=1c6=k

Lc∑

l=1

B∑

b1=1

B∑

b2=1

β(b1,b2)ck H

(b1)H

k u(b1)cl pclu

(b2)H

cl H(b2)k + σ2INk

. (2.32)

i(b)ckj is the jth element of i

(b)ck . Note that it is assumed, as in [9], that the covariance matrix

of any external interference can be estimated, for example by training, and whitened.

The whitening filter can be considered as part of global channel matrix of user k, Hk =[H

(1)T

k H(2)T

k . . .H(B)T

k

]T2. Hence, we only include a scaled identity matrix in Eqn. (2.32)

to represent noise and external interference. Next, the L equalities given by Eqn. (2.29)

can be grouped together in one equation as follows

p1

CDL= DΨDLp + DσDL (2.33)

where

D = diag

(Γ11

vH11S

DL11 v11

, . . . ,ΓKLK

vHKLk

SDLKLK

vKLK

), (2.34)

2This can be understood in general as follows. Denote the transmission from all BSs as vector s.Assume a user receives s through channel H. Noise, denoted by n, is also added to the received signal.Thus, the received signal is expressed as HHs + n. We assume the noise is colored with a covariancematrix R. The user whitens the noise using a whitening filter, found as R− 1

2 . Therefore, the processedreceived signal becomes R− 1

2 HHs + R− 12 n. The vector R− 1

2 n is now AWGN, and R− 12 HH can be

considered the equivalent channel.


[ΨDL]LrLc = vHkrlrE

(B∑

b=1

H(b)H

kru

(b)kclc

i(b)kckrlr

)(B∑

b=1

H(b)H

kru

(b)kclc

i(b)kckrlr

)Hvkrlr (2.35)

= vHkrlr

(B∑

b1=1

B∑

b2=1

β(b1,b2)kckr

H(b1)H

kru

(b1)kclc

u(b2)H

kclcH

(b2)kr

)vkrlr , (2.36)

for 1 ≤ Lr, Lc ≤ L and Lr 6= Lc, and

σDL = σ2[vH11v11, . . . ,v

HKLK

vKLK]T = σ21.

The vector 1 is an L×1 vector whose elements are all 1. The subscripts r and c are short

for row and column, respectively; kr is the user to which data stream Lr belongs when

the data streams are labeled from 1 to L; lr is the index of that data stream relative to

user kr. The same applies to kc and lc. Note that the diagonal entries of [ΨDL] are zero.

Also, it is assumed, for ease of representation, that when kr = kc = k, i(b)kk = xk, and in

that case the specific element [ΨDL]LrLc can be found as follows

[ΨDL]LrLc =

∣∣∣∣∣vHkrlr

B∑

b=1

H(b)H

kru

(b)kclc

∣∣∣∣∣

2

.

To minimize the total power while meeting the target SINRs at the same time, CDL is

set to 1. Accordingly, and after simple mathematical manipulations to Eqn. (2.33), we

get

p = σ2(D−1 −ΨDL)−11. (2.37)

A similar discussion applies for the uplink. In that case, the power vector is denoted

as q. Its elements are the powers allocated to all data streams across all users, i.e., the

diagonal elements of the matrices Qk for k = 1, . . . , K. The SINR of data stream j of

user k in the uplink can be expressed as follows,

SINRULkj = qkj

numkj

denkj

(2.38)


where

numkj =

(B∑

b=1

u(b)H

kj H(b)k

)vkjv

Hkj

(B∑

b=1

H(b)H

k u(b)kj

)(2.39)

and

denkj =

Lk∑

l=1l 6=j

qkl

∣∣∣∣∣

(B∑

b=1

u(b)H

kj H(b)k

)vkl

∣∣∣∣∣

2 + E

K∑c=1c 6=k

Lc∑

l=1

qcl

∣∣∣∣∣B∑

b=1

u(b)H

kj H(b)c vcle

(b)ckj

∣∣∣∣∣

2 + σ2

=

Lk∑

l=1l 6=j

B∑

b=1

B∑

b=2

u(b1)H

kj H(b1)k vklqklv

HklH

(b2)H

k u(b2)H

kj

+K∑

c=1c 6=k

Lc∑

l=1

B∑

b1=1

B∑

b2=1

γ(b1,b2)ck u

(b1)H

kj H(b1)c vclqclv

Hcl H

(b2)H

c u(b2)kj + σ2 (2.40)

Note that |x|2 = xxH = xHx for scalars and it is used as such in Eqn. (2.40) because

of its more compact form. Setting the same SINR targets as the downlink and grouping

equalities of similar form to Eqn. (2.29) for the uplink, we get

q1

CUL= DΨULq + DσUL, (2.41)

where

D = diag

(Γ11

num11

,Γ12

num12

, . . . ,ΓKLk

numKLK

), (2.42)

[ΨUL]LrLc = E

∣∣∣∣∣B∑

b=1

u(b)H

krlrH

(b)kc

vkclce(b)kckrlr

∣∣∣∣∣

2 ,

=B∑

b1=1

B∑

b2=1

γ(b1,b2)kckr

u(b1)H

krlrH

(b1)kc

vkclcvHkclcH

(b2)H

kcu

(b2)krlr

(2.43)


for 1 ≤ Lr, Lc ≤ L and Lr 6= Lc, and

σUL = σ2

[B∑

b=1

uH11u11, . . . ,

B∑

b=1

uHKLK

uKLK

]T

= σ21.

Again, we have [ΨUL]LrLc = 0 when Lr = Lc, and e(b)kk = xk. Note that the matrix D is

equal in the downlink and uplink and hence it has no superscript. Similarly, by setting

CUL = 1, we get

q = σ2(D−1 −ΨUL)−11. (2.44)

By establishing the above, we know that both the downlink and uplink have the same

achievable SINR region, if free of power constraints. To complete the first step, it is

required that ||p||1 = ||q||1. This can be achieved if ΨDL equals ΨULT. Note that the

transpose operation is sufficient since the Ψ matrices have real components. By inspecting

Eqns. (2.36) and (2.43), and using the fact that all terms of the form uHHv or vHHHu

are scalars, a sufficient condition for this equality to hold is having E[i(b1)jkl1

i(b2)H

jkl1

]equal

to E[e(b1)kjl2

e(b2)H

kjl2

], i.e.,

β(b1,b2)jk = γ

(b1,b2)kj . (2.45)

Notice that the subscripts k and j in Eqn. (2.45) are switched when going from one side

of the equation to the other, since we are equating ΨDL to a transposed version of ΨUL.

In attempting to show Eqn. (2.45), we require the following proposition. Note that

subscripts k and j are also switched in the proposition statement below as we go from

one side of the ‘implies’ statements to the other.

Proposition 1: For both downlink and uplink, m(b2)jk = m

(b1)jk +1 =⇒ m

(b2)kj = m

(b1)kj −1,

m(b2)jk = m

(b1)jk − 1 =⇒ m

(b2)kj = m

(b1)kj + 1, and m

(b2)jk = m

(b1)jk =⇒ m

(b2)kj = m

(b1)kj .

Proof: Let sj(m) denote the symbol transmitted to user j with time index m. Consider

the case when m(b2)jk = m

(b1)jk + 1 in the downlink. Accordingly, the symbol sj(m

(b2)jk ) is


the same as symbol sj(m(b1)jk + 1), but sj(m

(b2)jk ) arrives at user k as interference from BS

b2 and sj(m(b1)jk +1) arrives at user k as interference from BS b1. Hence, we can write the

exact times at which these symbols arrive at user k as follows (from the analysis of the

interference timing seen before).

t(sj(m(b2)jk )) = τ

(bj)j − τ

(b2)j + τ

(b2)k

t(sj(m(b1)jk + 1)) = τ

(bj)j − τ

(b1)j + τ

(b1)k

Note that sj(m(b1)jk + 1) arrives after sj(m

(b2)jk ) (otherwise sj(m

(b1)jk + 1) would have been

sj(m(b1)jk ); see Figure 2.3). Therefore, for some constant c, we can write

(1) t(sj(m(b2)jk )) − t(sj(m

(b1)jk + 1)) = c < 0

(2) =⇒ [τ(bj)j − τ

(b2)j + τ

(b2)k ] − [τ

(bj)j − τ

(b1)j + τ

(b1)k ] = c < 0

(3) =⇒ [−τ(b2)j + τ

(b2)k ] − [−τ

(b1)j + τ

(b1)k ] = c < 0

(4) =⇒ [τ(b2)j − τ

(b2)k ] − [τ

(b1)j − τ

(b1)k ] = −c > 0

(5) =⇒ [τ(bk)k + τ

(b2)j − τ

(b2)k ] − [τ

(bk)k + τ

(b1)j − τ

(b1)k ] = −c > 0

(6) =⇒ t(user k on j by BS b2) − t(user k on j by BS b1) = −c > 0

(7) =⇒ m(b2)kj + 1 = m

(b1)kj

where

(1) Direct result of sj(m(b1)jk + 1) arriving after sj(m

(b2)jk ).

(2) Express the times explicitly.

(3) Remove the common term τ(bj)j .

(4) Multiply the equation by −1.

(5) Add and subtract the term τ(bk)k .

(6) The first three terms can be interpreted as the time that interference from user

k arrives at user j due to transmission from BS b2. The last three terms can

be interpreted as the time that interference from user k arrives at user j due to

transmission from BS b1.


Figure 2.3: Indices of interfering symbols of one user from two BSs

(7) The difference in timing in step (6) is greater than zero. This means that the

interference from user k on user j due to transmission from BS b2 arrives after the

interference from user k on user j due to transmission from BS b1. Hence, the

relation between the indices of the symbols m(b2)kj + 1 = m

(b1)kj .

The other cases have similar proofs.

Now, we attempt to derive Eqn. (2.45). Assume that in the downlink m(b2)jk = m

(b1)jk +1

holds. Therefore, from Eqn. (2.15),

β(b1,b2)jk = ρ(δ

(b1)DL

jk )ρ(δ(b2)DL

jk − Ts).

Now consider the uplink. From Proposition 1, m(b2)jk = m

(b1)jk + 1 =⇒ m

(b2)kj = m

(b1)kj − 1.

Therefore, from Eqn. (2.27),

γ(b1,b2)kj = ρ(δ

(b1)UL

kj )ρ(δ(b2)UL

kj − Ts).

Again, notice that the subscripts k and j have been switched. Accordingly, it is enough

to show that δ(b)DL

jk = δ(b)UL

kj to prove that β(b1,b2)jk = γ

(b1,b2)kj . Before proceeding, recall that

δ and τ are related by a simple modulo operation, as in Eqn. (2.12) and Eqn. (2.24).

From Eqn. (2.12), we have τ(b)DL

jk = τ(b)k − τ

(bk)k − (τ

(b)j − τ

(bj)j ). Using the first method

to model the virtual uplink, where all the users transmit simultaneously, we have τ(b)UL

kj =

τ(b)k − τ

(b)j from Eqn. (2.20). In this case, it is not clear that δ

(b)DL

jk = δ(b)UL

kj . One may

try setting τ(b)UL

kj = τ(b)DL

jk and solving for appropriate virtual uplink delays such that

δ(b)DL

jk = δ(b)UL

kj holds, but such an attempt yields an over-determined system and no

solution is guaranteed. Furthermore, trying to solve for the virtual uplink delays using

2.3. SINGLE BASE STATION: SYNCHRONOUS INTERFERENCE 28

||p||1 = ||q||1, the initial condition required to establish the first step of a downlink/uplink

duality, yields an under-determined system.

On the other hand, using the second method to model the virtual uplink, where all

users transmit using time reversal, we have τ(b)UL

kj = τ(b)k − τ

(bk)k − (τ

(b)j − τ

(bk)j ), which

is equal to τ(b)DL

jk . This result directly gives δ(b)DL

jk = δ(b)UL

kj , and hence β(b1,b2)jk = γ

(b1,b2)kj .

Finally, we have E[i(b1)jkl1

i(b2)H

jkl1

]= E

[e(b1)kjl2

e(b2)H

kjl2

], ΨDL = ΨULT

, and ||p||1 = ||q||1. There-

fore, the first step towards duality can be completed provided we model the virtual

uplink using time reversal. In other words, we conclude that the time reversal model is

the uplink dual of the downlink timing scheme used in Section 2.1.1.

2.2.2 Second Step: Equating MSEs

After establishing the first step to duality, showing that the downlink and uplink MSEs

of a data stream are equal when its downlink and uplink SINRs are equal is a simple

matter and can be done along the lines of the proofs presented in [7, 8]. The details are

provided in Appendix A.

2.3 Single Base Station: Synchronous Interference

Before proceeding, we will examine a special case with only one BS and, hence, syn-

chronous interference. We will review the multiuser linear precoding algorithm presented

in [8], which can be used in this case. It is worth presenting and understanding this al-

gorithm as it will be adjusted to account for asynchronous interference, and it will form

the basis for the hybrid algorithm to be presented in Chapter 3.

Consider the case when only one BS is present. Modeling this can be simply accom-

plished by setting B = 1 in the system model presented earlier. Accordingly, all the

(b) superscripts will be dropped. With only one BS at hand, the MUI experienced by

the users is no longer asynchronous with their desired data, and hence all the interfer-

ence time delays τ(b)jk are set to zero, and i

(b)jk and e

(b)jk both simplify to xj. Moreover,

β(b1,b2)jk = γ

(b1,b2)kj = 1, and duality exists. Consequently, the estimated data vectors in the


downlink and virtual uplink can be expressed as follows.

xDLk = VH

k HHk Uk

√Pkxk︸︷︷︸

desired signal

+ VHk HH

k

K∑j=1j 6=k

Uj

√Pjxj

︸︷︷︸synchronous multiuser interference

+VHk nk (2.46)

xULk = UH

k HkVk

√Qkxk︸︷︷︸

desrired signal

+K∑

j=1j 6=k

UHk HjVj

√Qjxj

︸︷︷︸synchronous multiuser interference

+UHk n (2.47)

We will consider the SMSE in the virtual uplink and show how it can be minimized.

Then, the derived virtual uplink solution will be transformed into the downlink, using

the downlink/uplink duality [8].

By substituting Eqn. (2.47) into Eqn. (2.5) and expanding it, the error covariance

matrix of user k in the virtual uplink is found to be

EULk = UH

k HVQVHHHUk + σ2UHk Uk + ILk

−UHk HkVk

√Qk −

√QkV

Hk HH

k Uk,

(2.48)

where H = [H1, . . . ,HK ], V = diag (V1, . . . ,VK), and Q = diag (Q1, . . . ,QK) are the

uplink global channel, precoding, and power allocation matrices, respectively. Differen-

tiating the trace of Eqn. (2.48) with respect to UHk and setting the result to zero, the

optimum uplink MMSE decoding matrix is found to be

UMMSEk = J−1HkVk

√Qk, (2.49)

where

J = HVQVHHH + σ2IM . (2.50)

The columns of UMMSEk can be normalized to ensure that the power constraint is met.


Substituting Eqn. (2.49) into Eqn. (2.48), the virtual uplink MMSE error covariance

matrix of user k is

EUL,MMSEk = ILk

−√

QkVHk HH

k J−1HkVk

√Qk. (2.51)

Substituting Eqn. (2.51) into Eqn. (2.7), we get

SMSE =K∑

k=1

tr(EUL,MMSEk ) =

K∑

k=1

Lk −K∑

k=1

tr(√

QkVHk HH

k J−1HkVk

√Qk

)

= L−K∑

k=1

tr(HkVkQkV

Hk HH

k J−1)

= L− tr(HVQVHHHJ−1

)

= L−M + tr(J−1

). (2.52)

Note that the SMSE expression in Eqn. (2.52) deals with the uplink exclusively.

2.3.1 Review of the MIMO-SMSE Algorithm

In this section, we briefly review the MIMO-SMSE algorithm proposed in [8] to minimize

the SMSE given in Eqn. (2.52) under a total power constraint for the single-BS case.

The algorithm is detailed in Table 2.1.

The main idea behind the algorithm is to solve the SMSE minimization problem in

the virtual uplink (which proves to be easier than solving it directly in the downlink),

and then using downlink/uplink duality, convert the obtained solution into an equivalent

one in the downlink. The algorithm begins by initializing the uplink precoding matrices

V by the right singular vectors of the channel matrices H and by dividing the available

power equally among all data streams. Next, it iterates between finding uplink precoding

matrices that minimize the SMSE assuming the uplink power allocation is constant [11],

and finding the uplink power that minimizes the SMSE assuming the uplink precoding

matrices are constant. The latter can be expressed as follows. Note that minimizing the


Initialization: Vk = SVD(Hk), Q = (Pmax/L)I

Iteration:

1. Find virtual uplink precoding vectors, for k = 1 : K, j = 1 : Lk

vkj = emax

(HH

k J−2kj Hk, I/qkj + HH

k J−1kj Hk

)

N.B.: emax returns the normalized eigenvector with the highest eigenvalue.

2. Find virtual uplink power allocation to minimize SMSE.

q = arg minq tr (J−1), subject to qkj > 0, ||q|| ≤ Pmax

3. Repeat steps 1 and 2 until old SMSE - new SMSE < ε

Update:

4. Find downlink precoding matrices and normalize their columns,

for k = 1 : K, j = 1 : Lk

Uk = J−1HkVk

√Qk, ukj = ukj/||ukj||

5. Set the target SINRs to the actual SINRs, for k = 1 : K, j = 1 : Lk

Γkj = SINRULkj

6. Find the downlink power allocation.

p = σ2(D−1 −ΨULT

)−1

1

Table 2.1: Multiuser MIMO-SMSE Algorithm

SMSE in Eqn. (2.52) is equivalent to minimizing the term tr(J−1).

q = arg minq

tr(J−1

), (2.53)

subject to qkj > 0, ||q|| ≤ Pmax

The power allocation problem in Eqn. (2.53) is convex in q, which makes it relatively easy

to solve using numerical methods, since no closed form solution is known [8]. Finally,

when the relative change in SMSE is below a certain threshold, the obtained uplink

solution is used to derive the downlink precoding matrices and downlink power allocation


that achieve the same SMSE, under the same power constraint. Note that, in Table 2.1,

matrices D and ΨUL are given by Eqns. (2.42) and (2.43) with γ(b1,b2)ck set to 1, and the

matrix J is given by Eqn. (2.50). The matrix Jkj has the same expression as in [8]. It is

repeated below for convenience.

Jkj = J− qkjHkvkjvHkjH

Hk . (2.54)

2.3.2 Multiple Base Stations: Assuming Synchronous Interfer-

ence

In this section, we show that if all interference were synchronous, under our model, the

multiple BSs act as a single BS with B×M antennas. Assume that it is somehow possible

to ensure that all the transmitted signals from all BSs arrive synchronously at all users.

All the interference time delays τ(b)jk become zero, and i

(b)jk and e

(b)jk both simplify to xj.

Moreover, β(b1,b2)jk = γ

(b1,b2)kj = 1, and duality exists. Accordingly, xDL

k in Eqn. (2.13) and

xULk in Eqn. (2.25) become

xDLk =VH

k

B∑

b=1

H(b)H

k U(b)k

√Pkxk + VH

k

B∑

b=1

H(b)H

k

K∑j=1j 6=k

U(b)j

√Pjxj + VH

k nk

=VHk

B∑

b=1

H(b)H

k U(b)k

√Pkxk + VH

k

K∑j=1j 6=k

B∑

b=1

H(b)H

k U(b)j

√Pjxj + VH

k nk

=VHk HH

k Uk

√Pkxk + VH

k

K∑j=1j 6=k

HHk Uj

√Pjxj + VH

k nk (2.55)

2.4. MINIMIZING THE SMSE WITH ASYNCHRONOUS INTERFERENCE 33

xULk =

B∑

b=1

U(b)H

k H(b)k Vk

√Qkxk +

B∑

b=1

U(b)H

k

K∑j=1j 6=k

H(b)j Vj

√Qjxj +

B∑

b=1

U(b)H

k n(b)

=B∑

b=1

U(b)H

k H(b)k Vk

√Qkxk +

K∑j=1j 6=k

B∑

b=1

U(b)H

k H(b)j Vj

√Qjxj +

B∑

b=1

U(b)H

k n(b)

=UHk HkVk

√Qkxk +

K∑j=1j 6=k

UHk HjVj

√Qjxj + UH

k n (2.56)

where

Uk = [U(1)T

k . . . U(B)T

k ]T (BM × Lk), Hk =[H

(1)T

k H(2)T

k . . .H(B)T

k

]T

(BM ×Nk),

and n = [n(1)T

. . . n(B)T

]T (BM × 1). (2.57)

Comparing Eqns. (2.55) and (2.56) to Eqns. (2.46) and (2.47), we notice that the case

with multiple BSs and synchronous interference is equivalent to a case with one BS that

is a ‘super-BS’ with BM antennas. In that case, the algorithm of Table 2.1 can be used to

find the precoding and decoding matrices and power allocation that minimize the SMSE.

However, as mentioned previously in Section 2.1.1, in a system with multiple cooperating

BSs, asynchronous interference is inevitable, and the above simplifications are not valid.

Therefore, it is important to consider asynchronous interference explicitly when dealing

with multi-BS systems.

2.4 Minimizing the SMSE with Asynchronous Inter-

ference

With duality shown to exist even when asynchronous interference is present, we will

proceed to extend the MIMO-SMSE algorithm of Table 2.1 to account for asynchronism.

First, we consider how the SMSE expression changes. Recall that in the uplink, the


estimated data vector is given by Eqn. (2.25). Accordingly, the MSE error matrix for

user k in the uplink can be expressed as follows.

EULk =E

[(xk − xk)(xk − xk)

H],

=UHk HkVkQkV

Hk HH

k Uk + UHk

K∑c=1c6=k

AckUk + σ2UHk Uk

−√

QkVHk HH

k Uk −UHk HkVk

√Qk + ILk

, (2.58)

where

Uk = [U(1)T

k . . . U(B)T

k ]T , Ack = [A(1)T

ck . . . A(B)T

ck ]T , (2.59)

A(b)ck = H(b)

c VcQcVHc HH

c Γ(b)ck , and Γ

(b)ck = diag[γ

(b,1)ck IM , . . . , γ

(b,B)ck IM ]. (2.60)

Keeping all other matrices constant, the optimal solution for matrix Uk is the Wiener

solution:

UMMSEk =

(HkVkQkV

Hk HH

k +K∑

c=1c 6=k

Ack + σ2INk

)−1

HkVk

√Qk. (2.61)

The columns of UMMSEk can be normalized to ensure that the power constraint is met.

Substituting this optimal Uk back into Eqn. (2.58), we get the following expression for

EUL,MMSEk ,

EUL,MMSEk = ILk

−√

QkVHk HH

k J−1k HkVk

√Qk, (2.62)

where

Jk = HkVkQkVHk HH

k +K∑

c=1c 6=k

Ajk + σ2I. (2.63)


Note that, unlike in Eqn. (2.50), the matrix Jk cannot be made independent of user k

since Ajk is not common to all users. With the expression of the MMSE error covariance

matrix of user k available, the SMSE can be expressed as:

SMSE =K∑

k=1

tr(EUL,MMSEk ) =

K∑

k=1

Lk −K∑

k=1

tr(√

QkVHk HH

k J−1k HkVk

√Qk

),

= L−K∑

k=1

tr(HkVkQkV

Hk HH

k J−1k

). (2.64)

Initialization: Vk = SVD(Hk), Q = (Pmax/L)I

Iteration:

1. Find virtual uplink power allocation to minimize SMSE.

q = arg minq SMSE, subject to qkj > 0, ||q|| ≤ Pmax

2. Find downlink precoding matrices and normalize their columns,

for k = 1 : K, j = 1 : Lk

Uk = J−1k HkVk

√Qk, ukj = ukj/||ukj||

3. Set the target SINRs to the actual SINRs, for k = 1 : K, j = 1 : Lk

Γkj = SINRULkj

4. Find the downlink power allocation.

p = σ2(D−1 −ΨULT

)−1

1

5. Find uplink precoding matrices and normalize their columns,

for k = 1 : K, j = 1 : Lk

Vk = G−1k HH

k Uk

√Pk, vkj = vkj/||vkj||

6. Repeat steps 1 to 5 until old SMSE - new SMSE < ε

Table 2.2: Multiuser multi-BS algorithm for asynchronous interference

The above result has two major implications. First, the presence of Jk in Eqn. (2.64)

prevents the simplification of the SMSE into a form similar to Eqn. (2.52). This implies


that the uplink precoding vectors, vkj, cannot be found as shown in Table 2.1 [11]. Con-

sequently, minimizing the SMSE cannot be completely performed in the uplink. That is,

the iteration needs to constantly alternate between the uplink and downlink. Explicitly,

the uplink power allocation that minimizes the SMSE is found first. Next, the downlink

precoding matrices and the downlink power allocation are derived. Afterwards, the up-

link precoding matrices that minimize the downlink SMSE are found using the optimal

Wiener solution. Again, this iteration repeats until the relative change in SMSE is below

a certain threshold. In a similar derivation to that of UMMSEk , one can easily show that

the matrix Vk is the Wiener solution in the downlink:

VMMSEk = G−1

k HHk Uk

√Pk, (2.65)

where

Gk =

HH

k UkPkUHk Hk + HH

k

K∑c=1c 6=k

BckHk + σ2INk

, (2.66)

Bck = [B(1)T

ck . . . B(B)T

ck ]T , B(b)ck = U(b)

c PcUHc β

(b)ck , (2.67)

and

β(b)ck = diag[β

(b,1)ck IM , . . . , β

(b,B)ck IM ]. (2.68)

Again, the columns of VMMSEk can be normalized to ensure that the power constraint is

met.

The second implication of the new SMSE expression is on the convexity of the uplink

power allocation problem. While it is relatively easy to show that the power allocation

problem in Eqn. (2.53) is convex in q, the vector of uplink powers [8], the same approach

cannot be applied to Eqn. (2.64). It is only simulations that suggest that convexity in

the uplink powers still holds. These simulations will be presented in Section 2.5.

Finally, we present the multiuser multi-BS SMSE minimization algortihm in Table 2.2.

2.5. SIMULATION RESULTS 37

Note that matrices D and ΨUL are given by Eqn. (2.42) and Eqn. (2.43), wherein γ(b1,b2)ck

takes its value according to Eqn. (2.27).

2.5 Simulation Results

This section presents the results of simulations to illustrate the efficacy of the algorithm

proposed in Table 2.2. The values used for all parameters are found in Table 2.3.

Number of BSs B 2Number of transmit antennas per BS M 4

Number of users K 4Number of receive antennas per user Nk 1

Number of data streams per user Lk 1AWGN average power σ2 1Signal-to-noise ratio SNR Pmax/σ

2

Symbol Period TS 1µsPulse Shaping Filter g(t) Rectangular

Table 2.3: Parameters for simulations

When path loss is taken into consideration, it is assumed that the power of the signal

is proportional to the inverse of the distance raised to a constant path loss exponent.

Here, the path loss exponent is set to 3.5. The 2 BSs are placed 500 meters apart, which

makes the cell radius 250 meters. The minimum distance of any user to a BS was set to

150 meters. The K users are uniformly distributed in the rectangular area of width 200

meters and height 500/√

3 meters centered between the 2 BSs. The path loss of each

user was normalized to the path loss experienced 150 meters away from a BS.

Figure 2.4 shows the results obtained when the proposed algorithm is used. Note that

path loss is present in these simulations. We have shown in section 2.1.2 that when the

first method is used to model the uplink, duality does not exist. In the first simulation,

we ignore the fact that duality does not exist and derive the needed coding matrices

and power allocation using the uplink delays as given by Eqn. (2.20). As a result, the

downlink power allocation was found not to satisfy the power constraint all the time and

in such cases it was linearly scaled to meet the constraint. In that case we can see that


the error rate hits a floor before worsening. When the second method is used to model

the virtual uplink, duality exists, and the performance improves. The error rate behaves

as expected and achieves very low values at high SNRs.

−5 0 5 10 15 20

10−4

10−3

10−2

10−1

100

10 log (Pmax

/σ2)

BE

R

B=2, K=4, M=4, Nk=1, L

k=1, cell_radius=250

Virtual uplink using first methodVirtual uplink using second method

Figure 2.4: Linear precoding with and without asynchronous interference, with path loss

In order to evaluate the proposed algorithm further, we attempt to compare it to

the case where a super-BS with B ×M antennas is used and synchronous interference is

experienced. However, when one BS is present, it is often assumed that the channels to

all users have the same power on average, an assumption that the path loss model does

not conform with. Accordingly, we present the results of another simulation where path

loss is absent. Figure 2.5 shows the obtained results. Again, when duality is assumed to

exist, but does not, the error rate hits a floor and worsens. When the second method is

used, the performance is close to the super-BS case with synchronous interference. The

gap between the curves demonstrates the loss due to asynchronous interference.

The results of the simulations which suggest that the SMSE given by Eqn. (2.64)

2.6. MIMO-OFDM AND MULTIPLE BASE STATIONS 39

−2 0 2 4 6 8 10 1210

−6

10−5

10−4

10−3

10−2

10−1

100

BE

R

10 log (Pmax

/σ2)

B = 2, M = 4, K = 4, Nk = 1, L

k = 1

Virtual uplink using first methodAssuming Synchronous InterferenceVirtual uplink using second method

Figure 2.5: Linear precoding with and without asynchronous interference, no path loss

is convex in the uplink powers are shown in Figure 2.6. The simulated scenario has

B = 2, K = 3,M = 3, Nk = 2, and Lk = 1. The power of the third user q3 is set to

4 different values, and for each value the SMSE is plotted versus the powers of the first

two users, q1 and q2. The 3D plots suggest that the SMSE function is indeed convex.

2.6 MIMO-OFDM and Multiple Base Stations

With an understanding of asynchronous interference experienced in multi-BS, single-

carrier systems, we will briefly discuss in this section how OFDM might be introduced

into a multi-BS system. Moreover, we will state some of the difficulties that can arise,

which will motivate the work presented in Chapter 3.

We have seen that in a single-BS, single-carrier system, the MIMO-SMSE algorithm

2.6. MIMO-OFDM AND MULTIPLE BASE STATIONS 40

0100 10

0.5

1

1.5

2

q1

q3 = 3

q2

SM

SE

0100 10

0.5

1

1.5

2

q1

q3 = 6

q2

SM

SE

0100 10

0.5

1

1.5

2

q1

q3 = 9

q2

SM

SE

0100 10

0.5

1

1.5

2

q1

q3 = 12

q2

SM

SE

Figure 2.6: SMSE versus power of three users

of [8] provides a practical solution to the multiuser MIMO downlink problem. When the

system is changed into a single-BS, OFDM (multi-carrier) system with a higher number

of users, it is possible to extend the previous solution, by using a user selection algorithm

to allocate users to each subcarrier (an example is presented in Section 4.2 in detail), and

executing this algorithm for each subcarrier. When the system is changed into a multi-

BS, single-carrier system, asynchronous interference arises. However, with the proper

choice of the virtual uplink model, duality exists, and a modified algorithm can be used.

In what follows, we will discuss the remaining case: a multi-BS, OFDM system.

As we saw previously in Section 2.1.1, in a multi-BS, single-carrier system, a certain

BS would advance or delay its transmission to a certain user to ensure that its signal

arrives at the user synchronously with all other signals from all other BSs. As a reminder,

a MIMO-OFDM transmitter has as many IFFT modules as there are antennas, since all

data for all users is applied at the input of the IFFT modules and processed in one

go. In a multi-BS case, a certain BS might be required to transmit to different users

separately at different times that differ by non-integer multiples of the symbol period

(by applying the desired user’s data at the appropriate inputs of the IFFT modules,

and zeros elsewhere). Accordingly, a BS might require a set of IFFT modules for every

2.7. SUMMARY 41

user in the system, which is obviously impractical given the high number of users. One

might attempt performing the IFFT operation on all the users’ data all at once and then

perform time advances and delays on parts of the result, but it is not possible to extract

each user’s signal component (as if it were processed alone as described above) from the

single result obtained. Moreover, transmitting all the users’ data to each user, and having

each user extract only its data, would require an impractical amount of power.

Despite the infeasibility of the above cases, assume it is possible to transmit to each

user its own data using OFDM from multiple BS and have the transmissions arrive

at the user synchronously. The interference will be inevitably asynchronous, and will be

experienced by the output symbols of the IFFT module and not the original data symbols.

Accordingly, the algorithm presented in this chapter may not be directly applied. One

might attempt to study the effect of such asynchronism on the original data symbols

and precode against that, but this serves to complicate matters further. Consequently,

in the next chapter we propose that cooperative transmission be used only for users that

are located in a region that is almost equally distant from the cooperating BSs, i.e., at

the edge, which limits the degree of asynchronism. This facilitates extending single-BS,

single-carrier algorithms to multi-BS, OFDM systems. Moreover, in order to help those

users further, we explore a new approach based on a combination of linear and nonlinear

precoding to battle MUI, and study how it performs compared to linear precoding.

2.7 Summary

In this chapter, we provided a downlink system model for a MIMO cellular system with

multiple BSs. We also proposed a virtual uplink model that would preserve the useful

downlink/uplink duality, which otherwise does not exist if the virtual uplink is modeled

in a simpler manner. We saw how this comprehensive model simplifies into the single-

BS model in [8], and hence reviewed the iterative multiuser MIMO linear precoding

algorithm presented there. Consequently, we proposed a revised model that accounts for

multiple BS and asynchronous interference and simulated the results of the new scheme.

This chapter ended with motivating the search for an alternative approach for multi-user,

multi-BS, MIMO-OFDM systems.

Chapter 3

The Hybrid Algorithm

In the previous chapter, we briefly discussed the difficulties associated with using OFDM

in a multiuser, multi-BS scenario due to the presence of asynchronous interference. In this

chapter, we propose a scheme in which two BSs cooperate only for a group of users located

near the common boundary of their two cells. With limited asynchronous reception in

that region, we propose to use joint linear precoding as a method of BS cooperation for

those users. To help those users further, we propose to use nonlinear precoding to guard

them against the interference that they receive from users that are deeper inside the cell

and communicate with only one BS using linear precoding. With this use of linear and

nonlinear precoding, we formulate a ‘hybrid algorithm’, which minimizes the SMSE of

the whole system, and provide the results of simulations that evaluate its performance.

3.1 System Model

The system models of the multiuser downlink and the multiuser virtual uplink in this

chapter are similar to those in Chapter 2. Here, the K uniformly distributed users are

divided into three sets according to their locations, namely edge users (Ωe, |Ωe| = Ke),

intra-cell users of BS 1 (Ω1, |Ω1| = K1), and intra-cell users of BS 2 (Ω2, |Ω2| = K2). We

have K1 + K2 + Ke = K. As shown in Figure 3.1 edge users are users located in a band

42

3.1. SYSTEM MODEL 43

Figure 3.1: System model for the hybrid algorithm

along the border between the regions of the two BSs1. The definitions of M,N, Nk, L,

and Lk remain unchanged.

The channels between a BS and its users (which include its intra-cell users and the

edge users) are also modeled as flat Rayleigh fading with unit average power. The path

loss exponent is chosen to suit the simulated environment. Moreover, it is assumed that

the channels between a BS and the intra-cell users of the other BS (referred to as the

cross channels from here on) have zero power, a reasonable assumption since path loss

over a large distance can attenuate a signal severely to the extent that it can be perceived

as noise.

In this chapter, Hk represents the channel matrix between user k and the BSs it

can communicate with. Accordingly, for intra-cell users, this matrix is of size M × Nk,

and for edge users, it is of size 2M × Nk. Note that Hk for edge users is the vertical

concatenation of H(1)k and H

(2)k , respectively the M × Nk channel matrices from BS

1In this chapter, the edge users are assumed in a band that is 40 meters wide, while the BSs arelocated 500 meters apart. The numbers used here are for illustrative purposes only and may be changeddepending on the acceptable tolerance on asynchronous reception.

3.1. SYSTEM MODEL 44

1 and 2. We will use H(1)in = [H11, H12, . . . ,H1K1 ], He =

[He1 , He2 , . . . ,HeKe

], and

H(2)in = [H21, H22, . . . ,H2K2 ] to denote the global channel matrices of all intra-cell users

of BS 1, edge users, and intra-cell users of BS 2, respectively, where 11, . . . , 1K1 are

the indices of the intra-cell users of BS 1, e1, . . . , eKe are those of the edge users, and

21, . . . , 2K2 are those of the intra-cell users of BS 2. The virtual uplink uses the same

channel matrices with the dimensions reversed and components conjugated (Hermitian

operation). Finally, even though the signals of the edge users cannot be completely

synchronized (whether in the downlink or uplink), it can be shown that the asynchronism

between the pulse shapes of two symbols arriving from the two BSs is limited by the

40-meter wide edge band to a maximum misalignment of 11.5% assuming the system

parameters mentioned above. The AWGN vectors nk and n(b) remain unchanged.

Let dk (Lk×1) be the column data vector containing user k’s data streams. Similarly,

E[dkdHj ] = 0, ∀ j 6= k, and E[dkd

Hk ] = ILk

. Let xk (Lk × 1) be a modified version of

dk that is transmitted. The need for xk and how it differs from dk will become clear

later. Note that we will use d(1)in , de, and d

(2)in to denote the global column data vectors

of all intra-cell users of BS 1, all edge users, and all intra-cell users of BS 2, respectively.

Moreover, we will use x(1)in , xe, and x

(2)in to denote the global modified versions of d

(1)in , de,

and d(2)in , respectively. The estimate of xk at a receiver is similarly denoted by xk. When

needed, further processing can be done on xk to produce dk, an estimate of dk.

The definitions of matrices Pk,Qk,Uk, and Vk remain unchanged. Note that matrix

Uk is of size M × Lk for intra-cell users and of size 2M × Lk for edge users. That is,

for an edge user, Uk =[U

(1)T

k ,U(2)T

k

]T

, where U(1)k and U

(2)k are the M × Lk pre-

coding matrices that BS 1 and BS 2 use for that edge user, respectively. We will

use P(1)in = diag

(Pin11 , Pin12 , . . . ,Pin1K1

), Pe = diag

(Pe1 , Pe2 , . . . ,PeKe

), and P

(2)in =

diag(Pin21 , Pin22 , . . . ,Pin2K1

)to denote the global downlink power matrices of all intra-

cell users of BS 1, edge users, and intra-cell users of BS 2, respectively. The same

applies to Q(1)in , Qe, Q

(2)in , V

(1)in , Ve, and V

(2)in . We will use U

(1)in = [U11, U12, . . . ,U1K1 ],

Ue =[Ue1 , Ue2 , . . . ,UeKe

], and U

(2)in = [U21, U22, . . . ,U2K2 ] to denote the global coding

matrices of all intra-cell users of BS 1, edge users, and intra-cell users of BS 2, respectively.

According to the above system model, the expression of the estimated xk in the

3.2. THE HYBRID ALGORITHM 45

downlink and the uplink for intra-cell users are as follows, for b = 1, 2.

xDLk = VH

k H(b)H

k

∑j∈Ωb,Ωe

U(b)j

√Pjxj + VH

k nk, (3.1)

xULk = UH

k

∑j∈Ωb,Ωe

H(b)j Vj

√Qjxj + UH

k n(b). (3.2)

The summations in Eqns. (3.1) and (3.2) contain terms that correspond to a user’s

intended signal, as well as MUI from other intra-cell and edge users. The expressions for

edge users will be similar, except that, due to interference pre-subtraction, there will be

no MUI from intra-cell users (only from other edge users). This will be explained in the

next section, after which the relevant expressions will be presented.

3.2 The Hybrid Algorithm

The main goal of the algorithm is to help the edge users since, without pre-subtraction,

they would receive interference from both BSs. Furthermore, this interference would

be asynchronous, reopening some of the issues previously discussed. In addition, this

algorithm can be easily extended to OFDM, in order to support the large number of

users that are present in a realistic scenario.

The edge users are helped in two ways: first, the two base stations cooperate and

jointly precode the transmissions to the edge users; second the interference from intra-

cell users is pre-subtracted and hence the overall interference level is significantly reduced.

Since the edge users are approximately equidistant from the BSs, the received signals are

essentially synchronous and the two BSs appear to the edge users as one super-BS with

2M antennas. The interference pre-subtraction happens at the BSs before they transmit

to the edge users. This is possible because it is assumed that complete CSI is known at

the BSs. Accordingly, edge users can be treated as a separate group for which precoding

and decoding matrices to battle the remaining MUI can be derived using any single-BS

multiuser MIMO algorithm, such as the one proposed in [8].


As for the intra-cell users of a certain BS, the MUI they see comes partly from the

transmission to other intra-cell users of the same BS and partly from the transmission

to the edge users. It is assumed that intra-cell users whiten the MUI from edge users.

Accordingly, each group of intra-cell users can also be treated as a separate group for

which precoding and decoding matrices to battle the remaining MUI can be derived using

any single-BS multiuser MIMO algorithm.

3.2.1 Interference Pre-subtraction

The work in this section is based on nonlinear THP and its matrix form [26, 30, 34].

The main idea behind THP is to subtract a linearly modified (by matrix Ge) version of

xin =[x

(1)T

in ,x(2)T

in

]T

from de to form xe, such that once xe is transmitted and received by

the edge user, the combined effect of the precoding, channel, decoding, and interference

restores de.

The one drawback with this approach is that this pre-subtraction can generate a mod-

ified data vector xe whose components are situated further away from the origin than the

components of de, and hence xe can have more power than de. To alleviate this problem,

the real binary phase shift keying (BPSK) constellation assumed in this work can be

extended on the real axis to form the constellation with points ...,−5,−3,−1, 1, 3, 5....Other constellations have similar extensions.

Note that the new constellation points can be mapped evenly to the original two

points in the BPSK constellation by a simple modulo-2Mconst operation, where Mconst

is the distance between the original two constellation points (Mconst = 2 in our case).

Consequently, one can map the components of de to points in the extended constellation

such that, after the pre-subtraction step, the components of xe lie within the interval

[−Mconst,Mconst), which helps reduce the power of xe. It can be easily seen that a similar

result can be achieved by pre-subtracting the interference from the original de and then

performing the modulo-2Mconst operation. Accordingly, xe can be expressed as follows.

xe = (de −Gexin) mod 2Mconst = de + 2Mconstqe −Gexin = ve −Gexin. (3.3)

The vector qe is a vector of integers that indicate how many times 2Mconst had to be added


or subtracted from the components of de to reach appropriate points in the extended

constellation, which are now represented by ve.

As will be seen next, the matrix Ge can contain complex components and hence xe can

be complex as well. The modulo-2Mconst operation will be performed on the real part only,

since it is the part that is actually transmitted; transmitting the imaginary part would

represent wasted energy. Simulations presented later show that better performance is

achieved when the available power is used to transmit only the real part. Consequently, to

determine the average power of the transmitted real parts of xe, we refer to the theoretical

result that states that the real and imaginary parts of the components of xe are almost

independent and identically distributed (i.i.d.) with a uniform distribution within the

region [−Mconst,Mconst) = [−2, 2) (since Mconst = 2 in our case) [30]. Therefore, the

average power of a symbol is now given by (Mconst−(−Mconst))2

12= 4

3. This is slightly higher

than the unit average power of symbol in the original de, but will be accounted for in a

power allocation step that incorporates a factor of 34

into the expression Pe, the power

allocation matrix for edge users.

With the expression of xe at hand, the expression of de can be presented, and the

value of Ge needed to cancel MUI from intra-cell users can be derived.

de = VHe

[H(1)H

e U(1)e

√Pexe + H(2)H

e U(2)e

√Pexe

+ H(1)H

e U(1)in

√P

(1)in x

(1)in + H(2)H

e U(2)in

√P

(2)in x

(2)in + ne

]mod 2Mconst,

= VHe

[HH

e Ue

√Pexe +

[H(1)H

e U(1)in

√P

(1)in

∣∣∣H(2)H

e U(2)in

√P

(2)in

]xin + ne

]mod 2Mconst,

= VHe

[HH

e Ue

√Peve −HH

e Ue

√PeGexin +

[H(1)H

e U(1)in

√P

(1)in

∣∣∣H(2)H

e U(2)in

√P

(2)in

]xin

+ ne

]mod 2Mconst. (3.4)

An important point to note is that both transmission from BS 1 and BS 2 have the same

vector xe (meaning that both choose the same extended constellation points to battle

the global interference, use the same matrix Ge, and know the data of the intra-cell


users of both BSs included in xin). This is exactly where the coordination of the two

BSs happens. If the BSs choose different points from the extended constellation or do

not know the data of the intra-cell users of the other BS, the receiver will not be able

to decode its message; essentially the receiver will not be able to resolve two different

integer shifts.

Now in order to cancel the interference, Ge has to be chosen such that the second

and third terms in the last line of Eqn. (3.4) cancel each other. Accordingly, Ge takes

the following form.

Ge =(VH

e HHe Ue

√Pe

)−1

VHe

[H(1)H

e U(1)in

√P

(1)in

∣∣∣H(2)H

e U(2)in

√P

(2)in

]. (3.5)

Note that [ · | · ] denotes the horizontal concatenation of two matrices. With the interfer-

ence from the intra-cell users of both BSs subtracted from the transmission for the edge

users, the edge users can be considered as an independent problem and the precoding

and decoding matrices can be designed for them as in [8]. This part of the algorithm will

be described in more detail in Section 3.2.3.

3.2.2 Whitening Interference from Edge Users at Intra-cell Users

The signal received by intra-cell user k of BS b can be expressed as follows.

yin,k = H(b)H

k

∑j∈Ωb,Ωe

U(b)j

√Pjxj + nk,

= H(b)H

k

∑j∈Ωb

U(b)j

√Pjxj + H

(b)H

k U(b)e

√Pexe + nk,

= H(b)H

k

∑j∈Ωb

U(b)j

√Pjxj + zk. (3.6)

The second and third terms in the second line of Eqn. (3.6) can be considered as colored

noise and are denoted by zk. The covariance matrix of this colored noise, denoted by


Rz,k, can be approximated as:

Rz,k = E[zkz

Hk

]= E

[(H

(b)H

k U(b)e

√Pexe + nk

)(H

(b)H

k U(b)e

√Pexe + nk

)H]

≈ 4

3H

(b)H

k U(b)e PeU

(b)H

e H(b)k + σ2INk

(3.7)

In the more general case where the cross channels are not forced to have zero power but

path loss is used to model their low power, the received signal yin,k has additional terms

representing the asynchronous inter-cell interference, and consequently its covariance

Rz,k also changes. The new expressions of yin,k and Rz,k are given below, assuming user

k belongs to BS b1 and inter-cell interference arrives from BS b2.

yin,k =H(b1)H

k

∑j∈Ωb1

U(b1)j

√Pjxj + H

(b1)H

k U(b1)e

√Pexe

+ H(b2)H

k U(b2)e

√Pei

(b2)ek + H

(b2)H

k U(b2)in

√P

(b2)in i

(b2)Ω2k︸︷︷︸

inter-cell asynchronous interference

+nk,

=H(b1)H

k

∑j∈Ωb1

U(b1)j

√Pjxj + zk, (3.8)

Rz,k = E[zkz

Hk

]≈4

3H

(b1)H

k U(b1)e PeU

(b1)H

e H(b1)k +

4

3ρ(δ

(b2)ek )H

(b1)H

k U(b1)e PeU

(b2)H

e H(b2)k

+4

3ρ(δ

(b2)ek )H

(b2)H

k U(b2)e PeU

(b1)H

e H(b1)k

+4

3β

(b2,b2)ek H

(b2)H

k U(b2)e PeU

(b2)H

e H(b2)k

+ β(b2,b2)Ω2k H

(b2)H

k U(b2)in P

(b2)in U

(b2)H

in H(b2)k + σ2INk

. (3.9)

In order to whiten this colored noise, intra-cell user k multiplies its received signal by a

whitening filter matrix R− 1

2z,k . Accordingly, the receiver then processes the modified signal

yin,k, that can be expressed in terms of Hk, the modified channel matrix, and zk, the


whitened noise.

yin,k = R− 1

2z,k yin,k = R

− 12

z,k H(b1)H

k

∑j∈Ωb1

U(b1)j

√Pjxj + R

− 12

z,k zk

= H(b1)H

k

∑j∈Ωb1

U(b1)j

√Pjxj + zk (3.10)

Note that H(b1)k = H

(b1)k R

− 12

z,k . Consequently, each of the two groups of intra-cell users can

be treated, as well, as an independent problem, and the precoding and decoding matrices

can be derived as in [8], but using the modified channel matrices.

3.2.3 The Hybrid Algorithm

In short, the algorithm performs the following steps. It starts by initializing the uplink

precoding matrices (the V matrices) to the right singular vectors of the channel matrices.

The uplink power matrices (the Q matrices) are initialized by dividing the available

power equally among all data streams. The modified channel matrices are initialized

by adjusting the true channels matrices using the initial power and precoding matrices.

Next, a global power allocation step is performed that minimizes the SMSE of the whole

system. As explained before, the hybrid algorithm treats each user group as a separate

group; therefore, the SMSE of the whole system is simply the sum of the SMSEs of the

three user groups. Since the reception in each group is synchronous, the SMSE of each

group can be expressed as in Eqn. (2.52).

SMSEΩ1 = L1 −M + tr(J(1)−1

), J(1) = H

(1)in V

(1)in Q

(1)in V

(1)H

in H(1)H

in + σ2IM , (3.11)

SMSEΩe = Le − 2M + tr(J−1

e

), Je = HeVeQeV

He HH

e + σ2I2M , (3.12)

SMSEΩ2 = L2 −M + tr(J(2)−1

), J(2) = H

(2)in V

(2)in Q

(2)in V

(2)H

in H(2)H

in + σ2IM , (3.13)


and

SMSE = SMSEΩ1 + SMSEΩe + SMSEΩ2 ,

= L− 4M + tr(J(1)−1

)+ tr

(J−1

e

)+ tr

(J(2)−1

)(3.14)

The total number of data streams for the intra-cell users of BS 1, the edge users, and the

intra-cell users of BS 2 are L1, Le, and L2, respectively. Note that L1+Le+L2 = L. Since

each of the SMSEs in Eqns. (3.11), (3.12), and (3.13) is convex in its power term [8], the

SMSE in Eqn. (3.14) is convex in these power terms. It can be easily seen that minimizing

the SMSE in Eqn. (3.14) is equivalent to minimizing tr(J(1)−1

)+ tr (J−1

e ) + tr(J(2)−1

).

Therefore, the power allocation problem can be expressed as:

[Q(1),Qe,Q

(2)]

= arg minQ(1),Qe,Q(2)

tr[J(1)−1

]+ tr

[J−1

e

]+ tr

[J(2)−1

](3.15)

subject to: tr[Q(1)

]+ tr [Qe] + tr

[Q(2)

] ≤ Ptot.

Note that the power allocation step assumes that the covariance matrices of the colored

noise (implicit in the modified channel matrices) are constant, since this assures that

the SMSE is convex with respect to power. The edge users are now treated as an inde-

pendent group, and new uplink precoding matrices (V) are derived for them. Using the

downlink/uplink duality, the downlink precoding and power allocation matrices (the U

and P matrices respectively) of the edge users are derived [8].

The precoding and decoding matrices and the power allocation are used, in turn, to

determine the covariance matrices of the colored noise caused by edge users at intra-cell

users as in Eqn. (3.7), which are used to modify the channel matrices of the intra-

cell users as described previously in Section 3.2.2. When inter-cell interference is to

be explicitly taken into account, the modified channel matrices are used to derive the

downlink precoding and power allocation matrices of the intra-cell users. Then, the

covariance matrices of the colored noise are found using Eqn. (3.9), and used to modify

the channel matrices of the intra-cell users. Next, new uplink precoding matrices (V) are

obtained for intra-cell users. The above steps are repeated until the relative change in

SMSE is below a certain threshold, or the SMSE experiences an increase. The increase


in SMSE can occur since the power allocation step assumes the covariance matrix of the

colored noise is constant, while in reality it is a function of the power allocation. Finally,

the downlink precoding and power allocation matrices (U and P matrices respectively)

of the intra-cell users and the Ge matrix needed by THP are derived.

Note that it is assumed that both BSs can provide a maximum power of Ptot =

2 × Pmax, where Pmax is the power that one BS is allowed to provide, knowing that the

uniform distribution of the users is symmetric along the perpendicular bisector of the

straight line joining the 2 BSs. In other words, at a particular instance, a BS might be

transmitting with a power greater than or less than Pmax, but over an extended time

period the average transmitted power is Pmax.

The following list describes the steps mathematically and in more detail.

Iteration:

1. Solve the following convex power allocation problem.[Q(1),Qe,Q

(2)]

= arg minQ(1),Qe,Q(2) tr[J(1)−1

]+ tr

[J−1

e

]+ tr

[J(2)−1

]

s.t. tr[Q(1)

]+ tr [Qe] + tr

[Q(2)

] ≤ Ptot

2. Find new V matrices for edge users.

vkj = emax

(HH

e,kJ−2e,kjHe,k, I/Qe,kj + HH

e,kJ−1e,kjHe,k

)

N.B.: emax returns the eigenvector with the highest eigenvalue.

3. Find the U matrices with normalized columns for all edge users and the global Pe

matrix.

Ue,k = J−1e He,kVe,k

√Qe,k, ue,kj = ue,kj/||ue,kj|| (normalizing columns)

Pe = 34σ2diag

[(D−1

e −Ψe)−1

1]

N.B.: The factor 34

is included to account for the increased average power of xe due

to THP as mentioned previously in Section 3.2.1.

4. Find Rz,k as in Eqn. (3.7) or Eqn. (3.9). Update Hk for all users in BSs 1 and 2.

H(b)k = H

(b)k R

− 12

z,k

5. Find new V matrices for intra-cell users, b = 1, 2.

vkj = emax

(H

(b)H

k J(b)−2

kj H(b)k , I/Q

(b)kj + H

(b)H

k J(b)−1

kj H(b)k

)


6. Repeat steps 1 to 5 above, until the relative change in SMSE is within a certain

threshold, or until SMSE shows an increase at any step.

Update:

1. Find the U matrices with normalized columns for all intra-cell users and their global

power allocation matrices, b = 1, 2.

U(b)k = J(b)−1

H(b)k V

(b)k

√Q

(b)k , u

(b)kj = u

(b)kj /||u(b)

kj || (normalizing columns)

P(b) = σ2diag[ (

D(b)−1 −ΨUL (b)T)−1

1]

2. Find Ge as given in Eqn. (3.5).

The matrices D and ΨUL are given by Eqns. (2.42) and (2.43) with γ(b1,b2)ck set to 1,

and the matrix Jkj is given by Eqn. (2.54), but applied to the set of users indicated by

its superscripts or subscripts. To use the above steps with OFDM, a slight modification

is needed, which will be presented later in Chapter 4.

3.2.4 Data Vector Estimation for THP Users

As we have seen previously in Section 3.2.1, modifying the data vector for the edge users

that use THP is equivalent to choosing the data symbols of the data vector from an

extended constellation such that the pre-subtraction of interference decreases its power.

It is crucial for the data vector from the extended constellation, ve, to be decoded properly

at the edge users so that the modulo operation restores the correct original data vector,

de. In the regular operation of the SMSE algorithm of [8], on which the hybrid algorithm

was based, the symbols of the estimated data vector, xe, are not brought back as close as

possible to −1 or 1, and they are simply hard-decoded according to their signs. Clearly,

this is not applicable to THP, since we are dealing with an extended constellation that

has more than only two symbols. The receiver process scales the transmitted vector and

we derive below how the estimated data vector should be scaled.

A global estimated data vector for all users can be written for the first step, i.e., the

algorithm of [8], by grouping the separate estimated data vectors as given by Eqn. (2.46).

This global vector is expressed in Eqn. (3.16). Note that V is block diagonal since the

3.3. A VARIATION ON THE HYBRID ALGORITHM 54

users cannot cooperate.

x = VHHHU√

Px + VHn. (3.16)

The matrix U has normalized columns and hence we may express it as U = UorigW,

where Uorig = J−1HV√

Q, and W = diag(

1||u1|| , . . . ,

1||uL||

), and ul is the lth column of

U. Expressing U and J explicitly in Eqn. (3.16), we get

x = VHHH(HVQVHHH + σ2IM

)−1HV

√QW

√Px + VHn (3.17)

From Eqn. (3.17), we can see that when J is inverted, the magnitude of the matrices

that make up J essentially normalize the magnitude of all the matrices outside, except

for W. Therefore, W−1 = diag (||u1||, . . . , ||uL||) should be multiplied into x to restore

the correct magnitude to its symbols.

3.3 A Variation on the Hybrid Algorithm

The main idea behind the hybrid algorithm presented in Section 3.2 was to pre-subtract

the transmission to the intra-cell users from that of the edge users and to treat the

transmission to the edge users at the intra-cell users as colored noise that is subsequently

whitened at each user. An intuitive variation to the above scheme would be to pre-

subtract the transmission to the edge users from that of the intra-cell users and treat the

transmission to the intra-cell users as colored noise that is whitened at the edge users.

Note that this variation is closely related to reversing the user ordering for THP when

there are only two users. In this case, x(b)in is found as follows,

x(b)in = (d

(b)in −G

(b)in xe) mod 2Mconst, (3.18)

where

G(b)in =

(V

(b)H

in H(b)H

in U(b)in

√P

(b)in

)−1

V(b)H

in H(b)H

in U(b)e

√Pe. (3.19)


As for the signal received by edge user k, in this case it can be expressed as below. Note

that in this case, the cross channels are forced to have zero power.

ye,k = HHe,kUe

√Pexe + H

(1)H

e,k U(1)in

√P

(1)in x

(1)in + H

(2)H

e,k U(2)in

√P

(2)in x

(2)in + ne,k,

= HHe,kUe

√Pexe + ze,k. (3.20)

Following from Eqn. (3.20), Rz,e,k can be approximated as follows.

Rz,e,k = E[ze,kz

He,k

],

≈ 4

3H

(1)H

e,k U(1)in P

(1)in U

(1)H

in H(1)e,k +

4

3H

(2)H

e,k U(2)in P

(2)in U

(2)H

in H(2)e,k + σ2INk

. (3.21)

Finally, the modified channel matrix for edge user k is found as

He,k = He,kR− 1

2z,e,k. (3.22)


This section presents the results of illustrative simulations. The parameters used for all

the simulations are the same as in Table 2.3.

3.4.1 Zero Power Cross Channels

The results of the first simulation are captured in Figure 3.3. Note that in this simulation,

the path loss exponent is assumed to be zero, i.e., the path loss is not taken into account.

This example serves to illustrate the workings of the proposed algorithm independent

of path loss effects. Note that since the path loss exponent is set to zero, there is no

path loss to attenuate the cross channels. However, as explained at the beginning of

Section 3.1, cross channels attenuate a signal severely. Therefore, the cross channels here

are forced to have zero power. An upper bound on performance is obtained by assuming

that all signals (including interference) from both BSs arrive at all users synchronously.

This system is equivalent to a super-BS with 2M transmit antennas communicating

jointly with K users. On the other hand, a lower bound on performance is obtained


−2 0 2 4 6 8 10 1210

−6

10−5

10−4

10−3

10−2

10−1

100

BE

R

10 log (Pmax

/σ2)

B = 2, M = 4, K = 4, Nk = 1, L

k = 1

Lower BoundUpper BoundHybrid AlgorithmAsynchronous Interference

Figure 3.2: Performance of the hybrid algorithm without path loss (K1 = K2 = 1,Ke = 2)

by assuming that the 4 users are uniformly distributed between two independent cells

without inter-cell interference and without joint cooperative processing for edge users.

The performance upper and lower bounds correspond to the bottom and top curves in

Figure 3.2, respectively.

To illustrate the efficacy of the hybrid algorithm, the case of Ke = 2, K1 = K2 = 1

was simulated, since having a substantial amount of edge users is a rare event when a

uniform distribution of users is combined with a 40-meter wide cooperation band. The

result is shown by the solid curve with diamond markers. As expected the performance

lies within the upper and lower bounds of performance. We can also see a clear increase in

diversity order from the lower bound, which is a direct result of the cooperation between

the two BSs. The diversity order is slightly less than that of the upper bound since

cooperation only happens for the edge users in the hybrid algorithm, while it happens

for all users in the upper bound case.

To evaluate the performance of the hybrid algorithm further, three other schemes

that assume zero power cross channels (ZPCC) were also simulated for Ke = 2, K1 =


−2 0 2 4 6 8 10 12

10−4

10−3

10−2

10−1

100

BE

R

10 log (Pmax

/σ2)

B = 2, M = 4, K = 4, Nk = 1, L

k = 1

ZPCC Case 3Hybrid AlgorithmZPCC Case 2ZPCC Case 1

Figure 3.3: Performance of the hybrid algorithm compared to ZPCC cases without pathloss (K1 = K2 = 1, Ke = 2)

K2 = 1. From here on, these schemes will be referred to as “ZPCC cases 1 to 3”. In

the three ZPCC cases, all interference is assumed to be synchronous and edge (intra-cell)

users battle MUI from other edge (intra-cell) users and intra-cell (edge) users by linear

precoding. For ZPCC case 1, the whole system is treated as one cell with one super-

BS equipped with 2M antennas. The channels between the users initially considered

as intra-cell users in cell 2 (cell 1) and the first (last) M antennas of the super-BS are

considered to have zero power. In this case, it is found that the super-BS transmits to all

users using all the 2M antennas, even though the transmission intended for some users on

some antennas never reaches those users. Despite that fact, this case of ZPCC performs

slightly better than the hybrid algorithm. Its result is shown by the solid curve with

cross markers in Figure 3.3. This can be explained by noticing that the transmission to

intra-cell users by the other BS can help the edge users better suppress the interference

they see from those intra-cell users.

This explanation is further supported by the performance of ZPCC case 2. In ZPCC

case 2, the same precoding vectors as the first variation are used by the BSs, except that


the components of those vectors corresponding to the antennas with zero power channels

are set to zero and the vectors are re-normalized to their initial unit magnitude. In other

words, the extra interference information that edge users received in the first variation

is no longer available. The result of this case is shown by the dashed curve with cross

markers in Figure 3.3. This result shows the worst performance among all the schemes.

Furthermore, plotting the results for the separate streams (Figure 3.4) shows that it is the

edge users that cause the apparent error floor, which supports our previous explanation.

In ZPCC case 3, one BS transmits to intra-cell users, and two BSs transmit to edge

users. That is, linear precoding for one group of intra-cell users is performed assuming

that only they, their BS, and the edge users exist. As for the edge users, linear precoding

is performed assuming that both BSs and all the users exist. The result of this case is

shown by the dotted curve with cross markers. With the ZPCC cases explained, it is only

fair to evaluate the performance of the hybrid algorithm by comparing it to a scheme

that uses the same amount of information, that is, ZPCC 3. In other words, ZPCC case

1 uses the values of the cross channels (zero in this case) directly in its solution, while

the hybrid algorithm and ZPCC case 3 do not require them. Clearly, at relatively high

SNRs, the hybrid algorithm performs better than ZPCC case 3, with a higher diversity

order. This higher diversity order comes mainly from an increase in that of the intra-cell

users as can seen by plotting the BERs of the separate data streams in Figure 3.5. This

is a possible consequence of precoding for intra-cell users while treating the MUI caused

by edge users as colored noise (that is whitened) and not as interferers to be suppressed.

A more realistic scenario that better represents a physical environment would simulate

the effects of path loss. Accordingly, in the second simulation, the path loss exponent

is set to 3.5, a value that represents an urban environment. The cell radius was set to

250 meters and the minimum distance of any user to a BS was set to 150 meters. The

path loss of each user was normalized to the path loss experienced 150 meters away from

a BS. The results of this simulation are captured in Figure 3.6. The results are similar

to those obtained without path loss, except that the higher diversity order of the hybrid

algorithm makes it outperform the ZPCC case 3 at higher SNRs.


−2 0 2 4 6 8 10 12

10−5

10−4

10−3

10−2

10−1

100

10 log(Pmax

/ σ2)

BE

R

B = 2, M = 4, K = 4, Nk = 1, L

k = 1

Average BEREdge Stream 1Edge Stream 2Intra−cell Stream 1Intra−cell Stream 2

Figure 3.4: BER plot of the separate streams for ZPCC case 2

−2 0 2 4 6 8 10 1210

−5

10−4

10−3

10−2

10−1

100

10 log(Pmax

/ σ2)

BE

R

B = 2, M = 4, K1 = 1, K

e = 2, K

2 = 1, N

k = 1, L

k = 1

HA − Edge Stream 1HA − Edge Stream 2HA − Intra−cell Stream 1HA − Intra−cell Stream 2ZPCC − Edge Stream 1ZPCC − Edge Stream 2ZPCC − Intra−cell Stream 1ZPCC − Intra−cell Stream 2

Figure 3.5: BER plot of the separate streams for hybrid algorithm and ZPCC case 3


−5 0 5 10 15 20

10−4

10−3

10−2

10−1

100

10 log (Pmax

/σ2)

BE

R

B=2, K=4, M=4, Nk=1, L

k=1, cell_radius=250

ZPCC − Case 3HAZPCC − Case 1ZPCC − Case 2

Figure 3.6: Performance of the hybrid algorithm with path loss

3.4.2 Non-zero Power Cross Channels

While in all the previous simulations the cross channels were set to zero, the results shown

in Figure 3.7 are those of simulations where the weak cross channel powers are modeled by

path loss. As expected, the performance slightly deteriorates in the presence of inter-cell

interference. Furthermore, when the users find their decoding matrices (the V matrices

with the whitening filters incorporated) through training, the performance is very close to

that when the theoretical ones are used; in fact, it is slightly better. This can be explained

by the fact that a relatively long training sequence of length 100 symbols has been used

and the decoding matrices are derived from simulated training transmissions, and not

based on the theoretical approximation of the covariance of the inter-cell interference.

This result might, as well, be used to show that the approximation used is a relatively

good one. Figure 3.7 also shows similar simulations performed for users that have Nk = 2

receive antennas, with a training sequence length of 50. The curves almost overlap. This

further supports the feasibility of users finding their decoding matrices through training,


−5 0 5 10 15 2010

−5

10−4

10−3

10−2

10−1

100

10 log(Pmax

/ σ2)

BE

R

B = 2, M = 4, K1 = 1, K

e = 2, K

2 = 1, L

k = 1

HA without inter−cell interferenceN

k = 1

HA with whitneing inter−cell interferenceN

k = 1

HA with inter−cell interference and rx est.N

k = 1, training length = 100

HA with whitneing inter−cell interferenceN

k = 2

HA with inter−cell interference and rx est.N

k = 2, training length = 50

Figure 3.7: Performance of the hybrid algorithm with inter-cell interference

since with Nk = 1, training can be thought of as the receivers estimating phase offsets

(since the decoding vectors are scalars normalized to unit magnitude).

Figure 3.8 shows the performance of the variation of the hybrid algorithm. It performs

worse than the hybrid algorithm. Figure 3.9 shows the result obtained from simulating

the hybrid algorithm when transmitting the whole complex components of vector xe. As

mentioned previously in Section 3.2.1, energy is wasted and the performance deteriorates.

Finally, we state some statistical results that were obtained through the above sim-

ulations as well. First, the power of the real parts of the modified transmitted data

vectors average over 105 runs for each SNR from −3 to 15 was 1.268, which is close to

the theoretical value of 43. At higher SNRs, the average power is closer to 4

3. Second,

the modulo operations at the transmitter and the receiver matched 99.6% of the time,

averaged over the same number of runs and the same SNR range.


−2 0 2 4 6 8 10 12 1410

−6

10−5

10−4

10−3

10−2

10−1

100

10 log(Pmax

/ σ2)

BE

R

B = 2, M = 4, K1 = 1, K

e = 2, K

2 = 1, N

k = 1, L

k = 1

HAVariation of HA

Figure 3.8: Performance of the hybrid algorithm variation

−2 0 2 4 6 8 10 12 1410

−6

10−5

10−4

10−3

10−2

10−1

100

10 log(Pmax

/ σ2)

BE

R

B = 2, M = 4, K1 = 1, K

e = 2, K

2 = 1, N

k = 1, L

k = 1

HAHA with complex transmission

Figure 3.9: Performance of the hybrid algorithm with complex transmission

3.5. IMPLEMENTATION ISSUES 63

3.5 Implementation Issues

In this section, we briefly review some of the implementation issues involved in the

physical deployment of the hybrid algorithm. The first issue that arises is the provision

of the needed CSI to the BSs, a common issue to all schemes that assume CSI at the

transmitter side. Moreover, in our case, the propagation delays between the BSs and

the users need to be determined and provided to the BSs. CSI usually either consists

of the exact values of the response of the channel or some statistics about it. In the

downlink, the users are responsible for performing channel estimation and feeding back

the information to the BSs. This is usually done through a feedback channel from the

users to the BS, that is often assumed to be error free. However, in reality, this feedback

channel is not necessarily error free. Moreover, even if that is achievable, the channel

response estimated by the users cannot be exactly fed back to the BSs due to the errors

that occur when the users discretize and quantize the channel values. The effect of

quantization is often studied by assuming that there is only a limited number of bits to

relay the needed information and simulating how that affects performance. Such a study

is required to determine the number of bits that allow the hybrid algorithm to perform

at an acceptable level. As for the propagation delays, they can be directly found at the

BSs since they should be the same in the downlink and uplink. Hence, determining them

should be easier and more accurate than the CSI.

With the hybrid algorithm being a cooperative scheme between 2 BSs, another im-

plementation issue is the amount of information that should be shared between the BSs,

how to share it, and how much delay is involved in the process. In what follows we

assume a simple yet logical view of the infrastructure of a cellular system, and provide

a back-of-the-envelope analysis of the resulting delays. We consider only two BSs and a

central controller (CC) that, in a non-cooperative scenario, would only provide the BSs

with the data they need to transmit to the users. Initially, the CSI and propagation

delays between all the users and a given BS is assumed to be known by that BS and the

data to be transmitted to all the users is known by the CC.

To execute the hybrid algorithm, the CSI from both BSs is required. This can be seen

from steps 1 to 4 of the iteration of the algorithm presented in Section 3.2.3. In step one,

3.5. IMPLEMENTATION ISSUES 64

BS 1 → CC BS 2 → CC CC → BS 1 CC → BS 2 AlgorithmRunning

T 1 H(1), H(1)e , τ (1) H(2), H

(2)e , τ (2)

T 2 X

T 3 xe, x(1)in , P(1)

U(1), Pe, U(1)e

xe, x(2)in , P(2)

U(2), Pe, U(2)e

Table 3.1: Algorithm executed at the CC

a global, i.e., for users of both BSs, power allocation step is performed, which requires

the CSI of all users. In steps 2 and 3, the precoding and decoding matrices of the edge

users are jointly derived, which requires the CSI between both BS and the edge users.

In step 4, estimating the covariance matrix of the colored noise seen by an intra-cell user

requires the precoding matrices and power allocation derived for the edge users and/or

the intra-cell users of the other cell. Moreover, deriving the matrix Ge required for THP

as in Eqn. (3.5), also requires knowledge of the precoding and decoding matrices, as well

as the power allocation of all the users. Finally, the same modified data vector for the

edge users should be known by both BSs.

Consequently, all the CSI and the propagation delays should be gathered in one

location for the algorithm to execute. Logically, that will either happen at one of the

BSs and the result will be sent to the other BS, or at the CC and the result is sent to

both BSs. Clearly, it is more costly to have both BS acquire all the CSI and execute

the algorithm. Table 3.1 and Table 3.2 show the communication steps required when

the algorithm is executed at the CC and at one of the BSs (BS 1 in this example),

respectively.

As we can see, less time is required when the algorithm is executed at the CC.

However, a CC would be responsible for more than 2 BSs; therefore, having the CC

execute the algorithm for all its BSs increases its complexity drastically. Executing the

algorithm at one of the BSs keeps the complexity of the CC low, but requires more time,

creating a tradeoff. Note that at the cost of extra equipment (a direct link between each

pair of cooperating BSs), the required time for executing the algorithm at one of the BSs

becomes similar to that when it is executed at the CC.

Introducing OFDM into the solution brings in the implementation challenges of

3.6. SUMMARY 65

BS 1 → CC BS 2 → CC CC → BS 1 CC → BS 2 AlgorithmRunning

T 1 H(2)e , H

(2)in , τ (2) de, x

(1)in , x

(2)in x

(2)in

T 2 H(2)e , H

(2)in , τ (2)

T 3 X

T 4 U(2)in , P

(2)in

U(2)e , Pe, xe

T 5 U(2)in , P

(2)in

U(2)e , Pe, xe

Table 3.2: Algorithm executed at BS 1

OFDM itself, including time, frequency, and phase offsets and the associated inter-carrier

and inter-block interference. Moreover, the possibility of having to run the algorithm for

each subcarrier (or group of subcarriers [39]) also increases the running time of the whole

process. Even though these are important issues, they are not discussed further as they

are out of the scope of this work.

3.6 Summary

In this chapter, we have proposed the hybrid algorithm, which brings together linear and

nonlinear precoding to communicate with the users of two BSs. Edge users, which are

almost equally distant from the two BSs, experience limited asynchronism and joint linear

precoding based on the original algorithm of [8] is used to communicate to them. Before

transmission, nonlinear THP is used to subtract the data of intra-cell users from that of

the edge users, and accordingly the edge users do not experience any interference from

the intra-cell users. As for the intra-cell users themselves, they treat the transmission

to the edge users as colored noise, which they whiten. Simulations demonstrating the

performance of the hybrid algorithm were presented and compared to the performance

of simpler linear precoding schemes. Finally, a discussion of implementation issues was

presented, which provided some insight into the challenges that must be overcome for a

cooperative scheme to be deployed.

Chapter 4

User Selection in Multiuser

MIMO-OFDM Systems

As we have mentioned previously in Section 1.2.4, a single-carrier, MIMO, linear precod-

ing algorithm can only support a few data streams, far fewer than what would be required

in practice. A simple approach to increasing user capacity is using OFDMA and running

the same algorithm on each of the subcarriers. With many users in the system, the first

step becomes choosing which users should be communicated with on each subcarrier.

In this chapter, we review a user selection algorithm presented in [39] for a multiuser

MIMO-OFDM system. We propose a few changes to the algorithm and demonstrate the

improvement in performance. We then discuss how users might be selected when the

hybrid algorithm is the single-carrier precoding algorithm of choice for the subcarriers in

a multi-BS, MIMO-OFDM system. We present a problem formulation and a simulation

exercise that helps in determining the number of edge and intra-cell users to select for a

given subcarrier.

4.1 System Model and Problem Statement

In this section, we will describe the system model of a single-BS multiuser MIMO-OFDM

system. The model is based on that presented in Chapter 2 for a single BS. It can be

easily extended to suit the hybrid algorithm presented in Chapter 3, as will be seen later.

66

4.1. SYSTEM MODEL AND PROBLEM STATEMENT 67

Once the system model is established, the user selection problem will be stated in detail.

4.1.1 System Model

The basic idea behind OFDM is to the divide a broadband frequency selective channel

into a set of overlapping, yet orthogonal, narrowband flat fading channels. This can

be achieved by the use of an IFFT block at the transmitter and an FFT block at the

receiver. Accordingly, a multiuser MIMO-OFDM system can treat each subcarrier as a

single-carrier multiuser MIMO system that is independent of the systems on the other

subcarriers. This of course comes at the cost of increased computations, because the

single-carrier algorithm has to be repeated for all subcarriers, and the number of subcar-

riers in an OFDM system, denoted here as Nc, can be very high. Readers are referred

to [39] for more details on computation reduction methods that are suitable when the

algorithm of [8] is the single-carrier algorithm of choice.

The system model for the MIMO-OFDM system can be easily derived from the one

in Chapter 2 for a single BS by incorporating a subcarrier index n to all the variables1.

The expressions of the estimated data vectors in the downlink and uplink, as well as the

SMSE in the uplink are given below. Note that the channel matrices below represent the

flat fading MIMO channel response on each subcarrier in the frequency domain.

xDLk (n) = VH

k (n)HHk (n)

K∑j=1

Uj(n)√

Pj(n)xj(n) + VHk (n)nk(n), (4.1)

xULk (n) =

K∑j=1

UHk (n)Hj(n)Vj(n)

√Qj(n)xj(n) + UH

k (n)n(n), (4.2)

SMSE = Nc(L−M) + σ2

Nc∑n=1

tr(J(n)−1

), (4.3)

1Note that the system model and the analysis in this chapter ignore important issues such as phasenoise and frequency offsets; issues that are beyond the scope of this work

4.2. ORIGINAL ALGORITHM AND PROPOSED MODIFICATIONS 68

where

J(n) = H(n)V(n)Q(n)VH(n)HH(n) + σ2IM , (4.4)

and the index n = 1, . . . , Nc.

4.1.2 Problem Statement

Consider a system with one BS that needs to communicate with a total of Kt À M users.

Clearly, since linear precoding is used, no more than M data streams, corresponding to

a maximum of M users (assuming Lk = 1) can be transmitted on each subcarrier. With

Kt À M users in the system, it is not clear which users should be allocated to each

subcarrier. Note that when the user allocation on the subcarriers is known, minimizing

the SMSE can be done using the scheme in Table 2.1 [8]. The precoding and decoding

matrices on a given subcarrier are derived in exactly the same way using that subcarrier’s

frequency domain channel matrix. The power allocation is a convex problem since it is the

sum of the Nc separate convex power allocation problems of the Nc subcarriers. Therefore,

it can be optimally and easily performed in one step over all subcarriers. On the other

hand, when the user allocation on the subcarriers is not known, finding the optimal user

group for each subcarrier that minimizes the SMSE of the system is a very complicated

problem and the brute force approach is computationally prohibitive [39]. Thus, the

problem becomes finding a practical method, which determines a user allocation that

gives an acceptable performance. Note that user fairness is not addressed as it further

complicates an already complex problem.

4.2 Original Algorithm and Proposed Modifications

The following approach was proposed in [39] to address the problem at hand. Basically,

the iteration step of the multiuser algorithm in Table 2.1 is run on all Kt users on a

certain subcarrier. The top K streams that were allocated the most power are then

assigned to that subcarrier. The same procedure is repeated over all subcarriers.

In the following, we propose two modifications to the original approach. First, we


propose to choose the K users with the lowest individual MSEs instead of those with the

highest power allocation. This choice is intuitive as our ultimate goal is minimizing the

SMSE of the system. Moreover, some ‘good’ users can achieve a low MSE, and hence

contribute well to decreasing the SMSE, with low power. Such users are more likely to

be discarded by the original approach. Second, we propose to choose the K users by

successively discarding the worst Kd ≤ Kt − K users and repeating the iteration until

the number of remaining users, Kr, becomes K. Clearly, as Kd increases, the running

time decreases. Note that Kd can be changed for each iteration. The intuition behind

this modification is that the presence of a high number of ‘bad’ users (Kt−K À K) can

affect which are the ‘best’ users. By gradually discarding ‘bad’ users, the choice of the

‘best’ users should logically improve. The modified approach is detailed in Table 4.1.


The following simulation was performed to evaluate the new approach. Consider a system

with one BS with M = 4 antennas, and Kt = 7 total users, each with Nk = 1 antennas

and Lk = 1 data stream. Let the number of subcarriers be Nc = 1. Having one subcarrier

logically should not affect the results, since the subcarriers are assumed independent, and

the user allocation approach simply performs the same procedure over all subcarriers.

This is further supported by the fact that when our simulation was set to select users by

the approach presented in [39] (performed using Nc = 64), it yielded very close results.

The following four scenarios were simulated. In the first scenario, Kd = 3, i.e., the

SMSE minimization iteration ran only once, and the K = 4 top users were selected

according to the highest power criterion. In the second scenario, Kd = 3 as well, but

the K = 4 top users were selected according to the lowest MSE criterion. In the third

scenario, Kd = [2, 1], i.e., the SMSE minimization iteration ran twice, and the highest

power criterion was used. In the fourth scenario, Kd = [2, 1] as well, and the lowest MSE

criterion was used. The total number of users was chosen to be relatively low (Kt = 7)

in order to be able to perform a brute force search for the best user allocation. For each

scenario, the percentage of times that the K = 4 selected users were identical to those


Repeat for n = 1 : Nc

Set Kr = Kt.

Repeat until Kr is K

1. Choose an appropriate value for Kd.

2. Minimize SMSE for all Kt users.

Initialization:

Vk = SVD(Hk), Q = (Pmax/LKt)ILKt

Iteration:

i. Find virtual uplink precoding vectors, for k = 1 : Kr, j = 1 : Lk

vkj(n) = emaxHHk (n)J−2

kj Hk(n), I/qkj(n) + HHk (n)J−1

kj Hk(n)emax returns the normalized eigenvector with highest eigenvalue.

ii. Find virtual uplink power allocation to minimize SMSE.

q(n) = argminq(n) tr (J−1(n)), subject to qkjn > 0, ||q(n)|| ≤ Pmax

iii. Repeat iteration until old SMSE - new SMSE < ε

3. Discard the Kd users with the highest individual MSEs.

Table 4.1: Modified user selection algorithm

4.4. MIMO-OFDM AND THE HYBRID ALGORITHM 71

0 5 10 1550

55

60

65

70

75

80

85

90

95

10 log(Pmax

/ σ2)

Per

cent

age

mat

ch to

opt

imal

allo

catio

n

Scenario 1, Kd = 3, Highest power criterion

Scenario 2, Kd = 3, Lowest MSE criterion

Scenario 3, Kd = [2, 1], Highest power criterion

Scenario 4, Kd = [2, 1], Lowest MSE criterion

Figure 4.1: Four different scenarios for user selection

selected by the brute force search over 7C4 combinations was found at different SNRs.

Figure 4.1 shows the results.

The first observation to make is the vast improvement that is achieved, especially at

high SNRs; the lowest MSE criterion clearly outperforms the highest power criterion.

The second observation is the improvement achieved by running the discarding the ‘bad’

users gradually rather than all at once. This improvement is achieved, independent of

which criterion is used. Finally, the third observation to make is the lower degree of

dependence of the lowest MSE criterion on the SNR.

4.4 MIMO-OFDM and the Hybrid Algorithm

Before discussing how users can be chosen for the hybrid algorithm, we will present how

it can be used in conjunction with OFDM. As described in Chapter 3, the hybrid algo-

rithm cooperates only for edge users to avoid the problems associated with asynchronous

interference. Therefore, OFDM can be used and the hybrid algorithm can simply be run


on each subcarrier. Thus, the system model for the MIMO-OFDM system can be easily

derived from the one in Chapter 3 for the hybrid algorithm by incorporating a subcarrier

index n to all the variables. The expressions of the estimated data vectors of intra-cell

users in the downlink and uplink for b = 1, 2, the estimated data vectors of edge users

in the downlink after interference has been canceled, as well as the SMSE in the uplink

are given below. Note that the channel matrices below represent the flat fading MIMO

channel response on each subcarrier in the frequency domain.

xk(n)DL = VHk (n)H

(b)H

k (n)∑

j∈Ωb,Ωe

U(b)j (n)

√Pj(n)xj(n) + VH

k (n)nk(n), (4.5)

xk(n)UL = UHk (n)

∑j∈Ωb,Ωe

H(b)j (n)Vj(n)

√Qj(n)xj(n) + UH

k (n)n(b)(n), (4.6)

de(n) =(VH

e (n)HHe (n)Ue(n)

√Pe(n)ve(n) + VH

e (n)ne(n))

mod 2Mconst, (4.7)

SMSE =Nc∑

n=1

L(n)− 4M + tr(J(1)−1

(n))

+ tr(J−1

e (n))

+ tr(J(2)−1

(n))

, (4.8)

where

J(1)(n) = H(1)in (n)V

(1)in (n)Q

(1)in (n)V

(1)H

in (n)H(1)H

in (n) + σ2IM , (4.9)

Je(n) = He(n)Ve(n)Qe(n)VHe (n)HH

e (n) + σ2I2M , (4.10)

J(2)(n) = H(2)in (n)V

(2)in (n)Q

(2)in (n)V

(2)H

in (n)H(2)H

in (n) + σ2IM (4.11)

To run the hybrid algorithm for each subcarrier, the steps presented in Section 3.2.3 are

repeated over all subcarrier indices; however, the power allocation step is run once per


iteration over all subcarriers. This procedure is presented below. As seen for the hybrid

algorithm, the power allocation problem for minimizing the SMSE on one subcarrier is

a convex problem; therefore, the power allocation problem to minimize the sum of the

SMSEs over all the subcarriers is also a convex problem. As mentioned previously in

Section 3.2.3, the uplink precoding matrices (the V matrices) are initialized to the right

singular vectors of the channel matrices. The uplink power matrices (the Q matrices) are

initialized by dividing the available power equally among all data streams. The modified

channel matrices are initialized by adjusting the true channels matrices using the initial

power and precoding matrices.

Iteration:

1. Solve the following convex power allocation problem.[Q(1)(n),Qe(n),Q(2)(n)

]for n=1,...,Nc

= arg minQ(1)(n),Qe(n),Q(2)(n)for n=1,...,Nc

∑Nc

n=1 tr[J(1)−1

(n)]+

tr[J−1

e (n)]

+ tr[J(2)−1

(n)]

s.t.∑Nc

n=1 tr[Q(1)(n)

]+ tr [Qe(n)] + tr

[Q(2)(n)

] ≤ Ptot

Repeat steps 2 to 5 for n = 1, . . . , Nc:

2. Find new V matrices for edge users.

vkj(n) = emax

(HH

e,k(n)J−2e,kj(n)He,k(n), I/Qe,kj(n) + HH

e,k(n)J−1e,kj(n)He,k(n)

)

N.B.: emax returns the eigenvector with the highest eigenvalue.

3. Find the U matrices with normalized columns for all edge users and the global Pe

matrix.

Ue,k(n) = J−1e (n)He,k(n)Ve,k(n)

√Qe,k(n), ue,kj(n) = ue,kj(n)/||ue,kj(n)||

Pe(n) = 34σ2diag

[(D−1

e (n)−Ψe(n))−1

1]

N.B.: The factor 34

is included to account for the increased average power of xe(n)

due to THP as mentioned previously in Section 3.2.1.

4. Find Rz,k(n) as in Eqn. (3.7) or Eqn. (3.9) and update Hk(n) for all users in BSs

1 and 2.

H(b)k (n) = H

(b)k (n)R

− 12

z,k (n)

4.5. USER SELECTION FOR THE HYBRID ALGORITHM 74

5. Find new V matrices for intra-cell users, b = 1, 2.

vkj(n) = emax

(H

(b)H

k (n)J(b)−2

kj (n)H(b)k (n), I/Q

(b)kj (n) + H

(b)H

k (n)J(b)−1

kj (n)H(b)k (n)

)

6. Repeat steps 1 to 5 above, until the relative change in SMSE is within a certain

threshold, or until SMSE shows an increase at any step.

Update:

Repeat steps 1 and 2 for n = 1, . . . , Nc:

1. Find the U matrices with normalized columns for all intra-cell users and their global

power allocation matrices, b = 1, 2.

U(b)k (n) = J(b)−1

(n)H(b)k (n)V

(b)k (n)

√Q

(b)k (n), u

(b)kj (n) = u

(b)kj (n)/||u(b)

kj (n)||P(b)(n) = σ2diag

[ (D(b)−1

(n)−ΨUL (b)T(n)

)−1

1]

2. Find Ge(n) as given in Eqn. (3.5).

4.5 User Selection for the Hybrid Algorithm

In Section 4.1.2, we mentioned that in a single-BS system it is clear that the maximum

number of data streams that can be transmitted on a single subcarrier in the downlink is

equal to the number of BS transmit antennas, M , since linear precoding is used. When

considering the hybrid algorithm, the theoretical maximum is not as clear; however,

logically, the maximum number of edge data streams (i.e. data streams of edge users)

should be equal to the sum of the number of antennas of the two BSs cooperating to

communicate with the edge users. This follows from the fact that the hybrid algorithm

treats the edge users as a separate group that employs linear precoding. As for the

intra-cell users, since the hybrid algorithm also treats them as separate groups that use

linear precoding, the maximum number of intra-cell data streams is equal to the number

of transmit antennas of the corresponding BS. Assuming that both BSs have the same

number of antennas M , the maximum number of edge data streams would be 2M , and

that of the intra-cell data streams would be M . Figure 4.2 shows the performance of the

hybrid algorithm for B = 2,M = 4, K1 = K2 = 4, Ke = 8, Nk = 1, and Lk = 1.

4.5. USER SELECTION FOR THE HYBRID ALGORITHM 75

−2 0 2 4 6 8 10 12 1410

−4

10−3

10−2

10−1

100

10 log(Pmax

/ σ2)

BE

R

K1 = K

2 = 1, K

e = 4

K1 = K

2 = 4, K

e = 8

Figure 4.2: Performance of hybrid algorithm for K1 = K2 = 4 and Ke = 8

As we can see in the figure, it is possible to communicate with 4 intra-cell users in

each cell and 8 edge users, and many users are serviced simultaneously. On the other

hand, the average performance per stream is poor, even at high SNRs. Since even

simple applications require a better BER than what was achieved, such a data stream

distribution is not desirable. Consequently, we propose to simulate the performance over

several data stream distributions, and empirically determine an appropriate choice. Note

that, as mentioned previously, in a realistic scenario, there would be a very high number of

users, of which few are allocated on a single subcarrier. Therefore, before proceeding with

the data stream simulation exercise, a method to select users for the hybrid algorithm

is necessary. One intuitive option would be running the modified selection algorithm

described in Section 4.2 on all the users, and then selecting a certain number of best

users for each user group. An example is shown in Figure 4.3 for a total of ten users:

three intra-cell users in each cell and four edge users. Another option would be running

the modified algorithm after replacing the SMSE minimizing iteration by that of the

hybrid algorithm. To make the proper choice, both options were simulated to select 1

4.6. SUMMARY 76

intra-cell user for each cell and 3 edge users from a total of 4 intra-cell users per cell

and 8 edge users. Note that Kd was set such that the iteration takes place only once.

The results are shown in Figure 4.4. The selection algorithm as presented in Section 4.2

performs slightly better than using the iteration from the hybrid algorithm.

Figure 4.3: User selection method for the hybrid algorithm

Based on the above result, we present some sample curves in Figures 4.5 and 4.6

obtained by selecting different numbers of intra-cell and edge users (same as selecting

data streams, since Lk = 1) from a total of 4 intra-cell users per cell and 8 edge users.

In general, such simulations should be randomized over the numbers of total users,

selected users, BS antennas, user antennas, user positions, and channels. Obviously, this

requires a lot of time, but it can be performed once, and the statistics can be stored in

look-up tables for fast access during real-time operation. A drawback of such scheme

is the lack of individualization, since the BER used is an average value of all the data

streams.

4.6 Summary

In this chapter, we reviewed the MIMO-OFDM user selection algorithm proposed in [39]

and made two modifications to it. In the first, we propsed to rank the users according to

4.6. SUMMARY 77

−2 0 2 4 6 8 10 1210

−5

10−4

10−3

10−2

10−1

100

10 log(Pmax

/ σ2)

BE

R

Selection using the modified selectionalgorithm directlySelection using the modified selection algorithmwith the iteration of the hybrid algorithm

Figure 4.4: Two different user selection methods for the hybrid algorithm

their individual MSEs instead of their allocated powers. In the second, we proposed to

knock out ‘bad’ users gradually, instead of choosing the ‘good’ users all at once. Simula-

tions showed the significant improvement achieved after applying these two modifications.

We then presented several simulations related to user selection for the hybrid algorithm.

In the first, we demonstrated the effect of increasing the number of users on the perfor-

mance of the hybrid algorithm. In the second, we explored which of two possible ways

is better for user selection for the hybrid algorithm. In the third, we presented sample

curves that can be used to determine how many users should be allocated on a each

subcarrier when the hybrid algorithm is used with OFDM, based on the available power

and the average required error rate per data stream.

4.6. SUMMARY 78

−2 0 2 4 6 8 10 12 1410

−6

10−5

10−4

10−3

10−2

10−1

100

10 log(Pmax

/ σ2)

BE

R

121 Arrangement131 Arrangement141 Arrangement151 Arrangement

Figure 4.5: Varying number of edge users for 1 intra-cell user per cell (1f1 refers to fedge users)

−2 0 2 4 6 8 10 12 1410

−5

10−4

10−3

10−2

10−1

100

10 log(Pmax

/ σ2)

BE

R

222 Arrangement232 Arrangement242 Arrangement252 Arrangement

Figure 4.6: Varying number of edge users for 2 intra-cell users per cell (2f2 refers to fedge users)

Chapter 5

Conclusions and Future Work

5.1 Conclusions

In this thesis, we considered the downlink of a wireless cellular system with multiple

antennas at the transmitters (BSs) and receivers (users). Our goal was to provide insight

on how communication in such a system can be performed in order to take advantage of

the multiple antennas and the fact that the entire bandwidth is being used in all the cells.

Accordingly, we have taken into consideration both positive and negative implications,

namely the possibility of cooperation between the multiple BSs and the resulting inter-

cell interference, respectively. We studied how the BSs can communicate with the users

assuming the availability of complete CSI at the BSs. Despite the advantages that can be

gained, many implementation issues should be addressed before a practical and efficient

deployment can be made. Overall, the contributions of this thesis were:

• Developing a downlink/uplink duality in a single-carrier, multi-BS, MIMO system

with asynchronous interference.

• Proposing a multiuser, multi-BS, linear precoding algorithm that makes use of

downlink/uplink duality to minimize the SMSE of the system while accounting for

asynchronous interference.

79

5.1. CONCLUSIONS 80

• Proposing a cooperative hybrid algorithm that combines linear precoding and non-

linear THP and minimizes the SMSE of the system. This algorithm helps ad-

dress the difficulties associated with employing BS cooperation in conjunction with

OFDM by having two BSs cooperate only for edge users that are almost equally

distant from them.

• Modifying an existing user selection algorithm for MIMO-OFDM to enhance its

performance and suggesting how it can be used to select users when the hybrid

algorithm is used along with OFDM.

In Chapter 2, we presented a detailed model for the system described above that takes

into account the required timing advances for synchronous reception at the users, based

on the model in [20]. We provided two possible virtual uplink models, again with timing

details, and showed that, with the appropriate choice of an uplink model, a duality exists

between the downlink and uplink, despite the presence of asynchronous interference. This

proof generalizes the downlink/uplink duality to the multiuser, multi-BS, MIMO case.

With duality in hand, we extended an existing single-BS linear precoding algorithm [8]

based on the downlink/uplink duality, to accommodate the presence of multiple BSs and

asynchronous interference. In our case, the power allocation step was not provably con-

vex; however, we provided simulations that suggested that the SMSE was still convex in

the powers of the data streams. Simulations showed the performance of the extension and

compared it to the performance assuming all reception was synchronous. This quantified

the loss acquired in the presence of asynchronous interference.

We briefly discussed the complications that arise when multiple BSs attempt to com-

municate with the users through OFDM, and noted that one may study the effect of

asynchronous interference on the OFDM symbols and code against that. However, we

chose to make the BSs cooperate only when communicating with edge users, first, be-

cause those users are in need of more help than intra-cell users given they are on the

edge and suffer from inter-cell interference more, and second because the effects of asyn-

chronism are limited at edge users and the designed algorithms can be easily extended

to OFDM. In Chapter 3, we proposed an algorithm for two cooperating BSs that is a

hybrid between linear and non-linear precoding. Non-linear THP is used to pre-subtract

5.1. CONCLUSIONS 81

interference that edge users receive from intra-cell users. Intra-cell users, on the other

hand, treat the interference they see from transmissions to edge users as colored noise

and whiten it. Accordingly, each user group is treated separately when deriving the

precoding and decoding matrices used internally to protect against MUI. However, the

convex power allocation is done globally, and the algorithm attempts to minimize SMSE

of the entire system. We demonstrated that this approach performs better than sim-

ple linear precoding algorithms with the same amount of cooperation. When nonlinear

THP is used, the complicated user ordering problem often arises. In our case, this is not

an issue, since the pre-subtraction happens among only two groups and hence only two

possible orders exist and both were considered. When the transmission to the intra-cell

users is pre-subtracted from that to the edge users, the hybrid algorithm is found to per-

form better. In other simulations, we showed that the users can estimate the decoding

matrices and the whitening filters, which accordingly need not be communicated from

the BSs to the users in any manner. Finally, we provided a brief discussion regarding

the implementation issues for the hybrid algorithm. While the physical complications

might outweigh the achieved performance and hinder the practical deployment of such a

scheme, this work was an exploration of one of many possible scenarios for BS cooper-

ation, and helped demonstrate the challenge involved in designing cooperative systems

and underline the issue of asynchronous interference.

In Chapter 4, we considered the problem of user selection for the subcarriers in a

MIMO-OFDM system. We started out with the user selection algorithm presented in [39]

for the MIMO-SMSE algorithm of [8] and modified it in two ways, namely selecting

users with the lowest individual MSEs and dropping ‘bad’ users instead of picking ‘good’

users. We demonstrated the improvements achieved by the two modifications. Finally, we

provided results of a simulation exercise that suggested how many users should be placed

on one subcarrier for the hybrid algorithm and how to choose these users. Obviously, this

is not an optimal solution. The high number of users and subcarriers, varying optimality

criteria (maximizing data rates, minimizing error rates, meeting quality-of-service (QoS)

requirements, etc.), and fairness issues complicate matters extremely. Therefore, the

methodology for a solution is not simple nor clear, and optimal user selection for MIMO-

OFDM remains an open problem.

5.2. FUTURE WORK 82

5.2 Future Work

Several interesting problems encountered while working on this thesis can be explored

further. First, as mentioned previously in Section 2.6, one can study the effect of asyn-

chronous interference as presented in this work on OFDM. Exploring how the misalign-

ment of the pulse shapes of the arriving OFDM symbols with the matched filters at the

receivers affects the actual data symbols can help in the design of linear precoding to

guard against asynchronous interference in OFDM.

Second, it would be useful and more practical to solve similar problems to those

tackled in Chapter 2 and Chapter 3 with per-BS power constraints. It is interesting

to mention that the discussion provided by Schubert and Boche in [6] that shows that

maximizing the minimum SINR with a total power constraint leads to the same SINR-

to-target ratio across all users does not necessarily extend to a multi-BS case with per-BS

power constraints. Investigating such a scenario can give more insight into the design of

precoding scheme for multi-BS systems with per-BS power constraints.

Third, addressing the implementation issues described in Section 3.5 is important,

especially determining how partial CSI affects the performance of an algorithm that

minimizes SMSE. This includes determining how to run the algorithm when other forms

of CSI are available, for example, the channel covariance matrices instead of the exact

channel values. This also includes studying the effect of discrete, quantized CSI values

on the performance. Both of the above points were briefly discussed in [8] for the MIMO-

SMSE algorithm it proposed. The implementation issues also include the phase noise

and frequency offset associated with OFDM. Treating these problems is crucial for the

effective application of OFDM, which is an essential component of the system we are

dealing with.

Fourth, since no convincing algorithm for MIMO-OFDM user selection was presented,

this area needs further exploration. The focus should be devising an algorithm that per-

forms joint optimal user selection according to individual user or data stream QoS re-

quirements (possibly through MSE, BER, or data rate constraints) and power constraints.

For example, the user selection process can be integrated into the SMSE minimization

algorithm by using different weights for the contribution of different users and restricting

5.2. FUTURE WORK 83

Figure 5.1: Extension of hybrid algorithm to 3 BSs

those weights to the integers 0 and 1 (integer programming).

Fifth, alternative ways to battle asynchronous interference may be explored. For

example, the RAKE receiver, common in CDMA systems, can receive delayed versions

of the same signal and combine them. It assumes that the different versions arrive at

integer multiples of the chip period. However, the chip period in CDMA systems is much

shorter than the symbol period assumed in this work, and integer multiples of it may be

used to approximate the continuous delay values. Note that the RAKE receiver combines

multiple, hopefully independent, versions of the same signal to achieve diversity, while

in our case it can combine the different signals from different BSs intended for one user

to achieve diversity.

Finally, one can study the possibility of extending the hybrid algorithm to three BSs.

Directional antennas can be used to provide cell sectorization, with 120 sectors, in order

to use the hybrid algorithm throughout the system. Figure 5.1 shows this setup, with

1 BS, 2 BSs, or 3 BSs communicating with users in the light area, slightly darker area,

or darkest area, respectively. Note that if the 60 sectors were used (at the cost of more

hardware), then the hybrid algorithm can be used as is for each two adjacent sectors in

different cells.

Appendix A

Equating MSEs

We generalize the MSE duality for single-carrier, MIMO systems with multiple cooperat-

ing BSs and asynchronous interference. Let vkl and ukl be the MMSE decoding vectors

for data stream l of user k in the downlink and uplink, respectively. From Eqn. (2.65)

and Eqn. (2.61),

vkl =

HH

k UkPkUHk Hk + HH

k

K∑c=1c6=k

BckHk + σ2INk

−1

HHk ukl

√pkl, (A.1)

ukl =

HkVkQkV

Hk HH

k +K∑

c=1c 6=k

Ack + σ2INk

−1

Hkvkl√

qkl. (A.2)

Let vkl = vkl/||vkl|| and ukl = ukl/||ukl||. From Eqn. (2.58), the MSE of data stream l

of user k in the uplink is

εULkl =uH

klHkVkQkVHk HH

k ukl + uHkl

K∑c=1c6=k

Ackukl + σ2uHklukl

−√qklvHklH

Hk ukl − uH

klHkvkl√

qkl + 1. (A.3)

84

APPENDIX A. EQUATING MSES 85

In a similar derivation to Eqn. (2.58), but using xDLk in Eqn. (2.13), the MSE of data

stream l of user k in the downlink is

εDLkl =vH

klHHk UkPkU

Hk Hkvkl + vH

klHHk

K∑c=1c 6=k

BckHkvkl + σ2vHkl vkl

−√pkluHklHkvkl − vH

klHHk ukl

√pkl + 1. (A.4)

Setting SINRDLkl = SINRUL

kl , and with simple mathematical manipulations, we get

||ukl||2pkl

vH

klHHk UkPkU

Hk Hkvkl + vH

klHHk

K∑c=1c6=k


=||vkl||2

qkl

uH

klHkVkQkVHk HH

k ukl + uHkl

K∑c=1c 6=k

Ackukl + σ2uHklukl

. (A.5)

Assuming ||vkl|| = α/√

pkl and ||ukl|| = α/√

qkl, where α is a constant, Eqn. (A.5)

simplifies to

vHklH

Hk UkPkU

Hk Hkvkl + vH

klHHk

K∑c=1c 6=k


= uHklHkVkQkV

Hk HH

k ukl + uHkl

K∑c=1c6=k

Ackukl + σ2uHklukl, (A.6)

and we get

εDLkl = εUL

kl . (A.7)

Appendix B

Derivations for Asynchronous

Interference

We explain the expressions of the asynchronous interference in the downlink and uplink,

and derive the expected values in Eqn. (2.15) and Eqn. (2.27).

Recall that i(b)jk is the misaligned interference caused on user k when BS b transmits

to user j. Its expression is repeated in Eqn. (B.1).

i(b)jk (m) = ρ(δ

(b)jk − TS)xj(m

(b)jk ) + ρ(δ

(b)jk )xj(m

(b)jk + 1), (B.1)

where

ρ(τ) =

∫ TS

0

g(t)g(t− τ)dt. (B.2)

The vector i(b)jk is a linear combination of two consecutive data vectors being trans-

mitted to user j. The value of ρ(τ) quantifies the contribution of each of those two data

vectors to the interference caused on user k, depending on the value of δ(b)jk = τ

(b)jk mod TS.

This can be understood more as follows. When a signal arrives at user k, it is convoluted

with a matched filter and sampled every TS seconds. The matched filter that maximizes

the SNR is given by g∗(−t) = g(−t) (because g(t) is real). The convolution operation is

86

APPENDIX B. DERIVATIONS FOR ASYNCHRONOUS INTERFERENCE 87

define as

g(τ) ∗ h(τ) =

∫ ∞

−∞g(t)h(t− τ)dt. (B.3)

Assume that, without loss of generality, xj(m(b)jk ), xj(m

(b)jk +1), and xk are all scalars, i.e.

Lk = 1, and hence they will no longer in bold. Also assume, for brevity and without loss

of generality, that m(b)jk = 1 and m

(b)jk +1 = 2. The desired symbol xk arrives synchronously

at user k, and when it is matched filtered and sampled (at τ = 0 in our example), the

following result is obtained.

xk g(τ) ∗ g(−τ)|τ=0 =

∫ ∞

−∞xk g(t)g(t− τ)dt

∣∣∣∣τ=0

=

∫ ∞

−∞xk g2(t)dt = xk, (B.4)

since g(t) has unit power.

Symbols xj(1) and xj(2) arrive asynchronously at user k, and before match filter-

ing and sampling, the continuous time signal of the asynchronous interference can be

expressed as

i(b)jk (τ) =

[xj(2)g(τ + δ

(b)jk ) + xj(1)g(τ − TS + δ

(b)jk )

][u(τ)− u(τ − TS)] , (B.5)

where u(τ) is the unit step function defined as

u(τ) =

0, τ < 0

1, τ ≥ 0. (B.6)

Therefore, when i(b)jk (τ) is match filtered and sampled at τ = 0, the following result is


obtained.

i(b)jk (τ) ∗ g(−τ)

∣∣∣τ=0

=

∫ TS−δ(b)jk

0

xj(2) g(t + δ(b)jk )g(t− τ)dt

+

∫ TS

TS−δ(b)jk

xj(1) g(t− TS + δ(b)jk )g(t− τ)dt

∣∣∣∣∣τ=0

=

∫ TS−δ(b)jk

0

xj(2) g(t + δ(b)jk )g(t)dt +

∫ TS

TS−δ(b)jk

xj(1) g(t− TS + δ(b)jk )g(t)dt

= ρ(−δ(b)jk ))xj(2) + ρ(−(δ

(b)jk − TS))xj(1). (B.7)

Recalling that g(t) is non-zero only for t ∈ [0, TS], we show that ρ(−τ) = ρ(τ).

ρ(τ) =

∫ TS

0

g(t)g(t− τ)dt =

∫ TS+τ

τ

g(t)g(t− τ)dt =

∫ TS

0

g(z + τ)g(z)dz = ρ(−τ)

(B.8)

Accordingly, Eqn. (B.7) simplifies to

ρ(δ(b)jk − TS)xj(1) + ρ(δ

(b)jk ))xj(2), (B.9)

which is analogous to Eqn. (B.1).

Now, consider E[i(b1)j1k i

(b2)H

j2k

]. When j1 = j2 = j = k, δ

(b)jk = 0. Substituting in

Eqn. (B.1), and noting that ρ(0) = 1 and that ρ(−TS) = 0 since g(t) has unit power and

is non-zero only for t ∈ [0, TS], we get

i(b)jk (m) = ρ(−TS)xj(m

(b)jk ) + ρ(0)xj(m

(b)jk + 1) = xj(m

(b)jk + 1). (B.10)

Therefore,

E[i(b1)j1k i

(b2)H

j2k

]= E

[i(b1)jk i

(b2)H

jk

]= E

[xj(m

(b1)jk + 1)xj(m

(b2)H

jk + 1)]

= ILj, (B.11)

since the data vectors of user j arrive synchronously from all BSs, meaning that m(b1)H

jk +

1 = m(b2)H

jk + 1, and hence xj(m(b1)jk + 1) = xj(m

(b2)jk + 1). This establishes the third line


of Eqn. (2.14).

For other values of j1, j2, and k, we expand E[i(b1)j1k i

(b2)H

j2k

].

E[i(b1)j1k i

(b2)H

j2k

]= E

[(ρ(δ

(b1)j1k − TS)xj1(m

(b1)j1k ) + ρ(δ

(b1)j1k )xj1(m

(b1)j1k + 1)

)

(ρ(δ


(b2)j2k ) + ρ(δ

(b2)j2k )xj2(m

(b2)j2k + 1)

)H]

= E[ρ(δ

(b1)j1k − TS)ρ(δ


(b1)j1k )xH

j2(m

(b2)j2k )

+ ρ(δ(b1)j1k − TS)ρ(δ

(b2)j2k )xj1(m

(b1)j1k )xH

j2(m

(b2)H

j2k + 1)

+ ρ(δ(b1)j1k )ρ(δ


(b1)j1k + 1)xH

j2(m

(b2)j2k )


(b2)j2k )xj1(m

(b1)j1k + 1) xH

j2(m

(b2)j2k + 1)

]

= ρ(δ(b1)j1k − TS)ρ(δ

(b2)j2k − TS)E

[xj1(m

(b1)j1k )xH

j2(m

(b2)j2k )

](B.12)

+ ρ(δ(b1)j1k − TS)ρ(δ

(b2)j2k )E

[xj1(m

(b1)j1k )xH

j2(m

(b2)j2k + 1)

](B.13)


(b2)j2k − TS)E

[xj1(m

(b1)j1k + 1)xH

j2(m

(b2)j2k )

](B.14)


(b2)j2k )E

[xj1(m

(b1)j1k + 1)xH

j2(m

(b2)j2k + 1)

](B.15)

When j1, j2, and k are all distinct, each data vector pair inside the expectation operators

in Eqns. (B.12), (B.13), (B.14), and (B.15) has two independent data vectors that belong

to user j1 and user j2. Therefore, all expected values evaluate to 0. This establishes the

first line of Eqn. (2.14).

Finally, when j1 = j2 = j 6= k, all data vector pairs inside the expectation operators

in Eqns. (B.12), (B.13), (B.14), and (B.15) have two data vectors that belong to the same

user j. The data vectors of user j are independent over time. Therefore, the expected

values evaluate to ILjonly when the time indices match; otherwise, they evaluate to 0.

When the time indices differ by more than 1, no time indices match in any of

Eqns. (B.12), (B.13), (B.14), and (B.15), and all evaluate to 0. This establishes the

first line of Eqn. (2.15). When m(b2)jk = m

(b1)jk + 1, only the expected value of Eqn. (B.13)

evaluates to ILj. This establishes the second line of Eqn. (2.15). When m

(b1)jk = m

(b2)jk , and

thus m(b1)jk +1 = m

(b2)jk +1 , only the expected values of Eqn. (B.12) and Eqn. (B.15) eval-

uate to ILj. This establishes the third line of Eqn. (2.15). Finally, when m

(b1)jk = m

(b2)jk +1,


only the expected value of Eqn. (B.14) evaluates to ILj. This establishes the fourth line

of Eqn. (2.15).

The derivation for the uplink asynchronous interference has a parallel structure to

the above derivation, but it is performed using the expression of e(b)jk in Eqn. (2.23).

Bibliography

[1] A. Bourdoux and N. Khaled, “Joint TX-RX optimisation for MIMO-SDMA based

on a null-space constraint,” in Proc. IEEE Vehicular Technology Conference, Van-

couver, Canada, September 2002, pp. 171–174.

[2] Q. H. Spencer, A. L. Swindlehurst, and M. Haardt, “Zero-forcing methods for down-

link spatial multiplexing in multiuser MIMO channels,” IEEE Trans. on Signal Pro-

cessing, vol. 52, no. 2, pp. 461–471, February 2004.

[3] J. H. Chang, L. Tassiulas, and F. Rashid-Farrokhi, “Joint transmitter receiver di-

versity for effcient space division multiple access,” IEEE Trans. on Wireless Com-

munications, vol. 1, no. 1, pp. 16–26, January 2002.

[4] A. R. S. Bahai, Multi-carrier digital communications: theory and applications of

OFDM. Springer, 2004.

[5] A. J. Tenenbaum and R. S. Adve, “Joint multiuser transmit-receive optimization

using linear processing,” in Proc. IEEE ICC 04, vol. 1, Paris, France, June 2004,

pp. 588–592.

[6] M. Schubert and H. Boche, “Solution of the multiuser downlink beamforming

problem with individual SINR constraints,” IEEE Trans. on Vehicular Technology,

vol. 53, no. 1, pp. 18–28, January 2004.

[7] S. Shi and M. Schubert, “MMSE transmit optimization for multi-user multi-antenna

systems,” in Proc. IEEE ICASSP 05, March 2005.

91

BIBLIOGRAPHY 92

[8] A. M. Khachan, A. J. Tenenbaum, and R. S. Adve, “Linear processing for the

downlink in multiuser MIMO systems with multiple data streams,” in Proc. IEEE

ICC 06, June 2006.

[9] M. Codreanu, A. Tolli, M. Juntti, and M. Latva-aho, “Joint design of Tx-Rx beam-

formers in MIMO downlink channel,” IEEE Trans. on Signal Proc., vol. 55, no. 9,

pp. 4639–4655, September 2007.

[10] A. J. Tenenbaum and R. S. Adve, “Improved sum-rate optimization in the multiuser

MIMO downlink,” in Proc. CISS, Princeton, NJ, March 2008.

[11] S. Serbetli and A. Yener, “Transceiver optimization for multiuser MIMO systems,”

IEEE Trans. on Signal Proc., vol. 52, no. 1, pp. 214–226, January 2004.

[12] P. T. Boggs and J. W. Tolle, “Sequential quadratic programming,” in Acta Numer-

ica. Cambridge University Press, 1995, pp. 1–51.

[13] E. Visotsky and U. Madhow, “Optimum beamforming using transmit antenna ar-

rays,” in Proc. IEEE VTC Spring, May 1999, pp. 851–856.

[14] A. Tolli, M. Codreanu, and M. Juntti, “Linear cooperative multiuser MIMO tran-

sciever design with per BS power constraints,” in Proc. IEEE ICC 07, Glasgow,

Scotland, June 2007.

[15] T. Tamaki, K. Seong, and J. M. Cioffi, “Downlink MIMO systems using cooperation

among base stations in a slow fading channel,” in Proc. IEEE ICC 07, Glasgow,

Scotland, June 2007.

[16] H. Dahrouj and W. Yu, “Coordinated beamforming for the multi-cell multi-antenna

wireless system,” in Proc. CISS, Princeton, NJ, March 2008.

[17] H. Dai, A. F. Molisch, and H. V. Poor, “Downlink capacity of interference-limited

MIMO systems with joint detection,” IEEE Trans. on Wireless Commun., vol. 3,

pp. 442–453, March 2004.

BIBLIOGRAPHY 93

[18] S. Jafar, G. Foschini, and A. Goldsmith, “Phantomnet: Exploring optimal multicel-

lular multiple antenna systems,” EURASIP J. App. Sig. Proc., pp. 591–604, October

2004.

[19] B. L. Ng, J. S. Evans, S. V. Hanly, and D. Aktas, “Transmit beamforming with

cooperative base stations,” in Proc. IEEE ISIT, 2005, pp. 1431–1435.

[20] H. Zhang, N. B. Mehta, A. F. Molisch, J. Zhang, and H. Dai, “On the fundamen-

tally asynchronous nature of interference in cooperative base station systems,” in

Proc. IEEE ICC 07, Glasgow, Scotland, June 2007.

[21] S. Verdu, Multiuser Detection. Cambridge University Press, 1998.

[22] T. A. Thomas and F. W. Vook, “Asynchronous interference suppression in broad-

band cyclic-prefix communications,” in Proc. IEEE WCNC, New Orleans, LA,

March 2003.

[23] K. Yano and M. Taromaru, “Pre-FFT type MMSE adaptive array antenna to sup-

press asynchronous interference for OFDM packet transmission,” in Proc. IEEE

WCNC, Hong Kong, March 2007.

[24] J. Hyejung and M. D. Zoltowski, “On the equalization of asynchronous multiuser

OFDM signals in fading channels,” in Proc. IEEE ICASSP, May 2004.

[25] M. Costa, “Writing on dirty paper,” IEEE Trans. on Information Theory, vol. 29,

no. 3, pp. 439–441, May 1983.

[26] M. Tomlinson, “New automatic equalizer employing modulo arithmetic,” Electronic

Letters, vol. 7, pp. 138–139, March 1971.

[27] H. Harashima and H. Miyakawa, “A method of code conversion for digital communi-

cation channels with intersymbol interference,” Trans. of the Institute of Electronics

and Communications Engineers of Japan, vol. 52, pp. 272–273, June 1969.

[28] ——, “Matched-transmission technique for channels with intersymbol interference,”

IEEE Trans. on Communications, vol. 20, pp. 774–780, August 1972.

BIBLIOGRAPHY 94

[29] B. M. Hochwald, C. B. Peel, and A. L. Swindlehurst, “A vector perturbation tech-

nique for near capacity multiantenna multiuser communication part II perturba-

tion,” IEEE Trans. on Communications, vol. 53, no. 3, pp. 537–544, March 2005.

[30] R. F. H. Fischer, Precoding and Signal Shaping for Digital Transmission. Wiley-

Interscience, 2002.

[31] C. Windpassinger, R. F. H. Fischer, T. Vencel, and J. B. Huber, “Precoding in

multiantenna and multiuser communications,” IEEE Trans. on Communications,

vol. 3, no. 4, pp. 1305–1316, July 2004.

[32] G. J. Foschini, G. D. Golden, R. A. Valenzuela, and P. W. Wolniansky, “Simplified

processing for high spectral efficiency wireless communication employing multi ele-

ment arrays,” IEEE Jour. on Selected Areas in Communications, vol. 17, no. 1, pp.

1841–1852, November 1999.

[33] T. Vencel, C. Windpassinger, and R. Fischer, “Sorting in the V-BLAST algorithm

and loading,” in Proc. of the 2002 Communications Systems and Networks, Septem-

ber 2002.

[34] R. Doostnejad, T. J. Lim, and E. Sousa, “Joint precoding and beamforming design

for the downlink in a multiuser MIMO system,” in Proc. IEEE WiMOB, Montreal,

Canada, August 2005.

[35] C.-H. F. Fung, W. Yu, and T. J. Lim, “Precoding for the multiantenna downlink:

multiuser snr gap and optimal user ordering,” IEEE Trans. on Communications,

vol. 55, no. 1, pp. 188–197, January 2007.

[36] Y. J. Zhang and K. B. Letaief, “An efficient resource-allocation scheme for spa-

tial multiuser access in MIMO/OFDM systems,” IEEE Trans. on Communications,

vol. 53, no. 1, pp. 107–116, January 2005.

[37] C. Pan, Y. Cai, and Y. Xu, “Adaptive subcarrier and power allocation for multiuser

MIMO-OFDM systems,” in Proc. IEEE ICC 05, May 2005.

BIBLIOGRAPHY 95

[38] Y. Shin, T. Kang, and H. Kim, “An efficient resource allocation for multiuser MIMO-

OFDM systems with zero-forcing beamformer,” in Proc. IEEE PIMRC, September

2007.

[39] H. Karaa and R. S. Adve, “User assignment for MIMO-OFDM systems with mul-

tiuser linear precoding,” in Proc. IEEE WCNC, March-April 2008.