PRECODING FOR MULTIUSER MIMO
SYSTEMS WITH MULTIPLE BASE STATIONS
by
Imad Azzam
A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science,
Graduate Department of Electrical and Computer Engineeringin the University of Toronto.
Copyright c© 2008 by Imad Azzam.All Rights Reserved.
Precoding for Multiuser MIMO Systems with
Multiple Base Stations
Master of Applied Science ThesisEdward S. Rogers Sr. Department of Electrical and Computer Engineering
University of Toronto
by Imad AzzamJune 2008
Abstract
Future cellular networks are expected to support extremely high data rates and user
capacities. This thesis investigates the downlink of a wireless cellular system that takes
advantage of multiple antennas at base stations and mobile stations, frequency reuse
across all cells, and cooperation among base stations. We identify asynchronous inter-
ference resulting from multi-cell communication as a key challenge, prove the existence
of a downlink/uplink duality in that case, and present a linear precoding scheme that
exploits this duality. Since this result is not directly extendable to orthogonal frequency
division multiplexing (OFDM), we propose a ‘hybrid’ algorithm for two cooperating base
stations, which combines linear and nonlinear precoding. This algorithm minimizes the
sum mean squared error of the system and is extendable to OFDM. Finally, we con-
sider the problem of user selection for multiuser precoding in OFDM-based systems. We
extend an available single-cell user selection scheme to multiple cooperating cells.
ii
To my parents, Halim and Amal,
and brothers, Raed and Abdullah
iii
Acknowledgements
I would like express sincere and deep gratitude to my advisor, Professor Raviraj Adve.
His continuous support, advice, discussions, and insight are invaluable and highly appre-
ciated. This work would not have been possible without him.
I would like thank all the friends that I met in the Communications Group for their
help and support throughout my two years in the group.
I would like to express special thanks to all the Lebanese friends I made at the
University of Toronto. In particular, an infinite amount of gratefulness goes to Rani
Daher, Nahi Abdul Ghani, Sari Onaissi, and Khaled Heloue. Their friendship made a
big difference in my life in Toronto. Rani’s daily support and encouragement through
advice and ‘cheering songs’ are unforgettable. Nahi’s novel sense of humor and social
networking guidelines made life smooth and enjoyable. Sari’s and Khaled’s brotherly
guidance helped me make better decisions and is greatly appreciated.
Many thanks go to my friends from back home. I could not have made it without
them. I hope that one day we will be reunited to celebrate our accomplishments.
No words can describe how grateful and indebted I am to my parents and brothers.
Their love and care are the motivation for all my achievements, and they will forever be.
I am deeply thankful to my aunt Ghada, who is more of an elder sister to me. Her
support and advice on many occasions were priceless.
I would like to acknowledge Bell Canada’s support through its Bell University Labo-
ratories R&D program, which made my research possible.
Imad H. Azzam
June 2008
iv
Contents
1 Introduction and Background 1
1.1 Motivation and Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Linear Precoding and Duality . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Multiple Base Stations and Asynchronous Interference . . . . . . 4
1.2.3 Nonlinear Precoding . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.4 User Selection for Multiuser MIMO-OFDM Systems . . . . . . . . 8
1.3 Thesis Overview and Structure . . . . . . . . . . . . . . . . . . . . . . . 9
2 Multiple Base Stations and Asynchronous Interference 11
2.1 System Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Downlink System Model . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.2 Virtual Uplink System Model . . . . . . . . . . . . . . . . . . . . 16
2.2 Downlink/Uplink Duality . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 First Step: SINR Targets . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.2 Second Step: Equating MSEs . . . . . . . . . . . . . . . . . . . . 28
2.3 Single Base Station: Synchronous Interference . . . . . . . . . . . . . . . 28
2.3.1 Review of the MIMO-SMSE Algorithm . . . . . . . . . . . . . . . 30
2.3.2 Multiple Base Stations: Assuming Synchronous Interference . . . 32
2.4 Minimizing the SMSE with Asynchronous Interference . . . . . . . . . . 33
2.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.6 MIMO-OFDM and Multiple Base Stations . . . . . . . . . . . . . . . . . 39
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
v
CONTENTS CONTENTS
3 The Hybrid Algorithm 42
3.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 The Hybrid Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.1 Interference Pre-subtraction . . . . . . . . . . . . . . . . . . . . . 46
3.2.2 Whitening Interference from Edge Users at Intra-cell Users . . . . 48
3.2.3 The Hybrid Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2.4 Data Vector Estimation for THP Users . . . . . . . . . . . . . . . 53
3.3 A Variation on the Hybrid Algorithm . . . . . . . . . . . . . . . . . . . . 54
3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Zero Power Cross Channels . . . . . . . . . . . . . . . . . . . . . 55
3.4.2 Non-zero Power Cross Channels . . . . . . . . . . . . . . . . . . . 60
3.5 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4 User Selection in Multiuser MIMO-OFDM Systems 66
4.1 System Model and Problem Statement . . . . . . . . . . . . . . . . . . . 66
4.1.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.2 Original Algorithm and Proposed Modifications . . . . . . . . . . . . . . 68
4.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4 MIMO-OFDM and the Hybrid Algorithm . . . . . . . . . . . . . . . . . . 71
4.5 User Selection for the Hybrid Algorithm . . . . . . . . . . . . . . . . . . 74
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5 Conclusions and Future Work 79
5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
A Equating MSEs 84
B Derivations for Asynchronous Interference 86
Bibliography 91
vi
List of Figures
2.1 Asynchronous interference in the downlink . . . . . . . . . . . . . . . . . 15
2.2 Asynchronous interference in the uplink . . . . . . . . . . . . . . . . . . . 20
2.3 Indices of interfering symbols of one user from two BSs . . . . . . . . . . 27
2.4 Linear precoding with and without asynchronous interference, with path
loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5 Linear precoding with and without asynchronous interference, no path loss 39
2.6 SMSE versus power of three users . . . . . . . . . . . . . . . . . . . . . . 40
3.1 System model for the hybrid algorithm . . . . . . . . . . . . . . . . . . . 43
3.2 Performance of the hybrid algorithm without path loss (K1 = K2 = 1,
Ke = 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3 Performance of the hybrid algorithm compared to ZPCC cases without
path loss (K1 = K2 = 1, Ke = 2) . . . . . . . . . . . . . . . . . . . . . . 57
3.4 BER plot of the separate streams for ZPCC case 2 . . . . . . . . . . . . . 59
3.5 BER plot of the separate streams for hybrid algorithm and ZPCC case 3 59
3.6 Performance of the hybrid algorithm with path loss . . . . . . . . . . . . 60
3.7 Performance of the hybrid algorithm with inter-cell interference . . . . . 61
3.8 Performance of the hybrid algorithm variation . . . . . . . . . . . . . . . 62
3.9 Performance of the hybrid algorithm with complex transmission . . . . . 62
4.1 Four different scenarios for user selection . . . . . . . . . . . . . . . . . . 71
4.2 Performance of hybrid algorithm for K1 = K2 = 4 and Ke = 8 . . . . . . 75
4.3 User selection method for the hybrid algorithm . . . . . . . . . . . . . . 76
vii
LIST OF FIGURES LIST OF FIGURES
4.4 Two different user selection methods for the hybrid algorithm . . . . . . 77
4.5 Varying number of edge users for 1 intra-cell user per cell (1f1 refers to f
edge users) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.6 Varying number of edge users for 2 intra-cell users per cell (2f2 refers to
f edge users) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.1 Extension of hybrid algorithm to 3 BSs . . . . . . . . . . . . . . . . . . . 83
viii
Chapter 1
Introduction and Background
1.1 Motivation and Objective
Meeting the demands that are expected from future generation networks poses intriguing
challenges for today’s wireless system designers. After the widespread deployment and
commercialization of third generation wireless cellular networks, research is now mainly
focused on taking data rates and user capacity to even higher levels, paving the way for
the fourth generation (4G) of wireless communications. Two emerging technologies that
are potential candidates for 4G wireless networks are multiple-input, multiple-output
(MIMO) systems and transmission based on orthogonal frequency division multiplexing
(OFDM). The use of antenna arrays (multiple antennas) at the transmitters and/or
receivers in MIMO systems enables multiuser communication with multiple users over
the same bandwidth. The use of orthogonal sub-carriers in OFDM provides protection
against inter-symbol interference (ISI). A vast amount of research and techniques have
been proposed for reaping the potential benefits of these two separate technologies [1–4].
Combining them potentially creates an even more capable system, which raises questions
about how to jointly optimize their functionality to further improve performance.
Another theme in communication networks that has been gaining more recognition
is cooperation. Cooperation can be generically defined as the sharing of resources to
achieve a common objective. In the context of wireless cellular networks, cooperation
can take place between the base stations (BSs), the users, and/or dedicated relays. The
1
1.2. LITERATURE REVIEW 2
interaction of one or more BSs necessitates the study of how neighboring cells affect each
other, especially in terms of inter-cell interference. Existing cellular systems avoid this
problem by deploying different frequencies in different cells. However, such an approach
has the drawback of reducing the system efficiency, since the available bandwidth is
divided into disjoint ranges and distributed among several neighboring cells. Accordingly,
the ultimate goal of system design becomes developing a scheme that mitigates the inter-
cell interference instead of avoiding it, which in turn enables the use of the entire available
bandwidth in every cell.
In reality, an effective wireless cellular system can bring together MIMO, OFDM, and
cooperation, which provides a high number of degrees of freedom. While this provides
great flexibility in design and the ability to tune the system to meet differing require-
ments, it also complicates the optimization of the system towards a required objective.
Consequently, our sponsors, Bell Mobility, through the Bell University Laboratories R&D
program, have asked us to explore various techniques targeted at the design of such a
system. Accordingly, our goal is to provide insight into the development of a practical
cooperative multiuser wireless cellular system that meets the demanding requirements of
4G communication networks.
We focus on a multiuser MIMO-OFDM system where cooperation is present in terms
of BS coordination and joint processing. More specifically, MIMO techniques are used to
multiplex data streams on the same bandwidth, OFDM is used to increase user capacity
and guard against ISI, and BS cooperation is used to provide users with better service
through joint processing and protect them further against multiuser interference (MUI)
and inter-cell interference. To enable communication over this system, BSs process the
data of users before transmission (precoding) and each user processes its received signal
to retrieve its own data (decoding). In this thesis, we address the precoding and decoding
problem for multiuser communications with coordinated transmissions from multiple BSs.
1.2 Literature Review
In what follows, we provide a brief survey of important works in four areas related to
this thesis. The survey starts with a review of prior work in linear precoding for MIMO
1.2. LITERATURE REVIEW 3
systems and the related concept of downlink/uplink duality. Next, we move to the area of
multiple base stations and discuss precoding in that scenario. The notion of asynchronous
interference is introduced and related work is presented. Then, nonlinear precoding is
briefly reviewed, with special focus on Tomlinson-Harashima Precoding (THP) that will
be used in this work. Finally, some works dealing with user selection for MIMO-OFDM
systems are presented. Note that basic mathematical models are not presented here and
delayed until the relevant sections for the convenience of the reader.
1.2.1 Linear Precoding and Duality
Our work focuses on the downlink of multiuser MIMO communication systems where
channel state information (CSI) is available at the transmitter. Using this CSI, system
performance can be maximized by pre-distorting the transmission to best match the
available CSI. In this thesis, we make extensive use of linear precoding [5–9] where the
signals to be transmitted are multiplied with a precoding matrix before transmission.
Similarly, the receiver multiplies the received signal with a decoding matrix to minimize
MUI. While the early works focused on minimizing the sum of the mean squared error
(SMSE) across all users’ signals [5–8], linear precoding to maximize sum data rate is also
possible [7, 9, 10].
Most of the work in precoding is based on a duality between the multiuser downlink
and a virtual multiuser uplink [6–8]. The duality states that, under the same sum power
constraint, the downlink and uplink both have the same achievable signal-to-interference-
plus-noise ratio (SINR) region [6]. In other words, if a certain set of SINR targets can be
achieved in the downlink for a given sum power constraint, then those targets can also be
achieved in the uplink with the same power constraint. This duality can be used to state
that the downlink and uplink have the same achievable minimum squared error (MSE)
region under the same sum power constraint [7]. This was shown for the single receive
antenna case in [6, 7] and for the MIMO scenario in [8]. Therefore, this duality can be
thought of as a tool that provides us with two different perspectives of the same system.
If the solution of a certain problem can be found using one perspective, then this solution
can be transformed to suit the other perspective. In our case, as in [8], minimizing the
1.2. LITERATURE REVIEW 4
SMSE proves to be easier in the uplink. Consequently, the uplink solution is determined
and then transformed to the downlink solution.
Furthermore, this downlink/uplink duality suggests that the precoding and decoding
matrices can be obtained via receive processing only, i.e., using the Wiener filter, and are
the same whether considering the downlink or uplink [8]. The outstanding issue is then
power allocation across users, which is a convex optimization problem when minimizing
SMSE [8]. In this work, we generalize this duality for the MIMO scenario with multiple
transmitters and multiple receivers that are geographically distant, which gives rise to
asynchronous MUI, a notion to be investigated later in more detail. We also study
the implications of asynchronous interference for the convexity of the power allocation
problem.
It is worth elaborating further on the work presented in [8], as it forms the basis of the
linear precoding found in our work. Based on the outcome of the downlink/uplink dual-
ity, two iterative algorithms that jointly optimize the precoding/decoding matrices and
the power allocation to minimize the SMSE of a MIMO system are presented. One algo-
rithm cycles between the downlink and uplink to derive the precoding/decoding matrices
and power allocation. The other performs the optimization completely in the uplink to
obtain this optimal decoding matrices and uplink power allocation and then transforms
the solution to the downlink. This is made possible by a scheme proposed in [11] for
deriving the uplink precoding matrices given per-user power constraints without consid-
ering the downlink temporarily, and generalized in [8] for a sum power constraint. Both
algorithms have better performance than block diagonalization (BD), another MIMO
linear precoding technique [8]. Moreover, they provide performance very similar to se-
quential quadratic programming (SQP), a computationally intensive technique that can
be used to derive the precoding and decoding matrices directly in the downlink [5,8,12].
We make use of both algorithms at various points in this thesis.
1.2.2 Multiple Base Stations and Asynchronous Interference
More recently researchers have begun investigating the notion of multiple BSs cooperating
to achieve more efficient use of bandwidth. The ultimate goal of such systems is a
1.2. LITERATURE REVIEW 5
frequency reuse factor of unity. In multi-BS, multiuser precoding, multiple BSs coordinate
their transmissions to a group of users (that may straddle a traditional cell boundary).
This cooperation can provide better system performance, especially when servicing cell-
edge users. The work in [13] is probably the first to discuss multiuser communications
with multiple BSs. A system with multiple transmitter-receiver pairs that interfere with
each other is considered. An iterative algorithm that attempts to minimize the transmit
power while meeting quality of service requirements at the receivers is presented and
shown to converge to a local minimum. In a recent study, Tolli et al. [14] investigate
linear precoding for multiple cooperating BSs. They propose an algorithm to jointly
design precoding and decoding matrices for a multiuser MIMO system in attempt to
maximize the sum rate under per-BS power constraints. The authors treat all MUI as
synchronous and the resulting algorithm is similar to that of [8] with a single ‘super BS’
using as many transmit antennas as all the cooperating BSs collectively have. In [15],
Tamakai et al. study the achievable sum rates for a MIMO downlink system with multiple
BSs cooperating at three different levels. However, they assume that a variant of Time
Division Multiple Access (TDMA) is used in the best results they achieve. In [16],
Dahrouj et al. consider the downlink of a multiuser, multi-BS system and propose an
algorithm that jointly optimizes the beamformers used by the cooperating BSs in order
to minimize the total transmit power while satisfying the SINR constraints of the users.
When dealing with multi-BS environments, a crucial assumption often made is that
both the desired and interfering signals arrive synchronously at each user [14–19]. How-
ever, synchronous interference is physically impossible [20]. In point of fact, it is this
asynchronous interference that is the key challenge in developing transmission schemes
for multi-BS scenarios. The work in [20] provides amendments to some existing algo-
rithms accounting for this asynchronous interference. In this paper, we show that the
virtual uplink should be modeled carefully for duality to remain intact. Furthermore, the
convexity of the power allocation problem is unclear, and it cannot be extended directly
from that of the synchronous interference case. Accordingly, we provide some amend-
ments to extend the linear precoding algorithm presented in [8] and make it suitable for
a multi-BS system with asynchronous interference1.
1It is worthwhile to note that the problem of asynchronous interference is also encountered when
1.2. LITERATURE REVIEW 6
Several other works consider the problem of asynchronous interference and propose
different methods for mitigating its effect, mainly by using the cyclic redundancy pro-
vided by OFDM. Thomas and Vook [22] design space-time filters such that the MSE
between the head and tail of the OFDM symbol (originally equal without interference
due to cyclic redundancy) is minimized. In another approach, again based on the cyclic
redundancy, Yano and Taromaru [23] propose a special receiver structure that detects
the arrival of asynchronous interference and adapts the processing weights accordingly.
Jung and Zoltowski [24] propose using space-time filters twice, once to estimate the in-
terferer by examining a ‘window’ synchronized with the interferers signal and subtract
it from the original signal, and the second time to estimate the desired signal. Note
that [22,24] assume that the interference is misaligned by an integer factor of the symbol
period. Moreover, [22–24] all assume single-BS systems and do not address the multi-BS
scenario, neither in terms of the difficulties that may arise nor in terms of any advantages.
In this work, we briefly discuss some of difficulties experienced when attempting to com-
municate to multiple users from multiple BSs using OFDM. Consequently, we assume
that cooperation happens only for users in regions where asynchronous interference is
limited and propose a method to help such users further.
1.2.3 Nonlinear Precoding
When both CSI and interference are known at the transmitter, another form of precoding
for multiple users is Dirty Paper Coding (DPC) [25]. Briefly stated, since the interfer-
ence is known beforehand at the transmitter, it can be subtracted from a user’s desired
data. Examples of such coding techniques includes the pioneering work by Costa [25],
Tomlinson-Harashima Precoding (THP) first introduced in [26–28], and vector perturba-
tion techniques [29]. In this work we make use of THP and present some related works
below.
designing Code Division Multiple Access (CDMA) systems. In those systems, the asynchronous interfer-ence is resolved by assuming the presence of additional virtual users that also interfere with the desiredsignal and the system is designed to guard the desired user against those extra interferers [21]. A similarapproach may be attempted with linear precoding; however, it may lead to a complicated system withvery high dimensions. In this thesis, we focus on formulating relatively simple linear precoding, andhence we avoid adopting this method.
1.2. LITERATURE REVIEW 7
The main idea behind THP is moving the decision feedback loop of the common deci-
sion feedback equalizer (DFE) from the receiver side to the transmitter side. This helps
reduce the possibility of error propagation; however, it can increase the average transmit-
ted power. Accordingly, a modulo operation at the transmitter alleviates the problem of
transmit power exceeding the usual transmit power constraint. The modulo operation is
a non-linear operation, and hence makes THP a non-linear precoding technique [30].
Another important issue which arises when working with THP is user ordering. As-
sume that a total of K users numbered 1, . . . , K are present in the system. For user k,
the interference from users k + 1, . . . , K are presubtracted. Accordingly, user 1 sees no
interference, user 2 sees interference only from user 1, and so on. Choosing the opti-
mal order for the K users is a complicated problem and different works employing THP
use different ordering methods that meet their optimization criterion. Windpassinger et
al. [31] present a MIMO system that uses THP, and argue that it performs better than
linear precoding with DFE at the receiver. As for user ordering, they propose using two
methods presented in [32, 33]. Doostnejad et al. [34] propose a combination of linear
and nonlinear precoding for the MIMO downlink to minimize individual MSEs given in-
dividual power constraints or minimize the total transmit power given individual SINR
constraints. Moreover, the downlink/uplink duality is generalized from MISO systems
to MIMO systems for nonlinear precoding. Users are ordered according to the Frobenius
norm of their channels. The user that sees no interference (user 1 according to the pre-
vious numbering) is chosen to be the user that has the lowest channel Frobenius norm,
and the user that sees interference from all other users (user K) is chosen to be the
user that has the highest channel square Frobenius norm. In [35], Fung et al. propose a
practical THP implementation for the downlink of a multiuser, MIMO system and show
that it outperforms existing THP implementations based on zero-forcing techniques in
terms of power efficiency. The total power is minimized while satisfying the minimum
data rate and maximum bit error rate (BER) requirements of the users. User ordering
is also addressed and an algorithm that finds the optimal user ordering in polynomial
time is proposed. A less complex algorithm that finds a near-optimal user ordering is
also presented.
In our work, THP is used by two cooperating BSs to presubtract interference between
1.2. LITERATURE REVIEW 8
user groups, as opposed to individual users. As for the ordering problem, since the number
of user groups is restricted to 2, only two possible orderings are possible and both are
considered.
1.2.4 User Selection for Multiuser MIMO-OFDM Systems
Most works on multiuser communications assume communication to a small, fixed group
of users. This is because the number of transmit antennas essentially limits the number
of users that can be simultaneously serviced. In a realistic system, however, the number
of users would far exceed the number of antennas at the BS. One possible way to deal
with the high number of users is combining OFDM with MIMO. In such a system, it is
possible to transmit multiple data streams on each of the many subcarriers made available
by OFDM. This is a multiuser extension of orthogonal frequency division multiple access
(OFDMA). Of course, OFDM and OFDMA have their own implementation challenges,
such as phase noise and carrier offsets [4], but those are out of the scope of this work.
When linear precoding is used on each subcarrier, the maximum number of data
streams that can be transmitted on a given subcarrier is equal to the number of transmit
antennas. With many data streams required for transmission, the choice of which users
should be assigned to which subcarriers is not clear. Many works exist that address this
problem for single-carrier MIMO systems or single-input, single-output (SISO) OFDM
systems. However, since we are dealing with a MIMO-OFDM system, we focus our
attention on works that address user selection for such a system.
User selection methods vary according the chosen optimization criterion. Zhang and
Ben Letaief [36] propose a theoretical joint user, power, and bit-load allocation scheme
in which the BER or data rate requirements of users are met. However, due to its com-
plexity, they propose a more practical algorithm where users are separated into different
groups such that the inter-group user correlation is low. Only users from different groups
are allowed to transmit on the same subcarrier, and traditional methods, such as FDMA,
are used to allocate users from the same group to different subcarriers. With the users
decoupled in such a manner, power and bit-loads are jointly allocated to each user in-
dependently using the initial algorithm, but applied to that one user, which reduces its
1.3. THESIS OVERVIEW AND STRUCTURE 9
complexity. In another approach, Pan et al. [37] propose an algorithm that allocates
subcarriers to users and minimizes the power required to meet certain data rates. The
main idea is to use DPC to allow more than one user to be allocated on each subcarrier.
In a more recent work, the authors of [38] propose an approach that attempts to meet
the data rates required by users in two steps, briefly summarized below. First, some
simplifying assumptions are made based on which an average channel gain for each user
is determined. Using these average channel gains, the number of subcarriers to meet each
users data rate is approximated. Second, subcarriers are allocated to users such that the
initial approximations are satisfied. Then, an extra user may be allocated to a subcarrier
if it has a high channel gain and a low correlation with users already on that subcarrier.
Karaa and Adve [39] propose a MIMO-OFDM user allocation scheme tailored for
linear precoding as presented in [8]. The optimization criterion is the minimization of
the SMSE. For every subcarrier, the proposed approach runs an iteration that alternates
between power allocation and optimal beamforming for all system users, until the relative
change in SMSE is below a certain threshold. Afterwards, the data streams that are
allocated the most power are assigned to the subcarrier. It is shown by simulations that
the performance of this scheme closely approaches that of the optimal user selection
(simulated by brute force), when both are combined with linear precoding. This work is
reviewed in more detail later, as it is modified to enhance its performance further and
then extended to user selection with multiple BSs.
1.3 Thesis Overview and Structure
As mentioned previously, our goal is to provide insight into the design of a multiuser,
multi-BS, MIMO-OFDM system with a frequency reuse factor of unity. More specifically,
we will initially consider the downlink of a single-carrier, multi-BS, MIMO system and
explain the nature of the asynchronous interference that is experienced in such systems.
The contributions of this thesis are:
• Developing the useful downlink/uplink duality for a single-carrier, multi-BS, MIMO
system with asynchronous interference.
1.3. THESIS OVERVIEW AND STRUCTURE 10
• Modifying an existing multiuser linear precoding algorithm that minimizes the
SMSE of the system to account for the multiple BSs and asynchronous interfer-
ence.
• Presenting a brief discussion on the difficulties of using OFDM for BS coopera-
tion and proposing a cooperative algorithm that combines linear precoding and
non-linear THP (hybrid algorithm) and minimizes the SMSE of the system. This
algorithm avoids the mentioned difficulties by having two BSs cooperate only for
edge users that are approximately equally distant from them.
• Modifying an existing user selection algorithm for MIMO-OFDM to enhance its
performance and suggesting how it can be used to select users when the hybrid
algorithm is used along with OFDM.
Consequently, the thesis is organized as follows. In Chapter 2, we develop the system
model and identify asynchronous interference as the key challenge. We prove the existence
of a downlink/uplink duality in a multi-BS MIMO system with asynchronous interference
and present a linear precoding scheme suitable for that scenario. In Chapter 3, we present
the hybrid algorithm which minimizes the SMSE of a system where two BSs cooperate to
communicate with edge users, i.e., cooperation is restricted to edge users exclusively. In
Chapter 4, we present a MIMO-OFDM user selection algorithm, as well as a simulation
exercise on how user selection for the hybrid algorithm might be done. The thesis wraps
up with some conclusions and suggestions for future work in Chapter 5.
Chapter 2
Multiple Base Stations and
Asynchronous Interference
In this chapter, we investigate a single-carrier MIMO system with multiple cooperating
BSs. We provide system models for the downlink and virtual uplink that take into account
the asynchronous nature of the interference, inevitable due to the use of multiple BSs.
We proceed to show that a downlink/uplink duality exists, and then exploit this duality
to extend an existing single-BS linear precoding algorithm to accommodate multiple BSs
and asynchronous interference.
2.1 System Models
This section describes the system models of the multiuser downlink and the multiuser
virtual uplink. In both cases it is assumed that there are B BSs and K users randomly
distributed in the cells of the B BSs. Each BS has M antennas, while user k has Nk
antennas with N =∑K
k=1 Nk. Moreover, each user transmits or receives Lk data streams
simultaneously, where Lk ≤ minM, Nk. Also, L =∑K
k=1 Lk. Note that all data streams
share the same frequency and time channels. The BSs communicate to the users over
frequency flat channels. All BSs are assumed to know the CSI to all users perfectly. We
emphasize that this CSI also includes the propagation delays between all the BSs and all
the users.
11
2.1. SYSTEM MODELS 12
2.1.1 Downlink System Model
Let xk (Lk × 1) be the column data vector containing the data streams of user k to be
transmitted from all BSs. The data symbols in xk are independent with unit average
power. Moreover, the data vectors of a user k are independent over time. As for the data
vectors of different users, they are also independent. Therefore,
E[xk(m)xHj (n)] =
ILk
, k = j and m = n
0, otherwise, (2.1)
where (·)H is the Hermitian operator, E [·] is the expectation operator, and m and n are
discrete time indices. In general, unless required, we will drop the time index.
Before transmission, each data stream in xk is allocated a certain power. This is done
by multiplying xk by√
Pk (Lk×Lk), where Pk is a diagonal matrix, whose components are
the powers allocated to the different data streams of xk. Furthermore, before transmission
from BS b, the data vector meant for user k is linearly precoded with a matrix U(b)k
(M × Lk). Hence, the M × 1 signal transmitted from BS b to user k is
t(b)k = U
(b)k
√Pkxk. (2.2)
The channel between BS b and user k is represented by the matrix H(b)k (M × Nk),
whose elements are circularly symmetric complex Gaussian random variables with unit
variance (Rayleigh fading). Since perfect CSI is assumed, these channels are known at
the BSs. The Nk × 1 signal received by user k is
yk =B∑
b=1
H(b)H
k U(b)k
√Pkxk + interference + nk, (2.3)
where nk denotes the additive white Gaussian noise (AWGN). The interference term will
be explained in more detail below. Finally, user k processes its received signal linearly
by multiplying it by a decoding matrix VHk (Lk ×Nk) to produce an estimate of its own
2.1. SYSTEM MODELS 13
data vector
xk = VHk yk. (2.4)
Let Ek denote the error covariance matrix of user k. Ek is expressed as follows.
Ek = E[(xk − xk)(xk − xk)
H]
(2.5)
The diagonal elements of Ek are the mean squared errors (MSEs) of the data streams of
user k and, therefore, the SMSE of user k can be found as follows,
SMSEk = tr (Ek) , (2.6)
where tr(·) is the trace operator. The SMSE of the whole system can be expressed as
the sum of the individual SMSEs,
SMSE =K∑
k=1
SMSEk =K∑
k=1
tr (Ek) . (2.7)
Accordingly, the optimization problem of finding precoding and decoding matrices and a
power allocation that minimize the SMSE in the downlink under a sum power constraint
can be described as follows,
minPk,U
(b)k ,Vk
k=1:K, b=1:B
K∑
k=1
tr (Ek) (2.8)
subject toK∑
k=1
tr(Pk) ≤ B × Pmax,
where B × Pmax is the maximum sum power available over all the BSs, and Pmax is the
average transmit power of one BS over time.
Timing Issues: In the first summation of Eqn. (2.3), it is assumed that the trans-
missions from all BSs meant for a certain user k (t(b)k , for b = 1, . . . , B) are received by
user k synchronously. Therefore, as in [20], BS b advances the time at which it transmits
2.1. SYSTEM MODELS 14
t(b)k by τ
(b)k − τ
(bk)k , where τ
(b)k is the propagation delay from BS b to user k, and τ
(bk)k is
the propagation delay from user k to the nearest BS. Due to the random distribution of
users, each user will be at a different distance from the different BSs. Since each user
receives its own data from the different BSs synchronously, it is impossible to ensure that
the interference at user k from other users in the system be synchronous with user k’s
desired signals as well; the interference is therefore asynchronous. The distances between
BSs and users in practical systems imply that the delays related to the asynchronous in-
terference are very small; however, relative to the symbol period in such systems, which
is on the order of microseconds, these delays are considerable and cannot be neglected.
Most works dealing with multiple BS ignore this issue, which significantly simplifies any
system design. The issue of asynchronous interference was first identified in [20] and we
show here that it is the key challenge in designing multi-BS cooperation schemes.
The asynchronous interference causes the pulse shape used to transmit the interfering
data streams to be misaligned with the matched filter at user k. Adapting from [20], this
misalignment can expressed as follows.
interference =B∑
b=1
H(b)H
k
K∑j=1j 6=k
U(b)j
√Pji
(b)jk , (2.9)
where
i(b)jk (m) = ρ(δ
(b)jk − TS)xj(m
(b)jk ) + ρ(δ
(b)jk )xj(m
(b)jk + 1), (2.10)
ρ(τ) =
∫ TS
0
g(t)g(t− τ)dt, (2.11)
τ(b)jk = τ
(b)k − τ
(bk)k − (τ
(b)j − τ
(bj)j ), δ
(b)jk = τ
(b)jk mod TS. (2.12)
Here, TS is the symbol period. g(t) is the pulse shaping filter. It is non-zero only for
t ∈ [0, TS], real, and has unit power. The term i(b)jk is the misaligned interference caused
2.1. SYSTEM MODELS 15
Figure 2.1: Asynchronous interference in the downlink
on user k when BS b transmits to user j; it is a linear combination of two consecutive
data vectors being transmitted to user j, with time indices m(b)jk and m
(b)jk + 1. Note that
by convention, in the downlink, m(b)jk is the index given to the first interfering symbol and
m(b)jk + 1 is that given to the second. Figure 2.1 helps in understanding the expression
for i(b)jk . The term τ
(b)jk represents the difference between the time when interference from
BS b while transmitting to user j arrives at user k and the time when user k receives its
desired signal. The expression can be understood as follows. Assume all BSs transmit to
their closest users at the same time, say t = 0. User k is closest to BS bk and, therefore,
user k receives its desired signal at t = τ(bk)k . BS b advances its transmission to user j
by τ(b)j − τ
(bj)j so that user j receives all its intended signals from all BSs simultaneously.
The propagation delay from BS b to user k is τ(b)k and hence the interference from user j
arrives at user k at t = τ(b)k − (τ
(b)j − τ
(bj)j ). This makes the interference from BS b when
transmitting to user j arrive at user k after τ(b)jk = τ
(b)k − τ
(bk)k − (τ
(b)j − τ
(bj)j ). Therefore,
xk = VHk
B∑
b=1
H(b)H
k U(b)k
√Pkxk
︸ ︷︷ ︸desired signal
+VHk
B∑
b=1
H(b)H
k
K∑j=1j 6=k
U(b)j
√Pji
(b)jk
︸ ︷︷ ︸asynchronous multiuser interference
+VHk nk (2.13)
As noted previously, the expression of i(b)jk (m) (Eqn. (2.10)) contains the two consec-
utive data vectors xj(m(b)jk ) and xj(m
(b)jk + 1). It is assumed that the channels experience
quasi-static fading, meaning that the fading is the same over a considerable number of
2.1. SYSTEM MODELS 16
symbol periods. This implies that the same power and precoding matrices can be used for
several consecutive symbols. Accordingly, Pj and U(b)j can be used for i
(b)jk in Eqn. (2.13).
Before proceeding to the virtual uplink model, it is worth investigating the value of
E[i(b1)j1k i
(b2)H
j2k
]as it will be frequently used later. Using Eqn. (2.1) and Eqn. (2.10), it can
be easily shown that
E[i(b1)j1k i
(b2)H
j2k
]=
0, for j1, j2, and k all distinct
β(b1,b2)jk ILj
, for j1 = j2 = j 6= k
ILj, for j1 = j2 = j = k
, (2.14)
where
β(b1,b2)jk =
0, if |m(b2)jk −m
(b1)jk | > 1
ρ(δ(b1)jk )ρ(δ
(b2)jk − Ts), if m
(b2)jk = m
(b1)jk + 1
ρ(δ(b1)jk )ρ(δ
(b2)jk ) + ρ(δ
(b1)jk − Ts)ρ(δ
(b2)jk − Ts), if m
(b2)jk = m
(b1)jk
ρ(δ(b2)jk )ρ(δ
(b1)jk − Ts), if m
(b2)jk = m
(b1)jk − 1
. (2.15)
The details of the derivation are provided in Appendix B.
2.1.2 Virtual Uplink System Model
In precoding for a single BS, many works make use of an downlink/uplink duality be-
cause of the significant simplification it provides in solving the optimization problem in
Eqn. (2.8). Since we hope to exploit this duality for our multi-BS case, we present in
this section the virtual uplink system model to be used. It is worth emphasizing that
the uplink is purely virtual, i.e., it is a purely mathematical construct to simplify the
downlink optimization problem. In this regard, we investigate two variants of the virtual
uplink model.
In the uplink, the system model is very similar to that in the downlink, except for the
following changes. The K users are now communicating to B BSs. User k transmits Lk
data streams, which make up the components of the Lk × 1 data vector xk. The power
allocation matrix used by user k is Qk. Vk is now the precoding matrix used by user k.
2.1. SYSTEM MODELS 17
Therefore, the signal transmitted by user k to all BSs is
tk = Vk
√Qkxk. (2.16)
The channel matrices are the same as those in the downlink with the dimensions reversed
and components conjugated (Hermitian operation). Therefore, the M ×1 signal received
by BS b from user k is
y(b)k = H
(b)k Vk
√Qkxk + interference + n(b), (2.17)
where n(b) denotes the AWGN at BS b1. The interference term will be explained in more
detail below after the timing models for the virtual uplink are presented. The BSs group
the received vectors y(b)k into one global received vector yk = [y
(1)T
k . . . y(B)T
k ]T (BM ×1),
where (·)T is the transpose operator. yk is processed linearly by multiplying it by the
hermitian of the decoding matrix Uk (BM × Lk) to produce an estimate of the data
vector of user k
xk = UHk yk. (2.18)
Note that Uk can be expressed as Uk =[U
(1)T
k , . . . ,U(B)T
k
]T
, where U(b)k is the decoding
matrix used by BS b for user k. Therefore, xk can be also expressed as
xk = UHk yk =
B∑
b=1
U(b)H
k y(b)k . (2.19)
One crucial difference between the work in [8] and the multi-BS case is the issue of
timing. In what follows, we will consider two different methods for modeling the timing
of signals in the uplink. Each method produces a different expression for the term τ(b)jk ,
which represents the difference between the time when interference from user j arrives
at BS b and the time when the desired signal of user k arrives at BS b. Otherwise, all
other asynchronous interference terms are common.
1We assume that each BS can process its received signal several times with different delays in orderto obtain a y(b)
k vector for each user k that is synchronized with the data of user k.
2.1. SYSTEM MODELS 18
Simultaneous Transmissions: In the first method, we model the virtual uplink
in a simple, straightforward manner. We assume that all the MSs transmit at the same
time, say t = 0. The idea behind the first method comes from the downlink assumption
that all BSs transmit to their closest users at the same time. In this case, τ(b)jk is expressed
as follows.
τ(b)jk = τ
(b)j − τ
(b)k (2.20)
Eqn. (2.20) can be understood as follows. As explained previously, all the users transmit
at the same time, t = 0. The interfering signal from user j arrives at BS b at t = τ(b)j ,
while the desired signal from user k arrives at t = τ(b)k . Therefore, the difference is
τ(b)jk = τ
(b)j − τ
(b)k .
Time Reversal: In the second method, we will look first at how the virtual uplink
is modeled in the case with one BS. As in [8], the signal received by that one BS in the
uplink is given as follows.
y =K∑
k=1
HkVk
√Qkxk + n (2.21)
Note that Eqn. (2.21) has no timing terms, meaning that the different signals from the
different users arrive synchronously at the BS. With users being at different distances
from the BS, this is only possible if the users transmit at different times such that their
signals arrive at the same time at the BS. This cannot be directly extended to the
multiple BS case since, similar to what was mentioned previously in the timing issues in
Section 2.1.1, it is impossible for all the signals from all the users to arrive at all the BSs
synchronously.
In the downlink, we assumed that a BS transmits to all users for which it is the closest
BS at the same time, say t = 0. Accordingly, we propose that in the virtual uplink each
BS should necessarily receive synchronously only the signals transmitted by the users
that have it as the closest BS, say t = 0 as well. In this case, τ(b)jk is expressed as follows.
τ(b)jk = τ
(b)j − τ
(bj)j − (τ
(b)k − τ
(bk)k ) (2.22)
2.1. SYSTEM MODELS 19
Eqn. (2.22) can be understood as follows. Since each BS should receive the signals
transmitted by the users that have it as the closest BS at the same time, t = 0, user
k transmits its signal at t = −τ(bk)k and this signal arrives at BS b at t = −τ
(bk)k + τ
(b)k .
Similarly, the signal from user j arrives at BS b at t = −τ(bj)j + τ
(b)j . Therefore, the
interference of user j on user k arrives after τ(b)jk = τ
(b)j − τ
(bj)j − (τ
(b)k − τ
(bk)k ).
With the expressions of τ(b)jk at hand, we present the expressions of the asynchronous
interference caused by user j on user k at BS b and the estimated data vector for user k,
xk.
e(b)jk (m) = ρ(δ
(b)jk )xj(m
(b)jk ) + ρ(δ
(b)jk − TS)xj(m
(b)jk + 1), (2.23)
δ(b)jk = τ
(b)jk mod TS, (2.24)
xk =B∑
b=1
U(b)H
k H(b)k Vk
√Qkxk
︸ ︷︷ ︸desrired signal
+B∑
b=1
U(b)H
k
K∑j=1j 6=k
H(b)j Vj
√Qje
(b)jk
︸ ︷︷ ︸asynchronous multiuser interference
+B∑
b=1
U(b)H
k n(b) (2.25)
The expression for e(b)jk in Eqn. (2.23) can be explained in a manner similar to that of i
(b)jk
in Eqn.(2.10). Figure 2.2 helps in understanding Eqn. (2.23). Note that the indices m(b)jk
and m(b)jk + 1 remain unchanged since they are simply a way of identifying the symbols,
and these symbols are the same whether they are traveling in the downlink or uplink.
In a similar derivation to that of E[i(b1)j1k i
(b2)H
j2k
], it can be shown that
E[e
(b1)j1k e
(b2)H
j2k
]=
0, for j1, j2, and k all distinct
γ(b1,b2)jk ILj
, for j1 = j2 = j 6= k
ILj, for j1 = j2 = j = k
, (2.26)
2.2. DOWNLINK/UPLINK DUALITY 20
Figure 2.2: Asynchronous interference in the uplink
where
γ(b1,b2)jk =
0, if |m(b2)jk −m
(b1)jk | > 1
ρ(δ(b2)jk )ρ(δ
(b1)jk − Ts), if m
(b2)jk = m
(b1)jk + 1
ρ(δ(b1)jk )ρ(δ
(b2)jk ) + ρ(δ
(b1)jk − Ts)ρ(δ
(b2)jk − Ts), if m
(b2)jk = m
(b1)jk
ρ(δ(b1)jk )ρ(δ
(b2)jk − Ts), if m
(b2)jk = m
(b1)jk − 1
. (2.27)
More details are provided in Appendix B.
2.2 Downlink/Uplink Duality
In general, proving the existence of a downlink/uplink duality requires two steps. In the
first step, SINR targets are set on each data stream and the proof requires showing that
the same total power is needed to meet these targets in the downlink and uplink. In the
second step, the proof requires showing that the MSE for a data stream is the same in
the downlink or uplink, using the fact that the uplink and downlink SINRs are the same.
2.2.1 First Step: SINR Targets
Let the target SINR for data stream j of user k be Γkj. Let p be a column downlink power
allocation vector whose elements are the powers allocated to all data streams across all
users, i.e., the diagonal elements of the matrices Pk for k = 1, . . . , K. We would like
to find p such that the minimum of the ratios SINRDLkj /Γkj over all values of k and j is
2.2. DOWNLINK/UPLINK DUALITY 21
maximized and ||p||1 = 1Tp ≤ B × Pmax, where 1 is an all-ones L × 1 vector. It was
shown in [6] that the solution of this problem makes all the above ratios equal to the
same level, denoted as CDL.
CDL =SINRDL
kj
Γkj
, 1 ≤ k ≤ K, 1 ≤ j ≤ Lk, ||p||1 ≤ B × Pmax. (2.28)
The above is true since the SINR for a given data stream is monotonically increasing in
the power of that data stream, and monotonically decreasing in the power of any other
data stream. This still holds when the interference is asynchronous, and can be easily
seen from the explicit expression of the SINR to be presented later.
From Eqn. (2.28), we can write the following equation for any data stream j of user
k.
pkj1
CDL= pkj
Γkj
SINRDLkj
. (2.29)
From Eqn. (2.13), the SINR of a data stream j of user k in the downlink can be found by
dividing its power by the power of the remaining data streams and noise in the system.
This SINR can be expressed as
SINRDLkj = pkj
vHkjS
DLkj vkj
vHkjT
DLkj vkj
, (2.30)
SDLkj =
(B∑
b=1
H(b)H
k u(b)kj
)(B∑
b=1
u(b)H
kj H(b)k
), (2.31)
where u(b)kj denotes the column of U
(b)k corresponding to stream j of the user k at BS b.
2.2. DOWNLINK/UPLINK DUALITY 22
Also,
TDLkj =
Lk∑
l=1l 6=j
pkl
(B∑
b=1
H(b)H
k u(b)kl
)(B∑
b=1
u(b)H
kl H(b)k
)
+ E
[K∑
c=1c6=k
Lc∑
l=1
pcl
(B∑
b=1
H(b)H
k u(b)cl i
(b)ckj
)(B∑
b=1
i(b)H
ckj u(b)H
cl H(b)k
)]+ σ2INk
,
=
Lk∑
l=1l 6=j
B∑
b1=1
B∑
b2=1
H(b1)H
k u(b1)kl pklu
(b2)H
kl H(b2)k
+K∑
c=1c6=k
Lc∑
l=1
B∑
b1=1
B∑
b2=1
β(b1,b2)ck H
(b1)H
k u(b1)cl pclu
(b2)H
cl H(b2)k + σ2INk
. (2.32)
i(b)ckj is the jth element of i
(b)ck . Note that it is assumed, as in [9], that the covariance matrix
of any external interference can be estimated, for example by training, and whitened.
The whitening filter can be considered as part of global channel matrix of user k, Hk =[H
(1)T
k H(2)T
k . . .H(B)T
k
]T2. Hence, we only include a scaled identity matrix in Eqn. (2.32)
to represent noise and external interference. Next, the L equalities given by Eqn. (2.29)
can be grouped together in one equation as follows
p1
CDL= DΨDLp + DσDL (2.33)
where
D = diag
(Γ11
vH11S
DL11 v11
, . . . ,ΓKLK
vHKLk
SDLKLK
vKLK
), (2.34)
2This can be understood in general as follows. Denote the transmission from all BSs as vector s.Assume a user receives s through channel H. Noise, denoted by n, is also added to the received signal.Thus, the received signal is expressed as HHs + n. We assume the noise is colored with a covariancematrix R. The user whitens the noise using a whitening filter, found as R− 1
2 . Therefore, the processedreceived signal becomes R− 1
2 HHs + R− 12 n. The vector R− 1
2 n is now AWGN, and R− 12 HH can be
considered the equivalent channel.
2.2. DOWNLINK/UPLINK DUALITY 23
[ΨDL]LrLc = vHkrlrE
(B∑
b=1
H(b)H
kru
(b)kclc
i(b)kckrlr
)(B∑
b=1
H(b)H
kru
(b)kclc
i(b)kckrlr
)Hvkrlr (2.35)
= vHkrlr
(B∑
b1=1
B∑
b2=1
β(b1,b2)kckr
H(b1)H
kru
(b1)kclc
u(b2)H
kclcH
(b2)kr
)vkrlr , (2.36)
for 1 ≤ Lr, Lc ≤ L and Lr 6= Lc, and
σDL = σ2[vH11v11, . . . ,v
HKLK
vKLK]T = σ21.
The vector 1 is an L×1 vector whose elements are all 1. The subscripts r and c are short
for row and column, respectively; kr is the user to which data stream Lr belongs when
the data streams are labeled from 1 to L; lr is the index of that data stream relative to
user kr. The same applies to kc and lc. Note that the diagonal entries of [ΨDL] are zero.
Also, it is assumed, for ease of representation, that when kr = kc = k, i(b)kk = xk, and in
that case the specific element [ΨDL]LrLc can be found as follows
[ΨDL]LrLc =
∣∣∣∣∣vHkrlr
B∑
b=1
H(b)H
kru
(b)kclc
∣∣∣∣∣
2
.
To minimize the total power while meeting the target SINRs at the same time, CDL is
set to 1. Accordingly, and after simple mathematical manipulations to Eqn. (2.33), we
get
p = σ2(D−1 −ΨDL)−11. (2.37)
A similar discussion applies for the uplink. In that case, the power vector is denoted
as q. Its elements are the powers allocated to all data streams across all users, i.e., the
diagonal elements of the matrices Qk for k = 1, . . . , K. The SINR of data stream j of
user k in the uplink can be expressed as follows,
SINRULkj = qkj
numkj
denkj
(2.38)
2.2. DOWNLINK/UPLINK DUALITY 24
where
numkj =
(B∑
b=1
u(b)H
kj H(b)k
)vkjv
Hkj
(B∑
b=1
H(b)H
k u(b)kj
)(2.39)
and
denkj =
Lk∑
l=1l 6=j
qkl
∣∣∣∣∣
(B∑
b=1
u(b)H
kj H(b)k
)vkl
∣∣∣∣∣
2 + E
K∑c=1c 6=k
Lc∑
l=1
qcl
∣∣∣∣∣B∑
b=1
u(b)H
kj H(b)c vcle
(b)ckj
∣∣∣∣∣
2 + σ2
=
Lk∑
l=1l 6=j
B∑
b=1
B∑
b=2
u(b1)H
kj H(b1)k vklqklv
HklH
(b2)H
k u(b2)H
kj
+K∑
c=1c 6=k
Lc∑
l=1
B∑
b1=1
B∑
b2=1
γ(b1,b2)ck u
(b1)H
kj H(b1)c vclqclv
Hcl H
(b2)H
c u(b2)kj + σ2 (2.40)
Note that |x|2 = xxH = xHx for scalars and it is used as such in Eqn. (2.40) because
of its more compact form. Setting the same SINR targets as the downlink and grouping
equalities of similar form to Eqn. (2.29) for the uplink, we get
q1
CUL= DΨULq + DσUL, (2.41)
where
D = diag
(Γ11
num11
,Γ12
num12
, . . . ,ΓKLk
numKLK
), (2.42)
[ΨUL]LrLc = E
∣∣∣∣∣B∑
b=1
u(b)H
krlrH
(b)kc
vkclce(b)kckrlr
∣∣∣∣∣
2 ,
=B∑
b1=1
B∑
b2=1
γ(b1,b2)kckr
u(b1)H
krlrH
(b1)kc
vkclcvHkclcH
(b2)H
kcu
(b2)krlr
(2.43)
2.2. DOWNLINK/UPLINK DUALITY 25
for 1 ≤ Lr, Lc ≤ L and Lr 6= Lc, and
σUL = σ2
[B∑
b=1
uH11u11, . . . ,
B∑
b=1
uHKLK
uKLK
]T
= σ21.
Again, we have [ΨUL]LrLc = 0 when Lr = Lc, and e(b)kk = xk. Note that the matrix D is
equal in the downlink and uplink and hence it has no superscript. Similarly, by setting
CUL = 1, we get
q = σ2(D−1 −ΨUL)−11. (2.44)
By establishing the above, we know that both the downlink and uplink have the same
achievable SINR region, if free of power constraints. To complete the first step, it is
required that ||p||1 = ||q||1. This can be achieved if ΨDL equals ΨULT. Note that the
transpose operation is sufficient since the Ψ matrices have real components. By inspecting
Eqns. (2.36) and (2.43), and using the fact that all terms of the form uHHv or vHHHu
are scalars, a sufficient condition for this equality to hold is having E[i(b1)jkl1
i(b2)H
jkl1
]equal
to E[e(b1)kjl2
e(b2)H
kjl2
], i.e.,
β(b1,b2)jk = γ
(b1,b2)kj . (2.45)
Notice that the subscripts k and j in Eqn. (2.45) are switched when going from one side
of the equation to the other, since we are equating ΨDL to a transposed version of ΨUL.
In attempting to show Eqn. (2.45), we require the following proposition. Note that
subscripts k and j are also switched in the proposition statement below as we go from
one side of the ‘implies’ statements to the other.
Proposition 1: For both downlink and uplink, m(b2)jk = m
(b1)jk +1 =⇒ m
(b2)kj = m
(b1)kj −1,
m(b2)jk = m
(b1)jk − 1 =⇒ m
(b2)kj = m
(b1)kj + 1, and m
(b2)jk = m
(b1)jk =⇒ m
(b2)kj = m
(b1)kj .
Proof: Let sj(m) denote the symbol transmitted to user j with time index m. Consider
the case when m(b2)jk = m
(b1)jk + 1 in the downlink. Accordingly, the symbol sj(m
(b2)jk ) is
2.2. DOWNLINK/UPLINK DUALITY 26
the same as symbol sj(m(b1)jk + 1), but sj(m
(b2)jk ) arrives at user k as interference from BS
b2 and sj(m(b1)jk +1) arrives at user k as interference from BS b1. Hence, we can write the
exact times at which these symbols arrive at user k as follows (from the analysis of the
interference timing seen before).
t(sj(m(b2)jk )) = τ
(bj)j − τ
(b2)j + τ
(b2)k
t(sj(m(b1)jk + 1)) = τ
(bj)j − τ
(b1)j + τ
(b1)k
Note that sj(m(b1)jk + 1) arrives after sj(m
(b2)jk ) (otherwise sj(m
(b1)jk + 1) would have been
sj(m(b1)jk ); see Figure 2.3). Therefore, for some constant c, we can write
(1) t(sj(m(b2)jk )) − t(sj(m
(b1)jk + 1)) = c < 0
(2) =⇒ [τ(bj)j − τ
(b2)j + τ
(b2)k ] − [τ
(bj)j − τ
(b1)j + τ
(b1)k ] = c < 0
(3) =⇒ [−τ(b2)j + τ
(b2)k ] − [−τ
(b1)j + τ
(b1)k ] = c < 0
(4) =⇒ [τ(b2)j − τ
(b2)k ] − [τ
(b1)j − τ
(b1)k ] = −c > 0
(5) =⇒ [τ(bk)k + τ
(b2)j − τ
(b2)k ] − [τ
(bk)k + τ
(b1)j − τ
(b1)k ] = −c > 0
(6) =⇒ t(user k on j by BS b2) − t(user k on j by BS b1) = −c > 0
(7) =⇒ m(b2)kj + 1 = m
(b1)kj
where
(1) Direct result of sj(m(b1)jk + 1) arriving after sj(m
(b2)jk ).
(2) Express the times explicitly.
(3) Remove the common term τ(bj)j .
(4) Multiply the equation by −1.
(5) Add and subtract the term τ(bk)k .
(6) The first three terms can be interpreted as the time that interference from user
k arrives at user j due to transmission from BS b2. The last three terms can
be interpreted as the time that interference from user k arrives at user j due to
transmission from BS b1.
2.2. DOWNLINK/UPLINK DUALITY 27
Figure 2.3: Indices of interfering symbols of one user from two BSs
(7) The difference in timing in step (6) is greater than zero. This means that the
interference from user k on user j due to transmission from BS b2 arrives after the
interference from user k on user j due to transmission from BS b1. Hence, the
relation between the indices of the symbols m(b2)kj + 1 = m
(b1)kj .
The other cases have similar proofs.
Now, we attempt to derive Eqn. (2.45). Assume that in the downlink m(b2)jk = m
(b1)jk +1
holds. Therefore, from Eqn. (2.15),
β(b1,b2)jk = ρ(δ
(b1)DL
jk )ρ(δ(b2)DL
jk − Ts).
Now consider the uplink. From Proposition 1, m(b2)jk = m
(b1)jk + 1 =⇒ m
(b2)kj = m
(b1)kj − 1.
Therefore, from Eqn. (2.27),
γ(b1,b2)kj = ρ(δ
(b1)UL
kj )ρ(δ(b2)UL
kj − Ts).
Again, notice that the subscripts k and j have been switched. Accordingly, it is enough
to show that δ(b)DL
jk = δ(b)UL
kj to prove that β(b1,b2)jk = γ
(b1,b2)kj . Before proceeding, recall that
δ and τ are related by a simple modulo operation, as in Eqn. (2.12) and Eqn. (2.24).
From Eqn. (2.12), we have τ(b)DL
jk = τ(b)k − τ
(bk)k − (τ
(b)j − τ
(bj)j ). Using the first method
to model the virtual uplink, where all the users transmit simultaneously, we have τ(b)UL
kj =
τ(b)k − τ
(b)j from Eqn. (2.20). In this case, it is not clear that δ
(b)DL
jk = δ(b)UL
kj . One may
try setting τ(b)UL
kj = τ(b)DL
jk and solving for appropriate virtual uplink delays such that
δ(b)DL
jk = δ(b)UL
kj holds, but such an attempt yields an over-determined system and no
solution is guaranteed. Furthermore, trying to solve for the virtual uplink delays using
2.3. SINGLE BASE STATION: SYNCHRONOUS INTERFERENCE 28
||p||1 = ||q||1, the initial condition required to establish the first step of a downlink/uplink
duality, yields an under-determined system.
On the other hand, using the second method to model the virtual uplink, where all
users transmit using time reversal, we have τ(b)UL
kj = τ(b)k − τ
(bk)k − (τ
(b)j − τ
(bk)j ), which
is equal to τ(b)DL
jk . This result directly gives δ(b)DL
jk = δ(b)UL
kj , and hence β(b1,b2)jk = γ
(b1,b2)kj .
Finally, we have E[i(b1)jkl1
i(b2)H
jkl1
]= E
[e(b1)kjl2
e(b2)H
kjl2
], ΨDL = ΨULT
, and ||p||1 = ||q||1. There-
fore, the first step towards duality can be completed provided we model the virtual
uplink using time reversal. In other words, we conclude that the time reversal model is
the uplink dual of the downlink timing scheme used in Section 2.1.1.
2.2.2 Second Step: Equating MSEs
After establishing the first step to duality, showing that the downlink and uplink MSEs
of a data stream are equal when its downlink and uplink SINRs are equal is a simple
matter and can be done along the lines of the proofs presented in [7, 8]. The details are
provided in Appendix A.
2.3 Single Base Station: Synchronous Interference
Before proceeding, we will examine a special case with only one BS and, hence, syn-
chronous interference. We will review the multiuser linear precoding algorithm presented
in [8], which can be used in this case. It is worth presenting and understanding this al-
gorithm as it will be adjusted to account for asynchronous interference, and it will form
the basis for the hybrid algorithm to be presented in Chapter 3.
Consider the case when only one BS is present. Modeling this can be simply accom-
plished by setting B = 1 in the system model presented earlier. Accordingly, all the
(b) superscripts will be dropped. With only one BS at hand, the MUI experienced by
the users is no longer asynchronous with their desired data, and hence all the interfer-
ence time delays τ(b)jk are set to zero, and i
(b)jk and e
(b)jk both simplify to xj. Moreover,
β(b1,b2)jk = γ
(b1,b2)kj = 1, and duality exists. Consequently, the estimated data vectors in the
2.3. SINGLE BASE STATION: SYNCHRONOUS INTERFERENCE 29
downlink and virtual uplink can be expressed as follows.
xDLk = VH
k HHk Uk
√Pkxk︸ ︷︷ ︸
desired signal
+ VHk HH
k
K∑j=1j 6=k
Uj
√Pjxj
︸ ︷︷ ︸synchronous multiuser interference
+VHk nk (2.46)
xULk = UH
k HkVk
√Qkxk︸ ︷︷ ︸
desrired signal
+K∑
j=1j 6=k
UHk HjVj
√Qjxj
︸ ︷︷ ︸synchronous multiuser interference
+UHk n (2.47)
We will consider the SMSE in the virtual uplink and show how it can be minimized.
Then, the derived virtual uplink solution will be transformed into the downlink, using
the downlink/uplink duality [8].
By substituting Eqn. (2.47) into Eqn. (2.5) and expanding it, the error covariance
matrix of user k in the virtual uplink is found to be
EULk = UH
k HVQVHHHUk + σ2UHk Uk + ILk
−UHk HkVk
√Qk −
√QkV
Hk HH
k Uk,
(2.48)
where H = [H1, . . . ,HK ], V = diag (V1, . . . ,VK), and Q = diag (Q1, . . . ,QK) are the
uplink global channel, precoding, and power allocation matrices, respectively. Differen-
tiating the trace of Eqn. (2.48) with respect to UHk and setting the result to zero, the
optimum uplink MMSE decoding matrix is found to be
UMMSEk = J−1HkVk
√Qk, (2.49)
where
J = HVQVHHH + σ2IM . (2.50)
The columns of UMMSEk can be normalized to ensure that the power constraint is met.
2.3. SINGLE BASE STATION: SYNCHRONOUS INTERFERENCE 30
Substituting Eqn. (2.49) into Eqn. (2.48), the virtual uplink MMSE error covariance
matrix of user k is
EUL,MMSEk = ILk
−√
QkVHk HH
k J−1HkVk
√Qk. (2.51)
Substituting Eqn. (2.51) into Eqn. (2.7), we get
SMSE =K∑
k=1
tr(EUL,MMSEk ) =
K∑
k=1
Lk −K∑
k=1
tr(√
QkVHk HH
k J−1HkVk
√Qk
)
= L−K∑
k=1
tr(HkVkQkV
Hk HH
k J−1)
= L− tr(HVQVHHHJ−1
)
= L−M + tr(J−1
). (2.52)
Note that the SMSE expression in Eqn. (2.52) deals with the uplink exclusively.
2.3.1 Review of the MIMO-SMSE Algorithm
In this section, we briefly review the MIMO-SMSE algorithm proposed in [8] to minimize
the SMSE given in Eqn. (2.52) under a total power constraint for the single-BS case.
The algorithm is detailed in Table 2.1.
The main idea behind the algorithm is to solve the SMSE minimization problem in
the virtual uplink (which proves to be easier than solving it directly in the downlink),
and then using downlink/uplink duality, convert the obtained solution into an equivalent
one in the downlink. The algorithm begins by initializing the uplink precoding matrices
V by the right singular vectors of the channel matrices H and by dividing the available
power equally among all data streams. Next, it iterates between finding uplink precoding
matrices that minimize the SMSE assuming the uplink power allocation is constant [11],
and finding the uplink power that minimizes the SMSE assuming the uplink precoding
matrices are constant. The latter can be expressed as follows. Note that minimizing the
2.3. SINGLE BASE STATION: SYNCHRONOUS INTERFERENCE 31
Initialization: Vk = SVD(Hk), Q = (Pmax/L)I
Iteration:
1. Find virtual uplink precoding vectors, for k = 1 : K, j = 1 : Lk
vkj = emax
(HH
k J−2kj Hk, I/qkj + HH
k J−1kj Hk
)
N.B.: emax returns the normalized eigenvector with the highest eigenvalue.
2. Find virtual uplink power allocation to minimize SMSE.
q = arg minq tr (J−1), subject to qkj > 0, ||q|| ≤ Pmax
3. Repeat steps 1 and 2 until old SMSE - new SMSE < ε
Update:
4. Find downlink precoding matrices and normalize their columns,
for k = 1 : K, j = 1 : Lk
Uk = J−1HkVk
√Qk, ukj = ukj/||ukj||
5. Set the target SINRs to the actual SINRs, for k = 1 : K, j = 1 : Lk
Γkj = SINRULkj
6. Find the downlink power allocation.
p = σ2(D−1 −ΨULT
)−1
1
Table 2.1: Multiuser MIMO-SMSE Algorithm
SMSE in Eqn. (2.52) is equivalent to minimizing the term tr(J−1).
q = arg minq
tr(J−1
), (2.53)
subject to qkj > 0, ||q|| ≤ Pmax
The power allocation problem in Eqn. (2.53) is convex in q, which makes it relatively easy
to solve using numerical methods, since no closed form solution is known [8]. Finally,
when the relative change in SMSE is below a certain threshold, the obtained uplink
solution is used to derive the downlink precoding matrices and downlink power allocation
2.3. SINGLE BASE STATION: SYNCHRONOUS INTERFERENCE 32
that achieve the same SMSE, under the same power constraint. Note that, in Table 2.1,
matrices D and ΨUL are given by Eqns. (2.42) and (2.43) with γ(b1,b2)ck set to 1, and the
matrix J is given by Eqn. (2.50). The matrix Jkj has the same expression as in [8]. It is
repeated below for convenience.
Jkj = J− qkjHkvkjvHkjH
Hk . (2.54)
2.3.2 Multiple Base Stations: Assuming Synchronous Interfer-
ence
In this section, we show that if all interference were synchronous, under our model, the
multiple BSs act as a single BS with B×M antennas. Assume that it is somehow possible
to ensure that all the transmitted signals from all BSs arrive synchronously at all users.
All the interference time delays τ(b)jk become zero, and i
(b)jk and e
(b)jk both simplify to xj.
Moreover, β(b1,b2)jk = γ
(b1,b2)kj = 1, and duality exists. Accordingly, xDL
k in Eqn. (2.13) and
xULk in Eqn. (2.25) become
xDLk =VH
k
B∑
b=1
H(b)H
k U(b)k
√Pkxk + VH
k
B∑
b=1
H(b)H
k
K∑j=1j 6=k
U(b)j
√Pjxj + VH
k nk
=VHk
B∑
b=1
H(b)H
k U(b)k
√Pkxk + VH
k
K∑j=1j 6=k
B∑
b=1
H(b)H
k U(b)j
√Pjxj + VH
k nk
=VHk HH
k Uk
√Pkxk + VH
k
K∑j=1j 6=k
HHk Uj
√Pjxj + VH
k nk (2.55)
2.4. MINIMIZING THE SMSE WITH ASYNCHRONOUS INTERFERENCE 33
xULk =
B∑
b=1
U(b)H
k H(b)k Vk
√Qkxk +
B∑
b=1
U(b)H
k
K∑j=1j 6=k
H(b)j Vj
√Qjxj +
B∑
b=1
U(b)H
k n(b)
=B∑
b=1
U(b)H
k H(b)k Vk
√Qkxk +
K∑j=1j 6=k
B∑
b=1
U(b)H
k H(b)j Vj
√Qjxj +
B∑
b=1
U(b)H
k n(b)
=UHk HkVk
√Qkxk +
K∑j=1j 6=k
UHk HjVj
√Qjxj + UH
k n (2.56)
where
Uk = [U(1)T
k . . . U(B)T
k ]T (BM × Lk), Hk =[H
(1)T
k H(2)T
k . . .H(B)T
k
]T
(BM ×Nk),
and n = [n(1)T
. . . n(B)T
]T (BM × 1). (2.57)
Comparing Eqns. (2.55) and (2.56) to Eqns. (2.46) and (2.47), we notice that the case
with multiple BSs and synchronous interference is equivalent to a case with one BS that
is a ‘super-BS’ with BM antennas. In that case, the algorithm of Table 2.1 can be used to
find the precoding and decoding matrices and power allocation that minimize the SMSE.
However, as mentioned previously in Section 2.1.1, in a system with multiple cooperating
BSs, asynchronous interference is inevitable, and the above simplifications are not valid.
Therefore, it is important to consider asynchronous interference explicitly when dealing
with multi-BS systems.
2.4 Minimizing the SMSE with Asynchronous Inter-
ference
With duality shown to exist even when asynchronous interference is present, we will
proceed to extend the MIMO-SMSE algorithm of Table 2.1 to account for asynchronism.
First, we consider how the SMSE expression changes. Recall that in the uplink, the
2.4. MINIMIZING THE SMSE WITH ASYNCHRONOUS INTERFERENCE 34
estimated data vector is given by Eqn. (2.25). Accordingly, the MSE error matrix for
user k in the uplink can be expressed as follows.
EULk =E
[(xk − xk)(xk − xk)
H],
=UHk HkVkQkV
Hk HH
k Uk + UHk
K∑c=1c6=k
AckUk + σ2UHk Uk
−√
QkVHk HH
k Uk −UHk HkVk
√Qk + ILk
, (2.58)
where
Uk = [U(1)T
k . . . U(B)T
k ]T , Ack = [A(1)T
ck . . . A(B)T
ck ]T , (2.59)
A(b)ck = H(b)
c VcQcVHc HH
c Γ(b)ck , and Γ
(b)ck = diag[γ
(b,1)ck IM , . . . , γ
(b,B)ck IM ]. (2.60)
Keeping all other matrices constant, the optimal solution for matrix Uk is the Wiener
solution:
UMMSEk =
(HkVkQkV
Hk HH
k +K∑
c=1c 6=k
Ack + σ2INk
)−1
HkVk
√Qk. (2.61)
The columns of UMMSEk can be normalized to ensure that the power constraint is met.
Substituting this optimal Uk back into Eqn. (2.58), we get the following expression for
EUL,MMSEk ,
EUL,MMSEk = ILk
−√
QkVHk HH
k J−1k HkVk
√Qk, (2.62)
where
Jk = HkVkQkVHk HH
k +K∑
c=1c 6=k
Ajk + σ2I. (2.63)
2.4. MINIMIZING THE SMSE WITH ASYNCHRONOUS INTERFERENCE 35
Note that, unlike in Eqn. (2.50), the matrix Jk cannot be made independent of user k
since Ajk is not common to all users. With the expression of the MMSE error covariance
matrix of user k available, the SMSE can be expressed as:
SMSE =K∑
k=1
tr(EUL,MMSEk ) =
K∑
k=1
Lk −K∑
k=1
tr(√
QkVHk HH
k J−1k HkVk
√Qk
),
= L−K∑
k=1
tr(HkVkQkV
Hk HH
k J−1k
). (2.64)
Initialization: Vk = SVD(Hk), Q = (Pmax/L)I
Iteration:
1. Find virtual uplink power allocation to minimize SMSE.
q = arg minq SMSE, subject to qkj > 0, ||q|| ≤ Pmax
2. Find downlink precoding matrices and normalize their columns,
for k = 1 : K, j = 1 : Lk
Uk = J−1k HkVk
√Qk, ukj = ukj/||ukj||
3. Set the target SINRs to the actual SINRs, for k = 1 : K, j = 1 : Lk
Γkj = SINRULkj
4. Find the downlink power allocation.
p = σ2(D−1 −ΨULT
)−1
1
5. Find uplink precoding matrices and normalize their columns,
for k = 1 : K, j = 1 : Lk
Vk = G−1k HH
k Uk
√Pk, vkj = vkj/||vkj||
6. Repeat steps 1 to 5 until old SMSE - new SMSE < ε
Table 2.2: Multiuser multi-BS algorithm for asynchronous interference
The above result has two major implications. First, the presence of Jk in Eqn. (2.64)
prevents the simplification of the SMSE into a form similar to Eqn. (2.52). This implies
2.4. MINIMIZING THE SMSE WITH ASYNCHRONOUS INTERFERENCE 36
that the uplink precoding vectors, vkj, cannot be found as shown in Table 2.1 [11]. Con-
sequently, minimizing the SMSE cannot be completely performed in the uplink. That is,
the iteration needs to constantly alternate between the uplink and downlink. Explicitly,
the uplink power allocation that minimizes the SMSE is found first. Next, the downlink
precoding matrices and the downlink power allocation are derived. Afterwards, the up-
link precoding matrices that minimize the downlink SMSE are found using the optimal
Wiener solution. Again, this iteration repeats until the relative change in SMSE is below
a certain threshold. In a similar derivation to that of UMMSEk , one can easily show that
the matrix Vk is the Wiener solution in the downlink:
VMMSEk = G−1
k HHk Uk
√Pk, (2.65)
where
Gk =
HH
k UkPkUHk Hk + HH
k
K∑c=1c 6=k
BckHk + σ2INk
, (2.66)
Bck = [B(1)T
ck . . . B(B)T
ck ]T , B(b)ck = U(b)
c PcUHc β
(b)ck , (2.67)
and
β(b)ck = diag[β
(b,1)ck IM , . . . , β
(b,B)ck IM ]. (2.68)
Again, the columns of VMMSEk can be normalized to ensure that the power constraint is
met.
The second implication of the new SMSE expression is on the convexity of the uplink
power allocation problem. While it is relatively easy to show that the power allocation
problem in Eqn. (2.53) is convex in q, the vector of uplink powers [8], the same approach
cannot be applied to Eqn. (2.64). It is only simulations that suggest that convexity in
the uplink powers still holds. These simulations will be presented in Section 2.5.
Finally, we present the multiuser multi-BS SMSE minimization algortihm in Table 2.2.
2.5. SIMULATION RESULTS 37
Note that matrices D and ΨUL are given by Eqn. (2.42) and Eqn. (2.43), wherein γ(b1,b2)ck
takes its value according to Eqn. (2.27).
2.5 Simulation Results
This section presents the results of simulations to illustrate the efficacy of the algorithm
proposed in Table 2.2. The values used for all parameters are found in Table 2.3.
Number of BSs B 2Number of transmit antennas per BS M 4
Number of users K 4Number of receive antennas per user Nk 1
Number of data streams per user Lk 1AWGN average power σ2 1Signal-to-noise ratio SNR Pmax/σ
2
Symbol Period TS 1µsPulse Shaping Filter g(t) Rectangular
Table 2.3: Parameters for simulations
When path loss is taken into consideration, it is assumed that the power of the signal
is proportional to the inverse of the distance raised to a constant path loss exponent.
Here, the path loss exponent is set to 3.5. The 2 BSs are placed 500 meters apart, which
makes the cell radius 250 meters. The minimum distance of any user to a BS was set to
150 meters. The K users are uniformly distributed in the rectangular area of width 200
meters and height 500/√
3 meters centered between the 2 BSs. The path loss of each
user was normalized to the path loss experienced 150 meters away from a BS.
Figure 2.4 shows the results obtained when the proposed algorithm is used. Note that
path loss is present in these simulations. We have shown in section 2.1.2 that when the
first method is used to model the uplink, duality does not exist. In the first simulation,
we ignore the fact that duality does not exist and derive the needed coding matrices
and power allocation using the uplink delays as given by Eqn. (2.20). As a result, the
downlink power allocation was found not to satisfy the power constraint all the time and
in such cases it was linearly scaled to meet the constraint. In that case we can see that
2.5. SIMULATION RESULTS 38
the error rate hits a floor before worsening. When the second method is used to model
the virtual uplink, duality exists, and the performance improves. The error rate behaves
as expected and achieves very low values at high SNRs.
−5 0 5 10 15 20
10−4
10−3
10−2
10−1
100
10 log (Pmax
/σ2)
BE
R
B=2, K=4, M=4, Nk=1, L
k=1, cell_radius=250
Virtual uplink using first methodVirtual uplink using second method
Figure 2.4: Linear precoding with and without asynchronous interference, with path loss
In order to evaluate the proposed algorithm further, we attempt to compare it to
the case where a super-BS with B ×M antennas is used and synchronous interference is
experienced. However, when one BS is present, it is often assumed that the channels to
all users have the same power on average, an assumption that the path loss model does
not conform with. Accordingly, we present the results of another simulation where path
loss is absent. Figure 2.5 shows the obtained results. Again, when duality is assumed to
exist, but does not, the error rate hits a floor and worsens. When the second method is
used, the performance is close to the super-BS case with synchronous interference. The
gap between the curves demonstrates the loss due to asynchronous interference.
The results of the simulations which suggest that the SMSE given by Eqn. (2.64)
2.6. MIMO-OFDM AND MULTIPLE BASE STATIONS 39
−2 0 2 4 6 8 10 1210
−6
10−5
10−4
10−3
10−2
10−1
100
BE
R
10 log (Pmax
/σ2)
B = 2, M = 4, K = 4, Nk = 1, L
k = 1
Virtual uplink using first methodAssuming Synchronous InterferenceVirtual uplink using second method
Figure 2.5: Linear precoding with and without asynchronous interference, no path loss
is convex in the uplink powers are shown in Figure 2.6. The simulated scenario has
B = 2, K = 3,M = 3, Nk = 2, and Lk = 1. The power of the third user q3 is set to
4 different values, and for each value the SMSE is plotted versus the powers of the first
two users, q1 and q2. The 3D plots suggest that the SMSE function is indeed convex.
2.6 MIMO-OFDM and Multiple Base Stations
With an understanding of asynchronous interference experienced in multi-BS, single-
carrier systems, we will briefly discuss in this section how OFDM might be introduced
into a multi-BS system. Moreover, we will state some of the difficulties that can arise,
which will motivate the work presented in Chapter 3.
We have seen that in a single-BS, single-carrier system, the MIMO-SMSE algorithm
2.6. MIMO-OFDM AND MULTIPLE BASE STATIONS 40
0100 10
0.5
1
1.5
2
q1
q3 = 3
q2
SM
SE
0100 10
0.5
1
1.5
2
q1
q3 = 6
q2
SM
SE
0100 10
0.5
1
1.5
2
q1
q3 = 9
q2
SM
SE
0100 10
0.5
1
1.5
2
q1
q3 = 12
q2
SM
SE
Figure 2.6: SMSE versus power of three users
of [8] provides a practical solution to the multiuser MIMO downlink problem. When the
system is changed into a single-BS, OFDM (multi-carrier) system with a higher number
of users, it is possible to extend the previous solution, by using a user selection algorithm
to allocate users to each subcarrier (an example is presented in Section 4.2 in detail), and
executing this algorithm for each subcarrier. When the system is changed into a multi-
BS, single-carrier system, asynchronous interference arises. However, with the proper
choice of the virtual uplink model, duality exists, and a modified algorithm can be used.
In what follows, we will discuss the remaining case: a multi-BS, OFDM system.
As we saw previously in Section 2.1.1, in a multi-BS, single-carrier system, a certain
BS would advance or delay its transmission to a certain user to ensure that its signal
arrives at the user synchronously with all other signals from all other BSs. As a reminder,
a MIMO-OFDM transmitter has as many IFFT modules as there are antennas, since all
data for all users is applied at the input of the IFFT modules and processed in one
go. In a multi-BS case, a certain BS might be required to transmit to different users
separately at different times that differ by non-integer multiples of the symbol period
(by applying the desired user’s data at the appropriate inputs of the IFFT modules,
and zeros elsewhere). Accordingly, a BS might require a set of IFFT modules for every
2.7. SUMMARY 41
user in the system, which is obviously impractical given the high number of users. One
might attempt performing the IFFT operation on all the users’ data all at once and then
perform time advances and delays on parts of the result, but it is not possible to extract
each user’s signal component (as if it were processed alone as described above) from the
single result obtained. Moreover, transmitting all the users’ data to each user, and having
each user extract only its data, would require an impractical amount of power.
Despite the infeasibility of the above cases, assume it is possible to transmit to each
user its own data using OFDM from multiple BS and have the transmissions arrive
at the user synchronously. The interference will be inevitably asynchronous, and will be
experienced by the output symbols of the IFFT module and not the original data symbols.
Accordingly, the algorithm presented in this chapter may not be directly applied. One
might attempt to study the effect of such asynchronism on the original data symbols
and precode against that, but this serves to complicate matters further. Consequently,
in the next chapter we propose that cooperative transmission be used only for users that
are located in a region that is almost equally distant from the cooperating BSs, i.e., at
the edge, which limits the degree of asynchronism. This facilitates extending single-BS,
single-carrier algorithms to multi-BS, OFDM systems. Moreover, in order to help those
users further, we explore a new approach based on a combination of linear and nonlinear
precoding to battle MUI, and study how it performs compared to linear precoding.
2.7 Summary
In this chapter, we provided a downlink system model for a MIMO cellular system with
multiple BSs. We also proposed a virtual uplink model that would preserve the useful
downlink/uplink duality, which otherwise does not exist if the virtual uplink is modeled
in a simpler manner. We saw how this comprehensive model simplifies into the single-
BS model in [8], and hence reviewed the iterative multiuser MIMO linear precoding
algorithm presented there. Consequently, we proposed a revised model that accounts for
multiple BS and asynchronous interference and simulated the results of the new scheme.
This chapter ended with motivating the search for an alternative approach for multi-user,
multi-BS, MIMO-OFDM systems.
Chapter 3
The Hybrid Algorithm
In the previous chapter, we briefly discussed the difficulties associated with using OFDM
in a multiuser, multi-BS scenario due to the presence of asynchronous interference. In this
chapter, we propose a scheme in which two BSs cooperate only for a group of users located
near the common boundary of their two cells. With limited asynchronous reception in
that region, we propose to use joint linear precoding as a method of BS cooperation for
those users. To help those users further, we propose to use nonlinear precoding to guard
them against the interference that they receive from users that are deeper inside the cell
and communicate with only one BS using linear precoding. With this use of linear and
nonlinear precoding, we formulate a ‘hybrid algorithm’, which minimizes the SMSE of
the whole system, and provide the results of simulations that evaluate its performance.
3.1 System Model
The system models of the multiuser downlink and the multiuser virtual uplink in this
chapter are similar to those in Chapter 2. Here, the K uniformly distributed users are
divided into three sets according to their locations, namely edge users (Ωe, |Ωe| = Ke),
intra-cell users of BS 1 (Ω1, |Ω1| = K1), and intra-cell users of BS 2 (Ω2, |Ω2| = K2). We
have K1 + K2 + Ke = K. As shown in Figure 3.1 edge users are users located in a band
42
3.1. SYSTEM MODEL 43
Figure 3.1: System model for the hybrid algorithm
along the border between the regions of the two BSs1. The definitions of M,N, Nk, L,
and Lk remain unchanged.
The channels between a BS and its users (which include its intra-cell users and the
edge users) are also modeled as flat Rayleigh fading with unit average power. The path
loss exponent is chosen to suit the simulated environment. Moreover, it is assumed that
the channels between a BS and the intra-cell users of the other BS (referred to as the
cross channels from here on) have zero power, a reasonable assumption since path loss
over a large distance can attenuate a signal severely to the extent that it can be perceived
as noise.
In this chapter, Hk represents the channel matrix between user k and the BSs it
can communicate with. Accordingly, for intra-cell users, this matrix is of size M × Nk,
and for edge users, it is of size 2M × Nk. Note that Hk for edge users is the vertical
concatenation of H(1)k and H
(2)k , respectively the M × Nk channel matrices from BS
1In this chapter, the edge users are assumed in a band that is 40 meters wide, while the BSs arelocated 500 meters apart. The numbers used here are for illustrative purposes only and may be changeddepending on the acceptable tolerance on asynchronous reception.
3.1. SYSTEM MODEL 44
1 and 2. We will use H(1)in = [H11, H12, . . . ,H1K1 ], He =
[He1 , He2 , . . . ,HeKe
], and
H(2)in = [H21, H22, . . . ,H2K2 ] to denote the global channel matrices of all intra-cell users
of BS 1, edge users, and intra-cell users of BS 2, respectively, where 11, . . . , 1K1 are
the indices of the intra-cell users of BS 1, e1, . . . , eKe are those of the edge users, and
21, . . . , 2K2 are those of the intra-cell users of BS 2. The virtual uplink uses the same
channel matrices with the dimensions reversed and components conjugated (Hermitian
operation). Finally, even though the signals of the edge users cannot be completely
synchronized (whether in the downlink or uplink), it can be shown that the asynchronism
between the pulse shapes of two symbols arriving from the two BSs is limited by the
40-meter wide edge band to a maximum misalignment of 11.5% assuming the system
parameters mentioned above. The AWGN vectors nk and n(b) remain unchanged.
Let dk (Lk×1) be the column data vector containing user k’s data streams. Similarly,
E[dkdHj ] = 0, ∀ j 6= k, and E[dkd
Hk ] = ILk
. Let xk (Lk × 1) be a modified version of
dk that is transmitted. The need for xk and how it differs from dk will become clear
later. Note that we will use d(1)in , de, and d
(2)in to denote the global column data vectors
of all intra-cell users of BS 1, all edge users, and all intra-cell users of BS 2, respectively.
Moreover, we will use x(1)in , xe, and x
(2)in to denote the global modified versions of d
(1)in , de,
and d(2)in , respectively. The estimate of xk at a receiver is similarly denoted by xk. When
needed, further processing can be done on xk to produce dk, an estimate of dk.
The definitions of matrices Pk,Qk,Uk, and Vk remain unchanged. Note that matrix
Uk is of size M × Lk for intra-cell users and of size 2M × Lk for edge users. That is,
for an edge user, Uk =[U
(1)T
k ,U(2)T
k
]T
, where U(1)k and U
(2)k are the M × Lk pre-
coding matrices that BS 1 and BS 2 use for that edge user, respectively. We will
use P(1)in = diag
(Pin11 , Pin12 , . . . ,Pin1K1
), Pe = diag
(Pe1 , Pe2 , . . . ,PeKe
), and P
(2)in =
diag(Pin21 , Pin22 , . . . ,Pin2K1
)to denote the global downlink power matrices of all intra-
cell users of BS 1, edge users, and intra-cell users of BS 2, respectively. The same
applies to Q(1)in , Qe, Q
(2)in , V
(1)in , Ve, and V
(2)in . We will use U
(1)in = [U11, U12, . . . ,U1K1 ],
Ue =[Ue1 , Ue2 , . . . ,UeKe
], and U
(2)in = [U21, U22, . . . ,U2K2 ] to denote the global coding
matrices of all intra-cell users of BS 1, edge users, and intra-cell users of BS 2, respectively.
According to the above system model, the expression of the estimated xk in the
3.2. THE HYBRID ALGORITHM 45
downlink and the uplink for intra-cell users are as follows, for b = 1, 2.
xDLk = VH
k H(b)H
k
∑j∈Ωb,Ωe
U(b)j
√Pjxj + VH
k nk, (3.1)
xULk = UH
k
∑j∈Ωb,Ωe
H(b)j Vj
√Qjxj + UH
k n(b). (3.2)
The summations in Eqns. (3.1) and (3.2) contain terms that correspond to a user’s
intended signal, as well as MUI from other intra-cell and edge users. The expressions for
edge users will be similar, except that, due to interference pre-subtraction, there will be
no MUI from intra-cell users (only from other edge users). This will be explained in the
next section, after which the relevant expressions will be presented.
3.2 The Hybrid Algorithm
The main goal of the algorithm is to help the edge users since, without pre-subtraction,
they would receive interference from both BSs. Furthermore, this interference would
be asynchronous, reopening some of the issues previously discussed. In addition, this
algorithm can be easily extended to OFDM, in order to support the large number of
users that are present in a realistic scenario.
The edge users are helped in two ways: first, the two base stations cooperate and
jointly precode the transmissions to the edge users; second the interference from intra-
cell users is pre-subtracted and hence the overall interference level is significantly reduced.
Since the edge users are approximately equidistant from the BSs, the received signals are
essentially synchronous and the two BSs appear to the edge users as one super-BS with
2M antennas. The interference pre-subtraction happens at the BSs before they transmit
to the edge users. This is possible because it is assumed that complete CSI is known at
the BSs. Accordingly, edge users can be treated as a separate group for which precoding
and decoding matrices to battle the remaining MUI can be derived using any single-BS
multiuser MIMO algorithm, such as the one proposed in [8].
3.2. THE HYBRID ALGORITHM 46
As for the intra-cell users of a certain BS, the MUI they see comes partly from the
transmission to other intra-cell users of the same BS and partly from the transmission
to the edge users. It is assumed that intra-cell users whiten the MUI from edge users.
Accordingly, each group of intra-cell users can also be treated as a separate group for
which precoding and decoding matrices to battle the remaining MUI can be derived using
any single-BS multiuser MIMO algorithm.
3.2.1 Interference Pre-subtraction
The work in this section is based on nonlinear THP and its matrix form [26, 30, 34].
The main idea behind THP is to subtract a linearly modified (by matrix Ge) version of
xin =[x
(1)T
in ,x(2)T
in
]T
from de to form xe, such that once xe is transmitted and received by
the edge user, the combined effect of the precoding, channel, decoding, and interference
restores de.
The one drawback with this approach is that this pre-subtraction can generate a mod-
ified data vector xe whose components are situated further away from the origin than the
components of de, and hence xe can have more power than de. To alleviate this problem,
the real binary phase shift keying (BPSK) constellation assumed in this work can be
extended on the real axis to form the constellation with points ...,−5,−3,−1, 1, 3, 5....Other constellations have similar extensions.
Note that the new constellation points can be mapped evenly to the original two
points in the BPSK constellation by a simple modulo-2Mconst operation, where Mconst
is the distance between the original two constellation points (Mconst = 2 in our case).
Consequently, one can map the components of de to points in the extended constellation
such that, after the pre-subtraction step, the components of xe lie within the interval
[−Mconst,Mconst), which helps reduce the power of xe. It can be easily seen that a similar
result can be achieved by pre-subtracting the interference from the original de and then
performing the modulo-2Mconst operation. Accordingly, xe can be expressed as follows.
xe = (de −Gexin) mod 2Mconst = de + 2Mconstqe −Gexin = ve −Gexin. (3.3)
The vector qe is a vector of integers that indicate how many times 2Mconst had to be added
3.2. THE HYBRID ALGORITHM 47
or subtracted from the components of de to reach appropriate points in the extended
constellation, which are now represented by ve.
As will be seen next, the matrix Ge can contain complex components and hence xe can
be complex as well. The modulo-2Mconst operation will be performed on the real part only,
since it is the part that is actually transmitted; transmitting the imaginary part would
represent wasted energy. Simulations presented later show that better performance is
achieved when the available power is used to transmit only the real part. Consequently, to
determine the average power of the transmitted real parts of xe, we refer to the theoretical
result that states that the real and imaginary parts of the components of xe are almost
independent and identically distributed (i.i.d.) with a uniform distribution within the
region [−Mconst,Mconst) = [−2, 2) (since Mconst = 2 in our case) [30]. Therefore, the
average power of a symbol is now given by (Mconst−(−Mconst))2
12= 4
3. This is slightly higher
than the unit average power of symbol in the original de, but will be accounted for in a
power allocation step that incorporates a factor of 34
into the expression Pe, the power
allocation matrix for edge users.
With the expression of xe at hand, the expression of de can be presented, and the
value of Ge needed to cancel MUI from intra-cell users can be derived.
de = VHe
[H(1)H
e U(1)e
√Pexe + H(2)H
e U(2)e
√Pexe
+ H(1)H
e U(1)in
√P
(1)in x
(1)in + H(2)H
e U(2)in
√P
(2)in x
(2)in + ne
]mod 2Mconst,
= VHe
[HH
e Ue
√Pexe +
[H(1)H
e U(1)in
√P
(1)in
∣∣∣H(2)H
e U(2)in
√P
(2)in
]xin + ne
]mod 2Mconst,
= VHe
[HH
e Ue
√Peve −HH
e Ue
√PeGexin +
[H(1)H
e U(1)in
√P
(1)in
∣∣∣H(2)H
e U(2)in
√P
(2)in
]xin
+ ne
]mod 2Mconst. (3.4)
An important point to note is that both transmission from BS 1 and BS 2 have the same
vector xe (meaning that both choose the same extended constellation points to battle
the global interference, use the same matrix Ge, and know the data of the intra-cell
3.2. THE HYBRID ALGORITHM 48
users of both BSs included in xin). This is exactly where the coordination of the two
BSs happens. If the BSs choose different points from the extended constellation or do
not know the data of the intra-cell users of the other BS, the receiver will not be able
to decode its message; essentially the receiver will not be able to resolve two different
integer shifts.
Now in order to cancel the interference, Ge has to be chosen such that the second
and third terms in the last line of Eqn. (3.4) cancel each other. Accordingly, Ge takes
the following form.
Ge =(VH
e HHe Ue
√Pe
)−1
VHe
[H(1)H
e U(1)in
√P
(1)in
∣∣∣H(2)H
e U(2)in
√P
(2)in
]. (3.5)
Note that [ · | · ] denotes the horizontal concatenation of two matrices. With the interfer-
ence from the intra-cell users of both BSs subtracted from the transmission for the edge
users, the edge users can be considered as an independent problem and the precoding
and decoding matrices can be designed for them as in [8]. This part of the algorithm will
be described in more detail in Section 3.2.3.
3.2.2 Whitening Interference from Edge Users at Intra-cell Users
The signal received by intra-cell user k of BS b can be expressed as follows.
yin,k = H(b)H
k
∑j∈Ωb,Ωe
U(b)j
√Pjxj + nk,
= H(b)H
k
∑j∈Ωb
U(b)j
√Pjxj + H
(b)H
k U(b)e
√Pexe + nk,
= H(b)H
k
∑j∈Ωb
U(b)j
√Pjxj + zk. (3.6)
The second and third terms in the second line of Eqn. (3.6) can be considered as colored
noise and are denoted by zk. The covariance matrix of this colored noise, denoted by
3.2. THE HYBRID ALGORITHM 49
Rz,k, can be approximated as:
Rz,k = E[zkz
Hk
]= E
[(H
(b)H
k U(b)e
√Pexe + nk
)(H
(b)H
k U(b)e
√Pexe + nk
)H]
≈ 4
3H
(b)H
k U(b)e PeU
(b)H
e H(b)k + σ2INk
(3.7)
In the more general case where the cross channels are not forced to have zero power but
path loss is used to model their low power, the received signal yin,k has additional terms
representing the asynchronous inter-cell interference, and consequently its covariance
Rz,k also changes. The new expressions of yin,k and Rz,k are given below, assuming user
k belongs to BS b1 and inter-cell interference arrives from BS b2.
yin,k =H(b1)H
k
∑j∈Ωb1
U(b1)j
√Pjxj + H
(b1)H
k U(b1)e
√Pexe
+ H(b2)H
k U(b2)e
√Pei
(b2)ek + H
(b2)H
k U(b2)in
√P
(b2)in i
(b2)Ω2k︸ ︷︷ ︸
inter-cell asynchronous interference
+nk,
=H(b1)H
k
∑j∈Ωb1
U(b1)j
√Pjxj + zk, (3.8)
Rz,k = E[zkz
Hk
]≈4
3H
(b1)H
k U(b1)e PeU
(b1)H
e H(b1)k +
4
3ρ(δ
(b2)ek )H
(b1)H
k U(b1)e PeU
(b2)H
e H(b2)k
+4
3ρ(δ
(b2)ek )H
(b2)H
k U(b2)e PeU
(b1)H
e H(b1)k
+4
3β
(b2,b2)ek H
(b2)H
k U(b2)e PeU
(b2)H
e H(b2)k
+ β(b2,b2)Ω2k H
(b2)H
k U(b2)in P
(b2)in U
(b2)H
in H(b2)k + σ2INk
. (3.9)
In order to whiten this colored noise, intra-cell user k multiplies its received signal by a
whitening filter matrix R− 1
2z,k . Accordingly, the receiver then processes the modified signal
yin,k, that can be expressed in terms of Hk, the modified channel matrix, and zk, the
3.2. THE HYBRID ALGORITHM 50
whitened noise.
yin,k = R− 1
2z,k yin,k = R
− 12
z,k H(b1)H
k
∑j∈Ωb1
U(b1)j
√Pjxj + R
− 12
z,k zk
= H(b1)H
k
∑j∈Ωb1
U(b1)j
√Pjxj + zk (3.10)
Note that H(b1)k = H
(b1)k R
− 12
z,k . Consequently, each of the two groups of intra-cell users can
be treated, as well, as an independent problem, and the precoding and decoding matrices
can be derived as in [8], but using the modified channel matrices.
3.2.3 The Hybrid Algorithm
In short, the algorithm performs the following steps. It starts by initializing the uplink
precoding matrices (the V matrices) to the right singular vectors of the channel matrices.
The uplink power matrices (the Q matrices) are initialized by dividing the available
power equally among all data streams. The modified channel matrices are initialized
by adjusting the true channels matrices using the initial power and precoding matrices.
Next, a global power allocation step is performed that minimizes the SMSE of the whole
system. As explained before, the hybrid algorithm treats each user group as a separate
group; therefore, the SMSE of the whole system is simply the sum of the SMSEs of the
three user groups. Since the reception in each group is synchronous, the SMSE of each
group can be expressed as in Eqn. (2.52).
SMSEΩ1 = L1 −M + tr(J(1)−1
), J(1) = H
(1)in V
(1)in Q
(1)in V
(1)H
in H(1)H
in + σ2IM , (3.11)
SMSEΩe = Le − 2M + tr(J−1
e
), Je = HeVeQeV
He HH
e + σ2I2M , (3.12)
SMSEΩ2 = L2 −M + tr(J(2)−1
), J(2) = H
(2)in V
(2)in Q
(2)in V
(2)H
in H(2)H
in + σ2IM , (3.13)
3.2. THE HYBRID ALGORITHM 51
and
SMSE = SMSEΩ1 + SMSEΩe + SMSEΩ2 ,
= L− 4M + tr(J(1)−1
)+ tr
(J−1
e
)+ tr
(J(2)−1
)(3.14)
The total number of data streams for the intra-cell users of BS 1, the edge users, and the
intra-cell users of BS 2 are L1, Le, and L2, respectively. Note that L1+Le+L2 = L. Since
each of the SMSEs in Eqns. (3.11), (3.12), and (3.13) is convex in its power term [8], the
SMSE in Eqn. (3.14) is convex in these power terms. It can be easily seen that minimizing
the SMSE in Eqn. (3.14) is equivalent to minimizing tr(J(1)−1
)+ tr (J−1
e ) + tr(J(2)−1
).
Therefore, the power allocation problem can be expressed as:
[Q(1),Qe,Q
(2)]
= arg minQ(1),Qe,Q(2)
tr[J(1)−1
]+ tr
[J−1
e
]+ tr
[J(2)−1
](3.15)
subject to: tr[Q(1)
]+ tr [Qe] + tr
[Q(2)
] ≤ Ptot.
Note that the power allocation step assumes that the covariance matrices of the colored
noise (implicit in the modified channel matrices) are constant, since this assures that
the SMSE is convex with respect to power. The edge users are now treated as an inde-
pendent group, and new uplink precoding matrices (V) are derived for them. Using the
downlink/uplink duality, the downlink precoding and power allocation matrices (the U
and P matrices respectively) of the edge users are derived [8].
The precoding and decoding matrices and the power allocation are used, in turn, to
determine the covariance matrices of the colored noise caused by edge users at intra-cell
users as in Eqn. (3.7), which are used to modify the channel matrices of the intra-
cell users as described previously in Section 3.2.2. When inter-cell interference is to
be explicitly taken into account, the modified channel matrices are used to derive the
downlink precoding and power allocation matrices of the intra-cell users. Then, the
covariance matrices of the colored noise are found using Eqn. (3.9), and used to modify
the channel matrices of the intra-cell users. Next, new uplink precoding matrices (V) are
obtained for intra-cell users. The above steps are repeated until the relative change in
SMSE is below a certain threshold, or the SMSE experiences an increase. The increase
3.2. THE HYBRID ALGORITHM 52
in SMSE can occur since the power allocation step assumes the covariance matrix of the
colored noise is constant, while in reality it is a function of the power allocation. Finally,
the downlink precoding and power allocation matrices (U and P matrices respectively)
of the intra-cell users and the Ge matrix needed by THP are derived.
Note that it is assumed that both BSs can provide a maximum power of Ptot =
2 × Pmax, where Pmax is the power that one BS is allowed to provide, knowing that the
uniform distribution of the users is symmetric along the perpendicular bisector of the
straight line joining the 2 BSs. In other words, at a particular instance, a BS might be
transmitting with a power greater than or less than Pmax, but over an extended time
period the average transmitted power is Pmax.
The following list describes the steps mathematically and in more detail.
Iteration:
1. Solve the following convex power allocation problem.[Q(1),Qe,Q
(2)]
= arg minQ(1),Qe,Q(2) tr[J(1)−1
]+ tr
[J−1
e
]+ tr
[J(2)−1
]
s.t. tr[Q(1)
]+ tr [Qe] + tr
[Q(2)
] ≤ Ptot
2. Find new V matrices for edge users.
vkj = emax
(HH
e,kJ−2e,kjHe,k, I/Qe,kj + HH
e,kJ−1e,kjHe,k
)
N.B.: emax returns the eigenvector with the highest eigenvalue.
3. Find the U matrices with normalized columns for all edge users and the global Pe
matrix.
Ue,k = J−1e He,kVe,k
√Qe,k, ue,kj = ue,kj/||ue,kj|| (normalizing columns)
Pe = 34σ2diag
[(D−1
e −Ψe)−1
1]
N.B.: The factor 34
is included to account for the increased average power of xe due
to THP as mentioned previously in Section 3.2.1.
4. Find Rz,k as in Eqn. (3.7) or Eqn. (3.9). Update Hk for all users in BSs 1 and 2.
H(b)k = H
(b)k R
− 12
z,k
5. Find new V matrices for intra-cell users, b = 1, 2.
vkj = emax
(H
(b)H
k J(b)−2
kj H(b)k , I/Q
(b)kj + H
(b)H
k J(b)−1
kj H(b)k
)
3.2. THE HYBRID ALGORITHM 53
6. Repeat steps 1 to 5 above, until the relative change in SMSE is within a certain
threshold, or until SMSE shows an increase at any step.
Update:
1. Find the U matrices with normalized columns for all intra-cell users and their global
power allocation matrices, b = 1, 2.
U(b)k = J(b)−1
H(b)k V
(b)k
√Q
(b)k , u
(b)kj = u
(b)kj /||u(b)
kj || (normalizing columns)
P(b) = σ2diag[ (
D(b)−1 −ΨUL (b)T)−1
1]
2. Find Ge as given in Eqn. (3.5).
The matrices D and ΨUL are given by Eqns. (2.42) and (2.43) with γ(b1,b2)ck set to 1,
and the matrix Jkj is given by Eqn. (2.54), but applied to the set of users indicated by
its superscripts or subscripts. To use the above steps with OFDM, a slight modification
is needed, which will be presented later in Chapter 4.
3.2.4 Data Vector Estimation for THP Users
As we have seen previously in Section 3.2.1, modifying the data vector for the edge users
that use THP is equivalent to choosing the data symbols of the data vector from an
extended constellation such that the pre-subtraction of interference decreases its power.
It is crucial for the data vector from the extended constellation, ve, to be decoded properly
at the edge users so that the modulo operation restores the correct original data vector,
de. In the regular operation of the SMSE algorithm of [8], on which the hybrid algorithm
was based, the symbols of the estimated data vector, xe, are not brought back as close as
possible to −1 or 1, and they are simply hard-decoded according to their signs. Clearly,
this is not applicable to THP, since we are dealing with an extended constellation that
has more than only two symbols. The receiver process scales the transmitted vector and
we derive below how the estimated data vector should be scaled.
A global estimated data vector for all users can be written for the first step, i.e., the
algorithm of [8], by grouping the separate estimated data vectors as given by Eqn. (2.46).
This global vector is expressed in Eqn. (3.16). Note that V is block diagonal since the
3.3. A VARIATION ON THE HYBRID ALGORITHM 54
users cannot cooperate.
x = VHHHU√
Px + VHn. (3.16)
The matrix U has normalized columns and hence we may express it as U = UorigW,
where Uorig = J−1HV√
Q, and W = diag(
1||u1|| , . . . ,
1||uL||
), and ul is the lth column of
U. Expressing U and J explicitly in Eqn. (3.16), we get
x = VHHH(HVQVHHH + σ2IM
)−1HV
√QW
√Px + VHn (3.17)
From Eqn. (3.17), we can see that when J is inverted, the magnitude of the matrices
that make up J essentially normalize the magnitude of all the matrices outside, except
for W. Therefore, W−1 = diag (||u1||, . . . , ||uL||) should be multiplied into x to restore
the correct magnitude to its symbols.
3.3 A Variation on the Hybrid Algorithm
The main idea behind the hybrid algorithm presented in Section 3.2 was to pre-subtract
the transmission to the intra-cell users from that of the edge users and to treat the
transmission to the edge users at the intra-cell users as colored noise that is subsequently
whitened at each user. An intuitive variation to the above scheme would be to pre-
subtract the transmission to the edge users from that of the intra-cell users and treat the
transmission to the intra-cell users as colored noise that is whitened at the edge users.
Note that this variation is closely related to reversing the user ordering for THP when
there are only two users. In this case, x(b)in is found as follows,
x(b)in = (d
(b)in −G
(b)in xe) mod 2Mconst, (3.18)
where
G(b)in =
(V
(b)H
in H(b)H
in U(b)in
√P
(b)in
)−1
V(b)H
in H(b)H
in U(b)e
√Pe. (3.19)
3.4. SIMULATION RESULTS 55
As for the signal received by edge user k, in this case it can be expressed as below. Note
that in this case, the cross channels are forced to have zero power.
ye,k = HHe,kUe
√Pexe + H
(1)H
e,k U(1)in
√P
(1)in x
(1)in + H
(2)H
e,k U(2)in
√P
(2)in x
(2)in + ne,k,
= HHe,kUe
√Pexe + ze,k. (3.20)
Following from Eqn. (3.20), Rz,e,k can be approximated as follows.
Rz,e,k = E[ze,kz
He,k
],
≈ 4
3H
(1)H
e,k U(1)in P
(1)in U
(1)H
in H(1)e,k +
4
3H
(2)H
e,k U(2)in P
(2)in U
(2)H
in H(2)e,k + σ2INk
. (3.21)
Finally, the modified channel matrix for edge user k is found as
He,k = He,kR− 1
2z,e,k. (3.22)
3.4 Simulation Results
This section presents the results of illustrative simulations. The parameters used for all
the simulations are the same as in Table 2.3.
3.4.1 Zero Power Cross Channels
The results of the first simulation are captured in Figure 3.3. Note that in this simulation,
the path loss exponent is assumed to be zero, i.e., the path loss is not taken into account.
This example serves to illustrate the workings of the proposed algorithm independent
of path loss effects. Note that since the path loss exponent is set to zero, there is no
path loss to attenuate the cross channels. However, as explained at the beginning of
Section 3.1, cross channels attenuate a signal severely. Therefore, the cross channels here
are forced to have zero power. An upper bound on performance is obtained by assuming
that all signals (including interference) from both BSs arrive at all users synchronously.
This system is equivalent to a super-BS with 2M transmit antennas communicating
jointly with K users. On the other hand, a lower bound on performance is obtained
3.4. SIMULATION RESULTS 56
−2 0 2 4 6 8 10 1210
−6
10−5
10−4
10−3
10−2
10−1
100
BE
R
10 log (Pmax
/σ2)
B = 2, M = 4, K = 4, Nk = 1, L
k = 1
Lower BoundUpper BoundHybrid AlgorithmAsynchronous Interference
Figure 3.2: Performance of the hybrid algorithm without path loss (K1 = K2 = 1,Ke = 2)
by assuming that the 4 users are uniformly distributed between two independent cells
without inter-cell interference and without joint cooperative processing for edge users.
The performance upper and lower bounds correspond to the bottom and top curves in
Figure 3.2, respectively.
To illustrate the efficacy of the hybrid algorithm, the case of Ke = 2, K1 = K2 = 1
was simulated, since having a substantial amount of edge users is a rare event when a
uniform distribution of users is combined with a 40-meter wide cooperation band. The
result is shown by the solid curve with diamond markers. As expected the performance
lies within the upper and lower bounds of performance. We can also see a clear increase in
diversity order from the lower bound, which is a direct result of the cooperation between
the two BSs. The diversity order is slightly less than that of the upper bound since
cooperation only happens for the edge users in the hybrid algorithm, while it happens
for all users in the upper bound case.
To evaluate the performance of the hybrid algorithm further, three other schemes
that assume zero power cross channels (ZPCC) were also simulated for Ke = 2, K1 =
3.4. SIMULATION RESULTS 57
−2 0 2 4 6 8 10 12
10−4
10−3
10−2
10−1
100
BE
R
10 log (Pmax
/σ2)
B = 2, M = 4, K = 4, Nk = 1, L
k = 1
ZPCC Case 3Hybrid AlgorithmZPCC Case 2ZPCC Case 1
Figure 3.3: Performance of the hybrid algorithm compared to ZPCC cases without pathloss (K1 = K2 = 1, Ke = 2)
K2 = 1. From here on, these schemes will be referred to as “ZPCC cases 1 to 3”. In
the three ZPCC cases, all interference is assumed to be synchronous and edge (intra-cell)
users battle MUI from other edge (intra-cell) users and intra-cell (edge) users by linear
precoding. For ZPCC case 1, the whole system is treated as one cell with one super-
BS equipped with 2M antennas. The channels between the users initially considered
as intra-cell users in cell 2 (cell 1) and the first (last) M antennas of the super-BS are
considered to have zero power. In this case, it is found that the super-BS transmits to all
users using all the 2M antennas, even though the transmission intended for some users on
some antennas never reaches those users. Despite that fact, this case of ZPCC performs
slightly better than the hybrid algorithm. Its result is shown by the solid curve with
cross markers in Figure 3.3. This can be explained by noticing that the transmission to
intra-cell users by the other BS can help the edge users better suppress the interference
they see from those intra-cell users.
This explanation is further supported by the performance of ZPCC case 2. In ZPCC
case 2, the same precoding vectors as the first variation are used by the BSs, except that
3.4. SIMULATION RESULTS 58
the components of those vectors corresponding to the antennas with zero power channels
are set to zero and the vectors are re-normalized to their initial unit magnitude. In other
words, the extra interference information that edge users received in the first variation
is no longer available. The result of this case is shown by the dashed curve with cross
markers in Figure 3.3. This result shows the worst performance among all the schemes.
Furthermore, plotting the results for the separate streams (Figure 3.4) shows that it is the
edge users that cause the apparent error floor, which supports our previous explanation.
In ZPCC case 3, one BS transmits to intra-cell users, and two BSs transmit to edge
users. That is, linear precoding for one group of intra-cell users is performed assuming
that only they, their BS, and the edge users exist. As for the edge users, linear precoding
is performed assuming that both BSs and all the users exist. The result of this case is
shown by the dotted curve with cross markers. With the ZPCC cases explained, it is only
fair to evaluate the performance of the hybrid algorithm by comparing it to a scheme
that uses the same amount of information, that is, ZPCC 3. In other words, ZPCC case
1 uses the values of the cross channels (zero in this case) directly in its solution, while
the hybrid algorithm and ZPCC case 3 do not require them. Clearly, at relatively high
SNRs, the hybrid algorithm performs better than ZPCC case 3, with a higher diversity
order. This higher diversity order comes mainly from an increase in that of the intra-cell
users as can seen by plotting the BERs of the separate data streams in Figure 3.5. This
is a possible consequence of precoding for intra-cell users while treating the MUI caused
by edge users as colored noise (that is whitened) and not as interferers to be suppressed.
A more realistic scenario that better represents a physical environment would simulate
the effects of path loss. Accordingly, in the second simulation, the path loss exponent
is set to 3.5, a value that represents an urban environment. The cell radius was set to
250 meters and the minimum distance of any user to a BS was set to 150 meters. The
path loss of each user was normalized to the path loss experienced 150 meters away from
a BS. The results of this simulation are captured in Figure 3.6. The results are similar
to those obtained without path loss, except that the higher diversity order of the hybrid
algorithm makes it outperform the ZPCC case 3 at higher SNRs.
3.4. SIMULATION RESULTS 59
−2 0 2 4 6 8 10 12
10−5
10−4
10−3
10−2
10−1
100
10 log(Pmax
/ σ2)
BE
R
B = 2, M = 4, K = 4, Nk = 1, L
k = 1
Average BEREdge Stream 1Edge Stream 2Intra−cell Stream 1Intra−cell Stream 2
Figure 3.4: BER plot of the separate streams for ZPCC case 2
−2 0 2 4 6 8 10 1210
−5
10−4
10−3
10−2
10−1
100
10 log(Pmax
/ σ2)
BE
R
B = 2, M = 4, K1 = 1, K
e = 2, K
2 = 1, N
k = 1, L
k = 1
HA − Edge Stream 1HA − Edge Stream 2HA − Intra−cell Stream 1HA − Intra−cell Stream 2ZPCC − Edge Stream 1ZPCC − Edge Stream 2ZPCC − Intra−cell Stream 1ZPCC − Intra−cell Stream 2
Figure 3.5: BER plot of the separate streams for hybrid algorithm and ZPCC case 3
3.4. SIMULATION RESULTS 60
−5 0 5 10 15 20
10−4
10−3
10−2
10−1
100
10 log (Pmax
/σ2)
BE
R
B=2, K=4, M=4, Nk=1, L
k=1, cell_radius=250
ZPCC − Case 3HAZPCC − Case 1ZPCC − Case 2
Figure 3.6: Performance of the hybrid algorithm with path loss
3.4.2 Non-zero Power Cross Channels
While in all the previous simulations the cross channels were set to zero, the results shown
in Figure 3.7 are those of simulations where the weak cross channel powers are modeled by
path loss. As expected, the performance slightly deteriorates in the presence of inter-cell
interference. Furthermore, when the users find their decoding matrices (the V matrices
with the whitening filters incorporated) through training, the performance is very close to
that when the theoretical ones are used; in fact, it is slightly better. This can be explained
by the fact that a relatively long training sequence of length 100 symbols has been used
and the decoding matrices are derived from simulated training transmissions, and not
based on the theoretical approximation of the covariance of the inter-cell interference.
This result might, as well, be used to show that the approximation used is a relatively
good one. Figure 3.7 also shows similar simulations performed for users that have Nk = 2
receive antennas, with a training sequence length of 50. The curves almost overlap. This
further supports the feasibility of users finding their decoding matrices through training,
3.4. SIMULATION RESULTS 61
−5 0 5 10 15 2010
−5
10−4
10−3
10−2
10−1
100
10 log(Pmax
/ σ2)
BE
R
B = 2, M = 4, K1 = 1, K
e = 2, K
2 = 1, L
k = 1
HA without inter−cell interferenceN
k = 1
HA with whitneing inter−cell interferenceN
k = 1
HA with inter−cell interference and rx est.N
k = 1, training length = 100
HA with whitneing inter−cell interferenceN
k = 2
HA with inter−cell interference and rx est.N
k = 2, training length = 50
Figure 3.7: Performance of the hybrid algorithm with inter-cell interference
since with Nk = 1, training can be thought of as the receivers estimating phase offsets
(since the decoding vectors are scalars normalized to unit magnitude).
Figure 3.8 shows the performance of the variation of the hybrid algorithm. It performs
worse than the hybrid algorithm. Figure 3.9 shows the result obtained from simulating
the hybrid algorithm when transmitting the whole complex components of vector xe. As
mentioned previously in Section 3.2.1, energy is wasted and the performance deteriorates.
Finally, we state some statistical results that were obtained through the above sim-
ulations as well. First, the power of the real parts of the modified transmitted data
vectors average over 105 runs for each SNR from −3 to 15 was 1.268, which is close to
the theoretical value of 43. At higher SNRs, the average power is closer to 4
3. Second,
the modulo operations at the transmitter and the receiver matched 99.6% of the time,
averaged over the same number of runs and the same SNR range.
3.4. SIMULATION RESULTS 62
−2 0 2 4 6 8 10 12 1410
−6
10−5
10−4
10−3
10−2
10−1
100
10 log(Pmax
/ σ2)
BE
R
B = 2, M = 4, K1 = 1, K
e = 2, K
2 = 1, N
k = 1, L
k = 1
HAVariation of HA
Figure 3.8: Performance of the hybrid algorithm variation
−2 0 2 4 6 8 10 12 1410
−6
10−5
10−4
10−3
10−2
10−1
100
10 log(Pmax
/ σ2)
BE
R
B = 2, M = 4, K1 = 1, K
e = 2, K
2 = 1, N
k = 1, L
k = 1
HAHA with complex transmission
Figure 3.9: Performance of the hybrid algorithm with complex transmission
3.5. IMPLEMENTATION ISSUES 63
3.5 Implementation Issues
In this section, we briefly review some of the implementation issues involved in the
physical deployment of the hybrid algorithm. The first issue that arises is the provision
of the needed CSI to the BSs, a common issue to all schemes that assume CSI at the
transmitter side. Moreover, in our case, the propagation delays between the BSs and
the users need to be determined and provided to the BSs. CSI usually either consists
of the exact values of the response of the channel or some statistics about it. In the
downlink, the users are responsible for performing channel estimation and feeding back
the information to the BSs. This is usually done through a feedback channel from the
users to the BS, that is often assumed to be error free. However, in reality, this feedback
channel is not necessarily error free. Moreover, even if that is achievable, the channel
response estimated by the users cannot be exactly fed back to the BSs due to the errors
that occur when the users discretize and quantize the channel values. The effect of
quantization is often studied by assuming that there is only a limited number of bits to
relay the needed information and simulating how that affects performance. Such a study
is required to determine the number of bits that allow the hybrid algorithm to perform
at an acceptable level. As for the propagation delays, they can be directly found at the
BSs since they should be the same in the downlink and uplink. Hence, determining them
should be easier and more accurate than the CSI.
With the hybrid algorithm being a cooperative scheme between 2 BSs, another im-
plementation issue is the amount of information that should be shared between the BSs,
how to share it, and how much delay is involved in the process. In what follows we
assume a simple yet logical view of the infrastructure of a cellular system, and provide
a back-of-the-envelope analysis of the resulting delays. We consider only two BSs and a
central controller (CC) that, in a non-cooperative scenario, would only provide the BSs
with the data they need to transmit to the users. Initially, the CSI and propagation
delays between all the users and a given BS is assumed to be known by that BS and the
data to be transmitted to all the users is known by the CC.
To execute the hybrid algorithm, the CSI from both BSs is required. This can be seen
from steps 1 to 4 of the iteration of the algorithm presented in Section 3.2.3. In step one,
3.5. IMPLEMENTATION ISSUES 64
BS 1 → CC BS 2 → CC CC → BS 1 CC → BS 2 AlgorithmRunning
T 1 H(1), H(1)e , τ (1) H(2), H
(2)e , τ (2)
T 2 X
T 3 xe, x(1)in , P(1)
U(1), Pe, U(1)e
xe, x(2)in , P(2)
U(2), Pe, U(2)e
Table 3.1: Algorithm executed at the CC
a global, i.e., for users of both BSs, power allocation step is performed, which requires
the CSI of all users. In steps 2 and 3, the precoding and decoding matrices of the edge
users are jointly derived, which requires the CSI between both BS and the edge users.
In step 4, estimating the covariance matrix of the colored noise seen by an intra-cell user
requires the precoding matrices and power allocation derived for the edge users and/or
the intra-cell users of the other cell. Moreover, deriving the matrix Ge required for THP
as in Eqn. (3.5), also requires knowledge of the precoding and decoding matrices, as well
as the power allocation of all the users. Finally, the same modified data vector for the
edge users should be known by both BSs.
Consequently, all the CSI and the propagation delays should be gathered in one
location for the algorithm to execute. Logically, that will either happen at one of the
BSs and the result will be sent to the other BS, or at the CC and the result is sent to
both BSs. Clearly, it is more costly to have both BS acquire all the CSI and execute
the algorithm. Table 3.1 and Table 3.2 show the communication steps required when
the algorithm is executed at the CC and at one of the BSs (BS 1 in this example),
respectively.
As we can see, less time is required when the algorithm is executed at the CC.
However, a CC would be responsible for more than 2 BSs; therefore, having the CC
execute the algorithm for all its BSs increases its complexity drastically. Executing the
algorithm at one of the BSs keeps the complexity of the CC low, but requires more time,
creating a tradeoff. Note that at the cost of extra equipment (a direct link between each
pair of cooperating BSs), the required time for executing the algorithm at one of the BSs
becomes similar to that when it is executed at the CC.
Introducing OFDM into the solution brings in the implementation challenges of
3.6. SUMMARY 65
BS 1 → CC BS 2 → CC CC → BS 1 CC → BS 2 AlgorithmRunning
T 1 H(2)e , H
(2)in , τ (2) de, x
(1)in , x
(2)in x
(2)in
T 2 H(2)e , H
(2)in , τ (2)
T 3 X
T 4 U(2)in , P
(2)in
U(2)e , Pe, xe
T 5 U(2)in , P
(2)in
U(2)e , Pe, xe
Table 3.2: Algorithm executed at BS 1
OFDM itself, including time, frequency, and phase offsets and the associated inter-carrier
and inter-block interference. Moreover, the possibility of having to run the algorithm for
each subcarrier (or group of subcarriers [39]) also increases the running time of the whole
process. Even though these are important issues, they are not discussed further as they
are out of the scope of this work.
3.6 Summary
In this chapter, we have proposed the hybrid algorithm, which brings together linear and
nonlinear precoding to communicate with the users of two BSs. Edge users, which are
almost equally distant from the two BSs, experience limited asynchronism and joint linear
precoding based on the original algorithm of [8] is used to communicate to them. Before
transmission, nonlinear THP is used to subtract the data of intra-cell users from that of
the edge users, and accordingly the edge users do not experience any interference from
the intra-cell users. As for the intra-cell users themselves, they treat the transmission
to the edge users as colored noise, which they whiten. Simulations demonstrating the
performance of the hybrid algorithm were presented and compared to the performance
of simpler linear precoding schemes. Finally, a discussion of implementation issues was
presented, which provided some insight into the challenges that must be overcome for a
cooperative scheme to be deployed.
Chapter 4
User Selection in Multiuser
MIMO-OFDM Systems
As we have mentioned previously in Section 1.2.4, a single-carrier, MIMO, linear precod-
ing algorithm can only support a few data streams, far fewer than what would be required
in practice. A simple approach to increasing user capacity is using OFDMA and running
the same algorithm on each of the subcarriers. With many users in the system, the first
step becomes choosing which users should be communicated with on each subcarrier.
In this chapter, we review a user selection algorithm presented in [39] for a multiuser
MIMO-OFDM system. We propose a few changes to the algorithm and demonstrate the
improvement in performance. We then discuss how users might be selected when the
hybrid algorithm is the single-carrier precoding algorithm of choice for the subcarriers in
a multi-BS, MIMO-OFDM system. We present a problem formulation and a simulation
exercise that helps in determining the number of edge and intra-cell users to select for a
given subcarrier.
4.1 System Model and Problem Statement
In this section, we will describe the system model of a single-BS multiuser MIMO-OFDM
system. The model is based on that presented in Chapter 2 for a single BS. It can be
easily extended to suit the hybrid algorithm presented in Chapter 3, as will be seen later.
66
4.1. SYSTEM MODEL AND PROBLEM STATEMENT 67
Once the system model is established, the user selection problem will be stated in detail.
4.1.1 System Model
The basic idea behind OFDM is to the divide a broadband frequency selective channel
into a set of overlapping, yet orthogonal, narrowband flat fading channels. This can
be achieved by the use of an IFFT block at the transmitter and an FFT block at the
receiver. Accordingly, a multiuser MIMO-OFDM system can treat each subcarrier as a
single-carrier multiuser MIMO system that is independent of the systems on the other
subcarriers. This of course comes at the cost of increased computations, because the
single-carrier algorithm has to be repeated for all subcarriers, and the number of subcar-
riers in an OFDM system, denoted here as Nc, can be very high. Readers are referred
to [39] for more details on computation reduction methods that are suitable when the
algorithm of [8] is the single-carrier algorithm of choice.
The system model for the MIMO-OFDM system can be easily derived from the one
in Chapter 2 for a single BS by incorporating a subcarrier index n to all the variables1.
The expressions of the estimated data vectors in the downlink and uplink, as well as the
SMSE in the uplink are given below. Note that the channel matrices below represent the
flat fading MIMO channel response on each subcarrier in the frequency domain.
xDLk (n) = VH
k (n)HHk (n)
K∑j=1
Uj(n)√
Pj(n)xj(n) + VHk (n)nk(n), (4.1)
xULk (n) =
K∑j=1
UHk (n)Hj(n)Vj(n)
√Qj(n)xj(n) + UH
k (n)n(n), (4.2)
SMSE = Nc(L−M) + σ2
Nc∑n=1
tr(J(n)−1
), (4.3)
1Note that the system model and the analysis in this chapter ignore important issues such as phasenoise and frequency offsets; issues that are beyond the scope of this work
4.2. ORIGINAL ALGORITHM AND PROPOSED MODIFICATIONS 68
where
J(n) = H(n)V(n)Q(n)VH(n)HH(n) + σ2IM , (4.4)
and the index n = 1, . . . , Nc.
4.1.2 Problem Statement
Consider a system with one BS that needs to communicate with a total of Kt À M users.
Clearly, since linear precoding is used, no more than M data streams, corresponding to
a maximum of M users (assuming Lk = 1) can be transmitted on each subcarrier. With
Kt À M users in the system, it is not clear which users should be allocated to each
subcarrier. Note that when the user allocation on the subcarriers is known, minimizing
the SMSE can be done using the scheme in Table 2.1 [8]. The precoding and decoding
matrices on a given subcarrier are derived in exactly the same way using that subcarrier’s
frequency domain channel matrix. The power allocation is a convex problem since it is the
sum of the Nc separate convex power allocation problems of the Nc subcarriers. Therefore,
it can be optimally and easily performed in one step over all subcarriers. On the other
hand, when the user allocation on the subcarriers is not known, finding the optimal user
group for each subcarrier that minimizes the SMSE of the system is a very complicated
problem and the brute force approach is computationally prohibitive [39]. Thus, the
problem becomes finding a practical method, which determines a user allocation that
gives an acceptable performance. Note that user fairness is not addressed as it further
complicates an already complex problem.
4.2 Original Algorithm and Proposed Modifications
The following approach was proposed in [39] to address the problem at hand. Basically,
the iteration step of the multiuser algorithm in Table 2.1 is run on all Kt users on a
certain subcarrier. The top K streams that were allocated the most power are then
assigned to that subcarrier. The same procedure is repeated over all subcarriers.
In the following, we propose two modifications to the original approach. First, we
4.3. SIMULATION RESULTS 69
propose to choose the K users with the lowest individual MSEs instead of those with the
highest power allocation. This choice is intuitive as our ultimate goal is minimizing the
SMSE of the system. Moreover, some ‘good’ users can achieve a low MSE, and hence
contribute well to decreasing the SMSE, with low power. Such users are more likely to
be discarded by the original approach. Second, we propose to choose the K users by
successively discarding the worst Kd ≤ Kt − K users and repeating the iteration until
the number of remaining users, Kr, becomes K. Clearly, as Kd increases, the running
time decreases. Note that Kd can be changed for each iteration. The intuition behind
this modification is that the presence of a high number of ‘bad’ users (Kt−K À K) can
affect which are the ‘best’ users. By gradually discarding ‘bad’ users, the choice of the
‘best’ users should logically improve. The modified approach is detailed in Table 4.1.
4.3 Simulation Results
The following simulation was performed to evaluate the new approach. Consider a system
with one BS with M = 4 antennas, and Kt = 7 total users, each with Nk = 1 antennas
and Lk = 1 data stream. Let the number of subcarriers be Nc = 1. Having one subcarrier
logically should not affect the results, since the subcarriers are assumed independent, and
the user allocation approach simply performs the same procedure over all subcarriers.
This is further supported by the fact that when our simulation was set to select users by
the approach presented in [39] (performed using Nc = 64), it yielded very close results.
The following four scenarios were simulated. In the first scenario, Kd = 3, i.e., the
SMSE minimization iteration ran only once, and the K = 4 top users were selected
according to the highest power criterion. In the second scenario, Kd = 3 as well, but
the K = 4 top users were selected according to the lowest MSE criterion. In the third
scenario, Kd = [2, 1], i.e., the SMSE minimization iteration ran twice, and the highest
power criterion was used. In the fourth scenario, Kd = [2, 1] as well, and the lowest MSE
criterion was used. The total number of users was chosen to be relatively low (Kt = 7)
in order to be able to perform a brute force search for the best user allocation. For each
scenario, the percentage of times that the K = 4 selected users were identical to those
4.3. SIMULATION RESULTS 70
Repeat for n = 1 : Nc
Set Kr = Kt.
Repeat until Kr is K
1. Choose an appropriate value for Kd.
2. Minimize SMSE for all Kt users.
Initialization:
Vk = SVD(Hk), Q = (Pmax/LKt)ILKt
Iteration:
i. Find virtual uplink precoding vectors, for k = 1 : Kr, j = 1 : Lk
vkj(n) = emaxHHk (n)J−2
kj Hk(n), I/qkj(n) + HHk (n)J−1
kj Hk(n)emax returns the normalized eigenvector with highest eigenvalue.
ii. Find virtual uplink power allocation to minimize SMSE.
q(n) = argminq(n) tr (J−1(n)), subject to qkjn > 0, ||q(n)|| ≤ Pmax
iii. Repeat iteration until old SMSE - new SMSE < ε
3. Discard the Kd users with the highest individual MSEs.
Table 4.1: Modified user selection algorithm
4.4. MIMO-OFDM AND THE HYBRID ALGORITHM 71
0 5 10 1550
55
60
65
70
75
80
85
90
95
10 log(Pmax
/ σ2)
Per
cent
age
mat
ch to
opt
imal
allo
catio
n
Scenario 1, Kd = 3, Highest power criterion
Scenario 2, Kd = 3, Lowest MSE criterion
Scenario 3, Kd = [2, 1], Highest power criterion
Scenario 4, Kd = [2, 1], Lowest MSE criterion
Figure 4.1: Four different scenarios for user selection
selected by the brute force search over 7C4 combinations was found at different SNRs.
Figure 4.1 shows the results.
The first observation to make is the vast improvement that is achieved, especially at
high SNRs; the lowest MSE criterion clearly outperforms the highest power criterion.
The second observation is the improvement achieved by running the discarding the ‘bad’
users gradually rather than all at once. This improvement is achieved, independent of
which criterion is used. Finally, the third observation to make is the lower degree of
dependence of the lowest MSE criterion on the SNR.
4.4 MIMO-OFDM and the Hybrid Algorithm
Before discussing how users can be chosen for the hybrid algorithm, we will present how
it can be used in conjunction with OFDM. As described in Chapter 3, the hybrid algo-
rithm cooperates only for edge users to avoid the problems associated with asynchronous
interference. Therefore, OFDM can be used and the hybrid algorithm can simply be run
4.4. MIMO-OFDM AND THE HYBRID ALGORITHM 72
on each subcarrier. Thus, the system model for the MIMO-OFDM system can be easily
derived from the one in Chapter 3 for the hybrid algorithm by incorporating a subcarrier
index n to all the variables. The expressions of the estimated data vectors of intra-cell
users in the downlink and uplink for b = 1, 2, the estimated data vectors of edge users
in the downlink after interference has been canceled, as well as the SMSE in the uplink
are given below. Note that the channel matrices below represent the flat fading MIMO
channel response on each subcarrier in the frequency domain.
xk(n)DL = VHk (n)H
(b)H
k (n)∑
j∈Ωb,Ωe
U(b)j (n)
√Pj(n)xj(n) + VH
k (n)nk(n), (4.5)
xk(n)UL = UHk (n)
∑j∈Ωb,Ωe
H(b)j (n)Vj(n)
√Qj(n)xj(n) + UH
k (n)n(b)(n), (4.6)
de(n) =(VH
e (n)HHe (n)Ue(n)
√Pe(n)ve(n) + VH
e (n)ne(n))
mod 2Mconst, (4.7)
SMSE =Nc∑
n=1
L(n)− 4M + tr(J(1)−1
(n))
+ tr(J−1
e (n))
+ tr(J(2)−1
(n))
, (4.8)
where
J(1)(n) = H(1)in (n)V
(1)in (n)Q
(1)in (n)V
(1)H
in (n)H(1)H
in (n) + σ2IM , (4.9)
Je(n) = He(n)Ve(n)Qe(n)VHe (n)HH
e (n) + σ2I2M , (4.10)
J(2)(n) = H(2)in (n)V
(2)in (n)Q
(2)in (n)V
(2)H
in (n)H(2)H
in (n) + σ2IM (4.11)
To run the hybrid algorithm for each subcarrier, the steps presented in Section 3.2.3 are
repeated over all subcarrier indices; however, the power allocation step is run once per
4.4. MIMO-OFDM AND THE HYBRID ALGORITHM 73
iteration over all subcarriers. This procedure is presented below. As seen for the hybrid
algorithm, the power allocation problem for minimizing the SMSE on one subcarrier is
a convex problem; therefore, the power allocation problem to minimize the sum of the
SMSEs over all the subcarriers is also a convex problem. As mentioned previously in
Section 3.2.3, the uplink precoding matrices (the V matrices) are initialized to the right
singular vectors of the channel matrices. The uplink power matrices (the Q matrices) are
initialized by dividing the available power equally among all data streams. The modified
channel matrices are initialized by adjusting the true channels matrices using the initial
power and precoding matrices.
Iteration:
1. Solve the following convex power allocation problem.[Q(1)(n),Qe(n),Q(2)(n)
]for n=1,...,Nc
= arg minQ(1)(n),Qe(n),Q(2)(n)for n=1,...,Nc
∑Nc
n=1 tr[J(1)−1
(n)]+
tr[J−1
e (n)]
+ tr[J(2)−1
(n)]
s.t.∑Nc
n=1 tr[Q(1)(n)
]+ tr [Qe(n)] + tr
[Q(2)(n)
] ≤ Ptot
Repeat steps 2 to 5 for n = 1, . . . , Nc:
2. Find new V matrices for edge users.
vkj(n) = emax
(HH
e,k(n)J−2e,kj(n)He,k(n), I/Qe,kj(n) + HH
e,k(n)J−1e,kj(n)He,k(n)
)
N.B.: emax returns the eigenvector with the highest eigenvalue.
3. Find the U matrices with normalized columns for all edge users and the global Pe
matrix.
Ue,k(n) = J−1e (n)He,k(n)Ve,k(n)
√Qe,k(n), ue,kj(n) = ue,kj(n)/||ue,kj(n)||
Pe(n) = 34σ2diag
[(D−1
e (n)−Ψe(n))−1
1]
N.B.: The factor 34
is included to account for the increased average power of xe(n)
due to THP as mentioned previously in Section 3.2.1.
4. Find Rz,k(n) as in Eqn. (3.7) or Eqn. (3.9) and update Hk(n) for all users in BSs
1 and 2.
H(b)k (n) = H
(b)k (n)R
− 12
z,k (n)
4.5. USER SELECTION FOR THE HYBRID ALGORITHM 74
5. Find new V matrices for intra-cell users, b = 1, 2.
vkj(n) = emax
(H
(b)H
k (n)J(b)−2
kj (n)H(b)k (n), I/Q
(b)kj (n) + H
(b)H
k (n)J(b)−1
kj (n)H(b)k (n)
)
6. Repeat steps 1 to 5 above, until the relative change in SMSE is within a certain
threshold, or until SMSE shows an increase at any step.
Update:
Repeat steps 1 and 2 for n = 1, . . . , Nc:
1. Find the U matrices with normalized columns for all intra-cell users and their global
power allocation matrices, b = 1, 2.
U(b)k (n) = J(b)−1
(n)H(b)k (n)V
(b)k (n)
√Q
(b)k (n), u
(b)kj (n) = u
(b)kj (n)/||u(b)
kj (n)||P(b)(n) = σ2diag
[ (D(b)−1
(n)−ΨUL (b)T(n)
)−1
1]
2. Find Ge(n) as given in Eqn. (3.5).
4.5 User Selection for the Hybrid Algorithm
In Section 4.1.2, we mentioned that in a single-BS system it is clear that the maximum
number of data streams that can be transmitted on a single subcarrier in the downlink is
equal to the number of BS transmit antennas, M , since linear precoding is used. When
considering the hybrid algorithm, the theoretical maximum is not as clear; however,
logically, the maximum number of edge data streams (i.e. data streams of edge users)
should be equal to the sum of the number of antennas of the two BSs cooperating to
communicate with the edge users. This follows from the fact that the hybrid algorithm
treats the edge users as a separate group that employs linear precoding. As for the
intra-cell users, since the hybrid algorithm also treats them as separate groups that use
linear precoding, the maximum number of intra-cell data streams is equal to the number
of transmit antennas of the corresponding BS. Assuming that both BSs have the same
number of antennas M , the maximum number of edge data streams would be 2M , and
that of the intra-cell data streams would be M . Figure 4.2 shows the performance of the
hybrid algorithm for B = 2,M = 4, K1 = K2 = 4, Ke = 8, Nk = 1, and Lk = 1.
4.5. USER SELECTION FOR THE HYBRID ALGORITHM 75
−2 0 2 4 6 8 10 12 1410
−4
10−3
10−2
10−1
100
10 log(Pmax
/ σ2)
BE
R
K1 = K
2 = 1, K
e = 4
K1 = K
2 = 4, K
e = 8
Figure 4.2: Performance of hybrid algorithm for K1 = K2 = 4 and Ke = 8
As we can see in the figure, it is possible to communicate with 4 intra-cell users in
each cell and 8 edge users, and many users are serviced simultaneously. On the other
hand, the average performance per stream is poor, even at high SNRs. Since even
simple applications require a better BER than what was achieved, such a data stream
distribution is not desirable. Consequently, we propose to simulate the performance over
several data stream distributions, and empirically determine an appropriate choice. Note
that, as mentioned previously, in a realistic scenario, there would be a very high number of
users, of which few are allocated on a single subcarrier. Therefore, before proceeding with
the data stream simulation exercise, a method to select users for the hybrid algorithm
is necessary. One intuitive option would be running the modified selection algorithm
described in Section 4.2 on all the users, and then selecting a certain number of best
users for each user group. An example is shown in Figure 4.3 for a total of ten users:
three intra-cell users in each cell and four edge users. Another option would be running
the modified algorithm after replacing the SMSE minimizing iteration by that of the
hybrid algorithm. To make the proper choice, both options were simulated to select 1
4.6. SUMMARY 76
intra-cell user for each cell and 3 edge users from a total of 4 intra-cell users per cell
and 8 edge users. Note that Kd was set such that the iteration takes place only once.
The results are shown in Figure 4.4. The selection algorithm as presented in Section 4.2
performs slightly better than using the iteration from the hybrid algorithm.
Figure 4.3: User selection method for the hybrid algorithm
Based on the above result, we present some sample curves in Figures 4.5 and 4.6
obtained by selecting different numbers of intra-cell and edge users (same as selecting
data streams, since Lk = 1) from a total of 4 intra-cell users per cell and 8 edge users.
In general, such simulations should be randomized over the numbers of total users,
selected users, BS antennas, user antennas, user positions, and channels. Obviously, this
requires a lot of time, but it can be performed once, and the statistics can be stored in
look-up tables for fast access during real-time operation. A drawback of such scheme
is the lack of individualization, since the BER used is an average value of all the data
streams.
4.6 Summary
In this chapter, we reviewed the MIMO-OFDM user selection algorithm proposed in [39]
and made two modifications to it. In the first, we propsed to rank the users according to
4.6. SUMMARY 77
−2 0 2 4 6 8 10 1210
−5
10−4
10−3
10−2
10−1
100
10 log(Pmax
/ σ2)
BE
R
Selection using the modified selectionalgorithm directlySelection using the modified selection algorithmwith the iteration of the hybrid algorithm
Figure 4.4: Two different user selection methods for the hybrid algorithm
their individual MSEs instead of their allocated powers. In the second, we proposed to
knock out ‘bad’ users gradually, instead of choosing the ‘good’ users all at once. Simula-
tions showed the significant improvement achieved after applying these two modifications.
We then presented several simulations related to user selection for the hybrid algorithm.
In the first, we demonstrated the effect of increasing the number of users on the perfor-
mance of the hybrid algorithm. In the second, we explored which of two possible ways
is better for user selection for the hybrid algorithm. In the third, we presented sample
curves that can be used to determine how many users should be allocated on a each
subcarrier when the hybrid algorithm is used with OFDM, based on the available power
and the average required error rate per data stream.
4.6. SUMMARY 78
−2 0 2 4 6 8 10 12 1410
−6
10−5
10−4
10−3
10−2
10−1
100
10 log(Pmax
/ σ2)
BE
R
121 Arrangement131 Arrangement141 Arrangement151 Arrangement
Figure 4.5: Varying number of edge users for 1 intra-cell user per cell (1f1 refers to fedge users)
−2 0 2 4 6 8 10 12 1410
−5
10−4
10−3
10−2
10−1
100
10 log(Pmax
/ σ2)
BE
R
222 Arrangement232 Arrangement242 Arrangement252 Arrangement
Figure 4.6: Varying number of edge users for 2 intra-cell users per cell (2f2 refers to fedge users)
Chapter 5
Conclusions and Future Work
5.1 Conclusions
In this thesis, we considered the downlink of a wireless cellular system with multiple
antennas at the transmitters (BSs) and receivers (users). Our goal was to provide insight
on how communication in such a system can be performed in order to take advantage of
the multiple antennas and the fact that the entire bandwidth is being used in all the cells.
Accordingly, we have taken into consideration both positive and negative implications,
namely the possibility of cooperation between the multiple BSs and the resulting inter-
cell interference, respectively. We studied how the BSs can communicate with the users
assuming the availability of complete CSI at the BSs. Despite the advantages that can be
gained, many implementation issues should be addressed before a practical and efficient
deployment can be made. Overall, the contributions of this thesis were:
• Developing a downlink/uplink duality in a single-carrier, multi-BS, MIMO system
with asynchronous interference.
• Proposing a multiuser, multi-BS, linear precoding algorithm that makes use of
downlink/uplink duality to minimize the SMSE of the system while accounting for
asynchronous interference.
79
5.1. CONCLUSIONS 80
• Proposing a cooperative hybrid algorithm that combines linear precoding and non-
linear THP and minimizes the SMSE of the system. This algorithm helps ad-
dress the difficulties associated with employing BS cooperation in conjunction with
OFDM by having two BSs cooperate only for edge users that are almost equally
distant from them.
• Modifying an existing user selection algorithm for MIMO-OFDM to enhance its
performance and suggesting how it can be used to select users when the hybrid
algorithm is used along with OFDM.
In Chapter 2, we presented a detailed model for the system described above that takes
into account the required timing advances for synchronous reception at the users, based
on the model in [20]. We provided two possible virtual uplink models, again with timing
details, and showed that, with the appropriate choice of an uplink model, a duality exists
between the downlink and uplink, despite the presence of asynchronous interference. This
proof generalizes the downlink/uplink duality to the multiuser, multi-BS, MIMO case.
With duality in hand, we extended an existing single-BS linear precoding algorithm [8]
based on the downlink/uplink duality, to accommodate the presence of multiple BSs and
asynchronous interference. In our case, the power allocation step was not provably con-
vex; however, we provided simulations that suggested that the SMSE was still convex in
the powers of the data streams. Simulations showed the performance of the extension and
compared it to the performance assuming all reception was synchronous. This quantified
the loss acquired in the presence of asynchronous interference.
We briefly discussed the complications that arise when multiple BSs attempt to com-
municate with the users through OFDM, and noted that one may study the effect of
asynchronous interference on the OFDM symbols and code against that. However, we
chose to make the BSs cooperate only when communicating with edge users, first, be-
cause those users are in need of more help than intra-cell users given they are on the
edge and suffer from inter-cell interference more, and second because the effects of asyn-
chronism are limited at edge users and the designed algorithms can be easily extended
to OFDM. In Chapter 3, we proposed an algorithm for two cooperating BSs that is a
hybrid between linear and non-linear precoding. Non-linear THP is used to pre-subtract
5.1. CONCLUSIONS 81
interference that edge users receive from intra-cell users. Intra-cell users, on the other
hand, treat the interference they see from transmissions to edge users as colored noise
and whiten it. Accordingly, each user group is treated separately when deriving the
precoding and decoding matrices used internally to protect against MUI. However, the
convex power allocation is done globally, and the algorithm attempts to minimize SMSE
of the entire system. We demonstrated that this approach performs better than sim-
ple linear precoding algorithms with the same amount of cooperation. When nonlinear
THP is used, the complicated user ordering problem often arises. In our case, this is not
an issue, since the pre-subtraction happens among only two groups and hence only two
possible orders exist and both were considered. When the transmission to the intra-cell
users is pre-subtracted from that to the edge users, the hybrid algorithm is found to per-
form better. In other simulations, we showed that the users can estimate the decoding
matrices and the whitening filters, which accordingly need not be communicated from
the BSs to the users in any manner. Finally, we provided a brief discussion regarding
the implementation issues for the hybrid algorithm. While the physical complications
might outweigh the achieved performance and hinder the practical deployment of such a
scheme, this work was an exploration of one of many possible scenarios for BS cooper-
ation, and helped demonstrate the challenge involved in designing cooperative systems
and underline the issue of asynchronous interference.
In Chapter 4, we considered the problem of user selection for the subcarriers in a
MIMO-OFDM system. We started out with the user selection algorithm presented in [39]
for the MIMO-SMSE algorithm of [8] and modified it in two ways, namely selecting
users with the lowest individual MSEs and dropping ‘bad’ users instead of picking ‘good’
users. We demonstrated the improvements achieved by the two modifications. Finally, we
provided results of a simulation exercise that suggested how many users should be placed
on one subcarrier for the hybrid algorithm and how to choose these users. Obviously, this
is not an optimal solution. The high number of users and subcarriers, varying optimality
criteria (maximizing data rates, minimizing error rates, meeting quality-of-service (QoS)
requirements, etc.), and fairness issues complicate matters extremely. Therefore, the
methodology for a solution is not simple nor clear, and optimal user selection for MIMO-
OFDM remains an open problem.
5.2. FUTURE WORK 82
5.2 Future Work
Several interesting problems encountered while working on this thesis can be explored
further. First, as mentioned previously in Section 2.6, one can study the effect of asyn-
chronous interference as presented in this work on OFDM. Exploring how the misalign-
ment of the pulse shapes of the arriving OFDM symbols with the matched filters at the
receivers affects the actual data symbols can help in the design of linear precoding to
guard against asynchronous interference in OFDM.
Second, it would be useful and more practical to solve similar problems to those
tackled in Chapter 2 and Chapter 3 with per-BS power constraints. It is interesting
to mention that the discussion provided by Schubert and Boche in [6] that shows that
maximizing the minimum SINR with a total power constraint leads to the same SINR-
to-target ratio across all users does not necessarily extend to a multi-BS case with per-BS
power constraints. Investigating such a scenario can give more insight into the design of
precoding scheme for multi-BS systems with per-BS power constraints.
Third, addressing the implementation issues described in Section 3.5 is important,
especially determining how partial CSI affects the performance of an algorithm that
minimizes SMSE. This includes determining how to run the algorithm when other forms
of CSI are available, for example, the channel covariance matrices instead of the exact
channel values. This also includes studying the effect of discrete, quantized CSI values
on the performance. Both of the above points were briefly discussed in [8] for the MIMO-
SMSE algorithm it proposed. The implementation issues also include the phase noise
and frequency offset associated with OFDM. Treating these problems is crucial for the
effective application of OFDM, which is an essential component of the system we are
dealing with.
Fourth, since no convincing algorithm for MIMO-OFDM user selection was presented,
this area needs further exploration. The focus should be devising an algorithm that per-
forms joint optimal user selection according to individual user or data stream QoS re-
quirements (possibly through MSE, BER, or data rate constraints) and power constraints.
For example, the user selection process can be integrated into the SMSE minimization
algorithm by using different weights for the contribution of different users and restricting
5.2. FUTURE WORK 83
Figure 5.1: Extension of hybrid algorithm to 3 BSs
those weights to the integers 0 and 1 (integer programming).
Fifth, alternative ways to battle asynchronous interference may be explored. For
example, the RAKE receiver, common in CDMA systems, can receive delayed versions
of the same signal and combine them. It assumes that the different versions arrive at
integer multiples of the chip period. However, the chip period in CDMA systems is much
shorter than the symbol period assumed in this work, and integer multiples of it may be
used to approximate the continuous delay values. Note that the RAKE receiver combines
multiple, hopefully independent, versions of the same signal to achieve diversity, while
in our case it can combine the different signals from different BSs intended for one user
to achieve diversity.
Finally, one can study the possibility of extending the hybrid algorithm to three BSs.
Directional antennas can be used to provide cell sectorization, with 120 sectors, in order
to use the hybrid algorithm throughout the system. Figure 5.1 shows this setup, with
1 BS, 2 BSs, or 3 BSs communicating with users in the light area, slightly darker area,
or darkest area, respectively. Note that if the 60 sectors were used (at the cost of more
hardware), then the hybrid algorithm can be used as is for each two adjacent sectors in
different cells.
Appendix A
Equating MSEs
We generalize the MSE duality for single-carrier, MIMO systems with multiple cooperat-
ing BSs and asynchronous interference. Let vkl and ukl be the MMSE decoding vectors
for data stream l of user k in the downlink and uplink, respectively. From Eqn. (2.65)
and Eqn. (2.61),
vkl =
HH
k UkPkUHk Hk + HH
k
K∑c=1c6=k
BckHk + σ2INk
−1
HHk ukl
√pkl, (A.1)
ukl =
HkVkQkV
Hk HH
k +K∑
c=1c 6=k
Ack + σ2INk
−1
Hkvkl√
qkl. (A.2)
Let vkl = vkl/||vkl|| and ukl = ukl/||ukl||. From Eqn. (2.58), the MSE of data stream l
of user k in the uplink is
εULkl =uH
klHkVkQkVHk HH
k ukl + uHkl
K∑c=1c6=k
Ackukl + σ2uHklukl
−√qklvHklH
Hk ukl − uH
klHkvkl√
qkl + 1. (A.3)
84
APPENDIX A. EQUATING MSES 85
In a similar derivation to Eqn. (2.58), but using xDLk in Eqn. (2.13), the MSE of data
stream l of user k in the downlink is
εDLkl =vH
klHHk UkPkU
Hk Hkvkl + vH
klHHk
K∑c=1c 6=k
BckHkvkl + σ2vHkl vkl
−√pkluHklHkvkl − vH
klHHk ukl
√pkl + 1. (A.4)
Setting SINRDLkl = SINRUL
kl , and with simple mathematical manipulations, we get
||ukl||2pkl
vH
klHHk UkPkU
Hk Hkvkl + vH
klHHk
K∑c=1c6=k
BckHkvkl + σ2vHkl vkl
=||vkl||2
qkl
uH
klHkVkQkVHk HH
k ukl + uHkl
K∑c=1c 6=k
Ackukl + σ2uHklukl
. (A.5)
Assuming ||vkl|| = α/√
pkl and ||ukl|| = α/√
qkl, where α is a constant, Eqn. (A.5)
simplifies to
vHklH
Hk UkPkU
Hk Hkvkl + vH
klHHk
K∑c=1c 6=k
BckHkvkl + σ2vHkl vkl
= uHklHkVkQkV
Hk HH
k ukl + uHkl
K∑c=1c6=k
Ackukl + σ2uHklukl, (A.6)
and we get
εDLkl = εUL
kl . (A.7)
Appendix B
Derivations for Asynchronous
Interference
We explain the expressions of the asynchronous interference in the downlink and uplink,
and derive the expected values in Eqn. (2.15) and Eqn. (2.27).
Recall that i(b)jk is the misaligned interference caused on user k when BS b transmits
to user j. Its expression is repeated in Eqn. (B.1).
i(b)jk (m) = ρ(δ
(b)jk − TS)xj(m
(b)jk ) + ρ(δ
(b)jk )xj(m
(b)jk + 1), (B.1)
where
ρ(τ) =
∫ TS
0
g(t)g(t− τ)dt. (B.2)
The vector i(b)jk is a linear combination of two consecutive data vectors being trans-
mitted to user j. The value of ρ(τ) quantifies the contribution of each of those two data
vectors to the interference caused on user k, depending on the value of δ(b)jk = τ
(b)jk mod TS.
This can be understood more as follows. When a signal arrives at user k, it is convoluted
with a matched filter and sampled every TS seconds. The matched filter that maximizes
the SNR is given by g∗(−t) = g(−t) (because g(t) is real). The convolution operation is
86
APPENDIX B. DERIVATIONS FOR ASYNCHRONOUS INTERFERENCE 87
define as
g(τ) ∗ h(τ) =
∫ ∞
−∞g(t)h(t− τ)dt. (B.3)
Assume that, without loss of generality, xj(m(b)jk ), xj(m
(b)jk +1), and xk are all scalars, i.e.
Lk = 1, and hence they will no longer in bold. Also assume, for brevity and without loss
of generality, that m(b)jk = 1 and m
(b)jk +1 = 2. The desired symbol xk arrives synchronously
at user k, and when it is matched filtered and sampled (at τ = 0 in our example), the
following result is obtained.
xk g(τ) ∗ g(−τ)|τ=0 =
∫ ∞
−∞xk g(t)g(t− τ)dt
∣∣∣∣τ=0
=
∫ ∞
−∞xk g2(t)dt = xk, (B.4)
since g(t) has unit power.
Symbols xj(1) and xj(2) arrive asynchronously at user k, and before match filter-
ing and sampling, the continuous time signal of the asynchronous interference can be
expressed as
i(b)jk (τ) =
[xj(2)g(τ + δ
(b)jk ) + xj(1)g(τ − TS + δ
(b)jk )
][u(τ)− u(τ − TS)] , (B.5)
where u(τ) is the unit step function defined as
u(τ) =
0, τ < 0
1, τ ≥ 0. (B.6)
Therefore, when i(b)jk (τ) is match filtered and sampled at τ = 0, the following result is
APPENDIX B. DERIVATIONS FOR ASYNCHRONOUS INTERFERENCE 88
obtained.
i(b)jk (τ) ∗ g(−τ)
∣∣∣τ=0
=
∫ TS−δ(b)jk
0
xj(2) g(t + δ(b)jk )g(t− τ)dt
+
∫ TS
TS−δ(b)jk
xj(1) g(t− TS + δ(b)jk )g(t− τ)dt
∣∣∣∣∣τ=0
=
∫ TS−δ(b)jk
0
xj(2) g(t + δ(b)jk )g(t)dt +
∫ TS
TS−δ(b)jk
xj(1) g(t− TS + δ(b)jk )g(t)dt
= ρ(−δ(b)jk ))xj(2) + ρ(−(δ
(b)jk − TS))xj(1). (B.7)
Recalling that g(t) is non-zero only for t ∈ [0, TS], we show that ρ(−τ) = ρ(τ).
ρ(τ) =
∫ TS
0
g(t)g(t− τ)dt =
∫ TS+τ
τ
g(t)g(t− τ)dt =
∫ TS
0
g(z + τ)g(z)dz = ρ(−τ)
(B.8)
Accordingly, Eqn. (B.7) simplifies to
ρ(δ(b)jk − TS)xj(1) + ρ(δ
(b)jk ))xj(2), (B.9)
which is analogous to Eqn. (B.1).
Now, consider E[i(b1)j1k i
(b2)H
j2k
]. When j1 = j2 = j = k, δ
(b)jk = 0. Substituting in
Eqn. (B.1), and noting that ρ(0) = 1 and that ρ(−TS) = 0 since g(t) has unit power and
is non-zero only for t ∈ [0, TS], we get
i(b)jk (m) = ρ(−TS)xj(m
(b)jk ) + ρ(0)xj(m
(b)jk + 1) = xj(m
(b)jk + 1). (B.10)
Therefore,
E[i(b1)j1k i
(b2)H
j2k
]= E
[i(b1)jk i
(b2)H
jk
]= E
[xj(m
(b1)jk + 1)xj(m
(b2)H
jk + 1)]
= ILj, (B.11)
since the data vectors of user j arrive synchronously from all BSs, meaning that m(b1)H
jk +
1 = m(b2)H
jk + 1, and hence xj(m(b1)jk + 1) = xj(m
(b2)jk + 1). This establishes the third line
APPENDIX B. DERIVATIONS FOR ASYNCHRONOUS INTERFERENCE 89
of Eqn. (2.14).
For other values of j1, j2, and k, we expand E[i(b1)j1k i
(b2)H
j2k
].
E[i(b1)j1k i
(b2)H
j2k
]= E
[(ρ(δ
(b1)j1k − TS)xj1(m
(b1)j1k ) + ρ(δ
(b1)j1k )xj1(m
(b1)j1k + 1)
)
(ρ(δ
(b2)j2k − TS)xj2(m
(b2)j2k ) + ρ(δ
(b2)j2k )xj2(m
(b2)j2k + 1)
)H]
= E[ρ(δ
(b1)j1k − TS)ρ(δ
(b2)j2k − TS)xj1(m
(b1)j1k )xH
j2(m
(b2)j2k )
+ ρ(δ(b1)j1k − TS)ρ(δ
(b2)j2k )xj1(m
(b1)j1k )xH
j2(m
(b2)H
j2k + 1)
+ ρ(δ(b1)j1k )ρ(δ
(b2)j2k − TS)xj1(m
(b1)j1k + 1)xH
j2(m
(b2)j2k )
+ ρ(δ(b1)j1k )ρ(δ
(b2)j2k )xj1(m
(b1)j1k + 1) xH
j2(m
(b2)j2k + 1)
]
= ρ(δ(b1)j1k − TS)ρ(δ
(b2)j2k − TS)E
[xj1(m
(b1)j1k )xH
j2(m
(b2)j2k )
](B.12)
+ ρ(δ(b1)j1k − TS)ρ(δ
(b2)j2k )E
[xj1(m
(b1)j1k )xH
j2(m
(b2)j2k + 1)
](B.13)
+ ρ(δ(b1)j1k )ρ(δ
(b2)j2k − TS)E
[xj1(m
(b1)j1k + 1)xH
j2(m
(b2)j2k )
](B.14)
+ ρ(δ(b1)j1k )ρ(δ
(b2)j2k )E
[xj1(m
(b1)j1k + 1)xH
j2(m
(b2)j2k + 1)
](B.15)
When j1, j2, and k are all distinct, each data vector pair inside the expectation operators
in Eqns. (B.12), (B.13), (B.14), and (B.15) has two independent data vectors that belong
to user j1 and user j2. Therefore, all expected values evaluate to 0. This establishes the
first line of Eqn. (2.14).
Finally, when j1 = j2 = j 6= k, all data vector pairs inside the expectation operators
in Eqns. (B.12), (B.13), (B.14), and (B.15) have two data vectors that belong to the same
user j. The data vectors of user j are independent over time. Therefore, the expected
values evaluate to ILjonly when the time indices match; otherwise, they evaluate to 0.
When the time indices differ by more than 1, no time indices match in any of
Eqns. (B.12), (B.13), (B.14), and (B.15), and all evaluate to 0. This establishes the
first line of Eqn. (2.15). When m(b2)jk = m
(b1)jk + 1, only the expected value of Eqn. (B.13)
evaluates to ILj. This establishes the second line of Eqn. (2.15). When m
(b1)jk = m
(b2)jk , and
thus m(b1)jk +1 = m
(b2)jk +1 , only the expected values of Eqn. (B.12) and Eqn. (B.15) eval-
uate to ILj. This establishes the third line of Eqn. (2.15). Finally, when m
(b1)jk = m
(b2)jk +1,
APPENDIX B. DERIVATIONS FOR ASYNCHRONOUS INTERFERENCE 90
only the expected value of Eqn. (B.14) evaluates to ILj. This establishes the fourth line
of Eqn. (2.15).
The derivation for the uplink asynchronous interference has a parallel structure to
the above derivation, but it is performed using the expression of e(b)jk in Eqn. (2.23).
Bibliography
[1] A. Bourdoux and N. Khaled, “Joint TX-RX optimisation for MIMO-SDMA based
on a null-space constraint,” in Proc. IEEE Vehicular Technology Conference, Van-
couver, Canada, September 2002, pp. 171–174.
[2] Q. H. Spencer, A. L. Swindlehurst, and M. Haardt, “Zero-forcing methods for down-
link spatial multiplexing in multiuser MIMO channels,” IEEE Trans. on Signal Pro-
cessing, vol. 52, no. 2, pp. 461–471, February 2004.
[3] J. H. Chang, L. Tassiulas, and F. Rashid-Farrokhi, “Joint transmitter receiver di-
versity for effcient space division multiple access,” IEEE Trans. on Wireless Com-
munications, vol. 1, no. 1, pp. 16–26, January 2002.
[4] A. R. S. Bahai, Multi-carrier digital communications: theory and applications of
OFDM. Springer, 2004.
[5] A. J. Tenenbaum and R. S. Adve, “Joint multiuser transmit-receive optimization
using linear processing,” in Proc. IEEE ICC 04, vol. 1, Paris, France, June 2004,
pp. 588–592.
[6] M. Schubert and H. Boche, “Solution of the multiuser downlink beamforming
problem with individual SINR constraints,” IEEE Trans. on Vehicular Technology,
vol. 53, no. 1, pp. 18–28, January 2004.
[7] S. Shi and M. Schubert, “MMSE transmit optimization for multi-user multi-antenna
systems,” in Proc. IEEE ICASSP 05, March 2005.
91
BIBLIOGRAPHY 92
[8] A. M. Khachan, A. J. Tenenbaum, and R. S. Adve, “Linear processing for the
downlink in multiuser MIMO systems with multiple data streams,” in Proc. IEEE
ICC 06, June 2006.
[9] M. Codreanu, A. Tolli, M. Juntti, and M. Latva-aho, “Joint design of Tx-Rx beam-
formers in MIMO downlink channel,” IEEE Trans. on Signal Proc., vol. 55, no. 9,
pp. 4639–4655, September 2007.
[10] A. J. Tenenbaum and R. S. Adve, “Improved sum-rate optimization in the multiuser
MIMO downlink,” in Proc. CISS, Princeton, NJ, March 2008.
[11] S. Serbetli and A. Yener, “Transceiver optimization for multiuser MIMO systems,”
IEEE Trans. on Signal Proc., vol. 52, no. 1, pp. 214–226, January 2004.
[12] P. T. Boggs and J. W. Tolle, “Sequential quadratic programming,” in Acta Numer-
ica. Cambridge University Press, 1995, pp. 1–51.
[13] E. Visotsky and U. Madhow, “Optimum beamforming using transmit antenna ar-
rays,” in Proc. IEEE VTC Spring, May 1999, pp. 851–856.
[14] A. Tolli, M. Codreanu, and M. Juntti, “Linear cooperative multiuser MIMO tran-
sciever design with per BS power constraints,” in Proc. IEEE ICC 07, Glasgow,
Scotland, June 2007.
[15] T. Tamaki, K. Seong, and J. M. Cioffi, “Downlink MIMO systems using cooperation
among base stations in a slow fading channel,” in Proc. IEEE ICC 07, Glasgow,
Scotland, June 2007.
[16] H. Dahrouj and W. Yu, “Coordinated beamforming for the multi-cell multi-antenna
wireless system,” in Proc. CISS, Princeton, NJ, March 2008.
[17] H. Dai, A. F. Molisch, and H. V. Poor, “Downlink capacity of interference-limited
MIMO systems with joint detection,” IEEE Trans. on Wireless Commun., vol. 3,
pp. 442–453, March 2004.
BIBLIOGRAPHY 93
[18] S. Jafar, G. Foschini, and A. Goldsmith, “Phantomnet: Exploring optimal multicel-
lular multiple antenna systems,” EURASIP J. App. Sig. Proc., pp. 591–604, October
2004.
[19] B. L. Ng, J. S. Evans, S. V. Hanly, and D. Aktas, “Transmit beamforming with
cooperative base stations,” in Proc. IEEE ISIT, 2005, pp. 1431–1435.
[20] H. Zhang, N. B. Mehta, A. F. Molisch, J. Zhang, and H. Dai, “On the fundamen-
tally asynchronous nature of interference in cooperative base station systems,” in
Proc. IEEE ICC 07, Glasgow, Scotland, June 2007.
[21] S. Verdu, Multiuser Detection. Cambridge University Press, 1998.
[22] T. A. Thomas and F. W. Vook, “Asynchronous interference suppression in broad-
band cyclic-prefix communications,” in Proc. IEEE WCNC, New Orleans, LA,
March 2003.
[23] K. Yano and M. Taromaru, “Pre-FFT type MMSE adaptive array antenna to sup-
press asynchronous interference for OFDM packet transmission,” in Proc. IEEE
WCNC, Hong Kong, March 2007.
[24] J. Hyejung and M. D. Zoltowski, “On the equalization of asynchronous multiuser
OFDM signals in fading channels,” in Proc. IEEE ICASSP, May 2004.
[25] M. Costa, “Writing on dirty paper,” IEEE Trans. on Information Theory, vol. 29,
no. 3, pp. 439–441, May 1983.
[26] M. Tomlinson, “New automatic equalizer employing modulo arithmetic,” Electronic
Letters, vol. 7, pp. 138–139, March 1971.
[27] H. Harashima and H. Miyakawa, “A method of code conversion for digital communi-
cation channels with intersymbol interference,” Trans. of the Institute of Electronics
and Communications Engineers of Japan, vol. 52, pp. 272–273, June 1969.
[28] ——, “Matched-transmission technique for channels with intersymbol interference,”
IEEE Trans. on Communications, vol. 20, pp. 774–780, August 1972.
BIBLIOGRAPHY 94
[29] B. M. Hochwald, C. B. Peel, and A. L. Swindlehurst, “A vector perturbation tech-
nique for near capacity multiantenna multiuser communication part II perturba-
tion,” IEEE Trans. on Communications, vol. 53, no. 3, pp. 537–544, March 2005.
[30] R. F. H. Fischer, Precoding and Signal Shaping for Digital Transmission. Wiley-
Interscience, 2002.
[31] C. Windpassinger, R. F. H. Fischer, T. Vencel, and J. B. Huber, “Precoding in
multiantenna and multiuser communications,” IEEE Trans. on Communications,
vol. 3, no. 4, pp. 1305–1316, July 2004.
[32] G. J. Foschini, G. D. Golden, R. A. Valenzuela, and P. W. Wolniansky, “Simplified
processing for high spectral efficiency wireless communication employing multi ele-
ment arrays,” IEEE Jour. on Selected Areas in Communications, vol. 17, no. 1, pp.
1841–1852, November 1999.
[33] T. Vencel, C. Windpassinger, and R. Fischer, “Sorting in the V-BLAST algorithm
and loading,” in Proc. of the 2002 Communications Systems and Networks, Septem-
ber 2002.
[34] R. Doostnejad, T. J. Lim, and E. Sousa, “Joint precoding and beamforming design
for the downlink in a multiuser MIMO system,” in Proc. IEEE WiMOB, Montreal,
Canada, August 2005.
[35] C.-H. F. Fung, W. Yu, and T. J. Lim, “Precoding for the multiantenna downlink:
multiuser snr gap and optimal user ordering,” IEEE Trans. on Communications,
vol. 55, no. 1, pp. 188–197, January 2007.
[36] Y. J. Zhang and K. B. Letaief, “An efficient resource-allocation scheme for spa-
tial multiuser access in MIMO/OFDM systems,” IEEE Trans. on Communications,
vol. 53, no. 1, pp. 107–116, January 2005.
[37] C. Pan, Y. Cai, and Y. Xu, “Adaptive subcarrier and power allocation for multiuser
MIMO-OFDM systems,” in Proc. IEEE ICC 05, May 2005.
BIBLIOGRAPHY 95
[38] Y. Shin, T. Kang, and H. Kim, “An efficient resource allocation for multiuser MIMO-
OFDM systems with zero-forcing beamformer,” in Proc. IEEE PIMRC, September
2007.
[39] H. Karaa and R. S. Adve, “User assignment for MIMO-OFDM systems with mul-
tiuser linear precoding,” in Proc. IEEE WCNC, March-April 2008.