IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 11, 2016 1 … · 2018-10-18 · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 11, 2016 2 the angular spread of the signal

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 11, 2016 1

Robust Pilot Decontamination Based on Joint Angleand Power Domain Discrimination

Haifan Yin, Laura Cottatellucci, David Gesbert Fellow, IEEE, Ralf R. Muller Senior Member, IEEE,and Gaoning He

Abstract—We address the problem of noise and interferencecorrupted channel estimation in massive MIMO systems. Inter-ference, which originates from pilot reuse (or contamination),can in principle be discriminated on the basis of the distributionsof path angles and amplitudes. In this paper we propose novelrobust channel estimation algorithms exploiting path diversity inboth angle and power domains, relying on a suitable combinationof the spatial filtering and amplitude based projection. Theproposed approaches are able to cope with a wide range of systemand topology scenarios, including those where, unlike in previousworks, interference channel may overlap with desired channelsin terms of multipath angles of arrival or exceed them in termsof received power. In particular we establish analytically theconditions under which the proposed channel estimator is fullydecontaminated. Simulation results confirm the overall systemgains when using the new methods.

Index Terms—massive MIMO, pilot contamination, pilot de-contamination, channel estimation, covariance, subspace, eigen-value decomposition

I. INTRODUCTION

Massive MIMO (also known as Large-Scale Antenna Sys-tems) introduced in [2], is widely believed to be one of the keyenablers of the future 5th generation (5G) wireless systemsthanks to its potential to substantially enhance spectral andenergy efficiencies [2], [3] compared to traditional MIMOwith fewer antennas. This technique is based on the law oflarge numbers, which predicts that, as the number of basestation antennas increases, the vector channel for a desireduser terminal will grow more orthogonal to the vector channelof an interfering user, thus allowing the base station toreject interference by precoding, or even, as a low-complexityapproach, simply aligning the beamforming vector with thedesired channel (“Maximum Ratio Combining”, or MRC),providing that Channel State Information (CSI) is known atbase station. In practice however, CSI is acquired based ontraining sequences sent by user terminals. Due to limited time

Copyright (c) 2015 IEEE. Personal use of this material is permitted.However, permission to use this material for any other purposes must beobtained from the IEEE by sending a request to [email protected].

This work was supported by Huawei France Research Center. Ralf Mulleracknowledges the support of the Alexander von Humboldt-Foundation. A partof this work [1] was presented on June 30, 2015 at the 16th IEEE InternationalWorkshop on Signal Processing Advances in Wireless Communications,SPAWC 2015, in Stockholm, Sweden.

H. Yin, L. Cottatellucci, and D. Gesbert are with EURECOM, 06410 Biot,France (e-mail: yin, cottatel, [email protected], ).

R. R. Muller is with Friedrich-Alexander Universitat Erlangen-Nurnberg,Cauerstr. 7/LIT, 91058 Erlangen, Germany (e-mail: [email protected]).

G. He is with Huawei France Research Center, Huawei Technologies Co.Ltd., 92100 Boulogne-Billancourt, France (e-mail: [email protected]).

Digital Object Identifier 10.1109/TSP.2016.2535204.

and frequency resources, non-orthogonal pilot sequences aretypically used by user terminals in neighboring cells, resultingin residual channel estimation error. This effect, called pilotcontamination [4], [5], has a detrimental impact on the actualachievable spectral and energy efficiencies in real systems. Asa result, considerable research efforts have been spent in thelast couple of years towards alleviating pilot interference inmassive MIMO networks.

Such techniques span from smart design of pilot reuseschemes (e.g. [6], [7]) to channel estimation techniques basedon coordinated pilot allocation (e.g. [8], [9]), to methodsrelying on multi-cell joint processing (e.g. [10]), to nonlinearchannel estimation techniques leveraging on some fundamen-tal features of massive MIMO systems (e.g. [8], [11]–[13]).

Two key features of massive MIMO channels that have beenpreviously reported are of particular interest here: 1) channelsof different users tend to be pairwise orthogonal when thenumber of antennas increases, thus leading to a specificsubspace structure for the received data vectors that dependon these channels [12] and 2) the channel covariance ma-trix exhibits a low-rankness property whenever the multipathimpinging on the MIMO array spans a finite angular spread[8], [14], [15]. The blind signal subspace estimation in [12]capitalizes on the first property. The second property has beenutilized in [8], [14]–[17], assuming the knowledge of the long-term channel covariance matrices. While the exploitation of thetwo properties individually has given rise to a set of distinctoriginal decontamination approaches, in this work we willexploit these two key features in a combined manner. Doingso we can propose a novel approach towards mitigating pilotcontamination that exhibits much higher levels of robustness.

More specifically, in [12], [18], the pairwise channel or-thogonality property allows to blindly estimate the user-of-interest channel subspace and discriminate between user-of-interest signals and interference based on the channel powers.In practice, decontamination occurs via a projection drivenby the channel amplitudes. This approach works well withinthe constraint that the interference channel is received with apower level sufficiently lower than that of the desired channel,a condition hard to guarantee for some edge-of-cell users.

In a way completely different from [12], [18], anotherapproach based on a linear minimum mean squared error(MMSE) estimator is adopted in [8] to estimate the channelof interest via projection of the received signals onto the user-of-interest subspace. This subspace, identified by a channelcovariance matrix (a long-term one, as opposed to the instan-taneous signal correlation matrix of [12], [18]), is related to

arX

iv:1

509.

0602

4v3

[cs

.IT

] 5

Jun

201

6


the angular spread of the signal of interest [8] and enablesto annihilate the interference from users with non-overlappingdomains of multipath angles-of-arrival (AoA). Interestingly,this latter approach makes no assumption on received signalamplitudes and can also discriminate against interfering usersthat are received with similar or even higher powers. Yet, theapproach fails to decontaminate pilots when propagation scat-tering creates large angle spread, causing spatial overlappingamong desired and interference channels.

In this paper, we point out that the strengths of thesetwo previously unrelated estimation methods are stronglycomplementary, offering a unique opportunity for developingrobust channel estimation schemes. Thus, we aim to properlymerge the two projections in complementary domains whilekeeping the individual benefits. In fact, we propose a family ofalgorithms striking various performance/complexity trade-offs.

We start by presenting a first scheme named “covariance-aided amplitude based projection” that effectively combinesprojections in the angular and amplitude domains and ex-hibits robustness to interference power/angles overlappingconditions. We present an asymptotic analysis which revealsthe conditions under which the channel estimation error dueto pilot contamination and noise can be made to vanish.An intuitive physical interpretation of this condition for aUniform Linear Array (ULA) is given in the form of theresidual interference channel energy contained in the multipathcomponents that overlap in angle with those of the desiredchannel. Although the physical explanation is given for theULA example, the general principle apply to other antennaplacement topologies.

The obtained condition for decontamination is in generalless restrictive than the condition required by previous MMSEand the amplitude projection-based methods taken separatelyto achieve complete removal of pilot contamination.

We then propose two low-complexity alternative schemescalled “subspace and amplitude based projection” and “MMSE+ amplitude based projection” respectively. Such schemesachieve different complexity-performance trade-off at mod-erate number of antennas. Specifically, the “subspace andamplitude based projection” can be shown to reach asymptotic(in the number of antennas) decontamination result under thesame channel topology conditions as the first scheme.

More specifically, our contributions are as follows:• We put forward a modification of the known method of

amplitude based projection, with increased robustness.• We propose a spatial filter which helps bring down the

power of interference while preserving the signal ofinterest. With this spatial filter, we present a novel channelestimation scheme called “covariance-aided amplitudebased projection”. It combines the merits of linear MMSEestimator and amplitude based projection method, yet canbe shown to have significant gains over these knownschemes.

• We analyze the asymptotic performance of this proposedmethod and provide weaker condition compared to theprevious methods where the estimation error of the pro-posed method goes to zero asymptotically in the limitof large number of antennas and data symbols. The

asymptotic analysis relies on mild technical conditionssuch as uniformly boundedness of the spectral norm ofchannel covariance.

• As the uniformly boundedness of the largest eigenvalue ofchannel covariance was reported to be useful in previousworks (such as [19]) but not formally analyzed, weidentify in the case of ULA a sufficient propagationcondition under which the uniformly bounded spectralnorm of channel covariance is satisfied exactly.

• Finally we propose two low-complexity alternatives of thefirst method. An asymptotic performance characterizationis also given.

The paper is organized as follows: In section II we introducethe system model. Section III is a brief review of MMSEchannel estimator and its asymptotic performance. In sectionIV we briefly recall the amplitude based projection of [12],[18], and we propose a first improvement of the method.Then we present the novel covariance-aided amplitude basedprojection in section V-A for the setting of single user percell, and the asymptotic performance analysis of this methodis shown in section V-B. Section V-C presents a generalizationof the proposed scheme to multi-user per cell scenario. Insection VI we propose two low-complexity alternatives of ourprevious method and similar asymptotic results on the systemperformance are given. Section VII shows numerical results.Finally section VIII concludes the paper.

The notations adopted in the paper are as follows. Weuse boldface to denote matrices and vectors. Specifically, IMdenotes the M ×M identity matrix. (X)T , (X)∗, and (X)H

denote the transpose, conjugate, and conjugate transpose ofa matrix X respectively. (X)† is the Moore-Penrose pseu-doinverse of X. tr {·} denotes the trace of a square matrix.‖·‖2 denotes the `2 norm of a vector when the argument is avector, and the spectral norm when the argument is a matrix.In particular, if A is a Hermitian matrix, ‖A‖2 is the largesteigenvalue of A. We index the eigenvalues of A in non-increasing order and denote the i-th eigenvalue of A by λi{A}and its corresponding eigenvector by ei{A}. ‖·‖F stands forthe Frobenius norm. E {·} denotes the expectation. The Kro-necker product of two matrices X and Y is denoted by X⊗Y.vec(X) is the vectorization of the matrix X. diag{a1, ...,aN}denotes a diagonal matrix or a block diagonal matrix witha1, ...,aN at the main diagonal. , is used for definition.

II. SIGNAL AND CHANNEL MODELS

We consider a network of L time-synchronized1 cells,with full spectrum reuse. Each base station (BS) is equippedwith M antennas. There are K single-antenna users in eachcell simultaneously served by their base station. The cellularnetwork operates in time-division duplexing (TDD) mode, anddue to channel reciprocity, the downlink channel is obtainedat the BS by uplink training signal and data signal. Each

1Note that assuming synchronization between uplink pilots provides a worstcase scenario from a pilot contamination point of view, since any lack ofsynchronization will tend to statistically decorrelate the pilots. Furthermore,the main methods that we propose in this paper, i.e., the covariance-aidedamplitude based projection and the subspace and amplitude based projectiondo not rely on accurate time synchronization.


base station estimates the channels of its K users during acoherence time interval. The pilot sequences inside each cellare assumed orthogonal to each other in order to avoid intra-cell interference. However the same pilot pool is reused inother cells, giving rise to pilot contamination problem. Thepilot sequence assigned to the k-th user in a certain cell isdenoted by

sk = [ sk1 sk2 · · · skτ ]T , (1)

where τ is the length of a pilot sequence. Without loss ofgenerality we assume unitary average power of pilot symbols:

sTk1s∗k2 =

{0, k1 6= k2

τ, k1 = k2.

The M × 1 channel vector between the k-th user located inthe l-th cell and the j-th base station is denoted by h

(j)lk . The

following classical multipath channel model [20] is adopted:

h(j)lk =

β(j)lk√P

P∑p=1

a(θ(j)lkp)e

iϕ(j)lkp , (2)

where P is an arbitrary large number of i.i.d. paths, and eiϕ(j)lkp

is their i.i.d. random phase, which is independent over channelindices l, k, j, and path index p. a(θ) is the steering (or phaseresponse) vector by the array to a path originating from theangle of arrival θ:

a(θ) ,

1

e−j2πDλ cos(θ)

...e−j2π

(M−1)Dλ cos(θ)

, (3)

where λ is the signal wavelength and D is the antenna spacingwhich is assumed fixed. Note that we can limit θ to θ ∈ [0, π]because any θ ∈ [−π, 0) can be replaced by −θ giving thesame steering vector. β(j)

lk is the path-loss coefficient

β(j)lk =

√α

d(j)lk

γ , (4)

in which γ is the path-loss exponent, d(j)lk is the geographical

distance between the user and the j-th base station, and α is aconstant. Note that the model is shown for a ULA example forease of exposition. Under this model, the covariance matrixcan be shown asymptotically to have low rank, as long asthe AoA support is bounded and strictly smaller than [0, π].However, several other channel models also exhibit similarlow-rank property [15], which is the essential characteristicexploited by the MMSE estimator. Hence our approach is notdependent on the use of the one ring model above described.In fact, our main results, namely Theorem 1, as well as thegeneral principle carry to other channel models and antennaplacement topologies.

We define

H(j)l ,

[h

(j)l1 h

(j)l2 · · · h

(j)lK

], (5)

and the pilot matrix

S ,[s1 s2 · · · sK

]T. (6)

During the training phase, the received signal at the basestation j is

Y(j) =

L∑l=1

H(j)l S + N(j), (7)

where N(j) ∈ CM×τ is the spatially and temporally whiteadditive Gaussian noise (AWGN) with zero-mean and element-wise variance σ2

n. Then, during the uplink data transmissionphase, each user transmits C data symbols. The received datasignal at base station j is given by:

W(j) =

L∑l=1

H(j)l Xl + Z(j), (8)

where Xl ∈ CK×C is the matrix of transmitted symbols ofall users in the l-th cell. The symbols are i.i.d. with zero-mean and unit average element-wise variance. Z(j) ∈ CM×Cis the AWGN noise with zero-mean and element-wise varianceσ2n. Note that the block fading channel is constant during the

transmission for the τ pilot symbols and the C data symbols.

III. MMSE CHANNEL ESTIMATION

We briefly recall the MMSE channel estimator in a multi-cell setting with single-user per cell. Without loss of generality,we assume cell j is the target cell, and h

(j)j ∈ CM×1 is

the desired channel, while h(j)l ∈ CM×1,∀l 6= j are the

interference channels. We rewrite (7) in a vectorized form,

y(j) = S

L∑l=1

h(j)l + n(j), (9)

where y(j) = vec(Y(j)), n(j) = vec(N(j)). A pilot sequences is shared by all users. The pilot matrix S is given by

S , s⊗ IM . (10)

We define the channel covariance matrices

R(j)l , E{h(j)

l h(j)Hl } ∈ CM×M , l = 1, . . . , L, (11)

where the expectation is taken over channel realizations.A linear MMSE estimator for h

(j)j is given by

h(j)MMSEj = R

(j)j

(τ

L∑l=1

R(j)l + σ2

nIM

)−1

SH

y(j). (12)

As shown in previous works [8], [15], for a base stationequipped with a ULA, the above MMSE estimator can fullyeliminate the effects of interfering channels when M → ∞,under a specific “non-overlap” condition on the distributionsof multipath AoAs for the desired and interference channels.This condition is formalized as follows. Assume the user incell j is our target (desired) user. Denote the angular supportof the desired channel as Φd, (i.e., the probability densityfunction (PDF) pd(θ) of the AoA of the desired channel h

(j)j

satisfies pd(θ) > 0 if θ ∈ Φd and pd(θ) = 0 if θ /∈ Φd) andsimilarly the union of the angular supports of all interferencechannels h

(j)l (l 6= j) as Φi. If Φd∩Φi = ∅, then, as M →∞,

(12) converges to an interference-free estimate. In practice the“non-overlap” condition is hard to guarantee and the finite-M


performance of the MMSE scheme depends on angular spreadand user location, although the latter can be shaped via the useof so-called coordinated pilot assignment (CPA) [8].

IV. AMPLITUDE BASED PROJECTION

Interestingly, angle is not the only domain where interfer-ence can be discriminated upon, as revealed from a completelydifferent approach to pilot decontamination [12], [18]. In thatapproach the empirical instantaneous covariance matrix builtfrom the received data (8) is exploited, in contrast to the useof long-term covariance matrices in (12). Assume cell j isour target cell and each cell has K users. The eigenvaluedecomposition (EVD) of W(j)W(j)H/C is written as

1

CW(j)W(j)H = U(j)Λ(j)U(j)H , (13)

where U(j) ∈ CM×M =[u

(j)1 u

(j)2 · · · u

(j)M

]is a unitary

matrix and Λ(j) = diag{λ(j)1 , · · · , λ(j)

M } with its diagonalentries sorted in a non-increasing order. By extracting the firstK columns of U(j), i.e., the eigenvectors corresponding to thestrongest K eigenvalues, we obtain an orthogonal basis

E(j) ,[u

(j)1 u

(j)2 · · · u

(j)K

]∈ CM×K . (14)

The basic idea in [12], [18] is to use the orthogonal basis E(j)

as an estimate for the span of H(j)j , which includes all desired

user channels in cell j. Then, by projecting the received signalonto the subspace spanned by E(j), most of the signal ofinterest is preserved. In contrast, the interference signal iscanceled out thanks to the asymptotic property that the userchannels are pairwise orthogonal as the number of antennastends to infinity. Thus after the above mentioned projection,the estimate of the multi-user channel H

(j)j is given by:

H(j)AMj =

1

τE(j)E(j)HY(j)SH . (15)

Note here that interference and desired channel directions arediscriminated on the basis of channel amplitudes and not AoA,hence the estimate is labeled “AM” for “Amplitude”. As a wayto guarantee an asymptotic separation between the signal ofinterest and the interference in terms of power, it has beensuggested to introduce power control in the network [12], [18].

Remark 1 Generalized amplitude projectionAs shown in [12], [18], the above method works well when

the desired channels and interference channels are separable inpower domain, i.e., the instantaneous powers of any desiredchannels are higher than that of any interference channels.In practice however, this assumption is not always guaran-teed. For a finite number of antennas, the short-term fadingrealization can cause the interference subspace to spill overthe desired one. An enhanced version can somewhat mitigatethis problem by considering a generalized amplitude basedprojection. This consists in selecting a possibly larger number(κ(j)) of dominant eigenvectors to form E(j), where κ(j) isthe number of eigenvalues in Λ(j) that are greater than µλ(j)

K .µ is a design parameter that satisfies 0 ≤ µ < 1. See sectionVII for details on the choice of µ.

V. COVARIANCE-AIDED AMPLITUDE BASED PROJECTION

Note that both previous methods, while being able to tacklepilot contamination in quite different ways, perform well onlyin some restricted user/channel topologies. For a ULA basestation, the MMSE method leads to interference-free channelestimates under the strict requirement that the desired and in-terference channel do not overlap in their AoA regions. Whilethe amplitude based projection requires that no interferencechannel power exceeds that of a desired channel to achieve asimilar result. Unfortunately, due to the random user locationand scattering effects, it is quite unlikely to achieve theseconditions at all times. As a result, by combining the usefulproperties of both the MMSE and the amplitude projectionmethod, we propose below novel estimation methods that willlead to enhanced robustness in a realistic cellular scenario.

A. Single user per cell

For ease of exposition we first consider a simplified scenariowhere intra-cell interference is ignored by assuming that eachcell has only one user, i.e., K = 1. The users in differentcells share the same pilot sequence s. Then, with propermodifications, we will generalize this method to the settingof multiple users per cell in section V-C.

The objective is to combine long-term statistics which in-clude spatial distribution information together with short-termempirical covariance which contains instantaneous amplitudeand direction channel information. Hence, a spatial distributionfilter can be associated to an instantaneous projection operatorto help discriminate against any interference terms whosespatial directions live in a subspace orthogonal to that of thedesired channel. The intuition is that such a spatial filter maybring the residual interference to a level that is acceptable tothe instantaneous projection-based channel estimator.

In order to carry out the above intuition, we introducea long-term statistical filter Ξj , which is based on channelcovariance matrices in a way similar to that used by the MMSEfilter in (12).

Ξj =

(L∑l=1

R(j)l + σ2

nIM

)−1

R(j)j . (16)

Note that the linear filter Ξj allows to discriminate againstthe interference in angular domain by projecting away frommultipath AoAs that are occupied by interference. Note alsothat the choice of spatial filter Ξj is justified from the factthat the full information of desired channel h

(j)j is preserved,

as h(j)j lies in the signal space of R

(j)j . In fact, the desired

channel is recoverable using another linear transformation Ξj′:

Ξj′ , R

(j)†

j

(L∑l=1

R(j)l + σ2

nIM

), (17)

as can be seen from the following equality

Ξ′jΞjh(j)j = R

(j)†j R

(j)j h

(j)j = VjV

Hj h

(j)j = h

(j)j , (18)

where the columns of Vj are the eigenvectors of R(j)j corre-

sponding to non-zero eigenvalues.


The spatial filter is applied to the received data signal atbase station j as

Wj , ΞjW(j). (19)

The amplitude-based method as shown in section IV can nowbe applied on the filtered received data to get rid of the residualinterference. Take the eigenvector corresponding to the largesteigenvalue of the matrix WjW

Hj /C:

uj1 = e1{1

CWjW

Hj }. (20)

Hence uj1 can be considered as an estimate of the directionof the vector Ξjh

(j)j .

We then cancel the effect of the pre-multiplicative matrixΞj using Ξ′j in (17), and we obtain an estimate of the directionof the channel vector h

(j)j as follows:

uj1 =Ξ′juj1∥∥Ξ′juj1∥∥2

. (21)

Finally, the phase and amplitude ambiguities of the desiredchannel can be resolved by projecting the LS estimate ontothe subspace spanned by uj1:

h(j)CAj =

1

τuj1u

Hj1Y

(j)s∗, (22)

where the superscript “CA” denotes the covariance-aidedamplitude domain projection.

The algorithm is summarized below:

Algorithm 1 Covariance-aided Amplitude based Projection

1: Take the first eigenvector of WjWHj /C as in (20), with

Wj being the filtered data signal.2: Reverse the effect of the spatial filter using (21).3: Resolve the phase and amplitude ambiguities by (22).

The complexity of this proposed estimation scheme isbriefly evaluated.

We note that the computation of the matrix inversions in(16) has a complexity order of O(M2.37). However, thesecomputations are performed in a preamble phase and theircost is negligible under the underlying assumption of channelstationarity implicitly made in this article. In practical systems,the matrix inversion in (16) is performed when the channelstatistics are updated. Since the channel statistics are typicallyupdated in a time scale much larger than the channel coherencetime, i.e., the time scale for the applicability of Algorithm 1,then their computational cost is negligible. Therefore, we canfocus on the complexity of Algorithm 1 only.

In step 1, the spatial filtering of the data signals in (19)and the computation of the covariance matrix WjW

Hj is per-

formed along with the computation of the dominant eigenvec-tor of an M×M matrix as in (20). The former computation hasa complexity order O(CM2) while, by applying the classicalpower method, the computation of the dominant vector hasa complexity order O(M2). Both step 2 and step 3 requiremultiplications of matrices by M -dimensional vectors andthus both have a complexity order O(M2). Then, the global

complexity of the algorithm is dominated by the complexityof step 1, which is O(CM2).

The ability for the above estimator to combine the advan-tages of the previously known angle and amplitude projectionbased estimators is now analyzed theoretically. In particularwe are interested in the conditions under which full pilotdecontamination can be achieved asymptotically in the limitof the number of antennas M and data symbols C. In order tofacilitate the analysis, we introduce the following condition.

Condition C1: The spectral norm of R(j)l is uniformly

bounded:

∀M ∈ Z+ and ∀l ∈ {1, . . . , L},∃ζ, s.t.∥∥∥R(j)

l

∥∥∥2< ζ, (23)

where Z+ is the set of positive integers and ζ is a constant.Condition C1 can be interpreted as describing all the

scenarios in which the channel energy is spread over asubspace whose dimension grows with M . Note that the sameassumption can be found in some other papers, e.g., [19]. Thecorresponding physical condition is now investigated for thecase of a ULA with a typical antenna spacing D (less than orequal to half wavelength).

Proposition 1 Let Φ be the AoA support of a certain user. Letp(θ) be the probability density function of AoA of that user.If p(θ) is uniformly bounded, i.e., p(θ) < +∞,∀θ ∈ Φ, andΦ lies in a closed interval that does not include the paralleldirections with respect to the array , i.e., 0, π /∈ Φ, then,the spectral norm of the user’s covariance R is uniformlybounded.

Proof: See Appendix A.Note that this result is hinted upon [14] by resorting to

approximation of R by a circulant matrix. Our Proposition 1here gives a formal proof of the previous approximated result.

As another interpretation of Condition C1, it is worth notingthat when this condition is not satisfied, there is no guaranteethat the asymptotic pairwise orthogonality of different users’channels holds. In other words, the quantity h

(j)Hj h

(j)l /M, l 6=

j may not converge to zero, which is an adverse condition forall massive MIMO methods. However, our proposed methodsstill have significant performance gains under this adversecircumstance. Moreover, C1 is a sufficient condition and webelieve it can be weakened.

B. Asymptotic performance of the proposed CA estimator

We now look into the performance analysis of the proposedestimation scheme. Let us define

α(j)l , lim

M→∞

1

Mtr{ΞjR

(j)l ΞH

j },∀l = 1, . . . , L. (24)

Theorem 1 Given Condition C1, if the following inequalityholds true

α(j)j > α

(j)l ,∀l 6= j, (25)


then, the estimation error of (22) vanishes, i.e.,

limM,C→∞

∥∥∥h(j)CAj − h

(j)j

∥∥∥2

2∥∥∥h(j)j

∥∥∥2

2

= 0. (26)

Proof: For the sake of notational convenience, in this proofwe assume the user in cell j is the target user and thus drop thesuperscript (j). The desired channel is denoted by hj = h

(j)j

and the interference channels are hl = h(j)l , l 6= j. Since

hl, l = 1, . . . , L, is considered as M × 1 complex Gaussianwith the spatial correlation matrices Rl = E{hlhHl }, thechannels can be factorized as [21]

hl = R1/2l hWl, l = 1, . . . , L, (27)

where hWl ∼ CN (0, IM ), is an i.i.d. M × 1 Gaussian vectorwith unit variance. We build the proof of Theorem 1 onthe general correlation model (27). The proof consists inthree parts, corresponding to the three steps in Algorithm 1respectively. More specifically, Lemma 1 (and the intermediateresults towards Lemma 1) is the first part of the proof. It showsthat uj1 aligns asymptotically with the direction of the filteredchannel vector hj = Ξjhj . The second part of the proof isprovided in Lemma 6, which proves that after canceling theeffect of the spatial filter using Ξ′j , we obtain the direction ofthe true channel hj in uj1. The final part of the proof showsthat by projecting the LS estimate onto the subspace of uj1,we resolve the phase and amplitude of the true channel.

Lemma 1 Given Condition C1, if α(j)j > α

(j)l ,∀l 6= j, then

there exists a unique 0 ≤ φ < 2π, such that

limM,C→∞

∥∥∥∥∥ hj∥∥hj∥∥2

− uj1ejφ

∥∥∥∥∥2

= 0. (28)

where hl , Ξjhl, l = 1, . . . , L.

Proof: The proof of Lemma 1 relies on several intermediateresults, namely Lemma 2 - Lemma 5.

Lemma 2 Under Condition C1, the spectral norm of ΞjΞHj

satisfies:

limM→∞

1

M

∥∥ΞjΞHj

∥∥2

= 0. (29)

Proof: See Appendix B.Lemma 2 indicates that the spectral norm of the covarianceof the noise (after multiplying Ξj) is bounded and does notscale with M . This conclusion will be exploited when weprove in Lemma 5 that the impact of noise on the dominanteigenvector/eigenvalue vanishes.

Lemma 3 [22] Let AM be a deterministic M ×M complexmatrix with uniformly bounded spectral radius for all M . Letq = 1√

(M)[q1, · · · , qM ]T where qi,∀i = 1, · · · ,M is i.i.d.

complex random variable with zero mean, unit variance, andfinite eighth moment. Let r be a similar vector independent ofq. Then as M →∞,

qHAMqa.s.−−→ 1

Mtr{AM}, (30)

andqHAMr

a.s.−−→0, (31)

where a.s.−−→ denotes almost sure convergence.

Note that in this paper, the condition on the finite eighthmoment always holds, as when we apply Lemma 3, the com-ponents of the vector of interest are i.i.d. complex Gaussianvariables. It is well known that a complex Gaussian variablewith zero mean, unit variance has finite eighth moment.

Lemma 4 Given Condition C1,

limM→∞

1

MhHj hl = 0,∀l 6= j (32)

limM→∞

1

MhHl hl = αl, l = 1, . . . , L. (33)

Proof: See Appendix C.

Lemma 5 When Condition C1 is satisfied,

limM,C→∞

∥∥∥∥∥WjWHj

MC

hj∥∥hj∥∥2

− αjhj∥∥hj∥∥2

∥∥∥∥∥2

= 0, (34)

Proof: See Appendix D.Lemma 5 proves that as M,C → ∞, αj is an asymptoticeigenvalue of the random matrix WjW

Hj /MC, with its

corresponding eigenvector converging to hj/∥∥hj∥∥2

up to arandom phase.

We now return to the proof of Lemma 1. Since αj >αl,∀l 6= j, one may readily obtain from Lemma 5 and (32)that

limM,C→∞

λ1

{WjW

Hj

MC

}= αj , (35)

and that there exists a unique 0 ≤ φ < 2π, such that

limM,C→∞

∥∥∥∥∥ hj∥∥hj∥∥2

− e1

{WjW

Hj

MC

}ejφ

∥∥∥∥∥2

= 0, (36)

which completes the proof of Lemma 1.Now we show the second part of the proof of Theorem 1.

Note that in this part we make the implicit assumption thatthe spectral norm of Ξ′j satisfies

∥∥Ξ′j∥∥2< +∞. A sufficient

(but not necessary) condition of such an assumption is that thespectral norm of R†j is finite.

Lemma 6 Given (28), we have

limM,C→∞

∥∥∥∥ hj‖hj‖2

− uj1ejφ

∥∥∥∥2

= 0. (37)

Proof: See Appendix E.The final part of the proof of Theorem 1 can be found

in Appendix F, which corresponds to step 3 of Algorithm1. The proof shows that projecting the LS estimate onto thesubspace of uj1 will lead to noise-free estimate asymptoticallyas M,C →∞. This concludes the proof of Theorem 1.

Interestingly, condition (25) in Theorem 1 can be replacedwith ∥∥∥Ξjh

(j)j

∥∥∥2>∥∥∥Ξjh

(j)l

∥∥∥2,∀l 6= j, (38)


which indicates that under suitable conditions on the spectralnorm of channel covariance, after multiplying the filter Ξj ,if the power of the desired channel is higher than thatof interference channel, then, pilot contamination disappearsasymptotically, along with noise.

Note that we have so far no assumption on antenna place-ment in the analysis, other than the requirement for uniformlyboundedness of the spectral norm of channel covariance. In thesequel we look into a specific model of ULA as an exampleand seek to further understand the physical meaning of theproposed method.

We still assume h(j)j is the channel of interest. Denote its

angular support as Φd. Decompose the interference channelh

(j)l ,∀l 6= j, as follows:

h(j)l = h

(j)li + h

(j)lo , (39)

where

h(j)li =

β(j)l√P

∑θ∈Φd

a(θ)eiϕθ (40)

h(j)lo =

β(j)l√P

∑θ/∈Φd

a(θ)eiϕθ , (41)

which means h(j)li is the residual multipath component of

the interference channel within the AoA region Φd of thedesired channel, while h

(j)lo is the multipath component which

is outside Φd.

Theorem 2 For a ULA base station, under Condition C1, ifthe residual multipath component of the interference channelsatisfies:

∀l 6= j,∥∥∥Ξjh

(j)li

∥∥∥2<∥∥∥Ξjh

(j)j

∥∥∥2, (42)

then, the estimation error of the estimator (22) vanishes:

limM,C→∞

∥∥∥h(j)CAj − h

(j)j

∥∥∥2

2∥∥∥h(j)j

∥∥∥2

2

= 0. (43)

Proof: See Appendix G.Theorem 2 further confirms the fact that for a base stationequipped with ULA, only the interference multipath compo-nents that overlap with those of the desired channel affect theperformance of our pilot decontamination method. In otherwords, the spatial filter Ξj removes the energy located in allinterference multipath originating from directions that do notoverlap with those of the desired channel. It is then sufficientfor the energy of the residual interference components tobe below that of the desired channel to allow for a fulldecontamination.

C. Generalization to multiple users per cell

Now we generalize the covariance-aided amplitude basedprojection into multi-user setting where K users are servedsimultaneously in each cell. We consider the estimation ofuser channel h

(j)jk in the reminder of this section.

Define a matrix H(j)j\k as a sub-matrix of H

(j)j after remov-

ing its k-th column,

H(j)j\k ,

[h

(j)j1 · · · h

(j)j(k−1) h

(j)j(k+1) · · · h

(j)jK

]. (44)

A corresponding estimate of (44), denoted by H(j)j\k, is ob-

tained by removing the k-th column of H(j)j , which can be

an LS estimate, MMSE estimate, or other linear/non-linearestimate of H

(j)j . For demonstration purpose only, in this paper

we use the simplest LS estimate, which already shows verygood performance.

In order to adapt the method in section V-A to multi-user scenario, we propose to first neutralize the intra-cellinterference with a Zero-Forcing (ZF) filter Tjk based on theLS estimate H

(j)j\k, and then apply the spatial filter Ξjk. After

these two filters, the data signal is now

Wjk , ΞjkTjkW(j), (45)

where

Tjk , IM − H(j)j\k(H

(j)Hj\k H

(j)j\k)−1H

(j)Hj\k , (46)

and

Ξjk ,

(L∑l=1

R(j)lk + σ2

nIM

)−1

R(j)jk . (47)

The rest of this method proceeds as in the single user setting.Take the dominant eigenvector of WjkW

Hjk/C:

ujk1 = e1{1

CWjkW

Hjk}. (48)

The estimate of the direction of h(j)jk is obtained by

ujk1 =Ξ′jkujk1∥∥∥Ξ′jkujk1

∥∥∥2

, (49)

where

Ξ′jk , R(j)†jk

(L∑l=1

R(j)lk + σ2

nIM

). (50)

Finally the phase and amplitude ambiguities are resolved bythe training sequence, and we have the estimate of h

(j)jk :

h(j)CAjk =

1

τujk1u

Hjk1Y

(j)SH . (51)

Note that in this method, we build the ZF type filter Tjk basedon a rough LS estimate. Further improvements can be attainedwith higher quality estimates at the cost of higher complexity.As a simple example, we can reduce the effect of noise onthe estimate H

(j)j\k by first applying EVD of W(j)W(j)H/C,

then removing the subspace where the noise lies, and finallyperforming LS estimation. These extensions are out of thescope of this paper.

VI. LOW-COMPLEXITY ALTERNATIVES

In this section, we propose two alternatives of the methodshown in section V, aiming at lower computational complexityat the cost of mild performance losses.


A. Subspace and amplitude based projection

The low-rankness of channel covariance implies that theuplink received desired signal lives in a reduced subspace.By projecting the received data signal W(j) onto the signalspace of R

(j)jk , we are able to preserve the signal from user k

in cell j while removing the interference and noise that livein its complementary subspace. In the following, we show asubspace-based signal space projection method that relies onthe covariance of desired channel only. For ease of exposition,we simplify the system setup to single user per cell. Let theuser in cell j be the target user. The EVD of the covarianceof the desired channel is

R(j)j = Vj ΣjV

Hj , (52)

where the diagonal entries of Σj contains the non-negligibleeigenvalues of R

(j)j . Then we project the received data signal

onto the signal space of R(j)j , or the column space of Vj :

Wj , VjVHj W(j). (53)

The rest of this method follows the same idea as thecovariance-aided amplitude based projection scheme. Takingthe eigenvector corresponding to the largest eigenvalue ofWjW

H

j /C:

uj1 = e1

{1

CWjW

H

j

}, (54)

the channel estimate of h(j)j is given by

h(j)SAj =

1

τuj1u

Hj1Y

(j)s∗, (55)

where the superscript “SA” stands for “subspace and ampli-tude based projection”. Note that this method does not requirethe covariance of interference channels or variance of noise. Itexplicitly relies on the assumption that the desired covariancematrix has a low-dimensional signal subspace, with somedegradations expected when this condition is not realized inpractice. In fact, if R

(j)j has full rank, this method degrades

to pure amplitude based projection.Note that this “SA” estimator has lower complexity than

the “CA” estimator (22) in the sense that 1) “SA” estimatordoes not require the statistical knowledge of the interferencechannels or the variance of the noise, and 2) “SA” estimatorskips step 2 in Algorithm 1.

The physical condition under which full decontaminationis achieved with this method is shown below in the case ofa ULA. We denote the angular support of desired channelh

(j)j by Φd and the multipath components of the interference

channel h(j)l falling in Φd as h

(j)li .

Theorem 3 For a ULA base station, if the power of interfer-ence channel that falls into the angular support Φd satisfies

∀l 6= j,∥∥∥h(j)

li

∥∥∥2<∥∥∥h(j)

j

∥∥∥2, (56)

and the channel covariance satisfies

∀M ∈ Z+,∀l 6= j,∥∥∥R(j) 1

2j VjV

Hj R

(j) 12

l

∥∥∥2< +∞, (57)

then, the estimation error of the estimator (55) vanishes

limM,C→∞

∥∥∥h(j)SAj − h

(j)j

∥∥∥2

2∥∥∥h(j)j

∥∥∥2

2

= 0. (58)

Proof: Due to lack of space, we skip the complete proof andonly give two key steps below. By applying the asymptotic or-thogonality between two steering vectors which are associatedwith different AoAs ( Lemma 3 in [8]), we may readily obtain

limM→∞

1√M

VjVHj h

(j)j =

1√M

h(j)j (59)

limM→∞

1√M

VjVHj h

(j)l =

1√M

h(j)li ,∀l 6= j, (60)

which means the multipath components of interference thatfall outside Φd disappear asymptotically after the projectionby VjV

Hj . Then, equation (57) ensures that

limM→∞

1

Mh

(j)Hj h

(j)l = 0, l 6= j, (61)

where

h(j)l , VjV

Hj h

(j)l , l = 1, . . . , L. (62)

Note that in Theorem 3 condition (57) is less restrictive thanthe uniformly boundedness of the spectral norm of the channelcovariance. In the special case of zero angular spread, the rankof channel covariance becomes one. Denote the deterministicAoA from the user in cell l to base station j as θ

(j)

l . We caneasily see that the channel estimation error of (55) vanishescompletely as M,C →∞ as long as

∀l 6= j, θ(j)

l 6= θ(j)

j , (63)

which occurs with probability one.When channel covariance is not available, we can still

benefit from the subspace projection method by approximatingVj with a subset of discrete Fourier transform (DFT) basis asshown in [1]. This DFT basis can be chosen based on a smallnumber of channel observations. The generalization to multi-user case can be done by introducing the ZF filter (46) as insection V-C. Due to lack of space, we skip the details.

B. MMSE + amplitude based projection

Another alternative is to directly project the MMSE estimateonto the subspace of E(j) obtained by EVD of W(j)W(j)H/C

as in section IV. The estimator for the multi-user channel H(j)j

is given by

H(j)MAj = E

(j)E

(j)HR

(j)

j

(τ(

L∑l=1

R(j)

l ) + σ2nIKM

)−1

SH

y(j),

(64)where

E(j)

, IK ⊗E(j), (65)

S , ST ⊗ IM =[

s1 ⊗ IM · · · sK ⊗ IM], (66)


andR

(j)

l = diag{R(j)l1 , ...,R

(j)lK}, l = 1, . . . , L. (67)

The superscript “MA” denotes MMSE + amplitude basedprojection. It is worth noting that both the amplitude-basedprojection and angular-based projection require a large numberof antennas to achieve complete decontamination. In contrast,the MMSE estimator is efficient with very small number ofantennas. As M grows, MMSE estimator starts to reduceinterference earlier than the previously proposed methods, aswill be shown by simulations in Section VII. However, unlikethe previously proposed schemes, this “MA” estimator can-not achieve complete decontamination when the interferencechannel is overlapping with desired channel in both angularand power domains.

VII. NUMERICAL RESULTS

This section contains numerical results of our differentchannel estimation schemes compared with prior methods. Inthe simulation, we have multiple hexagonally shaped adjacentcells in the network. The radius of each cell is 1000 meters.Each base station has M antennas, which forms a ULA, withhalf wavelength antenna spacing. The length of pilot sequenceis τ = 10.

Two performance metrics are considered. The first is thenormalized channel estimation error

ε ,1

KL

L∑j=1

K∑k=1

∥∥∥h(j)

jk − h(j)jk

∥∥∥2

2∥∥∥h(j)jk

∥∥∥2

2

. (68)

The estimation errors in the plots are obtained by Monte Carlosimulations and displayed in dB scale.

The second metric is the uplink per-cell rate when MRCreceiver (based on the obtained channel estimate) is used atthe base station side.

In all simulations presented in this section, we assume thatthe channel covariance matrix is estimated using 1000 exactchannel realizations. The multipath angle of arrival of anychannel (including the interference channel) follows a uniformdistribution centered at the direction corresponding to line-of-sight (LoS). The number of multipath is P = 50. Accordingto the coherence time model in [23], for a mobile user movingat a vehicular speed of 70 km/h in an environment of 2.6 GHzcarrier frequency and 5µs high delay spread (corresponding toan excess distance of 1.5 km), the channel can be assumedcoherent over 500 transmitted symbols. Thus, we will letC = 500 in simulations, although larger coherence time canbe expected in practice for a user with lower mobility.

Note that in all simulations, the amplitude-based projectionand MMSE + amplitude based projection follow the enhancedeigenvector selection strategy shown in Remark 1 with thedesign parameter µ = 0.2.

We first illustrate Theorem 1 in Fig. 1. Suppose we havea two-cell network, with each cell having one user. In orderto make the interference overlapping in power domain withthe desired signal, we set the path loss exponent γ = 0. Thepower of the interference channel has equal probability to behigher or lower than the power of the desired channel. The

user in each cell is deliberately put in a symmetrical positionsuch that the multipath angular supports of the interferenceand the desired channel are half overlapping with each other.

0 50 100 150 200 250 300 350 400 450 500−20

−15

−10

−5

0

5

Number of Antennas

Est

imat

ion

Err

or [d

B]

LS estimationPure MMSEPure amplitudeMMSE + amplitudeCovariance−aided amplitudeMMSE − no interference

Fig. 1. Estimation performance vs. M, 2-cell network, 1 user per cell, pathloss exponent γ = 0, partially overlapping angular support, AoA spread 60degrees, SNR = 0 dB.

In the figure, “LS estimation” and “Pure MMSE” denotethe system performances when an LS estimator and an MMSEestimator (12) are used respectively. “Pure amplitude” denotesthe case when we apply the generalized amplitude basedprojection method only. “MMSE + amplitude” representsthe proposed estimator (64). “Covariance-aided amplitude”denotes the proposed covariance-aided amplitude based pro-jection method (22). The curve “MMSE - no interference”shows the estimation error of an MMSE estimator in aninterference-free scenario. As can be seen from Fig. 1, dueto the overlapping interference in both angle and powerdomains, the performance of all estimators saturate quicklywith the number of antennas, except the proposed covariance-aided amplitude based projection method, which eventuallyoutperforms interference-free MMSE estimation.2

In Fig. 2 and Fig. 3, we show the performance of estimationerror and the corresponding uplink per-cell rate for a 7-cellnetwork, with single user per cell. The users are assumed tobe distributed randomly and uniformly within their own cellsexcluding a central disc with radius 100 meters. The angularspread of the user channel (including interference channel)is 30 degrees. The path loss exponent is now γ = 2. Aswe may observe, the traditional LS estimator suffers fromsevere pilot contamination. The pure amplitude based methodand the pure MMSE method alleviate the pilot interference,yet saturate with the number of antennas. These saturationeffects come from the overlapping of the interference and thedesired channels in power and angular domains respectively.The “MMSE + amplitude” approach outperforms these twoknown methods as it discriminates against interference in bothamplitude and angular domains. However this scheme cannot

2The reason is that the performance of the interference-free MMSE esti-mation has a non-vanishing lower bound due to white Gaussian noise. On thecontrary, our proposed covariance-aided amplitude based projection methodeliminates the effects of noise and interference asymptotically.


0 50 100 150 200 250 300 350 400 450 500−25

−20

−15

−10

−5

0

5

Number of Antennas

Est

imat

ion

Err

or [d

B]

LS estimationPure MMSEPure amplitudeMMSE + amplitudeCovariance−aided amplitude

Fig. 2. Estimation performance vs. M, 7-cell network, one user per cell,AoA spread 30 degrees, path loss exponent γ = 2, cell-edge SNR = 0 dB.

0 100 200 300 400 5002

3

4

5

6

7

8

9

10

11

Number of Antennas

Per

−ce

ll R

ate

[bps

]

LS estimationPure MMSEPure amplitudeMMSE + amplitudeCovariance−aided amplitude

Fig. 3. Uplink per-cell rate vs. M, 7-cell network, one user per cell, AoAspread 30 degrees, path loss exponent γ = 2, cell-edge SNR = 0 dB.

cope with the case of overlapping in both domains. Owingto its robustness, the covariance-aided amplitude projectionmethod outperforms the rest in terms of both estimation errorand uplink per-cell rate.

We now turn our attention to multi-cell multi-user scenario.Fig. 4 and Fig. 5 show the channel estimation performanceand the corresponding uplink per-cell rate for a 7-cell networkwith each cell having 4 users. In these two figures, we add thecurve of subspace and amplitude based projection, which isdenoted in the figures as “Subspace + amplitude”. The otherparameters remain unchanged compared with those in Fig.2 and Fig. 3. We can notice that in Fig. 4 the covariance-aided amplitude projection method has some performanceloss with respect to the low-complexity MMSE + amplitudemethod and the MMSE method when the number of antennasis small. It is due to the following two facts: 1) when Mis small, it is well known that MMSE works well, but notthe amplitude based methods, and 2) with small M , theasymptotical orthogonality of channels of different users is notfully exhibited, and consequently a small amount of signal ofinterest is removed by the ZF filter Tkj , along with intra-cellinterference. However it is not disturbing in the sense that 1) as

the number of antennas grows, the covariance-aided amplitudeprojection method quickly outperforms the other methods; and2) The per-cell rate of this proposed method is still good evenwith moderate number of antennas, e.g., M > 25. It is alsointeresting to note that the low-complexity alternative scheme,subspace and amplitude based projection method, has someminor performance loss, yet keeps approximately the sameslope as the covariance-aided amplitude projection.

0 50 100 150 200 250 300 350 400 450 500−16

−14

−12

−10

−8

−6

−4

−2

0

2

4

Number of AntennasE

stim

atio

n E

rror

[dB

]

LS estimationPure MMSEPure amplitudeMMSE + amplitudeSubspace + amplitudeCovariance−aided amplitude

Fig. 4. Estimation performance vs. M, 7-cell network, 4 users per cell, AoAspread 30 degrees, path loss exponent γ = 2, cell-edge SNR = 0 dB.

0 100 200 300 400 5000

5

10

15

20

25

30

Number of Antennas

Per

−ce

ll R

ate

[bps

]

LS estimationPure MMSEPure amplitudeMMSE + amplitudeSubspace + amplitudeCovariance−aided amplitude

Fig. 5. Uplink per-cell rate vs. M, 7-cell network, 4 users per cell, AoAspread 30 degrees, path loss exponent γ = 2, cell-edge SNR = 0 dB.

VIII. CONCLUSIONS

In this paper we proposed a series of robust channel esti-mation algorithms exploiting path diversity in both angle andamplitude domains. The first method called “covariance-aidedamplitude based projection” is robust even when the desiredchannel and the interference channels overlap in multipathAoA and are not separable just in terms of power. Twolow-complexity alternative schemes were proposed, namely“subspace and amplitude based projection” and “MMSE +amplitude based projection”. Asymptotic analysis shows thecondition under which the channel estimation error convergesto zero.


APPENDIX

A. Proof of Proposition 1:

Denote the associated path loss as β. The covariance R isa Toeplitz matrix, with its mn-th entry given by

R(m,n) = β

∫ π

0

p(θ)ej2πDλ (n−m) cos(θ)dθ (69)

= β

∫ 1

−1

p (arccos(x)) ej2πDλ (n−m)x 1√

1− x2dx

=βλ

2πD

∫ 2πDλ

− 2πDλ

p(arccos( λx

2πD ))√

1−(λx

2πD

)2 ej(n−m)xdx,

=1

2π

∫ 2πDλ

− 2πDλ

f(x)ej(n−m)xdx, (70)

where

f(x) ,βλ

D

p(arccos( λx

2πD ))√

1−(λx

2πD

)2 . (71)

Since 0, π /∈ Φ, or in other words, p(0) = p(π) = 0, and thatp(θ) <∞,∀θ ∈ Φ, it follows that f(x) is uniformly bounded:

f(x) < +∞,−2πD

λ≤ x ≤ 2πD

λ. (72)

Thus, the Toeplitz matrix R is related to the real integrable anduniformly bounded generating function f(x), with its entriesbeing Fourier coefficients of f(x). We now resort to the knownresult on the spectrum of the n× n Toeplitz matrices Tn(f)defined by the generating function f(x). Denote by ess inf andess sup the essential minimum and the essential maximum off , i.e., the infimum and the supremum of f up to within a setof measure zero. Let mf , ess inff and Mf , ess supf .

Theorem 4 [24] If λ(n)0 ≤ λ

(n)1 ≤ · · · ≤ λ

(n−1)n−1 are

the eigenvalues of Tn(f), then, the spectrum of Tn(f) iscontained in (mf ,Mf ); moreover lim

n→∞λ

(n)0 = mf and

limn→∞

λ(n−1)n−1 = Mf .

By invoking Theorem 4, we obtain that limM→∞

‖R‖2 = Mf <

∞. In addition, for any finite M , the inequality ‖R‖2 < ∞always holds true. This concludes the proof.

B. Proof of Lemma 2:

Since Rj and(∑L

l=1 Rl + σ2nIM

)−1

are both positivesemi-definite (PSD) Hermitian matrices, we can directly applythe inequalities of [25] on the eigenvalues of the product oftwo PSD Hermitian matrices

‖Ξj‖2 ≤

∥∥∥∥∥∥(

L∑l=1

Rl + σ2nIM

)−1∥∥∥∥∥∥

2

‖Rj‖2 <ζ

σ2n

. (73)

It is straightforward to show that∥∥ΞjΞHj

∥∥2

= ‖Ξj‖22 <ζ2

σ4n

, (74)

which indicates that the spectral norm of ΞjΞHj is also

uniformly bounded. This proves Lemma 2.

C. Proof of Lemma 4:

Using the spatial correlation model (27), we may write

1

MhHj hl =

1

MhHWjR

12j ΞH

j ΞjR12

l hWl. (75)

By an abuse of notation, we now use the operator λ1{·} torepresent the largest singular value of a matrix. Appealing tothe singular value inequalities in [26], we can show that themaximum singular value of R

12j ΞH

j ΞjR12

l yields

λ1{R12j ΞH

j ΞjR12

l } ≤ λ1{R12j }λ1{ΞH

j ΞjR12

l } (76)

< ζ12λ1{ΞH

j Ξj}λ1{R12

l } (77)

<ζ3

σ4n

, (78)

which means the spectral radius of the complex matrixR

12j ΞH

j ΞjR12

l is uniformly bounded for any M . Thus, accord-ing to Lemma 3, 1

M hHj hl,∀l 6= j, converges almost surely tozero. Thus (32) holds true. In a similar way, we can prove(33). This concludes the proof of Lemma 4.

D. Proof of Lemma 5:

Define

Γ , limC→∞

(1

CWjW

Hj

)(79)

= hjhHj +

∑l 6=j

hlhHl + σ2

nΞjΞHj . (80)

In this proof, we first consider the noise free scenario and let

Γnf = hjhHj +

∑l 6=j

hlhHl , (81)

where the subscript “nf” denotes noise free. We can then write

limM→∞

∥∥∥∥∥Γnf

M

hj∥∥hj∥∥2


∥∥∥∥∥2

2

(82)

= limM→∞

(Γnf

M

hj∥∥hj∥∥2


)H(Γnf

M

hj∥∥hj∥∥2


)

= limM→∞

1

M2

∥∥hj∥∥2

2− limM→∞

2αjM

∥∥hj∥∥2

2+ α2

j

= α2j − 2α2

j + α2j

= 0,

which proves that when M → ∞, an eigenvalue of therandom matrix Γnf/M converges to αj , with its correspondingeigenvector converging to hj/

∥∥hj∥∥2up to a random phase.

Then we consider the Hermitian matrix σ2nΞjΞ

Hj as a

perturbation on Γnf/M . Due to the Bauer-Fike Theorem [27]on the perturbation of eigenvalues of Hermitian matrices,together with Lemma 2, we have for 1 ≤ i ≤ L:

limM→∞

∣∣∣∣λi{ Γ

M

}− λi

{Γnf

M

}∣∣∣∣ (83)

≤ limM→∞

σ2n

M

∥∥ΞjΞHj

∥∥2

(84)

= 0. (85)


The above result shows that the impact of the perturbationon the eigenvalues of Γnf/M vanishes as M → ∞. In otherwords, αj is again an asymptotic eigenvalue of Γ/M . Now weverify that despite the perturbation, the eigenvector of Γ/Mcorresponding to the asymptotic eigenvalue αj also convergesto hj/

∥∥hj∥∥2up to a random phase. To prove this, it is

sufficient to show that

limM→∞

∥∥∥∥∥ Γ

M

hj∥∥hj∥∥2


∥∥∥∥∥2

(86)

≤ limM→∞

∥∥∥∥∥Γnf

M

hj∥∥hj∥∥2


∥∥∥∥∥2

+

∥∥∥∥∥σ2nΞjΞ

Hj

M

hj∥∥hj∥∥2

∥∥∥∥∥2

(a)= 0,

where (a) is due to the definition of the spectral norm

limM→∞

∥∥∥∥∥σ2nΞjΞ

Hj

M

hj∥∥hj∥∥2

∥∥∥∥∥2

= 0. (87)

It follows that

limM,C→∞

∥∥∥∥∥WjWHj

MC

hj∥∥hj∥∥2


∥∥∥∥∥2

= 0, (88)

which concludes the proof of Lemma 5.

E. Proof of Lemma 6:

We can derive

limM,C→∞

∥∥∥∥∥ Ξ′jhj∥∥Ξ′jhj∥∥2

−Ξ′juj1e

jφ∥∥Ξ′juj1∥∥2

∥∥∥∥∥2

2

(89)

= limM,C→∞

(Ξ′jhj∥∥Ξ′jhj∥∥2

−Ξ′juj1e


)H·(

Ξ′jhj∥∥Ξ′jhj∥∥2

−Ξ′juj1e


)

= 2− limM,C→∞

(hHj Ξ′j

HΞ′juj1e

jφ∥∥Ξ′jhj∥∥2

∥∥Ξ′juj1∥∥2

+e−jφuHj1Ξ

′jH


∥∥Ξ′juj1∥∥2

)

We treat the following quantity separately

limM,C→∞

hHj Ξ′jH

Ξ′juj1ejφ∥∥Ξ′jhj∥∥2

∥∥Ξ′juj1∥∥2

(90)

= limM,C→∞

hHj Ξ′jH

Ξ′j

(hj

‖hj‖2+ uj1e

jφ − hj

‖hj‖2

)∥∥Ξ′jhj∥∥2

∥∥Ξ′juj1∥∥2

= limM,C→∞

∥∥∥∥Ξ′j hj

‖hj‖2

∥∥∥∥2∥∥Ξ′juj1∥∥2

= limM,C→∞

∥∥∥∥Ξ′j ( hj

‖hj‖2− uj1e

jφ + uj1ejφ

)∥∥∥∥2∥∥Ξ′juj1∥∥2

≤ limM,C→∞

∥∥∥∥Ξ′j( hj

‖hj‖2− uj1e

jφ)

∥∥∥∥2∥∥Ξ′juj1∥∥2

+ limM,C→∞

∥∥Ξ′juj1ejφ∥∥2∥∥Ξ′juj1∥∥2

= 1 (91)

In a similar way, we can prove that

limM,C→∞

∥∥Ξ′juj1∥∥2∥∥∥∥Ξ′j hj

‖hj‖2

∥∥∥∥2

≤ 1. (92)

Combining (91) and (92), we obtain

limM,C→∞

hHj Ξ′jH

Ξ′juj1ejφ∥∥Ξ′jhj∥∥2

∥∥Ξ′juj1∥∥2

= 1. (93)

With analogous derivation, we can prove

limM,C→∞

e−jφuHj1Ξ′jH


∥∥Ξ′juj1∥∥2

= 1. (94)

Applying (93) and (94) to (89) gives

limM,C→∞

∥∥∥∥∥ Ξ′jhj∥∥Ξ′jhj∥∥2

−Ξ′juj1e


∥∥∥∥∥2

2

= 0. (95)

The following equality holds

Ξ′jhj = Ξ′jΞjhj = R†jRjhj = hj , (96)

proving that

limM,C→∞

∥∥∥∥ hj‖hj‖2

− uj1ejφ

∥∥∥∥2

= 0, (97)

which completes the proof of Lemma 6.

F. Proof of Theorem 1:

From (37) we readily obtain

limM,C→∞

hHj uj1

‖hj‖2= 1. (98)

Recall from the uplink training (7), we have

hCAj =

1

τuj1u

Hj1

hjsT +

∑l 6=j

hlsT + N

s∗, (99)

and hence

limM,C→∞

∥∥∥hCAj − hj

∥∥∥2

2

‖hj‖22(100)


= limM,C→∞

(hCAj − hj)

H(hCAj − hj)

‖hj‖22

= limM,C→∞

1

‖hj‖22

∑l 6=j

hluj1uHj1

∑l 6=j

hl +∑l 6=j

hluj1uHj1N

s∗

τ

+sT

τNH uj1u

Hj1

∑l 6=j

hl +sT

τNH uj1u

Hj1N

s∗

τ

− hHj uj1uHj1hj + hHj hj

)= limM,C→∞

1

‖hj‖22

(hHj hj − hHj uj1u

Hj1hj

). (101)

Equation (98) ensures that

limM,C→∞

1

‖hj‖22hHj uj1u

Hj1hj =

1

‖hj‖22hHj hj = 1, (102)

which concludes the proof.

G. Proof of Theorem 2:

This proof follows similar steps towards Theorem 1. Thuswe give a sketch of the proof only. Define

Γ , limC→∞

(1

CWjW

Hj

)(103)

= h(j)j h

(j)Hj +

∑l 6=j

h(j)l h

(j)Hl + σ2

nΞjΞHj , (104)

where h(j)l , Ξjh

(j)l , l = 1, . . . , L. Due to the asymptotic

orthogonality between steering vectors in disjoint angularsupport, i.e., Lemma 3 in [8], we can easily show that in largeantenna limit, h

(j)lo falls into the null space of R

(j)j . Thus

limM→∞

1

Mh

(j)l h

(j)Hl = lim

M→∞

1

Mh

(j)li h

(j)Hli . (105)

Then we have

limM→∞

Γ

M=

1

M

h(j)j h

(j)Hj +

∑l 6=j

h(j)li h

(j)Hli + σ2

nΞjΞHj

.

Under Condition C1, it is easy to show that

limM→∞

∥∥∥∥∥∥ Γ

M

h(j)j∥∥∥h(j)j

∥∥∥2

−h

(j)H

j h(j)j

M

h(j)j∥∥∥h(j)j

∥∥∥2

∥∥∥∥∥∥2

= 0. (106)

Given the following condition

∀l 6= j,∥∥∥Ξjh

(j)li

∥∥∥2<∥∥∥Ξjh

(j)j

∥∥∥2, (107)

it is clear that the dominant eigenvector of Γ/M converges toh

(j)j /∥∥∥h(j)

j

∥∥∥2

(up to a random phase), with its corresponding

eigenvalue converging to h(j)H

j h(j)j /M . Then, using the same

technique in the proof of Lemma 6, we obtain

limM,C→∞

h(j)H

j uj1∥∥∥h(j)j

∥∥∥2

= 1. (108)

Finally, we readily obtain (43) by analogous derivations inAppendix F.

REFERENCES

[1] H. Yin, L. Cottatellucci, D. Gesbert, R. Muller, and G. He, “Pilot de-contamination using combined angular and amplitude based projectionsin massive MIMO systems,” in IEEE 16th Workshop on Signal Process-ing Advances in Wireless Communications, SPAWC 2015, Stockholm,Sweden, Jun. 2015, pp. 216–220.

[2] T. L. Marzetta, “Noncooperative cellular wireless with unlimited num-bers of base station antennas,” IEEE Trans. Wireless Commun, vol. 9,no. 11, pp. 3590–3600, Nov. 2010.

[3] H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and spectral effi-ciency of very large multiuser MIMO systems,” IEEE Trans. Commun.,vol. 61, no. 4, pp. 1436–1449, Apr. 2013.

[4] J. Jose, A. Ashikhmin, T. L. Marzetta, and S. Vishwanath, “Pilotcontamination problem in multi-cell TDD systems,” in Proc. IEEEInternational Symposium on Information Theory (ISIT09), Seoul, Korea,Jun. 2009, pp. 2184–2188.

[5] ——, “Pilot contamination and precoding in multi-cell TDD systems,”IEEE Trans. Wireless Commun., vol. 10, no. 8, pp. 2640–2651, Aug.2011.

[6] J. H. Sørensen and E. D. Carvalho, “Pilot decontamination through pilotsequence hopping in massive MIMO systems,” in Proc. of IEEE GlobalTelecommunications Conference (Globecom), Austin, TX, Dec. 2014,pp. 3285–3290.

[7] I. Atzeni, J. Arnau, and M. Debbah, “Fractional pilot reuse in massiveMIMO systems,” in 2015 IEEE International Conference on Communi-cation Workshop (ICCW), London, U.K., Jun. 2015, pp. 1030–1035.

[8] H. Yin, D. Gesbert, M. Filippou, and Y. Liu, “A coordinated approachto channel estimation in large-scale multiple-antenna systems,” IEEE J.Sel. Areas Commun., vol. 31, no. 2, pp. 264–273, Feb. 2013.

[9] D. Neumann, A. Gruendinger, M. Joham, and W. Utschick, “Pilot coor-dination for large-scale multi-cell TDD systems,” in 18th InternationalITG Workshop on Smart Antennas (WSA), Erlangen, Germany, Mar.2014, pp. 1–6.

[10] A. Ashikhmi, T. L. Marzetta, and L. Li, “Interference reduction inmulti-cell massive MIMO systems I: large-scale fading precoding anddecoding,” Submitted to IEEE Trans. Inf. Theory, 2014. [Online].Available: http://arxiv.org/abs/1411.4182

[11] H. Q. Ngo and E. G. Larsson, “EVD-based channel estimation inmulticell multiuser MIMO systems with very large antenna arrays,” inProc. IEEE International Conference on Acoustics, Speech, and SignalProcessing (ICASSP). IEEE, 2012, pp. 3249–3252.

[12] R. R. Muller, L. Cottatellucci, and M. Vehkapera, “Blind pilot decontam-ination,” IEEE J. Sel. Topics Signal Process, vol. 8, no. 5, pp. 773–786,Oct. 2014.

[13] D. Hu, L. He, and X. Wang, “Semi-blind pilot decontamination formassive MIMO systems,” IEEE Trans. Wireless Commun., vol. 15, no. 1,pp. 525–536, Jan. 2016.

[14] A. Adhikary, J. Nam, J.-Y. Ahn, and G. Caire, “Joint spatial division andmultiplexing – the large-scale array regime,” IEEE Trans. Inf. Theory,vol. 59, no. 10, pp. 6441–6463, Oct 2013.

[15] H. Yin, D. Gesbert, and L. Cottatellucci, “Dealing with interference indistributed large-scale MIMO systems: A statistical approach,” IEEE J.Sel. Topics Signal Process., vol. 8, no. 5, pp. 942–953, Oct 2014.

[16] H. Yin, L. Cottatellucci, and D. Gesbert, “Enabling massive MIMOsystems in the FDD mode thanks to D2D communications,” in Proc. ofAsilomar Conference on Signals, Systems, and Computers, PacificGrove, CA, Nov 2014, pp. 656–660.

[17] A. Adhikary and G. Caire, “Joint spatial division and multiplexing:Opportunistic beamforming and user grouping,” arXiv preprint, 2013.[Online]. Available: http://arxiv.org/abs/1305.7252

[18] L. Cottatellucci, R. R. Muller, and M. Vehkapera, “Analysis of pilotdecontamination based on power control,” in Proc. of IEEE VehicularTechnology Conference (VTC), Jun. 2013, pp. 1–5.

[19] S. Wagner, R. Couillet, M. Debbah, and D. Slock, “Large systemanalysis of linear precoding in correlated MISO broadcast channelsunder limited feedback,” IEEE Trans. Inf. Theory, vol. 58, no. 7, pp.4509–4537, 2012.

[20] A. F. Molisch, Wireless communications. Wiley, 2010.[21] A. J. Paulraj, R. Nabar, and D. Gore, Introduction to space-time wireless

communications. Cambridge University Press, 2003.[22] J. Evans and D. N. C. Tse, “Large system performance of linear

multiuser receivers in multipath fading channels,” IEEE Trans. Inf.Theory, vol. 46, no. 6, pp. 2059–2078, 2000.

[23] T. S. Rappaport, Wireless communications: principles and practice.Prentice Hall PTR New Jersey, 1996, vol. 2.

http://arxiv.org/abs/1411.4182

http://arxiv.org/abs/1305.7252


[24] U. Grenander and G. Szego, Toeplitz forms and their applications. 2nded. Chelsea, New York, 1984.

[25] A. W. Marshall, I. Olkin, and B. Arnold, Inequalities: theory ofmajorization and its applications. Springer Science & Business Media,2010.

[26] R. A. Horn and C. R. Johnson, Topics in matrix analysis. CambridgeUniversity Press, 1991.

[27] F. L. Bauer and C. T. Fike, “Norms and exclusion theorems,” NumerischeMathematik, vol. 2, no. 1, pp. 137–141, 1960.

Haifan Yin received the B.Sc. degree in Electricaland Electronic Engineering and the M.Sc. degreein Electronics and Information Engineering fromHuazhong University of Science and Technology,Wuhan, China, in 2009 and 2012 respectively. From2009 to 2011, he had been with Wuhan NationalLaboratory for Optoelectronics, China, working onthe implementation of TD-LTE systems as an R&Dengineer. In September 2012, he joined the Mo-bile Communications Department at EURECOM,France, as a Ph.D. student. His current research

interests include signal processing, channel estimation, cooperative networks,and large-scale antenna systems.

H. Yin was a recipient of the 2015 Chinese Government Award forOutstanding Self-financed Students Abroad.

Laura Cottatellucci is currently assistant professorat the Dept. of Mobile Communications in Eurecom.She obtained the PhD from Technical University ofVienna, Austria (2006). Specialized in networkingat Guglielmo Reiss Romoli School (1996, Italy), sheworked in Telecom Italia (1995-2000) as responsibleof industrial projects. From April 2000 to September2005 she worked as senior research in ftw Austriaon CDMA and MIMO systems. From October toDecember 2005 she was research fellow on ad-hocnetworks in INRIA (Sophia Antipolis, France) and

guest researcher in Eurecom. In 2006 she was appointed research fellow atthe University of South Australia, Australia to work on information theoryfor networks with uncertain topology. Cottatellucci is co-editor of a specialissue on cooperative communications for EURASIP Journal on WirelessCommunications and Networking and co-chair of RAWNET/WCN3 2009 inWiOpt. Her research topics of interest are large system analysis of wirelessand complex networks based on random matrix theory and game theory.

David Gesbert (F’11) is Professor and Head of theMobile Communications Department, EURECOM,France. He obtained the Ph.D degree from Ecole Na-tionale Superieure des Telecommunications, France,in 1997. From 1997 to 1999 he has been with theInformation Systems Laboratory, Stanford Univer-sity. In 1999, he was a founding engineer of IospanWireless Inc, San Jose, Ca.,a startup company pio-neering MIMO-OFDM (now Intel). Between 2001and 2003 he has been with the Department of Infor-matics, University of Oslo as an adjunct professor.

D. Gesbert has published about 250 papers and several patents all in the areaof signal processing, communications, and wireless networks. He was namedin the 2014 Thomson-Reuters List of Highly Cited Researchers in ComputerScience. Since 2015, he holds an ERC advanced grant on the topic of “SmartDevice Communications”

D. Gesbert was a co-editor of six special issues on wireless networks andcommunications theory. He was an associate editor for IEEE Transactions onWireless Communications and the EURASIP Journal on Wireless Communi-cations and Networking. He authored or co-authored papers winning the 2004and 2015 IEEE Best Tutorial Paper Award (Communications Society), 2012SPS Signal Processing Magazine Best Paper Award. He co-authored the book“Space time wireless communications: From parameter estimation to MIMOsystems”, Cambridge Press, 2006. He is a Technical Program Chair for IEEEICC 2017, to be held in Paris.

Ralf R. Muller (S’96–M’03–SM’05) was bornin Schwabach, Germany, 1970. He received theDipl.-Ing. and Dr.-Ing. degree with distinction fromFriedrich-Alexander-Universitat (FAU) Erlangen-Nurnberg in 1996 and 1999, respectively. From2000 to 2004, he directed a research group atThe Telecommunications Research Center Viennain Austria and taught as an adjunct professor atTU Wien. In 2005, he was appointed full professorat the Department of Electronics and Telecommu-nications at the Norwegian University of Science

and Technology in Trondheim, Norway. In 2013, he joined the Institute forDigital Communications at FAU Erlangen-Nurnberg in Erlangen, Germany.He held visiting appointments at Princeton University, US, Institute Eurecom,France, University of Melbourne, Australia, University of Oulu, Finland,National University of Singapore, Babes-Bolyai University, Cluj-Napoca,Romania, Kyoto University, Japan, FAU Erlangen-Nurnberg, Germany, andTU Munchen, Germany.

Dr. Muller received the Leonard G. Abraham Prize (jointly with SergioVerdu) for the paper “Design and analysis of low-complexity interferencemitigation on vector channels” from the IEEE Communications Society. Hewas presented awards for his dissertation “Power and bandwidth efficiency ofmultiuser systems with random spreading” by the Vodafone Foundation forMobile Communications and the German Information Technology Society(ITG). Moreover, he received the ITG award for the paper “A random matrixmodel for communication via antenna arrays” as well as the Philipp-ReisAward (jointly with Robert Fischer). Dr. Muller served as an associate editorfor the IEEE TRANSACTIONS ON INFORMATION THEORY from 2003to 2006. He is currently serving on the executive editorial board of the IEEETRANSACTIONS ON WIRELESS COMMUNICATIONS.

Gaoning He received his Master and Ph.D degreesin France, from Ecole d’Ingenieurs Telecom Paris-Tech (ENST). He had been a research engineer inMotorola research center in Paris from 2006 to 2009,research fellow in Alcatel-Lucent Chair on FlexibleRadio - Supelec in Paris from 2009 to 2010, and re-search scientist at Alcatel-Lucent Bell-labs in Shang-hai from 2010 to 2011. He joined Huawei Shanghairesearch center as senior researcher in 2011. Since2014, he is assistant director and research managerof the Mathematical and Algorithmic Sciences Lab

in Huawei France Research Center. His research interests lie in fundamentalmathematics, algorithms, statistics, information & communication sciencesresearch.

Documents

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 11, 2016 1 … · 2018-10-18 · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 11, 2016 2 the angular spread of the signal