Download pdf - On the Distribution of SINR for the MMSE MIMO Receiver and ...statistics.rutgers.edu/home/pingli/papers/mmse.pdf · On the Distribution of SINR for the MMSE MIMO Receiver and Performance

1

On the Distribution of SINR for the MMSE MIMOReceiver and Performance Analysis

Ping Li, Debashis Paul, Ravi Narasimhan, Member, IEEE, and John Cioffi, Fellow, IEEE

Abstract— This paper studies the statistical distribution of thesignal-to-interference-plus-noise ratio (SINR) for the minimummean square error (MMSE) receiver in multiple-input-multiple-output (MIMO) wireless communications. The channel modelis assumed to be (transmit) correlated Rayleigh flat-fadingwith unequal powers. The SINR can be decomposed into twoindependent random variables: SINR = SINRZF

+ T , whereSINRZF corresponds to the SINR for a zero-forcing (ZF) receiverand has an exact Gamma distribution. This paper focuses oncharacterizing the statistical properties of T using the resultsfrom random matrix theory. First three asymptotic momentsof T are derived for uncorrelated channels and channels withequicorrelations. For general correlated channels, some limitingupperbounds for the first three moments are also provided. Foruncorrelated channels and correlated channels satisfying certainconditions, it is proved that T converges to a Normal randomvariable. A Gamma distribution and a generalized Gammadistribution are proposed as approximations to the finite sampledistribution of T . Simulations suggest that these approximatedistributions can be used to accurately estimate the probability oferrors even for very small dimensions (e.g., 2 transmit antennas).

Index Terms— Multiple-input-multiple-output system, mini-mum mean square error receiver, signal-to-interference-plus-noise ratio, channel correlation, random matrix, asymptoticdistributions, Gamma approximation, error probability.

I. INTRODUCTION

We consider the following signal and channel model in amultiple-input-multiple-output (MIMO) system,

yr =1√m

HWRt

1

2 P1

2 xt + nc =1√m

Hxt + nc, (1)

where xt ∈ Cp is the (normalized) transmitted signal vectorand yr ∈ Cm is the received signal vector. Here p is thenumber of transmit antennas and m is the number of receiveantennas. HW ∈ Cm×p consists of i.i.d. standard complexNormal entries. Rt ∈ Cp×p is the transmitter correlationmatrix. P = diag[c1, c2, ..., cp] ∈ Rp×p; ck = ck

mp , where ck

is the signal to noise ratio (SNR) for the kth spatial stream.This definition of SNR is consistent with [1] (Section 7.4). Wewill treat H = HWRt

1

2 P1

2 ∈ Cm×p as the channel matrix.nc ∈ Cp×p is the complex noise vector and is assumed to havezero mean and identity covariance. Note that the power matrixP has terms involving the variance of the noise. In this study,

Ping Li is with the Department of Statistics, Stanford University (email:[email protected]). Debashis Paul is with the Department of Statis-tics, University of California, Davis (email: [email protected]).Ravi Narasimhan is with the Department of Electrical Engineering, Uni-versity of California, Santa Cruz (email: [email protected]). John Cioffi iswith the Department of Electrical Engineering, Stanford University (email:[email protected]).

we assume that the correlation matrix Rt and power matrixP are non-random. Also, we restrict our attention to p ≤ m.

We consider the popular linear minimum mean squareerror (MMSE) receiver. Conditional on the channel matrix H,the signal-to-interference-plus-noise ratio (SINR) on the kth

spatial stream can be expressed as (e.g., [1]–[6])

SINRk =1

MMSEk− 1 =

1[

(

Ip + 1mH†H

)−1]

kk

− 1, (2)

where Ip is a p × p identity matrix, and H† is the Hermitian

transpose of H. Note that (2), in the same form as (7.49) of[1], is derived based on the second order statistics of the inputsignals, not restricted to binary signals.

For binary inputs, Verdu [4] (6.47) provides the exactformula for computing the bit error rate (BER) (also see [7]).Conditional on H, this BER formula requires computing 2p−1

Q-functions. To compute BER unconditionally, we need tosample H enough times (e.g., 105) to get a reliable estimate.When p ≥ 32 (or p ≥ 64), the computations becomeintractable [4], [8].

Recently, study of the asymptotic properties of multiuserreceivers (e.g., [2]–[4], [6], [8]–[11]) has received a lot ofattention. Works that relate directly to the content of thispaper include Tse and Hanly [11] and Verdu and Shamai[6], who independently derived the asymptotic first moment ofSINR for uncorrelated channels. Tse and Zeitouni [3] provedthe asymptotic Normality of SINR for the equal power case,and commented on the possibility of extending the resultto the unequal powers scenario. Zhang et. al. [12] provedthe asymptotic Normality of the multiple access interference(MAI), which is closely related to SINR. Guo et. al. [8]proved the asymptotic Normality of the decision statisticsfor a variety of linear multiuser receivers. [8] considered ageneral power distribution and corresponding unconditionalasymptotic behavior.

Based on the asymptotic Normality results, Poor and Verdu[2] (also in [4], [8]) proposed using the limiting BER (denotedby BER∞), which is a single Q-function,

BER∞ = Q(

√

E(SINRk)∞)

=

∫ ∞

√E(SINRk)∞

e−t2/2dt, (3)

where E(SINRk)∞ denotes the asymptotic first moment ofSINRk.

Equation (3) is convenient and accurate for large dimen-sions. However, its accuracy for small dimensions is of someconcern. For instance, [8] compared the asymptotic BERwith simulation results. This study showed that even with

2

p = 64 there exist significant discrepancies. In general (3) willunderestimate the true BER. For example, in our simulations,when m = 16, p = 8, SNR = 15 dB, the asymptotic BERgiven by (3) is roughly 1

10000 of the exact BER. In currentpractice, CDMA channels with m, p between 32 and 64 aretypical and in multi-antenna systems arrays of 4 antennas aretypical but arrays with 8 to 16 antennas would be consideredfeasible in the near future [9]. Therefore it would be usefulif one can compute error probabilities both efficiently andaccurately. Another motivation to have accurate and simpleBER expressions comes from system optimization designs[13], [14].

We address three aspects related to improving the accuracyin computing probability of errors. First, the known formulaefor the asymptotic moments can be improved at small di-mensions. Secondly, while the asymptotic BER (3) convergesvery slowly [4] (page 305), we can improve its accuracy byconsidering other asymptotically equivalent distributions usinghigher moments. Thirdly, the presence of channel correlationsmay (seriously) affect the moments and BER. In fact, whenthe dimension increases, the effect of correlation tends toinvalidate the independence assumption [15]. To the best ofour knowledge, the asymptotic SINR results in the existingliterature do not take into account correlations. Muller [16] hascommented that the presence of correlations make the analysisvery difficult .

The channel model in (1) takes into account the (transmit)channel correlations. This model is somewhat restrictive inthat it does not consider the receiver correlation and it is aGaussian channel (Rayleigh flat-fading), while [3] and [8] didnot assume Gaussianity. Removing the Normality assumptionwill still result in the same first moment of SINR but differentsecond moment (see [3]). Ignoring the channel correlation,however, may produce quite different results even for the firstmoment. For example, it will be shown later that, with equalpower and constant correlation ρ (equicorrelation model), thefirst moment is only about (1 − ρ) times the first momentwithout correlation.

Our work starts with a crucial observation: assuming acorrelated channel model in (1), the SINR can be decomposedinto two independent components

SINRk = SINRZFk + T = S + T, (4)

where SINRZFk , denoted by S, is the corresponding SINR for

the zero-forcing (ZF) receiver, which, conditional on H, canbe expressed as (e.g. [1], [3], [17]),

SINRZFk =

1[

(

1mH†H

)−1]

kk

. (5)

We will show that S has a Gamma distribution. BecauseS and T are independent, we can focus only on T . As S isoften the dominating component, separating out S is expectedto improve the accuracy of the approximation.

For uncorrelated channels and channels with equicorrela-tion, we derive the first three asymptotic moments for T (thusfor SINRk also). We show that these asymptotic momentsmatch the simulations remarkably well. For general correlated

channels, we propose some limiting upperbounds for the firstthree moments.

We prove the asymptotic Normality of T for correlatedchannels under certain rather restrictive conditions. The proofof asymptotic Normality provides a rigorous basis for propos-ing approximate distributions. In addition, if T is proved toconverge to a non-random distribution (e.g., a Normal withnon-random mean and variance), then the same limits holdfor the “unconditional” SINRk, by dominated convergence.The subtle difference between conditional and unconditionalSINRk becomes important if we want to extend to randompower distributions or random correlations (see [8]).

As an alternative to the Q-function in (3), a well-knownapproximation to the true BER would be

BERa4=

∫ ∞

0

(∫ ∞

√x

1√2π

e−t2

2 dt

)

dFSINRk(x), (6)

where FSINRkis the cumulative distribution function (CDF) of

SINRk. (6) converges to (3) as long as the asymptotic Nor-mality holds. We propose using a Gamma and a generalizedGamma distribution to approximate FSINRk

. In particular, ifT is approximated as a generalized Gamma random variable,combining the exact distribution of S, we are able to producevery accurate BER curves even for m = 4.

We will use binary signals as a test case for evaluating ourmethods. We assume BFSK for obtaining the same limitingBER (3) as in [2], [4], [8]. It should be clear that ourmethodology applies to other type of constellations such asM-QAM.

The paper is organized as follows. Section II describeshow to decompose SINRk = SINRZF

k + T . Section III hasthe derivation of the exact distribution of SINRZF

k and itsindependence from T . Section IV contains the derivation of thefirst three asymptotic moments (or upperbounds on moments)of T . Asymptotic Normality of T is proved in Section V. InSection VI the Gamma and generalized Gamma distributionapproximations are proposed. Section VII has a demonstrationabout how to apply our results on SINR in computing theprobability of errors. Section VIII compares our theoreticalresults with Monte Carlo simulations.

II. DECOMPOSITION OF SINRk

Let A = Ip + 1mH

†H. We apply the following shifting

operation,

A =(

a1 a2 . . . ak . . . ap

)

shift−→(

Akk a†k(−k)

ak(−k) A(−k,−k)

)

= A, (7)

where ak ∈ Cp×1 stands for the kth column of A, ak(−k) ∈C(p−1)×1 is ak with the kth entry removed. A(−k,−k) ∈C(p−1)×(p−1) is A with the kth column and kth row removed.Then,[

A−1]

kk=[

A−1]

11

=(

Akk − a†k(−k)

(

A(−k,−k)

)−1ak(−k)

)−1

. (8)

3

From (8), it follows that (2) can be expressed as

SINRk

=1

(

Akk − a†k(−k)

(

A(−k,−k)

)−1ak(−k)

)−1 − 1

=1

mh†

khk−

1

m2h†

kH(−k)

(

Ip−1 +1

mH

†(−k)H(−k)

)−1

H†(−k)hk, (9)

where hk ∈ Cm×1 stands for the kth column of H, H(−k) ∈Cm×(p−1) is H with the kth column removed.

We consider the singular value decomposition (SVD):H(−k) = UDV

†, U ∈ Cm×(p−1), D ∈ R(p−1)×(p−1),V ∈ C(p−1)×(p−1), U

†U = Ip−1, V

†V = VV† = Ip−1.

Let Uc be the orthogonal complement of U, implying thatU

†Uc = 0(p−1)×(m−p+1), . We augment Uc to U to get

U = [U Uc] which satisfies U†U = UU

† = Im.Plug these into (9) to get

SINRk =1

mh†

kUU†hk

− 1

m2h†

kUD2

(

Ip−1 +1

mD

2

)−1

U†hk. (10)

Now, expand U†hk =

[

U†hk

U†chk

]

=

[

tksk

]

, tk ∈C(p−1)×1, sk ∈ C(m−p+1)×1. Then (10) becomes

SINRk =1

ms†ksk +

1

mt†k

(

Ip−1 +1

mD

2

)−1

tk

=1

m

m−p+1∑

i=1

‖sk,i‖2 +1

m

p−1∑

i=1

‖tk,i‖2

1 + 1md2

i

= S + T, (11)

where di is the ith diagonal entry of D.We can apply the same shifting and SVD operations on

SINRZFk in (5) to get

SINRZFk =

1

ms†ksk = S. (12)

This proves the decomposition SINRk = SINRZFk + T . So

far, we have not used any statistical properties of H, hencethe decomposition is not restricted to Gaussian channels.

In the case p > m, SINRk in (2) is still valid. The sameSVD yields

SINRk =1

m

m∑

i=1

‖tk,i‖2

1 + 1md2

i

= T, (13)

because H(−k) has only m nonzero singular values when p >m. The results from random matrix theory that we will uselater also apply to p > m. However, in this study, we restrictour attention to p ≤ m.

III. DISTRIBUTIONS OF S AND tk

We can get the distribution of S and its relationship withT from the properties of the multivariate Normal distribution(see [18] Chapter 3). We consider R = P

1

2 RtP1

2 as thegeneralized covariance matrix. R is assumed to be positivedefinite.

We denote a Normal distribution as N(mean, Var), and acomplex Normal distribution as CN(mean, Var). Thus, eachrow of HW ∼ CN(0, Ip), and each row of H ∼ CN(0,R).First observe that,

Σk4=

1

[R−1]kk=

ck

[Rt−1]kk

= rkk − r†k(−k)(R(−k,−k))−1rk(−k), (14)

where rkk is the (k, k)th element of R, rk(−k) is the kth

column of R with the kth element removed, R(−k,−k) is R

with the kth row and kth column removed, [R−1]kk is the(k, k)th entry of R

−1.Conditional on H(−k), hk, tk, and sk are all Normally

distributed:

hk|H(−k) ∼ −H(−k)(R(−k,−k))−1rk(−k)

+ CN (0, ΣkIm) , (15)

tk = U†hk|H(−k) ∼ −DV

†(R(−k,−k))−1rk(−k)

+ CN (0, ΣkIp−1) , (16)

sk = U†chk|H(−k) ∼ CN (0, ΣkIm−p+1) . (17)

(Recall that H(−k) = UDV†, U

†H(−k) = DV

†,U

†cH(−k) = U

†cUDV

† = 0.)

We state our first lemma.Lemma 1: SINRZF

k = S is a Gamma random variable, S ∼G(m − p + 1, 1

mΣk). S is independent of T .Proof: Recall that S = 1

m

∑m−p+1i=1 ‖sk,i‖2. The inde-

pendence of S and T follows from (16) and (17). Conditionaldistribution of sk given H(−k) is independent of H(−k).Therefore, (17) is the unconditional distribution of sk. Theelements of sk are i.i.d. CN(0, Σk), and therefore S ∼ G(m−p + 1, 1

mΣk).

Gore et. al. [19] proved that SINRZFk is a Gamma random

variable for the equal power case. Both Muller et. al. [17] andTse and Zeitouni [3] used a Beta distribution to approximateSINRZF

k .The first three moments of S are

E(S) =m − p + 1

mΣk,

Var(S) =m − p + 1

m2Σ2

k,

Sk(S) = E(

(S − E(S))3)

= 2m − p + 1

m3Σ3

k. (18)

The following lemma concerns the distribution of tk.Lemma 2: Conditional on H(−k), ‖tk,i‖2 is a non-central

Chi-squared random variable with a moment generating func-

4

tion (MGF)

E(

exp(‖tk,i‖2y)∣

∣H(−k)

)

=1

1 − yΣkexp

(

y‖zi‖2

1 − yΣk

)

.

(19)

where zi = −di

(

V†R

−1(−k,−k)rk(−k)

)

i. For uncorrelated

channels, ‖tk,i‖2 ∼ G (1, ck), independent of H(−k).Proof: From (16), tk,i|H(−k) ∼ CN (zi, Σk). For uncor-

related channels, zi = 0 and Σk = ck so that ‖tk,i‖2 ∼G (1, ck). Therefore, conditional on H(−k),

2Σk

‖tk,i‖2 is astandard non-central Chi-squared random variable with 2 de-grees of freedom and non-centrality parameter 2

Σk‖zi‖2, which

gives the MGF in (19).

For future use we give expressions for three moments of‖tk,i‖2, conditional on H(−k).

E(‖tk,i‖2|H(−k)) = Σk + ‖zi‖2,

Var(‖tk,i‖2|H(−k)) = Σ2k + 2‖zi‖2Σk,

Sk(‖tk,i‖2|H(−k)) = 2Σ3k + 6‖zi‖2Σ2

k. (20)

The significance of the decomposition of SINRk can be seenfrom Fig. 1, where we plot E

(

TS

)

as a function of p, c (SNR),and correlation ρ, for m = 16. Here we consider equal powerck = c case with Rt being an equicorrelation matrix (i.e., Rt

consists of 1’s in the main diagonal and ρ’s in all off diagonalentries). This figure suggests that in the major range of SNRand p

m , S might be the dominating component.

2 4 6 8 10 12 14 1610

−6

10−4

10−2

100

102

p (m = 16)

E(T

/S)

ρ = 0ρ = 0.5ρ = 0.8

SNR = 30 dB

SNR = 0 dB

Fig. 1. The ratios E“

TS

”

, obtained from 105 simulations, with respect topm

, for two (equal power) SNR levels (c = 0 dB, 30 dB), and three levelsof equicorrelations (ρ = 0, 0.5, 0.8).

IV. ASYMPTOTIC MOMENTS

In this section we derive the asymptotic moments of T . Tcan be written as

T =1

m

p−1∑

i=1

‖tk,i‖2

1 + 1md2

i

=1

m

p−1∑

i=1

‖tk,i‖2λi =1

mt†kΛtk,

where λi = 11+ 1

m d2

i

, Λ =(

Ip−1 + 1mD

)−1=

diag[λ1, ..., λp−1].

We use some known results for the empirical eigenvaluedistribution (ESD) of the product of two random matrices (e.g,Silverstein [20], Bai [21], Silverstein and Bai [22]), to studythe asymptotic properties of T . We work under the regime:p → ∞, m → ∞, p−1

m → γ ∈ (0, 1). In the rest of the paper,whenever we mention “in the limit”, we refer to this condition.

Suppose that {τi}p−1i=1 are the eigenvalues of R(−k,−k).

Suppose further that the ESD of R = P1

2 RtP1

2 convergesas p → ∞ to a non-random measure F R. By Theorem 1.1 of[20], under some weak regularity conditions on F R (includingthat the support of F R is compact and does not contain 0), inthe limit, the ESD of 1

mH†(−k)H(−k), denoted by J , converges

to a measure J , whose Stieltjes transform, denoted by MJ ,satisfies

MJ(z)4=

∫

1

x − zJ(dx) =

∫

dF R(τ)

τ(1 − γ − γzMJ(z)) − z,

(21)

(see [8]). For finite p, the last integral in (21) can be approx-imated by

1

p − 1

p−1∑

i=1

1

τi(1 − γ − γzMJ(z)) − z.

Note that for uncorrelated channels, τi = ci. In general,(21) requires that z ∈ C+ = {z ∈ C : Im(z) > 0}, andMJ(z) is the unique solution in {M ∈ C : − 1−γ

z + γM ∈C+}. However, since 0 < MJ(−1) < 1 and since R ispositive definite (implying that all τi’s are positive), and weonly consider γ ≤ 1, it follows that 1

τi(1−γ−γzMJ(z))−z andits derivatives are bounded in a complex neighborhood ofz = −1, hence MJ(z) and its derivatives are well-definedat z = −1 by the bounded convergence theorem.

Therefore,

tr(Λ)

p − 1=

1

p − 1

p−1∑

i=1

1

1 + 1md2

i

=

∫

1

1 + xJ(dx)

=

∫

1

x − (−1)J(dx)

p−→ MJ (−1)4= µγ,c, (22)

tr(Λ2)

p − 1=

1

p − 1

p−1∑

i=1

1(

1 + 1md2

i

)2

=

∫

1

(x − (−1))2J(dx)

p−→ M ′J (−1)

4= σ2

γ,c, (23)

tr(Λ3)

p − 1=

1

p − 1

p−1∑

i=1

1(

1 + 1md2

i

)3

=

∫

1

(x − (−1))3J(dx)

p−→ 1

2M ′′

J (−1)4= ηγ,c,

(24)

where “p−→” denotes “converge in probability”, and M ′

J(z)and M ′′

J (z) are the first and second derivatives of MJ(z),respectively. In general, MJ(z), M ′

J(z), and M ′′J (z) have to

be solved numerically except in some simple cases (e.g., anuncorrelated channel with equal powers).

5

Let ∆i(z) = τi(1 − γ − γzMJ(z)) − z. M ′J(z), and M ′′

J

can be approximated by solving

M ′J(z)

(

1 − 1

p − 1

p−1∑

i=1

τiγz

∆i(z)2

)

=1

p − 1

p−1∑

i=1

τiγMJ(z) + 1

∆i(z)2,

M ′′J (z)

(

1 − 1

p − 1

p−1∑

i=1

τiγz

∆i(z)2

)

=2

p − 1

p−1∑

i=1

τiγM ′J(z)

∆i(z)2

+2

p − 1

p−1∑

i=1

(τiγMJ(z) + τiγzM ′J(z) + 1)2

∆i(z)3.

MJ(z) and its derivatives are not sufficient for computingthe moments of T , which involve ‖tk,i‖2. We will combinethe conditional moments of ‖tk,i‖2 in (20) with MJ(z) to getthe asymptotic moments of T .

We consider three cases in the following three subsec-tions. First, we consider uncorrelated channels with unequalpowers. For this case, we obtain the asymptotic momentsof T explicitly, although the results cannot be expressed inclosed forms. Next, we show that, assuming equal powers,the asymptotic moments of T for uncorrelated channels haveclosed form expressions. Finally, we consider the generalcorrelated channel with unequal powers. In this case, we onlyderive some limiting upperbounds for the first three momentsof T and give some sufficient conditions under which theseupperbounds are the exact limits.

A. Asymptotic moments of T for uncorrelated channels

In this case, Rt = Ip, R = P, Σk = ck = ckmp , rk(−k) =

0, ‖zi‖2 = d2i ‖(V †

R−1(−k,−k)rk(−k))i‖2 = 0, and ‖tk,i‖2 ∼

G(1, ck) as shown in Lemma 2. The following three lemmasare proved in Appendix I.

Lemma 3:

E

(

p

p − 1T

)

→ ckµγ,c4= ckMJ(−1). (25)

Lemma 4:

Var

(

p√p − 1

T

)

→ c2kσ2

γ,c4= c2

kM ′J(−1). (26)

Lemma 5:

1

p − 1Sk(pT ) → 2c3

kηγ,c4= c3

kM ′′J (−1). (27)

Therefore, asymptotically, the first three moments of T canbe approximated as

E(T ) ≈ ckp − 1

pµγ,c, (28)

Var(T ) ≈ c2k

p − 1

p2σ2

γ,c, (29)

Sk(T ) ≈ 2c3k

p − 1

p3ηγ,c. (30)

Combining with the moments of S in (18), and usingindependence of S and T , we have

E(SINRk) ≈ ckm − p + 1

p+ ck

p − 1

pµγ,c, (31)

Var(SINRk) ≈ c2k

m − p + 1

p2+ c2

k

p − 1

p2σ2

γ,c, (32)

Sk(SINRk) ≈ 2c3k

m − p + 1

p3+ 2c3

k

p − 1

p3ηγ,c. (33)

B. Expression for moments of T for uncorrelated channelswith equal powers

In this case, H =√

cHW. In the limit, the ESD of1mH

†(−k)H(−k) converges to the well-known “Marcenko-

Pastur Law” (Theorem 2.5 in [21]), from which we canderive closed form expressions for the moments by (tedious)integration. Alternatively, we can directly solve for MJ(−1)from the simplified version (i.e., taking τi = c for all i) of(21) to get

µγ,c = MJ(−1) =κ − (c(1 − γ) + 1)

2cγ, (34)

where κ =√

c2(1 − γ)2 + 2c(1 + γ) + 1.Similarly,

σ2γ,c = M ′

J(−1) = µγ,c −1 + c(1 + γ) − κ

2cγκ, (35)

and

ηγ,c =1

2M ′′

J (−1) = σ2γ,c −

c

κ3. (36)

Because 1 + c(1 + γ) ≥ κ, we know

µγ,c ≥ σ2γ,c ≥ ηγ,c. (37)

We always replace γ by p−1m in our computations. In the

literature (e.g., [3], [4], [8]), γ = pm is often used. We show

that the choice of γ can have a significant impact for smalldimensions. 1

Our approximate formula (31) for E(SINRk) (denoted as µ)is

µ = cm − p + 1

p+ c

p − 1

pµγ,c

=c − 1

4

p − 1

m

1

γ

(

√

c (1 +√

γ)2

+ 1 −√

c (1 −√γ)

2+ 1

)2

.

(38)

We replace γ with p−1m , and the corresponding µ is denoted

by µp−1. If instead we replace γ with pm , we denote it by

µp. We can compare µ with the well-known asymptotic firstmoment (e.g. [3], [4] (6.59)), denoted by µR,

µR = c − 1

4

(

√

c (1 +√

γ)2 + 1 −√

c (1−√γ)2 + 1

)2

.

(39)Similarly, we use µR,p−1 and µR,p to indicate whether γ =p−1m or γ = p

m is used for µR. Obviously µR,p−1 = µp−1.

1Note that in our notations, c = c mp

is equivalent to “ A2

σ2” in [4] (6.59)

and equivalent to “ Pσ2

” in [3].

6

Fig. 2 plots µp−1

µR,pand µp−1

µpas functions of m and p

m . Itis clear that the difference between µp−1 and µR,p at smalldimensions can be substantial. For example, when m = 4, p =4, µp−1

µR,pis almost 3. On the other hand, the difference between

µp−1 and µp is not significant. This experiment implies thatthe approximate formulae we obtain are not only accurate (seesimulation results in Section VIII) but also not as sensitive tothe choice of γ.

4 16 32 48 640.5

1

1.5

2

2.5

3

m

Rat

ios

µp−1

µR,p,p

m=

1

2

µp−1

µR,p,p

m=1

µp−1

µp,p

m=

1

2

µp−1

µp,p

m=1

Fig. 2. The ratiosµp−1

µR,pand

µp−1

µp, with respect to m and p/m, for SNR

= 20 dB.

In fact, we can prove the following Lemma to compare µwith µR algebraically:

Lemma 6: µp ≥ µp−1 = µR,p−1 ≥ µR,p.

Proof: See Appendix II.

We can get similar results if we compare our approximatevariance formula (32) with the asymptotic variance given by[3]. See the technical report [23] for details.

C. Asymptotic upperbounds for correlated channels

Recall that E(

‖tk,i‖2∣

∣H(−k)

)

= Σk + ‖zi‖2, where

‖zi‖2 = d2i

∥

∥

(

V †(R(−k,−k))−1rk(−k)

)

i

∥

∥

2. We have an im-

portant inequality:

p−1∑

i=1

(

1

m‖zi‖2λi

)

=

p−1∑

i=1

1m‖zi‖2

1 + 1md2

i

≤p−1∑

i=1

∥

∥

(

V †(R(−k,−k))−1rk(−k)

)

i

∥

∥

2

=‖V †R(−k,−k))

−1rk(−k)‖2

=‖R(−k,−k))−1rk(−k)‖2 4

= ek. (40)

We provide the limiting upperbounds for the first threemoments for T in the next Lemma.

Lemma 7:

E(T )U =p − 1

mΣkMJ(−1) + ek, (41)

Var (T )U

=p − 1

m2Σ2

kM ′J(−1) +

1

mΣk

(

1

2+ 2

√

2

δ

)

ek + e2k,

(42)

Sk(T )U =p − 1

m3Σ3

kM ′′J (−1) +

8/9

m2Σ2

kek +6

m2Σ2

k

1

δek + e3

k

+3

2ekΣk

1

m

√

ek + 21

mΣk

√

2

δ

√

ek + 21

mΣk

√

2

δ′.

(43)

where δ and δ′ are the same constants in the proofs of Lemma4 and Lemma 5.

Proof: See Appendix III.

We are interested in the special case where ek → 0. If ek →0, at any rate (faster than O(p−

1

2 ) or faster than O(p−2

3 )),ignoring all the terms involving ek, E(T )U (Var(T )U orSk(T )U ) is still the true limit. Under these conditions, wepropose the approximate moments for SINRk

E(SINRk) ≈ m − p + 1

mΣk +

p − 1

mΣkµγ,c + ek, (44)

Var(SINRk) ≈ m − p + 1

m2Σ2

k +p − 1

m2Σ2

kσ2γ,c + e2

k, (45)

Sk(SINRk) ≈ 2m− p + 1

m3Σ3

k + 2p − 1

m3Σ3

kηγ,c + e3k, (46)

Note that we retain the terms ek, e2k, e3

k in these expressionsbecause simulation studies show that keeping these termsimproves accuracy of the approximation for very small di-mensions.

It turns out that in the equicorrelation situation, ek → 0 at agood rate under mild regularity conditions. The equicorrelationis the simplest correlation model (e.g., [15] (4.41)) and isoften used to model closely spaced antennas or for the worstcase analysis [24]. For more realistic correlation models, seeNarasimhan [25].

Lemma 8: If the correlation matrix Rt consists of 1’s inthe main diagonal and ρ’s in all off-diagonal entries,2 thenek → 0 in the limit, as long as

∑

i 6=kck

ci= O(pf ) for f < 2.

Sufficient conditions for ek → 0 faster than O(p−1

2 ) andO(p−

2

3 ) would be f < 32 and f < 4

3 , respectively.

Proof: Rt can be written as Rt = (1 − ρ)Ip +ρ1p1

Tp , where 1p denotes a column vector of p 1’s. Recall

that R = P1

2 RtP1

2 . R(−k,−k) = (1 − ρ)P(−k,−k) +

ρP1

2

(−k,−k)1p−11Tp−1P

1

2

(−k,−k). The inverse of R(−k,−k) willbe

(R(−k,−k))−1

=P

−1(−k,−k)

1 − ρ−

P− 1

2

(−k,−k)

1 − ρ

ρ

(p − 1)ρ + 1 − ρ1p−11

Tp−1P

− 1

2

(−k,−k)

(47)

2In order for Rt to be positive definite, we have to restrict − 1

p−1< ρ < 1.

7

Note that rk(−k) = ρP1

2

kkP1

2

(−k,−k)1p−1. Therefore

(

R(−k,−k)

)−1rk(−k)

=ρ

1 − ρP

1

2

kkP− 1

2

(−k,−k)1p−1

− ρ

1 − ρP

1

2

kkP− 1

2

(−k,−k)

ρ

(p − 1)ρ + 1 − ρ1p−1(p − 1)

=ρ

(p − 1)ρ + 1 − ρP

1

2

kkP− 1

2

(−k,−k)1p−1, (48)

and

ek =‖(R(−k,−k))−1rk(−k)‖2

=

(

ρ

(p − 1)ρ + 1 − ρ

)2

Pkk1Tp−1P

−1(−k,−k)1p−1

=

(

ρ

(p − 1)ρ + 1 − ρ

)2∑

i6=k

ck

ci, (49)

from which it follows that ek → 0 if∑

i6=kck

ci= O(pf ), for

some f < 2. If we want ek → 0 faster than O(p−1

2 ) andO(p−

2

3 ), we should have f < 32 and f < 4

3 , respectively. Thiscompletes the proof.

In the equal power case,∑

i6=kck

ci= p − 1, and Σk ≈

(1 − ρ)c, which implies that the first (second, third) momentof SINRk for an equicorrelation channel is roughly only 1−ρ((1 − ρ)2, (1 − ρ)3) of the moment without correlations.

To conclude this section, we mention that our asymptoticresults also apply to the real channels (e.g., in [3], [8]) with amodification that involves multiplying the variance by a factorof 2 and the third moment by a factor of 4.

V. ASYMPTOTIC NORMALITY OF T

In this section, we prove that, if appropriately normalized,T converges in distribution to a Normal random variable, bothin the case of uncorrelated channels as well as the correlatedchannels if ek → 0 faster than O(p−

1

2 ). The latter includesthe equicorrelation channel with regularity conditions as inLemma 8.

The difficulty in dealing with channels with arbitrary cor-relations is due to the term

p−1∑

i=1

1

m‖zi‖2λi

=

p−1∑

i=1

( 1md2

i

1 + 1md2

i

)

∥

∥

(

V†(R(−k,−k))

−1rk(−k)

)

i

∥

∥

2. (50)

Although we know that (50) is bounded by ek =‖(R(−k,−k))

−1rk(−k)‖2, it is challenging to show that (50)converges (not necessarily to zero). Therefore, for the rest ofthis section, we assume that ek → 0 faster than O(p−

1

2 ),which makes the contribution of the term (50) asymptoticallynegligible.

Lemma 9:

W =mT − Σktr(Λ)√

p − 1

D=⇒ N

(

0, Σ2kσ2

γ,c

)

(51)

where “D

=⇒” means “converge in distribution”.

Proof: See Appendix IV.Corollary 1:

mT − (p − 1)Σkµγ,c√p − 1

D=⇒ N

(

0, Σ2kσ2

γ,c

)

(52)

Proof:

mT − (p − 1)Σkµγ,c√p − 1

=mT − Σktr (Λ)√

p − 1+

Σktr (Λ) − (p − 1)Σkµγ,c√p − 1

.

We have shown that mT−Σk tr(Λc)√p−1

D=⇒ N

(

0, Σ2kσ2

γ,c

)

. The

fact that Σktr(Λ)−(p−1)µγ,c√

p−1

P→ 0, follows Theorem 1.1 in Bai

and Silverstein [22], which implies that (p−1)(

tr(Λ)p−1 − µγ,c

)

is stochastically bounded. Equation (52) now follows fromthe Converging Together Lemma (or Slutsky’s Theorem) [26].

Corollary 2:

SINRk − m−p+1m Σk − p−1

m Σkµγ,c√

m−p+1m2 Σ2

k + p−1m2 Σ2

kσ2γ,c

D=⇒ N(0, 1). (53)

Proof: This is a direct consequence of the independenceof S and T .

VI. DISTRIBUTION APPROXIMATIONS

Once we have proved the asymptotic Normality of T (andSINRk), it is possible to rigorously define the “approximating”distributions. Since S and T are independent, we do not needto approximate the distribution of S. However, we keep theoption of approximating the distribution of S + T itself.

Our approach is to match the first two asymptotic momentsof T with the corresponding moments of a target distribution(e.g., Gamma). The asymptotic Normality of T (Lemma 9)ensures that the approximate distribution converges to thetrue limit. Normal approximation is not accurate for smalldimensions, and to an extent this is because T is positiveand a Normal random variable has zero third central moment.We expect a Gamma distribution, which has non-negativesupport and non-zero third central moment, to be a betterapproximation.

It turns out that only considering the first two moments maynot be sufficient for accurately computing error probabilities.Therefore, we consider a generalized Gamma distribution,which matches the first three moments and hence is likelyproduce better results, as another approximation.

In this section, we assume the same conditions as in provingthe asymptotic Normality of T . That is, ek → 0 faster thanO(p−

1

2 ). When this condition is not satisfied, we do not havea rigorous asymptotic result.

We start with the Normal approximation.

8

A. Normal Approximation

Asymptotic Normality of T implies that,

T ∼ N

(

p − 1

mΣkµγ,c,

p − 1

m2Σ2

kσ2γ,c

)

. (54)

Also, the asymptotic Normality of SINRk means that,

SINRk ∼ N

(

m − p + 1

mΣk +

p − 1

mΣkµγ,c ,

m − p + 1

m2Σ2

k +p − 1

m2Σ2

kσ2γ,c

)

. (55)

B. Gamma Approximation

First, we approximate T by a Gamma random variableG(αT , βT ) whose parameters are determined by solving:

E(T ) = αT βT =p − 1

mΣkµγ,c,

Var(T ) = αT β2T =

p − 1

m2Σ2

kσ2γ,c.

The Gamma approximation of T is therefore,

T ∼ G

(

(p − 1)µ2γ,c

σ2γ,c

,1

mΣk

σ2γ,c

µγ,c

)

. (56)

According to the Gamma approximation, the third centralmoment of T should be

2αT β3T = 2

p− 1

m3Σ3

k

σ4γ,c

µγ,c. (57)

We can also approximate SINRk by a Gamma distributionG(α, β), again via the method of moments,

SINRk ∼ G

(

(m − p + 1 + (p − 1)µγ,c)2

m − p + 1 + (p − 1)σ2γ,c

,

1

mΣk

m − p + 1 + (p − 1)σ2γ,c

m − p + 1 + (p − 1)µγ,c

)

. (58)

The third central moment of SINRk then would be

2

m3Σ3

k

(

m − p + 1 + (p − 1)σ2γ,c

)2

m − p + 1 + (p − 1)µγ,c. (59)

We define RS to be the ratio of the third central moment ofthe approximated Gamma distribution to the asymptotic thirdcentral moment. For T this becomes,

RST =σ4

γ,c

µγ,cηγ,c. (60)

For SINRk,

RSSINRk=

(

1 − p−1m + p−1

m σ2γ,c

)2

(

1 − p−1m + p−1

m µγ,c

) (

1 − p−1m + p−1

m ηγ,c

) .

(61)

RST and RSSINRkare indicators of how well the Gamma

approximation captures the skewness of the distribution of T .Ideally, we would like RS = 1. We will show later that RSis also critical for the generalized Gamma approximation.

For uncorrelated channels with equal powers, we have thefollowing inequalities.

Lemma 10: Assuming equal powers and no correlations,

RST ≤ 1 (62)

RSSINRk≤ 1 for any p ≤ m (63)

RST ≤ RSSINRkin the limit. (64)

Proof: See Appendix V.Fig. 3 plots some examples of RST and RSSINRk

, for theequal power uncorrelated cases.

0 0.2 0.4 0.6 0.8 10.5

0.6

0.7

0.8

0.9

1

γR

atio

s of

3rd

mom

ents

TSINR

k

c = 0 dB

c = 10 dB

c = 20 dB

c = 0 dB

c = 10 dB

c = 20 dB

Fig. 3. RST and RSSINRkfor selected SNR levels, over the whole range

of γ, for equal power uncorrelated channels.

C. Generalized Gamma Approximation

A generalized Gamma distribution can be described by astable law or an infinite divisible distribution [27], [28], [26](Chapter 2.7, 2.8), which involves the sum of i.i.d. sequence ofrandom variables. In our case, conditionally, T is a weightedsum of non-i.i.d. random variables, hence T is not exactly astable law nor infinite divisible.

We generalize the Gamma approximation of T by introduc-ing an additional parameter to the original Gamma distribution[27], [28], i.e., assuming T ∼ G(αT , βT , ξT ). The regularGamma distribution is a special case with ξT = 1. Assuminga generalized Gamma distribution, we can write down the firstthree moments of T as

E (T ) = αT βT , Var (T ) = αT β2T , Sk (T ) = (ξT + 1)αT β3

T .(65)

Equating these moments with the asymptotic moments ofT , we will get the same αT , and βT as for the Gammaapproximation. The third parameter ξT will be

ξT =2

RST− 1. (66)

Similarly, we can also generalize the Gamma approximationof SINRk by assuming SINRk ∼ G(α, β, ξ). The thirdparameter will be

ξ =2

RSSINRk

− 1. (67)

When ξ > 1, the generalized Gamma distribution withthese parameter does not have an explicit density in general.

9

However, it can be described in terms of the stable law anddoes have a closed-form MGF [27], [28],

MGF (s; G(α, β, ξ))

= exp

(

α

ξ − 1

(

1 − (1 − βξs)ξ−1

ξ

)

)

. (68)

When ξ < 1, the generalized Gamma distribution is acompound Poisson distribution with a MGF

MGF (s; G(α, β, ξ))

= exp

(

α

1 − ξ

(

(

1

1 − βξs

)1−ξ

ξ

− 1

))

. (69)

From Lemma 10, we can infer that ξT > 1 and ξ > 1 foruncorrelated equal power channels.

VII. ANALYSIS OF THE PROBABILITY OF ERROR

Computation of the probability of errors using the distri-bution of SINRk is a way of measuring how successfully theproposed distribution approximates the truth. For this purpose,we will compute the BER using (6), which is equivalent to theBFSK BER. It should be straightforward to apply our methodsto other types of (non-binary) constellations.

To simplify the notations, we drop the subscript k inBERk for the rest of the paper. In this section, we provideBER formulae under the Gamma and the generalized Gammaapproximations. We will compare these results with the exactformula for BER, denoted by BERe, given in [4] (6.47) or [7].

If the asymptotic Normality of SINRk holds, as in Corollary2, then

BERa4=

∫ ∞

0

(∫ ∞

√x

1√2π

e−t2

2 dt

)

dFSINRk(x)

→BER∞4= Q

(

√

E(SINRk)∞)

. (70)

For uncorrelated channels with equal powers,

E(SINRk)∞

= c1 − γ

γ+ cµγ,c

=

√

c2(1 − γ)2 + 2c(1 + γ) + γ2 + c(1 − γ) − γ

2γ, (71)

by treating p−1m = p

m = γ. Since it is not convenientto compute E(SINRk)∞ for the correlated cases, we willuse the our finite dimensional moment formula to computeE(SINRk)∞, which is different for different p and m.

Next, we will derive a variety of BER formulae correspond-ing to various approximation schemes, based on BFSK, whichcan be easily generalized to other types of constellations.

A. BER by Gamma Approximation on SINRk

Denote the BER computed using the Gamma approximationby BERg . Integration of (6) by parts yields

BERg =

∫ ∞

0

1√2π

e−x2 FSINRk

(x)1

2x− 1

2 dx. (72)

Replacing FSINRk(x) with Fα,β(x), the CDF of the Gamma

distribution G(α, β) as defined in (58), and using the resultsfrom integral tables [29], we have

BERg =

∫ ∞

0

1√2π

e−x2 Fα,β(x)

1

2x− 1

2 dx

=1

2Γ(α)√

2π

∫ ∞

0

x− 1

2 e−x2 Γ

(

α,x

β

)

dx

=1

2Γ(α)√

2π

Γ (1/2 + α) (1/β)α

α (1/2 + 1/β)1/2+α

×

2F1

(

1, 1/2 + α, α + 1,1/β

1/2 + 1/β

)

, (73)

where the gamma function Γ(y) =∫∞0 ty−1e−tdt, and the

incomplete gamma function Γ (α, y) =∫ y

0 tα−1e−tdt, and2F1(.) is the hypergeometric function,

2F1

(

1, 1/2 + α, α + 1,1/β

1/2 + 1/β

)

=

∞∑

n=0

Γ(1/2+α+n)Γ(1/2+α)

Γ(α+1+n)Γ(α+1)

(

1/β1/2+1/β

)n

n!. (74)

B. BER by Generalized Gamma Approximation on SINRk

According to the results of [30] , [31] (Chapter 9.2.3), underthe generalized Gamma approximation on SINRk, the BFSKBER (denoted as BERgg) can be expressed in terms of theMGF, as

BERgg =1

π

∫ π2

0

MGF

(

− 1

2 sin2 φ; G (α, β, ξ)

)

dφ

=1

π

∫ π2

0

exp

(

α

ξ − 1

(

1 −(

1 +βξ

2 sin2 φ

)ξ−1

ξ

))

dφ, (75)

which can be evaluated numerically. Note that this expressionis only for ξ > 1, but we can similarly write down BERgg forξ < 1.

C. BER by Generalized Gamma Approximation on T

Under this approximation, when ξT > 1, we can write downthe MGF for SINRk = S + T as

MGF

(

s; G

(

m − p + 1,Σk

m

)

+ G (αT , βT , ξT )

)

=1

(

1 − Σk

m s)m−p+1 exp

(

αT

ξT − 1

(

1 − (1− βT ξT s)ξT −1

ξT

))

.

(76)

Therefore, we can write the BER (denoted as BERg+gg) tobe

BERg+gg =1

π

∫ π2

0

1(

1 + Σk

2m sin2 φ

)m−p+1×

exp

αT

ξT − 1

1 −(

1 +βT ξT

2 sin2 φ

)

ξT −1

ξT

dφ, (77)

which can be evaluated numerically.

10

VIII. SIMULATIONS

In our simulations, we consider m = 4 (p = 2, 4), and m =16 (p = 8, 16). For convenience, only the results for equalpower cases are presented. Both uncorrelated and correlated(with equicorrelation ρ = 0.5) cases are tested. The range ofSNRs is c = 0 dB - 30 dB. Without loss of generality, the firststream (i.e., k = 1) is always assumed.

We simulate HW 106 times for every combination of(m, p, c, and ρ) for computing the empirical moments anddistribution. However, when computing the exact BERe form = 16, we only sample HW 105 times.

A. Moments

The theoretical moments are computed using (44), (45) and(46). Fig. 4 plots the first three moments of T computed boththeoretically and empirically from simulations. For m = 16,the theoretical moments match the simulations very well,especially the first moment. When m = 4, the curves for thefirst moment are still quite accurate except for the correlatedcases at very small SNRs. Note that due to the log scale,the errors at small SNRs are largely exaggerated in the figure.When m = 4, for the second and third moments, there seem tobe some “significant” discrepancies between theoretical resultsand simulations at large SNRs. However, these errors con-tribute negligibly to the second and third moments of SINRk

(see Fig. 5). For example, when m = 4, p = 4, ρ = 0, exactSk(S) = 314.983. Empirical Sk(T ) = 16.403, and theoreticalSk(T ) = 5.333. Although the theoretical and empirical Sk(T )apparently differ quite significantly, theoretically, Sk(SINR) =314.983 and empirically, Sk(SINR) = 314.993.

We also compare the theoretical and empirical moments ofSINRk in Fig. 5 for m = 4. As expected, the curves matchalmost perfectly except for the observable (due to the logscale) errors at very small SNRs.

B. Distributions

Fig. 6 presents the quantile-quantile (qq) plots for distribu-tions based on Gamma and Normal approximations againstthe empirical distribution for T , at a selected SNR = 10dB. The figure shows that the Normal approximation workspoorly for small m or p. The Gamma approximation performsmuch better in all cases. Fig. 7 gives the same type of plotsfor SINRk. It shows the Gamma approximation works well,especially for p

m = 12 . When p

m = 1, the Gamma distributionapproximates the portion between 1%− 99% quantiles prettywell.

We have shown that RST ≤ RSSINRkfor uncorrelated

channels with equal powers in Lemma 10, which implies thata Gamma could approximate the distribution of SINRk betterthan that of T . Fig. 3 shows RSSINRk

is much smaller atpm = 1 than at p

m = 12 , which helps explain why Gamma

approximation works well at pm = 1

2 .

C. Error Performance

Fig. 8 plots BERs versus SNRs for uncorrelated channels.From the plot, we can see that the BER curves produced by

−0.1 0 0.1 0.2 0.3 0.4−0.1

0

0.1

0.2

0.3

0.4SNR = 10dB m = 4 p = 2

Empirical quantiles

Nor

mal

& G

amm

a qu

antil

es

NormalGamma45 degree line

(a)

0 0.1 0.2−0.05

0

0.05

0.1

0.15

0.2

0.25 SNR = 10dB m = 16 p = 8

Empirical quantiles

Nor

mal

& G

amm

a qu

antil

es


(b)

0 0.5 1

−0.2

0

0.2

0.4

0.6

0.8

1SNR = 10dB m = 4 p = 4

Empirical quantiles

Nor

mal

& G

amm

a qu

antil

es


(c)

0 0.2 0.4 0.6

0

0.1

0.2

0.3

0.4

0.5

0.6SNR = 10dB m = 16 p = 16

Empirical quantiles

Nor

mal

& G

amm

a qu

antil

es


(d)

Fig. 6. Quantile-quantile plots for T . The triangles on the curves indicatethe 1% and 99% quantiles. The range of quantiles is 0.1% - 99.9%.

0 20 40 60−20

0

20

40

60SNR = 10dB m = 4 p = 2

Empirical quantiles

Nor

mal

& G

amm

a qu

antil

es


(a)

0 10 20 300

5

10

15

20

25

30SNR = 10dB m = 16 p = 8

Empirical quantiles

Nor

mal

& G

amm

a qu

antil

es


(b)

0 5 10 15 20−5

0

5

10

15

20

SNR = 10dB m = 4 p = 4

Empirical quantiles

Nor

mal

& G

amm

a qu

antil

es


(c)

2 4 6 8

0

2

4

6

8SNR = 10dB m = 16 p = 16

Empirical quantiles

Nor

mal

& G

amm

a qu

antil

es


(d)

Fig. 7. Quantile-quantile plots for SINRk. The triangles indicate the 1% and99% quantiles. The range of quantiles is 0.1% - 99.9%.

11

0 10 20 30

0.5

1

2

3

SNR (dB)

E (

T ) Theor. ρ = 0

Simu. ρ = 0Theor. ρ = 0.5Simu. ρ = 0.5

m = 4 p = 4

m = 4 p = 2

(a)

0 10 20 30

0.5

1

2

4

SNR (dB)

Var

( T

)1/2

Theor. ρ = 0Simu. ρ = 0Theor. ρ = 0.5Simu. ρ = 0.5

m = 4 p = 4

m = 4 p = 2

(b)

0 10 20 30

0.5

1

10

SNR (dB)

Sk

( T

)1/3


m = 4 p = 4

m = 4 p = 2

(c)

0 10 20 30

0.5

1

10

SNR (dB)

E (

T )


m = 16 p = 16

m = 16 p = 8

(d)

0 10 20 30

0.5

1

5

10

SNR (dB)

Var

( T

)1/2


m = 16 p = 16

m = 16 p = 8

(e)

0 10 20 30

0.5

1

10

20

SNR (dB)

Sk

( T

)1/3


m = 16 p = 16

m = 16 p = 8

(f)

Fig. 4. Theoretical and empirical moments of T . The vertical axes are in the log10 scale. Note that the variances and third moments are in terms of theirsquare roots and cubic roots, respectively.

0 10 20 30

100

101

102

103

SNR (dB)

E (

SIN

R )


m = 4 p = 2

m = 4 p = 4

(a)

0 10 20 30

100

101

102

103

SNR (dB)

Var

( S

INR

)1/2


m = 4 p = 2

m = 4 p = 4

(b)

0 10 20 30

100

101

102

103

SNR (dB)

Sk

( S

INR

)1/

3


m = 4 p = 2

m = 4 p = 4

(c)

Fig. 5. Theoretical and empirical moments of SINR.

the Gamma approximation are almost indistinguishable fromthe simulated curves for p

m = 12 . When p

m = 1, the Gammaapproximation still works well for moderate SNRs (e.g., ,SNR < 15 dB). At larger SNRs, the Gamma approximationslightly overestimates BER. The figure also shows that thegeneralized Gamma distribution produce almost perfect fitsto the simulated BER curves. All figures indicate using theasymptotic BER (BER∞) formula will seriously underestimatethe error probabilities at large SNRs. 3

3Note that, in order to produce comparable BER results with the classicalreferences (e.g., [2], [4], [8]), we actually used real channels (only for theBER curves).

Fig. 9 presents the BER results for the correlated channelswith equicorrelation ρ = 0.5. We can see the similar trends asfor the uncorrelated cases, i.e., Gamma approximation workswell for p

m = 12 and the generalized Gamma approximation

performs remarkably well.

Finally, to see the difference between correlated and uncor-related cases more closely, we plot the interesting portions ofthe BER curves for both cases in Fig. 10, which illustratesthat the differences could be significant.

12

0 10 20 3010

−6

10−4

10−2

100

SNR (dB)

BE

R

BERe

BERg

BERgg

BERg+gg

BER∞

m = 4 p = 2

m = 4 p = 4

(a)

0 10 20 3010

−6

10−4

10−2

100

SNR (dB)

BE

R

BERe

BERg

BERgg

BERg+gg

BER∞

m = 16 p = 16

m = 16 p = 8

(b)

Fig. 8. BER curves for uncorrelated channels. γ = 1

2and γ = 1 are used

for computing BER∞ in both (a) and (b).

IX. CONCLUSION

In this study, we characterized the distribution of SINRfor the MMSE receiver in the MIMO systems, for channelswith non-random transmit correlations and unequal powers.Our work started with a key observation that SINR canbe decomposed into two independent component: SINR =SINRZF + T , and we mainly focused on T . We providedthe first three asymptotic moments of T for the uncorrelatedchannels and channels with equicorrelations. We proved theasymptotic Normality of T for the uncorrelated channels aswell as the correlated channels under certain restrictions.We proposed using the asymptotically equivalent probabilitydistributions to approximate the distribution of T for finitedimensions. Simulations showed that these distribution ap-proximations performed well. Finally, we applied our resultsto the error performance analysis and produced accurate errorprobabilities.

0 10 20 3010

−6

10−4

10−2

100

SNR (dB)

BE

R

BERe

BERg

BERgg

BERg+gg

BER∞

m = 4 p = 4

m = 4 p = 2

(a)

0 10 20 3010

−6

10−4

10−2

100

SNR (dB)

BE

R

BERe

BERg

BERgg

BERg+gg

BER∞

m = 16 p = 16

m = 16 p = 8

(b)

Fig. 9. BER curves for equicorrelation ρ = 0.5. Note that the BER∞s aredifferent because we did not simulate the true limit of the mean.

10 20 30

10−2

10−1

SNR (dB)

BE

R

BERe

BERg

BERgg

BERg+gg

ρ = 0.5

ρ = 0

Fig. 10. Compare the uncorrelated and correlated BER curves. m = 16,p = 16.

13

APPENDIX IPROOF OF LEMMA 3, 4, 5

Restate Lemma 3:

E

(

p

p − 1T

)

→ ckµγ,c4= ckMJ(−1) (78)

Proof:

E

(

p

p − 1T

)

=E

(

1

p − 1E(

pT |H(−k)

)

)

=E

(

1

p − 1

p

m

p−1∑

i=1

E(

‖tk,i‖2)

λi

)

=Σkp

mE

(

tr (Λ)

p− 1

)

→Σkp

mE (MJ(−1)) = ckMJ(−1),

by the bounded convergence theorem [26] (Section 1.3b),because tr(Λ)

p−1 ≤ 1.

Restate lemma 4:

Var

(

p√p − 1

T

)

→ c2kσ2

γ,c4= c2

kM ′J(−1). (79)

Proof:

Var

(

p√p − 1

T

)

=p2

p − 1Var

(

1

m

p−1∑

i=1

‖tk,i‖2λi

)

=p2

p − 1E

(

Var

(

1

m

p−1∑

i=1

‖tk,i‖2λi

∣

∣

∣

∣

∣

H(−k)

))

+p2

p − 1Var

(

E

(

1

m

p−1∑

i=1

‖tk,i‖2λi

∣

∣

∣

∣

∣

H(−k)

))

=p2

m2Σ2

kE

(

tr(

Λ2)

p − 1

)

+p2

m2Σ2

k(p − 1)Var

(

tr (Λ)

p − 1

)

→c2kM ′

J(−1),

because E

(

tr(Λ2)p−1

)

→ M ′J(−1) by the bounded convergence

theorem, and (p − 1)Var(

tr(Λ)p−1

)

→0, which can be provedusing the results from the concentration of spectral measuresfor random matrices [32]. The result we need is Corollary 1.8bin [32], which can be stated as follows,

P

(∣

∣

∣

∣

tr (Λ)

p − 1− E

(

tr (Λ)

p − 1

)∣

∣

∣

∣

> ε

)

≤ 2e−δ(p−1)2ε2 , (80)

for any ε > 0. δ depends on the spectral radius of R(−k,−k),the logarithmic Sobolev Inequality constant, and the Lipschitzconstant of g(x) = f(x2) = 1

1+x2 . We can easily check that

the function f(x) = 11+x is convex Lipschitz and g(x) has a

finite Lipschitz norm. Therefore,

Var

(

tr (Λ)

p − 1

)

=

∫ ∞

0

2xP

(∣

∣

∣

∣

tr (Λ)

p− 1− E

(

tr (Λ)

p− 1

)∣

∣

∣

∣

> x

)

dx.

≤4

∫ ∞

0

xe−δ(p−1)2x2

dx =2

δ(p − 1)2, (81)

which implies (p − 1)Var(

tr(Λ)p−1

)

→0. This completes theproof.

Restate lemma 5:

1

p − 1Sk(pT ) =

1

p − 1E(

(pT − E(pT ))3)

→ 2c3kηγ,c

4= c3

kM ′′J (−1) (82)

Proof:

1

p − 1E(

(pT − E(pT ))3)

=1

p − 1E(

E(

(

pT − E(

pT |H(−k)

))3∣

∣

∣H(−k)

))

+1

p − 1E(

(

E(

pT |H(−k)

)

− E(pT ))3)

+3

p − 1Cov

(

E(

pT |H(−k)

)

, Var(

pT |H(−k)

))

. (83)

We can show that the first term in the right-hand side of(83) converges to c3

kM ′′J (−1) as follows:

1

p − 1E(

E(

(

pT − E(

pT |H(−k)

))3∣

∣

∣H(−k)

))

=p3

m3

1

p − 1E

E

(

p−1∑

i=1

(

‖tk,i‖2 − Σk

)

λi

)3∣

∣

∣

∣

∣

∣

H(−k)

=p3

m3

1

p − 1E

(

p−1∑

i=1

E(

‖tk,i‖2 − Σk

)3λ3

i

)

=p3

m3

1

p − 1E

(

p−1∑

i=1

2Σ3kλ3

i

)

=2p3

m3Σ3

kE

(

tr(

Λ3)

p − 1

)

→2p3

m3Σ3

k

1

2E (M ′′

J (−1)) = c3kM ′′

J (−1). (84)

We can show the last two terms in the right-hand side of(83) tend to 0.

Expand

1

p − 1E(

(

E(

pT |H(−k)

)

− E(pT ))3)

=p3

m3Σ3

k(p − 1)2E

(

(

tr (Λ)

p − 1− E (tr (Λ))

p − 1

)3)

.

14

Apply the concentration theorem one more time,

E

(

(

tr (Λ)

p− 1− E (tr (Λ))

p − 1

)3)

≤∫ ∞

0

3x2P

(∣

∣

∣

∣

tr (Λ)

p − 1− E

(

tr (Λ)

p − 1

)∣

∣

∣

∣

> x

)

dx

≤3

∫ ∞

0

x22e−δ(p−1)2x2

dx =3

2

1

(p − 1)3

√

π

δ3, (85)

which implies 1p−1 E

(

(

E(

pT |H(−k)

)

− E(pT ))3)

→ 0.The last term in (83) also tends to zero, again using the

concentration theorem and the following inequality,

3

p − 1Cov

(

E(

pT |H(−k)

)

, Var(

pT |H(−k)

))

=3p3

m3Σ3

k(p − 1)Cov

(

tr (Λ)

p − 1,

tr(

Λ2)

p − 1

)

≤3p3

m3(p − 1)Σ3

k

√

Var

(

tr (Λ)

p − 1

)

√

Var

(

tr (Λ2)

p − 1

)

. (86)

We have shown Var(

tr(Λ)p−1

)

≤ 2δ(p−1)2 . Similar arguments

will show that Var

(

tr(Λ2)p−1

)

≤ 2δ′(p−1)2 , for a different

constant δ′. Therefore, the last term in (83) also tends to zero.Combining the results for the three terms of (83) together,

we complete the proof.

APPENDIX IIPROOF OF LEMMA 6

Restate Lemma 6:

µp ≥ µp−1 = µR,p−1 ≥ µR,p. (87)

Proof: Expand µp,

µp =1

2cm − p + 1

p+

1

2

cm

p2− 1

2

p − 1

p(1 − κp) (88)

where

κp =

√

c2m2

p2

(

1 − p

m

)2

+ 2cm

p

(

1 +p

m

)

+ 1,

(pκp)2 = (cm + p − cp)2 + 4cp2

= (cm + p)2 + c2p2 + 2cp2 − 2c2mp.

Expand µp−1,

µp−1 =1

2cm − p + 1

p− 1

2(1 − κp−1) (89)

where

κp−1 =

√

κ2p +

c2

p2+ 2c2

m

p2− 2

c2

p− 2c

p,

(pκp−1)2

= (pκp)2+ c2 + 2c2m − 2c2p − 2cp.

Therefore,

µp − µp−1 =1

2

(

cm

p2+

1

p+

p − 1

pκp − κp−1

)

.

Suffices to show

cm + p + p(p − 1)κp − p2κp−1 ≥ 0

⇐⇒(cm + p)2 + (p − 1)p2κ2p + 2(cm + p)(p − 1)pκp

≥ p4κ2p−1

⇐⇒(cm + p)(p − 1)pκp

≥ (p − 1)(cm + p)2 − cp(p − 1)(cm − p)

⇐⇒(cm + p)pκp ≥ (cm + p)2 − cp(cm − p)

⇐⇒(cm + p)2 ≥ (cm − p)2,

which is true. Therefore, we prove µp ≥ µp−1. Here weuse“⇐⇒” for “is equivalent to”.

To show µp−1 ≥ µR,p, note that

µp−1 − µR,p =1

2

(

c

p+ κp−1 − κp

)

.

Suffices to show

pκp−1 ≥ pκp − c

⇐⇒p2κ2p−1 ≥ p2κ2

p + c2 − 2cpκp

⇐⇒c2 + 2c2m − 2c2p − 2cp ≥ c2 − 2cpκp

⇐⇒pκp ≥ −cm + cp + p,

which is true. We complete the proof.

APPENDIX IIIPROOF OF LEMMA 7

We will frequently use the following three inequalities:

Z(1)k

4=

p−1∑

i=1

(

1

m‖zi‖2λi

)

=

p−1∑

i=1

1m‖zi‖2

1 + 1md2

i

≤ ‖R(−k,−k))−1rk(−k)‖2 4

= ek. (90)

Z(2)k

4=

p−1∑

i=1

1m2‖zi‖2

(

1 + 1md2

i

)2 ≤ 1

2ek. (91)

Z(3)k

4=

p−1∑

i=1

1m6‖zi‖2

(

1 + 1md2

i

)3 ≤ 8

9ek. (92)

(Note that x1+x ≤ 1, 2x

(1+x)2 ≤ 12 , 6x

(1+x)3 ≤ 89 , for any x ≥ 0.)

We begin with the first moment. Conditional on H(−k),

E(T |H(−k)) =1

m

p−1∑

i=1

Σk + ‖zi‖2

1 + 1md2

i

=1

mΣktr(Λ) + Z

(1)k

≤ 1

mΣktr(Λ) + ek. (93)

Recall tr(Λ)p−1

P→ MJ(−1), the limiting upperbound of E(T )is then given by

E(T )U =p − 1

mΣkMJ(−1) + ek. (94)

We can derive the limiting upperbound for the variance ofT by following the proof of Lemma 4.

15

In Lemma 4, we have seen

Var

(

p√p − 1

T

)

=p2

p − 1E

(

1

m2

p−1∑

i=1

Σ2k + 2‖zi‖2Σk(

1 + 1md2

i

)2

)

+p2

p − 1Var

(

1

m

p−1∑

i=1

Σk + ‖zi‖2

1 + 1md2

i

)

. (95)

The first term in the right-hand side of (95) can be writtenas

p2

m2Σ2

kE

(

tr(Λ2)

p − 1

)

+p2

m

1

p − 1ΣkE(Z

(2)k )

≤ p2

m2Σ2

kM ′′J (−1) +

p2

m

1

p − 1Σk

1

2ek, (96)

in the limiting sense.Using the fact that Var(x + y) ≤ Var(x) + E(y2) +

2√

Var(x)√

E(y2), we can bound the second term in (95) by

p2

m2(p − 1)Σ2

kVar

(

tr(Λ)

p − 1

)

+p2

p − 1E(

Z(1)k

)2

+ 2p2

mΣk

√

Var

(

tr(Λ)

p − 1

)

√

E(

Z(1)k

)2

≤ p2

m2

2Σ2k

δ(p − 1)+

p2

p − 1e2

k + 2p2

m(p − 1)Σk

√

2

δek, (97)

where δ is the same constant in Lemma 4.Combining (96), (97), and ignoring the higher order term

in (97), we can get a comprehensive upperbound

Var

(

p√p − 1

T

)U

=p2

m2Σ2

kM ′J(−1) +

p2

mΣk

1

p − 1

1

2ek

+p2

p − 1e2

k + 2p2

m(p − 1)Σk

√

2

δek, (98)

which, for convenience, is written as

Var (T )U

=p − 1

m2Σ2

kM ′J(−1) +

1

mΣk

(

1

2+ 2

√

2

δ

)

ek + e2k.

(99)

We can get the limiting upperbound for the third centralmoment by following the proof of Lemma 5 and boundingthe extra terms that can not be shown to tend to zeros by theconcentration theorem.

Recallp3

p − 1Sk (T )

=1

p − 1E(

E(

(

pT − E(

pT |H(−k)

))3∣

∣

∣H(−k)

))

+1

p − 1E(

(

E(

pT |H(−k)

)

− E(pT ))3)

+3

p − 1Cov

(

E(

pT |H(−k)

)

, Var(

pT |H(−k)

))

. (100)

We can expand the first term of the right-hand side of (100)to get

p3

m3Σ3

kE

(

tr(Λ3)

p − 1

)

+p3

m2

1

p − 1Σ2

kZ(3)k

≤ p3

m3Σ3

kM ′′J (−1) +

p3

m2

1

p − 1Σ2

k

8

9ek. (101)

Expanding the second term of (100) and selectively ignoringsome negative terms, we can get a bound

1

p − 1E(

(

E(

pT |H(−k)

)

− E(pT ))3)

≤ p3

m3

1

p − 1E ((tr (Λ) − E (tr (Λ))) Σk + mek)3

≤ p3

m3Σ3

k

(

3

2

1

p − 1

√

π

δ3

)

+p3

p − 1e3

k + 3p3

m2Σ2

k

(

2

δ(p − 1)

)

ek.

We can also bound the last term of (100) by

3

p − 1Cov

(

E(

pT |H(−k)

)

, Var(

pT |H(−k)

))

≤ 3

p − 1

√

Var(

E(

pT |H(−k)

))

√

Var(

Var(

pT |H(−k)

))

.

We know

Var(

E(

pT |H(−k)

))

≤ p2e2k + 2

p2

mΣk

√

2

δek, (102)

and

Var(

Var(

pT |H(−k)

))

=p4

m4Σ2

kVar(

Σktr(

Λ2)

+ mZ(2)k

)

≤ p4

m4Σ2

k

(

Σ2k

2

δ′+ m2

(

1

2ek

)2

+ Σkmek

√

2

δ′

)

Combing all the bounds and limits, we can obtain a limitingupperbound for the third moment

Sk(T )U =p − 1

m3Σ3

kM ′′J (−1) +

8/9

m2Σ2

kek +6

m2Σ2

k

1

δek + e3

k

+3

2ekΣk

1

m

√

ek + 21

mΣk

√

2

δ

√

ek + 21

mΣk

√

2

δ′. (103)

APPENDIX IVPROOF OF LEMMA 9

Restate Lemma 9

W =mT − Σktr(Λ)√

p − 1

D=⇒ N

(

0, Σ2kσ2

γ,c

)

(104)

Proof: It suffices to show that the characteristic functionof W , E

(

ejyW)

→ exp(

−y2

2 Σ2kσ2

γ,c

)

, where

E(

ejyW)

= E(

E(

ejyW |H(−k)

))

, j =√−1. (105)

16

Conditional on H(−k),

E(

ejyW |H(−k)

)

=E

(

exp

(

jymT − Σktr(Λ)√

p − 1

)∣

∣

∣

∣

H(−k)

)

=exp

(−jyΣktr(Λ)√p − 1

) p−1∏

i=1

E(

exp(

jy‖tk,i‖2λi

))

=exp


)

p−1∏

i=1

(

1 − 1√p − 1

jyΣkλi

)−1

exp

(

‖zi‖2

jyλi√p−1

1 − jyΣkλi√p−1

)

=exp


)

L1(p)L2(p). (106)

Note that λi = 11+ 1

m d2

i

≤ 1. For any fixed y ∈ R, since

p → ∞, we can assume |y|√p−1

< 1, |y|Σkλi√p−1

< 1, whichallows us to conduct complex series expansions.

We can write

L2(p) =

p−1∏

i=1

exp

(

‖zi‖2

jyλi√p−1


)

= exp

(

p−1∑

i=1

‖zi‖2

jyλi√p−1


)

. (107)

Then

‖ log(L2(p))‖ ≤p−1∑

i=1

∥

∥

∥

∥

∥

‖zi‖2

jyλi√p−1


∥

∥

∥

∥

∥

≤p−1∑

i=1

‖zi‖2 |y|λi√p − 1

2

≤2|y| m√p − 1

Z(1)k

≤2|y| m√p − 1

ek → 0, (108)

because of the assumption that ek → 0 faster than O(p−1

2 ).

Therefore L2(p)P→ 1, hence can be ignored.

L1(p) =

p−1∏

i=1

(

1 − 1√p − 1

jyΣkλi

)−1

=exp

(

−p−1∑

i=1

log

(

1 − jyΣkλi√p − 1

)

)

=exp

(

p−1∑

i=1

∞∑

n=1

(jyΣkλi)n

n(p − 1)n/2

)

=exp

( ∞∑

n=1

p−1∑

i=1

(jyΣkλi)n

n(p − 1)n/2

)

=exp

(

p−1∑

i=1

jyΣkλi

(p − 1)1/2

)

exp

(

p−1∑

i=1

−y2Σ2kλ2

i

2(p − 1)

)

×

exp

( ∞∑

n=3

p−1∑

i=1

(jyΣkλi)n

n(p − 1)n/2

)

=exp

(

jyΣktr(Λ)√p − 1

)

exp

(−y2Σ2ktr(Λ2)

2(p − 1)

)

L3(p). (109)

We will show that L3(p) converges to 1 in probability. Notethat in the above derivations, we can expand the complexlogarithm and switch the order of summations because of thecondition |y|√

p−1< 1.

‖log (L3(p))‖ =

∥

∥

∥

∥

∥

∞∑

n=3

p−1∑

i=1

(jyΣkλi)n

n(p − 1)n/2

∥

∥

∥

∥

∥

≤∞∑

n=3

p−1∑

i=1

(|y|Σk)n

n(p − 1)n/2

≤∞∑

n=3

p − 1

3

(|y|Σk)n

(p − 1)n/2

=p − 1

3

|y|3Σ3

k

(p−1)3/2

1 − |y|Σk

(p−1)1/2

P→ 0, i.e., L(p, c)P→ 1.

After cancellations,

E(

ejyW |H(−k)

) P→ limp→∞

exp

(−y2Σ2ktr(Λ2)

2(p − 1)

)

=exp

(

−y2

2Σ2

kσ2γ,c

)

,

which is the characteristic function of a Normal randomvariable with mean 0 and variance Σ2

kσ2γ,c.

To show E(

ejyW)

→ exp(

−y2

2 Σ2kσ2

γ,c

)

, it suffices to

show∣

∣

∣E(

exp(

jyW + y2

2 Σ2kσ2

γ,c

))

− 1∣

∣

∣→ 0.

17

∣

∣

∣

∣

E

(

exp

(

jyW +y2

2Σ2

kσ2γ,c

))

− 1

∣

∣

∣

∣

=E

∣

∣

∣

∣

E

(

exp

(

jyW +y2

2Σ2

kσ2γ,c

)∣

∣

∣

∣

H(−k)

)

− 1

∣

∣

∣

∣

=E

∣

∣

∣

∣

exp

(

y2

2Σ2

kσ2γ,c

)

E(

exp (jyW )|H(−k)

)

− 1

∣

∣

∣

∣

→E

∣

∣

∣

∣

exp

(

y2

2Σ2

kσ2γ,c

)

exp

(

−y2

2Σ2

kσ2γ,c

)

− 1

∣

∣

∣

∣

= 0.

by the bounded convergence theorem. This completes theproof.

APPENDIX VPROOF OF LEMMA 10

First we can show that RST =σ4

γ,c

µγ,cηγ,c≤ 1, i.e., σ4

γ,c ≤µγ,cηγ,c. To simplify the expressions, let σ2

γ,c = µγ,c−O, and

ηγ,c = σ2γ,c − Q = µγ,c − O − Q, where O = c(1+γ)+1−κ

2cγκ ,Q = c

κ3 . Then

µγ,cηγ,c − σ4γ,c =µγ,c

(

σ2γ,c − Q

)

− σ4γ,c

=σ2γ,cO − µγ,cQ, (110)

σ2γ,c =µγ,c −

c(1 + γ) + 1 − κ

2cγκ

=

(

c(1 − γ)2 + (1 + γ) − (1 − γ)κ)

2γκ, (111)

σ2γ,cO =

(

c(1 − γ)2 + (1 + γ) − (1 − γ)κ)

2γκ

(c(1 + γ) + 1) − κ

2cγκ

=1

2cγ2κ2

(

κ2 − κ (c(1 − γ) + 1) − 2cγ)

, (112)

µγ,cQ =κ − c(1 − γ) − 1

2cγ

c

κ3=

κ − c(1 − γ) − 1

2γκ3. (113)

Plug (112) and (113) into (110),

µγ,cηγ,c − σ4γ,c

=1

2cγ2κ3

(

κ3 − κ2(c(1 − γ) + 1) − 3cγκ + c2γ(1 − γ) + cγ)

(114)

Therefore,

µγ,cηγ,c − σ4γ,c ≥ 0

⇐⇒κ3 − κ2(c(1 − γ) + 1) − 3cγκ + c2γ(1 − γ) + cγ ≥ 0

⇐⇒κ(κ2 − 3cγ) ≥ (c(1 − γ) + 1)(

κ2 − cγ)

⇐⇒κ2(

κ2 − 3cγ)2 ≥

(

κ2 − 4cγ2) (

κ2 − cγ)2

⇐⇒κ2 (2cγ)(

2κ2 − 4cγ)

≤ 4cγ(

κ2 − cγ)2

⇐⇒2c2γ2 ≥ 0,

which is true. It is easy to check to make sure κ2 − 3cγ ≥0. This completes the proof for RST =

σ4

γ,c

µγ,cηγ,c≤ 1, with

equality held when c = 0 or γ = 0.

It is easy to show RSSINRk≤ 1 by noting σ2

γ,c ≤√µγ,cηγ,c ≤ µγ,c+ηγ,c

2 , which implies

RSSINRk

≤

(

1 − m−1p + p−1

m

√µγ,cηγ,c

)2

(

1 − p−1m + p−1

m µγ,c

) (

1 − p−1m + p−1

m ηγ,c

)

=(1 − p−1

m )2 + 2p−1m (1 − p−1

m )√

µγ,cηγ,c +(

p−1m

)2µγ,cηγ,c

(1 − p−1m )2 + (1 − p−1

m )p−1m (µγ,c + ηγ,c) +

(

p−1m

)2µγ,cηγ,c

≤1,

with equality held only when RST = 1 holds.The remaining task is to show that RST ≤ RSSINRk

in thelimit. In the limit, we can replace p−1

m by γ, thus

RST ≤ RSSINRk

⇐⇒(1 − γ)2 + 2γ(1 − γ)σ2

γ,c + γ2σ4γ,c

(1 − γ)2

+ γ(1 − γ) (µγ,c + ηγ,c) + γ2 (µγ,cηγ,c)

≥σ4

γ,c

µγ,cηγ,c

⇐⇒ (1 − γ)(

µγ,cηγ,c − σ4γ,c

)

≥ γσ2γ,c

(

σ2γ,c (µγ,c + ηγ,c) − 2µγ,cηγ,c

)

,

which is equivalent to c2γ2 ≥ 0, after pages of tedious algebra.This completes the proof for RST ≤ RSSINRk

, in the limit.See the technical report [23] for details.

ACKNOWLEDGMENT

The authors thank Persi Diaconis for reading the originaldraft of this manuscript and Amir Dembo for the helpfuladvice in the revision. Ping Li thanks Tze Leung Lai andYoungjae Kim for the enjoyable conversations. Ping Li alsoacknowledges Antonia Tulino, Sergio Verdu, and DongningGuo for the email communications during the preparation ofthe manuscript.

REFERENCES

[1] A. Paulraj, R. Nabar, and D. Gore, Introduction to Space-Time WirelessCommunications, 1st ed. New York: Cambridge University Press, 2003.

[2] H. V. Poor and S. Verdu, “Probability of error in MMSE multiuserdetection,” IEEE Trans. Inform. Theory, vol. 43, pp. 858–871, May 1997.

[3] D. N. C. Tse and O. Zeitouni, “Linear multiuser receivers in randomenvironments,” IEEE Trans. Inform. Theory, vol. 46, no. 1, pp. 171–188,Jan. 2000.

[4] S. Verdu, Multiuser Detection. New York: Cambridge University Press,1998.

[5] U. Madhow and M. L. Honig, “MMSE interference suppressionfor direct-sequence spread-spectrum CDMA,” IEEE Trans. Commun.,vol. 42, pp. 3178–3188, Dec. 1994.

[6] S. Verdu and S. Shamai, “Spectral efficiency of CDMA with randomspreading,” IEEE Trans. Inform. Theory, vol. 45, no. 2, pp. 622–640,Mar. 1999.

[7] A. Hjørungnes and P. S. R. Diniz, “Minimum BER prefilter transformfor communications systems with binary signaling and known fir MIMOchannel,” IEEE Signal Processing Lett., vol. 12, no. 3, pp. 234–237, Mar.2005.

[8] D. Guo, S. Verdu, and L. K. Rasmussen, “Asymptotic normality of linearmultiuser receiver outputs,” IEEE Trans. Inform. Theory, vol. 48, pp.3080– 3095, Dec. 2002.

[9] A. M. Tulino and S. Verdu, “Random matrix theory and wireless commu-nications,” Foundations and Trends in Communications and InformationTheory, vol. 1, June 2004.

18

[10] L. Li, A. M. Tulino, and S. Verdu, “Asymptotic eigenvalue moments forlinear multiuser detection,” Communications in Information and Systems,vol. 1, no. 3, pp. 273–304, 2001.

[11] D. N. C. Tse and S. V. Hanly, “Linear multiuser receivers: Effectiveinterference, effective bandwidth and user capacity,” IEEE Trans. Inform.Theory, vol. 45, no. 2, pp. 641–657, Mar. 1999.

[12] J. Zhang, E. K. P. Chong, and D. N. Tse, “Output MAI distribution oflinear MMSE multiuser receivers in DS-CDMA systems,” IEEE Trans.Inform. Theory, vol. 47, pp. 1128–1144, Mar. 2001.

[13] D. Gesbert, “Robust linear MIMO receivers: A minimum error-rateapproach,” IEEE Trans. Signal Processing, vol. 51, no. 11, pp. 2863–2871, Nov. 2003.

[14] A. H. rungnes and D. Gesbert, “Exact ser-precoding of orthogonal space-time block coded correlated MIMO channels: An iterative approach,” inProceedings of the 6th Nordic Signal Processing Symposium, NORSIG2004, Espoo, Finland, June 2004, pp. 336–339.

[15] E. Biglieri, “Transmission and reception with multiple antennas: The-oretical foundations,” Foundations and Trends in Communications andInformation Theory, vol. 1, 2004.

[16] R. R. M uler, “On the asymptotic eigenvalue distribution of concatenatedvector-valued fading channels,” IEEE Trans. Inform. Theory, vol. 48,no. 7, pp. 2086–2091, 2002.

[17] R. R. M uller, P. Schramm, and J. B. Huber, “Spectral efficiency ofCDMA systems with linear interference suppression,” in IEEE Workshopon Communication Engineering, Ulm, Germany, Jan. 1997, pp. 93–97.

[18] K. Mardia, J. T. Kent, and J. Bibby, Multivariate Analysis. San Diego,CA: Academic Press, 1979.

[19] D. Gore, R. W. Heath, and A. Paulraj, “On performance of the zeroforcing receiver in presence of transmit correlation,” in IEEE ISIT,Lausanne, Switzerland, July 2002, p. 159.

[20] J. W. Silverstein, “Strong convergence of the empirical distribution ofeigenvalues of large dimensional random matrices,” Journal of Multi-variate Analysis, vol. 55, pp. 331–339, 1995.

[21] Z. D. Bai, “Methodologies in spectral analysis of large dimensionalrandom matrices, a review,” Statistica Sinica, vol. 9, no. 3, pp. 611–677, 1999.

[22] Z. D. Bai and W. Silverstein, “CLT for linear spectral statistics of large-dimensional sample covariance matrices,” The Annals of Probability,vol. 32, no. 1A, pp. 553–605, 2004.

[23] P. Li, D. Paul, R. Narasimhan, and J. Cioffi, “On the asymptotic andapproximate distribution of SINR for the linear MMSE MIMO receiver,”Department of Statistics, Stanford University, CA, Tech. Rep. 2005-15,June 2005.

[24] H. Shin and J. H. Lee, “Capacity of multiple-antenna fading channels:Spatial fading correlation, double scattering and keyhole,” IEEE Trans.Inform. Theory, vol. 49, Oct. 2003.

[25] R. Narasimhan, “Spatial multiplexing with transmit antenna and con-stellation selection for correlated MIMO fading channels,” IEEE Trans.Signal Processing, vol. 51, Nov. 2003.

[26] R. Durrett, Probability: Theory and Examples, 2nd ed. Belmont, CA:Duxbury Press, 1995.

[27] P. Hougaard, “Survival models for heterogeneous populations,”Biometrika, vol. 73, pp. 387–396, 1986.

[28] H. U. Gerber, “From the generalized gamma to the generalized negativebinomial distribution,” Insurance:Mathematics and Economics, vol. 10,pp. 303–309, 1991.

[29] I. S. Gradshteyn and I. M. Ryzhik, Tables of Integrals, Series, andProducts, 2nd ed. New York, NY: Academic Press, 1980.

[30] M. S. Alouini and A. J. Goldsmith, “A unified approach for calculatingerror rates of linearly modulated signals over generalized fading chan-nels,” IEEE Trans. Commun., vol. 47, pp. 1324–1334, Sept. 1999.

[31] M. K. Simon and M.-S. Alouini, Digital Communication over FadingChannels: A Unified Approach to Performance Analysis, 1st ed. NewYork: John Wiley & Sons, Inc., 2000.

[32] A. Guionnet and O. Zeitouni, “Concentration of the spectral measurefor large matrices,” Electronic Communications in Probability, 2000.