Large System Analysis of Linear Precoding in Correlated MISO Broadcast Channels Under Limited Feedback

  • Upload
    dtm

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

  • IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012 4509

    Large System Analysis of Linear Precoding inCorrelated MISO Broadcast Channels Under

    Limited FeedbackSebastian Wagner, Member, IEEE, Romain Couillet, Member, IEEE, Mrouane Debbah, Senior Member, IEEE, and

    Dirk T. M. Slock, Fellow, IEEE

    AbstractIn this paper, we study the sum rate performance ofzero-forcing (ZF) and regularized ZF (RZF) precoding in largeMISO broadcast systems under the assumptions of imperfectchannel state information at the transmitter and per-user channeltransmit correlation. Our analysis assumes that the number oftransmit antennas and the number of single-antenna users

    are large while their ratio remains bounded. We derivedeterministic approximations of the empirical signal-to-inter-ference plus noise ratio (SINR) at the receivers, which are tightas . In the course of this derivation, the per-userchannel correlation model requires the development of a novel de-terministic equivalent of the empirical Stieltjes transform of largedimensional random matrices with generalized variance profile.The deterministic SINR approximations enable us to solve variouspractical optimization problems. Under sum rate maximization,we derive 1) for RZF the optimal regularization parameter; 2) forZF the optimal number of users; 3) for ZF and RZF the optimalpower allocation scheme; and 4) the optimal amount of feedbackin large FDD/TDD multiuser systems. Numerical simulationssuggest that the deterministic approximations are accurate evenfor small .

    Index TermsBroadcast channel (BC), limited feedback, linearprecoding,multiuser (MU) systems, randommatrix theory (RMT).

    I. INTRODUCTION

    T HE pioneering work in [1] and [2] revealed that thecapacity of a point-to-point [single-user (SU)] mul-tiple-input multiple-output (MIMO) channel can potentiallyincrease linearly with the number of antennas. However,practical implementations quickly demonstrated that in mostpropagation environments the promised capacity gain ofSU-MIMO is unachievable due to antenna correlation andline-of-sight components [3]. In a multiuser (MU) scenario,the inherent problems of SU-MIMO transmission can largely

    Manuscript received June 25, 2010; revised June 03, 2011; acceptedSeptember 13, 2011. Date of publication March 22, 2012; date of currentversion June 12, 2012. This work was supported in part by the EU NoENewcom++ and in part by the ANR project SESAME. The material in thispaper was presented in part at the 2011 IEEE International Conference onCommunications.S. Wagner is with EURECOM, Sophia-Antipolis 06904, France, and

    also with ST-ERICSSON, Sophia-Antipolis 06560, France (e-mail: [email protected]).R. Couillet and M. Debbah are with SUPLEC, Gif sur Yvette 91190, France

    (e-mail: [email protected]; [email protected]).D. T. M. Slock is with EURECOM, Sophia-Antipolis 06904, France (e-mail:

    [email protected]).Communicated by A. Moustakas, Associate Editor for Communications.Color versions of one or more of the figures in this paper are available online

    at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TIT.2012.2191700

    be overcome by exploiting MU diversity, i.e., sharing thespatial dimension not only between the antennas of a singlereceiver, but among multiple (noncooperative) users. Theunderlying channel for MU-MIMO transmission is referredto as the MIMO broadcast channel (BC) or MU downlinkchannel. Although much more robust to channel correlation,the MIMO-BC suffers from interuser interference at the re-ceivers which can only be efficiently mitigated by appropriate(i.e., channel aware) preprocessing at the transmitter.It has been proved that dirty-paper coding (DPC) is a capacity

    achieving precoding strategy for the Gaussian MIMO-BC[4][8]. However, the DPC precoder is nonlinear and to thisday too complex to be implemented efficiently in practical sys-tems. It has been shown in [4], [9][11] that suboptimal linearprecoders can achieve a large portion of the BC rate regionwhile featuring low computational complexity. Thus, a lot ofresearch has recently focused on linear precoding strategies.In general, the rate maximizing linear precoder has no ex-

    plicit form. Several iterative algorithms have been proposed in[12] and [13], but no global convergence has been proved. Still,these iterative algorithms have a high computational complexitywhichmotivates the use of further suboptimal linear transmit fil-ters (i.e., precoders), by imposing more structure into the filterdesign. A straightforward technique is to precode by the inverseof the channel. This scheme is referred to as channel inversionor zero-forcing (ZF) [4].Although [9], [12], and [13] assume perfect channel state in-

    formation at the transmitter (CSIT) to determine theoreticallyoptimal performance, this assumption is untenable in practice.It is indeed a particularly strong assumption, since the perfor-mance of all precoding strategies is crucially depending on theCSIT quality. In practical systems, the transmitter has to acquirethe channel state information (CSI) of the downlink channelby feedback signaling from the uplink. Since in practice thechannel coherence time is finite, the information of the instan-taneous channel state is inherently incomplete. For this reason,a lot of research has been carried out to understand the impactof imperfect CSIT on the system behavior; see [14] for a recentsurvey.In this contribution, we focus on the multiple-input single-

    output (MISO) BC, where a central transmitter equipped withantennas communicates with single-antenna noncooper-

    ative receivers. We assume , i.e., we do not accountfor user scheduling, and consider ZF and regularized ZF (RZF)precoding under imperfect CSIT (modeled as a weighted sumof the true channel plus independent noise) as well as per-user

    0018-9448/$31.00 2012 IEEE

  • 4510 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    channel correlation, i.e., the vector channel of usersatisfies and .

    To obtain insights into the system behavior, we approximate thesignal-to-interference plus noise ratio (SINR) by a deterministicquantity, where the novelty of this study lies in the large systemapproach. More precisely, we approximate the SINR of userby a deterministic equivalent such that almostsurely, as the system dimensions and go jointly to infinitywith bounded ratio . Hence,becomes more accurate for increasing . To derive , weapply tools from the well-established field of large-dimensionalrandom matrix theory (RMT) [15], [16]. Previous work con-sidered SINR approximations based on bounds on the average(with respect to the random channels ) SINR. The determin-istic equivalent is not a bound but is a tight approximation,for asymptotically large , . Furthermore, the RMT toolsallow us to consider advanced channel models like the per-usercorrelation model, which are usually extremely difficult to studyexactly for finite dimensions. Interestingly, simulations suggestthat is very accurate even for small system dimensions, e.g.,

    . Currently, the 3GPP LTE-Advanced standard[17] already defines up to transmit antennas further mo-tivating the application of large system approximations to char-acterize the performance of wireless communication systems.Subsequently, we apply these SINR approximations to variouspractical optimization problems.

    A. Related LiteratureTo the best of the authors knowledge, Hochwald et al.

    [18] were the first to carry out a large system analysis withand finite ratio for linear precoding under the no-

    tion of channel hardening. In particular, they considered ZFprecoding, called channel inversion, for under perfectCSIT, and showed that the SINR for independent and identicaldistributed (i.i.d.) Gaussian channels converges to ,where is the signal-to-noise ratio (SNR), independent of theapplied power normalization strategy. They go on to derivethe sum rate maximizing system loading for a fixed .Their results are a special case of our analysis in Sections III-Band V-A. The authors in [18] conclude by showing that for

    , ZF achieves a large fraction of the linear (with respectto ) sum rate growth. The work in [9] extends the analysis in[18] to the case and shows that the sum rate of ZF isconstant in as , i.e., the linear sum rate growthis lost. The authors in [9] counter this problem by introducing aregularization parameter in the inverse of the channel matrix.Under the assumption of large , , perfect CSIT and forany rotationally invariant channel distribution, [9] derives theregularization parameter that maximizes theSINR. Note here that [9] does not apply the classic tools fromlarge-dimensional RMT to derive their results but rather findthe solution by applying various expectations and approxima-tions. In the present contribution, the RZF precoder of [9] isreferred to as channel distortion unaware RZF (RZF-CDU)precoder, since its design assumes perfect CSIT, although inpractice, the available CSIT is erroneous or distorted. It hasbeen observed in [9] that the RZF-CDU precoder is very similarto the transmit filter derived under the minimum mean squareerror (MMSE) criterion [19] and both become identical in the

    large , limit. Likewise, we will observe some similaritiesbetween RZF and MMSE filters when considering imperfectCSIT. The RZF precoder in [9] has been extended in [20]to account for channel quantization feedback under randomvector quantization (RVQ). The authors in [20] do not applytools from large RMT but use the same techniques as in [9] andobtain different results for the optimal regularization parameterand SINR compared to our results in Section VI.The first work applying tools from large RMT to derive the

    asymptotic SINR under ZF and RZF precoding for correlatedchannels was [21]. However, in [21] the regularization param-eter of the considered RZF precoder was set to fulfill the total av-erage power constraint. Similar work [22] was published later,where the authors considered the RZF precoder in [9] and de-rived the asymptotic SINR for uncorrelated Gaussian channels.Moreover, they derived the asymptotically optimal regulariza-tion parameter , already derived in [9], which is a spe-cial case of the result derived in Section IV. Another work [23],reproducing our results, noticed that the optimal regularizationparameter in [9] and [22] is independent of transmit correlationwhen the channel correlation is identical for all users.In the large system limit and for channels with i.i.d. entries,

    the cross correlations between the user channels, and thereforethe users SINRs, are identical. It has been shown in [24] thatfor this symmetric case and equal noise variances, the SINRmaximizing precoder is of closed form and coincides with theRZF precoder. Recently, the authors in [25] have claimed thatindeed the RZF precoder structure emerges as the optimal pre-coding solution for . This asymptotic optimalityfurther motivates a detailed analysis of the RZF precoder forlarge system dimensions.

    B. Contributions of the Present Work

    In this paper, we provide a concise framework that directlyextends and generalizes the results in [9], [18], [22], [23], and[26] by accounting for per-user correlation and imperfect CSIT.Furthermore, we apply our SINR approximations to several lim-ited-feedback scenarios that have been previously analyzed byapplying bounds on the ergodic rate of finite-dimensional sys-tems. Our main contributions are summarized as follows.1) Motivated by the channel model, we derive a deterministicequivalent of the empirical Stieltjes transform of matriceswith generalized variance profile, thereby extending theresults in [27] and [28].

    2) We propose deterministic equivalents for the SINR of ZFandRZF precoding under imperfect CSIT

    and channel with per-user correlation, i.e., deterministicapproximations of the SINR, which are independent of theindividual channel realizations, and (almost surely) exactas .

    3) Under imperfect CSIT and common correlation, we derive the sum rate maximizing

    RZF precoder called channel-distortion aware RZF(RZF-CDA) precoder.

    4) For ZF and RZF, under common correlation and differentCSIT qualities, we derive the optimal power allocationscheme which is the solution of a water-filling algorithm.

    For uncorrelated channels, we obtain the following results.

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4511

    1) Under ZF precoding and imperfect CSIT, a closed-form ap-proximate solution of the number of users maximizingthe sum rate per transmit antenna for a fixed .

    2) In large frequency-division duplex (FDD) systems, underRVQ, for and high SNR , to exactly maintain aninstantaneous per-user rate gap of bits/s/Hz, almostsurely, as , the number of feedback bits peruser has to scale with

    RZF-CDA: RZF-CDU/ZF:

    That is, the RZF-CDA precoder requiresbits less than RZF-CDU and ZF.

    3) In large time-division duplex (TDD) systems with channelcoherence interval , at high uplink SNR and downlinkSNR , the sum rate maximizing amount of channeltraining scales as and for a fixed and, respectively, under both RZF-CDA and ZF precoding.

    The remainder of this paper is organized as follows.Section II presents the transmission model and channel model.In Section III, we propose deterministic equivalents for theSINR of RZF and ZF precoding. In Section IV, we derivethe sum rate maximizing regularization under RZF precoding.Section V studies the sum rate maximizing number of users forZF precoding and the optimal power allocation when the CSITquality of the users is unequal. Section VI analyzes the optimalamount of feedback in a large FDD system. In Section VII, westudy a large TDD system and derive the optimal amount ofuplink channel training. Finally, in Section VIII, we summarizeour results and conclude this paper.Most technical proofs are presented in the Appendix. In these

    proofs, we apply several lemmas collected in Appendix VI.

    Notation: In the following, boldface lower-case and upper-case characters denote vectors and matrices, respectively. Theoperators , and denote conjugate transpose, traceand expectation, respectively. The identity matrix is de-noted , is the natural logarithm, and is the imag-inary part of . and are the spectral radiusand theminimum eigenvalue of the Hermitianmatrix , respec-tively. The imaginary unit is denoted . The sets and aredefined as and , respec-tively. A random vector is complex Gaussiandistributed with mean vector and covariance matrix .

    II. SYSTEM MODEL

    This section describes the transmission model as well as theunderlying channel model.

    A. Transmission Model

    Consider a MISO BC composed of a central transmitterequipped with antennas and of single-antenna noncoop-erative receivers. We assume ; thus, user scheduling isnot taken into account. Furthermore, we suppose narrow-band

    transmission. The signal received by user at any timeinstant reads

    where is the random channel from the transmitterto user , is the transmit vector, and the noise terms

    are independent. We assume that the channelevolves according to a block-fading model, i.e., the channel

    is constant at every time instant but varies independently fromone time instant to another.The transmit vector is a linear combination of the indepen-

    dent user symbols and can be written as

    where and are the precoding vector andthe signal power of user , respectively. Subsequently, we as-sume that user has perfect knowledge of and the effec-tive channel . In particular, an estimate of can beobtained through dedicated downlink training by precoding thepilots of user by . The precoding vectors are normalized tosatisfy the average total power constraint

    (1)

    where , ,and is the total available transmit power.Denote the SNR. Under the assumption of

    Gaussian signaling, i.e., and single-user de-coding with perfect CSI at the receivers, the SINR of useris defined as [29]

    (2)

    The rate of user is given by

    (3)

    and the ergodic sum rate is defined as

    (4)

    where the expectation is taken over the random channels .

    B. Channel Model

    Each user channel is modeled as

    (5)

    where is the channel correlation matrix of user andhas i.i.d. complex entries of zero mean and variance . Thechannel transmit correlation matrices are assumed to beslowly varying compared to the channel coherence time andthus are supposed to be perfectly known to the transmitter,whereas receiver has only knowledge about . Moreover,

  • 4512 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    only an imperfect estimate of the true channel is avail-able at the transmitter which is modeled as [30][33]

    (6)

    where , has i.i.d. entries of zeromean and variance independent of and . The param-eter reflects the accuracy or quality of the channel es-timate , i.e., corresponds to perfect CSIT, whereas for

    the CSIT is completely uncorrelated to the true channel.The variation in the accuracy of the available CSIT be-tween the different user channels arises naturally. First, theremight be low-mobility users and high-mobility users with largeor small channel coherence intervals, respectively. Therefore,the CSIT of the high mobility users will be outdated quicklyand hence be very inaccurate. On the other hand, the CSITof the low-mobility users remains accurate since their channeldoes not change significantly from the time of the channel es-timation until the time of precoding and coherent data trans-mission. Second, different CSIT qualities arise when the feed-back rate varies among the users. For instance, if the CSIT isobtained from uplink training, the training length of each usercould be different, leading to different channel estimation errorsat the transmitter. Similarly, if the users feed back a quantizedchannel, they could use channel quantization codebooks of dif-ferent sizes depending on their channel quality and the availableuplink resources. However, for simplicity, we assume identicalCSIT qualities for the optimization problems con-sidered in Sections VI and VII.

    Remark 1: The model for imperfect CSIT in (6) is adequatefor instance in a FDD system, where the channel is finelyquantized using a random codebook of i.i.d. vectors. Sincethe correlation matrices are known at both ends, usersolely quantizes the fast fading channel component to theclosest codebook vector , which can be accurately approx-imated as . Subsequently, the usersends the codebook index back to the transmitter, where theestimated downlink channel is reconstructed by multiplyingwith . For uncorrelated channels, this specific FDDsystem is studied in Section VI.Define the compound estimated channel matrix

    . Therefore, the matrixcan be written as

    (7)

    The per-user channel correlation model (also called gen-eralized variance profile) is very general and encompassesvarious propagation environments. For instance, all channelcoefficients of the vector channel may have dif-ferent variances resulting from different attenuation ofthe signal while traveling to the receivers. This so-calledvariance profile of the vector channel is obtained by setting

    ; see [27], [28], [34]. Anotherpossible scenario consists of an environment where all userchannels have identical transmit correlation , but where the

    users are heterogeneously scattered around the transmitter andhence experience different channel gains . Such a setupcan be modeled with . From a mathematical pointof view, a homogeneous system with common user channelcorrelation is very attractive. In this case, the userchannels are statistically equivalent and the deterministic SINRapproximations can be computed by solving a single implicitequation instead of multiple systems of coupled implicit equa-tions. A further simplification occurs when the channels areuncorrelated , in which case the approximatedSINRs are given explicitly. The model in (7) has never beenconsidered in large-dimensional RMT and therefore no resultsare available. The most general model studied assumes a vari-ance profile, first treated in [27] and extended in [28], which is aspecial case of the model in (7). Therefore, to be able to derivedeterministic equivalents of the SINR, we need to extend theresults in [27] and [28] to account for the per-user correlationmodel in (7), which is done in the next section.

    III. DETERMINISTIC EQUIVALENT OF THE SINRThis section introduces deterministic approximations of the

    SINR under RZF and ZF precoding for various assumptions onthe transmit correlation matrices . These results will be usedin Sections IVVII to solve practical optimization problems.The following theorem extends the results in [27], [28] and

    [35] by assuming a generalized variance profile. This theoremis required to cope with the channel model in (5) and forms themathematical basis of the subsequent large system analysis ofthe MISO BC under RZF and ZF precoding.

    Theorem 1: Let withHermitian nonnegative definite and random.The th column of is , where the en-tries of are i.i.d. of zero mean, varianceand have eighth-order moment of order . The ma-trices are deterministic. Furthermore, let

    and define deter-ministic. Assume and let

    have uniformly bounded spectral norm (with respect to). Define

    (8)

    Then, for , as , grow large with ratiosand such that

    and, we have that

    (9)

    almost surely, with given by

    (10)where the functions form the unique so-lution of

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4513

    (11)which is the Stieltjes transform of a nonnegative finite measureon . Moreover, for , the scalarsare the unique nonnegative solutions to (11).Note that (11) forms a system of coupled equations, from

    which (10) is given explicitly.Proof: The proof of Theorem 1 is given in Appendix I.

    Proposition 1 (Convergence of the Fixed-Point Algorithm):Let and be the sequence definedby and

    (12)for . Then, defined in (11) for

    .Proof: The proof of Proposition 1 is given in

    Appendices I-B and I-C.

    To derive a deterministic equivalent of the SINR under RZFand ZF precoding, we require the following assumptions on thecorrelation matrices and the power allocation matrix .

    Assumption 1: All correlation matrices have uniformlybounded spectral norm on , i.e.,

    (13)

    Assumption 2: The power is oforder , i.e.,

    (14)

    A. Regularized Zero-Forcing Precoding

    Consider the RZF precoding matrix

    (15)

    where is the channel esti-mate available at the transmitter, is a normalization scalar tofulfill the power constraint (1), and is the regularizationparameter. Here, is scaled by to ensure that itself con-verges to a constant, as .From the total power constraint (1), we obtain as

    where we defined . De-noting , the SINR of userin (2) under RZF precoding takes the form

    (16)

    whereand .To derive a deterministic equivalent of the SINR

    defined in (16) such that , almost surely,we require the following assumption.

    Assumption 3: The random matrix has uniformlybounded spectral norm on with probability 1, i.e.,

    (17)

    with probability 1.

    Remark 2: Assumption 3 holds true if, where denotes the cardinality of the

    set . That is, belongs to a finite family [36]. In par-ticular, if , then Assumption 3 is satisfied, since

    , where and bothand are uniformly bounded for all large with

    probability 1 [37].A deterministic equivalent of is provided in the

    following theorem.

    Theorem 2: Let Assumptions 1, 2, and 3 hold true and letand be the SINR of user defined in (16). Then

    (18)

    almost surely, where is given by

    (19)

    with , where form the unique positive so-lutions of

    (20)

    (21)

    and and read

    (22)

    (23)

    with and given by

    (24)(25)

  • 4514 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    where , , and take the form

    Proof: The proof of Theorem 2 is given in Appendix II.

    Corollary 1: Let Assumptions 1 and 2 hold true and letand ; then takes the form

    (26)where is the unique positive solution of

    (27)

    (28)

    and is given by

    (29)

    Proof: Substituting into Theorem 2, we havegiven in (27),

    , and . There-fore, the terms and become and

    , respectively. Furthermore,can be written as

    (30)

    Substituting these terms into (19) yields (26) which completesthe proof.

    Note that under Assumption 2, the term in (26) can beomitted since the convergence in (18) still holds true. We willmake use of this simplification when studying different applica-tions of the SINR approximations.

    Corollary 2: Let Assumption 2 hold true and let and; then takes the form

    (31)where is given as

    (32)

    Proof: Substituting into Corollary 1, we havewhich yields (31). Moreover, (27) becomes a

    quadratic equation in with unique positive solution (32),which completes the proof.

    In particular, we will consider two different RZF precoders.The first RZF precoder is defined by and is referredto as RZF channel distortion unaware (RZF-CDU) precoder.Under imperfect CSIT the RZF-CDU precoder is mismatchedto the true channel. The second RZF precoder is called RZFchannel distortion aware (RZF-CDA) precoder and does ac-count for imperfect CSIT. The optimal regularization parameterfor the RZF-CDA precoder is derived in Section IV.Moreover, there are two limiting cases of the RZF precoder

    corresponding to and . For the RZFprecoder converges to the matched filter (MF) precoder

    . A deterministic equivalent for the MF precoder canbe derived by taking the limit . How-ever, since the performance of the MF precoder is rather poorand no longer involves Stieltjes transforms, we will notdiscuss this precoding scheme in this study The reader is re-ferred to [38] or [39] for a detailed large system analysis of theMF precoder. In the case of , the RZF precoder convergesto the ZF precoder, which is discussed in the next section.

    B. Zero-Forcing Precoding

    For , the RZF precoding matrix in (15) reduces to theZF precoding matrix which reads

    where is a scaling factor to fulfill the power constraint (1) andis given by

    where . Defining ,the SINR of user in (2) under ZF precoding reads

    (33)

    To obtain a deterministic equivalent of the SINR in (33), weneed to ensure that theminimum eigenvalue of is boundedaway from zero for all large , almost surely. Therefore, thefollowing assumption is required.

    Assumption 4: There exists such that, for all large ,we have with probability 1.

    Remark 3: If and (i.e., incontrast to Theorem 2, must be invertible), for all , thenAssumption 4 holds true if . Indeed, for , from [37],there exists such that, for all large , ,where , with probability 1. Therefore, for alllarge ,almost surely.Furthermore, we require the following assumption for the

    channel model with per-user correlation.

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4515

    Assumption 5: Assume that exists forall and for some , for all .

    Remark 4: Under these conditions, are the uniquepositive solutions of (36). In particular, Assumption 5 holds trueif , and . This is detailedin the proof of Corollary 3.

    Theorem 3: Let Assumptions 15 hold true and let bethe SINR of user under ZF precoding defined in (33). Then

    almost surely, where is given by

    (34)

    where and read

    (35)

    The functions form the unique positive solution of

    (36)

    (37)

    Further, define , which is given as

    (38)

    where and take the form

    Proof: The proof of Theorem 3 is given in Appendix III.

    Corollary 3: Let Assumptions 1 and 2 hold true. Further, let, with , , for all ; then

    Theorem 3 holds true and takes the form

    with

    (39)

    (40)

    where is the unique positive solution of

    (41)

    (42)

    Proof: For , we obtain from (20)

    (43)

    A lower bound of (43) is given as whichis uniformly bounded away from zero if is invertible and

    . Thus, under these conditions, Assumption 5 is satisfied.Moreover, in (38) rewrites

    and therefore

    Dividing by and by , we ob-tain given in (40) and given in (39), respectively, whichcompletes the proof.

    Corollary 4: Let Assumption 2 hold true and let and; then takes the explicit form

    (44)

    Proof: By substituting into (41), is explicitlygiven by . We further have and

    .

    C. Rate Approximations

    We are interested in the individual rates of the users aswell as the average system sum rate . Since the logarithmis a continuous function, by applying the continuous mappingtheorem [40], it follows from the almost sure convergence

    , that

    (45)

    almost surely, where . An approximationof the ergodic sum rate is obtained by replacing the

    instantaneous (i.e., without averaging over the channel distribu-tion) SINR with its large system approximation , i.e.,

    (46)

  • 4516 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    It follows that

    (47)

    holds true almost surely. Another quantity of interest is the rategap between the achievable rate under perfect and imperfectCSIT. We define the rate gap of user as

    (48)

    where is the rate of user under perfect CSIT, i.e., for. Then, from (45) it follows that a deterministic equivalentof the rate gap of user such that

    almost surely is given by

    (49)

    where is a deterministic equivalent of the rate of user underperfect CSIT.Since we will require the per-user rate gaps for uncorrelated

    channels in the limited feedback analysis inSections VI and VII, we introduce hereafter for RZF-CDUand ZF precoding.

    Corollary 5 (RZF-CDU Precoding): Let ,, and define as the rate gap of

    user under RZF-CDU precoding. Then a deterministic equiv-alent such that

    almost surely is given by

    where is given in (32).Proof: With Corollary 2, compute as defined

    in (49), where .

    Corollary 6 (ZF Precoding): Let ,and define to be the rate gap of

    user under ZF precoding. Then

    almost surely, with given by

    where is defined given by

    (50)

    Proof: Substitute the SINR from Corollary 4 into (49).

    Remark 5: In practice, one is often interested in the averagesystem performance, e.g., the ergodic SINR or ergodicrate . Since the SINR is uniformly bounded on for

    the considered precoding schemes, we can apply the dominatedconvergence theorem [40, Theorem 16.4], and obtain

    where the expectation is taken over the probability spacegenerating the sequence with

    . The same holds true for the per-userrate , i.e., .

    D. Numerical ResultsWe validate Theorems 2 and 3 by comparing the ergodic sum

    rate (4), obtained by Monte Carlo (MC) simulations of i.i.d.Rayleigh block-fading channels, to the large system approxi-mation , for finite system dimensions and equal power al-location .The correlation of the user channel is modeled as

    in [41] by assuming a diffuse 2-D field of isotropic scatterersaround the receivers. The waves impinge the receiver uni-formly at an azimuth angle ranging from to .Denoting the distance between transmit antennas and ,the correlation is modeled as

    (51)

    where denotes the signal wavelength. The users are assumedto be distributed uniformly around the transmitter at an angle

    and as a simple example, we chooseand . Note that for small (inour example for small values of ), the corresponding signalof user is highly correlated since the signal arrives from avery narrow angle. Thus, the correlation model (51) yields rank-deficient correlating matrices for some users. The transmitteris equipped with a uniform linear array (ULA). To ensure that

    is bounded as grows large, we assume that the distancebetween adjacent antennas is independent of , i.e., the lengthof the ULA increases with .The simulation results presented in Fig. 1 depict the abso-

    lute error of the sum rate approximation compared tothe ergodic sum rate , averaged over independentchannel realizations. The notation indicates that

    is modeled according to (51) with . From Fig. 1,we observe that the approximated sum rate becomes moreaccurate with increasing .Figs. 2 and 3 compare the ergodic sum rate to the deter-

    ministic approximation (46) under RZF and ZF precoding, re-spectively. The error bars indicate the standard deviation ofthe MC results. It can be observed that the approximation liesroughly within one standard deviation of the MC simulations.From Fig. 2, under imperfect CSIT , the sum rate isdecreasing for high SNR, because the regularization parameterdoes not account for and thus the matrix in

    the RZF precoder becomes ill-conditioned. Fig. 3 shows that, for, the sum rate is not decreasing at high SNR, because

    the CSIT is much better conditioned. The optimal regular-ization is discussed in Section V. Further, observe that in Fig. 2the deterministic approximation becomes less accurate for highSNR. The reason is that in the derivation of the approximatedSINR, we apply Theorem 1 in and thus the

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4517

    Fig. 1. RZF, versus for a fixed SNR ofwith , .

    Fig. 2. RZF, sum rate versus SNR with and , simula-tion results are indicated by circle marks with error bars indicating the standarddeviation.

    bounds in Proposition 12 (Appendix I-A) are proportional to theSNR. Therefore, to increase the accuracy of the approximatedSINR, larger dimensions are required in the high SNR regime.We conclude that the approximations in Theorems 2 and 3

    are accurate even for small dimensions and can be applied tovarious optimization problems discussed in the sequel.

    IV. SUM RATE MAXIMIZING REGULARIZATION

    The optimal regularization parameter maximizing (46) isdefined as

    (52)

    In general, the optimization problem (52) is not convex in andthe solution has to be computed via a 1-D line search.

    Fig. 3. ZF, sum rate versus SNR with , , simulation resultsare indicated by circle marks with error bars indicating the standard deviation.

    In the following, we confine ourselves to the case ofcommon correlation , since for per-user correlationa common regularization parameter is no longer optimal [12],[42]. Under common transmit correlation, we subsequentlyassume that the distortions of the CSIT are identical forall users, since the users channels are statistically equivalent.Under these conditions maximizes (46) and theoptimization problem (52) has the following solution.

    Proposition 2: Let , and. The approximated SINR of user under RZF

    precoding (equivalently, the approximated per-user rate and thesum rate) is maximized for a regularization parameter ,given as a positive solution to the fixed-point equation

    (53)where is defined in (27) and is given by

    (54)

    with defined in (29).Proof: The proof is provided in Appendix IV.

    Note that the solution in Proposition 2 assumes a fixed distor-tion . Later in Section VI, the distortion becomes a functionof the quantization codebook size and in Section VII it dependson the uplink SNR as well as on the amount of channel training.Under perfect CSIT , Proposition 2 simplifies to

    the well-known solution , independent of , whichhas previously been derived in [9], [22], and [26]. As men-tioned in [9], for large the RZF-CDA precoder is identicalto the MMSE precoder in [19] and [43]. The authors in [26]showed that, under perfect CSIT, is independent of the cor-relation . However, for imperfect CSIT , the op-timal regularization parameter (53) depends on the transmit cor-relation through and . For uncorrelated channels

  • 4518 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    , we have and , and therefore,the explicit solution

    (55)

    Note that in this case, it can be shown that in (55) is theunique positive solution to (52).For imperfect CSIT , the RZF-CDA precoder and

    the MMSE precoder with regularization parameter[43] are no longer identical, even in the large

    , limit. Unlike the case of perfect CSIT, now dependson the correlation matrix through and . Theimpact of and on the sum rate of RZF-CDA precodingis evaluated through numerical simulations in Fig. 5. Further,note that since and are bounded from above under theconditions explained in Remark 6, at asymptotically high SNRthe regularization parameter in (53) converges to

    , where is a positive solution of

    (56)For uncorrelated channels, the limit in (56) takes the form

    Thus, for asymptotically high SNR, RZF-CDA precoding is notthe same as ZF precoding, since the regularization parameter

    is nonzero due to the residual interference caused by theimperfect CSIT. Similar observations have been made in [43]for the MMSE precoder.

    Remark 6: Note that in (56) we apply the limiton a result obtained from an SINR approximation which is al-most surely exact as . This is correct if

    in (16) is bounded for asymp-totically high SNR as . For it is clearthat is bounded since for all SNR. In the casewhere , we have and thus forthe support of the limiting eigenvalue distribution ofincludes zero resulting in an unbounded . From Remark 3,for , , and there ex-ists such that for all large .Thus, is bounded. On the contrary, for ,

    , and , it has not been proved thatand we have to evoke Assumption 4 to en-

    sure that is bounded. Thus, for , the limit (56) is onlywell defined for . Further, note that if is bounded as

    , the limits and can be in-verted without affecting the result.For various special cases, substituting (53) into the determin-

    istic equivalent of the SINR in (26) yields the followingsimplified expressions.

    Corollary 7: Let Assumptions 1 and 2 hold true and let, , , , and be the sum

    rate maximizing SINR of user under RZF precoding. Then

    almost surely, where is given by

    (57)

    where is the unique positive solution to

    Proof: Substituting into (26) together with, we obtain (57) which completes the proof.

    For uncorrelated channels , the solution to (57)is explicit and summarized in the following corollary.

    Corollary 8: Let , , , andbe the sum rate maximizing SINR of user under

    RZF precoding. Then, , almostsurely, where is given by

    (58)

    where and are given by

    (59)

    (60)

    Proof: Substituting into Corollary 7 leads to aquadratic equation in for which the unique positivesolution is given by (58), which completes the proof.

    A deterministic equivalent of the rate gapunder RZF-CDA precoding is provided in the

    following corollary.

    Corollary 9 (RZF-CDA Precoding): Let ,, and define as the rate gap of

    user under RZF-CDA precoding. Then,

    almost surely with

    where and are defined in (59) and (60), respectively.Proof: With Corollary 8, compute as defined

    in (49).

    The impact of the regularization parameter on the ergodicsum rate is depicted in Figs. 4 and 5.In Fig. 4, we compare the ergodic sum rate performance for

    different regularization parameters with CSIT distortion. The upper bound is obtained by opti-

    mizing for every channel realization, whereas maximizes

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4519

    Fig. 4. RZF, ergodic sum rate versus SNR with , ,, and .

    Fig. 5. RZF, ergodic sum rate versus SNR with , ,and .

    the ergodic sum rate. It can be observed that both andperform close to the optimal . Furthermore, if the channelquality is unknown at the transmitter (and hence assumedto be equal to zero), the performance is decreasing as soon asdominates (i.e., the interuser interference limits the perfor-

    mance) the noise power and approaches the sum rate of ZFprecoding for high SNR. We conclude that 1) adapting the reg-ularization parameter yields a significant performance increaseand 2) that the proposed RZF-CDA precoder with performsclose to optimal even for small system dimensions. In Fig. 5, wesimulate the impact of transmit correlation in the computationof on the sum rate. For this purpose, we use the standardexponential correlation model, i.e.,

    We compare two different RZF precoders: a first precodercoined RZF common correlation aware (RZF-CCA) that takesthe channel correlation into account and computes accordingto (53), and a second precoder called RZF common correlation

    unaware (RZF-CCU) that does not take into account andcomputes as in (55). We observe that for high correlation,i.e., , the RZF-CCA precoder significantly outperformsthe RZF-CCU precoder at medium-to-high SNR, whereas bothprecoders perform equally well at low SNR. Therefore, weconclude that it is beneficial to account for transmit correlation,especially in highly correlated channels. Further, simulations(not provided here) suggest that the sum rate gain of RZF-CCAover RZF-CCU precoding is less pronounced for lower CSITqualities (i.e., increasing ), because in this case the impact ofthe CSIT quality is more significant than the impact ofon the sum rate.

    V. OPTIMAL NUMBER OF USERS AND POWER ALLOCATIONIn this section, we address two problems: 1) the determina-

    tion of the sum rate maximizing number of users per transmitantenna for a fixed and 2) the optimization of the power dis-tribution among a given set of users with unequal CSIT qualities.Consider problem 1. Intuitively, an optimal number of usersexists because serving more users creates more interference

    which in turn reduces the rates of the users. At some point theaccumulated rate loss, due to the additional interference causedby scheduling another user, will outweigh the sum rate gain andhence the system sum rate will decrease. In particular, we con-sider a fair scenario where the SINR approximation of all usersis equal. Here, the (approximated) optimal solution can be ex-pressed under a closed form for ZF precoding.In problem 2, we optimize the power allocation matrix for

    a given . More precisely, we focus on common correlationwith different CSIT qualities , since in this case

    the (approximated) optimal power distribution is the solu-tion of a classical water-filling algorithm.

    A. Sum Rate Maximizing Number of UsersConsider the problem of finding the system loading max-

    imizing the approximated sum rate per transmit antenna for afixed , i.e.,

    (61)

    where denotes either with or with .In general, (61) has to be solved by a 1-D line search. However,in the case of ZF precoding and uncorrelated antennas, the op-timization problem (61) has a closed-form solution given in thefollowing proposition.

    Proposition 3: Let , , and ,the sum rate maximizing system loading per transmit antenna

    is given by

    (62)

    where , , and is the Lambert W-func-tion defined as , .

    Proof: Substituting the SINR in Corollary 4 into (61) anddifferentiating along leads to

    (63)

  • 4520 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    Denoting , we can rewrite (63) as

    Noticing that and solving for yields (62),which completes the proof.

    For , we have and . Inthis case, is a well-defined function. If , we obtainthe results in [18], although in [18] they are not given in closedform. Note that for , we have , i.e.,the optimal system loading tends to 1. Further, note that onlyinteger values of are meaningful in practice.

    B. Power Optimization Under Common CorrelationFrom Corollaries 1 and 3, the approximated sum rate (46) for

    both RZF and ZF precoding takes the form

    (64)

    with , where the only dependence on userstems from . The user powers that maximize (64), subjectto , , are thus given by the classical water-filling solution [44]

    (65)

    where and is the water level chosento satisfy . For , the optimaluser powers (65) are all equal, i.e., and

    . In this case though, itcould still be beneficial to adapt the number of users as dis-cussed in Section V-A.

    C. Numerical ResultsFig. 6 compares the optimal number of users

    in (62) to obtained by choosing the suchthat the ergodic sum rate is maximized, whereas Fig. 7 depictsthe impact of a suboptimal number of users on the ergodic sumrate of the system.From Fig. 6, it can be observed that 1) the approximated re-

    sults do fit well with the simulation results even for smalldimensions; 2) increase with the SNR; and 3) for

    , saturate for high SNR at a value lower than. Therefore, under imperfect CSIT, it is no longer optimal to

    serve the maximum number of users for asymptoticallyhigh SNR. Instead, depending on , a lower number of users

    should be served even at high SNR which implies a re-duced multiplexing gain of the system. The impact of differentnumbers of users on the sum rate is depicted in Fig. 7.From Fig. 7 we observe that 1) the approximate solution

    achieves most of the sum rate and 2) adapting the number ofusers with the SNR is beneficial compared to a fixed . More-over, from Fig. 6, we identify as an optimal choice(for ) for medium SNR and, as expected, the perfor-mance is optimal in the medium SNR regime and suboptimalat low and high SNR. From Fig. 6, it is clear that ishighly suboptimal in the medium and high SNR range and we

    Fig. 6. ZF, sum rate maximizing number of users versus SNR with, , and .

    Fig. 7. ZF, versus SNR with , , , and.

    observe a significant loss in sum rate. Consequently, the numberof users must be adapted to the channel conditions and the ap-proximate result is a good choice to determine the optimalnumber of users. In Fig. 8, under RZF-CDU precoding, we com-pare the ergodic sum rate performance with power allocation

    from (65) to equal power allocation .We consider a system with , where the CSITqualities vary significantly among the users, i.e., with

    , . We observe asignificant gain over the whole SNR range when optimal powerallocation is applied. In contrast, if the CSIT distortion of theusers channels with does not differ consider-ably ( , with ), weonly observe a small gain at high SNR. For increasing SNR, theSINRs become increasingly distinct depending on . There-fore, it might be optimal to turn off the users with lowest CSIT

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4521

    Fig. 8. RZF-CDU, versus with , , andand .

    accuracy as the SNR increases, which explains why the sumrate gain is larger at high SNR than at low SNR. However, re-call that the water-filling solution is optimal under Assumption2 and large . We thus conclude that theoptimal power allocation proposed in (65) achieves significantperformance gains, especially at high SNR, when the quality ofthe available CSIT varies considerably among the users chan-nels.

    VI. OPTIMAL FEEDBACK IN LARGE FDD MU SYSTEMS

    Consider a FDD system, where the users quantize their per-fectly estimated channel vectors and send the codebook quanti-zation index back to the transmitter over an independent feed-back channel of limited rate. The feedback channels are as-sumed to be error free and of zero delay. The quantization code-books are generated prior to transmission and are known toboth transmitter and respective receiver. Due to the finite ratefeedback link, imposing a finite codebook size, the transmitterhas only access to an imperfect estimate of the true downlinkchannel. To obtain tractable expressions, we restrict the subse-quent analysis to i.i.d. Gaussian channels .In the sequel, we follow the limited feedback analysis in [45],

    where each users channel direction is quantizedusing bits which are subsequently fed back to the transmitter.Under Rayleigh fading, the channel can be decomposed as

    , where we suppose that the channel magnitudeis perfectly known to the transmitter since it can be effi-

    ciently quantized with only a few bits [45]. Without loss of gen-erality1, we assume RVQ, where each user independently gen-erates a random codebook containingvectors that are isotropically distributed on the-dimensional unit sphere. Subsequently, user quantizes its

    channel direction to the closest according to

    1The derived scaling results hold for any quantization codebook [45].

    Under RVQ, the quantized channel direction is isotrop-ically distributed on the -dimensional unit sphere due to thestatistical properties of both the random codebook and thechannels . Thus, for fine quantization with small errors, theentries of both and can be modeled withgood approximation as i.i.d. Gaussian of zero mean and unitvariance. The quantization error vector can be approximatedas [46] and we can write

    (66)

    where is the quantization error variance. The scaling in (66)is required to ensure that the elements of have unit variance.Therefore, the effect of imperfect CSIT under RVQ in (66) iscaptured by the channel model (6). For RVQ, the quantizationerror can be upper bounded as [45, Lemma 1]

    (67)

    The bound in (67) is tight for large [45]. Moreover, since thequantization codebooks of the users are supposed to be of equalsize, the resulting CSIT distortions can be assumed identical,i.e., . Under this assumption and equal power allo-cation, for large , the SINR is identical for all users and,hence, optimizing is equivalent to optimizing the per-userrate bits/s/Hz and the sum rate

    .In the following, in particular under RVQ, we will derive the

    necessary scaling of the distortion to ensure that

    almost surely, where is defined in (48) and . That is, aconstant rate gap of is maintained exactly as .A constant rate gap ensures that the full multiplexing gain ofis achieved. Thus, the proposed scaling also guarantees a largerbut constant rate gap to the optimal DPC solution with perfectCSIT. The choice of a rate offset is motivated by meremathematical convenience to avoid terms of the form and tobe compliant with [45].With this strategy, we closely follow [45]. In [45, Theorem 1],

    the author derived an upper bound of the ergodic per-user gapfor ZF precoding with and unit norm precoding

    vectors under RVQ, which is given by

    (68)

    We cannot directly compare the deterministic equivalents to theupper bound in (68) for two reasons: 1) under ZF precoding and

    , a deterministic equivalent for the per-user rate gapdoes not exist and 2) [45] considers unit norm precoding vec-tors, whereas in this paper we only impose a total power con-straint (1). Concerning 1, at high SNR, we can use the deter-ministic equivalent for RZF-CDU precoding given in Corollary5 as a good approximation for ZF precoding, since for high SNRthe rates of RZF-CDU and ZF precoding converge. Regarding2, deriving a deterministic equivalent of the SINR under linearprecoding with a unit norm power constraint on the precodingvectors is difficult, since it introduces an additional nontrivial

  • 4522 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    Fig. 9. ZF, per-user rate gap versus number of bits per user with ,.

    dependence on the channel. However, it is useful to comparethe accuracy of the upper bound in (68) and the deterministicequivalent in Corollary 5 at high SNR.Fig. 9, depicts the per-user rate gap as a function of the feed-

    back bits per user under ZF precoding at a SNR of 25 dB. Wesimulated the ergodic per-user rate gap andof ZF precoding with unit norm precoding vectors and totalpower constraint, respectively. We compare the numerical re-sults to the upper bound (68) and to the deterministic equiv-alent for and .For both system dimensions and are close,suggesting that our results derived under the total power con-straint may be good approximations for the case of unit normprecoding vectors as well. As mentioned in [45], the accuracyof the upper bound increases with increasing but the deter-ministic equivalent appears to be more accuratefor both and . In fact, for

    , approximates the per-user rate gapsignificantly more accurately than the upper bound (68) for thegiven SNR. We conclude that the proposed deterministic equiv-alent is sufficiently accurate and can be used toderive scaling laws for the optimal feedback rate.In the following, we compare the scaling of under RZF-

    CDA, RZF-CDU, and ZF precoding to the upperbound given for ZF precoding in [45, Theorem 3].For the sake of comparison, we restate [45, Theorem 3].

    Theorem: [45, Theorem 3]: In order to maintain a rate offsetno larger than (per user) between ZF with perfect CSITand with finite-rate feedback (i.e., ), it issufficient to scale the number of feedback bits per mobile ac-cording to

    where . It is also mentioned that the result in[45, Theorem 3] holds true for RZF-CDU precoding for highSNR, since ZF and RZF-CDU precoding converge for asymp-totically high SNR. Furthermore, it is claimed, corroborated bysimulation results, that [45, Theorem 3] is true under RZF-CDUprecoding for all SNR. In order to correctly interpret the sub-sequent results, it is important to understand the differencesbetween our approach and the approach in [45]. The scalinggiven in [45, Theorem 3] is a strict upper bound on the ergodicper-user rate gap for all SNR and all undera unit norm constraint on the precoding vectors. In contrast,our approach yields a necessary scaling of that maintains agiven instantaneous target rate gap exactly asunder a total power constraint. Therefore, our results are

    not upper bounds for small , i.e., we cannot guarantee thatfor small dimensions. But since for asymptoti-

    cally large , the rate gap is maintained exactly and we applyan upper bound on the CSIT distortion under RVQ (67); it fol-lows that our results become indeed upper bounds for large. Simulations reveal that under the derived scaling of , the

    per-user rate gap is very close to even for small dimen-sion, e.g., . Concerning the ergodic and instantaneousper-user rate gap, the reader is reminded that our results holdalso for ergodic per-user rates as a consequence of the domi-nated convergence theorem; see Remark 5.Consequently, a comparison of the results in [45] to our so-

    lutions is meaningful, especially for larger values of whereour results become upper bounds.In the following section, we apply the deterministic equiva-

    lents of the per-user rate gap under RZF-CDA, RZF-CDU, andZF precoding provided in Corollaries 9, 5, and 6, respectively,to derive scaling laws for the amount of feedback necessary toachieve full multiplexing gain.

    A. Channel Distortion Aware RZF Precoding

    Proposition 4: Let . Then the CSIT distor-tion , such that the rate gap of user betweenRZF-CDA precoding with perfect CSIT and imperfect CSITsatisfies

    almost surely, has to scale as

    (69)

    (70)

    where is defined in (60). With , the distortion has toscale as

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4523

    Proof: Set given in Corollary 9 equal toand solve for .Although the proposed scaling of in (69) converges to

    zero for asymptotically high SNR, we can approximate the termin the high SNR regime.

    Proposition 5: For asymptotically high SNR, the termdefined in (70) converges to the following limits:

    ifif

    (71)

    Proof: For observe that scales as . Thus, for, (70) converges to . If , the term takes the

    form

    Therefore, for , (70) converges to , which completesthe proof.

    Remark 7: Note that and thus, werequire to ensure that the limit of the deter-ministic equivalent is well defined; see Remark 6. However, forfinite SNR with the approximation in Proposition 5, we have

    and the scaling result holds true. To compare Proposi-tion 4 to [45, Theorem 3], we use the upper bound on the quan-

    tization distortion (67), i.e., , whereis the number of feedback bits per user under RZF-CDA pre-coding. Thus, (69) can be rewritten as

    (72)

    B. Channel Distortion Unaware RZF Precoding

    Although the RZF-CDU precoder is suboptimal under imper-fect CSIT, the results are useful to compare to the work in [45].

    Proposition 6: Let . Then the CSIT distortion, such that the rate gap with of userbetween RZF-CDU precoding with perfect CSIT and imper-

    fect CSIT satisfies

    almost surely, has to scale as

    where is defined in (32) and .Proof: Set from Corollary 5 equal to

    and solve for .

    An approximation of the term at high SNR isgiven in the following proposition.

    Proposition 7: For asymptotically high SNR,converges to the following limits:

    ifif (73)

    Proof of Proposition 7: For and large, scales as. Therefore, . If , for

    large , the term scales as .With this approximation,we obtain , which completes theproof.

    Applying the upper bound on the CSIT distortion under RVQ(67) with bits per user, we obtain

    (74)

    C. ZF Precoding

    The following results are only valid for and thus, theycannot be compared to [45, Theorem 3] which are derived underthe assumption . However, for high SNR the results forthe RZF-CDU precoder are a good approximation for the ZFprecoder as well, even for .

    Corollary 10: Let and . To maintain arate offset such that

    almost surely, the distortion has to scale according to

    (75)

    Proof: From Corollary 6, set and solve for.

    Proposition 8: For asymptotically high SNR, in(75) converges to

    (76)

    Proof: From (75), the result is immediate.Under RVQ with feedback bits per user, we have

    (77)

    D. Discussion and Numerical Results

    At this point, we can draw the following conclusions. Theoptimal scaling of the CSIT distortion is lower forcompared to . For , the optimal scaling of thefeedback bits , , and for ZF in [45, Theorem3] are different, even at high SNR. In fact, for large , underRZF-CDU precoding and ZF precoding, the upper bound in [45,Theorem 3] appears to be too pessimistic in the scaling of the

  • 4524 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    Fig. 10. RZF, ergodic sum rate versus SNR under RZF precoding and RVQwith feedback bits per user, where is chosen to maintain a sum rate offsetof , , and .

    feedback bits. From (74) and (73), a more accurate choice maybe

    (78)

    i.e., bits less than proposed in [45, Theorem 3]. However,recall that (78) becomes an upper bound for large and a rategap of at least bits/s/Hz cannot be guaranteed for smallvalues of . Moreover, for high SNR, and large , tomaintain a rate offset of , the RZF-CDA precoder requires

    bits less than the RZF-CDU and ZF precoderand bits less than the scaling proposed in[45, Theorem 3].In contrast, for and high SNR, we have

    . Intuitively, the reason is that, for , thechannel matrix is well conditioned and the RZF and ZF pre-coders perform similarly. Therefore, both schemes are equallysensitive to imperfect CSIT and thus the scaling of is thesame for high SNR. Note that our model comprises a genericdistortion of the CSIT. That is, the distortion can be a combi-nation of different additional factors, e.g., channel estimation atthe receivers, channel mismatch due to feedback delay or feed-back errors (see [47]) as long as they can be modeled as additivenoise (6). Moreover, we consider i.i.d. block-fading channels,which can be seen as a worst case scenario in terms of feedbackoverhead. It is possible to exploit channel correlation in time,frequency, and space to refine the CSIT or to reduce the amountof feedback.Figs. 10 and 11 depict the ergodic sum rate of RZF precoding

    under RVQ and the corresponding number of feedback bits peruser , respectively. To avoid an infinitely high regularizationparameter , the minimum number of feedback bits is set to1.In Fig. 10, we plot the ergodic sum rate for RZF precoding

    under perfect CSIT with total power constraint (red solid lines)

    Fig. 11. RZF, feedback bits per user versus SNR, with to maintain a sumrate offset of , , and .

    and unit norm constraint on the precoding vectors (red dashedline). We observe that the sum rate under unit norm constraintis slightly larger at high SNR, suggesting that our scaling re-sults for RZF precoding derived under a total power constraintbecome inaccurate under the unit norm constraint at high SNR.Hence, one has to be cautious when comparing the scaling in[45, Theorem 3] directly to the scaling derived with the largesystem approximations at high SNR. From Fig. 10, we furtherobserve that 1) the desired sum rate offset of 10 bits/s/Hz is ap-proximately maintained over the given SNR range when ischosen according to (72) and the high SNR approximation in(78) under RZF-CDA and RZF-CDU precoding, respectively;2) given an equal number of feedback bits (72), the RZF-CDAprecoder achieves a significantly higher sum rate compared toRZF-CDU for medium and high SNR, e.g., about 2.5 bits/s/Hzat 20 dB; and 3) to maintain a sum rate offset of bits/s/Hz, theproposed feedback scaling of for unit norm pre-coding vectors [45] is very pessimistic, since the sum rate offsetto RZF with total power constraint and unit norm constraint isabout 6 bits/s/Hz and 7 bits/s/Hz at 20 dB, respectively.We conclude that the proposed RZF-CDA precoder signifi-

    cantly increases the sum rate for a given feedback rate or equiv-alently significantly reduces the amount of feedback given atarget rate. Moreover, the scaling of the number of feedbackbits under RZF-CDU precoding proposed in [45, Theorem 3]appears to be less accurate under a total power constraint thanour large system approximation in (72).

    VII. OPTIMAL TRAINING IN LARGE TDD MU SYSTEMSConsider a TDD system where uplink (UL) and downlink

    (DL) share the same channel at different times. Therefore, thetransmitter estimates the channel from known pilot signaling ofthe receivers. The channel coherence interval , i.e., the amountof channel uses for which the channel is approximately con-stant, is divided into channel uses for UL training andchannel uses for coherent transmission in the DL. Note that inorder to coherently decode the information symbols, the usersneed to know their effective (precoded) channels. This is usually

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4525

    accomplished by a dedicated training phase (using precoded pi-lots) in the DL prior to the data transmission. As shown in [48],a minimal amount of training (at most one pilot symbol) is suf-ficient when data and pilots are processed jointly. Therefore, weassume that the users have perfect knowledge of their effectivechannels and we neglect the overhead associated with the DLtraining.In the considered TDD system, the imperfections in the CSIT

    are caused by 1) channel estimation errors in the UL; 2) imper-fect channel reciprocity due to different hardware in the trans-mitter and receiver; and 3) the channel coherence interval . Inwhat follows, we assume that the channel is perfectly reciprocaland we study the joint impact of 1 and 3 for uncorrelatedchannels .

    A. Uplink Training PhaseIn our setup, the distortion of the CSIT is solely caused by

    an imperfect channel estimation at the transmitter and is iden-tical for all entries of . To acquire CSIT, each user transmitsthe same amount of orthogonal pilot symbols over theUL channel to the transmitter. Subsequently, the transmitter es-timates all channels simultaneously. At the transmitter, thesignal received from user is given by

    where we assumed perfect reciprocity of UL and DL channelsand is the average available transmit power at the receivers.That is, the UL and DL channel coefficients are equal and theUL noise is assumed identical for allusers and statistically equivalent to its DL analog. Subsequently,the transmitter performs an MMSE estimation of each channelcoefficient .Due to the orthogonality property of theMMSE estimation [49],the estimates of and the corresponding estimation errors

    are uncorrelated and i.i.d. complex Gaussiandistributed. Hence, we can write

    where and are independent with zero mean and varianceand , respectively. The variance of the estimation

    error is given by [47]

    (79)

    where we defined the uplink SNR as .

    B. Optimization of Channel TrainingWe focus on equal power allocation among the users, i.e.,

    because it is optimal for large and; see Section V-B. Since channel uses have already been

    consumed to train the transmitter about the user channels, thereremains an interval of length for DL data transmissionand thus we have the prelog factor . The net sum rateapproximation reads

    (80)

    To compute the training length that maximizes the net sumrate approximation (80), we substitute from Corollary 4into (80) and the approximated net sum rate under ZFprecoding takes the form

    (81)

    where . Similarly, for RZF-CDA precoding the ap-proximated net sum rate reads

    (82)

    where is given in Corollary 8.Substituting (79) into (81) and (82), we obtain

    (83)

    (84)

    (85)

    For under ZF precoding and for RZF-CDA pre-coding, it is easy to verify that the functions and arestrictly concave in and in the interval , respec-tively, where is the minimum amount of training required,due to the orthogonality constraint of the pilot sequences. There-fore, we can apply standard convex optimization algorithms[50] to evaluate

    (86)

    (87)

    In the following, we derive approximate explicit solutions to(86) and (87) for high SNR. We distinguish two cases: 1) theUL and DL SNR vary with finite ratio and 2)varies, while remains finite. In contrast to case 1, the systemin case 2 is interference-limited due to the finite transmit powerof the users.

    Case 1: Finite Ratio : We derive approximate, but ex-plicit, solutions for the optimal training intervals inthe high SNR regime and derive their limiting values for asymp-totically low SNR.

    High SNR Regime: An approximate closed-form solution to(86) and (87) is summarized in the following proposition.

    Proposition 9: Let , be large withconstant. Then, an approximation of the sum rate maximizing

  • 4526 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    amount of channel training and under ZF andRZF-CDA precoding is given by

    (88)

    if

    if(89)

    where and.

    Proof: The proof is presented in Appendix V.

    Thus, for a fixed DL SNR , the optimal training intervalsscale as . Likewise, for a constant , theoptimal training intervals scale as .Under ZF precoding the same scaling has been reported in[51][53]. From this scaling, it is clear that as ,tends to , the minimum amount of training.Moreover, for , with equality if .

    Therefore, RZF-CDA requires less training than ZF, but thetraining interval of both schemes is equal for asymptoticallyhigh SNR. In the case of full system loading , RZF-CDArequires less training compared to the scenario where .Low SNR Regime: For asymptotically low SNR ,

    with constant ratio the optimal amount of trainingis given in the subsequent proposition.

    Proposition 10: Let , with constant ratioand . Then, the sum rate maximizing amount

    of channel training and under ZF and RZF-CDA pre-coding converges to

    (90)

    Proof: Applying and ,(83) and (84) take the form

    (91)

    (92)

    Maximizing (91) and (92) with respect to and , re-spectively, yields (90). Since, by definition, we assume orthog-onal pilot sequences, hence , the result (90) implies that

    , which completes the proof.

    For ZF precoding, the limit has also been reported in [54].

    Case 2: With Finite : This scenario modelsa high-capacity DL channel where the primary sum rate lossstems from the inaccurate CSIT estimate due to limited-rate ULsignaling caused, e.g., by a finite transmit power of the users.Thus, the system becomes interference-limited and the optimalamount of channel training under ZF precoding is given in thefollowing proposition.

    Fig. 12. ZF and RZF-CDA, optimal amount of training with ,, , , RZF is indicated by circle marks.

    Proposition 11: Let and finite. Then the (ap-proximated) sum rate maximizing amount of channel training

    is given by

    (93)

    where is the Lambert W-function.Proof: For ZF precoding and , the sum rate (83)

    can be approximated as

    (94)

    Setting the derivative of (94) with respect to to zero yields

    (95)

    where and. Equation (95) can be written as

    Notice that . Thus, solvingfor yields (93).

    For asymptotically low , we obtain ,implying that . For RZF-CDA precoding, no accurateclosed-form solution to (87) has yet been found.

    C. Numerical Results

    In Fig. 12, we compare the approximated optimal trainingintervals to computed via exhaustivesearch and averaged over 1000 independent channel realiza-tions. The regularization parameter is computed using thelarge system approximation in (55). Fig. 12 shows that theapproximate solutions become very accurate for

    . Moreover, it can be observed that the approximationsin (88) and (89) match very well. Further, note that for ,ZF and RZF-CDA need approximately the same amount oftraining, as predicted by (88) and (89).

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4527

    Fig. 13. ZF and RZF-CDA, optimal relative amount of training versuswith , , , , RZF is indicated

    by circle marks.

    Fig. 14. ZF, ergodic sum rate versus downlink SNR with , ,, , and .

    Fig. 13 depicts the optimal relative amount of trainingfor ZF and RZF-CDA precoding. We observe that de-creases with increasing SNR as . That is, for in-creasing SNR, the estimation becomes more accurate and re-sources for channel training are reallocated to data transmis-sion. Furthermore, saturates at due to the orthogo-nality constraint on the pilot sequences. As expected from (88)and (89), we observe that the optimal amount of training is lessfor RZF-CDA than for ZF precoding. Moreover, the relativeamount of training for both ZF and RZF-CDA convergesat low SNR to and at high SNR to the minimum amount oftraining , as predicted by the theoretical analysis.Fig. 14 shows the ergodic sum rate under ZF precoding with

    fixed UL SNR for various training intervals. Weobserve 1) no significant difference in the performance of theschemes employing either optimal training , computed via

    exhaustive search, or obtained from a convex optimiza-tion of the large system approximation (83); 2) a small perfor-mance loss at low and medium SNR of the (high-SNR) approx-imation of in (93); and 3) a significant performance loss ifthe minimum training interval is used for all SNR.We conclude that our approximation in (93) achieves very goodperformance and can therefore be utilized to compute veryefficiently.

    VIII. CONCLUSIONIn this paper, we presented a consistent framework for the

    study of ZF and RZF precoding schemes based on the theoryof large-dimensional random matrices. The tools from RMT al-lowed us to consider a very realistic channel model accountingfor per-user channel correlation as well as individual channelgains for each link. The system performance under this generaltype of channel is extremely difficult to study for finite dimen-sions but becomes feasible by assuming large system dimen-sions. Simulation results indicated that these approximations arevery accurate even for small system dimensions and reveal thedeterministic dependence of the system performance on severalimportant system parameters, such as the transmit correlation,signal powers, SNR, and CSIT quality. Applied to practical op-timization problems, the deterministic approximations lead toimportant insights into the system behavior, which are consis-tent with previous results, but go further and extend them tomore realistic channel models and other linear precoding tech-niques. Furthermore, the proposed channel-independent perfor-mance approximations can be used to simulate the system be-havior without having to carry out extensive MC simulations.

    APPENDIX IPROOF OF THEOREM 1

    The proof is structured as follows: in Appendix I-A, we provethat almost surely, where isan auxiliary random variable involving the terms .Appendix I-B shows that the sequence defined by(12) converges to (11) as , if properly initial-ized. Finally, in Appendix I-C we demonstrate that satisfies

    , almost surely.A) Convergence to an Auxiliary Variable: The objective is

    to approximate the random variable by an appro-priate functional such that

    (96)

    almost surely. Take . From (96), we proceed by applyingLemma 2 and obtain

    (97)

    We choose as

    (98)

  • 4528 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    where is to be determined later, and obtain

    Consider the term . Taking thetrace, together with , we have

    Denoting and applying Lemma 1, weobtain

    Therefore, the left-hand side of (96) takes the form

    (99)

    The choice of an appropriate value for , such that (96) is sat-isfied, requires some intuition. From Lemma 4, we know that

    ,almost surely. Then, from Lemma 8, we surely have

    From the previous arguments, will be chosen as

    (100)

    Note that is random since it depends on . The remainderof this section proves (96) for the specific choice of in (100).Substituting (100) into (99), we obtain

    (101)

    (102)

    In order to prove that , almost surely, we divide theleft-hand side of (102) into terms, i.e.,

    (103)

    It is then easier to show that each con-verges to zero, sufficiently fast, as , which will imply

    , almost surely. are chosen as

    where we defined

    where .In the course of the development of the proof, we require

    the existence of moments of order of in (103), i.e.,, for some integer . First, we bound (103)

    as . The application of Hldersinequality yields

    Furthermore, for some , we can uniformly boundand as

    (104)

    (105)

    Proposition 12: Let the following upper bounds be welldefined and let the entries of have eighth-order moment of

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4529

    order . Then the th-order momentscan be bounded as

    (106)

    where are constants depending only on .Proof: The proof is based on various common inequalities.

    Applying Lemma 9, can be upper-bounded as

    We further bound by applying Lemmas 10 and 12 withthe fact that . Together with (104), wehave

    Similarly, with Lemma 2, it can be shown thatand thus

    The th-order moment of thus satisfies

    Applying the inequality yields

    If the moments and exist and are bounded,we can apply Lemma 3 and obtain (106). For the sake of brevity,we omit the derivations of the remaining moments ,

    , since the techniques are similar to the previousprocedure.

    From Proposition 12, we conclude that all are sum-mable if , . Therefore, is summablefor , and hence, the BorelCantelli lemma [40] im-plies that , almost surely. Note that with the same ap-proach, the convergence region can be extended to .We now prove the existence and uniqueness of a solution to

    (11).B) Proof of Convergence of the Fixed-Point Equation: In

    this section, we consider the fixed point (11). We first prove

    that, properly initialized, the sequenceconverges to a limit as . Subsequently, we show thatthis limit satisfies , almost surely.

    Proposition 13: Let and be thesequence defined by (12). If is a Stieltjes transform,then all are Stieltjes transforms as well.

    Proof: Suppose (12) is initialized by ,which is the Stieltjes transform of a function with a single massin zero. We demonstrate that at all subsequent iterationsthe corresponding are Stieltjes transforms for all . Forease of notation, we omit the dependence on ; are givenby

    (107)

    where . In (107), multiplying from theright by , we obtain

    (108)

    where . Denotingand

    , (108) takes the form

    (109)

    Since are uniformly bounded w.r.t. , we have. To show that are Stieltjes transforms of a nonnega-tive finite measure, the following three conditions must be ver-ified [28, Proposition 2.2]: for 1) ; 2)

    ; and 3) . From(109), it is easy to verify that all three conditions are met, whichcompletes the proof.

    We are now in a position to show that any sequenceconverges to a limit as .

    Proposition 14: Any sequence definedby (12) converges to a Stieltjes transform, denoted as

    if is a Stieltjes transform.Proof: Let and

    , where

  • 4530 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    Applying Lemma 2, the difference is

    (110)

    With Lemmas 9, 11, and 12, (110) can be bounded as

    (111)

    where . Clearly, the sequence converges toa limit for restricted to the set . Propo-sition 13 shows that all are uniformly bounded Stieltjestransforms, and therefore, their limit is analytic. Sincefor is at least countable and has a clusterpoint, Vitalis convergence theorem [15, Theorem 3.11] ensuresthat the sequence must converge for all andtheir limit is .It is straightforward to verify that the previous holds also true

    for .

    Remark 8: For , the existence of a unique solutionto (11) as well as the convergence of (12) from any real initialpoint can be proved within the framework of standard inter-ference functions [55]. The strategy is as follows. Let

    and, where

    [55, Th. 1 and 2] prove that if is a feasible standard inter-ference function, then (12) converges to a unique solutionwith all nonnegative entries for any initial point .The proof that is feasible as well as a standard interfer-ence function is straightforward and details are omitted in thiscorrespondence.The uniqueness of , whose entries are Stieltjes transforms

    of nonnegative finite measures, ensures the functional unique-ness of as a Stieltjes transform solution to(11) for . This completes the proof of uniqueness.Denote . In the fol-

    lowing section, we prove that satis-fies , almost surely.

    C) Proof of Convergence of the DeterministicEquivalent: In Appendix 1-A, we showed that

    ,almost surely. Furthermore, in Appendix 1-B, we proved thatthe sequence defined by (11) converges to a limit . Itremains to prove that

    (112)

    almost surely. Denote with defined in (101).Applying Lemma 2, (112) can be written as

    where ,

    and , . ApplyingLemmas 9 and 11, can be bounded as

    (113)

    Similar to (111), with Lemma 12, (113) can be further boundedas

    where . Taking the supremum over all, we obtain

    (114)

    From (114), on the set , it sufficesto show that goes to zero sufficiently fast. Forany , we have

    (115)

    Applying Markovs inequality, (115) can be further bounded as

    For all and with , the term issummable and we can apply the BorelCantelli Lemma whichimplies , almost surely.On , are summable and have

    a cluster point. Furthermore, Proposition 13 assures thatare Stieltjes transforms and hence uniformly bounded on everyclosed set in . Therefore, Vitalis convergence theorem[15, Theorem 3.11] applies, and extends the convergence regionof (112) to .Since (112) holds true, the following convergence holds al-

    most surely:

    (116)

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4531

    The convergence in (116) implies the convergence in (9), whichcompletes the proof.

    APPENDIX IIPROOF OF THEOREM 2

    The strategy is as follows: the SINR in (16) consistsof three terms: 1) the scaled signal power ; 2)the scaled interference power (bothscaled by ); and 3) the term of the power normalization.For each of these three terms, we will subsequently derive adeterministic equivalent which together constitute the finalexpression for .

    A) Deterministic Equivalent for : The termcan be written as

    (117)

    (118)

    where with and inwe applied Lemma 1 twice together with (6). For large andunder Assumptions 1, we apply Lemma 4 and obtain

    almost surely, where in we applied Lemma 6, the definition(8) and denoted the derivative of alongat . Applying Theorem 1 to , we obtain

    almost surely, where is defined in (21) and is given by

    (119)

    Define with . The systemof equations formed by the takes the formand the explicit solution is given in (24). Substituting

    and by their respective determin-istic equivalents and , we obtain in (22) such that

    , almost surely.B) Deterministic Equivalent for : Similar to the

    derivations in (117) and (118), we have

    Since and are independent, we apply Lemma 5 togetherwith Lemmas 4 and 6 and obtain

    almost surely.C) Deterministic Equivalent of :

    With (5) and , , we have

    (120)

    (121)

    Substituting with,

    where , , and into (121),we obtain a sum of five terms

    (122)

    where we denoted and. Noting that and

    , we apply Lemma 7 to each of the four quadraticforms in (122). Under Assumption 1, we obtain

    almost surely, where . Moreover, under As-sumptions 1 and 3 and uniformly on , we have

    almost surely, where . Sub-stituting the random terms in (122) by their respective determin-istic equivalents yields

    (123)

  • 4532 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    almost surely. The second term in brackets of (123) reduces toand we obtain

    (124)

    almost surely. From Lemma 6, we have

    almost surely, where and. Therefore, (124) becomes

    almost surely. We rewrite as

    Applying Lemmas 1, 4, and 6, we obtain almost surely

    A deterministic equivalent ofsuch that ,

    almost surely, is given in (20). To derive a deterministicequivalent for , we can assume theinvertible because the result is also a deterministic equivalentfor noninvertible matrices , which is proved in [39, Theorem4]. Define ; we have

    Denote .Applying Theorem 1, we obtain

    , almost surely, where is given by

    (125)

    where . By differentiating along , wehave

    (126)

    almost surely, where is given by

    Setting , we have withdefined in (21) and are the unique positive

    solutions of . Defineand and as

    (127)

    (128)

    Therefore, is given explicitly as

    (129)

    Note that is always invertible since is a uniquepositive solution. Finally, substituting and

    by their respective deterministic equiva-lents and , we obtain in (23) such that ,almost surely.If all available transmit power is allocated to a single user

    (i.e., ), both and are of order andhence grows unbounded with . Therefore, we requireAssumption 2 to ensure that the convergence in (18) holds true,which completes the proof.

    APPENDIX IIIPROOF OF THEOREM 3

    We bound by adding and subtractingand and applying the triangle inequality. We obtain

    (130)

    To show that almost surely as ,take arbitrarily small. For small enough, wewill demonstrate that almost surelyand independently of and . Fur-thermore, we show that for large enough,

    almost surely, from which we conclude that(130) can be made as small as desired.In order to prove that for

    small enough, it suffices to study the matricesand in the

    SINR of RZF precoding (16) and ZF precoding (33). Applyingthe matrix inversion lemma, takes the form

    Under Assumption 4, and,since is almost surely bounded for all large, , for any continuous functional we have

    with probability 1. Therefore,uniformly on , almost surely.

    From Theorem 2, we have immediately that for any ,almost surely.

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4533

    In order to prove for small enough,uniformly on , rewrite as

    (131)To show that , we need to verify thatthe limit of both numerator and denominator in (131)exists and that the denominator is uniformly bounded away fromzero. Define . Under Assumption 5, allexist and are strictly positive. Since is holomorphic for

    , and is bounded away from zero in a neighborhood ofzero, by continuity extension in , we obtain the limit

    as

    (132)

    where is given in (37). It is easy to verify that isuniformly bounded on . We have

    (133)

    Define , , and. Under Assumption 5, there exists a fixed

    point , where with . Inthis case, we can extend the results in [55] 2 and show that the it-erative fixed-point algorithm defined byconverges to the unique positive solution for any initial

    point , .Furthermore, we need to show that both

    and exist and are uniformly bounded on .Observe that

    (134)

    and we obtain

    (135)

    Therefore, for all . Similarly, definegiven in (38) and thus

    satisfying for all . To fulfill the constraints, we have to evoke Assumption 4. The limit

    is given by (34), which completes the proof.

    APPENDIX IVPROOF OF PROPOSITION 2

    2Since can be extended by continuity in zero, where it satisfies ,the positivity property of , defined in [55], does not hold. We precisely needto show that cannot converge to the fixed point 0, whichunfolds from Assumption 5 with similar arguments as in [55].

    The proof is inspired by [26] with adaptations to account forimperfect CSIT. From Corollary 1 with and

    , for large , , the SINR takes the form

    where

    with and defined in (27) and (29), respectively. Takingthe derivative along , we obtain

    (136)

    where

    (137)

    and thus, together with (30), we have

    Therefore, (136) becomes

    (138)

    Denoting ,and , (138) takes the form

    where . Denoting

    (139)

  • 4534 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012

    we obtain

    (140)

    Rewriting the term in brackets in (140), we have

    Since for and , the optimal regularizationparameter is given by (53). Substituting (137) into (139),the term takes the form

    (141)

    With (30) and (137), we obtain and. Substituting these terms into (141) yields (54), which

    completes the proof.

    APPENDIX VPROOF OF PROPOSITION 9

    The sum rate can be written as a function of the per-userrate under perfect CSIT and the per-user rate gap as

    where for ZF and RZF-CDAwe haveand , respectively, and

    where is defined in (85). Denoting, the derivatives take the form

    (142)

    (143)

    whereand .In (142) and (143), the per-user rate-gap and canbe neglected, since at high SNR and

    , respectively. Treating as constant, forand finite, solving (142) and (143) for

    and , respectively, yields (88) and (89), respectively, whichcompletes the proof.

    APPENDIX VIIMPORTANT LEMMAS

    Lemma 1 (Matrix Inversion Lemma): [35, Lemma 2.2]:Let be an invertible matrix and ,for which is invertible. Then

    Lemma 2 (Resolvent Identity): Let and be two invert-ible complex matrices of size . Then

    Lemma 3: [56, Lemma B.26]: Let be a de-terministic matrix and have i.i.d. complex entriesof zero mean, variance , and bounded th-order moment

    . Then for any

    (144)where is a constant solely depending on .

    Lemma 4: [15, Lemma 14.2]: Let , withbe a series of random matrices generated by the

    probability space such that, for , with, , uniformly on . Let, with , be random vectors of i.i.d. entries

    with zero mean, variance , and eighth-order moment oforder , independent of . Then

    almost surely.Proof: The proof unfolds from a direct application of the

    Tonelli theorem [40, Theorem 18.3]. Denotingthe probability space that generates the series ,we have that for every (i.e., for every realization

    , the trace lemma [15, Theorem 3.4]holds true. From [40, Theorem 18.3], the space of couples

    for which the trace lemma holds satisfies

    If , then on a subset of of probability1. Therefore, the inner integral equals 1 whenever . Asfor the outer integral, since , it also equals 1, and theresult is proved.

    Lemma 5: Let be as in Lemma 4 andbe random, mutually independent with standard i.i.d. entries ofzero mean, variance , and eighth-order moment of order

    , independent of :

    almost surely.

  • WAGNER et al.: LARGE SYSTEM ANALYSIS OF LINEAR PRECODING IN CORRELATED MISO BROADCAST CHANNELS 4535

    Proof: Remark that for someconstant independent of . The result then unfolds fromthe Markov inequality the BorelCantelli lemma [40] and theTonelli theorem [40, Theorem 18.3].

    Lemma 6: [15, Lemma 14.3]: Let , with, be