20
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1083 Capacity and Optimal Resource Allocation for Fading Broadcast Channels—Part I: Ergodic Capacity Lifang Li, Member, IEEE, and Andrea J. Goldsmith, Senior Member, IEEE Abstract—In multiuser wireless systems, dynamic resource al- location between users and over time significantly improves effi- ciency and performance. In this two-part paper, we study three types of capacity regions for fading broadcast channels and obtain their corresponding optimal resource allocation strategies: the er- godic (Shannon) capacity region, the zero-outage capacity region, and the outage capacity region with nonzero outage. In Part I, we derive the ergodic capacity region of an -user fading broadcast channel for code division (CD), time division (TD), and frequency division (FD), assuming that both the transmitter and the receivers have perfect channel side information (CSI). It is shown that by allowing dynamic resource allocation, TD, FD, and CD without successive decoding have the same ergodic capacity region, while optimal CD has a larger region. Optimal resource allocation poli- cies are obtained for these different spectrum-sharing techniques. A simple suboptimal policy is also proposed for TD and CD without successive decoding that results in a rate region quite close to the er- godic capacity region. Numerical results are provided for different fading broadcast channels. In Part II, we obtain analogous results for the zero-outage capacity region and the outage capacity region. Index Terms—Broadcast channels, capacity region, fading chan- nels, optimal resource allocation. I. INTRODUCTION T HE wireless communication channel for both point-to- point and broadcast communications varies with time due to user mobility, which induces time-varying path loss, shadowing, and multipath fading in the received signal power [1]. For these time-varying channels, dynamic allocation of resources such as power, rate, and bandwidth can result in better performance than fixed resource allocation strategies [2]–[5]. Indeed, adaptive techniques are currently used in both wireless and wireline systems and are being proposed as standards for next-generation cellular systems. By using an optimal dynamic power and rate allocation strategy, the ergodic (Shannon) capacity of a single-user fading Manuscript received February 11, 1999; revised April 29, 2000. This work was supported by NSF Career Award NCR-9501452 and under a grant from Pacific Bell, while L. Li pursued the Ph.D. degree under the supervision of A. J. Goldsmith at the California Institute of Technology, Pasadena, CA 91125 USA. The material in this paper was presented in part at the 36th Allerton Conference on Communication, Control, and Computing, Monticello, IL, September 1998. L. Li is with the Exeter Group, Inc., Los Angeles, CA 90013 USA (e-mail: [email protected]). A. J. Goldsmith is with the Department of Electrical Engineering, Stanford University, Stanford, CA 94305-9515 USA (e-mail: [email protected]. edu). Communicated by M. L. Honig, Associate Editor for Communications. Publisher Item Identifier S 0018-9448(01)01353-0. channel with channel side information (CSI) at both the trans- mitter and the receiver is obtained in [6]. The corresponding optimal power allocation strategy is a water-filling procedure over time or, equivalently, over the fading states. The ergodic capacity corresponds to the maximum long-term achievable rate averaged over all states of the time-varying channel. For a fading multiple-access channel (MAC), assuming perfect CSI at the receiver and at all transmitters, the optimal power control policy that maximizes the total ergodic rates of all users is derived in [7]. The ergodic capacity region of this channel and the corresponding optimal power and rate allocation are obtained in [8] using the polymatroidal structure of the region. 1 This ergodic capacity region is a multiuser generalization of the two-user capacity region derived in [9] for the Gaussian MAC with intersymbol interference (ISI), and the corresponding op- timal power allocation is a multiuser version of the single-user water-filling procedure. In [10], the zero-outage capacity region 2 and the optimal power allocation for the fading MAC are derived under the as- sumption that CSI is available at both the transmitters and the receiver. This capacity definition, in contrast with the ergodic capacity region, is important for delay-constrained applications such as voice and video, since it represents the maximum instan- taneous data rate that can be maintained in all fading conditions through optimal power control. Under this adaptation policy, the end-to-end delay is independent of the channel variation. By al- lowing some nonzero transmission outage under severe fading conditions, the minimum outage probability for a given rate and the corresponding optimal power allocation policy are de- rived for the single-user fading channel in [11] and the capacity with nonzero outage is implicitly obtained. The corresponding outage capacity region for the fading MAC with nonzero outage is derived in [12]. In Part I of this paper, we first derive the ergodic capacity of an -user flat-fading broadcast channel with transmitter and receiver CSI 3 and obtain the corresponding optimal resource al- location strategy for code division (CD) with and without suc- cessive decoding, time division (TD), and frequency division (FD). The optimal power allocation that achieves the boundary of the ergodic capacity region is derived by solving an optimiza- tion problem over a set of time-invariant additive white Gaussian noise (AWGN) broadcast channels with a total average transmit 1 The ergodic (Shannon) capacity of a fading channel is called “throughput capacity” in [8]. 2 The zero-outage capacity is called “delay-limited capacity” in [10]. 3 In practice, the CSI can be obtained either by estimating it at the receiver and sending it to the transmitter via a feedback path or through channel estimation of the opposite link in a time-division duplex system. 0018–9448/01$10.00 © 2001 IEEE

Capacity and optimal resource allocation for fading ...cs- · is derived in [7]. The ergodic capacity region of this channel and the corresponding optimal power and rate allocation

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001 1083

    Capacity and Optimal Resource Allocation for FadingBroadcast Channels—Part I: Ergodic Capacity

    Lifang Li, Member, IEEE,and Andrea J. Goldsmith, Senior Member, IEEE

    Abstract—In multiuser wireless systems, dynamic resource al-location between users and over time significantly improves effi-ciency and performance. In this two-part paper, we study threetypes of capacity regions for fading broadcast channels and obtaintheir corresponding optimal resource allocation strategies: the er-godic (Shannon) capacity region, the zero-outage capacity region,and the outage capacity region with nonzero outage. In Part I, wederive the ergodic capacity region of an -user fading broadcastchannel for code division (CD), time division (TD), and frequencydivision (FD), assuming that both the transmitter and the receivershave perfect channel side information (CSI). It is shown that byallowing dynamic resource allocation, TD, FD, and CD withoutsuccessive decoding have the same ergodic capacity region, whileoptimal CD has a larger region. Optimal resource allocation poli-cies are obtained for these different spectrum-sharing techniques.A simple suboptimal policy is also proposed for TD and CD withoutsuccessive decoding that results in a rate region quite close to the er-godic capacity region. Numerical results are provided for differentfading broadcast channels. In Part II, we obtain analogous resultsfor the zero-outage capacity region and the outage capacity region.

    Index Terms—Broadcast channels, capacity region, fading chan-nels, optimal resource allocation.

    I. INTRODUCTION

    T HE wireless communication channel for both point-to-point and broadcast communications varies with timedue to user mobility, which induces time-varying path loss,shadowing, and multipath fading in the received signal power[1]. For these time-varying channels, dynamic allocation ofresources such as power, rate, and bandwidth can result in betterperformance than fixed resource allocation strategies [2]–[5].Indeed, adaptive techniques are currently used in both wirelessand wireline systems and are being proposed as standards fornext-generation cellular systems.

    By using an optimal dynamic power and rate allocationstrategy, the ergodic (Shannon) capacity of a single-user fading

    Manuscript received February 11, 1999; revised April 29, 2000. This workwas supported by NSF Career Award NCR-9501452 and under a grant fromPacific Bell, while L. Li pursued the Ph.D. degree under the supervision of A. J.Goldsmith at the California Institute of Technology, Pasadena, CA 91125 USA.The material in this paper was presented in part at the 36th Allerton Conferenceon Communication, Control, and Computing, Monticello, IL, September 1998.

    L. Li is with the Exeter Group, Inc., Los Angeles, CA 90013 USA (e-mail:[email protected]).

    A. J. Goldsmith is with the Department of Electrical Engineering, StanfordUniversity, Stanford, CA 94305-9515 USA (e-mail: [email protected]).

    Communicated by M. L. Honig, Associate Editor for Communications.Publisher Item Identifier S 0018-9448(01)01353-0.

    channel with channel side information (CSI) at both the trans-mitter and the receiver is obtained in [6]. The correspondingoptimal power allocation strategy is a water-filling procedureover time or, equivalently, over the fading states. The ergodiccapacity corresponds to the maximum long-term achievablerate averaged over all states of the time-varying channel. Fora fading multiple-access channel (MAC), assuming perfectCSI at the receiver and at all transmitters, the optimal powercontrol policy that maximizes the total ergodic rates of all usersis derived in [7]. The ergodic capacity region of this channeland the corresponding optimal power and rate allocation areobtained in [8] using the polymatroidal structure of the region.1

    This ergodic capacity region is a multiuser generalization of thetwo-user capacity region derived in [9] for the Gaussian MACwith intersymbol interference (ISI), and the corresponding op-timal power allocation is a multiuser version of the single-userwater-filling procedure.

    In [10], the zero-outage capacity region2 and the optimalpower allocation for the fading MAC are derived under the as-sumption that CSI is available at both the transmitters and thereceiver. This capacity definition, in contrast with the ergodiccapacity region, is important for delay-constrained applicationssuch as voice and video, since it represents the maximum instan-taneous data rate that can be maintained in all fading conditionsthrough optimal power control. Under this adaptation policy, theend-to-end delay is independent of the channel variation. By al-lowing some nonzero transmission outage under severe fadingconditions, the minimum outage probability for a given rateand the corresponding optimal power allocation policy are de-rived for the single-user fading channel in [11] and the capacitywith nonzero outage is implicitly obtained. The correspondingoutage capacity region for the fading MAC with nonzero outageis derived in [12].

    In Part I of this paper, we first derive the ergodic capacity ofan -user flat-fading broadcast channel with transmitter andreceiver CSI3 and obtain the corresponding optimal resource al-location strategy for code division (CD) with and without suc-cessive decoding, time division (TD), and frequency division(FD). The optimal power allocation that achieves the boundaryof the ergodic capacity region is derived by solving an optimiza-tion problem over a set of time-invariant additive white Gaussiannoise (AWGN) broadcast channels with a total average transmit

    1The ergodic (Shannon) capacity of a fading channel is called “throughputcapacity” in [8].

    2The zero-outage capacity is called “delay-limited capacity” in [10].3In practice, the CSI can be obtained either by estimating it at the receiver and

    sending it to the transmitter via a feedback path or through channel estimationof the opposite link in a time-division duplex system.

    0018–9448/01$10.00 © 2001 IEEE

  • 1084 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001

    power constraint. For CD with successive decoding, the optimalpower allocation is similar to that of the parallel Gaussian broad-cast channels discussed in [13], [14], and the solution uses theresults therein. We solve the optimization problem for TD andshow that one of the nonunique optimal power allocation strate-gies is also optimal for CD without successive decoding, andthe ergodic capacity regions for these two techniques are thesame. TD and FD are equivalent in the sense that they have thesame ergodic capacity region and the optimal power allocationfor one of them can be directly obtained from that of the other[15]. Thus, we obtain the optimal resource allocation for FD aswell. For TD and CD without successive decoding we also pro-pose a simple suboptimal power allocation strategy that resultsin an ergodic rate region close to their capacity region. These re-sults are then extended to frequency-selective fading broadcastchannels that vary randomly.

    In Part II of this paper, we first obtain the zero-outage ca-pacity regions and the associated optimal resource allocationstrategies for an -user flat-fading broadcast channel with TD,FD, and CD with and without successive decoding. We thendetermine the outage probability region for a given rate vectorof the users and derive the optimal power allocation policythat achieves the boundaries of the outage probability regionsfor these different spectrum-sharing techniques. The outage ca-pacity regions are thus obtained implicitly from the outage prob-ability regions for given rate vectors. These results are also ex-tended to frequency-selective fading broadcast channels.

    Part I of this paper is organized as follows. In Section II, wepresent the flat-fading broadcast channel model. The ergodic ca-pacity regions and the optimal resource allocation for CD withand without successive decoding, TD, and FD, as well as thesuboptimal power and time allocations for TD are obtained inSection III. In Section IV, we extend our flat-fading model tothe case of frequency-selective fading. Section V shows variousnumerical results, followed by our conclusions in the last sec-tion.

    Notation: The prime is used to denote the derivative ofa function throughout this paper except in the proofs aboutthe convexity of a capacity region in the Appendix, SectionB, where the prime or double prime of a symbol just denotesanother symbol.

    II. THE FADING BROADCAST CHANNEL

    We consider a discrete-time -user broadcast channel withflat fading as shown in Fig. 1. In this model, the signal source

    is composed of independent information sources, andthe broadcast channel consists of independent flat-fadingsubchannels. The time-varying subchannel gains are denoted as

    and the Gaussian noises of these subchannels are denoted as. Let be the total average transmit

    power, the received signal bandwidth, and the noisedensity of , . Since the time-varyingreceived signal-to-noise ratio (SNR)

    Fig. 1. AnM -user fading broadcast channel model.

    Fig. 2. An equivalentM -user fading broadcast channel model.

    if we define4 , we have .Therefore, for slowly time-varying broadcast channels,

    we obtain an equivalent channel model, which is shown inFig. 2. In this model, the noise density of is ,

    . We assume that are known to thetransmitter and all the receivers at time. Thus, the trans-mitter can vary the transmit power for each user relativeto the noise density vector ,subject only to the average power constraint. For TD orFD, it can also vary the fraction of transmission time or band-width assigned to each user, subject to the constraint

    for all . For CD, the superposition code canbe varied at each transmission. Since every receiver knowsthe noise density vector , they can decode their individualsignals by successive decoding based on the known resourceallocation strategy given the noise densities. In practice, itis necessary to send the transmitter strategy to each receiverthrough either a header on the transmitted data or a pilot tone.We call the joint fading process and denote as theset of all possible joint fading states. denotes a givencumulative distribution function (cdf) on .

    III. ERGODIC CAPACITY REGIONS

    Under the assumption that both the transmitters and the re-ceiver have perfect CSI, the ergodic capacity region of a fading

    4Note that wheng [i] = 0 for somej, n [i] = 1 and no information canbe transmitted through thejth subchannel.

  • LI AND GOLDSMITH: CAPACITY AND OPTIMAL RESOURCE ALLOCATION FOR FADING BROADCAST CHANNELS—PART I 1085

    MAC is derived in [8] by exploiting its special polymatroidalstructure. In that work, the optimal resource allocation schemeis obtained by solving a family of optimization problems over aset of parallel Gaussian MACs, one for each fading state. In thissection, we derive the ergodic capacity region for the flat-fadingbroadcast channel under the assumption that the transmitter andall receivers have perfect CSI. The corresponding optimal re-source allocation strategy is obtained by optimizing over a set ofparallel Gaussian broadcast channels for CD with and withoutsuccessive decoding and for TD. For FD it is shown that theergodic capacity region is the same as for TD and the corre-sponding optimal power and bandwidth allocation policy canbe derived directly from that of TD [15]. We will discuss exten-sions of the results obtained in this section to the case of fre-quency-selective fading channels in Section IV.

    A. CD

    We first consider superposition coding and successive de-coding where, in each joint fading state, the-user broadcastchannel can be viewed as a degraded Gaussian broadcastchannel with noise densities and themultiresolution signal constellation is optimized relative tothese instantaneous noise densities. Given a power allocationpolicy , let be the transmit power allocated to Userfor the joint fading state and denoteas the set of all possible power policies satisfying the averagepower constraint where denotesthe expectation function. For simplicity, we assume that thestationary distributions of the fading processes have continuousdensities,5 i.e., , .

    Theorem 1: The ergodic capacity region for the fadingbroadcast channel when the transmitter and all the receiversknow the current channel state is given by

    (1)

    where

    (2)

    , and denotes the indicator func-tion ( if is true and zero otherwise). Moreover, thecapacity region is convex.

    Proof: See the Appendix, Section A.

    Since this capacity region is convex

    5If Prfn = n g 6= 0 for somei; j then, in statennn, Useri and Userj canbe viewed as a single user and superposition coding and successive decodingare applied toM � 1 users. The information for Useri and Userj are thentransmitted by time-sharing the channel.

    with , if a rate vector is a solution to the fol-lowing maximization problem, it will be on the boundary sur-face of in (1)

    subject to (3)

    The maximization problem in (3) is equivalent to

    subject to:(4)

    where and the objectivefunction

    (5)

    In (5), is the Lagrangian multiplier and can be viewed as aweighting parameter (rate reward) proportional to the priority ofUser . The problem in (4) is quite similarto that of the parallel AWGN broadcast channels discussed in[13], [14] and its solution is obtained by applying the resultstherein. For each fading state , let thepermutation be defined such that

    . The optimal power allocation procedure for each stateas derived in [13] is essentially water filling. We now describethis optimal power allocation procedure in the following.

    Initialization: Do not assign power to any user forwhich , with . Remove these usersfrom further consideration.

    Step 1: Denote the number of remaining users asand de-fine the permutation such that

    . Then, due to the removal criterion, wehave

    i.e.,

    Step 2: Define

    where and assign powerto User . If

    the total power for state has been allocated. If not,only the power for User has been allocated. Inthis case, increase the noisesby and do not assign power to any user

    for which such that with

  • 1086 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001

    . Remove these users from furtherconsideration. Also remove User and return toStep 1.

    In the above procedure, is the water-filling powerlevel that satisfies the total average power constraint of theusers in (4) instead of the total power constraint for a parallelAWGN broadcast channel as in [13], [14]. In each iteration, thiswater-filling procedure consists of selecting the best receiver ac-cording to a modified noise criterion using the weighting param-eter for each user , and adding power to the correspondingsubchannel until a predetermined power is achieved. Note thatin each iteration, some subchannels will be identified to hold nopower.

    This optimal power allocation procedure can be obtained in adifferent form through a greedy algorithm [14]. Specifically, ineach fading state, define the utility function for Useras

    and let

    Obviously, and are all decreasingfunctions. Moreover, for any , the two curves and

    will cross each other at most once. Now let andlet denote the point where intersects

    for some and in the set , assumingthat for . That is,denotes the set of all intersection points for the utility functions

    , and . Thencan be expressed as6

    if

    if .(6)

    Then the power allocation for User in statethat maximizes (4) is

    if for some

    otherwise.

    This power allocation strategy is shown to be equivalent to thewater-filling procedure [16] and it has a greedy interpretation inthe following sense [14]: can be interpreted as the mar-ginal weighted rate increase in the objective functionin (5) when a marginal power is allocated to User at in-terference level . The optimal solution to (4) can be obtainedgreedily by allocating marginal power to the user with thelargest positive marginal weighted rate increase

    at each interference level . This power allocationprocess continues until no user obtains a positive marginal

    6From (6) we see thatz is defined as the smallestz for whichu (z) = 0;if no such point exists, thenz = 1.

    weighted rate increase with the addition of power [i.e.,], in which case we allocate

    no more power to state.In the special case where each user has the same rate re-

    ward , it is easily seen that the optimal power allocation policyin a fading state is to assign power to User

    and assign no power to any other user. Wewill see in the following subsection that this is actually the sameas the optimal TD policy when all users have the same rate re-ward .

    B. TD

    Now we consider the TD case where, in each fading state, the information for the users will be divided and sent in

    time slots which are functions of. For a given power and timeallocation policy , let and bethe transmit power and fraction of transmission time allocated toUser , respectively, for fading state, andlet be the set of all such possible power and time allocationpolicies satisfying

    and

    (7)

    Theorem 2: The achievable rate region for the variable powerand variable transmission time scheme is

    (8)

    where

    (9)

    Moreover, the rate region is convex.Proof: See the Appendix, Section B.

    Note that in this paper we will refer to this achievable rateregion as the capacity region for TD, though we do not have aconverse proof due to the fact that the converse only holds for theoptimal transmission strategy for this channel, which, accordingto Theorem 1, is CD with successive decoding.

    Due to the convexity of this capacity region, with, if a rate vector is a solution to

    subject to (10)

    it will be on the boundary surface of . Based on the ex-pression for in (9) and the average total power constraint in(7), we can decompose the maximization problem (10) into thefollowing two problems.

    1) Assuming that , is the total power assignedto the users, i.e., , we mustdetermine how to distribute among the users so

  • LI AND GOLDSMITH: CAPACITY AND OPTIMAL RESOURCE ALLOCATION FOR FADING BROADCAST CHANNELS—PART I 1087

    that the total weighted rate in stateis maximized. Thatis, we must find

    subject to

    (11)

    where with

    and

    (12)

    2) After we obtain the expression for by solving (11),the remaining problem is how to assign the total power

    of the users for each state so that the totalweighted rate averaged over all fading states as expressedin (10) is maximized. That is,

    subject to(13)

    where is the Lagrangian multiplier.

    We solve the maximization problems (11) and (13) for thetwo-user case first and then generalize the results to the-usercase.

    1) Two-User Case:Lemma 1: When , assuming that , the solu-

    tion to (11) is

    1) if , then , which isachieved when , , , and

    ;2) if , let

    (14)

    Then

    if

    if

    if

    (15)

    where

    and satisfies

    (16)

    in (15) is achieved by letting

    if (17)

    if (18)

    if (19)

    Proof: See the Appendix, Section C.

    From (11) we see that when , is a linearcombination of . The proof of Lemma 1 in theAppendix, Section C shows that if , then ,

    and is simply ; if, then for , and will

    cross each other once at some positive value , as shown inFig. 3.7 In this figure, the slope of the tangent line between thecurves and is and it satisfies

    i.e.,

    Thus, in this case, is the continuous contour in Fig. 3which consists of part of the curve , the tangent line,and part of the curve , as indicated with the dash-dotted line which is offset slightly for clarity. The expressionof is therefore as given in (15). Note that the slope ofthe tangent of the curve is continuous and it decreaseswith the increase of .

    For a given fading state, from (13) we know that the optimalpower satisfies

    (20)

    Therefore, for any given , is determined by thepoint(s) on the curve whose tangent has a slope. Inthe case where and , since all the pointson the tangent line between and in Fig. 3have the same tangent slope, can be any value between

    and : if , from Lemma 1we know that will be time-shared by the two users; if

    is simply chosen as or , then it is only as-signed to User 1 or to User 2, respectively. In all other cases, thepoint that has a tangent with slopeis unique and hence so is

    . The unique choice of is then allocated to a singleuser based on its relative value compared to and

    7In this figure,P ,P ,P andP are all functions ofnnn. Their explicit depen-dence onnnn is not shown to simplify the notation.

  • 1088 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001

    Fig. 3. The functionsf (P ), f (P ), andJ(P ) when > .

    as discussed in Lemma 1. Consequently, we have the followingtheorem.

    Theorem 3: When , assuming that , theoptimal power and time allocation policy that achieves the TDcapacity region boundary for each fading state is

    1) if , then

    and

    2) if , then

    a) if or if and

    and

    b) if and

    and

    c) if and

    and

    In the above expressions, is given in (14) andis the water-filling power level. and satisfy

    the total average power constraint

    (21)

    and they may not be unique.Proof: See the Appendix, Section D.

    When , the optimal power policy for User 1 and User2 is similarly derived using appropriate substitutions for all sub-scripts. When , the optimal power policy is simplifiedas follows: if then

    otherwise

    That is, this is the same power allocation as for CD when.

    Note that when the cdf is continuous, , theprobability measure of the set

    (22)

    is zero. Since according to Theorem 3,is defined on a subsetof and the probability measure of any subset ofmust alsobe zero, the value of will not affect the power constraint (21)and is therefore uniquely determined by (21). Moreover, inthis case, with probability , at most a single user transmits ineach fading state. If is not continuous, the set mayhave positive probability measure and for any fading state inthis set, the broadcast channel can be either time-shared by thetwo users, occupied by a single user, or not used by any user ifthe fading is too severe.

    2) -User Case: The optimal power and time allocationpolicy that achieves the capacity region boundary for the two-user case can be generalized to the-user case .In the -user case, the optimal power allocation is again ob-tained based on the values of functions , .Specifically, for a given in (12), if such that

    , , then we can show that will notappear in the expression of , . Thus, no re-sources should be assigned to Userin the state . That is, theoptimal and are , . In thespecial case where each user has the same rate reward, as-suming that , then ,

    . This is because for any , ,. Therefore, it is clear that the optimal power allocation

    policy in a fading state is to assign power to Userand assign no power to any other user, which is the same as

    that for CD with successive decoding.For any assuming , we know

    that, as shown in the proof of Lemma 1, if then ,; if then and will cross each

    other once at some positive. In both cases, since , forlarge enough ( ), . Thus, assuming withoutloss of generality (WLOG) that , we firstremove any user [i.e., let , ] for which

    , with or which satisfies and. For the remaining users, there are still some users

    whose corresponding may not appear in the expression of. For example, assume WLOG that the remaining users

    are User 1–User 4. Due to the removal criterion, we know that

    (23)

    and

    (24)

    which means that their correspondingwill cross one another once at some positive , and

  • LI AND GOLDSMITH: CAPACITY AND OPTIMAL RESOURCE ALLOCATION FOR FADING BROADCAST CHANNELS—PART I 1089

    Fig. 4. The functionsJ(P ) andf (P ) = � ln[1 + ] (1 � j � 4)

    which satisfy (23) and (24).

    for large enough. Thus, if are as shownin Fig. 4,8 it is clear that will not appear in the expressionof since, , the curve is alwaysunder the dash-dotted contour of formed by part ofthe curves , , , and the straight tan-gent lines between them. Note that the dash-dotted curve in thisfigure is offset slightly for clarity. In the following, we use aniterative procedure to find all the usersamong the remainingusers whose corresponding will not appear in the expres-sion of and identify all other users to whom the re-sources will be allocated later. An interpretation of this proce-dure based on Fig. 4 will then be given.

    Initialization: Let .

    Step 1: Denote the number of remaining users asanddefine the permutation such that

    Then, due to the removal criterion, we have

    (25)

    Step 2: Let . If , all the users whosecorresponding will not appear in the expres-sion of have been removed and stop; if

    , go toStep 3.Step 3: For , define

    and let satisfy . Let, . If , re-

    move those users for which[i.e., let , , and removethem from further consideration]. Also remove User

    . Increase by and return toStep 1.

    8In this figure,P , P , andP (j = 1; 2) are all functions ofnnn. Theirexplicit dependence onnnn is not shown to simplify the notation.

    In this procedure we observe that in the first iteration,of User must be the first part of the curve

    where is close to zero, since

    and according to (25), for close to zero , it must betrue that

    For , satisfying correspondsto the slope of the common tangent between the two curves

    and . Since , if, all the curves will always be under the

    contour formed by part of the curve , part of thecurve , and the common tangent between them.Thus, no power should be assigned to users , .In this case, we know that part of the curveof User as well as the common tangent betweenthe curves and must be part of

    . For example, in Fig. 4, since the number of remainingusers is and , if we drawthe common tangents between curves and ,

    and , and also and , theslopes of which are , , and , respectively, then it is clearthat and , i.e., the slope of the tan-gent between and is the largest among theslopes of the three tangents. Thus, will not be partof but and the common tangent between

    and will.After removing those users , , and User ,

    the number of remaining users is reduced fromtoand User in the first iteration becomes Userin the second iteration. Similarly, in the second iteration, bycomparing the slopes of the common tangents between curves

    and , , we may re-move more users and find a new User in this iterationwhose corresponding as well as the commontangent between curves and mustbe part of . This User becomes User inthe third iteration and the iterative procedure goes on until allthe users whose corresponding will be part ofhave been identified and all other users have been removed.

    Note that in each iteration, the value of is different. As-sume that by the time the iteration stops, . If ,then is simply ; if , then iscomposed of as well as the common tangentsbetween curves and , ,the slopes of which are . That is, by denoting

    , , ,and letting and ( ) be the pointssatisfying

    (26)

    i.e.,

  • 1090 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001

    we can express as

    if

    if

    (27)

    where . For example, in the case shown in Fig. 4,since only User 2 will be removed from further consideration,by the time the iterative procedure stops, we have and

    The and satisfying (26) are asshown in Fig. 4 and and denote the slopes of thetangent lines for which ranges from to ,and from to , respectively. Thus, can beexpressed as

    if

    if

    if

    if

    if .

    Once we obtain the curve , similar to the two-usercase, fixed, since the optimal power satisfies thecondition (20), i.e., , is determined bythe point(s) on the curve whose tangent has a slope.If equals any , since all the points onthe common tangent between curves and sharethe same tangent, can be any value between and

    : if , will be time-shared by the two users and ; if is simplychosen as or , then it is only assigned to User

    or User , respectively. If for all, then is uniquely determined by the single

    point on that has a tangent with slope. Therefore,based on the expression of in (27) and the condition(20), for fixed, the optimal power and time allocationpolicy for the remaining users is

    a) if , then

    b) if , by denoting , we know that forthe given , there exists a such that

    or , since

    which results from the iterative procedure generatingand from the fact that is an

    increasing, concave function. If such that

    from (27) we have , since onlywhen does the tangent slopeof decrease from to . In this case,we set

    which corresponds to transmitting the information ofUser only. If such that , then

    since only when does the tan-gent slope of equal . Therefore, as in thetwo-user case, we can set

    (28)

    and

    which indicates that the channel is time-shared by Userand User if choosing ,

    and is occupied by User or User aloneif choosing or , respectively.

    In the above policy, and satisfy the average power con-straint

    (29)

    and they may not be unique, sincecan be any value betweenand . Notice that as in the two-user case, if the cdf is

    continuous, , the probability measure of the set9

    such that and (30)

    is zero and is uniquely determined by (29). Moreover, in thiscase, the above optimal power and time allocation policy forthe -user broadcast channel implies that with probability,the information of at most a single user is transmitted in eachfading state. If is not continuous then the setmay havenonzero probability measure and for , as discussed be-fore, the channel capacity region is achieved by time-sharingbetween two users or dedicated transmission to just one user;for any other channel state , the information of atmost one user is transmitted. Thus, in all cases, the capacity re-gion boundary of TD can be achieved by sending informationto just one user in every fading state. This motivates the subop-timal TD policy we propose in the next subsection.

    C. Suboptimal TD Policy

    The optimal power and time allocation policy in Section III-Bindicates that the capacity region in (8) is achieved bytransmitting the information of at most a single user in eachfading state , although it can also be achieved by a strategy

    9Note that in (30),M is a function ofnnn, which results from the iterativeprocedure described earlier in this section.

  • LI AND GOLDSMITH: CAPACITY AND OPTIMAL RESOURCE ALLOCATION FOR FADING BROADCAST CHANNELS—PART I 1091

    that transmits the information of two users in some states bytime sharing and assigns the channel to a single user or no userin the other states. Based on this observation, we now proposea suboptimal method for resource allocation. This method se-lects a single user in each channel state and allocates appropriatepower to him according to the fading state. We now describe ourmethod to choose the single user and his corresponding power.

    In each state , , define

    and let

    where is given in (12). Then, in the state, only the infor-mation for User is sent with transmit power , wheresatisfies the -user average power constraint (29).

    For example, in the two-user case, by denoting

    (31)

    we can express the suboptimal power and time allocation policyas follows:

    a) if and , or ifand (i.e.,

    ), then

    and

    where satisfies the two-user average power constraint(21);

    b) if and , or ifand (i.e.,

    ), then

    and

    In the special case where all users have the same rate reward,i.e., , it is obvious that this suboptimal TDpower allocation policy in a fading stateis to assign power

    to User and assign no powerto any other user, which is the same as the optimal TD policy.

    Compared to the optimal power and time allocation policy,the advantage of this suboptimal scheme is that it is much easierto compute the water-filling power level using (29).As will be shown in Section V, the resulting rate region of thetwo-user case comes very close to that of the optimal TD policy.This is due to the fact that the two policies are identical exceptover a small set of fading states. Specifically, assuming WLOGthat , in the case where the cdf is continuous,the detailed comparison of the optimal and suboptimal decisionregions for the two-user fading broadcast channel in the Ap-pendix, Section E shows that for a given , the two policies

    transmit the information of different users only in the rare occa-sions when , where

    and

    with determined by:

    and

    (32)

    D. CD Without Successive Decoding

    For CD without successive decoding, each receiver treats thesignals for other users as interference noise. For a given powerallocation policy , by denoting as the transmit powerallocated to User and as the set of all possible power policiessatisfying the average power constraint ,we have the following theorem.

    Theorem 4: The achievable rate region for CD without suc-cessive decoding is given by

    (33)

    where

    The proof of the achievability follows along the same linesas that for the capacity region of CD with successive decodinggiven in the Appendix, Section A and is therefore omitted. Notethat in this paper, as in the case of TD, we refer to this achiev-able rate region as the capacity region for CD without successivedecoding, though we do not have a converse proof since the con-verse only applies to the optimal transmission strategy, which isCD with successive decoding.

    In order to show that in (33) cannot be larger than thecapacity region of TD in (8), we give the following lemma.

    Lemma 2: , , ,

    (34)

    Proof: See the Appendix, Section F.

  • 1092 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001

    Theorem 5:

    (35)

    where equality is achieved using the optimal TD policy withpower allocated to at most one user in each fading state.

    Proof: Recall that the optimal power and time allocationpolicy discussed in Section III-B2) indicates that if is con-tinuous, then with probability, no more than one user is usingthe broadcast channel in each fading state; if is dis-continuous, in some fading states with nonzero probability, wecan either choose two users and transmit their information bytime-sharing the channel, or just select one of them and transmithis information alone. Therefore, in any case, the boundary ofthe capacity region in (8) can be achieved by an optimalTD policy which transmits the information of at most one userthrough the fading broadcast channel in each channel state.Obviously, this optimal policy can be used as a power allocationpolicy for CD without successive decoding to eliminate inter-ference from all other users and, therefore, to achieve the samecapacity region boundary as TD. Thus, we need only to showthat the capacity region of CD without successive decoding in(33) cannot be larger than the capacity region of TD in (8). Weuse Lemma 2 to prove this as follows.

    For CD without successive decoding, , denote

    For , let

    and let . Then according to Lemma 2, we have

    (36)

    Now consider the equal-power TD strategy which assigns thepower to each user for a fraction of the total trans-mission time in the state. By denoting

    we have

    (37)

    since . Therefore, from (36) and (37), it is clearthat given a fading state, , the capacity region ofthe equivalent AWGN broadcast channel for CD without succes-sive decoding is within that for equal-power TD and is, there-fore, within that for optimal TD. Consequently, the capacity re-gion of CD without successive decoding in (33) cannot be largerthan the capacity region of TD in (8).

    Note that since the suboptimal TD scheme proposed in Sec-tion III-C indicates that the broadcast channel is used by no more

    than one user, this suboptimal policy can also be applied to CDwithout successive decoding.

    IV. FREQUENCY-SELECTIVE FADING CHANNELS

    In the previous sections we have considered a flat-fadingbroadcast channel model which is appropriate for narrow-bandapplications. For wide-band communication systems, thetime-varying frequency-selective fading model is more appro-priate. In this section, we will extend our previous results tothis model.

    First we consider an -user time-invariant spectral Gaussianbroadcast channel with continuous noise spectra

    , where ranges over the system band-width [13]. For CD with successive decoding, given a powerallocation policy , can be viewed as the transmit powerallocated to User at frequency , . Let denotethe set of all power allocation policies satisfying the total powerconstraint , i.e.,

    Then it can be similarly shown as in Section III-A that the ca-pacity region is

    (38)

    For TD, given a resource allocation policy, , andcan be viewed as the transmit power and fraction of trans-

    mission time allocated to Userat frequency . In this case, theset is defined as

    Then the achievable rate region using TD is

    (39)

    Note that the regions in (38) and (39) are actually identical tothose in (1) and (8), respectively, with the role of the fading state

    now played by frequency. Therefore, the boundary surfacesof the regions in (38) and (39) and the corresponding optimal re-source allocation strategies can be obtained by using the resultsin Section III. Moreover, it is clear that one of the nonuniqueoptimal resource allocation strategies for TD can also be usedfor FD and CD without successive decoding to achieve the samecapacity region as TD.

    In the general case, where the channel is time-varying, if weassume as in [8] that the time variations are random and ergodic,and the channel varies very slowly relative to the multipath delayspread, then the channel can be decomposed into a set of par-allel time-invariant spectral Gaussian broadcast channels. In thiscase, for each fading state, let the continuous noise spectra

  • LI AND GOLDSMITH: CAPACITY AND OPTIMAL RESOURCE ALLOCATION FOR FADING BROADCAST CHANNELS—PART I 1093

    Fig. 5. Two-user ergodic capacity region comparison: 3 dB SNR difference.

    of the users be . Thecapacity regions and the optimal resource allocation strategiesfor CD, TD, and FD of this time-varying channel can then beobtained from the results in Section III by replacing the fadingstate with . That is, the average is now taken on both fre-quency and fading state . Note that for most physical chan-nels the time variations are correlated, not random. The capacityregion for multiuser channels under this more realistic channelmodel is unknown (see [17] for the capacity of a single-usertime-varying frequency selective fading channel).

    V. NUMERICAL RESULTS

    In this section, we present numerical results for the two-userergodic capacity regions of the Rician and Rayleigh flat-fadingchannels under different spectrum-sharing techniques. The ca-pacity regions obtained analytically in the previous section leadto double-integral formulas that were solved numerically usingMathematica to obtain the numerical results in this section. Inthe figures below, as in [15], the equal power TD scheme refersto the strategy that assigns the constant transmission powerand total bandwidth to User 1 for a fraction of the total trans-mission time, and then to User 2 for the remainder of the trans-mission. The optimal TD scheme for both the AWGN channeland the fading channels is obtained by allocating different power

    to the two users. We refer to CD without successive decoding asCDWO. Since TD and FD are equivalent in the sense that theyhave the same capacity region, all results for TD in the figuresalso apply for FD.

    In Fig. 5, the ergodic capacity regions of the Rician andRayleigh fading broadcast channels are compared to that of theGaussian broadcast channel using the CD, TD, equal powerTD, and CDWO techniques. The SNR difference between thetwo users is 3 dB ( and denote the average noise densitiesof User 1’s channel and User 2’s channel, respectively). Thetotal transmission power is 10 dB and the signal band-width 100 kHz. The ratio of the direct-path power to thescattered-path power in the Rician fading subchannels is6 dB.

    In this figure we see that while the single-user ergodic rate(the -axis or -axis intercept) in fading is smaller than therate in AWGN, the two-user capacity regions of both the Ri-cian fading and the Rayleigh fading broadcast channels usingeither optimal CD or suboptimal TD techniques in some placesdominate that of the AWGN broadcast channel using optimalCD. That is, for the fading broadcast channel, ergodic rate pairsbeyond the capacity region of the nonfading broadcast channelcan be achieved by applying optimal resource allocation overthe joint fading channel states. However, as shown in Fig. 6,this is not true when the difference between the average noise

  • 1094 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001

    Fig. 6. Two-user ergodic capacity region comparison: 20 dB SNR difference with strong signal power.

    variances of the two users is quite large. An intuitive analyticalexplanation of the two cases is given in the Appendix, SectionG by comparing the sum rate on a fading broadcast channel tothe sum rate on an AWGN channel, where the sum rate refers tothe sum of the weighted rates in (3) with equal rate reward foreach user.

    In Fig. 5, for simplicity, we calculate the rate region of thefading broadcast channel for TD by applying the simple sub-optimal TD power allocation policy. The resulting rate regionturns out to be very close to the capacity region for optimalCD. Therefore, the capacity region using the optimal TD powerpolicy will also come very close to that of optimal CD. This ob-servation implies that, due to the small SNR difference betweenthe two users, superposition encoding with successive decodingis not necessary, since time sharing is near-optimal. For theAWGN broadcast channel, the capacity region boundary of theoptimal TD scheme is indistinguishable from the equal-powerTD straight line, which means that when the two users have asimilar channel noise power, constant power allocation is goodenough for TD. The CDWO capacity region boundary (omittedfrom Fig. 5 but shown in figures in [15]) includes the two end-points of the equal-power TD line but is below this straight linedue to its convexity. However, for the fading channels, optimalor suboptimal TD has a much larger capacity region than equalpower TD and the capacity region for CDWO is the same as thatfor optimal TD.

    We show in Fig. 6 that when the SNR difference between thetwo users is 20 dB and the total average power is 25 dB, theergodic capacity region for CD in Rayleigh fading is now com-pletely within the region for CD in AWGN. However, optimalTD in fading can achieve some rate pairs far beyond the capacityregion of the AWGN broadcast channel using optimal TD, andthe suboptimal TD power policy for Rayleigh fading results ina rate region almost as large as the capacity region with the op-timal TD power policy. For both AWGN and fading channels,due to the large SNR difference between the two users, the ca-pacity region for optimal CD is noticeably larger than that foroptimal TD, and the capacity region for optimal TD is notice-ably larger than that for equal power TD.

    Fig. 7 shows the case where the SNR difference between thetwo users is 20 dB and the total average power is only 10 dB. Un-like the previous cases, we see here that the ergodic single-usercapacity of User 2 for the Rayleigh fading channel is largerthan that for the AWGN channel due to its very low averageSNR. Thus, the optimal CD or suboptimal TD scheme for fadingyields a large rate region that is not achievable for the AWGNchannel. However, as in Fig. 6, due to the great SNR differencebetween the two users, optimal CD results in a capacity regionmuch larger than that for suboptimal TD and the capacity regionfor optimal TD is significantly larger than that for equal powerTD. This observation holds for both fading and AWGN chan-nels.

  • LI AND GOLDSMITH: CAPACITY AND OPTIMAL RESOURCE ALLOCATION FOR FADING BROADCAST CHANNELS—PART I 1095

    Fig. 7. Two-user ergodic capacity region comparison: 20-dB SNR difference with weak signal power.

    VI. CONCLUSION

    We have obtained the ergodic capacity region and the op-timal dynamic resource allocation strategy for fading broadcastchannels with perfect CSI at both the transmitter and the re-ceivers. These results are obtained for CD with and without suc-cessive decoding, TD, and FD. Comparisons of the capacity re-gions show that CD with successive decoding has the largest ca-pacity region, while TD and FD are equivalent and they have thesame capacity region as CD without successive decoding. ForCD without successive decoding, the optimal power policy is totransmit the information of at most one user in each joint fadingstate. This policy is also optimal for TD, though other strategieswhich allow at most two users to time-share the channel mayalso be optimal. When the average channel fading condition foreach user is similar, the capacity regions for optimal CD and TDare quite close to each other. However, when each user has anaverage channel condition quite different from that of the others,optimal CD can achieve a much larger ergodic capacity regionthan the other techniques. In Part II of this paper, we will de-rive the zero-outage capacity region and the capacity region withnonzero outage for fading broadcast channels. For narrow-bandapplications, since each rate vector in the outage capacity regionmust be achievable in every fading stateunless an outage isdeclared, we cannot average over different fading conditions.

    However, the outage and ergodic capacity regions exhibit sim-ilar relative performance between the various channel-sharingtechniques.

    APPENDIX

    A. Proof of Theorem 1

    The convexity of the capacity region in (1) can be easilyproved by using the idea of time-sharing [18, pp. 396–397]. Wenow prove the achievability and the converse of this capacityregion.

    1) Achievability: We prove the achievability of the capacityregion by proving the achievability of in (2)for each given power allocation policy . , theproof is similar to that of the achievability of the single-userfading channel capacity [6]. The main idea is a “time diversity”system with multiplexed input and demultiplexed output.That is, we first discretize the range of the time-varying noisedensity of each user into states. Therefore, there are

    joint channel states of the users and wedenote them as , the probabilities of whichare , respectively. In each joint state, thechannel can be viewed as a time-invariant AWGN broadcastchannel, the capacity region of which is known [19]. Given a

  • 1096 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001

    block length , we then design an encoder/decoder pair for theusers in each state with codewords

    of average power for each user which achieve rate, , where is the maximum

    achievable rate for Useron the equivalent AWGN broadcastchannel corresponding to state, for largeenough, and

    These encoder/decoder pairs correspond to a set of input andoutput ports associated with each state. When the channel is instate , the corresponding pair of ports are connected throughthe channel. The codewords associated with each state are thusmultiplexed together for transmission, and demultiplexed at thechannel output. This effectively reduces the time-varying broad-cast channel to a set of time-invariant broadcast channels inparallel, where the th channel only operates when the time-varying channel is in state . The average rate for each user

    is thus the sum of rates associated with each stateweighted by , . Details of the proof canbe found in [16].

    2) Converse:Suppose that a rate vector is achievable,then we need to prove that any sequence of

    codes with average total powerand probability of erroras must have where

    is

    We assume that the codes are designed witha priori knowledgeof the joint channel state. Since the transmitter and receiversknow state up to the current time, this assumption can onlyresult in a higher achievable rate.

    As in the Appendix, Section A1), we first discretize the rangeof the time-varying noise density of each user intostates. Specifically, define and wesay that theth subchannel is in state if the time-varying noisedensity of User satisfies , whereThus, the set discretizes the fading range of each sub-channel into states and there are dis-crete joint channel states. We denote thethof these states as

    where is the base-expansion of , and is the th subchannel state. That is,

    for all and

    Note that a channel state if and only if ,.

    Over a given time interval , let be the number oftransmissions during which the channel is in the state, let

    be the random subset of at which times thechannel is in the state , and let be uniformly distributed on

    . By the stationarity and ergodicity of the channelvariation, as . For , let

    be the transmit power allocated to Userat time anddefine

    Since for any message from the base station to theusers,there is a power constraint on the corresponding codeword, itfollows that for each

    Therefore, for all such that , arebounded sequences in. Thus, there exist a converging subse-quence and a limiting such thatas . Moreover

    (40)

    For a given power allocation policy , we assume thatis the transmit power assigned to Userwhen the

    time-varying broadcast channel is in channel state. Letbe the set of all the power allocation policies which are piece-wise constant in each channel state and which satisfy the averagetotal power constraint (40). Assuming that the noise densitiesof the users in each channel state areconstants and are denoted as , thechannel states can be viewed as AWGNbroadcast channels where at any given time only one of thesechannels is in operation and the probability that theth broad-cast channel is in operation is given by , .We call this the probabilistic broadcast channel. We show inthe Appendix, Section A3) that if is achievable on theprobabilistic broadcast channel consisting of the broad-cast channels with probabilities

    under the assumption of perfect trans-mitter and receiver CSI (i.e., at each timeit is known at boththe transmitter and receivers which broadcast channel is inoperation), then

    (41)

    where

  • LI AND GOLDSMITH: CAPACITY AND OPTIMAL RESOURCE ALLOCATION FOR FADING BROADCAST CHANNELS—PART I 1097

    Define

    Since for and

    it is clear that

    (42)

    , let

    According to the power constraint (40), it is obvious thatsatisfies

    Thus, and

    (43)

    where is given in (2). Taking the limit of the left-handside of (43), we obtain

    (44)

    For the time-varying broadcast channel, ifis achievable, then

    (45)

    From (42) and (45) we have

    That is,

    (46)

    where denotes the capacity region of the time-varyingbroadcast channel. Combining (44) and (46) with the achiev-ability result which indicates that

    Fig. 8. Probabilistic broadcast channel: Channel 1 in operation withprobabilityp , and Channel 2 in operation with probabilityp .

    we obtain

    Since the upper bound equals the lower bound by the monotoneconvergence theorem [20], it is clear that

    3) Capacity of a Probabilistic Broadcast Channel withCSI: In the Appendix, Section A2), while proving the converseof the capacity region in Theorem 1 we have used the capacityof a probabilistic broadcast channel consisting ofdiscreteAWGN broadcast channels with given probabilities under theassumption that perfect CSI is available at both the transmitterand the receivers. We now show how to prove the capacityformula of a probabilistic broadcast channel composed of twoAWGN broadcast channels and two users. As will be discussedlater, the results can be easily generalized to the case ofchannels and users ( ).

    Assume that two discrete degraded memoryless AWGNbroadcast channels

    and

    are as shown in Fig. 8, where, , and are finitealphabets and denotes the channel transition probabilityfunction. Note that each broadcast channel has two outputsand . In the first channel (Channel 1), the Gaussian noises aredenoted as and , the noise variances of which areand , respectively. In the second channel (Channel 2), theGaussian noises are and , the noise variances of whichare and , respectively. Let and

    . We define the probabilistic broadcast channelconsisting of two AWGN broadcast channels as a channel whereChannel 1 operates with probability and Channel 2 operateswith probability , and at any time only one of thetwo channels is in operation.

    Denote as the capacity of an AWGN channel withSNR , i.e., . Let and bethe transmission rates of the particular information toand ,respectively. Here we do not consider the case where commoninformation is transmitted. , let ,

    . Assuming that the total average power is, forfixed , we first divide the total power

  • 1098 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001

    into the part used in Channel 1 and used in Channel2 with . Then is divided into the power

    used to transmit the information to and usedto transmit the information to ; is divided into the power

    used to transmit the information to and usedto transmit the information to . Then, under the assumption ofperfect CSI at both the transmitter and the receivers, the capacityregion of the probabilistic broadcast channel in Fig. 8 is definedby

    (47)

    Note that this capacity region is similar to that of a parallelbroadcast channel composed of two broadcast channels [21],[22], except that in the parallel case, bothand equal .Therefore, its proof can be obtained by following very similarsteps as that for the parallel broadcast channel [21], [22], the de-tails of which are given in [16].

    The capacity region in (47) is equivalent to [21], [22]

    (48)

    since it is obvious that , and every vertex of the convexhull is inside , which means . The convexityof is easily shown by using the time-sharing technique. Theformulas (47) and (48) can be readily generalized to the case of

    channels and users by careful inspec-tion of their structures, and the generalized formulas are alsoequivalent. Since we can similarly prove that the generalizedformula of (48) is the capacity region of the-channel, -userprobabilistic broadcast channel, the generalized formula of (47)as used in the Appendix, Section A2) must also be the capacityregion.

    B. Proof of Theorem 2

    1) Achievability of the Rate Region:The proof of the achiev-ability follows along the same idea as that for the capacity regionof CD with successive decoding discussed in the Appendix, Sec-tion A1) and is therefore omitted.

    2) Convexity of the Rate Region: , , , let

    Since

    (49)

    (50)

    (51)

    from (49)–(51), we obtain the Hessian matrix

    Therefore, is a concave function of . That is,, , ,

    (52)

    where , . Thisresult will be used in the following to prove the convexity of thecapacity region.

    , , we need to show that. Let and be the two power policies

    corresponding to the rate vectorsand , respectively. In agiven channel state, according to the two policies, the transmitpower and fractions of transmission time allocated to User

    are and , , and , re-spectively. Therefore, for the two power policies, the achievablerates for User in the state are

    (53)

    and

    (54)

    respectively. Equations (53) and (54) can also be expressed as[23]

    where and ,.

    If we define a third power policy such that in each channelstate , the transmit power and fraction of transmissiontime allocated to User are

    and

    respectively, then

  • LI AND GOLDSMITH: CAPACITY AND OPTIMAL RESOURCE ALLOCATION FOR FADING BROADCAST CHANNELS—PART I 1099

    Since

    we know that . Moreover, it is proved in (52) thatfor

    is a concave function of , i.e.,

    Therefore, .

    C. Proof of Lemma 1

    In the following, , , , , , , , andare all functions of . Their explicit dependence onis not

    shown to simplify the notation.Let and . Define

    where is given in (12), then

    (55)

    1) If , then . Since byassumption, the numerator of (55) is nonnegative for any

    . Thus, , . Therefore, if ,, i.e., . For ,

    and , we have

    (56)

    The last inequality in (56) is due to the concavity ofand the equality is achieved when and

    . Thus, in this case, the solution to (11) is.

    2) If , from (55) we know that

    if

    if .(57)

    Since , for large enough, .Thus, for large , , i.e., . However,

    . Using this and (57), it is clear that andwill cross each other once at some positive value, as shownin Fig. 3. In that figure, and are the points that satisfy

    where is the slope of the common tangent line in the figure.Since

    and can also be expressed as

    we have

    (58)

    (59)

    and satisfies

    i.e.,

    (60)

    where is given in (14). Therefore, if ,by time-sharing, in (11) can achieve the values between

    and on the straight line; if or, is simply or , respectively. That

    is, in this case, the solution to (11) is (15).

    D. Proof of Theorem 3

    For a given fading state, from (13) we know that the optimalpower satisfies (20). Let and be the optimalpower and fraction of transmission time allocated to user

    at state , respectively. From Lemma 1, we know that

    1) if , then . Thus

    (61)

    and , , , .Substituting (61) into (20), we have

    Since must be nonnegative

    2) if , as shown in Fig. 3, it is clear that

    if

    if

    if

    (62)

    where , , and are all functions of and they aregiven in (58)–(60), respectively. Since we know from (20)

  • 1100 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001

    that , from (15), (17)–(19), and (62) wehave

    a) if then and. Thus, .

    Consequently

    and

    b) if then and. Thus, .

    Consequently

    and

    c) if , then

    and can be any value between and .Thus

    and

    where can be any value betweenand . More-over, and must satisfy the total average powerconstraint (21).

    Because , from (58) and (59), we have

    (63)

    Furthermore, since and

    if

    if

    where

    we have

    A) if , or if and , then;

    B) if and , then ;

    C) if and , then .

    Therefore, the second part of Theorem 3 is proved by combininga), b), and c) with A), B), and C).

    E. Decision Region Comparison for Optimal and SuboptimalTD Schemes

    For the suboptimal TD policy, define

    where is given in (31). Let satisfy

    then

    (64)

    (65)

    Assuming that , from (64) and (65) we know that

    1) if , then

    and

    2) if , then

    and

    Since for , , where de-notes the derivative of with respect to , is adecreasing function. Thus, when the cdf is continuous,the suboptimal policy is equivalent to

    a) if and or if and ,then

    and

    b) if and , then

    and

    From the proof of Theorem 3 in the Appendix, Section D, it isclear that the optimal TD policy is equivalent to

    a) if and or if and ,where is defined in (60), i.e., , then

    and

    b) if and , then

    and

    Hence, when , satisfies , from thedefinition of in (14), we have

    (66)

    According to (63) and (66), it is clear that . Conse-quently, . Therefore, for a given , the only dif-ference between the optimal and suboptimal policy is that when

    , where is given in (32), the suboptimal schemetransmits the information of User 2, while the optimal schemetransmits the information of User 1. This suboptimal allocationof resources occurs only rarely. Note that the values ofin thetwo schemes satisfying the two-user power constraint (21) arenot the same, though they may be very close to each other.

  • LI AND GOLDSMITH: CAPACITY AND OPTIMAL RESOURCE ALLOCATION FOR FADING BROADCAST CHANNELS—PART I 1101

    F. Proof of Lemma 2

    For simplicity, we denote

    When , let , then , .Let

    then

    If , then for , we haveand

    (67)

    If and , then

    for

    for

    which implies

    (68)

    If and , then for, we have and

    (69)

    From (67)–(69), we conclude that

    That is,

    Now assume that (34) is true for

    then for

    Therefore, by induction, we know that (34) is true.

    G. Comparison of the Sum Rates on a Fading BroadcastChannel and on an AWGN Broadcast Channel

    In the following we consider the sum rate for an-userbroadcast system. Assume WLOG that User 1 has the smallestaverage noise variance, i.e.,

    (70)

    Then for the time-invariant AWGN broadcast channel with thesame noise variances, the maximum sum rate of theuserswill be

    AWGN

    (71)

    where is the total transmit power of the users.For the fading broadcast channel, it is shown that for both CD

    and TD, the optimal power allocation strategy in each fadingstate is to allocate power to User only,where , and satisfies

    . Therefore, the maximum sum rate of theusers is

    fading

  • 1102 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 3, MARCH 2001

    which is equivalent to the capacity of a single-user fadingchannel with average noise power [6]

    For this equivalent single-user fading channel, given the totalaverage power constraint, it is well known by Jensen’s in-equality that [6]

    fading (72)

    In the case where the average noise power of each user in abroadcast system with fading is dramatically different, most ofthe time the user with the best average fading condition (User 1)is chosen for transmission. Therefore, in this case, .Hence, from (71) and (72) it is clear that

    fading AWGN

    In the case where the average noise power of each user is quitesimilar, in each joint fading state, only the user with the bestchannel condition is chosen for transmission. In this case, sinceat any time, the channel chosen for transmission is the best of the

    channels with similar average fading conditions, the equiv-alent single-user fading channel is much better on the averagethan any of the individual fading channels, i.e., .Therefore, it is very likely that

    fading AWGN

    ACKNOWLEDGMENT

    The authors wish to thank the Editor and the anonymous re-viewers for their very helpful suggestions and comments.

    REFERENCES

    [1] R. H. Katz, “Adaptation and mobility in wireless information systems,”IEEE Personal Commun., pp. 6–17, First Quarter 1994.

    [2] S.-G. Chua and A. J. Goldsmith, “Variable-rate variable-power MQAMfor fading channels,”IEEE Trans. Commun., vol. 45, pp. 1218–1230,Oct. 1997.

    [3] J. F. Hayes, “Adaptive feedback communications,”IEEE Trans.Commun. Technol., vol. COM-16, pp. 29–34, Feb. 1968.

    [4] J. K. Cavers, “Variable-rate transmission for Rayleigh fading channels,”IEEE Trans. Commun., vol. COM-20, pp. 15–22, Feb. 1972.

    [5] B. Vucetic, “An adaptive coding scheme for time-varying channels,”IEEE Trans. Commun., vol. 39, pp. 653–663, May 1991.

    [6] A. J. Goldsmith and P. P. Varaiya, “Capacity of fading channels withchannel side information,”IEEE Trans. Inform. Theory, vol. 43, pp.1986–1992, Nov. 1997.

    [7] R. Knopp and P. A. Humblet, “Information capacity and power controlin single-cell multiuser communications,” inProc. IEEE Int. Communi-cations Conf. (ICC’95), Seattle, WA, June 1995.

    [8] D. N. Tse and S. V. Hanly, “Multiple-access fading channels—Part I:Polymatroidal structure, optimal resource allocation and throughput ca-pacities,” IEEE Trans. Inform. Theory, vol. 44, pp. 2796–2815, Nov.1998.

    [9] R. Cheng and S. Verdú, “Gaussian multiaccess channels with capacityregion and multiuser water-filling,”IEEE Trans. Inform. Theory, vol.39, pp. 773–785, May 1993.

    [10] S. V. Hanly and D. N. Tse, “Multiple-access fading channels—Part II:Delay-limited capacities,”IEEE Trans. Inform. Theory, vol. 44, pp.2816–2831, Nov. 1998.

    [11] G. Caire, G. Taricco, and E. Biglieri, “Optimum power control overfading channels,”IEEE Trans. Inform. Theory, vol. 45, pp. 1468–1489,July 1999.

    [12] L. Li and A. J. Goldsmith, “Outage capacities and optimal power alloca-tion for fading multiple-access channels,”IEEE Trans. Inform. Theory,submitted for publication.

    [13] D. Hughes-Hartogs, “The capacity of a degraded spectral Gaussianbroadcast channel,” Ph.D. dissertation, Inform. Syst. Lab., Ctr. Syst.Res., Stanford Univ., Stanford, CA, July 1975.

    [14] D. N. Tse, “Optimal power allocation over parallel Gaussian broad-cast channels,” inProc. Int. Symp. Inform. Theory, Ulm, Germany, June1997, p. 27.

    [15] A. J. Goldsmith, “The capacity of downlink fading channels with vari-able-rate and power,”IEEE Trans. Vehic. Technol., vol. 46, pp. 569–580,Aug. 1997.

    [16] L. Li, “Adaptive receiver design and optimal resource allocationstrategies for fading channels,” Ph.D. dissertation, Calif. Inst. Technol.,Pasadena, 2000.

    [17] A. J. Goldsmith and M. Medard, “Capacity of time-varying channelswith memory and channel side information,”IEEE Trans. Inform.Theory, submitted for publication.

    [18] T. M. Cover and J. A. Thomas,Elements of Information Theory. NewYork: Wiley, 1991.

    [19] P. P. Bergmans, “Random coding theorem for broadcast channels withdegraded components,”IEEE Trans. Inform. Theory, vol. IT-19, pp.197–207, Mar. 1973.

    [20] P. Billingsley, Probability and Measure, 2nd ed. New York: Wiley,1986.

    [21] A. A. El Gamal, “Capacity of the product and sum of two unmatchedbroadcast channels,”Probl. Pered, Inform., vol. 16, no. 1, pp. 3–23,Jan.–Mar. 1980.

    [22] , “Results in multiple user channel capacity,” Ph.D. dissertation,Stanford Univ., Stanford, CA, May 1978.

    [23] P. P. Bergmans and T. A. Cover, “Cooperative broadcasting,”IEEETrans. Inform. Theory, vol. IT-20, pp. 317–324, May 1974.