ch09 of maths discrete

Embed Size (px)

DESCRIPTION

maths discrete continuity

Citation preview

  • 9 CONVERGENCE IN PROBABILITY 111

    9 Convergence in probability

    The idea is to extricate a simple deterministic component out of a random situation. This istypically possible when a large number of random effects cancel each other out, so some limitis involved. The general situation, then, is the following: given a sequence of random variables,Y1, Y2, . . ., we want to show that, when n is large, Yn is approximately f(n) for some simpledeterministic function f(n). The meaning of approximately is what we now make clear.

    A sequence, Y1, Y2, . . ., of random variables converges to a number a in probability if, as n,P (|Yn a| ) converges to 1, for any fixed > 0. This is equivalent to P (|Yn a| > ) 0 asn, again for any fixed > 0.

    Example 9.1. Toss a fair coin n times, independently. Let Rn be the longest run of heads,i.e., the longest sequence of consecutive tosses of Heads. For example, if n = 15 and the tossescome out

    HHTTHHHTHTHTHHH.

    then Rn = 3. We will show that, as n,

    Rnlog2 n

    1,

    in probability. This means that, to a first approximation, one should expect about 20 consecutiveheads somewhere in a million tosses.

    To solve a problem such as this, we need to find upper bounds on probabilities that Rn islarge and that it is small, i.e., on P (Rn k) and P (Rn k), for appropriately chosen k. Now,for arbitrary k,

    P (Rn k) = P (k consecutive Heads start at some i, 0 i n k + 1)

    = P (

    nk+1i=1

    {i is the first Heads in a succession of at least k Heads})

    n 1

    2k

    For the lower bound, divide the string of size n into disjoint blocks of size k. There is nk

    such blocks (if n is not divisible by k, simply throw away the leftover smaller block at the end).Then Rn k as soon as one of the blocks consists of Heads only, and different blocks areindependendent. Therefore,

    P (Rn < k)

    (1

    1

    2k

    )nk

    exp

    (

    1

    2k

    nk

    ),

  • 9 CONVERGENCE IN PROBABILITY 112

    using the famous inequality 1 x ex, valid for all x.

    Below, we will use these trivial inequalities, valid for any real number x 2: x x 1,x x+ 1, x 1 x2 , and x+ 1 2x.

    To demonstrate that Rnlog2n 1, in probability, we need to show that, for any > 0,

    P (Rn (1 + ) log2 n) 0,(1)

    P (Rn (1 ) log2 n) 0,(2)

    as

    P

    ( Rnlog2 n 1

    )= P

    (Rn

    log2 n 1 + or

    Rnlog2 n

    1

    )

    = P

    (Rn

    log2 n 1 +

    )+ P

    (Rn

    log2 n 1

    )

    = P (Rn (1 + ) log2 n) + P (Rn (1 ) log2 n) .

    A little bit of fussing in the proof comes from the fact that (1 e) log2 n are not integers. Thiscommon in problems such as this. To prove (1), we plug k = (1 + ) log2 n into the upperbound to get

    P (Rn (1 + ) log2 n) n 1

    2(1+) log2 n1

    = n 2

    n1+

    =2

    n 0

    as n . One the other hand, to prove (2) we need to plug k = (1 ) log2 n+ 1 into thelower bound,

    P (Rn (1 ) log2 n) P (Rn < k)

    exp

    (

    1

    2k

    nk

    )

    exp

    (

    1

    2k

    (nk 1

    ))

    exp

    (

    1

    32

    1

    n1

    n

    (1 ) log2 n

    )

    = exp

    (

    1

    32

    n

    (1 ) log2 n

    )

    0,

    as n, as n is much larger than log2 n.

  • 9 CONVERGENCE IN PROBABILITY 113

    The most basic tool in proving convergence in probability is Chebyshevs inequality : if X isa random variable with EX = and Var(X) = 2, then

    P (|X | k) 2

    k2,

    for any k > 0. We proved this inequality in the previous chapter, and we will use it to prove thenext theorem.

    Theorem 9.1. Connection between variance and convergence in probability.

    Assume that Yn are random variables and a is a constant such that

    EYn a,

    Var(Yn) 0,

    as n. ThenYn a,

    as n, in probability.

    Proof. Fix an > 0. If n is so large that

    |EYn a| < /2,

    then

    P (|Yn a| > ) P (|Yn EYn| > /2)

    4Var(Yn)

    2

    0,

    as n. Note that the second inequality in the computation is Chebyshevs inequality.

    This is most often applied to sums of random variables. Let

    Sn = X1 + . . . +Xn,

    where Xi are random variables with finite expectation and variance. Then, without any inde-pendence assumption,

    ESn = EX1 + . . .+ EXn

    and

    E(S2n) =ni=1

    EX2i +i6=j

    E(XiXj),

    Var(Sn) =ni=1

    Var(Xi) +i6=j

    Cov(Xi,Xj).

  • 9 CONVERGENCE IN PROBABILITY 114

    You should recall thatCov(X1,Xj) = E(XiXj) EXiEXj

    andVar(aX) = a2Var(X).

    Moreover, if Xi are independent,

    Var(X1 + . . .+Xn) = Var(X1) + . . .+Var(Xn).

    Continuing with the review, lets reformulate and reprove the most famous convergence in prob-ability theorem. We will use the common abbreviation i. i. d. for independent identically dis-tributed random variables.

    Theorem 9.2. Weak law of large numbers. Let X,X1,X2, . . . be i. i. d. random variables withwith EX1 = and Var(X1) =

    2 1.06s. We will assume year-to-year independence of the stocks return.

    We will try to maximize the return to our investment by hedging. That is, we invest, atthe beginning of each year, a fixed proportion x of our current capital into the stock and theremaining proportion 1 x into the bond. We collect the resulting capital at the end of theyear, which is simultaneously the beginning of next year, and reinvest with the same proportionx. Assume that our initial capital is x0.

  • 9 CONVERGENCE IN PROBABILITY 115

    It is important to note that the expected value of the capital at the end of the year ismaximized when x = 1, but using this strategy you will eventually lose everything . Let Xn beyour capital at the end of year n. Define the average growth rate of your investment as

    = limn

    1

    nlog

    Xnx0

    ,

    so thatXn x0e

    n.

    We will express in terms of x; in particular, we will show it is a nonrandom quantity.

    Let Ii = I{stock goes up in year i}. These are independent indicators with EIi = 0.8.

    Xn = Xn1(1 x) 1.06 +Xn1 x 1.5 In

    = Xn1(1.06(1 x) + 1.5x In)

    and so we can unroll the recurrence to get

    Xn = x0(1.06(1 x) + 1.5x)Sn((1 x)1.06)nSn ,

    where Sn = I1 + . . . + In. Therefore,

    1

    nlog

    Xnx0

    =Snn

    log(1.06 + 0.44x) +

    (1

    Snn

    )log(1.06(1 x))

    0.8 log(1.06 + .44x) + 0.2 log(1.06(1 x)),

    in probability, as n . The last expression defines as a function of x. To maximize this,we set d

    dx= 0 to get

    0.8 0.44

    1.06 + 0.44x=

    0.2

    1 x.

    The solution is x = 722 , which gives 8.1%.

    Example 9.3. Distribute n balls independently at random into n boxes. Let Nn be the numberof empty boxes. Show that 1

    nNn converges in probability and identify the limit.

    Note thatNn = I1 + . . .+ In,

    where Ii = I{ith box is empty}, but you cannot use the weak law of large numbers as Ii are notindependent. Nevertheless,

    EIi =

    (n 1

    n

    )n=

    (1

    1

    n

    )n,

    and so

    ENn = n

    (1

    1

    n

    )n.

  • 9 CONVERGENCE IN PROBABILITY 116

    Moreover,

    E(N2n) = ENn +i6=j

    E(IiIj)

    with

    E(IiIj) = P (box i and j are both empty) =

    (n 2

    n

    )n,

    so that

    Var(Nn) = E(N2n) (ENn)

    2 = n

    (1

    1

    n

    )n+ n(n 1)

    (1

    2

    n

    )n n2

    (1

    1

    n

    )2n.

    Now let Yn =1nNn. We have

    EYn e1

    as n, and

    Var(Yn) =1

    n

    (1

    1

    n

    )n+n 1

    n

    (1

    2

    n

    )n

    (1

    1

    n

    )2n

    0 + e2 e2 = 0,

    as n. Therefore

    Yn =Nnn

    e1,

    as n, in probability.

    Problems

    1. Assume that n married couples (amounting to 2n people) are seated at random on 2n seatsaround a table. Let T be the number of couples that sit together. Determine ET and Var(T ).

    2. There are n birds that sit in a row on a wire. Each bird looks left or right with equalprobability. Let N be the number of birds not seen by any neighboring bird. Determine, withproof, the constant c so that, as n, 1

    nN c in probability.

    3. Recall the coupon collector problem: sample from n cards, with replacement, indefinitely,and let N be the number of cards you need to get each of n different cards are represented. Finda sequence an so that, as n, N/an converges to 1 in probability.

    4. Kings and Lakers are playing a best of seven playoff series, which means they play untilone team wins four games. Assume Kings win every game independently with probability p.

  • 9 CONVERGENCE IN PROBABILITY 117

    (There is no difference between home and away games.) Let N be the number of games played.Compute EN and Var(N).

    5. An urn contains n red and m black balls. Pull balls from the urn one by one withoutreplacement. Let X be the number of red balls you pull before any black one, and Y thenumber of red balls between the first and the second black one. Compute EX and EY .

    Solutions to problems

    1. Let Ii be the indicator of the event that the ith couple sits together. Then T = I1 + + In.Moreover,

    EIi =2

    2n 1, E(IiIj) =

    22(2n 3)!

    (2n 1)!=

    4

    (2n 1)(2n 2),

    for any i and j 6= i. Thus

    ET =2n

    2n 1

    and

    E(T 2) = ET + n(n 1)4

    (2n 1)(2n 2)=

    4n

    2n 1,

    so

    Var(T ) =4n

    2n 1

    4n2

    (2n 1)2=

    4n(n 1)

    (2n 1)2.

    2. Let Ii indicate the event that bird i is not seen by any other bird. Then EIi is12 if i = 1 or

    i = n and 14 otherwise. It follows that

    EN = 1 +n 2

    4=

    n+ 2

    4.

    Furthermore Ii and Ij are independent if |i j| 3 (two birds that have two or more birdsbetween them are observed independently). Thus Cov(Ii, Ij) = 0 if |i j| 3. As Ii and Ij areindicators, Cov(Ii, Ij) 1 for any i and j. For the same reason Var(Ii) 1. Therefore

    Var(N) =i

    Var(Ii) +i6=j

    Cov(Ii, Ij) n+ 4n = 5n.

    Clearly, if M = 1nN , then EM = 1

    nEN 14 and Var(M) =

    1n2Var(N) 0. It follows that

    c = 14 .

    3. Let Ni be the number of coupons needed to get i different coupons after having i1 differentones. Then N = N1 + . . . + Nn, and Ni are independent Geometric with success probability

  • 9 CONVERGENCE IN PROBABILITY 118

    ni+1n

    . So

    ENi =n

    n i+ 1, Var(Ni) =

    n(i 1)

    (n i+ 1)2,

    and therefore

    EN = n

    (1 +

    1

    2+ . . . +

    1

    n

    ),

    Var(N) =ni=1

    n(i 1)

    (n i+ 1)2 n2

    (1 +

    1

    22+ . . .+

    1

    n2

    ) n2

    2

    6< 2n2.

    If an = n log2 n, then1

    anEN 1,

    1

    a2nEN 0,

    as n, so that1

    anN 1

    in probability.

    4. Let Ii be the indicator of the event that the ith game is played. Then EI1 = EI2 = EI3 =EI4 = 1,

    EI5 = 1 p4 (1 p)4,

    EI6 = 1 p5 5p4(1 p) 5p(1 p)4 (1 p)5,

    EI7 =

    (6

    3

    )p3(1 p)3.

    Add the seven expectations to get EN . Now to compute E(N2) we use the fact that, if i > j,IiIj = Ii, so that E(IiIj) = EIi. So

    EN2 =i

    EIi + 2i>j

    E(IiIj) =i

    EIi + 2i

    (i 1)EIi =

    7i=1

    (2i 1)EIi,

    and the final result can be obtained by plugging in EIi and finally by the standard formula

    Var(N) = E(N2) (EN)2.

    5. Imagine the balls ordered in a row, and the ordering specifies the sequence in which they arepulled. Let Ii be the indicator of the event that the ith red ball is pulled before any black ones.Then EIi =

    1m+1 , simply the probability that in a random ordering of the ith red balls and all

    m black ones, the red comes first. As X = I1 + . . . + In, EX =n

    m+1 .

    Now let Ji be the indicator of the event that the ith red ball is pulled between the first andthe second black one. Then EJi is the probability that the red ball is second in the ordering ofm+ 1 balls as above, so EJi = EIi, and EY = EX.