8. Limit theorems

ENGG 2040C: Probability Models and Applications

Andrej Bogdanov

Spring 2013

8. Limit theorems

Many times we do not need to calculate probabilities exactly

Sometimes it is enough to know that a probability is very small (or very large)

E.g. P(earthquake tomorrow) = ?

This is often a lot easier

What do you think?

I toss a coin 1000 times. The probability that I get 14 consecutive heads is

< 10% ≈ 50% > 90%A B C

Consecutive heads

where Ii is an indicator r.v. for the event

“14 consecutive heads starting at position i”

Let N be the number of occurrences of 14 consecutive heads in 1000 coin flips.

N = I1 + … + I987

E[Ii ] = P(Ii = 1) = 1/214

E[N ] = 987 ⋅ 1/214

= 987/16384

≈ 0.0602

Markov’s inequality

For every non-negative random variable X and every value a:

P(X ≥ a) ≤ E[X] / a.

E[N ] ≈ 0.0602

P[N ≥ 1] ≤ E[N ] / 1 ≤ 6%.

Proof of Markov’s inequality

For every non-negative random variable X: and every value a:

P(X ≥ a) ≤ E[X] / a.

E[X ] = E[X | X ≥ a ] P(X ≥ a) + E[X | X < a ] P(X < a)

≥ 0≥ a ≥ 0

E[X ] ≥ a P(X ≥ a) + 0.

Hats

1000 people throw their hats in the air. What is the probability at least 100 people get their hat back?

N = I1 + … + I1000where Ii is the indicator for the event that person i

gets their hat. Then E[Ii ] = P(Ii = 1) = 1/n

Solution

E[N ] = n 1/n

= 1 P[N ≥ 100] ≤ E[N ] / 100 = 1%.

Patterns

A coin is tossed 1000 times. Give an upper bound on the probability that the pattern HH occurs:

(b) at most 100 times

(a) at least 500 times

Patterns

Let N be the number of occurrences of HH.

P[N ≥ 500] ≤ E[N ] / 500

= 249.75/500

≈ 49.88%

so 500+ HHs occur with probability ≤ 49.88%.P[N ≤ 100] ≤ ?

P[999 – N ≥ 899]

(b)

P[N ≤ 100] =

≤ E[999 – N ] / 899= (999 – 249.75)/

899≤ 83.34%

Last time we calculated E[N ] = 999/4 = 249.75.

(a)

Chebyshev’s inequality

For every random variable X and every t:

P(|X – m| ≥ ts) ≤ 1 / t2.

where m = E[X], s = √Var[X].

Patterns

E[N ] = 999/4 = 249.75

Var[N] = (5⋅999 – 7)/16 = 311.75

m = 249.75s ≈ 17.66

(a)

P(X ≥ 500)≤ P(|X – m| ≥ 14.17s)

≤ 1/14.172 ≈ 0.50%

(b)

P(X ≤ 100)≤ P(|X – m| ≥ 8.47s)

≤ 1/8.472 ≈ 1.39%

Proof of Chebyshev’s inequality

For every random variable X and every a:

P(|X – m| ≥ ts) ≤ 1 / t2.

where m = E[X], s = √Var[X].

P(|X – m| ≥ ts) = P((X – m)2 ≥ t2s2) ≤ E[(X – m)2] / t2s2 = 1 / t2.

An illustration

mm – ts m + ts

sP(|X – m| ≥ t s ) ≤ 1 / t2.

m a

P( X ≥ a ) ≤ m / a.

0

Markov’s inequality:

Chebyshev’s inequality:

Polling

1

2

3

45 6

7 8

9

Polling

Xi =1 if

i

0 ifi

X1,…, Xn are independent Bernoulli(m)

where m is the fraction of blue voters

X = X1 + … + Xn

X/n is the pollster’s estimate of m

Polling

How accurate is the pollster’s estimate X/n?

E[X] =

= mn E[X1] + … + E[Xn]

Var[X]

= Var [X1] + … + Var [Xn] = s2n

m = E[Xi], s = √Var[Xi]

X = X1 + … + Xn

Polling

E[X] = mn

Var[X] = s2n

P( |X – mn| ≥ t s √n ) ≤ 1 / t2.

P( |X/n – m| ≥ e) ≤ d.

confidenceerror

samplingerror

X = X1 + … + Xn

den

The weak law of large numbers

For every e, d > 0 and n ≥ s2/(e2d):

P(|X/n – m| ≥ e) ≤ d

X1,…, Xn are independent with same p.m.f. (p.d.f.) m = E[Xi], s =

√Var[Xi], X = X1 + … + Xn

Polling

Say we want confidence error d = 10% and sampling error e = 5% . How many people should we poll?

For e, d > 0 and n ≥ s2/(e2d):

P(|X/n – m| ≥ e) ≤ d

n ≥ s2/(e2d) ≥ 4000s2

For Bernoulli(m) samples, s2 = m (1 – m) ≤ 1/4

This suggests we should poll about 1000 people.

A polling experiment

n

X1 +

… +

Xn

n

X1, …, Xn independent Bernoulli(1/2)

A more precise estimate

Let’s assume n is large.

Weak law of large numbers:

X1 + … + Xn ≈ mn with high probability

X1,…, Xn are independent with same p.m.f. (p.d.f.)

P( |X – mn| ≥ t s √n ) ≤ 1 / t2.

this suggests X1 + … + Xn ≈ mn + T s √n

Some experiments

X = X1 + … + Xn Xi independent Bernoulli(1/2)

n = 6

n = 40

Some experiments

X = X1 + … + Xn Xi independent Poisson(1)

n = 3

n = 20

Some experiments

X = X1 + … + Xn Xi independent Uniform(0, 1)

n = 2

n = 10

The normal random variable

f(t) = (2p)-½ e-t /22

tp.d.f. of a normal random variable

The central limit theorem

X1,…, Xn are independent with same p.m.f. (p.d.f.)

where T is a normal random variable.

m = E[Xi], s = √Var[Xi], X = X1 + … + Xn

For every t (positive or negative):

lim P(X ≥ mn + t s √n ) = P(T ≥ t)n → ∞

Polling again

Probability model

X = X1 + … + Xn Xi independent Bernoulli(m)

m = fraction that will vote blue

E[Xi] = m, s = √Var[Xi] = √m(1 - m) ≤ ½.

Say we want confidence error d = 10% and sampling error e = 5% . How many people should we poll?

Polling again

lim P(X ≥ mn + t s √n ) = P(T ≥ t)n → ∞

5% n

lim P(X ≤ mn – t s √n ) = P(T ≤ -t)n → ∞

5% n

lim P(X/n is not within 5% of m) = P(T ≥ t) + P(T ≤ -t)n → ∞

= 2 P(T ≤ -t)

t s √n = 5% n

t = 5%√n/s

The c.d.f. of a normal random variable

t

F(t

)

P(T ≤ -t)

t-t

P(T ≥ t)

Polling again

confidence error = 2 P(T ≤ -t)

We want a confidence error of ≤ 10%:

= 2 P(T ≤ -5%√n/s)

≤ 2 P(T ≤ -√n/10)

We need to choose n so that P(T ≤ -√n/10) ≤ 5%.

Polling again

t

F(t)

P(T ≤ -√n/10) ≤ 5%

-√n/10 ≈ -1.645

n ≈ 16.452 ≈ 271

http://stattrek.com/online-calculator/normal.aspx





Party

Give an estimate of the probability that the average arrival time of a guest is past 8:40pm.

Ten guests arrive independently at a party between 8pm and 9pm.

Documents

8. Limit theorems