Math 141 - Lecture 12: The LLN, CLT, and the Normal Distributionpeople.reed.edu/~jones/Courses/P12.pdf · 2013-03-01 · Lecture 12: The LLN, CLT, and the Normal Distribution Albyn

Math 141Lecture 12: The LLN, CLT, and the Normal Distribution

Albyn Jones1

1Library [email protected]

www.people.reed.edu/∼jones/courses/141

Albyn Jones Math 141

Properties of X n

Suppose X1,X2, . . . ,Xn are IID random variables, with meanµ and standard deviation σ. We know that

E(X n) = µ

andSD(X n) =

σ√n

In other words, typical values for X n are around

µ± σ/√

n

or more formally:X n →P µ


The Law of Large Numbers

The Law of Large NumbersX n →P µ

tells us that X n gets as close to µ as you like whenn→∞, with probability approaching 1.

It does not tell you how close you are at any point,or how large n must be to guarantee you are asclose as you would like to be!

The Central Limit Theorem Another famous theorem called theCentral Limit Theorem answers that question!




tells us that X n gets as close to µ as you like whenn→∞, with probability approaching 1.It does not tell you how close you are at any point,or how large n must be to guarantee you are asclose as you would like to be!





tells us that X n gets as close to µ as you like whenn→∞, with probability approaching 1.It does not tell you how close you are at any point,or how large n must be to guarantee you are asclose as you would like to be!



First: The Normal Distribution

The so-called Normal distribution (aka Gaussian, or thebell-shaped curve) has its origin in approximations to Binomialprobabilities for large n.

Before discussing that approximation, we study the propertiesof the Normal distribution.


The Normal Distribution

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

Z

dens

ity

The Standard Normal Density


The Normal Distribution, Part 2

The Normal Distribution has several important features:it is symmetric and unimodal,the mean, median, and mode coincide,it is completely characterized by the values of the mean µand the standard deviation σ or variance σ2,every Normal distribution has the same shape.


Notation

The standard notation for a random variable X which has aNormal Distribution with mean µ and standard deviation σ is

X ∼ N(µ, σ2)

In other words, list the mean and variance.

Warning! R functions are parametrized by the mean µ andstandard deviation σ!


The Normal Distribution, Part 3

Roughly 96% of any Normal population lies with 2 SD’s of themean, and about 99.7% lies within 3 SD’s of the mean.

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

Z

dens

ity

0.023 0.136 0.341 0.341 0.136 0.023

Areas under the Standard Normal Curve


The Normal(50,102) Curve

The corresponding regions for ANY Normal distribution containthe same proportions of the population!

20 30 40 50 60 70 80

0.00

0.01

0.02

0.03

0.04

Z

dens

ity

0.023 0.136 0.341 0.341 0.136 0.023

Areas under the Normal(50,100) Curve


Linear functions of Normal RV’s are Normal!

Let Z ∼ N(0,1), and let Y = σZ + µ. Then

E(Y ) = E(σZ + µ) = σE(Z ) + µ = µ

andSD(Y ) = SD(σZ + µ) = σSD(Z ) = σ

Thus:Y ∼ N(µ, σ2)


Standardization

It works the other way too: let Y ∼ N(µ, σ2), then

Z = (Y − µ)/σ

is a Standard Normal RV:

E(Z ) = E(

Y − µσ

)=

1σE(Y − µ) = 1

σ(E(Y )− µ) = 0

A standardized RV is often called a Z–score, and representsthe number of standard deviations the value is away from themean.


Historical Footnote!

Standardizing data was a common practice back beforecomputers; then you only needed a table of probabilities for theStandard Normal distribution! Tables are unnecessary now, butit is still very useful to remember that areas under the Normaldensity curve depend only on the mean and SD, and thatZ–scores measure in units of SD’s.


R Functions for Normal Probabilities

pnorm(a, µ, σ) gives P(Y ≤ a), dnorm(a, µ, σ) gives the heightof the curve at a.

20 30 40 50 60 70 80

0.00

0.01

0.02

0.03

0.04

Z

dens

ity

pnorm(55,50,10) = 0.691

dnorm(55,50,10)

dnorm() and pnorm()


The Normal Density and CDF

The density function for X ∼ N(µ, σ2) is given by

f (x) =1√

2πσ2e−

(x−µ)2

2σ2

the CDF is the area under the curve to the left of the point x :

P(X ≤ x) =∫ x

−∞

1√2πσ2

e−(x−µ)2

2σ2 dx


Cumulative Normal Probabilities: pnorm() and qnorm()

The CDF is the area under the density curve up to a point,given by pnorm(), qnorm() is the inverse function of pnorm().

20 30 40 50 60 70 80

0.0

0.2

0.4

0.6

0.8

1.0

Z

dens

ity

pnorm(55,50,10) = 0.691

qnorm(.691,50,10) = 55

The Cumulative Distribution Function


Another pnorm example

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

x

dens

ity

pnorm(1)−pnorm(−1) : .68...


pnorm() and qnorm()

QUIZ!

What isqnorm(pnorm(0))?

What ispnorm(qnorm(.5))?


Sample Means

Suppose X1,X2, . . . ,Xn are IID random variables, with meanµ and standard deviation σ. We know that

E(X n) = µ

andSD(X n) =

σ√n


The Central Limit Theorem

Let X1,X2, . . . ,Xn be IID random variables, with mean µ andstandard deviation σ. Then as n increases, the distribution ofX n is approaching that of a Normal with mean µ and SD σ/

√n:

P

(X n − µσ/√

n≤ x

)→∫ x

−∞

1√2π

e−x22 dx


The Central Limit Theorem, Three versions

We actually have three ways of describing the normalapproximation:

1

X − µσ/√

n∼ N(0,1)

2

X ∼ N(µ, σ2/n)

3 ∑Xi ∼ N(nµ,nσ2)




1

X − µσ/√

n∼ N(0,1)

2

X ∼ N(µ, σ2/n)





1

X − µσ/√

n∼ N(0,1)

2

X ∼ N(µ, σ2/n)





1

X − µσ/√

n∼ N(0,1)

2

X ∼ N(µ, σ2/n)



Interpretation

It is the CLT that allows us to say that

X ≈ µ± σ/√

n

For the Normal distribution, the SD really is a typical deviation!

Finally: this also explains why the SD is often more useful thanother measures of spread.


Example: Binomial

Let Xi be n IID Bernoulli(p) RV’s. Then µ = p andσ =

√p(1− p), while X = p̂.

Standardized Averages

p̂ − p√p(1− p)/n

∼ N(0,1)

Averagesp̂ ∼ N(p,p(1− p)/n)

Sums ∑Xi ∼ Binomial(n,p) ∼ N(np,np(1− p))


Example: Binomial


√p(1− p), while X = p̂.


p̂ − p√p(1− p)/n

∼ N(0,1)




Example: Binomial


√p(1− p), while X = p̂.


p̂ − p√p(1− p)/n

∼ N(0,1)




Example: Binomial


√p(1− p), while X = p̂.


p̂ − p√p(1− p)/n

∼ N(0,1)




Example: Binomial(20,1/2)

0 1 2 3 4 5 6 7 8 9 11 13 15 17 19

0.00

0.05

0.10

0.15

Binomial(20,.5) and N(10, 20*.5*.5)



0 3 6 9 12 16 20 24 28 32 36 40 44 48

0.00

0.02

0.04

0.06

0.08

0.10

0.12

Binomial(50,.5) and N(25, 50*.5*.5)



If X ∼ Binomial(100,1/2), then

E(X ) = np = 100/2 = 50

SD(X ) =√

np(1− p) =√

100/4 = 5

So typical values for X are around 45 or 55, and roughly96% of the time,

40 ≤ X ≤ 60

sum(dbinom(40:60,100,.5)) gives 0.9648.



If X ∼ Binomial(100,1/100), then

E(X ) = np = 100/100 = 1

SD(X ) =√

np(1− p) =√

100 · .01 · .99 ≈ 1

So typical values for X are roughly 0 to 2, and according tothe CLT roughly 96% of the time,

−1 ≤ X ≤ 3

sum(dbinom(0:3,100,.01)) gives 0.9816, whilesum(dbinom(0:2,100,.01)) gives 0.92. The poissonapproximation works better here!



0 1 2 3 4 5 6 7 8 9 10 12 14 16

0.0

0.1

0.2

0.3

0.4

Binomial(100,.01) and N(1,.995)


Summary

The Normal distribution originated as a means of approximatingprobabilities for sums and averages.

For IID RV’s Xi with mean µ and variance σ2

X ∼ N(µ, σ2/n)


Documents

Math 141 - Lecture 12: The LLN, CLT, and the Normal Distributionpeople.reed.edu/~jones/Courses/P12.pdf · 2013-03-01 · Lecture 12: The LLN, CLT, and the Normal Distribution Albyn