40
Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section 2.2

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Embed Size (px)

Citation preview

Page 1: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Lectures prepared by:Elchanan MosselYelena Shvets

Introduction to probability

Stat 134 FAll 2005

Berkeley

Follows Jim Pitman’s book:

ProbabilitySection 2.2

Page 2: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Binomial Distribution:Mean

Recall: the Mean

of a binomial(n,p) distribution

is given by:

= #Trials £ P(success) = n p

• The Mean = Mode (most likely value).

• The Mean = “Center of gravity” of the distribution.

Page 3: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Standard Deviation

The Standard Deviation (SD) of a distribution, denoted measures the spread of the distribution.

Def: The Standard Deviation (SD) of the binomial(n,p) distribution is given by:

Page 4: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Binomial DistributionExample 1:

• Let’s compare Bin(100,1/4) to Bin(100,1/2).

• 1 = 25, 2 = 50

• 1 = 4.33, 2 = 5

• We expect Bin(100,1/4) to be less spread than Bin(100,1/2).

• Indeed, you are more likely to guess the value of Bin(100,1/4) distribution than of Bin(100,1/2) since:For p=1/2 : P(50) ¼ 0.079589237;

For p=1/4 : P(25) ¼ 0.092132;

Page 5: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Histograms for binomial(100, 1/4) & binomial(100, 1/2) ;

.00

.02

.04

.06

.08

.10

.12

0 10 20 30 40 50 60 70 80 90 100

Page 6: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Binomial DistributionExample 2:

• Let’s compare Bin(50,1/2) to Bin(100,1/2).

• 1 = 25, 2 = 50

• 1 = 3.54, 2 = 5

• We expect Bin(50,1/2) to be less spread than Bin(100,1/2).

• Indeed, you are more likely to guess the value of Bin(50,1/2) distribution than of Bin(100,1/2) since:

For n=100 : P(50) ¼ 0.079589237;

For n=50 : P(25) ¼ 0.112556;

Page 7: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Histograms for binomial(50, 1/2) & binomial(100,1/2)

.00

.02

.04

.06

.08

.10

.12

0 10 20 30 40 50 60 70 80 90 100

Page 8: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

, and the normal curve

•We will see today that the •Mean () and the •SD () give a very good summary of the binom(n,p) distribution via •The Normal Curve with parameters and .

Page 9: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

binomial(n,½); n=50,100,250,500

.00

.02

.04

.06

.08

.10

.12

0 50 100 150 200 250 300

25, 3.54

50, 5

125, 7.91

250, 11.2

Page 10: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

.0

.1

.2

.3

.4

.5

Normal Curve

2

2

(x- )-

21y e

2

x

y

Page 11: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

.0

.1

.2

.3

.4

.5

The Normal Distribution

a xb

2

2b

2

a

1(a,b)= dx

2

( - )x

P e

2

2

(x- )a2

-

1P(- ,a) e dx

2

Page 12: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

.0

.1

.2

.3

.4

.5

.0

.1

.2

.3

.4

.5

.0

.1

.2

.3

.4

.5

The Normal Distribution

2

2

(x- )b -2

a

1P(a,b) = e dx

2

a xb

2

2

(x- )b -21

e dx2

2(x- )a -

221e dx

2

=

Page 13: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Standard Normal

2 x2

1(x)=

2e

Standard Normal Density Function:

Corresponds to = 0 and =1

Page 14: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Standard Normal

For the normal (,) distribution: P(a,b) = (b-) - (a-);

z

-

(z)= (x)dx

Standard Normal Cumulative Distribution Function:

Page 15: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Standard Normal

.0

.1

.2

.3

.4

.5

-5 -4 -3 -2 -1 0 1 2 3 4 5

.0

.1

.2

.3

.4

.5

.6

.7

.8

.9

1.0

-5 -4 -3 -2 -1 0 1 2 3 4 5

2 x2

1(x)=

2e

z

-

(z)= (x)dx

Page 16: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Standard Normal

.0

.1

.2

.3

.4

.5

-5 -4 -3 -2 -1 0 1 2 3 4 5

.0

.1

.2

.3

.4

.5

.6

.7

.8

.9

1.0

-5 -4 -3 -2 -1 0 1 2 3 4 5

2 x2

1(x)=

2e

1(0)=

2

(z)

1 (z)

(-z)

z-z

Page 17: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Standard Normal

For the normal (,) distribution: P(a,b) = (b-) - (a-);

In order to prove this, suffices to show:

P(-1,a) = (a-);

Standard Normal Cumulative Distribution Function:

Page 18: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

2

2

(x- )a -2

-

1e dx

2

Change of variables

xs

dxds

ax a s

a

(s)ds

a( )

2a

s21

e ds 2

2

2

( )a -2

-

x-1e

2dx

Page 19: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Properties of :

(0) = 1/2;

(-z) = 1 - (z);

(-1) = 0;

(1)= 1.

does not have a closed form formula!

Page 20: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Normal Approximation of a binomial

For n independent trials with success probability p:

where:

Page 21: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

binomial(n,½); n=50,100,250,500

.00

.02

.04

.06

.08

.10

.12

0 50 100 150 200 250 300

25, 3.54

50, 5

125, 7.91

250, 11.2

Page 22: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Normal(,);

25, 3.54

50, 5

125, 7.91

250, 11.2

.00

.02

.04

.06

.08

.10

.12

0 50 100 150 200 250 300

Page 23: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Normal Approximation of a binomial

For n independent trials with success probability p:

•The 0.5 correction is called the “continuity correction”

•It is important when is small or when a and b are close.

Page 24: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Normal Approximation

Question: Find P(H>40) in 100 tosses.

Page 25: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

.00

.02

.04

.06

.08

.10

.12

Normal Approximation to bin(100,1/2).

2

2

(x-50)

2*51

2 5e

P(#H 40)

((40-50)/5) =1-(-2)

= (2) ¼ 0.9772

Exact = 0.971556

.00

.02

.04

.06

.08

.10

.12

What do we get With continuity correction?

Page 26: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

When does the Normal Approximation fail?

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

0 1

bin(1,1/2)

=0.5,=0.5

N(0.5,0.5)

Page 27: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

0.

0.2

0.4

0.6

0.8

1.

0 1 2 3 4 5 6 7 8 9 10

When does the NA fail?bin(100,1/100)

=1, ¼ 1

N(1,1)

Page 28: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Rule of Thumb

Normal works better:•The larger is.•The closer p is to ½.

Page 29: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Fluctuation in the number of successes.

From the normal approximation it follows that:

P(- to + successes in n trials) ¼ 68%

P(-2 to +2 successes in n trials) ¼ 95%

P(-3 to +3 successes in n trials) ¼ 99.7%

P(-4 to +4 successes in n trials) ¼ 99.99%

Page 30: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Fluctuation in the proportion of successes.Typical size of fluctuation in the number of successes

is:

np(1 p)

p(1 p)n n

Typical size of fluctuation in the proportion of successes is:

Page 31: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Square Root Law

Let n be a large number of independent trials with probability of success p on each.

The number of successes will, with high probability, lie in an interval, centered on the mean np, with a width a moderate multiple of .

The proportion of successes, will lie in a small interval centered on p, with the width a moderate multiple of 1/ .

n

n

Page 32: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Law of large numbers

Let n be a number of independent trials, with probability p of success on each.

For each > 0;

P(|#successes/n – p|< ) ! 1, as n ! 1

Page 33: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Confidence intervals

Suppose we observe the results of n trials with an probability of success p.

#successesˆThe observed f requency of successes

unk n

p=

now

.n

Page 34: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Conf Intervals

The Normal Curve Approximation says that f or any

fi xed p and n large enough, there is a 99.99% chance

ˆ that the observed f requency p will diff er f rom p by

p(1-p) less than 4 .

n

I t's easy to see that p(p(1-p)1 2

1-p) , so 4 .2 n n

Page 35: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Conf Intervals2 2ˆ ˆ (p ,p ) n n

is called a 99.99% confidence interval.

Page 36: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

binomial(n,½); n=50,100,250,500

.00

.02

.04

.06

.08

.10

.12

0 50 100 150 200 250 300

25, 3.54

50, 5

125, 7.91

250, 11.2

Page 37: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Normal(,);

25, 3.54

50, 5

125, 7.91

250, 11.2

.00

.02

.04

.06

.08

.10

.12

0 50 100 150 200 250 300

Page 38: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

.00

.02

.04

.06

.08

.10

.12

0 10 20 30 40 50 60 70 80 90 100

.00

.02

.04

.06

.08

.10

.12

0 10 20 30 40 50 60 70 80 90 100

bin(50,1/2)

= 25, = 3.535534

bin(50,1/2) + 25

50, =3.535534

Normal Approximation

2 (x- )221

e2

bin(100,1/2)

50, =5

(bin(50,1/2) + 25)* 5/3.535534

50, = 5

50, =5

Page 39: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Notre Dame de Rheims

Page 40: Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section

Quote

Bientôt je pus montrer quelques esquisses. Personnen'y comprit rien. Même ceux qui furent favorables à ma perception des vérités que je voulais ensuite graverdans le temple, me félicitèrent de les avoirdécouvertes au « microscope », quand je m'étais aucontraire servi d'un télescope pour apercevoir deschoses, très petites en effet, mais parce qu'elles étaient situées à une grande distance […]. Là où jecherchais les grandes lois, on m'appelait fouilleur dedétails. (TR, p.346)