26
More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all searches are independent. What’s the probability of being searched at least one time? 50 geese in a flock of 200 are tagged by a wildlife biologist. The next year, 10 geese from the flock are captured. Assume the flock still has (the same) 200 geese and no tags are lost. What’s the probability that at least 5 of the recaptured geese have tags? Suppose a written test has 5 True/False questions. Passing = at least 3 correct answers and the test can be taken at most 3 times. (Assume no learning occurs between tests if one fails!) If one randomly guesses what’s the probability of passing? What’s the probability that someone who randomly guesses will eventually pass? An overloaded server receives an average of 25 emails per second at 12:00PM. If it receives more than 30 emails in a second, it will crash. What’s the probability of a crash at 12:00PM on a given day (based on the traffic in the previous 1 second)?

More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Embed Size (px)

Citation preview

Page 1: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

More Examples:• There are 4 security checkpoints. The probability of being searched

at any one is 0.2. You may be searched more than once in total and all searches are independent. What’s the probability of being searched at least one time?

• 50 geese in a flock of 200 are tagged by a wildlife biologist. The next year, 10 geese from the flock are captured. Assume the flock still has (the same) 200 geese and no tags are lost. What’s the probability that at least 5 of the recaptured geese have tags?

• Suppose a written test has 5 True/False questions. Passing = at least 3 correct answers and the test can be taken at most 3 times. (Assume no learning occurs between tests if one fails!)– If one randomly guesses what’s the probability of passing?

– What’s the probability that someone who randomly guesses will eventually pass?

• An overloaded server receives an average of 25 emails per second at 12:00PM. If it receives more than 30 emails in a second, it will crash. What’s the probability of a crash at 12:00PM on a given day (based on the traffic in the previous 1 second)?

Page 2: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Answers to Examples1. X = number of times searched. X has a binomial distribution with

n=4 and p=0.2. We want Pr(X>0) = 1-Pr(X=0)2. X = number of recaptured geese w/ tags. X has a

hypergeometric distribution with N = 200, M = 50, n=10. We want Pr(X>=5) = Pr(X=5)+Pr(X=6)+Pr(X=7)+Pr(X=8)+Pr(X=9)+Pr(X=10)

3. X = number of questions right. X has a binomial distribution with n = 5 and p=0.5. Want Pr(X>=3) = Pr(X=3)+Pr(X=4)+Pr(X=5)

4. Pr eventually pass = Pr(Pass on first try or fail first and then pass or fail twice and then pass) = Pr(X>=3) + Pr(X<3)*Pr(X>=3) + Pr(X<3)*Pr(X<3)*Pr(X>=3)

5. X = number of emails in a second. X has a Poisson distribution with rate = 25 per second. Want Pr(X>30) = 1-Pr(X<=30) = Pr(X=0)+…+Pr(X=30)

(in each case, once you know the distribution and the parameters, the Pr(X=k) can be calculated with the pdf.)

Page 3: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

• If you’re interested in polls, an interesting “statistics related” website is: www.gallup.com

• Polls that ask questions w/ 2 answers are related to the binomial distribution:– n = number of people asked– p = probability of one of the

answers– Note that a poll uses data to

estimate p (i.e. estimate of p = number of yeses / n)

From gallup.com (Feb 19, 2003)n = 483

Example: X = number of peoplewho think “unfinished business is the reason.X has a Bin(483,0.31) distribution (assume 0.31 is the true p).

Page 4: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Example:• Suppose 10 people are polled:

– Is a terrorist attack at least somewhat likely at the Olympics?

• Suppose p=0.31• Q: What’s the probability that fewer than 9

people say yes?• A: Let X ~ Bin(10,0.31)

Want Pr(X<9) = 1-Pr(X=9)-Pr(X=10)=1-(10 choose 9)(0.319)(0.691)

-(10 choose 10)(0.3110)(0.690)=1-0.0000-0.0002 = 0.9998

Page 5: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Example: Dietary Data

Percent

Folate (Calorie Adjusted mg)

20

10

0

7.56.55.54.53.5

• As part of an epidemiological study, physicians measured the amount of folate in the diets of 545 people.

• What’s the probabilitythat a new person’s folate consumption equals exactly 5.5?

Histogram from observed sample

Question about the random variabledescribing dietary folate of a newperson.

Page 6: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

• In the folate example, if folate were measured accurately enough, the probability of seeing any exact value on a new person is zero.

• Note that this is different from random variables like “the number of questions right on a test, etc”.– The folate example gives an example of continuous data.

– Probability can be applied to the probability that a continuous random variable is in an interval, but any particular value has zero probablity.

Page 7: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Chapter 6: Continuous Distributions & Normality

• Up to this point, all random variables have been discrete:– Possible values are integers (any integer or a

subset):• Binomial(n,p) random variables can be 0 or 1 or …or n.• Poisson(rate) random variables can be 0 or 1 or …• Hypergeometric(N,M,n) random variables can be 0 or 1

or …or n.• PDFs give probabilities that the random variables take

on any of these values• CDFs give probabilities that the random variables are

less than or equal to a certain value

Page 8: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

• Random variables that can take on any real number are continuous.

• Continuous random variables have probability density functions (pdfs) too.

• Again, they are models for how the random variables behave.

• The probability that a continuous random variable is in an interval is the area under the pdf in that interval.

Page 9: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Folate

Density

4 5 6 7 8

0.0

0.2

0.4

0.6

0.8

PDF for the Folate Data (assume we know this function):

Pr(5 < random person’s folate intake < 6) = 0.54

= shaded area (i.e. )∫=<<6

5

)(')65Pr( dxxpdfsfolatefolate

Page 10: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

• Continuous PDFs : – notation: f(x)– f(x) is greater than or equal to zero.– All the area under f(x) is 1.– i.e.

– CDF: ∫

∞−

∞−

=<

==∞<<−∞

y

dxxfyX

dxxfX

)()Pr(

1)()Pr(

Page 11: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

)Pr()Pr( aXaX ≤=<

Let a be a number. For a continuous random variable X:

Page 12: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Continuous pdfs will be known functions

• Most commonly used:– Normal or Gaussian distribution (“bell curve”)– We’ll see why this is so common in a few weeks.

– 2 parameters: mean and std dev x

density

-4 -2 0 2 4

0.0

0.1

0.2

0.3

0.4

Page 13: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Mean = center of normal distribution

x

density

-4 -2 0 2 4

0.0

0.2

0.4

0.6

0.82 normal distibutions:Both have the same mean (0).Narrower one has a stddev of 2.Fatter one has std devof 1.

Smaller standard deviation means that the model says the data are more likely to be concentrated aroundthe mean.

Page 14: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

[1/(sqrt(2))]e[-0.5((x-)/)2]

The normal pdf is this functinon:

Page 15: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Determining normal probabilities:

• Suppose X has a normal distribution with mean 5 and std dev 2.

• Notation X~N(5,4) [notation uses N(mean,variance)]

• What’s the probability that X is less than 7?• It turns out that no one can “solve” the integral

that defines this probability.• As a result, we need to use tables, computers,

or calculators to compute normal probabilities.

Page 16: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

x

density

0 5 10

0.0

0.05

0.10

0.15

0.20

7

Pr(X<7) = area undercurve to left of x=7

Page 17: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

x

density

-4 -2 0 2 4

0.0

0.1

0.2

0.3

0.4

Fact 1: Pr(X < its mean) = 1/2

Page 18: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

x

density

-4 -2 0 2 4

0.0

0.1

0.2

0.3

0.4

Fact 2: Pr(X > its mean + a number)

= Pr(X < its mean - same number)

Page 19: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

x

density

-4 -2 0 2 4

0.0

0.1

0.2

0.3

0.4

Fact 3: Assume a > b.Pr(b< X < a) = Pr(X<a)-Pr(X<b)

ab

Area under curveBetween a and bIs area under curveTo the left of a minusThe area under the curve to the left ofb.

Page 20: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

x

density

-4 -2 0 2 4

0.0

0.1

0.2

0.3

0.4

Fact 4: Pr(X > a) = 1-Pr(X < a)

Page 21: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

x

density

-4 -2 0 2 4

0.0

0.1

0.2

0.3

0.4

Fact 5: Tables inside the cover of your book are given in terms of Pr(0<Z<a) (where a>0 and Z~N(0,1))(Tables with P(Z<a) are in Appendix 1)

a

Page 22: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Table in book: (inside cover)

Z .00 .01 .02 .03 .04…0.0 .0000 .0040 .0120 .0160 .0199

0.1 .0398 .0438 .0478 .0517 .0557

0.2 .0793 .0832 .0871 .0910 .0948

.

.

. Pr(0 < Z < 0.13) = 0.0517

Ones andtenths places

Hundredthsplace

This is the upperleft hand cornerof the table.

Page 23: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

Using Tables: 4 Easy Steps

Want Pr(X<7)1. Draw picture (next page) (allows use of common sense)2. Translate X to a normal random variable with mean 0 and

std dev 1 (called “Z”, a standard normal r.v.)– Do this by “centering and scaling”:

• Rule: If X~N(5,4) then (X-5)/2 ~N(0,1)

3. Manipulate to get in terms of Pr(Z<a) form– So, Pr(X<7) = Pr( (X-5)/2 < (7-5)/2)

= Pr( Z < 1) where Z~N(0,1)

4. Look up in table: Pr(X<7) = Pr(Z<1) = 0.8413

Page 24: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

x

density

0 5 10

0.0

0.05

0.10

0.15

0.20

7

Pr(X<7) = area undercurve to left of x=7

Page 25: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

• What’s Pr(X < 4)?

• Draw (on next page)

• Center and scale:– Pr(X<4) = Pr( (X-5)/2 < (4-5)/2 )

= Pr( Z < -1/2 )

• Look up = 0.3085

Page 26: More Examples: There are 4 security checkpoints. The probability of being searched at any one is 0.2. You may be searched more than once in total and all

x

density

0 5 10

0.0

0.05

0.10

0.15

0.20

7

Pr(X<4) = area undercurve to left of x=4