View
232
Download
0
Category
Preview:
Citation preview
8/17/2019 Continuous Probability Distribution.pdf
1/47
Probability Distribution
8/17/2019 Continuous Probability Distribution.pdf
2/47
8/17/2019 Continuous Probability Distribution.pdf
3/47
Continuous Probability Distributions
The probability of the random variable assuming a value
within some given interval from x1 to x2 is defined to be the
area under the graph of the probability density function
between x1 and x2.
f (x)
x
Uniform
x
1 x
2
x
f
(
x
) Normal
x
1
x
2
x
1 x
2
Exponential
x
f (x)
x
1 x
2
8/17/2019 Continuous Probability Distribution.pdf
4/47
Normal Probability Distributions
The normal probability distribution is the most important
distribution for describing a continuous random variable.
It is widely used in statistical inference.
8/17/2019 Continuous Probability Distribution.pdf
5/47
Normal Probability Distributions
Normal Probability Density Function
2 2( ) /21( )2
x f x e
= mean
= standard deviation
= 3.14159
e = 2.71828
where:
8/17/2019 Continuous Probability Distribution.pdf
6/47
Normal Probability Distributions
It’s a probability function, so no matter what the values of
and , must integrate to 1!
12
12)(
2
1
dxe
x
E(X)= =
Var(X)=2 =
Standard Deviation(X)=
dxe x
x
2)(
2
1
2
1
2)(
2
1
2 )2
1(
2
dxe x
x
8/17/2019 Continuous Probability Distribution.pdf
7/47
The distribution is symmetric; its skewness
measure is zero.
Normal Probability Distributions
Characteristics
x
8/17/2019 Continuous Probability Distribution.pdf
8/47
The entire family of normal probability distributions
is defined by its mean and its standard deviation .
Normal Probability Distributions
Characteristics
Standard Deviation
Mean x
8/17/2019 Continuous Probability Distribution.pdf
9/47
The highest point on the normal curve is at the mean,
which is also the median and mode.
Normal Probability Distributions
Characteristics
x
8/17/2019 Continuous Probability Distribution.pdf
10/47
Normal Probability Distributions
Characteristics
-10 0 20
The mean can be any numerical value: negative, zero,
or positive.
x
8/17/2019 Continuous Probability Distribution.pdf
11/47
Normal Probability Distributions
Characteristics
= 15
= 25
The standard deviation determines the width of the
curve: larger values result in wider, flatter curves.
x
8/17/2019 Continuous Probability Distribution.pdf
12/47
Probabilities for the normal random variable are given
by areas under the curve. The total area under the curve
is 1 (.5 to the left of the mean and 0.5 to the right).
Normal Probability Distributions
Characteristics
.5 .5
x
8/17/2019 Continuous Probability Distribution.pdf
13/47
Normal Probability Distributions
Since the area under the curve represents probability, the probability of a normal random variable at one specific value is
zero . With a single value, one can’t find the area since the
area must be bound by two values. Thus,
P(x = 10) = 0 P(x = 3) = 0 P(x = 7.5) = 0
However, one can find the following probabilities:
P( 1 < x < 3) P(2.2 < x < 3.7) P(x > 3)
8/17/2019 Continuous Probability Distribution.pdf
14/47
Normal Probability Distributions
Characteristics
of values of a normal random variable
are within of its mean.
68.26%
+/- 1 standard deviation
of values of a normal random variableare within of its mean.95.44%
+/- 2 standard deviations
of values of a normal random variable
are within of its mean.
99.72%
+/- 3 standard deviations
8/17/2019 Continuous Probability Distribution.pdf
15/47
Normal Probability Distributions
Characteristics
x
– 3 – 1
– 2
+ 1
+ 2
+ 3
68.26%
95.44%
99.72%
8/17/2019 Continuous Probability Distribution.pdf
16/47
Normal Probability Distributions
There may be thousands of normal distribution curves,
each with a different mean and a different standarddeviation. Since the shapes are different, the areas under the curves between any two points are also different.
To make life easier, all normal distributions can be
converted to a standard normal distribution. A standardnormal distribution has a mean of 0 and a standarddeviation of 1.
No matter what and are, the area between - and +is about 68.26%; the area between -2 and +2 is about95.44%; and the area between -3 and +3 is about99.72%. Almost all values fall within 3 standarddeviations.
8/17/2019 Continuous Probability Distribution.pdf
17/47
How good is rule for real data?
Check some example data:
The mean of the weight of the women = 127.8
The standard deviation (SD) = 15.5
8/17/2019 Continuous Probability Distribution.pdf
18/47
80 90 100 110 120 130 140 150 160
0
5
10
15
20
25
P e r c e n t
POUNDS
127.8 143.3112.3
68% of 120 = .68x120 = ~ 82 runners
In fact, 79 runners fall within 1-SD (15.5 lbs) of the mean.
8/17/2019 Continuous Probability Distribution.pdf
19/47
80 90 100 110 120 130 140 150 160
0
5
10
15
20
25
P e r c e n t
POUNDS
127.896.8
95% of 120 = .95 x 120 = ~ 114 runners
In fact, 115 runners fall within 2-SD’s of the mean.
158.8
8/17/2019 Continuous Probability Distribution.pdf
20/47
80 90 100 110 120 130 140 150 160
0
5
10
15
20
25
P e r c e n t
POUNDS
127.881.3
99.7% of 120 = .997 x 120 = 119.6 runners
In fact, all 120 runners fall within 3-SD’s of the mean.
174.3
8/17/2019 Continuous Probability Distribution.pdf
21/47
Example
Suppose SAT scores roughly follows a normal distribution in
the U.S. population of college-bound students (with range
restricted to 200-800), and the average math SAT is 500 with a
standard deviation of 50, then:
68% of students will have scores between 450 and 550
95% will be between 400 and 600
99.7% will be between 350 and 650
8/17/2019 Continuous Probability Distribution.pdf
22/47
Standard Normal Probability Distributions
The formula for the standardized normal probabilitydensity function is
22)(
2
1)
1
0(
2
1
2
1
2)1(
1)(
Z Z
ee Z p
8/17/2019 Continuous Probability Distribution.pdf
23/47
The Standard Normal Distribution Z)
All normal distributions can be converted into the standard
normal curve by subtracting the mean and dividing by the
standard deviation:
X
Z
Somebody calculated all the integrals for the standard
normal and put them in a table! So we never have tointegrate!
Even better, computers now do all the integration.
8/17/2019 Continuous Probability Distribution.pdf
24/47
0
z
The letter z is used to designate the standardnormal random variable.
Standard Normal Probability Distributions
8/17/2019 Continuous Probability Distribution.pdf
25/47
Applications of Standard Normal Distribution
Problem:
What’s the probability of getting a math SAT score of 575 or less,
=500 and =50?
5.150
500575
Z
i.e., A score of 575 is 1.5 standard deviations above the mean
5.1
2
1575
200
)
50
500(
2
1 22
2
1
2)50(
1)575( dz edxe X P
Z x
Yikes!
But to look up Z= 1.5 in standard normal chart (or enterinto SAS) no problem! = .9332
8/17/2019 Continuous Probability Distribution.pdf
26/47
Applications of Standard Normal Distribution
Problem:
Test scores of a special examination administered to all potential
employees of a firm are normally distributed with a mean of 500
points and a standard deviation of 100 points. What is the probability
that a score selected at random will be higher than 700?
P(x > 700) = ?
If we convert this normal variable, x, to a standard normal variable, z,
z = (x - µ) / σ = (700 – 500) / 100 = 2
-------------500----------700 x-scale
P(x > 700) = P(z > 2) ----------------0-----------2 z-scale
8/17/2019 Continuous Probability Distribution.pdf
27/47
Problem
If birth weights in a population are normally distributed with a
mean of 109 oz and a standard deviation of 13 oz,
a. What is the chance of obtaining a birth weight of 141 oz
or heavier when sampling birth records at random?
b. What is the chance of obtaining a birth weight of 120 orlighter ?
8/17/2019 Continuous Probability Distribution.pdf
28/47
Solution
a. What is the chance of obtaining a birth weight of 141 oz
or heavier when sampling birth records at random?
46.213
109141
Z
From the chart or SAS Z of 2.46 corresponds to a right tail (greater
than) area of: P(Z≥2.46) = 1-(.9931)= .0069 or .69 %
8/17/2019 Continuous Probability Distribution.pdf
29/47
Solution
b. What is the chance of obtaining a birth weight of 120
or lighter ?
From the chart or SAS Z of .85 corresponds to a left tail area of:
P(Z≤.85) = .8023= 80.23%
85.13
109120
Z
8/17/2019 Continuous Probability Distribution.pdf
30/47
Looking up probabilities in the
standard normal table
Area is 93.45%
Z=1.51
Z=1.51
What is the area to
the left of Z=1.51 in
a standard normal
curve?
8/17/2019 Continuous Probability Distribution.pdf
31/47
8/17/2019 Continuous Probability Distribution.pdf
32/47
Are my data normally distributed?
1. Look at the histogram! Does it appear bell shaped?
2. Compute descriptive summary measures — are mean,median, and mode similar?
3. Do 2/3 of observations lie within 1 std dev of the mean? Do
95% of observations lie within 2 std dev of the mean?4. Look at a normal probability plot — is it approximatelylinear?
5. Run tests of normality (such as Kolmogorov-Smirnov). But,
be cautious, highly influenced by sample size!
8/17/2019 Continuous Probability Distribution.pdf
33/47
Normal approximation to the binomial
When you have a binomial distribution where n is large and p
is middle-of-the road (not too small, not too big, closer to .5),then the binomial starts to look like a normal distribution in
fact, this doesn’t even take a particularly large n
Recall: What is the probability of being a smoker among a
group of cases with lung cancer is .6, what’s the
probability that in a group of 8 cases you have less than 2
smokers?
8/17/2019 Continuous Probability Distribution.pdf
34/47
Normal approximation to the binomial
When you have a binomial distribution where n is large and p
isn’t too small (rule of thumb: mean>5), then the binomial starts
to look like a normal distribution
Recall: smoking example…
1 4 52 3 6 7 80
.27 Starting to have a normal shape
even with fairly small n. You can
imagine that if n got larger, the
bars would get thinner and thinner
and this would look more and
more like a continuous function,with a bell curve shape. Here
np=4.8.
8/17/2019 Continuous Probability Distribution.pdf
35/47
Normal approximation to binomial
1 4 52 3 6 7 80
.27
What is the probability of fewer than 2 smokers?
Normal approximation probability:=4.8
=1.39
239.1
8.2
39.1
)8.4(2
Z
Exact binomial probability (from before) = .00065 + .008 = 0.00865
P(Z
8/17/2019 Continuous Probability Distribution.pdf
36/47
A little off, but in the right ballpark… we could also use the value
to the left of 1.5 (as we really wanted to know less than but not
including 2; called the “continuity correction”)…
37.239.1
3.3
39.1
)8.4(5.1
Z
P(Z≤-2.37) =.0089A fairly good approximation of
the exact probability, .00865.
8/17/2019 Continuous Probability Distribution.pdf
37/47
Practice problem
1. You are performing a cohort study. If the probability of
developing disease in the exposed group is .25 for the study
duration, then if you sample (randomly) 500 exposed
people, What’s the probability that at most 120 people
develop the disease?
8/17/2019 Continuous Probability Distribution.pdf
38/47
Solution:
P(Z
8/17/2019 Continuous Probability Distribution.pdf
39/47
More Sample Problems
a. P(z ≤ 1.5) =
b. P(z ≤ 1.0) =
c. P(1 ≤ z ≤ 1.5) =
d. P(0 < z < 2.5) =
8/17/2019 Continuous Probability Distribution.pdf
40/47
More Sample Problems
a. P(z ≤ 1.5) = 0.9332
b. P(z ≤ 1.0) = 0.8413
c. P(1 ≤ z ≤ 1.5) = 0.9332 – 0.8413 = 0.09
d. P(0 < z < 2.5) = 0.9938 – 0.5000 = 0.4938
8/17/2019 Continuous Probability Distribution.pdf
41/47
More Sample Problems
a. P(z ≤ - 1.0) =
b. P(z ≥ - 1) =
c. P(z ≥ - 1.5) =
d. P(- 3 < z ≤ 0) =
8/17/2019 Continuous Probability Distribution.pdf
42/47
More Sample Problems
a. P(z ≤ - 1.0) = 0.1587
b. P(z ≥ - 1) = 1 – P(z ≤ - 1) = 1 – 0.1587 = 0.8413
c. P(z ≥ - 1.5) = 1 – P(z ≤ - 1.5) = 1 – 0.0668 = 0.9332
d. P(- 3 < z ≤ 0) = 0.5 – 0.0014 = 0. 4986
8/17/2019 Continuous Probability Distribution.pdf
43/47
More Sample Problems
Given: µ = 77 σ = 20
a. P(x < 50) = ?
Convert to z: z = (x - µ) / σ = (50 – 77) / 20 = - 1.35
P(x < 50) = P(z < - 1.35) = 0.0885
b. P(x > 100) = ?
z = (100 – 77) / 20 = 1.15P(x > 100) = P(z > 1.15) = 1 – P(z ≤ 1.15) = 1 – 0.8749 =
0.1251 or 12.51 %
8/17/2019 Continuous Probability Distribution.pdf
44/47
Continuation of Sample Problem
c. x = ? to be considered a heavy user
Upper 20% of the area is in the right tail of the normal curve.
80% of the area is to the left. Go to Table 1 and locate 0.8 (or
80%) as the table entry. The closest entry is 0.7995. That
point represents a z-value of 0.84. Use this value of z in the
following equation:
z = (x - µ) / σ0.84 = (x – 77)/ 20 x = 93.8 hours
8/17/2019 Continuous Probability Distribution.pdf
45/47
More Sample Problems
A statistics instructor grades on a curve. He does not want to
give more than 15 percent A in his class. If test scores of
students in statistics are normally distributed with a mean of
75 and a standard deviation of 10, what should be the cut-off
point for an A?
z = (x - µ) / σ
1.04 = (x – 75) / 10
x = 85.4 or 85
8/17/2019 Continuous Probability Distribution.pdf
46/47
Sample Problems
The service life of a certain brand of automobile battery is
normally distributed with a mean of 1000 days and a standarddeviation of 100 days. The manufacturer of the battery wants
to offer a guarantee, but does not know the length of the
warranty. It does not want to replace more than 10 percent of
the batteries sold. What should be the length of the warranty?
z = ( x - µ ) / σ
- 1.28 = (x – 1000) / 100x = 872 days
8/17/2019 Continuous Probability Distribution.pdf
47/47
Reference:
Anderson Sweeney Williams
Recommended