View
228
Download
7
Category
Tags:
Preview:
DESCRIPTION
normal distribution
Citation preview
1
Chapter 6
2
Normal Distribution
The most important type of random variable is the normal or Gaussian random variable that has a normal distribution. In fact, the binomial distribution can be approximated to the normal distribution.
Note that normal or Gaussian random variable is continuous.
3
Graph of Normal Probability Distribution
Let μ and σ be the mean and standard deviation of the given population. Recall that normal or Gaussian random variable is continuous. Its distribution function is also continuous on the set of all real numbers. Therefore, its graph is continuous on the entire real line.
The normal or Gaussian graph is represented in the next slide.
4
The Normal Curve
5
Properties of a Normal Curve
The notable features of a normal distribution (density) curve are as follows:a. The curve is bell-shaped with the highest point (the mode) at the mean .b. It is symmetrical about the mean.c. The curve is always above the horizontal axis. In other words, the curve approaches the horizontal axis (asymptote) but never touches or crosses it.d. It has two inflection points at - and +.e. The area bounded by the normal curve, horizontal axis, two vertical lines is the probability measure of normal random variable belonging to the interval determined by the two vertical lines.
6
The Normal Distribution Function
,0,
2
1),|(
2
2
1
x
where
exf
x
Moreover, the normal distribution curve, y = f(x) is in fact the normal density function. Its mathematical representation is given by,
7
Effect of the mean and variance on the normal curve
FIXED 2, VARYING VARYING 2, FIXED
8
Empirical Rule
For a distribution that is symmetrical and bell shaped (in particular, for a normal distribution):
▪ Approximately 68% of the data fall in the interval
▪ Approximately 95% of the data fall in the interval
▪ Approximately 99.7% of the data fall in the interval
,
2,2
3,3
9
Empirical Rule
In fact from this empirical rule, one can easily conclude that the probabilities of the events:i. ii. iii. are 0.68, 0.95, and 0.997 respectively.That is,
68.01 xP
95.02 xP
997.03 xP
10
Graphical Representation of the Empirical Rule
11
Control Charts
A control chart is used to examine data over a period of equally spaced time intervals.
For a given random variable X, the control chart is a plot of the observed values of X = x in time sequence order.
12
Procedure for Making Control Chart for Random Variable X
13
Example 1: Graphing Control Charts
14
Inferences about data using control chart
Out-of-Control Signal-I:▪ One point beyond the three standard deviation level either above or below the center line (mean, μ).
Out-of-Control Signal-II:▪ A run of nine consecutive points on one side of the center line (mean, ).
Out-of-Control Signal-III:▪ At least two of three consecutive points beyond the two standard deviation level on the same side of the center line (mean, μ).
15
Graphical Illustration of Out-of-Control Signal I
16
Probability of Out-of-Control Signal I Using the Empirical Rule
Control Chart
0
10
20
30
1 4 7 10 13 16 19
Trial
Sam
ple
Mea
n
003.0997.013yEmpiricall
xP
17
Graphical Illustration of Out-of-Control Signal II
18
Probability of Out-of-Control Signal II Using the Empirical Rule
004.0002.02mean theof sidesboth on Nine
002.05.0mean theof side oneon Nine 9
P
P
Control Chart
0
10
20
301 4 7 10 13 16 19
Trial
Sam
ple
Mea
n
19
Graphical Illustration of Out-of-Control Signal III
20
Probability of Out-of-Control Signal III Using the Empirical Rule
Control Chart
0
10
20
30
1 4 7 10 13 16 19
Trial
Sam
ple
Mea
n
21
Probability of Signal III (cont)
yEmpiricall
accurately Moremean theAbove2
95.01025.02
xP
004.00036.00018.02
mean the deviations standard
than twomore valuesdata threeofout least twoAt
0018.0
)975.0()025.0()975.0()025.0(
mean the deviations standard two
thanmore valuesdata threeofout least twoAt
0333
1223
elow above or bP
CC
aboveP
22
Summary of Signals Probabilities
004.0975.0025.0975.0025.02)III Signal(
004.05.02)II Signal(
003.0997.01)I Signal(
:rule empirical theUsing
0333
1223
9
CCP
P
P
23
Chebyshev’s Theorem*
For any set of data (population or sample) with sample size greater than 1, regardless of the distribution of the data set, the proportion of the data that must be within k standard deviations on either side of the mean is given by,
24
Results of Chebyshev’s Theorem
According to Chebyshev’s Theorem for any set of data, the proportion of data (percentage of data) within the given number of standard deviations yields the following results:
▪ At least 75% of the data fall in the interval
▪ At least 88.9% of the data fall in the interval
▪ At least 93.8% of the data fall in the interval
2,2
3,3
4,4
25
The Normal Distribution
2
2
1
2
1),|(
x
exf
scorez
dttfxFCDF
xfPDFx
),|(),|(:
),|(:
Probability Distribution Function
Cumulative Probability Distribution Function
26
The Z-value (or Z-score)
The z-value or z-score is the deviation of the measurement from the mean per unit standard deviation. It is defined by,
where x is the original measurement, µ is the mean of the x distribution and σ is the standard deviation.
27
Remarks
Note that we assume the word average to be either the sample mean or the population mean µ. We further note that the original score x is referred to as “raw score x”.
Knowing the z-score, σ, and µ then the raw score x is determine by,
28
Standard Normal Distribution
If the original distribution of the x values is normal with mean µ, and standard deviation σ, then the corresponding z values have a normal distribution with mean, µ = 0 and standard deviation, σ = 1.
This transformed normal distribution with mean, µ = 0 and standard deviation, σ = 1 is called the standard normal distribution.
29
Proof of Mean & Variance of Z-score
?)( and ?)(
let Then,
)( and )(Given 2
yVyE
xy
xVxE
0)(
)()1()()()(
yE
ExExEyE
22
2
0)(
)()1()()()(
yV
VxVxVyV
?)( and ?)(
let Hence,
zVzE
xz
0)(001
)(1
)(
zE
xEx
EzE
11
)(
)(1
)(
22
2
zV
xVx
VzV
30
The Empirical Rule under the Standard Normal Curve
31
The Standard Normal Table
The textbook uses the left tail and half tail style tables interchangeably to solve problems involving the normal distribution. However, for the sake of uniformity we would focus on only the left tail style table.This style table provides the cumulative area to the left of a given z score associated with an original raw score x.
32
The Standard Normal Left Tail Table (Left Tail Z Table)
33
Some Remarks about Normal Probability
▪ The total area under the normal curve is always equal to 1.▪ The portion of the area under the curve within a given interval represents the probability that a measurement will lie in that interval.▪ The probability that z equals a certain number is always 0. ▪ P (z = a) = 0
▪ Therefore, < and ≤ can be used interchangeably. Similarly, > and ≥ can be used interchangeably. ▪ P (z < b) = P (z ≤ b) ▪ P (z > c) = P (z ≥ c)
34
Convention? Argumentative.
Some instructors and books states that:The area to the left of a z-value smaller than –3.49 is 0.000
It is better to state ≤ 0.0002 from tableThe area to the left of a z-value greater than 3.49 is 1.000
It is better to state ≥ 0.9998 from table
Always avoid absolute statements.
35
Use of the Left Tail Normal Table(looking up area under the curve)
7357.0)63.0( zPExample 2:
36
Example 3
0075.0925.01)43.2( zPa.
9625.0)78.1( zP
001.0)09.3( zP
b.
c.
d. 5910.0)23.0()227.0( zPzP
37
Example 4
)18.2()34.1(
)34.118.2(
zPzP
zP
0146.09854.01
)18.2(1)18.2(
zPzP
9099.0)34.1( zP
8953.00146.09099.0
9099.03.1
04.0
z
9854.01.2
08.0
z
38
Example 5
Given that the mean is 25 and the standard deviation is 5, what is the probability that the observed data point is at most 28.15.
7357.0)63.0(
5
2515.28)15.28(
zP
xPxP
Example 6
39
)00.150.0(
2
46
2
43)63(
;2,4Given
zP
xPxP
3085.06915.01
)50.0(1)50.0(
symmetryBy
zPzP
8413.0)00.1( zP
5328.03085.08413.0
8413.00.1
00.0
z
6915.05.0
00.0
z
40
b.
Example 7
41
c.
Example 7 (continue)
42
Inverse Normal Distribution
Sometimes we may be required to find the z value or raw score, x that corresponds to a given area under the normal curve.▪ To do this, we look up the area associated with the given problem and find the corresponding z value.▪ Next, the raw score, x can be computed as follows:
43
Example 8: Using the information given in example 7,
44
Example 9: Using the information given in example 7,
45
Example 10
1. Find the z value such that 90% of the area under the standard normal curve lies between –z and z.2. Find the z value such that 3% of the area under the standard normal curve lies to the right of z. 3. If a random variable X is normally distributed with mean 50 and standard deviation 10, find k so that the P (X ≥ k) = 0.99
46
Sampling Distribution
A sampling distribution is a probability distribution of a sample statistic based on all possible simple random samples of the same size from the same population.
47
Example 11: Sampling Distribution
An application center has six sales representatives at its North Jacksonville outlet. Listed below is the number of refrigerators sold by each last month.
a. Select all possible samples of size 2 and compute the sample mean number sold for each sample.b. What is the distribution of the sample means.
Sales Representative Number Sold Sales Representative Number Sold
Zina 54 Jan 48
Woon 50 Molly 50
Ernie 52 Rachel 52
48
Central Limit Theorem (CLT)
▪ In general, given that a data is normally distributed, then regardless of the sample size, the sampling distribution will follow normal probability distribution.▪ On the other hand, if the distribution of the data does not follow the normal distribution then only when the sample size increases does the sampling distribution approach the normal probability distribution.
49
Central Limit Theorem
Regardless of the distribution of the data, as the sample size increases, the sampling distribution approaches normality.
50
Central Limit Theorem
Let x be a random sample from a population with finite mean and finite variance 2.Let x be the sample mean; that is,
Then as the sample size increases, the probability distribution of the sample mean approaches a normal probability distribution with mean and variance 2/n.
Proof
51
nSE
nxVxE
n
xxxVxE
x
xx
n
ii
ison distributi sampling for theerror standard the
thereforeand ,)(,)( :Prove
and )(,)( Given that
22
12
Detailed Proof
52
)(1
11)(
1
)()()(11
1)(
times11
2121
1
1
nn
nnxE
n
xExExEn
xxxEn
xEnn
xExE
n
n
i
n
ii
nn
n
ii
n
ii
x
Detailed Proof (continue)
53
nn
n
nnxV
n
xVxVxVn
xxxVn
xVnn
xVxV
n
n
i
n
ii
nn
n
ii
n
ii
x
22
2
times
2222
1
22
12
212212
12
12
)(1
11)(
1
)()()(11
1)(
Detailed Proof (continue)
54
n
nn
xVSE x
22
)(
55
Example 12
Assume that the weight of marbles are normally distributed with mean 172 grams and standard deviation 29 grams.a. If 4 marbles are selected, find the probability that its
mean weight is less than 167 grams.b. If 25 marbles are selected, find the probability that
they have a mean weight more than 167 grams.c. If 100 marbles are selected, find the probability that
they have a mean weight between 167 grams and 180 grams.
56
Normal Approximation to the Binomial Distribution
In the binomial distribution, if the sample size is very large, the probability of finding r ≥ j for some j, where 1 ≤ j ≤ n is very tedious and lengthy calculations. In such cases, the problem can be solved by using the normal approximation to this type of the binomial distribution.Procedure:Step 1: Given a binomial distribution with n, r, and p, wheren – stands for total number of trialsr – stands for the number of successes (r = 0, 1, 2, …, n)p – stands for the probability of success in a single trial.Step 2: Criteria for the normal approximation to the binomial distribution is that if,
np > 5 and nq > 5 or np ≥ 5 and nq ≥ 5Then r has a binomial distribution that can be approximated by a normal distribution with
µ = np and
57
Continuity Correction
58
Correction for Continuity
59
Converting Binomial to Standard Normal without correction for continuity
)1( pnp
npxxz
60
Converting Binomial to Standard Normal with correction for continuity
)1(
5.05.0
pnp
npxxz
61
Example 13
The Denver Post stated that 80% of all new products introduced in grocery stores fail (and are taken off the market) within 2 years. Using normal approximation for this binomial distribution and correction for continuity, if a grocery store chain introduces 75 new products, a. Verify that the assumption for normal approximation to the
binomial is satisfied. b. What is the probability that within two years, 54 or more will fail?c. What is the probability that within two years, fewer than 62 will
fail?d. What is the probability that within two years, more than 49 will
fail? e. What is the probability that within two years, 58 or fewer fail?
62
Example 13 (solution)
Without correction for continuity
60 and 3.464
60 54 60( 54)
3.464 3.464
( 1.73) 0.9582
xP x P
P z
With correction for continuity
60 and 3.464
0.5 60 54 0.5 60( 54 0.5)
3.464 3.464
( 1.88) 0.9699
xP x P
P z
b.
5a.
63
Example 13 (solution)Without correction for continuity
60 and 3.464
60 61 60( 62) ( 61)
3.464 3.464
( 0.29) 0.6141
xP x P x P
P z
With correction for continuity
60 and 3.464
0.5 60 61 0.5 60( 61 0.5)
3.464 3.464
( 0.43) 0.6664
xP x P
P z
c.
64
Example 13 (solution)d. for Continuity
▪ With Correction for Continuity
65
Example 13 (solution)e. for Continuity
▪ With Correction for Continuity
66
PP & QQ Plots for Testing the Assumption of Normality
PP PLOT QQ PLOT
67
Normal Probability Plot
bmxz
bm
xzx
z
and 1
Let
1
Hence, the data is normal if the scatter plot of the data and the corresponding z-score (by matching percentiles) is a line.
68
Testing the Assumption of Normality using the Probability-Probability Plot (PP Plot)
Approximately Normal Not Normal
69
Normal Quantile Plot
70
Testing the assumption of Normality using the Quantile-Quantile Plot (QQ Plot)
71
Standardized & Percentage Plots
STANDARDIZED PLOT PERCENTAGE PLOT
72
Normality
Regardless of the data’s distribution, as the sample size increases, the sampling distribution approaches normality
Most continuous variables are assumed normal and even the discrete probability distribution, binomial, can be approximated using normality.
The normal probability distribution was developed by Gauss; a Gaussian probability distribution shows normality.
To test for normality we can use the PP plot (a metaphoric t-shirt) or the QQ plot (the t-shirt turned inside-out) .
To force normality or normalize the data, we can use the standardized plot or the percentage change plot.
Central Limit Theorem
Continuous Random Variable
Correction for Continuity
Gaussian Probability
Distribution
Normal Approximation
Normal Probability
Distribution
Sampling Distribution
Standard Score
PP plot
QQ plot
Standardized plot
Percentage plot
73
Assignment Problems
Section 6.1:# 6.1Section 6.2:# 6.6 Section 6.3:# 6.15, 6.17, 6.19, 6.27, 6.29Section 6.5:# 6.31, 6.33, 6.41Section 6.5:# 6.49, 6.54, 6.60Section 6.6:# 6.65, 6.68, 6.70
Assignment for chapter-06Section 6.1# 6.1 Determine if the following are continuous or
discrete random variables:a. Number of characters in a document.b. The amount of time it takes to make
dinner.c. The height of a palm tree.Section 6.2
# 6.6 Illustrate the following curves indicating the points of inflection .a. X~N()b. X~N()c. X~N()
Section6.4
Section 6.3# 6.15 Determine the probability that the standard
normal random variable Z will assume a single value between -1.42 and 0.75.# 6.17 The random variable X is normally
distributed with mean and . Find the following probabilities:
# 6.19 If random variable X is normally distributed with and , find K so that the .
Section6.4
# 6.27 The amount of solid fuels, X, which assumes values of X metric tons, is normally distributed with mean thousand metric tons (kmt) and
standard deviation thousand metric tons.a. Determine the probability that the amount
is between 250 kmt and 320 kmt, that is a. Find the metric tonnage such that
probability that the tonnage is exceeded is 0.80.
# 6.29 The price of coffee, X, which assumes values of x dollars, is normally distributed with mean and
standard deviation .a. Determine the probability that the cost is
between $10 and $15 that is b. Find the cost that the probability that this
price is exceeded is 0.2.
Section6.4# 6.31 The mean height of a group of 500 nonsmoking college students is 74 inches and the standard deviation is 5inches. What is the probability that in a random sample of 25 students from this group, the average height will be between 73 and 75 inches?
# 6.33 Suppose that the weight of candy packing machine are distributed about the mean of 16 ounces and a standard deviation of 2 ounces. What is the probability that if nine packages of candy are weighted their average weight:
a. Will be less than 14 ounces?b. Will be more than 16 ounces?
# 6.41 A survey of IQ scores of all the United States senators in the history of this body revealed a mean of and standard deviation . What is
the probability that the average IQ score of a random sample of 16 senators:
a. Will be lower than 85?b. Will exceed 85?c. Will be between 85 and 130?
Section 6.5# 6.49 According to Chebyshev`s rule, what percentage of the data values will lie within one standard deviation of the mean?# 6.54 According to empirical rule, what percentage of
data values will lie within three standard deviation of the mean?# 6.60 Based on the empirical rule, how many data
values in a set of size 200 would you expect to lie within three standard deviations of the
mean?
Section 6.6# 6.65 Let the random variable X be binomially distributed with and . Evaluate the following probabilities:
# 6.68 A fair coin is tossed 15 times. Determine the probability that between 6 and 8 heads inclusive will occur:a. Using the binomial probability distribution.b. Using the normal approximation without
correction for continuity.
c. Using the normal approximation with correction for continuity.
# 6.70 A fair die is rolled 200 times. Using normal approximation to the binomial, what is the
probability that an ace (one) will appear between 34 and 36 times?
Recommended