View
8
Download
0
Category
Preview:
Citation preview
Discrete DistributionsBernoulli f (x) = px(1 − p)1−x, x = 0, 10 < p < 1 M(t) = 1 − p + pet, −∞ < t < ∞
µ = p, σ 2 = p(1 − p)
Binomial f (x) = n!x!(n − x)! px(1 − p)n−x, x = 0, 1, 2, . . . , n
b(n, p)0 < p < 1 M(t) = (1 − p + pet)n, −∞ < t < ∞
µ = np, σ 2 = np(1 − p)
Geometric f (x) = (1 − p)x−1p, x = 1, 2, 3, . . .0 < p < 1
M(t) = pet
1 − (1 − p)et , t < − ln(1 − p)
µ = 1p
, σ 2 = 1 − pp2
Hypergeometric f (x) =
(N1
x
)(N2
n − x
)
(Nn
) , x ≤ n, x ≤ N1, n − x ≤ N2
N1 > 0, N2 > 0N = N1 + N2
µ = n(
N1
N
), σ 2 = n
(N1
N
)(N2
N
)(N − nN − 1
)
Negative Binomial f (x) =(
x − 1r − 1
)pr(1 − p)x−r, x = r, r + 1, r + 2, . . .
0 < p < 1
r = 1, 2, 3, . . . M(t) = (pet)r
[1 − (1 − p)et]r , t < − ln(1 − p)
µ = r(
1p
), σ 2 = r(1 − p)
p2
Poisson f (x) = λxe−λ
x! , x = 0, 1, 2, . . .λ > 0
M(t) = eλ(et−1), −∞ < t < ∞µ = λ, σ 2 = λ
Uniform f (x) = 1m
, x = 1, 2, . . . , mm > 0
µ = m + 12
, σ 2 = m2 − 112
Continuous Distributions
Beta f (x) = !(α + β)!(α)!(β)
xα−1(1 − x)β−1, 0 < x < 1α > 0β > 0 µ = α
α + β, σ 2 = αβ
(α + β + 1)(α + β)2
Chi-square f (x) = 1!(r/2)2r/2 xr/2−1e−x/2, 0 < x < ∞
χ2(r)r = 1, 2, . . . M(t) = 1
(1 − 2t)r/2 , t <12
µ = r, σ 2 = 2r
Exponential f (x) = 1θ
e−x/θ , 0 ≤ x < ∞θ > 0
M(t) = 11 − θ t
, t <1θ
µ = θ , σ 2 = θ2
Gamma f (x) = 1!(α)θα
xα−1e−x/θ , 0 < x < ∞α > 0θ > 0 M(t) = 1
(1 − θ t)α, t <
1θ
µ = αθ , σ 2 = αθ2
Normal f (x) = 1
σ√
2πe−(x−µ)2/2σ 2
, −∞ < x < ∞N(µ, σ 2)−∞ < µ < ∞ M(t) = eµt+σ 2t2/2, −∞ < t < ∞σ > 0 E(X) = µ, Var(X) = σ 2
Uniform f (x) = 1b − a
, a ≤ x ≤ b
U(a, b)
−∞ < a < b < ∞ M(t) = etb − eta
t(b − a), t %= 0; M(0) = 1
µ = a + b2
, σ 2 = (b − a)2
12
Chapte rChapte r
4Bivariate Distributions
4.1 Bivariate Distributions of the Discrete Type4.2 The Correlation Coefficient4.3 Conditional Distributions
4.4 Bivariate Distributions of the ContinuousType
4.5 The Bivariate Normal Distribution
4.1 BIVARIATE DISTRIBUTIONS OF THE DISCRETE TYPESo far, we have taken only one measurement on a single item under observation.However, it is clear in many practical cases that it is possible, and often very desir-able, to take more than one measurement of a random observation. Suppose, forexample, that we are observing female college students to obtain information aboutsome of their physical characteristics, such as height, x, and weight, y, because we aretrying to determine a relationship between those two characteristics. For instance,there may be some pattern between height and weight that can be described byan appropriate curve y = u(x). Certainly, not all of the points observed will beon this curve, but we want to attempt to find the “best” curve to describe therelationship and then say something about the variation of the points around thecurve.
Another example might concern high school rank—say, x—and the ACT(or SAT) score—say, y—of incoming college students. What is the relationshipbetween these two characteristics? More importantly, how can we use those mea-surements to predict a third one, such as first-year college GPA—say, z—witha function z = v(x, y)? This is a very important problem for college admissionoffices, particularly when it comes to awarding an athletic scholarship, because theincoming student–athlete must satisfy certain conditions before receiving such anaward.
Definition 4.1-1Let X and Y be two random variables defined on a discrete space. Let S denotethe corresponding two-dimensional space of X and Y, the two random vari-ables of the discrete type. The probability that X = x and Y = y is denoted byf (x, y) = P(X = x, Y = y). The function f (x, y) is called the joint probabilitymass function (joint pmf) of X and Y and has the following properties:
125
126 Chapter 4 Bivariate Distributions
(a) 0 ≤ f (x, y) ≤ 1.
(b)∑ ∑
(x,y)∈S
f (x, y) = 1.
(c) P[(X, Y) ∈ A] =∑ ∑
(x,y)∈A
f (x, y), where A is a subset of the space S.
The following example will make this definition more meaningful.
Example4.1-1
Roll a pair of fair dice. For each of the 36 sample points with probability 1/36, letX denote the smaller and Y the larger outcome on the dice. For example, if theoutcome is (3, 2), then the observed values are X = 2, Y = 3. The event {X = 2,Y = 3} could occur in one of two ways—(3, 2) or (2, 3)—so its probability is
136
+ 136
= 236
.
If the outcome is (2, 2), then the observed values are X = 2, Y = 2. Since the event{X = 2, Y = 2} can occur in only one way, P(X = 2, Y = 2) = 1/36. The joint pmfof X and Y is given by the probabilities
f (x, y) =
136
, 1 ≤ x = y ≤ 6,
236
, 1 ≤ x < y ≤ 6,
when x and y are integers. Figure 4.1-1 depicts the probabilities of the various pointsof the space S.
1/36
2/36 1/36
1/36
2/36
1/36
2/36
2/36
2/36
1/36
2/36
2/36
2/36
5/36 3/367/36
y
2/36
2/36
2/36
2/36
1/36
1/36
x
11/36
2/36
2/36
2/36
5/36
9/36
1/36
11/364 53 621
7/36
3/36
9/36
6
5
4
3
2
1
Figure 4.1-1 Discrete joint pmf
Section 4.1 Bivariate Distributions of the Discrete Type 127
Notice that certain numbers have been recorded in the bottom and left-handmargins of Figure 4.1-1. These numbers are the respective column and row totalsof the probabilities. The column totals are the respective probabilities that X willassume the values in the x space SX = {1, 2, 3, 4, 5, 6}, and the row totals arethe respective probabilities that Y will assume the values in the y space SY ={1, 2, 3, 4, 5, 6}. That is, the totals describe the probability mass functions of X andY, respectively. Since each collection of these probabilities is frequently recordedin the margins and satisfies the properties of a pmf of one random variable, each iscalled a marginal pmf.
Definition 4.1-2Let X and Y have the joint probability mass function f (x, y) with space S. Theprobability mass function of X alone, which is called the marginal probabilitymass function of X, is defined by
fX(x) =∑
yf (x, y) = P(X = x), x ∈ SX ,
where the summation is taken over all possible y values for each given x in thex space SX . That is, the summation is over all (x, y) in S with a given x value.Similarly, the marginal probability mass function of Y is defined by
fY(y) =∑
xf (x, y) = P(Y = y), y ∈ SY ,
where the summation is taken over all possible x values for each given y in they space SY . The random variables X and Y are independent if and only if, forevery x ∈ SX and every y ∈ SY ,
P(X = x, Y = y) = P(X = x)P(Y = y)
or, equivalently,
f (x, y) = fX(x)fY(y);
otherwise, X and Y are said to be dependent.
We note in Example 4.1-1 that X and Y are dependent because there are manyx and y values for which f (x, y) "= fX(x)fY(y). For instance,
fX(1)fY(1) =(
1136
)(1
36
)"= 1
36= f (1, 1).
Example4.1-2
Let the joint pmf of X and Y be defined by
f (x, y) = x + y21
, x = 1, 2, 3, y = 1, 2.
Then
fX(x) =∑
yf (x, y) =
2∑
y=1
x + y21
= x + 121
+ x + 221
= 2x + 321
, x = 1, 2, 3,
48 Probability and Statistics for Computer Scientists
(a) E(X) = 0.5
!0 0.5 1
(b) E(X) = 0.25
!0 0.25 0.5 1
FIGURE 3.3: Expectation as a center of gravity.
Similar arguments can be used to derive the general formula for the expectation.
Expectation,discrete case
µ = E(X) =∑
x
xP (x) (3.3)
This formula returns the center of gravity for a system with masses P (x) allocated at pointsx. Expected value is often denoted by a Greek letter µ.
In a certain sense, expectation is the best forecast of X . The variable itself is random. Ittakes different values with different probabilities P (x). At the same time, it has just oneexpectation E(X) which is non-random.
3.3.2 Expectation of a function
Often we are interested in another variable, Y , that is a function of X . For example, down-loading time depends on the connection speed, profit of a computer store depends on thenumber of computers sold, and bonus of its manager depends on this profit. Expectation ofY = g(X) is computed by a similar formula,
E {g(X)} =∑
x
g(x)P (x). (3.4)
Remark: Indeed, if g is a one-to-one function, then Y takes each value y = g(x) with probability
P (x), and the formula for E(Y ) can be applied directly. If g is not one-to-one, then some values ofg(x) will be repeated in (3.4). However, they are still multiplied by the corresponding probabilities.
When we add in (3.4), these probabilities are also added, thus each value of g(x) is still multipliedby the probability PY (g(x)).
3.3.3 Properties
The following linear properties of expectations follow directly from (3.3) and (3.4). For anyrandom variables X and Y and any non-random numbers a, b, and c, we have
50 Probability and Statistics for Computer Scientists
Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?
We see that both users receive the same average number of e-mails:
E(X) = E(Y ) = 50.
However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦
This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.
DEFINITION 3.6
Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is
σ2 = Var(X) = E (X − EX)2 =∑
x
(x− µ)2P (x)
Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.
According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.
Variance can also be computed as
Var(X) = E(X2)− µ2. (3.6)
A proof of this is left as Exercise 3.38a.
DEFINITION 3.7
Standard deviation is a square root of variance,
σ = Std(X) =√
Var(X)
Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.
If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.
Discrete Random Variables and Their Distributions 49
Propertiesof
expectations
E(aX + bY + c) = aE(X) + bE(Y ) + c
In particular,E(X + Y ) = E(X) + E(Y )E(aX) = aE(X)E(c) = c
For independent X and Y ,E(XY ) = E(X)E(Y )
(3.5)
Proof: The first property follows from the Addition Rule (3.2). For any X and Y ,
E(aX + bY + c) =∑
x
∑
y
(ax+ by + c)P(X,Y )(x, y)
=∑
x
ax∑
y
P(X,Y )(x, y) +∑
y
by∑
x
P(X,Y )(x, y) + c∑
x
∑
y
P(X,Y )(x, y)
= a∑
x
xPX(x) + b∑
y
yPY (y) + c.
The next three equalities are special cases. To prove the last property, we recall that P(X,Y )(x, y) =
PX(x)PY (y) for independent X and Y , and therefore,
E(XY ) =∑
x
∑
y
(xy)PX(x)PY (y) =∑
x
xPX(x)∑
y
yPY (y) = E(X)E(Y ). !
Remark: The last property in (3.5) holds for some dependent variables too, hence it cannot be
used to verify independence of X and Y .
Example 3.9. In Example 3.6 on p. 46,
E(X) = (0)(0.5) + (1)(0.5) = 0.5 and
E(Y ) = (0)(0.4) + (1)(0.3) + (2)(0.15) + (3)(0.15) = 1.05,
therefore, the expected total number of errors is
E(X + Y ) = 0.5 + 1.05 = 1.65.
♦
Remark: Clearly, the program will never have 1.65 errors, because the number of errors is alwaysinteger. Then, should we round 1.65 to 2 errors? Absolutely not, it would be a mistake. Although
both X and Y are integers, their expectations, or average values, do not have to be integers at all.
3.3.4 Variance and standard deviation
Expectation shows where the average value of a random variable is located, or where thevariable is expected to be, plus or minus some error. How large could this “error” be, andhow much can a variable vary around its expectation? Let us introduce some measures ofvariability.
50 Probability and Statistics for Computer Scientists
Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?
We see that both users receive the same average number of e-mails:
E(X) = E(Y ) = 50.
However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦
This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.
DEFINITION 3.6
Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is
σ2 = Var(X) = E (X − EX)2 =∑
x
(x− µ)2P (x)
Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.
According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.
Variance can also be computed as
Var(X) = E(X2)− µ2. (3.6)
A proof of this is left as Exercise 3.38a.
DEFINITION 3.7
Standard deviation is a square root of variance,
σ = Std(X) =√
Var(X)
Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.
If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.
50 Probability and Statistics for Computer Scientists
Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?
We see that both users receive the same average number of e-mails:
E(X) = E(Y ) = 50.
However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦
This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.
DEFINITION 3.6
Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is
σ2 = Var(X) = E (X − EX)2 =∑
x
(x− µ)2P (x)
Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.
According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.
Variance can also be computed as
Var(X) = E(X2)− µ2. (3.6)
A proof of this is left as Exercise 3.38a.
DEFINITION 3.7
Standard deviation is a square root of variance,
σ = Std(X) =√
Var(X)
Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.
If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.
Discrete Random Variables and Their Distributions 51
3.3.5 Covariance and correlation
Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.
!
"
!
"
!
"
Y
X
Y
X
Y
X
(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0
FIGURE 3.4: Positive, negative, and zero covariance.
DEFINITION 3.8
Covariance σXY = Cov(X,Y ) is defined as
Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )
It summarizes interrelation of two random variables.
Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.
Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.
If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.
DEFINITION 3.9
Correlation coefficient between variables X and Y is defined as
ρ =Cov(X,Y )
( StdX)( StdY )
Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.
Discrete Random Variables and Their Distributions 51
3.3.5 Covariance and correlation
Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.
!
"
!
"
!
"
Y
X
Y
X
Y
X
(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0
FIGURE 3.4: Positive, negative, and zero covariance.
DEFINITION 3.8
Covariance σXY = Cov(X,Y ) is defined as
Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )
It summarizes interrelation of two random variables.
Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.
Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.
If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.
DEFINITION 3.9
Correlation coefficient between variables X and Y is defined as
ρ =Cov(X,Y )
( StdX)( StdY )
Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.
Discrete Random Variables and Their Distributions 51
3.3.5 Covariance and correlation
Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.
!
"
!
"
!
"
Y
X
Y
X
Y
X
(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0
FIGURE 3.4: Positive, negative, and zero covariance.
DEFINITION 3.8
Covariance σXY = Cov(X,Y ) is defined as
Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )
It summarizes interrelation of two random variables.
Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.
Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.
If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.
DEFINITION 3.9
Correlation coefficient between variables X and Y is defined as
ρ =Cov(X,Y )
( StdX)( StdY )
Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.
52 Probability and Statistics for Computer Scientists
!
"
X
Y
!
"
X
Y
ρ = 1 ρ = −1
FIGURE 3.5: Perfect correlation: ρ = ±1.
How do we interpret the value of ρ? What possible values can it take?
As a special case of famous Cauchy-Schwarz inequality,
−1 ≤ ρ ≤ 1,
where |ρ| = 1 is possible only when all values of X and Y lie on a straight line, as inFigure 3.5. Further, values of ρ near 1 indicate strong positive correlation, values near (−1)show strong negative correlation, and values near 0 show weak correlation or no correlation.
3.3.6 Properties
The following properties of variances, covariances, and correlation coefficients hold for anyrandom variables X , Y , Z, and W and any non-random numbers a, b, c and d.
Properties of variances and covariances
Var(aX + bY + c) = a2 Var(X) + b2 Var(Y ) + 2abCov(X,Y )
Cov(aX + bY, cZ + dW )= acCov(X,Z) + adCov(X,W ) + bcCov(Y, Z) + bdCov(Y,W )
Cov(X,Y ) = Cov(Y,X)
ρ(X,Y ) = ρ(Y,X)
In particular,
Var(aX + b) = a2 Var(X)Cov(aX + b, cY + d) = acCov(X,Y )ρ(aX + b, cY + d) = ρ(X,Y )
For independent X and Y ,
Cov(X,Y ) = 0Var(X + Y ) = Var(X) + Var(Y )
(3.7)
52 Probability and Statistics for Computer Scientists
!
"
X
Y
!
"
X
Y
ρ = 1 ρ = −1
FIGURE 3.5: Perfect correlation: ρ = ±1.
How do we interpret the value of ρ? What possible values can it take?
As a special case of famous Cauchy-Schwarz inequality,
−1 ≤ ρ ≤ 1,
where |ρ| = 1 is possible only when all values of X and Y lie on a straight line, as inFigure 3.5. Further, values of ρ near 1 indicate strong positive correlation, values near (−1)show strong negative correlation, and values near 0 show weak correlation or no correlation.
3.3.6 Properties
The following properties of variances, covariances, and correlation coefficients hold for anyrandom variables X , Y , Z, and W and any non-random numbers a, b, c and d.
Properties of variances and covariances
Var(aX + bY + c) = a2 Var(X) + b2 Var(Y ) + 2abCov(X,Y )
Cov(aX + bY, cZ + dW )= acCov(X,Z) + adCov(X,W ) + bcCov(Y, Z) + bdCov(Y,W )
Cov(X,Y ) = Cov(Y,X)
ρ(X,Y ) = ρ(Y,X)
In particular,
Var(aX + b) = a2 Var(X)Cov(aX + b, cY + d) = acCov(X,Y )ρ(aX + b, cY + d) = ρ(X,Y )
For independent X and Y ,
Cov(X,Y ) = 0Var(X + Y ) = Var(X) + Var(Y )
(3.7)
200 Chapter 5 Distributions of Functions of Random Variables
5.6 THE CENTRAL LIMIT THEOREMIn Section 5.4, we found that the mean X of a random sample of size n from a dis-tribution with mean µ and variance σ 2 > 0 is a random variable with the propertiesthat
E(X) = µ and Var(X) = σ 2
n.
As n increases, the variance of X decreases. Consequently, the distribution of Xclearly depends on n, and we see that we are dealing with sequences of distributions.
In Theorem 5.5-1, we considered the pdf of X when sampling is from the normaldistribution N(µ, σ 2). We showed that the distribution of X is N(µ, σ 2/n), and inFigure 5.5-1, by graphing the pdfs for several values of n, we illustrated the propertythat as n increases, the probability becomes concentrated in a small interval centeredat µ. That is, as n increases, X tends to converge to µ, or ( X − µ) tends to convergeto 0 in a probability sense. (See Section 5.8.)
In general, if we let
W =√
nσ
( X − µ) = X − µ
σ/√
n= Y − nµ√
n σ,
where Y is the sum of a random sample of size n from some distribution with meanµ and variance σ 2, then, for each positive integer n,
E(W) = E
[X − µ
σ/√
n
]
= E(X) − µ
σ/√
n= µ − µ
σ/√
n= 0
and
Var(W) = E(W2) = E
[(X − µ)2
σ 2/n
]
=E
[(X − µ)2
]
σ 2/n= σ 2/n
σ 2/n= 1.
Thus, while X−µ tends to “degenerate” to zero, the factor√
n/σ in√
n(X−µ)/σ“spreads out” the probability enough to prevent this degeneration. What, then, is thedistribution of W as n increases? One observation that might shed some light on theanswer to this question can be made immediately. If the sample arises from a normaldistribution, then, from Theorem 5.5-1, we know that X is N(µ, σ 2/n), and hence Wis N(0, 1) for each positive n. Thus, in the limit, the distribution of W must be N(0, 1).So if the solution of the question does not depend on the underlying distribution (i.e.,it is unique), the answer must be N(0, 1). As we will see, that is exactly the case, andthis result is so important that it is called the central limit theorem, the proof ofwhich is given in Section 5.9.
Theorem5.6-1
(Central Limit Theorem) If X is the mean of a random sample X1, X2, . . . , Xn ofsize n from a distribution with a finite mean µ and a finite positive variance σ 2,then the distribution of
W = X − µ
σ/√
n=
∑ni=1 Xi − nµ√
n σ
is N(0, 1) in the limit as n → ∞.
Section 5.8 Chebyshev’s Inequality and Convergence in Probability 213
g(u) =
6
(324u5
5
)
, 0 < u < 1/6,
6(
120
− 324u5 + 324u4 − 108u3 + 18u2 − 3u2
), 1/6 ≤ u < 2/6,
6(
−7920
+ 117u2
+ 648u5 − 1296u4 + 972u3 − 342u2)
, 2/6 ≤ u < 3/6,
6(
73120
− 693u2
− 648u5 + 1944u4 − 2268u3)
, 3/6 ≤ u < 4/6,
6(−1829
20+ 1227u
2− 1602u2 + 2052u3 + 324u5 − 1296u4
), 4/6 ≤ u < 5/6,
6
(324
5− 324u + 648u2 − 648u3 + 324u4 − 324u5
5
)
, 5/6 ≤ u < 1.
We can also calculate
∫ 2/6
1/6g(u) du = 19
240= 0.0792
and
∫ 1
11/18g(u) du = 5, 818
32, 805= 0.17735.
Although these integrations are not difficult, they are tedious to do by hand. !
5.8 CHEBYSHEV’S INEQUALITY AND CONVERGENCE IN PROBABILITYIn this section, we use Chebyshev’s inequality to show, in another sense, that thesample mean, X, is a good statistic to use to estimate a population mean µ; therelative frequency of success in n independent Bernoulli trials, Y/n, is a good statisticfor estimating p. We examine the effect of the sample size n on these estimates.
We begin by showing that Chebyshev’s inequality gives added significance tothe standard deviation in terms of bounding certain probabilities. The inequality isvalid for all distributions for which the standard deviation exists. The proof is givenfor the discrete case, but it holds for the continuous case, with integrals replacingsummations.
Theorem5.8-1
(Chebyshev’s Inequality) If the random variable X has a mean µ and variance σ 2,then, for every k ≥ 1,
P(|X − µ| ≥ kσ ) ≤ 1k2 .
Proof Let f (x) denote the pmf of X. Then
214 Chapter 5 Distributions of Functions of Random Variables
σ 2 = E[(X − µ)2] =∑
x∈S
(x − µ)2f (x)
=∑
x∈A
(x − µ)2f (x) +∑
x∈A′(x − µ)2f (x), (5.8-1)
where
A = {x : |x − µ| ≥ kσ }.
The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,
σ 2 ≥∑
x∈A
(x − µ)2f (x).
However, in A, |x − µ| ≥ kσ ; so
σ 2 ≥∑
x∈A
(kσ )2f (x) = k2σ 2∑
x∈A
f (x).
But the latter summation equals P(X ∈ A); thus,
σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).
That is,
P(|X − µ| ≥ kσ ) ≤ 1k2 . !
Corollary5.8-1
If ε = kσ , then
P(|X − µ| ≥ ε) ≤ σ 2
ε2 .
"
In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,
P(|X − µ| < kσ ) ≥ 1 − 1k2 .
From the corollary, it also follows that
P(|X − µ| < ε) ≥ 1 − σ 2
ε2 .
Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.
Example5.8-1
If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by
214 Chapter 5 Distributions of Functions of Random Variables
σ 2 = E[(X − µ)2] =∑
x∈S
(x − µ)2f (x)
=∑
x∈A
(x − µ)2f (x) +∑
x∈A′(x − µ)2f (x), (5.8-1)
where
A = {x : |x − µ| ≥ kσ }.The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,
σ 2 ≥∑
x∈A
(x − µ)2f (x).
However, in A, |x − µ| ≥ kσ ; so
σ 2 ≥∑
x∈A
(kσ )2f (x) = k2σ 2∑
x∈A
f (x).
But the latter summation equals P(X ∈ A); thus,
σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).
That is,
P(|X − µ| ≥ kσ ) ≤ 1k2 . !
Corollary5.8-1
If ε = kσ , then
P(|X − µ| ≥ ε) ≤ σ 2
ε2 .
"
In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,
P(|X − µ| < kσ ) ≥ 1 − 1k2 .
From the corollary, it also follows that
P(|X − µ| < ε) ≥ 1 − σ 2
ε2 .
Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.
Example5.8-1
If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by
214 Chapter 5 Distributions of Functions of Random Variables
σ 2 = E[(X − µ)2] =∑
x∈S
(x − µ)2f (x)
=∑
x∈A
(x − µ)2f (x) +∑
x∈A′(x − µ)2f (x), (5.8-1)
where
A = {x : |x − µ| ≥ kσ }.The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,
σ 2 ≥∑
x∈A
(x − µ)2f (x).
However, in A, |x − µ| ≥ kσ ; so
σ 2 ≥∑
x∈A
(kσ )2f (x) = k2σ 2∑
x∈A
f (x).
But the latter summation equals P(X ∈ A); thus,
σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).
That is,
P(|X − µ| ≥ kσ ) ≤ 1k2 . !
Corollary5.8-1
If ε = kσ , then
P(|X − µ| ≥ ε) ≤ σ 2
ε2 .
"
In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,
P(|X − µ| < kσ ) ≥ 1 − 1k2 .
From the corollary, it also follows that
P(|X − µ| < ε) ≥ 1 − σ 2
ε2 .
Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.
Example5.8-1
If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by
Chapte rChapte r
4Bivariate Distributions
4.1 Bivariate Distributions of the Discrete Type4.2 The Correlation Coefficient4.3 Conditional Distributions
4.4 Bivariate Distributions of the ContinuousType
4.5 The Bivariate Normal Distribution
4.1 BIVARIATE DISTRIBUTIONS OF THE DISCRETE TYPESo far, we have taken only one measurement on a single item under observation.However, it is clear in many practical cases that it is possible, and often very desir-able, to take more than one measurement of a random observation. Suppose, forexample, that we are observing female college students to obtain information aboutsome of their physical characteristics, such as height, x, and weight, y, because we aretrying to determine a relationship between those two characteristics. For instance,there may be some pattern between height and weight that can be described byan appropriate curve y = u(x). Certainly, not all of the points observed will beon this curve, but we want to attempt to find the “best” curve to describe therelationship and then say something about the variation of the points around thecurve.
Another example might concern high school rank—say, x—and the ACT(or SAT) score—say, y—of incoming college students. What is the relationshipbetween these two characteristics? More importantly, how can we use those mea-surements to predict a third one, such as first-year college GPA—say, z—witha function z = v(x, y)? This is a very important problem for college admissionoffices, particularly when it comes to awarding an athletic scholarship, because theincoming student–athlete must satisfy certain conditions before receiving such anaward.
Definition 4.1-1Let X and Y be two random variables defined on a discrete space. Let S denotethe corresponding two-dimensional space of X and Y, the two random vari-ables of the discrete type. The probability that X = x and Y = y is denoted byf (x, y) = P(X = x, Y = y). The function f (x, y) is called the joint probabilitymass function (joint pmf) of X and Y and has the following properties:
125
126 Chapter 4 Bivariate Distributions
(a) 0 ≤ f (x, y) ≤ 1.
(b)∑ ∑
(x,y)∈S
f (x, y) = 1.
(c) P[(X, Y) ∈ A] =∑ ∑
(x,y)∈A
f (x, y), where A is a subset of the space S.
The following example will make this definition more meaningful.
Example4.1-1
Roll a pair of fair dice. For each of the 36 sample points with probability 1/36, letX denote the smaller and Y the larger outcome on the dice. For example, if theoutcome is (3, 2), then the observed values are X = 2, Y = 3. The event {X = 2,Y = 3} could occur in one of two ways—(3, 2) or (2, 3)—so its probability is
136
+ 136
= 236
.
If the outcome is (2, 2), then the observed values are X = 2, Y = 2. Since the event{X = 2, Y = 2} can occur in only one way, P(X = 2, Y = 2) = 1/36. The joint pmfof X and Y is given by the probabilities
f (x, y) =
136
, 1 ≤ x = y ≤ 6,
236
, 1 ≤ x < y ≤ 6,
when x and y are integers. Figure 4.1-1 depicts the probabilities of the various pointsof the space S.
1/36
2/36 1/36
1/36
2/36
1/36
2/36
2/36
2/36
1/36
2/36
2/36
2/36
5/36 3/367/36
y
2/36
2/36
2/36
2/36
1/36
1/36
x
11/36
2/36
2/36
2/36
5/36
9/36
1/36
11/364 53 621
7/36
3/36
9/36
6
5
4
3
2
1
Figure 4.1-1 Discrete joint pmf
Section 4.1 Bivariate Distributions of the Discrete Type 127
Notice that certain numbers have been recorded in the bottom and left-handmargins of Figure 4.1-1. These numbers are the respective column and row totalsof the probabilities. The column totals are the respective probabilities that X willassume the values in the x space SX = {1, 2, 3, 4, 5, 6}, and the row totals arethe respective probabilities that Y will assume the values in the y space SY ={1, 2, 3, 4, 5, 6}. That is, the totals describe the probability mass functions of X andY, respectively. Since each collection of these probabilities is frequently recordedin the margins and satisfies the properties of a pmf of one random variable, each iscalled a marginal pmf.
Definition 4.1-2Let X and Y have the joint probability mass function f (x, y) with space S. Theprobability mass function of X alone, which is called the marginal probabilitymass function of X, is defined by
fX(x) =∑
yf (x, y) = P(X = x), x ∈ SX ,
where the summation is taken over all possible y values for each given x in thex space SX . That is, the summation is over all (x, y) in S with a given x value.Similarly, the marginal probability mass function of Y is defined by
fY(y) =∑
xf (x, y) = P(Y = y), y ∈ SY ,
where the summation is taken over all possible x values for each given y in they space SY . The random variables X and Y are independent if and only if, forevery x ∈ SX and every y ∈ SY ,
P(X = x, Y = y) = P(X = x)P(Y = y)
or, equivalently,
f (x, y) = fX(x)fY(y);
otherwise, X and Y are said to be dependent.
We note in Example 4.1-1 that X and Y are dependent because there are manyx and y values for which f (x, y) "= fX(x)fY(y). For instance,
fX(1)fY(1) =(
1136
)(1
36
)"= 1
36= f (1, 1).
Example4.1-2
Let the joint pmf of X and Y be defined by
f (x, y) = x + y21
, x = 1, 2, 3, y = 1, 2.
Then
fX(x) =∑
yf (x, y) =
2∑
y=1
x + y21
= x + 121
+ x + 221
= 2x + 321
, x = 1, 2, 3,
48 Probability and Statistics for Computer Scientists
(a) E(X) = 0.5
!0 0.5 1
(b) E(X) = 0.25
!0 0.25 0.5 1
FIGURE 3.3: Expectation as a center of gravity.
Similar arguments can be used to derive the general formula for the expectation.
Expectation,discrete case
µ = E(X) =∑
x
xP (x) (3.3)
This formula returns the center of gravity for a system with masses P (x) allocated at pointsx. Expected value is often denoted by a Greek letter µ.
In a certain sense, expectation is the best forecast of X . The variable itself is random. Ittakes different values with different probabilities P (x). At the same time, it has just oneexpectation E(X) which is non-random.
3.3.2 Expectation of a function
Often we are interested in another variable, Y , that is a function of X . For example, down-loading time depends on the connection speed, profit of a computer store depends on thenumber of computers sold, and bonus of its manager depends on this profit. Expectation ofY = g(X) is computed by a similar formula,
E {g(X)} =∑
x
g(x)P (x). (3.4)
Remark: Indeed, if g is a one-to-one function, then Y takes each value y = g(x) with probability
P (x), and the formula for E(Y ) can be applied directly. If g is not one-to-one, then some values ofg(x) will be repeated in (3.4). However, they are still multiplied by the corresponding probabilities.
When we add in (3.4), these probabilities are also added, thus each value of g(x) is still multipliedby the probability PY (g(x)).
3.3.3 Properties
The following linear properties of expectations follow directly from (3.3) and (3.4). For anyrandom variables X and Y and any non-random numbers a, b, and c, we have
50 Probability and Statistics for Computer Scientists
Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?
We see that both users receive the same average number of e-mails:
E(X) = E(Y ) = 50.
However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦
This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.
DEFINITION 3.6
Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is
σ2 = Var(X) = E (X − EX)2 =∑
x
(x− µ)2P (x)
Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.
According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.
Variance can also be computed as
Var(X) = E(X2)− µ2. (3.6)
A proof of this is left as Exercise 3.38a.
DEFINITION 3.7
Standard deviation is a square root of variance,
σ = Std(X) =√
Var(X)
Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.
If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.
Discrete Random Variables and Their Distributions 49
Propertiesof
expectations
E(aX + bY + c) = aE(X) + bE(Y ) + c
In particular,E(X + Y ) = E(X) + E(Y )E(aX) = aE(X)E(c) = c
For independent X and Y ,E(XY ) = E(X)E(Y )
(3.5)
Proof: The first property follows from the Addition Rule (3.2). For any X and Y ,
E(aX + bY + c) =∑
x
∑
y
(ax+ by + c)P(X,Y )(x, y)
=∑
x
ax∑
y
P(X,Y )(x, y) +∑
y
by∑
x
P(X,Y )(x, y) + c∑
x
∑
y
P(X,Y )(x, y)
= a∑
x
xPX(x) + b∑
y
yPY (y) + c.
The next three equalities are special cases. To prove the last property, we recall that P(X,Y )(x, y) =
PX(x)PY (y) for independent X and Y , and therefore,
E(XY ) =∑
x
∑
y
(xy)PX(x)PY (y) =∑
x
xPX(x)∑
y
yPY (y) = E(X)E(Y ). !
Remark: The last property in (3.5) holds for some dependent variables too, hence it cannot be
used to verify independence of X and Y .
Example 3.9. In Example 3.6 on p. 46,
E(X) = (0)(0.5) + (1)(0.5) = 0.5 and
E(Y ) = (0)(0.4) + (1)(0.3) + (2)(0.15) + (3)(0.15) = 1.05,
therefore, the expected total number of errors is
E(X + Y ) = 0.5 + 1.05 = 1.65.
♦
Remark: Clearly, the program will never have 1.65 errors, because the number of errors is alwaysinteger. Then, should we round 1.65 to 2 errors? Absolutely not, it would be a mistake. Although
both X and Y are integers, their expectations, or average values, do not have to be integers at all.
3.3.4 Variance and standard deviation
Expectation shows where the average value of a random variable is located, or where thevariable is expected to be, plus or minus some error. How large could this “error” be, andhow much can a variable vary around its expectation? Let us introduce some measures ofvariability.
50 Probability and Statistics for Computer Scientists
Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?
We see that both users receive the same average number of e-mails:
E(X) = E(Y ) = 50.
However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦
This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.
DEFINITION 3.6
Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is
σ2 = Var(X) = E (X − EX)2 =∑
x
(x− µ)2P (x)
Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.
According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.
Variance can also be computed as
Var(X) = E(X2)− µ2. (3.6)
A proof of this is left as Exercise 3.38a.
DEFINITION 3.7
Standard deviation is a square root of variance,
σ = Std(X) =√
Var(X)
Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.
If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.
50 Probability and Statistics for Computer Scientists
Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?
We see that both users receive the same average number of e-mails:
E(X) = E(Y ) = 50.
However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦
This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.
DEFINITION 3.6
Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is
σ2 = Var(X) = E (X − EX)2 =∑
x
(x− µ)2P (x)
Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.
According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.
Variance can also be computed as
Var(X) = E(X2)− µ2. (3.6)
A proof of this is left as Exercise 3.38a.
DEFINITION 3.7
Standard deviation is a square root of variance,
σ = Std(X) =√
Var(X)
Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.
If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.
Discrete Random Variables and Their Distributions 51
3.3.5 Covariance and correlation
Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.
!
"
!
"
!
"
Y
X
Y
X
Y
X
(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0
FIGURE 3.4: Positive, negative, and zero covariance.
DEFINITION 3.8
Covariance σXY = Cov(X,Y ) is defined as
Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )
It summarizes interrelation of two random variables.
Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.
Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.
If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.
DEFINITION 3.9
Correlation coefficient between variables X and Y is defined as
ρ =Cov(X,Y )
( StdX)( StdY )
Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.
Discrete Random Variables and Their Distributions 51
3.3.5 Covariance and correlation
Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.
!
"
!
"
!
"
Y
X
Y
X
Y
X
(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0
FIGURE 3.4: Positive, negative, and zero covariance.
DEFINITION 3.8
Covariance σXY = Cov(X,Y ) is defined as
Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )
It summarizes interrelation of two random variables.
Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.
Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.
If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.
DEFINITION 3.9
Correlation coefficient between variables X and Y is defined as
ρ =Cov(X,Y )
( StdX)( StdY )
Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.
Discrete Random Variables and Their Distributions 51
3.3.5 Covariance and correlation
Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.
!
"
!
"
!
"
Y
X
Y
X
Y
X
(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0
FIGURE 3.4: Positive, negative, and zero covariance.
DEFINITION 3.8
Covariance σXY = Cov(X,Y ) is defined as
Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )
It summarizes interrelation of two random variables.
Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.
Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.
If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.
DEFINITION 3.9
Correlation coefficient between variables X and Y is defined as
ρ =Cov(X,Y )
( StdX)( StdY )
Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.
52 Probability and Statistics for Computer Scientists
!
"
X
Y
!
"
X
Y
ρ = 1 ρ = −1
FIGURE 3.5: Perfect correlation: ρ = ±1.
How do we interpret the value of ρ? What possible values can it take?
As a special case of famous Cauchy-Schwarz inequality,
−1 ≤ ρ ≤ 1,
where |ρ| = 1 is possible only when all values of X and Y lie on a straight line, as inFigure 3.5. Further, values of ρ near 1 indicate strong positive correlation, values near (−1)show strong negative correlation, and values near 0 show weak correlation or no correlation.
3.3.6 Properties
The following properties of variances, covariances, and correlation coefficients hold for anyrandom variables X , Y , Z, and W and any non-random numbers a, b, c and d.
Properties of variances and covariances
Var(aX + bY + c) = a2 Var(X) + b2 Var(Y ) + 2abCov(X,Y )
Cov(aX + bY, cZ + dW )= acCov(X,Z) + adCov(X,W ) + bcCov(Y, Z) + bdCov(Y,W )
Cov(X,Y ) = Cov(Y,X)
ρ(X,Y ) = ρ(Y,X)
In particular,
Var(aX + b) = a2 Var(X)Cov(aX + b, cY + d) = acCov(X,Y )ρ(aX + b, cY + d) = ρ(X,Y )
For independent X and Y ,
Cov(X,Y ) = 0Var(X + Y ) = Var(X) + Var(Y )
(3.7)
52 Probability and Statistics for Computer Scientists
!
"
X
Y
!
"
X
Y
ρ = 1 ρ = −1
FIGURE 3.5: Perfect correlation: ρ = ±1.
How do we interpret the value of ρ? What possible values can it take?
As a special case of famous Cauchy-Schwarz inequality,
−1 ≤ ρ ≤ 1,
where |ρ| = 1 is possible only when all values of X and Y lie on a straight line, as inFigure 3.5. Further, values of ρ near 1 indicate strong positive correlation, values near (−1)show strong negative correlation, and values near 0 show weak correlation or no correlation.
3.3.6 Properties
The following properties of variances, covariances, and correlation coefficients hold for anyrandom variables X , Y , Z, and W and any non-random numbers a, b, c and d.
Properties of variances and covariances
Var(aX + bY + c) = a2 Var(X) + b2 Var(Y ) + 2abCov(X,Y )
Cov(aX + bY, cZ + dW )= acCov(X,Z) + adCov(X,W ) + bcCov(Y, Z) + bdCov(Y,W )
Cov(X,Y ) = Cov(Y,X)
ρ(X,Y ) = ρ(Y,X)
In particular,
Var(aX + b) = a2 Var(X)Cov(aX + b, cY + d) = acCov(X,Y )ρ(aX + b, cY + d) = ρ(X,Y )
For independent X and Y ,
Cov(X,Y ) = 0Var(X + Y ) = Var(X) + Var(Y )
(3.7)
200 Chapter 5 Distributions of Functions of Random Variables
5.6 THE CENTRAL LIMIT THEOREMIn Section 5.4, we found that the mean X of a random sample of size n from a dis-tribution with mean µ and variance σ 2 > 0 is a random variable with the propertiesthat
E(X) = µ and Var(X) = σ 2
n.
As n increases, the variance of X decreases. Consequently, the distribution of Xclearly depends on n, and we see that we are dealing with sequences of distributions.
In Theorem 5.5-1, we considered the pdf of X when sampling is from the normaldistribution N(µ, σ 2). We showed that the distribution of X is N(µ, σ 2/n), and inFigure 5.5-1, by graphing the pdfs for several values of n, we illustrated the propertythat as n increases, the probability becomes concentrated in a small interval centeredat µ. That is, as n increases, X tends to converge to µ, or ( X − µ) tends to convergeto 0 in a probability sense. (See Section 5.8.)
In general, if we let
W =√
nσ
( X − µ) = X − µ
σ/√
n= Y − nµ√
n σ,
where Y is the sum of a random sample of size n from some distribution with meanµ and variance σ 2, then, for each positive integer n,
E(W) = E
[X − µ
σ/√
n
]
= E(X) − µ
σ/√
n= µ − µ
σ/√
n= 0
and
Var(W) = E(W2) = E
[(X − µ)2
σ 2/n
]
=E
[(X − µ)2
]
σ 2/n= σ 2/n
σ 2/n= 1.
Thus, while X−µ tends to “degenerate” to zero, the factor√
n/σ in√
n(X−µ)/σ“spreads out” the probability enough to prevent this degeneration. What, then, is thedistribution of W as n increases? One observation that might shed some light on theanswer to this question can be made immediately. If the sample arises from a normaldistribution, then, from Theorem 5.5-1, we know that X is N(µ, σ 2/n), and hence Wis N(0, 1) for each positive n. Thus, in the limit, the distribution of W must be N(0, 1).So if the solution of the question does not depend on the underlying distribution (i.e.,it is unique), the answer must be N(0, 1). As we will see, that is exactly the case, andthis result is so important that it is called the central limit theorem, the proof ofwhich is given in Section 5.9.
Theorem5.6-1
(Central Limit Theorem) If X is the mean of a random sample X1, X2, . . . , Xn ofsize n from a distribution with a finite mean µ and a finite positive variance σ 2,then the distribution of
W = X − µ
σ/√
n=
∑ni=1 Xi − nµ√
n σ
is N(0, 1) in the limit as n → ∞.
Section 5.8 Chebyshev’s Inequality and Convergence in Probability 213
g(u) =
6
(324u5
5
)
, 0 < u < 1/6,
6(
120
− 324u5 + 324u4 − 108u3 + 18u2 − 3u2
), 1/6 ≤ u < 2/6,
6(
−7920
+ 117u2
+ 648u5 − 1296u4 + 972u3 − 342u2)
, 2/6 ≤ u < 3/6,
6(
73120
− 693u2
− 648u5 + 1944u4 − 2268u3)
, 3/6 ≤ u < 4/6,
6(−1829
20+ 1227u
2− 1602u2 + 2052u3 + 324u5 − 1296u4
), 4/6 ≤ u < 5/6,
6
(324
5− 324u + 648u2 − 648u3 + 324u4 − 324u5
5
)
, 5/6 ≤ u < 1.
We can also calculate
∫ 2/6
1/6g(u) du = 19
240= 0.0792
and
∫ 1
11/18g(u) du = 5, 818
32, 805= 0.17735.
Although these integrations are not difficult, they are tedious to do by hand. !
5.8 CHEBYSHEV’S INEQUALITY AND CONVERGENCE IN PROBABILITYIn this section, we use Chebyshev’s inequality to show, in another sense, that thesample mean, X, is a good statistic to use to estimate a population mean µ; therelative frequency of success in n independent Bernoulli trials, Y/n, is a good statisticfor estimating p. We examine the effect of the sample size n on these estimates.
We begin by showing that Chebyshev’s inequality gives added significance tothe standard deviation in terms of bounding certain probabilities. The inequality isvalid for all distributions for which the standard deviation exists. The proof is givenfor the discrete case, but it holds for the continuous case, with integrals replacingsummations.
Theorem5.8-1
(Chebyshev’s Inequality) If the random variable X has a mean µ and variance σ 2,then, for every k ≥ 1,
P(|X − µ| ≥ kσ ) ≤ 1k2 .
Proof Let f (x) denote the pmf of X. Then
214 Chapter 5 Distributions of Functions of Random Variables
σ 2 = E[(X − µ)2] =∑
x∈S
(x − µ)2f (x)
=∑
x∈A
(x − µ)2f (x) +∑
x∈A′(x − µ)2f (x), (5.8-1)
where
A = {x : |x − µ| ≥ kσ }.
The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,
σ 2 ≥∑
x∈A
(x − µ)2f (x).
However, in A, |x − µ| ≥ kσ ; so
σ 2 ≥∑
x∈A
(kσ )2f (x) = k2σ 2∑
x∈A
f (x).
But the latter summation equals P(X ∈ A); thus,
σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).
That is,
P(|X − µ| ≥ kσ ) ≤ 1k2 . !
Corollary5.8-1
If ε = kσ , then
P(|X − µ| ≥ ε) ≤ σ 2
ε2 .
"
In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,
P(|X − µ| < kσ ) ≥ 1 − 1k2 .
From the corollary, it also follows that
P(|X − µ| < ε) ≥ 1 − σ 2
ε2 .
Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.
Example5.8-1
If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by
214 Chapter 5 Distributions of Functions of Random Variables
σ 2 = E[(X − µ)2] =∑
x∈S
(x − µ)2f (x)
=∑
x∈A
(x − µ)2f (x) +∑
x∈A′(x − µ)2f (x), (5.8-1)
where
A = {x : |x − µ| ≥ kσ }.The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,
σ 2 ≥∑
x∈A
(x − µ)2f (x).
However, in A, |x − µ| ≥ kσ ; so
σ 2 ≥∑
x∈A
(kσ )2f (x) = k2σ 2∑
x∈A
f (x).
But the latter summation equals P(X ∈ A); thus,
σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).
That is,
P(|X − µ| ≥ kσ ) ≤ 1k2 . !
Corollary5.8-1
If ε = kσ , then
P(|X − µ| ≥ ε) ≤ σ 2
ε2 .
"
In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,
P(|X − µ| < kσ ) ≥ 1 − 1k2 .
From the corollary, it also follows that
P(|X − µ| < ε) ≥ 1 − σ 2
ε2 .
Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.
Example5.8-1
If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by
214 Chapter 5 Distributions of Functions of Random Variables
σ 2 = E[(X − µ)2] =∑
x∈S
(x − µ)2f (x)
=∑
x∈A
(x − µ)2f (x) +∑
x∈A′(x − µ)2f (x), (5.8-1)
where
A = {x : |x − µ| ≥ kσ }.The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,
σ 2 ≥∑
x∈A
(x − µ)2f (x).
However, in A, |x − µ| ≥ kσ ; so
σ 2 ≥∑
x∈A
(kσ )2f (x) = k2σ 2∑
x∈A
f (x).
But the latter summation equals P(X ∈ A); thus,
σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).
That is,
P(|X − µ| ≥ kσ ) ≤ 1k2 . !
Corollary5.8-1
If ε = kσ , then
P(|X − µ| ≥ ε) ≤ σ 2
ε2 .
"
In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,
P(|X − µ| < kσ ) ≥ 1 − 1k2 .
From the corollary, it also follows that
P(|X − µ| < ε) ≥ 1 − σ 2
ε2 .
Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.
Example5.8-1
If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by
Section 6.2 Exploratory Data Analysis 241
Table 6.2-5 Order statistics of 50 exam scores
34 38 42 42 45 47 51 52 54 57
58 58 59 60 61 63 65 65 66 67
68 69 69 70 71 71 72 73 73 74
75 75 76 76 77 79 81 81 82 83
83 84 84 85 87 90 91 93 93 97
From either these order statistics or the corresponding ordered stem-and-leafdisplay, it is rather easy to find the sample percentiles. If 0 < p < 1, then the (100p)thsample percentile has approximately np sample observations less than it and alson(1−p) sample observations greater than it. One way of achieving this is to take the(100p)th sample percentile as the (n + 1)pth order statistic, provided that (n + 1)p isan integer. If (n + 1)p is not an integer but is equal to r plus some proper fraction—say, a/b—use a weighted average of the rth and the (r + 1)st order statistics. That is,define the (100p)th sample percentile as
πp = yr + (a/b)(yr+1 − yr) = (1 − a/b)yr + (a/b)yr+1.
Note that this formula is simply a linear interpolation between yr and yr+1.[If p < 1/(n + 1) or p > n/(n + 1), that sample percentile is not defined.]
As an illustration, consider the 50 ordered test scores. With p = 1/2, we findthe 50th percentile by averaging the 25th and 26th order statistics, since (n + 1)p =(51)(1/2) = 25.5. Thus, the 50th percentile is
π0.50 = (1/2)y25 + (1/2)y26 = (71 + 71)/2 = 71.
With p = 1/4, we have (n + 1)p = (51)(1/4) = 12.75, and the 25th sample percentile isthen
π0.25 = (1 − 0.75)y12 + (0.75)y13 = (0.25)(58) + (0.75)(59) = 58.75.
With p = 3/4, so that (n + 1)p = (51)(3/4) = 38.25, the 75th sample percentile is
π0.75 = (1 − 0.25)y38 + (0.25)y39 = (0.75)(81) + (0.25)(82) = 81.25.
Note that approximately 50%, 25%, and 75% of the sample observations are lessthan 71, 58.75, and 81.25, respectively.
Special names are given to certain percentiles. The 50th percentile is the medianof the sample. The 25th, 50th, and 75th percentiles are, respectively, the first, second,and third quartiles of the sample. For notation, we let q1 = π0.25, q2 = m = π0.50,and q3 = π0.75. The 10th, 20th, . . . , and 90th percentiles are the deciles of the sample,so note that the 50th percentile is also the median, the second quartile, and the fifthdecile. With the set of 50 test scores, since (51)(2/10) = 10.2 and (51)(9/10) = 45.9, thesecond and ninth deciles are, respectively,
π0.20 = (0.8)y10 + (0.2)y11 = (0.8)(57) + (0.2)(58) = 57.2
and
π0.90 = (0.1)y45 + (0.9)y46 = (0.1)(87) + (0.9)(90) = 89.7.
Regression 365
Predictor X Predictor X
Response
Y
Response
Y
(a) Observed pairs (b) Estimated regression line
(x1, y1)
(x2, y2)
(x3, y3)
(x4, y4)
(x5, y5)
y1
y1
y2
y2
y3
y3
y4
y4y5
y5
FIGURE 11.3: Least squares estimation of the regression line.
Function G is usually sought in a suitable form: linear, quadratic, logarithmic, etc. Thesimplest form is linear.
11.1.3 Linear regression
Linear regression model assumes that the conditional expectation
G(x) = E {Y | X = x} = β0 + β1 x
is a linear function of x. As any linear function, it has an intercept β0 and a slope β1.
The interceptβ0 = G(0)
equals the value of the regression function for x = 0. Sometimes it has no physical meaning.For example, nobody will try to predict the value of a computer with 0 random accessmemory (RAM), and nobody will consider the Federal reserve rate in year 0. In othercases, intercept is quite important. For example, according to the Ohm’s Law (V = RI)the voltage across an ideal conductor is proportional to the current. A non-zero intercept(V = V0 + R I) would show that the circuit is not ideal, and there is an external loss ofvoltage.
The slopeβ1 = G(x+ 1)−G(x)
is the predicted change in the response variable when predictor changes by 1. This is a veryimportant parameter that shows how fast we can change the expected response by varyingthe predictor. For example, customer satisfaction will increase by β1(∆x) when the qualityof produced computers increases by (∆x).
A zero slope means absence of a linear relationship between X and Y . In this case, Y isexpected to stay constant when X changes.
Regression 367
Regressionestimates
b0 = β0 = y − b1 x
b1 = β1 = Sxy/Sxx
where
Sxx =n∑
i=1
(xi − x)2
Sxy =n∑
i=1
(xi − x)(yi − y)
(11.6)
Example 11.3 (World population). In Example 11.1, xi is the year, and yi is the worldpopulation during that year. To estimate the regression line in Figure 11.1, we compute
x = 1980; y = 4558.1;
Sxx = (1950− x)2 + . . .+ (2010− x)2 = 4550;
Sxy = (1950− x)(2558− y) + . . .+ (2010− x)(6864− y) = 337250.
Then
b1 = Sxy/Sxx = 74.1
b0 = y − b1x = −142201.
The estimated regression line is
G(x) = b0 + b1 x = -142201 + 74.1x.
We conclude that the world population grows at the average rate of 74.1 million every year.
We can use the obtained equation to predict the future growth of the world population.Regression predictions for years 2015 and 2020 are
G(2015) = b0 + 2015 b1 = 7152 million people
G(2020) = b0 + 2020 b1 = 7523 million people
♦
11.1.4 Regression and correlation
Recall from Section 3.3.5 that covariance
Cov(X,Y ) = E(X − E(X))(Y − E(Y ))
and correlation coefficient
ρ =Cov(X,Y )
( StdX)( StdY )
302 Chapter 7 Interval Estimation
Thus, since the probability of the first of these is 1 −α, the probability of the lastmust also be 1 − α, because the latter is true if and only if the former is true. That is,we have
P[
X − zα/2
(σ√n
)≤ µ ≤ X + zα/2
(σ√n
)]= 1 − α.
So the probability that the random interval[
X − zα/2
(σ√n
), X + zα/2
(σ√n
)]
includes the unknown mean µ is 1 − α.Once the sample is observed and the sample mean computed to equal x, the
interval [ x − zα/2(σ/√
n ), x + zα/2(σ/√
n )] becomes known. Since the probabilitythat the random interval covers µ before the sample is drawn is equal to 1 − α,we now call the computed interval, x ± zα/2(σ/
√n ) (for brevity), a 100(1 − α)%
confidence interval for the unknown mean µ. For example, x ± 1.96(σ/√
n ) is a 95%confidence interval for µ. The number 100(1 − α)%, or equivalently, 1 − α, is calledthe confidence coefficient.
We see that the confidence interval for µ is centered at the point estimate xand is completed by subtracting and adding the quantity zα/2(σ/
√n ). Note that as n
increases, zα/2(σ/√
n ) decreases, resulting in a shorter confidence interval with thesame confidence coefficient 1−α. A shorter confidence interval gives a more preciseestimate of µ, regardless of the confidence we have in the estimate of µ. Statisticianswho are not restricted by time, money, effort, or the availability of observations canobviously make the confidence interval as short as they like by increasing the samplesize n. For a fixed sample size n, the length of the confidence interval can also beshortened by decreasing the confidence coefficient 1 − α. But if this is done, weachieve a shorter confidence interval at the expense of losing some confidence.
Example7.1-1
Let X equal the length of life of a 60-watt light bulb marketed by a certain manufac-turer. Assume that the distribution of X is N(µ, 1296). If a random sample of n = 27bulbs is tested until they burn out, yielding a sample mean of x = 1478 hours, then a95% confidence interval for µ is[
x − z0.025
(σ√n
), x + z0.025
(σ√n
)]=
[1478 − 1.96
(36√27
), 1478 + 1.96
(36√27
)]
= [1478 − 13.58, 1478 + 13.58]
= [1464.42, 1491.58].
The next example will help to give a better intuitive feeling for the interpretationof a confidence interval.
Example7.1-2
Let x be the observed sample mean of five observations of a random sample fromthe normal distribution N(µ, 16). A 90% confidence interval for the unknown meanµ is
[
x − 1.645
√165
, x + 1.645
√165
]
.
302 Chapter 7 Interval Estimation
Thus, since the probability of the first of these is 1 −α, the probability of the lastmust also be 1 − α, because the latter is true if and only if the former is true. That is,we have
P[
X − zα/2
(σ√n
)≤ µ ≤ X + zα/2
(σ√n
)]= 1 − α.
So the probability that the random interval[
X − zα/2
(σ√n
), X + zα/2
(σ√n
)]
includes the unknown mean µ is 1 − α.Once the sample is observed and the sample mean computed to equal x, the
interval [ x − zα/2(σ/√
n ), x + zα/2(σ/√
n )] becomes known. Since the probabilitythat the random interval covers µ before the sample is drawn is equal to 1 − α,we now call the computed interval, x ± zα/2(σ/
√n ) (for brevity), a 100(1 − α)%
confidence interval for the unknown mean µ. For example, x ± 1.96(σ/√
n ) is a 95%confidence interval for µ. The number 100(1 − α)%, or equivalently, 1 − α, is calledthe confidence coefficient.
We see that the confidence interval for µ is centered at the point estimate xand is completed by subtracting and adding the quantity zα/2(σ/
√n ). Note that as n
increases, zα/2(σ/√
n ) decreases, resulting in a shorter confidence interval with thesame confidence coefficient 1−α. A shorter confidence interval gives a more preciseestimate of µ, regardless of the confidence we have in the estimate of µ. Statisticianswho are not restricted by time, money, effort, or the availability of observations canobviously make the confidence interval as short as they like by increasing the samplesize n. For a fixed sample size n, the length of the confidence interval can also beshortened by decreasing the confidence coefficient 1 − α. But if this is done, weachieve a shorter confidence interval at the expense of losing some confidence.
Example7.1-1
Let X equal the length of life of a 60-watt light bulb marketed by a certain manufac-turer. Assume that the distribution of X is N(µ, 1296). If a random sample of n = 27bulbs is tested until they burn out, yielding a sample mean of x = 1478 hours, then a95% confidence interval for µ is[
x − z0.025
(σ√n
), x + z0.025
(σ√n
)]=
[1478 − 1.96
(36√27
), 1478 + 1.96
(36√27
)]
= [1478 − 13.58, 1478 + 13.58]
= [1464.42, 1491.58].
The next example will help to give a better intuitive feeling for the interpretationof a confidence interval.
Example7.1-2
Let x be the observed sample mean of five observations of a random sample fromthe normal distribution N(µ, 16). A 90% confidence interval for the unknown meanµ is
[
x − 1.645
√165
, x + 1.645
√165
]
.
Section 7.1 Confidence Intervals for Means 305
1 − α = P
[
−tα/2(n−1) ≤ X − µ
S/√
n≤ tα/2(n−1)
]
= P[−tα/2(n−1)
(S√n
)≤ X − µ ≤ tα/2(n−1)
(S√n
)]
= P[−X − tα/2(n−1)
(S√n
)≤ −µ ≤ −X + tα/2(n−1)
(S√n
)]
= P[
X − tα/2(n−1)(
S√n
)≤ µ ≤ X + tα/2(n−1)
(S√n
)].
Thus, the observations of a random sample provide x and s2, and[
x − tα/2(n−1)(
s√n
), x + tα/2(n−1)
(s√n
)]
is a 100(1 − α)% confidence interval for µ.
Example7.1-5
Let X equal the amount of butterfat in pounds produced by a typical cow during a305-day milk production period between her first and second calves. Assume thatthe distribution of X is N(µ, σ 2). To estimate µ, a farmer measured the butterfatproduction for n = 20 cows and obtained the following data:
481 537 513 583 453 510 570 500 457 555
618 327 350 643 499 421 505 637 599 392
For these data, x = 507.50 and s = 89.75. Thus, a point estimate of µ is x = 507.50.Since t0.05(19) = 1.729, a 90% confidence interval for µ is
507.50 ± 1.729(
89.75√20
)or
507.50 ± 34.70, or equivalently, [472.80, 542.20].
Let T have a t distribution with n−1 degrees of freedom. Then tα/2(n−1) > zα/2.Consequently, we would expect the interval x ± zα/2(σ/
√n ) to be shorter than the
interval x± tα/2(n−1)(s/√
n ). After all, we have more information, namely, the valueof σ , in constructing the first interval. However, the length of the second intervalis very much dependent on the value of s. If the observed s is smaller than σ , ashorter confidence interval could result by the second procedure. But on the average,x ± zα/2(σ/
√n ) is the shorter of the two confidence intervals (Exercise 7.1-14).
Example7.1-6
In Example 7.1-2, 50 confidence intervals were simulated for the mean of a nor-mal distribution, assuming that the variance was known. For those same data, sincet0.05(4) = 2.132, x ± 2.132(s/
√5 ) was used to calculate a 90% confidence interval
for µ. For those particular 50 intervals, 46 contained the mean µ = 50. These 50intervals are depicted in Figure 7.1-1(b). Note the different lengths of the intervals.Some are longer and some are shorter than the corresponding z intervals. The aver-age length of the 50 t intervals is 7.137, which is quite close to the expected length ofsuch an interval: 7.169. (See Exercise 7.1-14.) The length of the intervals that use zand σ = 4 is 5.885.
310 Chapter 7 Interval Estimation
has a t distribution with n + m − 2 degrees of freedom. That is,
T =
X − Y − (µX − µY)√
σ 2/n + σ 2/m√√√√[
(n − 1)S2X
σ 2 + (m − 1)S2Y
σ 2
]/
(n + m − 2)
= X − Y − (µX − µY)√√√√
[(n − 1)S2
X + (m − 1)S2Y
n + m − 2
][1n
+ 1m
]
has a t distribution with r = n + m − 2 degrees of freedom. Thus, witht0 = tα/2(n+m−2), we have
P(−t0 ≤ T ≤ t0) = 1 − α.
Solving the inequalities for µX − µY yields
P
(
X − Y − t0SP
√1n
+ 1m
≤ µX − µY ≤ X − Y + t0SP
√1n
+ 1m
)
,
where the pooled estimator of the common standard deviation is
SP =√
(n − 1)S2X + (m − 1)S2
Y
n + m − 2.
If x, y, and sp are the observed values of X, Y, and SP, then
[
x − y − t0sp
√1n
+ 1m
, x − y + t0sp
√1n
+ 1m
]
is a 100(1 − α)% confidence interval for µX − µY .
Example7.2-2
Suppose that scores on a standardized test in mathematics taken by students fromlarge and small high schools are N(µX , σ 2) and N(µY , σ 2), respectively, where σ 2 isunknown. If a random sample of n = 9 students from large high schools yieldedx = 81.31, s2
x = 60.76, and a random sample of m = 15 students from small highschools yielded y = 78.61, s2
y = 48.24, then the endpoints for a 95% confidenceinterval for µX − µY are given by
81.31 − 78.61 ± 2.074
√8(60.76) + 14(48.24)
22
√19
+ 115
because t0.025(22) = 2.074. The 95% confidence interval is [−3.65, 9.05].
REMARKS The assumption of equal variances, namely, σ 2X = σ 2
Y , can be modifiedsomewhat so that we are still able to find a confidence interval for µX − µY . That is,if we know the ratio σ 2
X/σ 2Y of the variances, we can still make this type of statistical
310 Chapter 7 Interval Estimation
has a t distribution with n + m − 2 degrees of freedom. That is,
T =
X − Y − (µX − µY)√
σ 2/n + σ 2/m√√√√[
(n − 1)S2X
σ 2 + (m − 1)S2Y
σ 2
]/
(n + m − 2)
= X − Y − (µX − µY)√√√√
[(n − 1)S2
X + (m − 1)S2Y
n + m − 2
][1n
+ 1m
]
has a t distribution with r = n + m − 2 degrees of freedom. Thus, witht0 = tα/2(n+m−2), we have
P(−t0 ≤ T ≤ t0) = 1 − α.
Solving the inequalities for µX − µY yields
P
(
X − Y − t0SP
√1n
+ 1m
≤ µX − µY ≤ X − Y + t0SP
√1n
+ 1m
)
,
where the pooled estimator of the common standard deviation is
SP =√
(n − 1)S2X + (m − 1)S2
Y
n + m − 2.
If x, y, and sp are the observed values of X, Y, and SP, then
[
x − y − t0sp
√1n
+ 1m
, x − y + t0sp
√1n
+ 1m
]
is a 100(1 − α)% confidence interval for µX − µY .
Example7.2-2
Suppose that scores on a standardized test in mathematics taken by students fromlarge and small high schools are N(µX , σ 2) and N(µY , σ 2), respectively, where σ 2 isunknown. If a random sample of n = 9 students from large high schools yieldedx = 81.31, s2
x = 60.76, and a random sample of m = 15 students from small highschools yielded y = 78.61, s2
y = 48.24, then the endpoints for a 95% confidenceinterval for µX − µY are given by
81.31 − 78.61 ± 2.074
√8(60.76) + 14(48.24)
22
√19
+ 115
because t0.025(22) = 2.074. The 95% confidence interval is [−3.65, 9.05].
REMARKS The assumption of equal variances, namely, σ 2X = σ 2
Y , can be modifiedsomewhat so that we are still able to find a confidence interval for µX − µY . That is,if we know the ratio σ 2
X/σ 2Y of the variances, we can still make this type of statistical
Section 7.3 Confidence Intervals for Proportions 319
P
[
−zα/2 ≤ (Y/n) − p√
p(1 − p)/n≤ zα/2
]
≈ 1 − α. (7.3-1)
If we proceed as we did when we found a confidence interval for µ in Section 7.1,we would obtain
P
[Yn
− zα/2
√p(1 − p)
n≤ p ≤ Y
n+ zα/2
√p(1 − p)
n
]
≈ 1 − α.
Unfortunately, the unknown parameter p appears in the endpoints of this inequality.There are two ways out of this dilemma. First, we could make an additional approx-imation, namely, replacing p with Y/n in p (1 − p)/n in the endpoints. That is, if n islarge enough, it is still true that
P
[Yn
− zα/2
√(Y/n)(1 − Y/n)
n≤ p ≤ Y
n+ zα/2
√(Y/n)(1 − Y/n)
n
]
≈ 1 − α.
Thus, for large n, if the observed Y equals y, then the interval[
yn
− zα/2
√(y/n)(1 − y/n)
n,
yn
+ zα/2
√(y/n)(1 − y/n)
n
]
serves as an approximate 100(1 − α)% confidence interval for p. Frequently, thisinterval is written as
yn
± zα/2
√(y/n)(1 − y/n)
n(7.3-2)
for brevity. This formulation clearly notes, as does x ± zα/2(σ/√
n) in Section 7.1, thereliability of the estimate y/n, namely, that we are 100(1 − α)% confident that p iswithin zα/2
√(y/n)(1 − y/n)/n of p = y/n.
A second way to solve for p in the inequality in Equation 7.3-1 is to note that
|Y/n − p|√
p (1 − p)/n≤ zα/2
is equivalent to
H(p) =(
Yn
− p)2
−z2α/2 p(1 − p)
n≤ 0. (7.3-3)
But H(p) is a quadratic expression in p. Thus, we can find those values of p forwhich H(p) ≤ 0 by finding the two zeros of H(p). Letting p = Y/n and z0 = zα/2 inEquation 7.3-3, we have
H(p) =(
1 + z20
n
)
p2 −(
2 p + z20
n
)
p + p 2.
By the quadratic formula, the zeros of H(p) are, after simplifications,
p + z20/(2n) ± z0
√p (1 − p )/n + z2
0/(4n2)
1 + z20/n
, (7.3-4)
358 Chapter 8 Tests of Statistical Hypotheses
Table 8.1-1 Tests of hypotheses about one mean, variance known
H0 H1 Critical Region
µ = µ0 µ > µ0 z ≥ zα or x ≥ µ0 + zασ/√
n
µ = µ0 µ < µ0 z ≤ −zα or x ≤ µ0 − zασ/√
n
µ = µ0 µ %= µ0 |z| ≥ zα/2 or |x − µ0| ≥ zα/2σ/√
n
Z = X − µ0√σ 2/n
= X − µ0
σ/√
n, (8.1-1)
and the critical regions, at a significance level α, for the three respective alternativehypotheses would be (i) z ≥ zα , (ii) z ≤ −zα , and (iii) |z| ≥ zα/2. In terms of x, thesethree critical regions become (i) x ≥ µ0 + zα(σ/
√n ), (ii) x ≤ µ0 − zα(σ/
√n ), and
(iii) |x − µ0| ≥ zα/2(σ/√
n ).The three tests and critical regions are summarized in Table 8.1-1. The underly-
ing assumption is that the distribution is N(µ, σ 2) and σ 2 is known.It is usually the case that the variance σ 2 is not known. Accordingly, we now take
a more realistic position and assume that the variance is unknown. Suppose our nullhypothesis is H0: µ = µ0 and the two-sided alternative hypothesis is H1: µ %= µ0.Recall from Section 7.1, for a random sample X1, X2, . . . , Xn taken from a normaldistribution N(µ, σ 2), a confidence interval for µ is based on
T = X − µ√
S2/n= X − µ
S/√
n.
This suggests that T might be a good statistic to use for the test of H0: µ = µ0 with µ
replaced by µ0. In addition, it is the natural statistic to use if we replace σ 2/n by itsunbiased estimator S2/n in (X − µ0)/
√σ 2/n in Equation 8.1-1. If µ = µ0, we know
that T has a t distribution with n − 1 degrees of freedom. Thus, with µ = µ0,
P[ |T| ≥ tα/2(n−1)] = P
[|X − µ0|
S/√
n≥ tα/2(n−1)
]
= α.
Accordingly, if x and s are, respectively, the sample mean and sample standarddeviation, then the rule that rejects H0: µ = µ0 and accepts H1: µ %= µ0 if andonly if
|t| = |x − µ0|s/
√n
≥ tα/2(n−1)
provides a test of this hypothesis with significance level α. Note that this rule isequivalent to rejecting H0: µ = µ0 if µ0 is not in the open 100(1 − α)% confidenceinterval
(x − tα/2(n−1)
[s/
√n
], x + tα/2(n−1)
[s/
√n
]).
Table 8.1-2 summarizes tests of hypotheses for a single mean, along with thethree possible alternative hypotheses, when the underlying distribution is N(µ, σ 2),σ 2 is unknown, t = (x − µ0)/(s/
√n ), and n ≤ 30. If n > 30, we use Table 8.1-1 for
approximate tests, with σ replaced by s.
Section 8.1 Tests About One Mean 359
Table 8.1-2 Tests of hypotheses for one mean, variance unknown
H0 H1 Critical Region
µ = µ0 µ > µ0 t ≥ tα(n − 1) or x ≥ µ0 + tα(n − 1)s/√
n
µ = µ0 µ < µ0 t ≤ −tα(n − 1) or x ≤ µ0 − tα(n − 1)s/√
n
µ = µ0 µ %= µ0 |t| ≥ tα/2(n − 1) or |x − µ0| ≥ tα/2(n − 1)s/√
n
Example8.1-3
Let X (in millimeters) equal the growth in 15 days of a tumor induced in a mouse.Assume that the distribution of X is N(µ, σ 2). We shall test the null hypothesis H0:µ = µ0 = 4.0 mm against the two-sided alternative hypothesis H1: µ %= 4.0. If weuse n = 9 observations and a significance level of α = 0.10, the critical region is
|t| = |x − 4.0|s/
√9
≥ tα/2(8) = 1.860.
If we are given that n = 9, x = 4.3, and s = 1.2, we see that
t = 4.3 − 4.0
1.2/√
9= 0.3
0.4= 0.75.
Thus,
|t| = |0.75| < 1.860,
and we accept (do not reject) H0: µ = 4.0 at the α = 10% significance level. (SeeFigure 8.1-3.) The p-value is the two-sided probability of |T| ≥ 0.75, namely,
p-value = P(|T| ≥ 0.75) = 2P(T ≥ 0.75).
With our t tables with eight degrees of freedom, we cannot find this p-value exactly.It is about 0.50, because
P(|T| ≥ 0.706) = 2P(T ≥ 0.706) = 0.50.
However, Minitab gives a p-value of 0.4747. (See Figure 8.1-3.)
α/2 = 0.05α/2 = 0.05
p-value
t = 0.75
0.1
0.2
0.3
0.4
−3 −2 −1 3210
0.1
0.2
0.3
0.4
−3 −2 −1 3210
Figure 8.1-3 Test about mean of tumor growths
Section 8.2 Tests of the Equality of Two Means 367
Y
X
0.5 1.0 1.5 2.0 2.5
Figure 8.2-2 Box plots for pea stem growths
for the X sample and
0.8 1.15 1.6 2.2 2.6
for the Y sample. The two box plots are shown in Figure 8.2-2.
Assuming independent random samples of sizes n and m, let x, y, and s2p rep-
resent the observed unbiased estimates of the respective parameters µX , µY , andσ 2
X = σ 2Y of two normal distributions with a common variance. Then α-level tests of
certain hypotheses are given in Table 8.2-1 when σ 2X = σ 2
Y . If the common-varianceassumption is violated, but not too badly, the test is satisfactory, but the significancelevels are only approximate. The t statistic and sp are given in Equations 8.2-1 and8.2-2, respectively.
REMARK Again, to emphasize the relationship between confidence intervals andtests of hypotheses, we note that each of the tests in Table 8.2-1 has a correspondingconfidence interval. For example, the first one-sided test is equivalent to saying thatwe reject H0: µX − µY = 0 if zero is not in the one-sided confidence interval withlower bound
x − y − tα(n+m−2)sp√
1/n + 1/m.
Table 8.2-1 Tests of hypotheses for equality of two means
H0 H1 Critical Region
µX = µY µX > µY t ≥ tα(n+m−2) or
x − y ≥ tα(n+m−2)sp√
1/n + 1/m
µX = µY µX < µY t ≤ −tα(n+m−2) or
x − y ≤ −tα(n+m−2)sp√
1/n + 1/m
µX = µY µX $= µY |t| ≥ tα/2(n+m−2) or
|x − y| ≥ tα/2(n+m−2)sp√
1/n + 1/m
376 Chapter 8 Tests of Statistical Hypotheses
rolled to yield a total of n = 8000 observations. Let Y equal the number of timesthat 6 resulted in the 8000 trials. The test statistic is
Z = Y/n − 1/6√
(1/6)(5/6)/n= Y/8000 − 1/6
√(1/6)(5/6)/8000
.
If we use a significance level of α = 0.05, the critical region is
z ≥ z0.05 = 1.645.
The results of the experiment yielded y = 1389, so the calculated value of the teststatistic is
z = 1389/8000 − 1/6√
(1/6)(5/6)/8000= 1.67.
Since
z = 1.67 > 1.645,
the null hypothesis is rejected, and the experimental results indicate that these dicefavor a 6 more than a fair die would. (You could perform your own experiment tocheck out other dice.)
There are times when a two-sided alternative is appropriate; that is, here wetest H0: p = p0 against H1: p #= p0. For example, suppose that the pass rate in theusual beginning statistics course is p0. There has been an intervention (say, some newteaching method) and it is not known whether the pass rate will increase, decrease, orstay about the same. Thus, we test the null (no-change) hypothesis H0: p = p0 againstthe two-sided alternative H1: p #= p0. A test with the approximate significance levelα for doing this is to reject H0: p = p0 if
|Z| = |Y/n − p0|√p0(1 − p0)/n
≥ zα/2,
since, under H0, P(|Z| ≥ zα/2) ≈ α. These tests of approximate significance level α
are summarized in Table 8.3-1. The rejection region for H0 is often called the criticalregion of the test, and we use that terminology in the table.
The p-value associated with a test is the probability, under the null hypothesisH0, that the test statistic (a random variable) is equal to or exceeds the observedvalue (a constant) of the test statistic in the direction of the alternative hypothesis.
Table 8.3-1 Tests of hypotheses for one proportion
H0 H1 Critical Region
p = p0 p > p0 z = y/n − p0√p0(1 − p0)/n
≥ zα
p = p0 p < p0 z = y/n − p0√p0(1 − p0)/n
≤ −zα
p = p0 p #= p0 |z| = |y/n − p0|√p0(1 − p0)/n
≥ zα/2
Confidence IntervalsParameter Assumptions Endpoints
µ N(µ, σ 2) or n large, x ± zα/2σ√n
σ 2 known
µ N(µ, σ 2) x ± tα/2(n−1)s√n
σ 2 unknown
µX − µY N(µX , σ 2X) x − y ± zα/2
√σ 2
X
n+ σ 2
Y
mN(µY , σ 2
Y )σ 2
X , σ 2Y known
µX − µY Variances unknown, x − y ± zα/2
√s2
x
n+
s2y
mlarge samples
µX − µY N(µX , σ 2X) x − y ± tα/2(n+m−2)sp
√1n
+ 1m
,
N(µY , σ 2Y )
σ 2X = σ 2
Y , unknown sp =
√(n − 1)s2
x + (m − 1)s2y
n + m − 2
µD = µX − µY X and Y normal, d ± tα/2(n−1)sd√
nbut dependent
p b(n, p) yn
± zα/2
√(y/n)[1 − (y/n)]
nn is large
p1 − p2 b(n1, p1) y1
n1− y2
n2± zα/2
√p1(1 − p1)
n1+ p2(1 − p2)
n2,
b(n2, p2)p1 = y1/n1, p2 = y2/n2
484 Appendix B Tables
Table I Binomial Coefficients(
nr
)= n!
r!(n − r)! =(
nn − r
)
n(
n0
) (n1
) (n2
) (n3
) (n4
) (n5
) (n6
) (n7
) (n8
) (n9
) (n10
) (n11
) (n12
) (n13
)
0 1
1 1 1
2 1 2 1
3 1 3 3 1
4 1 4 6 4 1
5 1 5 10 10 5 1
6 1 6 15 20 15 6 1
7 1 7 21 35 35 21 7 1
8 1 8 28 56 70 56 28 8 1
9 1 9 36 84 126 126 84 36 9 1
10 1 10 45 120 210 252 210 120 45 10 1
11 1 11 55 165 330 462 462 330 165 55 11 1
12 1 12 66 220 495 792 924 792 495 220 66 12 1
13 1 13 78 286 715 1,287 1,716 1,716 1,287 715 286 78 13 1
14 1 14 91 364 1,001 2,002 3,003 3,432 3,003 2,002 1,001 364 91 14
15 1 15 105 455 1,365 3,003 5,005 6,435 6,435 5,005 3,003 1,365 455 105
16 1 16 120 560 1,820 4,368 8,008 11,440 12,870 11,440 8,008 4,368 1,820 560
17 1 17 136 680 2,380 6,188 12,376 19,448 24,310 24,310 19,448 12,376 6,188 2,380
18 1 18 153 816 3,060 8,568 18,564 31,824 43,758 48,620 43,758 31,824 18,564 8,568
19 1 19 171 969 3,876 11,628 27,132 50,388 75,582 92,378 92,378 75,582 50,388 27,132
20 1 20 190 1,140 4,845 15,504 38,760 77,520 125,970 167,960 184,756 167,960 125,970 77,520
21 1 21 210 1,330 5,985 20,349 54,264 116,280 203,490 293,930 352,716 352,716 293,930 203,490
22 1 22 231 1,540 7,315 26,334 74,613 170,544 319,770 497,420 646,646 705,432 646,646 497,420
23 1 23 253 1,771 8,855 33,649 100,947 245,157 490,314 817,190 1,144,066 1,352,078 1,352,078 1,144,066
24 1 24 276 2,024 10,626 42,504 134,596 346,104 735,471 1,307,504 1,961,256 2,496,144 2,704,156 2,496,144
25 1 25 300 2,300 12,650 53,130 177,100 480,700 1,081,575 2,042,975 3,268,760 4,457,400 5,200,300 5,200,300
26 1 26 325 2,600 14,950 65,780 230,230 657,800 1,562,275 3,124,550 5,311,735 7,726,160 9,657,700 10,400,600
For r > 13 you may use the identity(
nr
)=
(n
n − r
).
Table II The Binomial Distribution
x
f(x)
F(x)b(8, 0.35)
0.05
0.10
0.15
0.20
0.25
0.30
0 2 4 6 8 x
b(8, 0.35)
0.5
1.0
0 2 4 6 8
F(x) = P(X ≤ x) =x∑
k=0
n!k!(n − k)! pk(1 − p)n−k
p
n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
2 0 0.9025 0.8100 0.7225 0.6400 0.5625 0.4900 0.4225 0.3600 0.3025 0.25001 0.9975 0.9900 0.9775 0.9600 0.9375 0.9100 0.8775 0.8400 0.7975 0.75002 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
3 0 0.8574 0.7290 0.6141 0.5120 0.4219 0.3430 0.2746 0.2160 0.1664 0.12501 0.9928 0.9720 0.9392 0.8960 0.8438 0.7840 0.7182 0.6480 0.5748 0.50002 0.9999 0.9990 0.9966 0.9920 0.9844 0.9730 0.9571 0.9360 0.9089 0.87503 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
4 0 0.8145 0.6561 0.5220 0.4096 0.3164 0.2401 0.1785 0.1296 0.0915 0.06251 0.9860 0.9477 0.8905 0.8192 0.7383 0.6517 0.5630 0.4752 0.3910 0.31252 0.9995 0.9963 0.9880 0.9728 0.9492 0.9163 0.8735 0.8208 0.7585 0.68753 1.0000 0.9999 0.9995 0.9984 0.9961 0.9919 0.9850 0.9744 0.9590 0.93754 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
5 0 0.7738 0.5905 0.4437 0.3277 0.2373 0.1681 0.1160 0.0778 0.0503 0.03121 0.9774 0.9185 0.8352 0.7373 0.6328 0.5282 0.4284 0.3370 0.2562 0.18752 0.9988 0.9914 0.9734 0.9421 0.8965 0.8369 0.7648 0.6826 0.5931 0.50003 1.0000 0.9995 0.9978 0.9933 0.9844 0.9692 0.9460 0.9130 0.8688 0.81254 1.0000 1.0000 0.9999 0.9997 0.9990 0.9976 0.9947 0.9898 0.9815 0.96885 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
6 0 0.7351 0.5314 0.3771 0.2621 0.1780 0.1176 0.0754 0.0467 0.0277 0.01561 0.9672 0.8857 0.7765 0.6553 0.5339 0.4202 0.3191 0.2333 0.1636 0.10942 0.9978 0.9842 0.9527 0.9011 0.8306 0.7443 0.6471 0.5443 0.4415 0.34383 0.9999 0.9987 0.9941 0.9830 0.9624 0.9295 0.8826 0.8208 0.7447 0.65624 1.0000 0.9999 0.9996 0.9984 0.9954 0.9891 0.9777 0.9590 0.9308 0.89065 1.0000 1.0000 1.0000 0.9999 0.9998 0.9993 0.9982 0.9959 0.9917 0.98446 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
7 0 0.6983 0.4783 0.3206 0.2097 0.1335 0.0824 0.0490 0.0280 0.0152 0.00781 0.9556 0.8503 0.7166 0.5767 0.4449 0.3294 0.2338 0.1586 0.1024 0.0625
Table II continued
p
n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
2 0.9962 0.9743 0.9262 0.8520 0.7564 0.6471 0.5323 0.4199 0.3164 0.22663 0.9998 0.9973 0.9879 0.9667 0.9294 0.8740 0.8002 0.7102 0.6083 0.50004 1.0000 0.9998 0.9988 0.9953 0.9871 0.9712 0.9444 0.9037 0.8471 0.77345 1.0000 1.0000 0.9999 0.9996 0.9987 0.9962 0.9910 0.9812 0.9643 0.93756 1.0000 1.0000 1.0000 1.0000 0.9999 0.9998 0.9994 0.9984 0.9963 0.99227 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
8 0 0.6634 0.4305 0.2725 0.1678 0.1001 0.0576 0.0319 0.0168 0.0084 0.00391 0.9428 0.8131 0.6572 0.5033 0.3671 0.2553 0.1691 0.1064 0.0632 0.03522 0.9942 0.9619 0.8948 0.7969 0.6785 0.5518 0.4278 0.3154 0.2201 0.14453 0.9996 0.9950 0.9786 0.9437 0.8862 0.8059 0.7064 0.5941 0.4770 0.36334 1.0000 0.9996 0.9971 0.9896 0.9727 0.9420 0.8939 0.8263 0.7396 0.63675 1.0000 1.0000 0.9998 0.9988 0.9958 0.9887 0.9747 0.9502 0.9115 0.85556 1.0000 1.0000 1.0000 0.9999 0.9996 0.9987 0.9964 0.9915 0.9819 0.96487 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9998 0.9993 0.9983 0.99618 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
9 0 0.6302 0.3874 0.2316 0.1342 0.0751 0.0404 0.0207 0.0101 0.0046 0.00201 0.9288 0.7748 0.5995 0.4362 0.3003 0.1960 0.1211 0.0705 0.0385 0.01952 0.9916 0.9470 0.8591 0.7382 0.6007 0.4628 0.3373 0.2318 0.1495 0.08983 0.9994 0.9917 0.9661 0.9144 0.8343 0.7297 0.6089 0.4826 0.3614 0.25394 1.0000 0.9991 0.9944 0.9804 0.9511 0.9012 0.8283 0.7334 0.6214 0.50005 1.0000 0.9999 0.9994 0.9969 0.9900 0.9747 0.9464 0.9006 0.8342 0.74616 1.0000 1.0000 1.0000 0.9997 0.9987 0.9957 0.9888 0.9750 0.9502 0.91027 1.0000 1.0000 1.0000 1.0000 0.9999 0.9996 0.9986 0.9962 0.9909 0.98058 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9997 0.9992 0.99809 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
10 0 0.5987 0.3487 0.1969 0.1074 0.0563 0.0282 0.0135 0.0060 0.0025 0.00101 0.9139 0.7361 0.5443 0.3758 0.2440 0.1493 0.0860 0.0464 0.0233 0.01072 0.9885 0.9298 0.8202 0.6778 0.5256 0.3828 0.2616 0.1673 0.0996 0.05473 0.9990 0.9872 0.9500 0.8791 0.7759 0.6496 0.5138 0.3823 0.2660 0.17194 0.9999 0.9984 0.9901 0.9672 0.9219 0.8497 0.7515 0.6331 0.5044 0.37705 1.0000 0.9999 0.9986 0.9936 0.9803 0.9527 0.9051 0.8338 0.7384 0.62306 1.0000 1.0000 0.9999 0.9991 0.9965 0.9894 0.9740 0.9452 0.8980 0.82817 1.0000 1.0000 1.0000 0.9999 0.9996 0.9984 0.9952 0.9877 0.9726 0.94538 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9995 0.9983 0.9955 0.98939 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9997 0.9990
10 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
11 0 0.5688 0.3138 0.1673 0.0859 0.0422 0.0198 0.0088 0.0036 0.0014 0.00051 0.8981 0.6974 0.4922 0.3221 0.1971 0.1130 0.0606 0.0302 0.0139 0.00592 0.9848 0.9104 0.7788 0.6174 0.4552 0.3127 0.2001 0.1189 0.0652 0.03273 0.9984 0.9815 0.9306 0.8389 0.7133 0.5696 0.4256 0.2963 0.1911 0.11334 0.9999 0.9972 0.9841 0.9496 0.8854 0.7897 0.6683 0.5328 0.3971 0.27445 1.0000 0.9997 0.9973 0.9883 0.9657 0.9218 0.8513 0.7535 0.6331 0.50006 1.0000 1.0000 0.9997 0.9980 0.9924 0.9784 0.9499 0.9006 0.8262 0.7256
Table II continued
p
n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
7 1.0000 1.0000 1.0000 0.9998 0.9988 0.9957 0.9878 0.9707 0.9390 0.88678 1.0000 1.0000 1.0000 1.0000 0.9999 0.9994 0.9980 0.9941 0.9852 0.96739 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9993 0.9978 0.9941
10 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.999511 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
12 0 0.5404 0.2824 0.1422 0.0687 0.0317 0.0138 0.0057 0.0022 0.0008 0.00021 0.8816 0.6590 0.4435 0.2749 0.1584 0.0850 0.0424 0.0196 0.0083 0.00322 0.9804 0.8891 0.7358 0.5583 0.3907 0.2528 0.1513 0.0834 0.0421 0.01933 0.9978 0.9744 0.9078 0.7946 0.6488 0.4925 0.3467 0.2253 0.1345 0.07304 0.9998 0.9957 0.9761 0.9274 0.8424 0.7237 0.5833 0.4382 0.3044 0.19385 1.0000 0.9995 0.9954 0.9806 0.9456 0.8822 0.7873 0.6652 0.5269 0.38726 1.0000 0.9999 0.9993 0.9961 0.9857 0.9614 0.9154 0.8418 0.7393 0.61287 1.0000 1.0000 0.9999 0.9994 0.9972 0.9905 0.9745 0.9427 0.8883 0.80628 1.0000 1.0000 1.0000 0.9999 0.9996 0.9983 0.9944 0.9847 0.9644 0.92709 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9992 0.9972 0.9921 0.9807
10 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9997 0.9989 0.996811 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.999812 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
13 0 0.5133 0.2542 0.1209 0.0550 0.0238 0.0097 0.0037 0.0013 0.0004 0.00011 0.8646 0.6213 0.3983 0.2336 0.1267 0.0637 0.0296 0.0126 0.0049 0.00172 0.9755 0.8661 0.6920 0.5017 0.3326 0.2025 0.1132 0.0579 0.0269 0.01123 0.9969 0.9658 0.8820 0.7473 0.5843 0.4206 0.2783 0.1686 0.0929 0.04614 0.9997 0.9935 0.9658 0.9009 0.7940 0.6543 0.5005 0.3530 0.2279 0.13345 1.0000 0.9991 0.9924 0.9700 0.9198 0.8346 0.7159 0.5744 0.4268 0.29056 1.0000 0.9999 0.9987 0.9930 0.9757 0.9376 0.8705 0.7712 0.6437 0.50007 1.0000 1.0000 0.9998 0.9988 0.9944 0.9818 0,9538 0.9023 0.8212 0.70958 1.0000 1.0000 1.0000 0.9998 0.9990 0.9960 0.9874 0.9679 0.9302 0.86669 1.0000 1.0000 1.0000 1.0000 0.9999 0.9993 0.9975 0.9922 0.9797 0.9539
10 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9997 0.9987 0.9959 0.988811 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9995 0.998312 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.999913 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
14 0 0.4877 0.2288 0.1028 0.0440 0.0178 0.0068 0.0024 0.0008 0.0002 0.00011 0.8470 0.5846 0.3567 0.1979 0.1010 0.0475 0.0205 0.0081 0.0029 0.00092 0.9699 0.8416 0.6479 0.4481 0.2811 0.1608 0.0839 0.0398 0.0170 0.00653 0.9958 0.9559 0.8535 0.6982 0.5213 0.3552 0.2205 0.1243 0.0632 0.02874 0.9996 0.9908 0.9533 0.8702 0.7415 0.5842 0.4227 0.2793 0.1672 0.08985 1.0000 0.9985 0.9885 0.9561 0.8883 0.7805 0.6405 0.4859 0.3373 0.21206 1.0000 0.9998 0.9978 0.9884 0.9617 0.9067 0.8164 0.6925 0.5461 0.39537 1.0000 1.0000 0.9997 0.9976 0.9897 0.9685 0.9247 0.8499 0.7414 0.60478 1.0000 1.0000 1.0000 0.9996 0.9978 0.9917 0.9757 0.9417 0.8811 0.78809 1.0000 1.0000 1.0000 1.0000 0.9997 0.9983 0.9940 0.9825 0.9574 0.9102
10 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9989 0.9961 0.9886 0.9713
Table II continued
p
n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
11 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9994 0.9978 0.993512 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9997 0.999113 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.999914 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
15 0 0.4633 0.2059 0.0874 0.0352 0.0134 0.0047 0.0016 0.0005 0.0001 0.00001 0.8290 0.5490 0.3186 0.1671 0.0802 0.0353 0.0142 0.0052 0.0017 0.00052 0.9638 0.8159 0.6042 0.3980 0.2361 0.1268 0.0617 0.0271 0.0107 0.00373 0.9945 0.9444 0.8227 0.6482 0.4613 0.2969 0.1727 0.0905 0.0424 0.01764 0.9994 0.9873 0.9383 0.8358 0.6865 0.5155 0.3519 0.2173 0.1204 0.05925 0.9999 0.9978 0.9832 0.9389 0.8516 0.7216 0.5643 0.4032 0.2608 0.15096 1.0000 0.9997 0.9964 0.9819 0.9434 0.8689 0.7548 0.6098 0.4522 0.30367 1.0000 1.0000 0.9994 0.9958 0.9827 0.9500 0.8868 0.7869 0.6535 0.50008 1.0000 1.0000 0.9999 0.9992 0.9958 0.9848 0.9578 0.9050 0.8182 0.69649 1.0000 1.0000 1.0000 0.9999 0.9992 0.9963 0.9876 0.9662 0.9231 0.8491
10 1.0000 1.0000 1.0000 1.0000 0.9999 0.9993 0.9972 0.9907 0.9745 0.940811 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9995 0.9981 0.9937 0.982412 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9987 0.9989 0.996313 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.999514 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000015 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
16 0 0.4401 0.1853 0.0743 0.0281 0.0100 0.0033 0.0010 0.0003 0.0001 0.00001 0.8108 0.5147 0.2839 0.1407 0.0635 0.0261 0.0098 0.0033 0.0010 0.00032 0.9571 0.7892 0.5614 0.3518 0.1971 0.0994 0.0451 0.0183 0.0066 0.00213 0.9930 0.9316 0.7899 0.5981 0.4050 0.2459 0.1339 0,0651 0.0281 0.01064 0.9991 0.9830 0.9209 0.7982 0.6302 0.4499 0.2892 0.1666 0.0853 0.03845 0.9999 0.9967 0.9765 0.9183 0.8103 0.6598 0.4900 0.3288 0.1976 0.10516 1.0000 0.9995 0.9944 0.9733 0.9204 0.8247 0.6881 0.5272 0.3660 0.22727 1.0000 0.9999 0.9989 0.9930 0.9729 0.9256 0.8406 0.7161 0.5629 0.40188 1.0000 1.0000 0.9998 0.9985 0.9925 0.9743 0.9329 0.8577 0.7441 0.59829 1.0000 1.0000 1.0000 0.9998 0.9984 0.9929 0.9771 0.9417 0.8759 0.7728
10 1.0000 1.0000 1.0000 1.0000 0.9997 0.9984 0.9938 0.9809 0.9514 0.894911 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9987 0.9951 0.9851 0.961612 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9991 0.9965 0.989413 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9994 0.997914 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.999715 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000016 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
20 0 0.3585 0.1216 0.0388 0.0115 0.0032 0.0008 0.0002 0.0000 0.0000 0.00001 0.7358 0.3917 0.1756 0.0692 0.0243 0.0076 0.0021 0.0005 0.0001 0.00002 0.9245 0.6769 0.4049 0.2061 0.0913 0.0355 0.0121 0.0036 0.0009 0.00023 0.9841 0.8670 0.6477 0.4114 0.2252 0.1071 0.0444 0.0160 0.0049 0.00134 0.9974 0.9568 0.8298 0.6296 0.4148 0.2375 0.1182 0.0510 0.0189 0.0059
Table II continued
p
n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
5 0.9997 0.9887 0.9327 0.8042 0.6172 0.4164 0.2454 0.1256 0.0553 0.02076 1.0000 0.9976 0.9781 0.9133 0.7858 0.6080 0.4166 0.2500 0.1299 0.05777 1.0000 0.9996 0.9941 0.9679 0.8982 0.7723 0.6010 0.4159 0.2520 0.13168 1.0000 0.9999 0.9987 0.9900 0.9591 0.8867 0.7624 0.5956 0.4143 0.25179 1.0000 1.0000 0.9998 0.9974 0.9861 0.9520 0.8782 0.7553 0.5914 0.4119
10 1.0000 1.0000 1.0000 0.9994 0.9961 0.9829 0.9468 0.8725 0.7507 0.588111 1.0000 1.0000 1.0000 0.9999 0.9991 0.9949 0.9804 0.9435 0.8692 0.748312 1.0000 1.0000 1.0000 1.0000 0.9998 0.9987 0.9940 0.9790 0.9420 0.868413 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9985 0.9935 0.9786 0.942314 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9984 0.9936 0.979315 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9985 0.994116 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.998717 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.999818 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000019 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000020 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
25 0 0.2774 0.0718 0.0172 0.0038 0.0008 0.0001 0.0000 0.0000 0.0000 0.00001 0.6424 0.2712 0.0931 0.0274 0.0070 0.0016 0.0003 0.0001 0.0000 0.00002 0.8729 0.5371 0.2537 0.0982 0.0321 0.0090 0.0021 0.0004 0.0001 0.00003 0.9659 0.7636 0.4711 0.2340 0.0962 0.0332 0.0097 0.0024 0.0005 0.00014 0.9928 0.9020 0.6821 0.4207 0.2137 0.0905 0.0320 0.0095 0.0023 0.00055 0.9988 0.9666 0.8385 0.6167 0.3783 0.1935 0.0826 0.0294 0.0086 0.00206 0.9998 0.9905 0.9305 0.7800 0.5611 0.3407 0.1734 0.0736 0.0258 0.00737 1.0000 0.9977 0.9745 0.8909 0.7265 0.5118 0.3061 0.1536 0.0639 0.02168 1.0000 0.9995 0.9920 0.9532 0.8506 0.6769 0.4668 0.2735 0.1340 0.05399 1.0000 0.9999 0.9979 0.9827 0.9287 0.8106 0.6303 0.4246 0.2424 0.1148
10 1.0000 1.0000 0.9995 0.9944 0.9703 0.9022 0.7712 0.5858 0.3843 0.212211 1.0000 1.0000 0.9999 0.9985 0.9893 0.9558 0.8746 0.7323 0.5426 0.345012 1.0000 1.0000 1.0000 0.9996 0.9966 0.9825 0.9396 0.8462 0.6937 0.500013 1.0000 1.0000 1.0000 0.9999 0.9991 0.9940 0.9745 0.9222 0.8173 0.655014 1.0000 1.0000 1,0000 1.0000 0.9998 0.9982 0.9907 0.9656 0.9040 0.787815 1.0000 1.0000 1.0000 1.0000 1.0000 0.9995 0.9971 0.9868 0.9560 0.885216 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9992 0.9957 0.9826 0.946117 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9988 0.9942 0.978418 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9984 0.992719 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9996 0.998020 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.999521 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.999922 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000023 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000024 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000025 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
Table III The Poisson Distribution
Poisson, λ = 3.8
x
0.05
0.10
0.15
0.20
0
f(x)
2 4 6 8 10 12 x
Poisson, λ = 3.8
0.2
0.4
0.6
0.8
1.0
0
F(x)
2 4 6 8 10 12
F(x) = P(X ≤ x) =x∑
k=0
λke−λ
k!
λ = E(X)
x 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0 0.905 0.819 0.741 0.670 0.607 0.549 0.497 0.449 0.407 0.3681 0.995 0.982 0.963 0.938 0.910 0.878 0.844 0.809 0.772 0.7362 1.000 0.999 0.996 0.992 0.986 0.977 0.966 0.953 0.937 0.9203 1.000 1.000 1.000 0.999 0.998 0.997 0.994 0.991 0.987 0.9814 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.996
5 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.9996 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
x 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
0 0.333 0.301 0.273 0.247 0.223 0.202 0.183 0.165 0.150 0.1351 0.699 0.663 0.627 0.592 0.558 0.525 0.493 0.463 0.434 0.4062 0.900 0.879 0.857 0.833 0.809 0.783 0.757 0.731 0.704 0.6773 0.974 0.966 0.957 0.946 0.934 0.921 0.907 0.891 0.875 0.8574 0.995 0.992 0.989 0.986 0.981 0.976 0.970 0.964 0.956 0.947
5 0.999 0.998 0.998 0.997 0.996 0.994 0.992 0.990 0.987 0.9836 1.000 1.000 1.000 0.999 0.999 0.999 0.998 0.997 0.997 0.9957 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.9998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
x 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0
0 0.111 0.091 0.074 0.061 0.050 0.041 0.033 0.027 0.022 0.0181 0.355 0.308 0.267 0.231 0.199 0.171 0.147 0.126 0.107 0.0922 0.623 0.570 0.518 0.469 0.423 0.380 0.340 0.303 0.269 0.2383 0.819 0.779 0.736 0.692 0.647 0.603 0.558 0.515 0.473 0.4334 0.928 0.904 0.877 0.848 0.815 0.781 0.744 0.706 0.668 0.629
5 0.975 0.964 0.951 0.935 0.916 0.895 0.871 0.844 0.816 0.7856 0.993 0.988 0.983 0.976 0.966 0.955 0.942 0.927 0.909 0.8897 0.998 0.997 0.995 0.992 0.988 0.983 0.977 0.969 0.960 0.9498 1.000 0.999 0.999 0.998 0.996 0.994 0.992 0.988 0.984 0.9799 1.000 1.000 1.000 0.999 0.999 0.998 0.997 0.996 0.994 0.992
10 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.99711 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.99912 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Appendix B Tables 491
Table III continued
x 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0
0 0.015 0.012 0.010 0.008 0.007 0.006 0.005 0.004 0.003 0.0021 0.078 0.066 0.056 0.048 0.040 0.034 0.029 0.024 0.021 0.0172 0.210 0.185 0.163 0.143 0.125 0.109 0.095 0.082 0.072 0.0623 0.395 0.359 0.326 0.294 0.265 0.238 0.213 0.191 0.170 0.1514 0.590 0.551 0.513 0.476 0.440 0.406 0.373 0.342 0.313 0.285
5 0.753 0.720 0.686 0.651 0.616 0.581 0.546 0.512 0.478 0.4466 0.867 0.844 0.818 0.791 0.762 0.732 0.702 0.670 0.638 0.6067 0.936 0.921 0.905 0.887 0.867 0.845 0.822 0.797 0.771 0.7448 0.972 0.964 0.955 0.944 0.932 0.918 0.903 0.886 0.867 0.8479 0.989 0.985 0.980 0.975 0.968 0.960 0.951 0.941 0.929 0.916
10 0.996 0.994 0.992 0.990 0.986 0.982 0.977 0.972 0.965 0.95711 0.999 0.998 0.997 0.996 0.995 0.993 0.990 0.988 0.984 0.98012 1.000 0.999 0.999 0.999 0.998 0.997 0.996 0.995 0.993 0.99113 1.000 1.000 1.000 1.000 0.999 0.999 0.999 0.998 0.997 0.99614 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.999 0.999
15 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.99916 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
x 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0
0 0.002 0.001 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.0001 0.011 0.007 0.005 0.003 0.002 0.001 0.001 0.000 0.000 0.0002 0.043 0.030 0.020 0.014 0.009 0.006 0.004 0.003 0.002 0.0013 0.112 0.082 0.059 0.042 0.030 0.021 0.015 0.010 0.007 0.0054 0.224 0.173 0.132 0.100 0.074 0.055 0.040 0.029 0.021 0.015
5 0.369 0.301 0.241 0.191 0.150 0.116 0.089 0.067 0.050 0.0386 0.527 0.450 0.378 0.313 0.256 0.207 0.165 0.130 0.102 0.0797 0.673 0.599 0.525 0.453 0.386 0.324 0.269 0.220 0.179 0.1438 0.792 0.729 0.662 0.593 0.523 0.456 0.392 0.333 0.279 0.2329 0.877 0.830 0.776 0.717 0.653 0.587 0.522 0.458 0.397 0.341
10 0.933 0.901 0.862 0.816 0.763 0.706 0.645 0.583 0.521 0.46011 0.966 0.947 0.921 0.888 0.849 0.803 0.752 0.697 0.639 0.57912 0.984 0.973 0.957 0.936 0.909 0.876 0.836 0.792 0.742 0.68913 0.993 0.987 0.978 0.966 0.949 0.926 0.898 0.864 0.825 0.78114 0.997 0.994 0.990 0.983 0.973 0.959 0.940 0.917 0.888 0.854
15 0.999 0.998 0.995 0.992 0.986 0.978 0.967 0.951 0.932 0.90716 1.000 0.999 0.998 0.996 0.993 0.989 0.982 0.973 0.960 0.94417 1.000 1.000 0.999 0.998 0.997 0.995 0.991 0.986 0.978 0.96818 1.000 1.000 1.000 0.999 0.999 0.998 0.096 0.993 0.988 0.98219 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.997 0.994 0.991
20 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.998 0.997 0.99521 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.99822 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.99923 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
492 Appendix B Tables
Table III continued
x 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0
0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.0001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.0002 0.001 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.0003 0.003 0.002 0.002 0.001 0.001 0.000 0.000 0.000 0.000 0.0004 0.011 0.008 0.005 0.004 0.003 0.002 0.001 0.001 0.001 0.000
5 0.028 0.020 0.015 0.011 0.008 0.006 0.004 0.003 0.002 0.0016 0.060 0.046 0.035 0.026 0.019 0.014 0.010 0.008 0.006 0.0047 0.114 0.090 0.070 0.054 0.041 0.032 0.024 0.018 0.013 0.0108 0.191 0.155 0.125 0.100 0.079 0.062 0.048 0.037 0.029 0.0229 0.289 0.242 0.201 0.166 0.135 0.109 0.088 0.070 0.055 0.043
10 0.402 0.347 0.297 0.252 0.211 0.176 0.145 0.118 0.096 0.07711 0.520 0.462 0.406 0.353 0.304 0.260 0.220 0.185 0.154 0.12712 0.633 0.576 0.519 0.463 0.409 0.358 0.311 0.268 0.228 0.19313 0.733 0.682 0.629 0.573 0.518 0.464 0.413 0.363 0.317 0.27514 0.815 0.772 0.725 0.675 0.623 0.570 0.518 0.466 0.415 0.368
15 0.878 0.844 0.806 0.764 0.718 0.669 0.619 0.568 0.517 0.46716 0.924 0.899 0.869 0.835 0.798 0.756 0.711 0.664 0.615 0.56617 0.954 0.937 0.916 0.890 0.861 0.827 0.790 0.749 0.705 0.65918 0.974 0.963 0.948 0.930 0.908 0.883 0.853 0.819 0.782 0.74219 0.986 0.979 0.969 0.957 0.942 0.923 0.901 0.875 0.846 0.812
20 0.992 0.988 0.983 0.975 0.965 0.952 0.936 0.917 0.894 0.86821 0.996 0.994 0.991 0.986 0.980 0.971 0.960 0.947 0.930 0.91122 0.999 0.997 0.995 0.992 0.989 0.983 0.976 0.967 0.956 0.94223 0.999 0.999 0.998 0.996 0.994 0.991 0.986 0.981 0.973 0.96324 1.000 0.999 0.999 0.998 0.997 0.995 0.992 0.989 0.984 0.978
25 1.000 1.000 0.999 0.999 0.998 0.997 0.996 0.994 0.991 0.98726 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.997 0.995 0.99327 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.997 0.99628 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.999 0.99829 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999
30 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.99931 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.00032 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.00033 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.00034 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
35 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Table IV The Chi-Square Distribution
χ2(8)χ2(8)
χ2α(8)x0
0.05
0.10
2
α
4 6 8 10 12 14 16 18 20 2 4 6 8 10 16 18 200
0.05
0.10
P(X ≤ x) =∫ x
0
1"(r/2)2r/2 wr/2−1e−w/2dw
P(X ≤ x)
0.010 0.025 0.050 0.100 0.900 0.950 0.975 0.990
r χ20.99(r) χ2
0.975(r) χ20.95(r) χ2
0.90(r) χ20.10(r) χ2
0.05(r) χ20.025(r) χ2
0.01(r)
1 0.000 0.001 0.004 0.016 2.706 3.841 5.024 6.6352 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.2103 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.344 0.297 0.484 0.711 1.064 7.779 9.488 11.14 13.285 0.554 0.831 1.145 1.610 9.236 11.07 12.83 15.09
6 0.872 1.237 1.635 2.204 10.64 12.59 14.45 16.817 1.239 1.690 2.167 2.833 12.02 14.07 16.01 18.488 1.646 2.180 2.733 3.490 13.36 15.51 17.54 20.099 2.088 2.700 3.325 4.168 14.68 16.92 19.02 21.67
10 2.558 3.247 3.940 4.865 15.99 18.31 20.48 23.21
11 3.053 3.816 4.575 5.578 17.28 19.68 21.92 24.7212 3.571 4.404 5.226 6.304 18.55 21.03 23.34 26.2213 4.107 5.009 5.892 7.042 19.81 22.36 24.74 27.6914 4.660 5.629 6.571 7.790 21.06 23.68 26.12 29.1415 5.229 6.262 7.261 8.547 22.31 25.00 27.49 30.58
16 5.812 6.908 7.962 9.312 23.54 26.30 28.84 32.0017 6.408 7.564 8.672 10.08 24.77 27.59 30.19 33.4118 7.015 8.231 9.390 10.86 25.99 28.87 31.53 34.8019 7.633 8.907 10.12 11.65 27.20 30.14 32.85 36.1920 8.260 9.591 10.85 12.44 28.41 31.41 34.17 37.57
21 8.897 10.28 11.59 13.24 29.62 32.67 35.48 38.9322 9.542 10.98 12.34 14.04 30.81 33.92 36.78 40.2923 10.20 11.69 13.09 14.85 32.01 35.17 38.08 41.6424 10.86 12.40 13.85 15.66 33.20 36.42 39.36 42.9825 11.52 13.12 14.61 16.47 34.38 37.65 40.65 44.31
26 12.20 13.84 15.38 17.29 35.56 38.88 41.92 45.6427 12.88 14.57 16.15 18.11 36.74 40.11 43.19 46.9628 13.56 15.31 16.93 18.94 37.92 41.34 44.46 48.2829 14.26 16.05 17.71 19.77 39.09 42.56 45.72 49.5930 14.95 16.79 18.49 20.60 40.26 43.77 46.98 50.89
40 22.16 24.43 26.51 29.05 51.80 55.76 59.34 63.6950 29.71 32.36 34.76 37.69 63.17 67.50 71.42 76.1560 37.48 40.48 43.19 46.46 74.40 79.08 83.30 88.3870 45.44 48.76 51.74 55.33 85.53 90.53 95.02 100.480 53.34 57.15 60.39 64.28 96.58 101.9 106.6 112.3
This table is abridged and adapted from Table III in Biometrika Tables for Statisticians, edited by E.S.Pearson and H.O.Hartley.
Table Va The Standard Normal Distribution Function
Φ(z0)
z0
z
0.1
0.2
0.3
0.4
−3 −2 −1 0 1 2 3
f(z)
P(Z ≤ z) = $(z) =∫ z
−∞
1√2π
e−w2/2 dw
$(−z) = 1 − $(z)
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.53590.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.57530.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.61410.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.65170.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.68790.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.72240.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.75490.7 0.7580 0.7611 0.7642 0.7673 0.7703 0.7734 0.7764 0.7794 0.7823 0.78520.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.81330.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.83891.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.86211.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.88301.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.90151.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.91771.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.93191.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.94411.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.95451.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.96331.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.97061.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.97672.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.98172.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.98572.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.98902.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.99162.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.99362.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.99522.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.99642.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.99742.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.99812.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.99863.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
α 0.400 0.300 0.200 0.100 0.050 0.025 0.020 0.010 0.005 0.001
zα 0.253 0.524 0.842 1.282 1.645 1.960 2.054 2.326 2.576 3.090zα/2 0.842 1.036 1.282 1.645 1.960 2.240 2.326 2.576 2.807 3.291
Table Vb The Standard Normal Right-Tail Probabilities
zαz
0.1
0.2
0.3
0.4
−3 −2 −1 0 1 2 3
f(z)
α
P(Z > zα) = α
P(Z > z) = 1 − $(z) = $(−z)
zα 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.46410.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.42470.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.38590.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.34830.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.31210.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.27760.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.24510.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.21480.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.18670.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.16111.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.13791.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.11701.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.09851.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.08231.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.06811.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.05591.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.04551.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.03671.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.02941.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.02332.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.01832.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.01432.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.01102.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.00842.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.00642.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.00482.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.00362.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.00262.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.00192.9 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.00143.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.00103.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.00073.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.00053.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.00033.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
Table VI The t Distribution
P(T ! t)
t
0.1
0.2
0.3
0.4
−3 −2 −1 0 1 2 3 tα(r)
0.1
0.2
0.3
0.4
−3 −2 −1 0 1 2 3
α
P(T ≤ t) =∫ t
−∞
"[(r + 1)/2]√πr "(r/2)(1 + w2/r)(r+1)/2
dw
P(T ≤ −t) = 1 − P(T ≤ t)
P(T ≤ t)
0.60 0.75 0.90 0.95 0.975 0.99 0.995
r t0.40(r) t0.25(r) t0.10(r) t0.05(r) t0.025(r) t0.01(r) t0.005(r)
1 0.325 1.000 3.078 6.314 12.706 31.821 63.6572 0.289 0.816 1.886 2.920 4.303 6.965 9.9253 0.277 0.765 1.638 2.353 3.182 4.541 5.8414 0.271 0.741 1.533 2.132 2.776 3.747 4.6045 0.267 0.727 1.476 2.015 2.571 3.365 4.032
6 0.265 0.718 1.440 1.943 2.447 3.143 3.7077 0.263 0.711 1.415 1.895 2.365 2.998 3.4998 0.262 0.706 1.397 1.860 2.306 2.896 3.3559 0.261 0.703 1.383 1.833 2.262 2.821 3.250
10 0.260 0.700 1.372 1.812 2.228 2.764 3.169
11 0.260 0.697 1.363 1.796 2.201 2.718 3.10612 0.259 0.695 1.356 1.782 2.179 2.681 3.05513 0.259 0.694 1.350 1.771 2.160 2.650 3.01214 0.258 0.692 1.345 1.761 2.145 2.624 2.99715 0.258 0.691 1.341 1.753 2.131 2.602 2.947
16 0.258 0.690 1.337 1.746 2.120 2.583 2.92117 0.257 0.689 1.333 1.740 2.110 2.567 2.89818 0.257 0.688 1.330 1.734 2.101 2.552 2.87819 0.257 0.688 1.328 1.729 2.093 2.539 2.86120 0.257 0.687 1.325 1.725 2.086 2.528 2.845
21 0.257 0.686 1.323 1.721 2.080 2.518 2.83122 0.256 0.686 1.321 1.717 2.074 2.508 2.81923 0.256 0.685 1.319 1.714 2.069 2.500 2.80724 0.256 0.685 1.318 1.711 2.064 2.492 2.79725 0.256 0.684 1.316 1.708 2.060 2.485 2.787
26 0.256 0.684 1.315 1.706 2.056 2.479 2.77927 0.256 0.684 1.314 1.703 2.052 2.473 2.77128 0.256 0.683 1.313 1.701 2.048 2.467 2.76329 0.256 0.683 1.311 1.699 2.045 2.462 2.75630 0.256 0.683 1.310 1.697 2.042 2.457 2.750
∞ 0.253 0.674 1.282 1.645 1.960 2.326 2.576
This table is taken from Table III of Fisher and Yates: Statistical Tables for Biological, Agricultrual, and Medical Research, published by Longman Group Ltd.,London (previously published by Oliver and Boyd, Edinburgh).
Appendix B Tables 497
Table VII The F Distribution
P(F ≤ f ) =∫ f
0
"[(r1 + r2)/2](r1/r2)r1/2wr1/2−1
"(r1/2)"(r2/2)(1 + r1w/r2)(r1+r2)/2dw
f0
0.2
0.4
0.6
P(F ! f )
1 2 3 4 5
Fα(4, 8)0
0.2
0.4
0.6
F(4, 8)
α
1 2 3 5
498 Appendix B Tables
Table VII continued
P(F ≤ f ) =∫ f
0
"[(r1 + r2)/2](r1/r2)r1/2wr1/2−1
"(r1/2)"(r2/2)(1 + r1w/r2)(r1+r2)/2dw
Numerator Degrees of Freedom, r1Den.d.f.
α P(F ≤ f ) r2 1 2 3 4 5 6 7 8 9 10
0.05 0.95 1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.90.025 0.975 647.79 799.50 864.16 899.58 921.85 937.11 948.22 956.66 963.28 968.630.01 0.99 4052 4999.5 5403 5625 5764 5859 5928 5981 6022 6056
0.05 0.95 2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.400.025 0.975 38.51 39.00 39.17 39.25 39.30 39.33 39.36 39.37 39.39 39.400.01 0.99 98.50 99.00 99.17 99.25 99.30 99.33 99.36 99.37 99.39 99.40
0.05 0.95 3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.790.025 0.975 17.44 16.04 15.44 15.10 14.88 14.73 14.62 14.54 14.47 14.420.01 0.99 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.35 27.23
0.05 0.95 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.960.025 0.975 12.22 10.65 9.98 9.60 9.36 9.20 9.07 8.98 8.90 8.840.01 0.99 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.55
0.05 0.95 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.740.025 0.975 10.01 8.43 7.76 7.39 7.15 6.98 6.85 6.76 6.68 6.620.01 0.99 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 10.05
0.05 0.95 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.060.025 0.975 8.81 7.26 6.60 6.23 5.99 5.82 5.70 5.60 5.52 5.460.01 0.99 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87
0.05 0.95 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.640.025 0.975 8.07 6.54 5.89 5.52 5.29 5.12 4.99 4.90 4.82 4.760.01 0.99 12.25 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 6.62
0.05 0.95 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.350.025 0.975 7.57 6.06 5.42 5.05 4.82 4.65 4.53 4.43 4.36 4.300.01 0.99 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 5.81
0.05 0.95 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.140.025 0.975 7.21 5.71 5.08 4.72 4.48 4.32 4.20 4.10 4.03 3.960.01 0.99 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 5.26
0.05 0.95 10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.980.025 0.975 6.94 5.46 4.83 4.47 4.24 4.07 3.95 3.85 3.78 3.720.01 0.99 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85
Appendix B Tables 499
Table VII continued
P(F ≤ f ) =∫ f
0
"[(r1 + r2)/2](r1/r2)r1/2wr1/2−1
"(r1/2)"(r2/2)(1 + r1w/r2)(r1+r2)/2dw
Numerator Degrees of Freedom, r1Den.d.f.
α P(F ≤ f ) r2 1 2 3 4 5 6 7 8 9 10
0.05 0.95 12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.750.025 0.975 6.55 5.10 4.47 4.12 3.89 3.73 3.61 3.51 3.44 3.370.01 0.99 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30
0.05 0.95 15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.540.025 0.975 6.20 4.77 4.15 3.80 3.58 3.41 3.29 3.20 3.12 3.060.01 0.99 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80
0.05 0.95 20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.350.025 0.975 5.87 4.46 3.86 3.51 3.29 3.13 3.01 2.91 2.84 2.770.01 0.99 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.37
0.05 0.95 24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.250.025 0.975 5.72 4.32 3.72 3.38 3.15 2.99 2.87 2.78 2.70 2.640.01 0.99 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 3.17
0.05 0.95 30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.160.025 0.975 5.57 4.18 3.59 3.25 3.03 2.87 2.75 2.65 2.57 2.510.01 0.99 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 2.98
0.05 0.95 40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.080.025 0.975 5.42 4.05 3.46 3.13 2.90 2.74 2.62 2.53 2.45 2.390.01 0.99 7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.89 2.80
0.05 0.95 60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.990.025 0.975 5.29 3.93 3.34 3.01 2.79 2.63 2.51 2.41 2.33 2.270.01 0.99 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63
0.05 0.95 120 3.92 3.07 2.68 2.45 2.29 2.17 2.09 2.02 1.96 1.910.025 0.975 5.15 3.80 3.23 2.89 2.67 2.52 2.39 2.30 2.22 2.160.01 0.99 6.85 4.79 3.95 3.48 3.17 2.96 2.79 2.66 2.56 2.47
0.05 0.95 ∞ 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.830.025 0.975 5.02 3.69 3.12 2.79 2.57 2.41 2.29 2.19 2.11 2.050.01 0.99 6.63 4.61 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32
500 Appendix B Tables
Table VII continued
P(F ≤ f ) =∫ f
0
"[(r1 + r2)/2](r1/r2)r1/2wr1/2−1
"(r1/2)"(r2/2)(1 + r1w/r2)(r1+r2)/2dw
Numerator Degrees of Freedom, r1Den.d.f.
α P(F ≤ f ) r2 12 15 20 24 30 40 60 120 ∞
0.05 0.95 1 243.9 245.9 248.0 249.1 250.1 251.1 252.2 253.3 254.30.025 0.975 976.71 984.87 993.10 997.25 1001.4 1005.6 1009.8 1014.0 1018.30.01 0.99 6106 6157 6209 6235 6261 6287 6313 6339 6366
0.05 0.95 2 19.41 19.43 19.45 19.45 19.46 19.47 19.48 19.49 19.500.025 0.975 39.42 39.43 39.45 39.46 39.47 39.47 39.48 39.49 39.500.01 0.99 99.42 99.43 99.45 99.46 99.47 99.47 99.48 99.49 99.50
0.05 0.95 3 8.74 8.70 8.66 8.64 8.62 8.59 8.57 8.55 8.530.025 0.975 14.34 14.25 14.17 14.12 14.08 14.04 13.99 13.95 13.900.01 0.99 27.05 26.87 26.69 26.60 26.50 26.41 26.32 26.22 26.13
0.05 0.95 4 5.91 5.86 5.80 5.77 5.75 5.72 5.69 5.66 5.630.025 0.975 8.75 8.66 8.56 8.51 8.46 8.41 8.36 8.31 8.260.01 0.99 14.37 14.20 14.02 13.93 13.84 13.75 13.65 13.56 13.46
0.05 0.95 5 4.68 4.62 4.56 4.53 4.50 4.46 4.43 4.40 4.360.025 0.975 6.52 6.43 6.33 6.28 6.23 6.18 6.12 6.07 6.020.01 0.99 9.89 9.72 9.55 9.47 9.38 9.29 9.20 9.11 9.02
0.05 0.95 6 4.00 3.94 3.87 3.84 3.81 3.77 3.74 3.70 3.670.025 0.975 5.37 5.27 5.17 5.12 5.07 5.01 4.96 4.90 4.850.01 0.99 7.72 7.56 7.40 7.31 7.23 7.14 7.06 6.97 6.88
0.05 0.95 7 3.57 3.51 3.41 3.41 3.38 3.34 3.30 3.27 3.230.025 0.975 4.67 4.57 4.47 4.42 4.36 4.31 4.25 4.20 4.140.01 0.99 6.47 6.31 6.16 6.07 5.99 5.91 5.82 5.74 5.65
0.05 0.95 8 3.28 3.22 3.15 3.12 3.08 3.04 3.01 2.97 2.930.025 0.975 4.20 4.10 4.00 3.95 3.89 3.84 3.78 3.73 3.670.01 0.99 5.67 5.52 5.36 5.28 5.20 5.12 5.03 4.95 4.86
0.05 0.95 9 3.07 3.01 2.94 2.90 2.86 2.83 2.79 2.75 2.710.025 0.975 3.87 3.77 3.67 3.61 3.56 3.51 3.45 3.39 3.330.01 0.99 5.11 4.96 4.81 4.73 4.65 4.57 4.48 4.40 4.31
Appendix B Tables 501
Table VII continued
P(F ≤ f ) =∫ f
0
"[(r1 + r2)/2](r1/r2)r1/2wr1/2−1
"(r1/2)"(r2/2)(1 + r1w/r2)(r1+r2)/2dw
Numerator Degrees of Freedom, r1Den.d.f.
α P(F ≤ f ) r2 12 15 20 24 30 40 60 120 ∞
0.05 0.95 10 2.91 2.85 2.77 2.74 2.70 2.66 2.62 2.58 2.540.025 0.975 3.62 3.52 3.42 3.37 3.31 3.26 3.20 3.14 3.080.01 0.99 4.71 4.56 4.41 4.33 4.25 4.17 4.08 4.00 3.91
0.05 0.95 12 2.69 2.62 2.54 2.51 2.47 2.43 2.38 2.34 2.300.025 0.975 3.28 3.18 3.07 3.02 2.96 2.91 2.85 2.79 2.720.01 0.99 4.16 4.01 3.86 3.78 3.70 3.62 3.54 3.45 3.36
0.05 0.95 15 2.48 2.40 2.33 2.29 2.25 2.20 2.16 2.11 2.070.025 0.975 2.96 2.86 2.76 2.70 2.64 2.59 2.52 2.46 2.400.01 0.99 3.67 3.52 3.37 3.29 3.21 3.13 3.05 2.96 2.87
0.05 0.95 20 2.28 2.20 2.12 2.08 2.04 1.99 1.95 1.90 1.840.025 0.975 2.68 2.57 2.46 2.41 2.35 2.29 2.22 2.16 2.090.01 0.99 3.23 3.09 2.94 2.86 2.78 2.69 2.61 2.52 2.42
0.05 0.95 24 2.18 2.11 2.03 1.98 1.94 1.89 1.84 1.79 1.730.025 0.975 2.54 2.44 2.33 2.27 2.21 2.15 2.08 2.01 1.940.01 0.99 3.03 2.89 2.74 2.66 2.58 2.49 2.40 2.31 2.21
0.05 0.95 30 2.09 2.01 1.93 1.89 1.84 1.79 1.74 1.68 1.620.025 0.975 2.41 2.31 2.20 2.14 2.07 2.01 1.94 1.87 1.790.01 0.99 2.84 2.70 2.55 2.47 2.39 2.30 2.21 2.11 2.01
0.05 0.95 40 2.00 1.92 1.84 1.79 1.74 1.69 1.64 1.58 1.510.025 0.975 2.29 2.18 2.07 2.01 1.94 1.88 1.80 1.72 1.640.01 0.99 2.66 2.52 2.37 2.29 2.20 2.11 2.02 1.92 1.80
0.05 0.95 60 1.92 1.84 1.75 1.70 1.65 1.59 1.53 1.47 1.390.025 0.975 2.17 2.06 1.94 1.88 1.82 1.74 1.67 1.58 1.480.01 0.99 2.50 2.35 2.20 2.12 2.03 1.94 1.84 1.73 1.60
0.05 0.95 120 1.83 1.75 1.66 1.61 1.55 1.50 1.43 1.35 1.250.025 0.975 2.05 1.95 1.82 1.76 1.69 1.61 1.53 1.43 1.310.01 0.99 2.34 2.19 2.03 1.95 1.86 1.76 1.66 1.53 1.38
0.05 0.95 ∞ 1.75 1.67 1.57 1.52 1.46 1.39 1.32 1.22 1.000.025 0.975 1.94 1.83 1.71 1.64 1.57 1.48 1.39 1.27 1.000.01 0.99 2.18 2.04 1.88 1.79 1.70 1.59 1.47 1.32 1.00
502 Appendix B Tables
Table VIII Random Numbers on the Interval (0, 1)
3407 1440 6960 8675 5649 5793 15145044 9859 4658 7779 7986 0520 66970045 4999 4930 7408 7551 3124 05277536 1448 7843 4801 3147 3071 47497653 4231 1233 4409 0609 6448 2900
6157 1144 4779 0951 3757 9562 23546593 8668 4871 0946 3155 3941 96623187 7434 0315 4418 1569 1101 00434780 1071 6814 2733 7968 8541 10039414 6170 2581 1398 2429 4763 9192
1948 2360 7244 9682 5418 0596 49711843 0914 9705 7861 6861 7865 72934944 8903 0460 0188 0530 7790 91183882 3195 8287 3298 9532 9066 82256596 9009 2055 4081 4842 7852 5915
4793 2503 2906 6807 2028 1075 71752112 0232 5334 1443 7306 6418 96390743 1083 8071 9779 5973 1141 43938856 5352 3384 8891 9189 1680 31928027 4975 2346 5786 0693 5615 2047
3134 1688 4071 3766 0570 2142 34920633 9002 1305 2256 5956 9256 89798771 6069 1598 4275 6017 5946 81892672 1304 2186 8279 2430 4896 36983136 1916 8886 8617 9312 5070 2720
6490 7491 6562 5355 3794 3555 75108628 0501 4618 3364 6709 1289 05439270 0504 5018 7013 4423 2147 40895723 3807 4997 4699 2231 3193 81306228 8874 7271 2621 5746 6333 0345
7645 3379 8376 3030 0351 8290 36406842 5836 6203 6171 2698 4086 54696126 7792 9337 7773 7286 4236 17884956 0215 3468 8038 6144 9753 31311327 4736 6229 8965 7215 6458 3937
9188 1516 5279 5433 2254 5768 87180271 9627 9442 9217 4656 7603 88262127 1847 1331 5122 8332 8195 33222102 9201 2911 7318 7670 6079 26761706 6011 5280 5552 5180 4630 4747
7501 7635 2301 0889 6955 8113 43645705 1900 7144 8707 9065 8163 98463234 2599 3295 9160 8441 0085 93175641 4935 7971 8917 1978 5649 57992127 1868 3664 9376 1984 6315 8396
Appendix B Tables 503Table IX Distribution Function of the Correlation Coefficient R, ρ = 0
R p.d.f.ν = 15 d.f.
r
1
−1 0
P(R ≤ r)
1
R p.d.f.ν = 15 d.f.
rα(v)
α
1
−1 0 1
P(R ≤ r) =∫ r
−1
"[(n − 1)/2]"(1/2)"[(n − 2)/2]
(1 − w2)(n−4)/2) dw
P(R ≤ r)
0.95 0.975 0.99 0.995ν = n − 2degrees offreedom r0.05(ν) r0.025(ν) r0.01(ν) r0.005(ν)
1 0.9877 0.9969 0.9995 0.99992 0.9000 0.9500 0.9800 0.99003 0.8053 0.8783 0.9343 0.95874 0.7292 0.8113 0.8822 0.91725 0.6694 0.7544 0.8329 0.8745
6 0.6215 0.7067 0.7887 0.83437 0.5822 0.6664 0.7497 0.79778 0.5493 0.6319 0.7154 0.76469 0.5214 0.6020 0.6850 0.7348
10 0.4972 0.5759 0.6581 0.7079
11 0.4761 0.5529 0.6338 0.683512 0.4575 0.5323 0.6120 0.661313 0.4408 0.5139 0.5922 0.641114 0.4258 0.4973 0.5742 0.622615 0.4123 0.4821 0.5577 0.6054
16 0.4000 0.4683 0.5425 0.589717 0.3887 0.4555 0.5285 0.575018 0.3783 0.4437 0.5154 0.561419 0.3687 0.4328 0.5033 0.548720 0.3597 0.4226 0.4920 0.5367
25 0.3232 0.3808 0.4450 0.486930 0.2959 0.3494 0.4092 0.448735 0.2746 0.3246 0.3809 0.418240 0.2572 0.3044 0.3578 0.393145 0.2428 0.2875 0.3383 0.3721
50 0.2306 0.2732 0.3218 0.354160 0.2108 0.2500 0.2948 0.324870 0.1954 0.2318 0.2736 0.301780 0.1829 0.2172 0.2565 0.282990 0.1725 0.2049 0.2422 0.2673
100 0.1638 0.1946 0.2300 0.2540
504A
ppendixB
Tables
Table X Discrete Distributions
ProbabilityDistribution andParameter Values
ProbabilityMass
Function
Moment-Generating
FunctionMeanE(X) Variance Var(X) Examples
Bernoulli pxq1−x, x = 0, 1 q + pet, p pq Experiment with two possible0 < p < 1 −∞ < t < ∞ outcomes, say success andq = 1 − p failure, p = P(success)
Binomial(
nx
)pxqn−x, (q + pet)n, np npq Number of successes in
n = 1, 2, 3, . . . −∞ < t < ∞ a sequence of n Bernoulli0 < p < 1 x = 0, 1, . . . , n trials, p = P(success)
Geometric qx−1p,pet
1 − qet1p
qp2 The number of trials to
0 < p < 1 x = 1, 2, . . . obtain the first success in aq = 1 − p t < − ln(1 − p) sequence of Bernoulli trials
Hypergeometric Selecting n objects at randomx ≤ n, x ≤ N1
(N1
x
)(N2
n − x
)
(Nn
) n(
N1
N
)n(
N1
N
)(N2
N
)(N − nN − 1
)without replacement from a
n − x ≤ N2 set composed of twoN = N1 + N2 types of objectsN1 > 0, N2 > 0
Negative Binomial(
x − 1r − 1
)prqx−r,
(pet)r
(1 − qet)r ,rp
rqp2 The number of trials to
obtain the rth success in ar = 1, 2, 3, . . . x = r, r + 1, . . . t < − ln(1 − p) sequence of Bernoulli trials0 < p < 1
Poissonλxe−λ
x! , eλ(et−1) λ λ Number of events occurring inλ > 0 −∞ < t < ∞ a unit interval, events are
x = 0, 1, . . . occurring randomly at a meanrate of λ per unit interval
Uniform1m
, x = 1, 2, . . . , mm + 1
2m2 − 1
12Select an integer randomly
m > 0 from 1, 2, . . . , m
Appendix
BTables
505
Table XI Continuous Distributions
ProbabilityDistribution andParameter Values Probability Density Function
Moment-Generating
FunctionMeanE(X) Variance Var(X) Examples
Beta"(α + β)"(α)"(β)
xα−1(1 − x)β−1,α
α + β
αβ
(α + β + 1)(α + β)2 X = X1/(X1 + X2),α > 0 where X1 and X2 haveβ > 0 0 < x < 1 independent gamma
distributions with same θ
Chi-squarexr/2−1e−x/2
"(r/2)2r/2 ,1
(1 − 2t)r/2 , t <12
r 2r Gamma distribution, θ = 2,r = 1, 2, . . . α = r/2; sum of squares of r
0 < x < ∞ independent N(0, 1) randomvariables
Exponential1θ
e−x/θ , 0 ≤ x < ∞ 11 − θ t
, t <1θ
θ θ2 Waiting time to first arrivalθ > 0 when observing a Poisson
process with a mean rate ofarrivals equal to λ = 1/θ
Gammaxα−1e−x/θ
"(α)θα,
1(1 − θ t)α
, t <1θ
αθ αθ2 Waiting time to αth arrivalα > 0 when observing a Poissonθ > 0 0 < x < ∞ process with a mean rate of
arrivals equal to λ = 1/θ
Normale−(x−µ)2/2σ 2
σ√
2π, eµt+σ 2t2/2 µ σ 2 Errors in measurements;
−∞ < µ < ∞ −∞ < t < ∞ heights of children;σ > 0 −∞ < x < ∞ breaking strengths
Uniform1
b − a, a ≤ x ≤ b
etb − eta
t(b − a), t %= 0
a + b2
(b − a)2
12Select a point at random
−∞ < a < b < ∞ from the interval [a, b]1, t = 0
506A
ppendixB
Tables
Table XII Tests and Confidence Intervals
Distribution
θ : Theparameterof interest
W: The variable used to testH0: θ = θ0
Two-sided 1 − αConfidence Interval for θ Comments
N(µ, σ 2) or n large µX − θ0
σ/√
nx ± zα/2
σ√n
W is N(0, 1);σ 2 known P(W ≥ zα/2) = α/2
N(µ, σ 2) µX − θ0
S/√
nx ± tα/2(n−1)
s√n
W has a t distribution withσ 2 unknown n − 1 degrees of freedom;
P[W ≥ tα/2(n−1)] = α/2
Any distribution µX − θ0
σ/√
nx ± zα/2
σ√n
W has an approximatewith known N(0, 1) distribution forvariance, σ 2 n sufficiently large
N(µX , σ 2X) µX − µY
X − Y − θ0√σ 2
X
n+ σ 2
Y
m
x − y ± zα/2
√σ 2
X
n+ σ 2
Y
mW is N(0, 1)
N(µY , σ 2Y )
σ 2X , σ 2
Y known
N(µX , σ 2X) µX − µY
X − Y − θ0√S2
X
n+ S2
Y
m
x − y ± zα/2
√s2
xn
+s2
y
mW is approximately N(0, 1)
N(µY , σ 2Y ) if sample sizes are large
σ 2X , σ 2
Y unknown
N(µX , σ 2X) µX − µY
X − Y − θ0√(n − 1)S2
X + (m − 1)S2Y
n + m − 2
(1n
+ 1m
) x − y ± tα/2(n+m−2)sp
√1n
+ 1m
W has a t distribution withr = n + m − 2 degrees of
freedomN(µY , σ 2Y )
σ 2X = σ 2
Y , unknown sp =
√(n − 1)s2
x + (m − 1)s2y
n + m − 2
D = X − Y µX − µYD − θ0
SD/√
nd ± tα/2(n−1)
sd√n
W has a t distribution withis N(µX − µY , σ 2
D) n − 1 degrees of freedomX and Y dependent
Appendix
BTables
507
Table XII continued
Distribution
θ : Theparameterof interest
W: The variable used to testH0: θ = θ0
Two-sided 1 − αConfidence Interval for θ Comments
N(µ, σ 2) σ 2 (n − 1)S2
θ0
(n − 1)s2
χ2α/2(n−1)
,(n − 1)s2
χ21−α/2(n−1)
W is χ2(n−1),µ unknown P[W ≤ χ2
1−α/2(n−1)] = α/2,P[W ≥ χ2
α/2(n−1)] = α/2
N(µ, σ 2) σ(n − 1)S2
θ20
√√√√ (n − 1)s2
χ2α/2(n−1)
,
√√√√ (n − 1)s2
χ21−α/2(n−1)
W is χ2(n−1).µ unknown P[W ≤ χ2
1−α/2(n−1)] = α/2,
P[W ≥ χ2α/2(n−1)] = α/2
N(µX , σ 2X)
σ 2X
σ 2Y
S2Y
S2X
θ0s2
x/s2y
Fα/2(n−1, m−1), Fα/2(m−1, n−1)
s2x
s2y
W has an F distribution with
N(µY , σ 2Y ) m − 1 and n − 1 degrees
µX , µY unknown of freedom
b(n, p) p
Yn
− θ0√(
Yn
)(1 − Y
n
)/n
yn
± zα/2
√( yn
)(1 − y
n
)/n W is approximately N(0, 1)
for n sufficiently large
b(n, p) p p ± zα/2√
p(1 − p)/(n + 4) W is approximately N(0, 1)p = (y + 2)/(n + 4) for n sufficiently large
b(n1, p1) p1 − p2
Y1
n1− Y2
n2− θ0
√(Y1 + Y2
n1 + n2
)(1 − Y1 + Y2
n1 + n2
)(1n1
+ 1n2
)y1
n1− y2
n2± W is approximately N(0, 1)
b(n2, p2) when n1 and n2 aresufficiently large
zα/2
√y1
n1
(1 − y1
n1
)/n1 + y2
n2
(1 − y2
n2
)/n2
Recommended