Upload
others
View
24
Download
0
Embed Size (px)
Citation preview
5/20/2013
1
Chapter 17
Inferences:
Continuous Response
Introduction
• This chapter covers random sampling evaluations from
a population that has a continuous response. • An example of a continuous response is the amount of
tire tread that exists after 40,000 km of automobile
usage. One tire might, for example, have 6.0 mm of
remaining tread while another tire might measure 5.5
mm.
• In this chapter the estimation of population mean and
standard deviation from sampled data is discussed in
conjunction with probability plotting.
5/20/2013
2
17.1 Summarizing Sample Data
• The classical analysis of sampled data taken from a
continuous response population has focused on determining
a sample mean (𝑥 ) and standard deviation (𝑠), along with
perhaps confidence interval statements.
• Other information with probabilities may be more meaningful.
• For example, an experiment might be able to indicate 90% of
the automobiles using certain type of tire will have at least 4.9
mm of tire tread after 40,000 km. Such a statement can be
more informative than a statement that only relates to the
mean tire tread after 40,000 km
17.2 Sample Size:
Hypothesis Test of a Mean
• Given the producer’s risk 𝛼, consumer’s risk 𝛽, and the
acceptable amount of uncertainty 𝛿, Diamond’s (1989)
equation can be used to determine the sample size (n)
necessary to evaluate a hypothesis test criterion:
𝑛 = (𝑈𝛼 + 𝑈𝛽)2𝜎2
𝛿2
• 𝑈𝛽 is determined from single sided Table B in Appendix E.
• If the alternative hypothesis is single-sided (𝜇 < criterion), 𝑈𝛼 is also determined from Table B, but if the alternative
hypothesis is double-sided (𝜇 <> criterion), 𝑈𝛼 is determined
from Table C.
• If 𝜎 is unknown, 𝛿 can be expressed in terms of 𝜎
5/20/2013
3
17.2 Sample Size:
Hypothesis Test of a Mean
• If the standard deviation is not known, the sample size
should be adjusted to using (Diamond 1989):
𝑛 = (𝑡𝛼 + 𝑡𝛽)2𝑠2
𝛿2
• 𝑡𝛽 is determined from single sided Table D in Appendix E.
• If the alternative hypothesis is single sided (μ < criterion),
𝑡𝛼 is also determined from Table D, but if the alternative
hypothesis is double sided (μ <> criterion), 𝑡𝛼 is determined
from Table E.
• A stereo amplifier output level is desired to be on the average
at least 100 watts (W) per channel. Determine the sample
size that is needed to verify this criterion given the following:
𝛼 = 0.1, which from Table B yield 𝑈𝛼 = 1.282
𝛽 = 0.05, which from Table B yield 𝑈𝛽 = 1.645 𝛿 = 0.5𝜎
𝑛 = (𝑈𝛼 +𝑈𝛽)2𝜎2
𝛿2= (1.282 + 1.645)2
𝜎2
0.5𝜎 2= 34.26~35
17.3 Example 17.1
Sample Size Determination for a Mean
Criterion Test
5/20/2013
4
17.3 Example 17.1
Sample Size Determination for a Mean
Criterion Test • If the standard deviation is not known, this sample need to
be adjusted. Given the number degrees of freedom for the
t-table value equals 34 (i.e. 35 – 1) interpolation in Table D
yields to t0.1;34 = 1.307 and t0.05;34 = 1.692
𝑛 = (𝑡𝛼 + 𝑡𝛽)2𝑠2
𝛿2= (1.307 + 1.692)2
𝑠2
0.5𝑠 2= 35.95~36
17.4 Confidence Intervals on the Mean
and Hypothesis Test Criteria Alternatives:
Table 17.1
Single-Sided Double-Sided
σ Known
𝜇 ≤ 𝑥 +𝑈𝛼𝜎
𝑛 𝑜𝑟
𝜇 ≥ 𝑥 −𝑈𝛼𝜎
𝑛
𝑥 −𝑈𝛼𝜎
𝑛≤ 𝜇 ≤ 𝑥 +
𝑈𝛼𝜎
𝑛
σ Unkown
𝜇 ≤ 𝑥 +𝑡𝛼𝑠
𝑛 𝑜𝑟
𝜇 ≥ 𝑥 −𝑡𝛼𝑠
𝑛
𝑥 −𝑡𝛼𝑠
𝑛≤ 𝜇 ≤ 𝑥 +
𝑡𝛼𝑠
𝑛
Reference
tables
𝑈𝛼: Table B
𝑡𝛼: Table D (𝜈 = 𝑛 − 1) 𝑈𝛼 :Table C
𝑡𝛼: Table E (𝜈 = 𝑛 − 1)
5/20/2013
5
17.4 Confidence Intervals on the Mean
and Hypothesis Test Criteria Alternatives
• If sample size is calculated before conducting the
experiment using desired values of 𝛼, 𝛽, and 𝛿, the null
hypothesis is not rejected if the criterion is contained within
the appropriate confidence interval for 𝜇. This decision is
made with the 𝛽 risk of error that was used in calculating
the sample size.
• However, if the criterion is not contained within the interval,
the null hypothesis is rejected. This decision is made with
the 𝛼 risk of error.
• Other methods can be used when setting up a hypothesis
test criterion. Consider 𝐻𝑎: 𝜇 > 𝜇𝑎, where 𝜇𝑎 is a product
specification vriterion, it can be determined that
𝑥 𝑐𝑟𝑖𝑡𝑒𝑟𝑖𝑜𝑛 = 𝜇𝑎 +𝑡𝛼𝑠
𝑛
When 𝑥 is greater than the test 𝑥 𝑐𝑟𝑖𝑡𝑒𝑟𝑖𝑜𝑛, 𝐻0 is rejected.
When 𝑥 is less than 𝑥 𝑐𝑟𝑖𝑡𝑒𝑟𝑖𝑜𝑛, 𝐻0 is not rejected.
• Alternative approach to this problem is to use this equation
𝑡0 =(𝑥 − 𝜇𝑎)
𝑠/ 𝑛
Where the null hypothesis is rejected if 𝑡0 > 𝑡𝛼
17.4 Confidence Intervals on the Mean
and Hypothesis Test Criteria Alternatives
5/20/2013
6
17.5 Example 17.2:
Confidence Intervals on the Mean:
𝜎 is known
• 𝑥 = 5.77, 𝜎 = 2, 𝛼 = 0.10
One-Sample Z
The assumed standard deviation = 2
90% Upper
N Mean SE Mean Bound
16 5.770 0.500 6.411
Minitab:
Stat
Basic Statistics
1-Sample Z-test
90% Lower
N Mean SE Mean Bound
16 5.770 0.500 5.129
N Mean SE Mean 90% CI
16 5.770 0.500 (4.948, 6.592)
17.5 Example 17.2:
Confidence Intervals on the Mean:
𝜎 is unknown
• 𝑥 = 5.77, 𝑠 = 2.41,𝛼 = 0.10
Minitab:
Stat
Basic Statistics
1-Sample t-test
One-Sample T
90% Upper
N Mean StDev SE Mean Bound
16 5.770 2.410 0.603 6.578
90% Lower
N Mean StDev SE Mean Bound
16 5.770 2.410 0.603 4.962
N Mean StDev SE Mean 90% CI
16 5.770 2.410 0.603 (4.714, 6.826)
5/20/2013
7
17.6 Example 17.3:
Sample Size -- An Alternative Approach
• An alternative approach to determine the sample size is to fix
the width (𝑤) of the margin of error in the confidence intervals.
𝑤 =𝑈𝛼𝜎
𝑛 , 𝑠𝑜 𝑛 =
𝑈𝛼2𝜎2
𝑤2
17.7 Standard Deviation
Confidence Interval
• When a sample of size n is taken from a population that is
normally distributed, the double-sided confidence interval
equation for the population’s standard deviation is
(𝑛−1)𝑠2
𝜒2𝛼2;𝜈
12
≤ 𝜎 ≤ (𝑛−1)𝑠2
𝜒21−𝛼2;𝜈
12
• where 𝑠 is the standard deviation of the sample 𝜒2 values
are taken from Table G with 𝛼/2 risk and 𝜈 degrees of
freedom equal to the sample size minus 1.
• This relationship is not robust for data from non-normal
distribution.
5/20/2013
8
17.8 Example 17.4: Standard
Deviation Confidence statement
• Consider again the 16 data points from sample 1 of table 3.3,
which had a mean of 5.77 and standard deviation of 2.41. Given
that a standard deviation was not known, the 90% confidence
interval for the standard deviation of the population would then be
(𝑛−1)𝑠2
𝜒2𝛼2;𝜈
12
≤ 𝜎 ≤ (𝑛−1)𝑠2
𝜒21−𝛼2;𝜈
12
(16−1)2.412
𝜒20.12 ;15
12
≤ 𝜎 ≤ (16−1)2.412
𝜒21−0.12 ;15
12
1.87 ≤ 𝜎 ≤ 3.46
17.9 Percentage of the Population
Assessments: Prediction Intervals, Tolerance
Intervals, and Probability Plots
• A confidence interval for a parameter is an interval that is
likely to contain the true value of the parameter.
• Prediction and tolerance intervals are concerned with the
population itself and with values that may be sampled
from it in the future.
• These intervals are only useful when the shape of the
population is known, here we assume the population is
known to be normal.
16
5/20/2013
9
Prediction Interval
• A prediction interval is an interval that is likely to contain
the value of an item that will be sampled from the
population at a future time.
• We “predict” that a value that is yet to be sampled from
the population will fall within the predication interval.
17
100(1 – α)% Prediction Interval
• Let 𝑥1, … , 𝑥𝑛 be a random sample from a normal population. Let 𝑦 be another item to be sampled from this population, whose value has not yet been observed. The
100(1 – 𝛼)% prediction interval for 𝑦 is
𝑥 ± 𝑡𝛼/2,𝑛−1 𝑠 1 +1
𝑛
• The probability is 1 – 𝛼 that the value of 𝑦 will be contained in this interval.
• One sided intervals may also be constructed.
𝑥 + 𝑡𝛼,𝑛−1 𝑠 1 +1
𝑛 or 𝑥 − 𝑡𝛼,𝑛−1𝑠 1 +
1
𝑛
18
5/20/2013
10
Example of Prediction Intervals
• A sample of 10 concrete blocks manufactured by a
certain process has a mean compressive strength of
1312 MPa, with standard deviation of 25 MPa. Find a
95% prediction interval for the strength of a block that has
not yet been measured.
𝑥 ± 𝑡𝛼/2,𝑛−1 𝑠 1 +1
𝑛
1312± 𝑡.025,9(25) 1 +1
10
1312± 2.262 25 1.1 𝑜𝑟 (1253,1371)
19
Comparing CI and PI
• The formula for the PI is similar to the formula for the CI of a mean of normal population.
• The prediction interval has a small adjustment to the standard error with the additional + 1 under the square root.
• This reflects the random variation in the value of the sampled item that is to be predicted.
• Prediction intervals are sensitive to the assumption that the population is normal.
• If the shape of the population differs much from the normal curve, the prediction interval may be misleading.
• Large samples do not help, if the population is not normal then the prediction interval is invalid.
20
5/20/2013
11
Tolerance Intervals
• A tolerance interval is an interval that is likely to contain a specified proportion of the population.
• First assume that we have a normal population whose mean 𝜇 and standard deviation 𝜎 are known.
• To find an interval that contains 90% of the population, we have 𝜇 ± 1.645𝜎.
• In general, the interval 𝜇 ± 𝑧𝛾 2 𝜎 will contain
100(1 – 𝛾)% of the population.
• In practice, we do not know 𝜇 or 𝜎. Instead we use the sample mean and sample standard deviation.
21
Consequences
• Since we are estimating the mean and standard
deviation from the sample,
– We must make the interval wider than it would be if 𝜇 and 𝜎 were known.
– We cannot be 100% confident that the interval actually
contains the required proportion of the population.
22
5/20/2013
12
Construction of Interval
• We must specify the proportion 100(1 – 𝛾)% of the population that we wish the interval to contain.
• We must also specify the confidence 100(1 – 𝛼)% that the interval actually contains the specified proportion.
• It is then possible to find a number 𝑘𝑛, 𝛼, 𝛾 such that the interval
𝑥 ± 𝑘𝑛,𝛼,𝛾𝑠
will contain at least 100(1 – 𝛾)% of the population with
confidence 100(1 – 𝛼)%. Values of 𝑘𝑛, 𝛼, 𝛾 are presented in the following table.
23
Tolerance Factors for the
Normal Distribution
24
n 90% 95% 99% 90% 95% 99%
2 32.0187 37.6746 48.4296 160.1940 188.4915 242.3004
3 8.3795 9.9158 12.8613 18.9304 22.4009 29.0553
4 5.3692 6.3699 8.2993 9.3984 11.1501 14.5274
5 4.2749 5.0787 6.6338 6.6118 7.8550 10.2602
6 3.7123 4.4140 5.7746 5.3366 6.3453 8.3013
7 3.3686 4.0074 5.2481 4.6129 5.4877 7.1868
8 3.1358 3.7317 4.8907 4.1473 4.9355 6.4683
9 2.9670 3.5317 4.6310 3.8223 4.5499 5.9660
10 2.8385 3.3794 4.4330 3.5821 4.2647 5.5943
11 2.7372 3.2592 4.2766 3.3970 4.0449 5.3075
12 2.6550 3.1617 4.1496 3.2497 3.8700 5.0792
13 2.5868 3.0808 4.0441 3.1295 3.7271 4.8926
14 2.5292 3.0124 3.9549 3.0294 3.6081 4.7371
15 2.4799 2.9538 3.8785 2.9446 3.5073 4.6053
95% Conf 99% Conf
Source: Principles of Statistics for Engineers and Scientists, by
William Navidi, 2010
5/20/2013
13
Tolerance Factors for the
Normal Distribution
25 Source: Principles of Statistics for Engineers and Scientists, by
William Navidi, 2010
n 90% 95% 99% 90% 95% 99%
16 2.4371 2.9029 3.8121 2.8717 3.4207 4.4920
17 2.3995 2.8583 3.7538 2.8084 3.3453 4.3934
18 2.3662 2.8188 3.7022 2.7527 3.2792 4.3068
19 2.3366 2.7835 3.6560 2.7034 3.2205 4.2300
20 2.3099 2.7518 3.6146 2.6594 3.1681 4.1614
25 2.2083 2.6310 3.4565 2.4941 2.9715 3.9039
30 2.1398 2.5494 3.3497 2.3848 2.8414 3.7333
35 2.0899 2.4900 3.2719 2.3063 2.7479 3.6107
40 2.0516 2.4445 3.2122 2.2468 2.6770 3.5177
45 2.0212 2.4083 3.1647 2.1998 2.6211 3.4443
50 1.9964 2.3787 3.1259 2.1616 2.5756 3.3846
60 1.9578 2.3328 3.0657 2.1029 2.5057 3.2929
70 1.9291 2.2987 3.0208 2.0596 2.4541 3.2251
80 1.9068 2.2720 2.9859 2.0260 2.4141 3.1725
90 1.8887 2.2506 2.9577 1.9990 2.3819 3.1303
100 1.8738 2.2328 2.9343 1.9768 2.3555 3.0955
95% Conf 99% Conf
Example of Tolerance Intervals
• The lengths of bolts manufactured by a certain process
are known to be normally distributed. In a sample of 30
bolts, the average length was 10.25 cm, with a standard
deviation of 0.20 cm. Find a tolerance interval that
includes 90% of the lengths of the bolts with 95%
confidence.
𝑥 ± 𝑘𝑛,𝛼,𝛾𝑠
10.25± 𝑘30,.05,.10(0.20)
10.25 ± 2.1398 0.20 𝑜𝑟 (9.82, 10.68)
26
5/20/2013
14
17.10 Example 17.5: Percentage of
the Population Statements
17.11 Statistical Tolerancing
• Consider that measurements for n components that are
each centered at mean (𝜇𝑖) of normal distribution with plus
or minus tolerance (𝑇𝑖) around this mean value.
• The worst-case overall tolerance (𝑇𝑤) for this situation is
simply the addition of these tolerances.
𝑇𝑤 = ± 𝑇𝑖 = ±(𝑇1 +
𝑛
𝑖=1
𝑇2 +⋯+ 𝑇𝑛)
5/20/2013
15
17.11 Statistical Tolerancing
• The serial 3𝜎 combination of the component tolerances
yields an overall product 3𝜎 tolerance (𝑇3𝜎) of
𝑇3𝜎 = ± 𝑇𝑖2
𝑛
𝑖=1
1/2
= ±(𝑇12 + 𝑇2
2 +⋯+ 𝑇𝑛2)1/2
• The assumption that each component follows a normal
distribution will not be valid. For example, a ± 10%
tolerance resistor may follow a bimodal, truncated, and/or
skewed, distribution because the best parts can be sorted
out and sold at a higher price, with a ±1% or ±5% tolerance.
17.11 Statistical Tolerancing
• Another situation where the normality assumption may be
invalid is where a manufacturer initially produces a part at
one tolerance extreme, anticipating tool wear in the
manufacturing process.
• An alternative approach is to estimate a distribution shape
for each component. Then conduct a Monte Carlo
simulation to yield an overall expected output.
5/20/2013
16
17.12 Example 17.6:
Combining Analytical Data with
Statistical Tolerancing • An automatic sheet feed device is to load a sheet of paper
into a printer such that the first character printed on the
paper will be 1.261.26 mm from the edge of the paper.
• In this problem, there are 2 variabilities that need
consideration:
– Variability within machines
– Variability between machines
• Three alternatives:
– Worst-case analysis
– All sheets of paper on all printer manufactured
– Tolerance of the sheets produced on a worst-case machine
17.12 Example 17.6:
Combining Analytical Data with
Statistical Tolerancing • Alternative 1: Worst-case analysis
– Worst-case machine tolerances are combined with the 3𝜎 limits of machine repeatability:
𝑇𝐴 = 𝑇𝑤 + 3𝜎𝑟 = ± 𝑇𝑖 ± 3𝜎𝑟
𝑛
𝑖=1
= ±2.02
– All sheets of paper on all printer manufactured
– Tolerance of the sheets produced on a worst-case machine
5/20/2013
17
17.12 Example 17.6:
Combining Analytical Data with
Statistical Tolerancing • Alternative 1: Worst-case analysis
– Worst-case machine tolerances are combined with the 3𝜎 limits of machine repeatability:
𝑇𝐴 = 𝑇𝑤 + 3𝜎𝑟 = ± 𝑇𝑖 ± 3𝜎𝑟
𝑛
𝑖=1
= ±2.02
• Alternative 2: All sheets of paper on all printers
𝑇𝐵 = ± 𝑇𝑖2
𝑛
𝑖=1
+ (3𝜎𝑟)2
0.5
= ±0.622
• Tolerance of the sheets produced on a worst-case machine
17.12 Example 17.6:
Combining Analytical Data with
Statistical Tolerancing • Alternative 3: Tolerance of the sheets produced on a worst-
case machine
𝑇𝐶 = ± 𝑇𝑖2
𝑛
𝑖=1
1/2
± (3𝜎𝑟) = ±0.85
5/20/2013
18
17.13 Nonparametric Estimates:
Run Test For Randomization
• A run is defined as a group of consecutive
observations either all greater than or less than
some value. To assess whether data are in
random order using a runs test, data are
evaluated in terms of the number of runs above
and below the median. Within this
nonparametric test no assumption is required
about the population distribution.
17.14 Example 17.7: Nonparametric
Runs Test For Randomization
• Forty people are selected randomly. Each person is asked a question,
which has five possible answers that are coded 1-5. A gradual bias in the
question phrasing or a lack of randomization when selecting people would
cause non-randomization of the responses.
1 1 2 1 1 1 1 1 1
2 3 3 2 0 0 0 0 1
1 3 3 4 4 5 5 5 5
2 1 1 2 2 2 1 1 3
3 3 3 2
5/20/2013
19
17.14 Example 17.7: Nonparametric
Runs Test For Randomization
Runs Test: C1
Runs test for C1
Runs above and below K = 2.05
The observed number of runs = 7
The expected number of runs = 19.2
14 observations above K, 26 below
P-value = 0.000
Minitab:
Stat
Nonparametrics
Run Test