Upload
vukhue
View
224
Download
3
Embed Size (px)
Citation preview
155S7.5_3 Estimating a Population Variance
1
March 29, 2011
Chapter 7Estimates and Sample Sizes
71 Review and Preview72 Estimating a Population Proportion73 Estimating a Population Mean: σ Known74 Estimating a Population Mean: σ Not Known75 Estimating a Population Variance
MAT 155 Statistical AnalysisDr. Claude Moore
Cape Fear Community College
Excel program for critical values and for confidence interval for chisquare distribution (standard deviation and variance) http://cfcc.edu/faculty/cmoore/chisquared.xls
Key Concept
This section we introduce the chisquare probability distribution so that we can construct confidence interval estimates of a population standard deviation or variance. We also present a method for determining the sample size required to estimate a population standard deviation or variance.
ChiSquare DistributionIn a normally distributed population with variance σ2, assume that we randomly select independent samples of size n and, for each sample, compute the sample variance s2 (which is the square of the sample standard deviation s). The sample statistic (pronounced chisquare) has a sampling distribution called the chisquare distribution.
where n = sample size s 2 = sample variance σ2 = population variance
ChiSquare Distribution
degrees of freedom = n – 1
155S7.5_3 Estimating a Population Variance
2
March 29, 2011
Properties of the Distribution of the ChiSquare Statistic
1. The chisquare distribution is not symmetric, unlike the normal and Student t distributions.
As the number of degrees of freedom increases, thedistribution becomes more symmetric.
ChiSquare Distribution ChiSquare Distribution for df = 10 and df = 20
2. The values of chisquare can be zero or positive, but they cannot be negative.3. The chisquare distribution is different for each number of degrees of freedom, which is df = n – 1. As the number of degrees of freedom increases, the chisquare distribution approaches a normal distribution.
In Table A4, each critical value of corresponds to an area given in the top row of the table, and that area represents the cumulative area located to the right of the critical value.
Properties of the Distribution of the ChiSquare Statistic – cont.
Example
A simple random sample of ten voltage levels is obtained. Construction of a confidence interval for the population standard deviation σ requires the left and right critical values of corresponding to a confidence level of 95% and a sample size of n = 10. Find the critical value of separating an area of 0.025 in the left tail, and find the critical value of separating an area of 0.025 in the right tail.
Example Critical Values of the ChiSquare
155S7.5_3 Estimating a Population Variance
3
March 29, 2011
ExampleFor a sample of 10 values taken from a normally distributed population, the chisquare statistic = (n – 1)s2/σ2 has a 0.95 probability of falling between the chisquare critical values of 2.700 and 19.023. Instead of using Table A4, technology (such as STATDISK, Excel, and Minitab) can be used to find critical values of . A major advantage of technology is that it can be used for any number of degrees of freedom and any confidence level, not just the limited choices included in Table A4.
Estimators of σ 2
The sample variance s2 is the best point estimate of the population variance σ2.
Estimators of σ
The sample standard deviation s is a commonly used point estimate of σ (even though it is a biased estimate).
Confidence Interval for Estimating a Population Standard Deviation or Variance
σ = population standard deviations = sample standard deviationn = number of sample values
σ 2 = population variances 2 = sample varianceE = margin of error
= lefttailed critical value of
= righttailed critical value of
Confidence Interval for Estimating a Population Standard Deviation or Variance
Requirements:1. The sample is a simple random sample. 2. The population must have normally distributed values (even if the sample is large).
155S7.5_3 Estimating a Population Variance
4
March 29, 2011
Confidence Interval for Estimating a Population Standard Deviation or Variance
Confidence Interval for the Population Variance σ 2
Confidence Interval for Estimating a Population Standard Deviation or Variance
Confidence Interval for the Population Standard Deviation σ
Procedure for Constructing a Confidence Interval for σ or σ 2
1. Verify that the required assumptions are satisfied.
2. Using n – 1 degrees of freedom, refer to Table A4 or use technology to find the critical values and that correspond to the desired confidence level.
3. Evaluate the upper and lower confidence interval limits using this format of the confidence interval:
4. If a confidence interval estimate of σ is desired, take the square root of the upper and lower confidence interval limits and change σ 2 to σ.
5. Round the resulting confidence level limits. If using the original set of data to construct a confidence interval, round the confidence interval limits to one more decimal place than is used for the original set of data. If using the sample standard deviation or variance, round the confidence interval limits to the same number of decimals places.
Procedure for Constructing a Confidence Interval for σ or σ 2 cont
155S7.5_3 Estimating a Population Variance
5
March 29, 2011
Confidence Intervals for Comparing Data
CautionConfidence intervals can be used informally to compare the variation in different data sets, but the overlapping of confidence intervals should not be used for making formal and final conclusions about equality of variances or standard deviations.
The proper operation of typical home appliances requires voltage levels that do not vary much. Listed below are ten voltage levels (in volts) recorded in the author’s home on ten different days. These ten values have a standard deviation of s = 0.15 volt. Use the sample data to construct a 95% confidence interval estimate of the standard deviation of all voltage levels.
123.3 123.5 123.7 123.4 123.6 123.5 123.5 123.4 123.6 123.8
Example:
Requirements are satisfied: simple random sample and normality
Example:n = 10 so df = 10 – 1 = 9. Use table A4 to find:
Example:
Construct the confidence interval: n = 10, s = 0.15
155S7.5_3 Estimating a Population Variance
6
March 29, 2011
Evaluation the preceding expression yields:
Example:
Finding the square root of each part (before rounding), then rounding to two decimal places, yields this 95% confidence interval estimate of the population standard deviation:
Based on this result, we have 95% confidence that the limits of 0.10 volt and 0.27 volt contain the true value of σ.
Determining Sample SizesThe procedures for finding the sample size necessary to estimate s2 are much more complex than the procedures given earlier for means and proportions. Instead of using very complicated procedures, we will use Table 72.
STATDISK also provides sample sizes. With STATDISK, select Analysis, Sample Size Determination, and then Estimate St Dev.
Minitab, Excel, and the TI83/84 Plus calculator do not provide such sample sizes.
Determining Sample Sizes Example: We want to estimate the standard deviation s of all voltage levels in a home. We want to be 95% confident that our estimate is within 20% of the true value of σ. How large should the sample be? Assume that the population is normally distributed.From Table 72, we can see that 95% confidence and an error of 20% for σ correspond to a sample of size 48. We should obtain a simple random sample of 48 voltage levels form the population of voltage levels.
155S7.5_3 Estimating a Population Variance
7
March 29, 2011
RecapIn this section we have discussed:• The chisquare distribution.• Using Table A4.• Confidence intervals for the population variance and standard deviation.• Determining sample sizes.
You may use the Excel program to find Critical Values or Confidence Intervals for the ChiSquared Distribution. It is available at http://cfcc.edu/faculty/cmoore/ChiSquared.xls
384/6. Find the critical values and that correspond to the given confidence level and sample size: 95%; n = 20
384/8. Find the critical values and that correspond to the given confidence level and sample size: 90%; n = 51
Use the given confidence level and sample data to find a confidence interval for the population standard deviation In each case, assume that a simple random sample has been selected from a population that has a normal distribution.384/10. Speeds of Drivers Ticketed in a 65 mi/h Zone on the Massachusetts Turnpike 95% confidence; n = 25, x = 81.0 mi/h, s = 2.3 mi/h.
155S7.5_3 Estimating a Population Variance
8
March 29, 2011
Use the given confidence level and sample data to find a confidence interval for the population standard deviation In each case, assume that a simple random sample has been selected from a population that has a normal distribution.384/12. Reaction Times of NASCAR Drivers 99% confidence; n = 8, x = 1.24 sec, s = 0.12 sec.
Assume that each sample is a simple random sample obtained from a normally distributed population. Use Table 72 on page 382 to find the indicated sample size.384/14. Find the minimum sample size needed to be 95% confident that the sample standard deviation s is within 30% of σ. Is this sample size practical in most applications?
Sample size is n = 21.
Assume that each sample is a simple random sample obtained from a normally distributed population. Use Table 72 on page 382 to find the indicated sample size.384/16. Find the minimum sample size needed to be 95% confident that the sample variance is within 20% of the population variance.
Sample size is n = 211.
155S7.5_3 Estimating a Population Variance
9
March 29, 2011
385/20. Pulse Rates of Men and Women Data Set 1 in Appendix B includes 40 pulse rates of men, and those pulse rates have a mean of 69.4 beats per minute and a standard deviation of 11.3 beats per minute. That data set also includes 40 pulse rates of women, and those pulse rates have a mean of 76.3 beats per minute and a standard deviation of 12.5 beats per minute. a. Construct a 99% confidence interval estimate of the standard deviation of the pulse rates of men. b. Construct a 99% confidence interval estimate of the standard deviation of the pulse rates of women. c. Compare the variation of the pulse rates of men and women. Does there appear to be a difference?
(a) (b)
﴾c﴿ The two intervals are similar. No, there does not appear to be a difference in the variation of pulse rates of men and women.
385/22. Designing Theater Seats In the course of designing theater seats, the sitting heights (in mm) of a simple random sample of adult women is obtained, and the results are listed below (based on anthropometric survey data from Gordon, Churchill, et al.). Use the sample data to construct a 95% confidence interval estimate of σ, the standard deviation of sitting heights of all women. Does the confidence interval contain the value of 35 mm, which is believed to be the standard deviation of sitting heights of women? 849 807 821 859 864 877 772 848 802 807 887 815
Yes, the interval contains the traditionally believed value of 35 mm.
If you work this by use of formula, use the TI to find the standard deviation, s = 34.981. Find critical values for 95% confidence level. Use the appropriate formula.
You may use the Excel program shown below. It is available at http://cfcc.edu/faculty/cmoore/chisquared.xls
386/24. Comparing Waiting Lines a. The listed values are waiting times (in minutes) of customers at the Jefferson Valley Bank, where customers enter a single waiting line that feeds three teller windows. Construct a 95% confidence interval for the population standard deviation σ. 6.5 6.6 6.7 6.8 7.1 7.3 7.4 7.7 7.7 7.7
b. The listed values are waiting times (in minutes) of customers at the Bank of Providence, where customers may enter any one of three different lines that have formed at three teller windows. Construct a 95% confidence interval for the population standard deviation σ.4.2 5.4 5.8 6.2 6.7 7.7 7.7 8.5 9.3 10.0
c. Interpret the results found in parts (a) and (b). Do the confidence intervals suggest a difference in the variation among waiting times? Which arrangement seems better: the singleline system or the multipleline system?
The variation is considerably higher in part ﴾b﴿. Yes; since the intervals do not overlap, there is a significant difference in the variability of the two systems. The singleline system in part ﴾a﴿ is better for the customers because it eliminates the long wait endured by some customers when one of the lines is slow.
The confidence intervals in parts (a) and (b) were generated using the Excel program available at http://cfcc.edu/faculty/cmoore/chisquared.xls
386/24. Comparing Waiting Lines a. The listed values are waiting times (in minutes) of customers at the Jefferson Valley Bank, where customers enter a single waiting line that feeds three teller windows. Construct a 95% confidence interval for the population standard deviation σ. 6.5 6.6 6.7 6.8 7.1 7.3 7.4 7.7 7.7 7.7b. The listed values are waiting times (in minutes) of customers at the Bank of Providence, where customers may enter any one of three different lines that have formed at three teller windows. Construct a 95% confidence interval for the population standard deviation σ.4.2 5.4 5.8 6.2 6.7 7.7 7.7 8.5 9.3 10.0c. Interpret the results found in parts (a) and (b). Do the confidence intervals suggest a difference in the variation among waiting times? Which arrangement seems better: the singleline system or the multipleline system?
See solutions from previous page.