22
Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189 Sampling distribution of the sample variance - pages 189 - 190 Confidence interval for a population mean - pages 209 - 212 Confidence interval for a population variance - pages 268 - 270 Confidence interval for a population proportion - pages 278 - 281

Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Embed Size (px)

Citation preview

Page 1: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Topic 6 - Confidence intervals based on a single sample

• Sampling distribution of the sample mean - pages 187 - 189

• Sampling distribution of the sample variance - pages 189 - 190

• Confidence interval for a population mean - pages 209 - 212

• Confidence interval for a population variance - pages 268 - 270

• Confidence interval for a population proportion - pages 278 - 281

Page 2: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Confidence Intervals

• To use the CLT in our examples, we had to know the population mean, , and the population standard deviation, .

• This is okay if we have a huge amount of sample data to estimate these quantities with and s, respectively.

• In most cases, the primary goal of the analysis of our sample data is to estimate and to determine a range of likely values for these population values.

X

Page 3: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Confidence interval for with known

• Suppose for a moment we know but not • The CLT says that is approximately normal

with a mean of and a variance of 2/n.

• So, will be standard normal.

• For the standard normal distribution, let z be such that P(Z > z) =

X

XZ

n

Page 4: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Confidence interval for with known• Normal calculator

Page 5: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Confidence interval for with unknown

• For large samples (n ≥ 30), we can replace with s, so that is a (1-)100% confidence interval for .

• For small samples (n < 30) from a normal population,

is a (1-)100% confidence interval for .

• The value t,n-1 is the appropriate quantile from a t distribution with n-1 degrees of freedom.

• t-distribution demonstration• t-distribution calculator

/ 2X z s n

/ 2, 1nX t s n

Page 6: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Acid rain data• The EPA states that any area where the average pH of

rain is less than 5.6 on average has an acid rain problem.

• pH values collected at Shenandoah National Park are listed below.

• Calculate 95% and 99% confidence intervals for the average pH of rain in the park.

Page 7: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Light bulb data• The lifetimes in days of 10 light bulbs of a

certain variety are given below. Give a 95% confidence interval for the expected lifetime of a light bulb of this type. Do you trust the interval?

Page 8: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Interpreting confidence statements• In making a confidence statement, we

have the desired level of confidence in the procedure used to construct the interval.

• Confidence interval demonstration

Page 9: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Sample size effect• As sample size increases, the width of our

confidence interval clearly decreases.• If is known, the width of our interval for

is

• Solving for n,

• If is known or we have a good estimate, we can use this formula to decide the sample size we need to obtain a certain interval length.

/ 22L zn

2/ 2(2 )n zL

Page 10: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Paint example• Suppose we know from past data that the

standard deviation of the square footage covered by a one gallon can of paint is somewhere between 2 and 4 square feet. How many one gallon cans do we need to test so that the width of a 95% confidence interval for the mean square footage covered will be at most 1 square foot?

Page 11: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Confidence interval for a variance

• In many cases, especially in manufacturing, understanding variability is very important.

• Often times, the goal is to reduce the variability in a system.

• For samples from a normal population,

is a (1-)100% confidence interval for 2.

• The value ,n-1 is the appropriate quantile from a

distribution with n-1 degrees of freedom.• calculator

2 2

2 2/ 2, 1 1 / 2, 1

( 1) ( 1),

n n

n s n s

Page 12: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Acid rain data• Calculate a 95% confidence interval for the variance of

pH of rain in the park.

Page 13: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Confidence interval for a proportion

• A binomial random variable, X, counts the number of successes in n Bernoulli trials where the probability of success on each trial is p.

• In sampling studies, we are often times interested in the proportion of items in the population that have a certain characteristic.

• We think of each sample, Xi, from the population as a Bernoulli trial, the selected item either has the characteristic (Xi = 1) or it does not (Xi = 0).

• The total number with the characteristic in the sample is then

1

n

ii

X X

Page 14: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Confidence interval for a proportion

• The proportion in the sample with the characteristic is then

• The CLT says that the distribution of this sample proportion is then normally distributed for large n with – E(X/n) = – Var(X/n) =

• For the CLT to work well here, we need– X ≥ 5 and n-X ≥ 5– n to be much smaller than the population

size

1

ˆ / /n

ii

p X n X n

Page 15: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Confidence interval for a proportion• How can we use this to develop a (1-

)100% confidence interval for p, the population proportion with the characteristic?

/ 2ˆ ˆ ˆ(1 )/p z p p n

Page 16: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Murder case example• Find a 99% confidence interval for the

proportion of African Americans in the jury pool. (22 out of 295 African American in sample)

• StatCrunch

Page 17: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Sample size considerations

• The width of the confidence interval is

• To get the sample size required for a specific length, we have

• We might use prior information to estimate the sample proportion or substitute the most conservative value of ½.

/ 2 ˆ ˆ2 (1 )/L z p p n

2/ 2 ˆ ˆ(2 / ) (1 )n z L p p

Page 18: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Nurse employment case

• How many sample records should be considered if we want the 95% confidence interval for the proportion all her records handled in a timely fashion to be 0.02 wide?

Page 19: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Prediction intervals

• Sometimes we are not interested in a confidence interval for a population parameter but rather we are interested in a prediction interval for a new observation.

• For our light bulb example, we might want a 95% prediction interval for the time a new light bulb will last.

• For our paint example, we might want a 95% prediction interval for the amount of square footage a new can of paint will cover.

Page 20: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Prediction intervals

• Given a random sample of size n, our best guess at a new observation, Xn+1, would be the sample mean, .

• Now consider the difference, .

X

1nX X

1( )nE X X

1( )nVar X X

Page 21: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Prediction intervals• If is known and our population is normal,

then using a pivoting procedure gives

• If is unknown and our population is normal, a (1-)100% prediction interval for a new observation is

/ 2, 1

11nX t s

n

Page 22: Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages 187 - 189187 - 189 Sampling distribution of the

Acid rain example• For the acid rain data, the sample mean pH

was 4.577889 and the sample standard deviation was 0.2892. What is a 95% prediction interval for the pH of a new rainfall?

• Does this prediction interval apply for the light bulb data?