Healey Chapter 7 Estimation Procedures Using the Sampling Distribution to Construct Confidence...

Preview:

Citation preview

Healey Chapter 7

Estimation Procedures

Using the Sampling Distribution to Construct

Confidence Intervals

Outline:

The logic of estimation

How to construct and interpret confidence interval estimates for:

Sample means Sample Proportions

The Logic Behind Estimation

In estimation procedures, statistics calculated from random samples are used to estimate the value of population parameters.

Example: If we know that 42% of a random sample drawn

from a city vote Liberal, we can estimate the percentage of all city residents who vote Liberal.

Logic (cont.)

Information from samples is used to estimate information about the population.

Statistics are used to estimate parameters.

POPULATION

SAMPLE

PARAMETER

STATISTIC

Logic (cont.)

Sampling Distribution is the link between sample and population.

The value of the parameters is unknown but characteristics of the Sampling Distribution are defined by theorems.

POPULATION

SAMPLING DISTRIBUTION

SAMPLE

Two Estimation Procedures

1. A point estimate is a sample statistic used to estimate a population value: The London Free Press reports that “42% of a

sample of randomly selected city residents voted Liberal.”

2. Confidence intervals (for means or proportions) consist of a range of values: …”between 38% and 46% of city residents voted

Liberal.”

Bias and Efficiency Bias:

An estimator of a mean (or a proportion) is unbiased if the mean of its sampling distribution is equal to the population mean.

Efficiency: The smaller the standard error (S.D. of the

sampling distribution,) the more the samples are clustered about the mean of the sampling distribution

This is known as efficiency.

Sample Size and Efficiency Standard error of sampling distribution:

=

In looking at the formula, we can see that as sample size N increases, the standard error ( ) will decrease. The larger N is, the more efficient the estimate will be. A larger sample size means that the estimate is closer to the real population mean.

1NS

Confidence Levels

Our level of confidence has to be converted into a Z-score that we will then use in our formula to find the confidence interval.

The 95% confidence level means that we are willing to accept a probability of being wrong 5% of the time (or alpha (α) = .05)

This probability (the area under the curve) will be divided evenly between the upper and lower tail of the distribution (.025 on either side of the curve.)

Confidence Levels (cont.)When α = .05…

…then .025 of the area is distributed on either side (C )The .95 in the middle section is our confidence level.The cut-off between our confidence level and +/- .025 is

represented by a Z-value of +/- 1.96.

c c

Z-values for Various Alpha LevelsConfidence Level α α/2 Z-score

90% .10 .0500 +/-1.6595% .05 .0250 +/-1.9699% .01 .0050 +/-2.5899.9% .001 .0005 +/-3.29

(Note: Z-scores are found in Appendix A using the area for α/2)

Confidence Intervals For MeansProcedure: 1. Set the alpha (the probability that the interval will

be wrong). Note that the symbol for alpha is Setting alpha equal to 0.05, a 95% confidence level, means

the researcher is willing to be wrong 5% of the time. 2. Find the Z-value associated with alpha.

If alpha is equal to 0.05, we would place half (0.025) of this probability in the lower tail and half in the upper tail of the distribution.

3. Substitute values into formula and solve.

Formula: c.i. =

1N

s

Example: Confidence Intervals For Means Question:

For a random sample of 178 Canadian households, average television viewing time was 6 hours/day with s = 3. What would be your estimate of the population mean viewing time, at the 95% confidence level (Alpha (α) = .05)

Example: Confidence Intervals For Means Z-score for 95% confidence level (α+.05) is +/-1.96 Substitute all information into formula and solve:

c.i. =

= 6.0 ±1.96(3/√177)

= 6.0 ±1.96(3/13.30)

= 6.0 ±1.96(.23)

= 6.0 ± .44

1N

s

Example (cont.)

We can estimate that households in this community average 6.0 ± .44 hours of TV watching each day.

Another way to state the interval:5.56 ≤ μ ≤ 6.44

Interpretation:We estimate, with 95% confidence, that the population

mean for TV watching is greater than or equal to 5.56 and less than or equal to 6.44.(This interval has a .05 chance of being wrong.)

Example (cont.)

In other words:

Even if the statistic is as much as ±1.96 standard deviations from the mean of the sampling distribution the confidence interval will still include the value of μ.

Only rarely (5 times out of 100) will the interval not include μ.

Confidence Intervals For Proportions Procedure:

Set alpha = .05. Find the associated Z score. Substitute the sample information into formula:

c.i. =

Note: s = sample proportion

u (when population proportion is not known,) is set to .50

uus

1

Example: Confidence Intervals For Proportions

Question:

If 42% of a random sample of 764 people from an Ontario city vote Liberal, what % of the entire city vote Liberal?

Hint: Don’t forget to change the % to a proportion.

Example for Proportions (cont.)

c.i. =

= .42 ±1.96 (√.25/764)

= .42 ±1.96 (√.00033)

= .42 ±1.96 (.018)

= .42 ±.04

uus

1

Confidence Intervals For Proportions Changing back to %, we estimate that 42% ± 4% of

the city residents vote Liberal. Another way to state the interval:

38% ≤ Pu ≤ 46%

Interpretation: We estimate that the population value is greater than or equal to 38% and less than or equal to 46% for city residents who vote Liberal.

(This interval has a .05 chance of being wrong.)

Calculating Sample Sizes (note: Formula 6.4 and 6.5 in 2nd edition)

Sample sizes (cont) These formulae can be used to estimate the

minimum required sample size for means or proportions.

Where….. n = minimum required sample size Z = determined by your alpha level σ or Pu = population standard deviation (use s

if unknown) or population proportion ME = margin of error (in +/- actual units of

your desired estimate)

Practice Questions:

Healey 1st Cdn #7.5, 7.7, 7.9 Healey 2nd Cdn #6.5, 6.7, 6.9

Recommended