Chapter 7annegloag.weebly.com/uploads/2/2/9/9/22998796/chapter7.pdf · Chapter 7 Sampling Distributions . Section 7.1 Sampling Distributions and the Central Limit Theorem . Sampling

Chapter 7

Sampling Distributions

Section 7.1

Sampling Distributions and the

Central Limit Theorem

Sampling Distributions

Sampling distribution

• The probability distribution of a sample statistic.

• Formed when samples of size n are repeatedly taken

from a population.

• e.g. Sampling distribution of sample means

Sampling Distribution of Sample Means

Sample 1

1

x

Sample 5

5x

Sample 2

2

x

Sample 3

3x

Sample 4

4x

Population with μ, σ

The sampling distribution consists of the values of the

sample means, 1 2 3 4 5, , , , ,...x x x x x

2. The standard deviation of the sample means, , is

equal to the population standard deviation, σ,

divided by the square root of the sample size, n.

1. The mean of the sample means, , is equal to the

population mean μ.

Properties of Sampling Distributions of

Sample Means

x

x

x

xn

• Called the standard error of the mean.

Example: Sampling Distribution of

Sample Means

The population values {1, 3, 5, 7} are written on slips of

paper and put in a box. Two slips of paper are randomly

selected, with replacement.

a. Find the mean, variance, and standard deviation of

the population.

Mean: 4x

N

22Varianc : 5e

( )x

N

Standard Deviat 5ion 236: 2.

Solution:


Sample Means

b. Graph the probability histogram for the population

values.

All values have the

same probability of

being selected (uniform

distribution)

Population values

Pro

bab

ilit

y

0.25

1 3 5 7

x

P(x) Probability Histogram of

Population of x

Solution:


Sample Means

c. List all the possible samples of size n = 2 and

calculate the mean of each sample.

5 3, 7

4 3, 5

3 3, 3

2 3, 1

4 1, 7

3 1, 5

2 1, 3

1 1, 1

7 7, 7

6 7, 5

5 7, 3

4 7, 1

6 5, 7

5 5, 5

4 5, 3

3 5, 1 These means

form the

sampling

distribution of

sample means.

Sample

Solution:

Sample x x


Sample Means

d. Construct the probability distribution of the sample

means.

x f Probability f Probability

1 1 0.0625

2 2 0.1250

3 3 0.1875

4 4 0.2500

5 3 0.1875

6 2 0.1250

7 1 0.0625

Solution: x


Sample Means

e. Find the mean, variance, and standard deviation of

the sampling distribution of the sample means.

Solution:

The mean, variance, and standard deviation of the

16 sample means are:

4x

2 5

2 52

.x

2 5 1 581x

. .

These results satisfy the properties of sampling

distributions of sample means.

4x

5 2 236

1 5812 2

..

xn


Sample Means

f. Graph the probability histogram for the sampling

distribution of the sample means.

The shape of the

graph is symmetric

and bell shaped. It

approximates a

normal distribution.

Solution:

Sample mean

Pro

bab

ilit

y 0.25

P(x) Probability Histogram of

Sampling Distribution of

0.20

0.15

0.10

0.05

6 7 5 4 3 2

x

x

The Central Limit Theorem

1. If samples of size n ≥ 30 are drawn from any

population with mean = µ and standard deviation = σ,

x

x

xx

x

xxxx x

xx

x x

then the sampling distribution of sample means

approximates a normal distribution. The greater the

sample size, the better the approximation.


2. If the population itself is normally distributed,

then the sampling distribution of sample means is

normally distribution for any sample size n.

x

x

x

xx

x

x

xxx x

xx

x


• In either case, the sampling distribution of sample

means has a mean equal to the population mean.

• The sampling distribution of sample means has a

variance equal to 1/n times the variance of the

population and a standard deviation equal to the

population standard deviation divided by the square

root of n.

Variance

Standard deviation (standard

error of the mean)

x

xn

22

x n

Mean


1. Any Population Distribution 2. Normal Population Distribution

Distribution of Sample Means,

n ≥ 30

Distribution of Sample Means,

(any n)

Example: Interpreting the Central Limit

Theorem Cellular phone bills for residents of a city have a mean

of $63 and a standard deviation of $11. Random

samples of 100 cellular phone bills are drawn from this

population and the mean of each sample is determined.

Find the mean and standard error of the mean of the

sampling distribution. Then sketch a graph of the

sampling distribution of sample means.

Solution: Interpreting the Central Limit

Theorem

• The mean of the sampling distribution is equal to the

population mean

• The standard error of the mean is equal to the


root of n.

63x

111.1

100x

n


Theorem

• Since the sample size is greater than 30, the sampling

distribution can be approximated by a normal

distribution with

$63x $1.10x

Example: Interpreting the Central Limit

Theorem Suppose the training heart rates of all 20-year-old

athletes are normally distributed, with a mean of 135

beats per minute and standard deviation of 18 beats per

minute. Random samples of size 4 are drawn from this

population, and the mean of each sample is determined.

Find the mean and standard error of the mean of the

sampling distribution. Then sketch a graph of the

sampling distribution of sample means.


Theorem

• The mean of the sampling distribution is equal to the

population mean

• The standard error of the mean is equal to the


root of n.

135x

189

4x

n


Theorem

• Since the population is normally distributed, the

sampling distribution of the sample means is also

normally distributed.

135x 9x

Probability and the Central Limit

Theorem

• To transform x to a z-score

Value Mean

Standard error

x

x

x xz

n

Example: Probabilities for Sampling

Distributions

The graph shows the length of

time people spend driving each

day. You randomly select 50

drivers ages 15 to 19. What is

the probability that the mean

time they spend driving each

day is between 24.7 and 25.5

minutes? Assume that σ = 1.5

minutes.

Solution: Probabilities for Sampling

Distributions

From the Central Limit Theorem (sample size is greater

than 30), the sampling distribution of sample means is

approximately normal with

25x 1.5

0.2121350

xn

Solution: Probabilities for Sampling

Distributions

1

24 7 251 41

1 550

xz

n

. -

..

24.7 25

P(24.7 < x < 25.5)

x

Normal Distribution

μ = 25 σ = 0.21213

2

25 5 252 36

1 550

xz

n

.

..

25.5 –1.41

z

Standard Normal Distribution

μ = 0 σ = 1

0

P(–1.41 < z < 2.36)

2.36

0.9909

0.0793

P(24 < x < 54) = P(–1.41 < z < 2.36)

= 0.9909 – 0.0793 = 0.9116

Example: Probabilities for x and x An education finance corporation claims that the

average credit card debts carried by undergraduates are

normally distributed, with a mean of $3173 and a

standard deviation of $1120. (Adapted from Sallie Mae)

Solution:

You are asked to find the probability associated with

a certain value of the random variable x.

1. What is the probability that a randomly selected

undergraduate, who is a credit card holder, has a

credit card balance less than $2700?

Solution: Probabilities for x and x

P( x < 2700) = P(z < –0.42) = 0.3372

z =

x - m

s=

2700 - 3173

1120» -0.42

2700 3173

P(x < 2700)

x

Normal Distribution

μ = 3173 σ = 1120

–0.42

z


μ = 0 σ = 1

0

P(z < –0.42)

0.3372

Example: Probabilities for x and x

2. You randomly select 25 undergraduates who are

credit card holders. What is the probability that

their mean credit card balance is less than $2700?

Solution:

You are asked to find the probability associated with

a sample mean . x

3173x 1120

22425

xn

0

P(z < –2.11)

–2.11

z


μ = 0 σ = 1

0.0174


z =x - m

sn

=2700 - 3173

112025

=-473

224» -2.11

Normal Distribution

μ = 3173 σ = 1120

2700 3173

P(x < 2700)

x

P( x < 2700) = P(z < –2.11) = 0.0174


• There is about a 34% chance that an undergraduate

will have a balance less than $2700.

• There is only about a 2% chance that the mean of a

sample of 25 will have a balance less than $2700

(unusual event).

• It is possible that the sample is unusual or it is

possible that the corporation’s claim that the mean is

$3173 is incorrect.

Chapter 8

Confidence Intervals

Section 8.1

Confidence Intervals for the Mean

(Large Samples)

Point Estimate for Population μ

Point Estimate

• A single value estimate for a population parameter

• Most unbiased point estimate of the population mean

μ is the sample mean

x

Estimate Population

Parameter…

with Sample

Statistic

Mean: μ x

Example: Point Estimate for Population μ

A social networking website allows its users to add

friends, send messages, and update their personal

profiles. The following represents a random sample of

the number of friends for 40 users of the website. Find a

point estimate of the population mean, µ. (Source:

Facebook)

140 105 130 97 80 165 232 110 214 201 122

98 65 88 154 133 121 82 130 211 153 114

58 77 51 247 236 109 126 132 125 149 122

74 59 218 192 90 117 105

Solution: Point Estimate for Population μ

The sample mean of the data is

5232130.8

40

xx

n

Your point estimate for the mean number of friends for

all users of the website is 130.8 friends.

115 120 125 130 135 140 150 145

Point estimate

115 120 125 130 135 140 150 145

Point estimate

130.8x

How confident do we want to be that the interval estimate

contains the population mean μ?

Interval Estimate

Interval estimate

• An interval, or range of values, used to estimate a

population parameter.

( )

Interval estimate

Right endpoint 146.5

Left endpoint

115.1

Level of Confidence

Level of confidence c

• The probability that the interval estimate contains the

population parameter.

z z = 0 –zc zc

Critical values

½(1 – c) ½(1 – c)

c is the area under the

standard normal curve

between the critical values.

The remaining area in the tails is 1 – c .

c

Use the Standard

Normal Table to find the

corresponding z-scores.

zc

Level of Confidence

• If the level of confidence is 90%, this means that we

are 90% confident that the interval contains the

population mean μ.

z z = 0 zc

The corresponding z-scores are ±1.645.

c = 0.90

½(1 – c) = 0.05 ½(1 – c) = 0.05

–zc = –1.645 zc = 1.645

Sampling Error

Sampling error

• The difference between the point estimate and the

actual population parameter value.

• For μ:

the sampling error is the difference – μ

μ is generally unknown

varies from sample to sample

x

x

Margin of Error

Margin of error

• The greatest possible distance between the point

estimate and the value of the parameter it is

estimating for a given level of confidence, c.

• Denoted by E.

• Sometimes called the maximum error of estimate or

error tolerance.

c x cE z zn

σ

σ When n ≥ 30, the sample

standard deviation, s, can

be used for σ.

Example: Finding the Margin of Error

Use the social networking website data and a 95%

confidence level to find the margin of error for the

mean number of friends for all users of the website.

Assume the sample standard deviation is about 53.0.

zc

Solution: Finding the Margin of Error

• First find the critical values

z zc z = 0

0.95

0.025 0.025

–zc = –1.96

95% of the area under the standard normal curve falls

within 1.96 standard deviations of the mean. (You

can approximate the distribution of the sample means

with a normal curve by the Central Limit Theorem,

because n = 40 ≥ 30.)

zc = 1.96

Solution: Finding the Margin of Error

53.01.96

40

16.4

c c

sE z z

n n

You don’t know σ, but

since n ≥ 30, you can

use s in place of σ.

You are 95% confident that the margin of error for the

population mean is about 16.4 friends.

Confidence Intervals for the Population

Mean

A c-confidence interval for the population mean μ

•

• The probability that the confidence interval contains μ

is c.

where cx E x E E zn

Constructing Confidence Intervals for μ

Finding a Confidence Interval for a Population Mean

(n ≥ 30 or σ known with a normally distributed population)

In Words In Symbols

1. Find the sample statistics n and

.

2. Specify σ, if known.

Otherwise, if n ≥ 30, find the

sample standard deviation s and

use it as an estimate for σ.

xx

n

2( )1

x xs

n

x

Constructing Confidence Intervals for μ

3. Find the critical value zc that

corresponds to the given

level of confidence.

4. Find the margin of error E.

5. Find the left and right

endpoints and form the

confidence interval.

Use the Standard

Normal Table or

technology.

Left endpoint:

Right endpoint:

Interval:

cE zn

x E

x E

x E x E

In Words In Symbols

Example: Constructing a Confidence

Interval

Construct a 95% confidence interval for the mean

number of friends for all users of the website.

Solution: Recall and E ≈ 16.4 130.8x

130.8 16.4

114.4

x E

130.8 16.4

147.2

x E

114.4 < μ < 147.2

Left Endpoint: Right Endpoint:

Solution: Constructing a Confidence

Interval

114.4 < μ < 147.2

•

With 95% confidence, you can say that the population

mean number of friends is between 114.4 and 147.2.


Interval σ Known

A college admissions director wishes to estimate the

mean age of all students currently enrolled. In a random

sample of 20 students, the mean age is found to be 22.9

years. From past studies, the standard deviation is

known to be 1.5 years, and the population is normally

distributed. Construct a 90% confidence interval of the

population mean age.

zc


Interval σ Known


z z = 0 zc

c = 0.90

½(1 – c) = 0.05 ½(1 – c) = 0.05

–zc = –1.645 zc = 1.645

zc = 1.645

• Margin of error:

• Confidence interval:


Interval σ Known

1.51.645 0.6

20cE z

n

22.9 0.6

22.3

x E

22.9 0.6

23.5

x E


22.3 < μ < 23.5


Interval σ Known

22.3 < μ < 23.5

( ) •

22.9 22.3 23.5

With 90% confidence, you can say that the mean age

of all the students is between 22.3 and 23.5 years.

Point estimate

xx E x E

Interpreting the Results

• μ is a fixed number. It is either in the confidence

interval or not.

• Incorrect: “There is a 90% probability that the actual

mean is in the interval (22.3, 23.5).”

• Correct: “If a large number of samples is collected

and a confidence interval is created for each sample,

approximately 90% of these intervals will contain μ.

Interpreting the Results

The horizontal segments

represent 90% confidence

intervals for different

samples of the same size.

In the long run, 9 of every

10 such intervals will

contain μ.

μ

Sample Size

• Given a c-confidence level and a margin of error E,

the minimum sample size n needed to estimate the

population mean µ is

• If σ is unknown, you can estimate it using s, provided

you have a preliminary sample with at least 30

members.

2

czn

E

Example: Sample Size

You want to estimate the mean number of friends for all

users of the website. How many users must be included

in the sample if you want to be 95% confident that the

sample mean is within seven friends of the population

mean? Assume the sample standard deviation is about

53.0.

zc

Solution: Sample Size


zc = 1.96

z z = 0 zc

0.95

0.025 0.025

–zc = –1.96 zc = 1.96


zc = 1.96 σ ≈ s ≈ 53.0 E = 7

2 21.96 53.0

220.237

czn

E

When necessary, round up to obtain a whole number.

You should include at least 221 users in your sample.

Section 8.2

Confidence Intervals for the Mean

(Small Samples)

The t-Distribution

• When the population standard deviation is unknown,

the sample size is less than 30, and the random

variable x is approximately normally distributed, it

follows a t-distribution.

• Critical values of t are denoted by tc.

t =x - m

s

n

Properties of the t-Distribution

1. The t-distribution is bell shaped and symmetric

about the mean.

2. The t-distribution is a family of curves, each

determined by a parameter called the degrees of

freedom. The degrees of freedom are the number

of free choices left after a sample statistic such as

is calculated. When you use a t-distribution to

estimate a population mean, the degrees of freedom

are equal to one less than the sample size.

d.f. = n – 1 Degrees of freedom

x

Properties of the t-Distribution

3. The total area under a t-curve is 1 or 100%.

4. The mean, median, and mode of the t-distribution are

equal to zero.

5. As the degrees of freedom increase, the t-distribution

approaches the normal distribution. After 30 d.f., the t-

distribution is very close to the standard normal z-

distribution.

t

0 Standard normal curve

The tails in the t-

distribution are “thicker”

than those in the standard

normal distribution. d.f. = 5

d.f. = 2

Example: Critical Values of t

Find the critical value tc for a 95% confidence level

when the sample size is 15.

Table 5: t-Distribution

tc = 2.145

Solution: d.f. = n – 1 = 15 – 1 = 14

Solution: Critical Values of t

95% of the area under the t-distribution curve with 14

degrees of freedom lies between t = ±2.145.

t

–tc = –2.145 tc = 2.145

c = 0.95

Confidence Intervals for the Population

Mean

A c-confidence interval for the population mean μ

•

• The probability that the confidence interval contains μ

is c.

where c

sx E x E E t

n

Confidence Intervals and t-Distributions

1. Identify the sample

statistics n, , and s.

2. Identify the degrees of

freedom, the level of

confidence c, and the

critical value tc.


xx

n

2( )

1x x

sn

cE tn

s

d.f. = n – 1

x

In Words In Symbols

Confidence Intervals and t-Distributions




Left endpoint:

Right endpoint:

Interval:

x Ex E

x E x E

In Words In Symbols


Interval

You randomly select 16 coffee shops and measure the

temperature of the coffee sold at each. The sample mean

temperature is 162.0ºF with a sample standard deviation

of 10.0ºF. Find the 95% confidence interval for the

population mean temperature. Assume the temperatures

are approximately normally distributed.

Solution:

Use the t-distribution (n < 30, σ is unknown,

temperatures are approximately normally distributed).


Interval

• n =16, x = 162.0 s = 10.0 c = 0.95

• df = n – 1 = 16 – 1 = 15

• Critical Value

Table 5: t-Distribution

tc = 2.131


Interval



102.131 5.3

16cE t

n

s

162 5.3

156.7

x E

162 5.3

167.3

x E


156.7 < μ < 167.3


Interval

• 156.7 < μ < 167.3

( ) • 162.0 156.7 167.3


mean temperature of coffee sold is between 156.7ºF

and 167.3ºF.

Point estimate

xx E x E

No

Normal or t-Distribution?

Is n ≥ 30?

Is the population normally,

or approximately normally,

distributed? Cannot use the normal

distribution or the t-distribution. Yes

Is σ known?

No

Use the normal distribution with

If σ is unknown, use s instead.

cE zn

σ

Yes

No

Use the normal distribution with

.cE z

n

σ

Yes

Use the t-distribution with

and n – 1 degrees of freedom.

cE tn

s

Example: Normal or t-Distribution?

You randomly select 25 newly constructed houses. The

sample mean construction cost is $181,000 and the

population standard deviation is $28,000. Assuming

construction costs are normally distributed, should you

use the normal distribution, the t-distribution, or neither

to construct a 95% confidence interval for the

population mean construction cost?

Solution:

Use the normal distribution (the population is

normally distributed and the population standard

deviation is known)

Section 8.3

Confidence Intervals for Population

Proportions

Point Estimate for Population p

Population Proportion

• The probability of success in a single trial of a

binomial experiment.

• Denoted by p

Point Estimate for p

• The proportion of successes in a sample.

• Denoted by

read as “p hat”

number of successes in sampleˆ

sample sizex

pn

Point Estimate for Population p

Point Estimate for q, the population proportion of

failures

• Denoted by

• Read as “q hat”

1ˆ ˆq p

Estimate Population

Parameter…

with Sample

Statistic

Proportion: p p̂

Example: Point Estimate for p

In a survey of 1000 U.S. adults, 662 said that it is

acceptable to check personal e-mail while at work. Find

a point estimate for the population proportion of U.S.

adults who say it is acceptable to check personal e-mail

while at work. (Adapted from Liberty Mutual)

Solution: n = 1000 and x = 662

6620.662 66.2%

1000ˆ

xp

n

Confidence Intervals for p

A c-confidence interval for a population proportion p

•

•The probability that the confidence interval contains p is c.

ˆ ˆwhereˆ ˆ c

pqp E p p E E z

n

Constructing Confidence Intervals for p

1. Identify the sample statistics n

and x.

2. Find the point estimate

3. Verify that the sampling

distribution of can be

approximated by a normal

distribution.

4. Find the critical value zc that

corresponds to the given level of

confidence c.

ˆx

pn

Use the Standard

Normal Table or

technology.

.p̂

5, 5ˆ ˆnp nq p̂

In Words In Symbols

Constructing Confidence Intervals for p





ˆ ˆc

pqE z

n

Left endpoint:

Right endpoint:

Interval:

p̂ Ep̂ E

ˆ ˆp E p p E

In Words In Symbols

Example: Confidence Interval for p

In a survey of 1000 U.S. adults, 662 said that it is

acceptable to check personal e-mail while at work.

Construct a 95% confidence interval for the population

proportion of U.S. adults who say that it is acceptable to

check personal e-mail while at work.

Solution: Recall ˆ 0.662p

1 0.6ˆ ˆ1 62 0.338q p

Solution: Confidence Interval for p

• Verify the sampling distribution of can be

approximated by the normal distribution

p̂

1000 0.662 2ˆ 66 5np

1000 0.338 8ˆ 33 5nq


(0.662) (0.ˆ ˆ 338)1.96 0.029

1000c

pqE z

n

© 2012 Pearson Education, Inc. All rights reserved. 82 of 83



ˆ

0.662 0.029

0.633

p E


0.633 < p < 0.691

ˆ

0.662 0.029

0.691

p E


• 0.633 < p < 0.691


proportion of U.S. adults who say that it is acceptable

to check personal e-mail while at work is between

63.3% and 69.1%.

Point estimate

p̂p̂ E p̂ E

Sample Size

• Given a c-confidence level and a margin of error E,

the minimum sample size n needed to estimate p is

• This formula assumes you have an estimate for

and .

• If not, use and

2

ˆ ˆ czn pq

E

ˆ 0.5.qˆ 0.5p

p̂q̂


You are running a political campaign and wish to

estimate, with 95% confidence, the population proportion

of registered voters who will vote for your candidate.

Your estimate must be accurate within 3% of the true

population proportion. Find the minimum sample size

needed if

1. no preliminary estimate is available.

Solution:

Because you do not have a preliminary estimate

for use and ˆ 5.0.q ˆ 0.5p p,ˆ


• c = 0.95 zc = 1.96 E = 0.03

2 21.96

(0.5)(0.5) 1067.110.

ˆ03

ˆ czn pq

E

Round up to the nearest whole number.

With no preliminary estimate, the minimum sample

size should be at least 1068 voters.


You are running a political campaign and wish to

estimate, with 95% confidence, the population

proportion of registered voters who will vote for your

candidate. Your estimate must be accurate within 3% of

the true population proportion. Find the minimum

sample size needed if

2. a preliminary estimate gives .

ˆ 0.31p

Solution:

Use the preliminary estimate

1 0.31 0. 9ˆ ˆ 61q p

ˆ 0.31p


• c = 0.95 zc = 1.96 E = 0.03

2 21.96

(0.31)(0.69) 913.020.

ˆ ˆ03

czn pq

E

Round up to the nearest whole number.

With a preliminary estimate of , the

minimum sample size should be at least 914 voters.

Need a larger sample size if no preliminary estimate

is available.

ˆ 0.31p

Documents

Chapter 7annegloag.weebly.com/uploads/2/2/9/9/22998796/chapter7.pdf · Chapter 7 Sampling Distributions . Section 7.1 Sampling Distributions and the Central Limit Theorem . Sampling