Outline

1

Outline

1. Review of last week

2. Sampling distributions

3. The sampling distribution of the mean

4. The Central Limit Theorem

5. Confidence intervals

6. Normal distribution example

7. Sampling distribution example

8. Confidence interval example

2

Review of last week

Last week, we learned how to use the Standard Normal Distribution to work out the probability of finding individual scores in some interval – e.g., what is the probability that the next Canadian woman we meet is taller than 175 cm?

Today, we’re going to do the same sort of thing with sample means rather than individual scores.

3

The sampling distribution of a sample statistic (such as X) is the probability distribution of that statistic.

Population(µ)

SampleX

4

The sampling distribution of the mean consists of all possible sample means – for all possible samples of size n – that you could take from the population

Population(µ)

Sample 4X4

Sample 3X3

Sample 2X2

Sample 1X1

5

When we draw a sample from a population, we are at the same time drawing a sample mean from the distribution of sample means for samples of size n

µX

X

Distribution of sample means for samples of size n

6

Sampling distributions

The sampling distribution of a sample statistic is the probability distribution of that statistic.

We can have sampling distributions of any sample statistic

Mean

Median M

Variance s2

Std devn s

X

7

The sampling distribution of the mean

The sampling distribution of the sample mean X.

E(X) = μ = μ

Variability of this distribution is given by the standard error of the mean:

σ = σ s≅

X

X

n n

8

The Central Limit Theorem

Consider a random sample of n observations from a population with mean µ and standard deviation .

When n is sufficiently large, the sampling distribution of X will be approximately normal with mean µ = µ and = / .

Note: this is true regardless of the shape of the underlying distribution of raw scores

XX

Xn

9


The larger the sample size, the better the approximation to the normal distribution.

For most populations, n ≥ 30 will be “sufficiently large.”

10


When we draw a sample and measure its mean, by the CLT, we may assume the sampling distribution of the sample mean is normal.

That means we can use the standard normal distribution (SND) to work out the probability of finding a sample mean in a given range relative to the population mean.

11

μ

The sampling distribution of the sample mean

X

X

12

The sampling distribution of the mean

We use the sampling distribution of the mean the way we used the SND last week. We obtain probabilities of finding sample means in a given range relative to the population mean, for samples of size n.

Don’t forget to use the standard error, σX, rather than the standard deviation, σ!

13

Confidence Intervals

There are two ways to estimate population parameters such as the mean:

1. Point estimates, such as X

2. Interval estimates, which tell us a range of values that will contain the parameter with known probability.

14

.45 .45

Z = -1.645 µX Z = 1.645

90% of the time, X will fall within the range Z = -1.645 to Z = +1.645

15


If 90% of the time X falls in the range Z = -1.645 to Z = +1.645 around the mean µ, then…

90% of the time, µ must fall within a range of the same width centered on X.

16


For given , the 100 (1-)% Confidence Interval for µX is:

C.I. = X ± Z/2 X

C.I. = X ± Z/2 /√n

17


When is not known and n is large (≥ 30), use s:

C.I. = X ± Z/2 sX

C.I. = X ± Z/2 s/√n

18

Normal Distribution Example

The amount of time that students wait to be served when buying coffee from the “Campus Perks” coffee outlet is normally distributed with a mean of 62.0 seconds and a 98.5 percentile of 79.36 seconds. In a random sample of 30 students buying coffee at Campus Perks, approximately how many will wait between 40 and 58 seconds to be served?

NOTE: This is not a question about a sample mean!

19


40 58 62 P98.5

Z for .4850 = 2.17

.4850

.50

20


= 79.36 – 62 = 8

2.17

Z1 = 40 – 62 = -2.75 (p = .4970 from table)

8

Z2 = 58 – 62 = -0.50 (p = .1915 from table)

8

21


P(40 ≤ X ≤ 58) = .4970 - .1915 = .3055

The probability of any one student waiting between 40 and 58 seconds is .3055.

Therefore, in a random sample of 30, we expect approximately .3055 (30) = 9.165 ≈ 9 students to wait between 40 and 58 seconds.

22

Sampling Distribution Example

People’s reaction times (RTs) to a simple visual stimulus are normally distributed with a mean of 500 milliseconds and a standard deviation of 150 milliseconds. You believe that people who go on a low-carb diet, however, will have slower (longer) RTs than this, on average, though their standard deviation will remain at 150. To test your belief, you take a random sample of 40 people who self-report having being on a low-carb diet for at least 6 months and measure their RTs. You decide that your belief will be supported if the mean RT of the low-carb group is 565 milliseconds or slower. What is the probability that you will conclude that your belief has been supported even if a low-carb diet actually has no effect on RTs whatsoever?

23

500 565

We want this probability

You decide that your belief will be supported if the mean RT of the low-carb group is 565 milliseconds or slower. What is the probability that you will conclude that your belief has been supported even if a low-carb diet actually has no effect on RTs whatsoever?

24

Example 2

What is P(X ≥ 565 │µ = 500)?

Z = 565 – 500

150/√40

25

Example 2

What is P(X ≥ 565 │µ = 500)?

Z = 565 – 500 = 65 = 2.74

150/√40 23.72

P for Z = 2.74 (from table) is .4969.

Therefore, desired probability is .5 - .4969 = .0031.

26

Example 3

Two variables important to a professional football player are speed and strength. Each year, camps are held to determine potential players’ speed and strength, both of which are continuous, normally-distributed, and independent of each other. The middle 95% of strength scores is bounded by 600 and 900 (on a composite strength index). The average time to run 40 yards is 4.6 seconds, and 40 yard time exceeds 6 seconds only 5% of the time.

a. In order to be considered by a team, a potential player must not exceed the 75th percentile for time to run 40 yards. What is the slowest a player can run 40 yards and still be considered?

27

64.6

.45

Probability distribution for time to run 40 yards (seconds)

.25

X seconds

28

Example 3

Z(.45) = 1.645 = 6 – 4.6

σ

σ = 6 – 4.6 = .851

1.645

29

Example 3

Now we can find X (the 75th percentile):

Z(.25) = 0.675 = X – 4.6

.851

X = 0.675 * (.851) + 4.6 = 5.15 (seconds)

30

64.6

The 75th percentile for 40 yard times is 5.15 seconds.

5.15 seconds

31

Example 3

Two variables important to a professional football player are speed and strength. Each year, camps are held to determine potential players’ speed and strength, both of which are continuous, normally-distributed, and independent of each other. The middle 95% of strength scores is bounded by 600 and 900 (on a composite strength index). The average time to run 40 yards is 4.6 seconds, and 40 yard time exceeds 6 seconds only 5% of the time.

b. You take a random sample of 200 potential players. What is the probability that the average strength score of the sample is less than or equal to 740?

32

600 900µ

.45 .45

750

Probability distribution for strength scores

33

Example 3

Z = 1.645 = 900 – 750

σ

σ = 900 – 750 = 91.19

1.645

Z = 740 – 750 = -1.55

91.19/√200

34

Example 3

P (Z < 1.55) = .4394 (From table)

Tail probability will be .5 – .4394 = .0606

35

750

.0606

740

What is the probability that the mean for a sample of 200 players is less than this value?

This is the sampling distribution of mean strength scores for samples with n = 200

36

Confidence Interval Example

A researcher samples 36 undergraduates from a local university and finds it took them 36.4 days, on average, to find a job, with a standard deviation of 8 days. Use these data to form a 96% confidence interval for the true mean time it takes for graduates to find a job.

NOTE: We are not given the population standard deviation

37


Recall:

C.I. = X ± Z/2 sX = X ± Z/2 s/√n

X = 36.4

S = 8

n = 36

S/√n = 8/6 = 1.33

38


(1-)% = 96%, so /2 = .02 – this is the tail probability.

We get /2 = .02 when we look up Z.48 = 2.05

C.I. = 36.4 ± 2.05 (1.33)

(33.67 ≤ µ ≤ 39.13)

Documents

Outline