26
Chapter 6 Sampling and Sampling Distributions 6.1 Denitions A statistical population is a set or collection of all possible observations of some characteristic. A sample is a part or subset of the population. A random sample of size is a sample that is chosen in such a way as to ensure that every sample of size has the same probability of being chosen. A parameter is a number describing some (unknown) aspect of a population. (i.e. ) A statistic is some function of the sample observations. (i.e. ¯ ) The probability distribution of a statistic is known as a sampling distribution. (How is ¯ distributed) We need to distinguish the distribution of a random variable, say ¯ from the re- alization of the random variable (ie. we get data and calculate some sample mean say ¯ =42) 1

Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

  • Upload
    dangthu

  • View
    264

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

Chapter 6

Sampling and Sampling

Distributions

6.1 Definitions

• A statistical population is a set or collection of all possible observations of some

characteristic.

• A sample is a part or subset of the population.

• A random sample of size is a sample that is chosen in such a way as to ensure

that every sample of size has the same probability of being chosen.

• A parameter is a number describing some (unknown) aspect of a population. (i.e.

)

• A statistic is some function of the sample observations. (i.e. )

• The probability distribution of a statistic is known as a sampling distribution. (How

is distributed)

• We need to distinguish the distribution of a random variable, say from the re-

alization of the random variable (ie. we get data and calculate some sample mean

say = 42)

1

Page 2: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

2 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-4

A Population is the set of all items or individuals of interest Examples: All likely voters in the next election

All parts produced todayAll sales receipts for November

A Sample is a subset of the population Examples: 1000 voters selected at random for interview

A few parts selected for destructive testingRandom receipts selected for audi t

Populations and Samples

Figure 6.1:

Page 3: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.1. DEFINITIONS 3

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-5

Population vs. Sample

a b c d

ef gh i jk l m n

o p q rs t u v w

x y z

Population Sample

b c

g i n

o r u

y

Figure 6.2:

Page 4: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

4 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

Note on Statistics

• The value of the statistic will change from sample to sample and we can therefore

think of it as a random variable with it’s own probability distribution.

• is a random variable

• Repeated sampling and calculation of the resulting statistic will give rise to a dis-tribution of values for that statistic.

Page 5: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.1. DEFINITIONS 5

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-10

Sampling Distributions

A sampling distribution is a distribution of all of the possible values of a statistic for a given size sample selected from a

population

Figure 6.3:

Page 6: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-11

Chapter Outline

Sampling Distributions

Sampling Distribution of

Sample Mean

Sampling Distribution of

Sample Proportion

Sampling Distribution of

Sample Difference in

Means

Figure 6.4:

Page 7: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.2. IMPORTANT THEOREMS RECALLED 7

6.2 Important Theorems Recalled

Suppose 12 are independent with [] = and [] = 2 ∀ =1 2 .

Suppose = 11 + 22 + + + , then:

[ ] = [X

+ ] = 1[1] + 2[2] + · · ·+ [] +

= 11 + +

=X

+

and

[ ] = [X

+ ] = 21 [1] + 22 [2] + · · ·+ 2 []

= 2121 + 22

22 + · · ·+ 2

2

=X

22 because of independence

and if is normal ∀ i.e. ∼ ( 2 ) independently ∀:

∼ (X

+ X

22 )

6.3 Frequently used Statistics

6.3.1 The sample mean

• Let 1 2 be a random sample of size from a population with mean

and variance 2. The sample mean is:

Page 8: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

8 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

=1

X=1

1. The expected value of the sample mean is the population mean:

[] = (1

X=1

) =1

X=1

[] =1

(+ + ) =

2. The variance of the sample mean ( 0 independent):

[] = (1

X=1

) =1

2

X=1

()

=1

2

X=1

2 =2

3. If we do not have independence it can be shown that

[] =2

µ −

− 1¶where is the population size

µ −

− 1¶is called the correction factor

and if is large relative to then¡−−1

¢⇒ 1 so that [] = 2

Note on Sample Mean

1. The use of the formulas for expected values and variances of sums of random

variables that we saw in chapter 5.

2. The variance of the sample mean is a decreasing function of the sample size.

3. The standard deviation of the sample mean (under independence)

=√

Page 9: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.4. SAMPLING DISTRIBUTION OF THE SAMPLE MEAN 9

6.3.2 The sample variance

• Let 1 2 be a random sample of size from a population with variance

2.

• The sample variance is:

2 =1

− 1X=1

( − )2

1. [2] = 2 (we omit the proof)

6.4 Sampling distribution of the Sample Mean

Sampling from a Normal Population

• Let be the sample mean of an independent random sample of size from a

population with mean and variance 2.

• Then we know that [] = and [] = 2

.

• If we further specify the population distribution as being normal, then

∼ ( 2) for all

and we can write:

µ

2

6.5 Equation for the Standardized Sample Mean

Since ∼ ( 2

) we can ask what transformation can give us to a standard normal

• Generically the approach is ALWAYS

=Random Variable-Mean of Random Variable

Standard Deviation of Random Variable

• What does that mean for :

Page 10: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

10 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

• Random Variable =

• Mean of Variable = [] =

• Standard Deviation of Variable = =√

• Put it all together

= −

Page 11: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.5. EQUATION FOR THE STANDARDIZED SAMPLE MEAN 11

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-22

Z-value for Sampling Distributionof the Mean

Z-value for the sampling distribution of :

where: = sample mean

= population mean

= population standard deviation

n = sample size

Xμσ

n

σμ)X(

σ

μ)X(Z

X

X

Figure 6.5:

Page 12: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

12 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-26

Sampling Distribution Properties

For sampling with replacement:

As n increases,

decreasesLarger sample size

Smaller sample size

x

(continued)

μ

Figure 6.6:

Example of Standardizing for Sample Mean

The lengths of individual machined parts coming off a production line at Morton Metal-

works are normally distributed around their mean of = 30 centimeters. Their standard

deviation around the mean is = 1 centimeter. An inspector just took a sample of

= 4 of these parts and found that for this sample is 29.875 centimeters. What is

the probability of getting a sample mean this low or lower if the process is still producing

parts at a mean of = 30?

Answer

• Given the population is normally distributed with mean = 30 and standard

deviation = 1, we know that ∼ (30 12) for all ,

so

∼ (30 12)

Now want to apply our transformation stuff:

Page 13: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.5. EQUATION FOR THE STANDARDIZED SAMPLE MEAN 13

( ≤ 29875) = ( −

√≤ 29875− 30

1√4

)

= ( ≤ 29875− 301√4

) = ( ≤ −250) = 0062

Questions: 7.3.

6.5.1 Sampling from a Non Normal Distribution

• We have seen that we can obtain the exact sampling distribution for the sample

mean if the individual are all independent normal variates.

• What happens when the 0 are not normally distributed?

Page 14: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

14 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-13

Developing a Sampling Distribution

Assume there is a population …

Population size N=4

Random variable, X,

is age of individuals

Values of X:

18, 20, 22, 24 (years)

A B C D

Figure 6.7:

Page 15: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.5. EQUATION FOR THE STANDARDIZED SAMPLE MEAN 15

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-14

.25

018 20 22 24

A B C D

Uniform Distribution

P(x)

x

(continued)

Summary Measures for the Population Distribution:

Developing a Sampling Distribution

214

24222018

N

Xμ i

2.236N

μ)(Xσ

2i

Figure 6.8:

Page 16: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

16 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

6.5.2 The Central Limit Theorem

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-15

1st 2nd Observation Obs 18 20 22 24

18 18,18 18,20 18,22 18,24

20 20,18 20,20 20,22 20,24

22 22,18 22,20 22,22 22,24

24 24,18 24,20 24,22 24,24

16 possible samples (sampling with replacement)

Now consider all possible samples of size n = 2

1st 2nd Observation Obs 18 20 22 24

18 18 19 20 21

20 19 20 21 22

22 20 21 22 23

24 21 22 23 24

(continued)

Developing a Sampling Distribution

16 Sample Means

Figure 6.9:

Page 17: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.5. EQUATION FOR THE STANDARDIZED SAMPLE MEAN 17

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-16

1st 2nd ObservationObs 18 20 22 24

18 18 19 20 21

20 19 20 21 22

22 20 21 22 23

24 21 22 23 24

Sampling Distribution of All Sample Means

18 19 20 21 22 23 240

.1

.2

.3 P(X)

X

Sample Means Distribution

16 Sample Means

_

Developing a Sampling Distribution

(continued)

(no longer uniform)

_

Figure 6.10:

Page 18: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

18 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

Let12 be an independent random sample having identical distribution from

a population of any shape with mean and variance 2.

Then if is large:

∼ (2

) approximately for large

and similarly we can use the same transformation to standard form:

= −

∼ (0 1) approximately for large

Notes on the Central Limit Theorem

1. This result holds only for large n and we refer to such results as holding asymp-

totically. In this case we say that is asymptotically normally distributed

2. We have a short-hand way to write that the distribution of is independently and

identically ( ) distributed with mean and variance 2

∼ ( 2)

3. Identically implies that [] = and [] = 2 for all . That is the distribution

of each observation ( = 1 ) is the same.

Page 19: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.5. EQUATION FOR THE STANDARDIZED SAMPLE MEAN 19

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-28

n?

Central Limit Theorem

As the sample size gets large enough…

the sampling distribution becomes almost normal regardless of shape of population

x

Figure 6.11:

Page 20: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

20 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-31

Example

Suppose a population has mean µ = 8 and standard deviation s = 3. Suppose a random sample of size n = 36 is selected.

What is the probability that the sample mean is between 7.8 and 8.2?

Figure 6.12:

See ??

Page 21: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.5. EQUATION FOR THE STANDARDIZED SAMPLE MEAN 21

Stati sti cs for Business and Economi cs, 6e © 2007 Pearson E ducation, Inc. Chap 7-33

Example

Solution (continued):(continued)

0.38300.5)ZP(-0.5

363

8-8.2

μ- μ

363

8-7.8P 8.2) μ P(7.8 X

X

Z7.8 8.2 -0.5 0.5

Sampling Distribution

Standard Normal Distribution .1915

+.1915

Population Distribution

??

??

?????

??? Sample Standardize

8μ 8μX 0μ

zxX

Figure 6.13:

Example From Transformation to Standard Form when Sampling from a Non-

Normal Distribution

• The delay time for inspection of baggage at a border station follows a bimodal

distribution with a mean of = 8 minutes and a standard deviation of = 6

minutes. A sample of = 64 from a particular minority group has a mean of

= 10 minutes.

• Is their evidence that this minority group is being detained longer than usual?

• How likely is it that a sample mean of ≥ 10 if the population mean is 8?

Answer:

• note from the question that this population from which we are sampling is non

normal

• we know this is from the word bimodal since the normal has one mode

Page 22: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

22 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

• If the are , then by applying (or invoke) the central limit theorem:

( ≥ 10) ≈ ( ≥ 10− 86√64)

= ( ≥ 267) = 0038

approximately.

Questions: NCT 7.11, 7.18, 7.19, &7.20.

6.6 Sampling Distribution: Differences of SampleMeans

1 − 2

• Let 1 and 2 be the means of two samples from two separate and independent

populations.

[1] = 1 [1] =211

[2[= 2 [2] =222

Since 1 and 2 are independent:

[1 − 2] = [1]−[2] = 1 − 2

[1 − 2] = [1] + [2] =211+

222

6.6.1 Normal Case for 1 − 2

If 1 are as (1 21) , 2 are . as (2

22), and 1 and 2 are

independent,

then we have an exact normal result for any sample size:

1 − 2 ∼ (1 − 2211+

222)

• Recall a linear combination of normals is normal

Page 23: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.6. SAMPLINGDISTRIBUTION: DIFFERENCESOF SAMPLEMEANS 1−2 23

6.6.2 Non Normal Case for 1 − 2

If 1 are with mean 1 and variance 21, 2 are . with mean 2 and

variance 22, and 1 and 2 are independent, then using the central limit theorem,

for large 1 , and 2

1 − 2 ∼ (1 − 2211+

222) approximately or asymptotically

6.6.3 Example of Difference of Means with Non-Normal Popu-

lation

Suppose that right-handed (RH) students have a mean IQ of 80 units with variance

1,400 and left-handed (LH) students have a mean IQ of 80 with variance 1,320. What is

the probability that the sample mean IQ of RH students will be at least 5 units higher

than the sample mean IQ of LH students if we take a sample of 100 RH students and 120

LH students?

Answer: Let 1 be the IQ of the ith RH student and 2 be the IQ of the ith LH

student.

1 = 80 21 = 1400 1 = 100

2 = 80 22 = 1320 2 = 120

and we want to find (1 − 2 ≥ 5).

Since 1 and 2 are both large we can apply the central limit theorem,

1 − 2 ∼ (1 − 2211+

222) approximately

here:

1 − 2 = 80− 80 = 0 and211+

222= 25

Note: Formula for the standardizing transformation is:

=(1 − 2)− (1 − 2)

1−2

Page 24: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

24 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

where

1−2=

s211+

222

So that

1 − 2 ∼ (0 25)

(1 − 2 ≥ 5) = ( ≥ 1) = 1587

Page 25: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

6.7. SAMPLING DISTRIBUTION OF SAMPLE PROPORTION 25

6.7 Sampling Distribution of Sample Proportion

Let be a binomially distributed random variable (the number of successes in trials).

• Recall the sample proportion is

=

s the fraction of successes in trials

• In Chapter 6 we used the normal approximation to the binomial as the number oftrials got large ((1− ) ≥ 9)

• This is another application of the Central Limit Theorem() = = and () = 2 = (1− )

• So we might ask what is the [] and []

•[] =

¸=

[]

=

=

• Recall trials are independent so that

[] =

¸=

[]

2=

(1− )

2=

(1− )

• So we apply the generic formula

=Random Variable-Mean of Random Variable

Standard Deviation of Random Variable

• What is mean for

• Random Variable is

• Mean of Variable is [] =

• Standard Deviation of Variable is =

q(1−)

• Put it all together

=−q(1−)

for large

Page 26: Chapter 6 Sampling and Sampling Distributions - QEDqed.econ.queensu.ca/walras/custom/200/250A/250F2009/lect6.pdf · Chapter 6 Sampling and Sampling ... Random receipts selected for

26 CHAPTER 6. SAMPLING AND SAMPLING DISTRIBUTIONS

6.8 Sampling Distribution of Sample Proportion: 1−2

• Consider two independent populations.

1. Population 1:

1 = 1 = 1 1 =1

1

Recall

[1] = 1

and

[1] =1(1− 1)

1

2. Population 2:

2 = number of successes

2 = number in sample 2

2 =2

2

[2]=2(1− 2)

2

3. Form Difference of Sample Proportion: 1 − 2

• If 1 and 2 are large, i.e. 11(1− 1) ≥ 9 and 22(1− 2) ≥ 9 then:

1 − 2 ∼ (1 − 21(1− 1)

1+

2(1− 2)

2)

• This is another application of the central limit theorem

Questions: NCT 7.21,7.22, 7.27 & 7.36.

Omit Sampling Distribution of the Sample Variance