Chapter 7 Probability and Samples: The Distribution of Sample Means

Preview:

Citation preview

Chapter 7Chapter 7

Probability and Samples: The Probability and Samples: The Distribution of Sample MeansDistribution of Sample Means

Samples and Sampling ErrorSamples and Sampling Error

The scores we have looked at thus The scores we have looked at thus far are z-scores and probabilities far are z-scores and probabilities where the sample consists of a single where the sample consists of a single score.score.

This chapter will extend the concepts This chapter will extend the concepts of z-scores and probability to cover of z-scores and probability to cover situations with larger samples.situations with larger samples. Ex: A z-score for an entire sampleEx: A z-score for an entire sample

Z-scores (review)Z-scores (review)

Describes exactly where the score is Describes exactly where the score is located in the distributionlocated in the distribution

Ex: a z-score of +2.00 is extremeEx: a z-score of +2.00 is extreme

Figure 6.4Figure 6.4

The normal distribution following a z-score transformationThe normal distribution following a z-score transformation

Cop

yrig

ht ©

200

2 W

adsw

orth

Gro

up. W

adsw

orth

is a

n im

prin

t of t

he

Wad

swor

th G

roup

, a d

ivis

ion

of T

hom

son

Lear

ning

Extreme Sample

Central, Representativ

e Sample

Probability (review)Probability (review)

If the score is normal, should be able If the score is normal, should be able to determine the probability value for to determine the probability value for each score.each score.

A score with a z-score of +2.00 has a A score with a z-score of +2.00 has a probability of only p = .0028probability of only p = .0028

Figure 6.4Figure 6.4

The normal distribution following a z-score transformationThe normal distribution following a z-score transformation

Cop

yrig

ht ©

200

2 W

adsw

orth

Gro

up. W

adsw

orth

is a

n im

prin

t of t

he

Wad

swor

th G

roup

, a d

ivis

ion

of T

hom

son

Lear

ning

Extreme Sample

Central, Representativ

e Sample

Z-ScoresZ-Scores So far we have been limited to So far we have been limited to

situations where the sample consists situations where the sample consists of a single score.of a single score. Most studies have larger samplesMost studies have larger samples We will now extend the concepts of z-We will now extend the concepts of z-

scores and probability to cover scores and probability to cover situations with larger samples.situations with larger samples.

A z-score near zero indicates a A z-score near zero indicates a central, representative samplecentral, representative sample

A z-score beyond +/- 2.00 indicates A z-score beyond +/- 2.00 indicates an extreme examplean extreme example

It will be possible to determine exact It will be possible to determine exact probabilities for a sampleprobabilities for a sample

Figure 6.4Figure 6.4

The normal distribution following a z-score transformationThe normal distribution following a z-score transformation

Cop

yrig

ht ©

200

2 W

adsw

orth

Gro

up. W

adsw

orth

is a

n im

prin

t of t

he

Wad

swor

th G

roup

, a d

ivis

ion

of T

hom

son

Lear

ning

Extreme Sample

Central, Representativ

e Sample

Difficulties with using Difficulties with using samplessamples

Samples provide an incomplete Samples provide an incomplete picture of the populationpicture of the population

Any stats computed will not be Any stats computed will not be identical to the corresponding identical to the corresponding parameters for the entire populationparameters for the entire population Ex: IQ for a sample of 25 students is Ex: IQ for a sample of 25 students is

different for IQ of all populationdifferent for IQ of all population The difference is called a sampling errorThe difference is called a sampling error

Sampling ErrorSampling Error

This difference, or This difference, or errorerror between the between the sample stats and the corresponding sample stats and the corresponding population parameters, is called population parameters, is called sampling errorsampling error A sampling error is the discrepancy, or

amount of error between a sample statistic and its corresponding population parameter.

QuestionsQuestions How can you tell which sample is giving How can you tell which sample is giving

the best description of the population?the best description of the population? Can you predict how a sample will Can you predict how a sample will

describe its population?describe its population? What is the probability of selecting a What is the probability of selecting a

sample that has a certain sample mean?sample that has a certain sample mean? We can answer these, but we need to set We can answer these, but we need to set

rules that relate samples to populations.rules that relate samples to populations.

Distribution of Sample Distribution of Sample MeansMeans

Many different samples come up with Many different samples come up with different results.different results.

A huge set of possible samples forms A huge set of possible samples forms a relatively simple, orderly, and a relatively simple, orderly, and predictable pattern predictable pattern makes it possible to predict the makes it possible to predict the

characteristics of a sample with some characteristics of a sample with some accuracy.accuracy.

Distribution of Sample Means Distribution of Sample Means (cont.)(cont.)

The ability to predict sample The ability to predict sample characteristics is based on the characteristics is based on the distribution of sample means..

The distribution of sample means is The distribution of sample means is the collection of sample means for the collection of sample means for all all the possible random samplesthe possible random samples of a of a particular size (n) that can be particular size (n) that can be obtained from a populationobtained from a population

Distribution of Sample Means Distribution of Sample Means (cont.)(cont.)

It is necessary to have all the It is necessary to have all the possible values in order to compute possible values in order to compute probabilities.probabilities. If a set has 100 samples, the probability If a set has 100 samples, the probability

of obtaining any specific sample is 1 out of obtaining any specific sample is 1 out of 100 or p = 1/100.of 100 or p = 1/100.

Before we only discussed scores, now Before we only discussed scores, now we are discussing we are discussing statisticsstatistics (sample (sample means);means);

Because statistics are obtained from Because statistics are obtained from samples, a distribution of statistics is samples, a distribution of statistics is referred to as a referred to as a sampling distribution..

Sampling DistributionSampling Distribution

A A sampling distributionsampling distribution is a is a distribution of statistics obtained by distribution of statistics obtained by selecting all the possible samples of selecting all the possible samples of a specific size from a population.a specific size from a population.

To construct a sample mean:To construct a sample mean:

Take a sample Take a sample Get the meanGet the mean ReplaceReplace Get the sampleGet the sample Get the meanGet the mean ReplaceReplace Do this until you have gotten all possible Do this until you have gotten all possible

sample combinations.sample combinations. Look at Ex. 7.1 – 4 scores n=2 16 sample Look at Ex. 7.1 – 4 scores n=2 16 sample

means – look at histogram p. 147.means – look at histogram p. 147.

Sample MeansSample Means

Note that the sample means tend to Note that the sample means tend to pile up around the population meanpile up around the population mean

55 The sample means are clustered The sample means are clustered

around a value of 5around a value of 5

Sample Means (cont.)Sample Means (cont.)

Samples are supposed to be Samples are supposed to be representative of the populationrepresentative of the population

Therefore, the sample means tend to Therefore, the sample means tend to approximate the population mean.approximate the population mean.

Sample Means (cont.)Sample Means (cont.)

The distribution of sample means is The distribution of sample means is approximately normal in shape.approximately normal in shape.

Can use the distribution of sample means Can use the distribution of sample means to answer probability questions about to answer probability questions about sample means.sample means.

Ex: if you take a sample of n=2 scores Ex: if you take a sample of n=2 scores from the original population, what is the from the original population, what is the probability of obtaining a sample mean probability of obtaining a sample mean greater than 7?greater than 7?

P (X > 7) = ?P (X > 7) = ?

Figure 7.1Figure 7.1

Frequency distribution for a population of four scores: 2, 4, 6, 8Frequency distribution for a population of four scores: 2, 4, 6, 8

Cop

yrig

ht ©

200

2 W

adsw

orth

Gro

up. W

adsw

orth

is a

n im

prin

t of t

he

Wad

swor

th G

roup

, a d

ivis

ion

of T

hom

son

Lear

ning

Table 7.1Table 7.1

The possible samples of n = 2 scores from the population in Figure The possible samples of n = 2 scores from the population in Figure 7.17.1

Cop

yrig

ht ©

200

2 W

adsw

orth

Gro

up. W

adsw

orth

is a

n im

prin

t of t

he

Wad

swor

th G

roup

, a d

ivis

ion

of T

hom

son

Lear

ning

Ex: if you take a sample of n=2 Ex: if you take a sample of n=2 scores from the original population, scores from the original population, what is the probability of obtaining a what is the probability of obtaining a sample mean greater than 7?sample mean greater than 7?

P (X > 7) = ?P (X > 7) = ? Because probability is equivalent to Because probability is equivalent to

proportion, the probability question proportion, the probability question can be restated as follows:can be restated as follows:

Of all the possible sample means, Of all the possible sample means, what proportion has values greater what proportion has values greater than 7?than 7?

In Figure 7.2 – All the possible In Figure 7.2 – All the possible sample means are pictured, and only sample means are pictured, and only 1 out of the 16 means has a value 1 out of the 16 means has a value greater than 7. greater than 7.

Answer: 1 out of 16 or p = 1/16Answer: 1 out of 16 or p = 1/16

Figure 7.2Figure 7.2

The distribution of sample means for n = 2The distribution of sample means for n = 2

Cop

yrig

ht ©

200

2 W

adsw

orth

Gro

up. W

adsw

orth

is a

n im

prin

t of t

he

Wad

swor

th G

roup

, a d

ivis

ion

of T

hom

son

Lear

ning

The Central Limit TheoremThe Central Limit Theorem It might not be possible to list all the samples It might not be possible to list all the samples

and compute all the possible sample means. and compute all the possible sample means. As the size of n increases, the number of As the size of n increases, the number of

possible samples increases too.possible samples increases too. Therefore, it is necessary to develop the Therefore, it is necessary to develop the

general characteristics of the distribution of general characteristics of the distribution of sample means that can be applied in any sample means that can be applied in any situation.situation.

Characteristics are specified in Characteristics are specified in Central Limit Central Limit TheoremTheorem

Cornerstone for much of inferential statisticsCornerstone for much of inferential statistics

Central Limit TheoremCentral Limit Theorem

For any population with mean For any population with mean and and standard deviation standard deviation the distribution the distribution of sample means for sample size n of sample means for sample size n will have a mean of will have a mean of and a standard and a standard deviation of deviation of

nn and will approach a and will approach a normal distribution as n approaches normal distribution as n approaches infinity.infinity.

Central Limit TheoremCentral Limit Theorem

Describes the distribution of sample means Describes the distribution of sample means for any population, no matter what shape, for any population, no matter what shape, mean, or standard deviation.mean, or standard deviation.

The distribution of sample means The distribution of sample means “approaches” a normal distribution very “approaches” a normal distribution very rapidly. rapidly.

Describes the distribution of sample means by Describes the distribution of sample means by identifying the three basic characteristics that identifying the three basic characteristics that describe any distribution: shape, central describe any distribution: shape, central tendency, and variability.tendency, and variability.

ShapeShape of the Distribution of of the Distribution of MeansMeans

Sample means tends to be a normal Sample means tends to be a normal distributiondistribution

Can be almost perfect shape if:Can be almost perfect shape if: The population from which the samples The population from which the samples

are selected is a normal distributionare selected is a normal distribution The number of scores (n) in each The number of scores (n) in each

sample is relatively large, around 30 or sample is relatively large, around 30 or more.more.

MeanMean of the Distribution of of the Distribution of MeansMeans

The expected value of XThe expected value of X The mean of the distribution of The mean of the distribution of

sample means is equal to sample means is equal to (the (the population mean) and is called the population mean) and is called the expected value of X.expected value of X.

Standard Error of XStandard Error of X

We have considered the shape and We have considered the shape and the central tendency of the the central tendency of the distribution of sample means.distribution of sample means.

To completely describe this To completely describe this distribution, we need one more distribution, we need one more characteristiccharacteristic VariabilityVariability

Standard Error of XStandard Error of X

We will be working with the standard We will be working with the standard deviation for the distribution of deviation for the distribution of sample means.sample means.

Called the standard error of XCalled the standard error of X The standard error defines the The standard error defines the

standard, or typical, distance from standard, or typical, distance from the mean.the mean.

Remember, a sample is not expected Remember, a sample is not expected to provide a perfectly accurate to provide a perfectly accurate reflection of its population.reflection of its population.

There will be some error between the There will be some error between the sample and the populationsample and the population

Standard Error of XStandard Error of X

The standard deviation of the The standard deviation of the distribution of sample means is distribution of sample means is called the called the standard error of Xstandard error of X. .

The standard error measures the The standard error measures the standard amount of difference standard amount of difference between X and between X and due to chance due to chance

Standard Error of XStandard Error of X

Standard error = Standard error = x x = standard distance = standard distance between X and between X and

indicates that we are measuring a indicates that we are measuring a standard deviation or a standard distance standard deviation or a standard distance from the meanfrom the mean

The subscript x indicates that we are The subscript x indicates that we are measuring the standard deviation for a measuring the standard deviation for a distribution of sample means.distribution of sample means.

Standard ErrorStandard Error

Valuable because it specifies Valuable because it specifies precisely how well a sample mean precisely how well a sample mean estimates its population meanestimates its population mean

How much error you should expect How much error you should expect on the averageon the average

Can use the sample mean as an Can use the sample mean as an estimate of the population meanestimate of the population mean

Standard ErrorStandard Error Magnitude determined by two factorsMagnitude determined by two factors

Size of the sampleSize of the sample The larger the sample size (n), the more probable The larger the sample size (n), the more probable

it is that the sample mean will be close to the it is that the sample mean will be close to the populationpopulation

The standard deviation of the population The standard deviation of the population from which the sample is selectedfrom which the sample is selected

standard error = standard error = xx = =

nn

Standard errorStandard error

When the sample size increases, the When the sample size increases, the standard error decreasesstandard error decreases

As n decreases, the error increasesAs n decreases, the error increases

Probability and the Distribution of Sample Probability and the Distribution of Sample MeansMeans

Primary use of the distribution of sample Primary use of the distribution of sample means is to find the probability associated means is to find the probability associated with any specific sample.with any specific sample.

Remember probability is equivalent to Remember probability is equivalent to proportion.proportion.

Because the distribution of sample means Because the distribution of sample means presents the entire set of all possible X’s, presents the entire set of all possible X’s, we can use proportion of this distribution we can use proportion of this distribution to determine probabilities.to determine probabilities.

Example 7.2Example 7.2

Population of SAT scoresPopulation of SAT scores = 100= 100 If you take a random sample of n = If you take a random sample of n =

25 students, what is the probability 25 students, what is the probability that the sample mean would be that the sample mean would be greater than X = 540?greater than X = 540?

Restate probability question as a Restate probability question as a proportion questionproportion question Out of Out of all the possible sample meansall the possible sample means, ,

what proportion has values greater than what proportion has values greater than 540?540?

all the possible sample means all the possible sample means is the is the distribution of sample meansdistribution of sample means

The problems is to find a specific The problems is to find a specific portion of this distributionportion of this distribution

What we knowWhat we know The distribution is normal becausse the The distribution is normal becausse the

population of SAT scores is normalpopulation of SAT scores is normal The distribution has a mean of 500 The distribution has a mean of 500

because the population mean is because the population mean is The distribution has a standard error of The distribution has a standard error of

XX = 20 = 20

XX = =

n 25 5n 25 5

Figure 7.3Figure 7.3

A distribution of sample meansA distribution of sample means

Cop

yrig

ht ©

200

2 W

adsw

orth

Gro

up. W

adsw

orth

is a

n im

prin

t of t

he

Wad

swor

th G

roup

, a d

ivis

ion

of T

hom

son

Lear

ning

We are interested in sample means We are interested in sample means greater than 540 – the shaded areagreater than 540 – the shaded area

Next, find the s-score value that Next, find the s-score value that defines the exact location of X = 540defines the exact location of X = 540

The value of 540 is located above the The value of 540 is located above the mean by 40 pts.mean by 40 pts.

This is 2 s.d. (in this case, 2 standard This is 2 s.d. (in this case, 2 standard errors) above the meanerrors) above the mean

The z-score for X = 540 is z = +2.00The z-score for X = 540 is z = +2.00

Because this distribution of sample Because this distribution of sample means is normal, you can use the means is normal, you can use the unit normal table to find the unit normal table to find the probability associated with z=+2.00probability associated with z=+2.00

The table indicates that 0.0228 of The table indicates that 0.0228 of the distribution is located in the tail the distribution is located in the tail of the distribution beyond z = +2.00of the distribution beyond z = +2.00

Conclusion – it is very unlikely, p = Conclusion – it is very unlikely, p = 0.0228 (2.28%) to obtain a random 0.0228 (2.28%) to obtain a random sample of n = 25 students with an sample of n = 25 students with an average SAT score greater than 540average SAT score greater than 540

Z-scoresZ-scores

It is possible to use a z-score to It is possible to use a z-score to describe the position of any specific describe the position of any specific sample within the distribution of sample within the distribution of sample meanssample means

Z-score tells exactly where a specific Z-score tells exactly where a specific sample is located in relation to all the sample is located in relation to all the other possible samples that could other possible samples that could have been obtained.have been obtained.

Figure 7.8Figure 7.8

Showing standard error in a graphShowing standard error in a graph

Cop

yrig

ht ©

200

2 W

adsw

orth

Gro

up. W

adsw

orth

is a

n im

prin

t of t

he

Wad

swor

th G

roup

, a d

ivis

ion

of T

hom

son

Lear

ning

Recommended