Upload
domenic-campbell
View
246
Download
6
Tags:
Embed Size (px)
Citation preview
Estimating from Samples Estimating from Samples
© Christine Crisp
““Teach A Level Maths”Teach A Level Maths”
Statistics 2Statistics 2
Estimating from SamplesSuppose we want to know the average height of 17
year olds.The heights of 17 year-olds form a population.
But I’ve been vague about my population.Do I mean the 17 year-olds in my school or
college?Or do I mean all the 17 year-olds in England?Or all those in the world?
Am I putting the girls and boys together? Or do I want to consider 2 populations; boys and girls separately?
In Statistics, a population is the property of the entire group that we are interested in.
We always need to state clearly what our population is.
Estimating from Samples
Even fairly small populations can be costly or impossible to investigate.
If you wanted to know the mean height of the 17 year-olds in your school or college would you have time to measure them all? What would you do about those who are absent when you measure?
If you wanted to know the mean height of the 17 year-olds in England it would be impossible to collect the data on all of them.
For these reasons we take samples. From a sample we can make a prediction about the population.
We are going to see how a calculation from a sample, called a statistic, can be used to estimate a parameter of a population.
Estimating from Samples
A population doesn’t have to consist of people. It might, for example, be the times taken to run the 100m in the last Olympics, or lengths of the runner beans in my garden this year.
Suppose we are interested in the size of eggs laid by a flock of hens. For example, we may already have an estimate of the mean weight before a change of diet and we want to know whether the new diet has increased the weight of the eggs.
We can’t weigh all the eggs so we choose to take a simple random sample and weigh those.
Let’s assume that our population consists of the weights of all the eggs laid in a week and that there are 1000 of them.
Estimating from Samples
58·6, 61·0, 63·0, 64·0, 66·8
Suppose the sample consists of 5 eggs with the following masses in grams.
7621 xThe mean,
( The subscript 1 indicates the 1st sample. )
Suppose on the second day we take another sample and this time the sample is
528, 571, 588, 599, 625
The mean, 2582 x
It’s very likely that the population mean is neither of these values so we need to decide whether we are justified in using either of them and, if so, how accurate they will be.
Estimating from Samples
Before we can make a decision about the sample means we need to look at some theory.
To do this I’m going to pretend that I know the weights of all the eggs in the population. We can then compare the samples with the population.
I’ll draw a frequency diagram of the weights of all the eggs in the population.
Estimating from Samples
Population
Now we’ll superimpose the 1st sample of 5 eggs.
Estimating from Samples
The weights in the sample are indicated by the arrows.
Population and 1st sample
Estimating from Samples
This is the 1st sample mean, 7621 x
Now for the 2nd sample:
Population and 1st sample
Estimating from Samples
This is the 2nd sample mean, 2582 x
Now we’ll take 10 samples, each of size 5.
Instead of showing the individual weights, we’ll just show the means of the samples.
Population and 2nd sample
Estimating from Samples
Now for 100 samples
Population and 10 sample means
Estimating from Samples
Population and 100 sample means
Finally 1000 samples
Estimating from Samples
Population and 1000 sample means
We have a distribution of the means of 1000 samples each of size 5.
We now want to see what happens if we increase the size of each sample.
Estimating from Samples
Population and 1000 sample means
Each sample is of size 5:
n = 5
The means are less spread out.
Each sample is of size 20:
n = 20
Population and 1000 sample means
Estimating from Samples
We can notice 4 things about the distribution of the sample means:
• The distribution is approximately Normal
• The spread is less than that of the population.
• The mean of the sample means is approximately the same as the population mean.
n = 5
• As the sample size increases, the spread decreases.
n = 20
Population and 1000 sample means
Population and 1000 sample means
N.B. The distribution is correctly called the “ distribution of the sample means”.
Estimating from SamplesIt can be shown that if we could take all samples
of a given size, n, that could be constructed ( which in practice is not possible ) then,
So, we are justified in using a sample mean to estimate the population mean even though on some occasions it will be a poor estimate.
the mean of the sample means equals the population mean
• Poor estimates occur rarely, and even less often as the sample size increases.
• On average a sample mean will give a good estimate of the population mean.
n = 5
Population and 1000 sample means
a poor estimate of
Estimating from Samples
In the example using hens eggs, the population was approximately Normal.
We’ll now look at an example where the population is not Normal.
Estimating from Samples
100 samples
n = 5
Increasing the sample size to 30
. . . and the number of samples to 1000
We notice the same things as before.
A population that isn’t Normal
Estimating from SamplesSUMMAR
Y
• The mean of a sample can be used to estimate the mean of a population.
• The population need not have a Normal distribution but in that case the sample size should be at least 30. ( The less Normal the population is, the greater the sample size should be ).
• The larger the sample size, the more likely it is that the estimate of the population mean will be accurate.
Using a sample to make estimates:
A statistic is a quantity calculated from a sample. A population has parameters such as the mean which can be estimated from a sample using a statistic.
Estimating from Samples
1. The table below gives a random sample of the number of goals scored in 10 Premier league football matches taken from the first 3 weeks of the 2005/06 season.
0, 3, 1, 0, 6, 4, 1, 1, 3, 2
Solution:(a) . This is the estimate of .12 x(b) As the population is unlikely to be Normal, the
sample size is too small to give an accurate result. The weather in the first 3 weeks will not be typical of the whole season so the results may be unrepresentative.
Exercise
(a) Use the sample to estimate the mean number of goals likely to be scored throughout the season.
(b) Make at least 2 comments on the accuracy of your estimate.
Estimating from Samples
The following slides contain repeats of information on earlier slides, shown without colour, so that they can be printed and photocopied.For most purposes the slides can be printed as “Handouts” with up to 6 slides per sheet.
Estimating from Samples
A population doesn’t have to consist of people. It might, for example, be the times taken to run the 1000m in the last Olympics, or lengths of the runner beans in my garden this year.
Suppose we are interested in the size of eggs laid by a flock of hens. For example, we may already have an estimate of the mean weight before a change of diet and we want to know whether the new diet has increased the weight of the eggs.
We can’t weigh all the eggs so we choose to take a simple random sample and weigh those.
Let’s assume that our population consists of the weights of all the eggs laid in a week and that there are 1000 of them.
Populations
Estimating from Samples
58·6, 61·0, 63·0, 64·0, 66·8
Suppose the sample consists of 5 eggs with the following masses in grams.
7621 xThe mean,
( The subscript 1 indicates the 1st sample. )
Suppose on the second day we take another sample and this time the sample is
528, 571, 588, 599, 625
The mean, 2582 x
It’s very likely that the population mean is neither of these values so we need to decide whether we are justified in using either of them and, if so, how accurate they will be.
Estimating from Samples
Before we can make a decision about the sample means we need to look at some theory.
To do this I’m going to pretend that I know the weights of all the eggs in the population. We can then compare the samples with the population.
The diagrams show the entire population of weights and the means of samples. Each sample consists of 5 weights.
Estimating from Samples
Population and 10 sample means
1 sample mean
Population
Estimating from Samples
Population and 100 sample means
Population and 1000 sample means
n = 5
n = 5
Estimating from SamplesWe now want to see what happens if we
increase the size of each sample.
Each sample is of size 5:
n = 5
The means are less spread out.
Each sample is of size 20:
n = 20
Population and 1000 sample means
Population and 1000 sample means
n = 5
n = 20
Estimating from Samples
We can notice 4 things about the distribution of the sample means:
• The distribution is approximately Normal
• The spread is less than that of the population.
• The mean of the sample means is approximately the same as the population mean.
n = 5
• As the sample size increase, the spread decreases.
n = 20
Population and 1000 sample means
Population and 1000 sample means
Sample size
Sample size
Estimating from SamplesIt can be shown that if we could take all samples
of a given size, n, that could be constructed ( which in practice is not possible ) then,
So, we are justified in using a sample mean to estimate the population mean even though on some occasions it will be a poor estimate.
the mean of the sample means equals the population mean
• Poor estimates occur rarely, and even less often as the sample size increases.
• On average a sample mean will give a good estimate of the population mean.
a poor estimate of
n = 5
Population and 1000 sample means
Estimating from Samples
100 samples
n = 5
Increasing the sample size to 30
. . . and the number of samples to 1000
We notice the same things as before.
A population that isn’t Normal
Estimating from Samples
• The mean of a sample can be used to estimate the mean of a population.
• The population need not have a Normal distribution but in that case sample size should be at least 30. ( The less Normal the population is, the greater the sample size should be ).
• The larger the sample size, the more likely it is that the estimate of the population mean will be accurate.
Using a sample to make estimates:
A statistic is a quantity calculated from a sample. A population has parameters such as the mean which can be estimated from a sample using a statistic.