67
How confident can we be in our analysis?

How confident can we be in our analysis?. Unit Plan – 10 lessons Recap on CLT and Normal Distribution Confidence intervals for the mean Confidence

Embed Size (px)

Citation preview

Page 1: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

How confident can we be in our analysis?

Page 2: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Unit Plan – 10 lessons

Recap on CLT and Normal Distribution

Confidence intervals for the mean

Confidence intervals for proportions

Confidence intervals for the difference between two means

Sample size Margin of error Proofs

Page 3: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

I Can Do…

Page 4: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Starter lesson 1:

A sample of 16 items is taken from a population with = 34 and = 4. Find the mean and standard deviation of the distribution of the total.

E(T) = n T = = 16 (34) = 4 (4) = 544 = 16

Find the mean and the standard deviation of the sample means.

E(X) = X = = 34

= 1

Assume population is normal!

n

4

4

16

4

n

Page 5: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

This lesson aims to be a revision period for:

• The Normal Distribution• The Central Limit Theorem

Lesson One: Re-cap: The Normal Distribution and CLT

Page 6: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Distribution of the Sample Mean When several different samples are

taken from a population, the results will vary from sample to sample.

These results will have a distribution of their own

We covered these distributions in the previous unit

Page 7: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Distribution of the Sample Mean If n (the sample size) is large enough

(>30) the distribution will be approximately normal.

The distribution of the sample mean (X) has a mean of it’s own called Mu (μ) and a standard deviation of This is also known as the

standard error of the sample mean. (It gives an indication of it’s spread)

We will revisit this in another lesson.

Page 8: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Remember these?

A sample of 16 items is taken from a population with = 34 and = 4. Find the mean and standard deviation of the distribution of the total.

E(T) = n T = = 16 (34) = 4 (4) = 544 = 16

Find the mean and the standard deviation of the sample means.

E(X) = X = = 34

= 1

Assume population is normal!

n

4

4

16

4

n

Page 9: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Sample Stats and Population Parameters

Sample Statistics

Populations

Parameters

Mean = μ μStandard Deviation s = σ

Page 10: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Re-cap: The Central Limit Theorem

If samples of sample size n > 30 are repeatedly taken from any population, of no matter what distribution, then X, the random variable obtained by finding the means of these samples, is approximately normally distributed with mean and standard deviation

[is the mean and the standard deviation of the population]

n

Page 11: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Applications of the Central Limit Theorem: probability questions for the distributions of the sample mean and the total

Probability questions on the sample mean may be given now that we know that this distribution (of the sample means) is approximately normally distributed. We use normal distribution to solve.

If the underlying population is normally distributed then we may also be given probability questions on the distribution of a sample total. Again, we use normal distribution to solve.

Page 12: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

When is the Assumption for normality not required?

Assumption for normality is not required as the Central Limit Theorem states that for n > 30 the distribution of the sample mean is normal regardless of the underlying population distribution.

Page 13: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

By the end of this lesson students should be able to:

• Explain how a confidence interval gives an estimate of the population mean.

• Calculate a confidence interval for the population parameter μ and interpret it.

Lesson Two: Introduction to Confidence

Intervals

Page 14: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Confidence Intervals

We use samples as estimates for the population parameters.

As different samples produce different estimates we give an interval rather than specific values as an estimate of the population mean. This interval is called a confidence interval.

Page 15: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

95% Confidence Interval

A 95% confidence interval means that on average, 95% of the time the interval will contain the true population mean μ.

Page 16: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

95% Confidence Interval

A 95% CI maybe worked out using:

The actual formula is:If it is known, otherwise use ‘s’.

Size of the sample.

Z value for 95%

Page 17: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Bolts

A machine manufactures bolts to a set length with a standard deviation of 2.5mm.

A random sample of 20 bolts is checked and found to have a mean length of 75.2mm.

Find the 95% confidence interval for the mean length of bolts.

Page 18: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Maths Test Marks

The population of the marks resembles a normal distribution.

Xi Mi Fi

20- 25 |

30- 35 ||||

40- 45 ||||

50- 55 |||| |||| ||||

60- 65|||| |||| |||| |

70- 75 |||| ||

80- 85 |||| ||||

90-100 95 ||||

Population Parameters:

μ = 64.84% σ = 16.64%n = 61

Sample Statistics:

μx = _____%

n = 16

Page 19: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Back to Maths Test Marks

Population Parameter:

μ = 64.84%

Page 20: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Ice Cream Factory

Page 21: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

By the end of this lesson students should be able to:

• Use a sample proportion to calculate a C.I. in order to estimate π given the formula:

Lesson Three: C.I. for the Population

Proportion (π)

Page 22: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Definition Check Point

Confidence Level

Inverse (Normal Value)

Sample Proportion

Sample Size

Population Proportion

(most cases this is unknown so we would have to use p as an estimate)

Page 23: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Defective Cubes

We have a company that produces blue counters for commercial use.

We conduct regular quality control tests of our product.

Occasionally the odd yellow counter is produced much to our annoyance!

Page 24: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Defective Cubes Collect seven (7) quality control

tests. Record the proportion of yellow counters:

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0

Page 25: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Defective Cubes Calculate the 95% confidence level

for the population of yellow counters:

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0

Page 26: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0

Defective Cubes Here’s our data for the seven (7)

quality control tests:

How would this data change if we knew that the actual population proportion of yellow beans was 0.25?

Page 27: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Cleansville

A sample of 500 Cleansville residents found that 42% of them recycle their rubbish. Calculate a 99% confidence interval for the proportion of residents who recycle in Cleansville.

Page 28: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Your Turn…

Have a go at the questions on page 86 and 87 of the W.O.N.

or

Exercise 14.3 Page 232 Sigma Text Book

Page 29: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

By the end of this lesson students should be able to:

• Calculate the sample size needed for a given C.I. width for a mean and a proportion.

• Understand what the margin of error shows.

Lesson Four: Sample Size (n)

Page 30: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

The Margin of Error

The Margin of Error is half the length of the confidence interval. It is the distance between the sample mean x and one of the end points of the interval.

Lowerupper

Margin of error

Page 31: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

The Margin of Error

To calculate the sample size, we use:

e is the accuracy of the margin of error (the “width”)

Page 32: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Lets give it a go…

1. A Golf ball has a bounce which is normally distributed and σ = 3.6cm. If a sample of 100 balls is tested and x = 82cm, find a 95% C.I.

2. Find the total width of the C.I.3. What sample size would be needed if

a C.I. was within 0.5 cm with a 95% confidence?

Page 33: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Qu. 1

A Golf ball has a bounce which is normally distributed and σ = 3.6cm. If a sample of 100 balls is tested and x = 82cm, find a 95% C.I.

81.2982.71

81.29 < μ < 82.71

Page 34: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Qu. 1

Find the total width of the C.I.

81.2982.71

= 82.71-81.29= 1.42

Page 35: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

The Margin of Error

The Margin of Error is half the length of the confidence interval. It is the distance between the sample mean x and one of the end points of the interval.

81.2982.71

Margin of error (or accuracy)

= 0.71

Page 36: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Cont’d…

What sample size would be needed if a C.I. was within 0.5 cm with a 95% confidence?

Page 37: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Cont’d…

What sample size would be needed if the total length of a C.I. was 1.2 cm with a 95% confidence?

Page 38: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Lesson Five: Sample size for proportions:Two Cases There are two cases for

proportion: π is given π is unknown

Page 39: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Two Cases – π is Given

Research has shown that 42% of households have SkyTV.

How large a sample is needed to have a 99% confidence that the sample proportion is within 4% of the true percentage?

Page 40: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Two Cases – π is Unknown

An opinion poll is to be conducted. What is the minimum size needed so that the margin of error is no greater than 3% for a 95% C.I.

Use π = 0.5 (gives largest C.I.)

Page 41: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Your Turn

W.O.N. Pages 94-96or

Sigma Text Ex 14.4 Page 235

Page 42: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Starter lesson 6:1). A random sample from a population with standard deviation

14 yielded a 99% confidence interval for the mean between 72 and 84.

Find the sample mean.

Find the sample size.

2). A gardener wants to estimate (within 3 days) the average number of days it takes for tomatoes to grow. He knows SD of growing times is about 5 days. What is required sample size at 95% confidence level?

Page 43: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Starter solutions:1). A random sample from a population with standard deviation 14

yielded a 99% confidence interval for the mean between 72 and 84.

Find the sample mean.

72 < < 84 Mid-point is

Find the sample size.

First, e = 84 – 78 = 6.

Now:

For 99% CI: z = 2.576.

So sample size is 37.

.).1(1.36

6

064.36

064.366

14.576.26

2

pdn

n

n

n

782

8472

X

nz

zeX

.

.

So:

Page 44: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Starter solutions:2). A gardener wants to estimate (within 3 days) the average

number of days it takes for tomatoes to grow. He knows SD of growing times is about 5 days. What is required sample size at 95% confidence level?

So minimum sample size is 11.

).1(7.109

04.96

38.9

38.9

35

.96.1

.

22

pdn

n

n

n

n

en

z

95% CI, so z = 1.96

e = 3 days = 5 days

Page 45: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

By the end of this lesson students should be able to:

• Calculate the difference of two means needed when comparing two populations.

Lesson Six: Difference of Two Means

Page 46: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

C.I. for the difference between two means

A frequent problem in statistics is to determine whether two populations are:

Similar, or Whether there is a significant

difference (and unlikely to be the same)

Page 47: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Who Drives Faster?

In a recent trial, 16 girls and 12 boys took part in a drag strip simulator to refute/confirm this statement:

“Girls can drive faster than Boys”

The results were as follows…

Page 48: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Who Drives Faster?

Females Males xF = 165.7 km/h

SF = 45 N = 16

xM = 160 km/h

SF = 25 N = 12

Is there or is there not a significant difference between these?

Calculate a 95% confidence level for both of these and graph it.

Page 49: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Who Drives Faster?

Boys

Girls

Page 50: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Who Drives Faster?

How do we decide whether small differences are significant or not?

We have a small unreliable sample (if we are testing the idea that the population

of girls can drive faster than boys.)

We need a formula to test this properly…

Page 51: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Distribution of the Difference of Two Means We are estimating the difference

between two population means (a parameter)

μ1 – μ2

The logical statistic to use is x1 – x2

Page 52: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

The C.I. for the Difference A confidence interval for a parameter

usually takes the form:

so…

(Sample value) ± (Confidence Level) × (SD of sample value)

Page 53: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Who Drives Faster?

Use the above formula to calculate the difference between the sample means (95% CI)

Page 54: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Who Drives Faster?

If there is no underlying difference between the speeds that girls and boys can drive at then:

μF – μM = 0 (Zero)

So all we have to do is check whether zero is included in the 95% C.I.

If such a confidence interval does not enclose zero then it is unlikely that the two means are equal. There is probably a difference between the two means.

Page 55: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Who Drives Faster?

Our Conclusion:

Since _______ lies within the confidence

interval, there is insufficient evidence to

conclude that

_________________________.

(on a drag strip simulator)

Page 56: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Significant Differences

It IS possible for even smaller differences to be significant if the sample is large enough.

Consider…Females Males

xF = 165.7 km/h

SF = 45 N = 20 000

xM = 163.2 km/h

SF = 25 N = 18500

Page 57: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

You could be asked questions on:

Constructing these CI’s

Finding margins of error for these CI’s

Interpreting these CI’s

Note on these confidence intervals:

Page 58: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Formulae to RememberSituation Standard

Error

Individual Continuous Data

Distribution of Sample Means

Distribution of Sample Proportions

Distribution of Difference of Two Means

Page 59: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Your Height Please…

You need to record your heights to the nearest centimetre. (Here)

We will need to keep males heights and females heights separate.

Calculate μ and σ (Population Parameters) for males and females

Sample Statistics

Populations

Parameters

Mean x μ

Standard Deviation s σ

Page 60: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Starter lesson 7:Two independent populations have means of 85.4 and 64.3 respectively and standard deviations are 8.7 and 6.4. A random sample of 64 is drawn from the first population and 36 from the second.

Find expected value of difference between two sample means.

E( ) = = 85.4 – 64.3 = 21.1

Find SD of difference between the two sample means.

=

What is the probability that the mean of the difference between these sample means will be greater than 22?

Let D = difference =

So P(D > 22)

= 85.4 = 64.3 = 8.7 = 6.4n1 = 64 n2 = 36

21 XX

)..2(28.0

2227.05.0

)5921.0()52.1

1.2122(

pd

ZPZP

21 XX

.)..2(52.1

36

4.6

64

7.8 22

2

22

1

21

pd

nn

21 XX

Page 61: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Lesson Seven: Practice Q’s By the end of this lesson students

should be able to:

• Calculate and interpret confidence intervals for:

• Means• Proportions• Difference between means

Page 62: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Lesson Nine: Excellence (1) By the end of this lesson students

should be able to:

• Find standard errors of estimates• More revision

Page 63: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Lesson Ten: Excellence (2) By the end of this lesson students

should be able to:

• Prove the formula for the standard error of the difference between two means

• Understand in depth applications of the Central Limit Theorem

Page 64: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Proof of Diff.

Page 65: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Proof of Diff.

Page 66: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Proof of Diff. Full proof

Page 67: How confident can we be in our analysis?. Unit Plan – 10 lessons  Recap on CLT and Normal Distribution  Confidence intervals for the mean  Confidence

Lesson Ten: Practice Assessment