17
SECTION 13.2 SECTION 13.2 Comparing Two Comparing Two Proportions Proportions

SECTION 13.2

Embed Size (px)

DESCRIPTION

SECTION 13.2. Comparing Two Proportions. In this scenario, we desire to compare two populations or the responses to two treatments based on two independent samples. We compare the populations by doing inference about the difference p 1 - p 2 - PowerPoint PPT Presentation

Citation preview

Page 1: SECTION 13.2

SECTION 13.2SECTION 13.2

Comparing Two Comparing Two ProportionsProportions

Page 2: SECTION 13.2

In this scenario, we desire to compare two In this scenario, we desire to compare two populations or the responses to two populations or the responses to two treatments based on two independent treatments based on two independent samples.samples.

We compare the populations by doing We compare the populations by doing inference about the difference pinference about the difference p11 - p - p22

The statistic that estimates this difference The statistic that estimates this difference is the difference between the two sample is the difference between the two sample proportions,proportions, 1 2ˆ ˆp p

Page 3: SECTION 13.2

The sampling distribution ofThe sampling distribution of

The variance of the difference is the sum of The variance of the difference is the sum of the variances of the variances of and , which is and , which is

Note that the variances add. The standard Note that the variances add. The standard deviations do not.deviations do not.

When the samples are large, the distribution When the samples are large, the distribution is approximately normal.is approximately normal.

The mean of this distribution is pThe mean of this distribution is p11-p-p22

1 1 2 2

1 2

(1 ) (1 )p p p p

n n

2p̂1p̂

1 2ˆ ˆp p

Page 4: SECTION 13.2

AssumptionsAssumptions1.1. Data are from two independent SRSs from the Data are from two independent SRSs from the

populationspopulations2.2. The populations are at least ten times as large as The populations are at least ten times as large as

the samplesthe samples3.3. A. For a significance test:A. For a significance test:

Where is the combined sample proportionWhere is the combined sample proportion..B. For a confidence interval:B. For a confidence interval:

1 1

2 2

ˆ ˆ5, (1 ) 5,

ˆ ˆ5, (1 ) 5c c

c c

n p n p

n p n p

1 1 1 1

2 2 2 2

ˆ ˆ5, (1 ) 5,

ˆ ˆ5, (1 ) 5

n p n p

n p n p

ˆcp

Page 5: SECTION 13.2

Confidence Intervals for Confidence Intervals for pp11 - - pp22

Draw an SRS of size Draw an SRS of size nn11 from a population having proportion from a population having proportion pp11 of successes of successes and draw an independent SRS of size and draw an independent SRS of size nn22 from another population having from another population having proportion proportion pp22 of successes. When of successes. When nn11 and and nn22 are large, an approximate are large, an approximate level C confidence interval for level C confidence interval for pp11 – – pp22 is ( ) ± is ( ) ± zz*SE*SE

In this formula the standard error SE of In this formula the standard error SE of is is

And And zz* is the upper (1 – C)/2 standard normal * is the upper (1 – C)/2 standard normal critical value. critical value. Follow the same assumptions as Follow the same assumptions as for single proportion for single proportion confidence intervals.confidence intervals.

1 2ˆ ˆp p

1 2ˆ ˆp p

1 1 2 2

1 2

ˆ ˆ ˆ ˆ(1 ) (1 )p p p pSE

n n

Page 6: SECTION 13.2

Our z test statisticOur z test statistic

1 2

1 2

ˆ ˆ

1 1ˆ ˆ(1 )c c

p pz

p pn n

ˆcp

Significance Tests for p1 – p2

Where is the combined sample proportion.

Page 7: SECTION 13.2

1.1. State the hypothesis and name testState the hypothesis and name testHHoo: p: p11 = p = p22 HHaa: p: p11 ‹, ›, or ≠ p ‹, ›, or ≠ p22

2.2. State and verify your assumptionsState and verify your assumptions3.3. Calculate the P value and other Calculate the P value and other

important valuesimportant values- Done in calculator or…Done in calculator or…- Using the formulas and tablesUsing the formulas and tables

4.4. State Conclusions State Conclusions (Both statistically and (Both statistically and contextually)contextually)- The smaller the p-value, the greater the - The smaller the p-value, the greater the

evidence is to reject Hevidence is to reject Hoo

The Steps for a The Steps for a Two Proportion z-testTwo Proportion z-test

Page 8: SECTION 13.2

CALCULATOR FUNCTIONSCALCULATOR FUNCTIONS

You may be able to find these on You may be able to find these on your own by now, but just in case, your own by now, but just in case, you will be looking for:you will be looking for:6: 2-PropZTest6: 2-PropZTestB: 2-PropZIntB: 2-PropZInt

Note: x is your number of successes while Note: x is your number of successes while n is your total trialsn is your total trials

Page 9: SECTION 13.2

+ 4 Confidence Interval for 2 + 4 Confidence Interval for 2 ProportionsProportions

Just like before, this helps us overcome the lack of Just like before, this helps us overcome the lack of Normality when the sample sizes are too small for the Normality when the sample sizes are too small for the large-sample procedures.large-sample procedures.

These methods cannot save us from the fact that small These methods cannot save us from the fact that small samples produce wide confidence intervals. samples produce wide confidence intervals.

The plus four interval may be conservative for very The plus four interval may be conservative for very small samples and population p’s close to 0 or 1.small samples and population p’s close to 0 or 1.

It is generally much more accurate than the large-It is generally much more accurate than the large-sample interval when the samples are small or the sample interval when the samples are small or the population p is close to 0 or 1.population p is close to 0 or 1.

Add 4 imaginary observations, one success and one Add 4 imaginary observations, one success and one failure in each of the two samples. failure in each of the two samples.

Use the large-sample procedures with the new sample Use the large-sample procedures with the new sample sizes and counts of successes.sizes and counts of successes.

Use this when the sample size is at least 5 in each Use this when the sample size is at least 5 in each group, with any counts of successes and failures.group, with any counts of successes and failures.

Page 10: SECTION 13.2

Example of Two-Proportion Confidence Example of Two-Proportion Confidence IntervalInterval

A surprising number of young adults (ages A surprising number of young adults (ages 19-25) still live at home with their parents. 19-25) still live at home with their parents. A random sample by the National Institutes A random sample by the National Institutes of Health included 2253 men and 2629 of Health included 2253 men and 2629 women in this age group. The survey women in this age group. The survey found that 986 of the men and 923 of the found that 986 of the men and 923 of the women lived at home. Is this good women lived at home. Is this good evidence that different proportions of evidence that different proportions of young men and young women live at young men and young women live at home? How large is the difference home? How large is the difference between the proportions of young men and between the proportions of young men and young women who live at home?young women who live at home?

Page 11: SECTION 13.2

Step 1—Parameters Step 1—Parameters Population 1—young menPopulation 1—young men Population 2—young womenPopulation 2—young women pp11 = proportion of young men who live at home = proportion of young men who live at home pp22 = proportion of young women who live at = proportion of young women who live at

homehome We will construct a 95% confidence interval for We will construct a 95% confidence interval for

the difference between men and women, pthe difference between men and women, p11- p- p22

Page 12: SECTION 13.2

Step 2—Conditions Step 2—Conditions SRSs—The data were obtained from a random SRSs—The data were obtained from a random

sample, so we should be safe generalizing to sample, so we should be safe generalizing to the respective populations of interest.the respective populations of interest.

Normality—To check that the large-sample Normality—To check that the large-sample confidence interval is safe, look at counts of confidence interval is safe, look at counts of successes and failures (show calculations) for successes and failures (show calculations) for both samples. All of these are much larger than both samples. All of these are much larger than 5, so the large-sample method will be accurate.5, so the large-sample method will be accurate.

Independence—The sample survey in this Independence—The sample survey in this example selected a single random sample of example selected a single random sample of young adults, not two separate random young adults, not two separate random samples of men and women. We divide the one samples of men and women. We divide the one sample by gender. The two-sample z sample by gender. The two-sample z procedures for comparing proportions are valid procedures for comparing proportions are valid in such situations. This is an important fact in such situations. This is an important fact about these methods.about these methods.

Page 13: SECTION 13.2

Step 3—Calculations Step 3—Calculations Here are the needed calculations:Here are the needed calculations:

z*=1.96z*=1.96So, our interval is (0.059 , 0.114)So, our interval is (0.059 , 0.114)Calculator: (0.05912, 0.11399)Calculator: (0.05912, 0.11399)

1 2

1 1 2 2

1 2

ˆ ˆ( ) *

ˆ ˆ ˆ ˆ(1 ) (1 )

(0.4376)(0.5624) (0.3511)(0.6489)

2253 26290.014

p p z SE

p p p pSE

n n

Page 14: SECTION 13.2

Step 4—Interpretation Step 4—Interpretation We are 95% confident that the percent of We are 95% confident that the percent of

young men living at home is between 5.9 young men living at home is between 5.9 and 11.4 percentage points higher than and 11.4 percentage points higher than the percent of young women who live at the percent of young women who live at home. This is definitely good evidence home. This is definitely good evidence that a different proportion of young men that a different proportion of young men and young women live at home.and young women live at home.

We have this level of confidence, because We have this level of confidence, because if we repeated our procedures over and if we repeated our procedures over and over with new samples, 95% of our over with new samples, 95% of our intervals would capture the true intervals would capture the true difference.difference.

Page 15: SECTION 13.2

Testing a ClaimTesting a Claim Considering the previous example, someone Considering the previous example, someone

makes the claim that young men are more makes the claim that young men are more likely to live at home. Does our data support likely to live at home. Does our data support this claim?this claim?

HHoo: p: p11 = p = p22

HHaa: p: p11 › p › p22

We need to check the Normal assumption again We need to check the Normal assumption again using the combined sample proportion.using the combined sample proportion.

ˆ 0.3910cp 1 1

2 2

ˆ ˆ ˆ ˆ5, (1 ) 5, 2253 5, 2253(1 ) 5,

ˆ ˆ ˆ ˆ5, (1 ) 5 2629 5, 2629(1 ) 5c c c c

c c c c

n p n p p p

n p n p p p

Page 16: SECTION 13.2

CalculationsCalculations

1 2

1 2

ˆ ˆ6.1782

1 1ˆ ˆ(1 )c c

p pz

p pn n

ˆ 0.3910cp P-value=0.000000000325

1 2

1 1ˆ ˆ(1 )

1 1.391(.609)

2253 2629

0.0140

c cSE p pn n

1 2ˆ ˆ 0.0866p p

Page 17: SECTION 13.2

InterpretationInterpretation

Based on our extremely low Based on our extremely low PP-value, we -value, we would reject the null hypothesis.would reject the null hypothesis.

Essentially, a difference in proportions Essentially, a difference in proportions this high would rarely every occur by this high would rarely every occur by chance if there is truly no difference chance if there is truly no difference between the proportion of young men between the proportion of young men and women that live at home.and women that live at home.

We are comfortable agreeing with the We are comfortable agreeing with the claim that more young men live at home.claim that more young men live at home.