Lecture 14: Hypothesis Testing for Proportions API-201ZHypothesis Testing for Proportions I Often...

Preview:

Citation preview

Lecture 14:Hypothesis Testing for Proportions

API-201Z

Maya Sen

Harvard Kennedy Schoolhttp://scholar.harvard.edu/msen

Announcements

I We will be posting examples of successful final exercisememos from years past

Announcements

I We will be posting examples of successful final exercisememos from years past

Roadmap

I Continuing with different types of hypothesis tests

I Last time: Hypothesis tests involving means and difference inmeans

I Today: Hypothesis tests involving proportions

I Interpreting hypothesis tests

I Practical versus statistical significance

I Type I and Type II errors (and power, if time)

Roadmap

I Continuing with different types of hypothesis tests

I Last time: Hypothesis tests involving means and difference inmeans

I Today: Hypothesis tests involving proportions

I Interpreting hypothesis tests

I Practical versus statistical significance

I Type I and Type II errors (and power, if time)

Roadmap

I Continuing with different types of hypothesis tests

I Last time: Hypothesis tests involving means and difference inmeans

I Today: Hypothesis tests involving proportions

I Interpreting hypothesis tests

I Practical versus statistical significance

I Type I and Type II errors (and power, if time)

Roadmap

I Continuing with different types of hypothesis tests

I Last time: Hypothesis tests involving means and difference inmeans

I Today: Hypothesis tests involving proportions

I Interpreting hypothesis tests

I Practical versus statistical significance

I Type I and Type II errors (and power, if time)

Roadmap

I Continuing with different types of hypothesis tests

I Last time: Hypothesis tests involving means and difference inmeans

I Today: Hypothesis tests involving proportions

I Interpreting hypothesis tests

I Practical versus statistical significance

I Type I and Type II errors (and power, if time)

Roadmap

I Continuing with different types of hypothesis tests

I Last time: Hypothesis tests involving means and difference inmeans

I Today: Hypothesis tests involving proportions

I Interpreting hypothesis tests

I Practical versus statistical significance

I Type I and Type II errors (and power, if time)

Roadmap

I Continuing with different types of hypothesis tests

I Last time: Hypothesis tests involving means and difference inmeans

I Today: Hypothesis tests involving proportions

I Interpreting hypothesis tests

I Practical versus statistical significance

I Type I and Type II errors (and power, if time)

Hypothesis Testing for Proportions

I Often not interested in making inferences about a populationmean, but a population proportion

I Proportions commonly used to summarize yes-or-nooutcomes:

I True U.S. presidential approval ratingI True support in the U.S. population for same-sex marriageI Proportion of people in an indigenous population who have a

certain genetic markerI Proportion of households in a city that are below national

poverty levels

I Can extend hypothesis testing to analyze all sorts ofproportions

I Tweak existing hypothesis framework only slightly

Hypothesis Testing for Proportions

I Often not interested in making inferences about a populationmean, but a population proportion

I Proportions commonly used to summarize yes-or-nooutcomes:

I True U.S. presidential approval ratingI True support in the U.S. population for same-sex marriageI Proportion of people in an indigenous population who have a

certain genetic markerI Proportion of households in a city that are below national

poverty levels

I Can extend hypothesis testing to analyze all sorts ofproportions

I Tweak existing hypothesis framework only slightly

Hypothesis Testing for Proportions

I Often not interested in making inferences about a populationmean, but a population proportion

I Proportions commonly used to summarize yes-or-nooutcomes:

I True U.S. presidential approval ratingI True support in the U.S. population for same-sex marriageI Proportion of people in an indigenous population who have a

certain genetic markerI Proportion of households in a city that are below national

poverty levels

I Can extend hypothesis testing to analyze all sorts ofproportions

I Tweak existing hypothesis framework only slightly

Hypothesis Testing for Proportions

I Often not interested in making inferences about a populationmean, but a population proportion

I Proportions commonly used to summarize yes-or-nooutcomes:

I True U.S. presidential approval ratingI True support in the U.S. population for same-sex marriageI Proportion of people in an indigenous population who have a

certain genetic markerI Proportion of households in a city that are below national

poverty levels

I Can extend hypothesis testing to analyze all sorts ofproportions

I Tweak existing hypothesis framework only slightly

Hypothesis Testing for Proportions

I Often not interested in making inferences about a populationmean, but a population proportion

I Proportions commonly used to summarize yes-or-nooutcomes:

I True U.S. presidential approval ratingI True support in the U.S. population for same-sex marriageI Proportion of people in an indigenous population who have a

certain genetic markerI Proportion of households in a city that are below national

poverty levels

I Can extend hypothesis testing to analyze all sorts ofproportions

I Tweak existing hypothesis framework only slightly

Hypothesis Testing for Proportions

I Can use hypothesis tests to explore these questions

I Objective: Use sample of data to make inferences about truepopulation proportion, π (sometimes denoted p)

I Same steps as before

I Test statistics slightly different

Hypothesis Testing for Proportions

I Can use hypothesis tests to explore these questions

I Objective: Use sample of data to make inferences about truepopulation proportion, π (sometimes denoted p)

I Same steps as before

I Test statistics slightly different

Hypothesis Testing for Proportions

I Can use hypothesis tests to explore these questions

I Objective: Use sample of data to make inferences about truepopulation proportion, π (sometimes denoted p)

I Same steps as before

I Test statistics slightly different

Hypothesis Testing for Proportions

I Can use hypothesis tests to explore these questions

I Objective: Use sample of data to make inferences about truepopulation proportion, π (sometimes denoted p)

I Same steps as before

I Test statistics slightly different

Hypothesis Testing for Proportions

I Can use hypothesis tests to explore these questions

I Objective: Use sample of data to make inferences about truepopulation proportion, π (sometimes denoted p)

I Same steps as before

I Test statistics slightly different

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Proportion Example: Physician Training Program

I Historically it has been difficult to attract general physiciansin the U.S. to rural areas, with 52% leaving rural areas within1 year of arriving

I You work on a new program designed to incentive doctors tostay in rural areas via loan forgiveness, housing allowances

I Random sample of physicians in the program shows that 62out of 130 leave within 1 year of arriving

I Based on this sample, what can you conclude about enrolleesin the program compared to physicians arriving to rural areasoverall?

Proportion Example: Physician Training Program

I Historically it has been difficult to attract general physiciansin the U.S. to rural areas, with 52% leaving rural areas within1 year of arriving

I You work on a new program designed to incentive doctors tostay in rural areas via loan forgiveness, housing allowances

I Random sample of physicians in the program shows that 62out of 130 leave within 1 year of arriving

I Based on this sample, what can you conclude about enrolleesin the program compared to physicians arriving to rural areasoverall?

Proportion Example: Physician Training Program

I Historically it has been difficult to attract general physiciansin the U.S. to rural areas, with 52% leaving rural areas within1 year of arriving

I You work on a new program designed to incentive doctors tostay in rural areas via loan forgiveness, housing allowances

I Random sample of physicians in the program shows that 62out of 130 leave within 1 year of arriving

I Based on this sample, what can you conclude about enrolleesin the program compared to physicians arriving to rural areasoverall?

Proportion Example: Physician Training Program

I Historically it has been difficult to attract general physiciansin the U.S. to rural areas, with 52% leaving rural areas within1 year of arriving

I You work on a new program designed to incentive doctors tostay in rural areas via loan forgiveness, housing allowances

I Random sample of physicians in the program shows that 62out of 130 leave within 1 year of arriving

I Based on this sample, what can you conclude about enrolleesin the program compared to physicians arriving to rural areasoverall?

Proportion Example: Physician Training Program

I Historically it has been difficult to attract general physiciansin the U.S. to rural areas, with 52% leaving rural areas within1 year of arriving

I You work on a new program designed to incentive doctors tostay in rural areas via loan forgiveness, housing allowances

I Random sample of physicians in the program shows that 62out of 130 leave within 1 year of arriving

I Based on this sample, what can you conclude about enrolleesin the program compared to physicians arriving to rural areasoverall?

Proportion Example: Physician Training Program

I Step 1: Null and Alternative Hypotheses

I H0: π = 0.52

I Ha: π 6= 0.52 for two-sided test (or π > 0.52, for one-sidedtest)

Proportion Example: Physician Training Program

I Step 1: Null and Alternative Hypotheses

I H0: π = 0.52

I Ha: π 6= 0.52 for two-sided test (or π > 0.52, for one-sidedtest)

Proportion Example: Physician Training Program

I Step 1: Null and Alternative Hypotheses

I H0: π = 0.52

I Ha: π 6= 0.52 for two-sided test (or π > 0.52, for one-sidedtest)

Proportion Example: Physician Training Program

I Step 1: Null and Alternative Hypotheses

I H0: π = 0.52

I Ha: π 6= 0.52 for two-sided test (or π > 0.52, for one-sidedtest)

Proportion Example: Physician Training Program

I Step 2: Collect sample data

I Assume this is done for us already!

I n = 130, with 62 doctors leaving within 1 year

Proportion Example: Physician Training Program

I Step 2: Collect sample data

I Assume this is done for us already!

I n = 130, with 62 doctors leaving within 1 year

Proportion Example: Physician Training Program

I Step 2: Collect sample data

I Assume this is done for us already!

I n = 130, with 62 doctors leaving within 1 year

Proportion Example: Physician Training Program

I Step 2: Collect sample data

I Assume this is done for us already!

I n = 130, with 62 doctors leaving within 1 year

Proportion Example: Physician Training Program

I Step 3: Conduct statistical test

I As before, need to find appropriate test statistic to compareour sample to distribution of data under the null hypothesis

I As before, another application of CLT

Proportion Example: Physician Training Program

I Step 3: Conduct statistical test

I As before, need to find appropriate test statistic to compareour sample to distribution of data under the null hypothesis

I As before, another application of CLT

Proportion Example: Physician Training Program

I Step 3: Conduct statistical test

I As before, need to find appropriate test statistic to compareour sample to distribution of data under the null hypothesis

I As before, another application of CLT

Proportion Example: Physician Training Program

I Step 3: Conduct statistical test

I As before, need to find appropriate test statistic to compareour sample to distribution of data under the null hypothesis

I As before, another application of CLT

Proportion Example: Physician Training Program

I We’ll work w/ sample proportion (denoted π̂ or p̂) → samplemean when all observations 1s and 0s

I x̄ = π̂ =∑

xin

I As a review, under the CLT,I (1) The sums and means of random variables have an

approximately normal distributionI (2) This distribution becomes more and more normal the more

observations are included in the sum or the meanI (3) This is the case even though underlying distribution may

not be normal

I Thus, under repeated sampling, π̂’s distribution will follow anormal distribution (see Lecture 11 slides for the simulation)

Proportion Example: Physician Training Program

I We’ll work w/ sample proportion (denoted π̂ or p̂) → samplemean when all observations 1s and 0s

I x̄ = π̂ =∑

xin

I As a review, under the CLT,I (1) The sums and means of random variables have an

approximately normal distributionI (2) This distribution becomes more and more normal the more

observations are included in the sum or the meanI (3) This is the case even though underlying distribution may

not be normal

I Thus, under repeated sampling, π̂’s distribution will follow anormal distribution (see Lecture 11 slides for the simulation)

Proportion Example: Physician Training Program

I We’ll work w/ sample proportion (denoted π̂ or p̂) → samplemean when all observations 1s and 0s

I x̄ = π̂ =∑

xin

I As a review, under the CLT,I (1) The sums and means of random variables have an

approximately normal distributionI (2) This distribution becomes more and more normal the more

observations are included in the sum or the meanI (3) This is the case even though underlying distribution may

not be normal

I Thus, under repeated sampling, π̂’s distribution will follow anormal distribution (see Lecture 11 slides for the simulation)

Proportion Example: Physician Training Program

I We’ll work w/ sample proportion (denoted π̂ or p̂) → samplemean when all observations 1s and 0s

I x̄ = π̂ =∑

xin

I As a review, under the CLT,

I (1) The sums and means of random variables have anapproximately normal distribution

I (2) This distribution becomes more and more normal the moreobservations are included in the sum or the mean

I (3) This is the case even though underlying distribution maynot be normal

I Thus, under repeated sampling, π̂’s distribution will follow anormal distribution (see Lecture 11 slides for the simulation)

Proportion Example: Physician Training Program

I We’ll work w/ sample proportion (denoted π̂ or p̂) → samplemean when all observations 1s and 0s

I x̄ = π̂ =∑

xin

I As a review, under the CLT,I (1) The sums and means of random variables have an

approximately normal distribution

I (2) This distribution becomes more and more normal the moreobservations are included in the sum or the mean

I (3) This is the case even though underlying distribution maynot be normal

I Thus, under repeated sampling, π̂’s distribution will follow anormal distribution (see Lecture 11 slides for the simulation)

Proportion Example: Physician Training Program

I We’ll work w/ sample proportion (denoted π̂ or p̂) → samplemean when all observations 1s and 0s

I x̄ = π̂ =∑

xin

I As a review, under the CLT,I (1) The sums and means of random variables have an

approximately normal distributionI (2) This distribution becomes more and more normal the more

observations are included in the sum or the mean

I (3) This is the case even though underlying distribution maynot be normal

I Thus, under repeated sampling, π̂’s distribution will follow anormal distribution (see Lecture 11 slides for the simulation)

Proportion Example: Physician Training Program

I We’ll work w/ sample proportion (denoted π̂ or p̂) → samplemean when all observations 1s and 0s

I x̄ = π̂ =∑

xin

I As a review, under the CLT,I (1) The sums and means of random variables have an

approximately normal distributionI (2) This distribution becomes more and more normal the more

observations are included in the sum or the meanI (3) This is the case even though underlying distribution may

not be normal

I Thus, under repeated sampling, π̂’s distribution will follow anormal distribution (see Lecture 11 slides for the simulation)

Proportion Example: Physician Training Program

I We’ll work w/ sample proportion (denoted π̂ or p̂) → samplemean when all observations 1s and 0s

I x̄ = π̂ =∑

xin

I As a review, under the CLT,I (1) The sums and means of random variables have an

approximately normal distributionI (2) This distribution becomes more and more normal the more

observations are included in the sum or the meanI (3) This is the case even though underlying distribution may

not be normal

I Thus, under repeated sampling, π̂’s distribution will follow anormal distribution (see Lecture 11 slides for the simulation)

Proportion Example: Physician Training Program

I Specifically, with large enough sample, π̂ will

π̂ ∼ N(π,π(1 − π)

n)

I See proofs for both mean and variance in Appendix)

I (∑

xi follows a binomial distribution, so parameters forsampling distribution derived from the parameters of abinomial distribution divided by n)

I Which means we can standardize

Z =π̂− π√π(1−π)

n

I where Z follows a standard normal

Proportion Example: Physician Training Program

I Specifically, with large enough sample, π̂ will

π̂ ∼ N(π,π(1 − π)

n)

I See proofs for both mean and variance in Appendix)

I (∑

xi follows a binomial distribution, so parameters forsampling distribution derived from the parameters of abinomial distribution divided by n)

I Which means we can standardize

Z =π̂− π√π(1−π)

n

I where Z follows a standard normal

Proportion Example: Physician Training Program

I Specifically, with large enough sample, π̂ will

π̂ ∼ N(π,π(1 − π)

n)

I See proofs for both mean and variance in Appendix)

I (∑

xi follows a binomial distribution, so parameters forsampling distribution derived from the parameters of abinomial distribution divided by n)

I Which means we can standardize

Z =π̂− π√π(1−π)

n

I where Z follows a standard normal

Proportion Example: Physician Training Program

I Specifically, with large enough sample, π̂ will

π̂ ∼ N(π,π(1 − π)

n)

I See proofs for both mean and variance in Appendix)

I (∑

xi follows a binomial distribution, so parameters forsampling distribution derived from the parameters of abinomial distribution divided by n)

I Which means we can standardize

Z =π̂− π√π(1−π)

n

I where Z follows a standard normal

Proportion Example: Physician Training Program

I Specifically, with large enough sample, π̂ will

π̂ ∼ N(π,π(1 − π)

n)

I See proofs for both mean and variance in Appendix)

I (∑

xi follows a binomial distribution, so parameters forsampling distribution derived from the parameters of abinomial distribution divided by n)

I Which means we can standardize

Z =π̂− π√π(1−π)

n

I where Z follows a standard normal

Proportion Example: Physician Training Program

I Specifically, with large enough sample, π̂ will

π̂ ∼ N(π,π(1 − π)

n)

I See proofs for both mean and variance in Appendix)

I (∑

xi follows a binomial distribution, so parameters forsampling distribution derived from the parameters of abinomial distribution divided by n)

I Which means we can standardize

Z =π̂− π√π(1−π)

n

I where Z follows a standard normal

Proportion Example: Physician Training Program

I Specifically, with large enough sample, π̂ will

π̂ ∼ N(π,π(1 − π)

n)

I See proofs for both mean and variance in Appendix)

I (∑

xi follows a binomial distribution, so parameters forsampling distribution derived from the parameters of abinomial distribution divided by n)

I Which means we can standardize

Z =π̂− π√π(1−π)

n

I where Z follows a standard normal

Proportion Example: Physician Training Program

I Specifically, with large enough sample, π̂ will

π̂ ∼ N(π,π(1 − π)

n)

I See proofs for both mean and variance in Appendix)

I (∑

xi follows a binomial distribution, so parameters forsampling distribution derived from the parameters of abinomial distribution divided by n)

I Which means we can standardize

Z =π̂− π√π(1−π)

n

I where Z follows a standard normal

Proportion Example: Physician Training Program

I Under assumption that null is true (here, π = 0.52), cancalculate test statistic using our data:

Z =π̂− π√π(1−π)

n

z =π̂− π0√π0(1−π0)

n

=62/130 − 0.52√

0.52(1−0.52)130

= −0.983

Proportion Example: Physician Training Program

I Under assumption that null is true (here, π = 0.52), cancalculate test statistic using our data:

Z =π̂− π√π(1−π)

n

z =π̂− π0√π0(1−π0)

n

=62/130 − 0.52√

0.52(1−0.52)130

= −0.983

Proportion Example: Physician Training Program

I Under assumption that null is true (here, π = 0.52), cancalculate test statistic using our data:

Z =π̂− π√π(1−π)

n

z =π̂− π0√π0(1−π0)

n

=62/130 − 0.52√

0.52(1−0.52)130

= −0.983

Proportion Example: Physician Training Program

I Note: Test statistic does not rely on any sample variance:

z =π̂− π0√π0(1−π0)

n

I In fact, fully determined by n and π0 (assuming null)

I Thus, more OK to use standard normal (b/c we are not usingsample standard deviation)

I But still want to check sufficient sample size → above n = 30and π not close to 0 or 1:

I n × π0 > 10 and n × (1 − π0) > 10

I If yes, distribution of π̂ is approximately normal, ok to usez-test (Normal)

I If not → calculate using binomial distribution directly

Proportion Example: Physician Training Program

I Note: Test statistic does not rely on any sample variance:

z =π̂− π0√π0(1−π0)

n

I In fact, fully determined by n and π0 (assuming null)

I Thus, more OK to use standard normal (b/c we are not usingsample standard deviation)

I But still want to check sufficient sample size → above n = 30and π not close to 0 or 1:

I n × π0 > 10 and n × (1 − π0) > 10

I If yes, distribution of π̂ is approximately normal, ok to usez-test (Normal)

I If not → calculate using binomial distribution directly

Proportion Example: Physician Training Program

I Note: Test statistic does not rely on any sample variance:

z =π̂− π0√π0(1−π0)

n

I In fact, fully determined by n and π0 (assuming null)

I Thus, more OK to use standard normal (b/c we are not usingsample standard deviation)

I But still want to check sufficient sample size → above n = 30and π not close to 0 or 1:

I n × π0 > 10 and n × (1 − π0) > 10

I If yes, distribution of π̂ is approximately normal, ok to usez-test (Normal)

I If not → calculate using binomial distribution directly

Proportion Example: Physician Training Program

I Note: Test statistic does not rely on any sample variance:

z =π̂− π0√π0(1−π0)

n

I In fact, fully determined by n and π0 (assuming null)

I Thus, more OK to use standard normal (b/c we are not usingsample standard deviation)

I But still want to check sufficient sample size → above n = 30and π not close to 0 or 1:

I n × π0 > 10 and n × (1 − π0) > 10

I If yes, distribution of π̂ is approximately normal, ok to usez-test (Normal)

I If not → calculate using binomial distribution directly

Proportion Example: Physician Training Program

I Note: Test statistic does not rely on any sample variance:

z =π̂− π0√π0(1−π0)

n

I In fact, fully determined by n and π0 (assuming null)

I Thus, more OK to use standard normal (b/c we are not usingsample standard deviation)

I But still want to check sufficient sample size → above n = 30and π not close to 0 or 1:

I n × π0 > 10 and n × (1 − π0) > 10

I If yes, distribution of π̂ is approximately normal, ok to usez-test (Normal)

I If not → calculate using binomial distribution directly

Proportion Example: Physician Training Program

I Note: Test statistic does not rely on any sample variance:

z =π̂− π0√π0(1−π0)

n

I In fact, fully determined by n and π0 (assuming null)

I Thus, more OK to use standard normal (b/c we are not usingsample standard deviation)

I But still want to check sufficient sample size → above n = 30and π not close to 0 or 1:

I n × π0 > 10 and n × (1 − π0) > 10

I If yes, distribution of π̂ is approximately normal, ok to usez-test (Normal)

I If not → calculate using binomial distribution directly

Proportion Example: Physician Training Program

I Note: Test statistic does not rely on any sample variance:

z =π̂− π0√π0(1−π0)

n

I In fact, fully determined by n and π0 (assuming null)

I Thus, more OK to use standard normal (b/c we are not usingsample standard deviation)

I But still want to check sufficient sample size → above n = 30and π not close to 0 or 1:

I n × π0 > 10 and n × (1 − π0) > 10

I If yes, distribution of π̂ is approximately normal, ok to usez-test (Normal)

I If not → calculate using binomial distribution directly

Proportion Example: Physician Training Program

I Note: Test statistic does not rely on any sample variance:

z =π̂− π0√π0(1−π0)

n

I In fact, fully determined by n and π0 (assuming null)

I Thus, more OK to use standard normal (b/c we are not usingsample standard deviation)

I But still want to check sufficient sample size → above n = 30and π not close to 0 or 1:

I n × π0 > 10 and n × (1 − π0) > 10

I If yes, distribution of π̂ is approximately normal, ok to usez-test (Normal)

I If not → calculate using binomial distribution directly

Proportion Example: Physician Training Program

I Note: Test statistic does not rely on any sample variance:

z =π̂− π0√π0(1−π0)

n

I In fact, fully determined by n and π0 (assuming null)

I Thus, more OK to use standard normal (b/c we are not usingsample standard deviation)

I But still want to check sufficient sample size → above n = 30and π not close to 0 or 1:

I n × π0 > 10 and n × (1 − π0) > 10

I If yes, distribution of π̂ is approximately normal, ok to usez-test (Normal)

I If not → calculate using binomial distribution directly

Proportion Example: Physician Training Program

I Step 4: Once have calculated the test statistic, use that tocalculate p-value

I Under two-tailed test:I 2× P(Z 6 −0.983) = 0.3256

Proportion Example: Physician Training Program

I Step 4: Once have calculated the test statistic, use that tocalculate p-value

I Under two-tailed test:I 2× P(Z 6 −0.983) = 0.3256

Proportion Example: Physician Training Program

I Step 4: Once have calculated the test statistic, use that tocalculate p-value

I Under two-tailed test:

I 2× P(Z 6 −0.983) = 0.3256

Proportion Example: Physician Training Program

I Step 4: Once have calculated the test statistic, use that tocalculate p-value

I Under two-tailed test:I 2× P(Z 6 −0.983) = 0.3256

Proportion Example: Physician Training Program

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I Given two-tailed test with p-value of 0.3256, what would youdo?

I Would you reject if one-tailed test?

Proportion Example: Physician Training Program

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I Given two-tailed test with p-value of 0.3256, what would youdo?

I Would you reject if one-tailed test?

Proportion Example: Physician Training Program

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I Given two-tailed test with p-value of 0.3256, what would youdo?

I Would you reject if one-tailed test?

Proportion Example: Physician Training Program

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I Given two-tailed test with p-value of 0.3256, what would youdo?

I Would you reject if one-tailed test?

Extending to Comparing Differences in Proportions

I This example used a single proportionI Just like in case of a single mean, there is “some value” to

compare sample proportion toI Ex) Whether program changes share of doctors leaving rural

areas (H0 : π = 0.52)I Ex) Whether half of jobs occupied by women (H0 : π = 0.50)

I However, can extend to explore comparisons of proportionsbetween groups (analogy to difference in means)

I Ex) Compare voter turnout on Brexit in Scotland (notcompetitive) versus England (competitive)?

I Ex) Are Canadians more likely to favor tighter gun controlrestrictions compared to Americans?

I Ex) Do police in the U.S. pull over black motorists morefrequently than white motorists?

Extending to Comparing Differences in Proportions

I This example used a single proportion

I Just like in case of a single mean, there is “some value” tocompare sample proportion to

I Ex) Whether program changes share of doctors leaving ruralareas (H0 : π = 0.52)

I Ex) Whether half of jobs occupied by women (H0 : π = 0.50)

I However, can extend to explore comparisons of proportionsbetween groups (analogy to difference in means)

I Ex) Compare voter turnout on Brexit in Scotland (notcompetitive) versus England (competitive)?

I Ex) Are Canadians more likely to favor tighter gun controlrestrictions compared to Americans?

I Ex) Do police in the U.S. pull over black motorists morefrequently than white motorists?

Extending to Comparing Differences in Proportions

I This example used a single proportionI Just like in case of a single mean, there is “some value” to

compare sample proportion to

I Ex) Whether program changes share of doctors leaving ruralareas (H0 : π = 0.52)

I Ex) Whether half of jobs occupied by women (H0 : π = 0.50)

I However, can extend to explore comparisons of proportionsbetween groups (analogy to difference in means)

I Ex) Compare voter turnout on Brexit in Scotland (notcompetitive) versus England (competitive)?

I Ex) Are Canadians more likely to favor tighter gun controlrestrictions compared to Americans?

I Ex) Do police in the U.S. pull over black motorists morefrequently than white motorists?

Extending to Comparing Differences in Proportions

I This example used a single proportionI Just like in case of a single mean, there is “some value” to

compare sample proportion toI Ex) Whether program changes share of doctors leaving rural

areas (H0 : π = 0.52)

I Ex) Whether half of jobs occupied by women (H0 : π = 0.50)

I However, can extend to explore comparisons of proportionsbetween groups (analogy to difference in means)

I Ex) Compare voter turnout on Brexit in Scotland (notcompetitive) versus England (competitive)?

I Ex) Are Canadians more likely to favor tighter gun controlrestrictions compared to Americans?

I Ex) Do police in the U.S. pull over black motorists morefrequently than white motorists?

Extending to Comparing Differences in Proportions

I This example used a single proportionI Just like in case of a single mean, there is “some value” to

compare sample proportion toI Ex) Whether program changes share of doctors leaving rural

areas (H0 : π = 0.52)I Ex) Whether half of jobs occupied by women (H0 : π = 0.50)

I However, can extend to explore comparisons of proportionsbetween groups (analogy to difference in means)

I Ex) Compare voter turnout on Brexit in Scotland (notcompetitive) versus England (competitive)?

I Ex) Are Canadians more likely to favor tighter gun controlrestrictions compared to Americans?

I Ex) Do police in the U.S. pull over black motorists morefrequently than white motorists?

Extending to Comparing Differences in Proportions

I This example used a single proportionI Just like in case of a single mean, there is “some value” to

compare sample proportion toI Ex) Whether program changes share of doctors leaving rural

areas (H0 : π = 0.52)I Ex) Whether half of jobs occupied by women (H0 : π = 0.50)

I However, can extend to explore comparisons of proportionsbetween groups (analogy to difference in means)

I Ex) Compare voter turnout on Brexit in Scotland (notcompetitive) versus England (competitive)?

I Ex) Are Canadians more likely to favor tighter gun controlrestrictions compared to Americans?

I Ex) Do police in the U.S. pull over black motorists morefrequently than white motorists?

Extending to Comparing Differences in Proportions

I This example used a single proportionI Just like in case of a single mean, there is “some value” to

compare sample proportion toI Ex) Whether program changes share of doctors leaving rural

areas (H0 : π = 0.52)I Ex) Whether half of jobs occupied by women (H0 : π = 0.50)

I However, can extend to explore comparisons of proportionsbetween groups (analogy to difference in means)

I Ex) Compare voter turnout on Brexit in Scotland (notcompetitive) versus England (competitive)?

I Ex) Are Canadians more likely to favor tighter gun controlrestrictions compared to Americans?

I Ex) Do police in the U.S. pull over black motorists morefrequently than white motorists?

Extending to Comparing Differences in Proportions

I This example used a single proportionI Just like in case of a single mean, there is “some value” to

compare sample proportion toI Ex) Whether program changes share of doctors leaving rural

areas (H0 : π = 0.52)I Ex) Whether half of jobs occupied by women (H0 : π = 0.50)

I However, can extend to explore comparisons of proportionsbetween groups (analogy to difference in means)

I Ex) Compare voter turnout on Brexit in Scotland (notcompetitive) versus England (competitive)?

I Ex) Are Canadians more likely to favor tighter gun controlrestrictions compared to Americans?

I Ex) Do police in the U.S. pull over black motorists morefrequently than white motorists?

Extending to Comparing Differences in Proportions

I This example used a single proportionI Just like in case of a single mean, there is “some value” to

compare sample proportion toI Ex) Whether program changes share of doctors leaving rural

areas (H0 : π = 0.52)I Ex) Whether half of jobs occupied by women (H0 : π = 0.50)

I However, can extend to explore comparisons of proportionsbetween groups (analogy to difference in means)

I Ex) Compare voter turnout on Brexit in Scotland (notcompetitive) versus England (competitive)?

I Ex) Are Canadians more likely to favor tighter gun controlrestrictions compared to Americans?

I Ex) Do police in the U.S. pull over black motorists morefrequently than white motorists?

Randomized Controlled Trial Example

I 2006 study to determine healing power of spirituality withcoronary surgery patients & their health outcomes

I Looking at six US hospitals, patients randomly assigned intotwo groups

I 604 assigned to have volunteers pray for them → 315developed complications

I 597 assigned to control (no prayer) → 304 developedcomplications

I Patients in both groups ”informed that they may or may notreceive prayer”

I Based on this data, should we agree with skeptics that sayprayer unrelated to surgical outcomes?

Randomized Controlled Trial Example

I 2006 study to determine healing power of spirituality withcoronary surgery patients & their health outcomes

I Looking at six US hospitals, patients randomly assigned intotwo groups

I 604 assigned to have volunteers pray for them → 315developed complications

I 597 assigned to control (no prayer) → 304 developedcomplications

I Patients in both groups ”informed that they may or may notreceive prayer”

I Based on this data, should we agree with skeptics that sayprayer unrelated to surgical outcomes?

Randomized Controlled Trial Example

I 2006 study to determine healing power of spirituality withcoronary surgery patients & their health outcomes

I Looking at six US hospitals, patients randomly assigned intotwo groups

I 604 assigned to have volunteers pray for them → 315developed complications

I 597 assigned to control (no prayer) → 304 developedcomplications

I Patients in both groups ”informed that they may or may notreceive prayer”

I Based on this data, should we agree with skeptics that sayprayer unrelated to surgical outcomes?

Randomized Controlled Trial Example

I 2006 study to determine healing power of spirituality withcoronary surgery patients & their health outcomes

I Looking at six US hospitals, patients randomly assigned intotwo groups

I 604 assigned to have volunteers pray for them → 315developed complications

I 597 assigned to control (no prayer) → 304 developedcomplications

I Patients in both groups ”informed that they may or may notreceive prayer”

I Based on this data, should we agree with skeptics that sayprayer unrelated to surgical outcomes?

Randomized Controlled Trial Example

I 2006 study to determine healing power of spirituality withcoronary surgery patients & their health outcomes

I Looking at six US hospitals, patients randomly assigned intotwo groups

I 604 assigned to have volunteers pray for them → 315developed complications

I 597 assigned to control (no prayer) → 304 developedcomplications

I Patients in both groups ”informed that they may or may notreceive prayer”

I Based on this data, should we agree with skeptics that sayprayer unrelated to surgical outcomes?

Randomized Controlled Trial Example

I 2006 study to determine healing power of spirituality withcoronary surgery patients & their health outcomes

I Looking at six US hospitals, patients randomly assigned intotwo groups

I 604 assigned to have volunteers pray for them → 315developed complications

I 597 assigned to control (no prayer) → 304 developedcomplications

I Patients in both groups ”informed that they may or may notreceive prayer”

I Based on this data, should we agree with skeptics that sayprayer unrelated to surgical outcomes?

Randomized Controlled Trial Example

I 2006 study to determine healing power of spirituality withcoronary surgery patients & their health outcomes

I Looking at six US hospitals, patients randomly assigned intotwo groups

I 604 assigned to have volunteers pray for them → 315developed complications

I 597 assigned to control (no prayer) → 304 developedcomplications

I Patients in both groups ”informed that they may or may notreceive prayer”

I Based on this data, should we agree with skeptics that sayprayer unrelated to surgical outcomes?

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Let’s review these steps using the prayer and surgery randomizedcontrolled trial (RTC) example

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Let’s review these steps using the prayer and surgery randomizedcontrolled trial (RTC) example

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Let’s review these steps using the prayer and surgery randomizedcontrolled trial (RTC) example

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Let’s review these steps using the prayer and surgery randomizedcontrolled trial (RTC) example

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Let’s review these steps using the prayer and surgery randomizedcontrolled trial (RTC) example

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Let’s review these steps using the prayer and surgery randomizedcontrolled trial (RTC) example

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Let’s review these steps using the prayer and surgery randomizedcontrolled trial (RTC) example

Hypothesis Testing for Proportions

Conducted using several steps:

1. Generate your null and alternative hypotheses

2. Collect sample/s of data

3. Calculate appropriate test statistic

4. Use that to calculate a probability called a p-value

5. Use the p-value to decide whether to reject null hypothesisand interpret results

Let’s review these steps using the prayer and surgery randomizedcontrolled trial (RTC) example

Randomized Controlled Trial Example

I Step 1: Null and Alternative HypothesesI Null Hypothesis

I No difference (prayer not effective)I H0 : π1 − π2 = 0 (or π1 = π2)

I Alternative HypothesisI Ha : π1 − π2 6= 0 (or π1 6= π2)I (Ha : π1 − π2 > 0)

Randomized Controlled Trial Example

I Step 1: Null and Alternative Hypotheses

I Null HypothesisI No difference (prayer not effective)I H0 : π1 − π2 = 0 (or π1 = π2)

I Alternative HypothesisI Ha : π1 − π2 6= 0 (or π1 6= π2)I (Ha : π1 − π2 > 0)

Randomized Controlled Trial Example

I Step 1: Null and Alternative HypothesesI Null Hypothesis

I No difference (prayer not effective)I H0 : π1 − π2 = 0 (or π1 = π2)

I Alternative HypothesisI Ha : π1 − π2 6= 0 (or π1 6= π2)I (Ha : π1 − π2 > 0)

Randomized Controlled Trial Example

I Step 1: Null and Alternative HypothesesI Null Hypothesis

I No difference (prayer not effective)

I H0 : π1 − π2 = 0 (or π1 = π2)

I Alternative HypothesisI Ha : π1 − π2 6= 0 (or π1 6= π2)I (Ha : π1 − π2 > 0)

Randomized Controlled Trial Example

I Step 1: Null and Alternative HypothesesI Null Hypothesis

I No difference (prayer not effective)I H0 : π1 − π2 = 0 (or π1 = π2)

I Alternative HypothesisI Ha : π1 − π2 6= 0 (or π1 6= π2)I (Ha : π1 − π2 > 0)

Randomized Controlled Trial Example

I Step 1: Null and Alternative HypothesesI Null Hypothesis

I No difference (prayer not effective)I H0 : π1 − π2 = 0 (or π1 = π2)

I Alternative Hypothesis

I Ha : π1 − π2 6= 0 (or π1 6= π2)I (Ha : π1 − π2 > 0)

Randomized Controlled Trial Example

I Step 1: Null and Alternative HypothesesI Null Hypothesis

I No difference (prayer not effective)I H0 : π1 − π2 = 0 (or π1 = π2)

I Alternative HypothesisI Ha : π1 − π2 6= 0 (or π1 6= π2)

I (Ha : π1 − π2 > 0)

Randomized Controlled Trial Example

I Step 1: Null and Alternative HypothesesI Null Hypothesis

I No difference (prayer not effective)I H0 : π1 − π2 = 0 (or π1 = π2)

I Alternative HypothesisI Ha : π1 − π2 6= 0 (or π1 6= π2)I (Ha : π1 − π2 > 0)

Randomized Controlled Trial Example

I Step 2: Collect sample data

I Done for us:

Prayer No Complications Complications Total

Yes 289 315 604No 293 304 597

Randomized Controlled Trial Example

I Step 2: Collect sample data

I Done for us:

Prayer No Complications Complications Total

Yes 289 315 604No 293 304 597

Randomized Controlled Trial Example

I Step 2: Collect sample data

I Done for us:

Prayer No Complications Complications Total

Yes 289 315 604No 293 304 597

Randomized Controlled Trial Example

I Step 2: Collect sample data

I Done for us:

Prayer No Complications Complications Total

Yes 289 315 604No 293 304 597

Randomized Controlled Trial Example

I Step 3: Calculated appropriate test statistic

I Again, under CLT, if n1 and n2 are relatively large, thenestimator π̂1 − π̂2 will have an approximately normaldistribution

I Also, for independent samples, the standard error will just bethe sum of the two standard errors

I So π̂1 − π̂2 will follow:

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I and we can standardize to

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

Randomized Controlled Trial Example

I Step 3: Calculated appropriate test statistic

I Again, under CLT, if n1 and n2 are relatively large, thenestimator π̂1 − π̂2 will have an approximately normaldistribution

I Also, for independent samples, the standard error will just bethe sum of the two standard errors

I So π̂1 − π̂2 will follow:

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I and we can standardize to

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

Randomized Controlled Trial Example

I Step 3: Calculated appropriate test statistic

I Again, under CLT, if n1 and n2 are relatively large, thenestimator π̂1 − π̂2 will have an approximately normaldistribution

I Also, for independent samples, the standard error will just bethe sum of the two standard errors

I So π̂1 − π̂2 will follow:

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I and we can standardize to

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

Randomized Controlled Trial Example

I Step 3: Calculated appropriate test statistic

I Again, under CLT, if n1 and n2 are relatively large, thenestimator π̂1 − π̂2 will have an approximately normaldistribution

I Also, for independent samples, the standard error will just bethe sum of the two standard errors

I So π̂1 − π̂2 will follow:

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I and we can standardize to

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

Randomized Controlled Trial Example

I Step 3: Calculated appropriate test statistic

I Again, under CLT, if n1 and n2 are relatively large, thenestimator π̂1 − π̂2 will have an approximately normaldistribution

I Also, for independent samples, the standard error will just bethe sum of the two standard errors

I So π̂1 − π̂2 will follow:

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I and we can standardize to

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

Randomized Controlled Trial Example

I Step 3: Calculated appropriate test statistic

I Again, under CLT, if n1 and n2 are relatively large, thenestimator π̂1 − π̂2 will have an approximately normaldistribution

I Also, for independent samples, the standard error will just bethe sum of the two standard errors

I So π̂1 − π̂2 will follow:

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I and we can standardize to

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

Randomized Controlled Trial Example

I Step 3: Calculated appropriate test statistic

I Again, under CLT, if n1 and n2 are relatively large, thenestimator π̂1 − π̂2 will have an approximately normaldistribution

I Also, for independent samples, the standard error will just bethe sum of the two standard errors

I So π̂1 − π̂2 will follow:

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I and we can standardize to

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

Randomized Controlled Trial Example

I Step 3: Calculated appropriate test statistic

I Again, under CLT, if n1 and n2 are relatively large, thenestimator π̂1 − π̂2 will have an approximately normaldistribution

I Also, for independent samples, the standard error will just bethe sum of the two standard errors

I So π̂1 − π̂2 will follow:

∼ N(π1 − π2,π1(1 − π1)

n1+π2(1 − π2)

n2)

I and we can standardize to

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

Randomized Controlled Trial Example

I Given that the null is true, π1 − π2 = 0, we can calculate thetest statistic

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

z =π̂1 − π̂2√

π̂1(1−π̂1)n1

+π̂2(1−π̂2)

n2

I Using our example:

=315/604 − 304/597√

315/604(1−315/604)604 +

304/597(1−304/597)597

= 0.4637

Randomized Controlled Trial Example

I Given that the null is true, π1 − π2 = 0, we can calculate thetest statistic

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

z =π̂1 − π̂2√

π̂1(1−π̂1)n1

+π̂2(1−π̂2)

n2

I Using our example:

=315/604 − 304/597√

315/604(1−315/604)604 +

304/597(1−304/597)597

= 0.4637

Randomized Controlled Trial Example

I Given that the null is true, π1 − π2 = 0, we can calculate thetest statistic

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

z =π̂1 − π̂2√

π̂1(1−π̂1)n1

+π̂2(1−π̂2)

n2

I Using our example:

=315/604 − 304/597√

315/604(1−315/604)604 +

304/597(1−304/597)597

= 0.4637

Randomized Controlled Trial Example

I Given that the null is true, π1 − π2 = 0, we can calculate thetest statistic

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

z =π̂1 − π̂2√

π̂1(1−π̂1)n1

+π̂2(1−π̂2)

n2

I Using our example:

=315/604 − 304/597√

315/604(1−315/604)604 +

304/597(1−304/597)597

= 0.4637

Randomized Controlled Trial Example

I Given that the null is true, π1 − π2 = 0, we can calculate thetest statistic

Z =π̂1 − π̂2 − (π1 − π2)√π1(1−π1)

n1+π2(1−π2)

n2

z =π̂1 − π̂2√

π̂1(1−π̂1)n1

+π̂2(1−π̂2)

n2

I Using our example:

=315/604 − 304/597√

315/604(1−315/604)604 +

304/597(1−304/597)597

= 0.4637

Randomized Controlled Trial Example

I Step 4: Calculate p-value

I Note: Again, w/ proportions, standard normal works well (noneed to use t)

I Let’s use a two-tailed testI Here, p-value = 0.642813

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I What would you think?I Evidence to reject null of no effect of prayer?I Using a one-sided test?

Randomized Controlled Trial Example

I Step 4: Calculate p-value

I Note: Again, w/ proportions, standard normal works well (noneed to use t)

I Let’s use a two-tailed testI Here, p-value = 0.642813

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I What would you think?I Evidence to reject null of no effect of prayer?I Using a one-sided test?

Randomized Controlled Trial Example

I Step 4: Calculate p-value

I Note: Again, w/ proportions, standard normal works well (noneed to use t)

I Let’s use a two-tailed testI Here, p-value = 0.642813

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I What would you think?I Evidence to reject null of no effect of prayer?I Using a one-sided test?

Randomized Controlled Trial Example

I Step 4: Calculate p-value

I Note: Again, w/ proportions, standard normal works well (noneed to use t)

I Let’s use a two-tailed test

I Here, p-value = 0.642813

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I What would you think?I Evidence to reject null of no effect of prayer?I Using a one-sided test?

Randomized Controlled Trial Example

I Step 4: Calculate p-value

I Note: Again, w/ proportions, standard normal works well (noneed to use t)

I Let’s use a two-tailed testI Here, p-value = 0.642813

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I What would you think?I Evidence to reject null of no effect of prayer?I Using a one-sided test?

Randomized Controlled Trial Example

I Step 4: Calculate p-value

I Note: Again, w/ proportions, standard normal works well (noneed to use t)

I Let’s use a two-tailed testI Here, p-value = 0.642813

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I What would you think?I Evidence to reject null of no effect of prayer?I Using a one-sided test?

Randomized Controlled Trial Example

I Step 4: Calculate p-value

I Note: Again, w/ proportions, standard normal works well (noneed to use t)

I Let’s use a two-tailed testI Here, p-value = 0.642813

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I What would you think?

I Evidence to reject null of no effect of prayer?I Using a one-sided test?

Randomized Controlled Trial Example

I Step 4: Calculate p-value

I Note: Again, w/ proportions, standard normal works well (noneed to use t)

I Let’s use a two-tailed testI Here, p-value = 0.642813

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I What would you think?I Evidence to reject null of no effect of prayer?

I Using a one-sided test?

Randomized Controlled Trial Example

I Step 4: Calculate p-value

I Note: Again, w/ proportions, standard normal works well (noneed to use t)

I Let’s use a two-tailed testI Here, p-value = 0.642813

I Step 5: Decide whether or not to reject the null hypothesisand interpret results

I What would you think?I Evidence to reject null of no effect of prayer?I Using a one-sided test?

Potential issues with Hypothesis Testing

Hypothesis tests are not perfect!

I 1) Hypothesis Tests test the null hypothesis, not the alternatehypothesis

I 2) Will not guarantee practical significance

I 3) Cannot guarantee full certainty regarding whether nullhypothesis is 100% false

Potential issues with Hypothesis Testing

Hypothesis tests are not perfect!

I 1) Hypothesis Tests test the null hypothesis, not the alternatehypothesis

I 2) Will not guarantee practical significance

I 3) Cannot guarantee full certainty regarding whether nullhypothesis is 100% false

Potential issues with Hypothesis Testing

Hypothesis tests are not perfect!

I 1) Hypothesis Tests test the null hypothesis, not the alternatehypothesis

I 2) Will not guarantee practical significance

I 3) Cannot guarantee full certainty regarding whether nullhypothesis is 100% false

Potential issues with Hypothesis Testing

Hypothesis tests are not perfect!

I 1) Hypothesis Tests test the null hypothesis, not the alternatehypothesis

I 2) Will not guarantee practical significance

I 3) Cannot guarantee full certainty regarding whether nullhypothesis is 100% false

Potential issues with Hypothesis Testing

Hypothesis tests are not perfect!

I 1) Hypothesis Tests test the null hypothesis, not the alternatehypothesis

I 2) Will not guarantee practical significance

I 3) Cannot guarantee full certainty regarding whether nullhypothesis is 100% false

Potential issues with Hypothesis Testing

Hypothesis tests are not perfect!

I 1) Hypothesis Tests test the null hypothesis, not the alternatehypothesis

I 2) Will not guarantee practical significance

I 3) Cannot guarantee full certainty regarding whether nullhypothesis is 100% false

1) Hypotheses Tests test the null hypothesis

I Interpret as to whether you do or do not reject the null

I NOT whether you accept the alternative

I (A bit like proof by contradiction)

I Is p-value the probability that the null is true?

1) Hypotheses Tests test the null hypothesis

I Interpret as to whether you do or do not reject the null

I NOT whether you accept the alternative

I (A bit like proof by contradiction)

I Is p-value the probability that the null is true?

1) Hypotheses Tests test the null hypothesis

I Interpret as to whether you do or do not reject the null

I NOT whether you accept the alternative

I (A bit like proof by contradiction)

I Is p-value the probability that the null is true?

1) Hypotheses Tests test the null hypothesis

I Interpret as to whether you do or do not reject the null

I NOT whether you accept the alternative

I (A bit like proof by contradiction)

I Is p-value the probability that the null is true?

1) Hypotheses Tests test the null hypothesis

I Interpret as to whether you do or do not reject the null

I NOT whether you accept the alternative

I (A bit like proof by contradiction)

I Is p-value the probability that the null is true?

2) Practical versus Statistical Significance

I A very small p-value provides strong evidence to reject nullhypothesis

I But: statistical significance 6= practical significance

I We can often get small p-values due to large sample sizes

2) Practical versus Statistical Significance

I A very small p-value provides strong evidence to reject nullhypothesis

I But: statistical significance 6= practical significance

I We can often get small p-values due to large sample sizes

2) Practical versus Statistical Significance

I A very small p-value provides strong evidence to reject nullhypothesis

I But: statistical significance 6= practical significance

I We can often get small p-values due to large sample sizes

2) Practical versus Statistical Significance

I A very small p-value provides strong evidence to reject nullhypothesis

I But: statistical significance 6= practical significance

I We can often get small p-values due to large sample sizes

2) Practical versus Statistical Significance

I Example: Suppose you have intervention to increase testscores of poor children (Group 1) versus affluent children(Group 2)

I H0 : µ1 − µ2 = 0, Ha : µ1 − µ2 6= 0

I n = 1 million study yields difference of 0.02 on 100-point testand p-value < 0.00001

I → What would you do?

I Would it affect your decision if intervention cost $1 billion?$500,000? Costless?

2) Practical versus Statistical Significance

I Example: Suppose you have intervention to increase testscores of poor children (Group 1) versus affluent children(Group 2)

I H0 : µ1 − µ2 = 0, Ha : µ1 − µ2 6= 0

I n = 1 million study yields difference of 0.02 on 100-point testand p-value < 0.00001

I → What would you do?

I Would it affect your decision if intervention cost $1 billion?$500,000? Costless?

2) Practical versus Statistical Significance

I Example: Suppose you have intervention to increase testscores of poor children (Group 1) versus affluent children(Group 2)

I H0 : µ1 − µ2 = 0, Ha : µ1 − µ2 6= 0

I n = 1 million study yields difference of 0.02 on 100-point testand p-value < 0.00001

I → What would you do?

I Would it affect your decision if intervention cost $1 billion?$500,000? Costless?

2) Practical versus Statistical Significance

I Example: Suppose you have intervention to increase testscores of poor children (Group 1) versus affluent children(Group 2)

I H0 : µ1 − µ2 = 0, Ha : µ1 − µ2 6= 0

I n = 1 million study yields difference of 0.02 on 100-point testand p-value < 0.00001

I → What would you do?

I Would it affect your decision if intervention cost $1 billion?$500,000? Costless?

2) Practical versus Statistical Significance

I Example: Suppose you have intervention to increase testscores of poor children (Group 1) versus affluent children(Group 2)

I H0 : µ1 − µ2 = 0, Ha : µ1 − µ2 6= 0

I n = 1 million study yields difference of 0.02 on 100-point testand p-value < 0.00001

I → What would you do?

I Would it affect your decision if intervention cost $1 billion?$500,000? Costless?

2) Practical versus Statistical Significance

I Example: Suppose you have intervention to increase testscores of poor children (Group 1) versus affluent children(Group 2)

I H0 : µ1 − µ2 = 0, Ha : µ1 − µ2 6= 0

I n = 1 million study yields difference of 0.02 on 100-point testand p-value < 0.00001

I → What would you do?

I Would it affect your decision if intervention cost $1 billion?

$500,000? Costless?

2) Practical versus Statistical Significance

I Example: Suppose you have intervention to increase testscores of poor children (Group 1) versus affluent children(Group 2)

I H0 : µ1 − µ2 = 0, Ha : µ1 − µ2 6= 0

I n = 1 million study yields difference of 0.02 on 100-point testand p-value < 0.00001

I → What would you do?

I Would it affect your decision if intervention cost $1 billion?$500,000?

Costless?

2) Practical versus Statistical Significance

I Example: Suppose you have intervention to increase testscores of poor children (Group 1) versus affluent children(Group 2)

I H0 : µ1 − µ2 = 0, Ha : µ1 − µ2 6= 0

I n = 1 million study yields difference of 0.02 on 100-point testand p-value < 0.00001

I → What would you do?

I Would it affect your decision if intervention cost $1 billion?$500,000? Costless?

3) No full certainty about correctly rejecting null

I Hypothesis tests given us p-values, or how extreme our teststatistic is given null being true

I However: Do not allow us to rule out null hypothesis w/100% certainty

I Specifically, we’re concerned about two scenarios

I We reject the null hypothesis, even though the null is true

I We fail to reject the null hypothesis, even though the null isfalse

I Sound familiar? (Conditional statements)

3) No full certainty about correctly rejecting null

I Hypothesis tests given us p-values, or how extreme our teststatistic is given null being true

I However: Do not allow us to rule out null hypothesis w/100% certainty

I Specifically, we’re concerned about two scenarios

I We reject the null hypothesis, even though the null is true

I We fail to reject the null hypothesis, even though the null isfalse

I Sound familiar? (Conditional statements)

3) No full certainty about correctly rejecting null

I Hypothesis tests given us p-values, or how extreme our teststatistic is given null being true

I However: Do not allow us to rule out null hypothesis w/100% certainty

I Specifically, we’re concerned about two scenarios

I We reject the null hypothesis, even though the null is true

I We fail to reject the null hypothesis, even though the null isfalse

I Sound familiar? (Conditional statements)

3) No full certainty about correctly rejecting null

I Hypothesis tests given us p-values, or how extreme our teststatistic is given null being true

I However: Do not allow us to rule out null hypothesis w/100% certainty

I Specifically, we’re concerned about two scenarios

I We reject the null hypothesis, even though the null is true

I We fail to reject the null hypothesis, even though the null isfalse

I Sound familiar? (Conditional statements)

3) No full certainty about correctly rejecting null

I Hypothesis tests given us p-values, or how extreme our teststatistic is given null being true

I However: Do not allow us to rule out null hypothesis w/100% certainty

I Specifically, we’re concerned about two scenarios

I We reject the null hypothesis, even though the null is true

I We fail to reject the null hypothesis, even though the null isfalse

I Sound familiar? (Conditional statements)

3) No full certainty about correctly rejecting null

I Hypothesis tests given us p-values, or how extreme our teststatistic is given null being true

I However: Do not allow us to rule out null hypothesis w/100% certainty

I Specifically, we’re concerned about two scenarios

I We reject the null hypothesis, even though the null is true

I We fail to reject the null hypothesis, even though the null isfalse

I Sound familiar? (Conditional statements)

3) No full certainty about correctly rejecting null

I Hypothesis tests given us p-values, or how extreme our teststatistic is given null being true

I However: Do not allow us to rule out null hypothesis w/100% certainty

I Specifically, we’re concerned about two scenarios

I We reject the null hypothesis, even though the null is true

I We fail to reject the null hypothesis, even though the null isfalse

I Sound familiar? (Conditional statements)

Type I versus Type II Errors

H0 is true H0 is not true

Reject H0

Do not reject H0

Type I versus Type II Errors

H0 is true H0 is not true

Reject H0

Do not reject H0

Type I versus Type II Errors

H0 is true H0 is not true

Reject H0 GoodDo not reject H0 Good

Type I versus Type II Errors

H0 is true H0 is not true

Reject H0 Bad! GoodDo not reject H0 Good Bad!

Type I versus Type II Errors

H0 is true H0 is not true

Reject H0 Type 1 GoodDo not reject H0 Good Type 2

Type I versus Type II Errors

I Both types of errors can occur w/ a hypothesis test

I Good practice: Before study carried out, set limits on howmuch error we would tolerate

Type I versus Type II Errors

I Both types of errors can occur w/ a hypothesis test

I Good practice: Before study carried out, set limits on howmuch error we would tolerate

Type I versus Type II Errors

I Both types of errors can occur w/ a hypothesis test

I Good practice: Before study carried out, set limits on howmuch error we would tolerate

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance or the critical value

I Rejection region: Values of X̄ to left/right of values of α forwhich the hypothesis test would reject the null

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Type I Error

I Type I: Null hypothesis rejected when in fact it is true

I Akin to false negativeI Mammogram analogy: Negative test, despite having the

disease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance or the critical value

I Rejection region: Values of X̄ to left/right of values of α forwhich the hypothesis test would reject the null

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance or the critical value

I Rejection region: Values of X̄ to left/right of values of α forwhich the hypothesis test would reject the null

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance or the critical value

I Rejection region: Values of X̄ to left/right of values of α forwhich the hypothesis test would reject the null

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance or the critical value

I Rejection region: Values of X̄ to left/right of values of α forwhich the hypothesis test would reject the null

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance or the critical value

I Rejection region: Values of X̄ to left/right of values of α forwhich the hypothesis test would reject the null

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance or the critical value

I Rejection region: Values of X̄ to left/right of values of α forwhich the hypothesis test would reject the null

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance or the critical value

I Rejection region: Values of X̄ to left/right of values of α forwhich the hypothesis test would reject the null

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance or the critical value

I Rejection region: Values of X̄ to left/right of values of α forwhich the hypothesis test would reject the null

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Type I Error

I Type I: Null hypothesis rejected when in fact it is trueI Akin to false negative

I Mammogram analogy: Negative test, despite having thedisease

I P(Type I error) = P(Rejecting H0|H0 true) = α

I α: Also referred to as level of significance or the critical value

I Rejection region: Values of X̄ to left/right of values of α forwhich the hypothesis test would reject the null

I Very common to set α = 0.05, α = 0.10, or α = 0.01

I Question: Why would we want to avoid Type I Error?

I Question: How do you interpret a level of significance of 0.05?

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is falseI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is false

I Akin to false positiveI Mammogram analogy: Positive test, despite not having the

disease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is falseI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is falseI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is falseI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Type II Error

I Type II: Null hypothesis is not rejected when in fact it is falseI Akin to false positive

I Mammogram analogy: Positive test, despite not having thedisease

I P(Type II error) = P(Not rejecting H0|H0 false)

I Often set at P(Type II error) = 0.20 = β

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Type II Error and Power

I 1 − β = P(Rejecting H0|H0 false)

I Probability of correctly rejecting the null

I Known as the power of a test

I Sometimes considered less important from substantiveperspective

I However: consider your specific problem → in some instances,either error may be more costly

I Note: Exact power calculation will depend on your alternativeHa

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:

I 1) You are asked to calculate the sample size you will need todetect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Power Analyses

I Power Analysis: If a given alternative hypothesis were true,how good would our test be at (correctly) rejecting the null?

I Somewhat similar to hypothesis test set up, but for purposesof calculating power, assume alternative true:P(Rejecting H0|Ha true)

I Usually approached in one of two ways:I 1) You are asked to calculate the sample size you will need to

detect an effect (difference between null and alternative) of acertain size

I Need to state a precise alternative null

I 2) You are asked to calculate what is the smallest effect(difference) you could detect given a sample size

→ Both mean that power analyses usually done before data arecollected (and $ spent!)

Power Analysis Example

I You are an urban policy expert studying commuting times

I Suppose you are considering testing whether average timespent in commuting is 4 hrs/week versus alternative that it isgreater due to recent construction

I Suppose further that you know (assume) true standarddeviation to be 2hrs

I Find power when a) sample size equals 16 and b) alternativehypothesis is that commuting hours equals 6hrs (so effect ofconstruction is 2hrs)

Power Analysis Example

I You are an urban policy expert studying commuting times

I Suppose you are considering testing whether average timespent in commuting is 4 hrs/week versus alternative that it isgreater due to recent construction

I Suppose further that you know (assume) true standarddeviation to be 2hrs

I Find power when a) sample size equals 16 and b) alternativehypothesis is that commuting hours equals 6hrs (so effect ofconstruction is 2hrs)

Power Analysis Example

I You are an urban policy expert studying commuting times

I Suppose you are considering testing whether average timespent in commuting is 4 hrs/week versus alternative that it isgreater due to recent construction

I Suppose further that you know (assume) true standarddeviation to be 2hrs

I Find power when a) sample size equals 16 and b) alternativehypothesis is that commuting hours equals 6hrs (so effect ofconstruction is 2hrs)

Power Analysis Example

I You are an urban policy expert studying commuting times

I Suppose you are considering testing whether average timespent in commuting is 4 hrs/week versus alternative that it isgreater due to recent construction

I Suppose further that you know (assume) true standarddeviation to be 2hrs

I Find power when a) sample size equals 16 and b) alternativehypothesis is that commuting hours equals 6hrs (so effect ofconstruction is 2hrs)

Power Analysis Example

I You are an urban policy expert studying commuting times

I Suppose you are considering testing whether average timespent in commuting is 4 hrs/week versus alternative that it isgreater due to recent construction

I Suppose further that you know (assume) true standarddeviation to be 2hrs

I Find power when a) sample size equals 16 and b) alternativehypothesis is that commuting hours equals 6hrs (so effect ofconstruction is 2hrs)

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(with given α values)

I Assume alternative hypothesis is true, µ = 6

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µ (e.g., µ = 6, 7, 8, 9)to plot a power curve or power function

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(with given α values)

I Assume alternative hypothesis is true, µ = 6

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µ (e.g., µ = 6, 7, 8, 9)to plot a power curve or power function

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(with given α values)

I Assume alternative hypothesis is true, µ = 6

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µ (e.g., µ = 6, 7, 8, 9)to plot a power curve or power function

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(with given α values)

I Assume alternative hypothesis is true, µ = 6

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µ (e.g., µ = 6, 7, 8, 9)to plot a power curve or power function

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(with given α values)

I Assume alternative hypothesis is true, µ = 6

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µ (e.g., µ = 6, 7, 8, 9)to plot a power curve or power function

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(with given α values)

I Assume alternative hypothesis is true, µ = 6

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µ (e.g., µ = 6, 7, 8, 9)to plot a power curve or power function

Power Analysis Example

Steps:

I Because you’ll eventually do a hypothesis test assuming thenull, first calculate the rejection region under standard set-up(with given α values)

I Assume alternative hypothesis is true, µ = 6

I Calculate probability of not being in rejection region when thisalternative is true (this is β)

I And then finally calculate 1 − β to get Power

I Note: Can repeat for different values of µ (e.g., µ = 6, 7, 8, 9)to plot a power curve or power function

Power Analysis Intuition

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

z

Null

Power Analysis Intuition

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

z

Null

Power Analysis Intuition

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

z

RejectReject Do not reject

Null

Power Analysis Intuition

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

z

RejectReject Do not reject

Null Alternative

Power Analysis Intuition

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

z

RejectReject Do not reject

Null Alternative

Power

Type II Error

Power Analysis Example

I Step 1: Calculate rejection region for eventual (null)hypothesis test

I Remember that test statistic for single mean is

z =X̄ − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate rejection regions under null

1.645 <X̄ − 4

1/2→ X̄ > 4.8825

I That is, if data have a sample mean of X̄ > 4.89 you wouldreject null

Power Analysis Example

I Step 1: Calculate rejection region for eventual (null)hypothesis test

I Remember that test statistic for single mean is

z =X̄ − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate rejection regions under null

1.645 <X̄ − 4

1/2→ X̄ > 4.8825

I That is, if data have a sample mean of X̄ > 4.89 you wouldreject null

Power Analysis Example

I Step 1: Calculate rejection region for eventual (null)hypothesis test

I Remember that test statistic for single mean is

z =X̄ − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate rejection regions under null

1.645 <X̄ − 4

1/2→ X̄ > 4.8825

I That is, if data have a sample mean of X̄ > 4.89 you wouldreject null

Power Analysis Example

I Step 1: Calculate rejection region for eventual (null)hypothesis test

I Remember that test statistic for single mean is

z =X̄ − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate rejection regions under null

1.645 <X̄ − 4

1/2→ X̄ > 4.8825

I That is, if data have a sample mean of X̄ > 4.89 you wouldreject null

Power Analysis Example

I Step 1: Calculate rejection region for eventual (null)hypothesis test

I Remember that test statistic for single mean is

z =X̄ − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate rejection regions under null

1.645 <X̄ − 4

1/2→ X̄ > 4.8825

I That is, if data have a sample mean of X̄ > 4.89 you wouldreject null

Power Analysis Example

I Step 1: Calculate rejection region for eventual (null)hypothesis test

I Remember that test statistic for single mean is

z =X̄ − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate rejection regions under null

1.645 <X̄ − 4

1/2→ X̄ > 4.8825

I That is, if data have a sample mean of X̄ > 4.89 you wouldreject null

Power Analysis Example

I Step 1: Calculate rejection region for eventual (null)hypothesis test

I Remember that test statistic for single mean is

z =X̄ − µ

σ/√n

I Assuming one-tailed test, we reject H0 if z > 1.645 (α = 0.05)

I Step 2: Calculate rejection regions under null

1.645 <X̄ − 4

1/2→ X̄ > 4.8825

I That is, if data have a sample mean of X̄ > 4.89 you wouldreject null

Power Analysis Example

I Now, assume alternative true, µ = 6

I If µ was 6, how often would we get a sample mean in therejection region of null of µ = 4?

I That is, what is P(X̄ > 4.89) if µ = 6?

z =X̄ − µ

σ/√n

=4.89 − 6

1/2

= −2.22

I Finally P(Z < −2.22) = 0.013

I So Power = 1 − P(Z < −2.22) = 0.987

Power Analysis Example

I Now, assume alternative true, µ = 6

I If µ was 6, how often would we get a sample mean in therejection region of null of µ = 4?

I That is, what is P(X̄ > 4.89) if µ = 6?

z =X̄ − µ

σ/√n

=4.89 − 6

1/2

= −2.22

I Finally P(Z < −2.22) = 0.013

I So Power = 1 − P(Z < −2.22) = 0.987

Power Analysis Example

I Now, assume alternative true, µ = 6

I If µ was 6, how often would we get a sample mean in therejection region of null of µ = 4?

I That is, what is P(X̄ > 4.89) if µ = 6?

z =X̄ − µ

σ/√n

=4.89 − 6

1/2

= −2.22

I Finally P(Z < −2.22) = 0.013

I So Power = 1 − P(Z < −2.22) = 0.987

Power Analysis Example

I Now, assume alternative true, µ = 6

I If µ was 6, how often would we get a sample mean in therejection region of null of µ = 4?

I That is, what is P(X̄ > 4.89) if µ = 6?

z =X̄ − µ

σ/√n

=4.89 − 6

1/2

= −2.22

I Finally P(Z < −2.22) = 0.013

I So Power = 1 − P(Z < −2.22) = 0.987

Power Analysis Example

I Now, assume alternative true, µ = 6

I If µ was 6, how often would we get a sample mean in therejection region of null of µ = 4?

I That is, what is P(X̄ > 4.89) if µ = 6?

z =X̄ − µ

σ/√n

=4.89 − 6

1/2

= −2.22

I Finally P(Z < −2.22) = 0.013

I So Power = 1 − P(Z < −2.22) = 0.987

Power Analysis Example

I Now, assume alternative true, µ = 6

I If µ was 6, how often would we get a sample mean in therejection region of null of µ = 4?

I That is, what is P(X̄ > 4.89) if µ = 6?

z =X̄ − µ

σ/√n

=4.89 − 6

1/2

= −2.22

I Finally P(Z < −2.22) = 0.013

I So Power = 1 − P(Z < −2.22) = 0.987

Power Analysis Example

I Now, assume alternative true, µ = 6

I If µ was 6, how often would we get a sample mean in therejection region of null of µ = 4?

I That is, what is P(X̄ > 4.89) if µ = 6?

z =X̄ − µ

σ/√n

=4.89 − 6

1/2

= −2.22

I Finally P(Z < −2.22) = 0.013

I So Power = 1 − P(Z < −2.22) = 0.987

Power

I Note: Power dependent on a) sample size, b) your Ha (size ofeffect), and c) type of test (change α level)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Power

I Note: Power dependent on a) sample size, b) your Ha (size ofeffect), and c) type of test (change α level)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Power

I Note: Power dependent on a) sample size, b) your Ha (size ofeffect), and c) type of test (change α level)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Power

I Note: Power dependent on a) sample size, b) your Ha (size ofeffect), and c) type of test (change α level)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Power

I Note: Power dependent on a) sample size, b) your Ha (size ofeffect), and c) type of test (change α level)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Power

I Note: Power dependent on a) sample size, b) your Ha (size ofeffect), and c) type of test (change α level)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Power

I Note: Power dependent on a) sample size, b) your Ha (size ofeffect), and c) type of test (change α level)

I Larger sample size → more power

I Bigger difference between null and alternative you are testing→ requires less power

I A two-sided hypothesis test has less power than the one- sidedhypothesis, since it is more conservative

I Rule of thumb: 80% power

I Higher power may be better, but perhaps not if it comes inchanging the type of test (b/c it affects Type 1 error)

Next time

I Interval estimation and confidence intervals

Proof of Expected Value of π̂

I If X can only take on 0, 1 values with constant probability, π,then

∑xi follows a binomial distribution

I Furthermore, the expected value of a binomial is nπ (or np, inour previous notation)

I Therefore:

E [π̂] = E [

∑xin

]

=1

nE [∑

xi ]

=np

n= p

Proof of Variance of π̂

I The variance of a binomial is nπ(1 − π) (or np(1 − p), usingour previous notation)

I Therefore:

Var [π̂] = Var [

∑xin

]

=1

n2Var [

∑xi ]

=1

n2× nπ(1 − π)

=π(1 − π)

n

Recommended