70
TRANSLATING RESEARCH INTO ACTION Sample size calculations for randomized evaluations Rebecca Thornton Assistant Professor of Economics University of Michigan povertyactionlab.org

Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

TRANSLATING RESEARCH INTO ACTION

Sample size calculations for randomized evaluations

Rebecca Thornton Assistant Professor of Economics

University of Michigan

povertyactionlab.org

Page 2: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

1. Background: The basics

2. Getting more complicated: Clusters

3. How to do this in practice

Outline

Page 3: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Interviews are expensive and you have a budget

• You do not want to be disappointed that you didn’t have a large enough sample

• If you understand the basics of sample size, there are lots of things you can do to increase your power

• You are spending a lot of money and time on this evaluation

Why care?

Page 4: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• General question: how large does the sample need to be to credibly detect a given effect size? (ie. a certain effect of a program)

• What does “credibly” mean here?

It means we can be reasonably sure that the difference between the control and treatment group is due to the treatment and not just to chance

4

Today’s Question

Page 5: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Two important issues about sample size

1. Larger sample helps to ensure that the treatment and control groups are balanced (on observables and unobservables)

Helps prevent a biased estimate

2. Can detect a significant difference in outcomes between the treatment and control groups

Helps to detect a significant estimate

Sample Size

Page 6: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

To start (and to finish)

• Doing sample size calculations is a craft

• The values estimated depend on parameters

whose values are unknown and will vary.

– Power calculations involve some guess work.

– Vary across outcomes!

Page 7: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Basic set up

• At the end of an experiment, we compare the

average outcome of interest in the treatment with

the average outcome of interest in the control

• We are interested in the difference:

Mean (treatment) - Mean (control) = Effect (size)

• Example: Want to know the effect of giving out text

books on test scores. You have the scores of

treatment students (with books) and control students

(without books)

Page 8: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Simple Example

60

65

70

75

80

85

90

No Books Books

Test Scores

Page 9: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Subtract the average of the Control from the

average of the Treatment

• Run a regression of the outcome (Y) on an

indicator of being in the Treatment group:

Y= a + bT

Effect of the program

Page 10: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Simple Example

60

65

70

75

80

85

90

No Books Books

Test Scores

Page 11: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Effect size: Difference in means or,

Y = a + bT b = Effect size (slope of the line)

Y=70+10*T

• Treatment Effect = 10 points – How confident am I that there is no treatment effect?

– * 10 percent chance that there is really no effect

– ** 5 percent chance

– *** 1 percent chance

Effect of the program

Page 12: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Is the estimate of b biased?

– Discussed in previous lectures

– Depends on the validity of the randomization and mitigation of other threats

• How precise is the estimate of b?

– Did this difference happen just by chance? How confident am I that there is a true effect of my program?

– Depends on the sample size, the variability of the outcome variable (Y), and the actual effect of the program

• Accuracy vs. Precision

Back to the main questions…

Page 13: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Accuracy versus Precision

Accuracy P

reci

sio

n

Page 14: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Unbiased and sample size

Unbiased Sa

mp

le S

ize

Page 15: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• When we do survey research and estimate treatment effects… – Randomization helps us to be accurate (unbiased) – Sample size allows us to be precise (confident about

our estimates)

• Both are independently important – Increased sample size may be precise, but not

accurate. – Randomization without a large enough sample will

allow us to estimate the unbiased effect (accuracy), but we might not be that confident about it

Estimation

Page 16: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Impact evaluation involves the scientific method – 1) propose a hypothesis

– 2) design the experiment to test that hypothesis

• How do we test hypotheses? – We start with an hypothesis (ie., there will be an

effect of the program)

– At the end of an experiment, we test our hypothesis

Scientific method

Page 17: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• In criminal law, most institutions follow the rule: “innocent until proven guilty”

• The presumption is that the accused is innocent and the burden is on the prosecutor to show guilt

– The jury or judge starts with the “null hypothesis” that the accused person is innocent

– The prosecutor has a hypothesis that the accused person is guilty

17

Hypothesis testing

Page 18: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• In program evaluation, instead of “presumption of innocence,” the rule is: “presumption of insignificance”

• The “Null hypothesis” (H0) is that there was no (zero) impact of the program

• The burden of proof is on the evaluator to show a significant effect of the program

Hypothesis testing

Page 19: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• If our measurements show a difference between the treatment and control group we know:

– There is some difference between the treatment and the control…

– But, our presumption is that there is no impact of the program (our H0 is still true)

– It might be that the difference is solely due to chance (random sampling error)

• We need to use statistics to calculate how likely this difference is in fact due to random chance or not

Hypothesis testing

Page 20: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Lets say the sample size is = 2…

Extreme Example

Page 21: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Perhaps…

Less extreme: Is this difference due to random chance?

Control

Treatment

Page 22: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Probably not….

Is this difference due to random chance?

Control

Treatment

Page 23: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Using statistics, if we find that it is very unlikely (say less than a 5% probability) that the difference is solely due to chance: – We “reject our null hypothesis” – We may now say: “our program has a statistically

significant impact”

• Are we now 100 percent certain there is an impact? – No, we may be only 95% confident; and we accept

that if we using this threshold, we may be wrong 5% of the time

Hypothesis testing: conclusions

Page 24: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• What if we can’t reject our null hypothesis

– Does that mean we can be 100% certain there is no impact?

– No, it just didn’t meet the statistical threshold to conclude otherwise

Hypothesis testing: conclusions

Page 25: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Possibility #1: There is an impact

– Could detect it – have enough statistical power

– Could not detect it – do not have enough power

• Possibility #2: There is no impact

– Conclude there was no impact

– Conclude there was an impact

Two possibilities

Page 26: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

YOU CONCLUDE

Effective No Effect

THE

TRUTH

Effective Type II Error

No Effect

Type I Error

Hypothesis testing

Page 27: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

YOU CONCLUDE

Effective No Effect

THE

TRUTH

Effective Type II Error

No Effect

Type I Error

(probability =

sig level)

Hypothesis testing

Significance Level: Set to a level that you are comfortable with: With a

level of 5%, you can be 95% confident your conclusion of an effect. For policy purpose, you want to be very confident in the answer you give: the level will be set fairly low . Related to Type I error.

Page 28: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

YOU CONCLUDE

Effective No Effect

THE

TRUTH

Effective (probability =

power)

Type II Error

No Effect

Type I Error

Hypothesis testing

Power: How frequently will we detect effective programs. Type II error results from low power.

Page 29: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

1. Variance – The more “noisy” it is to start with, the harder it is

to measure effects

2. Effect Size to be detected – The smaller the effect size we want to detect, the

larger sample we need

3. Sample Size – The more children we sample, the more likely we

are to obtain the true difference

Power: main ingredients

Page 30: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

1. Variance – The more “noisy” it is to start with, the harder it is

to measure effects

2. Effect Size to be detected – The smaller the effect size we want to detect, the

larger sample we need

3. Sample Size – The more children we sample, the more likely we

are to obtain the true difference

Power: main ingredients

Page 31: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Variance

Low Standard Deviation

0

5

10

15

20

25

va

lue

33

37

41

45

49

53

57

61

65

69

73

77

81

85

89

Number

Fre

qu

en

cy

mean 50

mean 60

Page 32: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Less Precision

Medium Standard Deviation

0

1

2

3

4

5

6

7

8

9

va

lue

33

37

41

45

49

53

57

61

65

69

73

77

81

85

89

Number

Fre

qu

en

cy

mean 50

mean 60

Page 33: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Even less precise

High Standard Deviation

0

1

2

3

4

5

6

7

8

value 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89

Number

Fre

qu

en

cy

mean 50

mean 60

Page 34: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Variance depends first on your outcome variable: which outcome you want to measure

• Must calculate separately for each outcome

• What can help increase power? Can “absorb” variance: – using a baseline

– controlling for other variables

– Do a pilot and measure the outcome variables, field testing

Variance

Page 35: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

1. Variance – The more “noisy” it is to start with, the harder it is

to measure effects

2. Effect Size to be detected – The smaller the effect size we want to detect, the

larger sample we need

3. Sample Size – The more children we sample, the more likely we

are to obtain the true difference

Power: main ingredients

Page 36: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

-4 -3 -2 -1 0 1 2 3 4 5 6

control

treatment

1 Standard Deviation

Effect Size: 1 “standard deviation”

Page 37: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Effect Size: 3 standard deviations

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

-4 -3 -2 -1 0 1 2 3 4 5 6

control

treatment

The less overlap the better… (easier to detect a difference)

3 Standard Deviations

Page 38: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• What effect do you think that the program will have?

• What is the smallest effect that you would like to be able to detect with confidence?

Effect Size

Page 39: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

DO NOT USE: “Expected” effect size

• First start with the question: how big of an effect do I think the program will have? – This is usually large… I like the program, why else

implement?

– But if we overestimate the effect size, we overestimate the power that we will have, and our sample size may be too small

• Be conservative – What is the smallest effect size that would justify

implementing the program?

39

“Choosing” an effect size

Page 40: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Different effect sizes for different outcome variables

• Also depends on how variable the outcome is

• How to standardize effect sizes across outcomes? – Standardized effect size is the effect size divided

by the standard deviation of the outcome

= (Treatment – Control)/SD

• Common standardized effect sizes 40

“Choosing” an effect size

Page 41: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

An effect size of…

Is considered… …and it means that…

0.2 Modest The average member of the treatment group had a better outcome than the 58th percentile of the control group

0.5 Large The average member of the treatment group had a better outcome than the 69th percentile of the control group

0.8 VERY Large The average member of the treatment group had a better outcome than the 79th percentile of the control group

Standardized effect size

Really? Common Danger: Picking an effect size that is too large! Calculate!

Page 42: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

1. Variance – The more “noisy” it is to start with, the harder it is

to measure effects

2. Effect Size to be detected – The smaller the effect size we want to detect, the

larger sample we need

3. Sample Size – The more children we sample, the more likely we

are to obtain the true difference

Power: main ingredients

Page 43: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

0

20

40

60

80

100

120

140

160

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

test scores

control

treatment

control μ

treatment μ

Average difference: 6 points

We only observe a random sample of the students

Page 44: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Say that we have a sample of 1 observation, that comes from the distribution of data…

0.0%

0.2%

0.4%

0.6%

0.8%

1.0%

1.2%

1.4%

1.6%

1.8%

0

20

40

60

80

100

120

140

160

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

test scores

control

treatment

control μ

treatment μ

N=1

Sample size = 1

Page 45: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Sample size = 4

0.0%

0.5%

1.0%

1.5%

2.0%

2.5%

3.0%

3.5%

4.0%

0

20

40

60

80

100

120

140

160

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

test scores

control

treatment

control μ

treatment μ

N=4

Page 46: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Sample size = 9

0.0%

1.0%

2.0%

3.0%

4.0%

5.0%

6.0%

0

20

40

60

80

100

120

140

160

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

test scores

control

treatment

control μ

treatment μ

N=9

Page 47: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Sample size = 100

0.0%

2.0%

4.0%

6.0%

8.0%

10.0%

12.0%

14.0%

16.0%

18.0%

0

20

40

60

80

100

120

140

160

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

test scores

control

treatment

control μ

treatment μ

N=100

Page 48: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Sample size = 6,000

0.0%

5.0%

10.0%

15.0%

20.0%

25.0%

30.0%

35.0%

40.0%

45.0%

0

20

40

60

80

100

120

140

160

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

test scores

control

treatment

control μ

treatment μ

N=sqrt(6000)

Page 49: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• What is a good level of power?

• A power of 80% tells us that, in 80% of the experiments of this sample size conducted in this population, if the null hypothesis is in fact false (e.g. there is a treatment effect), we will be able to reject it. In other words, 80% of the time we will be able to measure an effect.

• 20% of the time I will be disappointed

• Common Power used: 80%, 90%

• But I don’t like to be disappointed 20% of the time

Power: What level do I want?

Page 50: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

1. Background: The basics

2. Getting more complicated: Clusters

3. How to do this in practice

Outline

Page 51: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Up to now, we have been assuming randomization at the individual level

• But often, we may want to randomize at a higher group level – Village

– School

– District

• In that case, groups are randomized and individuals within each treatment or control group all get the same treatment

Individual vs. Group design

Page 52: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Minimize or remove contamination across individuals – Example: Deworming, information campaigns

• More feasible

• Only natural choice – Example: Any education intervention that affect

an entire classroom (e.g. flipcharts, teacher training).

• Why not? Expense (linked with power)

Reason for cluster randomization

Page 53: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• If the treatment is randomized at a group level you need more observations

• Why? The observations (ie. individuals) are not independent of each other – All villagers are exposed to the same weather – All districts share a common history – All students share a schoolmaster

• The more correlation between the outcomes within a group, the larger sample you need

• Value called r (rho) measures this

Impact of Group-level randomization

Page 54: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Like percentages, r must be between 0 and 1

• Higher values mean that your clusters are more correlated (bad for power), lower r is more desirable

• It is sometimes low, 0, .05, .08, but can be high:0.62

Values of r (rho)

Madagascar Math + Language 0.5

Busia, Kenya Math + Language 0.22

Udaipur, India Math + Language 0.23

Mumbai, India Math + Language 0.29

Vadodara, India Math + Language 0.28

Busia, Kenya Math 0.62

Page 55: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Where do I find *my* rho?

– Use data

– Ask other researchers

– Be conservative and use a high value

Values of r (rho)

Page 56: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Impact of r (rho) on sample size?

• Design effect = #cluster/#nocluster

• Design effect = 1+(n-1)*rho

– If only one respondent per cluster, rho doesn’t matter

– Larger rho, bigger design effect

– Larger sample size, larger effects of rho

group size (n) rho 10 50 100 200 0.02 1.18 1.98 2.98 4.98

0.05 1.45 3.45 5.95 10.95 0.10 1.9 5.9 10.9 20.9

Page 57: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• If experimental design is clustered, we now need to consider rho when choosing a sample size

• It is extremely important to randomize an adequate number of groups

• Often the number of individuals within groups matter less than the total number of groups

Implications

Page 58: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

1. Background: The basics

2. Getting more complicated: Clusters

3. How to do this in practice

Outline

Page 59: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• Two approaches:

• Approach one: Given budget constraints or logistics, you are given the maximum possible sample size. With your estimated effect size, will you have enough power such that it is worthwhile pursuing the project?

• Approach two: Set the power equal to some acceptable number. Given the estimated effect size, what is the sample required to obtain that power?

How to do “power calculations”?

Page 60: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

• You plug in some numbers…

• Software will either graph (relates to two approaches above):

– Approach 1: Power vs. effect size

– Approach 2: Power vs. observations

• Follow the graph to see #observations or effect size that gives you ~0.90 power

Power calculations using OD software

Page 61: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Power Calculations using the OD software

• Choose “Power vs number of clusters” in the

menu “clustered randomized trials”

Page 62: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Cluster Size (If no clusters)

• Choose cluster with 1 units… this is a bit

confusing

Page 63: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Choose Significance Level and

Standardized Effect Size

• Pick a

– Normally you pick 0.05

• Pick d

– Can experiment with 0.20

• You obtain the resulting graph showing

power as a function of sample size.

Page 64: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Power and Sample Size

Page 65: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Power and Sample Size

Page 66: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Power and Sample Size

Page 67: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Availability of a Baseline

• A baseline has three main uses:

– Check if C and T group same before the treatment

– Reduce the sample size needed (use controls)

– Interactions and subgroups

• To compute power with a baseline:

– Need to know correlation between two outcome

measures

– Stronger the correlation, the bigger the gain.

– Very big gains for very persistent outcomes such as

tests scores

Page 68: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Stratified Samples

• Stratification reduces sample size needed to achieve a given power

• Why? – Reduce the variance of outcome of interest in each strata

– Reduce the correlation of units within clusters

• Example: if you randomize within school and grade which class is treated and which class is control: – Variance of test score goes down

– The within cluster correlation goes down

• Common stratification variables: – Baseline values of the outcomes when possible

– We expect the treatment to vary in different subgroups

Page 69: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Other considerations

• Are you interested in the difference between two

treatments?

• Are you interested in testing whether the effect is

different in different subpopulations?

• Will there be attrition?

Page 70: Planning Sample Size for Randomized Evaluations Ed... · Sample size calculations for randomized evaluations Rebecca Thornton ... have a large enough sample •If you understand the

Conclusions

• Sample size calculations are a craft

• Calculations depend on parameters whose

values are unknown and will vary.

– Power calculations involve some guess work.

– Involve pilot testing

– Vary across outcomes!