44
Rerandomization to Improve Covariate Balance in Randomized Experiments Kari Lock Harvard Statistics Advisor: Don Rubin 4/28/11

Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Rerandomization  to  Improve  Covariate  Balance  in  

Randomized  Experiments  Kari  Lock  

Harvard  Statistics  Advisor:  Don  Rubin  

4/28/11  

Page 2: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   Randomized  experiments  are  the  “gold  standard”  for  estimating  causal  effects  

•   WHY?        

1.  They  yield  unbiased  estimates    2.  They  eliminate  confounding  factors  

The  Gold  Standard  

Page 3: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

R  R   R   R  

R   R   R   R  

R   R   R   R  

R   R   R   R  

R   R   R   R  

R   R   R   R  

R   R   R   R  

R   R   R   R  

R   R   R  

R   R   R   R   R  

R   R   R   R   R  

R   R   R   R   R  

R   R   R  

R   R   R   R   R  

R   R   R   R   R  

R   R   R   R   R  

 Randomize  

R  R   R   R   R   R   R   R  

Page 4: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

12  Females,  8  Males   8  Females,  12  Males  5  Females,  15  Males   15  Females,  5  Males  

•   Suppose  you  get  a  “bad”  randomization  •   What  would  you  do???  •   Can  you  rerandomize???    When?    How?    

Covariate  Balance  -­‐  Gender  

Page 5: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Rubin:  What  if,  in  a  randomized  experiment,  the  chosen  randomized  allocation  exhibited  substantial  imbalance  on  a  prognostically  important  baseline  covariate?  

Cochran:  Why  didn't  you  block  on  that  variable?  

Rubin:  Well,  there  were  many  baseline  covariates,  and  the  correct  blocking  wasn't  obvious;  and  I  was  lazy  at  that  time.  Cochran:  This  is  a  question  that  I  once  asked  Fisher,  and  his  reply  was  unequivocal:  

Fisher  (recreated  via  Cochran):  Of  course,  if  the  experiment  had  not  been  started,  I  would  rerandomize.  

The  Origin  of  the  Idea  

Page 6: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   The  more  covariates,  the  more  likely  at  least  one  covariate  will  be  imbalanced  across  treatment  groups  

•   With  just  10  independent  covariates,  the  probability  of  a  signiWicant  difference  (at  level  α  =  .05)  for  at  least  one  covariate  is  1  –  (1-­‐.05)10  =  40%!  

•   Covariate  imbalance  is  not  limited  to  rare  “unlucky”  randomizations    

Covariate  Imbalance  

Page 7: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   We  all  know  that  randomized  experiments  are  the  “gold  standard”  for  estimating  causal  effects  

•   WHY?        

1.  They  yield  unbiased  estimates    2.  They  eliminate  confounding  factors  

…  on  average!        For  any  particular  experiment,  covariate  imbalance  is  possible  (and  likely!),  and  conditional  bias  exists  

The  Gold  Standard  

Page 8: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Randomize  subjects  to  treatment  and  control  

Collect  covariate  data  Specify  criteria  determining  when  a  randomization  is  unacceptable;  based  

on  covariate  balance  

(Re)randomize  subjects  to  treatment  and  control  

Check  covariate  balance  

1)  

2)  

Conduct  experiment  

unacceptable   acceptable  

Analyze  results    (with  a  randomization  test)  

3)  

4)  

RERANDOMIZATION  

Page 9: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Theorem:  If  the  treatment  groups  are  exchangeable  for  each  randomization  and  for  the  rerandomization  criteria,  then  rerandomization  yields  an  unbiased  estimate  of  the  average  treatment  effect.    •   For  exchangeability:  •   Equal  sized  treatment  groups  •   Rerandomization  criteria  that  is  objective  and  does  not  favor  a  speciWic  group    

Unbiased  

Page 10: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Unbiased  

Options  for  unequal  sized  treatment  groups:  

•   Rerandomize  into  multiple  equally  sized  groups,  then  combine  groups  after  the  rerandomization  •   Discard  the  extra  units  to  form  equal  sized  groups  •   Rerandomize  to  lower  MSE,  with  perhaps  slight  bias  •   Do  not  rerandomize  

Page 11: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   Randomization  Test:  •   Simulate  randomizations  to  see  what  the  statistic  would  look  like  just  by  random  chance,  if  the  null  hypothesis  were  true    

•   Rerandomization  Test:  •   A  randomization  test,  but  for  each  simulated  randomization,  follow  the  same  rerandomization  criteria  used  in  the  experiment  

•   As  long  as  the  simulated  randomizations  are  done  using  the  same  randomization  scheme  used  in  the  experiment,  this  will  give  accurate  p-­‐values  

 

Rerandomization  Test  

Page 12: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   t-­‐test:  •   Too  conservative  •   SigniWicant  results  can  be  trusted    

•   Regression:  •   Regression  including  the  covariates  that  were  balanced  on  using  rerandomization  more  accurately  estimates  the  true  precision  of  the  estimated  treatment  effect  •   Assumptions  are  less  dangerous  after  rerandomization  because  groups  should  be  well  balanced  

   

Alternatives  for  Analysis  

Page 13: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

We  use  Mahalanobis  Distance,  M,  to  represent  multivariate  distance  between  group  means:    

2Under adequate sample sizes and pure randomization: ~: Number of covariates to be balanced

kMk

χ

Choose  a  and  rerandomize  when  M  >  a  

( ) ( ) ( )1

'cov T CT C T CM−

− −= −X X X X X X

Criteria  for  Acceptable  Balance  

( ) ( ) ( )11

1 1 'covT C T CT Cn n

−−⎛ ⎞

= + − −⎜ ⎟⎝ ⎠

X X X X X

Page 14: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Ma

MMa

Ma

Distribution  of  M  

RERANDOMIZE  

Acceptable  Randomizations  

pa  =  Probability  of  accepting  a  randomization  

Page 15: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   Choosing  the  acceptance  probability  is  a  tradeoff  between  a  desire  for  better  balance  and  computational  time  

•   The  number  of  randomizations  needed  to  get  one  successful  one  is  Geometric(pa),  so  the  expected  number  needed  is  1/pa    

•   Computational  time  must  be  considered  in  advance,  since  many  simulated  acceptable  randomizations  are  needed  to  conduct  the  randomization  test  

Choosing  pa  

Page 16: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   The  obvious  choice  may  be  to  set  limits  on  the  acceptable  balance  for  each  covariate  individually  •   This  destroys  the  joint  distribution  of  the  covariates    

-3 -2 -1 0 1 2 3

-3-2

-10

12

3 D2

D1

Another  Measure  of  Balance  

Page 17: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   Since  M  follows  a  known  distribution,  easy  to  specify  the  proportion  of  accepted  randomizations  •   M  is  afWinely  invariant  (unaffected  by  afWine  transformations  of  the  covariates)  •   Correlations  between  covariates  are  maintained  •   The  balance  improvement  for  each  covariate  is  the  same  (and  known)…  •   …  and  is  the  same  for  any  linear  combination  of  the  covariates  

Rerandomization  Based  on  M  

Page 18: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Theorem:  If  nT  =  nC,  the  covariate  means  are  normally  distributed,  and  rerandomization  occurs  when  M  >  a,  then     ( )|T CE M a− ≤ =X X 0

( ) ( )cov cov| .T C T CaM a v− ≤ = −X X X X

( )( )

22

2

1,2 2 2

,2 2

ka

k

k aP a

vk ak P a

γ χ

χγ

+

⎛ ⎞+⎜ ⎟ ≤⎝ ⎠≡ × =⎛ ⎞ ≤⎜ ⎟⎝ ⎠

Covariates  After  Rerandomization  

1

0where is the incomplete gamma function: ( , ) .

c b yb c y e dyγ γ − −≡ ∫

Page 19: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Percent  Reduction  in  Variance  

( )( )

, ,

, ,

rerandomizativar

var

onj T j C

j T j C

X X

X X

( ) ( )( )

, , , ,

, ,

rerandomizatiovar var n100

varj T j C j T j C

j T j C

X X X X

X X

⎛ ⎞−⎜ ⎟⎜⎝

− ⎟⎠

− ∣

•   For  each  covariate  xj,  the  ratio  of  variances  is:  

and  the  percent  reduction  in  variance  is  

av=

100(1 )av= −

Page 20: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit
Page 21: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Shadish  Data  

n  =  445  undergraduate  students  

Randomized  Experiment  235  

Observational  Study  210  

Vocab  Training  116  

Math  Training  119  

Vocab  Training  131  

Math  Training  79  

randomization  

randomization   students  choose  

Shadish,  M.  R.,  Clark,  M.  H.,  Steiner,  P.  M.  (2008).    Can  nonrandomized  experiments  yield  accurate  answers?    A  randomized  experiment  comparing  random  and  nonrandom  assignments.    JASA.    103(484):  1334-­‐1344.  

Page 22: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Shadish  Data  

k    =  10  covariates  

Standardized Difference in Covariate Means-3 -2 -1 0 1 2 3

maleage

collgpaaactcomp

preflitlikelit

likemathnumbmath

mathprevocabpre

Page 23: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Shadish  Data  

To  generate  1000  rerandomizations:    

 pa  =  0.1:  11.6  seconds    pa  =  0.01:  1.8  minutes    pa  =  0.001:  18.3  minutes    pa  =  0.0001:    3.4  hours  

 

 

Page 24: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit
Page 25: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Shadish  Data  

To  generate  1000  rerandomizations:    

 pa  =  0.1:  11.6  seconds    pa  =  0.01:  1.8  minutes    pa  =  0.001:  18.3  minutes    pa  =  0.0001:    3.4  hours  

 

 

Page 26: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Shadish  Data  

( )210 0.001 1.48P a aχ ≤ ⇒ ==

Rerandomize  when  M  >  1.48  

Ratio of Variances:1.48

1.4

101, 1,2 22 2 2 2 0.12

1010, ,2 2

82 2

a

k a

k akv

γ γ

γ γ

⎛ ⎞ ⎛ ⎞+ +⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠× = × =⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

=

k    =  10  covariates  pa  =  0.001      

     

Percent  Reduction  in  Variance:  100(1  –  va)  =  88%  

Page 27: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Shadish  Data  Vocab Pre-Test

XT-XC

-2 -1 0 1 2

Pure RandomizationRerandomization

Page 28: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Shadish  Data  

Standardized Difference in Covariate Means-4 -2 0 2 4

act+10gpa

male

age

gpa

act

preflit

likelit

likemath

numbmath

mathpre

vocabpre

Actual RandomizationPure RandomizationRerandomization PRIV

88%

88%

88%

88%

88%

88%

88%

88%

88%

88%

88%

Page 29: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Theorem:  If  nT  =  nC  ,  the  covariate  and  outcome  means  are  normally  distributed,  the  treatment  effect  is  additive,  and  rerandomization  occurs  when  M  >  a,  then        

( )|T CE Y aY M τ− ≤ =and  the  percent  reduction  in  variance  for  the  estimated  treatment  effect  is  

( ) 200 1 ,1 av R−where  R2  is  the  squared  canonical  correlation.  

Estimated  Treatment  Effect  After  Rerandomization  

Page 30: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

va  

Outcome  Variance  Reduction  

Page 31: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Shadish  Data  

n  =  445  undergraduate  students  

Randomized  Experiment  235  

Observational  Study  210  

Vocab  Training  116  

Math  Training  119  

Vocab  Training  131  

Math  Training  79  

randomization  

randomization   students  choose  

Outcomes:    score  on  a  vocabulary  test,  score  on  a  mathematics  test    

Page 32: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Shadish  Data  

10, .001 Covariate Percent Reduction in Variance = 88%

ak p=⇒

=

Vocabulary:    R2  =  0.11    ⇒  Percent  Reduction  in  Variance  =  88(0.11)  =  9.68%      

Equivalent  to  increasing  the  sample  size  by    

1/(1-­‐0.28)  =  1.39  

2Percent Reduction in Varia ˆnce 88 for is T CY RYτ = −

Mathematics:    R2  =  0.32    ⇒  Percent  Reduction  in  Variance  =  88(0.32)  =  28.16%      

Page 33: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Shadish  Data  

Vocabulary

Estimated Treatment Effect-3 -2 -1 0 1 2 3

Mathematics

Estimated Treatment Effect-2 -1 0 1 2

Page 34: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   Since  M  follows  a  known  distribution,  easy  to  specify  the  proportion  of  accepted  randomizations  •   M  is  afWinely  invariant  (unaffected  by  afWine  transformations  of  the  covariates)  •   Correlations  between  covariates  are  maintained  •   The  balance  improvement  for  each  covariate  is  the  same…  •   …  and  is  the  same  for  any  linear  combination  of  the  covariates  

Rerandomization  Based  on  M  

Is  that  good???  

•   Since  M  follows  a  known  distribution,  easy  to  specify  the  proportion  of  accepted  randomizations  •   M  is  afWinely  invariant  (unaffected  by  afWine  transformations  of  the  covariates)  •   Correlations  between  covariates  are  maintained  •   The  balance  improvement  for  each  covariate  is  the  same…  •   …  and  is  the  same  for  any  linear  combination  of  the  covariates  

Page 35: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   In  practice,  covariates  are  often  of  varying  importance    •   We  can  place  covariates  into  tiers,  with  the  top  tier  including  the  most  important  covariates,  the  second  tier  less  important,  etc.  

•   Criteria  for  acceptable  balance  can  be  less  stringent  for  each  successive  tier    

     

Tiers  of  Covariates  

Page 36: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

1)   Check  balance  for  covariates  in  tier  1    ⇒  calculate  M1  using  the  covariates  in  tier  1,    

rerandomize  if  M1  >  a1    2)  For  each  successive  tier,  regress  each  covariate  

on  all  covariates  in  previous  tiers,  and                check  balance  for  the  residuals  of  that  tier              ⇒  calculate  Mt  using  the  residuals  of  tier  t,  

rerandomize  if  Mt  >  at      at  denotes  the  acceptance  threshold  for  tier  t  

Tiers  of  Covariates  

Page 37: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Theorem:  If  the  nT  =  nC  and  the  covariate  means  are  normally  distributed,  and  rerandomization  occurs  if  any  Mt  >  at,  then  the  ratio  of  variances  for  covariate  xj  is          

     

1 if in tier 1,av

1 2

2 21 1(1 ) if in tier 2,a av R Rv+ −

Tiers  of  Covariates  

1

12 2 2 2

1 1 12

( ) (1 ), if in tier 2,i t

t

a a i i a ti

v R v R R v R t−

− −=

+ − + − >∑where  Rt2  denotes  the  squared  canonical  correlation  between  xj  and  all  covariates  in  tiers  1  through  t.  

Page 38: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Empirical Percent Reduction in Variance

0 20 40 60 80 100

actcomp*collgpaalikelit*collgpaalikelit*actcomp

likemath*collgpaalikemath*actcomp

likemath*likelitnumbmath*collgpaanumbmath*actcomp

numbmath*likelitnumbmath*likemath

mathpre*collgpaamathpre*actcomp

mathpre*likelitmathpre*likemath

mathpre*numbmathvocabpre*collgpaavocabpre*actcomp

vocabpre*likelitvocabpre*likemath

vocabpre*numbmathvocabpre*mathpre

mathpre2vocabpre2

TIER 4:aframcauc

marriedmaleage

daddegrmomdegr

parentsIncomepintellpemot

pconscpagreepextra

beckmars

majormicredit

hsgpaarpreflit

TIER 3:collgpaaactcomp

likelitlikemath

numbmathTIER 2:

mathprevocabpre

TIER 1: 94.8278.2440.7523.24

CovariatesResiduals

Page 39: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Standardized Difference in Covariate Means

-4 -2 0 2 4

actcomp*collgpaalikelit*collgpaalikelit*actcomp

likemath*collgpaalikemath*actcomp

likemath*likelitnumbmath*collgpaanumbmath*actcomp

numbmath*likelitnumbmath*likemath

mathpre*collgpaamathpre*actcomp

mathpre*likelitmathpre*likemath

mathpre*numbmathvocabpre*collgpaavocabpre*actcomp

vocabpre*likelitvocabpre*likemath

vocabpre*numbmathvocabpre*mathpre

mathpre2vocabpre2

TIER 4:aframcauc

marriedmaleage

daddegrmomdegr

parentsIncomepintellpemot

pconscpagreepextra

beckmars

majormicredit

hsgpaarpreflit

TIER 3:collgpaaactcomp

likelitlikemath

numbmathTIER 2:

mathprevocabpre

TIER 1:PRIV95%94%

79%80%79%85%79%67%52%44%45%46%38%47%43%43%41%53%31%41%44%43%42%39%51%55%

94%89%92%79%78%84%90%85%81%82%84%91%88%74%76%79%78%72%78%77%80%78%81%

Page 40: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

Empirical  vs  Theoretical  Values  

0 20 40 60 80 100

020

4060

8010

0

Empirical PRIV

Theo

retic

al P

RIV

Tier 1Tier 2Tier 3Tier 4

Page 41: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•     Rerandomization  is  easily  extended  to  multiple  treatment  groups  

•   To  measure  balance,  one  of  the  MANOVA  test  statistics  (such  as  Wilks’  Λ)  can  be  used,  measuring  the  ratio  of  within  group  variability  to  between  group  variability  for  multiple  covariates  

•   This  is  equivalent  to  rerandomizing  based  on  Mahalanobis  distance  if  used  on  only  two  groups    

Multiple  Treatment  Groups  

Page 42: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•     This  is  not  meant  to  replace  blocking,  and  blocking  and  rerandomization  can  be  used  together  

•   Blocking  is  great  for  balancing  a  few  very  important  covariates,  but  is  not  easily  used  for  many  covariates  

•     “Block  what  you  can,  rerandomize  what  you  can’t”    

 

Blocking  

Page 43: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   The  theoretical  results  given  depend  on  M  ~  χk2      

•   If  the  covariate  means  are  not  normally  distributed,  that  is  Wine!    The  theoretical  results  given  are  used  nowhere  in  analysis.    Rerandomization  does  NOT  depend  on  asymptotics.  

•   The  threshold  a  corresponding  to  pa,  and  the  percent  reduction  in  variance  for  the  covariates  can  be  estimated  via  simulation  

Small  Samples  or  Non-­‐Normality  

Page 44: Rerandomization,to,Improve, CovariateBalancein, Randomized ...php.scripts.psu.edu/users/k/l/klm47/Lock_Defense_2011.pdf · 4/28/2011  · mathpre*collgpaa mathpre*actcomp mathpre*likelit

•   Rerandomization  improves  covariate  balance  between  the  treatment  groups,  giving  the  researcher  more  faith  that  an  observed  effect  is  really  due  to  the  treatment  

•   If  the  covariates  are  correlated  with  the  outcome,  rerandomization  also  increases  precision  in  estimating  the  treatment  effect,  giving  the  researcher  more  power  to  detect  a  signiWicant  result  

Conclusion  

Thank  you!!!