Upload
elisa-hager
View
226
Download
2
Embed Size (px)
Citation preview
{
Multilevel Modeling using
StataAndrew HicksCCPR Statistics and Methods Core
Workshop based on the book:
Multilevel and Longitudinal ModelingUsing Stata(Second Edition)
bySophia Rabe-HeskethAnders Skrondal
200
300
400
500
600
700
Min
i Wrig
ht M
eas
ure
me
nts
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17Subject ID
Occasion 1 Occasion 2
Within-Subject Dependence
Within-Subject Dependence: We can predict occasion 2 measurement ifwe know the subject’s occasion 1 measurement.
Between-Subject Heterogeneity: Large differences between subjects(compare subjects 9 and 15)
Within-subject dependence is due to between-subject heterogeneity
Standard Regression Model
𝑦 𝑖𝑗=𝛽+𝜉 𝑖𝑗
Measurement of subject i on occasion j
Population Mean
Residuals (error terms)Independent over subjects and occasions
Clearly ignores information aboutwithin-subject dependence
{{
{ { 𝜷
Variance Component Model
𝑦 𝑖𝑗=𝛽+𝜉 𝑖𝑗
𝜁 𝑗 𝜖 𝑖𝑗𝑦 𝑖𝑗=𝛽+¿ +¿Random Intercept: deviation of subjectj’s mean from overall mean
Within-subject residual: deviation of observation i from subject j’s mean
Variance Component Model
𝑦 𝑖𝑗=𝛽+𝜉 𝑖𝑗
𝜁 𝑗 𝜖 𝑖𝑗𝑦 𝑖𝑗=𝛽+¿ +¿Random Intercept: deviation of subjectj’s mean from overall mean
Within-subject residual: deviation of observation i from subject j’s mean
Variance Component Model
𝜁 𝑗 𝜖 𝑖𝑗𝑦 𝑖𝑗=𝛽+¿ +¿Random Intercept: deviation of subjectj’s mean from overall mean
Within-subject residual: deviation of observation i from subject j’s mean
𝜷𝜁 𝑗
𝛽+𝜁 𝑗𝜖2 𝑗
𝜖1 𝑗
Variance Component Model
𝜁 𝑗 𝜖 𝑖𝑗𝑦 𝑖𝑗=𝛽+¿ +¿𝜁 𝑗 ∼ 𝑁 (0 ,𝜓)𝜖 𝑖𝑗∼ 𝑁 (0 ,𝜃)
𝑉𝑎𝑟 ( 𝑦 𝑖𝑗 )=𝑉𝑎𝑟 ( 𝛽)+𝑉𝑎𝑟 (𝜁 𝑗)+𝑉𝑎𝑟 (𝜖 𝑖𝑗)0 𝜓 𝜃
𝑉𝑎𝑟 ( 𝑦 𝑖𝑗 )=𝜓+𝜃
Variance Component Model
𝜁 𝑗 𝜖 𝑖𝑗𝑦 𝑖𝑗=𝛽+¿ +¿Proportion of Total Variance due to subject differences:
=
=
Intraclass Correlation: within cluster correlation
=
Random or Fixed Effect?
Since every subject has a different effect we can think of subjects as categorical explanatory variables. Since the effectsof each subject is random, we have been using a random effect model:
, 𝜁 𝑗∼ 𝑁 (0 ,𝜓)What if we want to fix our model so that each effect is for a specific subject? Then we would use a fixed effect model:
,
.xtreg wm, fe
Random or Fixed Effect?
random effect model:
if the interest concerns the population of clusters
“generalize the potential effect” i.e. nurse giving the drug
fixed effect model:
if we are interest in the “effect” of the specific clusters in a particulardataset
“replicable in life” i.e. the actual drug
Random Intercept Model with Covariates
𝑦 𝑖𝑗=𝛽+𝜉 𝑖𝑗
𝜁 𝑗 𝜖 𝑖𝑗𝑦 𝑖𝑗=𝛽+¿ +¿without covariates:
Random Intercept Model with Covariates
with covariates:
𝑦 𝑖𝑗=𝛽1+𝛽2 𝑥2 𝑖𝑗+… 𝛽𝑝 𝑥𝑝𝑖𝑗+𝜉 𝑖𝑗
𝜖 𝑖𝑗+¿𝑦 𝑖𝑗=𝛽1+𝛽2 𝑥2 𝑖𝑗+… 𝛽𝑝 𝑥𝑝𝑖𝑗+𝜁 𝑗
𝜖 𝑖𝑗+¿
random parameter not estimated with fixed parameters
but whose variance is estimated with variance of
Ecological Fallacyoccurs when between-cluster relationships differ substantially from within-cluster relationships.
• Can be caused by cluster-lever confounding
For example, mothers who smoke during pregnancy may also adoptother behaviors such as drinking and poor nutritional intake, or have lowersocioeconomic status and be less educated. These variables adversely affectbirthweight and have not be adequately controlled for. In these cases thecovariate is correlated with the error term. (endogeneity)
• Because of this, the between-effect may be an overestimate of thetrue effect.
• In contrast, for within-effects each mother serves as her own control, so within mother estimates may be closer to the true causal effect.
How to test for endogeneity?
Use the Hausman test to compare two alternative estimators of
Random-coefficient model
We’ve already considered random intercept models where the interceptis allowed to vary over clusters after controlling for covariates.
What if we would also like the coefficients (or slopes) to vary across clusters?
Models the involve both random intercepts and random slopes are called Random Coefficient Models
Random-coefficient model
Random Intercept Model:
𝑦 𝑖𝑗=𝛽1+𝛽2 𝑥𝑖𝑗+𝜁 𝑗+𝜖𝑖𝑗
Random Coefficient Model:
𝑦 𝑖𝑗=𝛽1+𝛽2 𝑥𝑖𝑗+𝜁 1 𝑗+𝜁2 𝑗 𝑥 𝑖𝑗+𝜖 𝑖𝑗
𝑦 𝑖𝑗=(𝛽¿¿1+𝜁1 𝑗)+(𝛽2+𝜁2 𝑗)𝑥𝑖𝑗+𝜖 𝑖𝑗¿
cluster-specific random intercept
cluster-specific random slope