Multilevel Modeling: Introduction Chongming Yang, Ph.D Social Science Research Institute Social...

Preview:

Citation preview

Multilevel Modeling:Introduction

Chongming Yang, Ph.D

Social Science Research InstituteSocial Capital Group Meeting, Spring

2008

“In the past twenty years we have witnessed a paradigm shift in the analysis of correlational data. Confirmatory factor analysis and structural equation modeling have replaced exploratory factor analysis and multiple regression as the standard methods. We are currently in the early stages of a paradigm shift in the analysis of experimental data. Multilevel modeling is replacing ANOVA. Certainly ANOVA will remain a basic tool in the social psychological research, but it can no longer be considered the only technique”

Kenny, D.A. Kashy, D.A., & Bolger, N. (1998). Data analysis in psychology. In D.T. Gilbert, S.T. Fiske, & G. Lindzey (Eds.) The Handbook of Social Psychology, Vol. 1 (pp233-265). New York: McGraw-Hill.

New Paradigm in Data Analysis

Alternative Labels

• Hierarchical Linear Model (HLM)

• Random Coefficient Model

• Variance Component Model

• Multilevel Model

• Contextual Analysis

• Mixed Linear Model

Hierarchical Data Structure

• Response (outcome) variable at lowest level

• Grouping at higher levels

• Explanatory (predictive) variables at all levels

• Assuming sampling at all levels

Two Types

• Persons nested within a group

• Repeated measures nested within a person

Example of Multilevel Data

• Class Student id Math(yr1) Verb(yr1) ses Math(yr2)

1 1 78 72 70 80

1 2 65 60 56 67

1 3 80 78 63 81

1 4 85 80 75 85

2 1 92 90 80 90

2 2 91 92 81 92

2 3 93 91 83 93

2 4 90 92 82 91

2 5 94 93 85 95

Properties of Hierarchical Data

• Observations are interdependent, more similar within groups than from different groups due to shared history, contextual effects, etc.

• Errors are not independent (longitudinal data)

Standard Modeling Assumptions

• Independent observations

• Independent errors

• Equal variances of errors for all observations

Consequences of Ignoring Hierarchical Data Properties

• Smaller standard errors for regression coefficients, thus

• Spurious effects

Design-based Approach

• Apply standard analysis with sampling weights to adjust standard errors, common in survey research

Design Effects of Two-level Data

• Intraclass Correlation

= between-level variance/total variance

• Design Effect

n/[1+(n-1)]

where n = average cluster size (=>2 warrants a multilevel analysis)

Another Look

• Class Student id Math(y1) Verbal(y1) ses Teachers’

Competence

1 1 78 72 70 4

1 2 65 60 56 4

1 3 80 78 63 4

1 4 85 80 75 4

2 1 92 90 80 3

2 2 91 92 81 3

2 3 93 91 83 3

2 4 90 92 82 3

2 5 94 93 85 3

Intercepts & Slopes for Each Class

X

y

0

Class Level Summary

• Class intercept slope …

1 9.72 2.50

2 13.51 3.26

3 7.64 4.07

4 16.25 0.92

5 13.17 1.27

6 11.21 3.85

7 9.05 4.21

8 17.11 1.32

9 15.32 2.11

Modeling Intercepts & Slopes

0 = g0 + u0

1 = g10 + u1

when variances of u0 and u1 are zero, there are no group differences in 0 and 1. Thus variances of u0 and u1 are very important parameters.

Model-based ApproachMultilevel Modeling

• (Multiple Equations) Multilevel Model: yi= 0 + 1xi + ri

0 = g00 + u0

1 = g10 + u1

• (Single Equation) Mixed Model:yi= g00+u0+g10xi+u1xi+ri

Multilevel Modeling (with 2nd Level Predictors)

• (Multiple Equations) Multilevel Model: yi= 0 + 1xi + ri

0 = g00 + g01zj + u0 (main effects)

1 = g10 + g11zj + u1 (cross-level interaction)

• (Single Equation) Mixed Model:yi= g00+g01zj+u0+g10xi+g11zjxi+u1xi+ri

Rearranged Single Equation

• yi= [g00 + g10xi + g01zj + g11zjxi] (fixed effects)

+ [u1xi + u0 + ri] (random effects)

• Parameters to be estimated:

intercept: g00

slopes: g10, g01, g11

variances: r, u0, u1

covariances: among rs(in longitudinal data), u0 &u1, gs

Fixed or Random?

Fixed Random

Effect All levels are present in the experiment

Random selection of all possible levels

Variable Known Values:

e.g. gender

has a expectation (mean) and variance

Coefficient Gender A probability function of others variables, has a variance, e.g. 1st level coefficients

Cross-level Interaction

• Appears in a single equation as product term, not in multiple equations

• The effect of a lower level variable depends on upper level variables

• Example:

The effect of students’ aptitude on math achievement depends teachers’ competence

Estimation

• Restricted Maximum Likelihood: Variance components are included in the likelihood function, regression coefficients are estimated in a second step (less biased against variance)

• Full Maximum Likelihood: Both variance components and regression coefficients are included in the likelihood function (variances are slightly underestimated.

Deviance

• -2 times log-likelihood Function, 2 distribution, can be used for model comparison,

• The smaller, the better fit

Explore HLM Program

• Create MDM

• Specify and run a model

• Interpret parameters in the output

Model Exploration Procedures

1. Start with an intercept-only model (Calculate intraclass correlation)

2. Add 1st level predictors for a fixed model (Test individual slopes)

3. Model intercept by 2nd level predictors (Test significance & amount of variance explained)

4. Random coefficient model (Test variance component of 1st level slopes one by one)

5. Model Random slopes predicted by higher level variables (Test significance and amount variance explained)

Longitudinal Data

Time

y

0 1 2 3 4

Unconditional Growth Model

• 1st level: Occasion

y = p0 + p1t + r

• 2nd level: Person

p0 = g00 + u0

p1 = g10 + u1

Parameters to Interpret

• Means of Intercept (g00) & Slope(g10)

• Variances of Intercept (u0) & Slope (u1)

• Covariance/Correlation of Intercept & Slope (u0 & u1)

Extended Model

• Occasion level: Time-variant covariate x

y = p0 + p1t + p3x + r

• Person level: time-invariant covariate z

p0 = g00 + g01zj + u0

p1 = g10 + g11zj + u1

p2 = g20 + g21zj + u2

Nonlinear Growth (by Recoding T Variable)

• Linear: 0, 1, 2, 3… (0, 1, 2.5, 3.5…)

• Quadratic: 0, 1, 4, 9…

• Logarithmic: 0, 0.69, 1.10, 1.39…

• Exponential: 0, 1.72, 6.39, 19.09

Explore HLM Program Chapter 4 Example

• Create MDM

• Specify and run a model

• Interpret parameters in the output

Explore the SAS program

• Identify levels of the variables in the data

• Identify which variables could have main and/or interaction effects

• Identify random coefficients and then their variances in the output

Minimum Sample Size

• Cluster level: > 20

• Individual level: =>1

Obtain Standardized Coefficients

Standardize continuous variables to obtain standardized coefficients

Further Topics

• Categorical dependent variables

• Multivariate dependent variables

• Latent variables + mediating effects (multilevel structural equation modeling)

• Power & Sample Size

• ...

Further Resources

• http://gseweb.harvard.edu/~faculty/singer/

• www.ats.ucla.edu/stat/sas/default.htm

• SSRI consultants

• …

Recommended