Random effects as latent variables: SEM for repeated measures data Dr Patrick Sturgis University of Surrey

Random effects as latent variables: SEM for repeated measures data

Dr Patrick Sturgis

University of Surrey

Overview

• Random effects (multi-level) models for repeated measures data.

• Random effects as latent variables.• Specifying time in LGC models.• Growth parameters.• Linear Growth.• Plotting observed and fitted growth.• Predicting Growth.• An example: Issue Voting

Repeated Measures & Random Effects

• A problem when analysing panel data is how to account for the correlation between observations on the same subject.

• Different approaches handle this problem in different ways.• E.g. impose different structures on the residual correlations

(exchangeable, unstructured, independent).• Assume correlations between repeated observations arise

because the regression coefficients vary across subjects.• So, we have average (or ‘fixed’) effects for the population as

a whole.• And individual variability (or ‘random’) effects around these

average coefficients.• This is sometimes referred to as a ‘random effects’ or ‘multi-

level’ model.

SEM for Repeated Measures• The primary focus of this course has been on how latent

variables can be used on cross-sectional data.• The same framework can be used on repeated

measured data to overcome the correlated residuals problem.

• The mean of a latent variable is used to estimate to the ‘average’ or fixed effect.

• The variance of a latent variable represents individual heterogeneity around the fixed coefficient – the ‘random’ effect.

• For cross-sectional data latent variables are specified as a function of different items at the same time point.

• For repeated measures data, latent variables are specified as a function of the same item at different time points.

LV

X11 X12 X13 X14

E1

1

E2

1

E3

1

E4

1

A Single Latent Variable Model

4 different items

same item at 4 time points

Estimate mean and variance of underlying factor

Estimate mean and variance of trajectory of change over time

Estimate factor loadingsConstrain factor loadings

LV

X1 X2 X3 X4

E1

1

E2

1

E3

1

E4

1

Random Effects as Latent Variables

• So, it turns out that another way of estimating growth trajectories on repeated measures data is as latent variables in SEM models.

• The mean of the latent variable is the fixed part of the model.– It indicates the average for the parameter in the

population.

• The variance of the latent variable is the random part of the model.– It indicates individual heterogeneity around the

average.– Or inter-individual difference in intra-individual

change.

Growth Parameters

• The earlier path diagram was an over-simplification.

• In practice we require at least two latent variables to describe growth.

• One to estimate the mean and variance of the intercept (usually denoting initial status).

• And one to estimate the mean and variance of the slope (denoting change over time).

Specifying Time in LGC Models

• In random effect models, time is included as an independent variable:

• In LGC models, time is included via the factor loadings of the latent variables.

• We constrain the factor loadings to take on particular values.

• The number of latent variables and the values of the constrained loadings specify the shape of the trajectory.

ittioiit xy 1

A Linear Growth Curve Model

11 1

1 1

0

2 3

Constraining values of the intercept to 1 makes this parameter indicate initial status

Constraining values of the slope to 0,1,2,3 makes this parameter indicate linear change

ICEPT SLOPE

X1t1 X1t2 X1t3 X1t4

E1

1

E2

1

E3

1

E4

1

File structure for LGC

• For random effect models, we use ‘long’ data file format.

• There are as many rows as there are observations.

• For LGC, we use ‘wide’ file formats.

• Each case (e.g. respondent) has only one row in the data file.

An Example

• We are interested in the development of knowledge of SEM during a course.

• We have measures of knowledge on individual students taken at 4 time points.

• Test scores have a minimum value of zero and a maximum value of 25.

• We specify linear growth.

Linear Growth Example

mean=11.2 (1.4) p<0.001

variance =4.1 (0.8) p<0.001

mean=1.3 (0.25) p<0.001

variance =0.6 (0.1) p<0.001

11 1

1 1

0

2 3

ICEPT SLOPE

X1t1 X1t2 X1t3 X1t4

E1

1

E2

1

E3

1

E4

1

Interpretation

• The average level of knowledge at time point one was 11.2

• There was significant variation across respondents in this initial status.

• On average, students increased their knowledge score by 1.2 units at each time point.

• There was significant variation across respondents in this rate of growth.

• Having established this descriptive picture, we will want to explain this variation.

Graphical Displays

• It is useful to graph observed and fitted growth trajectories.

• This gives us a clear picture of heterogeneity in individual development.

• This is useful for determining which time function(s) to specify.

• And can highlight model mis-specifications in a way that is difficult to spot with just the numerical estimates.

Observed Individual Trajectories

Fitted Trajectories

Explaining Growth

• Up to this point the models have been concerned only with describing growth.

• These are unconditional LGC models.

• We can add predictors of growth to explain why some people grow more quickly than others.

• These are conditional LGC models.

Predicting Growth

• Some predictors of growth do not change during the period of observation.

• E.g. sex, parental social class, date of birth.• These are referred to as ‘fixed’ or time-constant.• Other predictors change over time and may

influence the outcome variable.• E.g. parental status, health status.• These are referred to as time-varying covariates.

Fixed Predictors of Growth

11 1

1 10

2 3

Do men have a different initial status than women?

Do men grow at a different rate than women?

Gender

(women = 0; men=1)

Does initial status influence rate of growth?

ICEPT SLOPE

X1t1 X1t2 X1t3 X1t4

E1

1

E2

1

E3

1

E4

1

Example: Issue Voting

• Proximity to parties on issue dimensions strongly related to political preferences...

• All previous investigations use between-person analysis of cross-sectional data.

• Is individual change in issue proximity correlated with individual change in party evaluation over the 5 years of the panel?

• Is this relationship moderated by level of political knowledge?

• British Election Panel Study 1997-2001.

Direction and Proximity

P1 P2VP3

Proximity – voter prefers party closest to them

Direction – voter prefers party strongest on same side of issue as them

Left Right

Penalty applied to parties outside ‘region of acceptability’

kijk

in

k

ik nyxk

/)*( D1

ij

kijkik

in

k

nyxk

/ P1

ij

Issue Dimensions

• European Integration– Some people feel that Britain should do all it can to

unite fully with the European Union. Other people feel that Britain should do all it can to protect its independence from the European Union.

• Taxation and spending– Some people feel that the government should put up

taxes a lot and spend much more on health and social services. Other people feel that the government should cut taxes a lot and spend much less on health and social services.

Issue Dimensions

• Income redistribution– Some people feel that government should make much

greater efforts to make people’s incomes more equal. Other people feel that government should be much less concerned about how equal people’s incomes are.

• Unemployment and Inflation– Some people feel that getting people back to work

should be the government's top priority. Other people feel that keeping prices down should be the government's top priority.

Spatial & Directional Scores 97-01 Figure 1a spatial mean party placement

0.00

1.00

2.00

3.00

4.00

5.00

6.00

97 98 99 00 01

labour

tory

Lib Dem

Figure 1b spatial respondent party placement

0.00

1.00

2.00

3.00

4.00

5.00

6.00

1 2 3 4 5

labour

tory

Lib Dem

Figure 1c directional mean party placement

-3.00

-2.00

-1.00

0.00

1.00

2.00

3.00

4.00

5.00

97 98 99 00 01

labour

tory

Lib Dem

Figure 1d directional respondent party placement

-3.00

-2.00

-1.00

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

97 98 99 00 01

labour

tory

Lib Dem

Party Evaluations 1997-2001 Choose a phrase from this scale to say how you feel about the Labour/Conservative/Liberal Democrat party

5. Strongly Against 4. Against3. Neither/Nor2. In Favour1. Strongly in Favour

Path Diagram for LGC Models

Cross-Sectional Betas of party evaluation on proximity by

Knowledge of Party Positions

Sample Party spatial (mean)

spatial (personal )

direction (mean)

direction (personal)

full sample Conservative 0.35* 0.53* -0.46* -0.48* n=2034 Labour 0.27* 0.42* -0.39* -0.43* low knowledge Conservative 0.17* 0.36* -0.25* -0.32* n=513 Labour 0.11* 0.24* -0.21* -0.27* high knowledge Conservative 0.54* 0.69* -0.64* -0.64* n=579 Labour 0.46* 0.58* -0.58* -0.57*

*=significant at 95% confidence level.

Betas of party evaluation slope on proximity slope from LGC models by

Knowledge of Party Positions

Sample Party spatial (mean)

spatial (personal )

direction (mean)

direction (personal)

full sample Conservative -1.72 0.63* -0.56* -0.65* n=2034 Labour 0.66* 0.43* -0.88* -1.03* low knowledge Conservative -0.45 0.23 -0.16 -0.29 n=513 Labour 0.56 1.05 -0.52 -1.21 high knowledge Conservative -4.06 0.83* -0.81* -0.77* n=579 Labour 0.86* 0.79* -1.30* -1.08*

*=significant at 95% confidence level;

Conclusions

• For more sophisticated voters, change in policy proximity correlated with change in evaluation.

• No relationship between change in policy proximity and evaluation for least sophisticated.

• Cross-sectional parameters tell us nothing about temporal dimension of relationships.

Documents

Random effects as latent variables: SEM for repeated measures data Dr Patrick Sturgis University of Surrey