Upload
breonna-norford
View
217
Download
1
Embed Size (px)
Citation preview
Univariate Model Fitting
Sarah Medland
QIMR – openMx workshop
Brisbane 16/08/10
Univariate Twin Models= using the twin design to assess
the aetiology of one trait (univariate)
1. Path Diagrams
2. Basic ACE R Script
1. Path Diagrams for Univariate Models
Basic Twin Model - MZ
Twin 1 Trait
A C E
c ea
Twin 2 Trait
E C A
c ae
1.0
1 1 1 111
1.0
Basic Twin Model - DZ
Twin 1 Trait
A C E
c ea
Twin 2 Trait
E C A
c ae
0.5
1 1 1 111
1.0
Basic Twin Model
Twin 1 Trait
A C E
c ea
Twin 2 Trait
E C A
c ae
rMZ = 1.0;
rDZ = 1.0
rMZ = 1.0;
rDZ = 0.5
1 1 1 111
The Variance
Since the variance of a variable is
the covariance of the variable
with itself, the expected variance
will be the sum of all paths from
the variable to itself, which follow
Wright’s rules
Variance for Twin 1 - A
Twin 1 Trait
A C E
c ea
1 1 1 a*1*a = a2
Variance for Twin 1 - C
a*1*a = a2
c*1*c = c2
Twin 1 Trait
A C E
c ea
1 1 1
Variance for Twin 1 - E
a*1*a = a2
c*1*c = c2
e*1*e = e2
Twin 1 Trait
A C E
c ea
1 1 1
Total Variance for Twin 1
a*1*a = a2
c*1*c = c2
e*1*e = e2
Twin 1 Trait
A C E
c ea
1 1 1
Total Variance = a2 + c2 + e2
Covariance - MZ
Twin 1 Trait
A C E
c ea
Twin 2 Trait
E C A
c ae
1.0
1 1 1 111
1.0 a2 + c2
Covariance - DZ
Twin 1 Trait
A C E
c ea
Twin 2 Trait
E C A
c ae
0.5
1 1 1 111
1.00.5a2 +
c2
Variance-Covariance Matrices
MZ Twin 1 Twin 2
Twin 1 a2 + c2 + e2
a2 + c2
Twin 2 a2 + c2 a2 + c2 + e2
Variance-Covariance Matrices
DZ Twin 1 Twin 2
Twin 1 a2 + c2 + e2
0.5a2 + c2
Twin 2 0.5a2 + c2 a2 + c2 + e2
Variance-Covariance Matrices
MZ Twin 1 Twin 2
Twin 1 a2 + c2 + e2 a2 + c2
Twin 2 a2 + c2 a2 + c2 + e2
DZ Twin 1 Twin 2
Twin 1 a2 + c2 + e2 0.5a2 + c2
Twin 2 0.5a2 + c2 a2 + c2 + e2
Why isn’t e2 included in the covariance?
Because, e2 refers to environmental influences UNIQUE to each twin. Therefore, this cannot explain why there is similarity between twins.
Why is a2 only .5 for DZs but not MZs?
Because DZ twins share on average half of their genes, whereas MZs share all of their genes.
2. Basic openMx ACE Script
OverviewOpenMx script
Running the script
Describing the output
ACE model
# Fit ACE Model with RawData and Matrices Input# -----------------------------------------------------------------------univACEModel <- mxModel("univACE", mxModel("ACE", # Matrices a, c, and e to store a, c, and e path coefficients mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=10, label="a11", name="a" ), mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=10, label="c11", name="c" ), mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=10, label="e11", name="e" ), # Matrices A, C, and E compute variance components mxAlgebra( expression=a %*% t(a), name="A" ), mxAlgebra( expression=c %*% t(c), name="C" ), mxAlgebra( expression=e %*% t(e), name="E" ),# Algebra to compute total variances and standard deviations (diagonal only) mxAlgebra( expression=A+C+E, name="V" ),
Specify thematrices you
needTo build the
model
Twin 1 Trait
A C E
c ea
1 1 1
Do some algebra to get the variances
Start values?
a2 = additive genetic variance (A)c2 = Shared E variance (C)e2 = Non-shared E variance (E) Sum is modelled to be expected
Total VarianceStart Values for a, c, e:
(Total Variance / 3)
Twin 1 Trait
A C E
c ea
1 1 1
Standardize parameter estimates
# Algebra to compute total variances and standard deviations (diagonal only)
mxAlgebra( expression=A+C+E, name="V" ), mxMatrix( type="Iden", nrow=nv, ncol=nv, name="I"), mxAlgebra( expression=solve(sqrt(I*V)), name="iSD"), # Algebra to compute standardized path estimares and variance
components mxAlgebra( expression=a%*%iSD, name="sta"), mxAlgebra( expression=c%*%iSD, name="stc"), mxAlgebra( expression=e%*%iSD, name="ste"), mxAlgebra( expression=A/V, name="h2"), mxAlgebra( expression=C/V, name="c2"), mxAlgebra( expression=E/V, name="e2"),The regression coefficient a is standardized by: (a * SD(A)) / SD(Trait)
where SD(Trait) is the standard deviation of the dependent variable, and SD(A) is the standard deviation of the predictor, the latent factor ‘A’ (=1)
Twin 1 Trait
A
a
1
V1 * = V V
SDinv
= 1/SD a * 1/SD = a/SD
Standardize parameter estimates
# Algebra to compute total variances and standard deviations (diagonal only)
mxAlgebra( expression=A+C+E, name="V" ), mxMatrix( type="Iden", nrow=nv, ncol=nv, name="I"), mxAlgebra( expression=solve(sqrt(I*V)), name="iSD"), # Algebra to compute standardized path estimares and variance
components mxAlgebra( expression=a%*%iSD, name="sta"), mxAlgebra( expression=c%*%iSD, name="stc"), mxAlgebra( expression=e%*%iSD, name="ste"), mxAlgebra( expression=A/V, name="h2"), mxAlgebra( expression=C/V, name="c2"), mxAlgebra( expression=E/V, name="e2"),The heritability ‘h2’ is the proportion of the total variance due to A (additive
genetic effects; = A / V. Note: this will be “sta” squared. The standardized variance components for C and E are: C / V; E / V N
Twin 1 Trait
A
sta
1
VA / = “h2” “sta”2
Standardising dataV = A + C + E A/V = 73/233= .31
V = [73] + [90] + [70] C/V = 90/233= .39
V = [233] E/V = 70/233= .30
a = 8.55c = 9.49e = 8.35
SD = sqrt(V) = 15.26
sta = 8.55 / 15.26 = 0.56 squared = .31stc = 9.49 / 15.26 = 0.62 squared = .39ste = 8.35 / 15.26 = 0.55 squared = .30
# Algebra for expected variance/covariance matrix in MZ mxAlgebra( expression= rbind ( cbind(A+C+E , A+C), cbind(A+C , A+C+E)),
name="expCovMZ" ), # Algebra for expected variance/covariance matrix in DZ mxAlgebra( expression= rbind ( cbind(A+C+E , 0.5%x%A+C), cbind(0.5%x%A+C , A+C+E)),
name="expCovDZ" ) MZ Twin 1 Twin 2
Twin 1 a2 + c2 + e2 a2 + c2
Twin 2 a2 + c2 a2 + c2 + e2
DZ Twin 1 Twin 2
Twin 1 a2 + c2 + e2 0.5a2 + c2
Twin 2 0.5a2 + c2 a2 + c2 + e2
Twin 1 Trait
A C E
c ea
Twin 2 Trait
E C A
c ae
1/ 0.5
1 1 1 111
1.0
mxModel("MZ", mxData( observed=mzData, type="raw" ), mxFIMLObjective( covariance="ACE.expCovMZ",
means="ACE.expMean", dimnames=selVars ) ), mxModel("DZ", mxData( observed=dzData, type="raw" ), mxFIMLObjective( covariance="ACE.expCovDZ",
means="ACE.expMean", dimnames=selVars ) ), mxAlgebra( expression=MZ.objective + DZ.objective,
name="m2ACEsumll" ), mxAlgebraObjective("m2ACEsumll"), mxCI(c('ACE.A', 'ACE.C', 'ACE.E')))univACEFit <- mxRun(univACEModel, intervals=T)univACESumm <- summary(univACEFit)univACESumm
Sub-models# Fit AE model# -----------------------------------------------------------------------univAEModel <- mxModel(univACEFit, name="univAE", mxModel(univACEFit$ACE, mxMatrix( type="Lower", nrow=1, ncol=1, free=FALSE, values=0, label="c11",
name="c" ) ))univAEFit <- mxRun(univAEModel)univAESumm <- summary(univAEFit)univAESumm
You can fit sub-models to test the significance of your parameters
-you simply drop the parameter and see if the model fit is significantly worse than full
model
Twin 1 Trait
A E
Twin 2 Trait
E A
Sub-models# Fit CE model# -----------------------------------------------------------------------univCEModel <- mxModel(univACEFit, name="univCE", mxModel(univACEFit$ACE, mxMatrix( type="Lower", nrow=1, ncol=1, free=FALSE, values=0, label="a11", name="a" ) ))univCEFit <- mxRun(univCEModel)univCESumm <- summary(univCEFit)univCESumm
Twin 1 Trait
C E
Twin 2 Trait
E C
# Fit E model# -----------------------------------------------------------------------univEModel <- mxModel(univAEFit, name="univE", mxModel(univAEFit$ACE, mxMatrix( type="Lower", nrow=1, ncol=1, free=FALSE, values=0, label="a11",
name="a" ) ))univEFit <- mxRun(univEModel)univESumm <- summary(univEFit)univESumm
Twin 1 Trait
E
Twin 2 Trait
E
The E parameter can never not be dropped because it includes measurement error
OpenMx Output
univACESummfree parameters: name matrix row col Estimate 1 a11 ACE.a 1 1 8.5465042 c11 ACE.c 1 1 9.4884543 e11 ACE.e 1 1 8.3521974 mean ACE.Mean 1 1 94.147803
observed statistics: 2198 estimated parameters: 4 degrees of freedom: 2194 -2 log likelihood: 17639.91 saturated -2 log likelihood: NA number of observations: 1110 chi-square: NA p: NA AIC (Mx): 13251.91 BIC (Mx): 1127.663 adjusted BIC: RMSEA: NA
tableFitStatisticsmodels compared to saturated model
Name ep -2LL df AIC diffLL diffdf p
M1 : univTwinSat 10 17637.98 2188 13261.98 - - - M2 : univACE 4 17639.91 2194 13251.91 1.93 6 0.93M3 : univAE 3 17670.38 2195 13280.38 32.4 7 0 M4 : univCE 3 17665.27 2195 13275.27 27.3 7 0 M5 : univE 2 18213.28 2196 13821.28 575.3 8 0
Smaller -2LL means better fit. -2LL of sub-model is always higher (worse fit). The question is: is it significantly worse.Chi-sq test: dif in -2LL is chi-square distributed. Evaluate sig of chi-sq test. A non-sig p-value means that the model is consistent with the data.
Nested.fitmodels compared to ACE model
Name ep -2LL df diffLL diffdf p
univACE 4 17639.91 2194 NA NA NAunivAE 3 17670.38 2195 32.47 1 0 univCE 3 17665.27 2195 25.36 1 0 univE 2 18213.28 2196 573.37 2 0
Smaller -2LL means better fit. -2LL of sub-model is always higher (worse fit). The question is: is it significantly worse.Chi-sq test: dif in -2LL is chi-square distributed. Evaluate sig of chi-sq test. Critical Chi-sq value for 1 DF = 3.84A non-sig p-value means that the dropped parameter(s) are non-significant.
Estimates ACE model> univACEFit$ACE.h2 [,1][1,] 0.3137131
> univACEFit$ACE.c2 [,1][1,] 0.3866762
> univACEFit$ACE.e2 [,1][1,] 0.2996108