View
217
Download
1
Tags:
Embed Size (px)
Citation preview
David M. Evans
Sarah E. Medland
Developmental Models in Genetic Research
Wellcome Trust Centre for Human Genetics Oxford
United Kingdom
Twin Workshop Boulder 2004
Queensland Institute of Medical Research Brisbane Australia
• These type of models are appropriate whenever one has repeated measures data– short term: trials of an experiment– long term: longitudinal studies
• When we have data from genetically informative individuals (e.g. MZ and DZ twins) it is possible to investigate the genetic and environmental influences affecting the trait over time.
What sorts of questions?• Are there changes in the magnitude of genetic and environmental effects over time?
• Do the same genetic and environmental influences operate throughout time?
• If there are no cohort effects then we can answer the first question using a cross-sectional study type design
• However, to answer the second question, longitudinal data is required
“Simplex” Structure
Weight1 Weight2 Weight3 Weight4 Weight5 Weight6
Weight1 1.000
Weight2 0.985 1.000
Weight3 0.968 0.981 1.000
Weight4 0.957 0.970 0.985 1.000
Weight5 0.932 0.940 0.964 0.975 1.000
Weight6 0.890 0.897 0.927 0.949 0.973 1.000
From Fischbein (1977)
• “Factor” models tend to fit this type of data poorly (Boomsma & Molenaar, 1987)
• => need a type of model which explicitly takes into account the longitudinal nature of the data
Y1Y2 Y3 Y4
A1
Phenotypic Simplex Model
η1 η2 η3
β2
ζ1
β3
λ1 λ2λ3
Y1 ε3Y2 Y3ε2ε1 ε4 Y4
η4
λ4
β4
ζ2 ζ3 ζ4
Y - “indicator variable” ζ - “innovations”
η - “latent variable” λ - “factor loadings”
ε - “measurement error” β - “transmission coefficients”
η1 η2 η3
β2
ζ1
β3
λ1 λ2λ3
Y1 ε3Y2 Y3ε2ε1 ε4 Y4
η4
λ4
β4
ζ2 ζ3 ζ4
Measurement Model: Yi = λi ηi + εi
Latent Variable Model: ηi = βi ηi-1 + ζi
η1 η2 η3
β2
ζ1
β3
λ1 λ2λ3
Y1
ε3
Y2 Y3
ε2ε1 ε4
Y4
η4
λ4
β4
ζ2 ζ3 ζ4
ζ - Innovations are standardized to unit variance
λ - Factor loadings are estimated
1 11 1
η1 η2 η3
β2
ζ1
β3
1 1 1
Y1
ε3
Y2 Y3
ε2ε1 ε4
Y4
η4
1
β4
ζ2 ζ3 ζ4
ζ -Variance of the innovations are estimated
λ - Factor loadings are constrained to unity
? ?? ?
η1 η2 η3
β2
ζ1
β3
1 1 1
Y1
ε3
Y2 Y3
ε2ε1 ε4
Y4
η4
1
β4
ζ2 ζ3 ζ4
CONSTRAINTS
(1) var (ε1) = var (ε4)
(2) Need at the VERY MINIMUM three measurement occasions
? ?? ?
Deriving the Expected Covariance Matrix
Path Analysis
Matrix Algebra
Covariance Algebra
(1) Trace backward along an arrow and then forward, or simply forwards from one variable to the other, but NEVER FORWARD AND THEN BACK
(2) The contribution of each chain traced between two variables is the product of its path coefficients
(3) The expected covariance between two variables is the sum of all legitimate routes between the two variables
(4) At any change in a tracing route which is not a two way arrow connecting different variables in the chain, the expected variance of the variable at the point of change is included in the product of path coefficients
The Rules of Path Analysis
Adapted from Neale & Cardon (1992)
(2) The contribution of each chain traced between two variables is the product of its path coefficients
The Rules of Path Analysis
Adapted from Neale & Cardon (1992)
η1
Y1Y2
λ1 λ2
1
cov (Y1, Y2) = λ1 λ2
(3) The expected covariance between two variables is the sum of all legitimate routes between the two variables
The Rules of Path Analysis
Adapted from Neale & Cardon (1992)
Y1Y2
λ2 λ4
η1
1
cov (Y1, Y2) = λ1λ2 + λ3λ4
η2
1
λ1 λ3
η1 η2
β2
ζ1
1 1
Y1Y2
(4) At any change in a tracing route which is not a two way arrow connecting different variables in the chain, the expected variance of the variable at the point of change is included in the product of path coefficients
The Rules of Path Analysis
Adapted from Neale & Cardon (1992)
cov (Y1, Y2) = β2var(ζ1)
η1 η2 η3
β2
ζ1
β3
1 1 1
Y1 ε3Y2 Y3ε2ε1 ε4 Y4
η4
1
β4
ζ2 ζ3 ζ4
cov(y1, y2) = ???
var(y1) = ???
var(y2) = ???
η1 η2 η3
β2
ζ1
β3
1 1 1
Y1 ε3Y2 Y3ε2ε1 ε4 Y4
η4
1
β4
ζ2 ζ3 ζ4
β2 var (ζ1)cov(y1, y2) =
(1) Trace backward along an arrow and then forward, or simply forwards from one variable to the other, but NEVER FORWARD AND THEN BACK
(4) At any change in a tracing route which is not a two way arrow connecting different variables in the chain, the expected variance of the variable at the point of change is included in the product of path coefficients
η1 η2 η3
β2
ζ1
β3
1 1 1
Y1 ε3Y2 Y3ε2ε1 ε4 Y4
η4
1
β4
ζ2 ζ3 ζ4
β2 var (ζ1)cov(y1, y2) =
var(y1) =
var(y2) =β2
2 var (ζ1) + var (ζ2) + var (ε2)
var (ζ1) + var (ε1)
var (ζ1) + var (ε1 )
β2 var (ζ1) β22 var (ζ1) + var (ζ2)
+ var (ε2 )
β2 β3 var (ζ1) β3 var (ζ2) β32(β2
2 var (ζ1) + var (ζ2))
+ var(ζ3) + var (ε3 )
β2 β3 β4var (ζ1) β3 β4var (ζ2) β4var (ζ3) β42(β3
2(β22 var (ζ1) + var (ζ2))
+var(ζ3)) + var(ζ4) + var (ε4 )
Y1 Y2 Y3 Y4
Y4
Y3
Y2
Y1
Expected Phenotypic Covariance Matrix
This can be expressed compactly in matrix algebra form:
(I - B)-1 * Ψ * (I - B)-1 ’ + Θε
I is an identity matrix
B is the matrix of transmission coefficients
Ψ is the matrix of innovation variances
Θε is the matrix of measurement error variances
var(ζ1) 0 0 0
0 var(ζ2) 0 0
0 0 var(ζ3) 0
0 0 0 var(ζ4)
Ψ =
var(ε1) 0 0 0
0 var(ε2) 0 0
0 0 var(ε3) 0
0 0 0 var(ε4)
Θε =
0 0 0 0
β2 0 0 0
0 β3 0 0
0 0 β4 0
B =
(1) Draw path model
(2) Use path analysis to derive the expected covariance matrix
(3) Decompose the expected covariance matrix into simple matrices
(4) Write out matrix formulae
(5) Implement in Mx
Phenotypic Simplex Model: MX Example
Data taken from Fischbein (1977): 66 Females had their weight measured six times at 6 month intervals from 11.5 years of age.
Time Latent Variable Variance Error. Total
βn var(ηn-1 ) var(ζn ) Variance Variance
1 - - - 51.34 0.13 51.47
2 1.052 x 51.34 + 1.50 = 58.02 0.13 58.15
3 1.032 x 58.02 + 2.07 = 63.52 0.13 63.66
4 1.062 x 63.52 + 1.86 = 72.69 0.13 72.82
5 0.972 x 72.69 + 3.27 = 71.50 0.13 71.64
6 0.942 x 71.50 + 3.27 = 66.72 0.13 66.86
Phenotypic Simplex Model: Results
A1 A2 A3
βa2
ζa1
βa3
λa1 λa2 λa3
C1 C2 C3
E1 E3E2
y1 ε3y2 y3
ζc2 ζc3
ζe1
ζc1
ζe3ζe2
ε2ε1
βc2 βc3
βe2βe3
λc1λc3λc2
λe2λe3
λe1
A4
βa4
λa4
C4
E4
ε4 y4
ζc4
ζe4
βc4
βe4
λc4
λe4
ζa2 ζa3 ζa4
Measurement Model: yi = λaiA i + λciC i+ λeiE i + εi
Latent Variable Model: Ai = βai Ai-1 + ζai
Ci = βci Ci-1 + ζci
Ei = βei Ei-1 + ζei
A1 A2 A3
βa2
ζa1
βa3
1 1 1
C1 C2 C3
E1 E3E2
y1 ε3y2 y3
ζc2 ζc3
ζe1
ζc1
ζe3ζe2
ε2ε1
βc2 βc3
βe2βe3
1 11
1 11
A4
βa4
1
C4
E4
ε4 y4
ζc4
ζe4
βc4
βe4
1
1
ζa2 ζa3 ζa4
Genetic Simplex Model: MX Example
• Equate measurement error across all time points
• Drop the measurement error structure from the model– Where will the measurement error go?
• Can you drop the common environmental structure from the model?
Time Genetic Variance Environmental Variance Total
var(ζn ) β var(ζn-1 ) var(ζn ) β var(ζn-1 )
1 4.792 =22.98 1.822 = 3.30 26.28
2 1.122 + 1.052 x 22.98 =26.72 0.562 + 0.922 x 3.30 = 3.09 29.81
3 1.502 + 1.042 x 26.72 = 31.40 0.982 + 1.052 x 3.09 = 4.39 35.79
4 1.232 + 1.022 x 31.40 = 34.07 0.952 + 0.852 x 4.39 = 4.08 38.15
5 1.392 + 1.022 x 34.07 = 37.57 0.812 + 0.852 x 4.08 = 3.55 41.12
6 ? + ? x 37.57 = ? ? + ? x 3.55 = ? ?
Genetic Simplex Model: Results
Useful References
• Boomsma D. I. & Molenaar P. C. (1987). The genetic analysis of repeated measures. I. Simplex models. Behav Genet, 17(2), 111-23.
• Boomsma D. I., Martin, N. G. & Molenaar P. C. (1989). Factor and simplex models for repeated measures: application to two psychomotor measures of alcohol sensitivity in twins. Behav Genet, 19(1), 79-96.
Time Genetic Variance Environmental Variance Total
var(ζn ) β var(ζn-1 ) var(ζn ) β var(ζn-1 )
1 4.792 =22.98 1.822 = 3.30 26.28
2 1.122 + 1.052 x 22.98 =26.72 0.562 + 0.922 x 3.30 = 3.09 29.81
3 1.502 + 1.042 x 26.72 = 31.40 0.982 + 1.052 x 3.09 = 4.39 35.79
4 1.232 + 1.022 x 31.40 = 34.07 0.952 + 0.852 x 4.39 = 4.08 38.15
5 1.392 + 1.022 x 34.07 = 37.57 0.812 + 0.852 x 4.08 = 3.55 41.12
6 1.392 + 0.972 x 37.57 = 37.40 1.002 + 1.012 x 3.55 = 4.62 42.02
Genetic Simplex Model: Results