Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
The role of environmental heterogeneity in meta-analysisof gene-environment interaction
Bhramar Mukherjee
Department of Biostatistics, University of Michigan
E-mail: [email protected]
Workshop on Emerging Statistical Challenges and Methods forAnalysis of Massive Genomic Data in Complex Human Disease
Studies, Banff
Summary Data: BIRS 14w5011
June 26, 2014 Meta-analysis of G x E 1 / 43
The role of environmental heterogeneity in meta-analysisof gene-environment interaction
Bhramar Mukherjee
Department of Biostatistics, University of Michigan
E-mail: [email protected]
Workshop on Emerging Statistical Challenges and Methods forAnalysis of Massive Genomic Data in Complex Human Disease
Studies, Banff
Summary Data: BIRS 14w5011
June 26, 2014 Meta-analysis of G x E 1 / 43
The role of environmental heterogeneity in meta-analysisof gene-environment interaction
Bhramar Mukherjee
Department of Biostatistics, University of Michigan
E-mail: [email protected]
Workshop on Emerging Statistical Challenges and Methods forAnalysis of Massive Genomic Data in Complex Human Disease
Studies, Banff
Summary Data: BIRS 14w5011
June 26, 2014 Meta-analysis of G x E 1 / 43
Outline
I saw a brown bear!
June 26, 2014 Meta-analysis of G x E 2 / 43
Outline
It was magical!
June 26, 2014 Meta-analysis of G x E 3 / 43
Outline
Gene-Environment-Wide interaction studies (GEWIS), orGE-Whiz according to Duncan Thomas!
Statistical ChallengesData harmonization across cohorts.
Misclassification/measurement error in E.
Multiple testing, computational time.
Optimal search strategies for discovery and replication.
Timing of exposure measurement.
Model misspecification in main effects of E and interaction term.
Prohibitive sample size requirement.
Scale dependence of interaction.
June 26, 2014 Meta-analysis of G x E 4 / 43
Outline
G x E Search Strategies for Case-Control Data
Single Step MethodsCase-Control, Case-Only, Empirical Bayes, Bayesian Model Averaging.
Two-Step/Hybrid MethodsUse screening step followed by subset or weighted p-value testing.
Screening test independent of final testing step.
Use marginal genetic association or GE-correlation or both as screen.
Kooperberg-Leblanc, Two-step, Hybrid, Cocktail, EDGxE.
Gene-discovery methods that use interactionJoint 2 df tests for main effects and G x E interaction.
June 26, 2014 Meta-analysis of G x E 5 / 43
Outline
G x E Search Strategies for Case-Control Data
Single Step MethodsCase-Control, Case-Only, Empirical Bayes, Bayesian Model Averaging.
Two-Step/Hybrid MethodsUse screening step followed by subset or weighted p-value testing.
Screening test independent of final testing step.
Use marginal genetic association or GE-correlation or both as screen.
Kooperberg-Leblanc, Two-step, Hybrid, Cocktail, EDGxE.
Gene-discovery methods that use interactionJoint 2 df tests for main effects and G x E interaction.
June 26, 2014 Meta-analysis of G x E 5 / 43
Outline
G x E Search Strategies for Case-Control Data
Single Step MethodsCase-Control, Case-Only, Empirical Bayes, Bayesian Model Averaging.
Two-Step/Hybrid MethodsUse screening step followed by subset or weighted p-value testing.
Screening test independent of final testing step.
Use marginal genetic association or GE-correlation or both as screen.
Kooperberg-Leblanc, Two-step, Hybrid, Cocktail, EDGxE.
Gene-discovery methods that use interactionJoint 2 df tests for main effects and G x E interaction.
June 26, 2014 Meta-analysis of G x E 5 / 43
Meta Analysis of G x E
Meta Analysis of G x E
In press
June 26, 2014 Meta-analysis of G x E 6 / 43
Meta Analysis of G x E
Undeniable importance in the scientific literature
June 26, 2014 Meta-analysis of G x E 7 / 43
Meta Analysis of G x E
Meta-analysis of marginal genetic associations
June 26, 2014 Meta-analysis of G x E 8 / 43
Meta Analysis of G x E
Full efficiency results for fixed effect(s) models
June 26, 2014 Meta-analysis of G x E 9 / 43
Meta Analysis of G x E
Limited statistical literature on meta-analysis of GEI
June 26, 2014 Meta-analysis of G x E 10 / 43
Meta Analysis of G x E
Lots of interesting problems, thus...
Time for some creative statistics
June 26, 2014 Meta-analysis of G x E 11 / 43
Methods
Meta analysis of GEI
Possible choices:
Individual Patient Data (IPD) or mega analysis.
Meta-analysis: study level summary statistics (estimates, standard errorsof parameters).
Meta-regression: use study level covariates to explain the heterogeneityamong study-specific effects
June 26, 2014 Meta-analysis of G x E 12 / 43
Methods
Rich literature on meta-analysis in clinical trials
June 26, 2014 Meta-analysis of G x E 13 / 43
Methods
Notations
Y: quantitative trait
E: continuous environmental exposureG: SNP with genotypes AA, Aa and aa (A: minor allele)
dominant (G = 1 if AA and Aa; G = 0 if aa)additive (G = 2 if AA; G = 1 if Aa; G = 0 if aa)co-dominant (G1 = 1 if Aa and 0 otherwise; G2 = 1 if AA and 0 otherwise)
Z: covariates/confounders
K: number of independent studies
nk: number of participants in the k-th study, k = 1, ...,K
N = ∑Kk=1 nk: total number of participants
Subscript i,k: index for participant i in study k, i = 1, ...,nk, k = 1, ...,K
June 26, 2014 Meta-analysis of G x E 14 / 43
Methods
Notations
Y: quantitative trait
E: continuous environmental exposureG: SNP with genotypes AA, Aa and aa (A: minor allele)
dominant (G = 1 if AA and Aa; G = 0 if aa)additive (G = 2 if AA; G = 1 if Aa; G = 0 if aa)co-dominant (G1 = 1 if Aa and 0 otherwise; G2 = 1 if AA and 0 otherwise)
Z: covariates/confounders
K: number of independent studies
nk: number of participants in the k-th study, k = 1, ...,K
N = ∑Kk=1 nk: total number of participants
Subscript i,k: index for participant i in study k, i = 1, ...,nk, k = 1, ...,K
June 26, 2014 Meta-analysis of G x E 14 / 43
Methods
� A set of studies investigating Type 2 Diabetes (T2D): 8 European cohortsquantitative traits case-control study
cohort country study availableb case control TotalDIAGEN Germany cohort 1510 421 622 1043D2D2007 Finland cohort 2693 287 1043 1330DPS Finland randomized trial 433 - - -FUSION-FS Finland subset of FUSION 172 - - -FUSION-S2 Finland matched case-control 2730 624 794 1418METSIM Finland male cohort 1456 632 603 1235HUNT Norway case-control 1324 511 721 1232TROMSO Norway matched case-control 1411 644 693 1337In total 11729 3119 4476 7595b quantitative traits: high-density lipoprotein cholesterol (HDL), LDL, total cholesterol available and
many other T2D related traits on most genotyped participants.
Today’s exampleSNPs in FTO: associated with T2D and BMI (Voight et al. 2010)
age, gender, BMI: associated with T2D and HDL (Kim et al. 2013)
Do SNPs in FTO (G) modify the effect of age or BMI (E) on log HDL(Y) after adjusting for gender, T2D status (Z)?
June 26, 2014 Meta-analysis of G x E 15 / 43
Methods
� A set of studies investigating Type 2 Diabetes (T2D): 8 European cohortsquantitative traits case-control study
cohort country study availableb case control TotalDIAGEN Germany cohort 1510 421 622 1043D2D2007 Finland cohort 2693 287 1043 1330DPS Finland randomized trial 433 - - -FUSION-FS Finland subset of FUSION 172 - - -FUSION-S2 Finland matched case-control 2730 624 794 1418METSIM Finland male cohort 1456 632 603 1235HUNT Norway case-control 1324 511 721 1232TROMSO Norway matched case-control 1411 644 693 1337In total 11729 3119 4476 7595b quantitative traits: high-density lipoprotein cholesterol (HDL), LDL, total cholesterol available and
many other T2D related traits on most genotyped participants.
Today’s exampleSNPs in FTO: associated with T2D and BMI (Voight et al. 2010)
age, gender, BMI: associated with T2D and HDL (Kim et al. 2013)
Do SNPs in FTO (G) modify the effect of age or BMI (E) on log HDL(Y) after adjusting for gender, T2D status (Z)?
June 26, 2014 Meta-analysis of G x E 15 / 43
Methods
T2D study: Descriptive summary
E Z G YAge (year) BMI (kg/m2) Gender rs1121980 HDL (mmol/l)
Study N mean (SDa) mean (SD) female (%) MAFa mean (SD)D2D2007 2693 59.9 (8.4) 27.5 (4.8) 52 0.41 1.44 (0.35)DIAGEN 1510 63.3 (14.3) 27.9 (5.2) 55 0.46 1.45 (0.47)DPS 433 55.1 (7.1) 31.3 (4.6) 68 0.44 1.22 (0.29)FUSION-FS 172 38.6 (10.9) 26.2 (4.9) 55 0.43 1.29 (0.32)FUSION-S2 2730 57.2 (8.4) 27.9 (5.1) 44 0.40 1.45 (0.41)HUNT 1324 67.2 (13.1) 28.0 (4.4) 48 0.47 1.26 (0.38)METSIM 1456 56.3 (6.6) 27.9 (4.7) 0.0 0.44 1.42 (0.40)TROMSO 1411 59.9 (12.5) 27.6 (4.7) 50 0.49 1.43 (0.42)Entire study 11729 59.6 (11.1) 27.9 (4.9) 44 0.43 1.41 (0.40)a SD: standard deviation; MAF: minor allele frequency.
June 26, 2014 Meta-analysis of G x E 16 / 43
Methods
1. Underlying model and IPD analysis� Fixed-effect model (FEM)
Yki = αk +βGGki +βEEki +δGkiEki + εki (1)
αk: study specific interceptβG and βE: main effects of G and Eδ : GEIεki ∼ N(0,σ2
k )
� Interpretation of GEI parameter δ :effect of E in subgroups defined by G, e.g. dominantG = 0: βEG = 1: βE +δ
� IPD analysis: fit model (1) using individual level dataAdvantage: efficiency; ‘gold standard’Disadvantage: practical challenges with sharing raw data
June 26, 2014 Meta-analysis of G x E 17 / 43
Methods
1. Underlying model and IPD analysis� Fixed-effect model (FEM)
Yki = αk +βGGki +βEEki +δGkiEki + εki (1)
αk: study specific interceptβG and βE: main effects of G and Eδ : GEIεki ∼ N(0,σ2
k )
� Interpretation of GEI parameter δ :effect of E in subgroups defined by G, e.g. dominantG = 0: βEG = 1: βE +δ
� IPD analysis: fit model (1) using individual level dataAdvantage: efficiency; ‘gold standard’Disadvantage: practical challenges with sharing raw data
June 26, 2014 Meta-analysis of G x E 17 / 43
Methods
1. Underlying model and IPD analysis� Fixed-effect model (FEM)
Yki = αk +βGGki +βEEki +δGkiEki + εki (1)
αk: study specific interceptβG and βE: main effects of G and Eδ : GEIεki ∼ N(0,σ2
k )
� Interpretation of GEI parameter δ :effect of E in subgroups defined by G, e.g. dominantG = 0: βEG = 1: βE +δ
� IPD analysis: fit model (1) using individual level dataAdvantage: efficiency; ‘gold standard’Disadvantage: practical challenges with sharing raw data
June 26, 2014 Meta-analysis of G x E 17 / 43
Methods
Revisiting the efficiency results of Lin and Zheng
Yki = αk +βGGki +βEEki +δGkiEki + εki, εki ∼ N(0,σ2k ) (2)
(αk,σ2k ): study specific nuisance parameters
(βG and βE): common nuisance parameters
δ : common parameter of interest
Need to pool study specific estimates of the common parameters(βG,βE,δ ) as a vector with estimates of multivariate information matrix(3×3) as weights for retaining full efficiency as IPD.
Loss of efficiency by only pooling a subset of common parameters, sayjust univariate inverse variance weighted estimate of δ except somespecial cases.
June 26, 2014 Meta-analysis of G x E 18 / 43
Methods
Revisiting the efficiency results of Lin and Zheng
Yki = αk +βGGki +βEEki +δGkiEki + εki, εki ∼ N(0,σ2k ) (2)
(αk,σ2k ): study specific nuisance parameters
(βG and βE): common nuisance parameters
δ : common parameter of interest
Need to pool study specific estimates of the common parameters(βG,βE,δ ) as a vector with estimates of multivariate information matrix(3×3) as weights for retaining full efficiency as IPD.
Loss of efficiency by only pooling a subset of common parameters, sayjust univariate inverse variance weighted estimate of δ except somespecial cases.
June 26, 2014 Meta-analysis of G x E 18 / 43
Methods
2a. Meta-analysis: UIVW estimator
� Univariate inverse-variance weighted (UIVW) estimatorCollect δk and v(δk) from each study k
Fixed-effect meta-analysis: δk ∼ N(δ ,v(δk))
v(δk): asymptotic model based variance of δk
‘standard condition’: v(δk) can be estimated by v(δk) with negligible error(e.g. Dersimonian and Laird 1986, Whitehead and Whitehead 1991, Lin and Zeng, 2010)
� δUIVW =
{∑k
v(δk)−1}−1
∑k
v(δk)−1
δk and v(δ UIVW) ={∑k
v(δk)−1}−1
Advantage: only need study level estimates (δk and v(δk))Disadvantage: potential efficiency loss
June 26, 2014 Meta-analysis of G x E 19 / 43
Methods
2a. Meta-analysis: UIVW estimator
� Univariate inverse-variance weighted (UIVW) estimatorCollect δk and v(δk) from each study k
Fixed-effect meta-analysis: δk ∼ N(δ ,v(δk))
v(δk): asymptotic model based variance of δk
‘standard condition’: v(δk) can be estimated by v(δk) with negligible error(e.g. Dersimonian and Laird 1986, Whitehead and Whitehead 1991, Lin and Zeng, 2010)
� δUIVW =
{∑k
v(δk)−1}−1
∑k
v(δk)−1
δk and v(δ UIVW) ={∑k
v(δk)−1}−1
Advantage: only need study level estimates (δk and v(δk))Disadvantage: potential efficiency loss
June 26, 2014 Meta-analysis of G x E 19 / 43
Methods
2b. Meta-analysis: MIVW estimator (Lin and Zeng, 2010)
� Multivariate inverse-variance weighted (MIVW) estimatorConsider the vector: β = (βG,βE,δ )
Collect βk = (βGk, βEk, δk) and its estimated 3×3 covariance matrix v(β k)
� βMIVW
={∑k
v(β k)−1}−1
∑k
v(β k)−1
β k and v(βMIVW
) ={∑k
v(β k)−1}−1
Obtain δ MIVW and v(δ MIVW) from corresponding elements of β MIVW.
Advantage: asymptotically fully efficient as δ IPD
Disadvantage: estimated multivariate covariance matrix v(β k) is rarelyavailable in published results
June 26, 2014 Meta-analysis of G x E 20 / 43
Methods
2b. Meta-analysis: MIVW estimator (Lin and Zeng, 2010)
� Multivariate inverse-variance weighted (MIVW) estimatorConsider the vector: β = (βG,βE,δ )
Collect βk = (βGk, βEk, δk) and its estimated 3×3 covariance matrix v(β k)
� βMIVW
={∑k
v(β k)−1}−1
∑k
v(β k)−1
β k and v(βMIVW
) ={∑k
v(β k)−1}−1
Obtain δ MIVW and v(δ MIVW) from corresponding elements of β MIVW.
Advantage: asymptotically fully efficient as δ IPD
Disadvantage: estimated multivariate covariance matrix v(β k) is rarelyavailable in published results
June 26, 2014 Meta-analysis of G x E 20 / 43
Methods
3. Meta-regression
� Intuition: interaction model (1) implies Y-G association depends linearlyon E
� Two-stage meta-regressionStage 1: Estimating marginal effect of G to obtain λk and v(λk)
Yki = λ0k +λkGki +ηki, i = 1, ...,nk
Stage 2: Examine whether marginal effect estimates (λk) depend on studylevel means of E (mk = ∑i Eki/nk)
λk = γ0 + γmk + εk, εk ∼ N(0, v(λk)), k = 1, ...,K,
� δ MR: weighted least squares estimator of γ
Advantage: can identify interaction with λk, v(λk) and mkDisadvantage: huge potential for ecological bias and hard to implementwith few studies
June 26, 2014 Meta-analysis of G x E 21 / 43
Methods
3. Meta-regression
� Intuition: interaction model (1) implies Y-G association depends linearlyon E
� Two-stage meta-regressionStage 1: Estimating marginal effect of G to obtain λk and v(λk)
Yki = λ0k +λkGki +ηki, i = 1, ...,nk
Stage 2: Examine whether marginal effect estimates (λk) depend on studylevel means of E (mk = ∑i Eki/nk)
λk = γ0 + γmk + εk, εk ∼ N(0, v(λk)), k = 1, ...,K,
� δ MR: weighted least squares estimator of γ
Advantage: can identify interaction with λk, v(λk) and mkDisadvantage: huge potential for ecological bias and hard to implementwith few studies
June 26, 2014 Meta-analysis of G x E 21 / 43
Methods
3. Meta-regression
� Intuition: interaction model (1) implies Y-G association depends linearlyon E
� Two-stage meta-regressionStage 1: Estimating marginal effect of G to obtain λk and v(λk)
Yki = λ0k +λkGki +ηki, i = 1, ...,nk
Stage 2: Examine whether marginal effect estimates (λk) depend on studylevel means of E (mk = ∑i Eki/nk)
λk = γ0 + γmk + εk, εk ∼ N(0, v(λk)), k = 1, ...,K,
� δ MR: weighted least squares estimator of γ
Advantage: can identify interaction with λk, v(λk) and mkDisadvantage: huge potential for ecological bias and hard to implementwith few studies
June 26, 2014 Meta-analysis of G x E 21 / 43
Methods
4. Adaptively Weighted Estimator (AWE)
� Motivation: A combined estimator that uses only univariate summarystatistics.
� Proposal: δ AWE(w) = wδ UIVW +(1−w)δ MR, 0≤ w≤ 1Find w that minimizes v(δ AWE(w))
Need to know cov(δ UIVW, δ MR)
June 26, 2014 Meta-analysis of G x E 22 / 43
Methods
4. Adaptively Weighted Estimator (AWE)
� Motivation: A combined estimator that uses only univariate summarystatistics.
� Proposal: δ AWE(w) = wδ UIVW +(1−w)δ MR, 0≤ w≤ 1Find w that minimizes v(δ AWE(w))
Need to know cov(δ UIVW, δ MR)
June 26, 2014 Meta-analysis of G x E 22 / 43
Methods
June 26, 2014 Meta-analysis of G x E 23 / 43
Methods
Weights
LEMMA.
For k = 1, ...,K, cov(δk, λk)=0
⇒ cov(δ UIVW, δ MR) = 0
RESULT.
v(δ AWE(w)) attains its minimum at [v(δ UIVW)−1 + v(δ MR)−1]−1 if and only ifweight on δ UIVW is w = {v(δ UIVW)−1 +v(δ MR)−1}−1v(δ UIVW)−1.
� δ AWE = {v(δ UIVW)−1 + v(δ MR)−1}−1{v(δ UIVW)−1δ UIVW + v(δ MR)−1δ MR}
δ AWE: Inverse variance weighted estimator of δ UIVW and δ MR
June 26, 2014 Meta-analysis of G x E 24 / 43
Methods
Weights
LEMMA.
For k = 1, ...,K, cov(δk, λk)=0 ⇒ cov(δ UIVW, δ MR) = 0
RESULT.
v(δ AWE(w)) attains its minimum at [v(δ UIVW)−1 + v(δ MR)−1]−1 if and only ifweight on δ UIVW is w = {v(δ UIVW)−1 +v(δ MR)−1}−1v(δ UIVW)−1.
� δ AWE = {v(δ UIVW)−1 + v(δ MR)−1}−1{v(δ UIVW)−1δ UIVW + v(δ MR)−1δ MR}
δ AWE: Inverse variance weighted estimator of δ UIVW and δ MR
June 26, 2014 Meta-analysis of G x E 24 / 43
Methods
Weights
LEMMA.
For k = 1, ...,K, cov(δk, λk)=0 ⇒ cov(δ UIVW, δ MR) = 0
RESULT.
v(δ AWE(w)) attains its minimum at [v(δ UIVW)−1 + v(δ MR)−1]−1 if and only ifweight on δ UIVW is w = {v(δ UIVW)−1 +v(δ MR)−1}−1v(δ UIVW)−1.
� δ AWE = {v(δ UIVW)−1 + v(δ MR)−1}−1{v(δ UIVW)−1δ UIVW + v(δ MR)−1δ MR}
δ AWE: Inverse variance weighted estimator of δ UIVW and δ MR
June 26, 2014 Meta-analysis of G x E 24 / 43
Methods
Weights
LEMMA.
For k = 1, ...,K, cov(δk, λk)=0 ⇒ cov(δ UIVW, δ MR) = 0
RESULT.
v(δ AWE(w)) attains its minimum at [v(δ UIVW)−1 + v(δ MR)−1]−1 if and only ifweight on δ UIVW is w = {v(δ UIVW)−1 +v(δ MR)−1}−1v(δ UIVW)−1.
� δ AWE = {v(δ UIVW)−1 + v(δ MR)−1}−1{v(δ UIVW)−1δ UIVW + v(δ MR)−1δ MR}
δ AWE: Inverse variance weighted estimator of δ UIVW and δ MR
June 26, 2014 Meta-analysis of G x E 24 / 43
Methods
Notation: heterogeneity in E
� Classical ANOVA partitioning: TSS(E) = BSS(E)+WSS(E)
TSS = Ns2E, total sum of squares of E
BSS = ∑k nk(mk−m)2, between-study sum of squares of EWSS = ∑k nks2
Ek, within-study sum of squares of E
wheremk = ∑i Eki/nk, s2
Ek =1nk
∑i(Eki−mk)2, for the k-th study
m = ∑k, i Eki/N, s2E = 1
N ∑k, i(Eki−m)2, combining all studies
� TSS/Np→ tss, WSS/N
p→ wss, BSS/Np→ bss, as N→ ∞
assume nk/N→ rk ∈ (0,1) as N→ ∞
tss = σ2E , wss = ∑k rkσ2
Ek and bss = ∑k rk(µk−µ)2
June 26, 2014 Meta-analysis of G x E 25 / 43
Methods
Notation: heterogeneity in E
� Classical ANOVA partitioning: TSS(E) = BSS(E)+WSS(E)
TSS = Ns2E, total sum of squares of E
BSS = ∑k nk(mk−m)2, between-study sum of squares of EWSS = ∑k nks2
Ek, within-study sum of squares of E
wheremk = ∑i Eki/nk, s2
Ek =1nk
∑i(Eki−mk)2, for the k-th study
m = ∑k, i Eki/N, s2E = 1
N ∑k, i(Eki−m)2, combining all studies
� TSS/Np→ tss, WSS/N
p→ wss, BSS/Np→ bss, as N→ ∞
assume nk/N→ rk ∈ (0,1) as N→ ∞
tss = σ2E , wss = ∑k rkσ2
Ek and bss = ∑k rk(µk−µ)2
June 26, 2014 Meta-analysis of G x E 25 / 43
Methods
Notation: heterogeneity in E
� Classical ANOVA partitioning: TSS(E) = BSS(E)+WSS(E)
TSS = Ns2E, total sum of squares of E
BSS = ∑k nk(mk−m)2, between-study sum of squares of EWSS = ∑k nks2
Ek, within-study sum of squares of E
wheremk = ∑i Eki/nk, s2
Ek =1nk
∑i(Eki−mk)2, for the k-th study
m = ∑k, i Eki/N, s2E = 1
N ∑k, i(Eki−m)2, combining all studies
� TSS/Np→ tss, WSS/N
p→ wss, BSS/Np→ bss, as N→ ∞
assume nk/N→ rk ∈ (0,1) as N→ ∞
tss = σ2E , wss = ∑k rkσ2
Ek and bss = ∑k rk(µk−µ)2
June 26, 2014 Meta-analysis of G x E 25 / 43
Methods
Summary of the methods
Methods Data shared Bias AREa
IPD individual level data unbiased 1UIVW δk, v(δk) unbiased wss/tssMIVW βk, v(βk) unbiased 1MR λk, v(λk) and mk unbiased under G-E indepen-
dence, ecological bias in gen-eral
bss/tss
AWE δk, v(δk), λk, v(λk) and mk unbiased under G-E indepen-dence, bias adaptively con-trolled
1
a ARE: asymptotic relative efficiency relative to δ IPD under certain assumptions
Under certain assumptions
δ AWE ≈ WSSBSS+WSS
δUIVW +
BSSBSS+WSS
δMR.
June 26, 2014 Meta-analysis of G x E 26 / 43
Methods
Summary of the methods
Methods Data shared Bias AREa
IPD individual level data unbiased 1UIVW δk, v(δk) unbiased wss/tssMIVW βk, v(βk) unbiased 1MR λk, v(λk) and mk unbiased under G-E indepen-
dence, ecological bias in gen-eral
bss/tss
AWE δk, v(δk), λk, v(λk) and mk unbiased under G-E indepen-dence, bias adaptively con-trolled
1
a ARE: asymptotic relative efficiency relative to δ IPD under certain assumptions
Under certain assumptions
δ AWE ≈ WSSBSS+WSS
δUIVW +
BSSBSS+WSS
δMR.
June 26, 2014 Meta-analysis of G x E 26 / 43
Methods
Similar in spirit to the recommendation of Simmonds and Higgins (2007)
June 26, 2014 Meta-analysis of G x E 27 / 43
Methods
Adaptive control of bias in AWE
0.0 0.2 0.4 0.6 0.8 1.0
Additive model
BSS/TSS
Abs
olut
e R
elat
ive
Bia
s
MRAWE
00.
10.
20.
3
0.0 0.2 0.4 0.6 0.8 1.0
Dominant model
BSS/TSS
Abs
olut
e R
elat
ive
Bia
s
MRAWE
00.
10.
2
June 26, 2014 Meta-analysis of G x E 28 / 43
Methods
Simulation study
� Relative performances of IPD, UIVW, MIVW, MR and AWE
� Simulation scenarios:1 Lack of common set of confounders to adjust across studies
Yki = αk +βGGki +βEEki +δGkiEki
+βTZk
Zk + εki
Zk gender race smokestudy 1
√ √ √
study 2√ √
NAstudy 3
√NA
√
......
......
δ NIPD: Naive IPD analysis based on subset of Z that is commonlyavailable to ALL studies
2 non-linear interaction
� K = 20 studies, N = 10,000 participants, nk ranging 200-2000
June 26, 2014 Meta-analysis of G x E 29 / 43
Methods
Simulation study
� Relative performances of IPD, UIVW, MIVW, MR and AWE
� Simulation scenarios:1 Lack of common set of confounders to adjust across studies
Yki = αk +βGGki +βEEki +δGkiEki
+βTZk
Zk + εki
Zk gender race smokestudy 1
√ √ √
study 2√ √
NAstudy 3
√NA
√
......
......
δ NIPD: Naive IPD analysis based on subset of Z that is commonlyavailable to ALL studies
2 non-linear interaction
� K = 20 studies, N = 10,000 participants, nk ranging 200-2000
June 26, 2014 Meta-analysis of G x E 29 / 43
Methods
Simulation study
� Relative performances of IPD, UIVW, MIVW, MR and AWE
� Simulation scenarios:1 Lack of common set of confounders to adjust across studies
Yki = αk +βGGki +βEEki +δGkiEki
+βTZk
Zk + εki
Zk gender race smokestudy 1
√ √ √
study 2√ √
NAstudy 3
√NA
√
......
......
δ NIPD: Naive IPD analysis based on subset of Z that is commonlyavailable to ALL studies
2 non-linear interaction
� K = 20 studies, N = 10,000 participants, nk ranging 200-2000
June 26, 2014 Meta-analysis of G x E 29 / 43
Methods
Simulation study
� Relative performances of IPD, UIVW, MIVW, MR and AWE
� Simulation scenarios:1 Lack of common set of confounders to adjust across studies
Yki = αk +βGGki +βEEki +δGkiEki
+βTZk
Zk + εki
Zk gender race smokestudy 1
√ √ √
study 2√ √
NAstudy 3
√NA
√
......
......
δ NIPD: Naive IPD analysis based on subset of Z that is commonlyavailable to ALL studies
2 non-linear interaction
� K = 20 studies, N = 10,000 participants, nk ranging 200-2000
June 26, 2014 Meta-analysis of G x E 29 / 43
Methods
Simulation study
� Relative performances of IPD, UIVW, MIVW, MR and AWE
� Simulation scenarios:1 Lack of common set of confounders to adjust across studies
Yki = αk +βGGki +βEEki +δGkiEki
+βTZk
Zk + εki
Zk gender race smokestudy 1
√ √ √
study 2√ √
NAstudy 3
√NA
√
......
......
δ NIPD: Naive IPD analysis based on subset of Z that is commonlyavailable to ALL studies
2 non-linear interaction
� K = 20 studies, N = 10,000 participants, nk ranging 200-2000
June 26, 2014 Meta-analysis of G x E 29 / 43
Methods
Simulation study
� Relative performances of IPD, UIVW, MIVW, MR and AWE
� Simulation scenarios:1 Lack of common set of confounders to adjust across studies
Yki = αk +βGGki +βEEki +δGkiEki
+βTZk
Zk + εki
Zk gender race smokestudy 1
√ √ √
study 2√ √
NAstudy 3
√NA
√
......
......
δ NIPD: Naive IPD analysis based on subset of Z that is commonlyavailable to ALL studies
2 non-linear interaction
� K = 20 studies, N = 10,000 participants, nk ranging 200-2000
June 26, 2014 Meta-analysis of G x E 29 / 43
Methods
Simulation results: typical power graph
Type III R2 of G × E interaction (%)
Pow
er
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.1 0.2
Dominant WSS < BSS
Additive WSS < BSS
0.0 0.1 0.2
Co−dominantWSS < BSS
Dominant WSS > BSS
0.0 0.1 0.2
Additive WSS > BSS
0.0
0.2
0.4
0.6
0.8
1.0Co−dominantWSS > BSS
IPDUIVW
REMMIVW
MRAWE
TS
Figure : Comparison of the proposed methods in terms of empirical power
June 26, 2014 Meta-analysis of G x E 30 / 43
Methods
Power: lack of common set of confounders to adjust
Type III R2 of G × E interaction (%)
Pow
er
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.1 0.2
Dominant WSS < BSS
Additive WSS < BSS
0.0 0.1 0.2
Co−dominantWSS < BSS
Dominant WSS > BSS
0.0 0.1 0.2
Additive WSS > BSS
0.0
0.2
0.4
0.6
0.8
1.0
Co−dominantWSS > BSS
IPDNIPD
UIVWREM
MIVWMR
AWETS
Figure : Power comparison under lack of common set of confounders to adjust
performance regarding estimation and testing of GEI was relativelyunchanged (also noted in VanderWeele et al. 2013)
June 26, 2014 Meta-analysis of G x E 31 / 43
Methods
Non-linear interactionFigure : Non-linear G-E interaction model and the covariate heterogeneity of E(sigmoid curve: true association βG(E) = 2exp(E−50)/{1+ exp(E−50)}+2)
40 45 50 55 60
Covariate heterogeneity of E across studies (boxplots)
Value of E
True
non
−lin
ear
rela
tions
hip
βG(E
) (cu
rve)
1234567891011121314151617181920Study
45.646.146.546.947.647.948.448.949.650.150.850.951.652.052.753.053.553.954.655.0
µk
1.82.01.52.03.51.51.71.42.13.53.51.71.92.13.51.92.01.61.71.3σEk
2
2.5
3
3.5
4
June 26, 2014 Meta-analysis of G x E 32 / 43
Methods
Non-linear interaction: power
Value of E
Pow
er to
det
ect G
EI (
barc
hart
)
45 46 47 48 49 50 51 52 53 54
0.0
0.1
0.2
0.3
0.4
0.5
0.0
0.1
0.2
0.3
0.4
0.5
True
val
ue o
f GE
I (cu
rve)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
800
400
500
200
200
200
400
500
500
2000
500
500
200
400
1000
400
400
500
200
200
1.8
2.0
1.5
2.0
3.5
1.5
1.7
1.4
2.1
3.5
3.5
1.7
1.9
2.1
3.5
1.9
2.0
1.6
1.7
1.3
kn k
σ Ek
barchart: power
green curve: value of truenon-linear interaction
red bars: four cohorts withrelatively larger variance of E
Method PowerIPD 0.96UIVW 0.68MIVW 0.96MR 0.82AWE 0.96
June 26, 2014 Meta-analysis of G x E 33 / 43
Methods
Non-linear interaction: power
Value of E
Pow
er to
det
ect G
EI (
barc
hart
)
45 46 47 48 49 50 51 52 53 54
0.0
0.1
0.2
0.3
0.4
0.5
0.0
0.1
0.2
0.3
0.4
0.5
True
val
ue o
f GE
I (cu
rve)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
800
400
500
200
200
200
400
500
500
2000
500
500
200
400
1000
400
400
500
200
200
1.8
2.0
1.5
2.0
3.5
1.5
1.7
1.4
2.1
3.5
3.5
1.7
1.9
2.1
3.5
1.9
2.0
1.6
1.7
1.3
kn k
σ Ek
barchart: power
green curve: value of truenon-linear interaction
red bars: four cohorts withrelatively larger variance of E
Method PowerIPD 0.96UIVW 0.68MIVW 0.96MR 0.82AWE 0.96
June 26, 2014 Meta-analysis of G x E 33 / 43
Methods
T2D study: meta-analysis
Age : log(HDLki) = αk +βGSNPki +βEAgeki +δSNPki×Ageki +βz1BMIki +βz2Genderki +βz3T2D+ εki
BMI : log(HDLki) = αk +βGSNPki +βEBMIki +δSNPki×BMIki +βz1Ageki +βz2Genderki +βz3T2D+ εki
−0.010 −0.005 0.000 0.005 0.010
rs1121980 X age interaction
−0.005 0.000 0.005 0.010 0.015
rs1121980 X BMI interaction
D2D2007DIAGEN
DPSFUSION_FSFUSION_S2
HUNTMETSIM
TROMSO
UIVWREM
MIVW2’MIVWAWE
AWE2’
IPD
June 26, 2014 Meta-analysis of G x E 34 / 43
Methods
T2D study: meta-analysis
Age : log(HDLki) = αk +βGSNPki +βEAgeki +δSNPki×Ageki +βz1BMIki +βz2Genderki +βz3T2D+ εki
BMI : log(HDLki) = αk +βGSNPki +βEBMIki +δSNPki×BMIki +βz1Ageki +βz2Genderki +βz3T2D+ εki
−0.010 −0.005 0.000 0.005 0.010
rs1121980 X age interaction
−0.005 0.000 0.005 0.010 0.015
rs1121980 X BMI interaction
D2D2007DIAGEN
DPSFUSION_FSFUSION_S2
HUNTMETSIM
TROMSO
UIVWREM
MIVW2’MIVWAWE
AWE2’
IPD
June 26, 2014 Meta-analysis of G x E 34 / 43
Methods
T2D study: meta-regression
40 45 50 55 60 65
−0.
025
−0.
015
−0.
005
0.00
5
study specific mean age (year)
Mar
gina
l SN
P e
ffect
FUSION_FS
27 28 29 30 31−
0.02
5−
0.01
5−
0.00
50.
005
study specific mean BMI (kg/m2)
DPS
June 26, 2014 Meta-analysis of G x E 35 / 43
Methods
T2D study
� RS1121980×BMI interaction on log(HDL)
rs1121980 × BMI P-value*Method Estimate SEa 95% CIa additive co-dominantIPD 1.880 0.679 (0.548, 3.212) 0.006∗? 0.013∗
UIVW 1.936 0.700 (0.565, 3.308) 0.006∗ 0.018∗
MIVW 1.957 0.698 (0.589, 3.325) 0.005∗ 0.017∗
MR -0.062 3.484 (-6.890, 6.767) 0.986 0.630AWE 1.859 0.686 (0.514, 3.204) 0.007∗ 0.016∗
a SE: standard error (estimates and SEs multiplied by 1000); CI: confidence interval.
� Combining βE(BMI) with δBMI (BMI×SNP): for 1 kg/m2 increase in BMI,HDL level will decrease by (in terms of percentage change(yx+1− yx)/yx×100%):
1.73 (95% CI: (1.57, 1.90)) for GG1.54 (95% CI: (1.44, 1.64)) for AG or GA1.35 (95% CI: (1.17, 1.53)) for AA (minor allele: A)
� Presence of A attenuated the (negative) association between BMI and HDL
June 26, 2014 Meta-analysis of G x E 36 / 43
Methods
T2D study
� RS1121980×BMI interaction on log(HDL)
rs1121980 × BMI P-value*Method Estimate SEa 95% CIa additive co-dominantIPD 1.880 0.679 (0.548, 3.212) 0.006∗? 0.013∗
UIVW 1.936 0.700 (0.565, 3.308) 0.006∗ 0.018∗
MIVW 1.957 0.698 (0.589, 3.325) 0.005∗ 0.017∗
MR -0.062 3.484 (-6.890, 6.767) 0.986 0.630AWE 1.859 0.686 (0.514, 3.204) 0.007∗ 0.016∗
a SE: standard error (estimates and SEs multiplied by 1000); CI: confidence interval.
� Combining βE(BMI) with δBMI (BMI×SNP): for 1 kg/m2 increase in BMI,HDL level will decrease by (in terms of percentage change(yx+1− yx)/yx×100%):
1.73 (95% CI: (1.57, 1.90)) for GG1.54 (95% CI: (1.44, 1.64)) for AG or GA1.35 (95% CI: (1.17, 1.53)) for AA (minor allele: A)
� Presence of A attenuated the (negative) association between BMI and HDL
June 26, 2014 Meta-analysis of G x E 36 / 43
Conclusion
Summary
� Adaptively weighted estimator:Instead of a discrete choice, a weighted combination of UIVW and MR
Captures precision trade-offs in terms of covariate heterogeneity in E
Uses only univariate summary statistics
Test based on AWE valid under fixed effects model
Efficiency comes at a cost of potential bias, but well-controlled trade-off.
June 26, 2014 Meta-analysis of G x E 37 / 43
Conclusion
Summary
� Adaptively weighted estimator:Instead of a discrete choice, a weighted combination of UIVW and MR
Captures precision trade-offs in terms of covariate heterogeneity in E
Uses only univariate summary statistics
Test based on AWE valid under fixed effects model
Efficiency comes at a cost of potential bias, but well-controlled trade-off.
June 26, 2014 Meta-analysis of G x E 37 / 43
Conclusion
Summary
� Adaptively weighted estimator:Instead of a discrete choice, a weighted combination of UIVW and MR
Captures precision trade-offs in terms of covariate heterogeneity in E
Uses only univariate summary statistics
Test based on AWE valid under fixed effects model
Efficiency comes at a cost of potential bias, but well-controlled trade-off.
June 26, 2014 Meta-analysis of G x E 37 / 43
Conclusion
Summary
� Adaptively weighted estimator:Instead of a discrete choice, a weighted combination of UIVW and MR
Captures precision trade-offs in terms of covariate heterogeneity in E
Uses only univariate summary statistics
Test based on AWE valid under fixed effects model
Efficiency comes at a cost of potential bias, but well-controlled trade-off.
June 26, 2014 Meta-analysis of G x E 37 / 43
Conclusion
Summary
� Adaptively weighted estimator:Instead of a discrete choice, a weighted combination of UIVW and MR
Captures precision trade-offs in terms of covariate heterogeneity in E
Uses only univariate summary statistics
Test based on AWE valid under fixed effects model
Efficiency comes at a cost of potential bias, but well-controlled trade-off.
June 26, 2014 Meta-analysis of G x E 37 / 43
Conclusion
Big Picture Question: Where have all the GEI gone?
When will we ever learn?
I will keep exploring in my yellow submarine
Thank you for listening!
June 26, 2014 Meta-analysis of G x E 38 / 43
Conclusion
Heterogeneity in E exists!Example Courtesy: Nilanjan Chatterjee, NCI
Obesity, 2012
June 26, 2014 Meta-analysis of G x E 39 / 43
Conclusion
India Health Study
Trivandrum
New Delhi
June 26, 2014 Meta-analysis of G x E 40 / 43
Conclusion
Participant Characteristics by Region
Characteristic New Delhi Trivandrum
Total (n=1,313) n=619 n=694
Age, years (mean, SD) 47.4 ± 10.0 48.8 ± 9.2
Household monthly income, %
<5,000 rupees 7.1 71.9
>10,000 rupees 76.7 3.1
Household items, %
Car 25 7
Refrigerator 87 58
Washing machine 79 14
Total physical activity, MET-hr/wk 42.5 ± 43.8 147.3 ± 85.2
Vigorous physical activity, MET-hr/wk 0.6 ± 6.8 26.2 ± 51.4
Sitting, hr/day 10.4 ± 2.0 5.0 ± 2.3
Centrally obese, % 82.1 60.2
June 26, 2014 Meta-analysis of G x E 41 / 43
Conclusion
Association of FTO rs3751812 with Waist Circumference
Overall 1,209 +1.61 cm (0.67, 2.55) 0.0008
New Delhi
Overall 578 +2.53 cm (1.08, 3.97) 0.0006
Trivandrum
Overall 574 +0.87 cm (-0.35, 2.08) 0.16
By PA
< 91 MET-hrs/wk 517 +2.36 cm (0.82, 3.89) 0.003
92-151 MET-hrs/wk 32 +6.39 cm (1.94, 10.85) 0.005
152-217 MET-hrs/wk 24 -0.95 cm (-7.33, 5.42) 0.77
218+ MET-hrs/wk 5 N/A N/A
By PA
< 91 MET-hrs/wk 170 +3.50 cm (0.90, 6.10) 0.008
92-151 MET-hrs/wk 132 +1.13 cm (-1.08, 3.33) 0.32
152-217 MET-hrs/wk 141 +1.04 cm (-1.63, 3.70) 0.45
218+ MET-hrs/wk 131 -2.32 cm (-4.82, 0.18) 0.07
Characteristic NEffect size per T allele
(95% CI)Ptrend
0.009
0.59
0.004
Overall 1,209 +1.61 cm (0.67, 2.55) 0.0008
New Delhi
Overall 578 +2.53 cm (1.08, 3.97) 0.0006
Trivandrum
Overall 574 +0.87 cm (-0.35, 2.08) 0.16
Characteristic NEffect size per T allele
(95% CI)Ptrend
Interaction
by PA
June 26, 2014 Meta-analysis of G x E 42 / 43
Conclusion
PloS Medicine, 2011
PloS Genetics, 2013
Note the much larger sample sizes
June 26, 2014 Meta-analysis of G x E 43 / 43