An Environment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus

Preview:

DESCRIPTION

An Environment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus. Chirag J. Patel et al., PLoS One, May 2010. First, some context. Hypothesis-driven vs. data-driven research Tension between these two forms of research crosses all scientific disciplines - PowerPoint PPT Presentation

Citation preview

An Environment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus

Chirag J. Patel et al.,PLoS One, May 2010

First, some context...

2

• Hypothesis-driven vs. data-driven research• Tension between these two forms of research

crosses all scientific disciplines• In recent years, there has been an explosion of

data-driven research...why?

Kell, 2003

Computational Speed Over Time

3Kurzweil, 2010

Genome Sequencing Cost

4NHGRI, 2012

The Age of Big Data

5Lohr, 2012

Genome-Wide Association Study (GWAS)

6

• Typically case-control study design• Examine associations of single-nucleotide

polymorphisms (SNPs) with disease state• NCBI’s SNP Database lists 187,852,828 SNPs

identified in human genome (June 2012)• GWAS typically examines 100,000’s of SNPs

through use of DNA mircoarrays

Bush et al., PLOS Computational Biology, 2012

GWAS of systemic sclerosis

7Radstake et al., 2010

On to the paper!

T2D Prevalence in US

9CDC, 2011 Year

Estim

ated

Pre

vale

nce

(%)

Age

T2D Incidence in US

10CDC, 2012

Year

Estim

ated

Incid

ence

(per

100

0 pp

l)

Age

80 85 90 95 00 05 10 0

2

4

6

8

10

14

12

18

16

18 - 44

45 - 64

65 – 79

Introduction

11

• Type 2 Diabetes (T2D) has complex etiology, involving genetics, lifestyle, and environment

• GWAS identified multiple SNPs associated with T2D, but these don’t explain T2D trends

• Standard environmental epidemiology approaches limited by narrow focus

• Patel et al. propose first “Environment-Wide Association Study” (EWAS) to examine T2D using a large, nationally-representative dataset

Methods

12

• Combined four NHANES datasets (1999-2006)• Rich cross-sectional data on demographics,

chemical toxicants, pollutants, allergens, nutrients, fasting blood sugar, and self-reported medical history

• By using NHANES weighting, results can be generalized to US population

Methods: Environment Scan

13

• Omitted environmental factors with low variability (>90% of observations below detection limit). Also omitted factors only affecting specific subsets of population

• Across all four NHANES cohorts: 543 environmental factors

• 266 unique factors in total, with 157 factors found in more than one cohort

• Log-transformed factors when necessary. Used z-score transformations to allow comparisons between factors

14

Methods: Case definition

15

• Based on ADA guidelines: fasting blood glucose level ≥ 126 mg/dL

• Did not distinguish T1D from T2D• Did not consider medication use or medical

history

ADA, 2009

Methods: Primary Analysis

16

• Logistic regression (accounting for NHANES weighting) to estimates associations of 266 unique environmental factors with case status

• Estimated prevalence odds ratios• Ran regressions for each individual NHANES

cohort and with data of all combined cohorts • Covariates: age, sex, BMI, ethnicity, and

income/poverty ratio

Methods: False Discovery Rate (FDR)

17

• Accounted for multiple hypothesis testing• FDR= proportion of "discoveries" (significant

results) that are actually false positives• Less stringent than Bonferroni correction

Methods: False Discovery Rate (FDR)

18

Alpha Level FDR

5 false discoveries

100 total tests

5 false discoveries

100 significant resultsα =

α = 0.05

Shaffer, 1995

FDR =

FDR = 0.05

Methods: Primary Analysis

19

1) First phase: Used two-sided alpha level of 0.02 to pick factors associated with T2D in individual NHANES cohorts

2) Second phase: Determine how many of these 37 factors are associated with T2D in two or more cohorts (two-sided alpha level of 0.02)

Methods: secondary/sensitivity analyses

20

1) Reverse causality test: re-run analysis only among people that didn’t report doctor diagnosis of T2D

2) Lipophilic chemicals: adjusted for total triglycerides and cholesterol

3) Recent diet: adjusted for diet and supplement use

Results: first phase

21

Identified 37 unique factors (FDR = 10-30%)

• Dioxins• Furans• Heavy metals• Nutrient/vitamins• Organochlorine pesticides• Polychlorinated biphenyls• Viruses

Results: second phase

23

Identified 5 unique factors (overall FDR = 2%)

• Cis-β-carotene• Trans- β-carotene• γ-tocopherol• Heptachlor Epoxide• PCB170: 2,2',3,3',4,4',5-heptachlorobiphenyl

24

Results: reverse causality?

25

Primary Analysis Secondary Analysis

Cis-β-carotene 0.6 (0.5 – 0.7) 0.6 (0.5 – 0.7)

Trans- β-carotene 0.6 (0.5 – 0.7) 0.7 (0.5 – 0.8)

γ-tocopherol 1.5 (1.3 – 1.7) 1.8 (1.3 – 2.2)

Heptachlor Epoxide 1.7 (1.3 – 2.1) 1.6 (1.1 – 2.1)

PCB170 2.2 (1.6 – 3.2) 2.1 (1.2 – 3.9)

Prevalence OR (95% CI)

Results: confounding by lipid levels?

26

Primary Analysis Secondary Analysis

Cis-β-carotene 0.6 (0.5 – 0.7) 0.7 (0.6 – 0.8)

Trans- β-carotene 0.6 (0.5 – 0.7) 0.7 (0.6 – 0.8)

γ-tocopherol 1.5 (1.3 – 1.7) 1.4 (1.2 – 1.6)

Heptachlor Epoxide 1.7 (1.3 – 2.1) 1.6 (1.3 – 2.0)

PCB170 2.2 (1.6 – 3.2) 2.3 (1.4 – 3.7)

Prevalence OR (95% CI)

Results: adjusting for diet/supplements?

27

Primary Analysis Secondary Analysis

Cis-β-carotene 0.6 (0.5 – 0.7) 0.7 (0.6 – 0.8)

Trans- β-carotene 0.6 (0.5 – 0.7) 0.7 (0.6 – 0.8)

γ-tocopherol 1.5 (1.3 – 1.7) 1.3 (1.1 – 1.5)

Heptachlor Epoxide 1.7 (1.3 – 2.1) 1.6 (1.3 – 2.1)

PCB170 2.2 (1.6 – 3.2) 2.2 (1.4 – 3.5)

Prevalence OR (95% CI)

Discussion

28

• EWAS confirmed previous findings (carotenes and PCB) and provided novel associations (heptachlor epoxide and γ-tocopherol)

• Limitations and Strengths?• Dawning of age of “enviromics”?• Next steps?

o e.g. cumulative exposure?

Recommended