82
An Introduction to An Introduction to Genome-Wide Association Genome-Wide Association Studies Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Embed Size (px)

Citation preview

Page 1: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

An Introduction to An Introduction to Genome-Wide Association Genome-Wide Association StudiesStudies

Shen-Chih Chang, PhDEpi 295Oct 2, 2009

Page 2: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

What is a genome-wide What is a genome-wide association study?association study?

Any study of genetic variation across the entire human genome designed to identify genetic association with a disease.

It usually refers to studies with genetic markers of 100,000 or more to represent a large proportion of variation in the human genome.

It allows for efficient and comprehensive analysis of common genomic variation to be conducted without priori hypotheses based on gene function or disease pathways.

It requires very large series of cases and controls to ensure adequate statistical power, and multiple subsequent studies to confirm the initial findings.

Page 3: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Why are such studies Why are such studies possible now?possible now?

With the completion of the Human Genome Project in 2003 and the International HapMap Project in 2005, researchers now have a set of research tools that make it possible to find the genetic contributions to common diseases.

The tools include ◦computerized databases that contain the reference human genome sequence

◦a map of human genetic variation ◦a set of new technologies that can quickly and accurately analyze whole-genome samples.

Page 4: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Molecular Epidemiology Molecular Epidemiology Before/After GWASBefore/After GWAS

Before GWAS◦A few markers at a time◦Gene functions◦Disease pathways◦Biological mechanisms

After GWAS◦Hundreds of thousands of markers at a time

◦Gene hunting◦Association Studies

Association studies

Gene functionsDisease pathwaysBiological mechanisms

Page 5: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Terms Frequently Used in Terms Frequently Used in GWASGWAS

Single-nucleotide polymorphism (SNP)◦DNA sequence variation resulted from a single-base substitution.

◦Most common form of genetic variation in the genome.

Nonsynonymous SNP◦A polymorphism which results in a change in the amino acid sequence of a protein (and therefore may affect the function of the protein)

Alleles◦Alternative DNA sequences at the same physical gene locus, which may or may not result in different phenotypic traits.*Adopted from Pearson, et

al, 2008.

Page 6: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Terms Frequently Used in Terms Frequently Used in GWAS (continued)GWAS (continued)

Minor allele◦The allele of a bi-allelic polymorphism that is less frequent in the study population.

Minor allele frequency (MAF)◦Proportion of the less common allele in a population, ranging from less than 1% to 50%.

Haplotype◦A group of specific alleles at neighboring genes or markers that tend to be inherited together

Page 7: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Terms Frequently Used in Terms Frequently Used in GWAS (continued)GWAS (continued)

Linkage disequilibrium (LD)◦ Associations between 2 alleles located near each other on a chromosome, such that they are inherited together more frequently than expected by chance.

Tag SNP◦ A SNP which is in strong LD with other SNPs so that it can serve as a proxy for these SNPs. So we only need to genotype the Tag SNPs which can represent the variation of the gene(s).

Hardy-Weinberg equilibrium (HWE)◦ The population distribution of two alleles (with frequencies p and q) is stable from generation to generation.

◦ Genotypes occur at frequencies of p2, 2pg, and q2 for the major allele homozygote, heterozygote, and minor allele homozygote, respectively.

Page 8: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Other types of common DNA Other types of common DNA variation variation Deletion/InsertionCopy number variation

Page 9: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Single Nucleotide Single Nucleotide PolymorphismPolymorphism

AAGTCAGTCTAGGATCGGG

TTCAGTCAGATCCTTCAGTCAGATCCTTAGCCCAGCCC

TTCAGTCAGATCCTTCAGTCAGATCCCCAGCCCAGCCC

AAGTCAGTCTAGGAAGTCAGTCTAGGGGTCGGGTCGGG

Person 1

Person 2

SNPSNP

Page 10: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Insertion/DeletionInsertion/Deletion

AAGTCAGTCTAGGATCGGG

TTCAGTCAGATCCTTCAGTCAGATCCTTAGCCCAGCCC

TTCAGTCAGATCCTTCAGTCAGATCCCTCTAGCCCAGCCC

AAGTCAGTCTAGGAAGTCAGTCTAGGGAGATCGGGTCGGG

Person 1

Person 2

Insertion/Insertion/DeletionDeletion

Page 11: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Copy Number VariationCopy Number Variation

AAGTGTCGTCGTCGTCTCGGG

TTCATTCACAGCAGCAGCAGCAGCAGCAGCAGAGCCCAGCCC

TTCATTCACAGCAGCAGCAGCAGCAGAGCCCAGCCC

AAGTAAGTGTCGTCGTCGTCGTCGTCTCGGGTCGGG

Person 1

Person 2

3 vs. 4 trinucleotide 3 vs. 4 trinucleotide repeatsrepeats

Page 12: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Genotyping in GWA Genotyping in GWA StudiesStudiesGWA studies rely on LD information.

Usually at least 1 SNP within a group of SNPs with high LD (r2 ≥ 0.8) will be included on the platform.

Genotyping platforms comprising 500,000 to 1,000,000 SNPs have been estimated to capture:◦67% to 89% of common SNP variation in European and Asian ancestry

◦46% to 66% in African ancestry

Page 13: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Genotyping PlatformsGenotyping Platforms

Affymetrix ◦Genome-Wide Human SNP Array 6.0

More than 906,600 SNPs More than 946,000 copy number probes

Illumina ◦HumanAmni1-Quad /Human1M-Duo

~ 1 million markers◦Human660W_Quad

~650,000 markers◦HumanCytoSNP-12

~ 300,000 markers

Page 14: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Quality Control in GWA Quality Control in GWA StudiesStudiesCertain thresholds should be set up to ensure genotyping quality:◦the SNP call rate, typically > 95%

◦the minor allele frequency, typically > 1%

◦violations of Hardy Weinberg equilibrium

◦concordance rates in duplicate samples, typically > 99.5%.

Page 15: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Study Design in GWA Study Design in GWA StudiesStudies

A multistage approach can reduce the amount of genotyping required, without sacrificing power.

In stage 1, a full set of SNPs is genotyped in a fraction of samples, and a p-value threshold is used to identify a subset of SNPs with putative associations.

In the second and possibly third stages, the SNPs identified from the first stage are re-tested in populations that are larger or of a similar size.

The replication results can be used to distinguish the few true-positive associations identified in stage 1 from the many false-positive results that occur by chance.

Page 16: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Multistage Study Multistage Study DesignsDesigns

Joel N. Hirschhorn & Mark J. DalyNature Reviews Genetics 6, 95-108, 2005

Page 17: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Examining GWAS DATAExamining GWAS DATAPopulation stratification (population sturcture)◦ Refers to confounding in genetic association studies caused by genetic differences between cases and controls unrelated to disease but due to sampling them from populations of different ancestries.

Quantile-Quantile plot (Q-Q plot)◦ In a GWAS, a Q-Q plot can be used to assess the magnitude of population stratification.

◦ Observed association statistics (χ 2 or -log10 P values) are ranked from smallest to largest on the y-axis and plotted against the distribution of what would be expected under the null hypothesis of no association on the x-axis.

◦ Deviations from the identity line suggest either a very highly associated locus or significant differences in population structure.

Page 18: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Copyright restrictions may apply.

Pearson, T. A. et al. JAMA 2008;299:1335-1344.

Hypothetical Quantile-Quantile Plots in Genome-wide Association Studies

The sharp deviation above an expected value of approximately 8 could be due to a strong association of the disease with SNPs in a heavily genotyped region.

Inflation of observed statistics is more likely due to population structure than disease susceptibility genes.

Page 19: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Analyzing GWA StudiesAnalyzing GWA StudiesDifferent genetic model◦Additive model: each copy of the allele is assumed to increase risk by the same amount.

◦Dominant model: rare allele carriers compared to homozygotes of the common allele

◦Recessive model: homozygotes of the rare allele compared to common allele carriers

Correction of multiple comparison◦Bonferroni correction has been the most commonly used correction in GWAS:

◦A threshold of P = 0.05/number of tests performed

Page 20: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Selected Genome-Wide Selected Genome-Wide Association StudiesAssociation StudiesOn Lung Cancer

Page 21: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

deCODE Genetics, IcelandPopulation: European descendantOutcome: smoking quantity

Page 22: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

First Stage – First Stage – discoverydiscovery

10,995 Icelandic smokersInfimium HumanHap300 SNP chips (Illumina)

306,207 SNPs Significance threshold = 2 x 10-7 ≈ 0.05/306207

•Identified allele T of rs1051730 most strongly associated with smoking quantity (P = 5 x 10-16)•On chromosome 15q24•Within the CHRNA3 gene in a linkage disequilibrium block containing CHRNA5 and CHRNB4 (encode nicotinic acetylcholine receptors)

Page 23: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009
Page 24: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Second Stage - Second Stage - replicationreplication

Genotyping rs1051730 on additional samples (Centarus)

523 smokers from Spain1,375 smokers from The Netherlands

Page 25: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Third Stage - expansionThird Stage - expansionassociation with lung cancer and association with lung cancer and

peripheral arterial diseaseperipheral arterial diseaseFor lung cancer: three case-control studies from Iceland, Spain, and The Netherlands

For peripheral arterial disease: five case-control studies from Iceland, New Zealand, Austria, Sweden, and Italy

*Adjusted on sex and year of birth.

Page 26: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

IARC Central Europe lung cancer studyEuropean descendant Outcome: lung cancer

Page 27: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

First Stage - First Stage - discoverydiscovery

1,926 lung cancer cases/2,522 controls

Illumina Sentrix HumanHap300 BeadChip (Illumina)

310,023 SNPs (≈ 80% of common genomic variation)

Significance threshold = 5 x 10-7

•Identified two SNPs on chromosome 15q25,

•rs1051730 (P = 5 x 10-9) •Allelic ORadj, 1.30 (1.19-1.43)

•rs8034191 (P = 9 x 10-10) •Allelic ORadj, 1.32 (1.21-1.45)

Page 28: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009
Page 29: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Second Stage - Second Stage - expansionexpansion

Genotyping 34 additional 15q25 SNPs (Taqman):◦SNPs with an association P-value of < 10-6 from Center d’Etude du Polymorphism Humain Utah (CEU) HapMap (using an imputation method)

◦SNPs of CHRNA5 and CHRNA3 from previous studies on nicotine dependence

◦All non-synonymous SNPs in dbSNP from the six genes within or near the association region

Findings:◦23 showed evidence of association exceeding the genome-wide significance level

◦Strong LD region

Page 30: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Third Stage - Third Stage - confirmationconfirmationGenotyping rs8034191 (from the first panel) and rs16969968 (from the second panel)

In five independent studies of lung cancer◦ the European Prospective Investigation in Cancer and Nutrition (EPIC) cohort study (781 cases/1,578 controls)

◦ the Beta-Carotene and Retinol Efficacy Trial (CARET) cohort study (764 cases/1,515 controls)

◦ the Health Study of Nord-Trondelag (HUNT) and Tromso cohort studies (235 cases/392 controls)

◦ the Liverpool lung cancer case-control study (403 cases/814 controls)

◦ the Toronto lung cancer case-control study (330 cases/453 controls)

Page 31: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Third Stage - Third Stage - replicationreplication

Findings:•An increased risk for both heterozygous and homozygous carriers was observed in all five replication samples.•Two SNPs are in high LD (D’ = 1.00, r2 = 0.92).•An increased risk was also observed among non-smokers.

Page 32: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Fourth Stage – Fourth Stage – expansionexpansionassociation with head and neck association with head and neck cancercancerGenotyping rs8034191 in two separate studies of head and neck cancer in Europe◦Five of the six countries in the original GWAS, overlapping with the lung cancer controls (726 cases/694 controls)

◦The ARCAGE study with eight countries in Europe (1,536 cases/1,443 controls)

Findings:◦No association with HNC was observed.◦No evidence of an association with nicotine addiction.

Page 33: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

MD AndersonEuropean descendant Outcome: lung cancer

Page 34: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

First Stage - First Stage - discoverydiscovery

1,154 lung cancer cases/1,137 controls (all smokers)

Illumina HumanHap300 v1.1 BeadChips (Illumina)

315,860 SNPsSignificance threshold = 4.9 x 10-5

(the least significant result among the ten SNPs retained for follow-up)

Choose 10 SNPs with the most significant associations

Page 35: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009
Page 36: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Second Stage - Second Stage - replicationreplication

Genotyping 10 most significant associations from the discovery phase (Taqman) in two studies:◦Texas (711 cases/632 controls)◦ UK (2,013 cases/3,062 controls)

Findings:◦Elevated risks were replicated in 2/10 SNPs rs1051730 rs8034191 Two SNPs are in High LD (r2 > 0.8)

Page 37: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

*Adjusted on age, sex, packyears, and center

Page 38: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

A Catalog of Published A Catalog of Published Genome-Wide Association Genome-Wide Association StudiesStudieshttp://www.genome.gov/26525384#1

Page 39: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

GWA Replication GWA Replication StudiesStudiesLung CancerBladder CancerHead and Neck Cancer

Page 40: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

• Manuscript submitted to JNCI• Organizer: International Lung Cancer Consortium (ILCCO)• Participants: 21 studies (11,645 cases/14,954 controls)

• 9 from North America• 8 from Europe• 4 from Asia

Page 41: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Table 1 – Summary of the participating studies from ILCCO

Ref. Coordinating institute Study location Recruitment

period Eligibility Control source Cases* Controls*

Whites MD Anderson Cancer Center Texas, US Only ever smokers Hospital 709 629

Karmanos Cancer Institute (KCI), Wayne state University

Michigan, US 1988-2007 Population 575 860

University of Hawaii Hawaii, US 1992-1997 26-79 years old Population 138 175 Mayo clinic Minnesota, US 1997-2006 Hospital 1,644 1,021

Norris Cotton Cancer Center, Darmouth

Medical School (NELCS study) New Hampshire, US 2005-2008 Population 228 162

16 Penn State University College of Medicine Florida, US 2000-2003 18-79 years-old, within 1 year diagnosis, no previous cancer

history Community 447 733

18 University of California, Los Angeles (UCLA) California, US 1999-2004 18-65 years old Population 319 581

13 University of California, San Francisco (UCSF)

San Francisco, US 1998-2003, 2005-2009

>18 years old Population and community

1,804 558

17 National Institute of Occupational Health Norway 1986-2005 Current smokers or quit

smoking <5 years Population 443 436

University of Sheffield United Kingdom 2005-2009 Diagnosed under age 61 or reported family history

Population recruited through family

114 133

INSERM U794 France Only ever smokers 135 146 10 12 14 15

Helmholtz Center Munich University of Göttingen Medical School

German Cancer Research Center (DKFZ) German Cancer Research Center (DKFZ)

Munich, Göttingen, Germany Munich, Germany

Heidelberg, Germany Heidelberg, Germany

2000-2008 1990-1998 1997-2007 1994-1998

LUCY 18-51 years old INRA all ages

DKFZ from 18 years old EPIC from 18 years old

Population 1,839 3,336

8 German Cancer Research Center (DKFZ) Saarland, Germany 2000-2003 50-75 years old Population 198 203 University Hospital of Cologne Cologne, Germany 2005-2008 Hospital 450 327

6 Division of Medical Oncology, University

Hospital Zaragoza, Spain 2006-2008 Hospital 350 1,227

6 Radboud University Nijmegen Medical Centre

Netherlands 2008 18-75 years old Population 396 2,068

CHS National Cancer Control Center at Carmel Medical Center and Technion Haifa, Israel 2007-2009 Population 212 197

Asians 18 University of California, Los Angeles (UCLA) California, US 1999-2004 18-65 years old Population 58 53 University of Hawaii Hawaii, US 1992-1997 26-79 years old Population 100 170

Samuel Lunenfeld Research Institute Ontario, Canada 1997-2002 Residents of Greater Toronto

Area Population and

hospital 65 98

9 Seoul National University Korea 2001-2008 Hospital Hospital 271 276 National University of Singapore Singapore 2005-2007 Only women Hospital 484 813

11 Aichi Cancer Center Japan 2000-2005 20-79 years old Hospital 716 716

Total 11,645 14,954

* Maximum number of cases and controls of European and Asian ethnic groups with DNA

Page 42: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

SNPs selectedSNPs selected Rare allele freq*

Gene Location Gene Name

Gene Symbol SNP ID

Nucleotide Change

Amino Acid Change

Han Chinese European Gene Function

15q25

Cholinergic receptor, nicotinic, alpha 5

CHRNA5 rs16969968 Ex5-54G>A Asp398Asn 0.03 (A) 0.42 (A) Nicotinic acetylcholine receptors are members of a superfamily of ligand-gated ion channels that mediate fast signal transmission at synapses.

rs931794 -31881G>A 0.31 (G) 0.43 (G) Cholinergic receptor, nicotinic, alpha 3

CHRNA3 rs12914385 IVS4-4117G>A 0.25 (A) 0.43 (A) rs1317286 T>C 0.08 (C) 0.41 (C)

Similar to RIKEN cDNA C630028N24 gene

LOC123688 rs8034191 IVS2+256T>C 0.04 (C) 0.43 (C) A hypothetical gene

6p Nucleolar protein 5B pseudogene

NOL5BP rs4324798 G>A 0.00 (A) 0.11 (A)

(Unknown) (Unknown) rs2256543 C>T 0.17 (T) 0.43 (T)

5p15

Cisplatin resistance related protein CRR9p

CLPTM1L rs402710 IVS16+9G>A 0.27 (A) 0.33 (A) CLPTM1L is a predicted transmembrane protein that is expressed in a range of normal and malignant tissues including skin, lung, breast, ovary and cervix.

Telomerase reverse transcriptase

TERT rs2736100 IVS2-3777G>T 0.41 (G) 0.45 (T) The enzyme consists of a protein component with reverse transcriptase activity, encoded by this gene, and an RNA component which serves as a template for the telomere repeat.

*From HapMap

• For Caucasians: two variants in the 15q25 locus (rs8034191 and rs16969968), two variants in 5p15 (rs402710 and rs2736100), and two variants in 6p21 (rs4324798 and rs2256543).• For Asians: three additional variants were selected in the 15q25 region (rs12914385, rs1317286 and rs931794) and the variants in 6p21 were not genotyped based on their low prevalence in these populations

Page 43: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

MethodsMethods Study population

◦ The control group was frequency matched to cases on age and sex in most of the studies.

◦ Some other studies also matched on ethnicity, residence or smoking status

◦ Only Whites or Asians were included

Genotyping method◦ Genotyping was performed locally using Taqman probes supplied centrally from IARC

◦ Toronto and France studies used data from the Illumina HumanHap300 BeadChip

◦ German Multicentre and Saarland study used Sequenom’s iPLEX assay

◦ Spain and Netherlands studies used the Centaurus (Nanogen) platform

Page 44: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Method (continued)Method (continued)• Quality control◦ 90 standard DNAs were used as inter-lab

control◦ The study was excluded from the analysis of a

variant if more than one discrepancy for that variant was found

◦ Average call rates per SNP: 97.1% to 99.6%◦ No deviation from HWE was observed (P cutoff =

0.0005, considering 100 independent tests)

Statistical analysis◦ Unconditional logistic regression to estimate

ORs and 95% CIs◦ Adjusted on age, sex, center, and smoking

packyears◦ Cochran’s Q test for heterogeneity◦ SAS 9.1 software was used

Page 45: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Table 2 – Distribution of selected demographic variables by ethnic group Whites Asians cases controls cases controls n % n % n % n % Sex Male 5,741 57.7 7,325 57.1 838 49.5 902 42.4 Female 4,210 42.3 5,503 42.9 856 50.5 1,224 57.6 Age (years) <50 1,252 12.6 2,969 23.1 182 10.7 243 11.4 50-59 2,499 25.1 3,443 26.8 426 25.1 492 23.1 60-69 3,273 32.9 3,859 30.1 565 33.4 679 31.9 70-79 2,451 24.6 2,310 18.0 443 26.2 612 28.8 ≥80 476 4.8 247 1.9 78 4.6 100 4.7 Smoking status Never 962 9.7 4,136 32.2 674 39.8 1,270 59.7 Former smoker 4,125 41.4 4,491 35.0 461 27.2 470 22.1 Current smoker 4,644 46.7 3,173 24.7 526 31.1 308 14.5 Ex or current 134 1.3 455 3.6 23 1.4 20 0.9 Missing 86 0.9 573 4.5 10 0.6 58 2.7 Histology Adenocarcinoma 3,892 39.1 329 30.1 Squamous cell 2,370 23.8 317 29.0 Large cell 413 4.2 96 8.8 Small cell 1,235 12.4 109 10.0 Other or not specified 2,041 20.5 243 22.2 Total 9,951 12,828 1,694 2,126

Page 46: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Variants Risk allele

Allele Cases Controls Heterozygotes Homozygotes Per allele p-trend p-heterogeneity freq. ref/het./hom. ref/het./hom. OR (95% CI) OR (95% CI) OR (95% CI) (by study) Whites Chr 15q25

rs16969968 A 0.35 3,371 / 4,523 / 1,484 4,827 / 5,019 / 1,373 1.33 (1.24-1.41) 1.54 (1.41-1.69) 1.26 (1.21-1.32) 2.10-26 0.09 rs8034191 G 0.35 2,586 / 3,488 / 1,185 4036 / 4,256 / 1,171 1.33 (1.24-1.43) 1.62 (1.47-1.79) 1.29 (1.23-1.35) 6.10-25 0.15

Chr 5p15 rs2736100 C 0.51 1,878 / 4,526 / 2,722 2,853 / 5,817 / 3,142 1.16 (1.07-1.25) 1.32 (1.21-1.43) 1.15 (1.10-1.20) 1.10-10 0.60 rs402710 G 0.65 873 / 3,847 / 4,140 1,115 / 4,178 / 3,905 1.16 (1.04-1.28) 1.30 (1.18-1.45) 1.14 (1.09-1.19) 5.10-8 0.73

Chr 6p rs2256543 A 0.43 2,898 / 4,519 / 1,803 3,860 / 5,813 / 2,260 1.03 (0.96-1.10) 1.07 (0.98-1.16) 1.03 (0.99-1.08) 0.14 0.92 rs4324798 A 0.08 8,066 /1,630 / 111 10,580 / 1,911 / 99 1.04 (0.96-1.12) 1.39 (1.04-1.87) 1.07 (0.99-1.14) 0.07 0.11

Asians Chr15q25

rs16969968 A 0.03 1,591 / 98 / 2 1,986 / 125 / 5 0.98 (0.75-1.30) 0.44 (0.08-2.31) 0.94 (0.73-1.23) 0.67 0.07 rs8034191 G 0.03 1,583 / 104/ 3 1,992 /122 / 3 1.06 (0.81-1.40) 1.06 (0.21-5.36) 1.06 (0.82-1.37) 0.66 0.09 rs12914385 T 0.30 728 / 647/ 148 584 / 762 / 177 1.05 (0.91-1.21) 1.04 (0.81-1.32) 1.03 (0.93-1.14) 0.58 0.10 rs1317286 G 0.10 1,223 / 291/ 13 1,521 / 313 / 22 1.18 (0.99-1.41) 0.73 (0.36-1.47) 1.10 (0.94-1.30) 0.23 0.10 rs931794 G 0.37 591 / 721/ 213 764 / 828 / 264 1.12 (0.96-1.29) 1.01 (0.82-1.25) 1.03 (0.93-1.14) 0.54 0.10

Chr5p15 rs2736100 C 0.39 538 / 836 / 312 775 / 1,014 / 312 1.24 (1.07-1.43) 1.51 (1.24-1.83) 1.23 (1.12-1.35) 2.10-5 0.32 rs402710 G 0.68 144 / 694 / 842 219 / 917 / 981 1.15 (0.91-1.46) 1.32 (1.05-1.66) 1.15 (1.04-1.27) 0.007 0.22

Ref.: reference class; Het.: Heterozygote; Hom.: Homozygote for the risk allele Risk allele frequencies are calculated among controls ORs are adjusted on age, sex, study

•Among whites:•Increased risk was observed for two SNPs on Chr15, and two SNPs on Chr5

•Among Asians:•Increased risk was observed for two SNPs on Chr5

Table 3 – Summary estimates of the main effects of the selected variants in White and Asian ethnic groups

Page 47: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

HeterozygousHomozygous

Adenocarcinomas Squamous Large cell Small cell

Never smokers Former smokers Current smokers

>0-<10 packyears 10-<20 packyears 20-<30 packyears 30-<40 packyears 40-<50 packyears 50-<60 packyears >=60 packyears

<50 50-60 60-70 >=70

Men Women

Co-dominant

By histology (p-heterogeneity=0.38)

By smoking status (p-heterogeneity=0.0001)

By packyears (p-heterogeneity=0.75)

By age (p-trend=0.002)

By gender (p-heterogeneity=0.88)

45231484

9378

37762128384

1106

92239944277

453771

107613731257919

2069

1177235630952750

52644114

50191373

11219

11219112191089610597

370641812875

1591127111651057691400693

2323308435302282

64454774

1.331.54

1.26

1.221.311.231.21

1.021.271.39

1.191.221.221.281.111.291.14

1.491.311.191.16

1.261.27

1.24-1.411.41-1.69

1.21-1.32

1.15-1.291.23-1.411.06-1.431.10-1.33

0.91-1.141.18-1.371.29-1.50

0.99-1.431.05-1.421.07-1.391.13-1.460.97-1.291.07-1.561.00-1.30

1.34-1.661.21-1.421.11-1.291.06-1.27

1.19-1.331.19-1.35

CHRNA5 (rs16969968) Ca Co OR 95%CI

1.0 1.2 1.4 1.6

OR

Figure 1 – Stratified analysis for rs16969968 (Chr 15) in Whites

•No association among never smokers•Stronger associations in current smokers than in former smokers

Page 48: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

All Controls Cases

n mean CI 95% n mean CI 95% n mean CI 95%

rs16969968 (CHRNA5) GG 5425 20.74 20.36-21.12 2610 17.99 17.45-18.53 2815 22.68 22.14-23.22

  GA 6597 21.85 21.49-22.20 2701 19.20 18.67-19.78 3896 23.70 23.22-24.18

  AA 2039 23.48 22.92-24.04 746 20.56 19.68-21.44 1293 25.56 24.84-26.28

p-trend     7.10-19     6.10-9     5.10-12

Table 4 – Association between rs16969968 and smoking intensity expressed in cigarettes per day in White ethnic group Means are adjusted by age, sex, study and case/control status when appropriate

• The mean of cigarettes per day was higher among homozygous carriers for the risk allele compared to carriers of the common allele.

Page 49: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

HeterozygousHomozygous

Adenocarcinomas Squamous Large cell Small cell

Never smokers Former smokers Current smokers

<50 50-60 60-70 >=70

Men Women

Co-dominant

By histology (p-heterogeneity=0.0002)

By smoking status (p-heterogeneity=0.44)

By age (p-heterogeneity=0.13)

By gender (p-heterogeneity=0.02)

45262722

9162

35512162405

1205

93436994309

1192232729862657

52883874

58173142

11812

11666118121134411733

397240952760

2708323435362334

67585054

1.161.32

1.15

1.201.061.331.00

1.221.141.12

1.181.241.111.09

1.101.22

1.07-1.251.21-1.43

1.10-1.20

1.13-1.270.99-1.131.15-1.540.92-1.09

1.09-1.351.07-1.231.04-1.20

1.06-1.311.15-1.351.04-1.201.00-1.20

1.05-1.171.14-1.30

TERT (rs2736100) Ca Co OR 95%CI

0.8 1.0 1.2 1.4 1.8

OR

HeterozygousHomozygous

Adenocarcinomas Squamous Large cell Small cell

Never smokers Former smokers Current smokers

<50 50-60 60-70 >=70

Men Women

Co-dominant

By histology (p-heterogeneity=0.01)

By smoking status (p-heterogeneity=0.75)

By age (p-heterogeneity=0.63)

By gender (p-heterogeneity=0.03)

836312

1686

92731794

109

671458524

181423562520

834852

1014312

2101

2101210121012101

1264454305

242487674698

8861215

1.241.51

1.23

1.320.931.171.00

1.271.241.15

1.241.201.341.15

1.101.35

1.07-1.431.24-1.83

1.12-1.35

1.18-1.480.78-1.120.87-1.590.75-1.33

1.10-1.461.02-1.510.93-1.42

0.92-1.670.99-1.441.14-1.590.98-1.37

0.96-1.261.19-1.54

TERT_rs2736100 Ca Co OR 95%CI

0.8 1.0 1.2 1.4 1.8

OR

Figure 2 – Stratified analysis for rs2736100 and rs402710 (Chr 5) in Whites and Asians

• More important in adenocarcinomas and large cell carcinomas.•Stronger association in women (no heterogeneity by gender was observed in adenocarcinoma analysis only)

Whites Asians

Page 50: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

HeterozygousHomozygous

Adenocarcinomas Squamous Large cell Small cell

Never smokers Former smokers Current smokers

<50 50-60 60-70 >=70

Men Women

Co-dominant

By histology (p-heterogeneity=0.03)

By smoking status (p-heterogeneity=0.27)

By age (p-heterogeneity=0.51)

By gender (p-heterogeneity=0.85)

38474140

8860

35832004364

1102

84536174217

1144220229102604

49773883

41783905

9198

9052919887318571

307031552547

2108264227831665

54723726

1.161.30

1.14

1.181.151.211.00

1.151.201.09

1.121.211.111.13

1.131.14

1.04-1.281.18-1.45

1.09-1.19

1.11-1.261.07-1.251.03-1.420.91-1.10

1.01-1.301.11-1.301.01-1.18

1.00-1.251.10-1.321.02-1.201.02-1.24

1.07-1.201.07-1.23

CLPTM1L (rs402710)Ca Co OR 95%CI

0.8 1.0 1.2 1.4 1.8

OR

HeterozygousHomozygous

Adenocarcinomas Squamous Large cell Small cell

Never smokers Former smokers Current smokers

<50 50-60 60-70 >=70

Men Women

Co-dominant

By histology (p-heterogeneity=0.58)

By smoking status (p-heterogeneity=0.67)

By age (p-heterogeneity=0.61)

By gender (p-heterogeneity=0.92)

694842

1680

92131695

109

663460525

182420562516

836844

917981

2117

2117211721172117

1263469307

241491676709

8991218

1.151.32

1.15

1.201.151.070.98

1.081.201.07

1.011.191.241.09

1.151.14

0.91-1.461.05-1.66

1.04-1.27

1.06-1.360.95-1.390.78-1.460.73-1.31

0.92-1.250.98-1.470.87-1.33

0.75-1.350.97-1.451.04-1.480.92-1.31

1.00-1.320.99-1.31

CLPTM1L_rs402710 Ca Co OR 95%CI

0.8 1.0 1.2 1.4 1.8

OR

Figure 2 – Stratified analysis for rs2736100 and rs402710 (Chr 5) in Whites and Asians (continued)

Whites Asians

• Heterogeneity by histology observed in Whites only

Page 51: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Number of risk-allele

ca co OR 95% CI p-value

7964 8212      

0 95 153 1.00 reference  

1 551 765 1.16 0.87-1.55 0.32

2 1538 1856 1.30 0.98-1.71 0.06

3 2364 2458 1.53 1.16-2.01 0.003

4 2097 1955 1.72 1.30-2.26 1.10-4

5 1099 883 1.98 1.49-2.63 2.10-6

6 220 142 2.64 1.86-3.74 4.10-8

     

 per risk-allele 1.15 1.12-1.18 1.10-26

Table 6 – Association between risk of lung cancer and combined genotypes of rs402710, rs2736100 and rs16969968 in Whites

• An OR of 2.64 was found for homozygous carriers of the three risk variants compared to individuals with no risk allele.

Page 52: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

ConclusionConclusionThe largest pools of independent studies not included in previous GWAS.

For 15q25◦Replicated the results from GWAS in Whites.

◦Expanded to Asians with no association.

For 5p15◦Confirmed the results in Whites.◦Reported an association in Asians.

For 6p21◦Results were not replicated.

Page 53: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

• MD Anderson• European Descendant • Bladder Cancer

• Nature Genetics, 41, 991 - 995 (2009).

Page 54: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

First Stage - First Stage - discoverydiscovery

969 bladder cancer cases/957 controls

Illumina HumanHap610 BeadChip (Illumina)

556,426 SNPs• None of the SNPs reached genome-wide significance.• After removing highly linked SNPs, three SNPs had a P-value < 10-5 and 50 SNPs showed a P-value < 10-4.

Page 55: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

No evidence for inflation of chi-squared test(none of the SNPs reached genome-wide significance

Page 56: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009
Page 57: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Second Stage - Second Stage - replicationreplication

Genotyping the top 50 SNPs and the top 10 additional SNPs in 8q24 in 3 additional US sites:◦New Hampshire (800 cases/912 controls)

◦Texas (764 cases/2,807 controls)◦MSKCC (149 cases/152 controls)

One SNP, rs2294008, showed consistent results in the discovery and replication phase.

9 additional European populations were used to replicate this SNP.

Page 58: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Overall allelic OR, 1.15 (95% CI, 1.10-1.20).No heterogeneity between populations was observed.

Page 59: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Similar associations were observed across different strata of gender, smoking status, and age.

Page 60: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Third Stage - Third Stage - expansionexpansionrs2294008 is located in exon 1 of the PSCA gene (prostate stem cell antigen), which is upregulated in most bladder tumors.

Resequenced the genomic region of PSCA in 106 individuals of European ancestry.◦27 SNPs were identified.◦All of the high frequency SNPs are in strong LD with rs2294008.

◦7 of the SNPs were genotyped in the discovery set and identical ORs compared to rs2294008 were observed.

Page 61: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Fourth Stage - Fourth Stage - expansionexpansionBladder cancer cell line study of rs2294008◦the T allele-containing haplotypes showed significantly lower promoter activity

◦substitution of C to T significantly reduced promoter activity

◦substitution of T to C increased promoter activity

◦rs2294008 is a functional variant in vitroParadoxical◦T allele reduces the transcriptional activity of the PSCA promoter

◦PSCA has been shown to be overexpressed in bladder tumors

◦The functional consequence of the T allele in vivo is still unclear

Page 62: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

The following slides are from Dr. McKay presented in the 2009 INHANCE meeting, Paris

Page 63: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009
Page 64: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009
Page 65: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009
Page 66: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009
Page 67: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009
Page 68: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Total sample numbers Genotypes

Replication

Total

Ca Co

Bremen 163 189

South America 1228 1076

Rome 235 222

Seattle 208 413

UNC 1288 1362

Penn State 429 685

UCLA 319 934

ORC 477 487

Brown 568 651

Pittsburgh 633 793

Netherlands 454 304

Total replicates 6002 7116

Updated: 06/26/09

Page 69: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Discovery phase Replication

4429 ca/5996 co 5322 ca/6218 co

marker segment GeneSymbol coding_status Reason OR 95%CI P OR 95%CI P

rs1573496 4q ADH7 SYNON p_all<1x10-5 0.70 0.62-0.79 8E-09 0.78 0.71-0.85 6E-08

rs7431530 3p2 RBMS3 p_all<1x10-5 0.81 ( 0.74- 0.88) 2E-06 0.94 0.88-1.00 0.03

rs4767364 12q24.13a FLJ13089 p_all<1x10-5 1.21 ( 1.12- 1.32) 2E-06 1.12 1.04-1.19 0.001

rs10801805 1p ZNF326 p_all<1x10-5 1.20 ( 1.11- 1.29) 3E-06 1.03 0.97-1.09 0.29

rs2287802 19p COL5A3 p_all<1x10-5 1.19 ( 1.11- 1.28) 4E-06 1.02 0.96-1.08 0.46

rs4799863 18q1 FHOD3 p_all<1x10-5 0.84 ( 0.78- 0.91) 5E-06 0.95 0.90-1.00 0.06

rs11067362 12q24.21b TBX3 p_all<1x10-5 1.35 ( 1.19- 1.54) 6E-06 1.01 0.92-1.10 0.89

rs2299851 6p MSH5 p_all<1x10-5 0.72 ( 0.62- 0.83) 6E-06 1.09 0.98-1.22 0.10

rs1431918 8q ASPH p_all<1x10-5 1.19 ( 1.10- 1.28) 7E-06 1.04 0.98-1.10 0.18

rs7924284 10q CWF19L1 p_all<1x10-5 1.38 ( 1.20- 1.59) 8E-06 0.97 0.88-1.07 0.59

rs2517452 6p C6orf15 p_oral <5x10-7 0.86 ( 0.80- 0.93) 8E-05 1.01 0.94-1.08 0.75

rs16837730 1p3 OPRD1 P_heavy <5x10-7 1.35 ( 1.14- 1.60) 4E-04 1.11 0.98-1.25 0.11

rs1041973 2q IL1RL1 NONSYN NONSYN p<1x10-4 0.83 ( 0.76- 0.90) 3E-05 0.93 0.87-0.99 0.01

rs3810481 20q13 PRIC285 NONSYN NONSYN p<1x10-4 1.22 ( 1.11- 1.34) 6E-05 1.06 0.98-1.15 0.14

rs2012199 1q FCRL5 NONSYN NONSYN p<1x10-4 1.24 ( 1.11- 1.38) 1E-04 1.07 0.99-1.16 0.10

rs484870 19p FLJ35784 NONSYN NONSYN p<1x10-4 1.16 ( 1.08- 1.26) 1E-04 0.98 0.93-1.04 0.59

rs1494961 4q HEL308 NONSYN NONSYN p<1x10-4 1.15 ( 1.07- 1.24) 1E-04 1.11 1.05-1.17 0.0001

Additional SNPs genotyped, but not selected from GWAS study      

rs1229984 4q _ ADH1B NONSYN ADH1B 0.49 ( 0.40- 0.60) 2.E-12 0.68 0.60-0.78 8E-09

rs16969968 15q25 CHRN NONSYNLung cancer smoking varaint

1.04 ( 0.96- 1.13) 0.351 1.09 1.03-1.15 0.002

Page 70: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Alcohol Dehydrogenase 7G → C substitutionGly → AlaMAF: Caucasian 0.09 Asian 0-0.5 (?)

Page 71: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

RNA binding motif, single stranded interacting proteinC → T substitutionIntronMAF: Caucasian 0.28 Asian 0.41

Page 72: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

chromosome 12 open reading frame 30A → G substitutionIntronMAF: Caucasian 0.65 Asian 0.03

Page 73: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

interleukin 1 receptor-like 1 C → A substitutionAla → GluMAF: Caucasian 0.13 Asian 0.17

Page 74: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

helicase, POLQ-like G → A substitutionVal → IleMAF: Caucasian 0.48 Asian 0.23

Page 75: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

alcohol dehydrogenase 1BA → G substitutionHis → ArgMAF: Caucasian 0.99 Asian 0.23

Page 76: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Nicotinic acetylcholine receptorsG→ A substitutionAsp → AsnMAF: Caucasian 0.42 Asian 0.03

Page 77: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Limitations in GWASLimitations in GWASPotential false-positive (negative) results.

Lack of information on gene function.

Insensitivity to rare variants.Can not assay insertion/deletion variants.

Requirement of large sample sizes.Bias due to case and control selection.

Findings are many steps away from actual clinical application.

Page 78: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

What we’ve learned from What we’ve learned from the GWAS?the GWAS?International collaboration.Don’t give up on negative results.

Be an active thinker, explore all possibilities.

Page 79: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

What’s next after What’s next after GWAS?GWAS?

Page 80: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

Thank you!!Thank you!!

Page 81: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

ReferencesReferences Thorgeirsson TE, Geller F, Sulem P, Rafnar T, Wiste A, Magnusson KP,

Manolescu A, Thorleifsson G, Stefansson H, Ingason A, Stacey SN, Bergthorsson JT, Thorlacius S, Gudmundsson J, Jonsson T, Jakobsdottir M, Saemundsdottir J, Olafsdottir O, Gudmundsson LJ, Bjornsdottir G, Kristjansson K, Skuladottir H, Isaksson HJ, Gudbjartsson T, Jones GT, Mueller T, Gottsäter A, Flex A, Aben KK, de Vegt F, Mulders PF, Isla D, Vidal MJ, Asin L, Saez B, Murillo L, Blondal T, Kolbeinsson H, Stefansson JG, Hansdottir I, Runarsdottir V, Pola R, Lindblad B, van Rij AM, Dieplinger B, Haltmayer M, Mayordomo JI, Kiemeney LA, Matthiasson SE, Oskarsson H, Tyrfingsson T, Gudbjartsson DF, Gulcher JR, Jonsson S, Thorsteinsdottir U, Kong A, Stefansson K. A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature. 2008 Apr 3;452(7187):638-42.

Hung RJ, McKay JD, Gaborieau V, Boffetta P, Hashibe M, Zaridze D, Mukeria A, Szeszenia-Dabrowska N, Lissowska J, Rudnai P, Fabianova E, Mates D, Bencko V, Foretova L, Janout V, Chen C, Goodman G, Field JK, Liloglou T, Xinarianos G, Cassidy A, McLaughlin J, Liu G, Narod S, Krokan HE, Skorpen F, Elvestad MB, Hveem K, Vatten L, Linseisen J, Clavel-Chapelon F, Vineis P, Bueno-de-Mesquita HB, Lund E, Martinez C, Bingham S, Rasmuson T, Hainaut P, Riboli E, Ahrens W, Benhamou S, Lagiou P, Trichopoulos D, Holcátová I, Merletti F, Kjaerheim K, Agudo A, Macfarlane G, Talamini R, Simonato L, Lowry R, Conway DI, Znaor A, Healy C, Zelenika D, Boland A, Delepine M, Foglio M, Lechner D, Matsuda F, Blanche H, Gut I, Heath S, Lathrop M, Brennan P. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature. 2008 Apr 3;452(7187):633-7.

Page 82: An Introduction to Genome-Wide Association Studies Shen-Chih Chang, PhD Epi 295 Oct 2, 2009

ReferencesReferences Amos CI, Wu X, Broderick P, Gorlov IP, Gu J, Eisen T, Dong Q,

Zhang Q, Gu X, Vijayakrishnan J, Sullivan K, Matakidou A, Wang Y, Mills G, Doheny K, Tsai YY, Chen WV, Shete S, Spitz MR, Houlston RS. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat Genet. 2008 May;40(5):616-22. Epub 2008 Apr 2.

Wu X, Ye Y, Kiemeney LA, Sulem P, Rafnar T, Matullo G, Seminara D, Yoshida T, Saeki N, Andrew AS, Dinney CP, Czerniak B, Zhang ZF, Kiltie AE, Bishop DT, Vineis P, Porru S, Buntinx F, Kellen E, Zeegers MP, Kumar R, Rudnai P, Gurzau E, Koppova K, Mayordomo JI, Sanchez M, Saez B, Lindblom A, de Verdier P, Steineck G, Mills GB, Schned A, Guarrera S, Polidoro S, Chang SC, Lin J, Chang DW, Hale KS, Majewski T, Grossman HB, Thorlacius S, Thorsteinsdottir U, Aben KK, Witjes JA, Stefansson K, Amos CI, Karagas MR, Gu J. Genetic variation in the prostate stem cell antigen gene PSCA confers susceptibility to urinary bladder cancer. Nat Genet. 2009 Sep;41(9):991-5. Epub 2009 Aug 2.

Pearson TA, Manolio TA. How to interpret a genome-wide association study. JAMA. 2008 Mar 19;299(11):1335-44. Erratum in: JAMA. 2008 May 14;299(18):2150.