64
Genetic Epidemiology Genetic Epidemiology M. Tevfik DORAK http://www.dorak.info/epi/genetepi.html

Genetic Epidemiology M. Tevfik DORAK

Embed Size (px)

Citation preview

Page 1: Genetic Epidemiology M. Tevfik DORAK

Genetic EpidemiologyGenetic Epidemiology

M. Tevfik DORAK

http://www.dorak.info/epi/genetepi.html

Page 2: Genetic Epidemiology M. Tevfik DORAK

Approaches to the identification of susceptibility genesRebbeck TR. Cancer 1999 (www)

Page 4: Genetic Epidemiology M. Tevfik DORAK

Handbook of Statistical Genetics(John Wiley & Sons)Fig.28-1 (www)

GENETIC EPIDEMIOLOGIC RESEARCH METHODS

Page 5: Genetic Epidemiology M. Tevfik DORAK

Disease characteristics:

Familial clustering:

Genetic or environmental:

Mode of inheritance:

Disease susceptibility loci:

Disease susceptibility markers:

Descriptive epidemiology

Family aggregation studies

Twin/adoption/half-sibling/migrant studies

Segregation analysis

Linkage analysis

Association studies

GENETIC EPIDEMIOLOGYGENETIC EPIDEMIOLOGYFlow of research

Page 6: Genetic Epidemiology M. Tevfik DORAK

Autosomal recessive disorders are usually common in populations with high level of inbreeding (restricted gene pool). Examples are Tangier disease in Tangier

Island off the coast of Virginia, USA; many genetic disorders in Ashkenazi Jews (Tay-Sachs disease, Gaucher disease, Fanconi anaemia, Niemann-Pick disease); congenital adrenal hyperplasia (CAH) due to 21-hydroxylase deficiency in Yupik

Eskimos; CAH due to 11-beta hydroxylase deficiency in Moroccan Jews; and thalassaemias (beta & alpha) in Cyprus and Sardinia

Populations like Finland, Iceland and Newfoundland exhibit an increased prevalence of rare recessive diseases (

congenital nephrotic syndrome of the Finnish type and Newfoundland rod-cone dystrophy)

Page 7: Genetic Epidemiology M. Tevfik DORAK

* nuclear families (index case and parents)

* affected relative pairs

(sibs, cousins, any two members of the family)

* extended pedigrees

* twins (monozygotic and dizygotic)

* unrelated population samples

Study Designs in Genetic EpidemiologyStudy Designs in Genetic Epidemiology

Page 8: Genetic Epidemiology M. Tevfik DORAK

Disease characteristics:

Familial clustering:

Genetic or environmental:

Mode of inheritance:

Disease susceptibility loci:

Disease susceptibility markers:

Descriptive epidemiology

Family aggregation studies

Twin/adoption/half-sibling/migrant studies

Segregation analysis

Linkage analysis

Association studies

GENETIC EPIDEMIOLOGYGENETIC EPIDEMIOLOGYFlow of research

Page 11: Genetic Epidemiology M. Tevfik DORAK

Curnow & Smith: J Roy Stat Soc 1975;138:139-169

Sibling Recurrence Risk / Sibling Risk Ratio (S )

Page 13: Genetic Epidemiology M. Tevfik DORAK

Disease characteristics:

Familial clustering:

Genetic or environmental:

Mode of inheritance:

Disease susceptibility loci:

Disease susceptibility markers:

Descriptive epidemiology

Family aggregation studies

Twin/adoption/half-sibling/migrant studies

Segregation analysis

Linkage analysis

Association studies

GENETIC EPIDEMIOLOGYGENETIC EPIDEMIOLOGYFlow of research

Page 17: Genetic Epidemiology M. Tevfik DORAK

Adoption StudiesAdoption Studies

1. Compare the risk in biological relatives with adopted relatives of affected adoptees (beware of adoption bias)

2. Compare the risk in biological relatives with adopted relatives of unaffected adoptees

Page 18: Genetic Epidemiology M. Tevfik DORAK

Migrant StudiesMigrant Studies

Liao CK et al. Endometrial cancer in Asian migrants to the United States and their descendants. Cancer Causes Control 2003;14:357-60 (www) Flood DM et al. Colorectal cancer incidence in Asian migrants to the United States and their descendants. Cancer Causes Control 2000;11:403-11 (www)

Feltbower RG et al. Trends in the incidence of childhood diabetes in south Asians and other children in Bradford, UK. Diabet Med 2002;19:162-6 (www)

“ Children in south Asia have a low incidence of type 1 diabetes but migrants to the UK have similar overall rates to the indigenous population. However, a more steeply rising incidence is seen in the south Asian population, and our data suggest that incidence in this group may eventually outstrip that of the non-south Asians. Genetic factors are unlikely to explain such a rapid change, implying an influence of environmental factors in disease aetiology “

Page 19: Genetic Epidemiology M. Tevfik DORAK

Disease characteristics:

Familial clustering:

Genetic or environmental:

Mode of inheritance:

Disease susceptibility loci:

Disease susceptibility markers:

Descriptive epidemiology

Family aggregation studies

Twin/adoption/half-sibling/migrant studies

Segregation analysis

Linkage analysis

Association studies

GENETIC EPIDEMIOLOGYGENETIC EPIDEMIOLOGYFlow of research

Page 21: Genetic Epidemiology M. Tevfik DORAK

Washington University (www)

Page 22: Genetic Epidemiology M. Tevfik DORAK

Modes of inheritance

Page 23: Genetic Epidemiology M. Tevfik DORAK

Disease characteristics:

Familial clustering:

Genetic or environmental:

Mode of inheritance:

Disease susceptibility loci:

Disease susceptibility markers:

Descriptive epidemiology

Family aggregation studies

Twin/adoption/half-sibling/migrant studies

Segregation analysis

Linkage analysis

Association studies

GENETIC EPIDEMIOLOGYGENETIC EPIDEMIOLOGYFlow of research

Page 26: Genetic Epidemiology M. Tevfik DORAK

Linkage Association

Linkage is a property of loci Association is a property of alleles

Role:* To identify a biological mechanism for transmission of a trait* To locate the gene involved

Role: * To identify association between an allelic variant and a disease* To identify linkage disequilibrium between a disease allele and a marker

Coarse mapping (>1cM) Fine mapping (<1cM)

No information about which allelic variant associated with higher risk of disease

 

Require family pedigrees Case-control or family based approach

Use very polymorphic markers Usually bi-allelic markers

Differences between linkage and association

Page 27: Genetic Epidemiology M. Tevfik DORAK

Risch NJ. Nature 2000

Page 28: Genetic Epidemiology M. Tevfik DORAK
Page 29: Genetic Epidemiology M. Tevfik DORAK

Disease characteristics:

Familial clustering:

Genetic or environmental:

Mode of inheritance:

Disease susceptibility loci:

Disease susceptibility markers:

Descriptive epidemiology

Family aggregation studies

Twin/adoption/half-sibling/migrant studies

Segregation analysis

Linkage analysis

Association studies

GENETIC EPIDEMIOLOGYGENETIC EPIDEMIOLOGYFlow of research

Page 30: Genetic Epidemiology M. Tevfik DORAK

Association StudiesAssociation Studies

Population-based

Cases and unrelated population controls from the same study base

Family-based

Child-family trios and TDT design is the most common

Page 31: Genetic Epidemiology M. Tevfik DORAK

ROCHE Genetic Education (www)

Odds Ratio: 3.695% CI = 1.3 to 10.4

Page 32: Genetic Epidemiology M. Tevfik DORAK

Genetic Models and Case-Control Association Data Analysis

The data may also be analysed assuming a prespecified genetic model. For example, with the hypothesis that carrying allele B increased risk of disease (dominant model), the AB and BB genotypes are pooled giving a 2x3x2 table. This is particularly relevant when allele B is rare, with few BB observations in cases and controls. Alternatively, under a recessive model for allele B, cells AA and AB would be pooled. Analysing by alleles provides an alternative perspective for case control data. This breaks down genotypes to compare the total number of A and B alleles in cases and controls, regardless of the genotypes from which these alleles are constructed. This analysis is counter-intuitive, since alleles do not act independently, but it provides the most powerful method of testing under a multiplicative genetic model, where risk of developing a disease increases by a factor r for each B allele carried: risk r for genotype AB and r2 for genotype BB. If a multiplicative genetic model is appropriate, both case and control genotypes will be in Hardy–Weinberg equilibrium, and this can be tested for. A fourth possible genetic model is additive, with an increased disease risk of r for AB genotypes, and 2r for BB genotypes. This model shows a clear trend of an increased number of AB and BB genotypes, with the risk for AB genotypes approximately half that for BB genotypes. The additive genetic model can be tested for using Armitage’s test for trend.

Lewis CM. Brief Bioinform 2002 (www)

Page 34: Genetic Epidemiology M. Tevfik DORAK

Tishkoff, Nat Reviews Genet 2002 (www)

Linkage disequilibrium and population demographyMapping disease genes by association requires the identification of linkage disequilibrium (LD) between a marker and a disease phenotype. Several studies of African populations have indicated that levels and patterns of LD in these populations differ from those in non-African populations owing to the age of African populations, admixture with other African and non-African populations, and historical differences in population size and substructure. A disease mutation (shown in violet) that occurs on a single haplotype background will initially be in complete LD with flanking markers on that chromosome (see panel a). In each generation, LD between a marker and a disease allele decays owing to recombination between the sites, and also because of the effects of mutation and gene conversion at marker loci. Young populations, and those that have undergone recent bottlenecks (as probably occurred during the migration of ancestral humans out of Africa), will have haplotype blocks of large to moderate size (panel b, shown in green). In older and larger African populations, in which there has been more recombination, the size of haplotype blocks will probably be smaller (panel c). LD can also be established by a founder event, with the strength and extent of the LD depending on the severity and length of the bottleneck event. Population substructure increases LD owing to a smaller effective population size and to higher levels of genetic drift in subdivided populations. So, if a pooled sample derived from several African populations was analysed, spurious LD would be detected, even if the haplotypes in each subpopulation were in LD. This could lead to erroneous conclusions about the association between genetic markers and disease phenotype. Small populations of stable size are expected to show LD between closely linked loci as a result of increased genetic drift, and larger populations will have fewer sites in LD. New mutations are less likely to be in LD in growing populations owing to the smaller effect of genetic drift, but allelic associations that exist before population expansion might persist for a longer period of time in an expanding population than in a population of constant size.

Page 35: Genetic Epidemiology M. Tevfik DORAK

(www)

Mapping Disease Susceptibility Genes by Association Studies

Page 36: Genetic Epidemiology M. Tevfik DORAK

Martin, 2000 (www)

Mapping Disease Susceptibility Genes by Association Studies

Plot of minus log of P value for case-control test for allelic association with AD, for SNPs immediately surrounding APOE (<100 kb)

Page 38: Genetic Epidemiology M. Tevfik DORAK

Palmer & Cardon, Lancet 2005 (www)

Sample size requirements for different genetic models

Page 39: Genetic Epidemiology M. Tevfik DORAK

Johnson GC et al. Nat Genet 2001 (www)

Sample size requirements as a function of allele frequencies

Page 40: Genetic Epidemiology M. Tevfik DORAK

Botstein & Risch. Nat Genet 2003 (www)

Sample size requirements as a function of the strength of association

Page 41: Genetic Epidemiology M. Tevfik DORAK
Page 43: Genetic Epidemiology M. Tevfik DORAK

SNP Selection for Association Studies

Yue, 2006 (www)

(www)

- Regulatory / Functional SNPs -

Page 44: Genetic Epidemiology M. Tevfik DORAK

SNP Selection for Association Studies

(www)

(www)

- Haplotype Tagging SNPs -

Page 45: Genetic Epidemiology M. Tevfik DORAK

Tabor HK et al. Nature Rev Genetics 2002 (www)

Haplotype Association

Page 46: Genetic Epidemiology M. Tevfik DORAK

Illustration of tagging SNPsa | The diagram shows five haplotypes. Twelve single nucleotide polymorphisms (SNPs) are localized in order along the chromosome. The letters on the top indicate groups of SNPs that have perfect pairwise linkage disequilibrium (LD) with one another, and the numbers on the bottom indicate each of the 12 SNPs. SNP 9 is the causal variant, which in this simple example determines drug response: allele C results in a therapeutic response, whereas allele G results in an adverse reaction. In this example, the selection of just one SNP from each of the groups A–E would be sufficient to fully represent all of the haplotype diversity. Each haplotype can be identified by just five tagging SNPs (tSNPs), and the causal variant would be tagged even if it were not itself typed (in fact, multi-marker approaches to tSNP selection would reduce the set of tags to fewer than five, but this is ignored for simplicity). So, tSNP profiles that are highlighted predict an adverse reaction to the medicine. Normally, LD patterns are not so clear-cut and statistical methods are required to select appropriate sets of tSNPs. b | The diagram depicts the same 12 SNPs, but with different associations among them, as might happen in a different population group. Because patterns of LD are different, some patients would be misclassified if the same five tSNPs were used and interpreted in the same way; that is, using the same SNP profiles as defined in population A, haplotype profiles 1, 2 and 3 are predicted to have allele C at the causal SNP 9 (a therapeutic response), whereas haplotype profiles 4 and 5 are predicted to have an adverse response. However, because the pattern of association has changed, the new haplotypes 6 and 7 are misclassified as haplotype patterns 6 and 7 in population B.

Goldstein, Nat Rev Genet 2003 (www)

Page 48: Genetic Epidemiology M. Tevfik DORAK

(Schork, 1998)

Associations with Ancestral Haplotypes

Page 52: Genetic Epidemiology M. Tevfik DORAK

Marchini, 2004 (www)

Population Stratification

Page 53: Genetic Epidemiology M. Tevfik DORAK

Cardon & Palmer, 2003 (www)

Population Stratification

Page 54: Genetic Epidemiology M. Tevfik DORAK

Diepstra, Lancet 2005 (www)

Multiple Comparisons & Spurious Associations

Page 57: Genetic Epidemiology M. Tevfik DORAK

□○■

BC

AC BB

□ ○

●BB

BC AB

□ ○

□ ○

■BC AB

AB CD AC BD

“ transmitted allele“ “case”

“ Non-transmitted allele” “control”

Parent-Case Trios in TDT/HRRParent-Case Trios in TDT/HRR

Page 58: Genetic Epidemiology M. Tevfik DORAK

- AN EXAMPLE OF TDT -- AN EXAMPLE OF TDT -

TRANSMISSION DISEQUILIBRIUM OF HLA-B62 TO THE TRANSMISSION DISEQUILIBRIUM OF HLA-B62 TO THE PATIENTS WITH CHILDHOOD AMLPATIENTS WITH CHILDHOOD AML

(Dorak et al, BSHI 2002)(Dorak et al, BSHI 2002)

Out of 13 parents heterozygote for B62, 12 transmitted B62 to the affected child and 1 did not

Mc Nemar’s test results:P = 0.006 (with continuity correction)odds ratio = 12.0, 95% CI = 1.8 to 513

Nontransmitted Allele

B62 Other

Transmitted Allele

B62 x 12

Other 1 y

Page 59: Genetic Epidemiology M. Tevfik DORAK

ROCHE Genetic Education (www)

Multifactorial Etiology

Page 60: Genetic Epidemiology M. Tevfik DORAK

Hunter, 2005 (www)

Models of gene–environment interactions

Page 61: Genetic Epidemiology M. Tevfik DORAK

Hunter, 2005 (www)

Sample size requirement for gene-environment interaction studies

Page 62: Genetic Epidemiology M. Tevfik DORAK

Hunter, 2005 (www)

An example of a gene-environment interaction

In Alzheimer disease, the risk of cognitive decline as measured by TICS test is particularly high in APOE4 carriers who have untreated hypertension

(APOE4+/HT+).

Page 63: Genetic Epidemiology M. Tevfik DORAK

Falconer's polygenic threshold model for dichotomous nonmendelian characters:Liability to the condition is polygenic and normally distributed (upper curve). People whose liability

is above a certain threshold value are affected. Their sibs (lower curve) have a higher average liability than the population mean and a greater proportion of them have liability exceeding the

threshold. Therefore the condition tends to run in families (Falconer DS, 1967).

Page 64: Genetic Epidemiology M. Tevfik DORAK

M.Tevfik DORAK

http://www.dorak.info