17
. Parametric and Non- Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage. repared by Dan Geiger.

Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

Embed Size (px)

Citation preview

Page 1: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

.

Parametric and Non-Parametric analysis of complex diseases

Lecture #6

Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage.

Prepared by Dan Geiger.

Page 2: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

2

Complex Diseases

1. Unknown mode of inheritance (Dominant/recessive)2. Several interacting loci (Epistasis)3. Unclear affected status (e.g., psychiatric disorders)4. Genetic heterogeneity 5. Non genetic factors

We start by specifying how alternative models look like using a Bayesian network model.

Page 3: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

3

Mode of Inheritance

S23m

L21fL21m

L23m

X21 S23f

L22fL22m

L23f

X22

X23

Specify different conditional probability tables between the phenotype variables Yi and the genotypes

S13m

L11fL11m

L13m

X11 S13f

L12fL12m

L13f

X12

X13

y3

y2y1

Recessive, full penetrance:P(y1 = sick | X11= (a,a)) = 1P(y1 = sick | X11= (A,a)) = 0P(y1 = sick | X11= (A,A)) = 0

Page 4: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

4

More modes of Inheritance

Dominant, 60% penetrance:P(y1 = sick | X11= (a,a)) = 0.6P(y1 = sick | X11= (A,a)) = 0.6P(y1 = sick | X11= (A,A)) = 0

Dominant, full penetrance:P(y1 = sick | X11= (a,a)) = 1P(y1 = sick | X11= (A,a)) = 1P(y1 = sick | X11= (A,A)) = 0

Recessive, 40% penetrance, 1% penetrance for phenocopies:P(y1 = sick | X11= (a,a)) = 0.4P(y1 = sick | X11= (A,a)) = 0.01P(y1 = sick | X11= (A,A)) = 0.01

Dominant, 20% penetrance, 5% penetrance for phenocopies:P(y1 = sick | X11= (a,a)) = 0.2P(y1 = sick | X11= (A,a)) = 0.2P(y1 = sick | X11= (A,A)) = 0.05

Page 5: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

5

Two or more interacting loci (epistasis)

Specify different conditional probability tables between the phenotype variables Yi and the 2 or more genotypes of person i.

Example: Recessive, full penetrance:P(y11 = sick | X11= (a,a), X21= (a,a)) = 1P(y11 = sick | X11= (A,a), X21= (a,a)) = 0P(y11 = sick | X11= (A,A), X21= (a,a)) = 06 more zero options to specify.

S23m

L21fL21m

L23m

X21 S23f

L22fL22m

L23f

X22

X23

S13m

L11fL11m

L13m

X11 S13f

L12fL12m

L13f

X12

X13

y3

y2y1

Page 6: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

6

Unclear affection status

S23m

L21fL21m

L23m

X21S23f

L22fL22m

L23f

X22

X23

S13m

L11fL11m

L13m

X11 S13f

L12fL12m

L13f

X12

X13

Specify a “confusion matrix” regarding the process that determines affected status.

Y3

Y2Y1

P(z1 = measured sick | y1 = sick) = 0.9P(z1 = measured sick | y1 = not sick) = 0.2

Z1

Z1

Z1

Page 7: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

7

Genetic Heterogeneity

Non-Allelic heterogeneity: several independent loci predisposes to the disease .

Si3

m

Li1

fL

i1m

Li3

m

Xi1

Si3

f

Li2

fL

i2m

Li3

f

Xi2

Xi3

1 2

3

Page 8: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

8

Non genetic factors

S23m

L21fL21m

L23m

X21 S23f

L22fL22m

L23f

X22

X23

S13m

L11fL11m

L13m

X11 S13f

L12fL12m

L13f

X12

X13

y3

y2y1

Under liability class 1 (L1=1):P(y1 = sick | X11= (a,a), L1 =1) = 1P(y1 = sick | X11= (A,a), L1 =1) = 0.05P(y1 = sick | X11= (A,A), L1 =1) = 0.05

Liability Class L1

Example: Li = 1 means “old”Li = 2 means “young”.

L3

L2

Under L1 =2 (“young”): the first line changes, say, to 0.3 and the other two lines to, say, 0.

Page 9: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

9

Parametric versus Non-Parametric

All analyses considered so far are “parametric” meaning that a mode of inheritance is assumed . In some cases, several options of modes of inheritance are assumed but still the analysis uses each option in turn.

For complex diseases it is believed that “non-parametric” methods might work better. In our context, these are methods that do not take mode of inheritance into account.The idea is that computing linkage without assuming mode of inheritance is more robust to error in model specification.

Clearly, if the model is correct, parametric methods perform better, but not so if the model is wrong as for complex traits.

Page 10: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

10

Some Non-Parametric Methods

Main idea: if affected siblings share more IBD alleles at some marker locus than randomly expected among siblings, then that locus might be near a locus of a predisposing gene.

Definitions: Any two identical copies of an allele l are said to be identical by state (IBS). If these alleles are inherited from the same individual then they are also identical by descent (IBD). Clearly, IBD implies IBS but not vice versa.

We will consider the following non-parametric methods:•Affected Sib-Pair Analysis (ASP)•Extended Affected Sib-Pair Analysis (ESPA)•Affected Pedigree Member method (APM)

Page 11: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

11

Identical By Descent (IBD)

Exactly one allele IBD.

1/2

1/2 1/1

1/3 1/2

1/2 1/3

1/3

No allele is IBD. One allele is IBS.

1/1

1/1 1/1

1/2

At least one allele IBD.Expected 1.5 alleles IBD.

Page 12: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

12

Affected Sib-Pair Analysis

The idea is that any two siblings are expected to have one allele IBD by chance (and at most two IBD alleles, ofcourse).

When a deviation of this pattern is detected, by examining many sib-pairs, a linkage is established between a disease gene and the marker location.

This phenomena happens regardless of mode of inheritance, but its strength is different for each mode.

Page 13: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

13

Affected Sib-Pair Analysis

1/2

1/3 1/4

3/4

There are 16 combinations of sibling marker genotypes:

SON1 SON2 IBD SON1 SON2 IBD SON1 SON2 IBD SON1 SON2 IBD 1/3 1/3 2 1/4 1/4 2 2/3 2/3 2 2/4 2/4 2 1/3 1/4 1 1/4 1/3 1 2/3 2/4 1 2/4 2/3 1 1/3 2/3 1 1/4 2/4 1 2/3 1/3 1 2/4 1/4 1 1/3 2/4 0 1/4 2/3 0 2/3 1/4 0 2/4 1/3 0

But now assume a dominant disease coming from the father and is on the haplotype with the 1 allele. The only viable options are marked in the table. The expected IBD is thus (2*2+2*1)/4 = 1.5, which can be detected in analysis.

Not surprisingly, the expected number of IBD alleles is (4*2+8*1)/16=1.

For a recessive disease linked on the haplotype of 1 and 3, the only viable pair is 1/3, 1/3 with expected IBD of 2.

Page 14: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

14

Affected Sib-Pair Analysis

1/2

1/3 1/4

3/4

Standard practice of the ASP method where pedigrees look like the above (two parents, two children, all observed), can be done even by hand.

However, one can use general pedigrees, and assume some family members are not observed, and consider more distant relatives such as first-cousins, etc.

Page 15: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

15

Extended Affected Sib-Pair Analysis(e.g, the ESPA program)

?/?

1/3 1/4

3/4

Compute the probability of alleles of every family configuration given the other typed persons in the pedigree. Based on this probabilities compute:

E[IBD] = 1Pr(1 allele IBD) + 2Pr(2 allele IBD)

(The ESPA program currently assumes no loops and at most 5 alleles at a locus.)

Page 16: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

16

Affected Pedigree Members method (APM)

Computing IBD for distant relatives is considered hard on large pedigrees so researchers used IBS instead.

Consider one relative to have alleles (A1,A2) and the

other to have (B1,B2). There are four possibilities to have

IBS alleles.

Weeks and Lang (1988) used the following statistics zij

for counting IBS status of two individuals:),(

4

1 2

1

2

1b

a baij BAz

This measure should be compared to what is expected under no linkage. To use many pedigrees, a converstion to standard normal variables is used.

Page 17: Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage

17

Taking Gene Frequencies into Account

Clearly it is more surprising for affected relatives to share a rare allele than a common one. So one can use a weighted average:

)(),(4

1 2

1

2

1ab

a baij AfBAz

/1)( or /1)(or 1)(aa AaAaa pAfpAfAf where