57
Estimating genetic variation within families Peter M. Visscher Queensland Institute of Medical Research Brisbane, Australia [email protected] 1

Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Estimating genetic variation within families

Peter M. VisscherQueensland Institute of Medical

ResearchBrisbane, Australia

[email protected]

Page 2: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Overview

• Estimation of genetic parameters• Variation in identity• Applications

– mean and variance of genome-wide IBD sharing for sibpairs

– estimation of heritability of height– genome partitioning of genetic variation

2

Page 3: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Estimation of genetic parameters

• Model– expected covariance between relatives

• Genetics• Environment

• Data– correlation/regression of observations between

relatives• Statistical method

– ANOVA– regression– maximum likelihood– Bayesian analysis

3

Page 4: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

[Galton, 1889] 4

Page 5: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

The height vs. pea debate

(early 1900s)

Do quantitative traits have the same hereditary and evolutionary properties as discrete characters?

Biometricians Mendelians

5

Page 6: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

RA Fisher (1918). Transactions of the Royal Societyof Edinburgh52: 399-433.

m-a m+d m+a

QQ

Qq

qq

Trait

m-a m+d m+a

QQ

Qq

qq

Trait

6

Page 7: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Genetic covariance between relatives

covG(yi,yj) = aijσA2 + dijσD

2

a = additive coefficient of relationship= 2 * coefficient of kinship (= E(π))

d = coefficient of fraternity= Prob(2 alleles are IBD)

7

Page 8: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Examples (no inbreeding)

Relatives a d

MZ twins 1 1Parent-offspring ½ 0Fullsibs ½ ¼Double first cousins ¼ 1/16

8

Page 9: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Controversy/confounding:nature vs nurture

• Is observed resemblance between relatives genetic or environmental?– MZ & DZ twins (shared environment)– Fullsibs (dominance & shared environment)

• Estimation and statistical inference– Different models with many parameters may

fit data equally well

9

Page 10: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Total mole count for MZ and DZ twins

0

100

200

300

400

0 100 200 300 400

Twin 2

Twin

1

0

100

200

300

400

0 100 200 300 400

Twin 2

Twin

1

MZ twins - 153 pairs, r = 0.94 DZ twins - 199 pairs, r = 0.60

10

Page 11: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Sources of variation in Queensland school test results of 16-year olds

12%78%

10%

Additivegenetic

Sharedenvironment

Non-sharedenvironment

11

Page 12: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

An unbiased approach

Estimate genetic variance within

families

12

Page 13: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Actual or realised genetic relationship

= proportion of genome shared IBD (πa)

• Varies around the expectation– Apart from parent-offspring and MZ twins

• Can be estimated using marker data

13

Page 14: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

x

1/4 1/4 1/4 1/414

Page 15: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

IDENTITY BY DESCENTSib 1

Sib 2

4/16 = 1/4 sibs share BOTH parental alleles IBD = 2

8/16 = 1/2 sibs share ONE parental allele IBD = 1

4/16 = 1/4 sibs share NO parental alleles IBD = 015

Page 16: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Single locus

Relatives E(πa) var(πa)

Fullsibs ½ 1/8Halfsibs ¼ 1/16

Double 1st cousins ¼ 3/32

16

Page 17: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Several notations

IBD Probability Actual

IBD0 k0 0 or 1IBD1 k1 0 or 1IBD2 k2 0 or 1

Σ=1 Σ=1

πa = ½k1 + k2 = R = 2θπd = k2 = ∆xy

17

[e.g., LW Chapter 7; Weir and Hill 2011, Genetics Research]

Realisationsk0 k1 k2

1 0 00 1 00 0 1

Page 18: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

n multiple unlinked loci

Relatives E(πa) var(πa)

Fullsibs ½ 1/8n

Halfsibs ¼ 1/16n

Double 1st cousins ¼ 3/32n

18

Page 19: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Loci are on chromosomes

• Segregation of large chromosome segments within families– increasing variance of IBD sharing

• Independent segregation of chromosomes– decreasing variance of IBD sharing

19

Page 20: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Theoretical SD of πa

Relatives 1 chrom (1 M) genome (35 M)

Fullsibs 0.217 0.038Halfsibs 0.154 0.027Double 1st cousins 0.173 0.030

[Stam 1980; Hill 1993; Guo 1996; Hill & Weir 2011]20

Page 21: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Fullsibs: genome-wide (Total length L Morgan)

var(πa) ≈ 1/(16L) – 1/(3L2)

var(πd) ≈ 5/(64L) – 1/(3L2)

var(πd)/ var(πa) ≈ 1.3 if L = 35

[Stam 1980; Hill 1993; Guo 1996]

Genome-wide variance depends more on total genome length than on the number of chromosomes

21

Page 22: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Fullsibs: Correlation additive and dominance relationships

r(πa, πd) = σ(πa) / σ(πd) ≈ [1/(16L) / (5/(64L))]0.5 = 0.89.

Using β(πa on πd) = 1

Difficult but not impossible to disentangle additive and dominance variance

NB Practical 22

Page 23: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

SummaryAdditive and dominance (fullsibs)

SD(πa) SD(πd)

Single locus 0.354 0.433One chromsome (1M) 0.217 0.247Whole genome (35M) 0.038 0.043

Predicted correlation 0.89(genome-wide πa and πd)

23

Page 24: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Application (1)Aim: estimate genetic variance from actual

relationships between fullsib pairs

• Two cohorts of Australian twin families

Adolescent AdultFamilies 500 1512Individuals 1201 3804Sibpairs with genotypes 950 3451Markers per individual 211-791 201-1717Average marker spacing 6 cM 5 cM

24

Page 25: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Application (1)

• Phenotype = height

Number of sibpairs with phenotypesand genotypes

Adolescent cohort 931Adult cohort 2444Combined 3375

25

Page 26: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Mean IBD sharing across the genome for the jth sib pair was based on IBD estimated from Merlin every

centimorgan and averaged at all 3491 points

3491/ˆˆ3491

1)()( ∑

=

=i

ijaja ππ

3491/ˆ3491

1)(2)( ∑

=

=i

ijjd pπ

additive

dominance

26

Page 27: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

And for the cth chromosome of length lc cM

c

l

i

cc lc

ijaja/ˆˆ

1)()( ∑

=

= ππ

c

l

iij

c lpc

jd/ˆ

1)(2)( ∑

=

additive

dominance

27

Page 28: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Mean and SD of genome-wide additive relationships

28

Page 29: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Mean and SD of genome-wide dominance relationships

29

Page 30: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Empirical and theoretical SD of additive relationshipscorrelation = 0.98 (n = 4401)

30

Page 31: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Empirical and theoretical SD of dominance relationshipscorrelation = 0.98 (n = 4401)

31

Page 32: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Additive and dominance relationships correlation = 0.91 (n= 4401)

32

Page 33: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Phenotypes

After adjustment for sex and age:σp = 7.7 cm σp = 6.9 cm 33

Page 34: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Phenotypic correlation between siblings

Raw After age & sex

Adolescents 0.33 0.40Adults 0.24 0.39

34

Page 35: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Models

C= Family effectA = Genome-wide additive geneticE = Residual

Full model C + A + EReduced model C + E

35

Page 36: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Estimation

• Maximum Likelihood variance components

• Likelihood-ratio-test (LRT) to calculate P-values for hypothesesH0: A = 0H1: A > 0

36

Page 37: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Estimates: null model (CE)

Cohort Family effect (C)

Adolescent 0.40 (0.34 – 0.45)Adult 0.39 (0.36 – 0.43)Combined 0.39 (0.36 – 0.42)

37

Page 38: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Estimates: full model (ACE)

Cohort C A P

Adolescent 0 0.80 0.0869Adult 0 0.80 0.0009Combined 0 0.80 0.0003

►All family resemblance due to additive genetic variation

38

Page 39: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Sampling variances are large

Cohort A (95% CI)

Adolescent 0.80 (0.00 – 0.90)Adult 0.80 (0.43 – 0.86)Combined 0.80 (0.46 – 0.85)

39

Page 40: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

F+A more accurately estimated

Cohort C+A (95% CI)

Adolescent 0.80 (0.36 – 0.90)Adult 0.80 (0.61 – 0.86)Combined 0.80 (0.62 – 0.85)

►Prediction of MZ correlation from fullsibs!

40

Page 41: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Power and SE of estimates

• True parameters (t)• Sample size (n)• Variance in genome-wide IBD sharing (var(π))

NCP = nh4var(π)(1+t2) / (1-t2)2

[ ]))var()(1(/)1()ˆvar( 2222 πntth +−≈

41

Page 42: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

• Aims– Estimate genetic variance from genome-wide

IBD in larger sample– Partition genetic variance to individual

chromosomes• using chromosome-wide coefficients of relationship

– Test hypotheses about the distribution of genetic variance in the genome

Application (2)Genome partitioning of additive

genetic variance for height

42

Page 43: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Sample # Sibpairs Sib Correlation

AU 5952 0.43US 3996 0.50NL 1266 0.45

Total 11,214 0.46

43

Page 44: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Realised relationshipsMean 0.499Range 0.31 – 0.64SD 0.036

44

Page 45: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Estimates from genome-wide additive and dominance coefficients

ACE modelHeritability 0.86 (0.49 – 0.95) P<0.0001Family 0.03 (0.00 – 0.03) P=0.38

ADCE modelAdditive component 0.70Dominance component 0.16 (P=0.35)

45

Page 46: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

46

Page 47: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

y = 1.006x + 0.0001R2 = 0.9715

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.00 0.02 0.04 0.06 0.08 0.10 0.12

From single chromosome analyses

From

com

bine

d ch

rom

osom

e an

alys

is

Estimates of chromosomal heritabilities

No epistasis?

47

Page 48: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

222120

19

18

17

16

15

14

13

12

11 10

9

8

7

6

5

4

3

2

1

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

50 100 150 200 250 300

Length of chromosome (cM)

Estim

ate

of h

erita

bilit

y

WLS analysis: P<0.001; intercept NS

Longer chromosomes explain more additive genetic variance: ~0.03 per 100 cM

48

Page 49: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

19

9

7

3

17

14

4

8

15

12

18

1

2016

21

2213

1011

62 5

y = 0.9623xR2 = 0.2813

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20

0.00 0.02 0.04 0.06 0.08 0.10 0.12

Heritability AU

Her

itabi

lity

USA

Estimates are consistent across countries

49

Page 50: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

-20

-18

-16

-14

-12

-10

-8

-6

-4

-2

00 1 2 3 4 5 6 7 8

Number of chromosomal additive genetic variance components

Scal

ed A

ICStepwise analyses: at least 6 chromosomes are needed to explain the additive genetic variance

50

Page 51: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Hypothesis test

Model h2 c2 df LRT

Full (22 chrom.) 0.92 0.00 22Genome-wide 0.86 0.03 1 19.2

Additive genetic variance in proportion to lenght not rejected

51

Page 52: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

17

612

18

15

9

34

8

14 5

10 13

17

19

216

2111

20220.00

0.02

0.04

0.06

0.08

0.10

0.12

0 1 2 3 4

Number of publications with LOD > 1.9

Estim

ate

of h

erita

bilit

y

Data consistent with published QTL results

Rank test: P=0.002

52

Page 53: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Conclusions

• Empirical variation in genome-wide IBD sharing follows theoretical predictions

• Genetic variance can be estimated from genome-wide IBD within families– results for height consistent with estimates from

between-relative comparisons– no assumptions about nature/nurture causes of family

resemblance• Genetic variance can be partitioned onto

chromosomes53

Page 54: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Conclusions

• With large sample sizes it will become possible to estimate – dominance variance– epistatic variance– genome-wide parent-of-origin variance

– genetic relative risk to disease

54

Page 55: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Genetic architecture for height

• Additive genetic variance• No QTL of large effects• Chromosomes explain ~10% of genetic

variance• Consequences for genome-wide

association

55

Page 56: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Other applications: breeding programmes

• Exploit variance in genome-wide IBD by using the realised A-matrix– large increase in accuracy of selection if

• variance in identity is large• family size is large

• “Genomic Selection”

56

Page 57: Estimating genetic variation within familiesnitro.biosci.arizona.edu/workshops/GIGA/pdfs/L2-Peter.pdf · Estimation of genetic parameters • Model – expected covariance between

Using the realised A-matrix: Reliability of EBV for an unphenotyped individual from n-1 phenotyped relatives(a simulation study)

57