68
Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 6 April 2005 Harald H.H. Göring

Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Embed Size (px)

Citation preview

Page 1: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Introduction to some basic concepts

in quantitative genetics

Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 6 April 2005

Harald H.H. Göring

Page 2: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

100% genetic contribution 0%

0% environmental contribution 100%

“Nature vs. nurture”

trait

genes environment

“mendelian” traitsinfections,

accidental injuries“complex” traits

Page 3: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

“Marker” loci

There are many different types of polymorphisms, e.g.:

• single nucleotide polymorphism (SNP):

AAACATAGACCGGTT

AAACATAGCCCGGTT

• microsatellite/variable number of tandem repeat (VNTR):

AAACATAGCACACA----CCGGTT

AAACATAGCACACACACCGGTT

• insertion/deletion (indel):

AAACATAGACCACCGGTT

AAACATAG--------CCGGTT

• restriction fragment length polymorphism (RFLP)

Page 4: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Genetic variation in numbers

There are ~6 x 109 humans on earth, and thus ~12 x 109 copies of each autosomal chromosome. Assuming a mutation rate of ~1 x 108, every single nucleotide will be mutated (~12 x 109) / (~1 x 108) = ~120 in each new generation of earthlings. Thus, every nucleotide will be polymorphic in Homo sapiens, except for those where variation is incompatible with life.

Any 2 chromosomes differ from each other every ~1,000 bp. The 2 chromosomal sets inherited from the mother and the father (each with a length of 3 x 109 bp) therefore differ from each other at ~3 x 109 / ~1,000 = ~ 3 x 106, or ~3 million, locations.

Page 5: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

locus: a position in the DNA sequence, defined relative to others; in different contexts, this might mean a specific polymorphism or a very large region of DNA sequence in which a gene might be located

gene: the sum total of the DNA sequence in a given region related to transcription of a given RNA, including introns, exons, and regulatory regions

polymorphism: the existence of 2 or more variants of some locus

allele the variant forms of either a gene or a polymorphism

neutral allele: any allele which has no effect on reproductive fitness; a neutral allele could affect a phenotype, as long as the phenotype itself has no effect on fitness

silent allele: any allele which has no effect on the phenotype under study; a silent allele can affect other phenotype(s) and reproductive fitness

disease-predisposing allele: any allele which increases susceptibility to a given disease; this should not be called a mutation

mutation:the process by which the DNA sequence is altered, resulting in a different allele

Definitions of some important terms

Page 6: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Genetics vs. epidemiology:aggregate effects

• The sharing of environmental factors among related (as well as unrelated) individuals is hard to quantify as an aggregate.

• In contrast, the sharing of genetic factors among related (as well as unrelated) individuals is easy to quantify, because inheritance of genetic material follows very simple rules.

• Aggregate sharing of genetic material can therefore be predicted fairly accurately w/o measurements: e.g.

– a parent and his/her child share exactly 50% of their genetic material (autosomal DNA)

– siblings share on average 50% of their genetic material– a grandparent and his/her grandchild (or half-sibs or avuncular individuals)

share on average 25% of their genetic material

• genome as aggregate “exposure”: While it is not clear whether an individual has been “exposed” to good or bad factors, “co-exposure” among relatives is predictable.

Page 7: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Use of genetic similarity of relatives

• The genetic similarity of relatives, a result of inheritance of copies of the same DNA from a common ancestor, is the basis for– heritability analysis– segregation analysis– linkage analysis– linkage disequilibrium analysis– relationship inference

• between close relatives (e.g., identification of human remains, paternity disputes)

• between distant groups of individuals from the same species (e.g., analysis of migration pattern)

• between different species (e.g., analysis of phylogenetic trees)

– identification of conserved DNA sequences through sequence alignment

– …

Page 8: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Relatives are not i.i.d.

• Unlike many random variables in many areas of statistics, the phenotypes and genotypes of related individuals are not independent and identically distributed (i.i.d.).

• Many standard statistical tests can and/or should therefore not be applied in the analysis of relatives.

• Most analyses on related individuals use likelihood-based statistical approaches, due to the modeling flexibility of this very general statistical framework.

Page 9: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

“Mendelian” vs. “complex” traits“simple mendelian” disease

•genotypes of a single locus cause disease

•often little genetic (locus) heterogeneity (sometimes even little allelic heterogeneity); little interaction between genotypes at different genes

•often hardly any environmental effects

•often low prevalence

•often early onset

•often clear mode of inheritance

•“good” pedigrees for gene mapping can often be found

•often straightforward to map

“complex multifactorial” disease

•genotypes of a single locus merely increase risk of disease

•genotypes of many different genes (and various environmental factors) jointly and often interactively determine the disease status

•important environmental factors

•often high prevalence

•often late onset

•no clear mode of inheritance

•not easy to find “good” pedigrees for gene mapping

•difficult to map

Page 10: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Genetic heterogeneity

timelocus homogeneity, allelic homogeneity

locus homogeneity, allelic heterogeneity

locus heterogeneity, allelic homogeneity (at each locus)

time

locus heterogeneity, allelic heterogeneity (at each locus)

Page 11: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Study design

different traits

different study designs

different analytical methods

Page 12: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

How to simplify the etiological architecture?

• choose tractable trait– Are there sub-phenotypes within trait?

• age of onset• severity• combination of symptoms (syndrome)

– “endophenotype” or “biomarker ” vs. disease• quantitative vs. qualitative (discrete)• Dichotomizing quantitative phenotypes leads to loss of information.• simple/cheap measurement vs. uncertain/expensive diagnosis• not as clinically relevant, but with simpler etiology

• given trait, choose appropriate study design/ascertainment protocol– study population

• genetic heterogeneity• environmental heterogeneity

– “random” ascertainment vs. ascertainment based on phenotype of interest• single or multiple probands• concordant or discordant probands• pedigrees with apparent “mendelian” inheritance?• inbred pedigrees?

– data structures• singletons, small pedigrees, large pedigrees

– account for/stratify by known genetic and environmental risk factors

Page 13: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Qualitative and quantitative traits

• qualitative or discrete traits:– disease (often dichotomous; assessed by

diagnosis): Huntington’s disease, obesity, hypertension, …

– serological status (seropositive or seronegative)– Drosophila melanogaster bristle number

• quantitative or continuous traits:– height, weight, body mass index, blood pressure,

…– assessed by measurement

Page 14: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

discrete trait(e.g. hypertension)

continuous trait(e.g. blood pressure)

0 1

Page 15: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Why use a quantitative trait?Why not?

Page 16: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Pros and cons of disease vs. quantitative trait

disease

• for rare disease, limited variation in random sample; need for non-random ascertainment

• for late-onset diseases, it is difficult/impossible to find multigenerational pedigrees

• diagnosis: often difficult, subjective, arbitrary

• treatment may cure disease or weaken symptoms, but original disease status is generally still known

• of great clinical interest

• often more complex etiologically

continuous trait

• sufficient variation in random sample; non-random ascertainment may not be necessary or advisable

• as no special ascertainment is necessary, any pedigree is suitable

• measurement: often straight-forward, reliable

• medications and other covariates may influence phenotype

• often only of limited/indirect clinical interest

• often simpler etiologically

Page 17: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Dichotomizing quantitative phenotypes generally leads to loss

of information

phenotype

probability density

unaffected affected

Page 18: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Characterization of a quantitative trait

mean : μ =E X[ ] =Xi∑

n

variance: σ 2 =E X−E X[ ]( )2⎡

⎣⎤⎦=

Xi −μ( )∑ 2

n

skewness :E X−E X[ ]( )3⎡

⎣⎤⎦=

Xi −μ( )∑ 3

n

kurtosis :E X−E X[ ]( )4⎡

⎣⎤⎦=

Xi −μ( )∑ 4

n

center of distribution

spread around center

symmetry

thickness of tails

Page 19: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

How can a continuous trait result from discrete genetic variation?

Suppose 4 genes influence the trait, each with 2 equally frequent alleles. Assume that at each locus allele 1 decreases the phenotype of an individual by 1, and that allele 2 increases the phenotype by 1.

Now, let us obtain a random sample from the population - by coin tossing. Take 2 coins and toss them. 2 tails mean genotype 11, and phenotype of -2. 2 heads mean genotype 22, and phenotype contribution of +2. 1 head and 1 head is a heterozygote (genotype 12), with phenotype of 0. Repeat this experiment 4 times (once for each locus). Sum up the results to obtain the overall phenotype.

Page 20: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Variance decomposition

σ p2 = σ g

2 + σ e2

phenotypic variance due to all causes

phenotypic variance due to genetic

variation

phenotypic variance due to

environmental variation

Page 21: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

σ g2 = σ a

2 + σ d2

phenotypic variance due to genetic

variation

phenotypic variance due to additive

effects of genetic variation

phenotypic variance due to dominant effects of genetic

variation

Decomposition of phenotypic variance attributable to genetic variation

Page 22: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

-a 0

phenotypic means of genotypes

+a

AA AB BB

d

σ a2 = 2pq a + d(q − p)( )

2

σ d2 = 2pqd( )

2

Page 23: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

-a d=0

phenotypic means of genotypes

+a

AA AB BB

If the phenotypic mean of the heterozygote is half way between the two homozygotes, there is “dose-response” effect, i.e. each dose of allele B increases the phenotype by the same amount. In this case, d = 0, and there is no dominance (interaction between alleles at the same polymorphism).

σ d2 = 2pqd( )

2= 2pq × 0( )

2= 0

Page 24: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

σ e2 = σ c

2 + σ u2

phenotypic variance due to

environmental variation

phenotypic variance due to

environmental variation common among individuals

(e.g., culture, household)

phenotypic variance due to

environmental variation unique to

an individual

Decomposition of phenotypic variance attributable to environmental variation

Page 25: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

The proportion of the phenotypic variance in a trait that is attributable to the effects of genetic variation.

Definition of heritability

h2 =σ g

2

σ p2

The absolute values of variance attributable to a specific factor are not important, as they depend on the scale of the phenotype. It is the relative values of variance matter.

Page 26: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

The proportion of the phenotypic variance in a trait that is attributable to:

Broad sense andnarrow-sense heritability

- effects of genetic variation (broad sense)

- additive effects of genetic variation (narrow sense)

h2 =σ g

2

σ p2

h2 =σa

2

σ p2

Page 27: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

100% genetic contribution 0%

0% environmental contribution 100%

“Nature vs. nurture”

trait

genes environment

Page 28: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Different degrees of relationship havedifferent phenotypic covariance/correlation

relative pairphenotypic covariance

phenotypic correlation

parent child

full sibs

half sibs

first cousins

1

2σa

2

1

2σa

2 +14σd

2

1

4σa

2

1

8σ a

2 1

8h2

1

4h2

>1

2h2

1

2h2

(assuming absence of effect of shared environment)

Page 29: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

MZ and DZ twins havedifferent phenotypic covariance/correlation

relative pairphenotypic covariance

phenotypic correlation

identical twins

fraternal twins

2x difference

σ a2 + σ d

2 ( + σ c2 )

1

2σa

2 +14σd

2 ( +σ c2 ) >

1

2h2

> h2

σ a2 +

3

2σ d

2 > h2

(assuming equal effect of

shared environment)

Page 30: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Normal distribution

f x( ) =12πσ

e−12(x−μ)2

σ 2

x

f(x)

Page 31: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Variance components approach:multivariate normal distribution (MVN)

In variance components analysis, the phenotype is generally assumed to follow a multivariate normal distribution:

f x( ) =1

2π( )n Ω( )12

exp12

x−μ( )'Ω−1 x−μ( )⎛

⎝⎜⎞

⎠⎟

ln f x( ) =−n2

ln 2π( )−12Ω −

12

x−μ( )'Ω−1 x−μ( )

no. of individuals (in a pedigree)

nn covariance matrix

phenotype vector

mean phenotype

vector

Page 32: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Ω= Ωii∑ σ i

2

Variance-covariance matrix

The variance-covariance matrix describes the phenotypic covariance among pedigree members.

nn structuring

matrix

scalar variance component

(random effect)

Page 33: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Ω=Iσu2

“Sporadic” model:no phenotypic resemblance

between relativesIn the simplest model, the phenotypic covariance among pedigree members is only influenced by environmental exposure unique to each individual. Shared factors among relatives, such as genetic and environmental factors, do not influence the trait.

identity matrix: I=

12...n

1 0 0 00 1 0 00 0 1 00 0 1

⎢⎢⎢⎢

⎥⎥⎥⎥

1 2 ... n

Page 34: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Identity matrix

f m

321

f m 1 2 3

f 1 0 0 0 0

m 0 1 0 0 0

1 0 0 1 0 0

2 0 0 0 1 0

3 0 0 0 0 1

Page 35: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Ω=Iσu2 + 2Φσa

2

Modeling phenotypic resemblance between

relatives:“polygenic” model

kinship matrix

Page 36: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

ΦKinship and relationship

matrixkinship matrix:

Each element in the kinship matrix contains probability that the allele at a locus randomly drawn from the 2 chromosomal sets in a person is a copy of the same allele at the same locus randomly drawn from the 2 chromosomal sets in another person. For one individual, = 0.5, assuming absence of inbreeding.

relationship matrix:

This provides the probability that a given locus is shared identical-by-descent among 2 individuals. This is equivalent to the expected proportion of the genome that 2 individuals share in common due to common ancestry. For one individual, 2 = 1, assuming absence of inbreeding.

Page 37: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Relationship matrix and 7 matrix

relationship

self 1 1

MZ twin pair 1 1

DZ twin pair 0.5 0.25

full sibs 0.5 0.25

half sibs 0.25 0

grandparent - grandchild

0.25 0

avuncular 0.25 0

first cousin 1/8 0

second cousin 1/32 0

7

Page 38: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Relationship matrix:nuclear family

f m

321

f m 1 2 3

f 1 0 0.5 0.5 0.5

m 0 1 0.5 0.5 0.5

1 0.5 0.5 1 0.5 0.5

2 0.5 0.5 0.5 1 0.5

3 0.5 0.5 0.5 0.5 1

Page 39: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Relationship matrix:half-sibs

f1 m

21

f1 m f2 1 2

f1 1 0 0 0.5 0

m 0 1 0 0.5 0.5

f2 0 0 1 0 0.5

1 0.5 0.5 0 1 0.25

2 0 0.5 0.5 0.25 1

f2

Page 40: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

• The likelihood of a hypothesis (e.g. specific parameter value(s)) on a given dataset, L(hypothesis|data), is defined to be proportional to the probability of the data given the hypothesis, P(data|hypothesis):

L(hypothesis|data) = constant * P(data|hypothesis)

• Because of the proportionality constant, a likelihood by itself has no interpretation.

• The likelihood ratio (LR) of 2 hypotheses is meaningful if the 2 hypotheses are nested (i.e., one hypothesis is contained within the other):

• Under certain conditions, maximum likelihood estimates are asymptotically unbiased and asymptotically efficient. Likelihood theory describes how to interpret a likelihood ratio.

LR =L H1 | data( )L H0 | data( )

=cP data |H1( )cP data |H0( )

=P data |H1( )P data |H0( )

Likelihood

Page 41: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Inference in heritability analysis

H0: (Additive) genetic variation does not contribute to phenotypic variation

H1: (Additive) genetic variation does contribute to phenotypic variation

Λ =−2 lnL H0( )

L H1( )

= −2 lnL σ a

2 = 0, %σ u2( )

L σ a2 ,σ u

2( ) h2 =

σa2

σa2 + σu

2

heritability:

Page 42: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Ω=Iσu2 + 2Φσa

2 + 7σd2

Modeling phenotypic resemblance between

relatives:“polygenic” model allowing

for dominance

matrix of probabilities that 2 individuals inherited the

same alleles on both chromosomes from 2 common ancestors

Page 43: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

2Φrelationship

self 1 1

MZ twin pair 1 1

DZ twin pair 0.5 0.25

full sibs 0.5 0.25

half sibs 0.25 0

grandparent - grandchild

0.25 0

avuncular 0.25 0

first cousin 1/8 0

second cousin 1/32 0

7

Relationship matrix and 7 matrix

Page 44: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

7 matrix:nuclear family

f m

321

f m 1 2 3

f 1 0 0 0 0

m 0 1 0 0 0

1 0 0 1 0.25 0.25

2 0 0 0.25 1 0.25

3 0 0 0.25 0.25 1

Page 45: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Inference in heritability analysis

H0: (Additive) genetic variation does not contribute to phenotypic variation

H1: (Additive) genetic variation does contribute to phenotypic variation

Λ =−2 lnL H0( )

L H1( )

= −2 lnL σ a

2 = 0,σ d2 = 0, %σ e

2( )

L σ a2 ,σ d

2 ,σ e2( )

2 degrees of freedom

Page 46: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Is it reasonable to assume that the only source for phenotypic resemblance among

relatives is genetic?

No. To overcome this problem, one can try to model shared environment, either in aggregate or broken into specific environmental factors.

Ω=Iσu2 +Hσ c

2 + 2Φσa2

household matrix: accounts for aggregate of environmental factors shared among individuals living in the same household

Page 47: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Household matrix

f m

321

f m 1 2 3

f 1 1 1 0 0

m 1 1 1 0 0

1 1 1 1 0 0

2 0 0 0 1 0

3 0 0 0 0 1

Page 48: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

“Household” effect

c2 =σ c

2

σ p2

Page 49: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Nested models for heritability analysis

model

“sporadic” + - -

“household” + + -

“additive polygenic” + - +

“general” + + +

σ u2 σ c

2 σ a2

non-nested hypotheses

Page 50: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Inclusion of covariates

Measured covariates can easily be incorporated as “fixed effects” in the multivariate normal model of the phenotype, by making the expected phenotype different for different individuals as a function of the measured covariates.

ln f x( ) =−n2

ln 2π( )−12Ω −

12

x−μ( )'Ω−1 x−μ( )

μ =μoverall +Yβ

μ i = μ overall + β jYijj

Page 51: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Inclusion of covariates

If covariates are not of interest in and of themselves, one can “regress them out” before pedigree analysis.

Xi =β0 + β jYijj∑

X =Yβ

Xi −Xi =ei

X −X =eThen use residuals as phenotype of interest in pedigree analysis.

Page 52: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Inference regarding covariates

in heritability analysisH0: measured covariate Y does not influence phenotype.

H1: measured covariate Y does influence phenotype.

Λ =−2 lnL H 0( )

L H1( )

= −2 lnL σ a

2 ,σ u2 ,β = 0( )

L σ a2 ,σ u

2 , β( )

Page 53: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Inference regarding covariates

in heritability analysisH0: measured covariate Y does not influence phenotype.

H1: measured covariate Y does influence phenotype.

Λ =−2 lnL H0( )

L H1( )= −2 ln

L σ u2 ,β = 0( )

L σ u2 , β( )

CAUTION:

Related individuals in pedigrees are treated as unrelated. This can easily lead to false positive findings regarding the effect of the covariate!

Page 54: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Choice of covariates

Covariates ought to be included in the likelihood model if they are known to influence the phenotype of interest and if their own genetic regulation does not overlap the genetic regulation of the target phenotype.

Typical examples include sex and age.

In the analysis of height, information on nutrition during childhood should probably be included during analysis. However, known growth hormone levels probably should not be.

Page 55: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Choice of covariates

σ p2 σ p

2

h2

without cov =σa

2

σ p2 ≈0.5 > 0.25 ≈

σa2 − σa

2 I σ cov2( )

σ p2 −σ cov

2 =h2withcov

σ a2

σ cov2

σ a2

Page 56: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Choice of covariates

σ p2 σ p

2

σ cov2

σ a2 σ a

2

h2

without cov =σa

2

σ p2 ≈0.2 < 0.3 ≈

σa2 − σa

2 I σ cov2( )

σ p2 −σ cov

2 =h2withcov

Page 57: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Choice of covariates:special case of treatment/medication

Page 58: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Before treatment/medicationof affected individuals

phenotype

probability density

unaffected affected

Page 59: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

After (partially effective) treatment / medication of affected individuals

phenotype

probability density

unaffected affected

apparent effect of covariate

Page 60: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Choice of covariates:special case of treatment/medication

• If medication is ineffective/partially effective, including treatment as a covariate is worse than ignoring it in the analysis.

• If medication is very effective, such that the phenotypic mean of individuals after treatment is equal to the phenotypic mean of the population as a whole, then including medication as a covariate has no effect.

• If medication is extremely effective, such that the phenotypic mean of individuals after treatment is “better” than the phenotypic mean of the population as a whole, then including medication as a covariate is better than ignoring it, but still far from satisfying.

• Either censor individuals or, better, infer or integrate over their phenotypes before treatment, based on information on efficacy etc.

Page 61: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Be careful ininterpretation of heritability estimates

While one can attempt to account for shared environmental factors individually or in aggregate, it is notoriously difficult to do so. In contrast to genetics where “co-exposure” among relatives is predictable due to inheritance rules, this is not the case with environmental factors of interest in epidemiology. If environmental co-exposure is not adequately modeled, shared environmental effects tend to inflate the heritability estimate, because shared exposure is generally greater among relatives, such as mimicking the effects of genetic similarity among relatives. Heritability estimates thus are often overestimates.

Page 62: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Be careful ininterpretation of heritability estimates

Keep in mind that heritability estimates are applicable only to a specific population at a specific point in time.

Page 63: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Heritability of adult height(additive heritability, adjusted for sex and age)

study sample sizeheritability estimate

TOPS 2199 0.78

FLS 705 0.83

GAIT 324 0.88

SAFHS 903 0.76

SAFDS 737 0.92

SHFS

AZ 643 0.80

DK 675 0.81

OK 647 0.79

Jiri 616 0.63

total 7449

Page 64: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Be careful ininterpretation of heritability estimates

Heritability is a population level parameter, summarizing the strength of genetic influences on variation in a trait among members of the population. It does not provide any information regarding the phenotype in a given individual, such as risk of disease.

Page 65: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Relative risk

The risk of disease (or another phenotype) in a relative of an affected individual as compared to the risk of disease in a randomly chosen person from the population.

λrelationship =prelationship

p

Page 66: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Relative risk as a function of heritability

λsib = 1 +1− p

p0.5ha

2 + 0.25hd2( )

p → 0 (rare disease) : λ sib → ∞

p → 1 (common disease) : λ sib → 1

Page 67: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Heritability of adult height(additive heritability, adjusted for sex and age)

phenotype p λsib

autism 0.0004 75

IDDM 0.004 15

schizophrenia 0.01 9

NIDDM 0.2 3

obesity 0.4 <2

Page 68: Introduction to some basic concepts in quantitative genetics Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo,

Be careful ininterpretation of heritability estimates

A heritability estimate is applicable only to a specific trait. If you alter the trait in any way, such as inclusion of additional/different covariates, this may alter the estimate and/or alter the interpretation of the finding.

Example:

•left ventricular mass not adjusted for blood pressure

•left ventricular mass adjusted for blood pressure