23
1 QTL mapping Ades 2008, NHGRI Simple Mendelian traits are caused by a single locus, and come in the ‘all-or-none’ flavor. A Quantitative Trait is one in which many loci contribute. The phenotype can therefore vary in a ‘quantitative’ manner. dified from Mike White slides, 2010

QTL mapping

  • Upload
    wyatt

  • View
    48

  • Download
    0

Embed Size (px)

DESCRIPTION

QTL mapping. Simple Mendelian traits are caused by a single locus, and come in the ‘ all-or-none ’ flavor. A Quantitative Trait is one in which many loci contribute. The phenotype can therefore vary in a ‘ quantitative ’ manner. Ades 2008, NHGRI. Modified from Mike White slides, 2010. - PowerPoint PPT Presentation

Citation preview

Page 1: QTL mapping

1

QTL mapping

Ades 2008, NHGRI

Simple Mendelian traits are caused by a single locus, and come in the ‘all-or-none’ flavor.

A Quantitative Trait is one in which many loci contribute. The phenotype can therefore vary in a ‘quantitative’ manner.

Modified from Mike White slides, 2010

Page 2: QTL mapping

2

Goals of QTL mapping

Ades 2008, NHGRI

To identify the loci that contribute to phenotypic

variation

1. Cross two parents with extreme phenotypes

2. Score the progeny for the phenotype

3. Genotype the progeny at markers across the genome

4. Associate the observed phenotypic variation with the underlying genetic variation

5. Ultimate goal: identify causal polymorphisms that explain the phenotypic variation

Modified from Mike White slides, 2010

Page 3: QTL mapping

3

Backcross

Broman and Sen 2009

Phenotype: Drug tolerance

80% 20% viability

Usually have at least 100 individuals

Page 4: QTL mapping

4

Intercross

Broman and Sen 2009

Phenotype: Drug tolerance

80% 20% viability

Page 5: QTL mapping

5

Backcross vs. Intercross

• An intercross recovers all three possible genotypes (AA, BB, AB). This allows detection of dominance with both alleles and provides estimates of the degree of dominance.

• A backcross has more power to detect QTL with fewer individuals.

• A backcross may be the only possible scheme when crossing two different species.

Page 6: QTL mapping

6

Genetic map: specific markers spaced across the genome

Markers can be:

•SNPs at particular loci

•Variable-length repeatse.g. ALU repeats

•ALL polymorphisms (if have whole genomes)

Ideally, markers shouldbe spaced every 10-20 cM

and span the whole genome

Page 7: QTL mapping

7

Genotype data: Determine allele at all markers in each F2

Page 8: QTL mapping

8

Phenotype data

Page 9: QTL mapping

9

Statistical framework

Broman and Sen 2009

1. Missing Data ProblemUse marker data to infer intervening genotypes

2. Model Selection ProblemHow do the QTL across the genome combine with the covariates to

generate the phenotype?

Page 10: QTL mapping

10

Marker regression: simple T-test (or ANOVA) at each marker

Marker 1: no QTL Marker 2: significant QTL (population means are different)

Page 11: QTL mapping

11

Marker regression

• Simple test – standard T-test/ANOVA

• Covariates (e.g. Gender, Environment) are to incorporate

• No genetic map necessary, since test is done separately on each marker

Advantages:

Disadvantages:

• Any individuals with missing marker data must be omitted from analysis

• Does not effectively consider positions between markers

• Does not test for genetic interactions (e.g. epistasis)

• The effect size of the QTL (i.e. power to detect QTL) is reduced by incomplete linkage to the marker

• Difficult to pinpoint QTL position, since only the marker positions are considered

Page 12: QTL mapping

12

Interval mapping

• In addition to examining phenotype-genotype associations at markers, look for associations between makers by inferring the genotype

Q

• The methods for calculating genotype probabilities between markers typically use hidden Markov models to account for additional factors, such as genotyping errors

• Lander and Botstein 1989

Page 13: QTL mapping

13

Interval mapping

Broman and Sen 2009

Page 14: QTL mapping

14

Interval mapping – maximum likelihood

1. Calculate genotype probabilities at intervening locations for every individual

2. At a marker, calculate the conditional probability that an individual is in one of the two QTL genotype groups (AA or AB) given their phenotype and the current estimates of µAA

(s-1) and µAB

(s-1) (Expectation Step)

3. Calculate new estimates of µAA(s)

and µAB(s), by combining the genotype

probabilities of each individual with their phenotypic values (Maximization Step)

4. Repeat until the estimates of µAA(s-1), µAA

(s) and µAB

(s-1), µAB(s) converge.

Page 15: QTL mapping

15

Interval mapping

• Takes account of missing genotype information – all individuals are included

• Can scan for QTL at locations in between markers

• QTL effects are better estimated

Advantages:

Disadvantages:

• More computation time required

• Still only a single-QTL model – cannot separate linked QTL or examine for interactions among QTL

Page 16: QTL mapping

16

LOD scores

• Measure of the strength of evidence for the presence of a QTL at each marker location

LOD(λ) = log10 likelihood ratio comparing the hypothesis of a QTL at position λ versus that of no QTL

Pr(y|QTL at λ, µAAλ, µABλ, σλ)

Pr(y|no QTL, µ, σ) { }log10

Ph

en

oty

pe

LOD 3 means that the TOP model is 103 times more likely than

the BOTTOM model

Page 17: QTL mapping

17

LOD curves

How do you know which peaks are really significant?

Page 18: QTL mapping

18

LOD threshold

Broman and Sen 2009

•Consider the null hypothesis that there are no QTLs genome-wide

one locationgenome-wide

1. Randomize the phenotype labels on the relative to the genotypes2. Conduct interval mapping and determine what the maximum LOD score is

genome-wide3. Repeat a large number of times (1000-10,000) to generate a null distribution

of maximum LOD scores

Page 19: QTL mapping

Leoine Moyle, Indiana University “Dissecting Speciation via the Genetics of Isolation and Adaptation”

Genetics ColloquiumWednesday, March 14

3:30 pmBiotech Center Auditorium Room 1111

Page 20: QTL mapping

20

LOD threshold

• 1000 permutations10% False Discovery Rate = LOD 3.19

(means that at this LOD cutoff 10% of peaks could be random chance)5% FDR = LOD 3.52

• Boundary of the peak is often taken as points that cross (Max LOD – 1.5) (or - 1.8 for an intercross)

Page 21: QTL mapping

21

LOD curves – Marker regression vs. interval mapping

IMMR

•With complete marker genotype information, marker regression would give the same results as interval mapping

Page 22: QTL mapping

22

Other mapping methods

• Methods discussed assume single QTL models• Multiple QTLs on a chromosome are not estimated correctly• Cannot detect a QTL whose effect is dependent on the genotype at a second QTL (epistasis)

• Two-dimensional two-QTL scans•Consider all pairs of markers across the genome

• Multiple QTL Models• Jointly estimate all sets of QTL, interactions, and covariates in a single, coherent model• Focuses on the model selection problem of QTL mapping

Can also apply other Models

Page 23: QTL mapping

23

From QTL to candidate genes

• F2 mapping results in large loci associated with the phenotype• Mapping a QTL that explains 5% of the phenotypic variance in 300 F2 animals will yield a region approximately 40 cM in size (800 genes in mice!)

• 2050 mouse and 700 rat QTL have been mapped (reviewed in Flint et al. 2005)• ~20 underlying genes have been identified

Strategies for getting to causal loci:1.Generate additional recombinants to fine map QTL

•Effect sizes of QTL can be overestimated•Often one large QTL is composed of manly tightly linked QTL of small effect

2.Identify candidate genes from known mutants, tissue-specific expression, etc.

3.Identify candidate genes through comparison to association mapping studies or population genomics studies

•Are the results repeatable across environments?•Association mapping and population genomics approaches only identify alleles with large effect sizes