Upload
wyatt
View
48
Download
0
Tags:
Embed Size (px)
DESCRIPTION
QTL mapping. Simple Mendelian traits are caused by a single locus, and come in the ‘ all-or-none ’ flavor. A Quantitative Trait is one in which many loci contribute. The phenotype can therefore vary in a ‘ quantitative ’ manner. Ades 2008, NHGRI. Modified from Mike White slides, 2010. - PowerPoint PPT Presentation
Citation preview
1
QTL mapping
Ades 2008, NHGRI
Simple Mendelian traits are caused by a single locus, and come in the ‘all-or-none’ flavor.
A Quantitative Trait is one in which many loci contribute. The phenotype can therefore vary in a ‘quantitative’ manner.
Modified from Mike White slides, 2010
2
Goals of QTL mapping
Ades 2008, NHGRI
To identify the loci that contribute to phenotypic
variation
1. Cross two parents with extreme phenotypes
2. Score the progeny for the phenotype
3. Genotype the progeny at markers across the genome
4. Associate the observed phenotypic variation with the underlying genetic variation
5. Ultimate goal: identify causal polymorphisms that explain the phenotypic variation
Modified from Mike White slides, 2010
3
Backcross
Broman and Sen 2009
Phenotype: Drug tolerance
80% 20% viability
Usually have at least 100 individuals
4
Intercross
Broman and Sen 2009
Phenotype: Drug tolerance
80% 20% viability
5
Backcross vs. Intercross
• An intercross recovers all three possible genotypes (AA, BB, AB). This allows detection of dominance with both alleles and provides estimates of the degree of dominance.
• A backcross has more power to detect QTL with fewer individuals.
• A backcross may be the only possible scheme when crossing two different species.
6
Genetic map: specific markers spaced across the genome
Markers can be:
•SNPs at particular loci
•Variable-length repeatse.g. ALU repeats
•ALL polymorphisms (if have whole genomes)
Ideally, markers shouldbe spaced every 10-20 cM
and span the whole genome
7
Genotype data: Determine allele at all markers in each F2
8
Phenotype data
9
Statistical framework
Broman and Sen 2009
1. Missing Data ProblemUse marker data to infer intervening genotypes
2. Model Selection ProblemHow do the QTL across the genome combine with the covariates to
generate the phenotype?
10
Marker regression: simple T-test (or ANOVA) at each marker
Marker 1: no QTL Marker 2: significant QTL (population means are different)
11
Marker regression
• Simple test – standard T-test/ANOVA
• Covariates (e.g. Gender, Environment) are to incorporate
• No genetic map necessary, since test is done separately on each marker
Advantages:
Disadvantages:
• Any individuals with missing marker data must be omitted from analysis
• Does not effectively consider positions between markers
• Does not test for genetic interactions (e.g. epistasis)
• The effect size of the QTL (i.e. power to detect QTL) is reduced by incomplete linkage to the marker
• Difficult to pinpoint QTL position, since only the marker positions are considered
12
Interval mapping
• In addition to examining phenotype-genotype associations at markers, look for associations between makers by inferring the genotype
Q
• The methods for calculating genotype probabilities between markers typically use hidden Markov models to account for additional factors, such as genotyping errors
• Lander and Botstein 1989
13
Interval mapping
Broman and Sen 2009
14
Interval mapping – maximum likelihood
1. Calculate genotype probabilities at intervening locations for every individual
2. At a marker, calculate the conditional probability that an individual is in one of the two QTL genotype groups (AA or AB) given their phenotype and the current estimates of µAA
(s-1) and µAB
(s-1) (Expectation Step)
3. Calculate new estimates of µAA(s)
and µAB(s), by combining the genotype
probabilities of each individual with their phenotypic values (Maximization Step)
4. Repeat until the estimates of µAA(s-1), µAA
(s) and µAB
(s-1), µAB(s) converge.
15
Interval mapping
• Takes account of missing genotype information – all individuals are included
• Can scan for QTL at locations in between markers
• QTL effects are better estimated
Advantages:
Disadvantages:
• More computation time required
• Still only a single-QTL model – cannot separate linked QTL or examine for interactions among QTL
16
LOD scores
• Measure of the strength of evidence for the presence of a QTL at each marker location
LOD(λ) = log10 likelihood ratio comparing the hypothesis of a QTL at position λ versus that of no QTL
Pr(y|QTL at λ, µAAλ, µABλ, σλ)
Pr(y|no QTL, µ, σ) { }log10
Ph
en
oty
pe
LOD 3 means that the TOP model is 103 times more likely than
the BOTTOM model
17
LOD curves
How do you know which peaks are really significant?
18
LOD threshold
Broman and Sen 2009
•Consider the null hypothesis that there are no QTLs genome-wide
one locationgenome-wide
1. Randomize the phenotype labels on the relative to the genotypes2. Conduct interval mapping and determine what the maximum LOD score is
genome-wide3. Repeat a large number of times (1000-10,000) to generate a null distribution
of maximum LOD scores
Leoine Moyle, Indiana University “Dissecting Speciation via the Genetics of Isolation and Adaptation”
Genetics ColloquiumWednesday, March 14
3:30 pmBiotech Center Auditorium Room 1111
20
LOD threshold
• 1000 permutations10% False Discovery Rate = LOD 3.19
(means that at this LOD cutoff 10% of peaks could be random chance)5% FDR = LOD 3.52
• Boundary of the peak is often taken as points that cross (Max LOD – 1.5) (or - 1.8 for an intercross)
21
LOD curves – Marker regression vs. interval mapping
IMMR
•With complete marker genotype information, marker regression would give the same results as interval mapping
22
Other mapping methods
• Methods discussed assume single QTL models• Multiple QTLs on a chromosome are not estimated correctly• Cannot detect a QTL whose effect is dependent on the genotype at a second QTL (epistasis)
• Two-dimensional two-QTL scans•Consider all pairs of markers across the genome
• Multiple QTL Models• Jointly estimate all sets of QTL, interactions, and covariates in a single, coherent model• Focuses on the model selection problem of QTL mapping
Can also apply other Models
23
From QTL to candidate genes
• F2 mapping results in large loci associated with the phenotype• Mapping a QTL that explains 5% of the phenotypic variance in 300 F2 animals will yield a region approximately 40 cM in size (800 genes in mice!)
• 2050 mouse and 700 rat QTL have been mapped (reviewed in Flint et al. 2005)• ~20 underlying genes have been identified
Strategies for getting to causal loci:1.Generate additional recombinants to fine map QTL
•Effect sizes of QTL can be overestimated•Often one large QTL is composed of manly tightly linked QTL of small effect
2.Identify candidate genes from known mutants, tissue-specific expression, etc.
3.Identify candidate genes through comparison to association mapping studies or population genomics studies
•Are the results repeatable across environments?•Association mapping and population genomics approaches only identify alleles with large effect sizes