Upload
winfred-gordon
View
217
Download
2
Tags:
Embed Size (px)
Citation preview
Lecture 16: Individual Identity and Paternity Analysis
March 7, 2014
Last Time
Interpretation of F-statistics
More on the Structure program
Principal Components Analysis
Population assignment
Individual identity (in lab)
Today
Population assignment examples
More on forensic evidence and individual identity
Introduction to paternity analysis
Population Assignment: Likelihood "Assignment Tests" based on allele
frequencies in source populations and genetic composition of individuals
for m loci
2lilk pP
for homozygote AiAi in population l at locus k
ljlilk ppP 2for heterozygote AiAj in population l at locus k
,)|(
)|()|,(
2
121 HGP
HGPLRGHHL
Individual likelihood often summarized as:
-log10(P(G|Hn))
Low numbers mean higher probability of Hn
Likelihoods can be plotted against each other
Carmichael et al. 2001 Mol Ecol 10:2787
Population Assignment Example: Wolf Populations in Northwest Territories Wolf populations sampled on island and
mainland populations in Canadian Northwest Territories
Immigrants detected on mainland (black circles) from Banks Island (white circles)
-log likelihood from Banks Island
-log
likel
ihoo
d fr
om M
ainl
and
Population Assignment Example:Fish Stories
Fishing competition on Lake Saimaa in Southeast Finland
Contestant allegedly caught a 5.5 kg salmon, much larger than usual for the lake
Officials compared fish from the lake to fish from local markets (originating from Norway and Baltic sea)
7 microsatellites
Based on likelihood analysis, fish was purchased rather than caught in lake
Lake Saimaa Market
-
Individual Identity: Likelihood Assume you find skin cells and blood under fingernails of
a murder victim
A hitman for the Sicilian mafia is seen exiting the apartment
You gather DNA evidence from the skin cells and from the suspect
They have identical genotypes
What is the likelihood that the evidence came from the suspect?
,)|(
)|()|,(
2
121 HGP
HGPLRGHHL
What is H1 and what is H2?
Match Probability
Probability of observing a genotype at locus k by chance in population is a function of allele frequencies:
for m loci
m
kkPP
1
2ik pP
Homozygote
jik ppP 2Heterozygote
Assumes unlinked (independent loci) and Hardy-Weinberg equilibrium
Probability of Identity
Probability 2 randomly selected individuals have same profile at locus k:
i ji
jii
ikID pppP 24 )2(
Homozygotes Heterozygotes
for m loci
m
kkIDPP
1
Exclusion Probability (E): E=1-P
What if the slimy mob defense attorney argues that the most likely perpetrator is the mob hitman’s brother, who has conveniently “disappeared”?
Does the general match probability apply to near relatives?
Probability of identity for full sibs
Heterozygotes
)21(4
1 2iiksibhoID ppP
2 alleles IBD1 allele IBD
0 alleles IBD
)21(4
1jijiksibheID ppppP
General Probability of Identity for Full Sibs:
Homozygotes 2 alleles IBD
0 alleles IBD
i iii
iiksibID pppP
2
224
2
11
4
1
Probability of identity for full sibs
i iii
iiksibID pppP
2
224
2
11
4
1
For a locus with 5 alleles, each at a frequency of 0.2:PID = 0.072PIDsib = 0.368
i ji
jii
ikID pppP 24 )2(
Probability of identity unrelated individuals
What is minimum probability of identity for full sibs?
i iii
iiksibID pppP
2
224
2
11
4
1
NRC (1996) recommendations
Use population that provides highest probability of observing the genotype (unless other information is known)
Correct homozygous genotypes for substructure within selected population (e.g., Native Americans, hispanics, African Americans, caucasians, Asian Americans)
No correction for heterozygotes
jiSTiii ppFpppP 2])1([' 2
Homozygotes Heterozygotes
Why is it ‘conservative’ (from the standpoint of proving a match) to ignore substructure for
heterozygotes?
T
STST H
HHF
Example: World Trade Center Victims Match victims using
DNA collected from toothbrushes, hair brushes, or relatives
Exact matches not guaranteed
Why not?
Use likelihood to match samples to victims
A series of little NBA prospects are born to ardent basketball fans in every city with an NBA team. The mothers regularly allege that the fathers are NBA stars from visiting teams. The “players” deny this
allegation.
Can this be resolved using molecular markers and population genetics methodologies?
Paternity Exclusion Analysis
Determine multilocus genotypes of all mothers, offspring, and potential fathers
Determine paternal gamete by “subtracting” maternal genotype from that of each offspring.
Infer paternity by comparing the multilocus genotype of all gametes to those of all potential males in the population
Assign paternity if all potential males, except one, can be excluded on the basis of genetic incompatibility with the observed pollen gamete genotype
Unsampled males must be considered
Locus 1
Locus 2
Locus 3
YES NO YES YES NO
NO
NO YES YES YES
YES YESNONO
NO
First step is to determine paternal contribution based on seedling alleles that do not match mother
Notice for locus 3 both alleles match mother, so there are two potential paternal contributions
Male 3 is the putative father because he is the only one that matches paternal contributions at all loci
Paternity Exclusion
Parentage Analysis: Paternity Exclusion
Determine multilocus genotypes of all mothers, offspring, and potential fathers
Determine paternal gamete by “subtracting” maternal genotype from that of each offspring.
Infer paternity by comparing the multilocus genotype of all gametes to those of all potential males in the population
Assign paternity if all potential males, except one, can be excluded on the basis of genetic incompatibility with the observed pollen gamete genotype
Unsampled males must be considered
Paternity Exclusion Analysis
Only one male cannot be excluded
More than one male cannot be excluded
All males are excluded
Possible outcomes:
Paternity is assigned
Analyze more loci
Conclude there is migration from external sources
Consequences:Male
Female
Male
MaleFemale
Male
Male
MaleFemale
Male
?
Probabilities of Paternity Exclusion, Single Locus, 2 alleles, codominant The paternity exclusion probability is the sum of the probability of all exclusionary combinations
)1(Pr: 2121 ppppSum k
m
kk
1
)Pr1(1Pr
See Chakraborty et al. 1988 Genetics 118:527 for a more general calculation of exclusion power
Probability of a falsely accused male of not matching for at least one of m loci:
Hedrick 2005
Alleles versus Loci For a given number of alleles: one locus with many alleles provides more exclusion
power than many loci with few alleles
10 loci, 2 alleles, Pr = 0.875
1 locus, 20 alleles, Pr=0.898
Uniform allele frequencies provide more power
Characteristics of an ideal genetic marker for paternity analysis
Highly polymorphic, (i.e. with many alleles)
Codominant
Reliable
Low cost
Easy to use for genotyping large numbers of individuals
Mendelian or paternal inheritance
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
2345678910
23
45
67
Shortcomings of Paternity Exclusion
Requiring exact matches for potential fathers is excessively stringent
Mutation
Genotyping error
Multiple males may match, but probability of match may differ substantially
No built-in way to deal with cryptic gene flow: case when male matches, but unsampled male may also match
Type I error: wrong father assigned paternity)
Advantages and Disadvantages of Likelihood
Advantages:
Flexibility: can be extended in many ways
- Compensate for errors in genotyping- Incorporate factors influencing mating success: fecundity,
distance, and directionCompensates for lack of exclusion power
- Fractional paternity Disadvantages
Often results in ambiguous paternities
Difficult to determine proper cutoff for LOD score
Summary
Direct assessment of movement is best way to measure gene flow
Parentage analysis is powerful approach to track movements of mates retrospectively
Paternity exclusion is straightforward to apply but may lack power and is confounded by genotyping error
Likelihood-based approaches can be more flexible, but also provide ambiguous answers when power is lacking