Upload
aaron-walton
View
224
Download
0
Tags:
Embed Size (px)
Citation preview
DNA Identification:Mixture Weight & Inference
Cybergenetics © 2003-2010Cybergenetics © 2003-2010
Mark W Perlin, PhD, MD, PhDMark W Perlin, PhD, MD, PhDCybergenetics, Pittsburgh, PACybergenetics, Pittsburgh, PA
TrueAlleleTrueAllele®® Lectures LecturesFall, 2010Fall, 2010
Mixture Weight: Uncertain Quantity
Infer mixture weight fromSTR experiments:
• quantitative peak data• contributor genotypes
Pr(W=w | data, G1=g1, G2=g2, …)hierarchical Bayesian model
Perlin MW, Legler MM, Spencer CE, Smith JL, Allan WP, Belrose JL, Duceman BW. Validating TrueAllele® DNA mixture interpretation. Journal of Forensic Sciences. 2011;56(November):in press.
Mixture Weight Model
template weight
locus weights
Wk
Wk,1 Wk,2 Wk,N…
kth contributor
W0prior probability
experiment data dk,1 dk,2 dk,N…
Experiment Estimate
wk
sum of peak heightsfrom kth contributor
sum of peak heightsfrom all contributors
=
D16S539
11 12 13
Three Alleles
Allele111213
Quantity500
6,750250
11 12 13
Experiment Estimate
Allele111213
Quantity500
6,750250
500 + 250
500 + 6,750 + 250
750
7,500= 10%=
D7S820
Four Alleles
Allele8
101213
Quantity4,000
2503,000
250
10 128 13
Experiment Estimate
10 128 13
250 + 250
4,000 + 250 + 3,000 + 250
500
7,500= 6.7%=
Allele8
101213
Quantity4,000
2503,000
250
Overlapping Alleles
Template Average
mean
variance
Template Mixture Weight Probability Distribution
mean = 6.7%
standarddeviation
= 0.9%
Central Limit Theorem
• more data experiments for a template provide greater mixture weight precision
• double the precision by doing four times the number of experiments
• combine evidence from multiple experiments to obtain a more informative result
Probability Solution
w | d, g1, g2, …
g1 | d, g2, w, …
g2 | d, g1, w, …
interacting random variables
find probability distributions by iterative sampling
zi | d, g1, g2, w, …
Gelfand, A. and Smith, A. (1990). Sampling based approaches to calculating marginal densities. J. American Statist. Assoc., 85:398-409.
Markov Chain Monte Carlo
gk,l :fi2, i=j
2fifj, i≠j⎧⎨⎪⎩⎪
w: Dir1( )ml : N+ 5000, 50002( )σ−2:Gam10, 20( )τ−2:Gam10, 500( )ψ−2:Gam12, 1200( )
Prior Probability
genotype
mixture weight
varianceparameters
DNA quantity
Σl =σ 2 ⋅Vl +τ 2 wl : N0,1[ ]K−1 w, ψ2⋅I( )
μl =ml ⋅ wk,l ⋅gk,lk=1
K
∑ dl : N+ μl,Σl( )
Joint Likelihood Function
data
pattern
variation
Prσ2=s2d1,d2,...,dj,...{ }∝Prσ2=s2{ }⋅ Prdjσ2=s2,...{ }j=1
J∏Prτ2=t2d1,d2,...,dj,...{ }∝Prτ2=t2{ }⋅ Prdjτ2=t2,...{ }
j=1
J∏
PrW=w|d1,d2K,dJ,K{ }∝PrW=w{ }⋅ Prdj|W=w,K{ }
j=1
J∏
PrQ=x|dl,1,dl,2,...,dl,z,...{ }∝PrQ=x{ }⋅ Prdl,i|Q=x,...{ }i=1
I∏Posterior Probability
genotype
mixture weight
data variation
Generally Accepted Method
genotype
mixture weight
data variation
James Curran. A MCMC method for resolving two person mixtures. Science & Justice.
2008;48(4):168-77.
Hierarchical Bayesian Model with MCMC Solution
• standard approach in modern science• describes uncertainty using probability• the "new calculus"• replaces hard calculus with easy computing• can solve virtually any problem• well-suited to interpreting DNA evidence