37
3 2 1 0 1 2 3 Current Research in Forensic Toolmark Analysis Petraco Group

Current Research in Forensic Toolmark Analysis Petraco Group

Embed Size (px)

Citation preview

Page 1: Current Research in Forensic Toolmark Analysis Petraco Group

3 2 1 0 1 2 3

Current Research in Forensic Toolmark Analysis

Petraco Group

Page 2: Current Research in Forensic Toolmark Analysis Petraco Group

Outline

• Introduction

• Instruments for 3D toolmark analysis

• 3D toolmark data/Features

• The statistics:• Identification Error Rates

• “Match” confidence

• “Match” probability from Empirical Bayes

• “Match” probability from CMS data and Bayesian Networks

Page 3: Current Research in Forensic Toolmark Analysis Petraco Group

• All forms of physical evidence can be represented as numerical patternso Toolmark surfaceso Dust and soil categories and spectrao Hair/Fiber categories and spectrao Triangulated fingerprint minutiae

• Machine learning trains a computer to recognize patterns o Can give “…the quantitative difference between an identification and

non-identification”Moran o Can yield identification error rate estimateso May be even confidence measures for I.D.s

Quantitative Criminalistics

Page 4: Current Research in Forensic Toolmark Analysis Petraco Group

Data Acquisition For Toolmarks

Comparison MicroscopeConfocal Microscope Focus Variation MicroscopeScanning Electron Microscope

Page 5: Current Research in Forensic Toolmark Analysis Petraco Group

2D profiles3D surfaces(interactive)

Screwdriver Striation Patterns in Lead

Page 6: Current Research in Forensic Toolmark Analysis Petraco Group

9mm Glock fired cartridge cases

Bottom ofFiring pin imp.

Page 7: Current Research in Forensic Toolmark Analysis Petraco Group

Bullet base, 9mm Ruger Barrel

Bullets

Bullet base, 9mm Glock Barrel

Page 8: Current Research in Forensic Toolmark Analysis Petraco Group

Close up: Land Engraved Areas

Page 9: Current Research in Forensic Toolmark Analysis Petraco Group

• Statistical pattern comparison!

• Modern algorithms are called machine learning

• Idea is to measure features that characterize physical evidence

• Train algorithm to recognize “major” differences between groups of featureswhile taking into account

natural variation and measurement error.

What can we do with all this microscope data?

Page 10: Current Research in Forensic Toolmark Analysis Petraco Group

• We need a toolmark feature set that is:• Large in number• (possibly) translationally invariant• (possibly) rotationally invariant• Mostly statistically independent• DISCRIMINATORY!

Good Features are the Key!

Page 11: Current Research in Forensic Toolmark Analysis Petraco Group

Aperture primer shear on a 9mm cartridge case fired from the a Glock 19

Toolmark Features

Mean total profile:

Mean “waviness”

profile:

Mean “roughness”

profile:

Page 12: Current Research in Forensic Toolmark Analysis Petraco Group

A toolmark is a “word” in a chapter.

A line is a “letter” in a “word”

Take a representative for a group of toolmarks made by the same tool as a “chapter” in a “dictionary”

Page 13: Current Research in Forensic Toolmark Analysis Petraco Group

“words” in the Biasotti-Murdock “dictionary”

Form the CMS-space

… ……

… ……

Database/queries

……

• Find best matching “word” in query to each “dictionary word”• Similarity metric is arbitrary

• We use a mix

• Process produces a registration free/translation/rotation-invariant multivariate feature vector

Page 14: Current Research in Forensic Toolmark Analysis Petraco Group

Tool

mar

ks (

scre

wdr

iver

str

iati

on p

rofi

les)

for

m d

atab

ase

Biasotti-Murdock Dictionary

Consecutive Matching Striae (CMS)-Space

Calculation was SLOW!

~ One week on a fairly beefy desktop computer

Page 15: Current Research in Forensic Toolmark Analysis Petraco Group

Tool

mar

ks (

scre

wdr

iver

str

iati

on p

rofi

les)

for

m d

atab

ase

Biasotti-Murdock Dictionary

FAST-Consecutive Matching Striae (CMS)-Space

Found an approximate algo and parallelized it

~ 3 minutes on same fairly beefy desktop computer

Page 16: Current Research in Forensic Toolmark Analysis Petraco Group

• Visually explore: 3D PCA of 760 real and simulated mean profiles of primer shears from 24 Glocks:

• ~45% variance retained

Page 17: Current Research in Forensic Toolmark Analysis Petraco Group

Support Vector Machines• Support Vector Machines (SVM) determine

efficient association rules• In the absence of specific knowledge of probability

densities

SVM decision boundary

Page 18: Current Research in Forensic Toolmark Analysis Petraco Group

Refined bootstrapped I.D. error rate for primer shear striation patterns= 0.35% 95% C.I. = [0%, 0.83%]

(sample size = 720 real and simulated profiles)

18D PCA-SVM Primer Shear I.D. Model, 2000 Bootstrap Resamples

Page 19: Current Research in Forensic Toolmark Analysis Petraco Group

How good of a “match” is it?Conformal PredictionVovk

• Data should be IID but that’s it C

umul

ativ

e #

of E

rror

s

Sequence of Unk Obs Vects

80% confidence20% errorSlope = 0.2

95% confidence5% errorSlope = 0.05

99% confidence1% errorSlope = 0.01

• Can give a judge or jury an easy to understand measure of reliability of classification result

• This is an orthodox “frequentist”

approach• Roots in Algorithmic Information

Theory

• Confidence on a scale of 0%-100%

• Testable claim: Long run I.D. error-rate should be the chosen significance level

Page 20: Current Research in Forensic Toolmark Analysis Petraco Group

Conformal Prediction

Theoretical (Long Run) Error Rate: 5%

Empirical Error Rate: 5.3%

14D PCA-SVM Decision Modelfor screwdriver striation patterns

• For 95%-CPT (PCA-SVM) confidence intervals will not contain the correct I.D. 5% of the time in the long run• Straight-forward validation/explanation picture for

court

Page 21: Current Research in Forensic Toolmark Analysis Petraco Group

• An I.D. is output for each questioned toolmark• This is a computer “match”

• What’s the probability the tool is truly the source of the toolmark?

• Similar problem in genomics for detecting disease from microarray data• They use data and Bayes’ theorem to get an

estimate

How good of a “match” is it?Efron Empirical Bayes’

Page 22: Current Research in Forensic Toolmark Analysis Petraco Group

Bayesian Statistics

• The basic Bayesian philosophy:

Prior Knowledge × Data = Updated Knowledge

A better understanding of the world

Prior × Data = Posterior

Page 23: Current Research in Forensic Toolmark Analysis Petraco Group

Empirical Bayes’• From Bayes’ Theorem we can getEfron:

Estimated probability of not a true “match” given the algorithms' output z-score associated with its “match”

Names: Posterior error probability (PEP)Kall

Local false discovery rate (lfdr)Efron

• Suggested interpretation for casework:

= Estimated “believability” that the specific tool

produced the toolmark

Page 24: Current Research in Forensic Toolmark Analysis Petraco Group

Empirical Bayes’• Model’s use with crime scene “unknowns”:

This is the est. post. prob. of no association = 0.00027 = 0.027%

Computer outputs “match” for: unknown crime scene toolmarks-with knowns from “Bob the burglar” tools

This is an uncertainty in the estimate

Page 25: Current Research in Forensic Toolmark Analysis Petraco Group

• Odd’s form of Bayes’ Rule:

Posterior Odds = Likelihood Ratio × Prior Odds

{ { {Posterior odds in favour of Theory A

Likelihood Ratio Prior odds in favour of Theory A

The “Bayesian Framework”

Page 26: Current Research in Forensic Toolmark Analysis Petraco Group

Bayes Factors/Likelihood Ratios • Using the fit posteriors and priors we can obtain the likelihood ratiosTippett, Ramos

Known match LR values

Known non-match LR values

Page 27: Current Research in Forensic Toolmark Analysis Petraco Group

• 2007 Neel and Wells study:• Count the number of each type of CMS run for KM and

KNM comparisons• A CMS type is its run length:

• 4X means 4 matching adjacent lines in a comparison of two striation patterns

Model each column of counts as arising from a multinomial distribution

Bayesian Match Probabilities from CMS

1411 KNM comparisons914 KM comparisons

 CMS run lengths:

Number observed 2X 3X 4X …0 508 612 694 …1 186 172 135 …2 109 59 43 …3 39 29 19 …4 21 15 16 …5 10 9 2 …6 4 9 1 …7 10 6 3 …8 14 2 0 …

>8 13 1 1 …

 CMS run lengths:

Number observed 2X 3X 4X …0 771 1239 1357 …1 298 124 47 …2 143 35 4 …3 84 10 2 …4 46 2 1 …5 21 1 0 …6 13 0 0 …7 14 0 0 …8 6 0 0 …

>8 15 0 0 …

Page 28: Current Research in Forensic Toolmark Analysis Petraco Group

• Model Neel and Wells counts in each column with a multinomial likelihood:

Bayesian Match Probabilities from CMS

• Model each cell probability before we’ve seen any data as an “uninformative” Dirichlet prior:

• Bayes’ theorem gives “updated” (posterior) cell probabilities:

Page 29: Current Research in Forensic Toolmark Analysis Petraco Group

• Updated CMS run length probabilities:

Bayesian Match Probabilities from CMS

KNM comparisonsKM comparisons

• So what can we use these for??• Lot’s of stuff, but we put them into a Bayesian network:

• BN model for Match/Non-match probabilities given observed numbers of CMS runs

 CMS run lengths:

Number observed 2X 3X 4X …0 0.550 0.663 0.752 …1 0.202 0.187 0.147 …2 0.119 0.065 0.047 …3 0.043 0.032 0.022 …4 0.024 0.018 0.019 …5 0.012 0.011 0.003 …6 0.005 0.011 0.002 …7 0.012 0.008 0.004 …8 0.016 0.003 0.001 …

>8 0.015 0.002 0.002 …

 CMS run lengths:

Number observed 2X 3X 4X …0 0.5440 0.8726 0.9556 …1 0.2099 0.0880 0.0338 …2 0.1010 0.0254 0.0035 …3 0.0598 0.0078 0.0021 …4 0.0332 0.0021 0.0014 …5 0.0155 0.0014 0.0007 …6 0.0099 0.0007 0.0007 …7 0.0105 0.0007 0.0007 …8 0.0049 0.0006 0.0007 …

>8 0.0113 0.0007 0.0007 …

Page 30: Current Research in Forensic Toolmark Analysis Petraco Group

Bayesian Networks• A “scenario” is represented by a joint probability

function• Contains variables relevant to a situation which represent

uncertain information

• Contain “dependencies” between variables that describe how they influence each other.

• A graphical way to represent the joint probability function is with nodes and directed lines• Called a Bayesian NetworkPearl

Page 31: Current Research in Forensic Toolmark Analysis Petraco Group

Bayesian Networks

• (A Very!!) Simple exampleWikipedia:• What is the probability the Grass is Wet?

• Influenced by the possibility of Rain

• Influenced by the possibility of Sprinkler action

• Sprinkler action influenced by possibility of Rain

• Construct joint probability function to answer questions about this scenario:

• Pr(Grass Wet, Rain, Sprinkler)

Page 32: Current Research in Forensic Toolmark Analysis Petraco Group

Bayesian Networks

Sprinkler: was on was on was off was off  Rain: yes no yes no

Grass Wet: yes 99% 90% 80% 0%no 1% 10% 80% 100%

  Rain: yes noSprinkler: was on 40% 1%

was off 60% 99% Rain: yes 20%no 80%

Pr(Sprinkler | Rain)

Pr(Rain)

Pr(Grass Wet | Rain, Sprinkler)

Pr(Sprinkler) Pr(Rain)

Pr(Grass Wet)

Page 33: Current Research in Forensic Toolmark Analysis Petraco Group

Bayesian Networks

Pr(Sprinkler) Pr(Rain)

Pr(Grass Wet)

You observegrass is wet.

Other probabilitiesare adjusted given the observation

Page 34: Current Research in Forensic Toolmark Analysis Petraco Group

Bayesian Networks“Prior” network based on Neel and Wells observed counts and Multinomial-Dirichlet model:

“Instantiated” network with observations from a comparison:

Estimate of the “match” probability which can be turned into an LR if so desired

Page 35: Current Research in Forensic Toolmark Analysis Petraco Group

Future Directions

• GUI modules for common toolmark comparison tasks/calculations using 3D microscope data

• 2D features for toolmark impressions

• Parallel implementation of computationally intensive routines

• Standards board, to review statistical methodology/algorithms• Maybe part of OSAC??

Page 36: Current Research in Forensic Toolmark Analysis Petraco Group

Acknowledgements• Professor Chris Saunders (SDSU)

• Professor Christophe Champod (Lausanne)

• Alan Zheng (NIST)

• Ryan Lillian and Marcus Brubaker (CADRE)

• Research Team:

• Ms. Tatiana Batson

• Dr. Martin Baiker

• Ms. Julie Cohen

• Dr. Peter Diaczuk

• Mr. Antonio Del Valle

• Ms. Carol Gambino

• Dr. James Hamby

• Mr. Nick Natalie

• Mr. Mike Neel

• Ms. Alison Hartwell, Esq.

• Ms. Loretta Kuo

• Ms. Frani Kammerman

• Dr. Brooke Kammrath

• Mr. Chris Lucky

• Off. Patrick McLaughlin

• Dr. Linton Mohammed

• Ms. Diana Paredes

• Mr. Nicholas Petraco

• Ms. Stephanie Pollut

• Dr. Peter Pizzola

• Dr. Graham Rankin

• Dr. Jacqueline Speir

• Dr. Peter Shenkin

• Mr. Chris Singh

• Mr. Peter Tytell

• Ms. Elizabeth Willie

• Ms. Melodie Yu

• Dr. Peter Zoon

Page 37: Current Research in Forensic Toolmark Analysis Petraco Group

Data, Programs, Reprints/Preprints:

[email protected]