of 20 /20
What is necessary (and unnecessary) for analyses of offender databases Forensic Bioinformatics (www.bioforensics.com) [email protected] Jason R. Gilder August 16, 2008

What is necessary (and unnecessary) for analyses of offender databases Forensic Bioinformatics () [email protected] Jason R

Embed Size (px)

Text of What is necessary (and unnecessary) for analyses of offender databases Forensic Bioinformatics ()...

  • Slide 1

What is necessary (and unnecessary) for analyses of offender databases Forensic Bioinformatics (www.bioforensics.com) [email protected] Jason R. Gilder August 16, 2008 Slide 2 Offender databases Originally designed for convicted offenders CODIS: Convicted Offender DNA Index System Expanded Unsolved crime samples Arrestees Elimination profiles Slide 3 CODIS COmbined DNA Index System National: NDIS State: SDIS - fewer restrictions Local: LDIS - fewest restrictions Convicted Offender Profiles in NDIS: 6,031,000 Forensic Profiles in NDIS: 225,400 More than 71,800 cold hits Slide 4 Why analyze a database? Questions remain regarding the weight of a DNA database match Random Match Probability (RMP) Database Match Probability (DMP) Balding & Donnelly LR Other Composition of database may affect chance of a coincidental match Presence of relatives Slide 5 Structure of a DNA database Collection of records Structured Query Language (SQL) format ID#FnameLnamePopSSNDateD3vWAFGAD7 AC937JohnDoeCAU283-24- 4300 5/2/0213, 1516, 1621, 2311, 14 BQ384JaneDoeHIS365-78- 3472 7/23/0312, 1715, 1925, 2510, 10 BZ927FrankSmithAA312-55- 1476 2/9/0613, 1514, 1524, 2612, 16 Slide 6 Examples of possible issues with the use of DNA databases Michigan v. Gary Leiterman Evidence: blood found on victims hand Cold hit to a 4-year-old boy R v. Sean Hoey Evidence: explosive device Cold hit to a 14-year-old boy Jaidyn Leskie inquest (Australia) Evidence: clothing from deceased Cold hit to a rape victim Slide 7 Lab error and false cold hits Slide 8 How a database can be analyzed Perform all pairwise profile comparisons the Arizona Search P 1 with P 2, P 1 with P 3, P 1 with P 4, , P 1 with P n P 2 with P 3, P 2 with P 4, P 2 with P 5, , P 2 with P n Analyze profile similarity Count number of matching loci and alleles Perform kinship analyses Slide 9 Arizona Match Data 65,493 Profiles 122 pairs matched at 9 of 13 loci 20 pairs matched at 10 of 13 1 pair matched at 11 of 13 1 pair matched at 12 of 13 LociAveStd Devp-value 9103.4710.640.08 103.061.689.6E-23 110.050.234.4E-05 1200 9+106.5910.835.8E-04 Slide 10 Review of Victoria State Database Krane/Paoletti analysis: >11,000 profiles each compared to all others across 9 loci: Shared allelesObserved occurrences 14 401 15 27 16 1 17 16 18 0 Aussie Bump Slide 11 # Matching Alleles 14151617 # Observed40127116 300 100 20 1 Slide 12 Slide 13 Issues with the release or analysis of a DNA database Privacy concerns Names, social security numbers, DNA profiles, addresses, etc. Issues with analysis Duplicate profiles, multiple databases, presence of relatives, processing time, CODIS requirements Legal issues California Proposition 69 Slide 14 Issue 1: Privacy concerns Database contains private information that should not be released Answer: provide anonymous profiles only Accomplished through one command SELECT D3, vWA, FGA, , D7 FROM CODIS_DB Slide 15 Issue 2: Duplicate profiles Many databases contain at least 10-15% duplicate profiles Answer: ignore duplicates in analysis A fairly thorough database analysis can take place with duplicates removed Also identify potential mistyping rate The lab may be able to cull out duplicates from the same individual with additional information (e.g. SSN) Slide 16 Issue 2b: Multiple databases California DOJ contains information in two databases that can be cross referenced to remove duplicates Login DB contains unique CII ID and accession numbers of all samples for that individual SDIS contains accession number and profile Answer: JOIN the data with one command Only select the first accession number profile SELECT D3, vWA, FGA, D7 FROM SDIS JOIN LOGIN_DB WHERE (LOGIN_DB.ACCESSION1 = SDIS.ACCESSION) Slide 17 Issue 3: Presence of relatives It is difficult to identify the presence of relatives by hand by simply looking at the CODIS records There are a significant, but unknown number, of such related individuals in Californias offender database. Kenneth Konzack Answer: Exactly! Slide 18 Issue 4: Processing time Performing an internal search of the database will take too long (a week or more) and will not allow for CODIS searches during that time Answer: perform an analysis on a separate computer or computers Pairwise database search is embarrassingly parallel Slide 19 Issue 5: Legal issues Legal statutes (e.g., California Proposition 69) prohibit release of database to citizens Answer: 38 state statutes (including CA) allow for an outside review of their database for statistical analysis Many require the removal of identifying information Slide 20 Questions?