133
Bruce Budowle Assessing the Significance of Y STR Evidence

Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Bruce Budowle

Assessing the Significance of Y STR Evidence

Page 2: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Characteristics of the Human Y ChromosomeCharacteristics of the Human Y Chromosome

• size: ~ 60 Mb

• ~ 35 Mb euchromatic (transcribed)

• ~ 25 Mb heterochromatic (non-transcribed)

• 95% non-recombining (NRY)

• 5% X-recombining (2 pseudoautosomal regions at telomeres)

• shape: acrocentric - very short p-arm, long q-arm (“Y” name)

• rich in different kinds of repetitive DNA sequences

• lack of recombination

• relatively poor in gene content

Page 3: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Genes on the Human Y ChromosomeGenes on the Human Y Chromosome

• 23 Mb of the euchromatic region determined

• 156 transcription units

• 78 encode proteins (genes)

• 27 distinct Y-specific protein-coding genes (gene families)

• 16 ubiquitously expressed genes = housekeeping genes

– e.g. RPS4Y, ZFY, AMELY, SMCY, DBY

• 9 testis-specific genes = male sex determination, spermatogenesis

– e.g. SRY, TSPY, CDY, RBMY, DAZ

• origin of NRY genes:

– derived / preserved from the proto-sex chromosomes (X-homology)

– specialisation in male-specific function

Page 4: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Genes Mapped to Y Chromosome

Page 5: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Evolution of Mammalian Sex ChromosomesEvolution of Mammalian Sex Chromosomes

Lahn, Pearson & Jegalian 2001

Some homology Some homology –– need to consider in validationneed to consider in validation

Page 6: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Polymorphisms on the Human Y Chromosome

Repetitive DNA – e.g., STRsSingle-Copy DNA – e.g., SNPs, indels

Page 7: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Y Chromosome Polymorphisms

• ~ 200 binary polymorphisms (Y-SNPs) characterized

• > 300 microsatellites (Y-STRs) characterized

• 1 minisatellite (MSY1)

Page 8: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Not all mutations

occur at the same

rate

‘hot spots’

‘cold spots’SNPs

Page 9: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

-From J.M. Butler (2003) Forensic Sci. Rev. 15:91-111

Page 10: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Y Chromosome STRs

Nucleic Acids Res. 28(2), e8 (2000)

Marker Name Repeat Motif Allele RangeDYS19 TAGA 8-16DYS385 GAAA 10-22DYS388 ATT 12-17

DYS389 I (TCTG) (TCTA) 7-13DYS389 II (TCTG) (TCTA) 23-31DYS390 (TCTA) (TCTG) 18-27DYS391 TCTA 8-13DYS392 TAT 7-16DYS393 AGAT 9-15YCAII CA 16-25YCAIII CA 19-25

Marker Name Repeat Motif Allele RangeDYS434 ATCT 8-11DYS435 TGGA 9-13DYS436 GTT 10-15DYS437 TCTA 8-11DYS438 TTTTC 6-12DYS439 AGAT 9-14

Y-GATA-A4 AGAT 11-14Y-GATA-A7.1 ATAG 7-12Y-GATA-A7.2 TAGA 8-12Y-GATA-A8 TCTA 8-14

Y-GATA-A10 TATC 11-14Y-GATA-C4 TATC 11-16Y-GATA-H4 TAGA 10-13

Marker Name Repeat Motif Allele RangeDYS441 CCTT 12-18DYS442 TATC 10-14DYS443 TTCC 12-17DYS444 TAGA 11-15DYS445 TTTA 10-13G09411 TATG 8-14

Page 11: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Mutation Process for Mutation Process for STR lociSTR loci

Page 12: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Scientific Uses

• Genealogy studiese.g., Hemmings-Jefferson case

Page 13: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

**Remember paternal lineage issues for identity testing

Page 14: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Scientific Use

• Genealogy studiese.g., Hemmings-Jefferson case

• Human evolution studiese.g., Human migrations

Page 15: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)
Page 16: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• Haplogroup: set of haplotypes defined

by slowly mutating markers (mainly SNPs) which

have more phylogenetic stability

• Haplotype: combination of allelic states of a

set of polymorphic markers lying on the same

DNA molecule

Definitions

Unique event polymorphisms record history of Y chromosome

Page 17: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Phylogenetic tree of

Y haplogroupsbased in binary

SNP data

Page 18: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)
Page 19: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Applications of Y-STRs• Forensic Analysis

– Detect male DNA in a sample containing male and female DNA (Huge background of female DNA)

– Aspermic males– Fingernail Scrapings– Additional DP value– Multiple male donors– Limits of differential extraction/ tissues– Gender clarification (amelogenin)

• Paternal Lineage– Paternity Testing– Kinship analysis– Deficiency cases

Page 20: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Identification of Male Contributor DNA in Crime Scene MaterialIdentification of Male Contributor DNA in Crime Scene Material

Autosomal STR profile

Female Victim DNA:

Male Suspect DNA:

Large Female DNA: Perpetrator Male DNA

- See only female DNA profile- Or partial DNA profile

- no female DNA- no profile overlap- only male component

Y STR profile

Page 21: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

One Forensic Application• Some victims of sexual assault may not provide

vaginal samples immediately after the incident

• Ability to retrieve typeable autosomal STR profiles from the semen diminishes rapidly

• However, sperm persist in the vaginal canal up to 3 days after intercourse (may be detectable up to 7 days after intercourse in the cervix)

• Sperm decrease in number as interval increases

• Lymphocytes and epithelial cells from male

Page 22: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Finger Nail Scraping Case

Victim was found strangled to death

Suspect had scratches on his face

Based on STR results, suspect could not be excluded; many alleles were below minimal threshold (inconclusive result)

Page 23: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Evidence Profile

Suspect Profile

Page 24: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Applications of Y-STRs

− Paternity Testing− Kinship analysis− Deficiency cases

• Forensic Analysis

• Paternal Lineages

Page 25: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Deficiency Case Male LineageDeficiency Case Male Lineage

• Y STR analysis - any male relative in pedigree can be a

reference for alleged father

?

Page 26: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

YY--STR Haplotype Analysis in Deficiency Paternity CaseSTR Haplotype Analysis in Deficiency Paternity Case

DYS19 DYS389I DYS389II DYS390 DYS391 DYS392 DYS393 DYS385 DYS413 YCAII

Nephew 14 13 30(16) 25 11 13 12 11-14 22 7

Son 14 12 29 (16) 24 10 15 12 11-14 22 7

Exclusion

If true biological nephew, then alleged father is excluded as father of child in question

?

Kayser et al. Progress in Forensic Genetics (1998), 7: 494-496

Page 27: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

DYS385 a/b

a = b a ≠ b

DYS389 I/III

IIF primer F primer

R primer

a b

Duplicated regions are 40,775 bp apart and facing

away from each other

F primer

R primer

F primer

R primer

DYS389I DYS389II

Figure 9.5, J.M. Butler (2005) Forensic DNA Typing, 2nd Edition © 2005 Elsevier Academic Press

Multi-Copy (Duplicated) Marker

Single Region but Two PCR Products (because forward primers bind twice)

Page 28: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Y-STR consensus structure and allele ranges

Page 29: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Marker Name GenBankAccession

Repeat Motif AlleleRange

PCR ProductSizes

Reference

DYS19 X77751 TAGA 8-16 178-210 bp Roewer 1992DYS385 Z93950 GAAA 10-22 252-300 bp Schneider 1998DYS388 G09695 ATT 12-17 128-143 bp Kayser 1997

DYS389 IDYS389 II

G09600G09600

(TCTG) (TCTA)(TCTG) (TCTA)

I: 7-13II:23-31

239-263 bp353-385 bp

Kayser 1997Kayser 1997

DYS390 G09611 (TCTA) (TCTG) 18-27 191-227 bp Kayser 1997DYS391 G09613 TCTA 8-13 275-295 bp Kayser 1997DYS392 G09867 TAT 7-16 236-263 bp Kayser 1997DYS393 G09601 AGAT 9-15 108-132 bp Kayser 1997YCAIII AC006370 CA 19-25 192-204 bp Kayser 1997DYS434 AC002992 ATCT 8-11 110-122 bp Ayub 2000DYS435 AC002992 TGGA 9-13 210-228 bp Ayub 2000DYS436 AC005820 GTT 10-15 128-143 bp Ayub 2000DYS437 AC002992 TCTA 8-11 186-202 bp Ayub 2000DYS438 AC002531 TTTTC 6-12 203-233 bp Ayub 2000DYS439 AC002992 AGAT 9-14 238-258 bp Ayub 2000

Y-GATA-A4 G42670 AGAT 11-14 242-254 bp White 1999Y-GATA-A7.1 G42675 ATAG 7-12 161-181 bp White 1999Y-GATA-A7.2 G42671 TAGA 8-12 174-190 bp White 1999Y-GATA-A8 G42672 TCTA 8-14 219-244 bp White 1999

Y-GATA-A10 G42674 TATC 11-14 160-172 bp White 1999Y-GATA-C4 G42673 TATC 11-16 251-271 bp White 1999Y-GATA-H4 G42676 TAGA 10-13 362-370 bp White 1999

Y Chromosome STR Markers

Page 30: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• For effective use, guidelines are needed

• ISFG Recommendations

• Combine with existing recommendations (NRC II Report)

• Nomenclature, Allelic Ladders, Population Genetics, Statistical Issues

Page 31: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• Similar to autosomal STRs

• Thresholds for detection and interpretation

• Stutter

• Mixtures – what constitutes a mixture

• Validation studies in concert with guidelines

• Interpret evidence before knowns

Basic Interpretation Guidelines

Page 32: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)
Page 33: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Y STR LOCI• DYS19

• DYS398 I

• DYS398 II

• DYS390

• DYS391

• DYS392

• DYS393

• DYS385 I/II

“Minimal Haplotype” – defined for research only

Page 34: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

DYS19DYS389IDYS389IIDYS390DYS391DYS392DYS393DYS438DYS439DYS385a/b

SWGDAM

DYS385 – two loci

Y STR Loci

DYS389 – two loci

Page 35: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• Commercially available Y-STR multiplex kits ---allow for standard markers and QA/QC

• Most have EMH and SWGDAM recommended loci

Kits

Page 36: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

DYS19DYS389IDYS389IIDYS390DYS391DYS392DYS393DYS437DYS438DYS439DYS385

PowerPlex® Y System

DYS385 – two loci

Page 37: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

DYS19DYS389IDYS389IIDYS390DYS391DYS392DYS393DYS437DYS438DYS439DYS385DYS448DYS456DYS458DYS635GATAH4.1

AmpFlSTR® Yfiler™ Kit

DYS385 – two loci

Page 38: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Promega Amplification Performance(moderate-to-low sensitivity instrument)

1ng male DNA

0.125ng male DNA

100ng female DNA

0.1ng male + 100ng female

2s injection on ABI PRISM® 310 Genetic Analyzer

Page 39: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

0.5ng male

0.5ng male + 5ng female (10-fold)

Neg. control3s inj. on ABI PRISM® 310 Genetic Analyzer

0.5ng male + 50ng female (100-fold)

0.5ng male + 500ng female (1000-fold)

DNA Mixtures: Increasing female

Promega0.5ng male + 0.5ng female (equal

mix)

0.5ng male + 150ng female (300-fold)

DNA quantified by A260

Page 40: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

AmpFλSTR® Yfiler™ Kit 1 ng Male Control DNA 007

DYS458 DYS389 I DYS390 DYS389 II

DYS438 DYS19 DYS385 a/b

DYS393 DYS391 DYS439 Y GATA C4 DYS392

Y GATA H4 DYS456 DYS437 DYS448

Page 41: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

AmpFlSTR® Yfiler™

137 alleles

Allelic ladder

Page 42: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

CFS AFRCT AFRMI AFRNYC AFRTX AFRCFS CAUCT CAUMI CAUNYC CAUTX CAU

371828680192571649783194

MI HISMN HISNYC HISTX HISApacheNavajoCFS ASNMN ASNNYC ASNTX ASN

CT HIS

Population

160

N

CFS EI

Population9710180192138219281014573

Y STR Population DataPromega Study

37

N

Total = 2443

Page 43: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

DYS19Allele Frequencies African American

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

12 13 14 15 16 17 18

Alleles

Freq

uenc

y

Sinha (n=543)CFS (n=37)CT (n=182)MI (n=86)NYC (n=80)TX (n=193)

Page 44: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

χ = frequency of each haplotype n = # haplotypes

(h = n(1-Σχ2)/ (n-1)Genetic Diversity

P = Σχ2Random Match Probability

Population Parameters

Page 45: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Gene DiversityHispanic

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

DYS437 DYS19 DYS392 DYS393 DYS390

CT HIS MI HIS MN HIS NYC HIS TX HIS

Page 46: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

CFS AFRCT AFRMI AFRNYC AFRTX AFRCFS CAUCT CAUMI CAUNYC CAUTX CAUCT HISMI HISMN HISNYC HISTX HIS

Y Haplotype ProfilesPopulation

3718286801935716397831941589710080192

N # Haplotypes

361728580181501538780170130909574179

% Single

97.394.598.810093.887.793.989.796.487.682.392.895.092.593.2

Haplotype Diversity

0.99850.99940.9997

>0.99990.99930.99440.99910.99680.99910.99810.99630.99850.99880.99680.9991

Page 47: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

ApacheNavajoCFS ASNMN ASNNYC ASNTX ASNCFS EI

Y Haplotype Profiles

13821928100457037

701012896436935

50.746.110096.095.698.694.6

0.97010.9806

>0.99990.99920.99700.99960.9955

Population N # Haplotypes % SingleHaplotype Diversity

Page 48: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

high haplotype diversity = high intra-individual variation

Haplotype DiversityN>80

0.955

0.96

0.965

0.97

0.975

0.98

0.985

0.99

0.995

1

1.005

CT CAU

NYC CAU

MI CAU

TX CAU

CT AFR

MI AFR

NYC AFR

TX AFR

CT HIS

MI HIS

MN HIS

NYC HIS

TX HIS

APACHENAVAJOMN A

SN

Page 49: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Multiband Y Patterns• MN ASIAN PA0077 DYS385 - 3 Bands

• MN HISPANIC PH0031 DYS390 - 2 Bands

• MN HISPANIC PH0063 Multibands

• NYC HISPANIC 26 DYS19 - 2 Bands

• NYC CAUCASIAN 4 DYS19 - 2 Bands

• CT HISPANIC 00-1851 DYS19 - 2 Bands

• CT HISPANIC 99-1695 DYS19 - 2 Bands

• CT HISPANIC 99-0362 DYS19 - 2 Bands

• CT HISPANIC 98-2136 DYS19 - 2 Bands

• CT CAUCASIAN 00-3022 DYS385 - 3 Bands

• ASIAN A-FTA-34-F/C DYS385 - 3 Bands

• ASIAN A-FTA-36-F/C DYS19 – Primer Binding site?

• ASIAN A-FTA-32-F/C DYS385 - 4 Bands

Must consider when analyzing

mixtures

Page 50: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

What about Equilibrium?

• These markers are all on a single chromosome

• No recombination

• Predict strong linkage and a lack of independence between the loci

• Haplotype

Page 51: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

CFS AFRCT AFRMI AFRNYC AFRTX AFRCFS CAUCT CAUMI CAUNYC CAUTX CAUCT HISMI HISMN HISNYC HISTX HIS

Population3718286801935716397831941589710080192

N # Equilibrium

352327342130263022112633423537

Y STR Loci Pairwise Tests12 Loci – 66 tests

Page 52: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

ApacheNavajoCFS ASNMN ASNNYC ASNTX ASNCFS EI

Y STR Loci Pairwise Tests12 Loci – 66 tests

13821928100457037

9126050475035

Fewest – Native AmericanMost – Asian (sample size)

# EquilibriumPopulation N

Page 53: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

391/389I391/389II389I/439389I/385-2439/389II439/393439/385-1439/385-2389II/393437/393

Loci19181718171918201718

# Populations

Y STR Loci Pairwise Tests22 populations; >17 Equilibrium detected

Page 54: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

391/438389I/389II438/437438/19438/392438/385-1438/385-2437/385-119/39219/385-1392/385-1390/385-1385-1/385-2

Loci5144304553553

# Populations

Y STR Loci Pairwise Tests22 populations; <5 Equilibrium

Page 55: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

389I/392438/439437/439437/385-2390/385-2

Loci1511161212

# Populations/Equilibria

Y STR Loci Pairwise Tests22 Populations – Examples of Population Specific Disequilibrium

CaucasianCaucasianCaucasian

African AmericanAfrican American

Population/Equilibria

Page 56: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• There is evidence of “independence” between some pairs of loci in the populations samples

• Likely due to combination of mutation rate, subdivision and random drift

• A large factor is haplogroup diversity

• The marker selection for increasing haplotype diversity is not directly correlated to gene diversity

• Conservative estimates

Equilibrium and Impact?

Page 57: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Approaching Analysis

• Some may suggest - “Use the set of Core Y STRs and add more as needed to resolve matches”

• First question – when do you stop?

• If you get a match, you would have to continue on ad infinitum!

• Is this a sensible policy?

How much power is needed???

Page 58: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Y-STR marker combination

African American (N=786)

Caucasians (N=778)

Hispanics (N=381)

European minimal haplotype (9) 75.8% 61.7% 79.8%

Minimal + SWGDAM + DYS437 (12)

87.7% 76.7% 88.2%

Eur. Minimal + SWGDAM (11) 86.8% 74.3% 85.6%

AmpFλSTR®

Yfiler kit (17)

97.6% 95.5% 95.8%

Discriminatory Capacity for Three US Populations

*DC= (# of different haplotypes / pop. size) x 100Mulero et al., JFS (2006) 51:64-75

Page 59: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Number of Unique Haplotypes Observed in Three US Populations

Y-STR marker combination

African American (N=786)

Caucasians (N=778)

Hispanics (N=381)

European minimal

haplotype (9)496 382 266

Minimal + SWGDAM + DYS437 (12)

628 524 306

Eur. Minimal + SWGDAM (11) 618 503 295

AmpFλSTR®

Yfiler kit (17)

749 714 350

Mulero et al., JFS (2006) 51:64-75

Page 60: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

So… more loci do result in better resolution

But…size of the database impacts more on statistics

Page 61: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Approaching Analysis

• So additional testing beyond a single kit is an unlikely approach because information gain is low

• Many samples will already be very limited

• Community will rely on commercially available kits not in-house designer systems

• QC/ Proficiency Testing

• Better to increase size of database(s) to gain power

• We will re-visit substructure issues later

Page 62: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Approaching Analysis

• Some may suggest - “A reference database should contain related individuals” – to better define the population

• Probability of paternal relative having the same haplotype is usually 1

• Typically comprised of unrelated individuals

• Although a small unknown number of related individuals may be in a database

• Able to address significance of a very closely related profile

Page 63: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Exclusion with 1 mismatch among 12 analyzed Y-STRs

Evidence 14 12 28 25 11 11 13 14,14 11 11 15

Known 14 12 28 24 11 11 13 14,14 11 11 15

By having a database of unrelated males one can assess weight ofrelative (with mutation) versus rarity of haplotype in population

Page 64: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Qualitative Conclusions of Y-STR Haplotype Comparison

Exclusion- The two haplotypes are dissimilar; i.e, the reference person is excluded as the contributor of Y-specific DNA of the evidence sample

Inclusion/Match- The Y haplotypes from two samples are sufficiently similar and potentially could have originated from the same source, or from a common paternal lineage

Inconclusive- Exclusion/Inclusion cannot be definitively inferred due to insufficient data from one or both of the DNA samples

Page 65: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• Unique event polymorphisms record history of Y chromosome

• Effective population size

• Patrilineal inheritance reduces population size

• Variance of offspring further reduces effective population size

• Patrilocal effect causes local differentiation

• Lack of recombination

• But some detectable linkage equilibrium

Now Let’s Think About Application

Page 66: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Population Differentiation

• Effective population size of Y chromosome is 1/4 of autosome or 1/3 of X– lower sequence diversity on Y– more susceptible to genetic drift

• random changes in frequency of haplotypes due to sampling bias from one generation to next

• accelerates differences between populations• Geographical clustering due to patrilocal behavior of

men– women move closer to man’s birthplace– local geographical differentiation enhanced

• Variance of offspring further reduces Ne (effective population size)

Page 67: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Calculation to Convey to the Court

• Frequency estimate not possible

• Court desires a frequency estimate

• Point Estimate (Counting Method)

• Confidence Interval

• Approach the same as mtDNA

Page 68: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Calculation to Convey to the CourtIssues

• Estimating the rarity of a Y DNA profile is performed differently than for autosomal DNA markers

• No evidence for recombination across the majority of the Y-chromosome

• One cannot employ the product rule to estimate the rarity of the Y types in a profile

• The composite multi-locus profile is treated as a single locus or haplotype

Page 69: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• The vast majority of possible haplotypes will not be observed in any database

• The counting method is likely to be conservative

• A correction for sampling

• A correction for substructure

Calculation to Convey to the Court

Page 70: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Calculation to Convey to the CourtCounting Method

• The counting method is very simple

• A Y STR haplotype (evidence sample) is compared to a reference database(s) of unrelated individuals

• The number of times the haplotype is observed in a database

Page 71: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Calculation to Convey to the CourtApproaches

• It is more likely that the counting method will be employed by the U.S. laboratories and courts because of its operational simplicity

Page 72: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Limitations of the Counting Method

• Non-matched sites of the haplotype are given weight equal to that of different origin (but may have some extra value for substructure)

• Mutations are not weighted

• Haplotypes of the same paternal lineage can be excluded, when they are subject to mutations

• Does not recognize evolutionary changes, and/or effect of convergent mutations

Page 73: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Maximum haplotype frequency

• If a Y-haplotype is not seen in a sample of N males then at the α level of significance

• As N becomes larger, maximum frequency becomes closer to point estimate

Page 74: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

If the haplotype has not been observedin the database, then:

The upper bound of the CI is

1-α1/N

Page 75: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• N frequency

• 100 3/100

• 500 3/500

• 1,000 3/1,000

• 10,000 3/10,000

Haplotype frequency (95%)

Page 76: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

CI = p ± 1.96 p(1-p)/N

For Y haplotype observed,count the number of timesthe profile is observed (X)

p = X/N

Page 77: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

I have confidence that the true haplotype frequency is less thanthe upper CI value

A potential mtDNA statement could be:

Page 78: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• Correction for population structure may be considered

• Effective population size ¼ of autosmal loci

• May actually be a little lower

• Substructure effects less in US than ancestral populations

• Use when reference database considered not representative

Calculation to Convey to the CourtPopulation Substructure

Page 79: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Problems created by population subdivision

Haplotype frequencies calculated from Haplotype frequencies calculated from population averagepopulation average

frequencies frequencies couldcould lead to:lead to:

–– Wrong estimates!Wrong estimates!

Page 80: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Employ a Theta (θ) Correction

θ is used as a measure of the effects of population subdivision (inbreeding)

Page 81: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

NRC θ recommendation was pragmatically set

Empirical values are much less for autosomal loci

National Academy of SciencesMay 1996

Page 82: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Still need to calculate substructure effects

But likely to be low for most major populations, if evaluated under a forensic model vs that of an evolutionary model

Page 83: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• Population is relevant and representative

• Population is relevant, but not representative

• Population is similar and not representative

• Follow NRC II Recommendations

Calculation to Convey to the CourtPopulation Substructure

Page 84: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

U.S.U.S. YY--STR HaplotypeSTR Haplotype Reference DatabaseReference Databasewww.ystr.org/usa

AA CAU HIS Total

Haplotype diversity 99.8% 99.6% 99.5% 99.7%

Number of populations sampled 10 11 9 30

Number of individuals sampled 599 628 478 1,705

Number of Y-STR loci typed (EMH) 9 9 9 9

Number of different haplotypes 454 76%

437 70%

354 74%

1116 65%

Most frequent haplotype 12 2.0%

25 3.98%

19 3.97%

533.1%

Kayser et al. J. Forensic Sci. (2002), 47(3): 5513-519

Page 85: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Hispanic

Structure of U.S. Populations Structure of U.S. Populations withwith YY--STR STR HaplotypesHaplotypes

Florida EA

European-American

African-American

Virginia AAFlorida AA

Maryland AATexas AA

New York AAPennsylvania AA

Missouri AAOregon AA

Indiana AALousiana AA

Pennsylvania EANew York EA

Indiana EAMissouri EA

Lousiana EAMaryland EA

Pennsylvania HFlorida H

New York HConnecticut H

Texas EACajun EA

Virginia EAOregon EA

Oregon HMaryland H

Lousiana HTexas HVirginia H

RST = 0.1

RST: measure for population differentiation

Kayser et al. J. Forensic Sci. (2002), 47(3): 5513-519

Page 86: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Based on Evolutionary ModelNot a Forensic Model

Page 87: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Computation of Frequency of Lineage-based Marker Profile

Using the general theory, the unconditional frequency of an haplotype (say Ai),

which is count divided by sample size, can be modified to get the conditional probability

Pr. (Ai|Ai) = [pi2 + θ pi(1-pi)]/pi

= pi + θ(1-pi) or= θ + pi(1 - θ)

Hence, the conditional probability always exceeds θ, the adjustment factor of possible population substructure in the database used

θ Becomes the bound

Page 88: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Computation of Frequency of Lineage-based Marker Profile (Contd.)

Some suggest that the quantity pi in Pr. (Ai|Ai) = θ + pi(1 - θ)

can be substituted by (Count of Ai + 2)/(N + 3),

where N is the sample size. When N is large, this has little effect, but can be of help when the count of Ai in the database is zero (i.e., profile in evidence not seen in the database)

Page 89: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Y STR θ -Value

Since in terms of match versus non-match, how different are the haplotypes is not an issue

The θ values for Y STRs are not computed based on mismatch approaches (such as AMOVA), but instead treating all haplotypes as different alleles, generally leading to a much smaller θvalue

Page 90: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

What θ to use?

• The value of θ (computed by either RST or FST) is dependent indirectly on the number of Y STR loci comprising the haplotype

• Generally, as more loci are included in the haplotype, most haplotypes in a data set will become differentiated

• Therefore, the greater the number of loci within the haplotypes, the smaller should be the value of θ

Page 91: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

What θ to use?• RST has been used to estimate the value of θ for forensic

calculations

• However, the analyses based on RST are better applied to evolutionary biology for studying the phylogenetichistory of Y-chromosomes

• The RST values are based on allele size variance, exploiting the extent of difference between different haplotypes

• Such an approach, however, typically does not apply to forensic inferences; forensic applications assess the evidence in terms of match or non-match.

Page 92: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

What θ to use?

• In these terms, the θ values for Y STRs should not be computed on mismatch based approaches, but instead by simply treating all haplotypes as different alleles

• Thus, FST (or GST) is a better estimator for θ

• Haplotypes are identified solely by their distinctiveness (i.e., haplotypes are considered simply in terms of identity by state)

• This generally leads to a more appropriate and much smaller θ value than estimated by RST

Page 93: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Y STR haplotype is one locus with many allelesA1A2....

A100

A101A102

.

.

.

.A200

Population 1

Population 2

Databases with reasonable size approximate this model

θ is almost 0

Page 94: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Y STR haplotype is one locus with many alleles

A1A2....

AnA1'A2'

.

.

.

.An'

Population 1

Population 2

In reality, with large number of loci a few types are shared and most if not all have never been seen in the database

θ approaches 0

Page 95: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

What did he just say?

Page 96: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Forensic ModelPopulation Substructure

DYS19 DYS389I DYS389II DYS390 DYS391 DYS392 DYS393 DYS385

A --- 14 12 29 24 10 15 12 11-14

E --- 18 12 25 24 10 15 15 11-18

B --- 14 13 29 24 10 15 12 11-14

C --- 14 13 29 24 12 15 12 10-14

D --- 18 11 25 24 10 13 15 12-18

Which haplotypes might be more closely related?

Page 97: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Forensic ModelPopulation Substructure

DYS19 DYS389I DYS389II DYS390 DYS391 DYS392 DYS393 DYS385

A --- 14 12 29 24 10 15 12 11-14C --- 14 13 29 24 12 15 12 10-14

Are such evolutionary differences considered in forensic evaluation?

E --- 18 12 25 24 10 15 15 11-18A --- 14 12 29 24 10 15 12 11-14

Exclusion

Exclusion

Page 98: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

F-Statistics distribution

0

0.0005

0.001

0.0015

0.002

0.0025

16 15 13 11 10

# of Loci

FstFst w/o NAFscFsc w/o NAFctFct w/o NA

θ Value and Impact(or do we need more Y STR loci?)

Page 99: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

θ Value and Impact• FST values across major populations range from <0.002 to <0.0001

(depending on the number of loci comprising the haplotype)

• Population-specific values are smaller, excluding Native Americans

• 17 loci A = 0.00005; C = 0.00016; H = 0.00007; 12 loci A = 0.00005; C = 0.0002; H = 0.0004

• With the size of current reference population databases, FST rarely will influence the upper bound of the Y STR frequency for most population groups (with the current commercially available systems)

• Increasing the size of reference population data sets and including more populations would be more valuable and a better use of resources for exploiting the full power of Y STR typing

• Dedicating efforts to expanding the battery of Y STR loci will not significantly exploit the power of the assays

Page 100: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Key thoughts

• Diversity• Most common types in population• Following NRCII recommendations

Page 101: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Partial Profile θCombined Populations

• 0.0008 -- 15 loci• 0.002 -- 11 loci• 0.0329 -- 5 loci• 0.1199 -- 2 loci

Page 102: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Locus Caucasian(N = 199)

Afr Amer(N = 203)

Hispanic(N = 207)

Asian(N = 83)

Total(N = 692)

DYS391 11:10 1

DYS389I 12:13 1

DYS389II 29:30 30:29 1*

DYS439 13:12 11:12 2

DYS438 0

DYS437 15:16 15:14 2

DYS19 16:17 2

17:16

DYS392 0

DYS393 14:15 1

DYS390 0

DYS385 14:15 14:15 12,14:14 3

Total 2/2388 6/2436 2/2484 3/996 13/8304

0.00084 0.00246 0.00081 0.0031 0.00157

Y STR mutations (father:son allele transmission)

Page 103: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• 692 confirmed father-son pairs (probability > 99.9%)

• 14 mutation events were observed

• Average rate of 1.57 x 10-3/locus /generation (13/8304)

• With a 95% confidence bound of 0.83 x 10-3 to 2.69 x 10-3

• This rate is a little smaller than that of the Kayser, et al.

• Estimate (2.80 x 10-3/locus)

• But the difference is not statistically significant (P > 0.05).

one Asian father-son pair at the DYS389I/II loci complex (12,29) �(13, 30) appears as a double mutation, but likely is a single original event.

Y STR mutations (father:son allele transmission)

Page 104: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Next Task

• Test independence between autosomalloci and Y haplotypes

Page 105: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Independence Testing of Y Haplotypeand 13 Autosomal CODIS STR Loci

(Autosomal Locus/ Y Haplotype Displaying Disequilibria* - 22 populations)

1. ApacheFGA, p-value = 0.03760000D21S11, p-value = 0.03460000D18S51, p-value = 0.02820000D5S818, p-value = 0.02660000

2. Minnesota AsianD8S1179, p < 10-3

3. Minnesota HispanicD16S539, p-value = 0.03340000D18S51, p-value = 0.02100000

4. Canada African AmericanFGA, p-value = 0.00920000

5. Canada Asian IndianD7S820, p-value = 0.02820000

6. Connecticut African AmericanFGA, p-value = 0.04300000THO1, p-value = 0.00280000

7. Connecticut CaucasianTHO1, p-value = 0.02880000

8. Michigan CaucasianD16S539, p-value = 0.04820000

Page 106: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

13. New York CaucasianD7S820, p-value = 0.01660000

14. New York HispanicD21S11, p-value = 0.01340000

15. Texas African AmericanD13S317, p-value = 0.01200000D18S51, p-value = 0.01420000

16. Texas HispanicD5S818, p-value = 0.01880000

9. Michigan HispanicvWA, p-value = 0.03160000FGA, p-value = 0.02240000

10. Native American TotalD3S1358, p-value = 0.02680000D21S11, p-value = 0.00060000D18S51, p-value = 0.00840000

11. NavajoD21S11, p-value = 0.02820000

12. New York AsianD16S539, p-value = 0.00740000

Independence Testing of Y Haplotypeand 13 Autosomal CODIS STR Loci

(Autosomal Locus/ Y Haplotype Displaying Disequilibria* - 22 populations)

Page 107: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Next Task• Mixtures

• Assume 2 alleles for 11 loci

• 211 possible haplotypes – 2048

• Most haplotypes not observed in database

• Assumption of independence not correct

• Minimal haplotype frequency (≈minimum allele frequency) not practical

Page 108: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Mixture• Probability of Exclusion

• Binomial distribution - haplotypes excluded and haplotypes not excluded

• Count number (m) not excluded; (PI = m/n)

• Estimate upper CI of PI

• PE = 1- PI

• Based on same principles used for autosomal loci (but at haplotype level)

Page 109: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• Four scenarios for two contributor sample in example:

•Hp --- S1 and S2 are source

• Hd --- S1 and unknown are source (same as PI)

• Hd --- S2 and unknown are source (same as PI)

• Hd --- two unknowns are source

MixtureLikelihood Ratio

Page 110: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• Assume three loci, two alleles at each locus, two male suspects

•Total alleles the same as in evidence

• Equal contribution

• 8 possible haplotypes

MixtureLikelihood Ratio

Page 111: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

13/15 8/10 22/25 --- 3 locus profile

15 8 22 --- haplotype 2

15 8 25 --- haplotype 4

13 10 22 --- haplotype 5

15 10 22 --- haplotype 6

13 10 25 --- haplotype 7

13 8 25 --- haplotype 3

15 10 25 --- haplotype 8

13 8 22 --- haplotype 1

All haplotypes included are

PE/PI

Page 112: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

PE/PI

All possible haplotypes are included

13/15, 8/10, 22/25 --- 3 locus profile

15 8 22 --- haplotype 2

15 8 25 --- haplotype 4 13 10 22 --- haplotype 515 10 22 --- haplotype 613 10 25 --- haplotype 7

13 8 25 --- haplotype 3

15 10 25 --- haplotype 8

13 8 22 --- haplotype 1

But certain haplotype pairs can not explain evidence

haplotype 1 + haplotype 2haplotype 1 + haplotype 3haplotype 1 + haplotype 4haplotype 1 + haplotype 5

and so on

Page 113: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

PE/PI

All possible haplotypes are included

13/15, 8/10, 22/25 --- 3 locus profile

15 8 22 --- haplotype 2

15 8 25 --- haplotype 4 13 10 22 --- haplotype 515 10 22 --- haplotype 613 10 25 --- haplotype 7

13 8 25 --- haplotype 3

15 10 25 --- haplotype 8

13 8 22 --- haplotype 1

haplotype 1 + haplotype 8haplotype 2 + haplotype 7haplotype 3 + haplotype 6haplotype 4 + haplotype 5

Only certain haplotype pairs can explain evidence

Page 114: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

LR =1

2[Pr(H1)Pr(H8) + Pr(H2)Pr(H7) + Pr(H3)Pr(H6) + Pr(H4)Pr(H5)]

MixtureLikelihood Ratio

Page 115: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

• Technically correct

• Can not estimate individual haplotype frequencies

• 217 possible haplotypes (although not all combinations can explain the evidence)

• Assuming independence is not correct

• Cannot place types in database, most never seen, too many

MixtureLR

Page 116: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Use same logic as PIfor the denominator in the LR

E = excluded E = not excluded

Haplotypes fall into either category

E/E and E/E pairs can not explain the evidence

Only E/E can explain the evidenceand only a subset of these fit

Page 117: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

m* - those pairs (of E/E) that explain evidence

m*/n(n-1) and take upper CI as denominator

MixtureLR

Page 118: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

LR =1

m*/n(n-1)

MixtureLR

•The denominator is the PI with an assumed number of contributors

•Makes better use of data

UCI

Page 119: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Online available Y-STR haplotype reference databases/ Commercial Kits

To calculate/search matches - haplotype frequencies

http://www.appliedbiosystems.com/yfilerdatabase/

Applied Biosystems Yfiler

http://www.promega.com/techserv/tools/pplexy/default.htm

Promega Powerplex Y

Page 120: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

AB Yfiler

Page 121: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)
Page 122: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Haplotype data can be input manually or through file upload

Page 123: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

There can be no matches

when testing this many loci and this size

database

Page 124: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

So using our formula from before and an α = 0.05

1-α1/N

1 – (0.05)1/3561 = 0.00084

But would not use total data set, breakdown by population

Page 125: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

If there is a match in the database

Page 126: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

CI = p ± 1.96 p(1-p)/N

What p…? What N…?

Page 127: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

CI = p ± 1.96 p(1-p)/N

As before, look at match number and sample size:

CI = 0.0006 + 1.96√ (0.0006(0.9994))/3561

Upper bound would be:0.00140

Page 128: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

CI = p ± 1.96 p(1-p)/N

In practice, use population group in which the match was found (and other populations appropriate):

CI = 0.0016 + 1.96√ (0.0016(0.9984))/1276

Upper bound would be:

0.00379

Page 129: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

One issue to consider is

• Expressing results when database sizes are different

• 1 in 70• 1 in 700• May have nothing to do with variation• But may be exploited incorrectly by some

Page 130: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Recap Comments on Lineage Markers

• Y STRs are paternally inherited

• Barring mutations, all paternally related persons will have the same haplotype

• Different markers on the Y chromosome are genetically linked with no recombination

• Consequently, Y STR sequence data are treated as a haplotype, frequency - NOT multiplicative across markers

Page 131: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Comments on Lineage Markers (Contd.)• Counting method is one approach that captures the genetic

information

• Stated ethnicity of individuals does not necessarily reflect patrilineal ancestry

• e.g., mtDNA of Hispanics may be almost entirely of Native American descent, while for the autosomal STRs, only 30-50% of their genes are of Native American descent

• Thus, grouping of populations by ancestral lineage does not necessarily provide accurate frequency estimates of specific Y STR haplotypes in a forensic context

Page 132: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Summary Steps of Computations Step 1

• Convenience samples with subjects included without a prior knowledge of their DNA profile is an adequate database

• Subjects’ with broad population grouping recorded with self-identified ethnicity

• Sample size of XXX to XXX individuals is generally adequate

• Haplotype counts from samples of genetically affine groups may be pooled to enhance the sample size

Page 133: Assessing the Significance of Y STR Evidence...Genes on the Human Y Chromosome • 23 Mb of the euchromatic region determined • 156 transcription units • 78 encode proteins (genes)

Calculation to Convey to the CourtCounting Method

• The counting method is very simple• A Y STR haplotype (evidence sample) is compared to a

reference database(s) of unrelated individuals• The number of times the haplotype is observed in a database• The size of a database can be and is often limited• With databases (e.g., n = 100 to 3000), many possible

haplotypes will not be observed and there will be sampling error • A confidence interval can be placed on the observation • Can convey with a high degree of confidence that the rarity of

the evidence Y STR haplotype among unrelated individuals in a given population(s) is less than the upper bound of the estimate

• θ adjustment – see presentation during meeting