of 26 /26
Blackett Family DNA Activity 2 Overview DNA Analyst Bob Blackett has graciously provided The Biology Project with sample data from his own work. In this activity, you will learn the concepts and techniques behind DNA profiling of the 13 core CODIS "Short Tandem Repeat" loci used for the national DNA databank. You will then have the opportunity to collect and interpret actual STR data, and to answer one or more of the following questions: 1. How is STR data used in a DNA Paternity Test? 2. How can STR data from close relatives be used to create a genetic profile of a missing person ? 3. How much genetic diversity exists among siblings? 4. How does one calculate the probability for a specific DNA profile? Alternatively, you may wish to create your own activities, based on some suggestions for open-ended inquiry that are offered below. This activity is aimed at students with a basic knowledge of DNA structure, Mendelian genetics, and human pedigree analysis. A good preparation for this activity would be to review our problem sets and tutorials in Human Biology . The Science of STR DNA Profile Analysis What is a Short Tandem Repeat Polymorphism (STR)? What are the 13 core CODIS loci? Methods of Analysis of STRs Genetics of STR Inheritance DNA Profile Frequency

Blackett Family DNA Activity 2

Embed Size (px)

Text of Blackett Family DNA Activity 2

Blackett Family DNA Activity 2OverviewDNA Analyst Bob Blackett has graciously provided The Biology Project with sample data from his own work. In this activity, you will learn the concepts and techniques behind DNA profiling of the 13 core CODIS "Short Tandem Repeat" loci used for the national DNA databank. You will then have the opportunity to collect and interpret actual STR data, and to answer one or more of the following questions: 1. How is STR data used in a DNA Paternity Test? 2. How can STR data from close relatives be used to create a genetic profile of a missing person? 3. How much genetic diversity exists among siblings? 4. How does one calculate the probability for a specific DNA profile? Alternatively, you may wish to create your own activities, based on some suggestions for open-ended inquiry that are offered below. This activity is aimed at students with a basic knowledge of DNA structure, Mendelian genetics, and human pedigree analysis. A good preparation for this activity would be to review our problem sets and tutorials in Human Biology.

The Science of STR DNA Profile AnalysisWhat is a Short Tandem Repeat Polymorphism (STR)? What are the 13 core CODIS loci? Methods of Analysis of STRs Genetics of STR Inheritance DNA Profile Frequency Calculations

Structured Inquiry Activities for StudentsSome representative activities involving data collection, interpretation, and analysis using Bob Blackett's data.

Create a Blackett Family Pedigree Collecting STR DNA profile data Paternity Testing with STR Data DNA Profile of a "Missing Person" DNA Profile Frequency Calculations

Open Ended Activities for StudentsHere we suggest some starting points for in depth exploration of the topic of STR DNA profiling. If you have additional suggestions for this section, contact the author about including open-ended inquiry that you have developed for your students. If selected, we will cite you and your school.

Blackett Family DNA Activity 2What is a Short Tandem Repeat Polymorphism (STR)? STR PolymorphismsMost of our DNA is identical to DNA of others. However, there are inherited regions of our DNA that can vary from person to person. Variations in DNA sequence between individuals are termed "polymorphisms". As we will discover in this activity, sequences with the highest degree of polymorphism are very useful for DNA analysis in forensics cases and paternity testing. This activity is based on analyzing the inheritance of a class of DNA polymorphisms known as "Short Tandem Repeats", or simply STRs. STRs are short sequences of DNA, normally of length 2-5 base pairs, that are repeated numerous times in a head-tail manner, i.e. the 16 bp sequence of "gatagatagatagata" would represent 4 head-tail copies of the tetramer "gata". The polymorphisms in STRs are due to the different number of copies of the repeat element that can occur in a population of individuals.

D7S280D7S280 is one of the 13 core CODIS STR genetic loci. This DNA is found on human chromosome 7. The DNA sequence of a representative allele of this locus is shown below. This sequence comes from GenBank, a public DNA database. The tetrameric repeat sequence of D7S280 is "gata". Different alleles of this locus have from 6 to 15 tandem repeats of the "gata" sequence. How many tetrameric repeats are present in the DNA sequence shown below? Notice that one of the tetrameric sequences is "gaca", rather than "gata".1 aatttttgta ttttttttag agacggggtt tcaccatgtt ggtcaggctg actatggagt 61 tattttaagg ttaatatata taaagggtat gatagaacac ttgtcatagt ttagaacgaa 121 ctaacgatag atagatagat agatagatag atagatagat agatagatag atagacagat 181 tgatagtttt tttttatctc actaaatagt ctatagtaaa catttaatta ccaatatttg 241 gtgcaattct gtcaatgagg ataaatgtgg aatcgttata attcttaaga atatatattc 301 cctctgagtt tttgatacct cagattttaa ggcc

What are the 13 core CODIS loci? A National DNA DatabankThe Federal Bureau of Investigation (FBI) of the US has been a leader in developing DNA typing technology for use in the identification of perpetrators of violent crime. In 1997, the FBI announced the selection of 13 STR loci to constitute the core of the United States national database, CODIS. All CODIS STRs are tetrameric repeat sequences. All forensic laboratories that use the CODIS system can contribute to a national database. DNA analysts like Bob Blackett can also attempt to match the DNA profile of crime scene evidence to DNA profiles already in the database. There are many advantages to the CODIS STR system:

The CODIS system has been widely adopted by forensic DNA analysts STR alleles can be rapidly determined using commercially available kits. STR alleles are discrete, and behave according to known principles of population genetics The data are digital, and therefore ideally suited for computer databases Laboratories worldwide are contributing to the analysis of STR allele frequency in different human populations STR profiles can be determined with very small amounts of DNA

A DNA Profile: The 13 CODIS STR lociAs part of his training and proficiency testing for DNA Profile analysis of STR (Short Tandem Repeat) Polymorphisms, Forensic Scientist and DNA Analyst Bob Blackett created a DNA profile on his own DNA. Here is Bob's DNA Profile for the 13 core Genetic Loci of the United States national database, CODIS (Combined DNA Index System):

Locus Genotype Frequency Locus Genotype Frequency

D3S1358 15, 18 8.2%

vWA 16, 16 4.4%

FGA 19, 24 1.7%

D8S1179 D21S11 D18S51 D5S818 12, 13 9.9% THO1 9, 9.3 9.6% 29, 31 2.3% TPOX 8, 8 3.52% 12, 13 4.3% CSF1PO 11, 11 7.2% 11, 13 13% AMEL XY (Male)

D13S317 D7S820 D16S539 11, 11 1.2% 10, 10 6.3% 11, 11 9.5%

For each genetic locus, Bob has determined his "genotype", and the expected frequency of his genotype at each locus in a representative population sample. For example, at the genetic locus known as "D3S1358", Bob has the genotype of "15, 18". This genotype is shared by about 8.2% of the population. By

combining the frequency information for all 13 CODIS loci, Bob can calculate that the frequency of his profile would be 1 in 7.7 quadrillion Caucasians (1 in 7.7 times 10 to the 15th power! In Bob's forensic DNA work, he often compares the DNA profile of biological evidence from a crime scene with a known reference sample from a victim or suspect. If any two samples have matching genotypes at all 13 CODIS loci, it is a virtual certainty that the two DNA samples came from the same individual (or an identical twin).

Methods of Analysis of STRsWe will assume that you have a basic understanding of the Polymerase Chain Reaction (PCR), and gel electrophoresis, especially as applied to DNA sequence analysis. We will focus here on the special features of PCR and gel electrophoresis as they are applied to STR characterization. If you are unfamiliar with these techniques, you should still be able to complete this activity.

Methods in Analysis of the 13 CODIS STR loci1. DNA extraction DNA can be extracted from almost any human tissue. Buccal cells from the inside of the cheek are most commonly used for paternity tests. Sources of DNA found at a crime scene might include blood, semen, tissue from a deceased victim, cells in a hair follicle, and even saliva. DNA extracted from items of evidence is compared to DNA extracted from reference samples from known individuals. 2. PCR Amplification DNA primers have been optimized to allow amplification of multiple STR loci in a single reaction mixture. By carefully adjusting the distance of the primers from the tetrameric repeat sequence, products from different loci will not overlap during gel electrophoresis.

In the partial results shown above, the three STRs D3S1358, vWA, and FGA are being analyzed simultaneously. The lengths of the amplified DNAs are shown by the scale from 100 bp to 280 bp at the top of the figure. The middle panels with multiple peaks are reference standards with the known alleles for each STR locus. Notice that the alleles for the three different loci do not overlap. The lower panel shows the alleles for Bob Blackett's mother Norma for the D3S1358, vWA, and FGA loci. Norma's alleles have been compared by

computer to the refrence standards, and labeled. To interpret this result, Norma's genotype is 15, 15 at the locus D3S1358, 14, 16 at vWA, and 24, 25 at FGA. 3. Detection of DNAs after PCR Amplification The PCR primers in the commercial kits used for STR analysis have fluorescent molecules covalently linked to the primer. To extend the number of different loci that can be analyzed in a single PCR reaction, multiple sets of primers with different "color" fluorescent labels are used. Following the PCR reaction, internal DNA length standards are added to the reaction mixture and the DNAs are separated by length in a capillary gel electrophoresis machine. As DNA peaks elute from the gel they are detected with laser activation. The sequencing machines used for allele separation and detection are the same type currently being used in the Human Genome Sequencing project, with digital output that can be analyzed by special computer software.

In the AmpFLSTR Profiler Plus PCR Amplification Kit from Applied Biosystems used by Bob Blackett, 9 STRs are analyzed by using three sets of primers. Each set has a different colored fluorescent label. In the figure above, three sets of STRs are represented by blue, three by green, and three by yellow (shown as black) fluourescent peaks. The red peaks are the DNA size standards. Special computer software is used to display the different colors as separate panels of data and determine the exact length of the DNAs. A tenth marker called AMEL is used to distinguish male DNA as X, Y or female DNA as X, X. A second kit, called Cofiler Plus, is used in a second PCR reaction to ammplify 4 additional STR loci, plus repeat some of the loci from the Profiler Kit. The result from 2 PCR reactions is the analysis of the entire CODIS set of 13 STRs, with overlap of some loci, and a test for the sex chromosomes. The results are obtained as discrete, digital alleles determined from the exact size of the amplified products compared to known standards.

Genetics of STR InheritanceSince there are no phenotypes associated with the CODIS STR loci, understanding the genetics of STR inheritance is simplified compared to other genetic problems. We need only consider the genotypes of the parents and their offspring. The alleles of different STR loci are inherited like any other Mendelian genetic markers. Diploid parents each pass on one of their two alleles to their offspring according. Here is brief review of the genetic concepts and terms important for understanding STR allele inheritance. For an in depth tutorial, visit our Monohybrid Cross problem set.

Allele. The different forms of a gene. Different STR repeat lengths represent different alleles at a genetic locus, i.e. 8 and 9 are different alleles of the THO1 locus. Locus. The position on a specific chromosome where the different alleles of a genetic marker are located. The plural is loci. Monohybrid Cross. Genetic cross involving parents differing in only one trait. Inheritance of each of the 13 STR loci can be treated as a separate Monohybrid Cross. Genotype. The genetic composition of the alleles at a locus. Since we are diploid, we each have two alleles at each locus. Homozygous. Both alleles at a locus are the same, i.e. Fred has a genotype of 29, 29 at the D21S11 locus. Heterozygous. Alleles at a locus are not the same, i.e. Normal has a genotype of 29, 31 at the D21S11 locus. Multiple Allelic Series. Many different alleles at a locus, i.e. the known alleles at the vWA locus are 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, and 21. Punnett Square. A diagram used to determine all possible genotypes that can occur in a genetic cross. All of the diagrams on this page are Punnett Squares.

Here are some examples of the how STR data can be interpreted in a family DNA study. The numbers outside the Punnett Squares are the parental alles that can be present in the egg or sperm of the parents. The numbers inside the squares are the genotypes possible for the resulting children.

Case 1If the genotypes of both parents are known, we use a Punnett Square to predict the possible phenotypes of their offspring. Each child inherits one allele of a given locus from each parent. Panel (a) - At the D21S11 locus, the children of Bob Blackett and wife Anne can have four different genotypes. Son David is 28, 31. Daughter Katie is 29, 30. Panel (b) - Bob Blackett inherited the 31 allele

from his mother, Norma. Therefore the 29 allele is paternal. If Bob's paternal was not 29, what would be your conclusion?

Case 2In the genotypes of a mother and several children are known, it is often possible to unambiguously predict the genotype of the father. In this case, Karen is the mother with a genotype of 9, 9.3 at the THO1 locus. From the Punnett Square we can determine that the paternal alleles of Tiffany, Melissa, and Amanda are 8, 9.3, and 9.3, respectively. Therefore, their father Steve must have a genotype of 8, 9.3. If the three daughters had three different paternal alleles, what would be your conclusion?

Case 3Sometimes only one allele of the father can be predicted when the genotypes of a mother and several children are know. In this example, the genotype of Karen, the mother, is 16, 17 at the D18S51 locus. The genotypes of the daughters are either 16, 18 or 17, 18. In each case, Melissa, Tiffany, and Amanda inherited the 18 allele from their father, Steve. We cannot determine if the genotype of Steve is homozygous, 18, 18 or 18, ? where the ? means any other allele.

Case 4Is it possible to determine parental genotypes when only the genotypes of their children are known? Consider the case of Bob Blackett's 4 first coursins, Marilyn, Buddy, Dick and Janet. Bob did not have DNA samples for their parents, Bud and Louise, who are both deceased. In a real forensic case, Bud and Louise might represent "missing persons". In panel (a) we can arrange the 3 known genotypes of the 4 children. In panel (b) we predict the only two paternal genotypes for the parents that can account for the children. Note that we cannot determine which genotype goes with which parent.

Case 5A variation on Case 4 is when there are only two genotypes known for the children, and both parental genotypes must be predicted. Panel (a) - Marilyn and Janet are 15, 16 at the locus D3S1358. Buddy and Dick are 18, 18. Panel (b) - The only parental genotypes that can give this result are 15, 18 and 16, 18. Once again, we cannot predict which parent as which genotype.

Case 6

Sometimes the parental genotypes cannot be predicted unambigously from the genotypes of their children. Marilyn is 16, 17 at the locus vWA. Buddy, Dick, and Janet are 16, 18. What are the parental genotypes? Panel (a) - One interpretations is that the parents are 16, 18 and 16, 17. Panel (b) - Another possibility is that one parent is 17, 18 and the other is 16, ?, where ? is any allele.

DNA Profile Frequency Calculations Genotype Probability at any STR Locus

Part of the work of forensic DNA analysis is the creation of population databases for the STR loci studied. Probability calculations are based on knowing allele frequencies for each STR locus for a representative human population (and showning HardyWeinberg equilibrium for the population by statistical tests). Allele frequency is defined as the number of copies of the allele in a population divided by the sum of all alleles in a population. For a heterozygous individual, if the two alleles have frequencies of p and q in a population, the probability (P) of an individual of having both alleles at a single locus is P = 2pq

If an individual is homozygous for an allele with a frequency of p, the probability (P) of the genotype is P = p2.

We saw earlier that Bob Blackett has the genotype 15, 18 at the locus D3S1358. In a reference database of 200 U.S. Caucasians, the frequency of the alleles 15 and 18 was 0.2825 and 0.1450, respectively. The frequency of the 15, 18 genotype is therefore P = 2 (0.2825) (0.1450) = .0819, or 8.2%.

Probability for a DNA profile of Multiple Loci

If databases of allele frequency for different loci can be shown to be independently inherited by appropriate statistical tests, the probability for the combined genotype can be determined by the multiplication (product rule). The probability (P) for a DNA profile is the product of the probability (P1, P2, ... Pn) for each individual locus, i.e. Profile Probability = (P1) (P2) ... (Pn)

The probability can be an extremely low numbers when all 13 CODIS STR markers are included in the DNA profile. As mentioned earlier, Bob Blackett calculated his own profile probability at 1.3 times 10-16, or no more frequent than 1 in 7.7 quadrillion individuals (7.7 million billion), which is more than a million times the population of the planet.

Create a Blackett Family Pedigree Blackett Family MembersThe Blackett Family DNA Activity is largely a genetic study of the inheritance of alleles in an extended family. Bob Blackett has tested DNA samples from himself and 13 other relatives. The first task of a human geneticist is the creation of a family tree, or pedigree to help with the interpretation of genotypes. From the following relationships, construct a pedigree for the Bob and his relatives. Person Family Relationship Bob Anne David Katie Fred Norma Karen Steve Tiffany Our propositus Wife Son Daughter Father Mother Sister Husband of Karen Daughter of Karen and Steve

Melissa Daughter of Karen and Steve Amanda Daughter of Karen and Steve Louise Bud Buddy Dick Janet Sister of Fred; Bob's Aunt Husband of Louise Son of Bud and Louise Son of Bud and Louise Daughter of Bud and Louise

Marilyn Daughter of Bud and Louise

Would you like to check your answer?We have prepared a sample pedigree chart of the Blackett family that you can use to check your answer, or to skip this activity if time is limiting. There are two options:

View the Pedigree in a new web page Download the pdf version. The pdf version, which requires Adobe Acrobat reader for display and printing, might be useful for taking notes during the ensuing activities.

Collecting STR DNA profile data STR Data for the Blackett FamilyThese data are from the actual DNA analysis of the Blackett family members by Bob Blackett. The tracings below show the genotypes for three of the 13 CODIS STR loci. In this activity, you will record the data for use in the ensuing genetic analysis of the Blackett family. Data on the other 10 loci will be provided later.

Collect the data for Bob, Anne, David, Katie, Fred and Norma for the "Paternity Testing with STR" Activity. Collect the data for Karen, Tiffany, Melissa, and Amanda for the "DNA Profile of a Missing Person" Activity. You will not need to collect the results for Buddy, Dick, Marilyn and Janet. They are provided for you to create your own activity, i.e. Can you make any conclusions about Louise and Bud?

You may wish to collect data in your own databook. If you would like to use partially completed spreadsheets to speed up your data collection, select from the following shortcuts:

Bob, Anne, David, Katie, Fred and Norma either in Table or pdf format. Karen, Tiffany, Melissa, and Amanda either in Table or pdf format. Completed data for Buddy, Dick, Marilyn, and Janet in Table or pdf format.

Note: In combining all of the individual profiles into a composite diagram for this activity, the tracings were digitally modifed for illustrative purposes.

Paternity Testing with STR DataIn this activity, you will assume the role of a Human Geneticist in a DNA Paternity Testing Laboratory. You have just obtained the DNA Profiles for Bob, Anne, David and Katie. You also have information about Bob's parents, Fred and Norma. In your role as a Human Geneticist, it is not essential that you know all of the laboratory techniques used to obtain the Blackett family genotypes. Your work is based on understanding the principles of Mendelian Genetics as applied to STR loci. Here are your options:

Go immediately to the questions below and interpret the data you have already collected. Review the principles of genetics needed for this activity Use the data that we have collected for you or, if you prefer, use the data that you have already collected. Download a worksheet for this activity in PDF format with the data that we have collected for you.

Choose from among the following questions to test your understanding of human genetics. 1. Who are the parents of David and Katie? Do all of the data you have collected on the genotypes of Bob, Anne, Katie, and David support the conclusion that Bob and Anne are the biological parents of David and Katie? You should justify your answer by reference to the specific genotypes for the STR loci. 2. What is the genetic legacy of Fred and Norma? The alleles that Bob passes on to his children have in turn been inherited from Bob's parents, Fred and Norma. Identify the alleles among the 13 CODIS STR loci in the genotypes of Katie and David that have been unambigously inherited from each of their paternal grandparents. Now identify any additional alleles that might have been inherited from their paternal grandparents. 3. Genetic Diversity and Sexual Reproduction Human geneticists are often asked why children have not inherited a particular trait from their parents. As a human geneticist, you know that one mechanism to insure genetic diversity is the independent assortment of alleles of different loci during gamete (egg and sperm) production, i.e. Mendel's Second Law of Genetics. To illustrate this important genetic principle, calculate how many genotypes would be possible among the children of Bob and Anne for the combined DNA profile from the D3S1358, vWA, and FGA. If you feel really ambitious, now calculate the possible genotypes of the children of Bob and Anne for all 13 CODIS STR loci. 4. How many genotypes are possible in a population for a three locus DNA Profile?

If there are two alleles, A and B, at a genetic locus in a population, there are three possible genotypes, namely AA, BB, and AB. If there are three alleles, A or B or C, there are six possible genotypes, namely AA, BB, CC, AB, AC, and BC. For N different alleles, the total possible genotypes is given by the following expression:

If we assume that the allele reference ladders from our data collection exercise represent all possible alleles (a conservative estimate), how many genotypes are possible in a population for the combined STR loci of D3S1358, vWA, and FGA? 5. How many genotypes are possible in a population for the combined CODIS 13 STR loci? If you feel really ambitious, you may wish to calculate the number of possible genotypes considering all 13 CODIS STR markers. The table below shows the number of alleles for each locus. Beware, the number will be very large. Locus D3S1358 vWA FGA D8S1179 D21S11 D18S51 D5S818 Alleles Locus Alleles 8 11 14 12 THO1 7 22 TPOX 8 21 CSF1PO 10 10 AMEL XY

D13S317 D7S820 D16S539 8 10 9

DNA Profile of a "Missing Person"In this activity you will assume the role of a forensic DNA analyst. Your task will be to determine the DNA profile for a "missing person" from the analysis of close family members. DNA analysts often have to recreate genotypes for those whose DNA is not readily available for analysis. A recent case of great national interest was the identification of the remains of the Vietnam soldier who had been interred in the Tomb of the Unknown Soldier. Here are your options:

Go immediately to the questions below and interpret the data you have already collected. Review the principles of genetics needed for this activity Skip the data collection, and Use the data that we have collected for you for Question #1. Download a worksheet for this activity in PDF format with the data that we have collected for you for Question #1. Completed data for Buddy, Dick, Marilyn, and Janet in Table or pdf format for Question #2.

1. What is Steve's Genotype? In our activity, we obtained data for Karen and her three daughters, Tiffany, Melissa, and Amanda. Bob Blackett has not yet had the opportunity to test the DNA of Steve, so Steve can play the role of the "missing person" in our activity. Determine Steve's genotype at the 13 CODIS STR loci. Indicate whether there is an unambigous genotype where both alleles are know, or some uncertainty about both paternal alleles. 2. What are the Genotypes of Bud and Louise? What happens when we have two missing people? Human geneticists are often asked to determine if adult children in the same family all have the same biological parents. Demonstrate that all of the genetic information for the children of Bud and Louise is consistent with all 4 having the same two parents.

DNA Profile Frequency CalculationsIn this activity, you can calculate the probability for some of the DNA profiles you have been studying. The following set of data are tables of allele frequency for the three STR loci D3S1358, vWA, and FGA for a combined, Caucasian population. The frequency data come from the web site of the Royal Canadian Mounted Police. Locus Allele Frequency 0.015 0.015 0.1341 0.2896 0.2287 0.1616 0.1616 0.0152

D3S1358 12 D3S1358 13 D3S1358 14 D3S1358 15 D3S1358 16 D3S1358 17 D3S1358 18 D3S1358 19

Locus VWA VWA VWA VWA VWA VWA VWA VWA

Allele Frequency 12 14 15 16 17 18 19 20 0.015 0.1311 0.1189 0.186 0.2774 0.189 0.0884 0.015

Locus FGA FGA FGA FGA FGA FGA FGA FGA

Allele Frequency 18 19 20 21 22 23 24 25 0.015 0.061 0.125 0.1799 0.2287 0.1311 0.1463 0.0945

FGA FGA

26 27

0.0183 0.015

1. What is the Probability for a 3-Locus DNA profile? Based on a population database of Caucasians developed by Bob Blackett and colleagues in Arizona, Bob can calculate the genotype frequency of his combined profile for the three STR loci D3S1358, vWA, and FGA to be 6 x 10-5. Compare this frequency with the frequency you calculate from the Royal Canadian Mounted Police data. For help with this calculation, review the DNA Profile Frequency Calculation page. 2. Genotype Frequency for the 13 CODIS STR Loci. If you feel really ambitious, retrieve additional frequency data for the other 10 STR Loci from the web site of the Royal Canadian Mounted Police and calculate the genotype frequency for all 13 STR loci for one or more individuals from this study. 3. Check your answers. As an alternative to doing all of the arithmetic yourself, you can Calculate a profile's Random Match Probability using the RCMP on-line calculator.