16
Lab 3 : Exact tests and Measuring of Genetic Variation

Lab 3 : Exact tests and Measuring of Genetic Variation

  • Upload
    jagger

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Lab 3 : Exact tests and Measuring of Genetic Variation. χ 2 - test. Where,. Chi-square Assumptions : Finite # of observations. Observations are independent. Samples collected randomly. Large sample size (>20; >50). - PowerPoint PPT Presentation

Citation preview

Page 1: Lab 3 : Exact tests and Measuring of Genetic Variation

Lab 3 : Exact tests and Measuring of Genetic Variation

Page 2: Lab 3 : Exact tests and Measuring of Genetic Variation

χ2 - test

Where,

1-estimated parameters# -

genotypes ofNumber

genotype ofcount Expected

genotype ofcount Observed

kdf

k

iE

iO

i

i

Page 3: Lab 3 : Exact tests and Measuring of Genetic Variation

Chi-square Assumptions :1.Finite # of observations.

2.Observations are independent.

3.Samples collected randomly.

4.Large sample size (>20; >50)

Page 4: Lab 3 : Exact tests and Measuring of Genetic Variation

Example: Suppose you caught 5 Bluegill fish and detected two alleles (A1 and A2) and observed that all 5 fish were A1A2 heterozygotes. Calculate allele frequencies and do a χ2 – test to determine whether the population is in HWE.

,5.05

5.2021

1211

N

NNp

Genotype Observed ExpectedA1A1 0 1.25 -1.25 1.5625 1.25A1A2 5 2.5 2.5 6.25 2.5A2A2 0 1.25 -1.25 1.5625 1.25

        χ2 5

Conclusion: Reject Ho at α = 0.05 , because calculated χ2-value (=5) is more than critical χ2- value with 1 d.f. (≈ 3.84) i.e. Bluegill population is not in HWE.

5.05.011 pq

EO 2)( EO E

EO 2)(

Page 5: Lab 3 : Exact tests and Measuring of Genetic Variation

Why is the previous conclusion is not reliable?

Because, it violates the assumption of large sample size.

As a rule of thumb, the Chi-square test should not be used when the expected number for any genotype class is less than 5.

Page 6: Lab 3 : Exact tests and Measuring of Genetic Variation

Exact Test1. Calculate the probability of observing N11=0, N12=5, N22=0 under

HWE using the multinomial probability equation .

2. Generate all possible permutations of 5 A1 alleles and 5 A2 alleles into 3 genotypes i.e. 10! =3,628,800.

3. Calculate probability of observing each of these samples under HWE using multinomial probability equation.

4. Determine proportion of samples, whose probability is ≤ 0.0313.

5. If proportion (p-value) is less than 0.05, then reject Ho at α = 0.05.

Page 7: Lab 3 : Exact tests and Measuring of Genetic Variation

Genotype        

1 2 3 4 5 N11 N12 N22 ProbabilityA1 A1 A1 B1 B1 B1 B1 A1 A1 B1 1 3 1 0.1563A1 A1 B1 B1 B1 B1 A1 A1 A1 B1 2 1 2 0.0586A1 B1 B1 B1 A1 B1 A1 A1 B1 A1 1 3 1 0.1563A1 B1 B1 A1 A1 B1 A1 B1 A1 B1 0 5 0 0.0313B1 A1 A1 B1 A1 A1 A1 B1 B1 B1 1 3 1 0.1563A1 A1 B1 A1 B1 A1 B1 B1 B1 A1 1 3 1 0.1563B1 A1 A1 A1 B1 B1 A1 B1 B1 A1 1 3 1 0.1563B1 A1 A1 B1 A1 B1 A1 A1 B1 B1 1 3 1 0.1563A1 A1 B1 A1 B1 B1 B1 B1 A1 A1 2 1 2 0.0586A1 B1 B1 A1 B1 A1 A1 A1 B1 B1 1 3 1 0.1563B1 B1 B1 A1 B1 A1 A1 B1 A1 A1 1 3 1 0.1563B1 B1 B1 A1 A1 A1 A1 B1 B1 A1 1 3 1 0.1563A1 A1 A1 A1 B1 B1 B1 B1 A1 B1 2 1 2 0.0586B1 A1 B1 A1 A1 B1 A1 A1 B1 B1 1 3 1 0.1563B1 A1 B1 B1 A1 A1 B1 A1 A1 B1 1 3 1 0.1563A1 B1 B1 B1 A1 A1 B1 B1 A1 A1 2 1 2 0.0586B1 A1 A1 B1 B1 A1 B1 A1 A1 B1 0 5 0 0.0313B1 A1 B1 A1 A1 A1 B1 B1 A1 B1 1 3 1 0.1563B1 A1 A1 A1 B1 A1 A1 B1 B1 B1 1 3 1 0.1563A1 A1 B1 A1 A1 A1 B1 B1 B1 B1 2 1 2 0.0586A1 A1 B1 A1 B1 B1 A1 B1 A1 B1 1 3 1 0.1563B1 B1 B1 A1 A1 A1 A1 B1 A1 B1 1 3 1 0.1563A1 B1 B1 B1 B1 B1 A1 A1 A1 A1 2 1 2 0.0586B1 B1 B1 A1 B1 A1 A1 A1 A1 B1 1 3 1 0.1563B1 B1 A1 A1 B1 A1 B1 B1 A1 A1 2 1 2 0.0586B1 B1 B1 A1 A1 B1 A1 A1 B1 A1 1 3 1 0.1563A1 A1 B1 A1 B1 B1 A1 A1 B1 B1 2 1 2 0.0586A1 A1 A1 B1 B1 B1 B1 A1 A1 B1 1 3 1 0.1563A1 B1 A1 A1 B1 A1 B1 B1 B1 A1 1 3 1 0.1563A1 B1 A1 B1 A1 B1 A1 B1 B1 A1 0 5 0 0.0313

Page 8: Lab 3 : Exact tests and Measuring of Genetic Variation

Probability

0.00 0.05 0.10 0.15 0.20

No.

of

sam

ple

s

0

5

10

15

20

3

0.0313

Probability

0.00 0.05 0.10 0.15 0.20

No.

of

sam

ple

s

0

5

10

15

20

3

0.0313

p – value = Sample with probability ≤ 0.0313 / Total # of sample = 3/30 = 0.10

Conclusion: Because p- value is more than 0.05, hence we can't reject Ho i.e. The bluegill population is in HWE at α = 0.05.

Page 9: Lab 3 : Exact tests and Measuring of Genetic Variation

Generation of all possible samples and calculation of probability for each sample is computationally intensive. It will require too much time and is practically impossible for large samples.

In practice, exact tests are done by sampling a distribution generated from a Markov Chain (beyond the scope of this course).

Page 10: Lab 3 : Exact tests and Measuring of Genetic Variation

Measures of Genetic Variation

1. Heterozygosity (Gene diversity).

2. Number of alleles (Allele diversity).

3. Effective number of alleles.

4. Percentage of polymorphic loci.

Page 11: Lab 3 : Exact tests and Measuring of Genetic Variation

1. Heterozygosity (Gene diversity)- Most commonly used measure of genetic variation.

- Observed heterozygosity (HO) = Proportion of heterozygotes in a sample.

- Expected heterozygosity(HE) = Heterozygosity expected under HWE.

n

iip

1

2= Expected homozygosity under HWE = p1

2 + p22 + P3

2 + …….+ pn2

For small sample size(< 50), unbiased HE can be calculated by :

Page 12: Lab 3 : Exact tests and Measuring of Genetic Variation

2. Number of Alleles(n): - Number of alleles present at a locus in a population.- Also called allele diversity (A).- Strongly influenced by sample size.

3. Effective number of Alleles(ne):The number of alleles a population would have if all alleles were at

equal frequency

,1

1

2

n

ii

e

pn

4. Proportion of polymorphic loci (P):

- Not so useful for highly variable loci like Microsatellites.- Locus selection bias

Page 13: Lab 3 : Exact tests and Measuring of Genetic Variation

GenAlEx

Page 14: Lab 3 : Exact tests and Measuring of Genetic Variation

GenAlEx

Page 15: Lab 3 : Exact tests and Measuring of Genetic Variation

Problem 1. Use GenAlEx to perform the following analyses based on the human SSR data:

a.Calculate the genetic variation measures HO, HE, Na, and Ne for all loci in

all populations. Include the estimated values of these measures for all loci in a population you will be assigned during the lab. What can you conclude about the allele frequencies of the 10 loci by comparing Na to Ne?

b.Calculate the average HO and HE across loci for your assigned

population. Can you predict anything about the test of HWE based on these values?

c.GRADUATE STUDENTS ONLY: For your assigned population, calculate the average HO and HE weighted by the number of individuals for which

data were available.

d.Perform a Chi-square test of HWE for all loci in all populations. Include a summary of the test for your assigned population in the lab report. How do you interpret the results of this test?

Page 16: Lab 3 : Exact tests and Measuring of Genetic Variation

Problem 2. Perform an exact test of HWE for all loci and all populations using Arlequin. Include the results for your assigned population in the lab report. How do these results compare to those from the Chi-square test and why? Which test do you trust more?

1. Go to the data worksheet and select the Export Data submenu of GenAlEx.

2. Select Arlequin and click OK on the export parameters window.

3. Save the file with an extension “.arp” in the class data folder. You could, for example, name your file human_ssr.arp