57
Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Embed Size (px)

Citation preview

Page 1: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Marcy K. UyenoyamaDepartment of Biology

Duke University

Genomic Conflict and DNA Sequence Variation

Page 2: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Population geneticsHistorically model-rich

Present need: model-based interpretation of observed patterns of genomic variation

What are hallmarks of each model?

• Self-incompatibility systems in plantsRecognizing genomic conflict due to sexual

antagonism

Overview

Page 3: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Neutral evolutionPure neutrality: distribution of offspring number is

independent of any trait in parent

Demographic history: deme founding, gene flow

Purifying selection: maintain functioning state against random deleterious mutations

• SelectionBalancing selection: maintenance of different forms

Selective sweeps: substitution of most fit for less fit

Canonical models

Page 4: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• How do we know it when we see it?Patterns evident in genome variation

• Model selectionChoosing among a small number of canonical models

for any particular system

Hallmarks of evolution

Page 5: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

A random sample of genes

Ancestral sequence

Sample

Observed

Page 6: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Site frequency spectrum

Allele and mutation spectra

0

1

2

3

4

5

6

7

1 2 3 4 5 6 17

Multiplicity

Num

ber

of m

utat

ions

a = {a1 = 6, a3 = 1, a5 = 1, a6 = 1}, for ai

the number of alleles with multiplicity i

Page 7: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

After an interval choose a lineage at random

– Replace it by two identical copies with probability

– Mutate it according to P with probability

The neutral coalescent

Sample root from stationary distribution of P,mutation transition matrix and bifurcate

t : exp(1 2 )

1 / (1 2 )

2 / (1 2 )

Page 8: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Events on level k Bifurcation at rate Mutation at rate

• Population parameters: ratios of rates Next event is a bifurcation/coalescence with probability

Evolutionary rates

Nk

/21

ku2

1

1

2/2

2/2

lim0/1,

21

1

k

k

kuNk

Nk

Nu

N

uNu 2/1

limfor0/1,

Page 9: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Site frequency spectrum

Allele and mutation spectra

0

1

2

3

4

5

6

7

1 2 3 4 5 6 17

Multiplicity

Num

ber

of m

utat

ions

a = {a1 = 6, a3 = 1, a5 = 1, a6 = 1}, for ai

the number of alleles with multiplicity i

Page 10: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• MutationNovel allelic types formed at rate u per gene per generation

• ReproductionFrequency of allele i in the parental population: pi

Multinomial sampling of N genes to form the offspring

To find: probability of the sample of n genes (n1, n2, …, nk) or (a1, a2, …, an)

for k the number of distinct haplotypes (alleles)ni the number of replicates of allele i

ai the number of alleles with i replicates

Infinite-alleles model

Page 11: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

!

1

)1()1(

!)(

1 i

an

i ain

np

i

a

a = (a1, a2, …, an), for ai the number of alleles representedby i replicates in a sample of size n

= 2Nu, for N the effective number of genes and u the per-locus, per-generation rate of mutation

Ewens (1972, Theoretical Population Biology)

Ewens sampling formula

Page 12: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Site frequency spectrum

Allele and mutation spectra

0

1

2

3

4

5

6

7

1 2 3 4 5 6 17

Multiplicity

Num

ber

of m

utat

ions

a = {a1 = 6, a3 = 1, a5 = 1, a6 = 1}, for ai

the number of alleles with multiplicity i

Page 13: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Population genomics

http://www.arabidopsis.org

About 750 accessions isolated from natural populations worldwideSummary statistics for sample of 19 entire genomes

Page 14: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Arabidopsis SNP spectra

Kim et al. (2008 Nature Genetics. 39: 1151)

Site frequency spectra differ among functional classes

2Minor allele counts 3 5 6 7 84

Page 15: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Biallelic sample of size m

• Multiplicities i and (m – i )

ESF conditioned on two alleles

1

1 2

1( 2 | )

1 1

mm

l j

jP K m

l j

1

1

/2 1

1

1/ 1/ ( )( 1, 1| 2, ) for / 2

1/

2 /( 2 | 2, )

1/

i m i m

j

m m

j

i m iP a a K m i m

j

mP a K m

j

independent of !

Page 16: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

!

1

)1()1(

!)(

1 i

an

i ain

np

i

a

a = (a1, a2, …, an), for ai the number of alleles representedby i replicates in a sample of size n

= 2Nu, for N the effective number of genes and u the per-locus, per-generation rate of mutation

Ewens (1972, Theoretical Population Biology)

Ewens sampling formula

Page 17: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Actual site frequency spectra

Excess of rare and common types, deficiency of intermediate types

Data from NIEHS Environmental Genome ProjectDirect resequencing of loci considered environmentally-sensitive

Global representation of ethnicities

Hernandez, Williamson, and Bustamante (2007)

Page 18: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Black: constant population sizeGrey: recent expansion from small population size

Braverman et al. (1995)

Spectrum shape

Signature of expansion?Expansions maintain more rare mutations

Signature of selective sweep?Neutral variants experience selection as

a population bottleneck

Page 19: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Arabidopsis SNP spectra

Kim et al. (2008 Nature Genetics. 39: 1151)

Site frequency spectra differ among functional classes

2Minor allele counts 3 5 6 7 84

Page 20: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Modelling a SNP data set

• Single segregating mutation in the sample genealogyConditional on exactly one segregating site, determine the

distribution of the size (number of descendants) of the branch on which the mutation occurs

• Exactly two alleles in the sampleConditional on two haplotypes, bearing any number of

segregating sites, determine the distribution of numbers of the two alleles

Nordborg (2001 Handbook of Statistical Genetics)

Page 21: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Two alleles

• One segregating site

Conditioning

1

1 2

1( 2 | )

1 1

mm

l j

jP K m

l j

1

1 2

1( 1| )

1 1

mm

l j

jP S m

l j

Page 22: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Single segregating site in a sample of size m

• Multiplicity i

1

1 2

1( 1| )

1 1

mm

l j

jP S m

l j

Multiplicity conditioned on a SNP

1

2

2

1 11 1

( | , )1 1

1

m i

l

m

j

m lli l i

f i mm

i j

dependent on θ!

Ganapathy and Uyenoyama (2009 Theoretical Population Biology)

Page 23: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Arabidopsis SNP spectra

Kim et al. (2008 Nature Genetics. 39: 1151)

Site frequency spectra differ among functional classes

2Minor allele counts 3 5 6 7 84

Page 24: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Population geneticsHistorically model-rich

Present need: model-based interpretation of observed patterns of genomic variation

What are hallmarks of each model?

• Self-incompatibility systems in plantsRecognizing genomic conflict due to sexual

antagonism

Overview

Page 25: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• PhenotypesMultiple genes generally influence a given phenotype

• ConflictTarget trait value differs among genes that control

phenotype

Sexual antagonismMale and female function collaborate in reproduction

Genes influencing each function may come into conflict

Genomic conflict

Page 26: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Mating type regions as a battlegroundS-locus controls self-incompatibility in flowering

plants

How does sexual antagonism affect the pattern of molecular-level variation within the S-locus?

What are hallmarks of conflict?

• Develop a basis for inferenceModel-based approach to the analysis of genetic

variation

Conflict and genomic variation

Page 27: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg

• Flower developmentBasic perfect flower includes

both male and female components

• FertilizationPollen grains deposited on

stigma germinate and pollen tubes grow down style to the ovary

Page 28: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png

Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg

• Gametophytic SI (GSI)Specificity expressed by

individual pollen grain or tube determined by own S-allele

• Pollen rejectionGrowth of pollen tube

arrested in style

Page 29: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png

Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg

• Sporophytic SI (SSI)Specificity expressed by

individual pollen grain or tube determined by the S-locus genotype of its parent

• Pollen rejectionGermination of pollen grain

may be arrested at stigma surface

Page 30: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png

Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg

An Bn

Sn

Pistil (A) component: rejection ofrecognized specificities

Pollen (B) component: declaration ofspecificity

Page 31: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Mating type regions

Uyenoyama (2005)

Page 32: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Human Y chromosome

Skaletsky et al. (2003 Nature 423: 825)

• Non-recombining male-specific Y (MSY)Euchromatic region ~ 23 MB

Differences between two random Ys every 3 – 4 KB

• Mammalian sex determinant SRYY-linked regulator of transcription of many male-

specific Y-linked genes

Page 33: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Mating type regions

Uyenoyama (2005)

Linkage between pistil (A) and pollen (B)components is essential to SI function• Pollen: declaration of specificity

• Pistil: rejection of recognized specificities

Page 34: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Brassica S-locus

Pollen component

Pistil component

Nasrallah (2000 Curr. Opin. Plant Biol.)

Natural populations often contain 30 – 50 S-alleles

Page 35: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Vierstra (2009, Nature Reviews Molecular Cell Biology)

Ubiquitin tags proteins for degradation

• Style: S-RNase disrupts pollen tube growthUpon entering a pollen tube, S-RNases initially sequestered in a vacuole

In incompatible crosses, vacuole breaks down, releasing S-RNases into cytoplasm of pollen tube

• Pollen: SLF (S-locus F-box)Mediator of ubiquitinylation (attachment of ubiquitin)

Disables all S-RNases except those of the same specificity

Page 36: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Pistil: why reject fertilization?Screening of potential mates may improve offspring

quality

Cost under incomplete reproductive compensation: ovules may go unfertilized

• Pollen: why provoke rejection?Self-rejection may improve quality of own ovules

Rejection by other plants reduces siring successHide behind another S-specificity in sporophytic SI?

Decline to declare S-specificity altogether?

Sexual antagonism

Page 37: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Basic discrete time recursion

• Symmetries in genotype and allele frequenciesModel change in frequency of focal allele i, assuming

all other alleles in equal frequency

GSI model

'

, ,

/ 21 1

jk ikij i j

k i j k i jj k i k

P PP q q

q q q q

Wright (1937, Genetics)

1 for [1 ( 1)] / for ,

2

( 1) / 2 (1 ) / ( 1) for

ij jk

i j

nP P j i P P n j k i

q q P n q q n j i

Page 38: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Change in allele frequency

• Diffusion equation coefficients

holds for large population size (N) and u (rate of mutation to new S-alleles) of order 1/N

Diffusion approximation

(1 ) for the number of common S-alleles

3 2

(1 ) for 1/

( 1)( 2)

q qnq n

n q

nq nqq n

n n

Wright (1937, Genetics)

2

( ) (1 ) / ( 1)( 2)

( ) (1 2 ) / 2

x nx nx n n ux

x x x N

Page 39: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Num

ber

of S

-alle

les

Frequency in population

• Diffusion with jumps

• Turnover rate

Wright’s diffusion model

(x) nx(1 nx)

(n 1)(n 2) ux

2 x(1 2x)

2N

4

( 1)( 2)

Nun

n n

Page 40: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Takahata (1993, Mechanisms of Molecular Evolution)

Expansion of time scaleunder balancing selection

• High rate of invasion of rare allelesPromotes invasion of new

and retention of rare types

Maintains high numbers of alleles

• Genealogical relationshipsTree shape similar under

symmetric balancing selection and neutrality

Greatly expanded time scale

Page 41: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Quasi-equilibrium of S-allelesInvasion of new, rare S-alleles balanced by extinction

of common S-alleles

• Expansion of time scaleRate of divergence among S-allele classes similar to

rate among neutral lineages, but in a population of size fN:

S-allele turnover

2

2

(1 1/ )2 2 ( 1)( 2)

2 4 16

2

j jn

n n n nf

nNf N N u

Page 42: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Basic discrete time recursion

• Diffusion approximation

Parameters:Effective population size (N)

Rate of mutation to new S-specificities (u)

Gametophytic SI models

'

, ,

/ 21 1

jk ikij i j

k i j k i jj k i k

P PP q q

q q q q

2

(1 )( )

( 1)( 2)

(1 2 )( )

2

nx nxx ux

n n

x xx

N

Page 43: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Stationary distribution of allele frequencyMost time spent close to

deterministic equilibrium (1/n) or in boundary layer close to extinction

• Number of S-allelesAnalytical expectation for

number of common S-alleles

Simulation results

Vallejo-Marín and Uyenoyama (2008)

Page 44: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png

Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg

An Bn

Sn

Pistil (A) component: rejection ofrecognized specificities

Pollen (B) component: declaration ofspecificity

Page 45: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Pollen specificity in GSI• Each pollen expresses its

own specificityRarer specificities are

incompatible with fewer plants

• Incompatible matingsFor n S-alleles in equal

frequencies, a pollen type is incompatible with a proportion 2/n of all plants

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png

Page 46: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Pistil: why reject fertilization?Screening of potential mates may improve offspring

quality

Cost under incomplete reproductive compensation: ovules may go unfertilized

• Pollen: why provoke rejection?Self-rejection may improve quality of own ovules

Rejection by other plants reduces siring successHide behind another S-specificity in sporophytic SI?

Decline to declare S-specificity altogether?

Sexual antagonism

Page 47: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

1.00.80.60.40.20.00.0

0.2

0.4

0.6

0.8

1.0

Column 2

Inf

Data from "Ainv"

s

Co

lum

n 2

Self-pollen fraction (s)

Rel

ativ

e vi

abili

ty o

f in

bred

off

spri

ng (

)

Full SC

Polymorphism

Full SI

Fate of style-part mutant

An+1 Bn

Sa

Page 48: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

1.00.80.60.40.20.00.0

0.2

0.4

0.6

0.8

1.0

Data from "Binv"

s

n=

1

0

Self-pollen fraction (s)

Rel

ativ

e vi

abili

ty o

f in

bred

off

spri

ng (

)

Full SC

Polymorphism

Full SI

Disruption

Uyenoyama, Zhang, and Newbigin (2001)

Fate of pollen-part mutant

An Bn+1

Sb

Page 49: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

An+1 Bn

Sa

An Bn+1

Sb

An+1 Bn+1

Sn+1

An Bn

Sn

Direction of pollen flow

Uyenoyama, Zhang, and Newbigin (2001)

Page 50: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

An+1 Bn

Sa

An Bn+1

Sb

An+1 Bn+1

Sn+1

An Bn

Sn

Uyenoyama, Zhang, and Newbigin (2001)

Evolutionarily unlikelyTURN OFFPartial breakdown of SIby pollen disablement

TURN ONRestoration of SIby stylar recognition

Evolutionarily unlikely

Page 51: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Joint genealogies

Newbigin, Paape, and Kohn (2008)

Unlike S-RNase genes, SLF genes show– Low divergence between allelic types

– No trans-specific sharing of lineages

Solanaceae and Plantaginaceae Rosaceae

Page 52: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Family-specific genealogiesRosaceae: do highly-diverged, ancient SFB lineages

reflect continuous operation or restoration of same F-box genes?

Solanaceae, Plantaginaceae: Recruitment of new F-box genes?

• Turnover of pollen-specificity lociExpression and recognition of a paralogue of the

former pollen specificity gene?

Can homologues be distinguished from paralogues with new function?

Cycles of loss/restoration of SI?

Page 53: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Brassica S-locus

Pollen component

Pistil component

Nasrallah (2000 Curr. Opin. Plant Biol.)

Natural populations often contain 30 – 50 S-alleles

Page 54: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

• Sexual antagonism in mating type regionsNeutral variation in linked regions

Rates of substitution at determinants of mating type

• InferenceGoal: use the pattern of variation in population

samples of genomic regions as a basis for inference about the evolutionary process

Detection • genomic conflict and other forms of selection

• mating systems and population structure

An appeal for inference methods

Page 55: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png

Pollen specificity in SSI• Codominance

Both specificities expressed

Almost twice as many incompatible styles under SSI than GSI for same number of S-alleles

• Complete dominanceOne specificity expressed

Page 56: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

SRK genealogies

Edh, Widén and Ceplitis (2009)

• Sporophytic SIDiploid genotype of pollen parent

determines S-specificity of each pollen grain

Class I is dominant over Class II, with codominance within class

• Class II: pollen-recessiveLower number of segregating

alleles, each with relatively higher frequency in population

Greater genealogical relationship within class?

Page 57: Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

Is class II younger

than class I?

Uyenoyama (1995)

• MRCA agesClass I: 25.5 ± 8.1 MY

Class II: 3.1 ± 0.9 MY

I/II: 41.4 ± 12.7 MY

• Origin of SLG/SRK system42.1 ± 9.0 MY