57

The dynamics of nuclear gene order in the eukaryotes

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The dynamics of nuclear gene order in the eukaryotes
Page 2: The dynamics of nuclear gene order in the eukaryotes

The dynamics of nuclear gene order in the

eukaryotes

Page 3: The dynamics of nuclear gene order in the eukaryotes

Genome archaeology in the angiosperms

Todd VisionDepartment of Biology

University of North Carolina at Chapel Hill

Page 4: The dynamics of nuclear gene order in the eukaryotes

Comparative maps Spaghetti Diagram Crop Circle

Livingstone et al 1999 Genetics 152:1183Gale & Devos 1998 PNAS 95:1972

Page 5: The dynamics of nuclear gene order in the eukaryotes

Arabidopsis as a hub for plant comparative maps

genome sizes in angiosperms

145262

367 367 372 415 439 473 560 622

907

0

250

500

750

1000

mega

base

s

data from Arumuganathan & Earle (1991)Plant Mol Biol Rep 9:208-218

Page 6: The dynamics of nuclear gene order in the eukaryotes

Tomato-Arabidopsis synteny

Bancroft (2001) TIG 17, 89 after Ku et al (2000) PNAS 97, 9121

Page 7: The dynamics of nuclear gene order in the eukaryotes

Outline

• Ancient genome duplication– How can we reconstruct genomic history?

• Computational challenges

• Role of different classes of gene duplication in genome evolution

Page 8: The dynamics of nuclear gene order in the eukaryotes

Outline

• Ancient genome duplication– How can we reconstruct genomic history?

• Computational challenges

• Role of different classes of gene duplication in genome evolution

Page 9: The dynamics of nuclear gene order in the eukaryotes

Mayer et al. (2001) Genome Res. 11, 1167

Rice-Arabidopsis synteny

Page 10: The dynamics of nuclear gene order in the eukaryotes

Paleotetraploidy?

The Arabidopsis Genome Initiative. 2000. Nature 408:796

Page 11: The dynamics of nuclear gene order in the eukaryotes

Genomic dot-plot

gene 1 2 3 4 5 6 7 8 1 1 0 0 0 1 0 0 0 2 0 1 0 0 0 1 0 0 3 0 0 1 0 0 0 1 0 4 0 0 0 1 0 0 0 1 5 1 0 0 0 1 0 0 0 6 0 1 0 0 0 1 0 0 7 0 0 1 0 0 0 1 0 8 0 0 0 1 0 0 0 1

1 2 3 4

5 6 7 8Chromosome copy 1Chromosome copy 2

Page 12: The dynamics of nuclear gene order in the eukaryotes

Duplication vs. multiplication

Multiple duplications generate abundant overlaps among homeologous regions

Page 13: The dynamics of nuclear gene order in the eukaryotes
Page 14: The dynamics of nuclear gene order in the eukaryotes

Vision et al. (2000) Science 290:2114-7.

Segmental paralogy in Arabidopsis

Page 15: The dynamics of nuclear gene order in the eukaryotes

A B DC E F

Many duplicated segments but few duplication events

0

2

4

6

8

10

12

0 .1 .2 .3 .4 .5 .6 .7 .8 .9

amino acid substitution

freq

uenc

y of

blo

cks

Page 16: The dynamics of nuclear gene order in the eukaryotes

Blanc, Hokamp, Wolfe (2003) Genome Res. 13, 137-144.

Page 17: The dynamics of nuclear gene order in the eukaryotes

Arabidopsis

tomatoAngiosperm Phylogeny Website. Version 2 August 2001. http://www.mobot.org/MOBOT/research/APweb/.

rice

Page 18: The dynamics of nuclear gene order in the eukaryotes

Block 37 after

Asterid-Rosidsplit

Block 57before

monocot-dicot divergence

Raes, Vandepoele, Saeys, Simillion, Van de Peer (2003) J. Struct. Func. Genomics 3, 117-129

Page 19: The dynamics of nuclear gene order in the eukaryotes

Divergence of homeologs

• Homeologs from age class C and older share less than a third of their genes

– Gene loss

– Or subsequent gene movement?

• There is no evidence for uneven proportions of duplicated genes between homeologs

Page 20: The dynamics of nuclear gene order in the eukaryotes

Redundant gene function: SHATTERPROOF

Martin Yanofsky

Page 21: The dynamics of nuclear gene order in the eukaryotes

Implications for comparative maps

• Networks of synteny

• Goodbye to pairwise comparisons

Page 22: The dynamics of nuclear gene order in the eukaryotes

Outline

• Ancient genome duplication– How can we reconstruct genomic history?

• Computational challenges

• Role of different classes of gene duplication in genome evolution

Page 23: The dynamics of nuclear gene order in the eukaryotes

Ghosts and Muggles

Simillion, Vandepoele, Van Montagu, Zabeau, Van de Peer (2002) PNAS 99, 13627

Page 24: The dynamics of nuclear gene order in the eukaryotes

Interspecies comparison can reveal Ghosts

Page 25: The dynamics of nuclear gene order in the eukaryotes

Things needful

• Identification of highly diverged Muggles

• A systematic way to identify Ghosts

• Centralization of mapped and sequenced DNA markers from multiple species

Page 26: The dynamics of nuclear gene order in the eukaryotes

FISH(Fast Identification of Segmental

Homology)• Identifies candidate segmental homologies

– Dynamic programming

• Statistically evaluates candidates– Null model of transpositional duplication

• No permutations required

• Approaches limits to sensitivity

Page 27: The dynamics of nuclear gene order in the eukaryotes

FISH under null model

k observed number

standard error

upper bound

lower bound

2 45.8 0.06 47.6 40.1

3 2.28 0.02 2.39 1.78

4 0.113 0.003 0.120 0.079

5 0.006 0.001 0.006 0.004

6 0.0003 0.0002 0.0003 0.0002

Page 28: The dynamics of nuclear gene order in the eukaryotes

eAssembler

• Reconstructs ancestral gene order by joining duplicated blocks with overlapping gene content

• Uses ‘breakpoint median’ as objective function

• Similar to algorithms used in sequence assembly

Blanc, Hokamp, Wolfe (2003) Genome Res. 13, 137-144.

Page 29: The dynamics of nuclear gene order in the eukaryotes

PHYTOMEintegrating plant genome maps,

sequences and phylogenies

From www.plantgdb.org

Page 30: The dynamics of nuclear gene order in the eukaryotes

Outline

• Ancient genome duplication– How can we reconstruct genomic history?

• Computational challenges

• Role of different classes of gene duplication in genome evolution

Page 31: The dynamics of nuclear gene order in the eukaryotes

Gene duplications in a chromosomal context

• Turnover within gene families can be high– Rate of duplication= 0.002/gene*MY– Half-life=23MY

• Three modes of duplication– Tandem– Transpositional– Segmental

• How does the mode of origin affect the molecular and functional divergence of duplicate genes?

Page 32: The dynamics of nuclear gene order in the eukaryotes

Gene family turnover

Lynch and Conery (2000) Science 290, 1151

Page 33: The dynamics of nuclear gene order in the eukaryotes

Importance of tandem and transpositional duplications

~10% of genes are in tandem arrays

85% of dispersed duplications are not in blocks

• Duplicates on the same chromosome are 20% more common than expected by chance

• Duplicates on the same chromosome are 86% as distant as would be expected by chance

Page 34: The dynamics of nuclear gene order in the eukaryotes
Page 35: The dynamics of nuclear gene order in the eukaryotes

Aux/IAA and ARF sister families

• Importance in Arabidopsis

Page 36: The dynamics of nuclear gene order in the eukaryotes

Diversification of the Aux/IAA gene family

David Remington and Jason Reed

Page 37: The dynamics of nuclear gene order in the eukaryotes
Page 38: The dynamics of nuclear gene order in the eukaryotes

Diversification of ARF gene family

Page 39: The dynamics of nuclear gene order in the eukaryotes

Chromosome 2-4 complex:242 duplicated gene pairs

2600

3000

3400

3800

4200

1200 1600 2000 2400 2800

chromosome 2 (5.6 Mb)

chro

mos

ome

4 (4

.6 M

b)

45

52

49

54

56

Page 40: The dynamics of nuclear gene order in the eukaryotes

Substitutions in coding sequences

• silent substitutions (Ks) only alter the codon, not the resulting amino acid

• replacement substitutions (Ka) alter the amino acid

• Ka and Ks are standardized by the numbers of synonymous and nonsynonymous sites

Page 41: The dynamics of nuclear gene order in the eukaryotes

Ratio of Ka to Ks

Ka/Ks < 1 selective constraint

Ka/Ks = 1 pure neutrality

Ka/Ks > 1positive selection

Page 42: The dynamics of nuclear gene order in the eukaryotes

How have these ancient segmental duplicates diverged?

1. What is the variation in Ka and Ks among simultaneously duplicated pairs?

2. Do the Ka/Ks ratios suggest positive selection?

3. Do the members of each duplicated pair evolve at the same rate?

Page 43: The dynamics of nuclear gene order in the eukaryotes

0

10

20

30

40

50

60

70

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Ka

frequency

0

20

40

60

80

100

120

0 1 2 3 4 5

Ks

frequency

coefficient of variation = 0.67

coefficient of variation = 0.53

Page 44: The dynamics of nuclear gene order in the eukaryotes

Relationship between Ka and Ks

Ka/Ks =1

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5

Ks

Ka

r2=0.558, p<0.001

Page 45: The dynamics of nuclear gene order in the eukaryotes

Relative rate test

O (outgroup) A B

d1 d2 d3

compare the fit of a model in which d2 = d3with one in which they are allowed to vary

Page 46: The dynamics of nuclear gene order in the eukaryotes

Relative rate tests

• 105 gene pairs could be evaluated against an outgroup

• >30 showed significantly unequal rates of evolution• no evident chromosomal or regional biases

Distance measure Significant pairs

protein 15

Ka 29

Ks 9

Page 47: The dynamics of nuclear gene order in the eukaryotes

Are paralogs different than orthologs?

• Homologous genes are either– Paralogs that diverged through duplication

– Orthologs that diverged though speciation

• Paralogs must coexist in the same genome – do they diverge differently as a result?

• Comparison to 212 Arabidopsis-Brassica orthologs by Tiffin and Hahn (2002) JME 54, 746.– For all pairs, Ka/Ks < 1

– Ka/Ks unimodal around 0.14 (as opposed to 0.20)

– CVKs/CVKa is appx. 2

Page 48: The dynamics of nuclear gene order in the eukaryotes

Conclusions

• A network of synteny due to duplication and gene loss makes deep comparative mapping difficult

• But phylogenetically-informed methods should allow us to go much deeper than at present

• Only by going deep will we be able to understand the varied roles of different kinds of duplication events in the diversification of gene families

Page 49: The dynamics of nuclear gene order in the eukaryotes

Acknowledgements• Arabidopsis genome evolution

– Daniel Brown– Steven Tanksley

• Comparative mapping– Peter Calabrese– Sugata Chakravarty– Luke Huan

• Evolution of duplicated genes– Liqing Zhang– Brandon Gaut– David Remington– Jason Reed

• Support– USDA– NSF

Page 50: The dynamics of nuclear gene order in the eukaryotes
Page 51: The dynamics of nuclear gene order in the eukaryotes

Conservation of gene orientation

parallel

convergent

divergent

Page 52: The dynamics of nuclear gene order in the eukaryotes

Formulating the problem in terms of graph traversal

• nodes are matches• edges are unidirectional• edges have associated distances

The putative duplicated blocks consist of the paths through the graph that traverse edges with short distances

Page 53: The dynamics of nuclear gene order in the eukaryotes

Statistical framework• Null model of duplications

– Single-gene duplication/random transposition

– Leads to uniformly distributed dots

• Null distribution for– The edge distance between nearest neighbors

– The number of serially connected short edges

• Observed edge distances and path lengths analytically compared to null expectation

• Can be approximated by a permutation test

Page 54: The dynamics of nuclear gene order in the eukaryotes

Only a fraction of the genes are (still?) duplicated

Chr2 segment1183 genes

Chr4 segment1168 genes

326duplicates

(~28%)

Page 55: The dynamics of nuclear gene order in the eukaryotes

271 (83%) pairwise duplications

Page 56: The dynamics of nuclear gene order in the eukaryotes

Tandem substitutions

• correlation between Ka and Ks disappears when tandem substitutions are excluded

• could be due to– doublet mutations– compensatory substitutions

Page 57: The dynamics of nuclear gene order in the eukaryotes

At2g18750

AT4g31000

49.5 calmodulin-binding protein 49.62 beta-expansin

AT4g28250At2g20750

49.63 NADH-ubiquinone oxireductase

At2g20800

AT4g28220

56.1 unknown transmembrane

At2g23810

AT4g30430

tobacco1698547

0.13

0.16

0.37

rice8118436

Hemerocallis3551953

p<0.0001

p<0.05

p<0.0001

p<0.01

0.300.10

0.22

0.14

0.160.29

0.220.120.70

potato5734586