Transcript
Page 1: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

The Repatterning ofEukaryotic Genomes byRandom Genetic DriftMichael Lynch,1 Louis-Marie Bobay,2

Francesco Catania,3 Jean-Francois Gout,1

and Mina Rho4

1Department of Biology and 4Department of Computer Science, Indiana University,Bloomington, Indiana 47408; email: [email protected] Evolutionary Genomics, Institut Pasteur, CNRS, URA2171, F-75724 Paris,France3Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN,United Kingdom

Annu. Rev. Genomics Hum. Genet. 2011.12:347–66

First published online as a Review in Advance onJuly 13, 2011

The Annual Review of Genomics and Human Geneticsis online at genom.annualreviews.org

This article’s doi:10.1146/annurev-genom-082410-101412

Copyright c© 2011 by Annual Reviews.All rights reserved

1527-8204/11/0922-0347$20.00

Keywords

complexity, genome evolution, mutation, protein evolution,recombination

Abstract

Recent observations on rates of mutation, recombination, and randomgenetic drift highlight the dramatic ways in which fundamental evolu-tionary processes vary across the divide between unicellular microbesand multicellular eukaryotes. Moreover, population-genetic theory sug-gests that the range of variation in these parameters is sufficient to ex-plain the evolutionary diversification of many aspects of genome size andgene structure found among phylogenetic lineages. Most notably, largeeukaryotic organisms that experience elevated magnitudes of randomgenetic drift are susceptible to the passive accumulation of mutationallyhazardous DNA that would otherwise be eliminated by efficient selec-tion. Substantial evidence also suggests that variation in the population-genetic environment influences patterns of protein evolution, with theemergence of certain kinds of amino-acid substitutions and protein-protein complexes only being possible in populations with relativelysmall effective sizes. These observations imply that the ultimate ori-gins of many of the major genomic and proteomic disparities betweenprokaryotes and eukaryotes and among eukaryotic lineages have beenmolded as much by intrinsic variation in the genetic and cellular featuresof species as by external ecological forces.

347

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

aruizp
Resaltado
aruizp
Resaltado
aruizp
Resaltado
Page 2: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

INTRODUCTION

It is generally acknowledged that the biologicalworld was entirely prokaryotic 3 billion yearsago, and that by ∼2.5 billion years ago, a keylineage had emerged that eventually gave riseto all of today’s eukaryotes. Although prokary-otes are the evolutionary cradle of metabolicdiversity and still dominate the earth numeri-cally, the emergence of eukaryotes initiated anenormous radiation of morphological diversity.The factors responsible for this diversificationand the degree to which natural selection playeda role remain unclear.

Based on the attributes shared across theentire eukaryotic domain, we can be reason-ably certain that the ancestral eukaryotic cellharbored a complex genome and a complex in-ternal structure (47, 65, 83), and it is temptingto further assume that the emergence of inter-nal membrane-bound structures was a neces-sary precursor to the evolution of diverse ex-ternal morphologies (50). However, althoughcellular features comprise the physical substrateupon which natural selection operates, intrinsicprocesses operating at the population-geneticlevel dictate the types of paths that are open orclosed to evolutionary exploration within dif-ferent phylogenetic lineages. Here, we reviewthe evidence that alterations in the population-genetic environment played a central and pos-sibly definitive role in establishing the uniquetypes of evolutionary trajectories taken by vari-ous eukaryotic lineages at the genomic and pro-teomic levels.

As the subject material is broad, we willrestrict our attention to three fundamentalissues. First, we will examine the generalphylogenetic patterns of the three main non-adaptive features of the population-geneticenvironment—random genetic drift, recombi-nation, and mutation—as the relative powersof these forces define the types of evolutionarychanges that are possible in various contexts.Second, we will review the broad set ofobservations on eukaryotic genome structurethat have emerged via the field of comparativegenomics. With considerably more data than

were available in the past, this overview willclearly establish the general boundaries of theoverall genome-architectural landscape withinwhich eukaryotic lineages have wandered overevolutionary time. Third, having establishedthe central role that random genetic drift andmutation have played in the diversification ofgenome structure, we will move to the next rungof the ladder in biological organization, thenature of the proteome, providing suggestivearguments that alterations in the nonadaptiveforces of evolution in the eukaryotic domainare of sufficient magnitude to influence theways in which protein evolution proceeds.New advances in population-genetic theory inthis area provide a potential resource for un-derstanding how complex cellular adaptationsmay have evolved in the ancestral eukaryote.

THE POPULATION-GENETICENVIRONMENT

Although natural selection plays an essentialrole in molding organismal diversity in waysthat ensure population survival, the stochasticnature of the processes of random genetic drift,recombination, and mutation makes it impossi-ble to precisely predict the genomic or pheno-typic responses that will be elicited by any spe-cific selective challenge. However, two thingsare clear. First, evolution follows the dictates ofDarwin’s “descent with modification”—naturalselection operates on standing variation, withthe new variants that arise by mutation andrecombination being defined by the preexist-ing resources. Second, the ability of naturalselection to promote beneficial mutations anderadicate deleterious mutations depends onthe intensity of selection at the gene levelrelative to the power of random genetic drift.If the magnitude of drift exceeds the power ofselection, adaptations that would otherwise goforward cannot be actively promoted, whereasdegenerative mutations with sufficiently mildeffects will accumulate. Although the lattereffect can lead to extinction of a sufficientlysmall population (73, 74), it can also promotenovel paths of evolution as initially deleterious

348 Lynch et al.

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 3: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

mutations that passively emerge at the DNAlevel are secondarily modified into new adap-tive forms. Here, we briefly review how themagnitudes of the three major features of thepopulation-genetic environment scale acrossthe tree of life.

Random Genetic Drift

Random sampling of the gene pool from gen-eration to generation is a ubiquitous sourceof evolutionary stochasticity. The magnitudeof drift is generally defined by the inverse ofthe effective number of gametes sampled pergeneration—2Ne in a diploid species, where Ne

is the effective population size (15). AlthoughNe is generally expected to increase with theabsolute number of reproductive adults in aspecies (N), many additional factors influencepatterns of gene transmission across genera-tions. Most aspects of population structure,such as uneven sex ratios, variation in fam-ily size, nonrandom mating, and localized in-breeding, result in nonrandom representationof genomes across generations, guaranteeingthat Ne < N—i.e., that fluctuations of allelefrequencies across generations will be muchlarger than expected were gametes to be sam-pled equally from N parents.

However, of at least equal importance is thefact that the physical structure of the genomeensures that Ne will be further depressed be-low the expectation based on gamete sampling.This is because the fates of nucleotides at aspecific genomic site are determined by selec-tion operating not just on that site, but alsoon all linked sites under selection. As a con-sequence, the direct effects of some deleteriousmutations can be partially masked by fortuitouslinkage to beneficial mutations and also by thesimultaneous presence of competing deleteri-ous mutations in other individuals. In the ex-treme case of an obligately asexual species, Ne

is not much more than the number of individu-als in the highest fitness class of the population,as essentially all other individuals represent the“living dead” who will leave no long-term de-scendants (29). Likewise, the ability of selection

to promote new adaptive mutations is reducedby linkage, as favorable (nonallelic) mutationsarising on homologous chromosomes cannot besimultaneously fixed unless recombination oc-curs between the two sites. These interferenceeffects from linkage are expected to increasewith increasing N, as a larger number of mu-tational targets enhances the likelihood of si-multaneous segregation of multiple mutations(28). Consequently, the relationship betweenNe and N is almost certainly nonlinear, with theresponse of Ne to N becoming shallower (andperhaps even leveling off) at very large N, asthe quantitative consequences of linkage (draft)overtake the magnitude of drift associated withgamete sampling. The net effect of these com-plexities is that 1/(2Ne) should be viewed sim-ply as a composite summary measure of thelong-term consequences of all sources of trans-mission stochasticity, including those caused bylinkage, mating-system variation, etc.

In principle, Ne can be estimated directlyby monitoring the variance of allele-frequencychange across generations, as this has an ex-pected value p(1 − p)/(2Ne), where p is theinitial allele frequency (88, 106). However, asthe expected change in allele frequency is ex-tremely small unless Ne is tiny, this approachis notoriously difficult to apply because errorsin estimating p will overwhelm the true changeunless the sample size is enormous. As a con-sequence, most attempts to estimate Ne havetaken a circuitous route, the most popular be-ing indirect inference from observations on lev-els of within-population variation at nucleotidesites assumed to be neutral. The logic under-lying this approach is that if u is the rate ofbase-substitution mutation per generation forthe sites under consideration, and Ne is roughlyconstant, an equilibrium level of variation willbe reached at which the input via mutation, 2u,is matched by the fractional loss via drift/draft,1/(2Ne). At this point, the level of nucleotideheterozygosity is approximately equal to the ra-tio of these two forces, 4Neu for a diploid (2Neufor haploids), provided the observed heterozy-gosity � 1, which is almost always the case. Theobvious limitation of this estimator is that it is

www.annualreviews.org • Eukaryotic Genome Evolution 349

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 4: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

not simply a function of Ne, but also depends onu. On the other hand, as will be noted below, acentral parameter in many areas of genome evo-lution is the ratio of the magnitudes of muta-tion and drift, u/[1/(2Ne)] = 2Neu, so this com-posite population-genetic estimator is in fact ofgreat utility (65).

After factoring out the contribution frommutation (below), standing levels of variation atsilent sites imply Ne ∼= 105 for vertebrates, ∼106

for invertebrates and land plants, ∼107 for uni-cellular eukaryotes (and fungi), and >108 forfree-living prokaryotes (66). Although crude,these estimates imply that the power of driftis substantially elevated in eukaryotes—e.g., atleast three orders of magnitude in large multi-cellular species relative to prokaryotes. It is alsoclear that the genetic effective sizes of popula-tions are generally very far below actual popu-lation sizes.

Mutation

Although mutation rates have historically beenestimated by indirect methods such as reporterconstructs and phylogenetic analysis, the recentapplication of high-throughput sequencingmethods to long-term mutation-accumulationlines has yielded highly refined genome-wideestimates of the mutation rate in several modelsystems (66). The results indicate a clearincrease in the mutation rate with increasinggenome size and organism complexity, froman average <10−9 base substitutions per siteper generation in prokaryotes to >10−8 inmammals, with invertebrates and land plantsexhibiting intermediate values. Results from invitro assays of replicative polymerases and themismatch-repair enzymes suggest that thesemodifications are due, at least in part, to vari-ation in the efficiency of the replication/repairmachinery (69), although multicellular speciesmay also be capable of minimizing germlinemutation rates per cell division by maintaininga relatively nonmutagenic environment in suchcell lineages.

When combined with the above-notedestimates of standing variation on silent-site

heterozygosity, these direct observationsfurther imply an inverse relationship betweenthe mutation rate and Ne in accordance withthe drift-barrier hypothesis for mutation-rateevolution (66, 69). The mechanistic underpin-nings of this hypothesis are derived from twogeneralities. First, owing to the predominanceof deleterious mutations, selection generallyoperates to reduce the mutation rate. Second,selection on the mutation rate is a second-order effect, in that the magnitude of selectionagainst a weak mutator allele is a function of theexcess number of deleterious mutations thatit promotes and remains in linkage disequi-librium with. In sexually reproducing species,the association with unlinked mutations will beeliminated in just two generations on average,whereas mutators in asexual populationsacquire a higher load, as associations with mu-tations are retained indefinitely unless removedby the slower action of selection. The net con-sequence of these effects is that the selectiveadvantage of an antimutator allele is generallyon the order of the reduction in the genome-wide deleterious-mutation rate in asexualpopulations and s times that in sexual popula-tions (18, 41, 45), where s is the average selectivedisadvantage of a deleterious mutation [whichis typically on the order of 0.001 to 0.01 (78)].

From these observations, it can be con-cluded that because genome-wide deleterious-mutation rates are typically on the order of0.01 to 1.0 (78) and are undoubtedly influencedby many dozens of loci, most single-amino-acid substitutions in replication/repair loci willhave very small selective consequences. Oncethe mutation rate has been driven down to thelevel at which the improvement of fitness byantimutators is typically less than the magni-tude of drift, ∼1/(2Ne), the lower bound to themutation rate has been reached. Thus, the in-crease in the per-generation mutation rate ineukaryotes relative to prokaryotes, and in mul-ticellular species in particular, is likely to be asimple pathological consequence of a reduc-tion in Ne, and not an outcome of selectionto improve the long-term evolvability of suchtaxa.

350 Lynch et al.

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 5: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

Recombination

As in the case of the mutation rate, there is astrong association between recombination ratesand effective population sizes, although in thiscase the causal effect is indirect. As will be dis-cussed below, eukaryotic genome sizes substan-tially increase in response to declining Ne, butsuch expansions are generally a consequence ofincreases in chromosome size rather than chro-mosome number. Thus, as a consequence of anearly invariant feature of meiosis—the occur-rence of approximately one crossover event perchromosome arm per meiotic event—the aver-age rate of recombination per unit of physicaldistance exhibits a strong inverse relationshipwith genome size (Figure 1). Most of the re-maining noise around this relationship is due tovariation in haploid chromosome number (C),with the average amount of recombination perunit of physical distance on chromosomes be-ing closely approximated by 2C/G for sexuallyreproducing eukaryotes, where G is the haploidgenome size.

Although this pronounced pattern broadlydefines the average levels of recombinationalactivity across the eukaryotic tree of life, it isalso true that considerable variation exists in lo-cal rates of recombination within chromosomesand within and among individuals (20, 90).Nonetheless, although recombination hot spotsare common (7, 38), they appear to be transientphylogenetically, and the evidence suggests thatrecombination-rate variation may be largely afunction of neutral processes (19). Indeed, re-combination hot spots are expected to be self-destructive, as they are typically converted bytheir less recombinogenic allelic partners dur-ing gene conversion events (86).

Thus, although considerable theoretical ef-fort has been addressed toward understand-ing the role of adaptation in promotingrecombination-rate modifiers (6, 33, 43), thebulk of the evidence suggests that recombina-tion rates per meiotic event are not fine-tunedby natural selection. If anything, given the con-served deployment of just one to two crossoverevents per chromosome across the entire

Genome size (Mb)101 102 103 104 105

Reco

mbi

nati

on ra

te p

er k

iloba

se

10–7

10–6

10–5

10–4

10–3

Uni- or oligocellular species

Angiosperms

Vertebrates

Invertebrates

Large numbers of chrom

osomes

Small num

bers of chromosom

es

Figure 1A compilation of estimates of the average amount of recombination per unit ofphysical distance in eukaryotic genomes, derived from 137 meiotic geneticmaps. The diagonal lines have slopes of −1.

eukaryotic domain, selection appears to min-imize genetic exchange within chromosomes,perhaps to minimize the damage that can ac-company double-strand breaks. The simplestways to modify global recombination rates arethen to either alter the numbers of chromo-somes (for which there is no phylogenetic pat-tern) or restrict the frequency of meiosis (inspecies capable of alternating episodes of sex-ual and asexual propagation).

The net result of the per-site recombinationrate declining and the mutation rate increasingwith genome size is that the ratio of rates ofrecombination to mutation per nucleotidesite increases from ∼1.0 in multicellularspecies to >100 in unicellular eukaryotes (65).However, the quantitative consequences ofthese differences in recombinational activityfor genome evolution are not entirely clear.Because the large genomes of multicellulareukaryotes often contain >90% noncodingDNA with unknown (and in many cases,

www.annualreviews.org • Eukaryotic Genome Evolution 351

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 6: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

Coding exons Internal introns

10–2

10–1

100

101

102

103

104 100%

Vertebrates

Invertebrates

Land plants

Green algae

Fungi

Alveolates

Excavates

Stramenopiles

10%

1%

10–1 100 101 102 103 104

Gen

ome

cont

ent (

Mb)

10–210–2

10–1

100

101

102

103

104 100%

10%

1%

10–1 100 101 102 103 10410–210–2

10–1

100

101

102

103

104 100%

10%

1%

10–1 100 101 102 103 10410–2

Pseudogenes

10–2

10–1

100

101

102

103

104 100%

10%

1%

10–1 100 101 102 103 10410–210–2

10–1

100

101

102

103

104 100%

10%

1%

10–1 100 101 102 103 10410–210–2

10–1

100

101

102

103

104

10–2

Non-LTR retrotransposons LTR retrotransposons

Genome size (Mb)

DNA transposons

10–1 100 101 102 103 104

100%

10%

1%

Figure 2The scaling of genome content with genome size across ∼150 eukaryotes. The full data set is available from the authors upon request.Note that the pool of intronic DNA can contain mobile elements and/or pseudogenes. Abbreviation: LTR, long terminal repeat.

probably nonexistent) functions, it is likely thatthe disparity in rates of recombination betweenselected sites is less than that suggested above,owing to the greater dispersion of such sites inspecies with greater abundance of spacer DNA.

GENOME ARCHITECTUREAND GENE-STRUCTURALCOMPLEXITY

The observations outlined in the precedingparagraphs clearly indicate that the evolution ofeukaryotes was accompanied by the emergenceof major differences in the population-geneticenvironment, with species in multicellular lin-eages exhibiting extreme lows for the power ofrecombination and extreme highs in the ratesof mutation and magnitude of random geneticdrift. Together, these changes produce a syner-gistic combination of conditions that increase

the likelihood of mildly deleterious mutationaccumulation while reducing the ability of se-lection to promote beneficial mutations withsmall advantages. The original argument thatthe consequences of such a syndrome are re-flected on a genome-wide scale in a fairly reg-ular way across the entire tree of life was basedon a relatively small number of taxa (72), but itis now possible to expand this earlier survey to∼150 eukaryotic species (Figure 2).

As genome sizes increase from ∼1 Mb in thesmallest eukaryotes to >10 Gb in the largest,the most pronounced change is a progressive in-crease in the contribution of noncoding DNA.This pattern is common to all forms of non-coding sequence, from the three main classesof mobile elements to introns to pseudogene-associated sequences. Variation exists in theaverage positioning of major phylogeneticgroups on an axis of genome size, but there is

352 Lynch et al.

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 7: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

an overall continuity of scaling between groupssuch that the content of a large fungal genomeapproximates that of an invertebrate genome ofcomparable size, etc. This general syndrome ofgenomic bloating, operating in parallel acrosseukaryotic lineages, is thought to be a conse-quence of the reduced efficiency of selectionopposing the accumulation of excess DNA intaxa with reduced Ne and a mutational biastoward insertions of large segments of DNA(62, 65).

The underlying premise of this hypothesis isthat essentially all forms of excess DNA are mu-tationally hazardous. For example, the additionof every intron to a gene imposes a constrainton ∼30 nucleotides of gene sequence requiredfor proper recognition by the spliceosome (67).The expanded 5′ untranslated regions of eu-karyotic genes serve as mutational substrate forthe origin of premature translation-initiationcodons, which typically result in defective tran-scripts (77). In addition, although completelynonfunctional intergenic DNA cannot sufferfrom a loss-of-function mutation, it can incurmutations that cause adjacent genes to turn onat inappropriate times or places (27, 32).

As will be described below, the mutational-hazard hypothesis provides a unifying explana-tory framework for a broad array of previouslydisconnected observations on genome size andgene structure. A few other hypotheses havebeen proposed as explanations for variation ingenome size, but these have no obvious connec-tion to the issue of gene-structural evolution,and have other limitations as well. For example,Cavalier-Smith (13, 14) has been a persistentadvocate of the view that genome size (inde-pendent of gene content) is a quantitative traitselected upon for its influence on a variety ofinterrelated cellular features, including the sizeof the nuclear envelope and the flow of tran-scripts to the cytoplasm. However, as noted inFigure 2, the primary mechanism of genome-size expansion in eukaryotes is the prolifer-ation of diverse families of mobile elements,all of which serve as major sources of dele-terious mutation, and it is unclear how oftenthe major mutational burden imposed by such

elements (resulting from the inactivation ofhost genes into which they hop) can be offsetby any cellular advantages of bulk DNA. Onesituation in which such a barrier might be atleast transiently surmounted is whole-genomeduplication.

From a quite different perspective, Hessenet al. (34) have suggested that patterns of nutri-ent limitation in various lineages select for al-ternative strategies for investing in DNA versusRNA, most notably the reallocation of nitrogenand phosphorus from the genome to ribosomesin rapidly growing species. However, as notedby Vieira-Silva et al. (105; see also 35), thereis no relationship between genome size and cellgrowth rates across a wide range of prokaryotes,and through increases in the numbers of originsof replication, large genomes can come to repli-cate just as rapidly as small genomes (65). Thus,there is no evidence that genome sizes are con-strained by energetic or time constraints (withviruses being a possible exception).

One of the primary predictions of themutational-hazard hypothesis is that the sus-ceptibility of a genome to the accumulation ofexcess DNA is a function of the ratio of thepower of mutation to drift (65). Consider amodification to a gene’s structure that increasesthe susceptibility to degenerative mutations. Ifsuch an embellishment increases the mutationaltarget size by n nucleotides (e.g., as noted above,n ≈ 30 in the case of introns), and the mu-tation rate per site is u, a new allele carryingsuch a change will experience an excess rateof degradation to defective alleles equal to nu.This differential susceptibility to mutation op-erates exactly like selection, in that the proba-bility of fixation of a structurally modified allele(assuming no change in the protein sequence orother functional aspects) is defined by the stan-dard diffusion approximation of Kimura (44),2s /(1 − e−4N es ), with the selective disadvantagebeing set equal to s = −nu (61). The quan-tity 4Nes is equivalent to the ratio of the rela-tive strength of selection, s, and drift, 1/(2Ne).If |4Nes| � 1, drift dominates the dynamicsof allele-frequency change, and the probabil-ity of fixation of a deleterious allele is closely

www.annualreviews.org • Eukaryotic Genome Evolution 353

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 8: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

approximated by the initial frequency 1/(2Ne),whereas if |4Nes| � 1, selection dominates, andthe probability of fixation of a deleterious alleleasymptotically approaches zero.

Thus, the likelihood of a mutationally haz-ardous modification fixing depends criticallyon whether the composite quantity 4Nenu isgreater or less than 1. Rewriting the lattercondition as 4Neu < 1/n, to denote the pointwhere the efficiency of selection is compro-mised enough that a harmful allele (or generegion) can readily drift to fixation, clarifies acentral point of the mutational-hazard hypoth-esis. Although it is commonly implied that thekey parameter underlying this hypothesis is theeffective population size (e.g., 31, 100, 112), thisis not strictly true. Rather, the likelihood of suc-cess of a mutationally hazardous gene modifi-cation is the ratio of the power of mutation todrift, 2Neu. One of the more dramatic sourcesof support for this point derives from obser-vations on organelle evolution (75). Althoughthe nuclear genomes of land plants and ani-mals have independently arrived at quite similararchitectural states (Figure 2), these lineageshave evolved enormous differences in the struc-ture of organelle genomes—those in land plantsare bloated with intergenic DNA and introns,whereas those in animals are extremely com-pact, to the point that the translation-initiationand translation-termination codons of adjacentgenes often overlap. These dramatic disparitiescan be explained by the fact that mutation ratesin plant organelles are typically ∼100 timeslower than those in animal organelles, whereasthe nuclear mutation rates and average values ofNe are roughly comparable across both lineages(65).

The plausibility of the mutational-hazardhypothesis derives from the fact that the mu-tational costs of various additions to eukary-otic genes are generally weak enough that suchmodifications are vulnerable to passive accumu-lation in species with sufficiently small Ne. Con-sider, for example, the consequences of addingan intron that imposes constraints on the se-quences at ∼30 key nucleotide sites (67). Giventhat the human mutation rate per nucleotide

site is ∼1.3 × 10−8 per generation (67), theselective disadvantage (nu) of a newly arisenintron-containing allele in the human lineageis then ∼4 × 10−7. Because the basic splicingmachinery is fairly conserved across all eukary-otes, this cost is likely to be quite general over alllineages (with the caveat that mutation-rate dif-ferences must be accounted for, yielding a lowervalue of nu for unicellular species). Thus, as afirst-order approximation, Ne in excess of 106

is required for selection to be effective at pre-venting the fixation of a new intron-containingallele in a multicellular eukaryote (assuming noadditional costs associated with the intron). Asnoted above, the effective population sizes ofmost land plants and animals are near or wellbelow this threshold, whereas estimates of Ne

for unicellular eukaryotes are in the vicinity ofthe threshold and may often exceed it. Thus, theassociation of the position of the threshold be-havior for average intron investment with uni-cellular lineages (Figure 2) is consistent withtheoretical expectations.

The scatter around the general gradientsobserved in Figure 2 is entirely expected, asspecies near the threshold can be expected towander in both directions over evolutionarytime, and genome size cannot respond instan-taneously to such changes. For example, be-cause many mechanisms exist for both the gainand loss of excess DNA, the mutational-hazardhypothesis implies that if Ne were to expandafter a sufficiently prolonged earlier patternof lower Ne, all aspects of genome architec-ture should undergo a gradual modificationin response to the change in the population-genetic environment. A most pronounced ex-ample of such behavior is observed in the age-distributional patterns for a variety of insertionsin mammalian genomes (97). Long terminal re-peat (LTR) retrotransposons can be aged fromthe divergence of their terminal sequences,which are identical at the time of birth of anew element, and the times of origin of inclu-sions such as processed pseudogenes and frag-ments of nuclear insertions of mitochondrialDNA (numts) can be estimated by reference totheir native gene sources. In a wide variety of

354 Lynch et al.

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 9: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

invertebrates, land plants, and unicellular eu-karyotes, the age distributions of such insertionsexhibit negative-exponential forms, consistentwith a long-term steady-state birth/death pro-cess (65). However, mammalian genomes ex-hibit dramatic age-distributional bulges for awide variety of insertion types, which can onlybe explained by recent increases in loss ratesand/or recent decreases in insertion rates ofsuch elements.

There are three remarkable features of themammalian data. First, the peaks of the age dis-tributions of inserts occur at roughly compa-rable times, ∼35–60 Mya, across a wide arrayof mammalian orders. Thus, because the or-ders of mammals began to diverge ∼100 Mya(113), these changes represent a dramatic exam-ple of parallel genomic evolution. Second, thedates of the recent declines in the age distri-butions of inserts vary among different types ofinserts, with peaks appearing ∼60 Mya for pseu-dogenes, 30 Mya for LTR retrotransposons,and 20 Mya for numts. Third, extrapolationfrom the age distributions of inserts impliesthat mammalian genome sizes prior to 60 Myawere approximately double those in modern-day species, and that today’s genomes are stillin a contraction phase—i.e., have not reachedequilibrium.

Our interpretation of these results is thatthe extinction of the dinosaurs at the K-Tboundary (∼65 Mya) facilitated global expan-sions of the effective population sizes of mostmammalian lineages, thereby reducing theinfluence of genetic drift and increasing theefficiency of selection against weakly disad-vantageous inserts. (Notably, the platypus, theancestors of which may never have experienceda pronounced population-size expansion,is the only mammal for which a phase ofgenomic contraction is not obvious.) Althougha gradual increase in Ne would be expectedto influence the demography of all forms ofgenomic insertions, the types to respond firstwould be those with the most mutationallyharmful effects, which would imply a rankingof average deleterious effects with pseudogenes> LTR retrotransposons > numts. Finally, if

correct, the population-size expansion hy-pothesis should have implications for all otheraspects of mammalian genome evolution. Thus,it is notable that although there has been a lossof GC content in mammalian genomes overtime, the loss rate has more recently declined,possibly because of an increase in the effec-tiveness of biased gene conversion toward GCcontent (9). Because biased gene conversionoperates like selection (87), with increasedefficiency in larger populations, this interpreta-tion, if correct, would provide independent sup-port for increased Ne in post-K-T mammalianlineages.

These types of observations suggest severalreasons for caution against relying on descrip-tions of one or a few genomes as evidence foror against the mutational-hazard hypothesis.First, owing to the fact that neutral mutationsfix within 4Ne generations on average (46),estimates of the quantity 4Neu are valid over nomore than the past 4Ne generations, whereasthe features of genomes that expand/contractin response to changes in Neu may take muchlonger periods to unfold. Second, singlegenome sequences need not always be repre-sentative of the broader lineages from whichthey are derived. For example, a recent com-parison of the Daphnia pulex genome with thatfrom other arthropods led to the conclusionthat the Daphnia lineage has experienced asubstantial expansion in intron number (17).However, the clone selected for sequencing isknown to be a member of an isolated lineagewith very low recent 4Neu, and most of itsunique introns are not even found in membersof the species in other nearby populations (55).Finally, although the general syndrome ofgenomic bloating in response to reductions in4Neu appears to be widespread in eukaryotes,this pattern arises because of the predominanceof large-scale insertions in eukaryotic genomes.The opposite pattern would be expected inspecies with a mutational bias toward deletion.Thus, it is notable that a syndrome of genome-size reduction in response to population sizeis observed in prokaryotes, which also exhibita deletion bias (48).

www.annualreviews.org • Eukaryotic Genome Evolution 355

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 10: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

Finally, it is worth emphasizing that thetypes of genomic alterations that accumulatein response to a reduction in 4Neu need notbe permanently deleterious, and may even leadto novel adaptations by modifying the possiblepathways for descent with modification. Forexample, once established, introns allow thegeneration of multiple isoforms from identicalprecursor messenger RNAs (mRNAs) throughalternative splicing, generating a level ofdiversity at the protein level that is otherwisenot possible from a fixed number of genes, andalso providing an additional potential layer ofposttranscriptional gene regulation (51, 101).However, it is still unclear what proportion ofalternative splicing is functional as opposed tobeing an indirect artifact of imperfect splicing(92), and in yeast, most introns appear to beunnecessary for normal function (93). Evenpseudogenes and transposable elements cansometimes be recycled to yield functionalelements. After losing their coding capacity,pseudogenes can still function as noncodingRNAs in regulating parent-gene expression(21, 103); likewise, some transposable elementshave been domesticated by their host genome,where they serve as regulatory elements (11,94). Whether this recycling of the so-calledjunk DNA is more the rule than the excep-tion is still unclear, but the examples citedabove illustrate the fact that evolution is anopportunistic process that can sometimes takeadvantage of an unfavorable situation.

PROTEIN STRUCTURE ANDCOMPLEX CELLULARADAPTATIONS

The preceding overview provides ampleevidence that many aspects of gene structurerespond in dramatic fashion to changes inthe population-genetic environment. A keyremaining question is whether the principlessuggested for the proliferation of genomicmodifications in response to reductions in4Neu can be extended to observations at theprotein and/or cellular levels. If this were thecase, then the possibility exists that the very

nature of the eukaryotic cell has been molded,at least in part, by historical aspects of the ratesof drift, mutation, and recombination, ratherthan being exclusively a product of externalecological challenges (30, 63–65, 82, 102).

Drift, Mutation, and the Modificationof Protein Features

In the spirit of encouraging future research inthis direction, we close with a consideration ofthe potential influence of drift and mutationon the evolution of various features of theprotein repertoire of species. Some aspectsof the mutational-hazard theory readily ex-tend to this level. First, alternative forms ofprotein architecture will be underlain by genestructures that naturally impose different levelsof vulnerability to deleterious mutations, andwhen the fitness-related consequences aresufficiently small relative to the magnitude ofgenetic drift, mutational pressure in the direc-tion of more precarious protein structures willeventually drive the latter to fixation. Second,once established, modified protein structuresmay be exploited secondarily as substrates fornovel pathways of adaptive evolution.

What are the population-genetic mecha-nisms by which these types of changes in pro-tein sequences come about? The simplest wayto consider how the pace of adaptive evolu-tion might be altered by changes in Ne involvesthe potential fates of unconditionally beneficialsingle-site amino-acid alterations. Assuming anideal random-mating population and letting ub

denote the rate of mutation to advantageousamino-acid substitutions, because the rate of in-put of mutations at the population level is 2Neub

and the probability of fixation of strongly ben-eficial mutations [defined as 4Nes � 1, where sis the selective advantage in heterozygotes (44)]is ∼2s, the rate of fixation of such mutations atthe population level is ∼4Neubs. Taken at facevalue, this expression suggests that the rate ofadaptation will increase with population size.However, this is only strictly true if Ne and ub

are independent. Because the mutation rate de-clines as ∼Ne

−0.6 (66), the overall scaling of the

356 Lynch et al.

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 11: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

rate of adaptation under this simple model is ac-tually ∼Ne

0.4, a much more gradual response topopulation-size increase. However, this scalingcan also be misleading, as it refers to the rateof adaptation on a per-generation basis. Smallorganisms with large Ne have reduced genera-tion times relative to large multicellular specieswith small Ne, with an approximate scaling ofNe

−0.8 (68), which implies a rate of adaptationon an absolute timescale roughly proportionalto Ne

1.2. Given that the window of beneficialmutations subject to positive selection (4Nes �1) must increase with Ne, the opportunity foradaptive evolution on an absolute timescale viamutations with additive effects clearly increaseswith the effective population size.

In contrast, letting ud denote the rate of pro-duction of deleterious mutations having neu-tral population dynamics (|4Nes| � 1), 2Neud

such mutations enter the population per gener-ation and fix with probability 1/(2Ne), yieldingan overall rate of fixation of very mildly dele-terious mutations equal to ud per generation.We are then left with the issue of how ud scaleswith population size. Considering the negativescaling of the overall mutation rate with popu-lation size, Ne

−0.6, along with the conversionto absolute time (using Ne

0.8), the scaling ofthe absolute-time rate of accumulation of effec-tively neutral but deleterious mutations, Ne

0.2,is nearly independent of Ne. However, theseconsiderations do not take into account the factthat the distribution of deleterious mutationsis strongly biased toward very small effects (4,23, 58), which will cause a rapid increase in thepool of effectively neutral (but absolutely dele-terious) mutations as Ne declines. Thus, thereseems little question that the rate of fixationof mildly deleterious mutations increases withpopulation-size reductions.

The evidence that the routes taken inprotein evolution are influenced by the mag-nitude of random genetic drift is manifest.For example, bacteria that have relinquished afree-living form in favor of an endosymbioticor pathogenic lifestyle occupy an environmentthat encourages reductions in Ne, and this isoften reflected in losses of adaptive codon bias,

increased accumulation of mildly deleteriousamino-acid substitutions, elevated expressionof molecular chaperones to accommodateprotein-folding defects, and (in some cases)massive release of mobile-element activityand inactivation of entire genes (8, 12, 24, 84,85, 107, 111). Species that have abandonedsexual reproduction and consequently experi-ence elevated levels of genetic draft accumulatean excess of mildly deleterious amino-acidsubstitutions (5, 40, 89, 91). Similar observa-tions have been made with respect to islandspecies experiencing increased levels of driftdue to population-size reductions (39, 52, 114),and even isolated populations of free-living bacteria may be vulnerable to excessdeleterious-mutation accumulation (22). Fi-nally, non-recombining organelle genomesexhibit elevated rates of accumulation of mildlydeleterious amino-acid alterations (59, 60, 71),as do non-recombining sex chromosomes (2,3, 42, 80).

The Emergence of Protein Complexes

Mutations with mild enough deleterious effectsto drift to fixation are unlikely to qualitativelyalter the functional core of a protein and in-stead may have smaller, superficial effects. Forexample, although optimally constructed pro-teins typically fold into forms that protect back-bone hydrogen bonds from interactions withsurrounding water molecules, many kinds ofamino-acid substitutions can compromise suchfeatures, resulting in adhesive surface patches(26). A comprehensive study of a set of >100protein orthologs with known structures re-vealed a gradient in the average level of ex-posure of backbone hydrogen bonds, with an∼50% increase from prokaryotes to unicellu-lar eukaryotes to invertebrates and land plantsto vertebrates (25). This is the same orderingnoted above for the expansion of genome sizeand gene-structural complexity, providing sug-gestive evidence that the shift in population-genetic environment across these domains oflife has repercussions that extend to the proteinlevel.

www.annualreviews.org • Eukaryotic Genome Evolution 357

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 12: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

Although excessive deleterious-mutationaccumulation can lead to population extinction,by altering the structure of proteins, mildlydetrimental mutations of this sort may alsoprovide an opportunity for the emergence ofcompensatory mutations that alleviate sucheffects and may even open up previouslyinaccessible paths to adaptive evolution. Mostnotably, the widespread expansion of the av-erage protein adhesiveness with a reduction inNe suggests the possibility that the shift towardhigher levels of drift in eukaryotes, particularlyin multicellular species, indirectly promotesan intracellular environment conducive to theevolution of protein-protein interactions. Forexample, although the overall consequencesof mild surface defects can be mitigated bycompensatory mutations at the same locus,an alternative scenario is the developmentof a consortium with another potentiallyinteracting protein in a way that hides thesurface defect(s) of one or both members—e.g.,modification of a monomeric protein into onethat can participate in a multimeric assemblage.

Protein complexes are extraordinarilycommon. Of the thousands of proteins whosestructures have been recorded in the PDB(Protein Data Bank), ∼40% are thought tofunction as multimers (Figure 3). A numberof biases may present themselves in such asurvey, but taken at face value the data suggestthat, relative to homomers (where differentsubunits are derived from the same locus),heteromers (subunits derived from distinctloci) are nearly three times more common invertebrates than in unicellular microbes. Manypotential advantages of protein-complex for-mation can be envisioned, including increasedstructural size and diversity, reduced problemsof folding single large proteins, and increasedflexibility for allosteric regulation and proteinactivation (81). However, if protein complexesare generally advantageous, we must explain(a) why multimers are approximately equallyfrequent across all domains of life, and (b) whyheteromers are so much more abundant intaxa that are expected to be less efficient atpromoting positively selected mutations.

EubacteriaUnicellular eukaryotesVertebrates

Monomers

Freq

uenc

y

0

0.1

0.2

0.3

0.4

0.5

0.6

Homomers Heteromers

Figure 3The distribution of protein-complex types intothree broad taxonomic assemblages. Relativefrequencies of the three main classes of proteinstructures were obtained from data deposited in 3DComplex.org v2.0 (53, 54), based primarily on taxafor which at least 20 records were available. Samplesizes are 3,545, 628, and 4,125 for eubacteria,unicellular eukaryotes, and vertebrates,respectively.

A potentially relevant point is that pro-teins with oligomerization potential can alsobe costly, providing the substrate for well-known human disorders involving inappropri-ate protein aggregates (e.g., Alzheimer’s andParkinson’s diseases) (16), and more generallyencouraging deleterious promiscuous protein-protein interactions when overexpressed (99,104). This fine line between adaptive com-plexity and pathology encourages the hypoth-esis that the evolutionary roots of numerousprotein-protein interactions may simply residein their initial roles as compensating mecha-nisms for mild structural defects of monomericproteins (25).

As in the case of the expansion of gene-structural complexity via the passive accumu-lation of introns, elongation of untranslated re-gions, etc., once a protein complex becomesestablished by nonadaptive mechanisms, thenovel architecture may serve as the substratefor the origin of more complex cellular adap-tations. In accordance with this view, sub-stantial evidence suggests that homodimericforms often serve as launching pads for the

358 Lynch et al.

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 13: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

evolutionary transition to more complex mul-timeric architectures via gene duplication anddivergence of subunits (37, 95, 96). Specific ex-amples of particularly complex traits include theflagellum (56), the nuclear pore complex (1), thespliceosome (98), the proteasome (36), and nu-cleosomes (79).

A precondition for the coevolutionary ori-gin and refinement of components of a molec-ular interaction is colocalization of the inter-acting partners (49). Combined with subcel-lular localization of mRNAs and proteins, en-hanced protein adhesiveness provides a simplestarting point for such interaction. However,the opportunity for an advantageous molecularinteraction need not be sufficient for its evolu-tionary advancement. For example, the inter-mediate states between one end point and an-other can be maladaptive, recombination cansometimes break up adaptive combinations ofmutations more rapidly than they are advancedby selection, etc. Thus, to understand the con-ditions under which complex adaptations (re-quiring more than one mutation to elicit animprovement) are most likely to be promoted,we must turn to recent advances in population-genetic theory (68, 70, 108, 109). Although thework in this area is quite technical, some gen-eralities of relevance to the current discussioncan be made.

Consider the situation in which two muta-tions are required for the emergence of a stablecomplex interface, with each single-step muta-tion being deleterious. For example, a pair ofamino-acid residues might need to be modi-fied to produce an Arg-Asp or Glu-Lys ionic-pair or salt-bridge interaction. If Ne is small,the population will evolve via sequential fixa-tion, with the fixation of the initially deleteri-ous state occurring before the secondary muta-tion appears. However, if the population size issufficiently large that more than one single-sitemutation arises per generation, the single-site(maladaptive) mutations will always be presentat a selection-mutation balance frequency of∼ud/δ, where ud is the total mutation rate tofirst-step alleles and δ is the selective disadvan-tage of such alleles in heterozygotes. The effec-

tive number of such alleles in a diploid popula-tion at any time is then ∼2Neud/δ. Because eachsuch allele provides the potential substrate forthe origin of an adaptive second-step mutation,which upon appearance will then have a prob-ability of fixation of ∼2s, the rate of establish-ment of the two-site adaptation is ∼4Neubuds/δper generation, where ub is the rate of muta-tion of a first-step allele to an adaptive two-stepallele. Again given the negative scaling of mu-tation rates and Ne, this quantity is expectedto scale as ∼Ne

−0.2 on a per-generation basisbut as ∼Ne

0.8 on an absolute-time basis, im-plying a substantial increase in the rate of es-tablishment of such adaptations with increas-ing effective population size. In fact, for popula-tions large enough that 4Neubuds/δ approaches1.0, the rate of origin of haplotypes containingthe adaptive complex is no longer limiting, andthe time to establishment is essentially deter-mined by the time to fixation (subsequent toorigin), which is nearly independent of popula-tion size. This type of scaling with Ne does notchange much with increasing complexity (i.e.,increasing numbers of deleterious intermediatestates) (70). Thus, population-genetic theorysuggests that homomeric structures whose evo-lution involves intermediate deleterious statesare at least as likely to arise in microbes asin multicellular species, a pattern that is inrough accordance with the similar incidenceof multimers across diverse domains of life(Figure 3).

For diploid species, a key assumption un-derlying the scenario outlined above is that thedouble-mutant allele is immediately advanta-geous, or at least not detrimental, when het-erozygous with alternative allelic forms. Shouldthe allelic product of the double mutant formharmful complexes with products of ances-tral alleles, fixation would be strongly inhib-ited because such alleles would not experiencea net advantage until a sufficiently high fre-quency was reached that the benefits of ho-mozygotes outweighed the disadvantages ofheterozygotes. Thus, heterozygote disadvan-tage imposes a very strong barrier to fixation ofalleles, even when there is strong homozygous

www.annualreviews.org • Eukaryotic Genome Evolution 359

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 14: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

advantage, and the larger the population size,the lower the likelihood that such alleles willstochastically drift to a high enough frequencyto vault the selection barrier.

As an example of why such issues are im-portant, consider the domain-swapping modelof Bennett et al. (10), whereby a monomericprotein is predisposed to making an evolution-ary transition to a homodimeric structure byvirtue of its initial structure—e.g., a monomericsubunit that folds to include an interface be-tween two domains separated by a loop. A sin-gle large deletion within the loop would pre-vent the formation of an interface within asingle monomeric subunit, while still allowingthe swapping of complementary domains acrosstwo subunits to yield a homodimer. If, how-ever, the short-looped protein competed withthe internal binding of the long-looped version(or vice versa), harmful intermediates would beproduced in heterozygotes. Although a numberof plausible cases of domain-swapped homo-dimers have been discovered (57), the preced-ing arguments suggest that the establishmentof such forms is most likely in populations ex-periencing a high magnitude of random geneticdrift, or in haploid species that never experienceheterozygosity.

In principle, heteromers may harbor muchmore potential for long-term evolutionary di-versification than homomers, as the amino-acidsequences of different subunits are only free todiverge substantially when they are encoded bydifferent loci. However, although heteromersmay ultimately be required for the construc-tion of complex cellular features, their de novoemergence from preexisting loci may be possi-ble in only a narrow set of population-geneticenvironments. Perhaps the most critical issueis whether the loci involved recombine. If theloci are effectively completely linked, all ofthe preceding arguments apply, as the consor-tium of genes will behave like a single locus.However, recombination (either crossing-overwithin chromosomes or random segregation ofdifferent chromosomes) between loci can im-pose a strong barrier to the emergence of a two-locus adaptation (68, 110).

Consider again the case in which two muta-tions are required for a novel adaptation, one ateach of two loci, with the intermediate (single-mutant) haplotypes being deleterious. Each ofthe single-locus mutations will then be kept atlow frequency by selection-mutation balanceuntil there is an opportunity for expansionof a double mutant. Although recombinationbetween the alternative single-mutant haplo-types can enhance the rate of production of theadaptive combination, because the nonmutant(ancestral) haplotype will predominate in thepopulation at the time of first appearance of theadaptive combination, almost all subsequentrecombination events will convert the doublemutant back to the maladapted single-mutationtypes. Thus, if the rate of recombination be-tween loci exceeds the selective advantage of thedouble mutant, it will be essentially impossiblefor the adaptive complex to become establishedunless the power of drift is sufficiently largeto allow the intermediate-state haplotypes tosimply drift to high enough frequencies toovercome the recombinational barrier. Thisagain shows that despite their potential advan-tages when fixed in a population, some types ofprotein assemblages can be strongly inhibitedfrom establishing in populations with largeNe, especially if the intermediate states aredeleterious.

One final point addresses the issue of het-eromeric complexes in a way that directly linksto the mutational-hazard hypothesis. Shouldthe function of a protein complex be equallyaccomplishable via a homomer or heteromer,then aside from the requirement for a geneduplication, development of the latter will beinhibited owing to the fact that heterodimerspresent double the target size for inactivatingmutations. This excess cost of mutation isexpected to be small, as it is equivalent to theper-locus mutation rate to defective alleles,which is sufficient only to prevent heterodimerestablishment if it exceeds the power ofrandom genetic drift (76). However, becausethe mutational burden of an entire gene locusis greater than that for small embellishmentsto a gene, as in the case of the expansion of

360 Lynch et al.

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 15: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

gene-structural complexity, the mutationalburden of excess protein complexity is likelyto be sufficient to inhibit the emergence of

many forms of protein-protein complexes inpopulations of unicellular species with verylarge Ne.

SUMMARY POINTS

1. Although it is commonly assumed that virtually all aspects of biodiversity, includingthose at the genomic and subcellular levels, reflect long-term adaptive tuning to persis-tent ecological challenges, there is substantial empirical evidence that the three majornonadaptive forces of evolution—mutation, recombination, and random genetic drift—vary by orders of magnitude among phylogenetic lineages.

2. Recent investigations in theoretical population genetics suggest that this phylogeneticrange of the population-genetic environment is sufficient to impose major differencesin the pathways that are open or closed to evolutionary exploitation in microbes versusmulticellular eukaryotes.

3. The mutational-hazard theory, which postulates that virtually all forms of excess DNAimpose a weak mutational burden, provides a null hypothesis for interpreting patterns ofgenome evolution, yielding a potentially unifying explanation for the dramatic gradientin genome size and many aspects of gene-structural complexity across the tree of life.These patterns are not readily explained by alternative hypotheses that invoke DNA asan architectural feature of the cell or that postulate genomic streamlining as a mechanismfor enhancing replication rates or minimizing problems with nutrient limitation.

4. Ample evidence also exists that reductions in the efficiency of selection, associated withelevated rates of random genetic drift in various eukaryotic lineages, have been sufficientto facilitate the accumulation of mildly deleterious amino-acid substitutions in proteinsas well as to secondarily encourage the emergence of novel protein-protein complexes.

5. The overall interpretation of these results is that a complete understanding of patternsof variation at both the genomic and proteomic levels, and perhaps beyond, cannot beachieved with a framework that assumes natural selection to be the only relevant evolu-tionary force and fails to consider the unique combinations of mutation, recombination,and random genetic drift experienced by various taxa.

FUTURE ISSUES

1. As the principles outlined above pertain to all levels of organismal diversity, in the bottom-up approach to understanding the mechanisms of evolution, it is now time to move beyondthe level of genome architecture to higher-order features such as protein structure. Thiswill provide a natural bridge between genomic and phenotypic evolution, and hence apotential entree into the evolution of cellular features.

2. Most previous studies of protein evolution have focused on primary sequence structure,but compelling evidence now exists that the transitions from prokaryotes to unicellulareukaryotes to multicellular species have been accompanied by changes in the generalarchitectural features of proteins, including the evolution of multimeric structures.

www.annualreviews.org • Eukaryotic Genome Evolution 361

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 16: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

3. Virtually all features of the cell are derived from protein assemblages, and it is nowknown that many of these originated from ancestral homomeric cores around which othercomponents were recruited by gene duplication and divergence. Thus, general insightsinto how novel cellular attributes evolve are achievable if a research agenda focused on thecomparative architecture of orthologous proteins across divergent taxa can be combinedwith population-genetic theory on the evolution of complex adaptations.

DISCLOSURE STATEMENT

The authors are not aware of any affiliations, memberships, funding, or financial holdings thatmight be perceived as affecting the objectivity of this review.

ACKNOWLEDGMENTS

This work has been supported by grants to M.L. from the National Science Foundation (EF-0328516 and EF-0827411), National Institutes of Health (R01-GM036827), and U.S. Departmentof the Army (ONRBAA10-002). F.C. is supported by a Marie Curie International IncomingFellowship (254202).

LITERATURE CITED

1. Alber F, Dokudovskaya S, Veenhoff LM, Zhang W, Kipper J, et al. 2007. The molecular architecture ofthe nuclear pore complex. Nature 450:695–701

2. Bachtrog D. 2006. Expression profile of a degenerating neo-Y chromosome in Drosophila. Curr. Biol.16:1694–99

3. Bachtrog D. 2008. The temporal dynamics of processes underlying Y chromosome degeneration. Genetics179:1513–25

4. Baer CF, Miyamoto MM, Denver DR. 2007. Mutation rate variation in multicellular eukaryotes: causesand consequences. Nat. Rev. Genet. 8:619–31

5. Barraclough TG, Fontaneto D, Ricci C, Herniou EA. 2007. Evidence for inefficient selection againstdeleterious mutations in cytochrome oxidase I of asexual bdelloid rotifers. Mol. Biol. Evol. 24:1952–62

6. Barton NH. 2010. Genetic linkage and natural selection. Philos. Trans. R. Soc. Lond. B Biol. Sci. 365:2559–69

7. Baudat F, Buard J, Grey C, Fledel-Alon A, Ober C, et al. 2010. PRDM9 is a major determinant of meioticrecombination hotspots in humans and mice. Science 327:836–40

8. Belda E, Moya A, Bentley S, Silva FJ. 2010. Mobile genetic element proliferation and gene inactiva-tion impact over the genome structure and metabolic capabilities of Sodalis glossinidius, the secondaryendosymbiont of tsetse flies. BMC Genomics 11:449

9. Belle EM, Duret L, Galtier N, Eyre-Walker A. 2004. The decline of isochores in mammals: an assessmentof the GC content variation along the mammalian phylogeny. J. Mol. Evol. 58:653–60

10. Bennett MJ, Choe S, Eisenberg D. 1994. Domain swapping: entangling alliances between proteins. Proc.Natl. Acad. Sci. USA 91:3127–31

11. Biemont C, Vieira C. 2006. Genetics: junk DNA as an evolutionary force. Nature 443:521–2412. Burke GR, Moran NA. 2011. Massive genomic decay in Serratia symbiotica, a recently evolved symbiont

of aphids. Genome Biol. Evol. 3:195–20813. Cavalier-Smith T. 1978. Nuclear volume control by nucleoskeletal DNA, selection for cell volume and

cell growth rate, and the solution of the DNA C-value paradox. J. Cell Sci. 34:247–7814. Cavalier-Smith T. 2005. Economy, speed and size matter: evolutionary forces driving nuclear genome

miniaturization and expansion. Ann. Bot. 95:147–75

362 Lynch et al.

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 17: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

15. Charlesworth B. 2009. Fundamental concepts in genetics: effective population size and patterns of molec-ular evolution and variation. Nat. Rev. Genet. 10:195–205

16. Chiti F, Dobson CM. 2009. Amyloid formation by globular proteins under native conditions. Nat. Chem.Biol. 5:15–22

17. Colbourne JK, Pfrender ME, Gilbert D, Thomas WK, Tucker A, et al. 2011. The ecoresponsive genomeof Daphnia pulex. Science 331:555–61

18. Dawson KJ. 1999. The dynamics of infinitesimally rare alleles, applied to the evolution of mutation ratesand the expression of deleterious mutations. Theor. Popul. Biol. 55:1–22

19. Dumont BL, Payseur BA. 2008. Evolution of the genomic rate of recombination in mammals. Evolution62:276–94

20. Dumont BL, Payseur BA. 2011. Evolution of the genomic recombination rate in murid rodents. Genetics187:643–57

21. Duret L, Chureau C, Samain S, Weissenbach J, Avner P. 2006. The Xist RNA gene evolved in eutheriansby pseudogenization of a protein-coding gene. Science 312:1653–55

22. Escobar-Paramo P, Ghosh S, DiRuggiero J. 2005. Evidence for genetic drift in the diversification of ageographically isolated population of the hyperthermophilic archaeon Pyrococcus. Mol. Biol. Evol. 22:2297–303

23. Eyre-Walker A, Keightley PD. 2007. The distribution of fitness effects of new mutations. Nat. Rev.Genet. 8:610–18

24. Fares MA, Moya A, Barrio E. 2004. GroEL and the maintenance of bacterial endosymbiosis. TrendsGenet. 20:413–16

25. Fernandez A, Lynch M. 2011. Non-adaptive origins of interactome complexity. Nature 474:502–526. Fernandez A, Zhang X, Chen J. 2008. Folding and wrapping soluble proteins exploring the molecular

basis of cooperativity and aggregation. Prog. Mol. Biol. Transl. Sci. 83:53–8727. Froula JL, Francino MP. 2007. Selection against spurious promoter motifs correlates with translational

efficiency across bacteria. PLoS One 2:e74528. Gillespie JH. 2000. Genetic drift in an infinite population. The pseudohitchhiking model. Genetics

155:909–1929. Gordo I, Charlesworth B. 2000. The degeneration of asexual haploid populations and the speed of

Muller’s ratchet. Genetics 154:1379–8730. Gray MW, Lukes J, Archibald JM, Keeling PJ, Doolittle WF. 2010. Cell biology: irremediable com-

plexity? Science 330:920–2131. Gregory TR, Witt JD. 2008. Population size and genome size in fishes: a closer look. Genome 51:309–1332. Hahn MW, Stajich JE, Wray GA. 2003. The effects of selection against spurious transcription factor

binding sites. Mol. Biol. Evol. 20:901–633. Hartfield M, Otto SP, Keightley PD. 2010. The role of advantageous mutations in enhancing the

evolution of a recombination modifier. Genetics 184:1153–6434. Hessen DO, Jeyasingh PD, Neiman M, Weider LJ. 2010. Genome streamlining and the elemental costs

of growth. Trends Ecol. Evol. 25:75–8035. Hessen DO, Jeyasingh PD, Neiman M, Weider LJ. 2010. Genome streamlining in prokaryotes versus

eukaryotes. Trends Ecol. Evol. 25:320–2136. Hughes AL. 1997. Evolution of the proteasome components. Immunogenetics 46:82–9237. Ispolatov I, Yuryev A, Mazo I, Maslov S. 2005. Binding properties and evolution of homodimers in

protein-protein interaction networks. Nucleic Acids Res. 33:3629–3538. Jeffreys AJ, Neumann R. 2009. The rise and fall of a human recombination hot spot. Nat. Genet. 41:625–

2939. Johnson KP, Seger J. 2001. Elevated rates of nonsynonymous substitution in island birds. Mol. Biol. Evol.

18:874–8140. Johnson SG, Howard RS. 2007. Contrasting patterns of synonymous and nonsynonymous sequence

evolution in asexual and sexual freshwater snail lineages. Evolution 61:2728–3541. Johnson T. 1999. The approach to mutation-selection balance in an infinite asexual population, and the

evolution of mutation rates. Proc. Biol. Sci. 266:2389–97

www.annualreviews.org • Eukaryotic Genome Evolution 363

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 18: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

42. Kaiser VB, Charlesworth B. 2010. Muller’s ratchet and the degeneration of the Drosophila miranda neo-Ychromosome. Genetics 185:339–48

43. Keightley PD, Otto SP. 2006. Interference among deleterious mutations favours sex and recombinationin finite populations. Nature 443:89–92

44. Kimura M. 1962. On the probability of fixation of mutant genes in a population. Genetics 47:713–1945. Kimura M. 1967. On the evolutionary adjustment of spontaneous mutation rates. Genet. Res. 9:23–3446. Kimura M, Ohta T. 1969. The average number of generations until fixation of a mutant gene in a finite

population. Genetics 61:763–7147. Koonin EV. 2010. The origin and early evolution of eukaryotes in the light of phylogenomics. Genome

Biol. 11:20948. Kuo CH, Moran NA, Ochman H. 2009. The consequences of genetic drift for bacterial genome com-

plexity. Genome Res. 19:1450–5449. Kuriyan J, Eisenberg D. 2007. The origin of protein interactions and allostery in colocalization. Nature

450:983–9050. Lane N, Martin W. 2010. The energetics of genome complexity. Nature 467:929–3451. Lareau LF, Green RE, Bhatnagar RS, Brenner SE. 2004. The evolving roles of alternative splicing. Curr.

Opin. Struct. Biol. 14:273–8252. Legrand D, Tenaillon MI, Matyot P, Gerlach J, Lachaise D, Cariou ML. 2009. Species-wide genetic

variation and demographic history of Drosophila sechellia, a species lacking population structure. Genetics182:1197–206

53. Levy ED, Boeri Erba E, Robinson CV, Teichmann SA. 2008. Assembly reflects evolution of proteincomplexes. Nature 453:1262–65

54. Levy ED, Pereira-Leal JB, Chothia C, Teichmann SA. 2006. 3D complex: a structural classification ofprotein complexes. PLoS Comput. Biol. 2:e155

55. Li W, Tucker AE, Sung W, Thomas WK, Lynch M. 2009. Extensive, recent intron gains in Daphniapopulations. Science 326:1260–62

56. Liu R, Ochman H. 2007. Stepwise formation of the bacterial flagellar system. Proc. Natl. Acad. Sci. USA104:7116–21

57. Liu Y, Eisenberg D. 2002. 3D domain swapping: as domains continue to swap. Protein Sci. 11:1285–9958. Loewe L, Charlesworth B. 2006. Inferring the distribution of mutational effects on fitness in Drosophila.

Biol. Lett. 2:426–3059. Lynch M. 1996. Mutation accumulation in transfer RNAs: molecular evidence for Muller’s ratchet in

mitochondrial genomes. Mol. Biol. Evol. 13:209–2060. Lynch M. 1997. Mutation accumulation in nuclear, organelle, and prokaryotic transfer RNA genes. Mol.

Biol. Evol. 14:914–2561. Lynch M. 2002. Intron evolution as a population-genetic process. Proc. Natl. Acad. Sci. USA 99:6118–2362. Lynch M. 2006. The origins of eukaryotic gene structure. Mol. Biol. Evol. 23:450–6863. Lynch M. 2007. The evolution of genetic networks by non-adaptive processes. Nat. Rev. Genet. 8:803–1364. Lynch M. 2007. The frailty of adaptive hypotheses for the origins of organismal complexity. Proc. Natl.

Acad. Sci. USA 104(Suppl. 1):8597–60465. Lynch M. 2007. The Origins of Genome Architecture. Sunderland, MA: Sinauer66. Lynch M. 2010. Evolution of the mutation rate. Trends Genet. 26:345–5267. Lynch M. 2010. Rate, molecular spectrum, and consequences of human mutation. Proc. Natl. Acad. Sci.

USA 107:961–6868. Lynch M. 2010. Scaling expectations for the time to establishment of complex adaptations. Proc. Natl.

Acad. Sci. USA 107:16577–8269. Lynch M. 2011. The lower bound to the evolution of mutation rates. Gen. Biol. Evol. In press70. Lynch M, Abegg A. 2010. The rate of establishment of complex adaptations. Mol. Biol. Evol. 27:1404–1471. Lynch M, Blanchard JL. 1998. Deleterious mutation accumulation in organelle genomes. Genetica 102–

103:29–3972. Lynch M, Conery JS. 2003. The origins of genome complexity. Science 302:1401–473. Lynch M, Conery JS, Burger R. 1995. Mutation accumulation and the extinction of small populations.

Am. Nat. 146:489–518

364 Lynch et al.

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 19: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

74. Lynch M, Conery JS, Burger R. 1995. Mutational meltdowns in sexual populations. Evolution 49:1067–8075. Lynch M, Koskella B, Schaack S. 2006. Mutation pressure and the evolution of organelle genomic

architecture. Science 311:1727–3076. Lynch M, O’Hely M, Walsh B, Force A. 2001. The probability of preservation of a newly arisen gene

duplicate. Genetics 159:1789–80477. Lynch M, Scofield DG, Hong X. 2005. The evolution of transcription-initiation sites. Mol. Biol. Evol.

22:1137–4678. Lynch M, Walsh B. 1998. Genetics and Analysis of Quantitative Traits. Sunderland, MA: Sinauer79. Malik HS, Henikoff S. 2003. Phylogenomics of the nucleosome. Nat. Struct. Biol. 10:882–9180. Marais GA, Nicolas M, Bergero R, Chambrier P, Kejnovsky E, et al. 2008. Evidence for degeneration

of the Y chromosome in the dioecious plant Silene latifolia. Curr. Biol. 18:545–4981. Marianayagam NJ, Sunde M, Matthews JM. 2004. The power of two: protein dimerization in biology.

Trends Biochem. Sci. 29:618–2582. Martin W. 2010. Evolutionary origins of metabolic compartmentalization in eukaryotes. Philos. Trans.

R. Soc. Lond. B Biol. Sci. 365:847–5583. Martin W, Koonin EV. 2006. Introns and the origin of nucleus-cytosol compartmentalization. Nature

440:41–4584. Moran NA. 1996. Accelerated evolution and Muller’s rachet in endosymbiotic bacteria. Proc. Natl. Acad.

Sci. USA 93:2873–7885. Moran NA, Plague GR. 2004. Genomic changes following host restriction in bacteria. Curr. Opin. Genet.

Dev. 14:627–3386. Myers S, Bowden R, Tumian A, Bontrop RE, Freeman C, et al. 2010. Drive against hotspot motifs in

primates implicates the PRDM9 gene in meiotic recombination. Science 327:876–7987. Nagylaki T. 1983. Evolution of a finite population under gene conversion. Proc. Natl. Acad. Sci. USA

80:6278–8188. Nei M, Tajima F. 1981. Genetic drift and estimation of effective population size. Genetics 98:625–4089. Neiman M, Hehman G, Miller JT, Logsdon JM Jr, Taylor DR. 2010. Accelerated mutation accumulation

in asexual lineages of a freshwater snail. Mol. Biol. Evol. 27:954–6390. Paigen K, Petkov P. 2010. Mammalian recombination hot spots: properties, control and evolution. Nat.

Rev. Genet. 11:221–3391. Paland S, Lynch M. 2006. Transitions to asexuality result in excess amino acid substitutions. Science

311:990–9292. Pan Q, Saltzman AL, Kim YK, Misquitta C, Shai O, et al. 2006. Quantitative microarray profiling

provides evidence against widespread coupling of alternative splicing with nonsense-mediated mRNAdecay to control gene expression. Genes Dev. 20:153–58

93. Parenteau J, Durand M, Veronneau S, Lacombe AA, Morin G, et al. 2008. Deletion of many yeast intronsreveals a minority of genes that require splicing for function. Mol. Biol. Cell 19:1932–41

94. Peaston AE, Evsikov AV, Graber JH, de Vries WN, Holbrook AE, et al. 2004. Retrotransposons regulatehost genes in mouse oocytes and preimplantation embryos. Dev. Cell 7:597–606

95. Pereira-Leal JB, Levy ED, Kamp C, Teichmann SA. 2007. Evolution of protein complexes by duplicationof homomeric interactions. Genome Biol. 8:R51

96. Reid AJ, Ranea JA, Orengo CA. 2010. Comparative evolutionary analysis of protein complexes in E. coliand yeast. BMC Genomics 11:79

97. Rho M, Zhou M, Gao X, Kim S, Tang H, Lynch M. 2009. Independent mammalian genome contractionsfollowing the KT boundary. Genome Biol. Evol. 1:2–12

98. Scofield DG, Lynch M. 2008. Evolutionary diversification of the Sm family of RNA-associated proteins.Mol. Biol. Evol. 25:2255–67

99. Semple JI, Vavouri T, Lehner B. 2008. A simple principle concerning the robustness of protein complexactivity to changes in gene expression. BMC Syst. Biol. 2:1

100. Speijer D. 2011. Does constructive neutral evolution play an important role in the origin of cellularcomplexity? Making sense of the origins and uses of biological complexity. Bioessays 33:344–49

101. Stamm S, Ben-Ari S, Rafalska I, Tang Y, Zhang Z, et al. 2005. Function of alternative splicing. Gene344:1–20

www.annualreviews.org • Eukaryotic Genome Evolution 365

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 20: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12CH15-Lynch ARI 26 July 2011 11:14

102. Stoltzfus A. 1999. On the possibility of constructive neutral evolution. J. Mol. Evol. 49:169–81103. Tam OH, Aravin AA, Stein P, Girard A, Murchison EP, et al. 2008. Pseudogene-derived small interfering

RNAs regulate gene expression in mouse oocytes. Nature 453:534–38104. Vavouri T, Semple JI, Garcia-Verdugo R, Lehner B. 2009. Intrinsic protein disorder and interaction

promiscuity are widely associated with dosage sensitivity. Cell 138:198–208105. Vieira-Silva S, Touchon M, Rocha EP. 2010. No evidence for elemental-based streamlining of prokary-

otic genomes. Trends Ecol. Evol. 25:319–20106. Wang J. 2005. Estimation of effective population sizes from data on genetic markers. Philos. Trans. R.

Soc. Lond. B Biol. Sci. 360:1395–409107. Warnecke T, Rocha EP. 2011. Function-specific accelerations in rates of sequence evolution suggest

predictable epistatic responses to reduced effective population size. Mol. Biol. Evol. In press108. Weinreich DM, Chao L. 2005. Rapid evolutionary escape by large populations from local fitness peaks

is likely in nature. Evolution 59:1175–82109. Weissman DB, Desai MM, Fisher DS, Feldman MW. 2009. The rate at which asexual populations cross

fitness valleys. Theor. Popul. Biol. 75:286–300110. Weissman DB, Feldman MW, Fisher DS. 2010. The rate of fitness-valley crossing in sexual populations.

Genetics 186:1389–410111. Wernegreen JJ, Moran NA. 1999. Evidence for genetic drift in endosymbionts (Buchnera): analyses of

protein-coding genes. Mol. Biol. Evol. 16:83–97112. Whitney KD, Garland T Jr. 2010. Did genetic drift drive increases in genome complexity? PLoS Genet.

6:e1001080113. Wible JR, Rougier GW, Novacek MJ, Asher RJ. 2007. Cretaceous eutherians and Laurasian origin for

placental mammals near the K/T boundary. Nature 447:1003–6114. Woolfit M, Bromham L. 2005. Population size and molecular evolution on islands. Proc. Biol. Sci.

272:2277–82

366 Lynch et al.

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 21: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12-FrontMatter ARI 10 August 2011 8:35

Annual Review ofGenomics andHuman Genetics

Volume 12, 2011Contents

Putting Medical Genetics into PracticeMalcolm A. Ferguson-Smith � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1

Copy Number and SNP Arrays in Clinical DiagnosticsChristian P. Schaaf, Joanna Wiszniewska, and Arthur L. Beaudet � � � � � � � � � � � � � � � � � � � � � � �25

Copy-Number Variations, Noncoding Sequences,and Human PhenotypesEva Klopocki and Stefan Mundlos � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �53

The Genetics of Atrial Fibrillation: From the Bench to the BedsideJunjie Xiao, Dandan Liang, and Yi-Han Chen � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �73

The Genetics of Innocence: Analysis of 194 U.S. DNA ExonerationsGreg Hampikian, Emily West, and Olga Akselrod � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �97

Genetics of Schizophrenia: New Findings and ChallengesPablo V. Gejman, Alan R. Sanders, and Kenneth S. Kendler � � � � � � � � � � � � � � � � � � � � � � � � � � � � 121

Genetics of Speech and Language DisordersChangsoo Kang and Dennis Drayna � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 145

Genomic Approaches to Deconstruct PluripotencyYuin-Han Loh, Lin Yang, Jimmy Chen Yang, Hu Li, James J. Collins,

and George Q. Daley � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 165

LINE-1 Elements in Structural Variation and DiseaseChristine R. Beck, Jose Luis Garcia-Perez, Richard M. Badge,

and John V. Moran � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 187

Personalized Medicine: Progress and PromiseIsaac S. Chan and Geoffrey S. Ginsburg � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 217

Perspectives on Human Population Structure at the Cusp of theSequencing EraJohn Novembre and Sohini Ramachandran � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 245

Rapid Turnover of Functional Sequence in Humanand Other GenomesChris P. Ponting, Christoffer Nellaker, and Stephen Meader � � � � � � � � � � � � � � � � � � � � � � � � � � � � 275

v

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.

Page 22: The Repatterning of Eukaryotic Genomes by Random Genetic Driftbioinformatica.uab.cat/base/documents/masterGP/Lynch et al. 2011... · GG12CH15-Lynch ARI 26 July 2011 11:14 The Repatterning

GG12-FrontMatter ARI 10 August 2011 8:35

Recent Advances in the Genetics of Parkinson’s DiseaseIan Martin, Valina L. Dawson, and Ted M. Dawson � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 301

Regulatory Variation Within and Between SpeciesWei Zheng, Tara A. Gianoulis, Konrad J. Karczewski, Hongyu Zhao,

and Michael Snyder � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 327

The Repatterning of Eukaryotic Genomes by Random Genetic DriftMichael Lynch, Louis-Marie Bobay, Francesco Catania, Jean-Francois Gout,

and Mina Rho � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 347

RNA-Mediated Epigenetic Programming of Genome RearrangementsMariusz Nowacki, Keerthi Shetty, and Laura F. Landweber � � � � � � � � � � � � � � � � � � � � � � � � � � � � 367

Transitions Between Sex-Determining Systems in Reptilesand AmphibiansStephen D. Sarre, Tariq Ezaz, and Arthur Georges � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 391

Unraveling the Genetics of Cancer: Genome Sequencing and BeyondKit Man Wong, Thomas J. Hudson, and John D. McPherson � � � � � � � � � � � � � � � � � � � � � � � � � � � 407

Indexes

Cumulative Index of Contributing Authors, Volumes 3–12 � � � � � � � � � � � � � � � � � � � � � � � � � � � � 431

Cumulative Index of Chapter Titles, Volumes 3–12 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 435

Errata

An online log of corrections to Annual Review of Genomics and Human Genetics articlesmay be found at http://genom.annualreviews.org/errata.shtml

vi Contents

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 201

1.12

:347

-366

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

itat A

uton

oma

de B

arce

lona

on

04/3

0/12

. For

per

sona

l use

onl

y.