5
Proc. Natl. Acad. Sci. USA Vol. 87, pp. 6703-6707, September 1990 Evolution Molecular phylogeny of Rodentia, Lagomorpha, Primates, Artiodactyla, and Carnivora and molecular clocks (mammalian phylogeny/DNA sequence trees/branching dates) WEN-HSIUNG LI*t, MANOLO GOUY*t, PAUL M. SHARP§, COLM O'HUIGIN*, AND YAU-WEN YANG$ *Center for Demographic and Population Genetics, University of Texas, P.O. Box 20334, Houston, TX 77225; tLaboratoire de Biomdtrie, Universitd Lyon I, 69622 Villeurbanne Cedex, France; §Department of Genetics, Trinity College, Dublin 2, Ireland; and IDepartment of Cell Biology, Baylor College of Medicine, Houston, TX 77030 Communicated by Wyatt W. Anderson, May 14, 1990 (received for review November 15, 1989) ABSTRACT Phylogenetic analysis of DNA sequences from primates, rodents, lagomorphs, artiodactyls, carnivores, and birds strongly suggests that the order Rodentia is an outgroup to the other four mammalian orders and that Artiodactyla and Carnivora belong to a superordinal clade. Further, there is strong evidence against the Glires concept, which unites Lago- morpha and Rodentia. The radiation among Lagomorpha, Primates, and Artiodactyla-Carnivora is very bush-like, but there is some evidence that Lagomorpha has branched off first. Thus, the branching sequence for these five orders of mammals seems to be Rodentia, Lagomorpha, Primates, Artiodactyla, and Carnivora. The branching date for Rodentia could be as early as 100 million years ago. The rate of nucleotide substi- tution in the rodent lineage is shown to be at least 1.5 times higher than those in the other four mammalian lineages. Despite the efforts of numerous comparative anatomists, paleontologists, and molecular evolutionists (1-8), the branching order of the major eutherian lineages remains highly controversial. In fact, the prevailing view of eutherian evolution has always been a bush-like radiation (1, 2, 9). To untangle this phylogenetic knot, we have sequenced a num- ber of apolipoprotein (Apo) genes from several mammalian orders and have compiled DNA sequences of these and other genes from data banks and the literature. Our current focus is on Primates, Rodentia, Lagomorpha, Artiodactyla, and Carnivora because there are many more DNA sequence data from these five orders than from others. Although Artiodactyla and Carnivora are commonly thought to have branched off prior to the primate-rodent split (2, 5, 10), some authors (3, 4) maintain that Rodentia is an outgroup to Artiodactyla, Carnivora, and Primates. The resolution of this issue has important consequences for molecular evolutionary studies. For example, using artiodac- tyls and carnivores as outgroups to rodents and primates, Wu and Li (11) estimated that the rate of synonymous nucleotide substitution is -2 times higher in the rodent lineage than in the human lineage. This view was criticized by Easteal (12), who argued that the rodents might be an outgroup to the other three orders. Should this be the case, the rate difference between the rodent and human lineages would be reduced, although it is unlikely to be completely nullified. The evolutionary position of Lagomorpha has always been controversial. Many paleontologists (4, 5) believe that Lago- morpha is closely related to Rodentia, whereas others (2, 10) think that it separated from Rodentia at the time of eutherian radiation. Great uncertainties are also seen in molecular studies (6, 7, 13). For example, Lagomorpha and Primates were put in one clade in the maximum parsimony analysis of protein sequence data by Goodman et al. (6), whereas Lagomorpha was an outgroup to Primates and Rodentia in Shoshani's (13) immunodiffusion comparisons. Finding the true evolutionary position of Lagomorpha has important implications not only for paleontology but also for molecular evolution. Some authors (11, 14) have advocated the gener- ation-time effect because they found much higher rates of nucleotide substitution in rodents than in humans. Under this hypothesis, substitution rates should also be high in rabbits because, like rodents, they have a short generation time. To test this hypothesis, we need to determine the evolutionary position of Lagomorpha. DATA AND METHODS The genes used (and their abbreviations) are apolipoprotein (Apo) Al, ApoB (3' part), ApoE, atrial natriuretic factor (ANF), A3/A1 /3-crystallin (BA3CRYST), Na',K+-ATPase a and (3 subunits (ATPNKA and ATPNKB), brain and muscle creatine kinases (CK-B and CK-M), glutathione peroxidase (GSHPX), growth hormone (GH), hemoglobin a and ,3 (HBA and HBB), insulin, interleukin la and 1,3 (ILlA and IL1B), low density lipoprotein receptor [LDLR (3' part)], luteinizing hormone subunit P (LHB), lysozyme c (LYSOc), muscle nicotinic ace- tylcholine receptor a subunit (MACHRA), c-myc oncogene (MYC), nerve growth factor (3 subunit [NGFB (3' part)], parathyroid hormone (PTH), phospholipase A2 (PPLA2), pro- lactin (PROLAC), protein kinase C subtypes a, (1II, and y [PKINCA (internal part), PKINCB2, and PKINCG (internal part); named according to ref. 15], proopiomelanocortin (POMC), protein phosphatase 2A catalytic subunit isotypes a and ,( [PP2AA and PP2AB (3' part)], and transferrin (Transfer). All sequence data are from GenBank Release 63 and the EMBL data library as available by electronic mail, except for the following sequences: cow and dog ApoAl and ApoE (ref. 16; unpublished data), cow and mouse BA3CRYST (17), rat PP2AA (18), and rabbit ApoE (19). We use only the coding regions of protein-coding genes because alignment for noncoding regions is difficult. Se- quence alignment was made at the amino acid level by a multiple alignment algorithm (20). Nucleotide alignment was then obtained according to the protein alignment. Regions where alignment was uncertain were discarded. We use the method of Li et al. (21) to estimate the numbers of substitutions per synonymous site (Ks) and per nonsyn- onymous site (KA) between two genes. This method assumes that both substitution rates are uniform over the coding regions of the gene. This assumption may hold approximately for the synonymous rate but is unlikely to hold well for the nonsynonymous rate. Currently we are ignorant of how KA Abbreviations: Apo, apolipoprotein; MP method, maximum parsi- mony method; ST method, method of Sattath and Tversky; Myr, million years. tTo whom reprint requests should be addressed. 6703 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on June 1, 2020

Molecularphylogeny Rodentia, Lagomorpha, Primates ... · Although Artiodactyla and Carnivora are commonly thought tohavebranched offprior tothe primate-rodentsplit (2, 5, 10), some

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Molecularphylogeny Rodentia, Lagomorpha, Primates ... · Although Artiodactyla and Carnivora are commonly thought tohavebranched offprior tothe primate-rodentsplit (2, 5, 10), some

Proc. Natl. Acad. Sci. USAVol. 87, pp. 6703-6707, September 1990Evolution

Molecular phylogeny of Rodentia, Lagomorpha, Primates,Artiodactyla, and Carnivora and molecular clocks

(mammalian phylogeny/DNA sequence trees/branching dates)

WEN-HSIUNG LI*t, MANOLO GOUY*t, PAUL M. SHARP§, COLM O'HUIGIN*, AND YAU-WEN YANG$*Center for Demographic and Population Genetics, University of Texas, P.O. Box 20334, Houston, TX 77225; tLaboratoire de Biomdtrie, Universitd Lyon I,69622 Villeurbanne Cedex, France; §Department of Genetics, Trinity College, Dublin 2, Ireland; and IDepartment of Cell Biology, Baylor College of Medicine,Houston, TX 77030

Communicated by Wyatt W. Anderson, May 14, 1990 (received for review November 15, 1989)

ABSTRACT Phylogenetic analysis ofDNA sequences fromprimates, rodents, lagomorphs, artiodactyls, carnivores, andbirds strongly suggests that the order Rodentia is an outgroupto the other four mammalian orders and that Artiodactyla andCarnivora belong to a superordinal clade. Further, there isstrong evidence against the Glires concept, which unites Lago-morpha and Rodentia. The radiation among Lagomorpha,Primates, and Artiodactyla-Carnivora is very bush-like, butthere is some evidence that Lagomorpha has branched off first.Thus, the branching sequence for these five orders of mammalsseems to be Rodentia, Lagomorpha, Primates, Artiodactyla,and Carnivora. The branching date for Rodentia could be asearly as 100 million years ago. The rate of nucleotide substi-tution in the rodent lineage is shown to be at least 1.5 timeshigher than those in the other four mammalian lineages.

Despite the efforts of numerous comparative anatomists,paleontologists, and molecular evolutionists (1-8), thebranching order of the major eutherian lineages remainshighly controversial. In fact, the prevailing view of eutherianevolution has always been a bush-like radiation (1, 2, 9). Tountangle this phylogenetic knot, we have sequenced a num-ber of apolipoprotein (Apo) genes from several mammalianorders and have compiled DNA sequences of these and othergenes from data banks and the literature. Our current focusis on Primates, Rodentia, Lagomorpha, Artiodactyla, andCarnivora because there are many more DNA sequence datafrom these five orders than from others.Although Artiodactyla and Carnivora are commonly

thought to have branched off prior to the primate-rodent split(2, 5, 10), some authors (3, 4) maintain that Rodentia is anoutgroup to Artiodactyla, Carnivora, and Primates. Theresolution of this issue has important consequences formolecular evolutionary studies. For example, using artiodac-tyls and carnivores as outgroups to rodents and primates, Wuand Li (11) estimated that the rate of synonymous nucleotidesubstitution is -2 times higher in the rodent lineage than inthe human lineage. This view was criticized by Easteal (12),who argued that the rodents might be an outgroup to the otherthree orders. Should this be the case, the rate differencebetween the rodent and human lineages would be reduced,although it is unlikely to be completely nullified.The evolutionary position of Lagomorpha has always been

controversial. Many paleontologists (4, 5) believe that Lago-morpha is closely related to Rodentia, whereas others (2, 10)think that it separated from Rodentia at the time of eutherianradiation. Great uncertainties are also seen in molecularstudies (6, 7, 13). For example, Lagomorpha and Primateswere put in one clade in the maximum parsimony analysis ofprotein sequence data by Goodman et al. (6), whereas

Lagomorpha was an outgroup to Primates and Rodentia inShoshani's (13) immunodiffusion comparisons. Finding thetrue evolutionary position of Lagomorpha has importantimplications not only for paleontology but also for molecularevolution. Some authors (11, 14) have advocated the gener-ation-time effect because they found much higher rates ofnucleotide substitution in rodents than in humans. Under thishypothesis, substitution rates should also be high in rabbitsbecause, like rodents, they have a short generation time. Totest this hypothesis, we need to determine the evolutionaryposition of Lagomorpha.

DATA AND METHODSThe genes used (and their abbreviations) are apolipoprotein(Apo) Al, ApoB (3' part), ApoE, atrial natriuretic factor (ANF),A3/A1 /3-crystallin (BA3CRYST), Na',K+-ATPase a and (3subunits (ATPNKA and ATPNKB), brain and muscle creatinekinases (CK-B and CK-M), glutathione peroxidase (GSHPX),growth hormone (GH), hemoglobin a and ,3 (HBA and HBB),insulin, interleukin la and 1,3 (ILlA and IL1B), low densitylipoprotein receptor [LDLR (3' part)], luteinizing hormonesubunit P (LHB), lysozyme c (LYSOc), muscle nicotinic ace-tylcholine receptor a subunit (MACHRA), c-myc oncogene(MYC), nerve growth factor (3 subunit [NGFB (3' part)],parathyroid hormone (PTH), phospholipase A2 (PPLA2), pro-lactin (PROLAC), protein kinase C subtypes a, (1II, and y[PKINCA (internal part), PKINCB2, and PKINCG (internalpart); named according to ref. 15], proopiomelanocortin(POMC), protein phosphatase 2A catalytic subunit isotypes aand ,( [PP2AA and PP2AB (3' part)], and transferrin (Transfer).All sequence data are from GenBank Release 63 and the EMBLdata library as available by electronic mail, except for thefollowing sequences: cow and dog ApoAl and ApoE (ref. 16;unpublished data), cow and mouse BA3CRYST (17), ratPP2AA (18), and rabbit ApoE (19).We use only the coding regions of protein-coding genes

because alignment for noncoding regions is difficult. Se-quence alignment was made at the amino acid level by amultiple alignment algorithm (20). Nucleotide alignment wasthen obtained according to the protein alignment. Regionswhere alignment was uncertain were discarded.We use the method of Li et al. (21) to estimate the numbers

of substitutions per synonymous site (Ks) and per nonsyn-onymous site (KA) between two genes. This method assumesthat both substitution rates are uniform over the codingregions of the gene. This assumption may hold approximatelyfor the synonymous rate but is unlikely to hold well for thenonsynonymous rate. Currently we are ignorant of how KA

Abbreviations: Apo, apolipoprotein; MP method, maximum parsi-mony method; ST method, method of Sattath and Tversky; Myr,million years.tTo whom reprint requests should be addressed.

6703

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

June

1, 2

020

Page 2: Molecularphylogeny Rodentia, Lagomorpha, Primates ... · Although Artiodactyla and Carnivora are commonly thought tohavebranched offprior tothe primate-rodentsplit (2, 5, 10), some

Proc. Natl. Acad. Sci. USA 87 (1990)

varies among sites. Jin and Nei (22) have suggested the useof the r distribution

f(A) = [pa/r(a)]e-AAa-1, [1]

where A is the substitution rate, f3 is a trivial scale parameter,and a > 0 determines the shape and width of the distribution.The density f(A) is unbounded near the origin if a < 1,decreases exponentially if a = 1, and becomes bell-shaped ifa > 1. Following Jin and Nei (22) and using the two-parameter method (23), we can show that the mean numberof substitutions per site is given by

Ka= [2(1-2P-Q 1/a + (1-2Q)-1/a-3], [2]4

where P and Q are the proportions of transitional andtransversional differences between the two sequences. If weuse the one-parameter model, then

3aK - [(1 4p/3)-1/ - 1], [3]

4

where p is the proportion of nucleotide differences betweenthe two sequences. For simplicity, we have used Eq. 3.

In phylogenetic analysis we consider four taxa at one timebecause this can greatly increase the number of genes foranalysis and therefore can increase the power of a statisticaltest. We use the maximum parsimony (MP) method (24); weinfer either the minimum number of nonsynonymous substi-tutions or the minimum number of all substitutions. In theformer case we infer the minimum number of nonsynony-mous changes between two codons among all possible evo-lutionary paths. The computer program is a minor modifica-tion ofthe PROTPARS program ofthe PHYLIP package (25). We

also use the method of Sattath and Tversky (26) (the STmethod), which is efficient (27) and is simple to use when onlyfour taxa are involved. Let dij be the evolutionary distancebetween taxa i and j. Then taxa 1 and 2 are neighbors (i.e.,in one cluster), and so are taxa 3 and 4, if d12 + d34 is smallerthan both d13 + d24 and d14 + d23.

RESULTSEvidence Supporting Rodentia as an Outgroup to Primates,

Lagomorpha, Artiodactyla, and Carnivora. An importantquestion concerning the evolutionary relationships amongRodentia, Primates, Lagomorpha, Artiodactyla, and Car-nivora is which order has branched off first? To address thisquestion, we use a bird species (chicken) as an outgroup. Tomaximize the use of sequence data, we consider only three ofthe five mammalian orders at one time. As no reliable estimateof Ks can be obtained between a chicken and a mammaliangene, we use only KA in the following analysis. Also, in the MPanalysis, we use only nonsynonymous changes.

First, we infer the branching order of Rodentia, Primates,and Artiodactyla, by using chicken as an outgroup. We use14 genes with a total of4347 codons (see footnote to Table 1).In the ST method we consider three conditions for KA: thenonsynonymous rate is uniform among the coding regions ofa gene, and the rate follows a r distribution with a = 1 or a= 0.5. Under each condition, we compute first the KA valuefor each gene and then the weighted average over genes usingthe number of codons in a gene as the weight. The results areshown in Table 1, group I. Fig. 1 a-c shows that under allthree conditions, Rodentia is an outgroup to Primates andArtiodactyla and that the internal branch is fairly long.

In the MP method, there are 265 informative nonsynony-mous sites, of which 136 sites support Rodentia as anoutgroup, whereas only 76 and 53 sites support, respectively,Artiodactyla or Primates as an outgroup. If each informative

Table 1. Average number of substitutions per nonsynonymous site (KA) between genes from Primates (P),Artiodactyla (A), Carnivora (C), Rodentia (R), and birds (B)

F rate distribution*

Uniform rate distribution a = 1 a = 0.5

P A R

A 0.081R 0.128 0.128B 0.203 0.197 0.225

P L R

L 0.059R 0.093 0.088B 0.167 0.165 0.183

P C R

CRB

0.0360.0670.159

0.0600.151 0.171

P A R

A 0.089R 0.150 0.150B 0.250 0.243 0.287

P L R

L 0.062R 0.104 0.099B 0.201 0.200 0.231

P C R

C

RB

0.0370.0750.188

0.0670.178 0.209

ARB

LRB

C

RB

P A R

0.0980.178 0.1790.314 0.306 0.376

P L R

0.0660.118 0.1140.247 0.249 0.301

P C R

0.0390.0840.226

0.0750.214 0.265

In group I the following genes from Primates, Artiodactyla, Rodentia, and birds are compared: ApoAl, ApoB, ATPNKA,ATPNKB, BA3CRYST, GH, HBA, HBB, LYSOc, MACHRA, NGFB, PROLAC, PTH, and Transfer. In group lI thefollowing genes from Primates, Lagomorpha, Rodentia and birds are compared: ApoAl, CK-B, CK-M, HBA, and HBB.In group III the following genes from Primates, Carnivora, Rodentia, and birds are compared: ApoAl, ATPNKB, CK-B,CK-M, insulin, and MYC. See Materials and Methods for gene names and sequence sources. In all cases, the human andchicken (Gallus gallus) genes were used. The other species employed are Macaca fascicularis (ApoAl and insulin), Musmusculus (BA3CRYST, GH, HBA, HBB, LYSOc, MACHRA, NGFB, PROLAC, Transfer, CK-M, insulin, and MYC),Rattus norvegicus (ApoAl, ApoB, ATPNKA, ATPNKB, GH, HBA, HBB, PROLAC, PTH, CK-B, CK-M, insulin, andMYC), Mastomys natalensis (a rodent) (NGFB), Bos taurus (bovine) (ApoAl, BA3CRYST, GH, HBB, LYSOc, MACHRA,NGFB, PROLAC, and PTH), Sus scrofa (pig) (ApoB, ATPNKA, ATPNKB, GH, PROLAC, PTH, and Transfer), Ovis aries(sheep) (ATPNKA, ATPNKB, and GH), Capra hircus (goat) (GH, HBA, and HBB), Oryctolagus cuniculus (rabbit)(ApoAl, HBA, HBB, CK-B, and CK-M), Canisfamiliaris (dog) (ApoAl, ATPNKB, CK-B, CK-M, and insulin), and Felissilvestris (cat) (MYC).*For the r distribution, the KA value for each gene was computed by using Eq. 3.

Group I

Group 1I

Group III

6704 Evolution: Li et al.

Dow

nloa

ded

by g

uest

on

June

1, 2

020

Page 3: Molecularphylogeny Rodentia, Lagomorpha, Primates ... · Although Artiodactyla and Carnivora are commonly thought tohavebranched offprior tothe primate-rodentsplit (2, 5, 10), some

Proc. Natl. Acad. Sci. USA 87 (1990) 6705

a p

A

0.15R

d p

L0.13

R

9 p

C

0.13

R

% 0 2 4 6 8

b p

A0.19

R

e p

L0.17

R

h p

o.C0.16

R

II II I I I

% 0 2 4 6

c p

-41{- A0.25

fp

-41-- L0.22

R

i C

-2 p0.20

R

I I

% 0 2 4 6 8 10 12

R FIG. 1. Phylogenetic relationshipsamong Rodentia (R), Primates (P), Artio-dactyla (A), Lagomorpha (L), and Car-nivora (C) inferred by the ST method (26)with a bird species (chicken) as an out-group. The branch lengths of mammalianlineages are drawn in proportion to thenumber of nonsynonymous substitutions(KA, whereas the length ofthe bird lineageis indicated by a number. The KA values(Table 1) were computed by assuming thatthe rate of nonsynonymous substitution isuniform within each gene (a, d, and g) orfollows the F distribution with a = 1 (b, e,and h) or a = 0.5 (c, f, and i). In a, d, andg, the means + SE of the central branchlength are 0.011 + 0.002, 0.007 0.003,and 0.006 ± 0.002, respectively.

site has a 1/3 probability of supporting each of the threealternative trees (28), then the tree with Rodentia as anoutgroup is highly significant (i.e., P < 0.0001). (This is arough test because the assumption of rate constancy does nothold. The same comment applies to the other tests.)

Next, we consider the branching order of Rodentia, Pri-mates, and Lagomorpha. Table 1, group II shows the averageKA values based on five genes with a total of 1326 codons.The ST method supports Rodentia as an outgroup if KA isassumed to be uniform or to follow the distribution with a= 1 (Fig. 1 d and e), whereas it supports Primates as anoutgroup if KA follows the F distribution with a = 0.5 (Fig.if). However, if a = 0.5, f(A) is unbounded near A = 0 andthe distribution is very skewed toward 0; i.e., the majority ofsites are invariable or nearly so. This may not be the case forthe ApoA1 gene because ApoAl evolves rather rapidly andthe majority of its residues are variable (16). If ApoAl isexcluded from analysis, the data support Rodentia as anoutgroup, even under the assumption of a = 0.5. When theMP method is applied to the five genes, 29 of the 57informative nonsynonymous sites support Rodentia as anoutgroup (significant at the 1% level), whereas only 11 and 17sites support, respectively, Lagomorpha or Primates as anoutgroup. Therefore, there is significant support for Rodentiaas an outgroup to Lagomorpha and Primates.

Finally, we consider the branching order of Rodentia,Primates, and Carnivora. We use six genes with a total of 1756codons (Table 1, group III). The ST method supports Rodentiaas an outgroup if the nonsynonymous rate is assumed to beuniform or to follow the distribution with a = 1 (Fig. 1 g andh). On the other hand, it supports Carnivora as an outgroup ifthe rate follows the r distribution with a = 0.5 (Fig. li), and

this remains the case when the ApoAl gene is excluded fromthe analysis. However, the internal branch in Fig. ii is veryshort and so this branching order is very uncertain. Moreover,this tree is supported only by 15 of the 62 informative sites,whereas the tree with Rodentia as an outgroup is supported by35 sites (significant at the 0.1% level); the remaining 12 sitessupport Primates as an outgroup.Evidence for the Artiodactyla-Carnivora Superordinal

Clade. To study the branching order of Artiodactyla (A),Carnivora (C), Primates (P), and Rodentia (R), we use sevengenes with a total of 1417 codons. When the ST method isapplied to the synonymous distances in Table 2, the resultsupports the (AC)-(PR) tree (Fig. 2a) [(AC)-(PR) means thatA and C belong to one cluster, P and R belong to another, andthe two clusters are connected by a central branch]; thelength (5%) of the central branch, which connects (AC) and(PR), is significantly greater than 0 because the standard errorcomputed by the method of Li (29) is only 1.6%. The sametree is obtained when the ST method is applied to thenonsynonymous distances, even ifKA follows a F distributionwith a = 0.5 (Fig. 2b). When the MP method is applied tononsynonymous changes, the result also tends to support thesame tree, though it is not significant; the numbers ofinformative sites supporting this tree, the (AP)-(CR) tree, andthe (AR)-(PC) tree are 32, 21, and 28, respectively. When theMP method is applied to all changes together, the (AC)-(PR)tree is supported by 75 ofthe 170 informative sites (significantat the 1% level), whereas the other two trees are supportedby only 50 and 45 sites, respectively.

Bush-Like Radiation of Lagomorpha, Primates, and Artio-dactyla-Carnivora. First, we consider Primates, Artiodac-tyla, Rodentia, and Lagomorpha (L). We use 14 genes with

Table 2. Average number of substitutions per synonymous site (Ks, above diagonal) and per nonsynonymous site(KA, below diagonal) between genes from Primates (P), Rodentia (R), Artiodactyla (A), and Carnivora (C)

F rate distribution*

Uniform rate distribution a = 1 a = 0.5

P A C R P A C P A C

P 0.390 0.414 0.614 P PA 0.103 0.352 0.665 A 0.112 A 0.123C 0.089 0.095 0.650 C 0.096 0.104 C 0.104 0.115R 0.130 0.149 0.133 R 0.147 0.170 0.150 R 0.167 0.1% 0.171

The genes used are ANF, ApoAl, ApoE, ATPNKB, LHB, POMC, and PPLA2. In each case the human and Rattusnorvegicus sequences were included. Other species used are Macaca fascicularis (ApoA1), Mus musculus (ANF, ApoE,and POMQ), Bos taurus (ANF, ApoAl, ApoE, LHB, POMC, and PPLA2), Ovis aries (ATPNKB), Sus scrofa (ATPNKB,POMC, and PPLA2), Canisfamiliaris (ANF, ApoAl, ApoE, ATPNKB, LHB, and PPLA2), and Mustela vison (a carnivore)(POMC).*For the r distribution, the KA value for each gene was computed by using Eq. 3.

Evolution: Li et al.

Dow

nloa

ded

by g

uest

on

June

1, 2

020

Page 4: Molecularphylogeny Rodentia, Lagomorpha, Primates ... · Although Artiodactyla and Carnivora are commonly thought tohavebranched offprior tothe primate-rodentsplit (2, 5, 10), some

Proc. Natl. Acad. Sci. USA 87 (1990)

P A P A

.047 jf .006

±016 007

008

R C R C

FIG. 2. Phylogenetic relationships among Rodentia (R), Primates(P), Artiodactyla (A), and Carnivora (C) inferred by the ST method(26). (a) The number on each branch is the number of synonymoussubstitutions (Ks). (b) The three numbers on each branch are thenumbers of nonsynonymous substitutions (KA) computed under thethree different assumptions: the rate is uniform within a gene or

follows the F distribution with a = 1 or a = 0.5 (Table 2).

a total of 4031 codons (Table 3). The (LA)-(PR) tree isobtained when the ST method is applied to the synonymous

distances, whereas the (LR)-(PA) tree is obtained when themethod is applied to the nonsynonymous distances (Table 3).In both cases the central branch is extremely short (i.e., 0.005and 0.002). When the MP method is applied to the nonsyn-

onymous changes, the (LR)-(PA) tree is slightly favored(supported by 47 sites) over the (LA)-(PR) and (LP)-(AR)trees (39 and 37 sites, respectively). When all changes are

considered, the result is also slightly in favor of the (LR)-(PA) tree (149 sites versus 142 and 121 sites). None of thealternative trees is statistically significant.Next, we consider data from Lagomorpha, Primates, Car-

nivora, and Rodentia: five genes with a total of 1464 codons(Table 3). The ST method supports the (LR)-(PC) tree forboth synonymous and nonsynonymous distances, but thecentral branch is extremely short in both cases (0.007 and0.003). The MP method also slightly favors this tree: the threealternative trees (LR)-PC), (LP)-(RC), and (LC)-(RP) aresupported, respectively, by 18, 14, and 17 informative siteswhen only nonsynonymous changes are used and by 54, 43,and 48 informative sites when all changes are used.As noted above, Rodentia is probably an outgroup to the

other four orders, and Artiodactyla and Carnivora probablybelong to one clade. Thus, the above result suggests thatLagomorpha is an outgroup to Primates, Artiodactyla, andCarnivora. But the result also suggests that the radiationamong Lagomorpha, Primates, and Artiodactyla-Carnivorais very bush-like, and their branching order is uncertain.However, it is clear that Lagomorpha is no closer to Rodentiathan to Primates, Artiodactyla, and Carnivora.

Molecular Clocks. In Fig. 1 a-c, the branch lengths for theprimate and artiodactyl lineages are about the same, whereasthat for the rodent lineage is about 1.5, 1.7, or 2.2 timeslonger, depending on whether the nonsynonymous rate fol-lows a uniform distribution or a r distribution with a = 1 or

0.5. Therefore, the primate and artiodactyl lineages evolve atabout the same rate, whereas the rodent lineage evolves atleast 1.5 times faster. The higher rate in the rodent lineageconfirms the conclusion ofWu and Li (11) and Li et al. (30).The branch lengths in Fig. 1 d-f suggest that the nonsyn-

onymous rate is about the same in the primate and lagomorphlineages. The rate in the rodent lineage is 1.5, 1.8, or 2.7 timesthat in the lagomorph lineage, depending on the distributionof the nonsynonymous rate.

In Fig. 1 g-i, the branch length for the carnivore lineage issubstantially shorter than that for the primate lineage. How-ever, the data set used is relatively small. The data set usedin Fig. 2 is considerably larger; it is seen from Fig. 2b that thenonsynonymous rate (KA) in the carnivore lineage is slightlyhigher than that in the primate lineage, and the rate in theartiodactyl lineage is even higher. Fig. 2a shows that thesynonymous rates (Ks) in the artiodactyl and the carnivorelineage are about the same and are somewhat (-25%) higherthan that in the primate lineage.The evolutionary position of Lagomorpha is uncertain, but

the substitution rates in this lineage seem to be similar tothose in the artiodactyl and carnivore lineages if the rodentlineage is used as a reference (Table 3).

In conclusion, the substitution rate in the rodent lineage isat least 1.5 times higher than those in the artiodactyl, carni-vore, and lagomorph lineages, which in turn are about 1.25times higher than the rate in the primate lineage.

DISCUSSIONAlthough many previous authors used more taxa than we did,they used only a limited number of genes so that the branch-ing order inferred varies from study to study, and no rigorousstatistical testing could be made (e.g., refs. 6, 7, 13). How-ever, since our conclusion that Rodentia is an outgroup toPrimates, Lagomorpha, Artiodactyla, and Carnivora was

inferred with chicken as an outgroup, it should be taken withsome caution because birds and mammals diverged a longtime [=270 million years (Myr)] ago. Nevertheless, theconclusion is probably reliable in view of the overwhelmingsupport from the MP analysis of nonsynonymous changes.

Table 3. Average number of substitutions per synonymous site (Ks, above diagonal) and per nonsynonymous site(KA, below diagonal) between genes from Lagomorpha (L), Primates (P), Artiodactyla (A), Carnivora (C), andRodentia (R) and tree topologies inferred by the ST method (26)

L P A R L P C R

L 0.414 0.458 0.612 L 0.338 0.357 0.568P 0.062 0.426 0.577 P 0.069 0.335 0.560A 0.074 0.067 0.638 C 0.071 0.067 0.577R 0.085 0.082 0.094 R 0.102 0.104 0.104

Tree topology and branch lengths*KS: (PR)-(LA), 0.186 0.391 0.005 0.220 0.239 KS: (LR)-(PC), 0.173 0.395 0.007 0.159 0.177KA: (LR)(PA), 0.032 0.053 0.002 0.028 0.040 KA: (LR)-(PC), 0.034 0.068 0.003 0.033 0.034

The genes used are ANF, ApoAl, ApoE, GSHPX, HBA, HBB, IL1A, IL1B, LDLR, PKINCA, PKINCB2, PKINCG,PP2AA, and PP2AB for the left matrix; and ANF, ApoAl, ApoE, CK-B, and CK-M for the right matrix. All KA and KSvalues were computed under the assumption of uniform rate. In each case the human and Oryctolagus cuniculus sequenceswere included. Other species used are Macaca fascicularis (ApoAl), Rattus norvegicus (ANF, ApoAl, ApoE, GSHPX,HBA, HBB, LDLR, PKINCB2, PP2AA, and PP2AB), Rattus rattus (PKINCA and PKINCG), Mus musculus (ANF, ApoE,GSHPX, HBA, HBB, IL1A, IL1B, and PKINCA), Bos taurus (ANF, ApoAl, ApoE, GSHPX, HBB, IL1A, IL1B, LDLR,PKINCA, PKINCB2, and PKINCG), Capra hircus (HBA and HBB), and Sus scrofa (PP2AA and PP2AB).*(PRH(LA) means that P and R belong to one cluster, L and A belong to another, and the two clusters are connected bya central branch; 0.186, 0.391, 0.005, 0.220, and 0.239 are, respectively, the branch lengths for the P lineage, the R lineage,the central branch, the L lineage, and the A lineage. The same interpretation applies to the other cases.

6706 Evolution: Li et al.

Dow

nloa

ded

by g

uest

on

June

1, 2

020

Page 5: Molecularphylogeny Rodentia, Lagomorpha, Primates ... · Although Artiodactyla and Carnivora are commonly thought tohavebranched offprior tothe primate-rodentsplit (2, 5, 10), some

Proc. Natl. Acad. Sci. USA 87 (1990) 6707

The analysis by the ST method (26) has also provided strongsupport, except for Fig. 1 f and i, where Rodentia is not anoutgroup to Lagomorpha and Carnivora. In these two casesthe nonsynonymous rate is assumed to follow a distributionvery skewed toward 0 so that the corrected KA valuesbetween rodents and birds (0.301 and 0.265 in Table 1) areclose to two times the observed values (0.162 and 0.153). Thisdoes not seem reasonable for nonconservative proteins (e.g.,ApoAl), because such proteins generally have many variableresidues. Actually, the branching order in Fig. li with Car-nivora being an outgroup to Primates and Rodentia is con-tradictory to that in Fig. 1 a-c and Fig. 2, which providestrong evidence for Rodentia being an outgroup to Artiodac-tyla and Primates and for Carnivora and Artiodactyla beingin one clade.Our conclusion is contradictory to the traditional view based

on paleontological and morphological data that Artiodactylaand Carnivora branched offbefore the divergence between theprimate and rodent lineages (5, 10), but it is in agreement withthe view of McKenna (3) and Szalay (4) that Rodentia is anoutgroup to Primates, Artiodactyla, and Carnivora.The Glires concept has become popular among some au-

thors (4, 5), though there has long been a diversity of opinionconcerning its validity. Our analysis provides strong evidenceagainst a close relationship between Rodentia and Lagomor-pha and suggests that Lagomorpha branched offafter Rodentiabut before Primates, Artiodactyla, and Carnivora.

Carnivores are commonly thought to be closer to ungulatesthan to other mammals. In fact, Simpson (1) grouped carni-vores with ungulate orders within the cohort Ferungulata.Our analysis strongly supports this grouping (Fig. 2).The radiation of eutherians is commonly thought to have

occurred near the end of the Mesozoic, about 80 Myr ago (2,4). Fig. 1 a-c suggests that it might have occurred consider-ably earlier. There is fossil evidence that the primate andartiodactyl lineages have been distinct since late in theMesozoic (10). If we assume that the two lineages diverged80 Myr ago and if we use the branch lengths estimated underthe assumption of the F distribution with a = 1 (Fig. lb), thenthe divergence time between the ancestor of rodents and thecommon ancestor of primates and artiodactyls is estimated tobe 80 x (0.010 + 0.043)/0.043 100 Myr. The estimatebecomes 81 Myr if the primate and artiodactyl lineagesdiverged only 65 Myr ago.One possible factor for the higher rate of nucleotide

substitution in the rodent lineage than in the primate andartiodactyl lineages is the generation-time effect; i.e., thehigher rate occurs because rodents have a short generationtime (11, 14). This factor can also explain the higher rate inmonkeys than in humans (30). It is, however, difficult toexplain why the substitution rate in the lagomorph lineage issimilar to those in the primate and artiodactyl lineages,despite the fact that the generation time is much shorter inrabbits than in primates and artiodactyls. Of course, thesubstitution rate depends more on the number of DNAreplications per unit time in the germ line than on the numberof generations per unit time, and for two organisms thedifference in the former number may be considerably smallerthan that in the latter; e.g., for mice and humans the twodifferences are about 7- and 100-fold, respectively (31). Oneshould also note that each of the rates estimated above refersto the average rate over the entire lineage since the eutherianradiation. These estimates are unlikely to reflect the ratedifferences at the present time because at the early stage ofdivergence all eutherian lineages would have similar gener-ation times and substitution rates. These might be the reasonswhy the rate difference between the rodent and primatelineages is only about 2-fold, although mice and humans differby 100-fold in generation time. Since the generation time in

rabbits is longer than that in mice and rats, about 2 times orlarger (32), the difference in substitution rate between thelagomorph and primate lineages would be smaller and moredifficult to detect. Another possible factor is that the effi-ciency of DNA repair could be lower in rodents than inhumans and artiodactyls (33). This hypothesis also offers asimple explanation for the lower rate in lemurs than in galagosand tarsiers (34). A third possible factor is that the G+Ccontent or the isochore structure ofG+C content in rodentshas become different from that in the other mammals (35) andthis change has accelerated the rate in rodents. Furtherstudies are required to evaluate the relative importance ofthese factors.We thank K. H. Wolfe for his help in data search. This study was

supported by National Institutes of Health grants.1. Simpson, G. G. (1945) Bull. Am. Mus. Nat. Hist. 85, 1-350.2. Romer, A. S. (1966) Vertebrate Paleontology (Univ. of Chicago

Press, Chicago).3. McKenna, M. C. (1975) in Phylogeny ofthe Primates, eds. Luckett,

W. P. & Szalay, F. S. (Plenum, New York), pp. 21-46.4. Szalay, F. S. (1977) in Major Patterns in Vertebrate Evolution, eds.

Hecht, M. K., Goody, P. C. & Hecht, B. M. (Plenum, New York),pp. 315-374.

5. Novacek, M. J. (1982) in Macromolecular Sequences in Systematicand Evolutionary Biology, ed. Goodman, M. (Plenum, New York),pp. 3-41.

6. Goodman, M., Czelusniak, J. & Beeber, J. E. (1985) Cladistics 1,171-185.

7. Wyss, A. R., Novacek, M. J. & McKenna, M. C. (1987) Mol. Biol.Evol. 4, 99-116.

8. Benton, M. J. (1988) Trends Ecol. Evol. 3, 40-45.9. Gregory, W. K. (1910) Bull. Am. Mus. Nat. Hist. 27, 1-524.

10. Kielan-Jaworowska, Z., Bown, T. M. & Lillegraven, J. A. (1979) inMesozoic Mammals, eds. Lillegraven, J. A., Kielan-Jaworowska,Z. & Clemens, W. A. (Univ. of California Press, Berkeley), pp.221-258.

11. Wu, C.-I. & Li, W.-H. (1985) Proc. Natl. Acad. Sci. USA 82,1741-1745.

12. Easteal, S. (1988) Proc. Natl. Acad. Sci. USA 85, 7622-7626.13. Shoshani, J. (1986) Mol. Biol. Evol. 3, 222-242.14. Kohne, D. E., Chiscon, J. A. & Hoyer, B. H. (1972) J. Hum. Evol.

1, 627-644.15. Coussens, L., Parker, P. J., Rhee, L., Yang-Feng, T. L., Chen, E.,

Waterfield, M. D., Francke, U. & Ullrich, A. (1986) Science 233,859-866.

16. Luo, C.-C., Li, W.-H. & Chan, L. (1989) J. Lipid Res. 30, 1735-1746.

17. Aarts, H. J. M., Jacobs, E. H. M., Willigen, G. V., Lubsen, N. H.& Schoenmakers, J. G. G. (1989) J. Mol. Evol. 28, 313-321.

18. Kitagawa, Y., Tahira, T., Ikeda, I., Kikuchi, K., Tsuiki, S.,Sugimura, T. & Nagao, M. (1988) Biochim. Biophys. Acta 951,123-129.

19. Hao, Q.-L., Yamin, T.-T., Pan, T.-C., Chen, S.-L., Chen, B.-S.,Kroon, P. & Chao, Y.-S. (1987) Atherosclerosis 66, 125-130.

20. Higgins, D. G. & Sharp, P. M. (1989) Comput. Appl. Biosci. 5,151-153.

21. Li, W.-H., Wu, C.-I. & Luo, C.-C. (1985) Mol. Biol. Evol. 2,150-174.

22. Jin, L. & Nei, M. (1990) Mol. Biol. Evol. 7, 82-102.23. Kimura, M. (1980) J. Mol. Evol. 16, 111-120.24. Fitch, W. M. (1977) Am. Nat. 111, 223-257.25. Felsenstein, J. (1982) Q. Rev. Biol. 57, 379-404.26. Sattath, S. & Tversky, A. (1977) Psychometrika 42, 319-345.27. Saitou, N. & Nei, M. (1987) Mol. Biol. Evol. 4, 406-425.28. Felsenstein, J. (1985) Syst. Zool. 34, 152-161.29. Li, W.-H. (1989) Mol. Biol. Evol. 6, 424-435.30. Li, W.-H., Tanimura, M. & Sharp, P. M. (1987) J. Mol. Evol. 25,

330-342.31. Vogel, F., Kopun, M. & Rathenberg, R. (1976) in Molecular

Anthropology, eds. Goodman, M. & Tashian, R. E. (Plenum, NewYork), pp. 13-33.

32. Nowak, R. M. & Paradiso, J. L. (1983) Walker's Mammals of theWorld (Johns Hopkins Univ. Press, Baltimore), 4th Ed.

33. Britten, R. J. (1986) Science 231, 1393-1398.34. Koop, B. F., Tagle, D. A., Goodman, M. & Slightom, J. L. (1989)

Mol. Biol. Evol. 6, 580-612.35. Bernardi, G., Mouchiroud, D., Gautier, C. & Bernardi, G. (1988) J.

Mol. Evol. 28, 7-18.

Evolution: Li et al.

Dow

nloa

ded

by g

uest

on

June

1, 2

020