17
ORIGINAL ARTICLE Phylogenetic utility of MORE AXILLARY GROWTH4 (MAX4)-like genes: a case study in Digitalis/Isoplexis (Plantaginaceae) L. J. Kelly A. Culham Received: 4 June 2007 / Accepted: 21 December 2007 / Published online: 10 May 2008 Ó Springer-Verlag 2008 Abstract We present the first assessment of phylogenetic utility of a potential novel low-copy nuclear gene region in flowering plants. A fragment of the MORE AXILLARY GROWTH 4 gene (MAX4, also known as RAMOSUS1 and DECREASED APICAL DOMINANCE1), predicted to span two introns, was isolated from members of Digitalis/ Isoplexis. Phylogenetic analyses, under both maximum parsimony and Bayesian inference, were performed and revealed evidence of putative MAX4-like paralogues. The MAX4-like trees were compared with those obtained for Digitalis/Isoplexis using ITS and trnL-F, revealing a high degree of incongruence between these different DNA regions. Network analyses indicate complex patterns of evolution between the MAX4 sequences, which cannot be adequately represented on bifurcating trees. The incidence of paralogy restricts the use of MAX4 in phylogenetic inference within the study group, although MAX4 could potentially be used in combination with other DNA regions for resolving species relationships in cases where para- logues can be clearly identified. Keywords Digitalis Isoplexis Low-copy nuclear gene region MAX4/RMS1/DAD1 Molecular phylogeny Network Paralogy Introduction One of the greatest challenges facing plant molecular phy- logenetics remains the development of appropriate DNA regions to act as sources of characters. It is widely understood that in order to reconstruct species relationships accurately, data from multiple independently evolving DNA regions need to be incorporated into phylogenetic analyses (Pamilo and Nei 1988; Doyle 1992; Strand et al. 1997; Soltis and Soltis 2000; Cronn et al. 2002a; Rokas et al. 2003; Rokas and Carroll 2005). Plastid DNA has been the most widely used source of characters for phylogenetic inference in plants (Small et al. 2004). The internal transcribed spacer region (ITS) of nuclear ribosomal DNA (nrDNA) is one of the most common sources of characters for use in low-level phylo- genetic investigations (A ´ lvarez and Wendel 2003; Hughes et al. 2006). Whilst congruence between plastid DNA and nrDNA trees can be used to provide confidence that the relationships inferred are an accurate reconstruction of organismal history, in some cases it is necessary to seek additional loci to allow resolution of conflict between com- peting hypotheses (Cronn et al. 2002a). It has been suggested that nuclear genes represent a virtually unlimited source of molecular characters for use in phylogenetic inference of plants (Sang 2002;A ´ lvarez and Wendel 2003; Small et al. 2004), and may offer a variety of other beneficial features (for reviews see Sang 2002; Mort and Crawford 2004; Small et al. 2004). Consequently there has been a steady increase in the number of low-copy nuclear gene regions being devel- oped for use in phylogenetic investigations of plants (e.g. Mathews and Sharrock 1996; Mason-Gamer et al. 1998; Tank and Sang 2001; Martins and Barkman 2005; Syring et al. 2005; Whittall et al. 2006). Nevertheless, there are still relatively few regions that have been tested in a range of plant lineages (Mort and Crawford 2004; Small et al. 2004). L. J. Kelly A. Culham Centre for Plant Diversity and Systematics, School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK L. J. Kelly (&) Jodrell Laboratories, Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3DS, UK e-mail: [email protected] 123 Plant Syst Evol (2008) 273:133–149 DOI 10.1007/s00606-008-0008-0

Phylogenetic utility of MORE AXILLARY GROWTH4 (MAX4)-like genes: a case study in Digitalis/Isoplexis (Plantaginaceae)

Embed Size (px)

Citation preview

ORIGINAL ARTICLE

Phylogenetic utility of MORE AXILLARY GROWTH4 (MAX4)-likegenes: a case study in Digitalis/Isoplexis (Plantaginaceae)

L. J. Kelly Æ A. Culham

Received: 4 June 2007 / Accepted: 21 December 2007 / Published online: 10 May 2008

� Springer-Verlag 2008

Abstract We present the first assessment of phylogenetic

utility of a potential novel low-copy nuclear gene region in

flowering plants. A fragment of the MORE AXILLARY

GROWTH 4 gene (MAX4, also known as RAMOSUS1 and

DECREASED APICAL DOMINANCE1), predicted to span

two introns, was isolated from members of Digitalis/

Isoplexis. Phylogenetic analyses, under both maximum

parsimony and Bayesian inference, were performed and

revealed evidence of putative MAX4-like paralogues. The

MAX4-like trees were compared with those obtained for

Digitalis/Isoplexis using ITS and trnL-F, revealing a high

degree of incongruence between these different DNA

regions. Network analyses indicate complex patterns of

evolution between the MAX4 sequences, which cannot be

adequately represented on bifurcating trees. The incidence

of paralogy restricts the use of MAX4 in phylogenetic

inference within the study group, although MAX4 could

potentially be used in combination with other DNA regions

for resolving species relationships in cases where para-

logues can be clearly identified.

Keywords Digitalis � Isoplexis �Low-copy nuclear gene region � MAX4/RMS1/DAD1 �Molecular phylogeny � Network � Paralogy

Introduction

One of the greatest challenges facing plant molecular phy-

logenetics remains the development of appropriate DNA

regions to act as sources of characters. It is widely understood

that in order to reconstruct species relationships accurately,

data from multiple independently evolving DNA regions

need to be incorporated into phylogenetic analyses (Pamilo

and Nei 1988; Doyle 1992; Strand et al. 1997; Soltis and

Soltis 2000; Cronn et al. 2002a; Rokas et al. 2003; Rokas and

Carroll 2005). Plastid DNA has been the most widely used

source of characters for phylogenetic inference in plants

(Small et al. 2004). The internal transcribed spacer region

(ITS) of nuclear ribosomal DNA (nrDNA) is one of the most

common sources of characters for use in low-level phylo-

genetic investigations (Alvarez and Wendel 2003; Hughes

et al. 2006). Whilst congruence between plastid DNA and

nrDNA trees can be used to provide confidence that the

relationships inferred are an accurate reconstruction of

organismal history, in some cases it is necessary to seek

additional loci to allow resolution of conflict between com-

peting hypotheses (Cronn et al. 2002a). It has been suggested

that nuclear genes represent a virtually unlimited source of

molecular characters for use in phylogenetic inference of

plants (Sang 2002; Alvarez and Wendel 2003; Small et al.

2004), and may offer a variety of other beneficial features

(for reviews see Sang 2002; Mort and Crawford 2004; Small

et al. 2004). Consequently there has been a steady increase in

the number of low-copy nuclear gene regions being devel-

oped for use in phylogenetic investigations of plants (e.g.

Mathews and Sharrock 1996; Mason-Gamer et al. 1998;

Tank and Sang 2001; Martins and Barkman 2005; Syring

et al. 2005; Whittall et al. 2006). Nevertheless, there are still

relatively few regions that have been tested in a range of plant

lineages (Mort and Crawford 2004; Small et al. 2004).

L. J. Kelly � A. Culham

Centre for Plant Diversity and Systematics,

School of Biological Sciences, University of Reading,

Whiteknights, Reading RG6 6AS, UK

L. J. Kelly (&)

Jodrell Laboratories, Royal Botanic Gardens,

Kew, Richmond, Surrey TW9 3DS, UK

e-mail: [email protected]

123

Plant Syst Evol (2008) 273:133–149

DOI 10.1007/s00606-008-0008-0

Moreover, it has been shown that the complex evolutionary

dynamics that characterise nuclear genes mean that the

regions currently available will not be appropriate for phy-

logeny reconstruction in all plant groups (e.g. Bailey et al.

2002, 2004; Archambault and Bruneau 2004), and as yet no

universally useful low-copy nuclear DNA sequence loci

have been developed (Hughes et al. 2006).

In this paper we present an assessment of the utility of a

potential novel low-copy nuclear gene region for phylo-

genetics, using Digitalis/Isoplexis as an exemplar. Digitalis

L. and Isoplexis (Lindley) Loudon are closely related

groups within the Plantaginaceae (sensu Angiosperm

Phylogeny Group II 2003; some authors advocate the use

of Veronicaceae to refer to the family, e.g. Olmstead et al.

2001; Tank et al. 2006). The c. 19 continental Digitalis

species are almost exclusively herbaceous (Brauchler et al.

2004), whilst the four Macaronesian island Isoplexis spe-

cies exhibit insular woodiness, one of the classical patterns

of plant evolution on islands (Carlquist 1974). It would

appear that the majority, if not all, of the species within

Digitalis/Isoplexis are octoploids, with the base number for

this group estimated to be x = 7 (Albach et al. 2004) and

the most commonly reported chromosome number for

species of Digitalis/Isoplexis being n = 28 (index to plant

chromosome numbers, http://mobot.mobot.org/W3T/

Search/ipcn.html). Recent molecular phylogenies for

these genera, using sequences from the nrDNA region ITS

and the plastid region trnL-F (trnL gene and trnL-F

intergenic spacer), indicate that Isoplexis nests within

Digitalis and should therefore be reduced to a sectional

rank (Carvalho 1999; Brauchler et al. 2004). However, in

common with phylogenies of many other insular woody

plant lineages the use of nrDNA and plastid DNA regions

has been insufficient to fully resolve species level rela-

tionships (e.g. Bohle et al. 1996; Kim et al. 1996;

Francisco-Ortega et al. 1997, 2002; Panero et al. 1999;

Ganders et al. 2000; Helfgott et al. 2000; Barber et al.

2002). Consequently, additional character sources are

needed to allow the resolution of relationships within

Digitalis/Isoplexis, making it an appropriate group in

which to test the utility of a novel gene region. The suit-

ability of Digitalis/Isoplexis as a test case is also promoted

by the existence of the phylogenies using conventional

molecular character sources (Carvalho and Culham 1998;

Carvalho 1999; Brauchler et al. 2004), which provide a

framework within which to judge the potential utility of

newly developed gene regions.

The MORE AXILLARY GROWTH4/RAMOSUS1/DEC-

REASED APICAL DOMINANCE1 (MAX4/RMS1/DAD1)

gene was first isolated from Arabidopsis thaliana by Sorefan

et al. (2003), who revealed that it contains six exons, pre-

dicted to encode a 570 amino acid protein that is a member of

the polyene chain dioxygenase superfamily. Loss of function

mutation in MAX4 results in plants with a characteristically

bushy phenotype, which develops as a result of increased bud

growth from the axils of rosette leaves. Evidence from

studies of these mutants, including gene expression analyses

and grafting experiments, led Sorefan et al. (2003) to propose

that MAX4 regulates shoot branching via the production of a

novel mobile branch-inhibiting signal. Further investigation

has also revealed that MAX4 is orthologous to RAMOSUS1

(RMS1) in Pisum sativum (Sorefan et al. 2003) and to the

DREASED APICAL DOMINANCE1 gene in Petunia hyb-

rida (Snowden et al. 2005). It has also been shown that

MAX4/RMS1/DAD1 has a similar exon/intron structure in

these three species (Sorefan et al. 2003; Snowden et al.

2005). A number of features of this gene favoured its

selection over other potential candidates. Firstly MAX4/

RMS1/DAD1 (henceforth referred to as MAX4) is apparently

single copy in A. thaliana, as a tblastx search (Altschul et al.

1997) with the MAX4 sequence against the complete genome

of A. thaliana on GenBank (Benson et al. 2007) fails to reveal

any other significantly similar sequences. However, similar

searches of the genomic sequence for Oryza sativa indicate

that this monocotyledonous species may have more than

one copy of MAX4. Another advantage of MAX4 was the

availability of information regarding both intron size and

position in A. thaliana and P. sativum. Knowledge of gene

structure is essential in order to allow confident positioning

of primers. In addition, expressed sequence tag (EST)

sequences were available from taxonomically diverse spe-

cies, thus allowing the identification of evolutionarily

conserved areas of amino acids at which primers can be

placed (Strand et al. 1997).

In this study we aimed to assess the potential of MAX4

to act as a novel source of molecular characters for species

level phylogeny reconstruction by investigating and cha-

racterising the evolutionary dynamics of this gene in

Digitalis/Isoplexis. Suitability of this gene for use in phy-

logenetic inference was gauged by comparing the numbers

of phylogenetically informative characters with those

available for Digitalis/Isoplexis from two separate and

independently evolving DNA regions, the nrDNA region

ITS and the plastid region trnL-F (Brauchler et al. 2004). In

addition, we examined congruence, levels of support for

congruent clades, and the levels of resolution in the MAX4

phylogeny compared to those obtained for ITS and trnL-F.

Materials and methods

Taxon sampling

Ingroup and outgroup sampling was aimed towards

assembling a taxa set complementary to Carvalho (1999)

and Brauchler et al. (2004). In total, 24 members of

134 Plant Syst Evol (2008) 273:133–149

123

Digitalis/Isoplexis were sampled for MAX4, with two out-

group genera: Erinus and Veronica (Table 1).

Molecular methods

DNA was extracted from fresh material using a CTAB

protocol modified from Doyle and Doyle (1987). DNA

from herbarium voucher material was extracted using the

method of Harris (1995) with minor modifications. For

Isoplexis chalcantha and Isoplexis isabelliana DNA

extractions prepared by Carvalho (1999) were used.

Angiosperm sequences, identified via tblastx searches of

the nuclear and EST databases on GenBank (Benson et al.

2007) using the MAX4 sequence from Arabidopsis thali-

ana, were aligned and used for design of degenerate PCR

primers (Table 2). Primers were positioned to flank the

fourth and fifth introns of MAX4 and amplify a region of

approximately 555–587 bp (Fig. 1), with intron position

and size based on information from A. thaliana and P.

sativum (Sorefan et al. 2003). The degenerate MAX(1)

primer pair was used for initial characterisation of MAX4 in

five exemplar taxa of Digitalis/Isoplexis, selected to rep-

resent each of the major clades recognised in the ITS-trnL-

F phylogeny of Carvalho (1999). We aimed to sequence 20

MAX4-like clones from each of these species. PCR

amplification was carried out in 50 ll reactions containing

*200 ng genomic DNA, 10 9 NH4 reaction buffer (Bio-

line), 3 mM MgCl2 (Bioline), 200 lM of each dNTP

(Promega), 100 pmol of each primer, 1 unit of BioTaq

DNA polymerase (Bioline). The following temperature

profile was used: 94�C for 5 min; 35 cycles of 94�C for

30 s, 60�C for 30 s, 72�C for 60 s; 72�C for 7 min. Pre-

liminary phylogenetic analysis identified two main classes

of sequences (A and B). Amplification of the target region

from the remaining taxa was achieved using the non-

degenerate study group specific primers (DIGMAX(1)30/(3)50), aiming to sequence six MAX4-like clones, including

both A and B copies, from each taxon. Reactions contained

50–500 ng genomic DNA, 25 pmol of each primer and

45 ll of 3 mM MgCl2 ReddyMix PCR pre-mix (ABgene)

Table 1 Plant material used in

the study of MAX4

a All vouchers are lodged at the

herbarium of The University

of Reading, United Kingdom

(RNG), with the exception of

I. chalcantha and I. isabelliana[see Carvalho (1999)] for details

Taxon Vouchera

Ingroup

Digitalis ciliata Trautv. Ross. 786

D. davisiana Heywood P0002797

D. ferruginea subsp. ferruginea L. P0019015/16

D. ferruginea subsp. schischkinii (Ivan.) Werner P0019018

D. grandiflora Mill. P0019029/30

D. lamarckii Ivanina Nesbitt & Samuel. 2725

D. lanata subsp. lanata Ehrh. Optima Iter. IX. 1118

D. lutea subsp. australis (Ten.) Arcang. P0019021

D. lutea subsp. lutea (Ten.) Arcang. P0019017

D. minor L. Bowen. 7316

D. nervosa Steud. & Hochst. ex Benth. P0019022/23

D. obscura subsp. laciniata (Lindl.) Maire Mateos et al. 7119/95

D. obscura subsp. obscura L. Jury et al. 132

D. parviflora Jacq. P0019019

D. purpurea subsp. heywoodii P. Silva & M. Silva P0019020

D. purpurea subsp. mauretanica (Humbert & Maire) A. M. Romo P0006303

D. purpurea subsp. purpurea L. P0019031

D. thapsi L. Photographic voucher

D. trojana Ivanina Nesbitt & Samuel. 1797

D. viridiflora Lindl. Richards et al. 23/06/1999

Isoplexis canariensis (L.) Loudon P0019028

I. chalcantha Svent. & O’Shann Carvalho (1999)

I. isabelliana (L.) Loudon Carvalho (1999)

I. sceptrum (L.f.) Loudon P0019032

Outgroup

Erinus alpinus L. P0019033

Veronica persica Poir. P0019035

V. serpyllifolia L. P0019034

Plant Syst Evol (2008) 273:133–149 135

123

in a final volume of 50 ll. A two-step temperature regime

was used: 94�C for 5 min; 10 cycles of 94�C for 30 s, 44�C

for 30 s, 72�C for 60 s; 32 cycles of 94�C for 30 s, 50�C

for 30 s, 72�C for 60 s; 72�C for 15 min. Appropriate sized

PCR products were excised and purified from agarose gels

using a Nucleospin Extract 2 in 1 kit (Macherey Nagel)

following the manufacturer’s protocol. PCR products were

cloned into the pCR�2.1-TOPO� vector (Invitrogen) fol-

lowing the manufacturer’s protocol. White colonies were

screened via two rounds of colony PCR: first using the

COLMAX(2) primer pair, to identify clones containing

MAX4-like sequences; positives were then screened with

the DIGMAXA30 and COLMAX(1)50 primers to identify

putative A and B copies. Colony PCR samples contained

10 9 NH4 reaction buffer (Bioline), 1.5 mM MgCl2 (Bi-

oline), 200 lM of each dNTP (Promega), 25 pmol of each

primer, 0.4 units of BioTaq DNA polymerase (Bioline) and

part of a single white colony in a final volume of 20 ll. The

temperature profile: 94�C for 10 min; 25 cycles of 94�C for

30 s, 55�C for 30 s, 72�C for 30 s; 72�C for 7 min, was

used. Selected colonies were grown in LB medium con-

taining ampicillin (100 mg ml-1) then extracted using a

NucleoSpin Plasmid kit (Macherey Nagel). Sequencing of

cloned PCR products was carried out with M13 primers

using a BigDyeTM Terminator v 3.1 100 Reaction Ready

kit (applied biosystems) following the manufacturer’s

protocol, and analysed on an ABI PRISM 3100 automated

capillary sequencer.

Sequence alignment and comparative analysis

Similarity of Digitalis/Isoplexis sequences to those avail-

able through GenBank (Benson et al. 2007) was assessed

using blastn and tblastx searches (Altschul et al. 1997).

Digitalis/Isoplexis sequences identified as putative MAX4

homologues were compared with the corresponding seg-

ment of the AtMAX4 gene, and the intron/exon structure

deduced. Variable positions within the final alignment of

Digitalis/Isoplexis MAX4-like sequences were cross-

checked against the electropherograms. Sequences were

visually inspected for evidence of chimeras resulting from

PCR recombination (Bradley and Hillis 1997; Cronn et al.

2002b). Sequence divergence resulting from Taq error

during PCR was estimated by calculating the predicted

Table 2 Oligonucleotide

primers used in the

amplification of the MAX4region

Primer name Sequence (50–30) Use

MAX4(1)30 CCARCADCCRTGVARGCC Amplification of target region

MAX4(1)50 ATMCCAYTKGAYGGRAGC Amplification of target region

DIGMAX(1)30 CCAACGGCCGTGCAAGCCATAG Amplification of target region

DIGMAX(3)50 TATACCATTGGATGGGAGCCCAAATG Amplification of target region

DIGMAXA30 CAAGCAAAAATGACRYTTAAACC Screening of colonies

COLMAX(1)50 GGATATGTGCAGCATTAACC Screening of colonies

COLMAX(2)30 CTTTGTGAGGGTGTTAGGG Screening of colonies

COLMAX(2)50 GAACATGGAAGAGGCATGG Screening of colonies

1

2

5' 3'

555 - 587 bp

3

46

5

- Untranslated Region - Coding Region

Fig. 1 Diagram of the AtMAX4 gene, with the segment targeted

during this study enlarged. Boxed areas represent exons, introns are

represented by lines. The size of the target region has been estimated

by taking into account the intron sizes for MAX4 in Arabidopsis

thaliana and Pisum sativum (Sorefan et al. 2003). Arrows indicate

primer annealing sites, key to numbers: 1—MAX(1)50/DIG-

MAX(3)50; 2—MAX(1)30/DIGMAX(1)30; 3—COLMAX(2) 50; 4—

COLMAX(2)30; 5—COLMAX(1)50; 6—DIGMAXA30

136 Plant Syst Evol (2008) 273:133–149

123

frequency of point mutations under the rate of 0.27–

0.85 9 10-4 errors bp-1 per cycle (Bracho et al. 1998).

MAX4-like sequences were aligned by eye in MegAlign

(Lasergene, DNASTAR Inc., USA). Putative paralogues,

initially identified by the pattern of base substitutions and

indels, were further assessed by comparison of sequence

distances within and between copies, using the uncorrected

‘‘p’’ distance as implemented in PAUP* v4.0b10 (Swofford

2002). A second measure of sequence distance was cal-

culated under the best-fit model of sequence evolution for

the entire sequences, as selected by MrModelTest v2.0

(Nylander 2004) and implemented in PAUP* v4.0b10. An

alignment of the MAX4-like partial exons was constructed

in MegAlign. Comparison of the predicted amino acid

sequences between putative copies, and with the corre-

sponding AtMAX4 segment, was made to identify residues

of divergent functional groups.

In order to place the Digitalis/Isoplexis sequences in a

broader context, a wide-scale dataset was constructed.

MAX4-like sequences from other taxa were identified by

conducting tblastn searches with the predicted amino acid

consensus sequence of the MAX4-like clones against the

nuclear and EST databases available on GenBank (Benson

et al. 2007) and the Populus trichocarpa draft genome

sequence v1.0 at the Joint Genome Institute (DoE Joint

Genome Institute and Poplar Genome Consortium 2004).

All non-duplicated accessions that could be aligned

unambiguously to the Digitalis/Isoplexis MAX4-like clones

(excluding those with [50% missing data) were included

in the dataset. Sequences were trimmed to retain only those

nucleotides corresponding to the MAX4 exon segments

under investigation and added to the alignment of MAX4-

like Digitalis/Isoplexis exons, with adjustment to maintain

the correct reading frame with respect to AtMAX4. Prior to

further analysis, any identical MAX4-like sequences within

the study group were identified and merged to form

‘‘combined’’ sequences. Tests for saturation were carried

out as described by Hirt et al. (1999) and Cox et al. (2004)

for both the wide-scale and exons + intron datasets.

Sequences included in the wide-scale and exons + intron

datasets have been submitted to GenBank (accession

numbers AJ870346–AJ870385 and AJ870787–AJ870920),

see Appendix 1 for a complete list. DNA sequence align-

ments for both datasets are available by emailing the

corresponding author.

Phylogenetic analyses

Maximum parsimony analyses were implemented in

PAUP* v4.0b10 Altivec for MacintoshTM (Swofford 2002)

using the heuristic search algorithm, with equal character

weighting. Un-rooted analysis of the wide-scale dataset

was conducted. Character-state optimisation was set to

DELTRAN, 100 random addition sequence replicates were

performed with TBR branch swapping, holding ten trees at

each step. The MulTrees option was in effect; gaps were

treated as missing data. Branch support was assessed using

the bootstrap (Felsenstein 1985); 1,000 pseudoreplicates of

the full-heuristic search were conducted, with a single

random addition sequence replicate, saving no more than

ten trees per replicate. Goodness-of-fit statistics were cal-

culated for most parsimonious trees (MPTs) in PAUP*

v4.0b10. Maximum parsimony analysis of the MAX4-like

exons and intron from the study group was carried out

under the same settings, with the exception that an out-

group was defined prior to analysis and that the heuristic

search was limited to saving 5,000 trees per replicate

(unconstrained searches exhausting the computer’s mem-

ory capacity, resulting in the premature termination of the

search).

For the phylogenetic analyses using Bayesian inference,

the best-fit model of evolution for each dataset was selected

using MrModelTest v2.0 (Nylander 2004). The wide-scale

dataset was evaluated with the third codon positions

excluded. For the exons + intron dataset, models were

evaluated for three different data partitions: exons + in-

tron; exons only, and intron only. Where the two tests

(hLRT and AIC) selected different nested models of evo-

lution, the difference in likelihood between the two models

was evaluated using a v2 approximation to assess whether

one model was a significantly better fit of the data than the

other. Where there was no significant difference in the

likelihood of nested models the more parameter rich model

was applied, as evidence suggests that underparameterisa-

tion may compromise accuracy of Bayesian analyses more

than overparameterisation (Erixon et al. 2003; Lemmon

and Moriarty 2004). To account for site specific among site

rate variation (SS ASRV) the likelihood of the neighbour

joining tree with Jukes Cantor distances created by

MrModelTest was assessed in PAUP* v4.0b10, under the

model of evolution selected by MrModelTest. Likelihood

values were calculated with and without correction for SS

ASRV. Analysis of both datasets by Bayesian inference

was carried out using MrBayes v3.0b4 (Huelsenbeck and

Ronquist 2001). In each search four chains were run, three

of which were heated using the default setting, starting

from a random tree and sampling every 100 generations. In

each case flat priors were implemented. Stationarity of the

chain was assessed by separately plotting log likelihood

and tree length against generation number, visual inspec-

tion of the resulting graphs allowed the point at which the

parameters converged on a stable value to be assessed and

the trees resulting from generations before this burn-in

period to be discarded. To further confirm that chains had

reached convergence, the post burn-in trees were used to

construct majority rule consensus trees for independent

Plant Syst Evol (2008) 273:133–149 137

123

searches. Topologies, and the posterior probability for each

clade, were compared among consensus trees. For the

wide-scale dataset, successive searches of two, four and six

million generations were conducted. Having established

convergence of the parameters on a stable value within the

six million generation search by the methods outlined

above, a second six million generation search was imple-

mented and the post burn-in trees from both searches

pooled to allow a single majority rule consensus tree

showing all compatible groupings to be constructed in

PAUP* v4.0b10. For the analysis of exons + intron dataset

one million generations was sufficient to allow full con-

vergence of the chain to be reached. Three independent one

million generation searches were carried out, and the sets

of post burn-in trees combined to allow the construction of

a single majority rule consensus tree.

It has been speculated that hybridisation may have

played an important role in the evolution of Digitalis

(Brauchler et al. 2004). Therefore, in order to explore

potential patterns of reticulate evolution, phylogenetic

networks were constructed for the MAX4-like sequences.

Network analysis of the MAX4 exons + intron dataset

(both including and excluding the outgroup sequence from

P. sativum) was carried out in SplitsTree4 V4.8 (Huson

1998) using the NeighbourNet algorithm with the distance

measure set to uncorrected ‘‘p’’. This is the best network

option for complex datasets as it responds better to these

situations than alternative methods (Morrison 2005), for

example, SplitDecomposition produces uninformative

multifurcations when applied to the full MAX4 exon-

s + intron dataset. SplitDecomposition analysis based on

uncorrected ‘‘p’’ distances was also carried out in Splits-

Tree4 V4.8 (Huson 1998) for a sub-set of the MAX4

sequences, comprising those taxa from section Digitalis

(Brauchler et al. 2004). Support for the graph topology was

tested by 1,000 bootstrap pseudoreplicates. SplitDecom-

position was chosen as the method of analysis for the

reduced dataset as it has a good display of characters in

simple to moderate cases of complexity (Morrison 2005)

and only represents the strongest signals of incompatibility,

thus limiting visual complexity of the resulting graphs and

aiding biological interpretation (Winkworth et al. 2005).

Results

Isolation of MAX4 from Digitalis/Isoplexis

One hundred and seventy four MAX4-like sequences were

obtained from 24 of the study taxa. It was not possible to

amplify the target from D. minor or from the outgroup taxa

Veronica persica and V. serpyllifolia (additional species

within the Plantaginaceae also failed to amplify: Callitri-

che cribrosa, C. hermaphroditica, Gobularia cordifolia,

and Plantago lanceolata. L. Kelly and A. Culham,

unpublished data). Amplification was achieved from all

other study taxa (Table 1).

Two major MAX4-like sequence types were detected

within Digitalis/Isoplexis, and their sister group Erinus, on

the basis of distinctive patterns of substitutions and indels;

tentatively designated as A and B copies. Despite repeated

and targeted attempts, with amplification being carried out

under a range of conditions, it was not possible to isolate

both of these main sequence types from every taxon. Copy

A was not found in D. lamarckii and I. sceptrum, and copy

B was not found in D. ciliata, D. davisiana, both subspe-

cies of D. lutea, D. parviflora, the three subspecies of D.

purpurea, D. thapsi and D. viridiflora. This is mostly likely

to be as a result of selection during the PCR (Wagner et al.

1994) as both copies appear to be under purifying selection

(L. Kelly and A. Culham, unpublished data), thus making

differential gene loss a less probable scenario.

Digitalis/Isoplexis MAX4-like sequences

The A copy sequences show greater variation in total

length, and intron size, than is seen in the B copies

(Table 3). Within the sequences coding sequence length

and intron position were highly conserved; the intron splice

site was also conserved in comparison with the corre-

sponding fourth intron in AtMAX4. In contrast with

AtMAX4, only a single intron was detected in the MAX4-

like sequences, the fifth intron with respect to the A. tha-

liana gene (Fig. 1) being absent in all clones analysed. The

length of coding sequence is also conserved in the majority

of MAX4-like sequences compared with the corresponding

Table 3 Main features of the

MAX4-like sequences from

Digitalis/Isoplexis

a Percentage divergence based

on GTR + c distances,

followed by percentages based

on uncorrected pairwise

distances in parentheses

Copy A Copy B

Full-length, bp 483–511 479–495

Coding length, bp 397–409 397–409

Intron length, bp 75–102 71–86

Intron start position, bp 176 175–176

Divergence within copya 0–3.8 (0–3.6) 0.2–4.6 (0.2–4.3)

Divergence between copiesa 12.1–15.6 (10.3–12.7) 12.1–15.6 (10.3–12.7)

138 Plant Syst Evol (2008) 273:133–149

123

segment in AtMAX4, except in the sequences from

D. purpurea subsp. mauretanica where all four clones

analysed had the same 12 bp deletion.

Measures of sequence divergence (Table 3) illustrate

that greater genetic distance exists between the A and B

copies than within each putative copy. Distances between

the putative A and B copy sequences from a single species

range between 12.1 and 15.6%. This is far greater than the

maximum distance (4.6%) detected within a set of

sequences for either the A or B copy from a single species.

This highlights the possibility that the A and B type

sequences may be derived from distinct loci. Nevertheless,

the level of variation within copies for certain taxa, as well

as the presence of distinct intron size groups, suggests that

the complexity of MAX4-like sequences within Digitalis/

Isoplexis and Erinus is not fully explained by the presence

of two paralogues. The lower levels of divergence within

copies for some taxa, for example 0.2% within the A copy

sequences of D. viridiflora, could be accounted for by Taq-

polymerase based incorporation errors during PCR, which

are predicted to equate to 0.1–0.4% sequence divergence

under the rate of Bracho et al. (1998), (although under an

alternative rate a 10-fold lower level of taq-polymerase

errors would be expected, Cline et al. 1996). However,

other taxa show variation within copies that is substantially

greater than 0.4%, which might be indicative of allelic

variation within heterozygous individuals, homoeologues

resulting from polyploidisation events in Digitalis/Iso-

plexis, or additional distinct MAX4-like loci within the

major sequence types.

Comparison of predicted protein sequences between the

MAX4-like putative copies and the corresponding sequence

from AtMAX4 indicate that some of the variation seen at

the nucleotide level between the major groups of MAX4-

like sequences has translated to divergence within amino

acids (Table 4). Whilst the majority of these changes have

been between amino acids of the same functional group,

three cases show some divergence in functional group.

Divergence between AtMAX4 and MAX4-like predicted

protein sequences from the study group was 27.4% for both

A and B copies.

Phylogenetic analyses

Tests for saturation (Hirt et al. 1999; Cox et al. 2004) on

the wide-scale dataset indicated that multiple hits had

occurred at the third codon positions, therefore the corre-

sponding nucleotides were excluded from all analyses of

this dataset. The parsimony analysis revealed that the

sequences obtained from Digitalis/Isoplexis form a single

well-supported monophyletic group. Sequences from out-

side of the study group were used to root the trees (Fig. 2).

Whilst the Digitalis/Isoplexis sequences form a well-sup-

ported monophyletic group (91% bootstrap support, BS),

resolution and support within this group are poor (BS of

less than 70%). There is also insufficient support and res-

olution to determine whether the Digitalis/Isoplexis

sequences are paralogous or orthologous to AtMAX4, or to

any of the other MAX4-like sequences from taxa outside of

the study group. In Bayesian analyses the GTR + C model

was used. Results of the parsimony and Bayesian analyses

of the wide-scale dataset are largely congruent (Fig. 2).

The wide-scale analysis revealed that the sequences

from Erinus are not sister to the Digitalis/Isoplexis

sequences, rather they nest within those from the ingroup

species. At the same time it was demonstrated that the

study group sequences form a well-supported monophy-

letic group, thus allowing any of the non-study group

sequences to act as an outgroup during analysis of the

Digitalis/Isoplexis exons and intron and allow rooting of

the tree. To reduce the impact of saturation on the analysis

of the exons and intron a single outgroup sequence was

included, P. sativum (AY557341), having been selected on

the basis of sequence distance to the Digitalis/Isoplexis

sequences under the GTR + C model. Analyses were

carried out with and without the outgroup, revealing that

the inclusion of the outgroup sequence does not confound

the recovery of relationships within Digitalis/Isoplexis. We

therefore included this sequence in the final analysis of the

exons + intron dataset.

Maximum parsimony analysis of the exons + intron

dataset resulted in a total of 430,924 MPTs, each of 399

steps (Fig. 3). Despite the inclusion of the third codon

Table 4 Comparison of the predicted amino acid consensus from A and B copy MAX4-like sequences with AtMAX4

Divergent residuesa (classified by functional group)

Conserved Semi-conserved Non-conserved Total

1 2 1 2 1 2 1 2

AtMAX4 segment – – – – – – – –

A copy 15.6 (21) – 6.7 (9) – 5.2 (7) – 27.4 (37) –

B copy 15.6 (21) 5.2 (7) 5.8 (8) 0.7 (1) 5.9 (8) 1.5 (2) 27.4 (37) 7.4 (10)

a Percentages of the total 135 residues, followed by whole numbers in parentheses

Plant Syst Evol (2008) 273:133–149 139

123

positions and intron sequence in the exons + intron dataset

branch lengths for many of the terminals are very short,

emphasising the paucity of available characters (Table 5).

The two major groups of MAX4-like sequences, identified

during alignment, form strongly supported (95% BS) A and

B copy clades. Although resolution for the terminal groups

is generally poor, a few groups with [80% bootstrap

support can be detected within the two major clades.

Within the A copy clade, the sequences from the three

subspecies of D. purpurea and D. thapsi form a mono-

phyletic group (86% BS). However, sequence seven from

D. purpurea subsp. purpurea is sister to other sequences

within this clade. This nested clade also has good bootstrap

support (91%), suggesting the existence of alleles or

additional MAX4-like homoeologues or paralogues within

D. purpurea subsp. purpurea. The nesting of sequences

from Erinus alpinus within those from Digitalis/Isoplexis

provides further indication that some of the A copy

sequences are non-orthologous. Within the B copy clade

the sequences from D. grandiflora are sister to the rest of

the B copies, with sequences from E. alpinus nested within

the main B clade. This emphasises the incongruence

between the MAX4-like phylogenetic tree and the existing

taxonomy. Sequences from several of the taxa are spread

among the different well-supported groups in the B copy

clade, indicating that multiple alleles, homoeologues or

paralogues may be represented.

In Bayesian analyses the GTR + SS model was applied

to the coding sequences and the HKY + C model to the

intron. Comparison of the topologies obtained under

maximum parsimony and Bayesian inference reveals a

high degree of congruence between the results. All clades

with 70% bootstrap support or above are found to have at

least a 95% posterior probability in the Bayesian tree (not

shown). There are however two clades within the Bayesian

consensus tree with a posterior probability of 95% or more

that have less than 50% bootstrap support in the parsimony

tree, marked with asterisks in Fig. 3.

NeighbourNet analysis of the exons + intron dataset

revealed a high degree of character conflict, represented by

the multiple reticulations within the graph (Fig. 4a). This

emphasises the complexity in the patterns of shared char-

acter states among the sequences, which may indicate that

evolution of the sequences is more complex than can be

represented on a bifurcating tree. The SplitDecomposition

network (Fig. 4b) for section Digitalis also shows a num-

ber of reticulations between the sequences.

Comparison with ITS and trnL-F

Comparison of the number of phylogenetically informative

characters between the MAX4-like sequences and other

molecular character sources, such as ITS and trnL-F, is

made difficult by unresolved issues concerning paralogy

and orthology within the MAX4 dataset. Proportions of

parsimony informative characters within the MAX4A and B

copy sequences from Digitalis/Isoplexis (11 and 10.7%

respectively) are substantially higher than seen in trnL-F

(1.3%), but lower than in ITS (15%), based on a largely

similar set of ingroup taxa from the study of Brauchler

Physcomitrella patens subsp. patens BJ167696Picea glauca CO482424

Saccharum officinarum CA100587Zea mays CO526902

Triticum aestivum BQ788859Oryza sativa AP003376Oryza sativa Combined

Sorghum bicolor CD462656Pisum sativum Combined

Medicago truncatula AJ499477Populus trichocarpa combined

Populus trichocarpa POPS114408Arabidopsis thaliana MAX4

100

100

70

100

10094

98

100

100

100

72

65

Study Group78 Terminals

Physcomitrella patens subsp. patens BJ167696

1 change

Pisum sativum CombinedMedicago truncatula AJ499477

Populus trichocarpa combinedPopulus trichocarpa POPS114408

Picea glauca CO482424

Saccharum officinarum CA100587Zea mays CO526902

Triticum aestivum BQ788859Oryza sativa AP003376Oryza sativa Combined

Sorghum bicolor CD462656

Arabidopsis thaliana MAX4

91

70

93

76

92

7899

88

95

Study Group78 Terminals

Fig. 2 Phylogenetic trees from the analysis of the wide-scale dataset,

illustrating the monophyly of the study group sequences. Left: a single

MPT from the set of 1,706 retained. Length = 334; CI = 0.692;

RI = 0.823; RC = 0.575. Numbers above branches are bootstrap

support values, those of below 50% are not shown. Right: Bayesian

inference consensus tree, showing all compatible groupings. Numbersabove branches are posterior probabilities, those of\1 are not shown.

An arrow indicates the group with [95% posterior probability but

\50% bootstrap support in the parsimony analysis

140 Plant Syst Evol (2008) 273:133–149

123

- D. ferruginea

- D. obscura

- D. purpurea

- D. thapsi

- D. trojana

- Isoplexis

- Erinus alpinus

- D. lamarckii

- D. ciliata

- D. lutea

- D. parviflora

D. purpurea subsp. mauretanica 3

D. purpurea subsp. heywoodii 1

D. purpurea subsp. purpurea A Combined 2 (2)

I. canariensis 6Pisum sativum AY557341

1 change

A95

D. purpurea subsp. mauretanica A Combined (3)

D. lanata subsp. lanata 1Combined A3 (ciliata/lutea/parviflora: 13)

D. ciliata 3D. ciliata 4D. lutea subsp. australis 1D. lutea subsp. australis 3

D. lutea subsp. australis 4D. lutea subsp. australis 7D. lutea subsp. australis 10

D. lutea subsp. australis 14D. lutea subsp. australis 15D. lutea subsp. australis 17Combined A1 (obscura/lutea: 2)

Combined A2 (obscura/trojana: 2)D. obscura subsp. laciniata 1

D. obscura subsp. obscura 24D. parviflora 1

D. parviflora A Combined (2)D. parviflora 3

D. parviflora 6D. parviflora 4

E. alpinus 1E. alpinus 5

D. ferruginea subsp. schischkinii 3D. ferruginea subsp. schischkinii 1

D. nervosa 2D. nervosa 7

D. trojana 8

D. purpurea subsp. purpurea 1D. thapsi 2D. thapsi 4

D. purpurea subsp. purpurea A Combined (3)

D. thapsi 5D. thapsi 3

D. thapsi 1D. purpurea subsp. purpurea 7

D. davisiana 1D. davisiana 2D. grandiflora 3 (2)

D. grandiflora A Combined (2)D. viridiflora A Combined (2)

D. viridiflora 1D. ferruginea subsp. ferruginea 1

D. ferruginea subsp. ferruginea 10I. isabelliana 3

I. chalcantha A Combined (2)I. canariensis 3

63

100

98

67

81

80

58

5565

64

89

77

91

86

93

100

73

98

59

91

B Clade

*

a

Fig. 3 A single MPT from the set of 430,924 found during

parsimony analysis of the exons + intron dataset. Length = 399;

CI = 0.769; RI = 0.967; RC = 0.744. Numbers above branches are

bootstrap support values; asterisks denote clades with \50% boot-

strap support but with [95% posterior probability in the Bayesian

analysis. Arrows indicate branches that collapse in the strict

consensus tree. a ‘‘A copy’’ clade and outgroup, parallel barsindicate points at which branch lengths have been truncated to aid

visual representation; b ‘‘B copy’’ clade. Taxa that appear in more

than one part of the two major clades, or that are found within

combined sequences also containing other species, are coded in the

bar on the right hand side (see key in top left hand corner of A).

Subspecies of a single species are given the same code, as are the four

Isoplexis species. Numbers in parentheses indicate the number of

clones used to construct combined sequences

Plant Syst Evol (2008) 273:133–149 141

123

et al. (2004). However, the results obtained from the

analyses of the MAX4-like exons and intron give a strong

indication that further alleles, homoeologues or paralogues

are present within both the A and B copy datasets. Con-

sequently, any set of orthologous sequences would have a

lower proportion of phylogenetically informative charac-

ters than is indicated by the figures above.

Comparison of the MAX4-like strict consensus tree

obtained under maximum parsimony (not shown) with the

ITS-trnL-F tree for Digitalis/Isoplexis (Brauchler et al.

D. lamarckii 3D. obscura subsp. laciniata 2

Combined B1 (lamarckii/Isoplexis: 4)D. trojana 2

D. lanata subsp. lanata 2I. sceptrum 3I. sceptrum 7

I. sceptrum 16Combined B2 (Isoplexis/ferruginea: 6)

I. sceptrum 8I. sceptrum 9

I. sceptrum 11D. ferruginea subsp. ferruginea 3

D. ferruginea subsp. ferruginea 6I. sceptrum 12D. trojana 7D. ferruginea subsp. ferruginea 8D. ferruginea subsp. ferruginea 9

Combined B3 (ferruginea/Isoplexis/Erinus: 6)I. sceptrum 5

I. sceptrum 13E. alpinus 4

I. canariensis 1Combined B4 (Isoplexis/trojana: 7)

I. canariensis 4I. isabelliana 1

D. ferruginea subsp. schischkinii 2D. trojana 5E. alpinus 2E. alpinus 7

I. canariensis 7D. ferruginea subsp. ferruginea 2D. ferruginea subsp. ferruginea 5

D. ferruginea subsp. ferruginea 7D. ferruginea subsp. ferruginea 12

Combined B5 (obscura/trojana/Isoplexis: 4)D. obscura subsp. laciniata 6

I. chalcantha 4I. chalcantha 2

D. nervosa B Combined (2)D. nervosa 3D. nervosa 8

D. nervosa 9D. nervosa 4

D. nervosa 6D. obscura subsp. obscura 1

D. obscura subsp. obscura 3D. obscura subsp. obscura 4D. obscura subsp. obscura 5

D. obscura subsp. obscura 6D. obscura subsp. obscura 9

D. obscura subsp. obscura 18D. obscura subsp. obscura 20D. obscura subsp. obscura 22D. obscura subsp. obscura 23D. obscura subsp. obscura 25

D. obscura subsp. obscura 12D. obscura subsp. obscura 13

D. obscura subsp. obscura B Combined (7)D. obscura subsp. obscura 10

D. obscura subsp. obscura 11D. obscura subsp. laciniata 21D. obscura subsp. obscura 26

D. grandiflora B Combined (2)

6365

64

64

65

89

B

65

86

86

87

80

91

92

7261

82

66

6488

99

69

95

D. obscura subsp. laciniata 5

D. grandiflora 6

*

A Clade+ outgroup

b

Fig. 3 continued

142 Plant Syst Evol (2008) 273:133–149

123

2004) reveals substantial incongruence. For example, evi-

dence from ITS-trnL-F indicates that the four species of

Isoplexis form a strongly supported monophyletic group,

with 100% bootstrap support. Furthermore, this group is

also recovered in the separate analyses of ITS and trnL-F

data carried out by Brauchler et al. (2004), although not

highly supported in the trnL-F tree. However, within the

MAX4-like phylogenetic tree sequences from Isoplexis are

not monophyletic. Overall, only three groups from the

MAX4-like strict consensus parsimony tree, that include

sequences from more than a single taxon, were found to

contain the same species as in the phylogenetic tree pro-

duced with ITS-trnL-F. However, the results of the

phylogenetic analysis of the ITS-trnL-F dataset also

revealed incrongruence with the existing species classifi-

cation, with three species (D. ferruginea, D. lutea, and

D. purpurea) lacking monophyly.

The ITS trnL-F phylogeny of Digitalis/Isoplexis

(Brauchler et al. 2004) failed to fully resolve the relation-

ships between species. Taxa from the ingroup were

represented by 32 terminals, only 7 of which (22%) were

fully resolved. However, an even lower level of resolution

was observed within the MAX4-like tree. Of the 113 ter-

minals representing the study group only four (3.5%) were

completely resolved in the parsimony strict consensus tree.

Discussion

Evolution of MAX4-like sequences in Digitalis/

Isoplexis

It is evident from this study that MAX4-like gene evolution

within Digitalis/Isoplexis has involved a complex series of

events. Given the occurrence of polyploidy within this

group we could expect to detect multiple homoeologues,

and thus a higher copy number than is the case in func-

tionally diploid species such as Arabidopsis thaliana

(where MAX4 is apparently single copy). However, the

polyploid nature of the majority of the study group species

is not sufficient to account for the full range of sequence

variants detected, as the diploid sister taxon Erinus alpinus

is also characterised by the presence of multiple sequence

types.

Within the MAX4-like dataset two main putative copies,

A and B, can be detected on the basis of distinctive patterns

of substitutions and indels. The designation of these two

main paralogues is also supported by the fact that the

sequences corresponding to these copies form separate and

well-supported monophyletic groups under both maximum

parsimony and Bayesian inference. The A and B copies are

both present within Erinus alpinus, indicating that MAX4

apparently underwent at least one duplication event prior to

the divergence of Digitalis/Isoplexis and Erinus. However,

the complexity of sequence variants within the MAX4-like

dataset is not limited to these two main putatively paralo-

gous copies. Examination of both the A and B clades

reveals major incongruence with the existing species

classification and previous phylogenetic reconstructions

(Carvalho 1999; Brauchler et al. 2004). The exact number

of sequence classes is not easily identifiable, but given the

fact that sequences from E. alpinus nest within those from

Digitalis/Isoplexis in both the A and B clades it is apparent

that for this diploid taxon at least two distinct groups must

be present within the A copy clade, and another two within

the B copy clade; with the expectation of the polyploid

Digitalis/Isoplexis species containing additional sequence

classes.

Recent evidence of the duplication of floral regulatory

genes within the Lamiales has led to the suggestion that an

Table 5 Distribution of

characters within the datasets

Percentages expressed as a

proportion of the total number

of characters for each data

partition, followed by whole

numbers in parenthesesa Outgroup sequence from

P. sativum is excluded from

the comparison of character

distribution within the

full-length dataset

Total length,

bp

Constant

characters

Autapomorphic

characters

Parsimony

informative

characters

Wide-scale dataset

Codon position 1 137 32.8 (45) 28.5 (39) 38.7 (53)

Codon position 2 138 47.1 (65) 23.9 (33) 29.0 (40)

Codon position 3 137 2.9 (4) 4.4 (6) 92.7 (127)

Total 412 27.7 (114) 18.9 (78) 53.4 (220)

Full-length dataseta

Codon position 1 136 61.0 (83) 26.5 (36) 12.5 (17)

Codon position 2 136 69.1 (94) 20.6 (28) 10.3 (14)

Codon position 3 137 27.0 (37) 33.6 (46) 39.4 (54)

Intron 125 56.8 (71) 8.8 (11) 34.4 (43)

Total 534 53.4 (285) 22.7 (121) 24.1 (128)

Plant Syst Evol (2008) 273:133–149 143

123

ancient whole genome duplication event has occurred in

this order, although it is likely to have taken place after the

divergence of the Plantaginaceae and the lineage leading to

many of the other families (Aagaard et al. 2005). It has also

been concluded that despite the presence of paralogues for

multiple genes among many families of the Lamiales, there

is little evidence within extant chromosome numbers to

support a whole genome duplication (Aagaard et al. 2005).

The extant chromosome number of Erinus alpinus also

gives little expectation for the presence of multiple copies

of MAX4, but the results presented here raise the possibility

that a more recent genome duplication event may have

occurred within the Plantaginaceae.

It is likely that the MAX4-like dataset obtained during

this investigation represents a mixture of different

sequence types, including paralogues, homoeologues and

alleles that, whilst arising from different processes, are

not easily distinguishable on the basis of sequence data

alone (Sang 2002; Small et al. 2004). Thus, several

processes, such as gene duplication, polyploidisation and

lineage sorting may have contributed to the complex

pattern of sequence relationships revealed from the

MAX4-like tree and resulted in the incongruence with the

pattern of species relationships inferred from the results

of the phylogenetic study of Brauchler et al. (2004). It

has been speculated previously that hybridisation may

have played an important part in the evolution of Digi-

talis (Brauchler et al. 2004). Phylogenetic trees cannot

directly depict hybrids (Vriesendorp and Bakker 2005),

but situations involving reticulate evolution can be vis-

ualised on phylogenetic networks (Winkworth et al.

2005). The results of the network analyses of the MAX4

sequences from Digitalis/Isoplexis reveal numerous

character incompatibilities, which are displayed as

Fig. 4 Splits graphs of the

MAX4-like sequences.

a NeighbourNet graph based on

uncorrected ‘‘p’’ distances of

118 MAX4-like sequences

(exons + intron) from Digitalis/

Isoplexis, goodness of

fit = 97.75%. Uppercase lettersdenote groups formed by the A

and B type sequences. An arrowindicates the placement of the

outgroup sequence (P. sativumAY557341) when included in

the analysis.

b SplitDecomposition network

based on uncorrected ‘‘p’’

distances for the 12 MAX4-like

sequences (all A copy) from

Digitalis sect. Digitalis,

goodness of fit = 90.99%.

Values above the branches

indicate bootstrap support. Key

to species abbreviations, Dph,

Digitalis purpurea subsp.

heywoodii; Dpm, D. purpureasubsp. mauretanica; Dpp,

D. purpurea subsp. purpurea;

Dt, D. thapsi

144 Plant Syst Evol (2008) 273:133–149

123

reticulations on the graphs (see Fig. 4). There are three

possible interpretations of the reticulations displayed in

these graphs (Morrison 2005). They may represent

homoplasy in the dataset (such as parallelisms or

reversals), result from uncertainty or ambiguity (such as

through the comparison of non-orthologous sequences) or

they represent events involving gene exchange between

unrelated organisms (such as hybridisation or lateral gene

transfer) (Morrison 2005). Sequence divergence is rela-

tively low within the MAX4 sequences (see ‘‘Results’’),

and NeighbourNet graphs constructed with the intron and

nucleotides corresponding to the third codon position

excluded still show multiple reticulations (not shown).

Thus, it is unlikely that all of the conflict can be

accounted for by homoplasy. Population-level processes

can also mimic the patterns expected from species-level

reticulations (Linder and Rieseberg 2004), and in the

case of the graph for the full dataset it seems likely that

at least some of the reticulations result from these con-

founding processes. However, in the case of the

reticulations shown in the SplitDecomposition graph for

section Digitalis is it more plausible that some of them

may represent true reticulations at the species level, as

D. purpurea subsp. heywoodii and D. thapsi form areas

of hybridisation within their native range (Brauchler

et al. 2004).

Potential for use of MAX4 in phylogenetic inference

The complexity of MAX4-like gene evolution within Digi-

talis/Isoplexis, as revealed by this study, has significant

implications for the potential use of these genes as sources

of phylogenetically informative data. Comparison of the

number of phylogenetically informative characters

between the MAX4-like sequences and conventional

molecular character sources, such as ITS and trnL-F, is

complicated by the unresolved issues concerning copy

number. However, it can be seen from the trees generated

by the analysis of the MAX4-like exons and intron (Fig. 3)

that large areas of the topologies lack resolution, or where

clades are resolved support for this resolution is poor. In

some cases the lack of resolution is within different clones

from a particular species, however, in other clades

sequences from separate genera are unresolved. This is an

obvious indication that the level of variation within a single

class of sequences is not sufficient to provide full resolu-

tion within the study group.

In addition to low levels of sequence variation, the

presence of multiple sequence variants within the dataset

severely limits the use of the MAX4-like genes for recov-

ering species relationships within Digitalis/Isoplexis. At

present the inability to reliably identify and distinguish the

various classes of sequences within all of the species

studied means that the accuracy of phylogenetic estimation

is compromised, as it cannot be confirmed that orthologues

are being compared (Doyle 1992; Martin and Burg 2002).

Previous studies have successfully incorporated data from

paralogous genes in to phylogenetic analyses, these include

the use of PHY (Phytochrome) genes in the Poaceae

(Mathews and Sharrock 1996), ADH (Alcohol Dehydro-

genase) genes in Paeonia (Sang et al. 1997) and LEGCYC

(LEG CYCLOIDEA) genes in Lupinus (Ree et al. 2004).

However, in the case of the region examined in this study

the use of paralogues to provide multiple datasets is not an

option, as it is currently not possible to confirm the exact

number of sequence variants within the MAX4-like dataset,

and consequently orthologous sequences cannot be deter-

mined and delineated.

Whilst the MAX4-like region is not appropriate for

species level delimitation within Digitalis/Isoplexis, the

features that make the development of low-copy nuclear

genes so challenging, such as the variability in copy

number and levels of divergence, may also suggest that

this region could be of use in other groups of plants.

Vicilin and Chalcone Synthase are examples of genes

that have multiple copies in some lineages but have been

used successfully for phylogenetic inference in plant

groups where they are single copy (Whitlock and Baum

1999; Koch et al. 2001). However, any future use of

MAX4 in studies of species relationships would require

careful characterisation of the gene’s evolutionary

dynamics for the plant group in question (for example,

through the use of Southern hybridisation to establish

copy number), to allow the confident identification of

orthologous sequences. In addition to applications to the

study of species relationships, genes such as MAX4 may

be of relevance to studies of evolutionarily important

processes such as gene duplication events. As gene

duplications are the ultimate source of evolutionary

novelty (Charlesworth et al. 2001) there has been interest

in studying the adaptive significance of these events

(Lawton-Rauh 2003). Thus, whilst not all low-copy

nuclear gene regions may be useful in delimiting species

relationships, they may have the potential to contribute

to our understanding of processes that help shape the

evolution of plants. One example of this may be through

the use of networks to explore conflicting patterns of

characters within the data, as the complex evolutionary

processes that characterise the speciation of plants are

not likely to be well represented by bifurcating trees

(Winkworth et al. 2005). Networks allow the represen-

tation of more of the phylogenetic information within a

dataset (Posada and Crandall 2001) and are important

tools for studying complex patterns in molecular

sequence data (Winkworth et al. 2005). The studies of

Joly and Bruneau (2006) and Brysting et al. (2007) are

Plant Syst Evol (2008) 273:133–149 145

123

two examples that combine the use of low-copy nuclear

gene sequences and network algorithms to address

complex cases of species-level evolution in plants, and

illustrate the potential utility of low-copy nuclear gene

data for purposes other than the reconstruction of

bifurcating trees.

Conclusions

The current study is the first to assess the utility of MAX4 for

use in phylogenetic inference, and the first report of the use

of low-copy nuclear gene data in phylogenetic analyses of

Digitalis/Isoplexis. Although the approach adopted was

successful in isolating the target region from species in

which the MAX4 gene was previously unknown, the data

were not ultimately appropriate for use in resolving phy-

logenetic relationships between the study species.

Alternative approaches are now available that may have an

increased likelihood of successful loci development, and

avoid the need for heavy investment in a single candidate

gene (see reviews by Schluter et al. 2005; Hughes et al.

2006). It has been suggested that the development of low-

copy nuclear gene regions is like a lottery, with occasional

successes against a backdrop of generally disappointing

results that are rarely reported in the literature (Hughes et al.

2006). The results of our development of MAX4 can be seen

as conforming to this under-reported trend. In light of the

relatively slow accumulation of variable species-level loci,

the reporting of such results will allow the most efficient use

of future research efforts. In conclusion, we recommend

that any future use of MAX4 as a source of characters for

phylogenetic investigations in flowering plants should be

limited to groups where this gene exhibits less complex

evolutionary dynamics, and where orthologous sequences

can be confidently identified and isolated.

We thank Victor Albert, Jim Dunwell, Gary Rosenberg,

James Tosh, Ben Warren and an anonymous reviewer for

their helpful comments on earlier drafts of this manuscript,

Ovidiu Paun for help with the network analyses and Mike

Wilkinson for useful discussion. We also thank Jose

Carvalho for providing Isoplexis chalcantha and I. isabel-

liana DNA extractions and for use of his Digitalis/Isoplexis

ITS/trnL-F dataset, and the Chelsea Physic Garden for

providing plant material of I. canariensis and I. sceptrum.

This work was supported financially by the Natural Envi-

ronment Research Council.

Appendix

Table 6

Table 6 Sequences included in the MAX4-like datasets, with Gen-

Bank accession numbers

Sequences GenBank number

Wide-scale dataset: ingroup sequences

Digitalis ciliata 3–4 AJ870348–49

D. davisiana 1–2 AJ870350-51

D. ferruginea subsp. ferruginea3, 6–10, 12

AJ870354, AJ870357-61,

AJ870363

D. ferruginea subsp. schischkinii2–3

AJ870366-67

D. grandiflora 3 AJ870370

D. lanata subsp. lanata 1 AJ870378

D. lutea subsp. australis 4, 10, 14 AJ870383, AJ870790, AJ870794

D. nervosa 2, 4, 6–7, 9 AJ870803, AJ870805, AJ870807-

08, AJ870810

D. obscura subsp. laciniata 5–6 AJ870815-16

D. obscura subsp. obscura 3–6,

10–13, 18, 22–24, 26

AJ870819-22, AJ870826-29,

AJ870834, AJ870838-40,

AJ870842

D. parviflora 3 AJ870845

D. purpurea subsp. heywoodii 1 AJ870850

D. purpurea subsp. mauretanica3

AJ870853

D. purpurea subsp. purpurea 1, 7 AJ870855, AJ870861

D. thapsi 1–3, 5 AJ870862-64, AJ870866

D. trojana 5, 7–8 AJ870871, AJ870873-74

Isoplexis canariensis 1, 4, 7 AJ870878, AJ870881, AJ870884

I. chalcantha 2 AJ870886

I. sceptrum 5, 7– 9, 13, 16 AJ870899, AJ870901-03,

AJ870907, AJ870910

Erinus alpinus 1, 4, 7 AJ870914, AJ870917, AJ870920

D. grandiflora A Combined AJ870368-69

D. grandiflora B Combined AJ870371-73

D. nervosa B Combined AJ870802, AJ870804, AJ870806,

AJ870809

D. obscura subsp. obscura B

Combined

AJ870817-18, AJ870823,

AJ870825, AJ870830-33,

AJ870835-37, AJ870841

D. purpurea subsp. mauretanicaA Combined

AJ870851-52, AJ870858

D. purpurea subsp. purpurea A

Combined 2

AJ870857-58

D. viridiflora A Combined AJ870875-77

Combined A1 (D. lutea subsp.

australis & D. obscura subsp.

laciniata)

AJ870796, AJ870814

Combined A3 (D. ciliata,

D. lutea both subspecies,

D. obscura both subspecies,

D. parviflora, D. trojana, &

Erinus)

AJ870346-47, AJ870380-82,

AJ870384-85, AJ870787-89,

AJ870791-93, AJ870795,

AJ870797, AJ870798-01,

AJ870811, AJ870824,

AJ870844, AJ870847,

AJ870849, AJ870867,

AJ870918

146 Plant Syst Evol (2008) 273:133–149

123

Table 6 continued

Sequences GenBank number

Combined A4 (D. purpureasubsp. purpurea & D. thapsi)

AJ870856, AJ870859-60,

AJ870865

Combined A5 (D. ferrugineasubsp. ferruginea,

I. canariensis, I. chalcantha &

I. isabelliana)

AJ870352, AJ870880, AJ870883,

AJ870885, AJ870887,

AJ870891

Combined B1 (D. lamarckii,D. obscura subsp. laciniata,

D. trojana & I. sceptrum)

AJ870374-77, AJ870812,

AJ870868, AJ870904

Combined B2 (D. ferrugineasubsp. ferruginea, D. lanatasubsp. lanata, I. sceptrum &

Erinus)

AJ870355, AJ870362, AJ870364,

AJ870379, AJ870895,

AJ870897-98, AJ870900,

AJ870904-05, AJ870908-09,

AJ870911-13, AJ870916

Combined B4 (D. trojana,

I. canariensis, I. isabelliana,

I. sceptrum, Erinus)

AJ870870, AJ870879, AJ870882,

AJ870889, AJ870893-94,

AJ870896, AJ870915,

AJ870919

Combined B5 (D. ferrugineasubsp. ferruginea, D. obscurasubsp. laciniata, D. trojanaI. chalcantha & I. isabelliana)

AJ870353, AJ870356, AJ870813,

AJ870869, AJ870872,

AJ870888, AJ870892

Wide-scale dataset: outgroup sequences

Arabidopsis thaliana AL161582

Medicago truncatula AJ499477

Oryza sativa AP003376

Physcomitrella patens subsp.

patensBJ167696

Picea glauca CO482424

Populus trichocarpa POPS114408a

Saccharum officinarum CA100587

Sorghum bicolor CD462656

Triticum aestivum BQ788859

Zea mays CO526902

Oryza sativa Combined AP003141, AK058473

Pisum sativum Combined AY557342, AY55734

Populus trichocarpa Combined TREE470814a, TREE818199a

Exons + intron dataset: ingroup sequences

D. ciliata 3–4 AJ870348-49

D. davisiana 1–2 AJ870350-51

D. ferruginea subsp. ferruginea1–3, 5–10, 12

AJ870352-54, AJ870356-61,

AJ870363

D. ferruginea subsp. schischkinii1–3

AJ870365-67

D. grandiflora 3, 6 AJ870370, AJ870373

D. lamarckii 3 AJ870376

D. lanata subsp. lanata 1–2 AJ870378-79

D. lutea subsp. australis 1, 3–4, 7,

10, 14–15, 17

AJ870380, AJ870382-83,

AJ870787, AJ870790,

AJ870794-95, AJ870797

D. nervosa 2–4, 6–9 AJ870803-05, AJ870807-10

D. obscura subsp. laciniata 1–2,

5–6

AJ870811-12, AJ870815-16

Table 6 continued

Sequences GenBank number

D. obscura subsp. obscura 1,

3–6, 9–13, 18, 20–26

AJ870817, AJ870819-22,

AJ870825-29, AJ870834,

AJ870836-42

D. parviflora 1, 3–4, 6 AJ870843, AJ870845-46,

AJ870848

D. purpurea subsp. heywoodii 1 AJ870850

D. purpurea subsp. mauretanica3

AJ870853

D. purpurea subsp. purpurea 1, 7 AJ870855, AJ870861

D. thapsi 1–5 AJ870862-66

D. trojana 2, 5, 7–8 AJ870868, AJ870871, AJ870873-

74

D. viridiflora 1 AJ870875

I. canariensis 1, 3–4, 6–7 AJ870878, AJ870880-81,

AJ870883-84

I. chalcantha 2, 4 AJ870886, AJ870888

I. isabelliana 1, 3 AJ870889, AJ870891

I. sceptrum 3, 5, 7–9, 11–13, 16 AJ870897, AJ870899, AJ870901-

03, AJ870905-07, AJ870910

Erinus alpinus 1, 2, 4–5, 7 AJ870914-15, AJ870917-18,

AJ870920

D. grandiflora A Combined AJ870368-69

D. grandiflora B Combined AJ870371-72

D. nervosa B Combined AJ870802, AJ870806

D. obscura subsp. obscura B

Combined

AJ870818, AJ870823, AJ870830-

33, AJ870835

D. parviflora A Combined AJ870844, AJ870849

D. purpurea subsp. mauretanicaA Combined

AJ870851-52, AJ870854

D. purpurea subsp. purpurea A

Combined

AJ870856, AJ870859-60

D. purpurea subsp. purpurea A

Combined 2

AJ870857-58

D. viridiflora A Combined AJ870876-77

I. chalcantha A Combined AJ870885, AJ870887

Combined A1 (D. lutea subsp.

australis & D. obscura subsp.

laciniata)

AJ870796, AJ870814

Combined A2 (D. obscura subsp.

obscura & D. trojana)

AJ870824, AJ870867

Combined A3 (D. ciliata,

D. lutea both subspecies &

D. parviflora)

AJ870346-47, AJ870381,

AJ870384-85, AJ870788-89,

AJ870791-93, AJ870798-01,

AJ870847

Exons + intron dataset: ingroup sequences

Combined B1 (D. lamarckii and

I. sceptrum)

AJ870374-75, AJ870377,

AJ870904

Combined B2 (D. ferrugineasubsp. ferruginea &

I. sceptrum)

AJ870355, AJ870900, AJ870908-

09, AJ870911, AJ870913

Combined B2 (D. ferrugineasubsp. ferruginea &

I. sceptrum)

AJ870355, AJ870900, AJ870908-

09, AJ870911, AJ870913

Plant Syst Evol (2008) 273:133–149 147

123

References

Aagaard JE, Olmstead RG, Willis JH, Phillips PC (2005) Duplication

of floral regulatory genes in the Lamiales. Amer J Bot

92(8):1284–1293

Albach DC, Martinez-Ortega M, Fischer MA, Chase MW (2004)

Evolution of Veronicaceae: a phylogenetic perspective. Ann

Missouri Bot Gard 91:275–302

Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, Miller

W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new

generation of protein database search programs. Nucl Acids Res

25:3389–3402

Alvarez I, Wendel JF (2003) Ribosomal ITS sequences and plant

phylogenetic inference. Molec Phylogenet Evol 29:417–434

Angiosperm Phylogeny Group II (2003) An update of the angiosperm

phylogeny group classification for the orders and families of

flowering plants: APG II. Bot J Linn Soc 141:399–436

Archambault A, Bruneau A (2004) Phylogenetic utility of the LEAFY/FLORICAULA gene in the Caesalpinioideae (Leguminosae):

gene duplication and a novel insertion. Syst Bot 29:609–626

Bailey CD, Price RA, Doyle JJ (2002) Systematics of the halimol-

obine Brassicaceae: evidence from three loci and morphology.

Syst Bot 27:318–332

Bailey CD, Hughes CE, Harris SA (2004) Using RAPDs to identify

DNA sequence loci for species level phylogeny reconstruction:

an example from Leucaena (Fabaceae). Syst Bot 29:4–14

Barber JC, Francisco-Ortega J, Santos-Guerra A, Turner KG, Jansen

RK (2002) Origin of Macaronesian Sideritis L. (Lamioideae:

Lamiaceae) inferred from nuclear and chloroplast sequence

datasets. Molec Phylogenet Evol 23:293–306

Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DJ

(2007) GenBank. Nucleic Acids Res 35:D21–D25

Bohle UR, Hilger HH, Martin WF (1996) Island colonization and

evolution of the insular woody habit in Echium L (Boragina-

ceae). Proc Natl Acad Sci USA 93:11740–11745

Bracho MA, Moya A, Barrio E (1998) Contribution of Taqpolymerase-induced errors to the estimation of RNA virus

diversity. J Gen Virol 79:2921–2928

Bradley RD, Hillis DM (1997) Recombinant DNA sequences

generated by PCR amplification. Molec Biol Evol 14:592–593

Brauchler C, Meimberg H, Heubl G (2004) Molecular phylogeny of

the genera Digitalis L. and Isoplexis (Lindley) Loudon (Veron-

icaceae) based on ITS- and trnL-F sequences. Pl Syst Evol

248:111–128

Brysting AK, Oxelman B, Huber KT, Moulton V, Brochmann C

(2007) Untangling complex histories of genome mergings in

high polyploids. Syst Biol 56:467–476

Carlquist S (1974) Island biology. Columbia University Press, New

York

Carvalho JA (1999) Systematic studies of the genera Digitalis L. and

Isoplexis (Lindl.) Loud. (Scrophulariaceae: Digitaleae) and

conservation of Isoplexis species. Ph. D. thesis, University of

Reading, United Kingdom

Carvalho JA, Culham A (1998) Conservation status and preliminar

results on the phylogenetics of Isoplexis an endemic Macarone-

sian genus. Bol Mus Municipal Funchal 5:109–127

Charlesworth D, Charlesworth B, McVean GAT (2001) Genome

sequences and evolutionary biology, a two-way interaction.

Trends Ecol Evol 16:235–242

Cline J, Braman JC, Hogrefe HH (1996) PCR fidelity of Pfu DNA

polymerase and other thermostable DNA polymerases. Nucl

Acids Res 24:3546–3551

Cox CJ, Goffinet B, Shaw AJ, Boles SB (2004) Phylogenetic

relationships among the mosses based on heterogeneous Bayes-

ian analysis of multiple genes from multiple genomic

compartments. Syst Bot 29:234–250

Cronn RC, Small RL, Haselkorn T, Wendel JF (2002a) Rapid

diversification of the cotton genus (Gossypium: Malvaceae)

revealed by analysis of sixteen nuclear and chloroplast genes.

Amer J Bot 89:707–725

Cronn RC, Cedroni M, Haselkorn T, Grover C, Wendel JF (2002b)

PCR-mediated recombination in amplification products derived

from polyploid cotton. Theor Appl Genet 104:482–489

DoE Joint Genome Institute and Poplar Genome Consortium (2004)

Populus trichocarpa genome v1.0. http://genome.jgi-psf.org/

Poptr1/Poptr1.home.html. Last accessed 28th Nov 2007

Doyle JJ (1992) Gene trees and species trees—molecular systematics

as one-character taxonomy. Syst Bot 17:144–163

Doyle JJ, Doyle JL (1987) A rapid DNA isolation procedure for small

quantities of fresh leaf tissue. Phytochem Bull Bot Soc Am

19:11–15

Erixon P, Svennblad B, Britton T, Oxelman B (2003) Reliability of

Bayesian posterior probabilities and bootstrap frequencies in

phylogenetics. Syst Biol 52:665–673

Felsenstein J (1985) Confidence limits on phylogenies: an approach

using the bootstrap. Evolution 39:783–791

Francisco-Ortega J, Santos-Guerra A, Hines A, Jansen RK (1997)

Molecular evidence for a Mediterranean origin of the Macaro-

nesian endemic genus Argyranthemum (Asteraceae). Amer J Bot

84:1595–1613

Francisco-Ortega J, Fuertes-Aguilar J, Kim SC, Santos-Guerra A,

Crawford DJ, Jansen RK (2002) Phylogeny of the Macaronesian

endemic Crambe section Dendrocrambe (Brassicaceae) based on

internal transcribed spacer sequences of nuclear ribosomal DNA.

Amer J Bot 89:1984–1990

Ganders FR, Berbee M, Pirseyedi M (2000) ITS base sequence

phylogeny in Bidens (Asteraceae): evidence for the continental

relatives of Hawaiian and Marquesan Bidens. Syst Bot 25:122–

133

Harris SA (1995) Systematics and randomly amplified polymorphic

DNA in the genus Leucaena (Leguminosae, Mimosoideae). Pl

Syst Evol 197:195–208

Helfgott DM, Francisco-Ortega J, Santos-Guerra A, Jansen RK,

Simpson BB (2000) Biogeography and breeding system evolu-

tion of the woody Bencomia alliance (Rosaceae) in Macaronesia

based on ITS sequence data. Syst Bot 25:82–97

Hirt RP, Logsdon JM, Healy B, Dorey MW, Doolittle WF, Embley

TM (1999) Microsporidia are related to Fungi: evidence from the

largest subunit of RNA polymerase II and other proteins. Proc

Natl Acad Sci USA 96:580–585

Table 6 continued

Sequences GenBank number

Combined B3 (D. ferrugineasubsp. ferruginea, I. sceptrum& Erinus)

AJ870362, AJ870364, AJ870895,

AJ870898, AJ870912,

AJ870916

Combined B4 (D. trojana,

I. canariensis, I. isabelliana,

I. sceptrum)

AJ870870, AJ870879, AJ870882,

AJ870893-94, AJ870896,

AJ870912

Combined B5 (D. obscura subsp.

laciniata, D. trojana &

I. isabelliana)

AJ870813, AJ870869, AJ870872,

AJ870892

Exons + intron dataset: outgroup sequence

Pisum sativum AY557341

a Populus trichocarpa draft genome v1.0 accession numbers (http://

genome.jgi-psf.org/Poptr1/Poptr1.home.html)

148 Plant Syst Evol (2008) 273:133–149

123

Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference

of phylogenetic trees. Bioinformatics 17:754–755

Hughes CE, Eastwood RJ, Bailey CD (2006) From famine to feast?

Selecting nuclear DNA sequence loci for plant species-level

phylogeny reconstruction. Philos Trans Roy Soc London B Biol

Sci 361:211–225

Huson DH (1998) SplitsTree: analysing and visualizing evolutionary

data. Bioinformatics 14:68–73

Joly S, Bruneau A (2006) Incorporating allelic variation for recon-

structing the evolutionary history of organisms from multiple

genes: an example from Rosa in North America. Syst Biol

55:623–636

Kim SC, Crawford DJ, Francisco-Ortega J, Santos Guerra A (1996) A

common origin for woody Sonchus and five related genera in the

Macaronesian islands: molecular evidence for extensive radia-

tion. Proc Natl Acad Sci USA 93:7743–7748

Koch M, Haubold B, Mitchell-Olds T (2001) Molecular systematics

of the Brassicaceae: evidence from coding plastidic matK and

nuclear Chs sequences. Amer J Bot 88:534–544

Lawton-Rauh A (2003) Evolutionary dynamics of duplicated genes in

plants. Molec Phylogenet Evol 29:396–409

Lemmon AR, Moriarty EC (2004) The importance of proper model

assumption in Bayesian phylogenetics. Syst Biol 53:265–277

Linder CR, Rieseberg LH (2004) Reconstructing patterns of reticulate

evolution in plants. Amer J Bot 91:1700–1708

Martin AP, Burg TM (2002) Perils of parology: using HSP70 genes

for inferring organismal phylogenies. Syst Biol 51:570–587

Martins TR, Barkman TJ (2005) Reconstruction of Solanaceae

phylogeny using the nuclear gene SAMT. Syst Bot 30:435–447

Mason-Gamer RJ, Weil CF, Kellogg EA (1998) Granule-bound starch

synthase: structure, function and phylogenetic utility. Molec Biol

Evol 15:1658–1673

Mathews S, Sharrock RA (1996) The phytochrome gene family in

grasses (Poaceae): a phylogeny and evidence that grasses have a

subset of the loci found in dicot angiosperms. Molec Biol Evol

13:1141–1150

Morrison DA (2005) Networks in phylogenetic analysis: new tools for

population biology. Int J Parasitol 35:567–582

Mort ME, Crawford DJ (2004) The continuing search: low-copy

nuclear sequences for lower-level plant molecular phylogenetic

studies. Taxon 53:257–261

Nylander JAA (2004) MrModelTest 2.0. program distributed by the

author. Evolutionary Biology Centre, Uppsala University

Olmstead RG, DePamphilis CW, Wolfe AD, Young ND, Elisons WJ,

Reeves PA (2001) Disintegration of the Scrophulariaceae. Amer

J Bot 88:348–361

Pamilo P, Nei M (1988) Relationships between gene trees and species

trees. Molec Biol Evol 5:568–583

Panero JL, Francisco-Ortega J, Jansen RK, Santos-Guerra A (1999)

Molecular evidence for multiple origins of woodiness and a New

World biogeographic connection of the Macaronesian Island

endemic Pericallis (Asteraceae: Senecioneae). Proc Natl Acad

Sci USA 96:13886–13891

Posada D, Crandall KA (2001) Intraspecific gene genealogies: trees

grafting into networks. Trends Ecol Evol 16:37–45

Ree RH, Citerne HL, Lavin M, Cronk QCB (2004) Heterogeneous

selection on LEGCYC paralogs in relation to flower morphology

and the phylogeny of Lupinus (Leguminosae). Molec Biol Evol

21:321–331

Rokas A, Carroll SB (2005) More genes or more taxa? The relative

contribution of gene number and taxon number to phylogenetic

accuracy. Molec Biol Evol 22:1337–1344

Rokas A, Williams BL, King N, Carroll SB (2003) Genome-scale

approaches to resolving incongruence in molecular phylogenies.

Nature 425:798–804

Sang T (2002) Utility of low-copy nuclear gene sequences in plant

phylogenetics. Crit Rev Biochem Molec Biol 37:121–147

Sang T, Donoghue MJ, Zhang DM (1997) Evolution of alcohol

dehydrogenase genes in peonies (Paeonia): phylogenetic rela-

tionships of putative nonhybrid species. Molec Biol Evol

14:994–1007

Schluter PM, Stuessy TF, Paulus HF (2005) Making the first step:

practical considerations for the isolation of low-copy nuclear

sequence markers. Taxon 54:766–770

Small RL, Cronn RC, Wendel JF (2004) Use of nuclear genes for

phylogeny reconstruction in plants. Austral Syst Bot 17:145–170

Snowden KC, Simkin AJ, Janssen BJ, Templeton KR, Loucas HM,

Simons JL, Karunairetnam S, Gleave AP, Clark DG, Klee HJ

(2005) The Decreased apical dominance1/Petunia hybridaCAROTENOID CLEAVAGE DIOXYGENASE8 gene affects

branch production and plays a role in leaf senescence, root

growth and flower development. Pl Cell 17:746–759

Soltis DE, Soltis PS (2000) Contributions of plant molecular system-

atics to studies of molecular evolution. Pl Molec Biol 42:45–75

Sorefan K, Booker J, Haurogne K, Goussot M, Bainbridge K, Foo E,

Chatfield S, Ward S, Beveridge C, Rameau C, Leyser O (2003)

MAX4 and RMS1 are orthologous dioxygenase-like genes that

regulate shoot branching in Arabidopsis and pea. Genes Dev

17:1469–1474

Strand AE, LeebensMack J, Milligan BG (1997) Nuclear DNA-based

markers for plant evolutionary biology. Molec Ecol 6:113–118

Swofford DL (2002) PAUP*-Phylogenetic analysis using parsimony

(*and other methods): version 4.0b10. Sinauer, Sunderland

Syring J, Willyard A, Cronn R, Liston A (2005) Evolutionary

relationships among Pinus (Pinaceae) subsections inferred from

multiple low-copy nuclear loci. Amer J Bot 92:2086–2100

Tank DC, Sang T (2001) Phylogenetic utility of the glycerol-3-

phosphate acyltransferase gene: evolution and implications in

Paeonia (Paeoniaceae). Molec Phylogenet Evol 19:421–429

Tank DC, Beardsley PM, Kelchner SA, Olmstead RG (2006) Review

of the systematics of Scrophulariaceae s.l. and their current

disposition. Austral Syst Bot 19:289–307

Vriesendorp B, Bakker FT (2005) Reconstructing patterns of

reticulate evolution in angiosperms: what can we do? Taxon

54:593–604

Wagner A, Blackstone N, Cartwright R, Dick M, Misof B, Snow P,

Wagner GP, Bartels J, Murtha M, Pendleton J (1994) Surveys of

gene families using polymerase chain reaction–PCR selection

and PCR drift. Syst Biol 43:250–261

Whitlock BA, Baum DA (1999) Phylogenetic relationships of

Theobroma and Herrania (Sterculiaceae) based on sequences

of the nuclear gene Vicilin. Syst Bot 24:128–138

Whittall JB, Medina-Marino A, Zimmer EA, Hodges SA (2006)

Generating single-copy nuclear gene data for a recent adaptive

radiation. Molec Phylogenet Evol 39:124–134

Winkworth RC, Bryant D, Lockhart PJ, Havell D, Moulton V (2005)

Biogeographic interpretation of splits graphs: least squares

optimization of branch lengths. Syst Biol 54:56–65

Plant Syst Evol (2008) 273:133–149 149

123