Upload
reading
View
0
Download
0
Embed Size (px)
Citation preview
ORIGINAL ARTICLE
Phylogenetic utility of MORE AXILLARY GROWTH4 (MAX4)-likegenes: a case study in Digitalis/Isoplexis (Plantaginaceae)
L. J. Kelly Æ A. Culham
Received: 4 June 2007 / Accepted: 21 December 2007 / Published online: 10 May 2008
� Springer-Verlag 2008
Abstract We present the first assessment of phylogenetic
utility of a potential novel low-copy nuclear gene region in
flowering plants. A fragment of the MORE AXILLARY
GROWTH 4 gene (MAX4, also known as RAMOSUS1 and
DECREASED APICAL DOMINANCE1), predicted to span
two introns, was isolated from members of Digitalis/
Isoplexis. Phylogenetic analyses, under both maximum
parsimony and Bayesian inference, were performed and
revealed evidence of putative MAX4-like paralogues. The
MAX4-like trees were compared with those obtained for
Digitalis/Isoplexis using ITS and trnL-F, revealing a high
degree of incongruence between these different DNA
regions. Network analyses indicate complex patterns of
evolution between the MAX4 sequences, which cannot be
adequately represented on bifurcating trees. The incidence
of paralogy restricts the use of MAX4 in phylogenetic
inference within the study group, although MAX4 could
potentially be used in combination with other DNA regions
for resolving species relationships in cases where para-
logues can be clearly identified.
Keywords Digitalis � Isoplexis �Low-copy nuclear gene region � MAX4/RMS1/DAD1 �Molecular phylogeny � Network � Paralogy
Introduction
One of the greatest challenges facing plant molecular phy-
logenetics remains the development of appropriate DNA
regions to act as sources of characters. It is widely understood
that in order to reconstruct species relationships accurately,
data from multiple independently evolving DNA regions
need to be incorporated into phylogenetic analyses (Pamilo
and Nei 1988; Doyle 1992; Strand et al. 1997; Soltis and
Soltis 2000; Cronn et al. 2002a; Rokas et al. 2003; Rokas and
Carroll 2005). Plastid DNA has been the most widely used
source of characters for phylogenetic inference in plants
(Small et al. 2004). The internal transcribed spacer region
(ITS) of nuclear ribosomal DNA (nrDNA) is one of the most
common sources of characters for use in low-level phylo-
genetic investigations (Alvarez and Wendel 2003; Hughes
et al. 2006). Whilst congruence between plastid DNA and
nrDNA trees can be used to provide confidence that the
relationships inferred are an accurate reconstruction of
organismal history, in some cases it is necessary to seek
additional loci to allow resolution of conflict between com-
peting hypotheses (Cronn et al. 2002a). It has been suggested
that nuclear genes represent a virtually unlimited source of
molecular characters for use in phylogenetic inference of
plants (Sang 2002; Alvarez and Wendel 2003; Small et al.
2004), and may offer a variety of other beneficial features
(for reviews see Sang 2002; Mort and Crawford 2004; Small
et al. 2004). Consequently there has been a steady increase in
the number of low-copy nuclear gene regions being devel-
oped for use in phylogenetic investigations of plants (e.g.
Mathews and Sharrock 1996; Mason-Gamer et al. 1998;
Tank and Sang 2001; Martins and Barkman 2005; Syring
et al. 2005; Whittall et al. 2006). Nevertheless, there are still
relatively few regions that have been tested in a range of plant
lineages (Mort and Crawford 2004; Small et al. 2004).
L. J. Kelly � A. Culham
Centre for Plant Diversity and Systematics,
School of Biological Sciences, University of Reading,
Whiteknights, Reading RG6 6AS, UK
L. J. Kelly (&)
Jodrell Laboratories, Royal Botanic Gardens,
Kew, Richmond, Surrey TW9 3DS, UK
e-mail: [email protected]
123
Plant Syst Evol (2008) 273:133–149
DOI 10.1007/s00606-008-0008-0
Moreover, it has been shown that the complex evolutionary
dynamics that characterise nuclear genes mean that the
regions currently available will not be appropriate for phy-
logeny reconstruction in all plant groups (e.g. Bailey et al.
2002, 2004; Archambault and Bruneau 2004), and as yet no
universally useful low-copy nuclear DNA sequence loci
have been developed (Hughes et al. 2006).
In this paper we present an assessment of the utility of a
potential novel low-copy nuclear gene region for phylo-
genetics, using Digitalis/Isoplexis as an exemplar. Digitalis
L. and Isoplexis (Lindley) Loudon are closely related
groups within the Plantaginaceae (sensu Angiosperm
Phylogeny Group II 2003; some authors advocate the use
of Veronicaceae to refer to the family, e.g. Olmstead et al.
2001; Tank et al. 2006). The c. 19 continental Digitalis
species are almost exclusively herbaceous (Brauchler et al.
2004), whilst the four Macaronesian island Isoplexis spe-
cies exhibit insular woodiness, one of the classical patterns
of plant evolution on islands (Carlquist 1974). It would
appear that the majority, if not all, of the species within
Digitalis/Isoplexis are octoploids, with the base number for
this group estimated to be x = 7 (Albach et al. 2004) and
the most commonly reported chromosome number for
species of Digitalis/Isoplexis being n = 28 (index to plant
chromosome numbers, http://mobot.mobot.org/W3T/
Search/ipcn.html). Recent molecular phylogenies for
these genera, using sequences from the nrDNA region ITS
and the plastid region trnL-F (trnL gene and trnL-F
intergenic spacer), indicate that Isoplexis nests within
Digitalis and should therefore be reduced to a sectional
rank (Carvalho 1999; Brauchler et al. 2004). However, in
common with phylogenies of many other insular woody
plant lineages the use of nrDNA and plastid DNA regions
has been insufficient to fully resolve species level rela-
tionships (e.g. Bohle et al. 1996; Kim et al. 1996;
Francisco-Ortega et al. 1997, 2002; Panero et al. 1999;
Ganders et al. 2000; Helfgott et al. 2000; Barber et al.
2002). Consequently, additional character sources are
needed to allow the resolution of relationships within
Digitalis/Isoplexis, making it an appropriate group in
which to test the utility of a novel gene region. The suit-
ability of Digitalis/Isoplexis as a test case is also promoted
by the existence of the phylogenies using conventional
molecular character sources (Carvalho and Culham 1998;
Carvalho 1999; Brauchler et al. 2004), which provide a
framework within which to judge the potential utility of
newly developed gene regions.
The MORE AXILLARY GROWTH4/RAMOSUS1/DEC-
REASED APICAL DOMINANCE1 (MAX4/RMS1/DAD1)
gene was first isolated from Arabidopsis thaliana by Sorefan
et al. (2003), who revealed that it contains six exons, pre-
dicted to encode a 570 amino acid protein that is a member of
the polyene chain dioxygenase superfamily. Loss of function
mutation in MAX4 results in plants with a characteristically
bushy phenotype, which develops as a result of increased bud
growth from the axils of rosette leaves. Evidence from
studies of these mutants, including gene expression analyses
and grafting experiments, led Sorefan et al. (2003) to propose
that MAX4 regulates shoot branching via the production of a
novel mobile branch-inhibiting signal. Further investigation
has also revealed that MAX4 is orthologous to RAMOSUS1
(RMS1) in Pisum sativum (Sorefan et al. 2003) and to the
DREASED APICAL DOMINANCE1 gene in Petunia hyb-
rida (Snowden et al. 2005). It has also been shown that
MAX4/RMS1/DAD1 has a similar exon/intron structure in
these three species (Sorefan et al. 2003; Snowden et al.
2005). A number of features of this gene favoured its
selection over other potential candidates. Firstly MAX4/
RMS1/DAD1 (henceforth referred to as MAX4) is apparently
single copy in A. thaliana, as a tblastx search (Altschul et al.
1997) with the MAX4 sequence against the complete genome
of A. thaliana on GenBank (Benson et al. 2007) fails to reveal
any other significantly similar sequences. However, similar
searches of the genomic sequence for Oryza sativa indicate
that this monocotyledonous species may have more than
one copy of MAX4. Another advantage of MAX4 was the
availability of information regarding both intron size and
position in A. thaliana and P. sativum. Knowledge of gene
structure is essential in order to allow confident positioning
of primers. In addition, expressed sequence tag (EST)
sequences were available from taxonomically diverse spe-
cies, thus allowing the identification of evolutionarily
conserved areas of amino acids at which primers can be
placed (Strand et al. 1997).
In this study we aimed to assess the potential of MAX4
to act as a novel source of molecular characters for species
level phylogeny reconstruction by investigating and cha-
racterising the evolutionary dynamics of this gene in
Digitalis/Isoplexis. Suitability of this gene for use in phy-
logenetic inference was gauged by comparing the numbers
of phylogenetically informative characters with those
available for Digitalis/Isoplexis from two separate and
independently evolving DNA regions, the nrDNA region
ITS and the plastid region trnL-F (Brauchler et al. 2004). In
addition, we examined congruence, levels of support for
congruent clades, and the levels of resolution in the MAX4
phylogeny compared to those obtained for ITS and trnL-F.
Materials and methods
Taxon sampling
Ingroup and outgroup sampling was aimed towards
assembling a taxa set complementary to Carvalho (1999)
and Brauchler et al. (2004). In total, 24 members of
134 Plant Syst Evol (2008) 273:133–149
123
Digitalis/Isoplexis were sampled for MAX4, with two out-
group genera: Erinus and Veronica (Table 1).
Molecular methods
DNA was extracted from fresh material using a CTAB
protocol modified from Doyle and Doyle (1987). DNA
from herbarium voucher material was extracted using the
method of Harris (1995) with minor modifications. For
Isoplexis chalcantha and Isoplexis isabelliana DNA
extractions prepared by Carvalho (1999) were used.
Angiosperm sequences, identified via tblastx searches of
the nuclear and EST databases on GenBank (Benson et al.
2007) using the MAX4 sequence from Arabidopsis thali-
ana, were aligned and used for design of degenerate PCR
primers (Table 2). Primers were positioned to flank the
fourth and fifth introns of MAX4 and amplify a region of
approximately 555–587 bp (Fig. 1), with intron position
and size based on information from A. thaliana and P.
sativum (Sorefan et al. 2003). The degenerate MAX(1)
primer pair was used for initial characterisation of MAX4 in
five exemplar taxa of Digitalis/Isoplexis, selected to rep-
resent each of the major clades recognised in the ITS-trnL-
F phylogeny of Carvalho (1999). We aimed to sequence 20
MAX4-like clones from each of these species. PCR
amplification was carried out in 50 ll reactions containing
*200 ng genomic DNA, 10 9 NH4 reaction buffer (Bio-
line), 3 mM MgCl2 (Bioline), 200 lM of each dNTP
(Promega), 100 pmol of each primer, 1 unit of BioTaq
DNA polymerase (Bioline). The following temperature
profile was used: 94�C for 5 min; 35 cycles of 94�C for
30 s, 60�C for 30 s, 72�C for 60 s; 72�C for 7 min. Pre-
liminary phylogenetic analysis identified two main classes
of sequences (A and B). Amplification of the target region
from the remaining taxa was achieved using the non-
degenerate study group specific primers (DIGMAX(1)30/(3)50), aiming to sequence six MAX4-like clones, including
both A and B copies, from each taxon. Reactions contained
50–500 ng genomic DNA, 25 pmol of each primer and
45 ll of 3 mM MgCl2 ReddyMix PCR pre-mix (ABgene)
Table 1 Plant material used in
the study of MAX4
a All vouchers are lodged at the
herbarium of The University
of Reading, United Kingdom
(RNG), with the exception of
I. chalcantha and I. isabelliana[see Carvalho (1999)] for details
Taxon Vouchera
Ingroup
Digitalis ciliata Trautv. Ross. 786
D. davisiana Heywood P0002797
D. ferruginea subsp. ferruginea L. P0019015/16
D. ferruginea subsp. schischkinii (Ivan.) Werner P0019018
D. grandiflora Mill. P0019029/30
D. lamarckii Ivanina Nesbitt & Samuel. 2725
D. lanata subsp. lanata Ehrh. Optima Iter. IX. 1118
D. lutea subsp. australis (Ten.) Arcang. P0019021
D. lutea subsp. lutea (Ten.) Arcang. P0019017
D. minor L. Bowen. 7316
D. nervosa Steud. & Hochst. ex Benth. P0019022/23
D. obscura subsp. laciniata (Lindl.) Maire Mateos et al. 7119/95
D. obscura subsp. obscura L. Jury et al. 132
D. parviflora Jacq. P0019019
D. purpurea subsp. heywoodii P. Silva & M. Silva P0019020
D. purpurea subsp. mauretanica (Humbert & Maire) A. M. Romo P0006303
D. purpurea subsp. purpurea L. P0019031
D. thapsi L. Photographic voucher
D. trojana Ivanina Nesbitt & Samuel. 1797
D. viridiflora Lindl. Richards et al. 23/06/1999
Isoplexis canariensis (L.) Loudon P0019028
I. chalcantha Svent. & O’Shann Carvalho (1999)
I. isabelliana (L.) Loudon Carvalho (1999)
I. sceptrum (L.f.) Loudon P0019032
Outgroup
Erinus alpinus L. P0019033
Veronica persica Poir. P0019035
V. serpyllifolia L. P0019034
Plant Syst Evol (2008) 273:133–149 135
123
in a final volume of 50 ll. A two-step temperature regime
was used: 94�C for 5 min; 10 cycles of 94�C for 30 s, 44�C
for 30 s, 72�C for 60 s; 32 cycles of 94�C for 30 s, 50�C
for 30 s, 72�C for 60 s; 72�C for 15 min. Appropriate sized
PCR products were excised and purified from agarose gels
using a Nucleospin Extract 2 in 1 kit (Macherey Nagel)
following the manufacturer’s protocol. PCR products were
cloned into the pCR�2.1-TOPO� vector (Invitrogen) fol-
lowing the manufacturer’s protocol. White colonies were
screened via two rounds of colony PCR: first using the
COLMAX(2) primer pair, to identify clones containing
MAX4-like sequences; positives were then screened with
the DIGMAXA30 and COLMAX(1)50 primers to identify
putative A and B copies. Colony PCR samples contained
10 9 NH4 reaction buffer (Bioline), 1.5 mM MgCl2 (Bi-
oline), 200 lM of each dNTP (Promega), 25 pmol of each
primer, 0.4 units of BioTaq DNA polymerase (Bioline) and
part of a single white colony in a final volume of 20 ll. The
temperature profile: 94�C for 10 min; 25 cycles of 94�C for
30 s, 55�C for 30 s, 72�C for 30 s; 72�C for 7 min, was
used. Selected colonies were grown in LB medium con-
taining ampicillin (100 mg ml-1) then extracted using a
NucleoSpin Plasmid kit (Macherey Nagel). Sequencing of
cloned PCR products was carried out with M13 primers
using a BigDyeTM Terminator v 3.1 100 Reaction Ready
kit (applied biosystems) following the manufacturer’s
protocol, and analysed on an ABI PRISM 3100 automated
capillary sequencer.
Sequence alignment and comparative analysis
Similarity of Digitalis/Isoplexis sequences to those avail-
able through GenBank (Benson et al. 2007) was assessed
using blastn and tblastx searches (Altschul et al. 1997).
Digitalis/Isoplexis sequences identified as putative MAX4
homologues were compared with the corresponding seg-
ment of the AtMAX4 gene, and the intron/exon structure
deduced. Variable positions within the final alignment of
Digitalis/Isoplexis MAX4-like sequences were cross-
checked against the electropherograms. Sequences were
visually inspected for evidence of chimeras resulting from
PCR recombination (Bradley and Hillis 1997; Cronn et al.
2002b). Sequence divergence resulting from Taq error
during PCR was estimated by calculating the predicted
Table 2 Oligonucleotide
primers used in the
amplification of the MAX4region
Primer name Sequence (50–30) Use
MAX4(1)30 CCARCADCCRTGVARGCC Amplification of target region
MAX4(1)50 ATMCCAYTKGAYGGRAGC Amplification of target region
DIGMAX(1)30 CCAACGGCCGTGCAAGCCATAG Amplification of target region
DIGMAX(3)50 TATACCATTGGATGGGAGCCCAAATG Amplification of target region
DIGMAXA30 CAAGCAAAAATGACRYTTAAACC Screening of colonies
COLMAX(1)50 GGATATGTGCAGCATTAACC Screening of colonies
COLMAX(2)30 CTTTGTGAGGGTGTTAGGG Screening of colonies
COLMAX(2)50 GAACATGGAAGAGGCATGG Screening of colonies
1
2
5' 3'
555 - 587 bp
3
46
5
- Untranslated Region - Coding Region
Fig. 1 Diagram of the AtMAX4 gene, with the segment targeted
during this study enlarged. Boxed areas represent exons, introns are
represented by lines. The size of the target region has been estimated
by taking into account the intron sizes for MAX4 in Arabidopsis
thaliana and Pisum sativum (Sorefan et al. 2003). Arrows indicate
primer annealing sites, key to numbers: 1—MAX(1)50/DIG-
MAX(3)50; 2—MAX(1)30/DIGMAX(1)30; 3—COLMAX(2) 50; 4—
COLMAX(2)30; 5—COLMAX(1)50; 6—DIGMAXA30
136 Plant Syst Evol (2008) 273:133–149
123
frequency of point mutations under the rate of 0.27–
0.85 9 10-4 errors bp-1 per cycle (Bracho et al. 1998).
MAX4-like sequences were aligned by eye in MegAlign
(Lasergene, DNASTAR Inc., USA). Putative paralogues,
initially identified by the pattern of base substitutions and
indels, were further assessed by comparison of sequence
distances within and between copies, using the uncorrected
‘‘p’’ distance as implemented in PAUP* v4.0b10 (Swofford
2002). A second measure of sequence distance was cal-
culated under the best-fit model of sequence evolution for
the entire sequences, as selected by MrModelTest v2.0
(Nylander 2004) and implemented in PAUP* v4.0b10. An
alignment of the MAX4-like partial exons was constructed
in MegAlign. Comparison of the predicted amino acid
sequences between putative copies, and with the corre-
sponding AtMAX4 segment, was made to identify residues
of divergent functional groups.
In order to place the Digitalis/Isoplexis sequences in a
broader context, a wide-scale dataset was constructed.
MAX4-like sequences from other taxa were identified by
conducting tblastn searches with the predicted amino acid
consensus sequence of the MAX4-like clones against the
nuclear and EST databases available on GenBank (Benson
et al. 2007) and the Populus trichocarpa draft genome
sequence v1.0 at the Joint Genome Institute (DoE Joint
Genome Institute and Poplar Genome Consortium 2004).
All non-duplicated accessions that could be aligned
unambiguously to the Digitalis/Isoplexis MAX4-like clones
(excluding those with [50% missing data) were included
in the dataset. Sequences were trimmed to retain only those
nucleotides corresponding to the MAX4 exon segments
under investigation and added to the alignment of MAX4-
like Digitalis/Isoplexis exons, with adjustment to maintain
the correct reading frame with respect to AtMAX4. Prior to
further analysis, any identical MAX4-like sequences within
the study group were identified and merged to form
‘‘combined’’ sequences. Tests for saturation were carried
out as described by Hirt et al. (1999) and Cox et al. (2004)
for both the wide-scale and exons + intron datasets.
Sequences included in the wide-scale and exons + intron
datasets have been submitted to GenBank (accession
numbers AJ870346–AJ870385 and AJ870787–AJ870920),
see Appendix 1 for a complete list. DNA sequence align-
ments for both datasets are available by emailing the
corresponding author.
Phylogenetic analyses
Maximum parsimony analyses were implemented in
PAUP* v4.0b10 Altivec for MacintoshTM (Swofford 2002)
using the heuristic search algorithm, with equal character
weighting. Un-rooted analysis of the wide-scale dataset
was conducted. Character-state optimisation was set to
DELTRAN, 100 random addition sequence replicates were
performed with TBR branch swapping, holding ten trees at
each step. The MulTrees option was in effect; gaps were
treated as missing data. Branch support was assessed using
the bootstrap (Felsenstein 1985); 1,000 pseudoreplicates of
the full-heuristic search were conducted, with a single
random addition sequence replicate, saving no more than
ten trees per replicate. Goodness-of-fit statistics were cal-
culated for most parsimonious trees (MPTs) in PAUP*
v4.0b10. Maximum parsimony analysis of the MAX4-like
exons and intron from the study group was carried out
under the same settings, with the exception that an out-
group was defined prior to analysis and that the heuristic
search was limited to saving 5,000 trees per replicate
(unconstrained searches exhausting the computer’s mem-
ory capacity, resulting in the premature termination of the
search).
For the phylogenetic analyses using Bayesian inference,
the best-fit model of evolution for each dataset was selected
using MrModelTest v2.0 (Nylander 2004). The wide-scale
dataset was evaluated with the third codon positions
excluded. For the exons + intron dataset, models were
evaluated for three different data partitions: exons + in-
tron; exons only, and intron only. Where the two tests
(hLRT and AIC) selected different nested models of evo-
lution, the difference in likelihood between the two models
was evaluated using a v2 approximation to assess whether
one model was a significantly better fit of the data than the
other. Where there was no significant difference in the
likelihood of nested models the more parameter rich model
was applied, as evidence suggests that underparameterisa-
tion may compromise accuracy of Bayesian analyses more
than overparameterisation (Erixon et al. 2003; Lemmon
and Moriarty 2004). To account for site specific among site
rate variation (SS ASRV) the likelihood of the neighbour
joining tree with Jukes Cantor distances created by
MrModelTest was assessed in PAUP* v4.0b10, under the
model of evolution selected by MrModelTest. Likelihood
values were calculated with and without correction for SS
ASRV. Analysis of both datasets by Bayesian inference
was carried out using MrBayes v3.0b4 (Huelsenbeck and
Ronquist 2001). In each search four chains were run, three
of which were heated using the default setting, starting
from a random tree and sampling every 100 generations. In
each case flat priors were implemented. Stationarity of the
chain was assessed by separately plotting log likelihood
and tree length against generation number, visual inspec-
tion of the resulting graphs allowed the point at which the
parameters converged on a stable value to be assessed and
the trees resulting from generations before this burn-in
period to be discarded. To further confirm that chains had
reached convergence, the post burn-in trees were used to
construct majority rule consensus trees for independent
Plant Syst Evol (2008) 273:133–149 137
123
searches. Topologies, and the posterior probability for each
clade, were compared among consensus trees. For the
wide-scale dataset, successive searches of two, four and six
million generations were conducted. Having established
convergence of the parameters on a stable value within the
six million generation search by the methods outlined
above, a second six million generation search was imple-
mented and the post burn-in trees from both searches
pooled to allow a single majority rule consensus tree
showing all compatible groupings to be constructed in
PAUP* v4.0b10. For the analysis of exons + intron dataset
one million generations was sufficient to allow full con-
vergence of the chain to be reached. Three independent one
million generation searches were carried out, and the sets
of post burn-in trees combined to allow the construction of
a single majority rule consensus tree.
It has been speculated that hybridisation may have
played an important role in the evolution of Digitalis
(Brauchler et al. 2004). Therefore, in order to explore
potential patterns of reticulate evolution, phylogenetic
networks were constructed for the MAX4-like sequences.
Network analysis of the MAX4 exons + intron dataset
(both including and excluding the outgroup sequence from
P. sativum) was carried out in SplitsTree4 V4.8 (Huson
1998) using the NeighbourNet algorithm with the distance
measure set to uncorrected ‘‘p’’. This is the best network
option for complex datasets as it responds better to these
situations than alternative methods (Morrison 2005), for
example, SplitDecomposition produces uninformative
multifurcations when applied to the full MAX4 exon-
s + intron dataset. SplitDecomposition analysis based on
uncorrected ‘‘p’’ distances was also carried out in Splits-
Tree4 V4.8 (Huson 1998) for a sub-set of the MAX4
sequences, comprising those taxa from section Digitalis
(Brauchler et al. 2004). Support for the graph topology was
tested by 1,000 bootstrap pseudoreplicates. SplitDecom-
position was chosen as the method of analysis for the
reduced dataset as it has a good display of characters in
simple to moderate cases of complexity (Morrison 2005)
and only represents the strongest signals of incompatibility,
thus limiting visual complexity of the resulting graphs and
aiding biological interpretation (Winkworth et al. 2005).
Results
Isolation of MAX4 from Digitalis/Isoplexis
One hundred and seventy four MAX4-like sequences were
obtained from 24 of the study taxa. It was not possible to
amplify the target from D. minor or from the outgroup taxa
Veronica persica and V. serpyllifolia (additional species
within the Plantaginaceae also failed to amplify: Callitri-
che cribrosa, C. hermaphroditica, Gobularia cordifolia,
and Plantago lanceolata. L. Kelly and A. Culham,
unpublished data). Amplification was achieved from all
other study taxa (Table 1).
Two major MAX4-like sequence types were detected
within Digitalis/Isoplexis, and their sister group Erinus, on
the basis of distinctive patterns of substitutions and indels;
tentatively designated as A and B copies. Despite repeated
and targeted attempts, with amplification being carried out
under a range of conditions, it was not possible to isolate
both of these main sequence types from every taxon. Copy
A was not found in D. lamarckii and I. sceptrum, and copy
B was not found in D. ciliata, D. davisiana, both subspe-
cies of D. lutea, D. parviflora, the three subspecies of D.
purpurea, D. thapsi and D. viridiflora. This is mostly likely
to be as a result of selection during the PCR (Wagner et al.
1994) as both copies appear to be under purifying selection
(L. Kelly and A. Culham, unpublished data), thus making
differential gene loss a less probable scenario.
Digitalis/Isoplexis MAX4-like sequences
The A copy sequences show greater variation in total
length, and intron size, than is seen in the B copies
(Table 3). Within the sequences coding sequence length
and intron position were highly conserved; the intron splice
site was also conserved in comparison with the corre-
sponding fourth intron in AtMAX4. In contrast with
AtMAX4, only a single intron was detected in the MAX4-
like sequences, the fifth intron with respect to the A. tha-
liana gene (Fig. 1) being absent in all clones analysed. The
length of coding sequence is also conserved in the majority
of MAX4-like sequences compared with the corresponding
Table 3 Main features of the
MAX4-like sequences from
Digitalis/Isoplexis
a Percentage divergence based
on GTR + c distances,
followed by percentages based
on uncorrected pairwise
distances in parentheses
Copy A Copy B
Full-length, bp 483–511 479–495
Coding length, bp 397–409 397–409
Intron length, bp 75–102 71–86
Intron start position, bp 176 175–176
Divergence within copya 0–3.8 (0–3.6) 0.2–4.6 (0.2–4.3)
Divergence between copiesa 12.1–15.6 (10.3–12.7) 12.1–15.6 (10.3–12.7)
138 Plant Syst Evol (2008) 273:133–149
123
segment in AtMAX4, except in the sequences from
D. purpurea subsp. mauretanica where all four clones
analysed had the same 12 bp deletion.
Measures of sequence divergence (Table 3) illustrate
that greater genetic distance exists between the A and B
copies than within each putative copy. Distances between
the putative A and B copy sequences from a single species
range between 12.1 and 15.6%. This is far greater than the
maximum distance (4.6%) detected within a set of
sequences for either the A or B copy from a single species.
This highlights the possibility that the A and B type
sequences may be derived from distinct loci. Nevertheless,
the level of variation within copies for certain taxa, as well
as the presence of distinct intron size groups, suggests that
the complexity of MAX4-like sequences within Digitalis/
Isoplexis and Erinus is not fully explained by the presence
of two paralogues. The lower levels of divergence within
copies for some taxa, for example 0.2% within the A copy
sequences of D. viridiflora, could be accounted for by Taq-
polymerase based incorporation errors during PCR, which
are predicted to equate to 0.1–0.4% sequence divergence
under the rate of Bracho et al. (1998), (although under an
alternative rate a 10-fold lower level of taq-polymerase
errors would be expected, Cline et al. 1996). However,
other taxa show variation within copies that is substantially
greater than 0.4%, which might be indicative of allelic
variation within heterozygous individuals, homoeologues
resulting from polyploidisation events in Digitalis/Iso-
plexis, or additional distinct MAX4-like loci within the
major sequence types.
Comparison of predicted protein sequences between the
MAX4-like putative copies and the corresponding sequence
from AtMAX4 indicate that some of the variation seen at
the nucleotide level between the major groups of MAX4-
like sequences has translated to divergence within amino
acids (Table 4). Whilst the majority of these changes have
been between amino acids of the same functional group,
three cases show some divergence in functional group.
Divergence between AtMAX4 and MAX4-like predicted
protein sequences from the study group was 27.4% for both
A and B copies.
Phylogenetic analyses
Tests for saturation (Hirt et al. 1999; Cox et al. 2004) on
the wide-scale dataset indicated that multiple hits had
occurred at the third codon positions, therefore the corre-
sponding nucleotides were excluded from all analyses of
this dataset. The parsimony analysis revealed that the
sequences obtained from Digitalis/Isoplexis form a single
well-supported monophyletic group. Sequences from out-
side of the study group were used to root the trees (Fig. 2).
Whilst the Digitalis/Isoplexis sequences form a well-sup-
ported monophyletic group (91% bootstrap support, BS),
resolution and support within this group are poor (BS of
less than 70%). There is also insufficient support and res-
olution to determine whether the Digitalis/Isoplexis
sequences are paralogous or orthologous to AtMAX4, or to
any of the other MAX4-like sequences from taxa outside of
the study group. In Bayesian analyses the GTR + C model
was used. Results of the parsimony and Bayesian analyses
of the wide-scale dataset are largely congruent (Fig. 2).
The wide-scale analysis revealed that the sequences
from Erinus are not sister to the Digitalis/Isoplexis
sequences, rather they nest within those from the ingroup
species. At the same time it was demonstrated that the
study group sequences form a well-supported monophy-
letic group, thus allowing any of the non-study group
sequences to act as an outgroup during analysis of the
Digitalis/Isoplexis exons and intron and allow rooting of
the tree. To reduce the impact of saturation on the analysis
of the exons and intron a single outgroup sequence was
included, P. sativum (AY557341), having been selected on
the basis of sequence distance to the Digitalis/Isoplexis
sequences under the GTR + C model. Analyses were
carried out with and without the outgroup, revealing that
the inclusion of the outgroup sequence does not confound
the recovery of relationships within Digitalis/Isoplexis. We
therefore included this sequence in the final analysis of the
exons + intron dataset.
Maximum parsimony analysis of the exons + intron
dataset resulted in a total of 430,924 MPTs, each of 399
steps (Fig. 3). Despite the inclusion of the third codon
Table 4 Comparison of the predicted amino acid consensus from A and B copy MAX4-like sequences with AtMAX4
Divergent residuesa (classified by functional group)
Conserved Semi-conserved Non-conserved Total
1 2 1 2 1 2 1 2
AtMAX4 segment – – – – – – – –
A copy 15.6 (21) – 6.7 (9) – 5.2 (7) – 27.4 (37) –
B copy 15.6 (21) 5.2 (7) 5.8 (8) 0.7 (1) 5.9 (8) 1.5 (2) 27.4 (37) 7.4 (10)
a Percentages of the total 135 residues, followed by whole numbers in parentheses
Plant Syst Evol (2008) 273:133–149 139
123
positions and intron sequence in the exons + intron dataset
branch lengths for many of the terminals are very short,
emphasising the paucity of available characters (Table 5).
The two major groups of MAX4-like sequences, identified
during alignment, form strongly supported (95% BS) A and
B copy clades. Although resolution for the terminal groups
is generally poor, a few groups with [80% bootstrap
support can be detected within the two major clades.
Within the A copy clade, the sequences from the three
subspecies of D. purpurea and D. thapsi form a mono-
phyletic group (86% BS). However, sequence seven from
D. purpurea subsp. purpurea is sister to other sequences
within this clade. This nested clade also has good bootstrap
support (91%), suggesting the existence of alleles or
additional MAX4-like homoeologues or paralogues within
D. purpurea subsp. purpurea. The nesting of sequences
from Erinus alpinus within those from Digitalis/Isoplexis
provides further indication that some of the A copy
sequences are non-orthologous. Within the B copy clade
the sequences from D. grandiflora are sister to the rest of
the B copies, with sequences from E. alpinus nested within
the main B clade. This emphasises the incongruence
between the MAX4-like phylogenetic tree and the existing
taxonomy. Sequences from several of the taxa are spread
among the different well-supported groups in the B copy
clade, indicating that multiple alleles, homoeologues or
paralogues may be represented.
In Bayesian analyses the GTR + SS model was applied
to the coding sequences and the HKY + C model to the
intron. Comparison of the topologies obtained under
maximum parsimony and Bayesian inference reveals a
high degree of congruence between the results. All clades
with 70% bootstrap support or above are found to have at
least a 95% posterior probability in the Bayesian tree (not
shown). There are however two clades within the Bayesian
consensus tree with a posterior probability of 95% or more
that have less than 50% bootstrap support in the parsimony
tree, marked with asterisks in Fig. 3.
NeighbourNet analysis of the exons + intron dataset
revealed a high degree of character conflict, represented by
the multiple reticulations within the graph (Fig. 4a). This
emphasises the complexity in the patterns of shared char-
acter states among the sequences, which may indicate that
evolution of the sequences is more complex than can be
represented on a bifurcating tree. The SplitDecomposition
network (Fig. 4b) for section Digitalis also shows a num-
ber of reticulations between the sequences.
Comparison with ITS and trnL-F
Comparison of the number of phylogenetically informative
characters between the MAX4-like sequences and other
molecular character sources, such as ITS and trnL-F, is
made difficult by unresolved issues concerning paralogy
and orthology within the MAX4 dataset. Proportions of
parsimony informative characters within the MAX4A and B
copy sequences from Digitalis/Isoplexis (11 and 10.7%
respectively) are substantially higher than seen in trnL-F
(1.3%), but lower than in ITS (15%), based on a largely
similar set of ingroup taxa from the study of Brauchler
Physcomitrella patens subsp. patens BJ167696Picea glauca CO482424
Saccharum officinarum CA100587Zea mays CO526902
Triticum aestivum BQ788859Oryza sativa AP003376Oryza sativa Combined
Sorghum bicolor CD462656Pisum sativum Combined
Medicago truncatula AJ499477Populus trichocarpa combined
Populus trichocarpa POPS114408Arabidopsis thaliana MAX4
100
100
70
100
10094
98
100
100
100
72
65
Study Group78 Terminals
Physcomitrella patens subsp. patens BJ167696
1 change
Pisum sativum CombinedMedicago truncatula AJ499477
Populus trichocarpa combinedPopulus trichocarpa POPS114408
Picea glauca CO482424
Saccharum officinarum CA100587Zea mays CO526902
Triticum aestivum BQ788859Oryza sativa AP003376Oryza sativa Combined
Sorghum bicolor CD462656
Arabidopsis thaliana MAX4
91
70
93
76
92
7899
88
95
Study Group78 Terminals
Fig. 2 Phylogenetic trees from the analysis of the wide-scale dataset,
illustrating the monophyly of the study group sequences. Left: a single
MPT from the set of 1,706 retained. Length = 334; CI = 0.692;
RI = 0.823; RC = 0.575. Numbers above branches are bootstrap
support values, those of below 50% are not shown. Right: Bayesian
inference consensus tree, showing all compatible groupings. Numbersabove branches are posterior probabilities, those of\1 are not shown.
An arrow indicates the group with [95% posterior probability but
\50% bootstrap support in the parsimony analysis
140 Plant Syst Evol (2008) 273:133–149
123
- D. ferruginea
- D. obscura
- D. purpurea
- D. thapsi
- D. trojana
- Isoplexis
- Erinus alpinus
- D. lamarckii
- D. ciliata
- D. lutea
- D. parviflora
D. purpurea subsp. mauretanica 3
D. purpurea subsp. heywoodii 1
D. purpurea subsp. purpurea A Combined 2 (2)
I. canariensis 6Pisum sativum AY557341
1 change
A95
D. purpurea subsp. mauretanica A Combined (3)
D. lanata subsp. lanata 1Combined A3 (ciliata/lutea/parviflora: 13)
D. ciliata 3D. ciliata 4D. lutea subsp. australis 1D. lutea subsp. australis 3
D. lutea subsp. australis 4D. lutea subsp. australis 7D. lutea subsp. australis 10
D. lutea subsp. australis 14D. lutea subsp. australis 15D. lutea subsp. australis 17Combined A1 (obscura/lutea: 2)
Combined A2 (obscura/trojana: 2)D. obscura subsp. laciniata 1
D. obscura subsp. obscura 24D. parviflora 1
D. parviflora A Combined (2)D. parviflora 3
D. parviflora 6D. parviflora 4
E. alpinus 1E. alpinus 5
D. ferruginea subsp. schischkinii 3D. ferruginea subsp. schischkinii 1
D. nervosa 2D. nervosa 7
D. trojana 8
D. purpurea subsp. purpurea 1D. thapsi 2D. thapsi 4
D. purpurea subsp. purpurea A Combined (3)
D. thapsi 5D. thapsi 3
D. thapsi 1D. purpurea subsp. purpurea 7
D. davisiana 1D. davisiana 2D. grandiflora 3 (2)
D. grandiflora A Combined (2)D. viridiflora A Combined (2)
D. viridiflora 1D. ferruginea subsp. ferruginea 1
D. ferruginea subsp. ferruginea 10I. isabelliana 3
I. chalcantha A Combined (2)I. canariensis 3
63
100
98
67
81
80
58
5565
64
89
77
91
86
93
100
73
98
59
91
B Clade
*
a
Fig. 3 A single MPT from the set of 430,924 found during
parsimony analysis of the exons + intron dataset. Length = 399;
CI = 0.769; RI = 0.967; RC = 0.744. Numbers above branches are
bootstrap support values; asterisks denote clades with \50% boot-
strap support but with [95% posterior probability in the Bayesian
analysis. Arrows indicate branches that collapse in the strict
consensus tree. a ‘‘A copy’’ clade and outgroup, parallel barsindicate points at which branch lengths have been truncated to aid
visual representation; b ‘‘B copy’’ clade. Taxa that appear in more
than one part of the two major clades, or that are found within
combined sequences also containing other species, are coded in the
bar on the right hand side (see key in top left hand corner of A).
Subspecies of a single species are given the same code, as are the four
Isoplexis species. Numbers in parentheses indicate the number of
clones used to construct combined sequences
Plant Syst Evol (2008) 273:133–149 141
123
et al. (2004). However, the results obtained from the
analyses of the MAX4-like exons and intron give a strong
indication that further alleles, homoeologues or paralogues
are present within both the A and B copy datasets. Con-
sequently, any set of orthologous sequences would have a
lower proportion of phylogenetically informative charac-
ters than is indicated by the figures above.
Comparison of the MAX4-like strict consensus tree
obtained under maximum parsimony (not shown) with the
ITS-trnL-F tree for Digitalis/Isoplexis (Brauchler et al.
D. lamarckii 3D. obscura subsp. laciniata 2
Combined B1 (lamarckii/Isoplexis: 4)D. trojana 2
D. lanata subsp. lanata 2I. sceptrum 3I. sceptrum 7
I. sceptrum 16Combined B2 (Isoplexis/ferruginea: 6)
I. sceptrum 8I. sceptrum 9
I. sceptrum 11D. ferruginea subsp. ferruginea 3
D. ferruginea subsp. ferruginea 6I. sceptrum 12D. trojana 7D. ferruginea subsp. ferruginea 8D. ferruginea subsp. ferruginea 9
Combined B3 (ferruginea/Isoplexis/Erinus: 6)I. sceptrum 5
I. sceptrum 13E. alpinus 4
I. canariensis 1Combined B4 (Isoplexis/trojana: 7)
I. canariensis 4I. isabelliana 1
D. ferruginea subsp. schischkinii 2D. trojana 5E. alpinus 2E. alpinus 7
I. canariensis 7D. ferruginea subsp. ferruginea 2D. ferruginea subsp. ferruginea 5
D. ferruginea subsp. ferruginea 7D. ferruginea subsp. ferruginea 12
Combined B5 (obscura/trojana/Isoplexis: 4)D. obscura subsp. laciniata 6
I. chalcantha 4I. chalcantha 2
D. nervosa B Combined (2)D. nervosa 3D. nervosa 8
D. nervosa 9D. nervosa 4
D. nervosa 6D. obscura subsp. obscura 1
D. obscura subsp. obscura 3D. obscura subsp. obscura 4D. obscura subsp. obscura 5
D. obscura subsp. obscura 6D. obscura subsp. obscura 9
D. obscura subsp. obscura 18D. obscura subsp. obscura 20D. obscura subsp. obscura 22D. obscura subsp. obscura 23D. obscura subsp. obscura 25
D. obscura subsp. obscura 12D. obscura subsp. obscura 13
D. obscura subsp. obscura B Combined (7)D. obscura subsp. obscura 10
D. obscura subsp. obscura 11D. obscura subsp. laciniata 21D. obscura subsp. obscura 26
D. grandiflora B Combined (2)
6365
64
64
65
89
B
65
86
86
87
80
91
92
7261
82
66
6488
99
69
95
D. obscura subsp. laciniata 5
D. grandiflora 6
*
A Clade+ outgroup
b
Fig. 3 continued
142 Plant Syst Evol (2008) 273:133–149
123
2004) reveals substantial incongruence. For example, evi-
dence from ITS-trnL-F indicates that the four species of
Isoplexis form a strongly supported monophyletic group,
with 100% bootstrap support. Furthermore, this group is
also recovered in the separate analyses of ITS and trnL-F
data carried out by Brauchler et al. (2004), although not
highly supported in the trnL-F tree. However, within the
MAX4-like phylogenetic tree sequences from Isoplexis are
not monophyletic. Overall, only three groups from the
MAX4-like strict consensus parsimony tree, that include
sequences from more than a single taxon, were found to
contain the same species as in the phylogenetic tree pro-
duced with ITS-trnL-F. However, the results of the
phylogenetic analysis of the ITS-trnL-F dataset also
revealed incrongruence with the existing species classifi-
cation, with three species (D. ferruginea, D. lutea, and
D. purpurea) lacking monophyly.
The ITS trnL-F phylogeny of Digitalis/Isoplexis
(Brauchler et al. 2004) failed to fully resolve the relation-
ships between species. Taxa from the ingroup were
represented by 32 terminals, only 7 of which (22%) were
fully resolved. However, an even lower level of resolution
was observed within the MAX4-like tree. Of the 113 ter-
minals representing the study group only four (3.5%) were
completely resolved in the parsimony strict consensus tree.
Discussion
Evolution of MAX4-like sequences in Digitalis/
Isoplexis
It is evident from this study that MAX4-like gene evolution
within Digitalis/Isoplexis has involved a complex series of
events. Given the occurrence of polyploidy within this
group we could expect to detect multiple homoeologues,
and thus a higher copy number than is the case in func-
tionally diploid species such as Arabidopsis thaliana
(where MAX4 is apparently single copy). However, the
polyploid nature of the majority of the study group species
is not sufficient to account for the full range of sequence
variants detected, as the diploid sister taxon Erinus alpinus
is also characterised by the presence of multiple sequence
types.
Within the MAX4-like dataset two main putative copies,
A and B, can be detected on the basis of distinctive patterns
of substitutions and indels. The designation of these two
main paralogues is also supported by the fact that the
sequences corresponding to these copies form separate and
well-supported monophyletic groups under both maximum
parsimony and Bayesian inference. The A and B copies are
both present within Erinus alpinus, indicating that MAX4
apparently underwent at least one duplication event prior to
the divergence of Digitalis/Isoplexis and Erinus. However,
the complexity of sequence variants within the MAX4-like
dataset is not limited to these two main putatively paralo-
gous copies. Examination of both the A and B clades
reveals major incongruence with the existing species
classification and previous phylogenetic reconstructions
(Carvalho 1999; Brauchler et al. 2004). The exact number
of sequence classes is not easily identifiable, but given the
fact that sequences from E. alpinus nest within those from
Digitalis/Isoplexis in both the A and B clades it is apparent
that for this diploid taxon at least two distinct groups must
be present within the A copy clade, and another two within
the B copy clade; with the expectation of the polyploid
Digitalis/Isoplexis species containing additional sequence
classes.
Recent evidence of the duplication of floral regulatory
genes within the Lamiales has led to the suggestion that an
Table 5 Distribution of
characters within the datasets
Percentages expressed as a
proportion of the total number
of characters for each data
partition, followed by whole
numbers in parenthesesa Outgroup sequence from
P. sativum is excluded from
the comparison of character
distribution within the
full-length dataset
Total length,
bp
Constant
characters
Autapomorphic
characters
Parsimony
informative
characters
Wide-scale dataset
Codon position 1 137 32.8 (45) 28.5 (39) 38.7 (53)
Codon position 2 138 47.1 (65) 23.9 (33) 29.0 (40)
Codon position 3 137 2.9 (4) 4.4 (6) 92.7 (127)
Total 412 27.7 (114) 18.9 (78) 53.4 (220)
Full-length dataseta
Codon position 1 136 61.0 (83) 26.5 (36) 12.5 (17)
Codon position 2 136 69.1 (94) 20.6 (28) 10.3 (14)
Codon position 3 137 27.0 (37) 33.6 (46) 39.4 (54)
Intron 125 56.8 (71) 8.8 (11) 34.4 (43)
Total 534 53.4 (285) 22.7 (121) 24.1 (128)
Plant Syst Evol (2008) 273:133–149 143
123
ancient whole genome duplication event has occurred in
this order, although it is likely to have taken place after the
divergence of the Plantaginaceae and the lineage leading to
many of the other families (Aagaard et al. 2005). It has also
been concluded that despite the presence of paralogues for
multiple genes among many families of the Lamiales, there
is little evidence within extant chromosome numbers to
support a whole genome duplication (Aagaard et al. 2005).
The extant chromosome number of Erinus alpinus also
gives little expectation for the presence of multiple copies
of MAX4, but the results presented here raise the possibility
that a more recent genome duplication event may have
occurred within the Plantaginaceae.
It is likely that the MAX4-like dataset obtained during
this investigation represents a mixture of different
sequence types, including paralogues, homoeologues and
alleles that, whilst arising from different processes, are
not easily distinguishable on the basis of sequence data
alone (Sang 2002; Small et al. 2004). Thus, several
processes, such as gene duplication, polyploidisation and
lineage sorting may have contributed to the complex
pattern of sequence relationships revealed from the
MAX4-like tree and resulted in the incongruence with the
pattern of species relationships inferred from the results
of the phylogenetic study of Brauchler et al. (2004). It
has been speculated previously that hybridisation may
have played an important part in the evolution of Digi-
talis (Brauchler et al. 2004). Phylogenetic trees cannot
directly depict hybrids (Vriesendorp and Bakker 2005),
but situations involving reticulate evolution can be vis-
ualised on phylogenetic networks (Winkworth et al.
2005). The results of the network analyses of the MAX4
sequences from Digitalis/Isoplexis reveal numerous
character incompatibilities, which are displayed as
Fig. 4 Splits graphs of the
MAX4-like sequences.
a NeighbourNet graph based on
uncorrected ‘‘p’’ distances of
118 MAX4-like sequences
(exons + intron) from Digitalis/
Isoplexis, goodness of
fit = 97.75%. Uppercase lettersdenote groups formed by the A
and B type sequences. An arrowindicates the placement of the
outgroup sequence (P. sativumAY557341) when included in
the analysis.
b SplitDecomposition network
based on uncorrected ‘‘p’’
distances for the 12 MAX4-like
sequences (all A copy) from
Digitalis sect. Digitalis,
goodness of fit = 90.99%.
Values above the branches
indicate bootstrap support. Key
to species abbreviations, Dph,
Digitalis purpurea subsp.
heywoodii; Dpm, D. purpureasubsp. mauretanica; Dpp,
D. purpurea subsp. purpurea;
Dt, D. thapsi
144 Plant Syst Evol (2008) 273:133–149
123
reticulations on the graphs (see Fig. 4). There are three
possible interpretations of the reticulations displayed in
these graphs (Morrison 2005). They may represent
homoplasy in the dataset (such as parallelisms or
reversals), result from uncertainty or ambiguity (such as
through the comparison of non-orthologous sequences) or
they represent events involving gene exchange between
unrelated organisms (such as hybridisation or lateral gene
transfer) (Morrison 2005). Sequence divergence is rela-
tively low within the MAX4 sequences (see ‘‘Results’’),
and NeighbourNet graphs constructed with the intron and
nucleotides corresponding to the third codon position
excluded still show multiple reticulations (not shown).
Thus, it is unlikely that all of the conflict can be
accounted for by homoplasy. Population-level processes
can also mimic the patterns expected from species-level
reticulations (Linder and Rieseberg 2004), and in the
case of the graph for the full dataset it seems likely that
at least some of the reticulations result from these con-
founding processes. However, in the case of the
reticulations shown in the SplitDecomposition graph for
section Digitalis is it more plausible that some of them
may represent true reticulations at the species level, as
D. purpurea subsp. heywoodii and D. thapsi form areas
of hybridisation within their native range (Brauchler
et al. 2004).
Potential for use of MAX4 in phylogenetic inference
The complexity of MAX4-like gene evolution within Digi-
talis/Isoplexis, as revealed by this study, has significant
implications for the potential use of these genes as sources
of phylogenetically informative data. Comparison of the
number of phylogenetically informative characters
between the MAX4-like sequences and conventional
molecular character sources, such as ITS and trnL-F, is
complicated by the unresolved issues concerning copy
number. However, it can be seen from the trees generated
by the analysis of the MAX4-like exons and intron (Fig. 3)
that large areas of the topologies lack resolution, or where
clades are resolved support for this resolution is poor. In
some cases the lack of resolution is within different clones
from a particular species, however, in other clades
sequences from separate genera are unresolved. This is an
obvious indication that the level of variation within a single
class of sequences is not sufficient to provide full resolu-
tion within the study group.
In addition to low levels of sequence variation, the
presence of multiple sequence variants within the dataset
severely limits the use of the MAX4-like genes for recov-
ering species relationships within Digitalis/Isoplexis. At
present the inability to reliably identify and distinguish the
various classes of sequences within all of the species
studied means that the accuracy of phylogenetic estimation
is compromised, as it cannot be confirmed that orthologues
are being compared (Doyle 1992; Martin and Burg 2002).
Previous studies have successfully incorporated data from
paralogous genes in to phylogenetic analyses, these include
the use of PHY (Phytochrome) genes in the Poaceae
(Mathews and Sharrock 1996), ADH (Alcohol Dehydro-
genase) genes in Paeonia (Sang et al. 1997) and LEGCYC
(LEG CYCLOIDEA) genes in Lupinus (Ree et al. 2004).
However, in the case of the region examined in this study
the use of paralogues to provide multiple datasets is not an
option, as it is currently not possible to confirm the exact
number of sequence variants within the MAX4-like dataset,
and consequently orthologous sequences cannot be deter-
mined and delineated.
Whilst the MAX4-like region is not appropriate for
species level delimitation within Digitalis/Isoplexis, the
features that make the development of low-copy nuclear
genes so challenging, such as the variability in copy
number and levels of divergence, may also suggest that
this region could be of use in other groups of plants.
Vicilin and Chalcone Synthase are examples of genes
that have multiple copies in some lineages but have been
used successfully for phylogenetic inference in plant
groups where they are single copy (Whitlock and Baum
1999; Koch et al. 2001). However, any future use of
MAX4 in studies of species relationships would require
careful characterisation of the gene’s evolutionary
dynamics for the plant group in question (for example,
through the use of Southern hybridisation to establish
copy number), to allow the confident identification of
orthologous sequences. In addition to applications to the
study of species relationships, genes such as MAX4 may
be of relevance to studies of evolutionarily important
processes such as gene duplication events. As gene
duplications are the ultimate source of evolutionary
novelty (Charlesworth et al. 2001) there has been interest
in studying the adaptive significance of these events
(Lawton-Rauh 2003). Thus, whilst not all low-copy
nuclear gene regions may be useful in delimiting species
relationships, they may have the potential to contribute
to our understanding of processes that help shape the
evolution of plants. One example of this may be through
the use of networks to explore conflicting patterns of
characters within the data, as the complex evolutionary
processes that characterise the speciation of plants are
not likely to be well represented by bifurcating trees
(Winkworth et al. 2005). Networks allow the represen-
tation of more of the phylogenetic information within a
dataset (Posada and Crandall 2001) and are important
tools for studying complex patterns in molecular
sequence data (Winkworth et al. 2005). The studies of
Joly and Bruneau (2006) and Brysting et al. (2007) are
Plant Syst Evol (2008) 273:133–149 145
123
two examples that combine the use of low-copy nuclear
gene sequences and network algorithms to address
complex cases of species-level evolution in plants, and
illustrate the potential utility of low-copy nuclear gene
data for purposes other than the reconstruction of
bifurcating trees.
Conclusions
The current study is the first to assess the utility of MAX4 for
use in phylogenetic inference, and the first report of the use
of low-copy nuclear gene data in phylogenetic analyses of
Digitalis/Isoplexis. Although the approach adopted was
successful in isolating the target region from species in
which the MAX4 gene was previously unknown, the data
were not ultimately appropriate for use in resolving phy-
logenetic relationships between the study species.
Alternative approaches are now available that may have an
increased likelihood of successful loci development, and
avoid the need for heavy investment in a single candidate
gene (see reviews by Schluter et al. 2005; Hughes et al.
2006). It has been suggested that the development of low-
copy nuclear gene regions is like a lottery, with occasional
successes against a backdrop of generally disappointing
results that are rarely reported in the literature (Hughes et al.
2006). The results of our development of MAX4 can be seen
as conforming to this under-reported trend. In light of the
relatively slow accumulation of variable species-level loci,
the reporting of such results will allow the most efficient use
of future research efforts. In conclusion, we recommend
that any future use of MAX4 as a source of characters for
phylogenetic investigations in flowering plants should be
limited to groups where this gene exhibits less complex
evolutionary dynamics, and where orthologous sequences
can be confidently identified and isolated.
We thank Victor Albert, Jim Dunwell, Gary Rosenberg,
James Tosh, Ben Warren and an anonymous reviewer for
their helpful comments on earlier drafts of this manuscript,
Ovidiu Paun for help with the network analyses and Mike
Wilkinson for useful discussion. We also thank Jose
Carvalho for providing Isoplexis chalcantha and I. isabel-
liana DNA extractions and for use of his Digitalis/Isoplexis
ITS/trnL-F dataset, and the Chelsea Physic Garden for
providing plant material of I. canariensis and I. sceptrum.
This work was supported financially by the Natural Envi-
ronment Research Council.
Appendix
Table 6
Table 6 Sequences included in the MAX4-like datasets, with Gen-
Bank accession numbers
Sequences GenBank number
Wide-scale dataset: ingroup sequences
Digitalis ciliata 3–4 AJ870348–49
D. davisiana 1–2 AJ870350-51
D. ferruginea subsp. ferruginea3, 6–10, 12
AJ870354, AJ870357-61,
AJ870363
D. ferruginea subsp. schischkinii2–3
AJ870366-67
D. grandiflora 3 AJ870370
D. lanata subsp. lanata 1 AJ870378
D. lutea subsp. australis 4, 10, 14 AJ870383, AJ870790, AJ870794
D. nervosa 2, 4, 6–7, 9 AJ870803, AJ870805, AJ870807-
08, AJ870810
D. obscura subsp. laciniata 5–6 AJ870815-16
D. obscura subsp. obscura 3–6,
10–13, 18, 22–24, 26
AJ870819-22, AJ870826-29,
AJ870834, AJ870838-40,
AJ870842
D. parviflora 3 AJ870845
D. purpurea subsp. heywoodii 1 AJ870850
D. purpurea subsp. mauretanica3
AJ870853
D. purpurea subsp. purpurea 1, 7 AJ870855, AJ870861
D. thapsi 1–3, 5 AJ870862-64, AJ870866
D. trojana 5, 7–8 AJ870871, AJ870873-74
Isoplexis canariensis 1, 4, 7 AJ870878, AJ870881, AJ870884
I. chalcantha 2 AJ870886
I. sceptrum 5, 7– 9, 13, 16 AJ870899, AJ870901-03,
AJ870907, AJ870910
Erinus alpinus 1, 4, 7 AJ870914, AJ870917, AJ870920
D. grandiflora A Combined AJ870368-69
D. grandiflora B Combined AJ870371-73
D. nervosa B Combined AJ870802, AJ870804, AJ870806,
AJ870809
D. obscura subsp. obscura B
Combined
AJ870817-18, AJ870823,
AJ870825, AJ870830-33,
AJ870835-37, AJ870841
D. purpurea subsp. mauretanicaA Combined
AJ870851-52, AJ870858
D. purpurea subsp. purpurea A
Combined 2
AJ870857-58
D. viridiflora A Combined AJ870875-77
Combined A1 (D. lutea subsp.
australis & D. obscura subsp.
laciniata)
AJ870796, AJ870814
Combined A3 (D. ciliata,
D. lutea both subspecies,
D. obscura both subspecies,
D. parviflora, D. trojana, &
Erinus)
AJ870346-47, AJ870380-82,
AJ870384-85, AJ870787-89,
AJ870791-93, AJ870795,
AJ870797, AJ870798-01,
AJ870811, AJ870824,
AJ870844, AJ870847,
AJ870849, AJ870867,
AJ870918
146 Plant Syst Evol (2008) 273:133–149
123
Table 6 continued
Sequences GenBank number
Combined A4 (D. purpureasubsp. purpurea & D. thapsi)
AJ870856, AJ870859-60,
AJ870865
Combined A5 (D. ferrugineasubsp. ferruginea,
I. canariensis, I. chalcantha &
I. isabelliana)
AJ870352, AJ870880, AJ870883,
AJ870885, AJ870887,
AJ870891
Combined B1 (D. lamarckii,D. obscura subsp. laciniata,
D. trojana & I. sceptrum)
AJ870374-77, AJ870812,
AJ870868, AJ870904
Combined B2 (D. ferrugineasubsp. ferruginea, D. lanatasubsp. lanata, I. sceptrum &
Erinus)
AJ870355, AJ870362, AJ870364,
AJ870379, AJ870895,
AJ870897-98, AJ870900,
AJ870904-05, AJ870908-09,
AJ870911-13, AJ870916
Combined B4 (D. trojana,
I. canariensis, I. isabelliana,
I. sceptrum, Erinus)
AJ870870, AJ870879, AJ870882,
AJ870889, AJ870893-94,
AJ870896, AJ870915,
AJ870919
Combined B5 (D. ferrugineasubsp. ferruginea, D. obscurasubsp. laciniata, D. trojanaI. chalcantha & I. isabelliana)
AJ870353, AJ870356, AJ870813,
AJ870869, AJ870872,
AJ870888, AJ870892
Wide-scale dataset: outgroup sequences
Arabidopsis thaliana AL161582
Medicago truncatula AJ499477
Oryza sativa AP003376
Physcomitrella patens subsp.
patensBJ167696
Picea glauca CO482424
Populus trichocarpa POPS114408a
Saccharum officinarum CA100587
Sorghum bicolor CD462656
Triticum aestivum BQ788859
Zea mays CO526902
Oryza sativa Combined AP003141, AK058473
Pisum sativum Combined AY557342, AY55734
Populus trichocarpa Combined TREE470814a, TREE818199a
Exons + intron dataset: ingroup sequences
D. ciliata 3–4 AJ870348-49
D. davisiana 1–2 AJ870350-51
D. ferruginea subsp. ferruginea1–3, 5–10, 12
AJ870352-54, AJ870356-61,
AJ870363
D. ferruginea subsp. schischkinii1–3
AJ870365-67
D. grandiflora 3, 6 AJ870370, AJ870373
D. lamarckii 3 AJ870376
D. lanata subsp. lanata 1–2 AJ870378-79
D. lutea subsp. australis 1, 3–4, 7,
10, 14–15, 17
AJ870380, AJ870382-83,
AJ870787, AJ870790,
AJ870794-95, AJ870797
D. nervosa 2–4, 6–9 AJ870803-05, AJ870807-10
D. obscura subsp. laciniata 1–2,
5–6
AJ870811-12, AJ870815-16
Table 6 continued
Sequences GenBank number
D. obscura subsp. obscura 1,
3–6, 9–13, 18, 20–26
AJ870817, AJ870819-22,
AJ870825-29, AJ870834,
AJ870836-42
D. parviflora 1, 3–4, 6 AJ870843, AJ870845-46,
AJ870848
D. purpurea subsp. heywoodii 1 AJ870850
D. purpurea subsp. mauretanica3
AJ870853
D. purpurea subsp. purpurea 1, 7 AJ870855, AJ870861
D. thapsi 1–5 AJ870862-66
D. trojana 2, 5, 7–8 AJ870868, AJ870871, AJ870873-
74
D. viridiflora 1 AJ870875
I. canariensis 1, 3–4, 6–7 AJ870878, AJ870880-81,
AJ870883-84
I. chalcantha 2, 4 AJ870886, AJ870888
I. isabelliana 1, 3 AJ870889, AJ870891
I. sceptrum 3, 5, 7–9, 11–13, 16 AJ870897, AJ870899, AJ870901-
03, AJ870905-07, AJ870910
Erinus alpinus 1, 2, 4–5, 7 AJ870914-15, AJ870917-18,
AJ870920
D. grandiflora A Combined AJ870368-69
D. grandiflora B Combined AJ870371-72
D. nervosa B Combined AJ870802, AJ870806
D. obscura subsp. obscura B
Combined
AJ870818, AJ870823, AJ870830-
33, AJ870835
D. parviflora A Combined AJ870844, AJ870849
D. purpurea subsp. mauretanicaA Combined
AJ870851-52, AJ870854
D. purpurea subsp. purpurea A
Combined
AJ870856, AJ870859-60
D. purpurea subsp. purpurea A
Combined 2
AJ870857-58
D. viridiflora A Combined AJ870876-77
I. chalcantha A Combined AJ870885, AJ870887
Combined A1 (D. lutea subsp.
australis & D. obscura subsp.
laciniata)
AJ870796, AJ870814
Combined A2 (D. obscura subsp.
obscura & D. trojana)
AJ870824, AJ870867
Combined A3 (D. ciliata,
D. lutea both subspecies &
D. parviflora)
AJ870346-47, AJ870381,
AJ870384-85, AJ870788-89,
AJ870791-93, AJ870798-01,
AJ870847
Exons + intron dataset: ingroup sequences
Combined B1 (D. lamarckii and
I. sceptrum)
AJ870374-75, AJ870377,
AJ870904
Combined B2 (D. ferrugineasubsp. ferruginea &
I. sceptrum)
AJ870355, AJ870900, AJ870908-
09, AJ870911, AJ870913
Combined B2 (D. ferrugineasubsp. ferruginea &
I. sceptrum)
AJ870355, AJ870900, AJ870908-
09, AJ870911, AJ870913
Plant Syst Evol (2008) 273:133–149 147
123
References
Aagaard JE, Olmstead RG, Willis JH, Phillips PC (2005) Duplication
of floral regulatory genes in the Lamiales. Amer J Bot
92(8):1284–1293
Albach DC, Martinez-Ortega M, Fischer MA, Chase MW (2004)
Evolution of Veronicaceae: a phylogenetic perspective. Ann
Missouri Bot Gard 91:275–302
Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, Miller
W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new
generation of protein database search programs. Nucl Acids Res
25:3389–3402
Alvarez I, Wendel JF (2003) Ribosomal ITS sequences and plant
phylogenetic inference. Molec Phylogenet Evol 29:417–434
Angiosperm Phylogeny Group II (2003) An update of the angiosperm
phylogeny group classification for the orders and families of
flowering plants: APG II. Bot J Linn Soc 141:399–436
Archambault A, Bruneau A (2004) Phylogenetic utility of the LEAFY/FLORICAULA gene in the Caesalpinioideae (Leguminosae):
gene duplication and a novel insertion. Syst Bot 29:609–626
Bailey CD, Price RA, Doyle JJ (2002) Systematics of the halimol-
obine Brassicaceae: evidence from three loci and morphology.
Syst Bot 27:318–332
Bailey CD, Hughes CE, Harris SA (2004) Using RAPDs to identify
DNA sequence loci for species level phylogeny reconstruction:
an example from Leucaena (Fabaceae). Syst Bot 29:4–14
Barber JC, Francisco-Ortega J, Santos-Guerra A, Turner KG, Jansen
RK (2002) Origin of Macaronesian Sideritis L. (Lamioideae:
Lamiaceae) inferred from nuclear and chloroplast sequence
datasets. Molec Phylogenet Evol 23:293–306
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DJ
(2007) GenBank. Nucleic Acids Res 35:D21–D25
Bohle UR, Hilger HH, Martin WF (1996) Island colonization and
evolution of the insular woody habit in Echium L (Boragina-
ceae). Proc Natl Acad Sci USA 93:11740–11745
Bracho MA, Moya A, Barrio E (1998) Contribution of Taqpolymerase-induced errors to the estimation of RNA virus
diversity. J Gen Virol 79:2921–2928
Bradley RD, Hillis DM (1997) Recombinant DNA sequences
generated by PCR amplification. Molec Biol Evol 14:592–593
Brauchler C, Meimberg H, Heubl G (2004) Molecular phylogeny of
the genera Digitalis L. and Isoplexis (Lindley) Loudon (Veron-
icaceae) based on ITS- and trnL-F sequences. Pl Syst Evol
248:111–128
Brysting AK, Oxelman B, Huber KT, Moulton V, Brochmann C
(2007) Untangling complex histories of genome mergings in
high polyploids. Syst Biol 56:467–476
Carlquist S (1974) Island biology. Columbia University Press, New
York
Carvalho JA (1999) Systematic studies of the genera Digitalis L. and
Isoplexis (Lindl.) Loud. (Scrophulariaceae: Digitaleae) and
conservation of Isoplexis species. Ph. D. thesis, University of
Reading, United Kingdom
Carvalho JA, Culham A (1998) Conservation status and preliminar
results on the phylogenetics of Isoplexis an endemic Macarone-
sian genus. Bol Mus Municipal Funchal 5:109–127
Charlesworth D, Charlesworth B, McVean GAT (2001) Genome
sequences and evolutionary biology, a two-way interaction.
Trends Ecol Evol 16:235–242
Cline J, Braman JC, Hogrefe HH (1996) PCR fidelity of Pfu DNA
polymerase and other thermostable DNA polymerases. Nucl
Acids Res 24:3546–3551
Cox CJ, Goffinet B, Shaw AJ, Boles SB (2004) Phylogenetic
relationships among the mosses based on heterogeneous Bayes-
ian analysis of multiple genes from multiple genomic
compartments. Syst Bot 29:234–250
Cronn RC, Small RL, Haselkorn T, Wendel JF (2002a) Rapid
diversification of the cotton genus (Gossypium: Malvaceae)
revealed by analysis of sixteen nuclear and chloroplast genes.
Amer J Bot 89:707–725
Cronn RC, Cedroni M, Haselkorn T, Grover C, Wendel JF (2002b)
PCR-mediated recombination in amplification products derived
from polyploid cotton. Theor Appl Genet 104:482–489
DoE Joint Genome Institute and Poplar Genome Consortium (2004)
Populus trichocarpa genome v1.0. http://genome.jgi-psf.org/
Poptr1/Poptr1.home.html. Last accessed 28th Nov 2007
Doyle JJ (1992) Gene trees and species trees—molecular systematics
as one-character taxonomy. Syst Bot 17:144–163
Doyle JJ, Doyle JL (1987) A rapid DNA isolation procedure for small
quantities of fresh leaf tissue. Phytochem Bull Bot Soc Am
19:11–15
Erixon P, Svennblad B, Britton T, Oxelman B (2003) Reliability of
Bayesian posterior probabilities and bootstrap frequencies in
phylogenetics. Syst Biol 52:665–673
Felsenstein J (1985) Confidence limits on phylogenies: an approach
using the bootstrap. Evolution 39:783–791
Francisco-Ortega J, Santos-Guerra A, Hines A, Jansen RK (1997)
Molecular evidence for a Mediterranean origin of the Macaro-
nesian endemic genus Argyranthemum (Asteraceae). Amer J Bot
84:1595–1613
Francisco-Ortega J, Fuertes-Aguilar J, Kim SC, Santos-Guerra A,
Crawford DJ, Jansen RK (2002) Phylogeny of the Macaronesian
endemic Crambe section Dendrocrambe (Brassicaceae) based on
internal transcribed spacer sequences of nuclear ribosomal DNA.
Amer J Bot 89:1984–1990
Ganders FR, Berbee M, Pirseyedi M (2000) ITS base sequence
phylogeny in Bidens (Asteraceae): evidence for the continental
relatives of Hawaiian and Marquesan Bidens. Syst Bot 25:122–
133
Harris SA (1995) Systematics and randomly amplified polymorphic
DNA in the genus Leucaena (Leguminosae, Mimosoideae). Pl
Syst Evol 197:195–208
Helfgott DM, Francisco-Ortega J, Santos-Guerra A, Jansen RK,
Simpson BB (2000) Biogeography and breeding system evolu-
tion of the woody Bencomia alliance (Rosaceae) in Macaronesia
based on ITS sequence data. Syst Bot 25:82–97
Hirt RP, Logsdon JM, Healy B, Dorey MW, Doolittle WF, Embley
TM (1999) Microsporidia are related to Fungi: evidence from the
largest subunit of RNA polymerase II and other proteins. Proc
Natl Acad Sci USA 96:580–585
Table 6 continued
Sequences GenBank number
Combined B3 (D. ferrugineasubsp. ferruginea, I. sceptrum& Erinus)
AJ870362, AJ870364, AJ870895,
AJ870898, AJ870912,
AJ870916
Combined B4 (D. trojana,
I. canariensis, I. isabelliana,
I. sceptrum)
AJ870870, AJ870879, AJ870882,
AJ870893-94, AJ870896,
AJ870912
Combined B5 (D. obscura subsp.
laciniata, D. trojana &
I. isabelliana)
AJ870813, AJ870869, AJ870872,
AJ870892
Exons + intron dataset: outgroup sequence
Pisum sativum AY557341
a Populus trichocarpa draft genome v1.0 accession numbers (http://
genome.jgi-psf.org/Poptr1/Poptr1.home.html)
148 Plant Syst Evol (2008) 273:133–149
123
Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference
of phylogenetic trees. Bioinformatics 17:754–755
Hughes CE, Eastwood RJ, Bailey CD (2006) From famine to feast?
Selecting nuclear DNA sequence loci for plant species-level
phylogeny reconstruction. Philos Trans Roy Soc London B Biol
Sci 361:211–225
Huson DH (1998) SplitsTree: analysing and visualizing evolutionary
data. Bioinformatics 14:68–73
Joly S, Bruneau A (2006) Incorporating allelic variation for recon-
structing the evolutionary history of organisms from multiple
genes: an example from Rosa in North America. Syst Biol
55:623–636
Kim SC, Crawford DJ, Francisco-Ortega J, Santos Guerra A (1996) A
common origin for woody Sonchus and five related genera in the
Macaronesian islands: molecular evidence for extensive radia-
tion. Proc Natl Acad Sci USA 93:7743–7748
Koch M, Haubold B, Mitchell-Olds T (2001) Molecular systematics
of the Brassicaceae: evidence from coding plastidic matK and
nuclear Chs sequences. Amer J Bot 88:534–544
Lawton-Rauh A (2003) Evolutionary dynamics of duplicated genes in
plants. Molec Phylogenet Evol 29:396–409
Lemmon AR, Moriarty EC (2004) The importance of proper model
assumption in Bayesian phylogenetics. Syst Biol 53:265–277
Linder CR, Rieseberg LH (2004) Reconstructing patterns of reticulate
evolution in plants. Amer J Bot 91:1700–1708
Martin AP, Burg TM (2002) Perils of parology: using HSP70 genes
for inferring organismal phylogenies. Syst Biol 51:570–587
Martins TR, Barkman TJ (2005) Reconstruction of Solanaceae
phylogeny using the nuclear gene SAMT. Syst Bot 30:435–447
Mason-Gamer RJ, Weil CF, Kellogg EA (1998) Granule-bound starch
synthase: structure, function and phylogenetic utility. Molec Biol
Evol 15:1658–1673
Mathews S, Sharrock RA (1996) The phytochrome gene family in
grasses (Poaceae): a phylogeny and evidence that grasses have a
subset of the loci found in dicot angiosperms. Molec Biol Evol
13:1141–1150
Morrison DA (2005) Networks in phylogenetic analysis: new tools for
population biology. Int J Parasitol 35:567–582
Mort ME, Crawford DJ (2004) The continuing search: low-copy
nuclear sequences for lower-level plant molecular phylogenetic
studies. Taxon 53:257–261
Nylander JAA (2004) MrModelTest 2.0. program distributed by the
author. Evolutionary Biology Centre, Uppsala University
Olmstead RG, DePamphilis CW, Wolfe AD, Young ND, Elisons WJ,
Reeves PA (2001) Disintegration of the Scrophulariaceae. Amer
J Bot 88:348–361
Pamilo P, Nei M (1988) Relationships between gene trees and species
trees. Molec Biol Evol 5:568–583
Panero JL, Francisco-Ortega J, Jansen RK, Santos-Guerra A (1999)
Molecular evidence for multiple origins of woodiness and a New
World biogeographic connection of the Macaronesian Island
endemic Pericallis (Asteraceae: Senecioneae). Proc Natl Acad
Sci USA 96:13886–13891
Posada D, Crandall KA (2001) Intraspecific gene genealogies: trees
grafting into networks. Trends Ecol Evol 16:37–45
Ree RH, Citerne HL, Lavin M, Cronk QCB (2004) Heterogeneous
selection on LEGCYC paralogs in relation to flower morphology
and the phylogeny of Lupinus (Leguminosae). Molec Biol Evol
21:321–331
Rokas A, Carroll SB (2005) More genes or more taxa? The relative
contribution of gene number and taxon number to phylogenetic
accuracy. Molec Biol Evol 22:1337–1344
Rokas A, Williams BL, King N, Carroll SB (2003) Genome-scale
approaches to resolving incongruence in molecular phylogenies.
Nature 425:798–804
Sang T (2002) Utility of low-copy nuclear gene sequences in plant
phylogenetics. Crit Rev Biochem Molec Biol 37:121–147
Sang T, Donoghue MJ, Zhang DM (1997) Evolution of alcohol
dehydrogenase genes in peonies (Paeonia): phylogenetic rela-
tionships of putative nonhybrid species. Molec Biol Evol
14:994–1007
Schluter PM, Stuessy TF, Paulus HF (2005) Making the first step:
practical considerations for the isolation of low-copy nuclear
sequence markers. Taxon 54:766–770
Small RL, Cronn RC, Wendel JF (2004) Use of nuclear genes for
phylogeny reconstruction in plants. Austral Syst Bot 17:145–170
Snowden KC, Simkin AJ, Janssen BJ, Templeton KR, Loucas HM,
Simons JL, Karunairetnam S, Gleave AP, Clark DG, Klee HJ
(2005) The Decreased apical dominance1/Petunia hybridaCAROTENOID CLEAVAGE DIOXYGENASE8 gene affects
branch production and plays a role in leaf senescence, root
growth and flower development. Pl Cell 17:746–759
Soltis DE, Soltis PS (2000) Contributions of plant molecular system-
atics to studies of molecular evolution. Pl Molec Biol 42:45–75
Sorefan K, Booker J, Haurogne K, Goussot M, Bainbridge K, Foo E,
Chatfield S, Ward S, Beveridge C, Rameau C, Leyser O (2003)
MAX4 and RMS1 are orthologous dioxygenase-like genes that
regulate shoot branching in Arabidopsis and pea. Genes Dev
17:1469–1474
Strand AE, LeebensMack J, Milligan BG (1997) Nuclear DNA-based
markers for plant evolutionary biology. Molec Ecol 6:113–118
Swofford DL (2002) PAUP*-Phylogenetic analysis using parsimony
(*and other methods): version 4.0b10. Sinauer, Sunderland
Syring J, Willyard A, Cronn R, Liston A (2005) Evolutionary
relationships among Pinus (Pinaceae) subsections inferred from
multiple low-copy nuclear loci. Amer J Bot 92:2086–2100
Tank DC, Sang T (2001) Phylogenetic utility of the glycerol-3-
phosphate acyltransferase gene: evolution and implications in
Paeonia (Paeoniaceae). Molec Phylogenet Evol 19:421–429
Tank DC, Beardsley PM, Kelchner SA, Olmstead RG (2006) Review
of the systematics of Scrophulariaceae s.l. and their current
disposition. Austral Syst Bot 19:289–307
Vriesendorp B, Bakker FT (2005) Reconstructing patterns of
reticulate evolution in angiosperms: what can we do? Taxon
54:593–604
Wagner A, Blackstone N, Cartwright R, Dick M, Misof B, Snow P,
Wagner GP, Bartels J, Murtha M, Pendleton J (1994) Surveys of
gene families using polymerase chain reaction–PCR selection
and PCR drift. Syst Biol 43:250–261
Whitlock BA, Baum DA (1999) Phylogenetic relationships of
Theobroma and Herrania (Sterculiaceae) based on sequences
of the nuclear gene Vicilin. Syst Bot 24:128–138
Whittall JB, Medina-Marino A, Zimmer EA, Hodges SA (2006)
Generating single-copy nuclear gene data for a recent adaptive
radiation. Molec Phylogenet Evol 39:124–134
Winkworth RC, Bryant D, Lockhart PJ, Havell D, Moulton V (2005)
Biogeographic interpretation of splits graphs: least squares
optimization of branch lengths. Syst Biol 54:56–65
Plant Syst Evol (2008) 273:133–149 149
123