Upload
doanmien
View
219
Download
0
Embed Size (px)
Citation preview
Indian Journal of Biotechnology
Vol 12, January 2013, pp 58-66
A comparative computational study of the ‘rbcL’ gene in plants and in the
three prokaryotic families—Archaea, cyanobacteria and proteobacteria
Sunil Kanti Mondal1,2#
, Subhadeep Shit3#
and Sudip Kundu1*
1Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta, Kolkata 700 009, India
2Department of Biotechnology, The University of Burdwan, Rajbati, Burdwan 713 104, India
3School of Biotechnology and Life Sciences, Haldia Institute of Technology, Haldia 721 657, India
The rbcL (ribulose-1,5-biphosphate carboxylase oxygenase) gene plays a crucial role in carbon fixation. Previous
studies shed light on its evolutionary relationship among different Phyla. Here, authors have done a comparative study of
rbcL genes among proteobacteria, archaea, cyanobacteria and plants based on their compositional variations (GC%, amino
acid frequency, codon usage, etc.). In addition we have checked the mutational pressure on rbcL genes. The results indicate
that the rbcL genes of cyanobacteria have a wide range of GC%. On the other hand, those of the proteobacteria have mainly
higher GC%. Preferences of some amino acids usages have observed in rbcL genes among all species with an exception
of plant. Analysis of RSCU (relative synonymous codon usage) values depicts GC ending codon biasness in proteobacteria,
archaea and with few exceptions in cyanobacterial species. On the other hand, AU ending codon biasness has been observed
for plants. The correspondence analysis shows the significant difference in codon usage pattern among the selected four
groups of species. The ENc (expected effective number of codons) plot implys the choice of rbcL gene codon is constrained
only by mutational biasness. Moreover, the rbcL genes’ expression ability, as predicted by CAI (codon adaptation index), is
similar for most of the species from different groups.
Keywords: Codon adaptation index (CAI), expected effective number of codons (ENc), ribulose-1,5-biphosphate carboxylase
oxygenase, relative synonymous codon usage (RSCU)
Introduction Ribulose-1,5-biphosphate carboxylase-oxygenase
(E.C. 4.1.1.39, Rubisco), one of the most abundant
enzymes on the globe1, is responsible for catalyzing
CO2 assimilation to organic carbon via Calvin cycle.
In its most prevalent conformation, found in most
proteobacteria, cyanobacteria, algae and higher plants,
Rubisco occurs as a hexadecamer composing of eight
large subunit (rbcL, MW ≈ 56,000) and eight small
subunits (rbcS, MW ≈ 14,000), assembling into an
L8S8 holoenzyme2. The large subunit plays the crucial
role of carbon fixation3. In Cyanobacteria and other
prokaryotes, genes for both rbcL and rbcS are
chromosomally encoded and co-transcribed2. As
evolutionary rate of rbcL is suitable for study of
phylogeny, it is often used as model for phylogenetic
investigation4. Understanding of these evolution
patterns may shed light on functional/structural
features governing Rubisco activity.
Previous investigations on rbcL genes suggest a
biphyletic origin of phototrophic eukaryotes5-7
. It is
also found that rbcL of green algae/plant lineages is
derived from cyanobacteria and forms one rbcL
lineage; the second rbcL lineage consists of the
non-green algae and is derived from proteobacteria.
This scenario of evolution is also well supported by
molecular phylogenies of other chloroplast-encoded
genes like psbA, tufA5,8
, atpB, ClpC9,10
, etc.
Phylogenies based on these genes suggest that there is
a single cyanobacterial ancestor of plastids. These
resulted in a number of hypotheses explaining the
apparently contradictory results. For example, (a) a
lateral gene transfer of rbcLS genes may have occurred
from a proteobacteria into the ancestor that gave rise to
non-green plants; (b) a lateral transfer of rbcLS operon
may have occurred into cyanobacterial ancestor that
gave rise to non-green plants; or (c) two rbcLS operons
may have been present in cyanobacterial-like ancestor
(that gave rise to plastids) and different copies were
retained in green versus nongreen lineages11
.
While a large number of investigations have
already done to understand the ancestry of Rubisco
——————
*Author for correspondence:
Mobile: +91-9433428324
E-mail: [email protected]; [email protected] #Both authors have equal contribution
MONDAL et al: A COMPARATIVE COMPUTATIONAL STUDY OF rbcL GENES
59
and its evolutionary relationship among different
Phyla, present study deals with the rbcL gene’s
evolutionary pattern in prokaryotes (archaea,
proteobacteria, cyanobacteria) and eukaryotes (plant);
an extensive analysis on the compositional variation
and similarity (GC content, amino acid usage
frequency, codon bias, etc.) of the rbcL genes;
and also to understand the mutational pressure on
rbcL genes.
Materials and Methods
Collection of Data
Taxonomic and other related information, whole
genome and 16S rRNA sequences of 43 species
(10 Archaea spp., 10 Cyanobacteria spp., 15
Proteobacteria spp. & 8 Plant spp.) were collected
from NCBI (http://www.ncbi.nlm.nih.gov). Annotated
rbcL gene and protein sequences were collected from
the KEGG database (http://www.genome.jp/dbget-bin/).
Evolutionary Analysis
Bootstrapped trees of 16S rRNA and rbcL gene
sequences were generated using ClustalX (version 2)
and PHYLIP (version 3.69)12,13
. Hierarchical
clustering based on GC content was generated using
DIANA14
.
Compositional Analysis
Parameters like GC content, RSCU (a measure
of relative synonymous codon usage biasness)
values, amino acid and tRNA frequencies were
considered for compositional analysis. The GC, GC3s
and Nc values15
were generated using CodonW
(http://codonw.sourceforge.net/). These provided useful
information regarding existence of mutational
pressures acting on the genes16
. The expected
effective number of codons (ENc) values from
GC3s under H0 (null hypothesis, i.e., no selection)
were calculated according to Equation 1, where S
denoted GC3s.
ENc= 2+S+ {29/ [S2+ (1-S)
2]} … (1)
Correspondence analysis was performed using
CodonW to investigate major trend in RSCU variation
among the genes. It identifies major trends in the
variation of synonymous codon usage data and
distributes genes along continuous axes in accordance
with these trends. RSCU value close to 1 indicates
lack of biasness, while much higher and lower values
indicate preference and avoidance of that particular
codon, respectively.
Types of motifs in gene sequence were
calculated using the online tool MOTIFSCAN
(http://hits.isb-sib.ch/cgi-bin/PFSCAN) and were
aligned determining whether each of the genes are
conserved among groups.
Statistical Significance Test
A statistical significance (z) test was performed
based on G and C ending codons of amino acids
having minimum four degenerative codons with
variation only in the third position to understand the
preference of either codons on the respective
organisms group.
Z = (Xc - Xg)/√((Sc2+Sg
2)/(Nc+Ng) … (2)
Here, Xi and Si denote the average and standard
deviation of i (i=C & G) ended codon’s RSCU values,
respectively. N indicates the sample size.
Expressional Probability
Codon adaptation index (CAI), a measure of gene’s
probable expression, was calculated17
. The codon
preference model was also generated to check an
existence of mutational biasness as well as also to
understand the preferential effect over the different
amino acids.
Results and Discussion
Phylogenetic Analysis
Phylogenetic trees generated based on rbcL genes
and 16S rRNA sequences are shown in Fig. 1. The
main cluster of phylogenetic tree of the rbcL genes of
prokaryotes (Fig. 1a) has two sub clusters; one of
archaeal and the other of α- and β-proteobacterial
secies. It is joined by a cluster of γ-proteobacterial
and finally preceded by cyanobacterial species. The
phylogenetic tress of 16S rRNA sequences is
presented in Fig. 1b. Its main cluster comprises of
archaeal and cyanobacterial species as sub
clusters, which is followed by cluster of β- and
γ-proteobacterial, and of α-proteobacterial species.
Comparison of both phylogenetic trees shows that
rbcL genes of archaea and proteobacterial species
have more similarity. On the other hand, 16S rRNA
based tree clearly shows genetical similarity of
archaeal and cyanobacterial species. The higher
similarity among the rbcL genes of cyanobaceria and
plant species is evident from the phylogenetic tree of
both the prokaryotic and plant species (Fig. 1c), and
confirms about the evolutionary relationships as
already observed by several previous studies.
INDIAN J BIOTECHNOL, JANUARY 2013
60
Fig. 1 (a-c)—(a) Phylogenetic tree based on the rbcL gene sequences; (b) Phylogenetic tree based on the 16SrRNA sequences of the
species; & (c) Phylogenetic tree of both prokaryotes and eukaryotes based on the rbcL gene sequences.
MONDAL et al: A COMPARATIVE COMPUTATIONAL STUDY OF rbcL GENES
61
Compositional Variability
Variation in rbcL Gene
The hierarchical cluster based on GC content
of the rbcL genes is presented in Fig 2. It has
two main clusters. The first cluster forms two sub
clusters: Cluster A and B, while the other cluster
comprises of three sub clusters: Cluster C, D and E.
Cluster A contains plant, whereas cluster B contains
mainly cyanobacterial and few archaea species
(Staphylothermus marinus, Methanosarcina mazei &
Methanoculleus marisnigri). These two clusters have
species possesing rbcL genes of low GC content. The
cluster C represents species having comparatively
higher GC content (56.42%) in their rbcL genes and it
has species of different groups, such as, cyanobacteria
(Thermocynechococcus elongatus & Synechoccus sp.),
archaea (M. barkeri, Archaeoglobus fulgidus &
Thermococcus gammatolerens) and proteobacteria
(Acidithiobacillus ferroxidans & Nitrosomonas
eutropha). However, the species having highest GC
content in rbcL genes are present in cluster D and E,
comprising mainly of the proteobacteria species
and few cyanobacteria (Synechococcus sp. A-prime &
Synechoccus sp. RCC307) and archaea (Thermofilum
pendens & Natronomonas pharaonis). The results
indicate that the rbcL genes of cyanobacterial species
have a wide range of GC contents, whereas those of
the proteobacterial species have mainly higher GC
contents. It has been observed that the range of GC
content in green plants varies from 28 to 42%18
.
It is also evident from the present results that the
rbcL genes of plants have comparatively lower
GC contents.
The gene has also been analyzed based on the
physico-chemical properties of the amino acids
present in its sequence. Amino acid frequencies are
calculated and presented in three groups: amino acids
with hydrophobic side chains in cluster I, polar amino
acids with neutral side chains in cluster II and polar
amino acids with charged groups in cluster III
(Fig. 3). Amino acids like alanine, valine, isoleucine
and leucine from cluster I show higher amino acid
usage among all the four groups of species. Similarly,
amino acids like glycine, proline and threonine from
the cluster II show comparative higher usage ratio. On
the other hand, glycine and cystiene usages are
maximum and minimum, respectively among all the
four groups of species. All the polar amino acids with
charged groups, except histidine, (cluster III) show
comparatively higher usage ratio in all groups of
species. However, few exceptions are observed in a
plant species, namely, Nicotiana benthamiana; it
shows higher usage ratio of arginine (from cluster III),
cystein and serine (from cluster II).
An intra and inter group correlation check was
done based on the rbcL genes’ RSCU (Table 1) and
the results show that proteobacterial and archaeal
species have greater RSCU values in the GC-ending
Fig. 2—Hierarchical clustering based on the GC content of rbcL genes.
INDIAN J BIOTECHNOL, JANUARY 2013
62
codons, indicating greater usage of these codons.
Cyanobacterial species also showed greater values for
GC-ending codons, except amino acids arginine,
alanine, valine, glutamic acid, glycine and lysine,
while plant specied showed greater RSCU values in
the AU-ending codons, except amino acids aspargine
and histidine.
Codon Preference Check
The rbcL genes’ RSCU values also lead to the
establishment of a codon preference model, which
thereby shows preference towards GC-ending codons
by the prokaryotes, signifying a mutational bias
towards G+C, but no such case is observed in the
plant species (Table 1). A preference check of G3
Fig. 3—Amino acid frequencies of rbcL genes obtained from archaea (AR), proteobacteria(PR), cyanobacteria (CY) and plant species (PL).
(afu = Archaeoglobus fulgidus; hbu = Hyperthermus butylicus; mem = Methanoculleus marisnigri; mac = Methanosarcina acetivorans;
mba = Methanosarcina barkeri; mma = Methanosarcina mazei; nph = Natronomonas pharaonis; smr = Staphylothermus marinus;
tga = Thermococcus gammatolerans; tpe = Thermofilum pendens; ana = Anabaena sp. PCC7120; cya = Cyanobacteria Yellowstone A-Prime;
cyb = Cyanobacteria Yellowstone B-Prime; cyc = Cyanothece; mar = Microcystis aeruginosa; pmb = Prochlorococcus marinus AS9601;
pmj = Prochlorococcus marinus MIT9211; pme = Prochlorococcus marinus NATL1A; syr = Synechococcus sp. RCC307;
tel = Thermosynechococcus elongates; ath = Arabidopsis thaliana; P01 = Beta vulgaris; P02 = Brassica napus; cre = Chlamydomonas
reinhardtii; P03 = Gossypium raimondii; P04 = Helianthus annus; P05 = Nicotiana benthamiana; P06 = Petunia hybrid; acr = Acidiphilium
cryptum; afr = Acidithiobacillus ferrooxidans; aeh = Alkalilimnicola ehrlichei; bbt = Bradyrhizobium; bph = Burkholderia phymatum;
cti = Cupriavidus taiwanensis; mca = Methylococcus capsulatus; net = Nitrosomonas eutropha; rsp = Rhodobacter sphaeroides 2.4.1;
rpe = Rhodopseudomonas palustris BisA53; rpa = Rhodopseudomonas palustris CGA009; rce = Rhodospirillum centenum;
sme = Sinorhizobium meliloti; tbd = Thiobacillus denitrificans; & xau = Xanthobacter autotrophicus)
MONDAL et al: A COMPARATIVE COMPUTATIONAL STUDY OF rbcL GENES
63
over C3 or vice versa (Table 2) shows few significant
results; archaeal species show biasness of C3 over G3
in all codons, except arginine and proline. Among
cyanobacterial species, preference of G3 over C3 are
found in amino acids like leucine, while C3 over G3
are seen in alanine, glycine, proline, threonine and
serine. Among proteobacteria species, biasness
towards C3 ending codons is observed, except
amino acids leucine and proline. In plant species,
preference exists in AU-ending codons, except
aspargine and histidine showing a mild higher
RSCU values in C3 ending codons.
Statistical Significance Test
Some amino acids like alanine and proline show no
significant difference (P<0.05) between G and C
ending codons both in archaea and plant species, while
threonine shows significant difference in archaea,
cyanobacteria, proteobacteria and plant species.
Motif Analysis
The general sequence of the rbcL motif is “G-x-
[D]-F-x-K-x-D-E”. Degree of conservation using
motif analysis shows that rbcL motifs of the different
species are found almost conserved in all the species,
except N. eutropha, Brassica napus, N. benthamiana
Table 1—Summary of the RSCU values of rbcL genes among the four groups
Amino
acids
Codons* Archaea Cyano-
bacteria
Proteo-
bacteria
Plant Amino
acids
Codons* Archaea Cyano-
bacteria
Proteo-
bacteria
Plant
gca 1 0.71 0.24 1.18 Met aug 1 1 1 1
gcc 1.25 1.18 2.21 0.57 cca 0.88 0.64 0.19 1.25
gcg 0.88 0.55 1.32 0.46 ccc 1.02 1.42 1.22 0.51
Ala
gcu 0.87 1.58 0.22 1.79
Pro
ccg 1.44 0.59 2.34 0.53
ugc 1.78 1.16 1.73 0.9 ccu 0.67 1.35 0.26 1.71 Cys
ugu 0.23 0.85 0.27 1.1 Gln caa 0.56 0.97 0.39 1.45
gac 1.28 1.04 1.48 0.71 cag 1.66 1.03 1.81 0.55 Asp
gau 0.72 0.96 0.52 1.29 Arg aga 1.53 1.88 0.21 1.49
gaa 0.78 1.31 0.81 1.53 agg 2.51 0.46 0.28 0.87 Glu
gag 1.22 0.69 1.19 0.47 cga 0.34 0.35 0.39 1.01
uuc 1.29 1.22 1.89 0.84 cgc 0.88 1.85 4.09 0.55 Phe
uuu 0.72 0.78 0.2 1.16 cgg 2.28 1.61 1.09 0.49
gga 0.98 0.6 0.08 1 cgu 0.74 2.14 0.68 2.16
ggc 1.51 1.13 3.1 0.5 agc 1.99 0.39 1.04 0.87
ggg 0.91 0.35 0.28 0.73 agu 0.71 0.33 0.07 0.95
Gly
ggu 0.6 1.93 0.53 1.77
Ser
uca 0.6 0.75 0.11 1.38
cac 1.31 1.43 1.43 1.1 ucc 1.16 1.99 1.52 0.78 His
cau 0.69 0.57 0.57 0.9 ucg 0.99 0.63 2.93 0.35
aua 0.87 0.17 0.03 0.36 ucu 0.56 1.93 0.34 1.67
auc 1.41 1.86 2.77 1.19 aca 0.79 0.7 0.15 1.18
Ile
auu 0.73 0.97 0.21 1.45
Thr
acc 1.54 2.05 2.7 0.76
aaa 0.68 1.11 0.28 1.51 acg 0.91 0.31 0.95 0.22 Lys
aag 1.32 0.89 1.72 0.49 acu 0.76 0.94 0.21 1.85
uua 0.24 0.74 0.05 1.23 gua 0.72 0.8 0.15 1.62
uug 0.35 1.03 0.28 1.34 guc 1.33 0.7 1.79 0.27
cua 0.64 0.74 0.11 0.94 gug 0.72 0.93 1.88 0.58
cuc 1.99 1.17 2.03 0.4
Val
guu 1.23 1.57 0.18 1.54
cug 1.39 1.52 3.35 0.63 Trp ugg 1 1 1 1
Leu
cuu 1.39 0.82 0.18 1.47 Tyr uac 1.44 1.22 1.45 0.83
aac 1.5 1.54 1.69 1.1 uau 0.59 0.78 0.55 1.17 Asn
aau 0.54 0.48 0.42 1.05
*Preferred codons are marked in ‘bold’.
INDIAN J BIOTECHNOL, JANUARY 2013
64
and Gossypium raimondii (Fig. 4). Within the groups,
cyanobacterial species show the most conserved
sequence, except Prochlorococcus marinus where
phenyl alanine residue has changed to leucine.
This change has also been detected in archaea where
in few species (Hyperthermus butylicus, S. marinus,
T. gammatolerans & 3 Methanosarcina species),
phenyl alanine residue has changed either to leucine
or tyrosine. Proteobacteria species also show a
conserved rbcL motif sequence.
Expressional Variability
ENc Plot Analysis
The ENc plot analysis [ENC plotted against %
(G+C)3 (Fig. 5) ] was used to investigate patterns of
synonymous codon usage, which shows that all the
species lie below the expected curve, except two plant
species B. napus and N. benthamiana, which are lying
on the curve of the predicted values. This implies that
the choice of the rbcL gene codon of the different
species is constrained only by a mutational biasness.
Correspondence Analysis
To determine the codon usage of rbcL gene among
the four groups, correspondence analysis on the genes
RSCU values of the 43 organisms was carried out by
a standard procedure19
. The distribution of the rbcL
gene from the 43 species on the first two major
axes of the correspondence analysis is shown in
Fig. 6. Genes are recognized based on their groups.
Proteobacteria and plant species are separated along
the first major axis, while cyanobacteria and archaea
species are separated along the second major axis,
thereby depicting significant difference in their codon
usage pattern. A closer look depicts that archaea
species (N. pharaonis & Methanoculleus marisnigri)
and cyanobacteria species (Cyanobacteria A &
B-Prime, Synechoccus RCC307 & T. elongates) show
similar RSCU pattern with proteobacteria species.
Similarly, P. marinus (NATL1A, MIT9211 & AS9601)
Table 2—Z-scores of the rbcL genes for the four different groups
Z-Scores Amino
acids Archaea
spp.
Cyanobacteria
spp.
Proteobacteria
spp.
Plant
spp.
Alanine 1.43 2.11 4.25 0.72
Glycine 2.13 2.33 7.94 -0.97
Leucine 0.54 -2.43 -3.97 -4.68
Proline -1.5 2.56 -4.49 -0.08
Arginine -5.42 -0.42 6.91 -2.61
Serine 4.11 3.68 -0.86 4.49
Threonine 2.43 4.89 7.12 2.98
Valine 2.3 -0.69 -0.38 -1.33
Amino acids having more than two codons have been considered.
Z values greater than or less than 1.96 (at P<0.05) are significant.
Fig. 4—Alignment of ‘rbcL’ motif sequences to identify the
degree of conservation among the archaea, proteobacteria,
cyanobacteria and plants.
MONDAL et al: A COMPARATIVE COMPUTATIONAL STUDY OF rbcL GENES
65
species of cyanobacteria are found to have closer
resemblance to all the plant species, except B. napus,
N. benthamiana and G. raimondii, which are similar
to the S. marinus and M. barkeri from archaea species.
Codon Adaptation Index Variation
Codon Adaptation Index (CAI) predicting the
degree of expression of the rbcL gene analyzed shows
an average rate of expression of 0.72-0.76 (Fig. 7).
This indicates that rbcL is highly expressible gene
and is playing an important role in these species life.
Only three species from plant, Helianthus annuus,
Arabidopsis thaliana and Chlamydomonas reinhardtii
are found showing a comparative lower CAI values.
This lower value might be due to usage of other
Rubisco genes.
Conclusion
The comparative study of rbcL genes among different groups of species highlighted few significant outcomes. First of all, the results show that rbcL
genes of cyanobacteria and plants are similar in comparison to the other groups, which is an already known evolutionary relationship. In addition to that, it further shows that rbcL genes of these two groups have lower GC% in comparison to those of proteobacteria and archaeal species, which show higher GC% values. On the other hand, cyanobacteria and the other prokaryotic groups show differences from plants on the basis of their higher usage ratio in GC preference codons, thereby signifying the existence of mutational pressure among their rbcL genes. Such mutational pressure is not observed in case of plants. Though the rbcL gene expression is nearly similar among the four groups but the pattern in their codon usages is found different.
Acknowledgement Authors wish to thank the Department of
Biotechnology, University of Burdwan, Burdwan for
providing the facility of Computational Biology
laboratory.
Fig. 5—The effective number of codons (ENc) plotted against
the GC3s for the rbcL gene. Each expected ENc from GC3s is
shown as a standard curve.
Fig. 6—Correspondence analysis of the rbcL gene from the
four groups based on its RSCU values. The proteobacteria and
the plant species are separated along the first major axis, while
the cyanobacteria and archaea species are separated along the second
major axis, thereby depicting the difference in their codon usage pattern.
Fig. 7—Representation of the CAI values for the ‘rbcL’ gene from four groups. An average of above 70% CAI is shown by all the
organisms, except H. annus, A. thaliana and C. reinhardtii from plants.
INDIAN J BIOTECHNOL, JANUARY 2013
66
References 1 Chase M W, Soltis D E, Olmstead R G, Morgan D,
Les D H et al, Phylogenetics of seed plants: An analysis of
nucleotide sequences from the plastid gene rbcL, Ann Mo Bot
Gard, 80 (1993) 528-580.
2 Tabita F R, Molecular and cellular regulation of autotrophic
carbon dioxide fixation in microorganisms, Microbiol Res,
52 (1988)155-189.
3 Miziorko H M & Lorimer G H, Ribulose-l,5-bisphosphate
carboxylase-oxygenase, Ann Rev Biochem, 52 (1983) 507-535.
4 Delwiche C F & Palmer J D, Rampant horizontal transfer
and duplication of rubisco genes in eubacteria and plastids,
Mol Biol Evol, 13 (1996) 873-882.
5 Morden C W, Delwiche C F, Kuhsel M & Palmer J D, Gene
phylogenies and the endosymbiotic origin of plastids,
Biosystems, 28 (1992) 75-90.
6 Ueda K & Shibuya H, Molecular phylogeny of rbcL and its
bearing on the origin of plastids: Bacteria or cyanobacteria?,
in Endocytobiology, edited by S Sato, M Ishida & H
Ishikawa, (V Tiibingen University Press, Tiibingen,
Germany) 1993, 369-376.
7 Mcfadden G I, Gilson P R & Waller R E, Molecular
phylogeny of chlorarachniophytes based on plastid rRNA
and rbcL sequences, Arch Protistenkd, 145 (1995) 231-239.
8 Delwiche C F, Kuhsel M & Palmer J D, Phylogenetic analysis of tufA sequences indicates a cyanobacterial origin of all plastids, Mol Phylogenet Evol, 4 (1995) 110-128.
9 Douglas S E & Murphy C A, Structural, transcriptional, and
phylogenetic analysis of the atpB gene cluster from the
plastid of CryptomonusΨ (Cryptophyceae), J Phycol, 30
(1994) 329-340.
10 Clarke A K & Eriksson M J, The cyanobacterium
Synechococcus sp. PCC 7942 possesses a close homologue
to the chloroplast ClpC protein in higher plants, Plant Mol
Biol, 31 (1996) 721-730.
11 Palmer J D, A genetic rainbow of plastids, Nature (Lond),
364 (1993) 762-763.
12 Felsenstein J, PHYLIP: Phylogeny interference package
(version 3.69) (Department of Genome Sciences and
Department of Biology, University of Washington,
Washington, USA) 1989, 164-166.
13 Swofford D L, Olsen G J, Waddell P J & Hillis D M,
Phylogenetic inference, in Molecular systematics, 2nd edn,
edited by D M Hillis, C Moritz & B K Mable (Sinauer
Associates, Inc., Publishers, Sunderland, USA) 1996, 407-514.
14 Kaufman L & Rousseeuw P J, Finding groups in data: An
introduction to cluster analysis, (John Wiley & Sons, Inc.,
New Jersey, USA) 1990.
15 Fu C, Xiong J & Miao W, Genome-wide identification and
characterization of cytochrome P450 monooxygenase genes
in the ciliate Tetrahymena thermophila, BMC Genomics, 10
(2009) 208.
16 Meng Z, Wei L & Xia L, Analysis of synonymous codon
usage in chloroplast genome of Populus alba, J For Res, 19
(2008) 293-297.
17 Sharp P M & Li W H, The codon adaptation index—A
measure of directional synonymous codon usage bias, and
its potential applications, Nucleic Acids Res, 15 (1987)
1281-1295.
18 Kusumi J & Tachida H, Compositional properties of green-
plant plastid genomes, J Mol Evol, 60(2005) 417-425.
19 Sharp P M, Tuohy T M F & Mosurski K R, Codon usage
in yeast: Cluster analysis clearly differentiates highly
and lowly expressed genes, Nucleic Acids Res, 14(1986)
5125-5143.