Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
CHARACTERISATION OF DUPLICATEDHAEMOGLOBIN GENES IN BIVALVES
Mathilde Klein Bachelor of Medical Laboratory Science, QUT
Submitted in fulfilment of the requirements for the degree of
Master of Applied Science (Research)
School of Biomedical Sciences, IHBI
Faculty of Health
Queensland University of Technology
2017
Keywords
Arcoida, Bivalves, Gene duplication, Genomics, Globins, Haemoglobin,
Limoida, Molluscs, Transcriptome
Keywords i
Abstract
Haemoglobins (Hbs) are found in virtually all phyla and are some of the most
investigated proteins in biomedical sciences. These proteins exhibit an extraordinary
diversity of form and function in invertebrate lineages. This provides a unique
opportunity to explore the origin and evolution of Hbs yet little is known about their
distribution, function and evolution in invertebrate lineages. To explore further the
functions and evolution of those Hbs, recent transcriptome data for the Arcid bivalve
Anadara trapezia is investigated here. This species shows the presence of duplicated
Hb encoding genes suggesting that gene duplication may have been more extensive
than previously thought in bivalves. This study tests the hypothesis that these
duplicated genes show patterns of tissue specific expression and evidence of
neofunctionalisation. This is shown here for at least three Hb encoding genes present
in A. trapezia with strong tissue specific expression in haemolymph compared to
other tissues. Furthermore, the expression of these genes remains unaffected by
prolonged air exposure suggesting that neofunctionalisation may confer an
evolutionary advantage to this bivalve. As well as the unique Hbs found in the
bivalve order Arcoida, Hbs are also found in three other bivalve orders: Carditoida,
Solemyoida and Veneroida. These four orders that possess Hbs provide compelling
evidence for the independent evolution of these proteins in multiple bivalve lineages.
To expand data on the distribution of Hbs in bivalves, a transcriptome sequence for
Ctenoides ales in the order Limoida was generated in this project. Interrogation of
the transcriptome shows the presence of at least three globin-like encoding genes
including two Hb-like encoding genes providing preliminary evidence for another
independent origin of Hb in a bivalve lineage. Overall, this study provides novel
insights into the function, evolution and distribution of Hbs in bivalves by
investigating two distantly related species. Results of this study are consistent with
current theories that Hb diversity in bivalves is a result of repeated rounds of gene
duplication providing the raw material for evolution. Investigation of hypoxic
resistance also reinforces that greater expression of Hbs in haemolymph confers a
physiological advantage suggesting that Hb would evolve more often in some
lineages during adaptation to unfavourable environment conditions, particularly
Abstract ii
hypoxia and prolonged air exposure. The finding of Hb-like encoding genes in
another bivalve lineage also supports the evolution of this gene family through
independent evolution and gene duplication, and gives insight into the distribution of
globin genes in bivalves which is still poorly understood. The investigation of Hb
genes in these bivalves also contributes to further understand the role of Hbs and
provides potential novel insights for resistance in hypoxic environments, disease
control and resistance to pollution in aquaculture.
Abstract iii
Table of Contents Keywords .................................................................................................................................. i
Abstract .................................................................................................................................... ii
List of Figures ......................................................................................................................... vi
List of Tables .......................................................................................................................... xi
List of Abbreviations ............................................................................................................. xii
Statement of Original Authorship ......................................................................................... xiii
Acknowledgements ............................................................................................................... xiv
Chapter 1: Introduction ............................................................................................ 1
1.1 Functional and structural diversity within the globin superfamily .................................1 1.1.1 Recently discovered globin genes ....................................................................................... 2 1.1.2 Myoglobins and haemoglobins ........................................................................................... 2
1.2 Evolution of haemoglobins .............................................................................................4 1.2.1 Gene duplication ............................................................................................................... 5 1.2.2 Divergent evolution of duplicated genes .......................................................................... 8 1.2.3 Convergent evolution ........................................................................................................ 9 1.2.4 Invertebrate haemoglobins .............................................................................................. 10
1.3 Bivalve haemoglobins .......................................................................................................12 1.3.1 Haemoglobins in the family Arcidae, Pteriomorphia subclass.......................................... 16
1.4 Aims ..................................................................................................................................19
1.5 Thesis Outline ...............................................................................................................20
Chapter 2: Tissue specificity and neofunctionalisation of haemoglobin genes in Anadara trapezia ....................................................................................................... 22
2.1 Background .......................................................................................................................22
2.2 Material and Methods .......................................................................................................23 2.2.1 Sample acquisition and tissue dissection .......................................................................... 23 2.2.2 RNA extraction and cDNA synthesis ............................................................................... 24 2.2.3 RT–PCR (Real Time PCR) ............................................................................................... 25 2.2.4 Candidate gene validation ................................................................................................. 26 2.2.5 RT-qPCR (Real Time quantitative PCR) for quantification of gene expression .............. 26 2.2.6 RT-qPCR data analysis ..................................................................................................... 27
2.3 Results ...............................................................................................................................28 2.3.1 Haemolymph analysis ....................................................................................................... 28 2.3.2 RT-PCR ............................................................................................................................ 30 2.3.3 Candidate gene validation ................................................................................................. 31 2.3.4 RT-qPCR for quantification of gene expression ............................................................... 31
2.4 Discussion .........................................................................................................................35 2.4.1 Haemolymph characteristics ............................................................................................. 35 2.4.2 Tissue specific expression and neofunctionalisation ......................................................... 36
2.5 Conclusion ........................................................................................................................37
Chapter 3: Functional annotation of the Ctenoides ales transcriptome .............. 39
3.1 Background .......................................................................................................................39
Table of contents iv
3.2 Materials and Methods ......................................................................................................41 3.2.1 Sample collection .............................................................................................................. 41 3.2.2 RNA extraction, library preparation and sequencing ........................................................ 41 3.2.3 Transcriptome assembly and validation ............................................................................ 44 3.2.4 Functional annotation of transcripts and mapping ............................................................ 44 3.2.5 Comparative transcriptomics ............................................................................................ 46 3.2.6 Candidate genes identification .......................................................................................... 47 3.2.7 Primer design and candidate gene validation .................................................................... 47 3.2.8 Phylogenetic analysis of sequences .................................................................................. 48
3.3 Results ...............................................................................................................................48 3.3.1 RNA extraction, library preparation and sequencing ........................................................ 48 3.3.2 Transcriptome assembly and validation ............................................................................ 50 3.3.3 Functional annotation of transcripts and mapping ............................................................ 51 3.3.4 Comparative transcriptomics ............................................................................................ 52 3.3.5 Candidate genes ................................................................................................................ 53 3.3.6 Primer design and candidate gene validation .................................................................... 55 3.3.7 Phylogeny of globins in bivalves ...................................................................................... 56
3.4 Discussion .........................................................................................................................58
3.5 Conclusion ........................................................................................................................60
Chapter 4: General Discussion ............................................................................... 61
4.1 Role of gene duplication in the current diversity of bivalve Hbs ......................................61
4.2 Globin gene evolution in hypoxic environments ..............................................................64
4.3 The importance of globin gene evolution to aquaculture ..................................................66
4.4 Limitations of the study ....................................................................................................67
4.5 Future research ..................................................................................................................67
4.6 Conclusions .......................................................................................................................67
References… ............................................................................................................. 69
Appendices.. .............................................................................................................. 98
Appendix A: Poster presented at the annual Lorne Genome conference 2015 ................... 98
Table of contents v
List of Figures
Figure 1.1. Maximum likelihood tree highlighting relationships between Hbs of jawed (gnathostomes) and jawless (agnathans ) vertebrates. In this phylogeny vertebrate-specific globins are grouped into two distinct clades: (i) Cyclostome Hbs + Cygb + Mb + GbE + GbY, (ii) β-Hbs + α-Hbs (Hoffmann et al., 2010a). ................................................................................................... 5
Figure 1.2. Comparison of chromosomal organization of α and β globin gene clusters in avian and mammalian taxa. The α and β globin genes represented here encode the α and β subunits of a tetrameric haemoglobin (α2β2) (Zhang et al., 2014). .......................... 8
Figure 1.3. This simplified phylogeny represents evolutionary relationships between major metazoan taxa, some lesser known phyla are not included for simplicity or due to unclear relationships. Taxa in which Hbs have been found are boxed in red. This phylogeny illustrates the independent evolution of Hbs through their presence in 11 major phyla shown here : Arthropods, Nematodes, Nemertines, Mollusks, Annelids, Echiurans, Pogonophorans, Phoronids, Playelminthes, Echinoderms and Chordates. Adapted from (Halanych & Passamaneck, 2001). ........... 11
Figure 1.4 Gene structure from two Hbs found in annelid worms (Branchipolynoe spp.). This diagram illustrates the exon-intron structure and domain architecture for the single-domain (top) and tetra domain (bottom) globins found in these species (Projecto-Garcia et al., 2010). ............................................................. 12
Figure 1.5 Phylogenetic classification of class Bivalvia of molluscs. The main bivalve subclasses are represented here in different colours: Protobranchia, Pteriomorpha, Palaeoheterodonta, Archiheterodonta, Anomalodesmata and Imparidenta. This basic phylogeny demonstrates the distribution of Hb (indicated as Hb following species and order name) and Hc (indicated as Hc) among bivalve taxa and the order of the species in which those respiratory pigments are found is indicated in grey brackets: Solemyoida, Nuculanoida, Arcoida, Carditoida and Veneroida. Adapted from (Gonzáles et al., 2015). ............................ 15
Figure 1.6. Quaternary assembly of Homo sapiens and Scapharca inaequivalvis Hbs. (a) HbA refers to adult Hb in H. sapiens; (b) HbII refers to hetero-tetrameric arcid Hb in S.inaequivalis and (c) HbI refers to homo-dimeric arcid Hb in S.inaequivalis. For each Hb structure represented here, haeme groups are shown in red, α (alpha) subunits are shown in dark grey, β (beta) subunits are shown in light grey and E and F helices are shown in blue. This illustrates the diversity of structures found in arcid Hbs with a heterotetramer (b) and a homodimer (c), both found
List of figures vi
in S.inaequivalis. It also shows the back to front assembly of the arcid Hbs with E and F helices arranging on the outside of the molecule compare to human Hb. (Ronda et al., 2013). ................ 17
Figure 2.1 Photograph of an A. trapezia specimen illustrating the anatomical position of all five tissues used to assess tissue specific expression of Hb genes in this study. .................................... 24
Figure 2.2 Summary of haemolymph analysis results. This data was obtained by testing a few microliters of fresh haemolymph using and Abbott i-STAT blood analyser. This was done for all haemolymph samples across two conditions of experiment and two timepoints: water 6 h, air 6 h, water 12 h and air 12 h. A) pH measurements, B) percentage of O2 saturation, C) partial pressure of CO2. Significant differences were assessed through ANOVA testing using the program SPSS with pH values, percentage of O2 saturation values and pCO2 values as dependent variables respectively. Condition was used as a factor for each test and results were considered statistically significant at p < 0.05. Statistically different groups are represented by the symbols (*) and (•) in each graph and error bars represent 2 standard errors around the mean. .............................. 29
Figure 2.3. PCR products amplified from two mantle samples to validate candidate Hb genes. Amplified PCR products obtained here were purified using the Bioline isolate PCR purification kit followed by cloning using the Promega pGEM-T and pGEM-T easy vector systems quick protocol. Samples were then sequenced on the ABI Genetic Analyzer 3500 in duplicate. Each gel of this figure represents one mantle sample. For both gels A (sample 1) and B (sample 2): lane 1 contains 100 bp Hyperladder, lane 2 through to 7 contain amplified products for 2D, AG, BG, HB, HD and 18S genes respectively. ......................................................................................... 31
Figure 2.4. Relative expression ratios of candidate globin genes amplified using RT-qPCR. Candidate genes amplified in each tissue from the bivalve A.trapezia are represented as follows: 2D, AG, BG and HB. Relative quantification analysis was performed using the ∆∆CT method. Relative expression is expressed as a ratio of levels of target sequences to levels of reference sequences with 18S used here as a housekeeping gene. Significant differences were assessed through ANOVA testing using the program SPSS and differences were considered statistically significant at p < 0.05. Significantly different groups are shown here with an asterix (*). Ratio values were used as a dependent variable against tissue type for each gene. Error bars represent 2 standard errors around the mean. ..................... 32
Figure 2.5. Examples of amplification curves obtained for target genes (2D, AG, BG, HB) and housekeeping gene 18S in each tissue type. Target amplification curves are represented in orange (left),
List of figures vii
negative controls in green (left) and reference amplification curves in blue (right). RT-qPCR was performed using a Lightcycler® measuring specific fluorescence at each cycle. All quantitative PCR analyses were repeated in three technical replicates along with negative controls and 18S as a housekeeping gene. ............................................................................. 33
Figure 2.6. Relative expression ratios of candidate globin genes amplified using RT-qPCR. Candidate genes amplified in each tissue from the bivalve A.trapezia are represented as follows: haemolymph, foot, gills, mantle and muscle. Relative quantification analysis was performed using the ∆∆CT method. Relative expression is expressed as a ratio of levels of target sequences to levels of reference sequences with 18S used here as a housekeeping gene. Significant differences were assessed through ANOVA testing using the program SPSS with ratio values as a dependent variable and condition as a factor in each tissue. Differences were considered statistically significant at p < 0.05 and no statistical differences were found here. Error bars represent 2 standard errors around the mean. .............................. 34
Figure 3.1. Workflow overview of functional annotation of the C. ales transcriptome using the Trinotate annotation pipeline. This is a comprehensive annotation suite based on homology searches to known sequence data. Contigs were first used as BASLTx queries against the TrEMBL and Swiss-Prot databases (stringency E-value of 1 x 10-6). TransDecoder was used to generate a predicted proteome then used as a BLASTp query against the TrEMBL and Swiss-Prot databases. SignalP was used to predict the presence and location of signal peptides and Pfam to determine the presence and position of protein domains. All results were uploaded to the SQLite database and an annotation report was generated. Adapted from van der Burg et al. (2016). ............................................................................... 46
Figure 3.2. Whole tissue from one C.ales individual was sampled and used for RNA extraction. This 1.5 % agarose gel electrophoresis shows RNA extracted from four samples (extraction was performed in quadruplicate) from this individual as follows: lane 1 contains 100 bp Hyperladder, lanes 2-5 contain samples 1-4 respectively. The RNA is visible as strong bands in each sample around the 700 bp mark. RNA obtained was then assessed for quantity and integrity and sequencing libraries were prepared. ..................................................................................... 49
Figure 3.3. Bioanalyser results for total RNA quality and quantity. Total RNA samples obtained from whole tissue of one C.ales individual were assessed for quantity and integrity on a Bioanalyzer 2100 RNA nano chip. Results shown here are for two RNA samples labelled here C10 and C11. RNA concentrations are given for each sample............................................ 50
List of figures viii
Figure 3.4. WEGO output for newly generated transcript for C. ales. Web gene Annotation Plotting was used to characterise the transcriptomic data obtained for C. ales. This figure represents the proportion and number of transcripts assigned Gene Ontology (GO) terms in three different gene ontology categories developed to represent common and basic biological information: cellular component (CC), molecular function (MF) and biological process (BP). ...................................................... 52
Figure 3.5. Venn diagram illustrating the number of gene clusters shared and unique between the four bivalve species C. gigas, L. gigantea, A. trapezia and C. ales. Orthologous gene clusters for C. ales, A. trapezia and the two model species L. gigantea and C. gigas are annotated and compared here using the program OrthoVenn. For L. gigantea and C. gigas, the predicted proteomes obtained from the genomes are used and the predicted proteomes from whole organism transcriptomes are used for A. trapezia and C. ales. Ortholog groups shown here were identified by an all-against-all reciprocal BLASTp alignment. ............................................................................................ 53
Figure 3.6. Globin domains identified for three candidate globin genes. The annotated transcriptome obtained for C.ales was interrogated for candidate globin genes here defined as contigs that possess a characteristic globin fold sequence and globin domain. The three contigs found to possess these characteristics are shown above as follows: CalesGl1 (top), CalesGl2 (middle) and CalesGl3 (bottom). For candidate gene 2, the purple circle represents a signal peptide and for candidate gene 3, the blue rectangle represents a transmembrane domain. Both were identified using SMART searches for homologous Pfam domains, signal peptides and internal repeats. ................................................................................... 54
Figure 3.7. Candidate gene products from C. ales transcriptome. Each candidate was amplified using primers as shown in table 3.2 above. Lanes 1 in all gels displayed here are Hyperladder 100 bp. Candidate gene 1 is shown in lane 2 of gel (A); candidate gene 2 is shown in lane 3 of gel (B); candidate gene 3 is shown in lane three of gel (C). All gels are 1.5 % agarose stained with GelRedTM (Biotum). Products of candidate genes seen here were purified using an Ethanol/EDTA precipitation protocol and samples were sequences on ABI Genetic Analyzer 3500 (ThermoFisher). ................................................................................... 55
Figure 3.8. Molecular Phylogenetic analyses by Maximum Likelihood method. Phylogenetic relationships of globin genes were here resolved for the following species: bivalves A. trapezia and C. ales, two mollusc model species L. gigantea and C. gigas, three vertebrate model species H. sapiens, G. gallus and D. rerio. The percentage of trees in which the associated taxa clustered
List of figures ix
together is shown next to the branches. A discrete Gamma distribution was used to model evolutionary rate differences among sites (3 categories (+G, parameter = 5.3023)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 3.8405 % sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. ................................................................................................ 57
Figure 3.9. Multiple alignments of globin protein sequences. All mollusc globin protein sequences used in the phylogeny represented in Figure 3.8 were aligned for sequence comparison and to identify residue conservation using the multiple sequence alignment with high accuracy and high throughput MUSCLE in MEGA. Sequences are grouped by species and include A. trapezia, C. ales, C. gigas and L. gigantea. Residues conserved across all sequences and all species are indicated above by an asterix (*). ....................................................................... 58
List of figures x
List of Tables
Table 2.1 List of primers designed to amplify products of candidate globin genes identified in the bivalve species A. trapezia using RT-PCR. All primers were designed to amplify the entire ORF of each gene with product sizes between 350 bp and 619 bp. ................. 26
Table 2.2 List of primers designed to amplify products of globin genes identified in the bivalve species A. trapezia using RT–qPCR and determine their expression levels in different tissues. All primers are designed to amplify products with sizes between 109 bp and 148 bp. (Prentis & Pavasovic, 2014). ............................... 27
Table 2.3 RT- PCR results summary for tissue specific expression. All five genes (2D, AG, BG, HB and HD) have been amplified in six biological replicates for each tissue and each condition. To summarise the results, in this table, a score is given out of 6 (total number of biological replicates) for each candidate gene amplified in each tissue and in each condition. An asterix * represents weak bands obtained on gel electrophoresis for at least 2 biological replicates out of 6 in that group. ............................. 30
Table 3.1 Summary of sequencing and assembly data for the bivalve C. ales. Sequencing libraries were prepared using an Illumina TruSeq® stranded RNA library prep kit and the final cDNA library was sequenced on an Illumina NextSeq500 using 150bp paired-end chemistry. Libraries obtained were then assembled and the table below summarises results obtained. ............................... 51
Table 3.2 Primers designed for validation of three candidate genes identified from the newly generated C. ales transcriptome. Primers were designed using Primer3 software to amplify the entire ORFs and validate the candidate genes identified. Forward and reverse primers were designed for each candidate as shown in the table below. ................................................................ 55
List of tables xi
List of Abbreviations
AA- Amino acid
Angb- Androglobin
BLAST- Basic local alignment search tool
CO- Carbon monoxide
Cyg- Cytoglobin
GbE- Globin E
GbX- Globin X
GbY- Globin Y
Hb- Haemoglobin
LUCA- Last universal common ancestor
NO- Nitric oxide
Mb- Myoglobin
NCBI- National Centre for Biotechnology Information
Ngb- Neuroglobin
ORF- Open reading frame
PCR- Polymerase chain reaction
ROS- Reactive Oxygen Species
List of abbreviations xii
Statement of Original Authorship
The work contained in this thesis has not been previously submitted to meet
requirements for an award at this or any other higher education institution. To the
best of my knowledge and belief, the thesis contains no material previously
published or written by another person except where due reference is made.
Signature:
Date: _____June 2017___________
Statement of original authorship xiii
QUT Verified Signature
Acknowledgements
I would firstly like to express my gratitude to my supervisors Dr. Ana Pavasovic
(School of Biomedical Sciences, Queensland University of Technology), Dr. Peter
Prentis (School of Earth, Environmental and Biological Sciences, QUT) and
Professor Louise Hafner (School of Biomedical Sciences, QUT) for their supervision
and guidance throughout my research and writing of this thesis. I would also like to
thank them for their continuous support, patience and encouragement throughout
these challenging two years.
A special thank you to my principal supervisors Dr. Ana Pavasovic and Dr. Peter
Prentis for their flexibility, genuine caring and faith in me along the way.
I would like to thank all my colleagues in the evolutionary and physiological
genomics lab (ePGL), especially Shorash Amin, Hayden Smith, Joachim Surm and
Chloe van de Burg for their ongoing help and support. I would also like to thank
Associate Professor Christopher Collet (School of Biomedical Sciences, QUT) and
Dr. David Hurwood (School of Earth, Environmental and Biological Sciences, QUT)
for reviewing my thesis draft and providing valuable feedback during my final
seminar.
I would like to thank all the staff at the QUT marine lab facility for their helpful
advice regarding care of the marine animals and tanks. I would also like to thank
QUT HPC and QUT MGRF for use of their facilities.
I would like to thank my family for giving me the opportunity to study in Australia
and for their ongoing loving support and encouragement.
Lastly, I would like to thank my partner Tom for his emotional support, endless
patience and encouragement. He has always been a source of strength and has kept
me determined throughout this challenging journey.
Acknowledgements xiv
Chapter 1: Introduction
1.1 Functional and structural diversity within the globin superfamily
The globins are an ancient gene superfamily, thought to have been present in
the last universal common ancestor (LUCA) of the three domains of life (Hoffmann et
al., 2010a). Globin genes encode for small iron metalloproteins, typically ~ 150 amino acids
in length. Globin domains consist of a proximal histidine residue in the F helix for iron
binding and a distal histidine residue on the opposite side of this iron for oxygen binding
(Bashford et al., 1987). Structurally, globin domains consist of eight alpha (α) helical
segments which form the characteristic globin fold in a three-on-three α helical structure with
a central haeme group comprising a proto-porphyrin ring (Perutz, 1979; Blank & Burmester,
2012). It is this haeme prosthetic group that allows globin proteins to reversibly bind oxygen
(O2) and other gaseous ligands (Pesce et al., 2002). For a long time, the metazoan (animal)
globin superfamily was thought to consist of only two globin types, haemoglobin (Hb) and
myoglobin (Mb), but recent research has shown that this superfamily is functionally and
structurally far more diverse than originally thought (Götting & Nikinmaa, 2015).
The globin gene superfamily includes genes encoding Hbs, neuroglobins (Ngbs),
cytoglobins (Cygbs), globin X (GbX), androglobins (Angbs), Mbs, globin E (GbE) and
globin Y (GbY) among others (Hoffmann et al., 2010a; Storz et al., 2013; Opazo et al.,
2015). This diverse group of genes is thought to have evolved from a common ancestral
gene through repeated rounds of gene and genome duplication, some 1.8 billion years ago
(Efstratiadis et al., 1980; Wajcman et al., 2009). This coincides with the accumulation of O2
levels in the atmosphere, suggesting that those globin genes arose as a mechanism to
scavenge toxic O2, carbon monoxide (CO) and nitric oxide (NO) gases (Hardison, 1996;
Koch & Britton, 2008). The repeated rounds of duplication and functional divergence of
duplicated genes have resulted in a large and diverse group of structurally similar
proteins. In this superfamily, GbX and Ngb are some of the more recently described proteins
(Pesce et al., 2002; Burmester & Hankeln, 2004; Roesner et al., 2005) but are thought to be
the most ancestral globin genes found in metazoans. In fact, GbX and Ngb are the only
globin genes found in early divergent phyla such as Cnidaria and Porifera, and predate the
split of deuterostomes and protostomes (Schwarze et al., 2014).
Chapter 1: Introduction 1
1.1.1 Recently discovered globin genes
Neuroglobin and GbX are both expressed in the nervous system of vertebrates, but
unlike Ngb, which is typically found in the cellular cytoplasm, GbX is a membrane bound
protein (Pesce et al., 2002; Burmester & Hankeln, 2014). The functions of both of these
proteins are still uncertain and they have been the focus of numerous studies to elucidate their
physiological function (Hankeln et al., 2005; Burmester & Hankeln, 2009; Wawrowski et al.,
2011). Nonetheless, initial studies speculate that Ngb and GbX largely play a protective role
in the cell (Brunori & Vallone, 2007; Burmester & Hankeln, 2009; Corti et al., 2016) as well
as being involved in cellular signaling processes (Burmester & Hankeln, 2009; Su et al.,
2014). Globin E, another described vertebrate globin (Kugelstadt et al., 2004), is typically
expressed in eye tissue. Its role is thought to be related to O2 supply of retinal cells, which
are metabolically highly active (Blank et al., 2011). Androglobin is an ancient chimeric gene
with high levels of expression in the testes of vertebrates (Hoogewijs et al., 2011), however,
its function remains unresolved (Burmester & Hankeln, 2014). Yet another recently
discovered globin, Cygb, is expressed in a diverse range of tissues and cell types including
epithelium, fibroblast cell lineages, macrophages, neurons and muscle fibers (Oleksiewicz et
al., 2011; Motoyama et al., 2014). Functions of Cygb appear similarly diverse, with roles in
protection from reactive oxygen species and in nitrous oxide metabolism (Pesce et al., 2002;
Singh et al., 2014).
1.1.2 Myoglobins and haemoglobins
Stemming from their early discovery, Mbs and Hbs are the most extensively studied
globins. Most metazoan species possess only a single copy of the Mb gene except in rare
cases of some fish lineages where an increase in copy number (up to seven) is seen (Koch &
Burmester, 2016). Mbs are small monomeric proteins whose expression is principally
restricted to striated muscle tissue where they play a key role in supplying the mitochondria
of myocytes with O2 during periods of hypoxia, also defined as oxygen deficiency in a biotic
environment (Wittenberg & Wittenberg, 2003). This protein is also reported to play an
important role in the decomposition of nitrous oxide during high cellular metabolic activity
(Flögel et al., 2001). Unlike the other globin proteins, Hb principally transports O2 from
respiratory surfaces to the working tissue within an organism.
Haemoglobins in metazoans have been reported in multiple animal phyla as key
oxygen-transport proteins (Weber, 1980; Mangum, 1992; Hardison, 1996; Hourdez et al,
2 Chapter 1: Introduction
2000). Most frequently Hbs are found in the circulatory system of metazoan species, but can
sometimes be confined to specific tissues, such as in the bivalve Yoldia eightsii where Hb is
expressed in the gill tissue (Dewilde et al., 2003). Outside of vertebrates, Hbs exhibit a
remarkable diversity in structure (Terwilliger, 1998; Weber & Vinogradov, 2001). For
example, Hb proteins can occur as monomers, dimers, tetramers or even in polymeric forms
(Weber, 1980). Circulating Hbs may also be found within erythrocytes (intracellular) or
freely dissolved in fluid tissue such as blood or haemolymph (extracellular) (Ching Ming
Chung & Ellerton, 1980). Extracellular Hbs are extremely diverse in subunit size and
quaternary structure, however, they all share the same globin fold as intracellular Hbs
(Negrisolo et al., 2001). Notable structural variations can often be found in invertebrate
species such as the bivalve mollusc Barbatia reeveana where a very large polymeric Hb of
430 kDa composed of 34 kDa di-domain subunits occurs, each containing two covalently
linked functional units for O2 binding (Grinich & Terwilliger, 1980). Other examples of
structural variation include annelid species, such as Riftia pachyptila and Lumbricus
terrestris, which possess Hbs consisting of 24 and 144 subunits respectively (Royer et al.,
2000; Strand et al., 2004; Flores et al., 2005).
Structural variations in Hbs can be observed even among vertebrate lineages,
where jawless vertebrates (agnathans) have weakly cooperative dimers compared to the
canonical tetrameric Hbs of jawed vertebrates (gnathostomes) (Hoffmann et al., 2010a).
By far the best-studied Hb is the tetrameric form found in humans. Structurally, the human
Hb consists of 4 globin subunits (2 alpha (α) and 2 beta (β)), each with their own haeme
group (Pesce et al., 2002; Storz et al., 2013). α and β chains consist of 141 and 146 amino
acids, respectively and are held together by noncovalent interactions to form a heterotetramer
(Antonini & Chiancone, 1977; Hardison et al.,1997).
The genes that encode the different subunits of Hb proteins are found as clusters in the
human genome. For example, the α globin gene cluster is located on chromosome 16 and
includes seven loci (NC_000016.10; 5’ – zeta – pseudozeta – mu – pseudoalpha1 – alpha2 –
alpha1 – theta – 3’) (Vernimmen, 2014) while the β gene cluster is found on chromosome 11
and is comprised of five loci (NC_000011.10; 5’-epsilon – gammaG – gammaA – delta – beta
– 3’) (Moleirinho et al., 2013). The order in which the genes are found within these clusters,
determines the timeline at which they are expressed during development (Hardison et al.,
1997) with those closest to the locus control region at the 5’end, expressed in early
Chapter 1: Introduction 3
development and those further away expressed at a later stage (Hanscombe et al., 1991).
This developmental expression has been associated with varying oxygen levels throughout
embryonic and foetal development (Efstratiadis et al., 1980) and demonstrates that oxygen
availability can influence the expression and evolution of Hbs.
1.2 Evolution of haemoglobins
Haemoglobins are among the most extensively studied proteins, yet our understanding of
their evolutionary history is becoming increasingly unclear. A recent phylogenetic analysis of
the vertebrate Hbs by Hoffmann et al., (2010a) suggests that Hbs in gnathostomes and
agnathans have evolved in each lineage independently. It is also likely that these Hb genes
have evolved from different ancestral genes based on the placement of gnathostome and
agnathan Hbs in two distinct clades (Figure 1.1) (Hoffmann et al., 2010a). Findings such as
these are in contrast to the prevailing paradigm that the current vertebrate Hbs have evolved
from an ancestral Mb gene. In invertebrates, the evolution of Hb genes remains even less
clear than in vertebrates, largely due to insufficient genomic resources for the groups that
possess Hbs.
The current paradigm to explain the diversity of vertebrate globin genes largely involves
gene and whole genome duplication events, mutation and natural selection acting to promote
evolutionary innovation in this gene family. Both, the two rounds of whole genome
duplication (2R hypothesis; (Hokamp et al., 2003)) that preceded the evolution of vertebrates
and further tandem duplications provided the raw substrate for the evolution and
diversification in the vertebrate globin genes seen today. Divergent and convergent evolution
have played major roles in the evolution of new functions observed in duplicated vertebrate
globin genes.
4 Chapter 1: Introduction
Figure 1.1. Maximum likelihood tree highlighting relationships between Hbs of jawed (gnathostomes) and
jawless (agnathans ) vertebrates. In this phylogeny vertebrate-specific globins are grouped into two distinct
clades: (i) Cyclostome Hbs + Cygb + Mb + GbE + GbY, (ii) β-Hbs + α-Hbs (Hoffmann et al., 2010a).
1.2.1 Gene duplication
The role of both gene duplication and whole-genome duplication is now widely
accepted as a key driver of evolution as it is the most frequent mechanism responsible for
the generation of genes with new functions (Ohno et al., 1968; Cañestro et al., 2013). It is
through the accumulation of mutations and repeated rounds of duplication of existing genes
that new functions are acquired (Zhang, 2003). Both whole-genome and gene duplication
have promoted evolutionary innovation and led to the current diversity within the Hb gene
family (Storz et al., 2013). Gene duplication can occur through five mechanisms: (i) the
Chapter 1: Introduction 5
unequal crossing-over between two sister chromatids of one chromosome or (ii) between
two homologous chromosomes during replication, (iii) regional redundant duplication of
DNA molecules (segmental duplication), (iv) polyploidization or (v) retrotransposition
(Ohno et al., 1968 ; Zhang, 2003). Unequal crossing over events can also be explained as
recombination between DNA sequences at different sites on sister chromatids or
homologous chromosomes. This results in a decrease in gene copy number on one
chromosome or one chromatid and an increase on the other (Roeder, 1983). The variations
in globin gene copy numbers have often been associated with this type of recombination
(Goossens et al., 1980; Higgs et al., 1980; Trent et al., 1981; Roeder, 1983; Hurles, 2004).
Vertebrate Hb gene clusters, described previously, provide classic examples of evolution
through gene duplication. In 1961, Ingram suggested that polyploidization initiated two
duplication events from Mb to the primordial Hb α gene which then produced the Hb β-like
gene. While this is an overly simplistic explanation for the diversity found within the globin
gene family, it provided an early plausible hypothesis that gene duplication was a major
driver of the evolution of the vertebrate globin gene family.
Gene duplication is known to have four possible outcomes (Zhang, 2003; Cañestro
et al., 2013). One possible outcome is dosage repetition, where limited sequence evolution
is seen, and gene function is conserved (Zhang, 2003; Cañestro et al., 2013). Another
outcome following a duplication event may be pseudogenisation of one duplicated copy
through the accumulation of mutations creating a non-functional gene (pseudogene) (Force
et al., 1999; Zhang, 2003; Bischof et al., 2006). This may result in a duplicate transcribed
into RNA but not translated. Neofunctionalisation represents another possible outcome of
gene duplication whereby accumulation of mutations in one of the copies may lead to new
functions (Innan & Kondrashov, 2010). Lastly, subfunctionalisation is a sub-category of
neofunctionalisation where the function remains the same but changes in regulatory
elements that control expression of the gene may lead to different copies of the duplicate
being expressed in different tissues (Lynch & Force, 2000; Huerta-Cepas et al., 2011).
Neofunctionalisation and subfunctionalisation are largely the result of divergent selection
acting on newly formed duplicates. Mutations in DNA coding regions will lead to changes
in the amino acid sequence and affect protein-protein interactions which are the foundation
of cellular molecular function therefore giving rise to new functions for those proteins
(Wray et al., 2003).
6 Chapter 1: Introduction
In vertebrates, repeated rounds of gene and whole genome duplication events have
led to copy number variation across the eight different classes of globin genes, as listed in
section 1.1. For example, in fish Mb copy number variation can range from none (ice
fishes: Chaenocephalus aceratus, Pseudochaenichthys georgianus and stickleback:
Gasterosteus aculeatus) (Sidell & O’Brien, 2006; Hoffmann et al., 2011) to seven copies
(lungfish: Protopterus annectens) (Koch et al., 2016). Pseudogenisation has led to attrition
in some of these globin classes in certain vertebrate lineages. Specifically, the GbY is
absent in birds, marsupials and placental mammals, while it is present in monotremes and
most other vertebrate lineages with the exception of lampreys and ray-finned fishes
(Burmester & Hankeln, 2014). Similarly, this process of gene loss is responsible for the
absence of Mb in a number of amphibian species (e.g., anurans (frogs)) (Maeda & Fitch,
1982; Fuchs et al., 2006). In the vertebrates, the formation of the α globin gene cluster is a
consequence of tandem duplication leading to functional copy number variation, such as two
in green anole (Anolis carolinensis) (Hoffmann et al., 2010b), three in chicken (Gallus
gallus) (Reitman et al., 1993) and four in platypus (Ornithorhynchus anatinus) (Opazo et
al., 2008). A recent systematic analysis of 22 avian and 22 mammalian genomes revealed a
significant conservation in copy number within the avian α globin gene cluster (2-3 copies)
but substantial variation among the surveyed mammalian lineages (2-8 copies)(Figure 1.2)
(Zhang et al., 2014). In addition to this duplication of α globin genes within the mammalian
lineages, there is strong evidence to suggest that mutations which have accumulated in some
duplicated copies may have led to the evolution of novel functions (He & Zhang, 2005).
This process of neofunctionalisation is often a result of divergent natural selection acting on
new genetic variation in duplicated gene copies.
Chapter 1: Introduction 7
Figure 1.2. Comparison of chromosomal organization of α and β globin gene clusters in avian and mammalian
taxa. The α and β globin genes represented here encode the α and β subunits of a tetrameric haemoglobin (α2β2)
(Zhang et al., 2014).
1.2.2 Divergent evolution of duplicated genes
Divergent evolution of duplicated genes is the process in which genes of a similar
function diverge from each other following duplication from a common ancestral gene
(Bikard et al., 2009). An example of divergent evolution in the globin superfamily is
s een in a number of vertebrate Hb proteins that undertake slightly different but similar
functions. These Hb proteins are encoded by genes which have evolved from the same
ancestral gene but exhibit differences in their ontogenetic timing of expression and
functional properties, such as affinity for oxygen (Gribaldo et al., 2003). In fact, the
differences in O2 scavenging and O2 transport roles of embryonic Hb versus adult Hb are
attributable to amino acid replacements (non-synonymous mutations) in the zeta/epsilon and
beta/alpha genes expressed in the embryonic and adult individual, respectively (Goodman et
al., 1987). Another example of divergent evolution driving evolutionary innovation in Hb
proteins can be seen in the avian lineage. In this instance, adult birds express two
functionally distinct Hb isoforms, HbA and HbD (Grispo et al., 2012). While the β globin
chains in both isoforms are identical, two functionally distinct α chains, one encoded by the
alphaA globin gene and the other by the alphaD globin gene are found in the HbA and HbD
8 Chapter 1: Introduction
isoforms, respectively. The alphaA and alphaD globin genes share a common ancestral gene,
but possess amino acid replacements that substantially alter O2 affinity in the presence of
allosteric modulators (Storz et al., 2015). Other examples of divergent evolution include the
expression differences found in GbX paralogs (orthologous genes, which have diversified
through duplication within the species) in the elephant shark (Callorhinchus milii) (Opazo et
al., 2015). In this example, GbX1 is expressed across a wide range of tissues, while the
expression of GbX2 is primarily restricted to the gonads. Taken together these examples
highlight the process of divergent evolution operating on non-synonymous mutations in
duplicated globin genes, but does not, however, explain the independent evolution of Hb in a
number of metazoan lineages such as vertebrates, bivalves and annelids (Wray et al., 1996).
1.2.3 Convergent evolution
Convergent evolution is the independent evolution of the same function or
phenotype in different lineages from different ancestral genes (Hoffmann et al., 2010b).
As a consequence, it can lead to analogous molecules with similar functions in unrelated
lineages (Hoffmann et al., 2010b; Burmester & Hankeln, 2014; Opazo et al., 2015). Such
adaptive phenotypic convergence appearing in unrelated taxa is partly attributed to similar
selection pressures operating on duplicated genes and this process is thought to be
widespread in nature (Parker et al., 2013). An important example of convergent evolution
is observed in the independent evolution of electric organs in numerous fish species, where
the myogenic electric organ produces electrical currents for the purposes of communication
and is believed to have evolved multiple times (Gallant et al., 2014). Similarly,
echolocation in Cetacea (whales and dolphins) and Chiroptera (bats) has also been attributed
to phenotypic convergence (Liu et al., 2010). Genome-wide analysis of cetaceans and two
different bat lineages (Yinpterochiroptera and Yangochiroptera) have shown extensive
convergent changes in nearly 200 loci containing coding sequences involved in echolocation
(Parker et al., 2013). Functionally different Hb proteins occurring within erythrocytes of
agnathan (jawless) and gnathostome (jawed) vertebrates present another example of
convergence (Hoffman et al., 2010a). In these two disparate lineages, phylogenetic
analyses have determined that the Hbs have not evolved from orthologous genes and are
structurally distinct, reflected in a weakly cooperative dimeric Hb form produced in
agnathans and a cooperative tetrameric Hb form found in gnathostomes (Figure 1.1)
(Hoffmann et al., 2010a; Schwarze et al., 2014). In addition to evidence of convergent
Chapter 1: Introduction 9
evolution of Hb among vertebrates, evidence of convergence between vertebrate and
invertebrate Hb proteins is also observed. A specific example is the repeated evolution of
intracellular tetrameric Hb in some bivalve mollusc species and vertebrates (O’Gower &
Nicol, 1968). In invertebrates, a variety of Hbs and Mbs have also been observed (Weber
& Vinogradov, 2001) and are believed to be the result of globin proteins emerging several
times convergently from different ancestral globin genes (Blank & Burmester, 2012;
Schwarze et al., 2014).
1.2.4 Invertebrate haemoglobins
From primary sequence to secondary, tertiary and quaternary structures, a large
diversity exists among invertebrate Hbs. This variability is thought to represent
specialization of Hb molecules to the wide range of environments they inhabit (Terwilliger,
1998; Weber & Vinogradov, 2001; Alyakrinskaya, 2002; Gow et al., 2005). Structurally,
invertebrate Hbs also retain the globin fold (Gow et al., 2005) and exhibit a crystal structure
that consists of at least six α-helices (Weber & Vinogradov, 2001). Haemoglobin in
invertebrates has evolved independently at least 11 times (Figure 1.3) (Weber & Vinogradov,
2001), however, it is likely that this number may increase as more data becomes available for
many understudied invertebrate lineages.
10 Chapter 1: Introduction
Figure 1.3. This simplified phylogeny represents evolutionary relationships between major metazoan taxa, some
lesser known phyla are not included for simplicity or due to unclear relationships. Taxa in which Hbs have been
found are boxed in red. This phylogeny illustrates the independent evolution of Hbs through their presence in
11 major phyla shown here : Arthropods, Nematodes, Nemertines, Mollusks, Annelids, Echiurans,
Pogonophorans, Phoronids, Playelminthes, Echinoderms and Chordates. Adapted from (Halanych &
Passamaneck, 2001).
The diversity in Hb proteins is a result of sequence diversity seen in genes encoding
invertebrate Hbs. Within invertebrates; nematodes, annelids, arthropods and molluscs all
possess genes encoding Hb proteins with more than a single functional globin domain which
has resulted in a diversity of functional proteins (Natarajan et al., 2015). The brine shrimp
(Artemia), for instance, expresses two functional Hb genes (an α and β subunit type), each of
which consists of nine tandem globin domains. The association of the polypeptide chains
encoded by these two genes results in the expression of three Hb proteins; a hetero-dimer (Hb
II) and two homodimers (HbI and HbIII) (Manning et al., 1990). Structurally, HbII is
characterised by α and β subunit types, while HbI and HbIII are composed of α and β subunit
types, respectively (Manning et al., 1990). Interestingly, in addition to possessing multi-
domain Hbs, production of multiple types of Hbs appears common among invertebrates. For
example, in the annelid worms (Branchipolynoe symmytilida and B. seepensis) a single
domain globin gene (SD) encodes a 137 amino acid Hb protein, while a tetradomain gene
(TD) encodes a Hb protein of 552 amino acids (Projecto-Garcia et al., 2010). Figure 1.4
Chapter 1: Introduction 11
illustrates the exon-intron structure and domain architecture of the two Hbs from
Branchipolynoe spp. Similarly, in nematodes, there is a diversity of globins including single-
domain and di-domain Hbs (Projecto-Garcia et al., 2015). It is notable that despite this
diversity, the di-domain globin genes appear to have a restricted distribution only in two
parasitic species (Ascaris suum and Pseudoterranova decipiens) (Darawshe et al., 1987;
Dixon et al., 1991). These di-domain globin genes from nematodes encode a large polymeric
multi-domain Hb (octamer consisting of eight two domain subunits). Overall, based on the
observed diversity, invertebrate Hbs have been classified into four distinct groups according
to their gene and protein subunit structures. These include single domain, single-subunit
Hbs; two-domain, multi-subunit Hbs; multi-domain, multi-subunit Hbs and single-domain,
multi-subunit Hbs (Vinogradov, 1985).
Figure 1.4 Gene structure from two Hbs found in annelid worms (Branchipolynoe spp.). This diagram
illustrates the exon-intron structure and domain architecture for the single-domain (top) and tetra domain
(bottom) globins found in these species (Projecto-Garcia et al., 2010).
1.3 Bivalve haemoglobins
Similar to the diversification patterns seen in other invertebrate lineages, bivalve
molluscs show strikingly diverse patterns of Hb distribution and evolution. This is of interest
as bivalves typically do not possess a functional respiratory pigment and those that do were
previously thought to utilise a copper-based respiratory pigment known as haemocyanin (Hc),
with Hb expression considered rare (O’Gower & Nicol, 1968; Terwilliger, 1998). Species
that contain Hc are restricted to the earliest branching lineage of bivalves and it has been
suggested that Hc is the ancestral oxygen-transport protein in class Bivalvia. It has been
shown, however, that Hb is more widely distributed in bivalve molluscs than previously
12 Chapter 1: Introduction
thought (Manwell, 1963; Smith, 1967; Terwilliger, 1998; Alyakrinskaya, 2002; Wajcman et
al., 2009).
Currently, four independent origins of Hbs have been hypothesised in bivalves
(Figure 1.5). The first instance is thought to have occurred in the primitive bivalve lineage
containing Solemya velum, which possesses several different types of Hbs expressed
predominately in gill tissue (Dando, 1985; Doeller et al., 1988; Torres-Mercado et al., 2003).
In Lucina pectinata, also found in this lineage, three Hbs are expressed; HbI which is a
sulphide reactive protein and HbII and III, which are O2 reactive globins (Montes-Rodriguez
et al., 2016). A second instance has occurred in the bivalve family Arcidae where species
contain multiple forms of Hb in circulating erythrocytes (Mangum, 1997) represented in
Figure 1.5 by Arca noae. Haemoglobin diversity of Arcidae will be discussed in more detail
in the following section as it constitutes one of the questions addressed by this thesis. The
third independent origin of bivalve Hb is found in the deep-sea clam genus Calyptogena.
Species within this genus generally present with two homo-dimeric Hb proteins (HbI and
HbII) in erythrocytes but C. nautilei possess two monomeric Hbs (HbIII and HbIV) which
show low sequence identity to HbI and HbII (Kawano et al., 2003). The fourth instance is an
extracellular Hb found in the heterodont clams Cardita borealis and C. floridana (Manwell,
1963; Terwilliger et al., 1978). Figure 1.5 depicts the basic bivalve phylogeny with the
distribution of Hc and Hb among its taxa. While these four cases listed above provide
compelling evidence for the independent evolution of Hb in multiple bivalve lineages, it is
not known whether Hb has evolved in other lineages where O2 transport proteins have not
been examined.
One lineage, in particular, that has not been extensively examined for the presence of
Hb proteins are the members of the family Limidae. This is significant as there is some
evidence to indicate that some species in this family may also possess Hb proteins. This
evidence, however, is principally restricted to a single report of this observation (Rawat,
2010) with no molecular or biochemical studies reported to date. It is plausible to
hypothesise that Limidae indeed possess Hb proteins based on their high metabolic rate
associated with the ability to swim and the red pigmentation of their tissues (Baldwin & Lee,
1978; Harper & Skelton, 1993; Mikkelsen & Bieler, 2003). If present, Hb could represent an
important physiological advantage for a more efficient oxygen delivery system to respiring
tissues during swimming. Consequently, if Hb were found in Limidae, then this would
Chapter 1: Introduction 13
constitute a fifth independent origin of Hb in bivalve molluscs. This represents a key
knowledge gap which will be addressed in this thesis.
14 Chapter 1: Introduction
Figure 1.5 Phylogenetic classification of class Bivalvia of molluscs. The main bivalve subclasses are
represented here in different colours: Protobranchia, Pteriomorpha, Palaeoheterodonta, Archiheterodonta,
Anomalodesmata and Imparidenta. This basic phylogeny demonstrates the distribution of Hb (indicated as Hb
following species and order name) and Hc (indicated as Hc) among bivalve taxa and the order of the species in
which those respiratory pigments are found is indicated in grey brackets: Solemyoida, Nuculanoida, Arcoida,
Carditoida and Veneroida. Adapted from (Gonzáles et al., 2015).
Chapter 1: Introduction 15
1.3.1 Haemoglobins in the family Arcidae, Pteriomorphia subclass
The bivalve family Arcidae, also referred to as the blood clams, are unusual in
comparison to most bivalve species as they possess haemolymph which has a deep red
coloration attributed to its circulating erythrocytes (Mangum, 1998). These erythrocytes
contain multiple Hbs; which are often dimeric (~ 32 kDa) and tetrameric (~ 65 kDa) (Como &
Thompson, 1980b; Furuta & Kajita, 1983; Suzuki et al., 2000), with the exception of a 430 kDa
polymeric Hb found in some members of the Barbatia genus, such as B. reeveana and B. lima
(Grinich & Terwilliger, 1980; Suzuki & Arita, 1995). Dimeric Hbs in the Arcidae exhibit further
structural complexity and can be homo-dimeric or hetero-dimeric (Furuta & Kajita, 1983; Mann
et al., 1986; Suzuki et al., 1992). In some species, such as blood cockles (Anadara trapezia and
Scapharca inaequivalvis), multiple homo-dimeric Hbs have been found circulating in
erythrocytes at the same time (O’Gower & Nicol, 1968; Fisher et al., 1984; Ronda et al., 2013).
The tetrameric Hbs in Arcidae are formed from two different subunits in an alpha2beta2
(α2β2) arrangement, analogous to the vertebrate Hb (Ronda et al., 2013). The α and β globin
chains in these species range from ~ 145 to 165 amino acids in length (Mann et al., 1986; Suzuki
et al., 1996). Both the dimeric and tetrameric Hbs show different pairing of their helices to the
canonical arrangement seen in vertebrate Hbs (Ronda et al., 2013). Specifically, the E and F
helices show subunit pairing, with the G and H helices on the outside of the quaternary structure,
creating a back-to-front formation when compared to vertebrate Hb (Vinogradov, 1985). Figure
1.6 illustrates the comparison of the E and F helical arrangement of the human and S.
inaequivalvis Hb. This back-to-front structural arrangement is common to all invertebrate Hbs
for which crystal structures have been resolved (Royer et al., 2005). This structural variation is
also considered to provide evidence to suggest that arcid Hb proteins are a result of convergent
evolution with vertebrate (Royer et al., 2005).
16 Chapter 1: Introduction
Figure 1.6. Quaternary assembly of Homo sapiens and Scapharca inaequivalvis Hbs. (a) HbA refers to adult Hb
in H. sapiens; (b) HbII refers to hetero-tetrameric arcid Hb in S.inaequivalis and (c) HbI refers to homo-dimeric
arcid Hb in S.inaequivalis. For each Hb structure represented here, haeme groups are shown in red, α (alpha)
subunits are shown in dark grey, β (beta) subunits are shown in light grey and E and F helices are shown in blue.
This illustrates the diversity of structures found in arcid Hbs with a heterotetramer (b) and a homodimer (c),
both found in S.inaequivalis. It also shows the back to front assembly of the arcid Hbs with E and F helices
arranging on the outside of the molecule compare to human Hb. (Ronda et al., 2013).
Based on the current literature, protein sequences have been more extensively studied in
comparison to the underlying genetic variation. Currently, most of the information regarding the
genes encoding Hb proteins in Arcidae has been generated in studies which utilised small scale
cDNA sequencing (Suzuki & Arita, 1995; Piro et al., 1996; Suzuki et al., 2000). This
sequencing has demonstrated that the majority of species examined most frequently have two to
three Hb encoding genes. For example, Tegillarca granosa contains three different Hb genes; α
and β encoding genes of the tetrameric Hb and a minor globin gene encoding the homo-dimeric
Hb (Bao et al., 2013a). S.inaequivalvis also expresses three homologous Hb genes (α, β and
minor) (Piro et al., 1996; Piro et al., 1998). Interestingly, these three genes have the same
exon/intron structure seen in vertebrate Hb genes, consisting of three exons and two introns (Piro
et al., 1996). Unlike in the previous two examples, B. lima expresses four distinct Hb genes
which encode a minor homo-dimeric globin (delta (δ)), a hetero-tetramer (α and β) and a
polymeric globin (2D) (Suzuki et al., 1996). The 2D gene is a di-domain globin and is thought
to have resulted from a duplication of delta gene, producing two consecutive domains within
a gene due to the loss of a stop codon (Suzuki et al., 1996). While these studies capture some of
the diversity of Arcidae Hb genes, they are principally based on previous protein evidence. There
Homo sapiens HbA Scapharca inaequivalis HbII Scapharca inaequivalis HbI
Chapter 1: Introduction 17
is a need for a systematic evaluation of diversity at the sequence level of all expressed Hb genes
in this group of organisms.
In arcid bivalves, there is preliminary evidence to suggest that some duplicated Hb
encoding genes may have taken on new functions. For example, in T. granosa molecular
analysis has revealed that some Hb encoding genes may play a role in immune response to
bacterial pathogens (Bao et al., 2013b). In this instance, single nucleotide polymorphisms
identified in α and β Hb encoding genes (HbIIA and HbIIB) showed an association with
resistance to pathogenic bacteria Vibrio parahaemolyticus (Bao et al., 2013b). In addition to
this example of neofunctionalisation, there is also some evidence that duplicated Hb genes in
some bivalves show tissue specific expression (Burmester et al., 2000), although, this has
only been reported for species outside of the Arcidae. Some authors speculate that the
diversity of globin genes in Arcidae may have contributed to the capacity of these organisms
to persist in anoxic or intertidal environments (Terwilliger et al., 1978; Alyakrinskaya, 2002).
While these studies are few and speculative, further investigation is needed to determine the
extent of gene duplication within a single species and if the duplicated genes have undergone
neofunctionalisation. Assessment of expression levels of Hb genes across different tissues
under experimental perturbations (e.g., hypoxic versus normoxic abiotic stress) would allow
identification and quantification of potentially new function in these genes.
A.trapezia was the first arcid species to have its Hb proteins isolated and characterised.
Early work by Nicol & O’Gower (1967) identified the presence of multiple circulating Hbs
including a tetrameric (HbI) with an α2β2 configuration and two homo-dimeric Hbs (HbIIa
and HbIIb). The homo-dimeric forms appear to be allelic variants of the same gene as they are
in Hardy-Weinberg equilibrium within populations (O’Gower & Nicol, 1968; Como &
Thompson, 1980b; Mann et al., 1986). One of the genes that encodes the β subunit of the
tetrameric Hb (HbI) has been fully characterised (At & Eo, 1984), however, the complete gene
sequence for the two homo-dimeric variants and the α subunit of the tetramer have not.
Interestingly, another Hb gene encoding a minor Hb, possibly seen as a ghost band by
O’Gower & Nicol (1968), has been characterised fully in A. trapezia (Titchen et al., 1991).
Based on this protein and DNA evidence, it appears that, at least, four different Hb genes are
present in this species. Despite this significant early work on Hb in A. trapezia, there have
been numerous inconsistencies regarding the nomenclature of the Hb proteins, and
corresponding genes (if at all characterised). In addition to this limitation, it remains unsure
18 Chapter 1: Introduction
whether the DNA and protein sequences reported for A. trapezia represent the complete
repertoire of Hb encoding genes in this species. This makes inferences about the evolution of
Hb in bivalves difficult and highlights the need for a systematic evaluation of the Hb genes in
A. trapezia and arcids in general. Significantly, even less is known about the functional
diversification of this gene family demonstrating the requirement for further functional
genomic interrogation of Hb genes in the Arcidae. This paucity of information represents a
key knowledge gap which will be addressed in this thesis.
1.4 Aims
The overarching aim of this study is to expand our understanding of the evolution of
haemoglobin (Hb) encoding genes, one of the most important gene families in the metazoan
tree of life. Specifically, this thesis aims to address the lack of information regarding the
distribution of Hb genes in bivalves and investigate potential evidence of
neofunctionalisation in duplicated Hb genes of Arcidae bivalves. In order to address these
aims, the thesis will consist of two main objectives each with its own hypothesis. These
include:
• Objective 1
o H1 – Duplicated Hb genes in Anadara trapezia, a member of the Arcidae
family, show tissue specific expression patterns and therefore evidence of
neofunctionalisation.
This hypothesis will be tested by interrogating the recently published whole
organism transcriptome of A. trapezia for all expressed copies of Hb genes and their
expression measured across multiple tissues using quantitative Polymerase Chain Reaction
(qPCR) techniques.
• Objective 2
o H1 – Bivalve Ctenoides ales, a member of the Limidae family, expresses Hb
encoding genes and therefore provides evidence of a fifth independent origin
of Hb in bivalve molluscs.
The hypothesis in objective 2 will be tested by sequencing the expressed portion of
the C. ales’ genome for presence of Hb encoding genes. Whole organism transcriptome will
Chapter 1: Introduction 19
be generated, sequenced utilizing high throughput sequencing techniques and
bioinformatically interrogated for evidence of Hb expression.
1.5 Thesis Outline
Thesis presented here includes four main chapters, consisting of an introductory literature
review (chapter 1), followed by two studies each described within their individual chapter
(chapters 2 and 3). Final section of the thesis is the general discussion (chapter 4).
• Chapter 1
The first chapter is a literature review which highlights the current understanding of
vertebrate and invertebrate haemoglobins (Hb), focusing on their vast structural diversity and
evolution. This section also explores evolution by gene duplication and discusses divergent
and convergent evolution. In addition, it focuses on invertebrate Hbs and more specifically
bivalves in light of the topic for this study.
• Chapter 2
This is the first data chapter which addresses Objective 1. It provides an overview of
duplicated Hb genes in Arcidae bivalves and highlights the knowledge gap in our
understanding of the evolutionary innovation through neofunctionalisation of these genes.
The chapter provides detailed methodological information on the measurement of tissue
specific expression of candidate Hb genes in A. trapezia and interpretation of the results.
Findings of this study are interpreted in detail in discussion.
• Chapter 3
Second data chapter which addresses Objective 2, describes the interrogation of the
transcriptome of C. ales for presence of Hb encoding genes. Methodology section in this
chapter concerns the isolation, sequencing and bioinformatics interrogation of the C. ales
transcriptome. Results present the detailed outcomes of this interrogation and describe the
transcriptional profile this species as well as the relevant candidate genes, followed by
detailed discussion.
20 Chapter 1: Introduction
• Chapter 4
In the fourth chapter, the overall outcomes of the project are discussed in context of the
current body of work relating to evolution of bivalve Hbs. Interrogation of the results is
extended to encompass a discussion on the limitations of the study and potential applications
of the outcomes reported in this thesis.
Chapter 1: Introduction 21
Chapter 2: Tissue specificity and neofunctionalisation of haemoglobin
genes in Anadara trapezia
2.1 Background
Haemoglobins (Hbs) are among the best studied proteins in vertebrates but little is
known about their distribution, function and evolution in invertebrate lineages
(Alyakrinskaya, 2002; Bao et al., 2013a). A number of invertebrate lineages have
independently evolved circulating Hbs as a mechanism to transport O2 for cellular respiration
(Mangum et al., 1975). Invertebrate lineages that have independently evolved circulating
Hbs include some arthropods, some annelids and the Arcid bivalves (Terwilliger, 1998;
Weber & Vinogradov, 2001; Wajcman et al., 2009).
The Arcid bivalves are an interesting group of intertidal mollusc species known as
blood clams. This group displays a diversity of Hb proteins with monomeric, dimeric,
tetrameric and polymeric protein forms (Terwilliger, 1980; Royer et al., 1985; Weber &
Vinogradov, 2001). In fact, the presence of multiple Hb proteins in circulating erythrocytes
was first observed in the Australian blood clam, Anadara trapezia by Nicol & O’Gower
(1967). This study revealed the presence of at least three distinct proteins in this species, two
homodimers and one tetramer, while protein sequencing indicated that at least two
duplication events produced these proteins (Suzuki et al., 1996). Having such a variety of Hb
genes may allow these organisms to survive in the intertidal zone and endure long periods
submerged or exposed to air. A clinal pattern of allele frequencies has also been observed for
the homo-dimer Hb gene in A. trapezia, which the authors hypothesised was associated with
changes in environmental variables including temperature and salinity (O’Gower & Nicol,
1968).
The large number and variation in blood clam Hb proteins is thought to be the result
of repeated rounds of gene duplications but the exact function of these diverse Hb proteins
remains unclear. Current data suggests that blood clam Hbs evolved from an ancestral
mollusc globin gene which, following the divergence of the blood clam ancestor underwent
repeated rounds of lineage specific gene duplication in different blood clam lineages (Riggs,
1991). More recent transcriptome data for A. trapezia (Prentis & Pavasovic, 2014) has
22 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
identified seven full length and genetically distinct Hb genes in this single species and
indicates that gene duplication may have been more extensive than previously reported.
The Hbs used in this study are HbI, a tetramer made up of α and β subunits encoded
by the genes AG (alpha globin) and BG (beta globin); HbII, a homo-dimer molecule encoded
by the HD gene, a hetero-dimer Hb encoded by the gene HB and a putative dimer Hb
encoded by the gene 2D. These genes were used for this study as they encode structurally
distinct proteins including a homo-dimeric and tetrameric Hb proteins. It is postulated that
these different Hbs may show distinct patterns of tissue specific or environment specific gene
expression. Consequently, differences in their patterns of expression were tested across five
tissues, that could be dissected distinctly and where Hbs where most likely to play a role,
(haemolymph, foot, gills, mantle and muscle) in four different experimental treatments to
determine if they showed tissue specific or environment specific gene expression.
2.2 Material and Methods
2.2.1 Sample acquisition and tissue dissection
Anadara trapezia specimens were collected from the intertidal zone in Wynnum,
Queensland, Australia (27°26'08.7"S 153°10'25.0"E) and transferred to holding tanks at
QUT Marine Lab facility until required for experimentation. All animals were collected
under the general fisheries permit number 166312. Ethics approval was not required since
the specimens do not qualify as animals as described by the Queensland Animal Care and
Protection Act 2001 (ACPA). Water conditions during the acclimation period of bivalves
were as follows: water salinity 30-35 ppt, pH 7.9-8.1, temperature 20-25 °C and ammonia
levels (< 0.1 mg/L). A 12 h light/dark cycle was also maintained in the tanks and the animals
were kept unfed for one week prior to the start of the experiment.
To best capture the variation in Hb expression, tissues were extracted from
individual animals submerged in seawater, as well as animals undergoing aerial exposure in
a moist tank (as experienced during periods of regular and prolonged low-tide).
Consequently, animals used for tissue dissection consisted of 12 individuals collected at six
hours (n = 6), six control animals submerged in water and six animals from a moist tank.
This was repeated at 12 hours (n = 6 for both treatments). In total, 24 A. trapezia individuals
had following five specific tissues removed and used for analysis: haemolymph, mantle,
muscle, gills and foot (Figure 2.1). A few microlitres of fresh haemolymph from each animal
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 23
was used to conduct analysis to measure pH, pCO2 and sO2 % using Abbott i-STAT blood
analysers. Haemolymph samples were spun at 10,000 rpm for 3 min, supernatant removed
and RNA was extracted immediately. Tissue samples were snap frozen in liquid N2.
Figure 2.1 Photograph of an A. trapezia specimen illustrating the anatomical position of all five tissues used to
assess tissue specific expression of Hb genes in this study.
2.2.2 RNA extraction and cDNA synthesis
RNA was extracted from each tissue separately using Mollusc RNA extraction kit
from Omega Biotek, following its longest protocol. Specifically, tissues were ground in
liquid N2 and the fine powder obtained was transferred into 1.5 mL microcentrifuge tubes.
These were rapidly brought to a fume hood and 350 µL of MRL buffer provided in the kit
was added and vortexed. Three hundred and fifty µL of a 24:1 solution of
chloroform:isopropanol freshly made were then added to each tube, vortexed mix and placed
in a centrifuge for 2 min at 10,000 x g. The upper aqueous phase of each tube was carefully
removed and transferred to clean 1.5 mL microcentrifuge tubes. One volume of isopropanol
was added to each tube with the upper aqueous phase, vortexed and immediately centrifuged
for 2 min at 10,000 x g. Supernatant was carefully aspirated and tubes were briefly inverted
over a paper towel to remove any residual liquid. RB buffer (100 µL) provided in the kit was
then added to each tube, vortexed to re-suspend the pellets and briefly incubated in water
baths at 65 °C to elute RNA. Another 350 µL of RB buffer and 100 µL of 100 % EtOH was
24 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
added to each tube and vortexed. The entire samples were then transferred into a HiBind®
RNA (Omega Biotek) mini column inside a 2 mL collection tube for washing and elution.
Samples were first centrifuged for 1 min at 10, 000 x g and filtrates were discarded. Five
hundred µL of RNA Wash Buffer I provided in the kit was added to the mini column and
tubes were centrifuged for 1 min at 10,000 x g. Filtrates were discarded and 500 µL of Wash
Buffer II provided in the kit was added to the mini columns and centrifuged for 1 min at
10,000 x g. Filtrates were discarded and another wash step was done with Wash Buffer II.
All samples were then centrifuged for 2 min at max speed to dry the columns. These were
then transferred to clean 1.5 mL microcentrifuge tubes for elution. Fifty µL of DEPC
(Diethylpyrocarbonate) water provided in the kit was added to each tube and samples were
centrifuged for 2 min at maximum speed and eluted RNA samples were stored at -70 °C.
cDNA synthesis was performed using the SensiFASTTM cDNA Synthesis Kit (Bioline)
following the manufacturer’s protocol. cDNA synthesis was performed in a total volume of
50 µL (20 µL water, 20 µL total RNA, 8 µL 5xTransAmp buffer and 2 µL reverse
transcriptase) with the following cycling conditions: 1 cycle at 25 °C for 10 min, 1 cycle at
42 °C for 60 min, 1 cycle at 85 °C for 5 min and a hold step at 4 °C. The cDNA was used for
all PCR reactions in downstream analysis.
2.2.3 RT–PCR (Real Time PCR)
Five candidate globin genes: putative dimer (2D), α chain (AG), homo-dimer (HB),
hetero-dimer (HD) and β chain (BG) previously identified were amplified in each sample by
using primers designed to amplify the entire Open reading frame (ORF) based on
transcriptomic data from A. trapezia (Prentis & Pavasovic, 2014) (Table 2.1). Polymerase
chain reaction amplification was then performed to determine the presence or absence of
each of the five genes in the different tissue samples using the following conditions: initial
denaturation for 3 min at 94 °C and 30 cycles of 30 sec at 94 °C, 30 sec at 52 °C and 1 min at
72 °C followed by one cycle of 5 min at 72 °C and a hold step at 4 °C. Amplicons were
separated by electrophoresis on a 1.5 % agarose gel and stained with GelRedTM (Biotum).
The intensity of PCR products was determined using an image analysis program assisted by a
gel documentation system (Chemidoc XRS, Bio-Rad). Some of these PCR products were
also used for candidate gene validation through sequencing.
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 25
Table 2.1 List of primers designed to amplify products of candidate globin genes identified in the bivalve
species A. trapezia using RT-PCR. All primers were designed to amplify the entire ORF of each gene with
product sizes between 350 bp and 619 bp (Prentis & Pavasovic, 2014).
Gene Primer Name Primer Sequence Product size (bp)
AG (Alpha chain) AGORF CGAGTCCGATTTATTGCTGA 545 AGORR TGTCAAACGAGACAGGTCCA
BG (Beta chain) BGORF TAATGCAGCCTGGACAACAG 501 BGORR TTTGTTGTAGACGCCCTTTG
HB (Homo-dimer) HBORF TTGTCACCTCCAGTCTGTCG 618 HBORR ACGCTACCCTGGTGATTGTC
2D (Putative dimer) 2DORF CGAAACCCAAGTCCATCAT 506 2DORR ATCCCTCACAGAGTGCTGCT
HD (Hetero-dimer)
HDORF ATCTGACGGAAGCAGACG 515 HDORR CGCGAGGTAGTGATATCGAA
2.2.4 Candidate gene validation
Amplicons from two samples were purified using the Bioline isolate PCR purification
kit followed by cloning using the pGEM®-T and pGEM®-T Easy (Promega) vector systems
quick protocol. Duplicate samples for each gene in each biological replicate were selected and
sequenced on the ABI Genetic Analyzer 3500 (ThermoFisher). Sequences obtained were
imported into the Geneious® software version 8.1.6 for visualization and comparison to the de
novo assembled contigs from the A. trapezia annotation report using a pairwise global
alignment. Sequences were aligned to the contig they were designed from to determine
percentage nucleotide similarity.
2.2.5 RT-qPCR (Real Time quantitative PCR) for quantification of gene expression
RT-qPCR was performed on each sample using the One-Step Real Time PCR kit
(Roche) and short primers designed based on past analysis of transcriptomic data from A.
trapezia (Prentis & Pavasovic, 2014) (Table 2.2). These reactions were performed using the
Lightcycler®480 real-time PCR system (Roche), measuring specific fluorescence at each
cycle and quantifying the initial levels of mRNA for each Hb gene in each tissue. The PCR
26 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
reactions were performed using the following conditions: 1 cycle of 5 min at 95 °C and 45
cycles of 10 sec at 95 °C, 10 sec at 60 °C and 10 sec at 72 °C. All quantitative PCR
analyses were repeated in three technical replicates to determine the validity of results
along with negative controls for each gene in each sample and 18S as a house keeping control
gene.
Table 2.2 List of primers designed to amplify products of globin genes identified in the bivalve species A.
trapezia using RT–qPCR and determine their expression levels in different tissues. All primers are designed to
amplify products with sizes between 109 bp and 148 bp. (Prentis & Pavasovic, 2014).
Gene Primer Name Primer Sequence Product size (bp)
AG (Alpha chain) AGF TGATGACCCATCCAGATTGA 117
AGR TTCAGGGTCTCCTTCATTGG
BG (Beta chain) BGF TGTCGAGAGCATCGATGAAG 109
BGR AAATTCGCTCGTCTTGGAGA
HB (Homo-dimer) HBF GCTGTCAACCACATCACCAG 139
HBR CCTGGACAACGCCTACAAGT
2D (Putative dimer) 2DF ATCCGACCCATGGAATAACA 114
2DR CGTGCAACATCTTCCAAGTC
HD (Hetero-dimer) HDF AAGGGACATGCCACAACATT 148
HDR CTCCGAGTGCCTGAAATTCT
18S 18F CGGCGACGTATCTTTCAAAT 136
18R CTTGGATGTGGTAGCCGTTT
2.2.6 RT-qPCR data analysis
RT-qPCR data was exported and viewed in the Lightcycler96 software associated with
the instrument. Relative quantification analysis was performed using the analysis function of
the Lightcycler96 software based on the ∆∆CT method. In this method, the housekeeping
gene provides a basis for comparing levels of target sequences to levels of reference
sequences and the final result is expressed as a relative ratio. These relative ratio values were
exported into Microsoft Excel version 14.0 to be converted to a format compatible with the
program SPSS (Statistical Package for the Social Science) used for statistical analysis. In this
program, significance of results was assessed through ANOVA testing followed by Tukey’s
post-hoc test and differences were considered significant at p < 0.05. Firstly, to assess levels
of target genes in each tissue, ratio values were used as a dependent variable against tissue
type for each gene. Secondly, significant differences in expression among conditions (water
or air 6 hours and 12 hours) were assessed in the same way with ratio values as a dependent
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 27
variable and condition as factor. Due to very low relative ratio values, the following formula
was applied to each value and plotted in SPSS: (-1/log (relative ratio)).
2.3 Results
2.3.1 Haemolymph analysis
Haemolymph analysis results obtained from iSTAT are summarised in Figure 2.2.
Haemolymph pH measurements ranged from 6.7 to 7.2 across the treatment groups with
significant differences among samples exposed to air for 12 h and samples in water for 6 h
and 12 h. Oxygen saturation measurements ranged from 24 to 89 % with significant
differences (p < 0.05) between the two time points of aerial exposure at 6 h and 12 h. Figure
2.2 shows a drop of nearly 30 % for oxygen bound to Hb molecules among samples kept in
water for 6 h and samples exposed to air for 6 h. Measurements for pCO2 ranged from 8.9 to
20.9. Values in water 6 h, air 6 h and water 12 h were relatively consistent across treatments,
however, measurements for samples exposed to air for 12 h fluctuated significantly with
values between 10.4 and 20.9. Similar values were observed for samples in water for 6 h and
in water for 12 h, but there is a slight increase in pCO2 in samples exposed to air for 6 h and a
large increase samples exposed to air for 12 h, showing a gradual build-up of CO2 in the
haemolymph of those animals.
28 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
Figure 2.2 Summary of haemolymph analysis results. This data was obtained by testing a few microliters of
fresh haemolymph using and Abbott i-STAT blood analyser. This was done for all haemolymph samples across
two conditions of experiment and two timepoints: water 6 h, air 6 h, water 12 h and air 12 h. A) pH
measurements, B) percentage of O2 saturation, C) partial pressure of CO2. Significant differences were assessed
through ANOVA testing using the program SPSS with pH values, percentage of O2 saturation values and pCO2
values as dependent variables respectively. Condition was used as a factor for each test and results were
considered statistically significant at p < 0.05. Statistically different groups are represented by the symbols (*)
and (•) in each graph and error bars represent 2 standard errors around the mean.
A
C
B
• •
•
*
*
A B
C Variation of partial pressure of CO2 across conditions
Mea
n pC
O2
Variation of sO2 % across conditions
C
Mea
n sO
2 %
Mea
n pH
Variation of pH across conditions
*
*
• •
• •
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 29
2.3.2 RT-PCR
RT-PCR results are summarised in Table 2.3. Results for the HD gene have been
removed here as it could not be validated. In haemolymph, all genes were amplified in all
samples. For foot samples, BG and HB were amplified in all samples, AG was amplified in
23 out of 24 samples, 2D was amplified in 14 out of 24 samples. Low amplification results
for foot samples overall may be due to low yields of RNA obtained for this tissues. In gill
samples, 2D, AG and BG were amplified in at least 21 out of 24 samples and HB was
amplified in 20 samples. In mantle, all five genes were amplified in at least 22 out of 24
samples. In muscle samples, AG, BG and HB were amplified in all 24 samples and 2D in 21
samples. Overall, all four genes tested here are found in all tissues but their expression was
significantly higher in the haemolymph while in the other tissues tested their expression was
minor. Inconsistencies with RT-qPCR results in section 2.3.4 may be due to RNA
degradation in some of those samples or failed amplifications during RT-PCR.
Table 2.3 RT- PCR results summary for tissue specific expression. All five genes (2D, AG, BG, HB and HD)
have been amplified in six biological replicates for each tissue and each condition. To summarise the results, in
this table, a score is given out of 6 (total number of biological replicates) for each candidate gene amplified in
each tissue and in each condition. An asterix * represents weak bands obtained on gel electrophoresis for at
least 2 biological replicates out of 6 in that group.
Tissue
Candidate globin genes
Water 6 h Air 6 h Water 12 h Air 12 h
Haemolymph
2D 6 6 6 6 AG 6 6 6 6 BG 6 6 6 6 HB 6 6 6 6
Foot
2D 4 4 5 1 AG 6 5 6* 6 BG 6 6 6 6 HB 6 6 6 6
Gills
2D 6 6 6 5 AG 5 6 5 5 BG 5 6 5 5 HB 5* 6* 4* 5*
Mantle
2D 5 6* 5 6 AG 6 6 5 6 BG 6 6 5 6 HB 6 6 5 6
Muscle
2D 6 6 6 4 AG 6 6 6 6 BG 6 6 6 6 HB 6 6 6 6
30 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
2.3.3 Candidate gene validation
Figure 2.3 shows PCR products for all candidate genes used in sequencing for
validation. Following Sanger sequencing, alignments of amplified sequences showed 100 %
similarity with the original candidate ORFs for 2D, AG, BG and HB, while HD did not show
the correct sequence and therefore was not validated.
Figure 2.3. PCR products amplified from two mantle samples to validate candidate Hb genes. Amplified PCR
products obtained here were purified using the Bioline isolate PCR purification kit followed by cloning using
the Promega pGEM-T and pGEM-T easy vector systems quick protocol. Samples were then sequenced on
the ABI Genetic Analyzer 3500 in duplicate. Each gel of this figure represents one mantle sample. For both
gels A (sample 1) and B (sample 2): lane 1 contains 100 bp Hyperladder, lane 2 through to 7 contain amplified
products for 2D, AG, BG, HB, HD and 18S genes respectively.
2.3.4 RT-qPCR for quantification of gene expression
RT-qPCR results revealed that 2D, AG and BG had significantly higher expression
levels (p < 0.05) in haemolymph compared to other tissues (Figure 2.4). In the case of HB,
there were no significant differences (p > 0.05) in expression levels across tissues with
overall lower expression than the other genes tested. These results are consistent with
amplification curves for target genes and for the housekeeping gene 18S as shown in Figure
2.5. In haemolymph, fluorescence is first detected in cycle 12 and this first cluster represents
amplification curves for 2D, AG and BG. The second cluster of amplification curves begins
at cycle 32 and represents HB expression. This difference is also observed across all tissues
in relative expression of 2D, AG and BG compared to HB as seen in Figure 2.4.
Amplification curves for foot, mantle and muscle follow a similar pattern with fluorescence
detected at cycle 20 for 2D, AG, BG (first cluster) and at cycle 26 for HB (second cluster).
The housekeeping gene 18S used in all PCR runs was consistently detected at cycle 8, across
A B
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 31
all tissues. Relative expression levels showed no significant differences between air or water
exposed samples (Figure 2.6).
Figure 2.4. Relative expression ratios of candidate globin genes amplified using RT-qPCR. Candidate genes
amplified in each tissue from the bivalve A.trapezia are represented as follows: 2D, AG, BG and HB. Relative
quantification analysis was performed using the ∆∆CT method. Relative expression is expressed as a ratio of
levels of target sequences to levels of reference sequences with 18S used here as a housekeeping gene.
Significant differences were assessed through ANOVA testing using the program SPSS and differences were
considered statistically significant at p < 0.05. Significantly different groups are shown here with an asterix (*).
Ratio values were used as a dependent variable against tissue type for each gene. Error bars represent 2
standard errors around the mean.
Relative expression of 2D across tissues Relative expression of AG across tissues
p < 0.05
Relative expression of BG across tissues Relative expression of HB across tissues
p < 0.05
p < 0.05
32 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
Figure 2.5. Examples of amplification curves obtained for target genes (2D, AG, BG, HB) and housekeeping
gene 18S in each tissue type. Target amplification curves are represented in orange (left), negative controls in
green (left) and reference amplification curves in blue (right). RT-qPCR was performed using a Lightcycler®
measuring specific fluorescence at each cycle. All quantitative PCR analyses were repeated in three technical
replicates along with negative controls and 18S as a housekeeping gene.
0.350 0.300 0.250 0.200 0.150 0.100 0.050
0.350 0.300 0.250 0.200 0.150 0.100 0 050
Fluo
resc
ence
Fluo
resc
ence
5 10 15 20 25 30 35 40 45 Cycle
5 10 15 20 25 30 35 40 45 Cycle
0.360
0.300
0.240
0.180
0.120
0.060
Fluo
resc
ence
5 10 15 20 25 30 35 40 45 Cycle
Fluo
resc
ence
0.600
0.500
0.400
0.300
0.200
0.100
5 10 15 20 25 30 35 40 45 Cycle
5 10 15 20 25 30 35 40 45 Cycle
Fluo
resc
ence
0.500
0.400
0.300
0.200
0.100
Fluo
resc
ence
0.720 0.600 0.480 0.360 0.240 0.120
5 10 15 20 25 30 35 40 45 Cycle
5 10 15 20 25 30 35 40 45 Cycle
Fluo
resc
ence
0.700 0.600 0.500 0.400 0.300 0.200 0.100
Fluo
resc
ence
5 10 15 20 25 30 35 40 45 Cycle
Fluo
resc
ence
0.600
0.500
0.400
0.300
0.200
0.100
5 10 15 20 25 30 35 40 45 Cycle
Fluo
resc
ence
0.720 0.600 0.480 0.360 0.240 0.120
0.500
0.400
0.300
0.200
0.100
5 10 15 20 25 30 35 40 45 Cycle
Haemolymph
Foot
Gills
Mantle
Muscle
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 33
Figure 2.6. Relative expression of candidate globin genes in A. trapezia amplified using RT-qPCR. Relative
quantification analysis was performed using the ∆∆CT method. Relative expression is shown as a ratio of levels
of target sequences to 18S sequences, used here as a housekeeping gene. Significant differences (p < 0.05) were
assessed through ANOVA testing using SPSS with ratio values as dependent variable, condition as factor in
each tissue. No statistical differences were found here. Error bars represent 2 standard errors around the mean.
Relative expression of all genes across conditions in haemolymph
Rela
tive
expr
essi
on
2.50
2.00
1.50
1.00
0.50
0.00
2.50
2.00
1.50
1.00
0.50
0.00
Rela
tive
expr
essi
on
2.50
2.00
1.50
1.00
0.50
0.00
Rela
tive
expr
essi
on
2.50
2.00
1.50
1.00
0.50
0.00
Rela
tive
expr
essi
on
2.50
2.00
1.50
1.00
0.50
0.00
Rela
tive
expr
essi
on
2D AG BG HB
Gene
2D AG BG HB
2D AG BG HB
2D AG BG HB
2D AG BG HB
Gene
Gene
Gene
Gene
Relative expression of all genes across conditions in foot
Relative expression of all genes across conditions in gills Relative expression of all genes across conditions in mantle
Relative expression of all genes across conditions in muscle
Water 6h Air 6h Water 12h Air 12h Condition
Water 6h Air 6h Water 12h Air 12h Condition
Water 6h Air 6h Water 12h Air 12h Condition Water 6h Air 6h Water 12h Air 12h
Water 6h Air 6h Water 12h Air 12h
Condition
Condition
34 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
2.4 Discussion
In this study, the expression of Hb encoding genes was examined (2D, AG, BG and
HB) across different tissues and environmental conditions, in A. trapezia, in order to
determine if these genes show patterns of expression consistent with neofunctionalisation.
Based on these findings, three Hb encoding genes appear to be predominantly expressed in
the haemolymph. This pattern was observed to be consistent across the two environmental
conditions tested (submersion and aerial exposure). Physico-chemical properties of the
haemolymph, however, indicated that major physiological changes are occurring in the
haemolymph without major changes in expression of Hb genes.
2.4.1 Haemolymph characteristics
Haemolymph analysis was performed in order to gain insights on the state of Hb
molecules and gas exchange throughout the experiment. pH values remained constant in
animals kept in water for 6 h and 12 h. However, animals exposed to air for 6 h (pH= 7.2)
showed a slight drop in pH and a significant drop was observed in animals exposed to air for
12 h (pH= 6.9). This finding is consistent with observations in another bivalve species
(Crassostrea gigas) (Michaelidis et al., 2005). This is an interesting physiological response
as C. gigas does not have a circulating respiratory pigment and this therefore suggests that
these changes may be independent of Hb in the haemolymph. The drop in pH and increase in
pCO2 in animals exposed to air for 6 h was not statistically significant (p > 0.05), however,
this may be due to the fact that A. trapezia is typically exposed to six hours (high to low) tide
cycles. Consequently, the animals may still be consuming their oxygen stores and relying on
aerobic metabolism (Booth et al., 1984).
The observed significant decline in pH is consistent with the increase in pCO2 in
animals under prolonged aerial exposure (12 h), and may be explained by a shift of the
organism from aerobic metabolism to anaerobic metabolism, an adaptation to hypoxia
widespread among bivalves (Brooks et al., 1991; De Zwann et al., 1993). This is well
supported in literature on marine bivalves, where a large number of studies have shown that
hypoxia causes pCO2 to rise in the haemolymph due to the accumulation of CO2 as an
anaerobic by-product, leading to acidification of body fluids and a drop in pH levels
(Crenshaw & Neff, 1969; Jokumsen & Fyhn, 1982; De Zwann et al., 1983; Booth et al.,
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 35
1984; Michaelidis et al., 2005; Shumway & Parsons, 2011). This gradual switch from
aerobic to partial or complete anaerobic metabolism upon prolonged air exposure is
documented in bivalves such as the deep-sea clam (Calyptogena magnifica) (Hourdez &
Lallier, 2006) and the sea mussel (Mytilus edulis). Another study on the intertidal lugworm
Arenicola marina (Booth et al., 1984; Toulmond & Tchernigovtzeff, 1984; Wang &
Widdows, 1991), which is known for its ability to rapidly acclimate to restricted O2
availability in its habitat, reinforces this finding and reports that acidosis was also found to be
coupled with a rise in pCO2 as a result of decreased gas exchange during low tide (Toulmond
& Tchernigovtzeff, 1984). Moreover, from results in this study, globins present in the
haemolymph of A. trapezia do not seem to show a Bohr effect unlike Hbs found in teleost
fish species (Souza & Bonilla-Rodiguez, 2007; Witeska, 2013). The saturation of oxygen
appears independent of pH as a significant drop in pH is coupled with a significant increase
in sO2 % for animals exposed to air for 12 h. This is consistent with previous studies on Hb
containing bivalves such as Barbatia reeveeana (Grinich & Terwilliger, 1980), Cardita
floridana (Manwell, 1963) and Anadara broughtonii (Furuta & Kajita, 1983) for which no
Bohr effect was observed.
2.4.2 Tissue specific expression and neofunctionalisation
It is now widely accepted that invertebrate Hbs are understudied, particularly in
reference to their evolution, distribution and functions (Alyakrinskaya, 2002; Bao et al.,
2013a). In an attempt to address this knowledge gap, a quantitative investigation of Hb
encoding genes was performed across five different tissues in A. trapezia individuals
undergoing aerial exposure. A. trapezia is particularly suited to examine neofunctionalisation
of Hb genes as it expresses multiple different Hb genes (Como & Thompson, 1980b) and is
commonly under abiotic stress due to aerial exposure during tidal cycles (Davenport &
Wong, 1986). A. trapezia, like most bivalves, possesses an open circulatory system where
surface areas for O2 uptake are composed of a pair of gills and mantle tissue anchored to the
shell (Herreid, 1980).
While strong patterns of differential expression across tissues were observed, there was
no evidence to support any changes in expression differences across environmental
conditions. This may indicate that these genes have similar functions under environmental
stress. Alternatively, the aerial exposure in this experiment mimics a tidal cycle at 6 h and a
36 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
prolonged period of 12 h, the absence of significant differences (p > 0.05) in expression
levels of Hb encoding genes during air exposure may indicate that multiple Hbs maximise O2
binding and supply during times of hypoxia (Projecto-Garcia et al., 2015).
All Hb encoding genes were expressed across all tissues but the genes encoding AG,
BG and 2D had significantly higher expression in haemolymph than foot, gills, mantle and
muscle (p < 0.05). Other authors have used patterns of tissue specific expression of
duplicated genes to infer that they have undergone neofunctionalisation (e.g., Na/K- ATPase
in fish in electrical organs) (Norman et al., 2011; Gallant et al., 2014). Consequently, results
in this study would indicate that at least three of the candidate genes investigated (AG, BG
and 2D) have potentially undergone neofunctionalisation based on their expression patterns.
Although it may be expected that Hb genes have a higher expression in haemolymph, most of
closely related bivalve lineages to Arcidae do not possess circulating erythrocytes (Sullivan,
1961; Read, 1966). This indicates that these genes have undergone altered expression
patterns following duplication from an ancestral globin gene. This may have been influenced
by their challenging living environment in intertidal zones where they are regularly exposed
to changing oxygen availabilities.
While Hb genes were expressed across multiple tissues, the expression levels may
have been somewhat influenced by the permeability of tissues to haemolymph. For example,
the gills are highly permeable to haemolymph because they are made up of folded and
ciliated epithelium to maximise surface area for gas exchange (Widdows et al., 1979).
Similarly, mantle tissue is rich in capillaries where oxygenated haemolymph is distributed to
the rest of the tissues for O2 delivery (Weber, 1980). Muscle and foot tissue also depend on
haemolymph for O2 supply and consequently the lower expression patterns seen across all
tissues apart from haemolymph may be the result of the residual haemolymph content within
them.
2.5 Conclusion
Overall, air exposure for 6 hours mimicking the duration of a low tide does not
change the haemolymph pH in A. trapezia whereas prolonged air exposure causes acidosis in
the haemolymph. This drop in pH on prolonged air exposure coincides with a rise in pCO2
and therefore indicates that the organism may be switching to anaerobic metabolism in order
to deal with this stress. In terms of expression of Hb encoding genes, the multiple Hbs
Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia 37
expressed in this species are present in all tissues but their expression is significantly higher
in haemolymph than in foot, gills, mantle and muscle. For at least three of these Hb encoding
genes, this contributes to the theory that they have undergone neofunctionalisation through
gene duplication. In this case, the unfavourable O2 conditions of the intertidal environment
these bivalves live in may have been a driver for this selective adaptation.
38 Chapter 2: Tissue specificity of haemoglobin genes in Anadara trapezia
Chapter 3: Functional annotation of the Ctenoides ales transcriptome
3.1 Background
Respiratory pigments, or oxygen-transport proteins, within the bivalve lineage show a
patchy distribution, as well as extensive variation in both form and function across species in
which they are expressed (Terwilliger, 1980; Booth et al., 1984; Morse et al., 1986; Mangum
et al., 1987; Weber & Vinogradov, 2001). The majority of bivalves do not have a functional
respiratory pigment and rely on filtration of highly oxygenated water over the gills, to meet
their respiratory demands (Angelini et al., 1998). Those bivalve species that do possess a
respiratory pigment usually express haemocyanin (Hc) (Terwilliger et al., 1988), a complex
copper – based protein that reversibly binds O2. In rare cases, however, bivalves utilise a type
of haemoglobin (Hb), as seen in orders Arcoida, Carditoida, Solemyoida and Veneroida
(Manwell, 1963; Terwilliger et al., 1978; Dando et al., 1985; Doeller et al., 1988; Suzuki et
al., 2000). Based on the current phylogenetic reconstruction of the bivalve species by
González et al. (2015), it appears that the earliest lineages possess Hc as the primary
respiratory pigment. Based on this observation, and the fact that almost all gastropods (sister
lineage to bivalves) possess Hc (Linzen et al., 1985), it is likely that Hc is the ancestral
oxygen-transport protein in Bivalvia. Unlike Hc, Hb is thought to have evolved from
ancestral globin genes on multiple occasions within bivalves.
Currently, four independent origins of Hbs have been hypothesised in bivalve
molluscs. The first instance is thought to have occurred in the primitive bivalve lineage
containing Solemya velum in the order Solemyoida. In addition to circulating hemocyanins,
this species contains several tissue Hbs, predominantly localised in the gills (Dando et al.,
1985; Doeller et al., 1988). The second independent origin of Hbs was reported in the bivalve
family Arcidae in the order Arcoida, where evidence suggests that all species investigated
contain multiple Hbs in circulating red blood cells (Mangum, 1997). The diversity of Hbs in
this group consists of monomeric, homo-dimeric, hetero-dimeric, tetrameric and polymeric
proteins (Terwilliger, 1980). The third reported origin of bivalve Hb is found in the deep-sea
clam genus, Calyptogena in the order Veneroida. In this genus, species such as C. kaikoi
contain two types of homo-dimeric Hb proteins, where one is found in erythrocytes and the
other is restricted to abductor muscle tissue (Suzuki et al., 2000). The fourth instance is an
extracellular Hb found in the heterodont clams, Cardita borealis and Cardita floridana in the
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 39
order Carditoida (Manwell, 1963; Terwilliger et al., 1978). While these four cases provide
compelling evidence for the independent evolution of Hb in multiple bivalve lineages it is not
known whether Hb has evolved in other lineages where oxygen-transport proteins have not
been examined.
Interestingly, there is some sparse evidence to suggest that members of the family
Limidae may also possess Hb proteins. This is largely restricted to one textbook publication
(Rawat, 2010) with no molecular or biochemical studies having been reported. It is plausible
to hypothesise that Limidae indeed possess Hb proteins based on their high metabolic rate
associated with the ability to swim and the red pigmentation of their tissues (Harper &
Skelton, 1993; Mikkelsen & Bieler, 2003). If present, Hb could represent an important
physiological advantage for a more efficient O2 delivery system to respiring tissues during
swimming. Consequently, if Hb were found in Limidae, then this would constitute a fifth
independent origin of Hb in bivalve molluscs.
An efficient approach to determine if Hbs are present in a given species is to sequence
the entire expressed portion of their genome, as it has successfully been performed in the
blood clams, Tegillarca granosa (Bao & Lin, 2010) and A. trapezia (Prentis & Pavasovic,
2014). Research into the presence of Hb in the family Limidae has lagged principally due to a
lack of genomic and proteomic resources developed for species from this bivalve family.
Here, the transcriptome sequencing, assembly and annotation for C. ales, a non-model bivalve
mollusc from the family Limidae is described, for which no transcriptome data currently
exists. The aim of the present study is to test the hypothesis that Hb genes, that encode for a
functional Hb protein, are present in this species and this would therefore likely constitute a
fifth independent origin of Hb in bivalves, as these proteins have only been found in the
bivalve orders Arcoida, Carditoida, Solemyoida and Veneroida. The newly generated
transcriptome will also greatly increase the genomic resources for C. ales and provide an
initial candidate gene list for further genetic research.
40 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
3.2 Materials and Methods
3.2.1 Sample collection
C. ales specimens were obtained from Cairns Marine Pty Ltd, Australia and kept in
holding tanks at the Marine Facility (QUT) , until required for experimentation. This
experiment required no ethics approval as the specimens do not qualify as animals as
described by the Queensland Animal Care and Protection Act 2001 (ACPA). Water
conditions were salinity at 30 - 35 ppt, pH 7.9 - 8.1, temperature 20 – 25 °C and ammonia
levels < 0.1 mg/L. A 12h light/dark cycle was maintained. Animals were fed every three days
with phyto-blast containing a wide range of aquaculture marine phytoplankton (Continuum
Aquatics).
3.2.2 RNA extraction, library preparation and sequencing
One individual tissue sample was used and divided into four for RNA extraction, as
per previously optimised TRIzol/chloroform RNA extraction protocol for mollusc species
(Prentis & Pavasovic, 2014). Specifically the protocol consisted of following steps: 1.0 mL
of Trizol reagent was added to the 1.5 mL microcentrifuge tubes containing tissue and each
tube was vortexed to mix for 5 min. The samples were then incubated at room temperature
for 2 min. Chloroform (0.2 mL) was added to each tube and vortexed for 20 – 30 sec,
followed by a 5 min incubation at room temperature. Samples were spun at 12,000 g for 15
min and the aqueous clear phase in each tube was collected carefully without disturbing the
other layers, and placed in four new 1.5 mL microcentrifuge tubes. Following this,
isopropanol (0.5 mL) was added to each tube of newly collected solution and vortexed for
approximately 10 sec. Sodium chloride (1.2 M, 100 µL) was added to each tube, followed by
a 10 min incubation at room temperature. Samples were then spun at 12,000 g for 15 min and
supernatant removed from each tube without disturbing the pellets. Ethanol (70%, 200 µL)
was then added to each tube to wash the pellets, followed by centrifugation at 12,000 g for 10
min and removal of supernatant. The pellets were then allowed to air dry for 10 min before
being eluted in 50 µL of RNase free water. The RNA was then visualised on a 1.5 % agarose
gel. The quantity and integrity of total RNA was validated on a Bioanalyzer 2100 RNA Nano
chip (Agilent Technologies).
Sequencing libraries were prepared from 20 µg of total RNA using a TruSeq®
stranded RNA library prep kit (Illumina), as per manufacturer’s low sample (LS) protocol.
This consisted of selective purification of mRNA using oligo(dT) magnetic beads and
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 41
subsequent selection of 200 – 700 bp fragments. Specifically, total RNA was diluted with
nuclease-free ultra-pure water to a final volume of 50 µL in a 96 - well 0.3 mL PCR plate
provided in the kit and labelled with a RNA Purification beads (RPB) code. After being
brought to room temperature, the tube containing the RNA Purification Beads was vortexed to
resuspend the oligo-dT beads. 50 µL of RNA Purification Beads was added to each well of
the RBP plate to bind the polyA RNA to the oligo-dT magnetic beads. Each well was then
mixed thoroughly by pipetting the entire volume up and down six times followed by sealing
of the plate with a Microseal adhesive seal provided. The RBP plate was then placed in a
thermal cycler on a cycle of 65 °C for 5 min and 4 °C hold to denature the RNA and facilitate
binding of the polyA RNA to the beads. After being taken out of the cycler, the RBP plate
was incubated at room temperature for 5 min to allow the RNA to bind to the beads. The
adhesive seal was then removed and the plate was placed on the magnetic stand at room
temperature for 5 min to separate the polyA RNA bound beads from the solution.
Supernatant from each well was then removed, discarded and the plate was taken off the
magnetic stand. Beads were washed by adding 200 µL of Bead Washing Buffer provided in
each well to remove unbound RNA. Wells were mixed thoroughly by pipetting up and down
six times. The RBP was again placed on the magnetic stand for 5 min at room temperature.
After being thawed, the Elution Buffer solution provided was centrifuged at 6,000 g for 5 sec.
Supernatant from each well was removed after incubation and discarded as it contained most
of the ribosomal RNA and other non-messenger RNA. The RBP plate was removed from the
magnetic stand and 50 µL of Elution Buffer was added to each well and mixed thoroughly by
pipetting up and down six times. The RBP plate was sealed and placed in the thermal cycler
on a cycle of 80 °C for 2 min and 25 °C hold to elute the mRNA from the beads. This step
releases both the mRNA and any contaminant rRNA that may have bound to the beads non-
specifically. The plate was removed from the thermal cycler and unsealed. After thawing,
the Bead Binding Buffer provided was centrifuged at 600g for 5 sec. Bead Binding Buffer
(50 µL) was added to each well of the RBP plate to allow mRNA to specifically rebind the
beads as well as reducing the amount of rRNA that is bound non-specifically. Each well was
mixed. The RBP plate was then incubated at room temperature for 5 min and then placed on
the magnetic stand where supernatant was removed and discarded. The plate was removed
from the stand and the beads were washed by adding 200 µL of Bead Washing Buffer in each
well and mixing. The plate was placed on the magnetic stand at room temperature for 5 min
and supernatant was removed and discarded to avoid any residual rRNA and other
contaminants. The plate was removed from the magnetic stand and 19.5 µL of Fragment,
42 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
Prime, Finish Mix provided were added and mixed thoroughly with the RNA to serve as a
first strand cDNA synthesis reaction buffer as they contain random hexamers. The plate was
sealed and placed in the thermal cycler on a cycle of 80 °C for 8 mins and 4 °C hold. The
RBP plate was then removed from the thermal cycler and centrifuged briefly. Complementary
DNA synthesis was then carried out on enriched mRNA samples as follows: the RBP plate
was placed on the magnetic stand at room temperature for 5 min and 17 µL of supernatant in
each well was transferred to the corresponding well on a new 0.3 mL PCR plate labelled with
a cDNA Plate (CDP) barcode. After thawing, the First Strand Synthesis Act D mix was
centrifuged at 6,000 g for 5 sec. SuperScript II (50 µL) was added to the First Strand
Synthesis Act D tube. Eight µL of this mix was added to each well of the CDP plate and
mixed thoroughly. The CDP plate was sealed and centrifuged briefly before being transferred
to the thermal cycler on a cycle 25 °C for 10 min, 42 °C for 15 min, 70 °C for 15 min and
hold at 4 °C. The CDP plate was then taken out of the thermal cycler and unsealed. Five µL
of Resuspension Buffer was added to each well followed by 20 µL of thawed and centrifuged
Second Strand Marking Master Mix provided. Each well was thoroughly mixed and the plate
sealed. The CDP plate was placed in the thermal cycler and incubated at 16 °C for one hour.
It was then removed from the cycler and unsealed. Well-mixed AMPure XP beads (90 µL)
provided was added to each well now containing 50 µL of double stranded cDNA and this
was mixed by pipetting the entire volume up and down ten times. The CDP plate was then
incubated at room temperature for 15 min before being placed on the magnetic stand for 5
min. 135 µL of supernatant was removed and discarded from each well. Freshly made 80 %
EtOH (200 µL) was then added to each well without disturbing the beads and the plate was
incubated at room temperature for 30 sec before removing and discarding all of the
supernatant from each well. This EtOH wash step was repeated one more time before leaving
the CDP plate to dry at room temperature for 15 min. The CDP plate was then removed from
the stand and 17.5 µL of thawed and centrifuged Resuspension Buffer provided was added to
each well and mixed. The plate was incubated at room temperature for 2 min and placed on
the magnetic stand for 5 min. Fifteen µL of supernatant from each well was then transferred
from the CDP plate to a new 96 well 0.3 mL PCR plate labelled with an Adapter Ligation
Plate (ALP) barcode.
Purified fragments were poly-A tailed, ligated to Illumina paired-end adapters, and
size selected by gel purification. Finally, PCR was used to enrich the DNA fragments with
the following conditions: one cycle at 98 °C for 30 sec, 15 cycles of 98 °C for 10 sec, 60 °C
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 43
for 30 sec, 72 °C for 30 sec, one cycle at 72 °C for 5 min and hold at 4 °C. The final cDNA
library was sequenced on an Illumina NextSeq500 using 150 bp paired-end chemistry.
3.2.3 Transcriptome assembly and validation
Following sequencing, the libraries were concatenated into two separate files based on
read direction (-left all_R1_reads.fastq and –right all_R2_reads.fastq). Quality control was
then performed on both files to validate the quality of the raw reads prior to assembly. This
was done using the FastQC program which provides summary statistics for read quality,
length and GC content (Andrews, 2010). Low quality reads (sequences with > 1 % N bases
and/or greater than 1 % Q < 20) were discarded and only reads with quality scores above Q20
and less than 1 % ambiguity were used for downstream analysis. These remaining reads were
assembled into contigs ≥ 200 bp using the Trinity short read de novo assembler (version 2.0.6)
(Grabherr et al., 2011). Assembled contiguous sequences (contigs) were filtered for
redundancy and chimeric transcripts using the program CD-HIT (Cluster Database at High
Identity with Tolerance), version 4.6.1 (Huang et al., 2010) and sequences with > 95 %
similarity were clustered into a single contig. At this point, the C. ales transcriptome
assembly was evaluated for completeness using CEGMA (Core Eukaryotic Genes Mapping
Approach) to determine the presence of a group of 248 highly conserved core eukaryotic
genes.
3.2.4 Functional annotation of transcripts and mapping
The transcriptome assembly was functionally annotated using sequence based searches
implemented in the Trinotate annotation pipeline (Haas et al., 2013), which is a
comprehensive annotation suite designed specifically for de novo assembled transcriptomes of
model or non-model organisms. Trinotate generates functional annotation based on
homology searches to known sequence data, protein domain identification and protein signal
peptide/transmembrane domain predictions before integrating the analysis of transcripts into a
SQLite database to obtain an annotation report as described below and as summarised in
Figure 3.1. Firstly, contigs were used as BLASTx queries against the TrEMBL and Swiss-
Prot databases with a stringency E-value of 1 x 10-6. Both of these databases are a collection
of core data on proteins such as the amino acid sequence, the protein name or description, the
taxonomic data, the citation information and as much annotation information as possible.
44 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
They are used concurrently as TrEMBL is computationally analysed while Swiss-Prot is
manually annotated based on literature and curated computational analysis. Gene ontology
terms were assigned to contigs that received BLAST hits that also contained functional
information. In order to characterise the transcriptomic data obtained, these terms were
analysed using WEGO (Web gene Annotation Plotting) (Ye et al., 2006). TransDecoder
v.2.0.1 (Haas et al., 2013) was used to generate a predicted proteome for the transcriptome
assembly based on the longest open reading frame (ORF) in each transcript. The predicted
proteome was then used as a BLASTp query against the TrEMBL and Swiss-Prot databases.
SignalP 4.1 was used to predict the presence and location of signal peptide cleavage sites in
coding sequences. Pfam domains were assigned using the Hidden Markov Model-based
sequence alignment tool (HMMER) to determine the presence and position of domains within
each predicted proteome. The analyses obtained for all transcripts were then uploaded to the
SQLite database and an annotation report was generated. Trinity output consisted of sequence
clusters identified as genes through BLASTx searches against the TrEMBL and Swiss-Prot
databases.
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 45
Figure 3.1. Workflow overview of functional annotation of the C. ales transcriptome using the Trinotate
annotation pipeline. This is a comprehensive annotation suite based on homology searches to known sequence
data. Contigs were first used as BASLTx queries against the TrEMBL and Swiss-Prot databases (stringency E-
value of 1 x 10-6). TransDecoder was used to generate a predicted proteome then used as a BLASTp query
against the TrEMBL and Swiss-Prot databases. SignalP was used to predict the presence and location of signal
peptides and Pfam to determine the presence and position of protein domains. All results were uploaded to the
SQLite database and an annotation report was generated. Adapted from van der Burg et al. (2016).
3.2.5 Comparative transcriptomics
In order to further validate the newly generated transcriptome, orthologous gene clusters
for C. ales, A. trapezia and the two model species Lottia gigantea and Crassostrea gigas were
compared and annotated using the program OrthoVenn (Wang et al., 2015). The predicted
proteomes from the genome of L. gigantea and C. gigas were used and compared to the
predicted proteome from the whole organism transcriptome of A. trapezia and C. ales.
Ortholog groups were identified by an all-against-all or reciprocal BLASTp alignment.
46 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
3.2.6 Candidate genes identification
The annotated transcriptome was interrogated for candidate globin genes, here defined
as contigs that possess a characteristic globin fold sequence and globin domain as identified in
the Trinotate pipeline through BLASTp and Pfam. Candidate contigs identified were
analysed through ORF finder (Stothard, 2000) to determine open reading frames. Coding
regions were translated into protein sequences using the translating tool ExPAsy (Gasteiger et
al., 2003) and used as BLASTp queries against the NR database at NCBI. Relevant
conserved domains were identified using SMART searches for homologous Pfam domains
(Finn et al., 2014), signal peptides and internal repeats.
3.2.7 Primer design and candidate gene validation
Primers for the candidate sequences were designed using Primer3 software, to
amplify the entire ORFs (open reading frames) of the candidate genes: contigs c97022_g1_i3
(CalesGl1), c89016_g1_i4 (CalesGl2) and c97010_g2_i1 (CalesGl3) and used for validation.
This was done with the following PCR protocol: 12.5 µL of MyFi® 2x, 8.5 µL of H2O, 1 µL
of forward primer, 1 µL of reverse primer, 2 µL of MgCl2 and 1 µL of template to a total
volume of 26 µL. Different conventional PCR amplification conditions were used for each
candidate gene. For contig c97022_g1_i3: one cycle at 94 °C for 3 min, 30 cycles at 94 °C 30
sec, 54 °C 30 sec, 72 °C 1 min, one cycle at 72 °C for 5 min and a 4 °C hold. For contig
c89016_g1_i4: one cycle at 94 °C for 3 min, 30 cycles at 94 °C 30 sec, 50 °C 45 sec, 72 °C 1
min, one cycle at 72 °C for 5 min and a 4 °C hold. For contig c97010_g2_i1: one cycle at 94 °C
for 3 min, 30 cycles at 94 °C 30 sec, 48 °C 1 min, 72 °C 1 min, one cycle at 72 °C for 5 min and
a 4 °C hold. Amplified products were size-separated by electrophoresis on 1.5 % agarose gels
stained with GelRedTM (Biotum) alongside a 100 bp Hyperladder (Bioline). The intensity of
PCR products was determined using an image analysis program assisted by a gel
documentation system (Chemidoc XRS, Bio-Rad). PCR products were then purified using an
Ethanol/EDTA precipitation protocol as follows: 5 µL of 125 mM EDTA disod ium salt (pH
8.0) was added to each PCR tube, vortexed and spun briefly. Ethanol (100 %, 60 µL) was
added and tubes were left to incubate at room temperature for 15 min. Tubes were then spun
in a centrifuge at 13,000 g for 20 min and the supernatant was carefully aspirated and
discarded. Freshly made 80 % ethanol (350 µL) was then added and tubes were spun at
13,000 g for 5 min before the supernatant was aspirated and discarded. Pellets in the tubes
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 47
were then left to dry at room temperature for 1 hr covered with aluminium foil. Samples were
sequenced on the ABI Genetic Analyzer 3500 (ThermoFisher). Sequence chromatograms
were visualised using Geneious® software version 8.1.6 and sequences were re-aligned to
candidate ORFs that primers were designed from.
3.2.8 Phylogenetic analysis of sequences
Phylogenetic relationships of globin genes in this study and model bivalve species were
resolved using maximum likelihood analysis in MEGA version 6.0 (Tamura et al., 2013).
This software conducts manual and automatic sequence alignments to infer molecular
relationships and build phylogenetic trees (Tamura et al., 2013). Firstly, the globin gene
sequences were isolated from the genomes of the two bivalve model species C. gigas and L.
gigantea by using a custom BLAST against haemoglobins, neuroglobins, globin-X,
cytoglobins and myoglobins in the three vertebrate model species: Homo sapiens (human),
Gallus gallus (chicken) and Danio rerio (zebra fish). For C. gigas, 12 globin genes were
found and 10 for L. gigantea. Globin gene sequences were isolated from the transcriptome of
A. trapezia. All sequences were further validated using Pfam searches to ensure the presence
of a globin domain. Alignments were performed using MUSCLE (Edgar, 2004) protein plug
in with default settings in MEGA and a best-fit model test was performed resulting in
LG+G+I (Tamura et al., 2013). This LG gamma distributed (G) with invariant sites (I) model
was therefore used with three discrete gamma categories to construct a maximum-likelihood
tree using the bootstrap method with 500 bootstrap replications. All bivalve globin protein
sequences used in the phylogeny were also aligned for comparison and to identify residue
conservation using the multiple sequence alignment with high accuracy and high throughput
MUSCLE in MEGA version 6.0.
3.3 Results
3.3.1 RNA extraction, library preparation and sequencing
Total RNA extraction, performed in quadruplicate, for C. ales is visualised in Figure
3.2, where all four samples showed strong bands around 700 bp, which represent ribosomal
RNA. Bioanalyzer results (Figure 3.3) confirm these findings with high concentration and
48 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
yields for both samples submitted. Transcriptome sequencing of mRNA from C. ales on
Illumina NextSeq 500 resulted in 55,592,966 150 bp paired-end reads and a GC content of
36.53 %. This slightly low GC content may be due to shortness of sequences and instability
of mRNA.
Figure 3.2. Whole tissue from one C.ales individual was sampled and used for RNA extraction. This 1.5 %
agarose gel electrophoresis shows RNA extracted from four samples (extraction was performed in quadruplicate)
from this individual as follows: lane 1 contains 100 bp Hyperladder, lanes 2-5 contain samples 1-4 respectively.
The RNA is visible as strong bands in each sample around the 700 bp mark. RNA obtained was then assessed
for quantity and integrity and sequencing libraries were prepared.
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 49
Figure 3.3. Bioanalyser results for total RNA quality and quantity. Total RNA samples obtained from whole
tissue of one C.ales individual were assessed for quantity and integrity on a Bioanalyzer 2100 RNA nano chip.
Results shown here are for two RNA samples labelled here C10 and C11. RNA concentrations are given for
each sample.
3.3.2 Transcriptome assembly and validation
Assembly of this data resulted in 182,908 contigs (≥ 200 bp). The mean contig length
was 357 bp, with an N50 of 1,167 bp. Assembly statistics are presented in Table 3.1.
50 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
Table 3.1 Summary of sequencing and assembly data for the bivalve C. ales. Sequencing libraries were
prepared using an Illumina TruSeq® stranded RNA library prep kit and the final cDNA library was sequenced on
an Illumina NextSeq500 using 150 bp paired-end chemistry. Libraries obtained were then assembled and the
table below summarises results obtained.
3.3.3 Functional annotation of transcripts and mapping
Overall, 22,903 (12.5 %) transcripts received significant BLASTx hits and 19,788 (10.8
%) BLASTp hits against the Swissprot database. Against the TrEMBL database of predicted
proteins, 34,098 (18.6 %) transcripts received significant BLASTx hits and 26,344 (14.4 %)
BLASTp hits. Gene ontology (GO) terms were assigned to 30,315 transcripts (Figure 3.4).
The most frequently assigned GO terms were cellular process (20,887), biological regulation
(11,020) and metabolic process (15,717) for the broad biological process category. For the
cellular component category, most GO terms were assigned to cell and cell part (23,916),
followed by organelle (16,087). In the molecular function category, GO terms were most
commonly assigned to binding (19,803) and catalytic activity (12,179).
Assembly statistic
Total assembled reads 127,196,743
Total trinity ‘genes’ 160,227
Total contigs 182,908
Mean contig length (bp) 357
Average contig 695.41
N50 1,167
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 51
Figure 3.4. WEGO output for newly generated transcript for C. ales. Web gene Annotation Plotting was used
to characterise the transcriptomic data obtained for C. ales. This figure represents the proportion and number of
transcripts assigned Gene Ontology (GO) terms in three different gene ontology categories developed to
represent common and basic biological information: cellular component, molecular function and biological
process.
3.3.4 Comparative transcriptomics
A comparative analysis was conducted across the four mollusc transcriptomes
including the two model species C. gigas and L. gigantea, as well as the bivalves A. trapezia
and C. ales. Orthologous clusters conserved among the four molluscs and those unique to
each species are represented in the Venn diagram in Figure 3.5. Overall, 6,743 clusters were
shared by all four species, which represents 55.5 % of the total number of clusters for C.
gigas, 58.1 % for L. gigantea, 53.2 % for A. trapezia and 38.8 % for C. ales. The
transcriptome with the lowest number of unique clusters was L. gigantea with 918, while C.
ales had the highest number of 4,355. A. trapezia shared the most orthologous clusters with
C. ales (1,607).
Cellular component Molecular function Biological process
Perc
ent o
f gen
es
Num
ber o
f gen
es
52 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
Figure 3.5. Venn diagram illustrating the number of gene clusters shared and unique between the four bivalve
species C. gigas, L. gigantea, A. trapezia and C. ales. For L. gigantea and C. gigas, the predicted proteomes
obtained from the genomes are used and the predicted proteomes from whole organism transcriptomes are used
for A. trapezia and C. ales. Ortholog groups shown here were identified by an all-against-all reciprocal BLASTp
alignment.
3.3.5 Candidate genes
Contigs identified as globins were extracted from the Trinotate annotation report
(c97022_g1_i3; c89016_g1_i4; c97010_g2_i1). Contig c97022_g1_i3 (CalesGl1) was found
to be 1,930 bp in length, consisting of an ORF of 498 bp encoding a polypeptide of 165 amino
acids. Analysis of the predicted protein sequence identified a globin domain (Figure 3.6) with
40 % sequence identity to a Ngb-like gene in the bivalve C. gigas. Contig c89016_g1_i4
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 53
(CalesGl2) was 715 bp in length, consisting of an ORF of 585 bp encoding a polypeptide of
194 amino acids. The predicted amino acid sequence of this contig consisted of a signal
peptide and a globin domain (Figure 3.6). It had 36 % protein identity to a Hb gene found in
the crab Carcinus maenas. Contig c97010_g2_i1 (CalesGl3) was found to be 1,835 bp in
length consisting of an ORF of 1,107 bp encoding a polypeptide chain of 368 amino acids.
The predicted amino acid sequence of this contig was found to encode a transmembrane
domain and two globin domains (Figure 3.6), with 38 % sequence identity to the HbI gene
found in the bivalve T. granosa.
bp
Figure 3.6. Globin domains identified for three candidate globin genes. The annotated transcriptome obtained
for C.ales was interrogated for candidate globin genes here defined as contigs that possess a characteristic globin
fold sequence and globin domain. The three contigs found to possess these characteristics are shown above as
follows: CalesGl1 (top), CalesGl2 (middle) and CalesGl3 (bottom). For candidate gene 2, the purple circle
represents a signal peptide and for candidate gene 3, the blue rectangle represents a transmembrane domain.
Both were identified using SMART searches for homologous Pfam domains, signal peptides and internal
repeats.
CalesGl1
CalesGl2
CalesGl3
54 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
3.3.6 Primer design and candidate gene validation
Forward and reverse primers were designed based on the longest ORF for each of the
three candidate genes for validation (Table 3.2). PCR reactions were optimised for each
primer from each candidate gene and sequences successfully amplified (Figure 3.7).
Alignments of chromatograms obtained from Sanger sequencing and original candidate ORFs
were successful and all three genes were validated at > 97 %.
Table 3.2 Primers designed for validation of three candidate genes identified from the newly generated C. ales
transcriptome. Primers were designed using Primer3 software to amplify the entire ORFs and validate the
candidate genes identified. Forward and reverse primers were designed for each candidate as shown in the table
below.
Figure 3.7. Candidate gene products from C. ales transcriptome. Each candidate was amplified using primers as
shown in table 3.2 above. Lanes 1 in all gels shown here are Hyperladder 100 bp. Candidate gene 1 is shown in
lane 2 of gel (A); candidate gene 2 is shown in lane 3 of gel (B); candidate gene 3 is shown in lane 3 of gel (C).
All gels are 1.5 % agarose stained with GelRedTM (Biotum). Products of candidate genes seen here were purified
using an Ethanol/EDTA precipitation protocol and samples were sequences on ABI Genetic Analyzer 3500
(ThermoFisher). Lanes 2 and 4 of gel (B) and lane 2 of gel (C) are candidate gene products prior to PCR
reactions being optimised.
Primer name Primer sequence 5’-3’ Contig Annealing T° Product size
CalesGl1_F AGA AGC GCA GGC AGA AGA AA c97022_g1_i3 55° 670
CalesGl1_R TGT GAA TCA ACG CAT TGC ACA
CalesGl2_F ATC AGT CGA CTG GTG CAT AG c89016_g1_i4 55° 670
CalesGl2_R TGC ATG TAC AGA ATA AAG GCA
CalesGl3_F GCA CAC AGC TAC ACT CTT GT c97010_g2_i1 55° 1400
CalesGl3_R TGC GTA GAT CGG AGT AGA GA
B A C
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 55
3.3.7 Phylogeny of globins in bivalves
The phylogenetic relationships of globins in the four mollusc species (L. gigantea, C.
gigas, A. trapezia and C. ales) and three model vertebrate species (D. rerio, G. gallus and H.
sapiens) are represented in Figure 3.8. In this maximum-likelihood tree, there are three
weakly supported major clades (A, B and C). Clade C contained Hb and other related globin
(all globins apart from Ngb and GbX) genes from the three vertebrate model species. Clades
A and B both contained globin genes for L. gigantea, C. gigas, C. ales and A. trapezia. Clade
A, although poorly supported, contained globin genes from L. gigantea, A. trapezia and C.
gigas, C. ales and vertebrate GbX but no genes encoding molluscan Hb proteins. Clade B is
strongly supported and contains all known Hb genes in A. trapezia. All bivalve globin protein
sequences used for phylogeny were also aligned using MUSCLE in MEGA. Results are
shown from position 260-331 where residues are conserved across all 30 bivalve globins. In
position 271, a phenylalanine residue and in positions 295 and 328, two histidine residues are
conserved across all 30 globins from the four different bivalve species (Figure 3.9).
56 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
Figure 3.8. Molecular Phylogenetic analyses by Maximum Likelihood method. Phylogenetic relationships of
globin genes were here resolved for the following species: bivalves A. trapezia and C. ales, two mollusc model
species L. gigantea and C. gigas, three vertebrate model species H. sapiens, G. gallus and D. rerio. The
percentage of trees in which the associated taxa clustered together is shown next to the branches. A discrete
Gamma distribution was used to model evolutionary rate differences among sites (3 categories (+G, parameter =
5.3023)). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 3.8405 % sites).
The tree is drawn to scale, with branch lengths measured in the number of substitutions per site.
L. gigantea globin-like 6 C. gigas neuroglobin-like
L. gigantea globin-like 5
C. gigas cytoglobin-2-like X1
A. trapezia neuroglobin NG
D. rerio GbX
C. gigas cytoglobin-2-like X3
L. gigantea globin-like 8
C. ales neuroglobin-like CalesGl1
D. rerio neuroglobin G. gallus neuroglobin
C. gigas globin-like
C. gigas cytoglobin-1
C. gigas cytoglobin-1-like
C. gigas neuroglobin (2)
C. gigas haemoglobin-1-like
L. gigantea globin-like 2
L. gigantea globin-like 1
C. gigas neuroglobin
H. sapiens neuroglobin
L. gigantea globin-like 7
L. gigantea globin-like 10
C. gigas neuroglobin-1
L. gigantea globin-like 3 L. gigantea globin-like 4
L. gigantea globin-like 9 C. gigas globin-like X1
A. trapezia heterodimer HB A. trapezia dimer 2D
A. trapezia homodimer HD A. trapezia beta globin BG
A. trapezia alpha globin AG H. sapiens cytoglobin D. rerio cytoglobin 2 G. gallus cytoglobin
D. rerio cytoglobin 1 G. gallus myoglobin
H. sapiens myoglobin D. rerio myoglobin
G. gallus globin E G. gallus HbA H. sapiens HbA D. rerio HbA
D. rerio HbB G. gallus HbB
H. sapiens HbB
C. ales haemoglobin-like CalesGl2 C. ales haemoglobin 1-like CalesGl3
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 57
Figure 3.9. Multiple alignments of globin protein sequences. All mollusc globin protein sequences used in the
phylogeny represented in Figure 3.8 were aligned for sequence comparison and to identify residue conservation
using the multiple sequence alignment with high accuracy and high throughput MUSCLE in MEGA. Sequences
are grouped by species and include A. trapezia, C. ales, C. gigas and L. gigantea. Residues conserved across all
sequences and all species are indicated above by an asterix (*).
3.4 Discussion
In this study, the transcriptome of C. ales, a bivalve mollusc, three full length globin
genes were identified. All predicted proteins of the candidate genes showed the two histidines
as well as the phenylalanine residues characteristic of globin proteins (Bashford et al., 1987).
This indicates that the proteins encoded by the globin genes identified here are capable of the
formation of the hydrophobic pocket with a haeme group (Royer et al., 2001). The sequence
similarity of the proteins, however, were highly divergent, with candidate gene one being
most similar to vertebrate Ngb genes and a Ngb-like gene from C. gigas, while candidate
genes two and three were most similar to Hb encoding genes from A. trapezia.
Of most interest here, is that two globin genes from C. ales are sister to the known Hb
encoding genes from A. trapezia. The finding that candidate genes CalesGl2 and CalesGl3
were most similar to genes encoding Hb proteins may indicate that these genes in C. ales
represent a fifth independent origin of Hb in Bivalvia. Currently, Hb has been identified in
species from Arcoida, Veneroida, Carditoida and Solemyoida orders in Bivalvia. It remains
to be determined if Hb encoding genes show patterns of molecular convergence as seen in
58 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
genes underlying convergently evolved phenotypic traits in echo-locating animals (Parker et
al., 2013) and marine mammals (Foote et al., 2015). The repeated evolution of Hb proteins in
bivalves indicates that when these mutations arise in ancestral globin genes they are strongly
selected for. To fully support a fifth independent origin of Hb in Bivalvia will require
functional protein work which was outside the scope of the current project.
Candidate gene CalesGl3 is of particular interest as this gene had multiple globin
domains in a single globin gene. Previously, di-domain Hbs in bivalve molluscs have only
been found in Barbatia reeveana and Barbatia lima from order Arcoida (Naito et al., 1991;
Suzuki et al., 1996). The sporadic occurrence of a multi-domain globin in another bivalve
class suggests that the di-domain globin gene identified in C. ales has evolved independently
of those in order Arcoida. Multi-domain globin genes have been identified in other animal
phyla including Annelida (Branchipolynoe symmytilida and Barbatia seepensis (Projecto-
Garcia et al., 2010), and Arthropoda (Artemia; (Jellie et al., 1996)). This indicates that multi-
domain globin genes may arise relatively frequently, but this idea remains to be tested. In B.
reeveana the di-domain gene was generated through incomplete gene duplication which
resulted from unequal crossing over during meiosis (Naito et al., 1991). It would not be
surprising if a similar mechanism generated the di-domain globin gene found in C. ales. In
fact, both gene duplication and adaptive evolution have been implicated in the origin and
diversity of many multi-domain proteins (Vogel et al., 2005) reinforcing the hypothesis of
evolution by incomplete duplication in this case.
Candidate gene CalesGl1 was identified as a Ngb-like gene and was found in a weakly
supported clade with vertebrate genes encoding Ngb proteins. It was not surprising to find
Ngb-like genes in bivalves as these genes were present in the common ancestor of
Eumetazoans (Cnidaria + Bilateria) (Roesner et al., 2005). The presence of only a single
copy of Ngb-like gene in both C. ales and A. trapezia was unexpected as most invertebrate
lineages often have more than one Ngb-like gene (i.e., C. gigas has 6 copies (UniProt
accession numbers: K1QVD6, K1Q9R1, K1QT48, K1QF07, K1RX51, K1R7G1). The low
number observed in C. ales and A. trapezia could be associated with low levels of gene
expression of Ngb-like genes (Schindelmeiser et al., 1979; Burmester & Hankeln, 2004) and
the use of transcriptome sequencing to identify these genes in both species investigated.
Vertebrate Ngbs are monomeric proteins involved in various physiological functions
including oxygen supply, storage, and interactions with mitochondria in nerve cells (Roesner
et al., 2005; Watanabe et al., 2012), but a number of functions of these proteins remain
Chapter 3: Functional annotation of the Ctenoides ales transcriptome 59
uncharacterised. The predicted proteins identified here, when fully characterised, may help to
better understand the function of neuroglobin-like proteins outside of vertebrate species.
3.5 Conclusion
These data provide the most comprehensive transcriptomic resource currently
available for the bivalve mollusc C. ales. In this transcriptome, there are at least three genes
encoding globin-like proteins. Candidate genes CalesGl2 and CalesGl3 are likely to encode
Hb proteins based on sequence data characteristics and similarities with existing sequences.
Candidate gene CalesGl3 in particular contains two globin domains in a single globin gene
which has only been found in another two bivalve species in a different lineage. In addition
to this, sequence similarities of candidate globin genes identified here to Hb encoding genes
in other bivalves further contributes to the theory of a common bivalve ancestor and
independent evolution of Hbs in bivalve lineages. These results provide preliminary evidence
for a possible fifth independent origin of Hbs in Bivalvia.
60 Chapter 3: Functional annotation of the Ctenoides ales transcriptome
Chapter 4: General Discussion
Haemoglobin genes are highly diverse and have evolved independently in multiple
metazoan lineages (Hardison, 1996; Mangum, 1998; Weber & Vinogradov, 2001; Hofmann et
al., 2010a; Hoffmann et al., 2012). This diversity is particularly evident in invertebrates
where Hbs show exceptional variation in their form, function and structural arrangement
(Weber, 1980; Terwilliger, 1998; Weber & Vinogradov, 2001; Alyakrinskaya, 2002;
Hoffmann et al., 2012). Such variability is reinforced by the patchy distribution of Hbs across
invertebrate groups. Bivalve molluscs for example, represent one such group, where Hbs
have evolved independently across multiple lineages (Manwell, 1963; Terwilliger et al., 1978;
Dando et al., 1985; Doeller et al., 1988; Suzuki et al., 2000). This has been shown to be the
result of gene duplication and can be correlated to species living in environments with low or
fluctuating oxygen availabilities. Knowledge on the sequence, role and evolution of Hbs is
not only important for phylogenomics but is also crucial for the bivalve farming industry.
Despite this, only a limited number of studies have examined the full range of Hb genes, or
functionally characterised their expression in bivalves (Terwilliger et al., 1983; Angelini et
al., 1998; Hourdez & Weber, 2005; Decker et al., 2014). To better understand the evolution
and expression of Hbs in bivalves, an investigation of Hb genes in two distantly related
species of bivalves is performed here. Firstly, by quantifying patterns of tissue specific Hb
gene expression in A. trapezia under submerged and aerially exposed treatments. This
provided insights into the function and role of Hb encoding genes in this species. Secondly,
by generating a new high-quality transcriptome resource for C. ales, the globin gene
repertoire in this species was examined and a fifth independent origin of Hb in bivalve
molluscs potentially identified.
4.1 Role of gene duplication in the current diversity of bivalve Hbs
The current diversity of Hb genes in bivalves has been hypothesised to be a result of
repeated rounds of gene duplication driving evolution. One of the mechanisms responsible
for this duplication has been demonstrated, by some authors, to be unequal crossing-over
during meiosis (Naito et al., 1991; Dewilde et al., 1999; Kato et al., 2001). For example, the
two-domain Hb (2D) from the blood clam B. lima is thought to have evolved from an unequal
crossing over event between two ancestral globin genes and the loss of a stop codon (Suzuki
Chapter 4: General discussion 61
et al., 1996). Duplication through unequal crossing over is best illustrated in datasets in this
study by the di-domain globin gene found in C. ales. This di-domain gene has most likely
evolved as a result of unequal crossing over that led to the deletion of a stop codon in a single
domain gene, generating an extended open reading frame with two globin domains.
Interestingly, most Arcidae species studied to date do not possess this di-domain gene.
Therefore, as C. ales, a member of the family Limidae and B. lima, a member of the Arcidae,
both have a di-domain globin gene, it is likely that this gene may have arisen independently in
both species.
The presence of at least five globin genes that encode Hb proteins in A. trapezia,
which form a single clade in the globin phylogeny for bivalves, supports the hypothesis that
extensive duplication has played an important role in the evolution of this gene family. In
arcid bivalves most species have more than a single Hb gene (Como & Thompson, 1980b;
Mangum, 1997), however, A. trapezia is the first species examined to have more than four Hb
genes. For example, three Hb genes have been identified in T. granosa and S. inaequivalvis,
while B. reeveana has four distinct Hb genes (Ikeda-Saito et al., 1983; Petruzelli et al., 1985;
Terwilliger, 1998; Royer et al., 2001; Bao & Lin, 2010). This indicates that either extensive
lineage specific duplication has occurred independently in the species of this family, or that
duplication events occurred in the common ancestor of the family Arcidae.
Investigation of expression patterns of globin genes in A. trapezia highlighted strong
tissue specific expression with some genes being predominantly expressed in erythrocytes.
Often tissue or developmental specific expression in duplicated genes is associated with
changes in the regulatory elements of these genes (Sankaran et al., 2008), which has led to
higher expression in certain tissues or developmental times. This has been demonstrated in
mammals as recently duplicated genes that shared regulatory sequences were more likely to
be co-expressed than duplicated genes that do not share regulatory sequences (Lan &
Pritchard, 2016). It remains to be determined whether divergence in regulatory sequences are
responsible for the tissue specific expression observed in A. trapezia globin genes encoding
Hb proteins, but this study is the first to show tissue specific expression. This indicates that
globin genes may have undergone neofunctionalisation following duplication in A. trapezia to
be dominantly expressed in haemocytes. This data together with data from C. ales shows that
duplication and subsequent divergence have played a dominant role in globin gene evolution
in bivalve molluscs.
62 Chapter 4: General discussion
Neofunctionalisation is the evolution of new functions in duplicated genes and has
been well demonstrated in vertebrate Hb genes (Aguileta et al., 2004; Hoffmann & Storz,
2007; Opazo et al., 2008). The pattern observed in results from chapter 2 of this study is
consistent with neofunctionalisation of clade B globin genes in A. trapezia, as two of these
genes (AG and BG) encode a tetrameric Hb (Como & Thompson, 1980a), while another gene
(HB) in this clade encodes a dimeric Hb (Petruzelli et al., 1985). Two of these three genes
(AG and BG), are dominantly expressed in haemocytes and as these cells are a novel
phenotypic trait in Arcidae, this expression pattern is indicative of neofunctionalisation.
Multiple examples exist that support the idea that neofunctionalisation of gene duplicates has
played an important role in the generation of novel phenotypic traits (Birchler & Veitia, 2010;
Kaessmann, 2010; Osborn et al., 2003). Of these, a well-known example is that of a
neofunctionalised copy of an elastin gene which contributed to the evolution of the bulbus
arteriosus, a novel organ found in the heart of teleost fish (Moriyama et al., 2016). While it is
tempting to speculate that neofunctionalised Hb genes may contribute to evolutionary novelty
in Arcid bivalve erythrocytes, the extensive duplication of globin genes encoding Hb proteins
may be associated with escape from adaptive conflict (Des Marais & Rausher, 2008; Storz et
al., 2008) if Hb proteins undertake multiple roles within this cell type.
Globin and Hb genes from many species have been demonstrated to encode
multifunctional proteins that serve not only in O2 transport, but also play a role in immune
function and contribute to disease phenotypes with haemocytes recognised as the immune
effectors (Bao et al., 2013b; Vinogradov & Moens, 2008; Weatherall, 2001; Donaghy,
2009). For example, a duplicated gene encoding a dimeric Hb from the arcid bivalve T.
granosa, was upregulated following an immune challenge with V. parahaemolyticus (Bao et
al., 2013b). This study demonstrates that the duplicated Hb genes of arcid bivalves may
have multifunctional roles in both O2 transport and the innate immune system response.
Such acquisition of new roles in pre-existing genes requires changes in the regulatory
elements of these genes as well as through coding regions. This is also necessary as these
genes adapt to different living conditions such as hypoxic environments. For example,
globin diversity in vesicomyid clams is believed to be a result of monomers modulation to
accommodate for oxygen levels in their surrounding environment. This is hypothesised to
be the result of structural genetic diversity in populations or changes in globin gene
transcription according to environmental changes (Carney et al., 2007). Based on findings in
this study for the bivalve A. trapezia, we can argue that it is most likely not a result of
Chapter 4: General discussion 63
changes in the transcription of globin genes, at least for animals exposed to air for up to 12
hours as expression levels of globins did not significantly change. Consequently, the
extensive duplication of Hb encoding globin genes in A. trapezia may allow natural selection
to drive specialization of different members of this multifunctional gene family. Further
functional studies will be required to determine if this idea is correct.
4.2 Globin gene evolution in hypoxic environments
The variability in functional properties of bivalve Hbs are thought to be associated with
the wide range of environmental conditions that these organisms encounter in their habitat
(O’Gower & Nicol, 1968; Terwilliger et al., 1978; Alyakrinskaya, 2002). Among these
environmental conditions, one of the most common stresses affecting bivalves is hypoxia
(Widdows et al., 1979; Weber, 1980; Booth et al., 1984; Burnett, 1997; Gobler et al., 2014).
It develops in organisms where the depletion of oxygen through respiration is faster than its
replenishment and some bivalve species show better tolerance and survival rates than others
when subject to extended periods of hypoxia (Officer et al., 1984; Rabalais et al., 2010). One
of the reasons for this is their ability to close their shells therefore avoiding oxygen depletion
and switching from aerobic metabolism to anaerobic metabolism (Brooks et al., 1991; De
Zwann et al., 1993). The present study showed that this may also be an adaptive feature in
the blood clam A. trapezia.
Furthermore, the evolution of circulating Hbs in bivalve lineages has been hypothesised
to allow for maximised O2 binding and transport during times of hypoxia (Projecto-Garcia et
al., 2015), since both Hbs and Hcs occur in bivalve species that experience periodic hypoxia
(Morse et al., 1986; Mangum et al., 1987; Terwilliger et al., 1988; Riggs, 1991; Weber &
Vinogradov, 2001; Projecto-Garcia et al., 2015). The presence of multiple Hb proteins in
bivalve species is also a common observation and has been linked to many organisms that
synthesise multiple oxygen carriers with different oxygen affinities to meet physiological
demands (Mangum, 1997; Weber & Vinogradov, 2001; Projecto-Garcia et al., 2015). For
example, the clam C. magnifica is found lodged into rock fissures outside hydrothermal vents
and exposed to both deep-sea water and vent fluid meaning that it is frequently subject to
chronic hypoxia (Berg, 1980). This species possesses an intracellular Hb with high O2
affinity for carrying and transport required for its sustainability in such a challenging
environment (Terwilliger et al., 1983; Scott & Fisher, 1995; Hourdez & Weber, 2005). The
64 Chapter 4: General discussion
deep-sea clams C. kaikoi, Calyptogena soyoae and Calyptogena tsubasa all possess two
homo-dimeric Hbs with more than 90 % identity, of which, HbI and HbII in C. kaikoi have
been found to be involved in O2 storage under low O2 conditions in the deep sea rather than
O2 transport (Kawano et al., 2003; Suzuki et al., 2000). Arcid bivalves are also frequently
found in habitats with low levels of O2 such as intertidal zones or deep-sea waters and often
experience prolonged periods of hypoxia (Arp et al., 1984; Abele-Oeschger & Oeschger,
1995; Terwilliger, 1998; Weber & Vinogradov, 2001). This is the case with the intertidal
bivalve A. trapezia used in this study but also with other arcid bivalves such as Anadara
kagoshimensis. As with A. trapezia, this species is found in sandy-muddy areas of the Indo-
Pacific region and possesses haemoglobin-containing erythrocytes (Golovina et al., 2016).
Compared to other bivalve species in the same habitat, A. kagoshimensis has also shown
better tolerance to hypoxia and this has been attributed to the presence of Hbs (Holden et al.,
1994). The multiple Hbs in these bivalves may be produced simultaneously and may have
different functions such as O2 carrying or storage, or can be produced sequentially to follow
changing conditions (Terwilliger, 1998). Different Hb structures confers them these different
functions and thus their potential resistance to hypoxia in the habitats where the bivalves
settle (Decker et al., 2016). Overall, the presence of Hbs seems to play a role in adaptation of
bivalves to hypoxia. Therefore, mutations that confer the evolution of such complex
respiratory pigments such as Hb are likely selected for.
Since all blood clams seem to have originated from a common ancestor, globin
structural variations and levels of expression is thought to have been influenced directly by
the environmental conditions these clam species are subjected to and specifically oxygen
concentration and availability (Kawano et al., 2003). The evidence from this study reinforces
that greater expression of Hbs in haemolymph confers a physiological advantage in tolerating
hypoxia. It is shown here by expression levels obtained for the blood clam A. trapezia.
Given the hypoxic nature of the intertidal environment where this bivalve is found, it is
almost certain that a duplication event was driven by the need for oxygen availability during
low tides. This is also consistent with the finding of Hb-like genes that may encode Hb
proteins in C.ales as this species lives in the tropical waters of the Indo-Pacific area at depths
between 30 – 35 m (and sometimes up to 50 m) where both depth and temperature reduce the
solubility of O2 creating a hypoxic environment (Garcia et al., 2005; Karstensen et al., 2008).
Nonetheless, the idea that Hb has evolved more often in lineages that live in hypoxic
environments requires further analysis before it can be supported.
Chapter 4: General discussion 65
4.3 The importance of globin gene evolution to aquaculture
Aquaculture is one of the fastest growing food-producing sectors and accounts for nearly 50
% of world consumption. The major groups currently produced in aquacultureinclude finfish,
crustaceans and molluscs. Culture of molluscs contributed approximately 20 % of total
aquaculture production in 2014 and this amount has been steadily increasing in recent years.
Bivalve molluscs of the family Arcidae includea number of major fishery and aquaculture
species such as T. granosa, S. inaequivalis, S. broughtonii and A. trapezia (Donaghy et al.,
2009; Bao et al., 2013b). Besides their importance as food sources, bivalve molluscs are
frequently used as indicators of pollution and overall health of ecosystems. Some of the
major issues encountered in clam farming are disease outbreaks, low survival rates due to
environmental pollution and slow growth rates (Alkarkhi et al., 2008; Vuddhakul et al.,
2006). Haemocytes found in bivalve molluscs have been shown to be involved in various
biological functions (Donaghy et al., 2009) such as immune defence against bacteria and
viruses (Bao et al., 2013b) but also detoxification. In fact, the role of haemoglobin in
haemocytes goes beyond supporting aerobic metabolism and in some species SNPs in specific
Hb genes have been observed to correlate with disease resistence. Hbs in bivalves have been
shown to be a source of antibacterial activity but are also responsible for eliminating harmful
reactive oxygen species (ROS) and Nitric Oxide (NO) which may be present in polluted
environments. Investigation of the haemoglobin genes in these bivalves as well as their
expression under environmental stress as it is done in this study for A.trapezia contributes to
our understanding of the functions of Hbs and haemocytes which may provide new
perspectives for disease control and resistance to pollution in cultures.
Aquaculture production systems are often classified into three general types: extensive ponds,
intensive ponds and intensive recirculating tank and raceway systems (Ebeling, 2006). There
is also a tendency of many aquaculture enterprises to intensify production using
superintensive systems, for example, super intensive culture using Atlantic salmon and
rainbow trout. Currently, dissolved oxygen is the most important limiting factor to increase
production in intensive systems. The study of haemoglobin gene expression under stress in A.
trapezia and the transcriptome generated for C. ales may be used in future studies looking at
dealing with this major issue. In particular, examining haemoglobin gene expression under
different oxygen concentrations will provide much needed data about how molluscs cope with
low dissolved oxygen in culture.
66 Chapter 4: General discussion
Other bivalves such as scallops are also used for farming. For example, in the scallop family
Pectinidae, only about 10 species are currently cultured from the 400 living species present in
this family (Shumway & Parsons, 2011). New sequence data for species is the ultimate
resource for the introduction new species in cultures or the addition of genetic material to
existing species to increase their diversity and their performance in cultures (Guo, 2009). The
transcriptome generated in this study for the scallop C. ales and the haemoglobin-like genes
described can therefore be used as potential molecular markers for bivalve selection and
breeding to improve aquaculture production. Overall, research based on trancriptomics and
proteomics proves valuable as it increases genetic data to study molecular traits of interest for
cultured species and especially evaluate their disease susceptibility and resistance to
environmental stresses.
4.4 Limitations of the study
Despite some limitations in the two studies conducted as part of this Masters thesis,
the findings about the evolution and expression of Hb genes in bivalve molluscs are valid.
One major limitation of the study is the use of transcriptome sequences for the identification
of Hb genes in both A. trapezia and C. ales instead of full genome sequences. Consequently,
the presence of only six and three full-length globin transcripts in A. trapezia and C. ales,
respectively may be an underestimation of the actual number of Hb genes in these species and
should be viewed with some caution. Many functional genes are not captured by
transcriptome sequencing because they are only expressed at specific developmental stages, in
certain tissues or at very low levels (Yagil et al., 2005). Thus this is a limitation of using
transcriptome sequencing in isolation to identify the number of different genes within gene
families in a species. Complete genome sequencing was not feasible in the current study due
to the time restrictions and financial constraint.
4.5 Future research
Data presented here for A. trapezia provides insight into the multiple functions of Hbs,
possible future studies could consist of further validating Hb genes. This could include
whether Hbs are translated into proteins or whether some might be pseudogenes. Data
presented here for C. ales provide the most comprehensive transcriptomic resource currently
available for this species and therefore allows for numerous gene families to be examined
Chapter 4: General discussion 67
detail. Overall this resource should lay an important foundation for future genetic or genomic
studies in this species. In future studies, complete genome sequences could also improve the
detection of the entire complement of globin genes for these two bivalve species.
4.6 Conclusions
Overall, the evolution of Hbs in bivalves is intriguing due to the great diversity of
proteins found across lineages both in structure and function. This study looked at the blood
clam A. trapezia which possesses five duplicated Hb encoding genes and determined that
expression levels of those five genes are much higher in haemolymph than in foot, gills,
mantle or muscle therefore validating the first hypothesis of this project that they have
undergone neofunctionalisation through gene duplication. Furthermore, the expression of
those genes was not affected by short-term or long-term hypoxia; therefore it can be
concluded that the neofunctionalisation acquired through gene duplication of existing Hb
encoding genes may provide some evolutionary advantage for this bivalve species. Findings
on the sensitivity and reaction of A. trapezia to hypoxia may also be a valuable indicator of
the potential for bivalve populations to survive in challenging environments.
Additionally, next generation sequencing was used to obtain the expressed portion of
the genome and present the first transcriptome for the species C. ales. Three globin-like
encoding genes were found in the newly generated transcriptome for this bivalve species. The
presence of a di-domain globin in particular represents the first in bivalves since the findings
of a di-domain in B. lima and B. reeveana in the Arcoida order. This suggests that multi-
domain globin genes may arise repeatedly through incomplete gene duplication. Although
more investigation is required on the three candidate genes identified here, preliminary
findings in this study support a fifth independent origin of Hb in bivalves and contribute to
validate the second hypothesis of this project.
68 Chapter 4: General discussion
References
Abele-Oeschger, D., & Oeschger, R. (1995). Hypoxia-induced autoxidation of haemoglobin
in the benthic invertebrates Arenicola marina (Polychaeta) and Astarte borealis (Bivalvia)
and the possible effects of sulphide. Journal of Experimental Marine Biology and
Ecology, 187(1), 63–80. https://doi.org/10.1016/0022-0981(94)00172-A
Aguileta, G., Bielawski, J. P., & Yang, Z. (2004). Gene conversion and functional divergence
in the β-globin gene family. Journal of Molecular Evolution, 59(2), 177–189.
https://doi.org/10.1007/s00239-004-2612-0
Alkarkhi, F. A., Ismail, N., & Easa, A. M. (2008). Assessment of arsenic and heavy metal
contents in cockles (Anadara granosa) using multivariate statistical techniques. Journal
of hazardous materials, 150(3), 783-789. https://doi.org/10.1016/j.jhazmat.2007.05.035
Alyakrinskaya, I. O. (2002). Physiological and biochemical adaptations to respiration of
haemoglobin-containing hydrobionts. Biology Bulletin of the Russian Academy of
Sciences, 29(3), 268–283. https://doi.org/10.1023/A:1015438615417
Andrews, S. (2010). FastQC: A quality control tool for high throughput sequence data.
Angelini, E., Salvato, B., Muro, P. D., & Beltramini, M. (1998). Respiratory pigments of
Yoldia eightsi, an Antarctic bivalve. Marine Biology, 131(1), 15–23.
https://doi.org/10.1007/s002270050291
Antonini, E., & Chiancone, E. (1977). Assembly of multisubunit respiratory proteins. Annual
Review of Biophysics and Bioengineering, 6(1), 239–271.
https://doi.org/10.1146/annurev.bb.06.060177.001323
Arp, A. J., Childress, J. J., & Fisher, C. R. (1984). Metabolic and blood gas transport
characteristics of the hydrothermal vent bivalve Calyptogena magnifica. Physiological
Zoology, 57(6), 648–662. https://doi.org/%7B%7Barticle.doi%7D%7D
References 69
At, G., & Eo, T. (1984). Amino acid sequence of the beta-chain of the tetrameric
haemoglobin of the bivalve mollusc, Anadara trapezia. Australian Journal of Biological
Sciences, 38(3), 221–236. https://doi.org/10.1071/BI9800653
Baldwin, J., & Lee, A. K. (1978). Contributions of aerobic and anaerobic energy production
during swimming in the bivalve mollusc Limaria fragilis (family Limidae). Journal of
Comparative Physiology, 129(4), 361–364. https://doi.org/10.1007/BF00686994
Bao, Y. B., Wang, Q., Guo, X. M., & Lin, Z. H. (2013a). Structure and immune expression
analysis of haemoglobin genes from the blood clam Tegillarca granosa. Genetics and
Molecular Research, 12(3), 3110–3123. http://dx.doi.org/10.4238/2013.February.28.5
Bao, Y., Li, P., Dong, Y., Xiang, R., Gu, L., Yao, H., …& Lin, Z. (2013b). Polymorphism of
the multiple haemoglobins in blood clam Tegillarca granosa and its association with
disease resistance to Vibrio parahaemolyticus. Fish & Shellfish Immunology, 34(5), 1320–
1324. https://doi.org/10.1016/j.fsi.2013.02.022
Bao, Y., & Lin, Z. (2010). Generation, annotation, and analysis of ESTs from hemocyte of
the bloody clam, Tegillarca granosa. Fish & Shellfish Immunology, 29(5), 740–746.
https://doi.org/10.1016/j.fsi.2010.07.009
Bashford, D., Chothia, C., & Lesk, A. M. (1987). Determinants of a protein fold. Journal of
Molecular Biology, 196(1), 199–216. https://doi.org/10.1016/0022-2836(87)90521-3
Berg Jr, C. J. (1980). Description of living specimens of Calyptogena magnifica Boss and
Turner with notes on their distribution and ecology. Appendix 1. The giant white clam
from the Galapagos Rift, Calyptogena magnifica species novum. Malacologia, 20, 183–
185.
Bikard, D., Patel, D., Metté, C. L., Giorgi, V., Camilleri, C., Bennett, M. J., & Loudet, O.
(2009). Divergent Evolution of Duplicate Genes Leads to Genetic Incompatibilities Within
70 References
Arabidopsis thaliana. Science, 323(5914), 623–626.
https://doi.org/10.1126/science.1165917
Birchler, J. A., & Veitia, R. A. (2010). The gene balance hypothesis: implications for gene
regulation, quantitative traits and evolution. New Phytologist, 186(1), 54–62.
https://doi.org/10.1111/j.1469-8137.2009.03087.x
Bischof, J. M., Chiang, A. P., Scheetz, T. E., Stone, E. M., Casavant, T. L., Sheffield, V. C.,
& Braun, T. A. (2006). Genome-wide identification of pseudogenes capable of disease-
causing gene conversion. Human Mutation, 27(6), 545–552.
https://doi.org/10.1002/humu.20335
Blank, M., & Burmester, T. (2012). Widespread occurrence of N-terminal acylation in animal
globins and possible origin of respiratory globins from a membrane-bound ancestor.
Molecular Biology and Evolution, 23(11), 3553–3561.
https://doi.org/10.1093/molbev/mss164
Blank, M., Kiger, L., Thielebein, A., Gerlach, F., Hankeln, T., Marden, M. C., & Burmester,
T. (2011). Oxygen supply from the bird’s eye perspective globin E is a respiratory protein
in the chicken retina. Journal of Biological Chemistry, 286(30), 26507–26515.
https://doi.org/10.1074/jbc.M111.224634
Booth, C. E., McDonald, D. G., & Walsh, P. J. (1984). Acid-base balance in the sea mussel,
Mytilus edulis. I. Effects of hypoxia and air-exposure on haemolymph acid-base status.
Marine Biology Letters, 5, 347–358.
Brooks, S. P. J., Zwaan, A. de, Thillart, G. van den, Cattani, O., Cortesi, P., & Storey, K. B.
(1991). Differential survival of Venus gallina and Scapharca inaequivalvis during anoxic
stress: Covalent modification of phosphofructokinase and glycogen phosphorylase during
anoxia. Journal of Comparative Physiology , 161(2), 207–212.
https://doi.org/10.1007/BF00262885
References 71
Brunori, M., & Vallone, B. (2007). Neuroglobin, seven years after. Cellular and Molecular
Life Sciences, 64(10), 1259. https://doi.org/10.1007/s00018-007-7090-2
Burmester, T., & Hankeln, T. (2014). Function and evolution of vertebrate globins. Acta
Physiologica, 211(3), 501–514. https://doi.org/10.1111/apha.12312
Burmester, T., & Hankeln, T. (2009). What is the function of neuroglobin? Journal of
Experimental Biology, 212(10), 1423–1428. https://doi.org/10.1242/jeb.000729
Burmester, T., & Hankeln, T. (2004). Neuroglobin: A Respiratory Protein of the Nervous
System. Physiology, 19(3), 110–113. https://doi.org/10.1152/nips.01513.2003
Burmester, T., Weich, B., Reinhardt, S., & Hankeln, T. (2000). A vertebrate globin expressed
in the brain. Nature, 407(6803), 520–523. https://doi.org/10.1038/35035093
Burnett, L. E. (1997). The challenges of living in hypoxic and hypercapnic aquatic
environments. American Zoologist, 37(6), 633–640. https://doi.org/10.1093/icb/37.6.633
Cañestro, C., Albalat, R., Irimia, M., & Garcia-Fernàndez, J. (2013). Impact of gene gains,
losses and duplication modes on the origin and diversification of vertebrates. Seminars in
Cell & Developmental Biology, 24(2), 83–94.
https://doi.org/10.1016/j.semcdb.2012.12.008
Carney, S. L., Flores, J. F., Orobona, K. M., Butterfield, D. A., Fisher, C. R., & Schaeffer, S.
W. (2007). Environmental differences in haemoglobin gene expression in the
hydrothermal vent tubeworm, Ridgeia piscesae. Comparative Biochemistry and
Physiology Part B: Biochemistry and Molecular Biology, 146(3), 326-337.
https://doi.org/10.1016/j.cbpb.2006.11.002
Ching Ming Chung, M., & Ellerton, H. D. (1980). The physico-chemical and functional
properties of extracellular respiratory haemoglobins and chlorocruorins. Progress in
Biophysics and Molecular Biology, 35, 53–102. https://doi.org/10.1016/0079-
6107(80)90003-6
72 References
Como, P. F., & Thompson, E. O. P. (1980a). Amino acid sequence of the α-chain of the
tetrameric haemoglobin of the bivalve mollusc Anadara trapezia. Australian Journal of
Biological Sciences, 33(6), 653–664. https://doi.org/10.1071/BI9800653
Como, P. F., & Thompson, E. O. P. (1980b). Multiple haemoglobins of the bivalve mollusc
Anadara trapezia. Australian Journal of Biological Sciences, 33(6), 643–652.
https://doi.org/10.1071/BI9800643
Corti, P., Xue, J., Tejero, J., Wajih, N., Sun, M., Stolz, D. B., … & Gladwin, M. T. (2016).
Globin X is a six-coordinate globin that reduces nitrite to nitric oxide in fish red blood
cells. Proceedings of the National Academy of Sciences, 113(30), 8538–8543.
https://doi.org/10.1073/pnas.1522670113
Crenshaw, M. A., & Neff, J. M. (1969). Decalcification at the mantle-shell interface in
molluscs. American Zoologist, 9(3), 881–885. https://doi.org/10.1093/icb/9.3.881
Dando, P. R., Southward, A. J., Southward, E. C., Terwilliger, N. B., & Terwilliger, R. C.
(1985). Sulphur-oxidising bacteria and haemoglobin in gills of the bivalve mollusc Myrtea
spinifera. Retrieved from http://agris.fao.org/agris-
search/search.do?recordID=AV20120126232
Darawshe, S., Tsafadyah, Y., & Daniel, E. (1987). Quaternary structure of erythrocruorin
from the nematode Ascaris suum. Evidence for unsaturated haem-binding sites.
Biochemical Journal, 242(3), 689–694. https://doi.org/10.1042/bj2420689
Davenport, J., & Wong, T. M. (1986). Responses of the blood cockle Anadara granosa (L.)
(Bivalvia: Arcidae) to salinity, hypoxia and aerial exposure. Aquaculture, 56(2), 151–162.
https://doi.org/10.1016/0044-8486(86)90024-4
De Zwaan, A., Cattan, O., & Putzer, V. M. (1993). Sulfide and cyanide induced mortality and
anaerobic metabolism in the arcid blood clam Scapharca inaequivalvis. Comparative
References 73
Biochemistry and Physiology Part C: Comparative Pharmacology, 105(1), 49–54.
https://doi.org/10.1016/0742-8413(93)90056-Q
Decker, C., Zorn, N., Le Bruchec, J., Caprais, J. C., Potier, N., Leize-Wagner, E., ... &
Andersen, A. C. (2016). Can the haemoglobin characteristics of vesicomyid clam species
influence their distribution in deep-sea sulfide-rich sediments? A case study in the Angola
Basin. Deep Sea Research Part II: Topical Studies in Oceanography.
http://dx.doi.org/10.1016/j.dsr2.2016.11.009
Decker, C., Zorn, N., Potier, N., Leize-Wagner, E., Lallier, F. H., Olu, K., & Andersen, A. C.
(2014). Globin’s structure and function in vesicomyid bivalves from the gulf of guinea
cold seeps as an adaptation to life in reduced sediments. Physiological and Biochemical
Zoology: Ecological and Evolutionary Approaches, 87(6), 855–869.
https://doi.org/10.1086/678131
Des Marais, D. L., & Rausher, M. D. (2008). Escape from adaptive conflict after duplication
in an anthocyanin pathway gene. Nature, 454(7205), 762–765.
https://doi.org/10.1038/nature07092
Dewilde, S., Angelini, E., Kiger, L., Marden, M., Beltramini, M., Salvato, B., & Moens, L.
(2003). Structure and function of the globin and globin gene from the Antarctic mollusc
Yoldia eightsi. Biochemical Journal, 370, 245–253. https://doi.org/10.1042/bj20020727
Dewilde, S., Hauwaert, M.-. L., Peeters, K., Vanfleteren, J., & Moens, L. (1999). Daphnia
pulex didomain haemoglobin: structure and evolution of polymeric haemoglobins and
their coding genes. Molecular Biology Evolution, 16.
https://doi.org/10.1093/oxfordjournals.molbev.a026211
Dixon, B., Walker, B., Kimmins, W., & Pohajdak, B. (1991). Isolation and sequencing of a
cDNA for an unusual haemoglobin from the parasitic nematode Pseudoterranova
74 References
decipiens. Proceedings of the National Academy of Sciences, 88(13), 5655–5659.
https://doi.org/10.1073/pnas.88.13.5655
Doeller, J. E., Kraus, D. W., Colacino, J. M., & Wittenberg, J. B. (1988). Gill Haemoglobin
May Deliver Sulfide to Bacterial Symbionts of Solemya velum (Bivalvia, Mollusca). The
Biological Bulletin, 175(3), 388–396. https://doi.org/%7B%7Barticle.doi%7D%7D
Donaghy, L., Lambert, C., Choi, K. S., & Soudant, P. (2009). Hemocytes of the carpet shell
clam (Ruditapes decussatus) and the Manila clam (Ruditapes philippinarum): current
knowledge and future prospects. Aquaculture, 297(1), 10-24.
Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high
throughput. Nucleic Acids Research, 32(5), 1792–1797.
https://doi.org/10.1093/nar/gkh340
Efstratiadis, A., Posakony, J. W., Maniatis, T., Lawn, R. M., O’Connell, C., Spritz, R. A., …
& Blechl, A.E. (1980). The structure and evolution of the human β-globin gene family.
Cell, 21(3), 653–668. https://doi.org/10.1016/0092-8674(80)90429-8
Ekblom, R., & Galindo, J. (2011). Applications of next generation sequencing in molecular
ecology of non-model organisms. Heredity, 107(1), 1–15.
https://doi.org/10.1038/hdy.2010.152
FAO. (2009). Fisheries & Aquaculture - Fishery statistical collections - global aquaculture
production. Retrieved August 23, 2016, from http://www.fao.org/fishery/statistics/global-
aquaculture-production/en
Finn, R. D., Miller, B. L., Clements, J., & Bateman, A. (2014). iPfam: a database of protein
family and domain interactions found in the Protein Data Bank. Nucleic Acids Research,
42(D1), D364–D373. https://doi.org/10.1093/nar/gkt1210
References 75
Fisher, A., Comly, M., Do, R., Temarkin, L., Ghazanfari, A. F., & Mukherjee, A. B. (1984).
Two pools of β-endorphin-like immunoreactivity in blood: plasma and erythrocytes. Life
Sciences, 34(19), 1839–1846. https://doi.org/10.1016/0024-3205(84)90677-5
Flögel, U., Merx, M. W., Gödecke, A., Decking, U. K. M., & Schrader, J. (2001).
Myoglobin: A scavenger of bioactive NO. Proceedings of the National Academy of
Sciences, 98(2), 735–740. https://doi.org/10.1073/pnas.98.2.735
Flores, J. F., Fisher, C. R., Carney, S. L., Green, B. N., Freytag, J. K., Schaeffer, S. W., &
Royer, W. E. (2005). Sulfide binding is mediated by zinc ions discovered in the crystal
structure of a hydrothermal vent tubeworm haemoglobin. Proceedings of the National
Academy of Sciences of the United States of America, 102(8), 2713–2718.
https://doi.org/10.1073/pnas.0407455102
Foote, A. D., Liu, Y., Thomas, G. W. C., Vinař, T., Alföldi, J., Deng, J., … & Gibbs, R. A.
(2015). Convergent evolution of the genomes of marine mammals. Nature Genetics, 47(3),
272–275. https://doi.org/10.1038/ng.3198
Force, A., Lynch, M., Pickett, F. B., Amores, A., Yan, Y., & Postlethwait, J. (1999).
Preservation of Duplicate Genes by Complementary, Degenerative Mutations. Genetics,
151(4), 1531–1545.
Fuchs, C., Burmester, T., & Hankeln, T. (2006). The amphibian globin gene repertoire as
revealed by the Xenopus genome. Cytogenetic and Genome Research, 112(3–4), 296–306.
Furuta, H., & Kajita, A. (1983). Dimeric haemoglobin of the bivalve mollusc Anadara
broughtonii: complete amino acid sequence of the globin chain. Biochemistry, 22(4), 917–
922. https://doi.org/10.1021/bi00273a032
Gallant, J. R., Traeger, L. L., Volkening, J. D., Moffett, H., Chen, P.-H., Novina, C. D., … &
Sussman, M. R. (2014). Genomic basis for the convergent evolution of electric organs.
Science, 344(6191), 1522–1525. https://doi.org/10.1126/science.1254432
76 References
Garcia, H. E., Boyer, T. P., Levitus, S., Locarnini, R. A., & Antonov, J. (2005). On the
variability of dissolved oxygen and apparent oxygen utilization content for the upper
world ocean: 1955 to 1998. Geophysical Research Letters, 32(9).
https://doi.org/10.1029/2004GL022286
Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R. D., & Bairoch, A. (2003).
ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic
Acids Research, 31(13), 3784–3788.
Gobler, C. J., DePasquale, E. L., Griffith, A. W., & Baumann, H. (2014). Hypoxia and
acidification have additive and synergistic negative effects on the growth, survival, and
metamorphosis of early life stage bivalves. PLoS one, 9(1), e83648.
https://doi.org/10.1371/journal.pone.0083648
Golovina, I. V., Gostyukhina, O. L., & Andreyenko, T. I. (2016). Specific metabolic features
in tissues of the ark clam Anadara kagoshimensis. Russian Journal of Biological
Invasions, 7(2), 137-145.
González, V. L., Andrade, S. C. S., Bieler, R., Collins, T. M., Dunn, C. W., Mikkelsen, P. M.,
Taylor, J.D., & Giribet, G. (2015). A phylogenetic backbone for Bivalvia: an RNA-seq
approach. Proceedings of the Royal Society B, 282(1801), 20142332.
https://doi.org/10.1098/rspb.2014.2332
Goodman, M., Czelusniak, J., Koop, B. F., Tagle, D. A., & Slightom, J. L. (1987). Globins: A
case study in molecular phylogeny. Cold Spring Harbor Symposia on Quantitative
Biology, 52, 875–890. https://doi.org/10.1101/SQB.1987.052.01.096
Goossens, M., Dozy, A. M., Embury, S. H., Zachariades, Z., Hadjiminas, M. G.,
Stamatoyannopoulos, G., & Kan, Y. W. (1980). Triplicated alpha-globin loci in humans.
Proceedings of the National Academy of Sciences, 77(1), 518–521.
References 77
Götting, M., & Nikinmaa, M. (2015). More than haemoglobin – the unexpected diversity of
globins in vertebrate red blood cells. Physiological Reports, 3(2), e12284.
https://doi.org/10.14814/phy2.12284
Gow, A. J., Payson, A. P., & Bonaventura, J. (2005). Invertebrate haemoglobins and nitric
oxide: How heme pocket structure controls reactivity. Journal of Inorganic Biochemistry,
99(4), 903–911. https://doi.org/10.1016/j.jinorgbio.2004.12.001
Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., … &
Regev, A. (2011). Full-length transcriptome assembly from RNA-Seq data without a
reference genome. Nature Biotechnology, 29(7), 644–652.
https://doi.org/10.1038/nbt.1883
Gribaldo, S., Casane, D., Lopez, P., & Philippe, H. (2003). Functional divergence prediction
from evolutionary analysis: A case study of vertebrate haemoglobin. Molecular Biology
and Evolution, 20(11), 1754–1759. https://doi.org/10.1093/molbev/msg171
Grinich, N. P., & Terwilliger, R. C. (1980). The quarternary structure of an unusual high-
molecular-weight intracellular haemoglobin from the bivalve mollusc Barbatia reeveana.
Biochemical Journal, 189(1), 1–8. https://doi.org/10.1042/bj1890001
Grispo, M. T., Natarajan, C., Projecto-Garcia, J., Moriyama, H., Weber, R. E., & Storz, J. F.
(2012). Gene duplication and the evolution of haemoglobin isoform differentiation in
birds. Journal of Biological Chemistry, 287(45), 37647–37658.
https://doi.org/10.1074/jbc.M112.375600
Guo, X. (2009). Use and exchange of genetic resources in molluscan aquaculture. Reviews in
Aquaculture, 1(3–4), 251–259. https://doi.org/10.1111/j.1753-5131.2009.01014.x
Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J.,… &
Regev, A. (2013). De novo transcript sequence reconstruction from RNA-seq using the
78 References
Trinity platform for reference generation and analysis. Nature Protocols, 8(8), 1494–1512.
https://doi.org/10.1038/nprot.2013.084
Halanych, K. M., & Passamaneck, Y. (2001). A brief review of metazoan phylogeny and
future prospects in hox-research. American Zoologist, 41(3), 629–639.
https://doi.org/10.1093/icb/41.3.629
Hankeln, T., Ebner, B., Fuchs, C., Gerlach, F., Haberkamp, M., Laufs, T. L., …& Burmester,
T. (2005). Neuroglobin and cytoglobin in search of their role in the vertebrate globin
family. Journal of Inorganic Biochemistry, 99(1), 110–119.
https://doi.org/10.1016/j.jinorgbio.2004.11.009
Hanscombe, O., Whyatt, D., Fraser, P., Yannoutsos, N., Greaves, D., Dillon, N., & Grosveld,
F. (1991). Importance of globin gene order for correct developmental expression. Genes &
Development, 5(8), 1387–1394. https://doi.org/10.1101/gad.5.8.1387
Hardison, R., Slightom, J. L., Gumucio, D. L., Goodman, M., Stojanovic, N., & Miller, W.
(1997). Locus control regions of mammalian β-globin gene clusters: combining
phylogenetic analyses and experimental results to gain functional insights. Gene, 205(1–
2), 73–94. https://doi.org/10.1016/S0378-1119(97)00474-5
Hardison, R. (1996). A brief history of haemoglobins: plant, animal, protist, and bacteria.
Proceedings of the National Academy of Sciences of the United States of America, 93(12),
5675.
Harper, E. M., & Skelton, P. W. (1993). The Mesozoic marine revolution and epifaunal
bivalves. Scripta Geologica, Special, (2), 127–153.
He, X., & Zhang, J. (2005). Rapid subfunctionalization accompanied by prolonged and
substantial neofunctionalization in duplicate gene evolution. Genetics, 169(2), 1157–1164.
https://doi.org/10.1534/genetics.104.037051
References 79
Herreid, C. F. (1980). Hypoxia in invertebrates. Comparative Biochemistry and Physiology
Part A: Physiology, 67(3), 311–320. https://doi.org/10.1016/S0300-9629(80)80002-8
Higgs, D. R., Old, J. M., Pressley, L., Clegg, J. B., & Weatherall, D. J. (1980). A novel
[alpha]-globin gene arrangement in man. Nature, 284(5757), 632–635.
https://doi.org/10.1038/284632a0
Hoffmann, F. G., Opazo, J. C., Hoogewijs, D., Hankeln, T., Ebner, B., Vinogradov, S. N., …
& Storz, J. F. (2012). Evolution of the globin gene family in deuterostomes: lineage-
specific patterns of diversification and attrition. Molecular Biology and Evolution, 29(7),
1735–1745. https://doi.org/10.1093/molbev/mss018
Hoffmann, F. G., Opazo, J. C., & Storz, J. F. (2011). Differential loss and retention of
cytoglobin, myoglobin, and globin-E during the radiation of vertebrates. Genome Biology
and Evolution, 3, 588–600. https://doi.org/10.1093/gbe/evr055
Hoffmann, F. G., Opazo, J. C., & Storz, J. F. (2010a). Gene cooption and convergent
evolution of oxygen transport haemoglobins in jawed and jawless vertebrates. Proceedings
of the National Academy of Sciences, 107(32), 14274–14279.
https://doi.org/10.1073/pnas.1006756107
Hoffmann, F. G., Storz, J. F., Gorr, T. A., & Opazo, J. C. (2010b). Lineage-specific patterns
of functional diversification in the α- and β-globin gene families of tetrapod vertebrates.
Molecular Biology and Evolution, 27(5), 1126–1138.
https://doi.org/10.1093/molbev/msp325
Hoffmann, F. G., & Storz, J. F. (2007). The αD-globin gene originated via duplication of an
embryonic α-like globin gene in the ancestor of tetrapod vertebrates. Molecular Biology
and Evolution, 24(9), 1982–1990. https://doi.org/10.1093/molbev/msm127
Hokamp, K., McLysaght, A., & Wolfe, K. H. (2003). The 2R hypothesis and the human
genome sequence. In A. Meyer & Y. V. de Peer (Eds.), Genome Evolution (pp. 95–110).
80 References
Springer Netherlands. Retrieved from http://link.springer.com/chapter/10.1007/978-94-
010-0263-9_10
Holden, J. A., Pipe, R. K., Quaglia, A., & Ciani, G. (1994). Blood cells of the arcid clam,
Scapharca inaequivalvis. Journal of the Marine Biological Association of the United
Kingdom, 74(02), 287-299.
Hoogewijs, D., Ebner, B., Germani, F., Hoffmann, F. G., Fabrizius, A., Moens, L., … &
Hankeln, T. (2011). Androglobin: A chimeric globin in metazoans that is preferentially
expressed in mammalian testes. Molecular Biology and Evolution, 29(4), 1105-1114.
https://doi.org/10.1093/molbev/msr246
Hourdez, S., & Lallier, F. H. (2006). Adaptations to hypoxia in hydrothermal-vent and cold-
seep invertebrates. Reviews in Environmental Science and Bio/Technology, 6(1–3), 143–
159. https://doi.org/10.1007/s11157-006-9110-3
Hourdez, S., & Weber, R. E. (2005). Molecular and functional adaptations in deep-sea
haemoglobins. Journal of Inorganic Biochemistry, 99(1), 130–141.
https://doi.org/10.1016/j.jinorgbio.2004.09.017
Hourdez, S., Lamontagne, J., Peterson, P., Weber, R. E., & Fisher, C. R. (2000).
Haemoglobin from a deep-sea hydrothermal-vent copepod. The Biological Bulletin,
199(2), 95–99.
Huang, Y., Niu, B., Gao, Y., Fu, L., & Li, W. (2010). CD-HIT Suite: a web server for
clustering and comparing biological sequences. Bioinformatics, 26(5), 680–682.
https://doi.org/10.1093/bioinformatics/btq003
Huerta-Cepas, J., Dopazo, J., Huynen, M. A., & Gabaldon, T. (2011). Evidence for short-time
divergence and long-time conservation of tissue-specific expression after gene duplication.
Briefings in Bioinformatics, 12(5), 442–448. https://doi.org/10.1093/bib/bbr022
References 81
Hurles, M. (2004). Gene duplication: the genomic trade in spare parts. PLoS Biology, 2(7),
e206. https://doi.org/10.1371/journal.pbio.0020206
Ikeda-Saito, M., Yonetani, T., Chiancone, E., Ascoli, F., Verzili, D., & Antonini, E. (1983).
Thermodynamic properties of oxygen equilibria of dimeric and tetrameric haemoglobins
from Scapharca inaequivalvis. Journal of Molecular Biology, 170(4), 1009–1018.
https://doi.org/10.1016/S0022-2836(83)80200-9
Ingram, V. M. (1961). Gene evolution and the haemoglobins. Nature, 189(4766), 704-708.
Innan, H., & Kondrashov, F. (2010). The evolution of gene duplications: classifying and
distinguishing between models. Nature Reviews Genetics, 11(2), 97–108.
https://doi.org/10.1038/nrg2689
Jellie, A. M., Tate, W. P., & Trotman, C. N. A. (1996). Evolutionary history of introns in a
multidomain globin gene. Journal of Molecular Evolution, 42(6), 641–647.
https://doi.org/10.1007/BF02338797
Jokumsen, A., & Fyhn, H. J. (1982). The influence of aerial exposure upon respiratory and
osmotic properties of haemolymph from two intertidal mussels, Mytilus edulis L. and
Modiolus modiolus L. Journal of Experimental Marine Biology and Ecology, 61(2), 189–
203. https://doi.org/10.1016/0022-0981(82)90008-9
Kaessmann, H. (2010). Origins, evolution, and phenotypic impact of new genes. Genome
Research, 20(10), 1313–1326. https://doi.org/10.1101/gr.101386.109
Karstensen, J., Stramma, L., & Visbeck, M. (2008). Oxygen minimum zones in the eastern
tropical Atlantic and Pacific oceans. Progress in Oceanography, 77(4), 331–350.
https://doi.org/10.1016/j.pocean.2007.05.009
Kato, K., Tokishita, S., Mandokoro, Y., Kimura, S., Ohta, T., Kobayashi, M., & Yamagata,
H. (2001). Two-domain haemoglobin gene of the water flea Moina macrocopa:
82 References
duplication in the ancestral Cladocera, diversification, and loss of a bridge intron. Gene,
273(1), 41–50. https://doi.org/10.1016/S0378-1119(01)00569-8
Kawano, K., Iwasaki, N., & Suzuki, T. (2003). Notable diversity in haemoglobin expression
patterns among species of the deep-sea clam, Calyptogena. Cellular and Molecular Life
Sciences CMLS, 60(9), 1952–1956. https://doi.org/10.1007/s00018-003-3184-7
Koch, J., & Burmester, T. (2016). Membrane-bound globin X protects the cell from reactive
oxygen species. Biochemical and Biophysical Research Communications, 469(2), 275–
280. https://doi.org/10.1016/j.bbrc.2015.11.105
Koch, J., Lüdemann, J., Spies, R., Last, M., Amemiya, C. T., & Burmester, T. (2016).
Unusual diversity of myoglobin genes in the lungfish. Molecular Biology and Evolution,
msw159. https://doi.org/10.1093/molbev/msw159
Koch, L. G., & Britton, S. L. (2008). Aerobic metabolism underlies complexity and capacity.
The Journal of Physiology, 586(1), 83–95. https://doi.org/10.1113/jphysiol.2007.144709
Kugelstadt, D., Haberkamp, M., Hankeln, T., & Burmester, T. (2004). Neuroglobin,
cytoglobin, and a novel, eye-specific globin from chicken. Biochemical and Biophysical
Research Communications, 325(3), 719–725. https://doi.org/10.1016/j.bbrc.2004.10.080
Lan, X., & Pritchard, J. K. (2016). Coregulation of tandem duplicate genes slows evolution of
subfunctionalization in mammals. Science, 352(6288), 1009–1013.
https://doi.org/10.1126/science.aad8411
Linzen, B., Soeter, N. M., Riggs, A. F., Schneider, H. J., Schartau, W., & Moore, M. D.
(1985). The structure of arthropod hemocyanins. Science, 229(4713), 519–524.
https://doi.org/10.1126/science.4023698
Liu, Y., Cotton, J. A., Shen, B., Han, X., Rossiter, S. J., & Zhang, S. (2010). Convergent
sequence evolution between echolocating bats and dolphins. Current Biology, 20(2), R53–
R54. https://doi.org/10.1016/j.cub.2009.11.058
References 83
Lynch, M., & Force, A. (2000). The probability of duplicate gene preservation by
subfunctionalization. Genetics, 154(1), 459–473.
Maeda, N., & Fitch, W. M. (1982). Frog heart monomeric haemoglobin. In Methods in
Protein Sequence Analysis (Eds), (pp. 569–570). Humana Press. Retrieved from
http://link.springer.com/chapter/10.1007/978-1-4612-5832-2_63
Mangum, C. P. (1998). Major Events in the Evolution of the Oxygen Carriers. American
Zoologist, 38(1), 1–13.
Mangum, C. P. (1997). Introduction The Red Blood Cell Haemoglobins Distribution and
localization Molecular structure. (Vol. 2).
Mangum, C. P. (1992). Respiratory function of the red blood cell haemoglobins of six animal
phyla. In C. P. Mangum (Eds.), Blood and Tissue Oxygen Carriers (pp. 117–149). Berlin,
Heidelberg: Springer Berlin Heidelberg. Retrieved from http://dx.doi.org/10.1007/978-3-
642-76418-9_5
Mangum, C. P., Scott, J. L., Miller, K. I., Holde, K. E. V., & Morse, M. P. (1987). Bivalve
hemocyanin: structural, functional, and phylogenetic relationships. The Biological
Bulletin, 173(1), 205–221.
Mangum, C. P., Woodin, B. R., Bonaventura, C., Sullivan, B., & Bonaventura, J. (1975). The
role of coelomic and vascular haemoglobin in the annelid family Terebellidae.
Comparative Biochemistry and Physiology Part A: Physiology, 51(2), 281–294.
https://doi.org/10.1016/0300-9629(75)90372-2
Mann, R. G., Fisher, W. K., Gilbert, A. T., & Thompson, E. O. P. (1986). Genetic variation
of the dimeric haemoglobin of the bivalve mollusc Anadara trapezia. Australian Journal
of Biological Sciences, 39(2), 109–116.
Manning, A. M., Trotman, C. N. A., & Tate, W. P. (1990). Evolution of a polymeric globin in
the brine shrimp Artemia. Nature, 348(6302), 653–656. https://doi.org/10.1038/348653a0
84 References
Manwell, C. (1963). The chemistry and biology of haemoglobin in some marine clams—I.
Distribution of the pigment and properties of the oxygen equilibrium. Comparative
Biochemistry and Physiology, 8(3), 209–218. https://doi.org/10.1016/0010-
406X(63)90125-7
Michaelidis, B., Haas, D., & Grieshaber, M. K. (2005). Extracellular and intracellular acid‐
base status with regard to the energy metabolism in the oyster Crassostrea gigas during
exposure to air. Physiological and Biochemical Zoology: Ecological and Evolutionary
Approaches, 78(3), 373–383. https://doi.org/10.1086/430223
Mikkelsen, P. M., & Bieler, R. (2003). Systematic revision of the western Atlantic file clams,
Lima and Ctenoides (Bivalvia : Limoida : Limidae). Invertebrate Systematics, 17(5), 667–
710.
Moleirinho, A., Seixas, S., Lopes, A. M., Bento, C., Prata, M. J., & Amorim, A. (2013).
Evolutionary constraints in the β-globin cluster: The signature of purifying selection at the
δ-globin (HBD) locus and its role in developmental gene regulation. Genome Biology and
Evolution, 5(3), 559–571. https://doi.org/10.1093/gbe/evt029
Montes-Rodríguez, I. M., Rivera, L. E., López-Garriga, J., & Cadilla, C. L. (2016).
Characterization and expression of the Lucina pectinata oxygen and sulfide binding
haemoglobin genes. PLoS one, 11(1), e0147977.
https://doi.org/10.1371/journal.pone.0147977
Moriyama, Y., Ito, F., Takeda, H., Yano, T., Okabe, M., Kuraku, S., ... & Koshiba-Takeuchi,
K. (2016). Evolution of the fish heart by sub/neofunctionalization of an elastin
gene. Nature communications, 7.
Morse, M. P., Meyhofer, E., Otto, J. J., & Kuzirian, A. M. (1986). Hemocyanin respiratory
pigment in bivalve mollusks. Science, 231(4743), 1302–1304.
https://doi.org/10.1126/science.3945826
References 85
Motoyama, H., Komiya, T., Thuy, L. T. T., Tamori, A., Enomoto, M., Morikawa, H., … &
Kawada, N. (2014). Cytoglobin is expressed in hepatic stellate cells, but not in
myofibroblasts, in normal and fibrotic human liver. Laboratory Investigation, 94(2), 192–
207. https://doi.org/10.1038/labinvest.2013.135
Naito, Y., Riggs, C. K., Vandergon, T. L., & Riggs, A. F. (1991). Origin of a “bridge” intron
in the gene for two domain globin. Proceedings of the National Academy of Sciences of
the USA, 88(15). https://doi.org/10.1073/pnas.88.15.6672
Natarajan, C., Projecto-Garcia, J., Moriyama, H., Weber, R. E., Muñoz-Fuentes, V., Green,
A. J., … & Storz, J. F. (2015). Convergent Evolution of Haemoglobin Function in High-
Altitude Andean Waterfowl Involves Limited Parallelism at the Molecular Sequence
Level. PLoS Genetics, 11(12), e1005681. https://doi.org/10.1371/journal.pgen.1005681
Negrisolo, E., Pallavicini, A., Barbato, R., Dewilde, S., Ghiretti-Magaldi, A., Moens, L., &
Lanfranchi, G. (2001). The evolution of extracellular haemoglobins of annelids,
vestimentiferans, and pogonophorans. Journal of Biological Chemistry, 276(28), 26391–
26397. https://doi.org/10.1074/jbc.M100557200
Nicol, P. I., & O’Gower, A. K. (1967). Haemoglobin variation in Anadara trapezia. Nature,
216, 684. https://doi.org/10.1038/216684a0
Norman, J. D., Danzmann, R. G., Glebe, B., & Ferguson, M. M. (2011). The genetic basis of
salinity tolerance traits in Arctic charr (Salvelinus alpinus). BMC Genetics, 12(1), 81.
https://doi.org/10.1186/1471-2156-12-81
Officer, C. B., Biggs, R. B., Taft, J. L., Cronin, L. E., Tyler, M. A., & Boynton, W. R. (1984).
Chesapeake Bay anoxia: origin, development, and significance. Science, 223(6).
O’Gower, A., & Nicol, P. I. (1968). A latitudinal cline of haemoglobins in a bivalve mollusc.
Heredity, 23(4), 485–491.
86 References
Ohno, S., Wolf, U., & Atkin, N. B. (1968). Evolution from Fish to Mammals by Gene
Duplication. Hereditas, 59(1), 169–187. https://doi.org/10.1111/j.1601-
5223.1968.tb02169.x
Oleksiewicz, U., Liloglou, T., Field, J. K., & Xinarianos, G. (2011). Cytoglobin: biochemical,
functional and clinical perspective of the newest member of the globin family. Cellular
and Molecular Life Sciences, 68(23), 3869–3883. https://doi.org/10.1007/s00018-011-
0764-9
Opazo, J. C., Lee, A. P., Hoffmann, F. G., Toloza-Villalobos, J., Burmester, T., Venkatesh,
B., & Storz, J. F. (2015). Ancient duplications and expression divergence in the globin
gene superfamily of vertebrates: Insights from the elephant shark genome and
transcriptome. Molecular Biology and Evolution, 32(7), 1684-1694.
https://doi.org/10.1093/molbev/msv054
Opazo, J. C., Hoffmann, F. G., & Storz, J. F. (2008). Genomic evidence for independent
origins of β-like globin genes in monotremes and therian mammals. Proceedings of the
National Academy of Sciences, 105(5), 1590–1595.
https://doi.org/10.1073/pnas.0710531105
Osborn, T. C., Pires, J.C., Birchler, J. A., Auger, D. L., Chen, Z.J., Lee, H.-S., …
Martienssen, R.A. (2003). Understanding mechanisms of novel gene expression in
polyploids. Trends in Genetics, 19(3), 141–147. https://doi.org/10.1016/S0168-
9525(03)00015-5
Parker, J., Tsagkogeorga, G., Cotton, J. A., Liu, Y., Provero, P., Stupka, E., & Rossiter, S. J.
(2013). Genome-wide signatures of convergent evolution in echolocating mammals.
Nature, 502(7470), 228–231. https://doi.org/10.1038/nature12511
Perutz, M., F. (1979). Regulation of oxygen affinity of haemoglobin: influence of structure of
the globin on the heme iron. Annual review of biochemistry, 48(1), 327–386.
References 87
Pesce, A., Bolognesi, M., Bocedi, A., Ascenzi, P., Dewilde, S., Moens, L., … & Burmester,
T. (2002). Neuroglobin and cytoglobin. EMBO reports, 3(12), 1146–1151.
Petruzzelli, R., Goffredo, B. M., Barra, D., Bossa, F., Boffi, A., Verzili, D., … & Chiancone,
E. (1985). Amino acid sequence of the cooperative homodimeric haemoglobin from the
mollusc Scapharca inaequivalvis and topology of the intersubunit contacts. FEBS Letters,
184(2), 328–332. https://doi.org/10.1016/0014-5793(85)80632-3
Piro, M. C., Gambacurta, A., Basili, P., & Ascoli, F. (1998). The exon/intron organization of
the globin gene of Scapharca inaequivalvis homodimeric haemoglobin: unusual intron
homology with other bivalve mollusc globin genes. Gene, 221(1), 45–49.
https://doi.org/10.1016/S0378-1119(98)00442-9
Piro, M. C., Gambacurta, A., & Ascoli, F. (1996). Scapharca inaequivalvis tetrameric
haemoglobin α and β chains: cDNA sequencing and genomic organization. Journal of
Molecular Evolution, 43(6), 594–601. https://doi.org/10.1007/BF02202107
Prentis, P. J., & Pavasovic, A. (2014). The Anadara trapezia transcriptome: A resource for
molluscan physiological genomics. Marine Genomics, 18, 113–115.
https://doi.org/10.1016/j.margen.2014.08.004
Projecto-Garcia, J., Jollivet, D., Mary, J., Lallier, F. H., Schaeffer, S. W., & Hourdez, S.
(2015). Selective forces acting during multi-domain protein evolution: the case of multi-
domain globins. SpringerPlus, 4(1), 1–14. https://doi.org/10.1186/s40064-015-1124-2
Projecto-Garcia, J., Zorn, N., Jollivet, D., Schaeffer, S. W., Lallier, F. H., & Hourdez, S.
(2010). Origin and evolution of the unique tetra-domain haemoglobin from the
hydrothermal vent scale worm branchipolynoe. Molecular Biology and Evolution, 27(1),
143–152. https://doi.org/10.1093/molbev/msp218
88 References
Rabalais, N. N., Diaz, R. J., Levin, L. A., Turner, R. E., Gilbert, D., & Zhang, J. (2010).
Dynamics and distribution of natural and human-caused hypoxia. Biogeosciences, 7(2),
585-619.
Rawat, R. (2010). Anatomy of Mollusca. Mittal Publications.
Read, K. R. (1966). Molluscan haemoglobin and myoglobin. Academic Press New York,
(Vol. 2).
Reitman, M., Grasso, J. A., Blumenthal, R., & Lewit, P. (1993). Primary sequence, evolution,
and repetitive elements of the Gallus gallus (chicken) β-globin cluster. Genomics, 18(3),
616–626. https://doi.org/10.1016/S0888-7543(05)80364-7
Riggs, A. F. (1991). Aspects of the origin and Evolution of Non-Vertebrate Haemoglobins.
American Zoologist, 31(3), 535–545. https://doi.org/10.1093/icb/31.3.535
Roeder, G. S. (1983). Unequal crossing-over between yeast transposable elements. Molecular
and General Genetics MGG, 190(1), 117–121.
Roesner, A., Fuchs, C., Hankeln, T., & Burmester, T. (2005). A globin gene of ancient
evolutionary origin in lower vertebrates: evidence for two distinct globin families in
animals. Molecular Biology and Evolution, 22(1), 12–20.
https://doi.org/10.1093/molbev/msh258
Ronda, L., Bettati, S., Henry, E. R., Kashav, T., Sanders, J. M., Royer, W. E., & Mozzarelli,
A. (2013). Tertiary and quaternary allostery in tetrameric haemoglobin from Scapharca
inaequivalvis. Biochemistry, 52(12), 2108–2117. https://doi.org/10.1021/bi301620x
Royer, W. E., Zhu, H., Gorr, T. A., Flores, J. F., & Knapp, J. E. (2005). Allosteric
haemoglobin assembly: diversity and similarity. Journal of Biological Chemistry, 280(30),
27477–27480. https://doi.org/10.1074/jbc.R500006200
Royer Jr, W. E., Knapp, J. E., Strand, K., & Heaslet, H. A. (2001). Cooperative
haemoglobins: conserved fold, diverse quaternary assemblies and allosteric mechanisms.
References 89
Trends in Biochemical Sciences, 26(5), 297–304. https://doi.org/10.1016/S0968-
0004(01)01811-4
Royer, W. E., Strand, K., Heel, M. van, & Hendrickson, W. A. (2000). Structural hierarchy in
erythrocruorin, the giant respiratory assemblage of annelids. Proceedings of the National
Academy of Sciences, 97(13), 7107–7111. https://doi.org/10.1073/pnas.97.13.7107
Royer, W. E., Love, W. E., & Fenderson, F. F. (1985). Cooperative dimeric and tetrameric
clam haemoglobins are novel assemblages of myoglobin folds. Nature, 316(6025), 277–
280. https://doi.org/10.1038/316277a0
Sankaran, V. G., Menne, T. F., Xu, J., Akie, T. E., Lettre, G., Handel, B. V., … & Orkin, S.
H. (2008). Human fetal haemoglobin expression is regulated by the developmental stage-
specific repressor BCL11A. Science, 322(5909), 1839–1842.
https://doi.org/10.1126/science.1165409
Schindelmeiser, I., Kuhlmann, D., & Nolte, A. (1979). Localization and characterization of
hemoproteins in the central nervous tissue of some gastropods. Comparative Biochemistry
and Physiology Part B: Comparative Biochemistry, 64(2), 149–154.
https://doi.org/10.1016/0305-0491(79)90153-6
Schwarze, K., Campbell, K. L., Hankeln, T., Storz, J. F., Hoffmann, F. G., & Burmester, T.
(2014). The globin gene repertoire of lampreys: convergent evolution of haemoglobin and
myoglobin in jawed and jawless vertebrates. Molecular Biology and Evolution, 31(10),
2708–2721. https://doi.org/10.1093/molbev/msu216
Scott, K. M., & Fisher, C. R. (1995). Physiological ecology of sulfide metabolism in
hydrothermal vent and cold seep vesicomyid clams and vestimentiferan tube worms.
American Zoologist, 35(2), 102–111. https://doi.org/10.1093/icb/35.2.102
Shumway, S. E., & Parsons, G. J. (2011). Scallops: Biology, Ecology and Aquaculture.
Elsevier.
90 References
Sidell, B. D., & O’Brien, K. M. (2006). When bad things happen to good fish: the loss of
haemoglobin and myoglobin expression in Antarctic icefishes. Journal of Experimental
Biology, 209(10), 1791–1802. https://doi.org/10.1242/jeb.02091
Singh, S., Canseco, D. C., Manda, S. M., Shelton, J. M., Chirumamilla, R. R., Goetsch, S. C.,
… & Mammen, P. P. A. (2014). Cytoglobin modulates myogenic progenitor cell viability
and muscle regeneration. Proceedings of the National Academy of Sciences, 111(1),
E129–E138. https://doi.org/10.1073/pnas.1314962111
Smith, M. H. (1967). Occurrence of haemoglobin in some molluscs. Comparative
Biochemistry and Physiology, 20(1), 361–364. https://doi.org/10.1016/0010-
406X(67)90755-4
Souza, P. C. de, & Bonilla-Rodriguez, G. O. (2007). Fish haemoglobins. Brazilian Journal of
Medical and Biological Research, 40(6), 769–778. https://doi.org/10.1590/S0100-
879X2007000600004
Stapley, J., Reger, J., Feulner, P. G. D., Smadja, C., Galindo, J., Ekblom, R., … Slate, J.
(2010). Adaptation genomics: the next generation. Trends in Ecology & Evolution, 25(12),
705–712. https://doi.org/10.1016/j.tree.2010.09.002
Storz, J. F., Bridgham, J. T., Kelly, S. A., & Garland, T. (2015). Genetic approaches in
comparative and evolutionary physiology. American Journal of Physiology - Regulatory,
Integrative and Comparative Physiology, 309(3), R197–R214.
https://doi.org/10.1152/ajpregu.00100.2015
Storz, J. F., Hoffmann, F. G., Opazo, J. C., & Moriyama, H. (2008). Adaptive functional
divergence among triplicated α-globin genes in rodents. Genetics, 178(3), 1623–1638.
https://doi.org/10.1534/genetics.107.080903
References 91
Storz, J. F., Opazo, J. C., & Hoffmann, F. G. (2013). Gene duplication, genome duplication,
and the functional diversification of vertebrate globins. Molecular Phylogenetics and
Evolution, 66(2), 469–478. https://doi.org/10.1016/j.ympev.2012.07.013
Stothard, P. (2000). The sequence manipulation suite: JavaScript programs for analyzing and
formatting protein and DNA sequences. BioTechniques, 28(6), 1102, 1104.
Strand, K., Knapp, J. E., Bhyravbhatla, B., & Royer Jr, W. E. (2004). Crystal structure of the
haemoglobin dodecamer from Lumbricus erythrocruorin: allosteric core of giant annelid
respiratory complexes. Journal of Molecular Biology, 344(1), 119–134.
https://doi.org/10.1016/j.jmb.2004.08.094
Su, C.-Y., Kemp, H. A., & Moens, C. B. (2014). Cerebellar development in the absence of
GbX function in zebrafish. Developmental Biology, 386(1), 181–190.
https://doi.org/10.1016/j.ydbio.2013.10.026
Sullivan, G. (1961). Functional morphology, micro-anatomy, and histology of the “Sydney
cockle” Anadara trapezia (Deshayes )(Lamellibranchia: Arcidae). Australian Journal of
Zoology, 9(2), 219–257.
Surm, J. M., Prentis, P. J., & Pavasovic, A. (2015). Comparative analysis and distribution of
omega-3 lcPUFA biosynthesis genes in marine molluscs. PLoS one, 10(8), e0136301.
https://doi.org/10.1371/journal.pone.0136301
Suzuki, T., Kawamichi, H., Ohtsuki, R., Iwai, M., & Fujikura, K. (2000). Isolation and
cDNA-derived amino acid sequences of haemoglobin and myoglobin from the deep-sea
clam Calyptogena kaikoi. Biochimica et Biophysica Acta (BBA)-Protein Structure and
Molecular Enzymology, 1478(1), 152–158.
Suzuki, T., Kawasaki, Y., Arita, T., & Nakamura, A. (1996). Two-domain haemoglobin of
the blood clam Barbatia lima resulted from the recent gene duplication of the single-
domain delta chain. Biochemical Journal, 313, 561–566.
92 References
Suzuki, T., & Arita, T. (1995). Two-domain haemoglobin from the blood clam, Barbatia
lima. The cDNA-derived amino acid sequence. Journal of Protein Chemistry, 14(7), 499–
502.
Suzuki, T., Nakamura, A., Satoh, Y., Inai, C., Furukohri, T., & Arita, T. (1992). Primary
structure of chain I of the heterodimeric haemoglobin from the blood clam Barbatia
virescens. Journal of Protein Chemistry, 11(6): 629–633 .
https://doi.org/10.1007/BF01024963
Tamura, K., Stecher, G., Peterson, D., Filipski, A., & Kumar, S. (2013). MEGA6: Molecular
Evolutionary Genetics Analysis version 6.0. Molecular Biology and Evolution, 30(12),
2725–2729. https://doi.org/10.1093/molbev/mst197
Terwilliger, N. B. (1998). Functional adaptations of oxygen-transport proteins. Journal of
Experimental Biology, 201(8), 1085–1098.
Terwilliger, N. B., Terwilliger, R. C., Meyhöfer, E., & Morse, M. P. (1988). Bivalve
hemocyanins—a comparison with other molluscan hemocyanins. Comparative
Biochemistry and Physiology Part B: Comparative Biochemistry, 89(1), 189–195.
https://doi.org/10.1016/0305-0491(88)90282-9
Terwilliger, R. C., Terwilliger, N. B., & Arp, A. (1983). Thermal vent clam (Calyptogena
magnifica) haemoglobin. Science, 219(4587), 981–983.
https://doi.org/10.1126/science.219.4587.981
Terwilliger, R. C. (1980). Structures of invertebrate haemoglobins. American Zoologist,
20(1), 53–67. https://doi.org/10.1093/icb/20.1.53
Terwilliger, R. C., Terwilliger, N. B., & Schabtach, E. (1978). Extracellular haemoglobin of
the clam, Cardita borealis (conrad): An unusual polymeric haemoglobin. Comparative
Biochemistry and Physiology Part B: Comparative Biochemistry, 59(1), 9–14.
https://doi.org/10.1016/0305-0491(78)90262-6
References 93
Titchen, D. A., Glenn, W. K., Nassif, N., Thompson, A. R., & Thompson, E. O. P. (1991). A
minor globin gene of the bivalve mollusc Anadara trapezia. Biochimica et Biophysica
Acta (BBA) - Gene Structure and Expression, 1089(1), 61–67.
https://doi.org/10.1016/0167-4781(91)90085-Z
Torres-Mercado, E., Renta, J. Y., Rodríguez, Y., López-Garriga, J., & Cadilla, C. L. (2003).
The cDNA-derived amino acid sequence of haemoglobin II from Lucina pectinata.
Journal of Protein Chemistry, 22(7–8), 683–690.
https://doi.org/10.1023/B:JOPC.0000008734.44356.b7
Toulmond, A., & Tchernigovtzeff, C. (1984). Ventilation and respiratory gas exchanges of
the lugworm Arenicola marina (L.) as functions of ambient PO2 (20–700 torr).
Respiration Physiology, 57(3), 349–363. https://doi.org/10.1016/0034-5687(84)90083-5
Trent, R. J., Bowden, D. K., Old, J. M., Wainscoat, J. S., Clegg, J. B., & Weatherall, D. J.
(1981). A novel rearrangement of the human β-like globin gene cluster. Nucleic Acids
Research, 9(24), 6723–6734. https://doi.org/10.1093/nar/9.24.6723
van der Burg, C. A., Prentis, P. J., Surm, J. M., & Pavasovic, A. (2016). Insights into the
innate immunome of actiniarians using a comparative genomic approach. BMC Genomics,
17, 850. https://doi.org/10.1186/s12864-016-3204-2
Vernimmen, D. (2014). Uncovering enhancer functions using the α-globin locus. PLoS
Genetics, 10(10), e1004668. https://doi.org/10.1371/journal.pgen.1004668
Vinogradov, S. N. (1985). The structure of invertebrate extracellular haemoglobins
(erythrocruorins and chlorocruorins). Comparative Biochemistry and Physiology Part B:
Comparative Biochemistry, 82(1), 1–15. https://doi.org/10.1016/0305-0491(85)90120-8
Vinogradov, S. N., & Moens, L. (2008). Diversity of globin function: enzymatic, transport,
storage, and sensing. Journal of Biological Chemistry, 283(14), 8773–8777.
https://doi.org/10.1074/jbc.R700029200
94 References
Vogel, C., Teichmann, S. A., & Pereira-Leal, J. (2005). The relationship between domain
duplication and recombination. Journal of Molecular Biology, 346(1), 355–365.
https://doi.org/10.1016/j.jmb.2004.11.050
Vuddhakul, V., Soboon, S., Sunghiran, W., Kaewpiboon, S., Chowdhury, A., Ishibashi, M.,
... & Nishibuchi, M. (2006). Distribution of virulent and pandemic strains of Vibrio
parahaemolyticus in three molluscan shellfish species (Meretrix meretrix, Perna viridis,
and Anadara granosa) and their association with foodborne disease in southern
Thailand. Journal of food protection, 69(11), 2615-2620.
Wajcman, H., Kiger, L., & Marden, M. C. (2009). Structure and function evolution in the
superfamily of globins. Comptes Rendus Biologies, 332(2–3), 273–282.
https://doi.org/10.1016/j.crvi.2008.07.026
Wang, Y., Coleman-Derr, D., Chen, G., & Gu, Y. Q. (2015). OrthoVenn: a web server for
genome wide comparison and annotation of orthologous clusters across multiple species.
Nucleic Acids Research, 43(W1), W78–W84. https://doi.org/10.1093/nar/gkv487
Wang, W. X., & Widdows, J. (1991). Physiological responses of mussel larvae Mytilus edulis
to environmental hypoxia and anoxia., (70), 223–236.
Watanabe, S., Takahashi, N., Uchida, H., & Wakasugi, K. (2012). Human neuroglobin
functions as an oxidative stress-responsive sensor for neuroprotection. Journal of
Biological Chemistry, 287(36), 30128–30138. https://doi.org/10.1074/jbc.M112.373381
Wawrowski, A., Gerlach, F., Hankeln, T., & Burmester, T. (2011). Changes of globin
expression in the Japanese medaka (Oryzias latipes) in response to acute and chronic
hypoxia. Journal of Comparative Physiology. B, Biochemical, Systemic, and
Environmental Physiology, 181(2), 199–208. https://doi.org/10.1007/s00360-010-0518-2
References 95
Weatherall, D. J. (2001). Phenotype—genotype relationships in monogenic disease: lessons
from the thalassaemias. Nature Reviews Genetics, 2(4), 245–255.
https://doi.org/10.1038/35066048
Weber, R. E., & Vinogradov, S. N. (2001). Nonvertebrate haemoglobins: functions and
molecular adaptations. Physiological Reviews, 81(2), 569–628.
Weber, R. E. (1980). Functions of invertebrate haemoglobins with special reference to
adaptations to environmental hypoxia. American Zoologist, 20(1), 79–101.
https://doi.org/10.1093/icb/20.1.79
Widdows, J., Bayne, B. L., Livingstone, D. R., Newell, R. I. E., & Donkin, P. (1979).
Physiological and biochemical responses of bivalve molluscs to exposure to air.
Comparative Biochemistry and Physiology Part A: Physiology, 62(2), 301–308.
https://doi.org/10.1016/0300-9629(79)90060-4
Witeska, M. (2013). Erythrocytes in teleost fishes: a review. Zoology and Ecology, 23(4),
275–281. https://doi.org/10.1080/21658005.2013.846963
Wittenberg, J. B., & Wittenberg, B. A. (2003). Myoglobin function reassessed. Journal of
Experimental Biology, 206(12), 2011–2020. https://doi.org/10.1242/jeb.00243
Wray, G. A., Hahn, M. W., Abouheif, E., Balhoff, J. P., Pizer, M., Rockman, M. V., &
Romano, L. A. (2003). The evolution of transcriptional regulation in eukaryotes.
Molecular Biology and Evolution, 20(9), 1377–1419.
https://doi.org/10.1093/molbev/msg140
Wray, G. A., Levinton, J. S., & Shapiro, L. H. (1996). Molecular evidence for deep
precambrian divergences among metazoan phyla. Science, 274(5287), 568–573.
Yagil, C., Hubner, N., Monti, J., Schulz, H., Sapojnikov, M., Luft, F. C., … & Yagil, Y.
(2005). Identification of hypertension-related genes through an integrated genomic-
96 References
transcriptomic approach. Circulation Research, 96(6), 617–625.
https://doi.org/10.1161/01.RES.0000160556.52369.61
Ye, J., Fang, L., Zheng, H., Zhang, Y., Chen, J., Zhang, Z., … Wang, J. (2006). WEGO: a
web tool for plotting GO annotations. Nucleic Acids Research, 34(Web Server issue),
W293–W297. https://doi.org/10.1093/nar/gkl031
Zhang, G., Li, C., Li, Q., Li, B., Larkin, D. M., Lee, C., … & Wang, J.(2014). Comparative
genomics reveals insights into avian genome evolution and adaptation. Science,
346(6215), 1311–1320. https://doi.org/10.1126/science.1251385
Zhang, J. (2003). Evolution by gene duplication: an update. Trends in Ecology & Evolution,
18(6), 292–298. https://doi.org/10.1016/S0169-5347(03)00033-8
References 97
Appendices
Appendix A: Poster presented at the annual Lorne Genome conference 2015
98 Appendix