14
The Fasciola hepatica genome: gene duplication and polymorphism reveals adaptation to the host environment and the capacity for rapid evolution Cwiklinski, K., Dalton, J. P., Dufresne, P. J., La Course, J., Williams, D. J. L., Hodgkinson, J., & Paterson, S. (2015). The Fasciola hepatica genome: gene duplication and polymorphism reveals adaptation to the host environment and the capacity for rapid evolution. Genome Biology, 16, [71]. DOI: 10.1186/s13059-015-0632-2 Published in: Genome Biology Document Version: Publisher's PDF, also known as Version of record Queen's University Belfast - Research Portal: Link to publication record in Queen's University Belfast Research Portal Publisher rights © 2015 The authors This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. General rights Copyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The Research Portal is Queen's institutional repository that provides access to Queen's research output. Every effort has been made to ensure that content in the Research Portal does not infringe any person's rights, or applicable UK laws. If you discover content in the Research Portal that you believe breaches copyright or violates any law, please contact [email protected]. Download date:16. Apr. 2018

The Fasciola hepatica genome: gene duplication and polymorphism

Embed Size (px)

Citation preview

Page 1: The Fasciola hepatica genome: gene duplication and polymorphism

The Fasciola hepatica genome: gene duplication and polymorphismreveals adaptation to the host environment and the capacity for rapidevolutionCwiklinski, K., Dalton, J. P., Dufresne, P. J., La Course, J., Williams, D. J. L., Hodgkinson, J., & Paterson, S.(2015). The Fasciola hepatica genome: gene duplication and polymorphism reveals adaptation to the hostenvironment and the capacity for rapid evolution. Genome Biology, 16, [71]. DOI: 10.1186/s13059-015-0632-2

Published in:Genome Biology

Document Version:Publisher's PDF, also known as Version of record

Queen's University Belfast - Research Portal:Link to publication record in Queen's University Belfast Research Portal

Publisher rights© 2015 The authorsThis is an Open Access article distributed under the terms of the CreativeCommons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the original work is properly credited.

General rightsCopyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or othercopyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associatedwith these rights.

Take down policyThe Research Portal is Queen's institutional repository that provides access to Queen's research output. Every effort has been made toensure that content in the Research Portal does not infringe any person's rights, or applicable UK laws. If you discover content in theResearch Portal that you believe breaches copyright or violates any law, please contact [email protected].

Download date:16. Apr. 2018

Page 2: The Fasciola hepatica genome: gene duplication and polymorphism

RESEARCH Open Access

The Fasciola hepatica genome: gene duplicationand polymorphism reveals adaptation to the hostenvironment and the capacity for rapid evolutionKrystyna Cwiklinski1,2, John Pius Dalton2,3, Philippe J Dufresne3,4, James La Course5, Diana JL Williams1,Jane Hodgkinson1 and Steve Paterson6*

Abstract

Background: The liver fluke Fasciola hepatica is a major pathogen of livestock worldwide, causing huge economiclosses to agriculture, as well as 2.4 million human infections annually.

Results: Here we provide a draft genome for F. hepatica, which we find to be among the largest known pathogengenomes at 1.3 Gb. This size cannot be explained by genome duplication or expansion of a single repeat element,and remains a paradox given the burden it may impose on egg production necessary to transmit infection. Despitethe potential for inbreeding by facultative self-fertilisation, substantial levels of polymorphism were found, whichhighlights the evolutionary potential for rapid adaptation to changes in host availability, climate change or to drugor vaccine interventions. Non-synonymous polymorphisms were elevated in genes shared with parasitic taxa, whichmay be particularly relevant for the ability of the parasite to adapt to a broad range of definitive mammalian andintermediate molluscan hosts. Large-scale transcriptional changes, particularly within expanded protease and tubulinfamilies, were found as the parasite migrated from the gut, across the peritoneum and through the liver to maturein the bile ducts. We identify novel members of anti-oxidant and detoxification pathways and defined their differen-tial expression through infection, which may explain the stage-specific efficacy of different anthelmintic drugs.

Conclusions: The genome analysis described here provides new insights into the evolution of this importantpathogen, its adaptation to the host environment and external selection pressures. This analysis also provides aplatform for research into novel drugs and vaccines.

BackgroundThe digenean trematode Fasciola hepatica is one of themost important pathogens of domestic livestock and hasa global distribution [1-4]. The disease, fasciolosis, re-sults in huge losses to the agricultural industry asso-ciated with poor food conversion, lower weight gains,impaired fertility and reduced milk (cattle) and wool(sheep) production. Heavy, acute infections can result indeath, particularly in sheep and goats. Economic lossesattributable to F. hepatica infection have been estimatedat more than US$3 billion per annum worldwide [5,6],although even this estimate may be conservative asF. hepatica infection modulates its host’s immuneresponse and its ability to resist or eliminate common

microbial pathogens [7,8]. Fasciolosis is also an import-ant zoonosis in regions where agricultural managementpractices are less advanced, particularly in South Americaand North Africa [3,9]. It is estimated that between 2.4and 17 million people are infected with this liver flukeworldwide, with a further 91 million people living at risk,resulting in fasciolosis being included on the WorldHealth Organization list of major neglected tropicaldiseases [1-3].The zoonotic potential of F. hepatica is enabled by its

remarkable ability to infect and mature in an extensiverange of terrestrial mammals. Thus, while the typicaldefinitive host for F. hepatica is one of many species ofdomestic or wild ruminant that ingest contaminatedpasture (Figure 1), F. hepatica is also able to exploit dis-parate host species including humans and rodents, andhas rapidly adapted to novel hosts such as llamas and

* Correspondence: [email protected] of Integrative Biology, University of Liverpool, Liverpool, UKFull list of author information is available at the end of the article

© 2015 Cwiklinski et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the CreativeCommons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, andreproduction in any medium, provided the original work is properly credited. The Creative Commons Public DomainDedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article,unless otherwise stated.

Cwiklinski et al. Genome Biology (2015) 16:71 DOI 10.1186/s13059-015-0632-2

Page 3: The Fasciola hepatica genome: gene duplication and polymorphism

kangaroos, which it has recently come into contact within South America and Australia, respectively [10]. Thisis in contrast to most digenean trematodes, such as thehuman pathogen Schistosoma mansoni, which have a farmore restricted host range. F. hepatica can also adaptrapidly to drug interventions and the emergence of re-sistance within F. hepatica populations to triclabenda-zole (TCBZ) is of major concern, since most drugs usedagainst other digeneans are only partly protective againstF. hepatica [10]. TCBZ is also the only drug currentlyavailable that is able to protect livestock and humansagainst early stage juveniles, which cause significantpathology as they migrate through the liver. The abilityof F. hepatica to adapt rapidly to novel hosts or to druginterventions is perhaps more remarkable given thatF. hepatica is a hermaphrodite that can facultatively self-fertilise and so F. hepatica populations might be ex-pected to lose genetic diversity through inbreeding thatwould be an essential basis for adaptation.

Here we provide a genome assembly for F. hepaticaand assess genome-wide polymorphism and transcrip-tional profiles in order to identify key features of itsgenome that underlie its ability to migrate through dif-ferent physiological environments, to parasitise differenthost species, and to respond rapidly to external selectionpressures.

Results and discussionA large genome with high gene polymorphismA draft genome for F. hepatica was generated with an as-sembled length of approximately 1.3 Gb (Table 1). Thegenome of F. hepatica is considerably larger than that ofother sequenced digenean parasites - Schistosoma spp (363to 397 Mb), Clonorchis sinensis (547 Mb) or Opisthorchisviverrini (634.5 Mb) [11-16] - and is one of the largestpathogen genomes sequenced to date. Genome size doesnot appear to be related to chromosome number amongtrematodes; F. hepatica has 10 pairs of chromosomes [17],

Figure 1 Fasciola hepatica lifecycle. (a) Graphical representation of the F. hepatica lifecycle (modified from [76]). a1 Definitive host - host rangeincludes cattle, sheep and humans. a1.1 Parasite excysts in the intestine of the definitive host, releasing newly excysted juveniles (NEJ) that migrateacross the intestinal wall, through the peritoneal cavity to the liver. a1.2 NEJ migrate through the liver parenchyma, increasing in size to juvenile flukesas they migrate a1.3 into the bile ducts a1.4, where they grow and develop into fully mature adults. a2 Eggs are released in the faeces and develop onpasture. a3 From each embryonated egg hatches a single miracidium, which infects the snail intermediate host (Galba truncatula). a4 Within the snailthe parasite undergoes a clonal expansion, developing through the sporocyst, rediae and cercariae lifecycle stages. a5 Cercariae are released from the snailand encyst on vegetation as dormant metacercariae, which are ingested by the definitive host a1. (b, c, d) Graphical representation of the developmentof the parasite through the definitive host. (b) The parasite increases dramatically (approximately 1,000-fold) in size over the course of approximately12 weeks, from NEJ to adult. (c) Expression of key enzymes of metabolism reveals how the growth of the parasite limits oxygen diffusion into theparasite tissue, switching from aerobic energy metabolism (Kreb’s cycle; PK: pyruvate kinase; SD: succinate dehydrogenase) to aerobic acetateproduction (ME: malic enzyme) to anaerobic dismutation (PEPCK: phosphoenolpyruvate carboxykinase), as shown by the log fold-change in expressionbetween the lifecycle stages (expression is shown relative to metacercariae lifecycle stage). (d) In addition to the dramatic growth, maturation of theparasite occurs, with the fully mature adult digesting host blood, which provides the nutrient for massive egg production (approximately 20,000 eggsper day per parasite), as shown by the increased expression of the egg shell component, vitelline.

Cwiklinski et al. Genome Biology (2015) 16:71 Page 2 of 13

Page 4: The Fasciola hepatica genome: gene duplication and polymorphism

S. mansoni and O. viverrini have eight pairs and six pairsof chromosomes, respectively [18,19], but C. sinensis, alsowith a smaller genome than F. hepatica, has 28 pairs [20].Comparative analysis with other sequenced trematode spe-cies indicated that the mean number of exons per gene iscomparable between species, but that mean exon and in-tron lengths tend to increase with genome size (Additionalfile 1: Table S1). Most core eukaryotic genes appeared assingle copy evidenced by both CEGMA (Additional file 2:Table S2) and analysis of read coverage (Additional file 3:Figure S1), suggesting that the large genome size of F.hepatica has not arisen by genome duplication. At least32% of the genome was estimated to consist of repetitiveDNA, which is consistent with other trematode ge-nomes [11-15]. The median repeat length was 26 bp(Additional file 4: Figure S2) and we observed retrotran-sposons, including 27 Mbp of long terminal repeats and59 Mbp of long interspersed elements (LINEs); however,there was no obvious expansion of a single repeat ele-ment to account for the large genome size. A LINE RTEBovB repeat, previously found in ruminants, was observeddistributed widely across the genome (at least 67,000 full orpartial copies in total across approximately 30% of the scaf-folds and totalling 28.1 Mbp). This was not due to contam-ination by host DNA, since no other host sequence, suchas sheep mitochondrial sequence, could be identified eitherin the assembly or in individual reads. BovB has previouslybeen reported as exhibiting horizontal transfer betweensnakes and ruminants and its presence in F. hepatica sug-gests that transfer of BovB elements between disparate ver-tebrate taxa may be facilitated by digenean infection [21].

We investigated levels of polymorphism among F. hep-atica genes by re-sequencing the genomes of individualfluke from each of five isolates, all from the UK. Sub-stantial polymorphism among isolates was observed;48% of genes exhibited at least one non-synonymousSNP and the level of non-synonymous nucleotide diver-sity, pi, averaged across 21.8 Mbp of coding sequence,was 5.2 × 10-4 (that is, two randomly sampled sequencesdiffered approximately every 1,900 bp). By comparison,this figure is higher than in humans [22], similar to mostvertebrates [23] and, on limited data, smaller than someparasitic nematode populations [24]. Although F. hepaticais a self-fertilising hermaphrodite, and so has the potentialto inbreed and lose genetic diversity, our data show thatF. hepatica populations, as a whole, harbour substantialgenetic variation. A likely explanation is that parasite pop-ulations are typically large, often larger than that of theirhosts, which greatly slows any enhanced effects of geneticdrift caused by self-fertilisation [25].By analysing the distribution of genetic diversity amongst

F. hepatica genes, we found higher non-synonymous poly-morphism in genes shared with parasitic cestodes anddigeneans relative to orthologs shared with the free-livingturbellaria (Figure 2a and Additional file 5: Table S3 andAdditional file 6: Table S4). These data suggest high adapt-ability in F. hepatica genes that mediate infection andsurvival in the host environment, which is consistent withF. hepatica’s ability to infect a range of both mammalianand molluscan hosts [3,9,10]. We then assessed whetherhigh non-synonymous polymorphism was associated withparticular biological functions and discovered a markedover-representation of biological processes associated withaxonogenesis and chemotaxis among the top 1% quantileof polymorphic genes (Figure 2b and Additional file 7:Table S5 and Additional file 8: Table S6). These genes in-cluded cadherin, semaphorin, fascilin and rabconnectin,which are involved in cell adhesion and migration of neu-rons [26-28]. The high polymorphism observed in chemo-sensory and neural development pathways may relate tothe challenge faced by F. hepatica in locating its snail hostor in tissue migration in its vertebrate host, and withvariation in host preference within parasite populations[29,30]. Such polymorphism may be particularly relevantfor development of new anthelmintics targeting the para-site’s neuromuscular system [31].

Expression patterns from multi-gene families revealimportant developmental host-parasite interactionsIn order to understand how F. hepatica has adapted tosurvive within its vertebrate host, we characterised itsdevelopmental time-course of gene expression usingRNAseq. Progressively more genes were differentiallyexpressed, and with larger fold-changes, following initialinfection and subsequent development in the host; that

Table 1 Fasciola hepatica assembly statistics

Metric Value

Scaffold N50 204 Kbp (REAPR 155 Kbpa)

Number of scaffolds ≥3 Kbp 20,158

Number of scaffolds ≥1 Kbp 45,354

Contig N50 (≥100 bp) 9.7 K bp

Number of contigs (≥100 bp) 254,014

Total assembly length 1.275 Gbp

Total length of gaps 91.6 Mbp

Repetitive content 32%

Number of RNAseq-supportedgene models

22,676 (15,740b)

Mean number of exons/gene 5.3

Mean exon size (95% range) 303 bp (36 bp – 1,369 bp)

Mean intron size (95% range) 3.7 Kbp (33 bp - 17.5 Kbp)

Proportion CEGMA core eukaryoticgenes found

90%

aN50 following breakage of some scaffolds at areas to low support. Bothassemblies are available from ENA under project accession PRJEB6687.bNumber of non-overlapping, distinct genome intervals covered byRNAseq-supported gene models.

Cwiklinski et al. Genome Biology (2015) 16:71 Page 3 of 13

Page 5: The Fasciola hepatica genome: gene duplication and polymorphism

is, associated with excysting of the metacercaria stage toform the newly excysted juvenile (NEJ), traversal of theintestinal wall and penetration of the liver as an imma-ture tissue-migrating parasite (Figure 3a). Thereafter, thetransition from these immature parasites to matureadults living in the bile duct was accompanied by furtherchanges in the overall profile of differentially expressedgenes, but not in the number of genes differentiallyexpressed relative to metacercariae. Known aspects ofFasciola biology were recapitulated in our expressiondata. Thus, as the parasite grows, diffusion of oxygeninto parasite tissue is limiting and our expression analysisconfirmed the switch from aerobic to anaerobic metabo-lism (Figure 1b and c) [32]. Similarly, our data confirmed

that the production of vitelline, the egg shell component,is limited to the mature parasites (Figure 1d).To explore novel processes associated with deve-

lopmental changes, genes were clustered based on ex-pression pattern (Figure 3b). Several biological processeswere greatly downregulated in maturing and adult para-sites relative to metacercariae (Cluster 12, Figure 3), in-cluding molecules involved in cell adhesion (cadherins,integrins) and cytoskeletal proteins (talins) that couldplay an important role in sensing changes in the phy-siological environment and rapidly initiating excystmentfollowing ingestion by the definitive host. Clusters sho-wing strongest patterns of differential expression (bothup- and downregulation) were markedly over-represented

Figure 2 Polymorphism within Fasciola hepatica. (a) Levels of non-synonymous polymorphism for F. hepatica genes exhibiting orthology withClonorchis, Schistosoma, Schmidtea or Echinococcus indicated within the phylogenic tree. Numbers by branches refer to numbers of orthologousgroups specifically shared by that branch; for example, 464 orthologs are shared only between Fasciola and Clonorchis, a further 388 are alsoshared with Schistosoma but not with Schmidtea or Echinococcus and so on. Branches not drawn to scale. Polymorphism is significantly (P <0.001)higher in Fasciola orthologs shared among the digenean and cestode parasites Clonorchis, Schistosoma and Echinococcus (green, red and blue crosses,respectively) relative to orthologs conserved with the turbellarian Schmidtea (black dots). (b, c) Directed acyclic graphs indicating over-representationof biological processes within the top 1% most polymorphic genes, based on non-synonymous diversity. Shaded boxes indicated significant (P <0.01)over-representation. Black and blue arrows indicate, respectively, ‘is a’ and ‘part of’ relationships between terms.

Cwiklinski et al. Genome Biology (2015) 16:71 Page 4 of 13

Page 6: The Fasciola hepatica genome: gene duplication and polymorphism

by peptidases and terms associated with regulation of pep-tidase activity (Clusters 2, 8, 10 and 11, Figure 3). Genesassociated with F. hepatica structure, particularly proteinpolymerisation and microtubule based movement, such astubulin, dynein and surface tegumental genes, were highlyupregulated in immature and adult fluke relative to earlierstages (Clusters 1 and 3, Figure 3b and Additional file 9:Table S7, Additional file 10: Table S8 and Additionalfile 11: Table S9). Our data show that, across the transcrip-tome as a whole, the strongest changes in expression wereobserved among different members of the multigenic pro-tease and tubulin families, as detailed below.

Peptidases play essential roles in parasite infection, tis-sue migration and feeding [33]. F. hepatica is uniqueamong helminth parasites because it relies almost exclu-sively on the secretion of papain-like cysteine peptidases(Clan A, family C1), classes cathepsins L (FhCL) and ca-thepsins B (FhCB) [33,34]. These two peptidase classes,however, have expanded and diverged to form multi-genic families, all of which have were found within ourassembly (Figure 4a and Additional file 12: Figure S3).Our data show that FhCL peptidases form clades with atotal of 17 members and we found that each clade has adistinctive peptidolytic activity based on residues that

Figure 3 Expression of genes exhibiting at least a 16-fold difference in expression between any two developmental stages and grouped byhierarchical clustering. (a) Heatmap with upregulation in blue and downregulation in red relative to metacercariae. Colours to left of heatmapcorrespond to different clusters. (b) Expression of genes within each cluster on a log2 scale. The number of genes and the enrichment of biologicalprocesses are shown for the most specific terms within the gene ontology structure and with a significance of <0.1%. GO:0006508 Proteolysis is shownin brackets for clusters where it appears with a significance of 0.1%< P < 1%.

Cwiklinski et al. Genome Biology (2015) 16:71 Page 5 of 13

Page 7: The Fasciola hepatica genome: gene duplication and polymorphism

make up the S2 sub-site of the active site (Additionalfile 12: Figure S3 [35,36]). Members within each cladeexhibited a concerted pattern of expression that can becorrelated with migration of the parasite through dif-ferent tissues where they would encounter a different

profile of macromolecular substrates. For example,FhCL3s are known to exhibit a unique collagenolytic ac-tivity required to disrupt the interstitial matrix and theirsecretion in high amounts enable NEJs to traverse theintestinal wall and penetrate the liver capsule [33,34,37].Our data reveal that the FhCL3 clade has expanded tofive members that are all abundantly expressed by NEJsand share critical residues (H61, W67 and V205) withinthe active site, suggesting that enlargement of this familyhas been driven by the need to produce plentiful collage-nolytic peptidase at this critical time in the parasite’s lifecycle. Conversely, members of the FhCL1 clade, thelargest with six members, were not expressed during theearly migratory stages but all showed an increase inexpression as the parasite matures (Figure 4b). Matureadults are obligate blood feeders [30,37] and secretecopious amounts of FhCL1 and, correspondingly, evolu-tion of the FhCL1 active site (residues N61, L/F67 andL/M205) has resulted in loss of collagenolytic activitybut a gain in the ability to effectively digest blood pro-teins, particularly haemoglobin, which are the matureparasite’s prime source of nutrients required for egg pro-duction [30,37].Unlike the FhCL peptidases, the FhCB family consists

of a single clade of seven members. Nevertheless, thesepeptidases were also temporally regulated (Figure 4c)such that three members (FhCB1, FhCB2 and FhCB3)exhibited parallel expression patterns to FhCL3 and werethus highly expressed in the NEJs and downregulated asthe parasites matured. These data suggest a concertedrole for FhCLs and FhCBs in the early infection stage.Also, the constitution and specific expression of a familyof peptidases, asparaginyl endopeptidases or legumainsthat are responsible for the processing of the inactiveFhCL and FhCB zymogens to functional enzymes [38]suggest specific and important developmental roles forthese peptidases (Figure 4d). Thus, legumain 1, whichwas the most highly expressed member in NEJs could berequired for activation of the FhCL3 and FhCB 1/2/3peptides at the time the parasite emerges from itsencysted stage in the intestine and initiates infection.Furthermore, legumain 3, which was switched on late inparasite development, is the prospective candidate foractivating the FhCL1, FhCL2 and FhCL5 in matureblood-feeding adult parasites.Our data highlight tubulins as another multigenic family

that exhibit among the highest log fold-changes in expres-sion throughout F. hepatica development. We identifiedthe full complement of five α-tubulin and six β-tubulinisotypes [39,40] and found duplication of β-tubulin isotype3 (Figure 5a, b). Basal expression levels revealed two func-tional subsets of β-tubulin molecules; high constitutive ex-pression of β-tubulin isotypes 2, 3a, 3b and 4 suggeststhey play an essential role in general microtubule structure

Figure 4 Analysis of the cysteine proteases belonging to F. hepaticacathepsin L (FhCL) and cathepsin B (FhCB) clades and their activators,the asparaginyl endopeptidases/legumains. (a) Representation of thenumber of genes identified for each FhCL clade and for the groups ofFhCB and legumain genes, based on analysis by BLAST and manualannotation. The nucleotide identity is shown for each clade/groupof genes (Clustal Omega). (b, c, d) Graphical representation of theexpression for FhCL, FhCB and legumain proteases across theF. hepatica lifecycle in reads per kilobase per million (RPKM). In thecase of the FhCL gene expression, all the genes within the same cladeshowed a similar pattern of expression, and so are represented hereas an average log RPKM. Likewise in the case where multiple genemodels were identified for a particular gene, which showed a similarlevel of expression, are represented here as an average log RPKM.

Cwiklinski et al. Genome Biology (2015) 16:71 Page 6 of 13

Page 8: The Fasciola hepatica genome: gene duplication and polymorphism

and function, while a more specialised role is implicatedfor β-tubulin isotype 1 and 5, which were particularlyexpressed in the immature liver stages. Tubulins areknown targets of benzimidazole (BZ) drugs [40,41], whichinclude TCBZ, and the unique nature of the interaction ofF. hepatica with TCBZ remains undefined. Protein simi-larity searches with the closely related species C. sinensisand S. mansoni, which are refractory to TCBZ, identifiedsix and 10 β-tubulin gene models, respectively. WhileF. hepatica β-tubulin isotypes 1 to 4 show >95% similaritywith multiple sequences from these species, that observedfor β-tubulin isotype 5 and 6 was <70% (Additional file 13:Table S10). Of the two BZ-derived drugs effective againstF. hepatica, TCBZ targets immature and adult fluke whilealbendazole (ABZ), a widely used anti-nematode drug,kills only adult stages. Field isolates of F. hepatica thatdemonstrate resistance to ABZ remain susceptible toTCBZ suggesting independent modes of action [42] thatmay be related to the diverse β-tubulin isotypes found inthis parasite.

Anti-oxidant and detoxification systemsWe investigated the evolution of anti-oxidant systems inF. hepatica, which are essential for adaptation to thehost environment. Thus, as the parasite rapidly develops

and enters different aerobic/anaerobic environments,anti-oxidant systems are critical not only for the detoxi-fication of reactive oxygen and nitrogen (ROS, RNS)generated by endogenous cellular metabolism but also asa frontline defence against superoxide and nitric oxideradicals released by host immune effector cells such asmacrophages, eosinophils and neutrophils [43,44]. Para-sitic platyhelminths express genes encoding superoxidedismutase (SOD) which dismutates superoxide radicalsto H2O2 but they do not possess catalase, the enzymeresponsible for converting H2O2 to water and oxygen(although a catalase gene is present in the genome offree-living flatworms, such as Schmidtea mediterranea).A catalase gene was not found in the F. hepatica genome,which is consistent with other parasitic platyhelminths,where the function of peroxide detoxification has beensupplanted by the newly discovered thiol-dependentperoxiredoxin and its reducing partner thioredoxin [45].The presence of the gene encoding the recently describedthioredoxin glutathione reductase (TGR; Additional file 14:Table S11), together with the absence of distinct thiore-doxin reductase and glutathione reductase genes verifiedthat TGR is the sole reductive enzyme that links thethioredoxin-dependent and glutathione-dependent anti-oxidant defence systems in this parasite as observed inother platyhelminths [43,46,47]. The pivotal position ofTGR between these two essential anti-oxidant systemsmakes it a promising target for the development of newanti-trematode drugs [46]. Included in the repertoire ofF. hepatica anti-oxidants are genes encoding SOD, gluta-thione transferases (GSTs) and three fatty acid bindingproteins (FABP; Additional file 14: Table S11). With theexception of the GSTs, we found that in F. hepatica eachof these components is encoded by a single gene, which isin stark contrast to the expanded anti-oxidant genefamilies in the closely related trematodes, O. viverrini andC. sinensis [13,15]. Peroxiredoxin, GST and FABP are se-creted by F. hepatica via non-classical secretory pathways,perhaps as cargoes of exosomes [37], into the host cir-culation where they influence host immune responses byrecruiting and activating M2 macrophages [7,48,49] andsuppressing dendritic cell activity [50]. These immuno-modulatory effects contribute to the establishment of animmune suppressive environment, which aids parasitesurvival and the development of chronic disease.Defence against chemical toxins, including those gen-

erated by the immune response, is essential in allowinghelminths to adapt to, and survive in, the host environ-ment. This is mediated in large part by the three-phasedrug detoxification pathway; Phase I (activation), PhaseII (conjugation) and Phase III (efflux). This pathway alsoacts to reduce drug activity and/or bioavailability andhas been linked with TCBZ resistance [41,51-53]. Untilnow only indirect evidence existed for the presence of

Figure 5 Expression of tubulin genes. (a, b) Graphical representationof the expression of the tubulin genes across the F. hepatica lifecyclerepresented by α-tubulins (a) and β-tubulins (b). The expression ofeach tubulin isotype (iso) is shown in reads per kilobase per million(RPKM), which were calculated as an average log RPKM value for themultiple transcriptome datasets at each lifecycle stage.

Cwiklinski et al. Genome Biology (2015) 16:71 Page 7 of 13

Page 9: The Fasciola hepatica genome: gene duplication and polymorphism

Phase I cytochrome P450 (CYP 450) genes in F. hepatica[54]. Here, we report that the F. hepatica genome containstwo Phase I CYP 450 gene models, one monooxygenaseand one reductase; both are present in S. mansoni butonly the reductase is found in C. sinensis. Basal expressionlevels did not reveal significant changes throughout deve-lopment (Figure 6). Our analysis reveals that of the PhaseII GSTs, the cytosolic GSTs were represented in theF. hepatica genome by at least seven separate genes

comprising four different classes; four mu, one omega,one sigma and a putative novel zeta class. Zeta class GSTsdemonstrate different activities to other GSTs and theirrole in helminths is poorly understood [55]. Comparisonwith S. mansoni and C. sinensis revealed this zeta classGST is unique to F. hepatica. Within the Phase IIIpathway [56,57], we identified 18 ATP-Binding Cassette(ABC) and five Multidrug and Toxic compound Extrusion(MATE) transporters in F. hepatica, similar to that foundin S. mansoni and C. sinensis. Although no significant de-velopmental changes in gene expression were observedfor the cytosolic GSTs, we found that two microsomalGSTs and several putative multidrug resistance proteinswere upregulated in the immature and adult stages(Figure 6 and Additional file 15: Table S12) supporting theimportance of broad-spectrum detoxification Phases inmediating defence against host-immune-mediated toxicassault and targeting via anti-parasitic drugs.

ConclusionsThe F. hepatica genome is one of the largest pathogengenomes sequenced to date but we found no evidence ofgenome duplication or repeat expansion to explain this.Why this large genome should have evolved is unclear,especially given that its replication may be energeticallycostly or slow cell division [58]. For a parasite, such asFasciola, that relies on the production of large numbersof eggs to facilitate transmission, one would expectstrong selection against the accumulation of junk DNAif a large genome imposed a cost on egg production. Itis possible that much of the non-coding portion of theF. hepatica genome is involved in gene regulation [59],and the size of F. hepatica’s genome may be related toits complex life cycle and variety of developmentalmorphs. If so, however, it would be difficult to explainwhy the F. hepatica genome is around three time thesize of the Schistosoma genome [11], which has a similarlife cycle. The large genome of F. hepatica therefore re-mains a paradox for which we may have to wait for com-parative genome sequencing, currently underway [60],across other platyhelminth taxa to provide an answer.The ability of F. hepatica to infect and survive in

different tissue environments as it migrates from theintestine, through the liver and into the bile ducts isunderpinned by gene duplication. Thus, our results showthat, across the whole transcriptome, the strongest pat-terns of differential expression were observed amongmembers of protease and tubulin gene families, and, inthe case of proteases, these can be associated withchanges in the active site and substrate specificity. Whilegene duplication appears to be a key process of adapta-tion to the parasitic life-style used by many helminths, itis notable that different helminth taxa have arrived atdifferent evolutionary solutions. Comparison between

Figure 6 Expression of detoxification genes. Expression of Phase I,II and III detoxification genes through development relative tometacaercariae.

Cwiklinski et al. Genome Biology (2015) 16:71 Page 8 of 13

Page 10: The Fasciola hepatica genome: gene duplication and polymorphism

F. hepatica and the bile-dwelling liver flukes C. sinensisand O. viverrini, shows the expansion of the anti-oxidantSOD families [13] and cathepsin F protease families [13,61]in C. sinensis and O. viverrini but not in F. hepatica, andconversely the expansion of the cathepsin L family in thelatter species only, which suggests differences in how theserelated parasites tackle life within the same environment.Differences in the specificity or developmental expressionof detoxification genes, such as members of the ABC andMATE families, or of the tubulin gene family, may be im-portant in understanding why F. hepatica responds somarkedly different to drugs at different stages of develop-ment compared to other digeneans. The exclusive activityof TCBZ against F. hepatica, and the potency of prazi-quantel to all these other digeneans except F. hepatica,points to a uniqueness in this parasite that needs to be re-solved [37].Fasciola hepatica is a highly adaptable parasite, evi-

denced by its ability to infect novel hosts, and it is not-able that our results reveal high levels of polymorphismin genes specific to parasitic digeneans. Diversity withinF. hepatica populations at genes important for the host-parasite interface may underpin a high evolutionary po-tential for F. hepatica to respond to changes in hostavailability or to other selection pressures. Similarly, thebroad geographic range of F. hepatica would suggestthat it is able to adapt to different climatic conditionsand, in this respect, F. hepatica may also be able torespond to exploit changes in climate in temperateregions, such as the UK, where warmer and wetter win-ters favour transmission and increased prevalence [62].F. hepatica is seen to rapidly develop drug resistance toTCBZ [52] and the standing genetic diversity that wefind across the genome suggests that it harbours the po-tential to evolve resistance to any novel drug treatment,which compounds the difficultly of controlling F. hepaticagiven the shortage of drugs effective against the juvenilestages. Nevertheless, the availability of a genome for F.hepatica that we provide, plus the characterisation of earlymolecular events in infection should help support thedevelopment of novel drugs and vaccines, particularlyagainst the migrating juveniles that cause much of thepathology. For example, RNAi-mediated knockdown ofcathepsin L and B expression has been shown to preventNEJs from crossing the intestine [45] and vaccine trialswith cathepsin L1 have provided partial protection againstinfection and pathology [63]. The exploitation of a broaderrange of targets within the F. hepatica genome is now apriority given the widespread prevalence of resistance toTCBZ within F. hepatica populations. The commercialimportance for agriculture to develop a replacement treat-ment to TCBZ may stimulate new treatments that couldbe translated to other important digenean parasites, in-cluding those of humans.

MethodsSource of parasite materialAdult parasites from each of five isolates were used forgenome sequencing: (1) FhepLivSP, from the laboratorymaintained Shrewsbury isolate (Ridgeway Research, UK);(2) FhepLivS1, a clonal line derived from the Shrewsburypopulation; (3) and (4) FhepLivR1 and R3, clonal linesderived from two isolates recovered from sheep inNorthern England naturally infected with F. hepatica; and(5) FhepLivR2, a clonal line derived from a F. hepaticapopulation from naturally infected sheep in South WestWales. For RNA sequencing, the following were used:(1) metacercariae and newly excysted juveniles (NEJ) at 1,3 and 24 h post excystment from a North Americanisolate (Baldwin Aquatics Inc., Monmouth, OR, USA).Twenty-one–day-old juvenile flukes were recovered frommice infected with the same isolate; (2) an adult parasiterecovered from the bile ducts of cattle naturally infectedwith F. hepatica in Uruguay. All animal work wasconducted with ethical approval from the Universities ofLiverpool (UK) and McGill (Canada).

SequencingApproximately 10 μg of DNA from a single, adult fluketaken from the FhepLivS1 strain was used to prepareIllumina TruSeq fragment libraries and 26 Gbp of2 × 250 bp reads generated on an Illumina MiSeq (meaninsert sizes 470 bp and 580 bp). Further individuals ofFhepLivS1 were used to prepare Nextera Mate-Pair li-braries (3 Kbp and 10 Kbp) and approximately 60 m2× 100 bp Illumina reads from each library were gene-rated. Approximately 200 μg of DNA prepared fromseveral fluke of the FhepLivSP isolate was used to con-struct 2 Kbp, 5 Kbp and 8 Kbp mate-pair libraries, whichwere sequenced either on an Illumina GAII or HiSeq2000. For each of the parasite isolates FhepLivSP,FhepLivS1, FhepLivR1, FhepLivR2 and FhepLivR3, DNAwas isolated from a single adult; a fragment library wasprepared and sequenced on a single lane of an IlluminaHiSeq 2000 to yield approximately 24 Gbp of sequence(Centre for Genomic Research, Liverpool, UK). For RNAsequencing, Illumina TruSeq RNA libraries were preparedfrom biological replicates of metacercariae (3 replicates),NEJ 1 h (2 replicates), NEJ 3 h (2 replicates), NEJ 24 h(2 replicates), Juveniles 21 days (1 replicate) and Adult(1 replicate) (Genome Quebec, Montreal, Canada).

Assembly and annotationIllumina MiSeq reads were trimmed to Q ≥30 and adap-tors removed using Sickle and Perl and assembled usingNewbler (Roche GS-Assembler v2.6) with flags set forlarge genome and a heterozygote sample. Mate-pairreads were first mapped to these contigs using Bowtie2[64] to remove duplicates and wrongly orientated reads,

Cwiklinski et al. Genome Biology (2015) 16:71 Page 9 of 13

Page 11: The Fasciola hepatica genome: gene duplication and polymorphism

and scaffolded into contigs using SSPACE [65]. Gapfilling was achieved using GapFiller for 2× 250 bp and2× 100 bp paired-end reads and run for three iterations(available as ENA accessions LN627018-LN647175).RNAseq data were mapped to scaffolds within the as-sembled genome greater than 3 Kbp using TopHat2 toidentify transcribed regions and splice junctions. These,together with RNAseq data assembled using Trinityand S. mansoni protein sequence, were passed to theMAKER pipeline [66] to predict genes. Repeatmasker,Windowmasker and Dustmasker were used to identifyrepetitive regions. CEGMA v2.4 [67], which searches for248 highly conserved genes, was used to assess the com-pleteness of genome with the settings for vertebrates toallow long introns. REAPR [68] was used to assess thequality of scaffolding within the assembly and to pro-duce an alternative, more conservative assembly by split-ting scaffolds at locations with lower support (availableas ENA accessions LN736597-LN774150). Homologs ofF. hepatica predicted protein sequences were identifiedwithin UniProt using BLAST and functional domainsidentified using InterPro. InParanoid and MultiParanoid[69] were used to identify ortholog clusters fromSchistosoma mansoni (v3.1.16), Clonorchis sinensis (v3.7),Schmidtea mediterranea and Echinococcus multilocularis(v29042013) predicted proteins [11-15,70].

Gene expression analysisRNAseq libraries were mapped to MAKER gene modelsusing TopHat2 [71] and read counts extracted usinghtseq-count. Genes with a sum of at least five readsacross all libraries were analysed for differential expres-sion in edgeR [72] using a negative binomial model ofsuccessive developmental stages relative to metacercariaeand with tagwise dispersion estimated from all samples.Hierarchical clustering, based on model coefficients, wasused to group differentially expressed genes by similarityof expression into 12 clusters and hypergeometric testsused to test for over-representation of gene ontologyterms within each cluster relative to the whole gene-set.

Genetic diversity analysisReads for each isolate were mapped to the genome usingBowtie2 and resulting bam files passed together to theGATK pipeline [73] for local realignment and SNP call-ing. SNP calls within genes were filtered by score >100,FS <60 and combined coverage between 10 and 250. Be-cause gene duplicates collapsed within the assemblymight give erroneously high diversity for some genes,genes were excluded with a median coverage greaterthan 213 and by heuristic scoring of SNPs appearing asheterozygotes in all samples. IGV was used to manuallyassess the success of filtering parameters. Levels of nu-cleotide diversity for different classes of SNPs were

calculated for all genes. To identify the most polymorphicgenes, a generalised linear model with a Poisson distribu-tion was fitted to the number of non-synonymous SNPs asthe response variable versus number of non-synonymoussites as a quadratic function. This accounts for the distri-bution of SNP counts within a gene, the fact that longergenes have the potential to have more SNPs and the possi-bility that genes of different lengths may evolve at differentrates. Genes were classified according to their conserva-tion across platyhelminths by the presence of orthologsretained across increasing taxonomic scales. This waspreferred to assessing conservation on the basis of non-synonymous versus synonymous substitution ratios bet-ween species, since synonymous sites were saturated bymultiple substitutions at such broad taxonomic scales. Therobustness of the generalised linear model was tested byrandomly sampling equal proportions of genes from eachquartile of the length distribution for each class of con-served gene, to ensure that no bias was introduced if thelength of a gene was correlated to its level of conservation.From the generalised linear model, the most polymorphicgenes were identified having residuals in the top 1%quantile and hypergeometric tests were used to test forover-representation of GO terms within these highly poly-morphic genes relative to the gene-set as a whole.

Discovery and characterisation of gene familiesDiscovery of F. hepatica gene families was carried outusing BLAST analysis (NCBI v2.2.27 and v2.2.29), withavailable published gene sequences of interest (Additionalfile 16: Table S13), followed by manual annotation. Com-parative analysis was carried out using the closely relatedtrematode genome sequence datasets: Clonorchis sinensisand Schistosoma mansoni. Sequence alignment and phylo-genetic analysis was carried out using Clustal Omega [74]and MEGA5 [75], respectively.

Data availabilityData are freely available from WormBase ParaSite and theEuropean Nucleotide Archive under accessions LN627018-LN647175 (assembly data), PRJEB6687 (genomic read data)and PRJEB6904 (transcriptomic read data).

Additional files

Additional file 1: Table S1. Comparison of the F. hepatica genomewith other parasitic trematode genomes.

Additional file 2: Table S2. Summary statistics of CEGMA analysis.

Additional file 3: Figure S1. Distribution of average read coverageacross gene models.

Additional file 4: Figure S2. Distribution of repeat length within theFasciola hepatica genome.

Cwiklinski et al. Genome Biology (2015) 16:71 Page 10 of 13

Page 12: The Fasciola hepatica genome: gene duplication and polymorphism

Additional file 5: Table S3. Number of Fasciola hepatica orthologclusters identified in Clonorchis sinensis, Schistosoma mansoni, Schmidteamediterranea and Echinococcus multilocularis.

Additional file 6: Table S4. Association between number of non-synonymous SNPs per gene and orthology with other platyhelminths.Showing the output of a generalised linear model with a quasi-poissonerror structure.

Additional file 7: Table S5. Over-representation of gene ontology (GO)terms in the top 1% quantile polymorphic genes, as estimated fromnumber of non-synonymous SNPs controlled for length.

Additional file 8: Table S6. Top 1% most polymorphic genes.

Additional file 9: Table S7. Over-respresentation of GO Terms associatedwith each co-expression cluster.

Additional file 10: Table S8. Identity of gene models within each of 12co-expression clusters, including annotation.

Additional file 11: Table S9. Differential expression of genes presentedin Figure 3. Values are log2 fold-changes relative to metacercaria.

Additional file 12: Figure S3. Analysis of the clades representing thecathepsin L cysteine proteases.

Additional file 13: Table S10. Comparative analysis of the F. hepatica βtubulin genes with C. sinensis and S. mansoni, based on protein similarity.

Additional file 14: Table S11. F. hepatica antioxidant analysis.

Additional file 15: Table S12. Differential expression of detoxificationgenes.

Additional file 16: Table S13. Table of Fasciola sequences used forBLAST analysis.

Competing interestsThe authors declare that they have no competing interests.

Authors’ contributionsJH, JPD, SP and DW conceived the study, obtained funding and contributedresources. KC, JH, JPD and SP designed the study, interpreted the data andwrote the manuscript, with substantial input from PD, DW and JLC. SPperformed genome assembly, automated annotation and analysis ofpolymorphism and differential expression. KC and JLC annotated andanalysed gene families. KC, JH, PD and DW generated and prepared parasitematerial for sequencing. All authors read and approved the final manuscript.

AcknowledgementsWe are grateful to Katherine Allen for producing clonal lines of liver flukeand Catherine Hartley, University of Liverpool and Paula Martin and OliverGladstone, Ridgeway Research, UK for technical assistance in producingparasite material. We are grateful for sequencing support provided byMargaret Hughes and Suzanne Kay at the Centre for Genomic Research,University of Liverpool, and Mathieu Bourgey, Genome Quebec, Canada. Thisstudy was funded by grants from the Biotechnology and Biological SciencesResearch Council, UK (BB1002480/1) to JH, SP, DW and JLC, and a MinistèreÉconomie, Innovation et Exportation (MEIE), Québec, award and a EuropeanResearch Council Advanced Grant (HELIVAC, 322725) to JPD.

Author details1Institute of Infection and Global Health, University of Liverpool, Liverpool,UK. 2School of Biological Sciences, Medical Biology Centre, Queen’sUniversity of Belfast, Belfast, Northern Ireland, UK. 3Institute of Parasitology,McGill University, Montreal, Quebec, Canada. 4Institut National de SantéPublique du Québec, Montreal, Quebec, Canada. 5Liverpool School ofTropical Medicine, Liverpool, UK. 6Institute of Integrative Biology, Universityof Liverpool, Liverpool, UK.

Received: 10 November 2014 Accepted: 13 March 2015

References1. Keiser J, Utzinger J. Food-borne trematodiases. Clin Microbiol Rev.

2009;22:466–83.

2. Fürst T, Keiser J, Utzinger J. Global burden of human food-borne trematodiasis:a systematic review and meta-analysis. Lancet Infect Dis. 2012;12:210–21.

3. Robinson MW, Dalton JP. Zoonotic helminth infections with particularemphasis on fasciolosis and other trematodiases. Philos Trans R Soc Lond BBiol Sci. 2009;364:2763–76.

4. Videnova K, Mackay DKJ. Availability of vaccines against major animaldiseases in the European Union. Rev Sci Tech. 2012;31:971–8.

5. Spithill TW, Carmona C, Piedrafita D, Smooker PM. Prospects forimmunoprophylaxis against Fasciola hepatica (liver fluke). In: Targets,Screens, Drugs and Vaccines. Weinheim: Wiley-VCH Verlag GmbH & Co.KGaA; 2012. p. 465–84.

6. Piedraffita D, Spithill TW, Smith RE, Raadsma HW. Improving animal andhuman health through understanding liver fluke immunology. ParasiteImmunol. 2010;32:572–81.

7. Dalton JP, Robinson MW, Mulcahy G, O’Neill SM, Donnelly S.Immunomodulatory molecules of Fasciola hepatica: Candidates for bothvaccine and immunotherapeutic development. Vet Parasitol. 2013;195:272–85.

8. Claridge J, Diggle P, McCann CM, Mulcahy G, Flynn R, McNair J, et al.Fasciola hepatica is associated with the failure to detect bovine tuberculosisin dairy cattle. Nat Commun. 2012;3:853.

9. García HH, Moro PL, Schantz PM. Zoonotic helminth infections of humans:echinococcosis, cysticercosis and fascioliasis. Curr Opin Infect Dis.2007;20:489–94.

10. Mas-Coma S, Bargues M-D, Valero MA. Fascioliasis and other plant-bornetrematode zoonoses. Int J Parasitol. 2005;35:1255–78.

11. Berriman M, Haas BJ, LoVerde PT, Wilson RA, Dillon GP, Cerqueira GC, et al. Thegenome of the blood fluke Schistosoma mansoni. Nature. 2009;460:352–65.

12. Protasio AV, Tsai IJ, Babbage A, Nichol S, Hunt M, Aslett MA, et al. Asystematically improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni. PLoS Negl Trop Dis. 2012;6:e1455.

13. Young ND, Nagarajan N, Lin SJ, Korhonen PK, Jex AR, Hall RS, et al. TheOpisthorchis viverrini genome provides insights into life in the bile duct.Nat Commun. 2014;5:e4378.

14. Zhou Y, Zheng H, Chen Y, Zhang L, Wang K, Guo J, et al. The Schistosomajaponicum genome reveals features of host-parasite interplay. Nature.2009;460:345–51.

15. Huang Y, Chen W, Wang X, Liu H, Chen Y, Guo L, et al. The carcinogenicliver fluke, Clonorchis sinensis: new assembly, reannotation and analysis ofthe genome and characterization of tissue transcriptomes. PLoS One.2013;8:e54732.

16. Young ND, Jex AR, Li B, Liu S, Yang L, Xiong Z, et al. Whole-genomesequence of Schistosoma haematobium. Nat Genet. 2012;44:221–5.

17. Sanderson AR. Maturation and probable gynogenesis in the liver fluke,Fasciola hepatica L. Nature. 1953;172:110–2.

18. Hirai H, Hirai Y, LoVerde PT. Evolution of sex chromosomes ZW ofSchistosoma mansoni inferred from chromosome paint and BAC mappinganalyses. Parasitol Int. 2012;61:684–9.

19. Komalamisra C. Chromosomes and C-banding of Opisthorchis viverrini.Southeast Asian J Trop Med Public Health. 1999;30:576–9.

20. Park G-M, Im K-I, Huh S, Yong T-S. Chromosomes of the liver fluke,Clonorchis sinensis. Korean J Parasitol. 2000;38:201.

21. Adelson DL, Raison JM, Edgar RC. Characterization and distribution ofretrotransposons and simple sequence repeats in the bovine genome.Proc Natl Acad Sci. 2009;106:12855–60.

22. Li W-H, Sadler LA. Low nucleotide diversity in man. Genetics.1991;129:513–23.

23. Gayral P, Melo-Ferreira J, Glémin S, Bierne N, Carneiro M, Nabholz B, et al.Reference-free population genomics from next-generation transcriptomedata and the vertebrate–invertebrate gap. PLoS Genet. 2013;9:e1003457.

24. Skuce P, Stenhouse L, Jackson F, Hypša V, Gilleard J. Benzimidazoleresistance allele haplotype diversity in United Kingdom isolates ofTeladorsagia circumcincta supports a hypothesis of multiple origins ofresistance by recurrent mutation. Int J Parasitol. 2010;40:1247–55.

25. Anderson TJC, Romero-Abel ME, Jaenike J. Genetic structure and epidemiologyof Ascaris populations: patterns of host affiliation in Guatemala. Parasitol.1993;107:319–34.

26. Scheiffele P. Cell-cell signaling during synapse formation in the CNS. AnnuRev Neurosci. 2003;26:485–508.

27. Zito K, Fetter RD, Goodman CS, Isacoff EY. Synaptic clustering of Fascilin IIand Shaker: essential targeting sequences and role of Dlg. Neuron.1997;19:1007–16.

Cwiklinski et al. Genome Biology (2015) 16:71 Page 11 of 13

Page 13: The Fasciola hepatica genome: gene duplication and polymorphism

28. Tuttle AM, Hoffman TL, Schilling TF. Rabconnectin-3a regulates vesicleendocytosis and canonical wnt signaling in zebrafish neural crest migration.Plos Biol. 2014;12:e1001852.

29. Kalbe M, Haberl B, Haas W. Snail host finding by Fasciola hepatica andTrichobilharzia ocellata: compound analysis of “miracidia-attractingglycoproteins”. Exp Parasitol. 2000;96:231–42.

30. Andrews SJ. The life cycle of Fasciola hepatica. In: Dalton JP, editor.Fasciolosis. Wallingford: CABI; 1999.

31. McVeigh P, Atkinson L, Marks NJ, Mousley A, Dalzell JJ, Sluder A, et al.Parasite neuropeptide biology: Seeding rational drug target selection? Int JParasitol Drugs Drug Resist. 2012;2:76–91.

32. Tielens AGM. Metabolism. In: Dalton JP, editor. Fasciolosis. Wallingford:CABI; 1999.

33. McVeigh P, Maule AG, Dalton JP, Robinson MW. Fasciola hepatica virulence-associated cysteine peptidases: a systems biology perspective. MicrobesInfect. 2012;14:301–10.

34. Robinson MW, Tort JF, Lowther J, Donnelly SM, Wong E, Xu W, et al.Proteomics and phylogenetic analysis of the cathepsin l protease family ofthe helminth pathogen Fasciola hepatica: expansion of a repertoire ofvirulence-associated factors. Mol Cell Proteomics. 2008;7:1111–23.

35. Turk V, Stoka V, Vasiljeva O, Renko M, Sun T, Turk B, et al. Cysteinecathepsins: from structure, function and regulation to new frontiers.Biochim Biophys Acta. 1824;2012:68–88.

36. Corvo I, O’Donoghue AJ, Pastro L, Pi-Denis N, Eroy-Reveles A, Roche L, et al.Dissecting the active site of the collagenolytic cathepsin L3 protease of theinvasive stage of Fasciola hepatica. PLoS Negl Trop Dis. 2013;7:e2269.

37. Robinson MW, Menon R, Donnelly SM, Dalton JP, Ranganathan S. Anintegrated transcriptomics and proteomics analysis of the secretome of thehelminth pathogen Fasciola hepatica: proteins associated with invasion andinfection of the mammalian host. Mol Cell Proteomics. 2009;8:1891–907.

38. Dalton JP, Brindley PJ, Donnelly S, Robinson MW. The enigmatic asparaginylendopeptidase of helminth parasites. Trends Parasitol. 2009;25:59–61.

39. Ryan LA, Hoey E, Trudgett A, Fairweather I, Fuchs M, Robinson MW, et al.Fasciola hepatica expresses multiple α- and β-tubulin isotypes. Mol BiochemParasitol. 2008;159:73–8.

40. Fuchs MA, Ryan LA, Chambers EL, Moore CM, Fairweather I, Trudgett A,et al. Differential expression of liver fluke β-tubulin isotypes at selected lifecycle stages. Int J Parasitol. 2013;43:1133–9.

41. Devine C, Brennan GP, Lanusse CE, Alvarez LI, Trudgett A, Hoey E, et al.Inhibition of triclabendazole metabolism in vitro by ketoconazole increasesdisruption to the tegument of a triclabendazole-resistant isolate of Fasciolahepatica. Parasitol Res. 2011;109:981–95.

42. Sanabria R, Ceballos L, Moreno L, Romero J, Lanusse C, Alvarez L.Identification of a field isolate of Fasciola hepatica resistant to albendazoleand susceptible to triclabendazole. Vet Parasitol. 2013;193:105–10.

43. Williams DL, Bonilla M, Gladyshev VN, Salinas G. Thioredoxin glutathionereductase-dependent redox networks in platyhelminth parasites. AntioxidRedox Signal. 2013;19:735–45.

44. Lu J, Holmgren A. The thioredoxin antioxidant system. Free Radic Biol Med.2014;66:75–87.

45. McGonigle L, Mousley A, Marks NJ, Brennan GP, Dalton JP, Spithill TW, et al.The silencing of cysteine proteases in Fasciola hepatica newly excystedjuveniles using RNA interference reduces gut penetration. Int J Parasitol.2008;38:149–55.

46. Prast-Nielsen S, Huang H-H, Williams DL. Thioredoxin glutathione reductase:Its role in redox biology and potential as a target for drugs againstneglected diseases. Biochim Biophys Acta. 1810;2011:1262–71.

47. Maggioli G, Silveira F, Martín-Alonso JM, Salinas G, Carmona C, Parra F.A recombinant thioredoxin-glutathione reductase from Fasciola hepaticainduces a protective response in rabbits. Exp Parasitol. 2011;129:323–30.

48. Robinson MW, Hutchinson WF, Dalton JP, Donnelly S. Peroxiredoxin: acentral player in immune modulation. Parasite Immunol. 2010;32:305–13.

49. Donnelly S, Stack CM, O’Neill SM, Sayed AA, Williams DL, Dalton JP.Helminth 2-Cys peroxiredoxin drives Th2 responses through a mechanisminvolving alternatively activated macrophages. FASEB J. 2008;22:4022–32.

50. Dowling DJ, Hamilton CM, Donnelly S, La Course J, Brophy PM, Dalton J,et al. Major secretory antigens of the helminth Fasciola hepatica activate asuppressive dendritic cell phenotype that attenuates Th17 cells but fails toactivate Th2 immune responses. Infect Immun. 2010;78:793–801.

51. Wilkinson R, Law CJ, Hoey EM, Fairweather I, Brennan GP, Trudgett A.An amino acid substitution in Fasciola hepatica P-glycoprotein from

triclabendazole-resistant and triclabendazole-susceptible populations.Mol Biochem Parasitol. 2012;186:69–72.

52. Brennan GP, Fairweather I, Trudgett A, Hoey E, McCoy, McConville M, et al.Understanding triclabendazole resistance. Exp Mol Pathol. 2007;82:104–9.

53. Virkel G, Lifschitz A, Sallovitz J, Ballent M, Scarcella S, Lanusse C. Inhibition ofcytochrome P450 activity enhances the systemic availability oftriclabendazole metabolites in sheep. J Vet Pharmacol Ther. 2009;32:79–86.

54. Devine C, Brennan GP, Lanusse CE, Alvarez LI, Trudgett A, Hoey E, et al.Potentiation of triclabendazole action in vivo against a triclabendazole-resistant isolate of Fasciola hepatica following its co-administration with themetabolic inhibitor, ketoconazole. Vet Parasitol. 2012;184:37–47.

55. Morphew RM, Eccleston N, Wilkinson TJ, McGarry J, Perally S, Prescott M,et al. Proteomics and in silico approaches to extend understanding of theglutathione transferase superfamily of the tropical liver fluke Fasciolagigantica. J Proteome Res. 2012;11:5876–89.

56. Reed MB, Panaccio M, Strugnell RA, Spithill TW. Developmental expressionof a Fasciola hepatica sequence homologous to ABC transporters. Int JParasitol. 1998;28:1375–81.

57. Young ND, Hall RS, Jex AR, Cantacessi C, Gasser RB. Elucidating thetranscriptome of Fasciola hepatica – A key to fundamental andbiotechnological discoveries for a neglected parasite. Biotechnol Adv.2010;28:222–31.

58. Pagel M, Johnstone RA. Variation across species in the size of the nucleargenome supports the junk-dna explanation for the c-value paradox.Proc Biol Sci. 1992;249:119–24.

59. Consortium TEP. An integrated encyclopedia of DNA elements in thehuman genome. Nature. 2012;489:57–74.

60. Holroyd N, Sanchez-Flores A. Producing parasitic helminth reference anddraft genomes at the Wellcome Trust Sanger Institute. Parasite Immunol.2012;34:100–7.

61. Kang J-M, Bahk Y-Y, Cho P-Y, Hong S-J, Kim T-S, Sohn W-M, et al. A family ofcathepsin F cysteine proteases of Clonorchis sinensis is the major secretedproteins that are expressed in the intestine of the parasite. Mol BiochemParasitol. 2010;170:7–16.

62. Fox NJ, White PCL, McClean CJ, Marion G, Evans A, Hutchings MR.Predicting impacts of climate change on Fasciola hepatica risk. PLoS One.2011;6:e16126.

63. Zafra R, Pérez-Écija RA, Buffoni L, Moreno P, Bautista MJ, Martínez-Moreno A,et al. Early and late peritoneal and hepatic changes in goats immunizedwith recombinant cathepsin l1 and infected with Fasciola hepatica. J CompPathol. 2013;148:373–84.

64. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2.Nat Methods. 2012;9:357–9.

65. Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W. Scaffoldingpre-assembled contigs using SSPACE. Bioinformatics. 2011;27:578–9.

66. Cantarel BL, Korf I, Robb SMC, Parra G, Ross E, Moore B, et al. MAKER: Aneasy-to-use annotation pipeline designed for emerging model organismgenomes. Genome Res. 2007;18:188–96.

67. Parra G, Bradnam K, Korf I. CEGMA: a pipeline to accurately annotate coregenes in eukaryotic genomes. Bioinformatics. 2007;23:1061–7.

68. Hunt M, Kikuchi T, Sanders M, Newbold C, Berriman M, Otto TD. REAPR: auniversal tool for genome assembly evaluation. Genome Biol. 2013;14:R47.

69. Alexeyenko A, Tamas I, Liu G, Sonnhammer ELL. Automatic clustering oforthologs and inparalogs shared by multiple proteomes. Bioinformatics.2006;22:e9–15.

70. Blythe MJ, Kao D, Malla S, Rowsell J, Wilson R, Evans D, et al. A dualplatform approach to transcript discovery for the planarian Schmidteamediterranea to establish RNAseq for stem cell and regeneration biology.PLoS One. 2010;5:e15617.

71. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2:accurate alignment of transcriptomes in the presence of insertions,deletions and gene fusions. Genome Biol. 2013;14:R36.

72. McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis ofmultifactor RNA-Seq experiments with respect to biological variation.Nucleic Acids Res. 2012;40:4288–97.

73. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al.A framework for variation discovery and genotyping using next-generationDNA sequencing data. Nat Genet. 2011;43:491–8.

74. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, et al. Fast, scalablegeneration of high-quality protein multiple sequence alignments usingClustal Omega. Mol Syst Biol. 2011;7:539–9.

Cwiklinski et al. Genome Biology (2015) 16:71 Page 12 of 13

Page 14: The Fasciola hepatica genome: gene duplication and polymorphism

75. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5:molecular evolutionary genetics analysis using maximum likelihood,evolutionary distance, and maximum parsimony methods. Mol Biol Evol.2011;28:2731–9.

76. Centers for Disease Control and Prevention. Laboratory identification ofparasitic diseases of public health concern. [http://www.cdc.gov/dpdx/fascioliasis/index.html]

Submit your next manuscript to BioMed Centraland take full advantage of:

• Convenient online submission

• Thorough peer review

• No space constraints or color figure charges

• Immediate publication on acceptance

• Inclusion in PubMed, CAS, Scopus and Google Scholar

• Research which is freely available for redistribution

Submit your manuscript at www.biomedcentral.com/submit

Cwiklinski et al. Genome Biology (2015) 16:71 Page 13 of 13