23
Sequencing, biochemical characterization, crystal structure and molecular dynamics of cellobiohydrolase Cel7A from Geotrichum candidum 3C Anna S. Borisova 1,2 , Elena V. Eneyskaya 2 , Kirill S. Bobrov 2 , Suvamay Jana 3 , Anton Logachev 4 , Dmitrii E. Polev 5 , Alla L. Lapidus 6 , Farid M. Ibatullin 2 , Umair Saleem 1 , Mats Sandgren 1 , Christina M. Payne 3 , Anna A. Kulminskaya 2,7 and Jerry St ahlberg 1 1 Department of Chemistry and Biotechnology, Swedish University of Agricultural Sciences, Uppsala, Sweden 2 National Research Centre «Kurchatov Institute», B.P. Konstantinov Petersburg Nuclear Physics Institute, Gatchina, Orlova roscha, Russia 3 Department of Chemical and Materials Engineering, University of Kentucky, Lexington, KY, USA 4 Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, Russia 5 Research Resource Centre «Molecular and Cell Technologies», St. Petersburg State University, Russia 6 Centre for Algorithmic Biotechnology, St. Petersburg Academic University, Russia 7 Department of Medical Physics, Peter the Great St. Petersburg Polytechnic University, Russia Keywords biomass degradation; cellulase; Geotrichum candidum; molecular dynamics; X-ray structure Correspondence J. St ahlberg, Department of Chemistry and Biotechnology, Swedish University of Agricultural Sciences, PO Box 7015, SE-750 07 Uppsala, Sweden Tel: +46-18-673182 E-mail: [email protected] C. M. Payne, Department of Chemical and Materials Engineering, University of Kentucky, 177 F. Paul Anderson Tower, Lexington, KY 40506, USA Fax: +1 859 323 1929 Tel: +1 859 257 2902 E-mail: [email protected] A. A. Kulminskaya, National Research Centre «Kurchatov Institute», B.P. Konstantinov Petersburg Nuclear Physics Institute, 188300, Gatchina, Orlova roscha, Russia Fax: +7 81371 32303 Tel: +7 813 7132014 E-mail: [email protected] Present address Umair Saleem, Birkedommervej 17, 3TH, 2400, København NV, Denmark (Received 17 June 2015, revised 13 August 2015, accepted 4 September 2015) The ascomycete Geotrichum candidum is a versatile and efficient decay fungus that is involved, for example, in biodeterioration of compact discs; notably, the 3C strain was previously shown to degrade filter paper and cotton more efficiently than several industrial enzyme prepa- rations. Glycoside hydrolase (GH) family 7 cellobiohydrolases (CBHs) are the primary constituents of industrial cellulase cocktails employed in biomass conversion, and feature tunnel-enclosed active sites that enable processive hydrolytic cleavage of cellulose chains. Understanding the structurefunction relationships defining the activity and stability of GH7 CBHs is thus of keen interest. Accordingly, we report the compre- hensive characterization of the GH7 CBH secreted by G. candidum (GcaCel7A). The bimodular cellulase consists of a family 1 cellulose- binding module (CBM) and linker connected to a GH7 catalytic domain that shares 64% sequence identity with the archetypal industrial GH7 CBH of Hypocrea jecorina (HjeCel7A). GcaCel7A shows activity on Avi- cel cellulose similar to HjeCel7A, with less product inhibition, but has a lower temperature optimum (50 °C versus 6065 °C, respectively). Five crystal structures, with and without bound thio-oligosaccharides, show conformational diversity of tunnel-enclosing loops, including a form with partial tunnel collapse at subsite 4 not reported previously in GH7. Also, the first O-glycosylation site in a GH7 crystal structure is reported on a loop where the glycan probably influences loop contacts across the active site and interactions with the cellulose surface. The GcaCel7A structures indicate higher loop flexibility than HjeCel7A, in accordance with sequence modifications. However, GcaCel7A retains small fluctua- tions in molecular simulations, suggesting high processivity and low endo-initiation probability, similar to HjeCel7A. Database Structural data are available in the Protein Data Bank under the accession numbers 5AMP, 4ZZV, 4ZZW, 4ZZT, and 4ZZU. The Geotrichum candidum GH family 7 cellobiohydrolase nucleotide sequence is available in GenBank under accession number KJ958925. 1 FEBS Journal (2015) ª 2015 FEBS

Sequencing, biochemical characterization, crystal structure and molecular dynamics of Cellobiohydrolase Cel7A from Geotrichum candidum 3C

Embed Size (px)

Citation preview

Sequencing, biochemical characterization, crystalstructure and molecular dynamics of cellobiohydrolaseCel7A from Geotrichum candidum 3CAnna S. Borisova1,2, Elena V. Eneyskaya2, Kirill S. Bobrov2, Suvamay Jana3, Anton Logachev4,Dmitrii E. Polev5, Alla L. Lapidus6, Farid M. Ibatullin2, Umair Saleem1, Mats Sandgren1,Christina M. Payne3, Anna A. Kulminskaya2,7 and Jerry St�ahlberg1

1 Department of Chemistry and Biotechnology, Swedish University of Agricultural Sciences, Uppsala, Sweden

2 National Research Centre «Kurchatov Institute», B.P. Konstantinov Petersburg Nuclear Physics Institute, Gatchina, Orlova roscha, Russia

3 Department of Chemical and Materials Engineering, University of Kentucky, Lexington, KY, USA

4 Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, Russia

5 Research Resource Centre «Molecular and Cell Technologies», St. Petersburg State University, Russia

6 Centre for Algorithmic Biotechnology, St. Petersburg Academic University, Russia

7 Department of Medical Physics, Peter the Great St. Petersburg Polytechnic University, Russia

Keywords

biomass degradation; cellulase; Geotrichum

candidum; molecular dynamics; X-ray

structure

Correspondence

J. St�ahlberg, Department of Chemistry and

Biotechnology, Swedish University of

Agricultural Sciences, PO Box 7015, SE-750

07 Uppsala, Sweden

Tel: +46-18-673182

E-mail: [email protected]

C. M. Payne, Department of Chemical and

Materials Engineering, University of

Kentucky, 177 F. Paul Anderson Tower,

Lexington, KY 40506, USA

Fax: +1 859 323 1929

Tel: +1 859 257 2902

E-mail: [email protected]

A. A. Kulminskaya, National Research

Centre «Kurchatov Institute», B.P.

Konstantinov Petersburg Nuclear Physics

Institute, 188300, Gatchina, Orlova roscha,

Russia

Fax: +7 81371 32303

Tel: +7 813 7132014

E-mail: [email protected]

Present address

Umair Saleem, Birkedommervej 17, 3TH,

2400, København NV, Denmark

(Received 17 June 2015, revised 13 August

2015, accepted 4 September 2015)

The ascomycete Geotrichum candidum is a versatile and efficient decay

fungus that is involved, for example, in biodeterioration of compact

discs; notably, the 3C strain was previously shown to degrade filter

paper and cotton more efficiently than several industrial enzyme prepa-

rations. Glycoside hydrolase (GH) family 7 cellobiohydrolases (CBHs)

are the primary constituents of industrial cellulase cocktails employed in

biomass conversion, and feature tunnel-enclosed active sites that enable

processive hydrolytic cleavage of cellulose chains. Understanding the

structure–function relationships defining the activity and stability of

GH7 CBHs is thus of keen interest. Accordingly, we report the compre-

hensive characterization of the GH7 CBH secreted by G. candidum

(GcaCel7A). The bimodular cellulase consists of a family 1 cellulose-

binding module (CBM) and linker connected to a GH7 catalytic domain

that shares 64% sequence identity with the archetypal industrial GH7

CBH of Hypocrea jecorina (HjeCel7A). GcaCel7A shows activity on Avi-

cel cellulose similar to HjeCel7A, with less product inhibition, but has a

lower temperature optimum (50 °C versus 60–65 °C, respectively). Five

crystal structures, with and without bound thio-oligosaccharides, show

conformational diversity of tunnel-enclosing loops, including a form with

partial tunnel collapse at subsite –4 not reported previously in GH7.

Also, the first O-glycosylation site in a GH7 crystal structure is reported

– on a loop where the glycan probably influences loop contacts across

the active site and interactions with the cellulose surface. The GcaCel7A

structures indicate higher loop flexibility than HjeCel7A, in accordance

with sequence modifications. However, GcaCel7A retains small fluctua-

tions in molecular simulations, suggesting high processivity and low

endo-initiation probability, similar to HjeCel7A.

Database

Structural data are available in the Protein Data Bank under the accession numbers 5AMP,

4ZZV, 4ZZW, 4ZZT, and 4ZZU. The Geotrichum candidum GH family 7 cellobiohydrolase

nucleotide sequence is available in GenBank under accession number KJ958925.

1FEBS Journal (2015) ª 2015 FEBS

doi:10.1111/febs.13509 Enzymes

Glycoside hydrolase family 7 reducing end acting cellobiohydrolase

Introduction

Cellulose-degrading enzymes play a key role in global

carbon recycling, a process of considerable ecological

importance. As much as ~ 15% of all atmospheric car-

bon dioxide is fixed yearly by plants and incorporated

into vast amounts of biomass, which is eventually

degraded and recycled by enzymes from fungi and bac-

teria [1]. Lignocellulosic biomass is also by far the most

abundant renewable carbon source available to human-

ity in the transition from fossil-based to sustainable pro-

duction of fuels and chemicals. The potential of

enzymes for sustainable biomass utilization is attracting

growing attention and is expected to play a vital role in

the future. Already today, cellulases and associated

biomass-degrading enzymes constitute the third largest

group of industrially produced enzymes [2]; these

enzymes are commonly used in applications such as cot-

ton processing and paper recycling, as detergent

enzymes, in juice extraction, and as animal feed addi-

tives [3,4]. Given a viable worldwide biomass-to-biofuel

industry, cellulases will undoubtedly be the most highly

produced proteins in the world, by orders of magnitude.

Basic and applied research on cellulases for biofuel

production began in the 1970s [5,6]. The enzyme sys-

tem most extensively studied is that of the filamentous

ascomycete fungus Hypocrea jecorina (also known as

Trichoderma reesei). H. jecorina remains the predomi-

nant organism used for the production of cellulases,

thanks to the development of hyperproducing indus-

trial strains capable of secreting > 50 g�L�1 of protein

[7]. However, each component of the native

H. jecorina secretome is not necessarily primed for effi-

cient hydrolytic turnover and stability under industrial

conditions, prompting the search for alternative

enzymes among the vast array of cellulolytic microor-

ganisms that exist in nature.

The filamentous yeast-like fungus Geotrichum

candidum strain 3C was isolated from a rotting rope

[8] and found to have high cellulolytic and xylanolytic

activities [9,10]. As early as the 1970s, it was shown

that the G. candidum 3C cellulase complex was more

efficient than that of well-studied Trichoderma sp.

[8,11,12]. The cellulases of G. candidum 3C were ini-

tially characterized, but unfortunately, the enzymes

have been undeservedly forgotten for decades. Never-

theless, an enzyme preparation from this fungus, ‘Cel-

lokandin G10x’ [11], has been used in the pulp and

paper industry for waste paper utilization [13], and

applied research has been carried out on G. candidum

3C to improve the process of bleaching of softwood

and hardwood kraft pulp [9,14]. Interestingly,

G. candidum and related species have been found to be

able to degrade various natural and artificial materials,

and to be responsible for biodeterioration and destroy-

ing the information pits in compact discs [15]. From

an evolutionary point of view, G. candidum is rela-

tively distantly related to other ascomycete fungi with

characterized glycoside hydrolase (GH) family 7 cel-

lobiohydrolases (CBHs). The draft genome sequence

of G. candidum 3C has been published [16] and is

available in GenBank under the strain name Galacto-

myces candidus 3C. Also, the draft genome sequence of

another G. candidum strain CLIB 918 (ATCC 204307),

described as a dairy yeast within the Sachharomy-

cotina, has been published very recently [17].

The most abundant protein secreted by G. candidum

3C was previously isolated and partially characterized,

showing properties that are typical of GH7 CBHs

(EC 3.2.1.176) [8,10,11]. GH7 CBHs are commonly

the major components in the secretomes of potent cel-

lulolytic fungi and are also key rate-limiting factors in

commercial cellulase cocktails [18]. Approximately

Abbreviations

APO1, apo structure 1 of catalytic domain of Geotrichum candidum Cel7A; APO2, apo structure 2 of catalytic domain of Geotrichum

candidum Cel7A; BC, bacterial cellulose; CBH, cellobiohydrolase; CBM, cellulose-binding module; CMC, carboxymethyl cellulose; CNP-G2,

2-chloro-4-nitrophenyl b-cellobioside; G2, catalytic domain of Geotrichum candidum Cel7A ligand complex with cellobioside; G3, catalytic

domain of Geotrichum candidum Cel7A ligand complex with cellotrioside; G4, catalytic domain of Geotrichum candidum Cel7A ligand

complex with cellotetraoside; GcaCel7A_CD, catalytic domain of Geotrichum candidum Cel7A; GcaCel7A, Geotrichum candidum Cel7A; GH,

glycoside hydrolase; HirCel7A, Heterobasidion irregulare Cel7A; HjeCel7A, Hypocrea jecorina Cel7A; MD, molecular dynamics; PASC,

phosphoric acid-swollen cellulose; PchCel7D, Phanerochaete chrysosporium Cel7D; PDB, Protein Data Bank; PEG, poly(ethylene glycol);

pNP-G2, p-nitrophenyl β-cellobioside; pNP-Lac, p-nitrophenyl β-lactoside; RemCel7A, Rasamsonia emersonii Cel7A; RMSF, root-mean-square

fluctuation; XGO, xyloglucan oligosaccharide.

2 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

one-third of the known GH7 members, including

Cel7A from G. candidum (GcaCel7A) and Cel7A from

H. jecorina (HjeCel7A), are bimodular proteins with a

family 1 cellulose-binding module (CBM) connected to

the GH7 catalytic module by a highly glycosylated

and flexible Ser/Thr-rich linker peptide.

Three-dimensional structures of the catalytic

domains of nine GH7 CBHs have been published pre-

viously, the first being HjeCel7A and the most recent

being Aspergillus fumigatus Cel7A [19–27]. The struc-

tures share a common b-jelly roll fold with a curved

b-sandwich constructed from two largely antiparallel

b-sheets packing face-to-face, forming an approxi-

mately 50-�A-long substrate-binding groove along the

GH7 catalytic module. A key structural feature of

GH7 CBHs is that long loops extend the edges of the

b-sandwich and effectively enclose the active site in a

tunnel. This enables the enzymes to act processively

along a cellulose chain and cleave off numerous cel-

lobiose units before detachment from the substrate,

which is believed to be key to their efficiency on highly

crystalline cellulose [18]. GH7 CBHs act preferentially

from the reducing towards the nonreducing end of cel-

lulose chains, in contrast with GH6 CBHs, which

work in the opposite direction.

The active site harbours 11 glucosyl-binding sub-

sites, numbered �7 to +4 from the nonreducing end of

the cellulose chain, and cleavage occurs between sub-

sites �1 and +1 [28]. Sequence identities are high

(> 50%) among the CBHs of family 7, and the cellu-

lose-binding active site is highly conserved, including

four Trp residues that serve as sugar-binding platforms

at subsites �7, �4, �2, and +1. Differences that may

relate to function occur primarily in the length and

sequence of loop regions, varying the accessibility of

the active site and the dynamics of loop movements; in

turn, these variations influence key enzyme properties,

such as processivity, product inhibition, endo-initiation

propensity, and the rate of release of non-productively

bound enzyme [23,29]. Computational investigation

offers complementary insights to those from X-ray

crystallography regarding protein dynamics and the

thermodynamics of protein–ligand interactions [30–33].Indeed, molecular dynamics (MD) simulations

revealed significant differences in loop dynamics

between HjeCel7A and Phanerochaete chrysosporium

Cel7D (PchCel7D), which have the most closed and

most open tunnels among known GH7 CBHs, respec-

tively. Heterobasidion irregulare Cel7A (HirCel7A)

showed intermediate properties [23]. Furthermore,

quantum mechanics/molecular mechanics MD simula-

tions of hydrolysis and cellulose chain threading in the

tunnel of HjeCel7A have enabled the calculation of

the free energy profile along the whole reaction coordi-

nate of the hydrolytic–processive cycle [33].

In this study, we report the identification, sequenc-

ing, biochemical characterization, and crystallization

of GcaCel7A, including five crystal structures of the

enzyme, both in its apo-form and in complex with

oligosaccharides. With the new structures, we also

conducted MD simulations to examine loop dynam-

ics and protein–substrate interactions with both

soluble and insoluble substrates for comparison with

GH7 homologues (HjeCel7A, HirCel7A, and

PchCel7D). Overall, our results highlight molecular-

level features that are important for understanding

this biologically and industrially relevant family of

GHs.

Results

Isolation and identification of GcaCel7A

From a culture of G. candidum 3C grown on filter paper

as the sole carbon source, the major cellulase enzyme

was purified to homogeneity with cellulose affinity, ion

exchange, and hydrophobic interaction chromatograpy.

The yield was 32 mg of purified protein per litre of cul-

ture, with a specific activity against crystalline cellulose

of 2.15 9 10�3 U/mg. SDS/PAGE analysis confirmed

that the enzyme was the most abundant protein in the

culture filtrate. Figure 1 shows the analysis of the puri-

fied protein after papain treatment to remove the linker-

CBM and isolate the catalytic domain (GcaCel7A_CD)

for protein crystallization. The proteolytic cleavage was

not complete. Both the full-length enzyme and the cat-

alytic domain were present, at 75 kDa and 46 kDa,

respectively, but as seen from the gel, they were well sep-

arated from each other. Trypsin digestion and peptide

mapping by MALDI-TOF mass spectrometry (data not

shown) identified the enzyme as a member of GH family

7.

Sequencing of the GcaCel7A-encoding gene

At the time of initial crystallization of GcaCel7A_CD,

neither the gene nor the protein had been sequenced.

Nevertheless, it was possible to solve the structure with

PchCel7D as a search model to build an initial struc-

ture model of GcaCel7A_CD. From this model, partial

amino acid sequence information could be derived for

the search for homologous genes in the NCBI and

MycoCosm databases. Alignment of the nucleotide

sequences of three close homologues (from Thermoascus

aurantiacus, Neosartorya fischeri, and Botryosphaeria

dothidea) revealed two conserved regions, near the

3FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A

beginning and the end of the genes, suitable for primer

design. PCR against genomic DNA extracted from

G. candidum 3C yielded a PCR product of ~ 1200 bp

that was purified and sequenced. Subsequently, gen-

ome sequence data of G. candidum 3C became avail-

able [16], and the full-length gene could be retrieved.

The GcaCel7A gene consists of 1671 bp, including one

intron of 66 bp. The nucleotide sequence of the pro-

tein-encoding region, corresponding to mRNA, has

been deposited in GenBank (accession number

KJ958925), and was confirmed by sequencing of PCR-

amplified cDNA.

The encoded GcaCel7A preprotein consists of 535

amino acids, divided into a 17-residue signal peptide

followed by a GH7 catalytic module of ~ 436 residues,

a Ser/Thr-rich linker region of ~ 47 residues, and

finally a C-terminal CBM1 of ~ 35 residues (Fig. 2).

Three putative N-glycosylation sites and 56 potential

O-glycosylation sites were predicted in GcaCel7A_CD

by the NetNGlyc 1.0 (http://www.cbs.dtu.dk/services/

NetNGlyc/) and NetOGlyc 4.0 (http://www.cbs.dtu.

dk/services/NetOGlyc/ [35]) servers, respectively. The

crystal structures show attached N-glycans at two of

the predicted N-glycosylation sites (Asn57 and

Asn206) and also at Asn432 near the C terminus of

GcaCel7A_CD, whereas no attached sugar is visible at

the Asn98 site in any of the structures. Furthermore,

O-glycosylation at Ser196 is indicated by electron den-

sity for an a-linked sugar, presumably mannose, in

two of the structures. No glycosylation sites on the

catalytic domain are conserved throughout GH7 CBHs

(Fig. 3).

Temperature and pH dependence of GcaCel7A

activity and stability

Figure 4 shows the activity of GcaCel7A on Avicel

cellulose at different pH values and temperatures,

and the residual activity at 37 °C and pH 5.0, after

preincubation at the indicated temperatures (1 h, pH

5.0) and pH values (24 h, 37 °C). The enzyme

showed a broad pH optimum, with the highest activ-

ity at pH 5.0 and within > 85% from pH 4.0 to 6.5.

There was no loss of activity from pH 4.0 to pH 7.0,

and > 80% activity was retained after preincubation

at pH 3.5 and pH 7.5 (Fig. 4A). The temperature

optimum was at 50 °C (Fig. 4B). Loss of activity

after preincubation was detected from 50 °C and

upwards. After 1 h at 60 °C, the residual activity had

decreased to 45%.

GcaCel7A activity on polysaccharides

Specific activities of GcaCel7A against selected

polysaccharide substrates are shown in Table 1. On

the insoluble cellulose substrates, the enzyme showed

the highest activity on amorphous cellulose [phospho-

ric acid swollen cellulose (PASC)], followed by bacte-

rial cellulose (BC) and microcrystalline cellulose

(Avicel). The low activity on carboxymethyl cellulose

(CMC), although it is soluble, is in accordance with

poor accessibility of the restricted substrate-binding

tunnel of CBHs to substituted cellulose. Significant

activity was also detected on lichenan (b-1,3-1,4-glucan)but not on beechwood xylan (data not shown). The

isolated GcaCel7A_CD hydrolysed CMC at the same

rate as full-length GcaCel7A, whereas activity on

Avicel was reduced by half by cleaving the CBM

(1.02 9 103 U�mg�1). The latter finding is consistent

with previous reports indicating that the CBM

enhances hydrolysis of crystalline cellulose under rela-

tively low solid loading conditions [37]. Activity com-

parisons of GcaCel7A and HjeCel7A were performed

with Avicel as substrate, both with the CBH acting

alone and with the addition of a commercial cellulase

cocktail (Accellerase) from which HjeCel7A was selec-

tively removed. In both cases, GcaCel7A yielded simi-

lar amounts of soluble sugar as HjeCel7A, within the

standard error of the experiment (Table 2). Approxi-

mately 1.7–1.8-fold higher conversion was achieved

Fig. 1. SDS/PAGE analysis of fractions from the final size exclusion

chromatography purification step after papain cleavage of

GcaCel7A. Lanes 1–3, full-length GcaCel7A fractions. Lanes 4 and

5: GcaCel7A_CD fractions. Molecular masses in kDa are indicated

for the reference ladder proteins.

4 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

with the enzyme cocktail than with either Cel7A acting

alone.

Enzyme kinetics and product inhibition

GcaCel7A_CD was used for the determination of enzyme

kinetic parameters, kcat and KM, on three chromogenic

disaccharide substrates, i.e. 2-chloro-4-nitrophenyl

b-cellobioside (CNP-G2), p-nitrophenyl b-cellobioside(pNP-G2), and p-nitrophenyl b-lactoside (pNP-Lac),

and studies of product inhibition by cellobiose. The

results are shown in Table 3 along with a comparison

of previously published data on pNP-Lac for

HjeCel7A and PchCel7D [38]. The kinetic parameters

N-glycosylation site (98...101)N-glycosylation site (57...60)

O-glycosylation site (196)N-glycosylation site (206...209)

Linker

CBM1 (484...518)Signal peptide (–17...–1) Catalytic region (1...436)

Fig. 2. Schematic representation of the amino acid sequence encoded by the GcaCel7A gene. Predicted N-glycosylation sites, two of which

were observed, and the observed O-glycosylation site are indicated by triangles. The picture was generated with the Vector NTI program by

using services of the SignalP 4.1 server (http://www.cbs.dtu.dk/services/SignalP/ [34]).

Fig. 3. Structure-based sequence alignment of the GH7 catalytic domains of GcaCel7A (PDB code 5AMP), RemCel7A (PDB code 1Q9H),

HjeCel7A (PDB code 1CEL), Trichoderma harzianum Cel7A (ThaCel7A; PDB code 2YOK), and HirCel7A (PDB code 2YG1). Secondary

structural elements of the GcaCel7A structure are indicated above the alignment (b-strand arrows and a-helices). Strictly identical residues

are marked in white letters on a black background. Regions of conserved, highly similar residues are framed in thin-lined boxes with bold

letters. Red frames indicate loop regions of interest, with loop nomenclature underneath. The green triangle indicates the O-glycosylated

Ser196, and blue triangles indicate the N-glycosylated Asn residues observed in GcaCel7A structures. The figure was prepared with the

ESPript web server with default parameters (http://espript.ibcp.fr [36]).

5FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A

for GcaCel7A_CD on the cellobioside substrates are

practically identical. Thus, the 2-chloro substitution on

the nitrophenyl group of CNP-G2 is accommodated in

subsite +1 without influencing the activity. On pNP-

Lac, kcat is almost twice as high, and kcat/KM is higher,

than with CNP-G2 and pNP-G2. GcaCel7A_CD

shows an approximately two-fold higher kcat on pNP-

Lac than HjeCel7A and PchCel7D [38]. At the same

time, substrate binding is weaker (higher KM) and

catalytic efficiency (kcat/KM) is slightly lower for

GcaCel7A_CD than for HjeCel7A, whereas the oppo-

site is true relative to PchCel7D. Inhibition constants

indicate approximately two-fold weaker binding of cel-

lobiose to GcaCel7A_CD than to HjeCel7A.

GcaCel7A_CD crystal structures

The isolated GcaCel7A_CD was successfully crystal-

lized, and we report five crystal structures herein: two

apo structures (referred to as APO1 and APO2), and

three ligand complexes, with cellobioside (G2), cel-

lotrioside (G3), and cellotetraoside (G4) bound at the

active site. X-ray diffraction data and refinement

statistics are summarized in Table 4. The initial struc-

ture model of GcaCel7A was solved by molecular

replacement with the APO1 dataset and a structure of

PchCel7D as the search model [Protein Data Bank

(PDB) code 1GPI] [25]. All of the structures were

solved in the monoclinic space group P21 with one

protein chain per asymmetric unit, but two discrete

crystal packings differing by the length of the b-axis

were represented. To distinguish between those forms,

we use the terms ‘long’ and ‘short’ for the b-axis. The

APO1 and G4 structures have long b-axes (90.5 �A);

whereas, the APO2, G2 and G3 structures have short

b-axes (81–82 �A). APO1 was refined at 2.12 �A, and

the other structures at higher resolution (1.37–1.56 �A).

Figure 5 shows the protein backbones of all of the

GcaCel7A structures superimposed with HjeCel7A.

In both of the apo structures, all residues of the pro-

tein, 1–438, are visible in the electron density and are

Fig. 4. Effects of pH and temperature on the activity of GcaCel7A

on Avicel cellulose. (A) Temperature dependence. (B) pH

dependence. Filled circles show relative activities in terms of

released reducing sugar during 2 h of incubation of enzyme with

5 mg�mL�1 Avicel at the indicated temperatures (at pH 5.0) and pH

values (at 37 °C). Open circles show the residual activity measured

at pH 5.0 and 37 °C after preincubation at different temperatures

for 1 h at pH 5.0, and after preincubation at different pH values for

24 h at 37 °C. Error bars indicate standard errors of at least three

measurements.

Table 1. Specific activity of full-length GcaCel7A on polysaccharides. Released reducing sugar was assayed after incubation of 5 mg�mL�1

substrate with GcaCel7A at pH 5.0 and 37 °C. PASC and BC were incubated for 1 h with 0.075 mg�mL�1 and 0.5 mg�mL�1 enzyme,

respectively, and the other substrates were incubated for 20 h with 0.5 mg�mL�1 enzyme.

Substrate Avicel CMC BC PASC Lichenan

Enzyme activity (U�mg�1 9 103)a 2.15 � 0.13 1.86 � 0.11 40.0 � 2.4 275 � 17 1.91 � 0.11

a The specific activity, U�mg�1, is defined as the accumulated amount in lmol of glucose equivalents released per mg of enzyme divided by

the incubation time in minutes under the assay conditions.

6 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

included in the structure models. The tip of loop B2 is

disordered in the G2 and G3 structures, and residues

199–201 have been excluded from the G2 structure

model, and residues 196–199 from the G3 structural

model, owing to insufficient electron density. In the

G4 structure, the density is weak and ambiguous for

Ser21, Gly22, and Gly23, indicating multiple confor-

mations. As compared with the other GcaCel7A_CD

structures and with Cel7 homologues, the APO1 struc-

ture seemed to fit best to the electron density in this

region of the G4 structure. Thus, the conformation of

these residues were taken from APO1 and were manu-

ally adjusted, refined, and included in the G4 structure

model. The terminal residues Gly437 and Thr438 are

not visible in the G4 structure; whereas, the other

structures clearly show Thr438 as the C-terminal resi-

due. In all structures, the N-terminal Gln is cyclized to

pyroglutamate, and all 16 Cys residues form disulfide

bonds. N-glycosylation is evident at three sites in all

five GcaCel7A_CD structures, with GlcNAc attached

to Asn57, Asn206, and Asn432. Interestingly, two con-

secutive GlcNAc residues are visible at Asn206, despite

deglycosylation treatment of the protein, which indi-

cates that endo-b-N-acetylglucosaminidase may be

unable to cleave the N-glycan at this site. Further-

more, there is convincing density for O-glycosylation

at Ser196 in the APO2 and G2 structures, where an a-linked mannose residue was refined. Positive density in

the difference map (Fo–Fc) of the G4 structure also

supports O-glycosylation at Ser196 but is not clear

enough for confident positioning of mannose. No

sugar is visible at this position in APO1 or in the G3

structure, where the entire Ser196 residue is missing.

As shown in Fig. 6, there is distinct electron density

for cello-oligosaccharides bound in the active site of

the GcaCel7A_CD ligand complexes. The G2 structure

was obtained by cocrystallization with laminaribiose,

and a disaccharide clearly occupies the product sub-

sites +1 and +2. Surprisingly, the density is not com-

patible with the b-1,3 linkage of laminaribiose, but

unambiguously shows that the bound sugar is cel-

lobiose (Fig. 6A). In the G3 complex, two thio-cel-

lotriose molecules are visible and included in the

structure model, in subsites �4/�3/�2 and �1/+1/+2,respectively (Fig. 6B). For the G4 structure, thio-cel-

lobiose was present in the cocrystallization and is visi-

ble at subsites +1 and +2. There is also contiguous

density for glucose residues all the way from subsite

�5 to subsite �1, indicating overlapping binding

modes (Fig. 6C). Consecutive glucose units connect

with proper thio-glycosidic bond geometry, which is

why we chose to refine two overlapping thio-cellote-

traose molecules at partial occupancy, spanning sub-

sites �5/�4/�3/�2 and �4/�3/�2/�1, respectively.

All glucosyl units are well defined except in subsite

�1, where the density is weaker and indicates the

presence of multiple conformations. The predominant

�1 conformer appears to be the a-anomer of a4C1 glucopyranosyl unit at partial occupancy. The

APO1 structure also shows a long stretch of contigu-

Table 2. Comparison of hydrolysis of 5 mg�mL�1 Avicel cellulose by GcaCel7A and HjeCel7A, acting alone (50 lg�mL�1) or added to a

Cel7A-depleted Hypocrea jecorina enzyme cocktail (25 lg�mL�1 Cel7A + 25 lg�mL�1 HjeCel7A-free Accellerase 1500), for 2 h at 40 °C and

pH 5.0. Excess b-glucosidase was subsequently added to convert all soluble sugars to glucose prior to the reducing sugar assay.

Enzyme

Cel7A alone Cel7A + HjeCel7A-free Accellerase

[Glc] (mg�mL�1)a Conversion (%) [Glc] (mg�mL�1)a Conversion (%)

GcaCel7A 0.39 � 0.021 6.9 0.65 � 0.072 11.7

HjeCel7A 0.42 � 0.025 7.6 0.75 � 0.039 13.5

a Average and standard deviation of five replicates.

Table 3. Enzyme kinetics parameters for GcaCel7A_CD on chromogenic disaccharide substrates and inhibition constants for cellobiose.

Enzyme kinetic parameters were derived by nonlinear regression from experiments performed at pH 5.0 and 30 °C with 0.01–5 mM

substrate. Previously published data for HjeCel7A and PchCel7D are provided here for comparison.

Enzyme Substrate kcat (s�1) KM (mM) kcat/KM (M�1 s�1) Ki (lM)

GcaCel7A_CD CNP-G2 0.11 � 0.02 0.84 � 0.12 131 –

GcaCel7A_CD pNP-G2 0.10 � 0.02 0.85 � 0.12 118 –

GcaCel7A_CD pNP-Lac 0.19 � 0.06 1.07 � 0.10 178 50 � 5

HjeCel7Aa pNP-Lac 0.093 0.41 226 24

PchCel7Da pNP-Lac 0.10 1.3 76 180

a Published data obtained at pH 5.0 and 30 °C [38]. Reported standard errors were 7–15% for KM and 3–6% for kcat.

7FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A

ous density in the active site, probably representing

bound poly(ethylene glycol) (PEG) and/or glycerol.

The APO2 structure was obtained by crystal soaking

with xyloglucan oligosaccharide (XGO). Cocrystal-

lization with XGO was also performed in an attempt

to trap an ‘open-loop’ structure of the enzyme,

which would be required in order to accommodate

this branched and bulky sugar. However, no bound

sugars were seen in any structures from XGO-ex-

posed crystals.

Structure of GcaCel7A and comparison with

related enzymes

Overall, the GcaCel7A structures are very similar, as

reflected by low pairwise rmsd values (0.16–0.35 �A).

Notable differences occur at loop B2, which seems to

be flexible and shows elevated temperature factors,

and local disorder (the G2 and G3 structures), as

observed in other Cel7 CBHs [18,23]. In the APO2

structure, loop B2 adopts a new conformation not

previously seen in any Cel7 structure. The loop bends

further inwards and partially obstructs the cellulose-

binding path through the tunnel. As compared with

the APO1 structure, Ala199 CB at the tip of loop B2

has moved 10.7 �A into the tunnel, causing it to clash

with a glucosyl unit binding at subsite �4 (Fig. 7A).

This B2 loop conformation correlates with the short

unit cell b-axis and may be a crystallization artefact

caused by tight crystal packing in this region. The

crystal contacts also involve the O-glycosylation at

Ser196, which explains why the attached mannose

residue is most clearly revealed in the APO2 and G2

structures. A more relaxed B2 loop conformation,

Table 4. Diffraction data and refinement statistics for the five determined GcaCel7A_CD structures. The space group is P21 in all cases,

with one protein chain per asymmetric unit.

Structure Apo structure (APO1)

Apo structure

(APO2)

Cellobioside

complex (G2)

Cellotrioside

complex (G3)

Cellotetraoside

complex (G4)

PDB code 5AMP 4ZZV 4ZZW 4ZZT 4ZZU

b-axis Long Short Short Short Long

Unit cell, a, b,

c (�A); b-angle (°)

42.88, 90.56, 55.27;

109.86

42.42, 80.73,

54.91; 109.54

42.64, 81.82,

55.03; 109.63

42.47, 80.76,

55.06; 109.6

42.71, 90.52,

55.09; 109.62

Beamline I911-2, MAX-lab ID23-1, ESRF I911-3, MAX-lab ID23-1, ESRF I911-3, MAX-lab

Wavelength (�A) 1.0409 0.9918 1.0001 0.9918 0.9000

No. of observations 84 009 258 951 140 868 187 944 217 429

No. of unique reflections 21 970 68 772 54 088 48 793 70 198

Resolution range (�A) 30.12–2.12

(2.17–2.12)a43.57–1.37

(1.39–1.37)

24.9–1.50

(1.53–1.50)

51.87–1.56

(1.59–1.56)

27.62–1.44

(1.47–1.44)

Multiplicity 3.7 (2.6) 3.8 (3.8) 2.6 (2.6) 3.9 (4.0) 3.1 (2.5)

Completeness (%) 97.9 (75.1) 93.7 (92.4) 97.2 (95.7) 97.8 (97.8) 99.4 (97.9)

Rmerge (%)b 5.9 (16.9) 7.8 (37.6) 5.7 (36.8) 6.3 (22.6) 5.1 (22.4)

Mean [I/SD(I)] 15.3 (6.9) 7.3 (2.6) 7.1 (1.8) 10.5 (4.0) 13.7 (4.4)

CC1/2 99.6 (92.4) 99.5 (78.3) 99.2 (73.7) 97.7 (94.9) 99.7 (89.7)

Rwork/Rfree (%) 15.7/19.5 18.0/20.3 17.2/19.5 16.9/19.5 16.3/17.8

RMSD, bond lengths (�A) 0.0041 0.0045 0.0046 0.0047 0.0043

RMSD, bond angles (o) 0.97 1.06 1.09 1.14 1.12

Protein atoms: no.,

average B-factor (�A2)

3889, 14.0 4011, 14.0 3981, 13.0 3963, 14.0 4252, 11.0

Water molecules: no.,

average B-factor (�A2)

467, 21.96 530, 23.94 369, 24.0 394, 25.64 609, 21.20

Sugar ligands: no. of residues,

average B-factor (�A2)

0 0 2, 21.22 6, 29.93 6, 15.56

Glycosylation: no. of residues,

average B-factor (�A2)

4, 29.45 5, 22.73 5, 25.40 5, 26.54 4, 15.33

Other heteroatoms Glycerol 92 Mg2+ Glycerol 93; PEG 92;

Mg2+ 92

Mg2+ Glycerol

Ramachandran plot

outliersc, no. of residues

0 1 1 2 1

a Data within parentheses are for the outermost resolution shell.bRmerge = Σhkl Σi | Ii(hkl) � <I(hkl)> | Σhkl Σi Ii(hkl).c Calculated by use of a strict boundary Ramachandan plot [39].

8 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

similar to that of HjeCel7A, is seen in the two

structures with long b-axes, i.e. APO1 and G4 (Figs 5

and 7).

The fold of GcaCel7A is very similar to those in

other Cel7 CBH structures (rmsd of < 0.8 �A), as

expected from high sequence identities [Rasamsonia

emersonii Cel7A (RemCel7A), 71%; HjeCel7A 64%;

Trichoderma harzianum Cel17A, 64%]. rmsd values

and sequence identities for further homologues are

provided in Table S1. The cellulose-binding active site

is highly conserved, including the four Trp platforms

at subsites �7, �4, �2 and +1 (Trp40, Trp38, Trp371,

and Trp380), and the catalytic triad Glu212 (nucle-

ophile), Asp214, and Glu217 (acid/base), having the

same residue numbers as in HjeCel7A. In GcaCel7A,

Leu213 replaces the Met that is conserved in the

nearest Cel7 homologues (Fig. 3).

At the B2/A3/B3 loop contact region, two Tyr resi-

dues at the tip of loop A3 play an important role in

tunnel-enclosing contacts with loops B2 and B3

across the active site (Tyr374 and Tyr375 in

GcaCel7A; Tyr370 and Tyr371 in HjeCel7A). In

GcaCel7A, the side chain of Tyr375 is flipped relative

to the predominant position of HjeCel7A Tyr371

above subsite �1, towards the +1 and +2 product

sites in a similar position as in the HjeCel7A struc-

tures 1CEL and 4P1J. There is also a shift outwards

of loop B3 and Tyr247 at its tip (Fig. 7). Loops A3

and B3 are practically identical in GcaCel7A and

HjeCel7A, whereas differences in loop B2 exist. At

the tip of loop B2, Asn198 in HjeCel7A, which inter-

acts with the �3 glucosyl unit, is replaced by the

smaller Ser198 in GcaCel7A, leading to the loss of

one direct protein–sugar interaction. Further size

reduction, of Ser195 and Thr201 in HjeCel7A to

Ala195 and Ser201 in GcaCel7A, reduces the contact

surface within the loop, with substrate, and with

the rest of the protein, suggesting that loop B2 would

be more flexible in GcaCel7A than in HjeCel7A.

The loop dynamics may also be influenced by the

O-glycosylation at Ser196 in GcaCel7A. To our

knowledge, there are no reports of O-glycosylation at

this position on any other Cel7 enzyme.

Superposition of the GcaCel7A ligand complexes

with the HjeCel7A Michaelis complex (4C4C; [40]),

shows very similar sugar binding at subsite �2, with a

gradual increase in deviation towards the entrance of

the tunnel (Fig. 8). At subsite �1, two binding modes

are observed. In the G4 complex, the �1 glucosyl is

connected to the �2 unit, and it binds in a similar

position as in HjeCel7A. However, the �1 glucosyl of

GcaCel7A is in the form of the a-anomer and adopts a4C1 chair conformation. In the G3 structure, on the

other hand, the �1 residue is connected to the +1 glu-

cosyl unit and is located further away from the cat-

alytic residues, illustrating that the �1 subsite is rather

spacious. At the product sites +1 and +2, the so-called

‘primed’ binding mode is observed in all of the

GcaCel7A complexes, in contrast with the ‘unprimed’

position in the HjeCel7A Michaelis complex. After

cleavage of the cellulose chain, the cellobiose product

can move away from the catalytic centre by pivoting

around OH6 of the +2 glucosyl unit [40,41]. In

GcaCel7A, the ‘primed’ mode is stabilized by Asp343

in loop B4, at hydrogen bonding distance to OH1 of

the reducing-end +2 glucose unit. Most GH7 CBHs

have an Asp at this position, but in H. jecorina

and Trichoderma species, it is missing owing to a one-

residue deletion in loop B4 [41].

MD

To investigate the significance of structural differences,

we conducted MD simulations of the GcaCel7A and

HjeCel7A catalytic domains in solution without a

bound ligand, in solution bound to a cellononaose

ligand, and complexed with the surface of a cellulose

microfibril (Fig. 9). Simulations of the enzymes in each

representative state enable us to directly examine the

effects of substrate on protein dynamics and to com-

pare relative endo-initiation abilities.

The relative degree of protein flexibility in response

to the environment over the course of MD simulation

provides a means to estimate similarity in dynamic

behaviour. Root-mean-square fluctuation (RMSF) of

Fig. 5. Overall structural alignment of the five GcaCel7A structures

with the HjeCel7A cellononaose complex (PDB code 4C4C [40]).

HjeCel7A protein and cellononaose ligand are coloured in light grey

with labelled loops in red. The GcaCel7A structures are coloured as

follows: APO1, orange; APO2, pale green; G2, yellow; G3, pink;

G4, cyan. Ligands in the GcaCel7A structures are not shown.

9FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A

the protein backbone was calculated to evaluate the

flexibility of these two GH7s under the three different

substrate scenarios (Fig. 10A,C). The RMSF of

GcaCel7A and HjeCel7A without a ligand is visually

illustrated on the protein backbones in Fig. 10(B) and

Fig. 10(D), respectively.

Fig. 6. Electron density for the ligands in the GcaCel7A G2, G3, and G4 structures. (A) The G2 structure shows cellobiose bound in the

product-binding subsites +1 and +2. (B) The G3 complex shows two molecules of thio-cellotriose, at subsites �4/�3/�2 and �1/+1/+2,

respectively. (C) In the G4 complex, overlapping thio-cellotetraose molecules at partial occupancy were refined at subsites �5 to �2 and �4

to �1, respectively, and thio-cellobiose at subsites +1 and +2. At subsite �1, the density indicates the presence of multiple binding modes.

All glucosyl units adopt the 4C1 chair conformation. Sigma-averaged 2Fo�Fc electron density maps are contoured at 0.53 e/A3 in (A) and at

0.26 e/A3 in (B) and (C).

Fig. 7. Tunnel-enclosing loop contacts. (A) In the GcaCel7A APO2 structure (green), loop B2 bends into the tunnel, causing Ala199 at the tip

of the loop to clash with a glucosyl binding at subsite �4. (B) The B2/A3/B3 loop contacts in the GcaCel7A APO1, G3, and G4 structures.

(C) B2/A3/B3 loop contacts in HjeCel7A (G9 complex; PDB code 4C4C [40]). Two Tyr residues at the tip of loop A3 play an important role in

tunnel-enclosing contacts with loops B2 and B3 across the active site. In GcaCel7A, the Tyr375 side chain is flipped towards the +1 and +2

product sites, relative to the position of Tyr371 in HjeCel7A above subsite –1, accompanied by a slight shift outwards of loop B3 and Tyr247

at its tip. The colouring scheme is the same as in Fig. 5: HjeCel7A protein and cellononaose ligand, light grey with loops in red; GcaCel7A

APO1, orange; APO2, green; G2, yellow; G3, magenta; G4, cyan.

10 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

As one might expect, the corresponding loop regions

in GcaCel7A and HjeCel7A show higher fluctuations

than the b-sandwich core of the proteins (Fig. 10A,C).

For HjeCel7A, the overall dynamics are dampened

with either bound cellulosic substrate as compared

with the ligand-free enzyme (Fig. 10C). The most sta-

bilizing effect is seen with the microfibril-bound

enzyme, in particular in the regions near the entrance

to the tunnel, including loops A1 and B1, which are in

direct contact with the microfibril surface. GcaCel7A

behaves differently, in the sense that there is little over-

all difference in fluctuations between the three scenar-

ios except for loops A2 and A4, which are more

flexible in the ligand-free enzyme than with substrate

bound in the active site (Fig. 10A). Surprisingly, there

is no significant stabilization of the microfibril-bound

GcaCel7A as compared with the cellononaose-bound

enzyme. One could reasonably expect that the presence

of a solid, crystalline substrate underneath the active

site loops would contribute additional stability, at the

very least on a localized basis. However, MD simula-

tions clearly indicate that this is not the case for

GcaCel7A; the RMSFs of the two substrate-bound

catalytic domains are almost indistinguishable

(Fig. 10A).

HjeCel7A shows a greater degree of flexibility in the

A4 and B4 loop regions at the product side of the

active site tunnel, as illustrated by multiple conforma-

tions and red regions in the rightmost loops in the

RMSF-coloured snapshot (Fig. 10D). The higher

stability of loop B4 in GcaCel7A (Fig. 10A) is attribu-

table to the insertion of Asp343, which forms a salt

bridge with Arg267. Also, an additional hydrogen

bond is provided by Lys338 in GcaCel7A (replacing

Glu335 in HjeCel7A). Both interactions will anchor

the loop more firmly to the rest of the protein. The

difference in behaviour of loop A4 between the two

enzymes is more difficult to rationalize. In GcaCel7A,

loop A4 is clearly stabilized by substrate binding

(Fig. 10A), as expected from the increase in local van

der Waals and hydrogen bond interaction opportuni-

ties when the product sites are occupied. However,

similar stabilization is not seen in HjeCel7A. The

sequences differ at 5 of 10 positions in loop A4, but

no obvious determinants of flexibility/stability can be

identified.

In a previous study, we investigated the structure

and dynamics of Cel7A from the tree pathogen Heter-

obasidion irregulare (HirCel7A [23]). Our investigation

revealed that the HirCel7A active site shows an ‘inter-

mediate’ degree of openness and flexibility relative to

PchCel7D and HjeCel7A (Fig. 11). We further suggest

that this degree of closure, resulting from active site

loop formation, impacts on processive ability, degree

Fig. 8. Oligosaccharide binding in the GcaCel7A complex structures in comparison with the HjeCel7A cellononaose complex (PDB code

4C4C; white). In (B), the view is rotated ~ 90 ° around the x-axis. The GcaCel7A G2 structure (yellow) contains cellobiose at subsites +1 and

+2, and the G3 complex (magenta) contains two 4,40-dithiocellotriose molecules at subsites �4/�3/�2 and �1/+1/+2, respectively. In the G4

structure (blue), 4-thiocellobiose was refined at subsites +1 and +2, and two overlapping molecules of 4,40,40 0 0-trithiocellotetraose in subsites

�5/�4/�3/�2 and �4/�3/�2/�1 at partial occupancy. Oxygen atoms are red, and sulfur atoms from thio-linkages are in yellow. Numbers

indicate the glucosyl-binding subsites.

11FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A

of endo-initiation, and substrate dissociation. Compar-

ison of the flexibility of GcaCel7A with that of

HjeCel7A, HirCel7A, and PchCel7D indicates that

GcaCel7A is even more rigid than HjeCel7A. Nearly

all of the GcaCel7A primary active site loops fluctuate

less than those of the other three GH7s, which may

translate to higher ligand-binding free energy and a

higher degree of processive ability [42,43].

MD simulations also suggest that GcaCel7A and

HjeCel7A show a similarly low degree of endo-initi-

ated processive action in comparison with PchCel7D

and HirCel7A. In the literature, loop B3 has been ter-

med the exo-loop, as it is beneficial in facilitating pro-

cessive crystalline substrate degradation [38]. The

ability to conduct endo-initiated attack of crystalline

substrates is thought to be related to both the flexibil-

ity and the length of this loop, along with that of the

nearby loop B2. Both of these loops must open suffi-

ciently to allow the entry into the active site of an

internal part of a cellulose chain. PchCel7D shows the

shortest B2 and B3 loops of all four GH7s discussed

here and is known to conduct endo-initiated attack

Fig. 9. MD simulations of the behaviour of the GcaCel7A and HjeCel7A catalytic domains in solution without a ligand, bound to a

cellononaose ligand in solution, and bound to a portion of a cellulose Ib microfibril.

Fig. 10. Comparison of RMSFs of GcaCel7A and HjeCel7A. (A) The RMSF of each GcaCel7A residue over a 100-ns MD simulation, in

solution without a ligand, bound to a cellononaose ligand, and catalytically engaged with a cellulose Ib microfibril. Key active site loops are

labelled on the plot. (B) Twenty aligned snapshots of GcaCel7A from the ‘no ligand’ MD simulation are shown coloured by RMSF. Red

indicates regions of higher fluctuation, and blue indicates regions of lower fluctuation. White regions are intermediate. (C) The RMSF of

each HjeCel7A residue over the three 100-ns MD simulations. (D) Twenty aligned snapshots of the HjeCel7A ‘no ligand’ simulation shown

coloured by RMSF. The colour scales of (B) and (D) are identical from 0 to 3.3, the maximum value of the RMSF from the GcaCel7A ‘no

ligand’ simulation.

12 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

more frequently than HjeCel7A [44]. Using this infor-

mation, we extrapolated substrate initiation modes for

GcaCel7A and the previously described HirCel7A [23],

based on evidence from MD simulation. We calculated

the minimum distances of nearby active site loops to

characterize the openness and movement of these

loops. The minimum distances were binned into his-

tograms to determine the probability that the loops

would be in a given position relative to each other.

Examination of the minimum distance between loops

B2 and A3 indicate that GH7s performing primarily

endo-initiated attack have a broad range of conforma-

tions relative to each other (Fig. 12) and include

HirCel7A and PchCel7D [23]. Conversely, loops B2

and A3 of exo-initiating HjeCel7A seldom stray from

each other, maintaining a general minimum distance

of ~ 3.5 �A (Fig. 12). On the basis of these data, we

suggest that GcaCel7A also conducts exo-initiated

attack on crystalline cellulose substrates. As before,

this observed behaviour is the same regardless of the

bound cellulosic substrate. The minimum distances

between loops A1 and B2 (Fig. S3A) and between

loops B2 and B3 (Fig. S3B) in GcaCel7A and

HjeCel7A are greater than that between loops B2 and

A3, but demonstrate the same relative behaviour.

Discussion

We have found that GcaCel7A is biochemically and

structurally similar to the well-characterized HjeCel7A

and shows comparable activity on crystalline cellulose.

Slight differences between the two do exist, however.

In terms of modularity, the linker peptide connecting

the catalytic domain with the CBM in GcaCel7A is

significantly longer than that in HjeCel7A. In fact, the

47-residue linker of GcaCel7A is longer than nearly all

other fungal GH7 linker peptides. In 2012, Sammond

et al. reported the analyses of available sequence data

for linkers in bimodular GH6 and GH7 CBHs and

found an average length of 30 residues for GH7-

CBM1 linkers (42 residues for GH6-CBM1 linkers)

[45]. The GcaCel7A linker contains 18 Thr and 13 Ser

residues, which together make up more than two-

thirds of the amino acids in the linker. The Ser and

Thr residues are distributed along the entire length of

the linker region, suggesting that the whole peptide

may be O-glycosylated and able to extend the physical

distance between the CBM1 and the catalytic module

by ~ 50% as compared with the average GH7-CBM1

linker [46]. The GcaCel7A linker also shows an unusu-

ally large number of positively charged residues, i.e.

Fig. 11. Snapshots of GcaCel7A, HjeCel7A, HirCel7A, and

PchCel7D from MD simulations of the four enzymes bound to

cellononaose ligands. The aligned snapshots, 20 in each figure, are

shown coloured by RMSF calculated from the simulation. Red

regions indicate a high degree of fluctuation, and blue regions

indicate relatively little fluctuation. The colour scale in this figure is

from 0 to 6, where 6 is the highest RMSF observed for any

residue of the four enzymes during simulation [23]. The HirCel7A

and PchCel7D figures are based on data from previously published

simulations described here for comparison with GcaCel7A

behaviour [23].

13FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A

three His residues and one Lys, and no negatively

charged side chains, indicating a strong positive net

charge of the linker at physiological pH [45]. It is

currently unknown how the longer linker and positive

charge may modulate the enzymatic properties of

GcaCel7A as compared with other GH7 CBHs.

From a functional standpoint, the activity pH pro-

file of GcaCel7A is broader and slightly shifted in the

alkaline direction, with an optimum at pH 5.0, as com-

pared with 4.5 for HjeCel7A [47,48]. At pH 6 and pH

7, GcaCel7A shows ~ 90% and ~ 60% activity, com-

pared with ~ 75% and ~ 15% for HjeCel7A. As a

CBH with similar functional capacity to that of Hje-

Cel7A, inclusion of GcaCel7A in industrial cellulase

cocktails where application of alkaline pretreatment

methods are employed may be advantageous for

retaining activity under a broader range of conditions.

Another significant advantage of employing

GcaCel7A rather than HjeCel7A in industrial applica-

tions is reduced cellobiose product inhibition. Enzyme

kinetic data indicate that cellobiose product inhibition

for GcaCel7A is approximately two-fold weaker than

that of HjeCel7A. This is somewhat surprising, given

the similarity of the product-binding sites. Identical

amino acid side chains overlap closely around the

bound sugar in product subsites +1 and +2 in both

enzymes. The nearest differences are found in loop B4

beyond the reducing end of the bound cellobiose,

where the Asp343 insertion in GcaCel7A offers an

opportunity to form an additional hydrogen bond with

the reducing end hydroxyl of the +2 glucoside unit.

Intuitively, one would thus expect stronger cellobiose

binding in GcaCel7A, which is not the case. Rather,

the Asp343 side chain is not oriented towards and

does not interact directly with the +2 glucosyl unit in

the ligand complex structures. Instead, Asp343 is

rotated towards Arg267, forming a salt bridge. We

note that Arg267, the nearby Arg399 that binds to

OH6 and OH1 of the +2 glucoside unit, and Asp262

are shifted slightly away from the sugar as compared

with HjeCel7A. Although differences are small, the +2sugar unit appears to fit somewhat more snugly in

HjeCel7A than in GcaCel7A; moreover, the HjeCel7A

–1 binding site forms several more hydrogen bonds

with the substrate than GcaCel7A (Fig. S2A).

Together, these differences may account for stronger

cellobiose product binding in HjeCel7A.

Although the pH profile of GcaCel7A is more forgiv-

ing than that of HjeCel7A, and reduced product inhibi-

tion is a distinct advantage, GcaCel7A appears to be less

thermostable, with a temperature optimum at ~ 50 °C,which is ~ 10 °C lower than that for HjeCel7A

(60–65 °C [47]). This is probably a direct result of fewer

possible disulfide bridges in GcaCel7A than in

HjeCel7A; two of the 10 disulfide bridges formed in

HjeCel7A are missing in GcaCel7A. The Cys4 to Cys72

disulfide bridge in HjeCel7A is located at the base of

surface loops near the entrance to the cellulose-binding

tunnel. This particular disulfide bridge is missing in most

other GH7 CBHs, including GcaCel7A and several ther-

mostable homologues, e.g. RemCel7A, Melanocarpus

albomyces Cel7B,Humicola grisea var. thermoidea Cel7A,

and A. fumigatus Cel7A [22,24,49,50]. As a result, the

GcaCel7A ligand fluctuates slightly more than that of

HjeCel7A in the �7 and �6 sites (Fig. S2B). The other

‘missing’ disulfide is highly conserved within GH7s, with

a new exception being GcaCel7A. Instead, we find that

Cys230 and Cys256 of HjeCel7A are replaced by Ala230

and Thr256, respectively. The region between these

positions constitutes a long surface loop, including loop

B3, that lies on the surface of the protein and projects

towards the catalytic centre. The loop showed elevated

fluctuations in the MD simulations (Fig. 10), which are

typically indicative of lower thermal stability. We

hypothesize that the disulfide bridge, when present, plays

an important role in stabilization by anchoring the base

of the loop to the b-sheet framework, which prevents

loop unfolding from propagating into the hydrophobic

core of the protein.

As discussed previously, flexibility of loop B2 is

essential for endolytic initiation of cellulose hydrolysis,

Fig. 12. Histograms of the minimum distance between loops B2

and A3 from 100-ns MD simulations of GcaCel7A, HjeCel7A,

HirCel7A, and PchCel7D. These two loops are thought to be critical

in endo-initiated catalysis. In the case of GcaCel7A and HjeCel7A,

the distances have been measured in the absence of a ligand,

bound to cellononaose, and bound to a cellulose microfibril. The

simulations of HirCel7A and PchCel7D examine the behaviour of

the enzymes bound to cellononaose and have previously been

reported by Momeni et al. [23]. In all cases, the distances have

been measured on the basis of the minimum distance between

the loops.

14 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

and is also likely to affect processivity and enzyme dis-

sociation [23]. Loop B2 constitutes a 13–15-residueinsertion in CBHs relative to GH7 endoglucanases and

folds over the b-sandwich core to define the tunnel

around subsite –4. Residues at the tip of the loop

interact with the opposing loop A3 across the active

site, which effectively closes the ‘roof’ of the tunnel.

Thus, loop B2 must open to allow the enzyme to

access an internal unit of a cellulose chain. Several

indications from the crystal structure analysis suggest

that loop B2 is more flexible in GcaCel7A than in

HjeCel7A, such as elevated temperature factors, partial

disorder of the loop in some structures, and fewer con-

tacts with substrate and the opposing loop. However,

in the crystal structures, the loop is affected by crystal

contacts with neighbouring protein molecules; the MD

simulations do not show any increase in RMSFs or

the opening frequency of loop B2, but rather show

very similar dynamics in GcaCel7A and HjeCel7A.

Accordingly, we predict that the probability of endo-

mode initiation for GcaCel7A is low and similar to

what has been determined for HjeCel7A [29].

The GcaCel7A structure represents the first report

of O-glycosylation of a GH7 catalytic domain [18],

where a mannose residue is attached to Ser196 near

the tip of loop B2. We suggest that this is attributable

to the undercharacterization of GH7 glycosylation pat-

terns rather than being an outlier among the familial

members. To date, analysis of glycosylation has been

reported for only a very limited number of GH7

enzymes, and almost exclusively concerns N-glycosylation

[51–53]. The only GH7 CBH that has been extensively

characterized in terms of glycosylation is HjeCel7A,

where O-glycosylation was found on the CBM and at

numerous sites on the linker region but not in the

catalytic domain [54–58]. In previous GH7 crystal

structures, N-glycosylation has been observed, but no

O-glycosylation has been reported up to now. On the

other hand, carbohydrates attached at the surfaces of

proteins may be highly flexible and are often not

visible in the electron density, even if they are known

to be present. Thus, the question of the extent of

O-glycosylation on GH7 catalytic domains and the

frequency with which it occurs remains open. Ser is

conserved at this position in the enzymes aligned in

Fig. 3, but not across all known GH7 CBH sequences.

However, if the adjacent positions immediately before

and after Ser196 are also taken into consideration, Ser

or Thr seem to be ubiquitously present near the tip of

loop B2. A potential O-glycosylation site is missing

here in only one of 42 sequences covering phylogeneti-

cally diverse species in which GH7 CBHs have been

found, from ascomycete and basidiomycete fungi to

oomycetes, haptophytes, and parabasilids (Fig. S4).

The location of the O-glycan near the tip of loop B2

in GcaCel7A is interesting, as the mannose (or a

longer O-glycan) would significantly increase the sur-

face area of the loop contact across the active site. In

an earlier study, the introduction of an N-glycosylation

site at an adjacent position, Asn194, in Penicillium

funiculosum Cel7A, by the A196S mutation, yielded a

70% increase in cellulase activity [51]. The authors

point out that the site is located at the space between

the catalytic domain and the cellulose and speculate

that introduction of the N-glycan may deter nonspeci-

fic interactions with the substrate surface. Given

the similarities in circumstances, we suggest that the

O-glycan of loop B2 serves a similar functional role.

One could feasibly conceive that the addition of an

O-glycan in the same loop B2 region of HjeCel7A

would serve to further enhance the activity over the

wild-type; however, the requirements for inducing

O-glycosylation by the expression host are not well

understood, making engineering such a construct

difficult.

The implementation of GcaCel7A in whole cellulase

cocktails results in similar synergistic effects as

observed with HjeCel7A. Almost two-fold higher con-

version of Avicel cellulose was achieved when half of

the enzyme was replaced with a Cel7A-depleted

H. jecorina enzyme cocktail (Table 2). The overall

yields were slightly lower for GcaCel7A, although this

can potentially be attributed to coevolved synergistic

effects. In other words, GcaCel7A has coevolved with

the other enzymes in the G. candidum secretome and

may be fine-tuned for optimal synergism in that con-

text rather than in the context of the specific composi-

tion and properties of the H. jecorina cocktail. In fact,

earlier studies showed that naturally secreted cocktails

from G. candidum degraded filter paper and cotton

more efficiently than the industrial enzyme prepara-

tions to which they were compared, including Cellulase

Onozuka R-10 from Trichoderma viride (Japan), Meiji

cellulose from Acremonium cellulolyticus (Japan), and

rapidase from Aspergillus niger and Trichoderma

longibrachiatum (France) [8,11].

In conclusion, we have found that GcaCel7A shows

similar structural and functional characteristics to the

industrially relevant HjeCel7A. Nevertheless, notable

industrial advantages over HjeCel7A, including

reduced product inhibition and a broader optimal pH

range, are valuable in the development of cellulase

products, where the implemented process conditions

may vary significantly. For example, we envision that

replacing a portion of the HjeCel7A component with

GcaCel7A in lower-temperature, high-solid-loading

15FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A

applications would result in improvements of

H. jecorina-based enzyme cocktails. Moreover, the minor

differences in the two structures, such as O-glycosylation

and modifications at loops B2 and B4, offer new direc-

tions in the pursuit of strategies for protein engineering

of GH7 CBHs. Finally, the performance of the G. can-

didum whole cellulase cocktail should be given serious

consideration in the development of new industrial

strains.

Experimental procedures

Fungal cultivation and protein preparation

The wild-type strain G. candidum 3C (a kind gift from

A. M. Bezborodov, Bach Institute of Biochemistry,

Moscow, Russia) was cultivated for 6 days at 28 °C in a

rotary shaker at 250 r.p.m. in 2-L flasks with 500 mL of

medium [1 g L�1 KH2PO4; 1.5 g�L�1 NaNO3; 1.5 g�L�1

(NH4)2SO4; 0.5 g�L�1 MgSO4.7H2O; 40 g�L�1 wheat bran;

and 10 g�L�1 filter paper]. The enzyme was purified accord-

ing to a protocol described previously [12]. Briefly, the fun-

gal biomass was removed by centrifugation, and the broth

supernatant was concentrated, desalted, and transferred to

0.1 M sodium acetate (pH 4.5) (buffer A), by ultrafiltration

(Vivaflow 200, 30-kDa cutoff; Sartorius, Germany). The

concentrated protein solution was loaded onto a glass col-

umn with Sigmacell Cellulose Type 50 (Sigma-Aldrich,

St. Louis, MO, USA). Buffer A was used to equilibrate the

column and to wash out unbound proteins; this was fol-

lowed by elution with distilled water. Fractions containing

GcaCel7A, as judged by cellulase activity measurements

and SDS/PAGE analysis, were pooled and transferred to

0.1 M Tris/HCl (pH 7.4) (buffer B) by dialysis. The protein

solution was fractionated on a BioSuite Q column (Waters

Co., Tokyo, Japan; 21.5 9 150 mm, flow rate of

2 mL�min�1) with a linear gradient of 0–0.5 M NaCl in buf-

fer B. Then, fractions with cellulase activity were loaded

onto a BioSuite Phenyl column (Waters Co.;

21.5 9 150 mm) in buffer B with 1.7 M (NH4)2SO4 fol-

lowed by a linear gradient of 1.7–0 M (NH4)2SO4 at a flow

rate 2 mL�min�1.

For structural studies, the CBM-linker portion of Gca-

Cel7A was removed proteolytically. Papain (26 lg) acti-

vated in 0.1 M sodium phosphate (pH 7.0), 2 mM DTT and

2 mM EDTA were added to 2 mg of GcaCel7A, and incu-

bated for 8 h at room temperature. The papain was then

inactivated by addition of iodoacetate to a concentration of

3.3 mM. Then, the protein was deglycosylated with 5 lg of

endo-b-N-acetylglucosaminidase (EC 3.2.1.96, from

Streptomyces plicatus) at room temperature overnight, and

this was followed by size exclusion chromatography on a

Hiload Superdex 200 16/60 column with 50 mM sodium

phosphate (pH 7.0) and 0.15 M NaCl as eluent. Fractions

containing GcaCel7A_CD were pooled, and the resulting

stock solution was concentrated and stored at �20 °C for

use in crystallization and biochemical experiments. Protein

purity was assessed by SDS/PAGE, and protein concentra-

tions were determined by measuring the absorbance at

280 nm and using extinction coefficients of

93 040 M�1�cm�1 and 78 810 M

�1�cm�1 for full-length

GcaCel7A and GcaCel7A_CD, respectively, which were

calculated by use of the ExPASy server [59].

Gene sequencing

For DNA and RNA isolation, mycelium of the strain

G. candidum 3C grown as described above was used. Cells

were filtered through gauze, washed with 15 mM Tris/HCl

and 150 mM NaCl (pH 7.0), and frozen with liquid nitro-

gen. Genomic DNA was extracted according to [60]. Total

RNA was isolated with the Fungal/Bacterial RNA Micro-

Prep kit (ZymoResearch, Irvine, CA, USA), according to

the manufacturer’s manual. The cDNA was synthesized on

the basis of total RNA with a RevertAid First Strand

cDNA Synthesis Kit (Thermo Scientific, Waltham, MA,

USA) and the Oligo(dT)18 primer.

The partial amino acid sequence of GcaCel7A, derived

from the initial structure, was used for a PBLAST homology

search against the NCBI and MycoCosm databases (http://

genome.jgi-psf.org/programs/fungi/index.jsf) and was found

to be most similar to a GH7 CBH from T. aurantiacus.

The MycoCosm database was then browsed with the geno-

mic nucleotide sequence of T. aurantiacus Cel7A. Two

other homologous Pezizomycotina gene sequences were

retrieved, from the species Neosartorya fischeri and

Botryosphaeria dothidea. Alignment of the three gene

sequences revealed two conserved regions of 21 bp and

17 bp near the ends of the genes, which served for the

design of four primers: GcCel7A_22, 50-CCGACCTT-

GATGTTGGAGTAGA-30; GcCel7A_22_cut21, 50-CCGA

CCTTGATGTTGGAGTAG-30; GcCel7A_22_cut17, 50-CCTTGATGTTGGAGTAG-30; and GcCel7A_17, 50-TACAC

CAACTGCTACAC-30. Gene amplification was performed

with an Eppendorf Mastercycler under the following condi-

tions: 2 min at 95 °C; 35 cycles of 30 s at 95 °C; 30 s at

52 °C; 45 s at 72 °C; and exposure at 72 °C for 5 min. The

reaction mixture contained 50 ng of template DNA, 5 lgof each primer, 0.2 mM each dNTP, and 2.5 U of Taq

polymerase (Evrogen, Moscow, Russia), in the appropriate

buffer. The resulting amplicons were sequenced at Evrogen

Co. The resulting PCR product, with a size of ~ 1200 bp,

was purified and sequenced, and the sequence was aligned

to a draft of the G. candidum whole genome sequence

(BioProject ID: PRJNA243259 [16]) to identify the com-

plete coding region (CDS) of the GcaCel7A gene and to

design primers for the CDS amplification. The complete

GcaCel7A CDS was PCR-amplified under the above condi-

tions, with primers CDS-F (50-ACCTTTGTCGTCCAT

CATGGC-30) and CDS-R (50-CCTTGCCTTGGATC

16 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

TTAGTTTGG-30), and cDNA from G. candidum as the

PCR template. The resulting PCR product was sequenced

with a BigDye Terminator v3.1 Cycle Sequencing Kit (Life

Technologies, Carsbad, CA, USA) and a 3500xL Genetic

Analyzer (Life Technologies). The nucleotide sequence cod-

ing for GcaCel7A has been deposited in NCBI GenBank

under the accession number KJ958925.

Cellulase activity assays

Assessment of cellulase activity, e.g. during protein purifica-

tion, was routinely performed by incubating enzyme sam-

ples with 5 mg�mL�1 Avicel PH-101 (Fluka-Sigma-Aldrich)

microcrystalline cellulose as substrate in 100 lL of 0.1 M

sodium acetate (pH 5.0), at 37 °C for 120 min with vigor-

ous agitation every 20 min. The reaction was stopped by

addition of 250 lL of 1 M NaOH. After removal of the

remaining solids by centrifugation, the amount of released

reducing sugar was determined with p-hydroxybenzoic acid

hydrazide reagent as previously described [61], against glu-

cose standards.

Temperature and pH dependence of GcaCel7A

activity and stability

The pH dependence of GcaCel7A activity on Avicel was

determined as above, but with 0.1 M sodium/citrate/phos-

phate buffers ranging from pH 3.0 to pH 7.5. The pH depen-

dence of protein stability was assessed by preincubation of

GcaCel7A for 24 h at 37 °C in the sodium/citrate/phosphate

buffers, followed by measurement of the residual activity on

Avicel at pH 5.0 as above. The temperature dependence of

GcaCel7A activity was monitored by performing incubations

with Avicel at temperatures ranging from 30 °C to 60 °C (at

pH 5.0) and measuring the release of reducing sugar. Enzyme

thermal stability was assessed by preincubation of GcaCel7A

at different temperatures (20–60 °C) for 1 h, followed by

measurement of the residual activity on Avicel under the

standard assay conditions (pH 5.0, 37 °C, 2 h). The data

points are means of at least two or three independent experi-

ments, and errors were within 7–10%.

Substrate specificity assays

The specific activity of GcaCel7A on different polysaccha-

rides was assessed by incubation with 5 mg�mL�1 substrate

and determination of released reducing sugar as described

above (pH 5.0, 37 °C), except that enzyme concentrations

and incubation times were chosen to yield 3–5% substrate

conversion. The substrates were: Avicel, CMC (Sigma-

Aldrich, cat. no. 419311), xylan from beech wood (Sigma-

Aldrich; cat. no. X-4252), lichenan (b-1,3-1,4-glucan; MP

Biomedicals, Moscow, Russia, cat. no. 02155231), PASC

prepared from Avicel PH-101 as previously described

[62,63], and BC produced by Gluconacetobacter hansenii

B6756 (purchased from the Russian National Collection of

Industrial Microorganisms, Moscow, Russia) and prepared

as previously described [64]. PASC and BC were incubated

for 1 h with 0.075 mg�mL�1 and 0.5 mg�mL�1 enzyme,

respectively, and the other substrates were incubated for

20 h with 0.5 mg�mL�1 enzyme. The specific activity,

U�mg�1, is defined as the accumulated amount in lmol of

glucose equivalents released per mg of enzyme divided by

the incubation time in minutes under the assay conditions.

Comparison of GcaCel7A and HjeCel7A activity

The activities of GcaCel7A and HjeCel7A (prepared as previ-

ously described [65]) were compared on Avicel with Cel7A

enzyme acting alone and in cooperation with other cellulases.

For the latter experiment, an industrial H. jecorina enzyme

cocktail, Accellerase 1500 (Danisco-Genencor, Palo Alto,

CA, USA), was fractionated by anion exchange chromatog-

raphy, and Cel7A-containing fractions were removed. The

remaining fractions were pooled to yield a Cel7A-depleted

enzyme cocktail, referred to as Cel7A-free Accellerase.

Enzyme reactions contained 50 lg�mL�1 Cel7A enzyme

alone, or 25 lg�mL�1 Cel7A + 25 lg�mL�1 Cel7A-free

Accellerase, and 5 g�L�1 Avicel PH-101 in 300 lL of 0.1 M

sodium acetate (pH 5.0). Five replicates were incubated at

40 °C in closed Eppendorff tubes for 2 h with occasional

mixing (~ 15-min intervals). The reactions were stopped by

removal of the remaining substrate by filtration with 1-lmWhatman glass fibre 96-well plates in a vacuum filtration

unit (Porvair Sciences, Wrexham, Wales, UK). The filtrate

was incubated with 67 lg�mL�1 almond b-glucosidase(Sigma-Aldrich) for 1 h at 37 °C to hydrolyse cellobiose to

glucose, and this was followed by the p-hydroxybenzoic acid

hydrazide reducing sugar assay [61].

Enzyme kinetics and cellobiose inhibition on

chromogenic substrates

Enzyme kinetics experiments were performed in triplicate in

96-well microtitre plates with three chromogenic substrates:

CNP-G2, pNP-G2 and pNP-Lac (all from Sigma-Aldrich).

Reaction mixtures of 150 lL containing 0.01–5 mM sub-

strate and 0.64 lM GcaCel7A_CD in 50 mM sodium acetate

(pH 5.0) were incubated at 30 °C for 30 min, and the reac-

tions were stopped by the addition of 150 lL of 0.5 M

sodium carbonate. Absorbance was read at 404 nm with an

Eon Microplate Spectrophotometer (BioTek Instruments,

Winooski, VT, USA), and the concentration of released

chromophore was calculated by the use of extinction coeffi-

cients of 16 800 M�1�cm�1 for 2-chloro-4-nitrophenolate and

18 300 M�1�cm�1 for 4-nitrophenolate ion. Product inhibi-

tion experiments were carried out as above in duplicate, with

0.64 lM and 0.17 lM GcaCel7A_CD and pNP-Lac as sub-

17FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A

strate, in the absence and presence of 20 lM and 80 lM cel-

lobiose. Appropriate controls were included to compensate

for any background absorbance by substrate solutions and/

or other compounds used. The Michaelis–Menten kinetics

parameters kcat and KM, and the competitive inhibition con-

stant, Ki, were derived by nonlinear regression with the pro-

gram ORIGIN 8.0 (OriginLab, Northampton, MA, USA).

X-ray crystallography

Crystallization experiments were carried out with deglyco-

sylated GcaCel7_CD. Screening for crystallization condi-

tions was performed in 96-well sitting drop trays with a

Mosquito crystallization robot (TTP Labtech, UK). The

most promising crystallization hits were obtained at room

temperature with Hampton PEG/Ion screen conditions,

with 0.2 M MgCl2 and 20% PEG 3350 as a precipitant.

The crystals used for data collection were grown by sitting

drop vapour diffusion under the same conditions after 1 : 1

mixing of precipitant with 6.5 mg�mL�1 GcaCel7A_CD in

0.15 M NaCl and 50 mM sodium phosphate (pH 7.0). No

ligand was added to the crystallization conditions for the

APO1 structure. The APO2 crystal was soaked for 15 min

with 5 mM XGO, but this ligand was not visible in the

structure. The G2, G3 and G4 structure complexes were

obtained from cocrystallization drops including lami-

naribiose (5 mM; Seikagaku Corporation, Tokyo, Japan),

4,40-dithio-cellotriose (5 mM), or 4-thio-cellobiose (10 mM),

respectively. Thio-cellooligosaccharides were synthesized by

sequential elongation of the thio-oligosaccaride chain from

the nonreducing to the reducing end, starting from

corresponding S-glycosyl isothiourea precursors and the

4-O-trifluoromethanesulfonyl derivative of galactose, with a

previously described approach [66,67]. A detailed descrip-

tion of the synthesis will be published elsewhere. NMR

spectra of all synthesized thiocellooligosaccharides were in

full agreement with previously published data [68]. Crystals

selected for X-ray data collection were briefly soaked in

mother liquor supplemented with 30% glycerol as cryopro-

tectant and then flash-frozen in liquid nitrogen. X-ray

diffraction data were collected at 100 K at the synchrotron

beamlines I911-2 and I911-3 (MAX-Lab, Lund, Sweden)

and ID23-1 (ESRF, Grenoble, France), as indicated in

Table 3. The data were integrated with XDS [69] and scaled

with SCALA or AIMLESS in the CCP4 suite [70]. The initial

GcaCel7A_CD structure model was solved by molecular

replacement with PHASER [71] and a structure of PchCel7D

as the search model (PDB code 1GPI [25]).

REFMAC5 [72] was used for structure model refinements,

and manual model rebuilding was performed with COOT

[73,74], with maximum likelihood A-weighted 2Fo–Fc

electron density maps [74]. For cross-validation of R and

Rfree calculations, 5% of the data were excluded from the

structure refinement [75]. Solvent molecules were automati-

cally added by use of the automatic water-picking function

in the ARP/WARP package [76]. Picked water molecules were

selected or discarded manually by visual inspection of

2Fo�Fc and Fo�Fc electron density maps. The coordinates

for the five final GcaCel7A_CD structure models and the

structure factors have been deposited in the PDB (http://

wwpdb.org/) with accession codes 5AMP, 4ZZV, 4ZZT,

4ZZW, 4ZZU, respectively.

MD simulations

Six total MD simulations were performed to examine the

protein dynamics of GcaCel7A relative to those of Hje-

Cel7A: GcaCel7A with no ligand, GcaCel7A bound to cel-

lononaose, GcaCel7A bound to a cellulose microfibril,

HjeCel7A with no ligand, HjeCel7A bound to cel-

lononaose, and HjeCel7A bound to a cellulose microfibril

(Fig. S1). In each case, only the catalytic domain of the

enzyme was considered. Additionally, the catalytic domains

were simulated in a deglycosylated state, as we have previ-

ously determined that catalytic domain glycosylation con-

tributes little to protein dynamics on the timescale of an

MD simulation [77,78]. Here, we briefly describe our

approach to construction, equilibration and execution of

the six MD simulations. Detailed simulation methodology

is provided in Doc. S1.

CHARMM was used to build and solvate the GcaCel7A

and HjeCel7A simulations from the reported GcaCel7A,

5AMP, and the 4C4C structure, respectively [41,79]. The

simulations of the catalytic domains with no bound ligands

were performed by simply removing the ligand from the

active site. Simulations of the bound catalytic domains in

solution included the cellononaose ligand from the 4C4C

structure of HjeCel7A, which was docked to GcaCel7A_CD

by alignment of the protein backbones. In the cellulose

microfibril simulations, the active site tunnels again bind

the 4C4C cellononaose ligand spanning the �7 to +2 bind-

ing sites, but the cellononaose ligand is also covalently

bound at the �7 pyranose to an edge chain of the cellulose

Ib hydrophobic surface [80,81]. The cellulose microfibril

represents the top half of a 36-chain cellulose Ib microfibril

with the corner chains removed [82]. The microfibril is 18

glucose residues in length and has a sufficient footprint to

allow the catalytic domain to fully interact. We have previ-

ously described similar system construction for the multido-

main HjeCel7A [83]. After construction, the systems were

explicitly solvated, and sodium ions were added for charge

neutrality. The catalytic domains in solution were (80 �A)3,

with ~ 52 000 atoms. The catalytic domains engaged with

cellulose microfibrils were solvated in 135 9 100 9 90-�A

boxes, resulting in systems of ~ 123 000 atoms.

The fully solvated systems were minimized and heated to

300 K and density-equilibrated in CHARMM. The 100-ns

production MD simulations were conducted in the canonical

ensemble at 300 K in NAMD with a 2-fs time step [84]. For all

simulations, the proteins were modelled by use of the CHAR-

18 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

MM force field with CMAP correction [85,86]. The cellononaose

ligand and microfibrils were described by use of the CHARMM

C36 force field [87,88], and water was described by use of the

TIP3P model [89,90]. Analysis of the trajectories was per-

formed with CHARMM and VMD [79,91].

Acknowledgements

A. S. Borisova, M. Sandgren and J. St�ahlberg

acknowledge financial support from Formas (The

Swedish Research Council for Environment, Agricul-

tural Sciences and Spatial Planning) (grant no. 213-

2013-1607), and the Faculty of Natural Resources and

Agricultural Sciences, Swedish University of Agricul-

tural Sciences, through the programme ‘MicroDrivE’.

We thank Dr Majid Haddad Momeni for help with

activity measurements. C. M. Payne and S. Jana thank

the August T. Larsson Guest Researcher Programme

at the Swedish University of Agricultural Sciences for

funding and the opportunity to work alongside the

Sandgren and St�ahlberg research group. Computa-

tional time for this research was provided in part by

the National Science Foundation through Extreme

Science and Engineering Discovery Environment

(XSEDE), which is supported by National Science

Foundation grant number ACI-1053575

(TG-MCB090159). The research was also supported

by the Russian Foundation for Basic Research (project

#14-08-01041-a), the Russian Scientific Foundation

(grant #14-50-00069), and the Centre for Molecular

and Cell Technologies, Research Park, St Petersburg

State University. A. Logachev was supported by the St

Petersburg State University (grant #1.38.253.2015).

Author contributions

A. S. Borisova, E. V. Eneyskaya, K. S. Bobrov, S. Jana,

A. L. Lapidus, F. M. Ibatullin, U. Saleem, M. Sand-

gren, C. M. Payne, A. A. Kulminskaya and J. St�ahlberg

planned experiments. A. S. Borisova, E. V. Eneyskaya,

K. S. Bobrov, S. Jana, A. Logachev, D. E. Polev and

U. Saleem performed experiments. A. S. Borisova, E. V.

Eneyskaya, K. S. Bobrov, S. Jana, A. Logachev, D. E.

Polev, A. L. Lapidus, M. Sandgren, C. M. Payne

and J. St�ahlberg analysed data. F. M. Ibatullin

contributed essential material. A. S. Borisova, C. M.

Payne, A. A. Kulminskaya and J. St�ahlberg wrote the

paper.

References

1 Gottschalk G (1988) Cellulose degradation and the

carbon cycle. In Biochemistry and Genetics of Cellulose

Degradation (Aubert J-P, Beguin P & Millet J, eds), pp.

3–8. Academic Press, London.

2 Vuong TV & Wilson DB (2009) Processivity, synergism,

and substrate specificity of Thermobifida fusca Cel6B.

Appl Environ Microbiol 75, 6655–6661.3 Bhat MK (2000) Cellulases and related enzymes in

biotechnology. Biotechnol Adv 18, 355–383.4 Kirk O, Borchert TV & Fuglsang CC (2002) Industrial

enzyme applications. Curr Opin Biotechnol 13, 345–351.5 Mandels M (1976) Saccharification technology.

Biotechnol Bioeng Symp 6, 221–222.6 Reese ET & Mandels M (1971) Enzymatic Degradation.

Wiley-InterScience, New York.

7 Cherry JR & Fidantsef AL (2003) Directed evolution of

industrial enzymes: an update. Curr Opin Biotechnol 14,

438–443.8 Rodionova NA, Tiunova NA, Feniksova RV,

Kudriashova TI & Martinovich LI (1974) Cellulolytic

enzymes of Geotrichum candidum. Doklady Akademii

nauk SSSR 214, 1206–1209.9 Rodionova NA, Dubovaya NV, Eneyskaya EV,

Martinovich LI, Grasheva IM & Bezborodov AM

(2000) Purification and characterization of enda-1,4-

beta-xylanase from Geotrichum candidum 3C (in

russian). Prikl Biokhim Mikrobiol 36, 535–540.10 Tiunova NA, Rodionova NA, Martinovich LI &

Gogolev MN (1980) Preparation of cellulolytic enzymes

from Geotrichum candidum. Prikl Biokhim Mikrobiol 16,

185–190.11 Rodionova NA, Martinovich LI, Gukasyan GS &

Amelina DS (1988) Purification of cellulases from

Geotrichum candidum and Trichoderma longibrachiatum

by chromatography on microcrystalline cellulose. Prikl

Biokhim Mikrobiol 24, 370–379.12 Vasil’eva NV, Rodionova NA, Martinovich LI,

Tavobilov IM & Bezborodov AM (1989) Isolation of

endo-1,4-beta-glucanase II, bound to microcrystalline

cellulose, from a Geotrichum candidum 3C cellulase

preparation. Prikl Biokhim Mikrobiol 25, 322–332.13 Rodionova NA, Martinovich LI, Vasil’eva NV,

Tovobilov IM, Zagustina NA, Zacharov VI, De

MiloLE, Losyakova LS, Bochkareva NG, Belogortsev

YA et al. (1987) The production of glucose. In

Fermentation of Cellulose Containing Materials to the

Sugars and Biofuel. pp. 31–42. CBNTI, Minmedprom,

Moscow, 1987 (PRI GKNT USSR (21) 4262039/31-13

(22) 12.06.87 (46) 15.11.89, �A., ed) USSR.

14 Lapin VV, Rodionova NA, Zagustina NA, Kapanchan

AT, Dubovaya NV & Bezborodov AM (2002)

Application of enzyme preparartion cellokandin form

Geotrichum candidum 3C in waste paper utilization and

dehydration of glucose suspension. Prikl Biokhim

Mikrobiol 38, 452–454.15 Garcia-Guinea J, Cardenes V, Martinez AT &

Martinez MJ (2001) Fungal bioturbation

19FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A

paths in a compact disk. Die Naturwissenschaften 88,

351–354.16 Polev DE, Bobrov KS, Eneyskaya EV & Kulminskaya

AA (2014) Draft genome sequence of geotrichum

candidum strain 3C. Genome Announc 2, e00956–14.17 Morel G, Sterck L, Swennen D, Marcet-Houben M,

Onesime D, Levasseur A, Jacques N, Mallet S,

Couloux A, Labadie K et al. (2015) Differential gene

retention as an evolutionary mechanism to generate

biodiversity and adaptation in yeasts. Sci Rep 5,

11571.

18 Payne CM, Knott BC, Mayes HB, Hansson H, Himmel

ME, Sandgren M, Stahlberg J & Beckham GT (2015)

Fungal cellulases. Chem Rev 115, 1308–1448.19 Divne C, Stahlberg J, Reinikainen T, Ruohonen L,

Pettersson G, Knowles JK, Teeri TT & Jones TA

(1994) The three-dimensional crystal structure of the

catalytic core of cellobiohydrolase I from Trichoderma

reesei. Science 265, 524–528.20 Grassick A, Murray PG, Thompson R, Collins CM,

Byrnes L, Birrane G, Higgins TM & Tuohy MG (2004)

Three-dimensional structure of a thermostable native

cellobiohydrolase, CBH IB, and molecular

characterization of the cel7 gene from the filamentous

fungus, Talaromyces emersonii. Eur J Biochem 271,

4495–4506.21 Kern M, McGeehan JE, Streeter SD, Martin RN,

Besser K, Elias L, Eborall W, Malyon GP, Payne CM,

Himmel ME et al. (2013) Structural characterization of

a unique marine animal family 7 cellobiohydrolase

suggests a mechanism of cellulase salt tolerance. Proc

Natl Acad Sci USA 110, 10189–10194.22 Momeni MH, Goedegebuur F, Hansson H,

Karkehabadi S, Askarieh G, Mitchinson C, Larenas

EA, Stahlberg J & Sandgren M (2014) Expression,

crystal structure and cellulase activity of the

thermostable cellobiohydrolase Cel7A from the fungus

Humicola grisea var. thermoidea. Acta Crystallogr D

Biol Crystallogr 70, 2356–2366.23 Momeni MH, Payne CM, Hansson H, Mikkelsen NE,

Svedberg J, Engstr€om �A, Sandgren M, Beckham GT &

St�ahlberg J (2013) Structural, biochemical, and

computational characterization of the glycoside

hydrolase family 7 cellobiohydrolase of the tree-killing

fungus Heterobasidion irregulare. J Biol Chem 288,

5861–5872.24 Moroz OV, Maranta M, Shaghasi T, Harris PV,

Wilson KS & Davies GJ (2015) The three-dimensional

structure of the cellobiohydrolase Cel7A from

Aspergillus fumigatus at 1.5 A resolution. Acta

Crystallogr F Struct Biol Commun 71, 114–120.25 Munoz IG, Ubhayasekera W, Henriksson H, Szabo I,

Pettersson G, Johansson G, Mowbray SL & Stahlberg

J (2001) Family 7 cellobiohydrolases from

Phanerochaete chrysosporium: crystal structure of the

catalytic module of Cel7D (CBH58) at 1.32 A

resolution and homology models of the isozymes.

J Mol Biol 314, 1097–1111.26 Parkkinen T, Koivula A, Vehmaanpera J & Rouvinen J

(2008) Crystal structures of Melanocarpus albomyces

cellobiohydrolase Cel7B in complex with cello-

oligomers show high flexibility in the substrate binding.

Protein Sci 17, 1383–1394.27 Textor LC, Colussi F, Silveira RL, Serpa V, de Mello

BL, Muniz JR, Squina FM, Pereira N Jr, Skaf MS &

Polikarpov I (2013) Joint X-ray crystallographic and

molecular dynamics study of cellobiohydrolase I from

Trichoderma harzianum: deciphering the structural

features of cellobiohydrolase catalytic activity. FEBS J

280, 56–69.28 Divne C, Stahlberg J, Teeri TT & Jones TA (1998)

High-resolution crystal structures reveal how a cellulose

chain is bound in the 50 A long tunnel of

cellobiohydrolase I from Trichoderma reesei. J Mol Biol

275, 309–325.29 Kurasin M & Valjamae P (2011) Processivity of

cellobiohydrolases is limited by the substrate. J Biol

Chem 286, 169–177.30 Beckham GT, Bomble YJ, Bayer EA, Himmel ME &

Crowley MF (2011) Applications of computational

science for understanding enzymatic deconstruction of

cellulose. Curr Opin Biotechnol 22, 231–238.31 Bu L, Nimlos MR, Shirts MR, Stahlberg J, Himmel

ME, Crowley MF & Beckham GT (2012) Product

binding varies dramatically between processive and

nonprocessive cellulase enzymes. J Biol Chem 287,

24807–24813.32 Chundawat SP, Beckham GT, Himmel ME & Dale BE

(2011) Deconstruction of lignocellulosic biomass to

fuels and chemicals. Annu Rev Chem Biomol Eng 2,

121–145.33 Knott BC, Crowley MF, Himmel ME, Stahlberg J &

Beckham GT (2014) Carbohydrate-protein interactions

that drive processive polysaccharide translocation in

enzymes revealed from a computational study of

cellobiohydrolase processivity. J Am Chem Soc 136,

8810–8819.34 Petersen TN, Brunak S, von Heijne G & Nielsen H

(2011) SignalP 4.0: discriminating signal peptides

from transmembrane regions. Nat Methods 8,

785–786.35 Steentoft C, Vakhrushev SY, Joshi HJ, Kong Y,

Vester-Christensen MB, Schjoldager KT, Lavrsen K,

Dabelsteen S, Pedersen NB, Marcos-Silva L et al.

(2013) Precision mapping of the human O-GalNAc

glycoproteome through SimpleCell technology. EMBO

J 32, 1478–1488.36 Robert X & Gouet P (2014) Deciphering key features

in protein structures with the new ENDscript server.

Nucleic Acids Res 42, W320–W324.

20 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

37 Tomme P, Van Tilbeurgh H, Pettersson G, Van

Damme J, Vandekerckhove J, Knowles J, Teeri T &

Claeyssens M (1988) Studies of the cellulolytic system

of Trichoderma reesei QM 9414. Analysis of domain

function in two cellobiohydrolases by limited

proteolysis. Eur J Biochem 170, 575–581.38 von Ossowski I, St�ahlberg J, Koivula A, Piens K,

Becker D, Boer H, Harle R, Harris M, Divne C, Mahdi

S et al. (2003) Engineering the Exo-loop of Trichoderma

reesei Cellobiohydrolase, Cel7A. A comparison with

Phanerochaete chrysosporium Cel7D. J Mol Biol 333,

817–829.39 Kleywegt GJ & Jones TA (1996) Phi/psi-chology:

ramachandran revisited. Structure 4, 1395–1400.40 Knott BC, Haddad Momeni M, Crowley MF,

Mackenzie LF, G€otz AW, Sandgren M, Withers SG,

St�ahlberg J & Beckham GT (2013) The mechanism of

cellulose hydrolysis by a two-step, retaining

cellobiohydrolase elucidated by structural and

transition path sampling studies. J Am Chem Soc 136,

321–329.41 Ubhayasekera W, Munoz IG, Vasella A, Stahlberg J &

Mowbray SL (2005) Structures of Phanerochaete

chrysosporium Cel7D in complex with product and

inhibitors. FEBS J 272, 1952–1964.42 Payne CM, Baban J, Horn SJ, Backe PH, Arvai AS,

Dalhus B, Bjør�as M, Eijsink VGH, Sørlie M, Beckham

GT et al. (2012) Hallmarks of processivity in glycoside

hydrolases from Crystallographic and computational

studies of the Serratia marcescens chitinases. J Biol

Chem 287, 36322–36330.43 Payne CM, Jiang W, Shirts MR, Himmel ME, Crowley

MF & Beckham GT (2013) Glycoside hydrolase

processivity is directly related to oligosaccharide

binding free energy. J Am Chem Soc 135, 18831–18839.44 Kurasin M & Valjamae P (2010) Processivity of

cellobiohydrolases is limited by the substrate. J Biol

Chem 286, 169–177.45 Sammond DW, Payne CM, Brunecky R, Himmel ME,

Crowley MF & Beckham GT (2012) Cellulase linkers

are optimized based on domain type and function:

insights from sequence analysis, biophysical

measurements, and molecular simulation. PLoS One 7,

e48615.

46 Beckham GT, Bomble YJ, Matthews JF, Taylor CB,

Resch MG, Yarbrough JM, Decker SR, Bu L, Zhao X,

McCabe C et al. (2010) The O-glycosylated linker from

the Trichoderma reesei Family 7 cellulase is a flexible,

disordered protein. Biophys J 99, 3773–3781.47 Boer H & Koivula A (2003) The relationship between

thermal stability and pH optimum studied with wild-

type and mutant Trichoderma reesei cellobiohydrolase

Cel7A. Eur J Biochem 270, 841–848.48 Becker D, Braet C, Brumer H 3rd, Claeyssens M,

Divne C, Fagerstrom BR, Harris M, Jones TA,

Kleywegt GJ, Koivula A et al. (2001) Engineering of a

glycosidase Family 7 cellobiohydrolase to more alkaline

pH optimum: the pH behaviour of Trichoderma reesei

Cel7A and its E223S/A224H/L225V/T226A/D262G

mutant. Biochem J 356, 19–30.49 Tuohy MG, Walsh DJ, Murray PG, Claeyssens M,

Cuffe MM, Savage AV & Coughlan MP (2002) Kinetic

parameters and mode of action of the

cellobiohydrolases produced by Talaromyces emersonii.

Biochim Biophys Acta 1596, 366–380.50 Voutilainen SP, Boer H, Linder MB, Puranen T,

Rouvinen J, Vehmaanper€a J & Koivula A (2007)

Heterologous expression of Melanocarpus albomyces

cellobiohydrolase Cel7B, and random mutagenesis to

improve its thermostability. Enzyme Microb Technol 41,

234–243.51 Andey WS, Jeoh T, Beckham GT, Chou Y-C, Baker

JO, Michener W, Brunecky R & Himmel ME (2009)

Probing the role of N-linked glycans in the stability and

activity of fungal cellobiohydrolases by mutational

analysis. Cellulose 16, 669–709.52 Gao L, Gao F, Wang L, Geng C, Chi L, Zhao J & Qu

Y (2012) N-glycoform diversity of cellobiohydrolase I

from Penicillium decumbens and synergism of

nonhydrolytic glycoform in cellulose degradation.

J Biol Chem 287, 15906–15915.53 Garcia-Viloca M, Gao J, Karplus M & Truhlar DG

(2004) How enzymes work: analysis by modern rate

theory and computer simulations. Science 303, 186–195.54 Chen L, Drake MR, Resch MG, Greene ER, Himmel

ME, Chaffey PK, Beckham GT & Tan Z (2014)

Specificity of O-glycosylation in enhancing the stability

and cellulose binding affinity of Family 1 carbohydrate-

binding modules. Proc Natl Acad Sci USA 111,

7612–7617.55 Harrison MJ, Nouwens AS, Jardine DR, Zachara NE,

Gooley AA, Nevalainen H & Packer NH (1998)

Modified glycosylation of cellobiohydrolase I from a

high cellulase-producing mutant strain of Trichoderma

reesei. Eur J Biochem 256, 119–127.56 Jeoh T, Michener W, Himmel ME, Decker SR &

Adney WS (2008) Implications of cellobiohydrolase

glycosylation for use in biomass conversion. Biotechnol

Biofuels 1, 10.

57 Klarskov K, Piens K, Stahlberg J, Hoj PB, Beeumen

JV & Claeyssens M (1997) Cellobiohydrolase I from

Trichoderma reesei: identification of an active-site

nucleophile and additional information on sequence

including the glycosylation pattern of the core protein.

Carbohydr Res 304, 143–154.58 Taylor CB, Talib MF, McCabe C, Bu L, Adney WS,

Himmel ME, Crowley MF & Beckham GT (2012)

Computational investigation of glycosylation effects on

a family 1 carbohydrate-binding module. J Biol Chem

287, 3147–3155.

21FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A

59 Gasteiger EHC, Gattiker A, Duvaud S, Wilkins MR,

Appel RD & Bairoch A (2005) Protein identification

and analysis tools on the ExPASy server. In The

Proteomics Protocols Handbook (Walker JM, ed) pp.

571–601. Humana Press Inc., Totowa, NJ.

60 Lebedeva L & Tvaruzek L (2006) Specialisation of

Rhynchosporium secalis (Oud.) J.J Davis infecting

barley and rye. Plant Protect Sci 42, 85–93.61 Hori C, Igarashi K, Katayama A & Samejima M

(2011) Effects of xylan and starch on secretome of the

basidiomycete Phanerochaete chrysosporium grown on

cellulose. FEMS Microbiol Lett 321, 14–23.62 Walseth CS (1952) Occurrence of cellulases in enzyme

preparations from microorganisms. Tappi J 35,

228–233.63 Wood TM (1988) Preparation of crystalline,

amorphous, and dyed cellulase substrates. Methods

Enzymol 160, 19–25.64 Tang W, Jia S, Jia Y & Yang H (2010) The influence of

fermentation conditions and post-treatment methods on

porosity of bacterial cellulose membrane. World J

Microbiol Biotechnol 26, 125–131.65 Stahlberg J, Divne C, Koivula A, Piens K, Claeyssens

M, Teeri TT & Jones TA (1996) Activity studies and

crystal structures of catalytically deficient mutants of

cellobiohydrolase I from Trichoderma reesei. J Mol Biol

264, 337–349.66 Ibatullin FM, Selivanov SI & Shavva AG (2001) A

general procedure for conversion of S-glycosyl

isothiourea derivatives into thioglycosides,

thiooligosaccharides and glycosyl thioesters Synthesis

2001, 419–422.67 Ibatullin FM, Shabalin KA, Janis JV & Selivanov SI

(2001) Stereoselective synthesis of

thioxylooligosaccharides from S-glycosyl isothiourea

precursors. Tetrahedron Lett 42, 4565–4567.68 Schou C, Rasmussen G, Schulein M, Henrissat B &

Driguez H (1993) 4-Thiocellooligosaccharides - their

synthesis and use as inhibitors of cellulases.

J Carbohydr Chem 12, 743–752.69 Kabsch W & Sander C (1983) Dictionary of protein

secondary structure: pattern recognition of hydrogen-

bonded and geometrical features. Biopolymers 22,

2577–2637.70 Evans PR (2011) An introduction to data reduction:

space-group determination, scaling and intensity

statistics. Acta Crystallogr D Biol Crystallogr 67,

282–292.71 McCoy AJ, Grosse-Kunstleve RW, Adams PD, Winn

MD, Storoni LC & Read RJ (2007) Phaser

crystallographic software. J Appl Crystallogr 40,

658–674.72 Murshudov GN, Skubak P, Lebedev AA, Pannu NS,

Steiner RA, Nicholls RA, Winn MD, Long F & Vagin

AA (2011) REFMAC5 for the refinement of

macromolecular crystal structures. Acta Crystallogr D

Biol Crystallogr 67, 355–367.73 Emsley P & Cowtan K (2004) Coot: model-building

tools for molecular graphics. Acta Crystallogr D Biol

Crystallogr 60, 2126–2132.74 Emsley P, Lohkamp B, Scott WG & Cowtan K (2010)

Features and development of Coot. Acta Crystallogr D

Biol Crystallogr 66, 486–501.75 Collaborative Computational Project, N (1994) The

CCP4 suite: programs for protein crystallography. Acta

Crystallogr D Biol Crystallogr 50, 760–763.76 Langer G, Cohen SX, Lamzin VS & Perrakis A (2008)

Automated macromolecular model building for X-ray

crystallography using ARP/wARP version 7. Nat

Protoc 3, 1171–1179.77 Payne CM, Bomble YJ, Taylor CB, McCabe C,

Himmel ME, Crowley MF & Beckham GT (2011)

Multiple functions of aromatic-carbohydrate

interactions in a processive cellulase examined with

molecular simulation. J Biol Chem 286, 41028–41035.78 Taylor CB, Payne CM, Himmel ME, Crowley MF,

McCabe C & Beckham GT (2013) Binding site

dynamics and aromatic-carbohydrate interactions in

processive and non-processive family 7 glycoside

hydrolases. J Phys Chem B 117, 4924–4933.79 Brooks BR, Brooks CL III, Mackerell AD Jr, Nilsson

L, Petrella RJ, Roux B, Won Y, Archontis G, Bartels

C, Boresch S et al. (2009) CHARMM: the biomolecular

simulation program. J Comput Chem 30, 1545–1614.80 Beckham GT, Matthews JF, Peters B, Bomble YJ,

Himmel ME & Crowley MF (2011) Molecular-Level

origins of biomass recalcitrance: decrystallization free

energies for four common cellulose polymorphs. J Phys

Chem B 115, 4118–4127.81 Lehtio J, Sugiyama J, Gustavsson M, Fransson L,

Linder M & Teeri TT (2003) The binding specificity

and affinity determinants of family 1 and family 3

cellulose binding modules. Proc Natl Acad Sci USA

100, 484–489.82 Nishiyama Y, Langan P & Chanzy H (2002) Crystal

structure and hydrogen-bonding system in cellulose Ibfrom synchrotron X-ray and neutron fiber diffraction.

J Am Chem Soc 124, 9074–9082.83 Payne CM, Resch MG, Chen L, Crowley MF, Himmel

ME, Taylor LE, Sandgren M, St�ahlberg J, Stals I, Tan

Z et al. (2013) Glycosylated linkers in multimodular

lignocellulose-degrading enzymes dynamically bind to

cellulose. Proc Natl Acad Sci 110, 14646–14651.84 Phillips JC, Braun R, Wang W, Gumbart J,

Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kale L &

Schulten K (2005) Scalable molecular dynamics with

NAMD. J Comput Chem 26, 1781–1802.85 MacKerell AD, Bashford D, Bellott M, Dunbrack RL,

Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha

S et al. (1998) All-atom empirical potential for

22 FEBS Journal (2015) ª 2015 FEBS

Characterization of G. candidum Cel7A A. S. Borisova et al.

molecular modeling and dynamics studies of proteins.

J Phys Chem B 102, 3586–3616.86 MacKerell AD Jr, Feig M & Brooks CL III (2004)

Extending the treatment of backbone energetics in

protein force fields: limitations of gas-phase quantum

mechanics in reproducing protein conformational

distributions in molecular dynamics simulations.

J Comput Chem 25, 1400–1415.87 Guvench O, Greene SN, Kamath G, Brady JW,

Venable RM, Pastor RW & MacKerell AD Jr (2008)

Additive empirical force field for hexopyranose

monosaccharides. J Comput Chem 29, 2543–2564.88 Guvench O, Hatcher E, Venable RM, Pastor RW &

MacKerell AD Jr (2009) CHARMM additive all-atom

force field for glycosidic linkages between

hexopyranoses. J Chem Theory Comput 5, 2353–2370.89 Durell SR, Brooks BR & Bennaim A (1994) Solvent-

induced forces between 2 hydrophilic groups. J Phys

Chem-Us 98, 2198–2202.90 Jorgensen WL, Chandrasekhar J, Madura JD, Impey

RW & Klein ML (1983) Comparison of simple

potential functions for simulating liquid water. J Chem

Phys 79, 926–935.91 Humphrey W, Dalke A & Schulten K (1996) VMD:

visual molecular dynamics. J Mol Graph Model 14,

33–38.

Supporting information

Additional supporting information may be found in

the online version of this article at the publisher’s web

site:Table S1. Sequence identities and rmsd values between

GcaCel7A and other GH7 CBH structures.

Table S2. Simulation parameters for all six MD simu-

lations included in this study.

Fig. S1. Illustrations of the six MD simulations per-

formed as part of this study.

Fig. S2. MD simulation results comparing the active

site properties and dynamics of GcaCel7A and Hje-

Cel7A.

Fig. S3. Histograms of the minimum distances of (A)

loops A1 to B2 and (B) loops B2 to B3 for GcaCel7A

and HjeCel7A.

Fig. S4. Multiple sequence alignment at loop B2 of 42

GH7 CBH sequences from phylogenetically distant

species in the eukaryote tree of life suggests that

potential O-glycosylation sites in the form of Ser or

Thr are ubiquitously present near the tip of loop B2.

Doc. S1. Detailed MD protocol.

Doc. S2. Additional MD simulation results.

Doc. S3. References.

23FEBS Journal (2015) ª 2015 FEBS

A. S. Borisova et al. Characterization of G. candidum Cel7A