1
STRUCTURAL ORDER AND DISORDER DICTATE SEQUENCE AND FUNCTIONAL EVOLUTION OF THE PAPILLOMAVIRUS E7 PROTEIN LUCIA B. CHEMES , JULIANA GLAVINA § , CRISTINA MARINOBUSLJE , GONZALO DE PRATGAY AND IGNACIO E. SANCHEZ § PROTEIN STRUCTURE, FUNCTION AND ENGINEERING LABORATORY, FUNDACION INSTITUTO LELOIR AND IIBBACONICET, BUENOS AIRES, ARGENTINA. § PROTEIN PHYSIOLOGY LABORATORY, DEPARTAMENTO DE QUIMICA BIOLOGICA, FACULTAD DE CIENCIAS EXACTAS Y NATURALESUNIVERSIDAD DE BUENOS AIRES, ARGENTINA Sequence evolu+on in the disordered E7N domain shows that some of its short func+onal mo+fs evolve in a coordinate manner and that the domain has been subject to several episodes of adap+ve evolu+on. The high func+onal density within E7N could explain the large number of targets found for this small protein. Evolu+on of the E7C domain is dictated by dimeriza+on, canonical zinc binding by the two CxxC mo+fs and likely also by zinc binding by unpaired cysteines and binding of short Ser/Prorich sequences within host proteins. INTRODUCTION E7 is the main transforming protein in papillomaviruses and plays an important role in oncogenesis [1]. The globular Cterminal domain (E7C) mediates zinc binding and homodimeriza+on [2]. The intrinsically disordered Nterminal domain (E7N) harbors several linear mo+fs that mediate interac+on with cellular targets, including the high affinity LxCxE binding site for the Re+noblastoma protein (Rb) [3]. We have analyzed sequence and func+onal evolu+on of E7 using 210 natural sequences. CONCLUDING REMARKS CR1-Helix LxCxE CKII-PEST RbAB (E2F SITE) p600 p300 IRF-1 Cullin-2 FHL2 E7N DOMAIN CONSERVATION E7C DOMAIN CONSERVATION The sequence logos [4] show that E7N is as conserved as E7C in spite of the lack of a stable structure (figure A). The highly conserved E7N motifs are separated by variable regions. The CR1 region shows high conservation at the helix-forming Rb-targeting residues 6-13 and at uncharacterized residues 1-3. The LxCxE motif also shows conservation at residues that are outside the canonical motif (pos. 19, 23, 25 and 26). One third of the E7 sequences lack a CKII phosphorylation site, while only 2.5% of them lack a stretch of acidic residues (n>3) (figure B). The tight restriction in sequence separation between the LxCxE motif and the CKII/PEST region together with the coevolution of residues 25 and 29 [5] (black asterisks), suggests that the two motifs form an evolutionary and functional unit. The phylogenetic analysis suggests that the LxCxE motif, the acidic stretch and the CKII sites have changed several times during papillomavirus evolution. Changes in sequence of the LxCxE motif are coupled to changes in phenotype (delta, gamma and alpha 2 sp.), pointing to adaptive evolution events. In reptilian, avian and some artiodactyl papillomaviruses, E7N is substituted by a domain with no sequence similarity to canonical E7N sequences. Whenever the LxCxE motif is present, the acidic stretch follows, further supporting the functional association between them. Gamma papillomaviruses often harbor an LxSxE motif. (Figure adapted from [6]) The sequence logo [4] for the C-terminal domain displays (1) four cysteine residues involved in zinc binding (positions 44, 47, 77 and 80, displayed in red in figure A), (2) a highly conserved leucine-rich region that acts as a nuclear export signal, (3) six conserved positions (displayed in blue in figure A) that form the core of each monomer, and (4) six conserved positions (displayed in cyan in figure A) that stabilize the dimer interface. Four conserved residues are surface exposed (yellow in figure A). A mutual information analysis [5] reveals two pairs of coevolving amino acid positions that form contacts across the dimerization interface (figure B). P107 p130 p21CIP p300 IRF-1 FHL2 RbAB (LxCxE SITE) TBP F-Actin CKII HPV-E2C RbC, p21CIP, p27KIP, TBP, AP1, Mi2B, IGFBP3, S4 proteasome, MPP2 IRF-1, h-TID, pCAF, Cullin2-UBC, E2F1, FHL2, NuMA, DNMT1 Two E7C sequence regions show high frequencies of Cysteine (6 to 21%), with most E7 proteins having at least one extra cysteine in addition to the two canonical CxxC motifs. One cysteine-rich region (blue) is close in space and sequence to the first CxxC motif and the zinc ion. The second region (green) is close in space to the zinc ion coordinated by the opposite monomer. The additional cysteines may stabilize alternative conformations of the domain through non-native coordination of the zinc ion. HPV45 E7C binds to a short peptide from the host cellular protein p21 [2]. NMR measurements indicate that the interaction is mediated by certain residues from each monomer's exposed surface (upper panel, left). The proposed site overlaps well with a conserved surface patch in E7C (upper panel, middle), suggesting that peptide binding is a property shared by most E7C domains. The structures of the E7C domain and host phd domains superimpose well [7] (lower panel, right), pointing to a plausible evolutionary origin for E7C. Many phd domains bind peptides at a site that corresponds to the proposed functional surface of E7C (upper panel, right). A motif search in the sequences of E7C targets [8] suggests that the domain binds peptides rich in proline and serine residues (lower panel, left). REFERENCES [1] Chemes LB et al. Intrinsic disorder in the human papollomavirus E7 protein. In: Flexible Viruses, Uversky DN and Longhi S. Eds. In press. [2] Ohlenschlager O et al. Solution sturcture of the partially folded high-risk human papilloma virus 45 oncoprotein E7. Oncogene 2006, 25:5953-9. [3] Lee JO et al. Structure of the retinoblastoma tumour suppressor pocket domain bound to a peptide from HPV E7. Nature 1998, 361:859-65. [4] Schneider TD, Stephens RM. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990, 18:6097-100. [5] Marino Buslje C et al. Correction for phylogeny, small number of observations and data redundancy improves the identification of coevolving amino acid pairs using mutual information. Bioinformatics 2009, 25:1125-1131 [6] Bravo IG et al. The clinical importance of understanding the evolution of papillomaviruses. Trends in Microbiology 2010, 18:432-438. [7] Suhrer S et al.. COPS-a novel workbench for explorations in fold space. Nucleic Acids Res. 2009, 37(Web Server issue):W539-544. [8] Radusky L et al. Discovery of functional protein linear motifs using a greedy algorithm and information theory. POSTER. * * * Coevolving residue pairs CKII-PEST REGION INFORMATION CONTENT CO-EVOLUTION CONSERVATION Zn-binding cysteines Surface residues Monomer interface Dimer interface RbAB (LxCxE SITE) REGION 1 E7C CYSTEINE CLUSTERS EVOLUTION OF E7N LINEAR MOTIFS CONSERVED PEPTIDE-BINDING SITE IN E7C PUTATIVE E7C BINDING MOTIF P21-binding site Information content HPV45 E7C (PDB 2B9D) PYGO1_MOUSE (PDB 2YYR) H3-binding site REGION 2 Zn-binding cysteines Zinc ion NES E7C/PYGO1 A B A B * * * * * * * * D61/T72 C45/Q56

Structural Order and Disorder Dictate Sequence And Functional Evolution of the Papillomavirus E7 Protein

Embed Size (px)

Citation preview

STRUCTURAL  ORDER  AND  DISORDER  DICTATE  SEQUENCE  AND  FUNCTIONAL  EVOLUTION  OF  THE  PAPILLOMAVIRUS  E7  PROTEIN  

LUCIA  B.  CHEMES¶,  JULIANA  GLAVINA§,  CRISTINA  MARINO-­‐BUSLJE¶,  GONZALO  DE  PRAT-­‐GAY¶    

AND  IGNACIO  E.  SANCHEZ§  ¶PROTEIN  STRUCTURE,  FUNCTION  AND  ENGINEERING  LABORATORY,  FUNDACION  INSTITUTO  LELOIR  AND  IIBBA-­‐CONICET,  BUENOS  AIRES,  ARGENTINA.  §PROTEIN  

PHYSIOLOGY  LABORATORY,  DEPARTAMENTO  DE  QUIMICA  BIOLOGICA,  FACULTAD  DE  CIENCIAS  EXACTAS  Y  NATURALES-­‐UNIVERSIDAD  DE  BUENOS  AIRES,  ARGENTINA  

•     Sequence  evolu+on   in   the  disordered  E7N  domain  shows  that  some  of   its  short  func+onal   mo+fs   evolve   in   a   coordinate   manner   and   that   the   domain   has   been  subject  to  several  episodes  of  adap+ve  evolu+on.  The  high  func+onal  density  within  E7N  could  explain  the  large  number  of  targets  found  for  this  small  protein.  

•         Evolu+on  of  the  E7C  domain  is  dictated  by  dimeriza+on,  canonical  zinc  binding  by  the  two  CxxC  mo+fs  and  likely  also  by  zinc  binding  by  unpaired  cysteines  and  binding  of  short  Ser/Pro-­‐rich  sequences  within  host  proteins.  

INTRODUCTION  E7   is   the  main  transforming  protein   in  papillomaviruses  and  plays  an   important  role   in  oncogenesis   [1].  The  globular  C-­‐terminal  domain   (E7C)   mediates   zinc   binding   and   homodimeriza+on   [2].   The   intrinsically   disordered   N-­‐terminal   domain   (E7N)   harbors  several   linear   mo+fs   that   mediate   interac+on   with   cellular   targets,   including   the   high   affinity   LxCxE   binding   site   for   the  Re+noblastoma  protein  (Rb)  [3].  We  have  analyzed  sequence  and  func+onal  evolu+on  of  E7  using  210  natural  sequences.    

CONCLUDING  REMARKS  

CR1-Helix! LxCxE! CKII-PEST!

RbAB (E2F SITE) p600 p300

IRF-1 Cullin-2 FHL2

E7N DOMAIN CONSERVATION! E7C DOMAIN CONSERVATION!

The sequence logos [4] show that E7N is as conserved as E7C in spite of the lack of a stable structure (figure A). The highly conserved E7N motifs are separated by variable regions. The CR1 region shows high conservation at the helix-forming Rb-targeting residues 6-13 and at uncharacterized residues 1-3. The LxCxE motif also shows conservation at residues that are outside the canonical motif (pos. 19, 23, 25 and 26). One third of the E7 sequences lack a CKII phosphorylation site, while only 2.5% of them lack a stretch of acidic residues (n>3) (figure B). The tight restriction in sequence separation between the LxCxE motif and the CKII/PEST region together with the coevolution of residues 25 and 29 [5] (black asterisks), suggests that the two motifs form an evolutionary and functional unit.

The phylogenetic analysis suggests that the LxCxE motif, the acidic stretch and the CKII sites have changed several times during papillomavirus evolution. Changes in sequence of the LxCxE motif are coupled to changes in phenotype (delta, gamma and alpha 2 sp.), pointing to adaptive evolution events. In reptilian, avian and some artiodactyl papillomaviruses, E7N is substituted by a domain with no sequence similarity to canonical E7N sequences. Whenever the LxCxE motif is present, the acidic stretch follows, further supporting the functional association between them. Gamma papillomaviruses often harbor an LxSxE motif. (Figure adapted from [6])

The sequence logo [4] for the C-terminal domain displays (1) four cysteine residues involved in zinc binding (positions 44, 47, 77 and 80, displayed in red in figure A), (2) a highly conserved leucine-rich region that acts as a nuclear export signal, (3) six conserved positions (displayed in blue in figure A) that form the core of each monomer, and (4) six conserved positions (displayed in cyan in figure A) that stabilize the dimer interface. Four conserved residues are surface exposed (yellow in figure A). A mutual information analysis [5] reveals two pairs of coevolving amino acid positions that form contacts across the dimerization interface (figure B).

P107 p130 p21CIP p300 IRF-1 FHL2

RbAB (LxCxE SITE) TBP F-Actin CKII HPV-E2C

RbC, p21CIP, p27KIP, TBP, AP1, Mi2B, IGFBP3, S4 proteasome, MPP2 IRF-1, h-TID, pCAF, Cullin2-UBC, E2F1, FHL2, NuMA, DNMT1

Two E7C sequence regions show high frequencies of Cysteine (6 to 21%), with most E7 proteins having at least one extra cysteine in addition to the two canonical CxxC motifs. One cysteine-rich region (blue) is close in space and sequence to the first CxxC motif and the zinc ion. The second region (green) is close in space to the zinc ion coordinated by the opposite monomer. The additional cysteines may stabilize alternative conformations of the domain through non-native coordination of the zinc ion.

HPV45 E7C binds to a short peptide from the host cellular protein p21 [2]. NMR measurements indicate that the interaction is mediated by certain residues from each monomer's exposed surface (upper panel, left). The proposed site overlaps well with a conserved surface patch in E7C (upper panel, middle), suggesting that peptide binding is a property shared by most E7C domains. The structures of the E7C domain and host phd domains superimpose well [7] (lower panel, right), pointing to a plausible evolutionary origin for E7C. Many phd domains bind peptides at a site that corresponds to the proposed functional surface of E7C (upper panel, right). A motif search in the sequences of E7C targets [8] suggests that the domain binds peptides rich in proline and serine residues (lower panel, left).

REFERENCES  [1] Chemes LB et al. Intrinsic disorder in the human papollomavirus E7 protein. In: Flexible Viruses, Uversky DN and Longhi S. Eds. In press. [2] Ohlenschlager O et al. Solution sturcture of the partially folded high-risk human papilloma virus 45 oncoprotein E7. Oncogene 2006, 25:5953-9. [3] Lee JO et al. Structure of the retinoblastoma tumour suppressor pocket domain bound to a peptide from HPV E7. Nature 1998, 361:859-65. [4] Schneider TD, Stephens RM. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990, 18:6097-100. [5] Marino Buslje C et al. Correction for phylogeny, small number of observations and data redundancy improves the identification of coevolving amino acid pairs using mutual information. Bioinformatics 2009, 25:1125-1131 [6] Bravo IG et al. The clinical importance of understanding the evolution of papillomaviruses. Trends in Microbiology 2010, 18:432-438. [7] Suhrer S et al.. COPS-a novel workbench for explorations in fold space. Nucleic Acids Res. 2009, 37(Web Server issue):W539-544. [8] Radusky L et al. Discovery of functional protein linear motifs using a greedy algorithm and information theory. POSTER.

* * * Coevolving residue pairs"

CKII-PEST REGION!INFORMATION CONTENT!CO-EVOLUTION!CONSERVATION!

Zn-binding cysteines" Surface residues"Monomer interface" Dimer interface"

RbAB (LxCxE SITE)

REGION 1"

E7C CYSTEINE CLUSTERS!EVOLUTION OF E7N LINEAR MOTIFS!

CONSERVED PEPTIDE-BINDING SITE IN E7C!

PUTATIVE E7C BINDING MOTIF

P21-binding site Information content HPV45 E7C (PDB 2B9D) PYGO1_MOUSE (PDB 2YYR)

H3-binding site

REGION 2" Zn-binding cysteines"Zinc ion"

NES!

E7C/PYGO1

A B A B

* * * * * * * *

D61/T72"C45/Q56"