16
This article was downloaded by: [University Of Pittsburgh] On: 25 November 2014, At: 17:50 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Biomolecular Structure and Dynamics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tbsd20 Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax Rajagopal Chattopadhyaya a & Amita Pal b a Department of Biochemistry b Plant Molecular & Cellular Genetics Section , Bose Institute , Calcutta , 700054 , India Published online: 15 May 2012. To cite this article: Rajagopal Chattopadhyaya & Amita Pal (2008) Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax, Journal of Biomolecular Structure and Dynamics, 25:4, 357-371, DOI: 10.1080/07391102.2008.10507184 To link to this article: http://dx.doi.org/10.1080/07391102.2008.10507184 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

  • Upload
    amita

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

This article was downloaded by: [University Of Pittsburgh]On: 25 November 2014, At: 17:50Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biomolecular Structure and DynamicsPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/tbsd20

Three-dimensional Models of NB-ARC Domains ofDisease Resistance Proteins in Tomato, Arabidopsis,and FlaxRajagopal Chattopadhyaya a & Amita Pal ba Department of Biochemistryb Plant Molecular & Cellular Genetics Section , Bose Institute , Calcutta , 700054 , IndiaPublished online: 15 May 2012.

To cite this article: Rajagopal Chattopadhyaya & Amita Pal (2008) Three-dimensional Models of NB-ARC Domains of DiseaseResistance Proteins in Tomato, Arabidopsis, and Flax, Journal of Biomolecular Structure and Dynamics, 25:4, 357-371, DOI:10.1080/07391102.2008.10507184

To link to this article: http://dx.doi.org/10.1080/07391102.2008.10507184

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato,

Arabidopsis, and Flax

http://www.jbsdonline.com

Abstract

Three dimensional models of NB-ARC domains in five different proteins were constructed based on the recently published crystal structure of the apoptotic protease activating factor 1, of which two are for tomato species, one each for flax, Arabidopsis, and nematode. Standard multiple sequence alignment was performed for chosen members of the NB-ARC domains, very divergent from each other in protein sequence, followed by homology model building and structure refinement. In this alignment, amino acid insertions and deletions between members generally fall in loop regions or at ends of alpha helices. Despite the presence of sequence divergence between the species, it is argued that the NB-ARC domains carry out the similar biological functions in the various species, highlighting the ATP binding and ATPase activity. By our comparative study of these models, it is predicted that NB-ARC domains should bind ADP/ATP rather than GDP/GTP. Both natural and induced mutants of Arabi-dopsis within the RPS2 locus and their phenotypes for disease reaction against Pseudomonas syringae are rationalized from the protein model. Apaf-1 Thr263 and Arg265 positions con-served totally within the NB-ARC domains are predicted to take active part in the catalytic activity of kinase-3 motif, the arginine known as the sensor I motif in AAA+ proteins. This was later verified for the Ced-4 crystal structure in complex with Ced-9. Our model of Ced-4 based on Apaf-1 was also compared with its crystal structure in the Ced-4-Ced-9 complex; the 3 layered α/β domain superposes quite well, helical domain I is shifted by about 5 Å but the winged helix domain is rotated away to a new position. Since Apaf-1 was crystallized with ADP and Ced-4-Ced9 with magnesium-ATP, this rotation signifies a change in structure of these NB-ARC domains between the two forms. Further, we hypothesize that certain mutants in the plant R proteins called ‘constitutive gain-of-function’ or ‘autocatalytic’ dispose their winged helix domains permanently like the magnesium-ATP form as observed for Ced-4, avoiding the closed ADP conformation. The models are also validated with mutagenesis data for a related tomato protein I-2, tomato prf and flax, including loss of function, wild type and autocatalytic phenotypes, and compared with similar data for potato and tobacco proteins, for which models were not built. These three dimensional models would help us to understand the spatial arrangement, function of R proteins and their conserved motifs.

Introduction

Plants defend themselves against a wide range of pathogenic organisms. In their genetic armoury, the presence of a class of resistance (R) genes is an important part. Plants lacking the R genes or possessing defective ones are known to be susceptible to the pathogenic organisms (1). Though we do not possess a full understanding of how R proteins trigger signal transduction pathways leading to host cell defense, it is known that the largest class of R genes encodes proteins that have a variable N terminus, a conserved central domain predicted to be a nucleotide binding (NB) site, and a variable number of leucine-rich repeats (LRRs) at the C terminus. The amino terminal domain of NBS shows homology to both Toll of Drosophila and interleukin-receptor-like proteins of mammals (2), hence named TIR (Toll/Inter-

Journal of Biomolecular Structure &Dynamics, ISSN 0739-1102Volume 25, Issue Number 4, (2008)©Adenine Press (2008)

Rajagopal Chattopadhyaya1,*

Amita Pal2

1Department of Biochemistry2Plant Molecular & Cellular Genetics SectionBose InstituteCalcutta 700054, India

357

* Phone: 91-33-2355-9544, ext 327Fax: 91-33-2355-3886Email: [email protected]

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 3: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

358

Chattopadhyaya and Pal

leukin receptor) domain. Toll, interleukin receptor, and related domains seem to be involved in non-specific cellular immunity in animals. Plant TIRs are presumed to possess functional analogy to these animal proteins (3).

The conserved central NB domain was originally thought to bind either ATP or GTP (4), containing three peptide motifs critical for nucleotide binding found in many such proteins (5). Among the three motifs, the first is kinase-1a, also called the P-loop or Walker A motif, the second is kinase-2 or Walker B motif, and a third, less conserved kinase-3a (5-7). Database searches for proteins with homology to eukaryotic cell death effectors like Ced4 in nematode (8) and Apaf-1 in human (9) identified several of these plant R proteins (10), thus resulting in a broader cat-egory named NB-ARC domains (www.sanger.ac.uk and www.ncbi.nlm.nih.gov). The NB-ARC domain is presumed to be involved in signal transduction cascades through phosphorylation/dephosphorylation events (11).

The R proteins are also similar to the Ced4 and Apaf-1 structurally with an N-terminal effector domain and C-terminal LRRs that are, like the WD-40 repeats in Apaf-1 (9), often involved in self-association of R proteins. The leucine-rich repeat is described as a pathogen protein recognition motif (12). A number of crystal structures on LRRs exist, but no experimental information is available for the plant sub-class (12).

The consistent structural arrangement of eight conserved motifs suggests that the NB-ARC domains of the plant R proteins have similar or identical biochemical functions (13). We have built three-dimensional models of this domain to gain insights into the spatial arrangements of these conserved motifs present within NB-ARC domains from five species with divergent protein sequences. The three dimensional models of these NB-ARC domains may also help us to understand the protein-protein interactions between the R protein of the host and the Avr protein of the pathogen as emphasized in ‘gene for gene’ theory (14); alternatively, it may help us to understand the role of an intermediary protein complex, which has been envisaged in ‘guard theory’ of plant disease resistance (11).

There have been few biochemical studies of these plant R proteins, e.g., a study on the tomato gene products I-2 and Mi-1 (7), and a recent mutational study on I-2 (15) presumably due to low abundance of R proteins. On the other hand, several biochemical studies have been published on Ced4 (16, 17). At first the models of these NB-ARC domains were built by employing the primary sequences of several R proteins as inputs in the program 3D-PSSM (18) in search of a suitable template. Later, the crystal structure of WD40 deleted Apaf-1 in the ADP bound form was published (19) and we realized that part of this protein structure should be used as the template for our homology models as Apaf-1 is a member of the NB-ARC family (the ‘A’ within ‘NB-ARC’ stands for Apaf-1).

The three dimensional models were generated through multiple sequence align-ment of chosen five protein sequences with that of Apaf-1 followed by loop build-ing and protein structure refinement. The published crystal structure of Apaf-1 showed ADP as bound form, hence, we also built models including bound ADP. It was shown that the multiple sequence alignment automatically optimizes the ADP binding site in the protein structures by the presence of appropriate groups in the correct places, despite considerable sequence divergence, amino acid inser-tions, and deletions between the members of the NB-ARC family employed in the present study. Finally, reported avrRPT2-dependant phenotypes of 14 mutants of Arabidopsis RPS2 gene (20-23) were rationalized on the basis of the presently constructed protein model.

Materials and Methods

Sequences of NB-ARC Domains From the Conserved Domain Database of NCBI

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 4: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

359NB-ARC Domains of

Plant R Proteins and Ced4

I. Arabidopsis thaliana (RPS2, gi 30173240; RPS2 like, gi 2443884; RPS2 like, gi 2244817),

II. Lycopersicon esculentum (tomato, I2C1, gi 2258315; PRF, gi 1513144),

III. Linum usitatissimum (flax, L6 gi 862904), IV. Vigna mungo (urd-bean, VMYR1, AY29745), V. Caenorhabditis elegans (nematode Ced4, gi 231729), and VI. Human Apaf-1, gi 20141188.

Background for Template Search for Theoretical Modeling

The interest in the NB-ARC domains emerged from a study on the development of Yellow Mosaic Virus resistance linked DNA marker in Vigna mungo (24). The trans-lated amino acid sequence described in that study was shown to possess homology with other NB-ARC domains, but with 147 residues does not represent the full do-main as it lacks the Walker A motif or P loop (6). One theoretical model for this V. mungo (incomplete) putative NBS domain with PDB code 1W71 was created using a low resolution crystal structure of the ATP-binding cassette (ABC) transporter protein MsbA from Vibrio cholerae (25) as template by one of us (A.P.) in 2004. Although structural similarity exists between ABC transporters and AAA+ proteins, as both are P-loop NTPases, yet, dissimilarities were also noted (26). Recently, Apaf-1 has also been shown as one of the members of AAA+ proteins (19), characterized by the strand order 51432 in the central β strands; this fold is not seen in other P-loop NTPases (26). In addition, the nucleotide binding domains of Vibrio cholerae MsbA and E. coli MsbA can sample a large conformational space in the absence of nucleotide (25). The nucleotide binding site leucine rich repeat (NBS-LRR) proteins are probably localized in the cytosol (20), whereas MsbA proteins typically have a helical transmembrane domain (25); thus functionally, the role of the former is to recognize intracellular and extracellular pathogens in the cytoplasm whereas the latter are involved in transport.

Template Searches Using 3D-PSSM

During the initiation of the present study, however, the Apaf-1 crystal structure was not available and hence we used eight different NB-ARC domains for tomato (gi 2258315, gi 1513144), flax (gi 862904), Arabidopsis (gi 30173240, gi 2443884, gi 2244817), nematode Ced4 (gi 231729), and human Apaf-1 (gi 20141188), plus a slightly enlarged sequence for VMYR1 (AY297425) containing twelve additional N-terminal amino acids from the highly homologous Glycine max (gi27764544; Score bits 229; E-value 6e-60, inversely related to the level of homology) whereby introducing the Walker A motif in the query sequence. These sequences ranged from 169 to 183 amino acids and were individually used as query sequences in the program 3D-PSSM (17). The searches found various adenylate kinases (PDB codes 1ak2, 1aky, 1ake) as possible answers, and occasionally chloramphenicol phosphotransferase (PDB code 1qhx) and era (PDB code 1ega). However, these answers were rejected once the crystal structure of Apaf-1 (19) became available since Apaf-1 is a member of the NB-ARC domain family.

Sequence Alignment and Database Search

Sequences were aligned to start at the WALKER-A/P-loop and ended with GLPL motifs to get best possible alignment using CLUSTAL W (ver 1.83) (27) with the default setting of gap opening penalty (10.0) and gap extension penalty (0.1). The Gonnet 250 protein weight matrix was used. Residue numbers of Apaf-1 as found in the PDB entry 1Z6T have been included for reference, but the secondary struc-ture elements have been renamed in Figure 1 since the CARD is not included in the plant R proteins. However, the alignment was later modified in a few places during model building, so as to preserve good stereochemistry of the protein chain, and the final alignment having this structural input or corrections is depicted in Figure 1.

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 5: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

360

Chattopadhyaya and Pal

Selection of NB-ARC Members for Model Building

Out of eight most diverse NB-ARC members, model building was continued for only one Arabidopsis (RPS2, gi 30173240) as the two other Arabidopsis sequences were found to possess 40% sequence identity or higher in the stretch of about 170 amino acids initially chosen in the 3D-PSSM template searches. The two tomato sequences, Prf (28) and I2C1 (29), were retained as they were quite divergent from each other and dissimilar in the total number of residues. The V. mungo model was not built as its sequence information is incomplete, though used for the purpose of multiple sequence alignment (Figure 1). Flax L6 (30) and nematode Ced4 (8) sequences were retained for model building.

Crystal Structure of Apaf-1

The final atomic model (PDB code 1Z6T) of WD40-deleted Apaf-1 contains 586 residues. It comprises five distinct domains – an N-terminal caspase recruitment domain (CARD, residues 1-107), a three-layered α/β domain (residues 108-284), helical domain I (residues 286-365), winged-helix domain (residues 366-450), and helical domain II (residues 451-586) – these stack against each other through extensive inter-domain interactions (19). A deeply buried ADP molecule serves as an organizing center to strengthen these interactions, locking Apaf-1 in a pre-sumably inactive conformation, as Apaf-1 is known to possess higher affinity for and hydrolyzes ATP/dATP as a part of its activity for forming the apoptosome and activation of caspase-9 (19). Incidentally, the NBS containing parts of the I-2 and Mi-1 proteins of tomato are also endowed with ATPase activity as demonstrated earlier by Tameling et al. (7). It is envisaged that R proteins utilize the energy released during hydrolysis, thereby, activating downstream signaling cascades leading to plant defense response.

Model Building Using InsightII

All the five models were constructed using residues 106-451 of the 2.2 Å crystal structure of Apaf-1, i.e., the structure lacking CARD and helical domain II (19) as the template, since NBS domains of disease resistance (R) proteins have a striking homology in the conserved motifs with those of Apaf-1 and CED-4. An alignment very close to the one displayed in Figure 1 was initially used for every model we built. As seen in Figure 1, insertions/deletions between the several sequences being compared are mainly in five loop regions between the secondary structure elements α2-β1, α5-α6, α6-β3, β5-α8, and α11-α12. Smaller inser-tions/deletions are seen at the start of α4, α7, end of α8, and start of α15. Other than these insertions and deletions, most of the models were constructed using the program Homology within InsightII. The starting positions of the loops were manually chosen to follow those in Apaf-1 as far as possible on the graphics ter-minal with stereo viewing for the aforementioned sections and stereochemistry regularized by subsequent constrained refinement using the program Discover within InsightII. Portions of these models derived from the crystal structure us-ing the program Homology were retained as far as possible and held fixed during refinement cycles using the program Discover, refining performed only in the regions of insertions/deletions. Side chains in the structurally conserved regions were allowed to move once the insertions/deletions usually in loops were refined to reasonable positions. Bad bond lengths, bond angles, torsion angles, omega values at various stages of refinement were identified using the program Prostat within the Homology module, and those sections of the proteins refined again un-til an acceptable structure resulted. The refined model was superimposed on the template Apaf-1 at this stage, and the positions of the ADP atoms were copied for inclusion in the new structure. Usually no bad contacts resulted upon inserting the ADP. When some bad contacts via side chains of the modeled structure were noticed, those were refined again holding the ADP fixed.

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 6: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

361NB-ARC Domains of

Plant R Proteins and Ced4

Results and Discussion

Description of the Models

As discussed in the methods section, all the five protein structure models were de-rived from the three central domains of the Apaf-1 crystal structure – a three-lay-ered α/β domain (residues 108-284), helical domain I (residues 286-365), winged-helix domain (residues 366-450). Figure 1 shows multiple alignments of all five sequences together with the template. Alpha carbon stereo diagrams of three of our five models are displayed superimposed on Apaf-1 in Figure 2.

In Figure 2, the three-layered α/β domain comprising secondary structural ele-ments till β5 (Figure 1) is seen in the top half above the ADP molecule. The helical domain I appears at the bottom right below the ADP and the winged-helix domain to the bottom left. The winged-helix domain is mainly helical; its three β strands β6-β7-β8 are towards the center and near the ADP. The CARD domain in Apaf-1 was shown to stack against the α/β domain and the winged-helix domain (19) it would have to be imagined on the left side of the molecule in Figure 2, towards the viewer, being absent in plant R proteins.

The three-layered α/β domain contains a parallel-stranded β sheet comprising five β strands. In Figure 2, β2 is seen closest to the viewer and β5 furthest. As noted earlier, the strand order is 51432 in these AAA+ proteins. The polypeptide between β2 and β3 shows the most variation in this domain (Figs. 1, 2). In con-trast, the polypeptide between β1 and β2 containing the P-loop region shows the lowest variation (Figs. 1, 2) and the variation exhibited is away from the ADP. The other two regions between β3 and β4 and β4 and β5 show intermediate varia-tion among the various proteins (Figs. 1, 2), not that much as insertions/deletions but as differences in structure.

The helical domain I is the smallest of the three domains in our models and structurally more conserved (Fig. 2), differences being found beginning Apaf-1 residues 299 and 352, regions away from the ADP. This domain is a bundle of four helices (Figs. 1, 2).

Structurally the most conserved is the C-terminal winged-helix domain, the only significant difference in main chain conformation being around Apaf-1 residue 403 (Fig. 2).

Several conserved motifs in the NBS region were identified by analyzing 481 plant R proteins by Meyers et al. (13). These motifs were named as P-loop/Kinase-1, RNBS-A, Kinase-2, RNBS-B/Kinase-3, RNBS-C, GLPL, RNBS-D. In our align-ment, these are respectively located at the start of α3, β2, end of α6, β4, β5 to α8, α10, α13, and the following loop (Fig. 1). Conserved regions observed by Meyers et al. (13) are not obvious in Figure 1 as we have aligned TIR, non-TIR, nematode, and human sequences together. Of these, P-loop/Kinase-1, RNBS-A, RNBS-B/Kinase-3, RNBS-C, and GLPL are all close in space to the ADP, while Kinase-2 and RNBS-D are not that close (Fig. 2).

The side chain of Leu 140 in Apaf-1 is surrounded by two neighboring residues at/following α2 – Ile 136 and Leu 143; Ala 167, Val 168, Leu 173; and Phe 178, residues from α3/α4 and Leu 239 from β3 (Fig. 1). Most of these are favorable Van der Waal’s contacts and many of these are maintained in other NB-ARC domains. Gly 154, Gly 157, and Gly 159 were all found to possess positive values (162º, 77º, 103º) of phi, the main chain torsion angle in the Apaf-1 structure. These glycines have to be conserved as other amino acids rarely adopt positive values of phi. They are also part of the P-loop or Walker motif A. Lys 160, also part of the P-loop, is also conserved as it provides a hydrogen bond to the β phosphate of ATP/ADP (Fig. 3). The two remaining conserved residues, Thr 263 and Arg 265 are both close to

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 7: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

362

Chattopadhyaya and Pal

sstr --α16----|

110 120 130 140 150 160 170 | | | | | | | I2C1 -ISTKQETRTPSTSLVDDSGIFGRKNEIENLVGRLLSMDTKRKNLAVVPIVGMGGMGKTTLAKAVYNDERVQKPrf -TTTYVAPSFSAYTQRANEEMEGFQDTIDELKDKLLGGSP---ELDVISIVGMPGLGKTTLAKKIYNDPEVTSL6 VSADIWSHISKENLILETDELVGIDDHITAVLEKLSLDSE---NVTMVGLYGMGGIGKTTTAKAVYN--KISSVMYR VYN--SIADRPS2 AIKTDGGSIQVTCREIPIKSVVGNTTMMEQVLEFLSEEE----ERGIIGVYGPGGVGKTTLMQSINN-ELITKCED4 -SRQMLDRKLLLGNVPKQMTCYIREYHVDRVIKKLDEMCD--LDSFFLFLHGRAGSGKSVIASQALSKSDQLI APAF ITSYVRTVLCEGGVPQRPVVFVTRKKLVNAIQQKLSKLKG---EPGWVTIHGMAGCGKSVLAAEAVRDHSL--sstr |--- 1------|---------|----- 2------|-------|- 1---|---|---— 3-----|-- 4- P-LOOP 180 190 200 210 220 230 | | | | | | I2C1 HFGLTAWFCVSEAYDAFRITKGLLQEIGSTDLKADDNLNQLQVKLKADDNLNQLQVKLKEKLNGKRFLVVLDDPrf RFDVHAQCVVTQLYSWRELLLTILNDVLEPSDRNEK-----------EDGEIA--DELRRFLLTKRFLILIDDL6 CFDCCCFIDNIRETQEKDGVVVLQKKLVSEILRIDSGS-------VGFNNDSGGRKTIKERVSRFKILVVLDDVMYR DFDSSCFLQNVREESNKHGLKHLQSLLLSEILGEND---------IILASVQRGISVIQQRLGRKKVLLILDDRPS2 GHQYDVLIWVQMSREFGECTIQQAVGARLGLSWDEK-----------ETGENR-ALKIYRALRQKRFLLLLDDCED4 GINYDSIVWLKDSGTAPKSTFDLFTDILLMLKSEDD-----------LLNFPSVEHVTSVVLKRMICNALIDRAPAF LEGCFPGGVHWVSVGK-QDKSGLLMKLQNLCTRLDQDES-------FSQRLP---LNIEEAKDRLRILMLRKHsstr - 4--|-|-- 2---|-|-------- 5-------|-------------------|------ 6--------| RNBS-A Kin-2 240 250 260 270 280 290 300 | | | | | | | I2C1 VWNDNYPEWDDLRN--LF-LQGDIGSKIIVTTRKESVALMMDSGAIYMGI---LSSEDSWALFKRHSLEHKDP Prf VW--DYKVWDNLCM--CF-SDVSNRSRIILTTRLNDVAEYVKCESDPHHL-RLFRDDESWTLLQKEV--FQGE L6 VD--EKFKFEDMLG--SP-KDFISQSRFIITSRSMRVLGTLNENQCKLYEVGSMSKPRSLELFSKHA--FKKN VMYR VD--NRKQLQAFAG--RS-DWFGPGSRVIVTTRDEQLLKSHEIERTYEVE--ELNDNDSLQLLIWNA--FKRE RPS VWEEIDLEKTGVPR-----PDRENKCKVMFTTRSIALCNNMGAEYKLRVE--FLEKKHAWELFCSKVW-RKDL CED4 PN--TLFVFDDVVQEETIRWAQELRLRCLVTTRDVEISNAASQTCEFIEV-TSLEIDECYDFLEAYG---MPM APAF PR--SLLILDDVWDSWVL-KAFDSQCQILLTTRDKSVTDSVMGPKYVVPVESSLGKEKGLEILSLFVN-MKKA sstr ----|--- 3--|--- 7---|---|- 4--||-- ---|---|- 5--|---|----- 8-------|---- RNBS-B RNBS-C 310 320 330 340 350 360 370 | | | | | | | I2C1 KEHPEFEEVGKQIADKCKGLPLALKALAGMLRSKSEVDEWRNILRSEIWELP-------SCSNGILPALMLSYPrf SCPPELEDVGFEISKSCRGLPLSVVLVAGVLKQKKKTLDSWKVVEQSLSSQR------IGSLEESISIIGFSYL6 TPPSYYETLANDVVDTTAGLPLTLKVIGSLLFKQEIAVWEDTLEQLRRT----------LNLDEVYDRLKISYVMYR KVDPRYEDVLNRS RPS2 LESSSIRRLAEIIVSKCGGLPLALITLGGAMAHRETEEEWIHASEVLTRFPA-----EMKGMNYVFALLKFSYCED4 PVGEKEEDVLNKTIELSSGNPATLMMFFKSCEPKTFEKMAQLNNKLESRGLVGVECITPYSYKSLAMALQRCVAPAF DLPEQAHSIIKECKGSPLVVSLIGALLRDFPNRWEYYLKQLQNKQFKRIRKS-----SSYDYEALDEAMSISVsstr --|--- 9-----||----- 10-------||---- 11----|----------------|----- 12---- GLPL 380 390 400 410 420 430 440 | | | | | | | I2C1 NDLP-AHLKQCFAYCAIYPKDYQFRKEQVIHLWIANGLVHQFHSGNQYFIELRSRSLFEMASEPSERDVEEFL Prf KNLP-HYLKPCFLYFGGFLQGKDIHVSKMTKLWVAEG---FVQANNEKGQEDTAQGFLDDLIGRNVVMAMEKR L6 DALN-PEAKEIFLDIACFFIGQNK-EEPYYMWTDCN----FYPASNIIFLIQRCMIQVGDDDEFKMHDQLRDM RPS2 DNLESDLLRSCFLYCALFPEEHSIEIEQLVEYWVGEG---FLTSSHGVNTIYKGYFLIGDLKAACLLETGDEK CED4 EVLS-DEDRSALAFAVVMPPGVDIPVKLWSCVIPVD-----ICSNEEEQLDDEVADRLKRLSKRGALLSGKRM APAF EMLR-EDIKDYYTDLSILQKDVKVPTKVLCILWDME----TEEVEDILQEFVNKSLLFCDRNGKSFRYYLHDL sstr --|---|-- 13---| -|---| 6|-- 14---|--- 15-------------|-|- 7--||- 8--|-- RNBS-D MHD 450 Protein PDB | seq code code I2C1 MHDLVNDLAQI AAB63274 2FT4 Prf PNTKVKTCRIH AAF76308 2B84 L6 GREIVRREDVL AAD25973 2B85 RPS2 TQVKMHNVVRS Q8LKZ8 2FT5 CED4 PVLTFKIDHII CAA48781 2G2M APAF QVDFLTEKNCS Q9UJ66 1Z6T

Figure 1: Multiple sequence alignment of two tomato (I2C1, PRF), one flax (L6), Vigna mungo (VMYR), one Arabidopsis (RPS2), one nematode (CED4), and one human (Apaf-1) NB-ARC domains obtained by run-ning the program CLUSTALW with default parameters. Residue numbers displayed are for Apaf-1, whose crys-tal structure is available in PDB entry 1Z6T. Secondary structure elements displayed have been renamed for the five models constructed in the present study, with con-served R protein motifs marked, protein sequence and PDB codes given at the lower right. It is remarkable that despite low sequence identity between the proteins, this alignment predicts deletions and insertions mostly in loop regions. Positions marked in red show perfect conservation or identities among the aligned proteins; similarities, including positions where all proteins pos-sess charged or polar residues are colored blue, and po-sitions with hydrophobic residues are colored green.

Figure 2: Alpha-carbon superposition in stereo of three of the five models constructed with the Apaf-1 crystal structure in light blue. The tomato I2C1 protein struc-ture (2FT4) is in red, the arabidopsis RPS2 structure (2FT5) is in deep blue, and nematode CED4 structure (2G2M) is in green, while only one common ADP mol-ecule in yellow is displayed. The models for tomato PRF (2B84) and flax L6 (2B85) have been omitted for simplicity in this figure. Numbered residues are at N-terminal ends of stretches where the various superim-posed proteins are significantly divergent in structure from each other, e.g., at residues 144, 190, 207, 236, 299, 352, 403 for the Apaf-1 protein in light blue. As seen from Figures 1 and here, these significantly diver-gent residues are in loop regions.

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 8: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

363NB-ARC Domains of

Plant R Proteins and Ced4

the putative position of γ phosphate of ATP and are, therefore, predicted to take ac-tive part in the kinase activity of these R proteins.

Why do R Proteins Bind ADP and not GDP?

Incubation of caspase-9 with Apaf-1 significantly improved its catalytic activity about fourfold, while the addition of ATP further enhanced the caspase-9 activity by about two orders of magnitude; further, ATP/dATP hydrolysis is required for cas-pase-9 activation since non-hydrolysable analogues did not work (19). ATP/dATP is needed by Apaf-1 to form an oligomeric complex called apoptosome (19). In the case of the tomato R protein I-2, it was demonstrated by Tameling et al. (7) that it binds specifically to adenosine nucleotides rather than CTP, GTP, or UTP. In a re-cent study Pal et al. (31) have shown presence of conserved G protein receptor sig-nature within the non-TIR subfamily of NB-ARC domain in members of Fabaceae family. However, such motifs were not found within the three non-TIR sequences, namely, I2C1, Prf, and RPS2, employed in this investigation. Moreover, Meyers et al. (13) had pointed out that the plant R genes lack the G-4 region typical of the GTPase superfamily and therefore the R proteins may bind ATP rather than GTP. Combining our models with a study on recognition templates for predicting adenyl-ate-binding sites in proteins (32), it can be predicted with confidence that all these plant R proteins prefer adenosine nucleotides in the following manner.

Most of the interactions of the ADP molecule in our I2C1 model are displayed in Figure 3. It was noted that the exocyclic N6 amino group of ADP hydrogen bonds to Phe 21 carbonyl oxygen. If one were to have a GTP or any other guanine-based nucleotide, this bond could not be formed since guaninine has the exocyclic O6 atom instead of N6 for adenine. Zhao et al. (32) described nucleotide binding sites within 31 protein complexes by means of their consensus affinity maps; they found that in adenylate, the N6 amino group shows a strong minimum. In addition, any guanine nucleotide will contain an additional exocyclic group, N2 amino, at the C2 atom. This additional group will clash with the main chain of Ile 20 and the side chain of Leu 201 (Fig. 3), hence the existing loop structure has to be perturbed for GTP binding. The calculated discrimination between adenine and guanine was in large part to the strongly unfavorable clashes of the guanine N2 position in adenine-binding proteins (32). Further, the steric contact used to exclude guanine is provided by entirely different combinations of amino acid residues in different ATP and NAD-binding proteins (32). The active site is highly conserved around the base-pairing face of the base (32), and this is the case for our models, as the re-gion GKTT is on the base pairing face of the ADP (Fig. 3). Thus the experimental observation of Tameling et al. (7) may now be generalized for all these NB-ARC domains in the light of our models and the above stated analysis.

Validation of the Model Using RPS2 Mutagenesis Data

The Arabidopsis R gene, RPS2, confers resistance to Pseudomonas syringae strains that express the avrRpt2 gene (33, 34). It has been demonstrated that Avr-Rpt2 type III effector avirulence protein is transported into plant cells, where the pathogen derived avirulence factor interacts with RPS2 protein-mediated surveil-lance system (20). This protein protein interaction triggers a defense system that acts via its interaction with RIN4, and probably triggers the plant resistance when RIN4 is degraded by AvrRpt2 (1).

In the mutant R2M4, a lysine residue of NB-ARC domain that is highly conserved in P-loops (5) of RPS2 was replaced with a leucine residue (K188L, Table I; repre-sented as K54L in Fig. 4). According to Meyers et al. (13) two threonine residues (positions 189 and 190) are highly conserved, whereas, Traut (5) stated that the resi-dues for these positions may be either serine or threonine. Two of these amino acid substitution mutants, in which serine substituted for threonine, R2M4a (T189S),

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 9: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

364

Chattopadhyaya and Pal

R2M4b (T190S) were found to be as active as the wild type RPS2 (22). Whereas, the mutant R2M4c (T189S and T190S, represented as T55S and T56S, colored white above adenine in Fig. 4), when tested for its activity in the transient expres-sion assay, was found to be slightly weaker than the wild type (22). It is clear from these evidences that replacing either of the conserved residues with a serine residue does not seem to affect RPS2 activity, but replacing both of them with serines may affect the resistance nature of RPS2 of Arabidopsis against P. syringae.

In the present investigation we analyzed these mutants from the proposed model structure. It is evident from the Figure 3 that the ADP molecule sits in a pocket in the P-loop region of these NB-ARC domains by multiple noncovalent bonds. Hence, it is not surprising that the three mutations tested in the P-loop region, i.e., K54L, T55S, and T56S (marked in white above the ADP in Fig. 4) abolished disease resistance to P. syringae, but none of these mutations display dominant negative ef-fect, as lack of one interaction is compensated by the remaining interactions.

Axtell et al. (20) have demonstrated that rps2-204C and rps2-205C mutants are ac-tually loss-of-function mutants raised by mutagenic treatment and alleles of RPS2 completely lost their resistant ability against P. syringae; but the function of related R genes continued to be active, this further supports the gene for gene specificity and the supporting evidences put forward by Leister and Katagiri (21) for RPS2-AvrRpt2. It is revealed from the structural analysis of the RPS2 mutant 204C that the position 142 is at the back of RPS2 molecule, top half, as shown in the Figure 4. Due to the paucity of residues between the secondary structural elements β3 and β4 (α7 is lacking in RPS2, vide Fig. 1), proline is essential in RPS2 for pro-

Table I Natural and induced mutants of Arabidopsis thaliana with mutated amino acid residue/s within the RPS2 locus and structural explanation of their avr Rpt2-dependant phenotypes based on the 3-D model of RPS2 protein.

Natural/ induced mutant*

Geographic origin of the germplasm

Original aa**

residue/aa position/ mutated aa

Phenotype$$

(Reference No)

Structural explanation for the observation based on RPPS2

protein model BG-4, Po-1, Zu–0, Knox–2

Seattle, USA, Germany,

Switzerland, Indiana, USA

S419P, H439N, H472Y

S (22) S419 part of 6 and Pro affects 6- 8 bonding; H439, H472

part of RPS2-AvrRpt2 interaction surface

Wu–0 Germany S419P, H439N mR (22) see above Pog-0 Canada S154Y sR (22) Side chain covered, not

important in RPS2-AvrRpt2 interactions

R2M4 Columbia, SC, USA$

K188L S (23) Part of crucial P-loop, H-bonds to ADP/ATP

R2M4a Do T189S R (23) Ser can substitute for Thr, multiple interactions hold ADP/ATP (Fig. 3)

R2M4b Do T190S R (23) - do - R2M4c Do T189S & T190S mR (23) do 204C Do P276L S (20) RPS2 has lower number of

residues between 3 and 4(Fig. 1), uses this Pro for a sharp turn

I>K Do I353K R (20) Insignificant change in protein surface

205C Do A421V S (20) introduces additional bulk at 13- helix turn and

destabilizes structure 209C Do A456T S (20) fully exposed and probably part

of RPS2-AvrRpt2 interaction, near H439 and H472

*amino acid substitution mutants; $Mutants of Col-0 background, origin Columbia, SC, USA; **avrRpt-2 dependant phenotypes, S, susceptible; mR, mildly resistant; R, resistant; sR, strongly resistant; $$Note that residue numbers shown in the above table are 134 more than in our PDB entry 4FT5 for the same residue, e.g.,S419P, H439N, H472Y, K188L above are referred to as S285P, H305N, H338Y, K54L, respectively, in our model, text, and figures.

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 10: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

365NB-ARC Domains of

Plant R Proteins and Ced4

Figure 3: Ribbon diagram of a portion of our I2C1 model in PDB entry 2FT4, with nonhydrogen atoms of the various side chains displayed in stereo, containing the bound ADP molecule and its immediate neighbors in a viewing direction as in Figure 2. The protein backbone is marked by the red ribbons, with the following color codes for various atom types – green for carbon, blue for nitrogen, and red for oxy-gen, yellow for sulfur and pink for phosphorus. The central ADP molecule is displayed in ball and stick. Interatomic distances for hydrogen bonds and Van der Waal’s contacts with various atoms of ADP are displayed in Angstroms, joined as dotted green lines. The adenine base is held between the side chains of Thr59, Ile20, Cys229, Leu201, and Asp17 (these I2C1 positions correspond to Apaf-1 positions 162, 126, 321, 294, and 123, respectively); in addition, its exocyclic N6 shows hydrogen bonding with the main chain carbonyl oxygen of Phe21. The ribose of ADP has a Van der Waal’s contact with Lys 230 side chain. The α-phosphate hydrogen bonds with the main chain amide and the side chain Oγ of Thr58, and the β-phosphate hydrogen bonds with Lys57 Nζ and its main chain amide. Thus, the region KTT (part of P-loop) is directly involved in holding the ADP molecule. However, in case of CED4 and Apaf-1, the KTT may be substituted by the sequence KSV (Figure 1) fully consistent with the above bonding scheme as the same bonds can be maintained.

Figure 4: The RPS2 structure (2FT5) is shown in stereo as a deep blue ribbon, omitting side chains, gen-erally. Side chains are displayed for some positions where mutations have been experimentally tested for disease reactions against Pseudomonas syringae, as summarized in Table I. Residues mutated by site-directed mutagenesis, i.e., K54L, T55S, T56S, P142L, I219K, A278V, and A322T are colored white. Natural mutant positions like S20Y, S285P, H305N, and H338Y are colored light green, while ADP is shown in yellow. Except for residues K54, T55, and T56, all other mutation sites tested are generally on the surface of the molecule. Maintenance or loss of function observed for these mutations can be ratio-nalized on the basis of this model for RPS2. Note that residue numbering in our PDB entry is 134 less compared to the protein sequence, e.g., S20Y is shown as S154Y in Table I. Several conserved sequence motifs are labeled in the right hand molecule of the stereo pair. The residue labels may not be clear un-less viewed in stereo, particularly the ones at the back of the molecule due to depth queuing used.

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 11: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

366

Chattopadhyaya and Pal

viding a sharp turn; probably introduction of leucine instead of proline makes the structure of the protein functionally inactive.

In the mutant I>K, the I219K position in the middle of α10 seen slightly below the ADP in Figure 4 had no effect in disease resistance phenotype. The isoleucine side chain is partly exposed, but changing it to lysine did not affect the structure, particu-larly as the side chains of Lys 188 and Lys 312 are already in proximity.

Mutant residues of rps205C (represented by A287V) and of rps209C (represented by A322T) are shown in white near the left and the center of the protein in Figure 4, respectively. Ala 287 is partly exposed and in a pocket surrounded by Met 337 car-bonyl oxygen and Val 341 side chain, so the A287V mutation introduces additional bulk at the border of α13 and the π helix, and may not be tolerated structurally; probably this is the reason for abolishing disease resistance against P. syringae. The Ala 322 is fully exposed, at the turn between β7 and β8. A322T mutation is prob-ably detrimental for the RPS2-AvrRpt2 interaction.

One of the natural mutations, H305N, in the middle of α15, produces susceptible variants and is very close in space near the previous mutation A322T discussed just above, confirming that this is indeed the surface that may be involved in interact-ing with AvrRpt2. Likewise, another natural mutation, H338Y, is close in space to the A287 position and both are found to abolish the disease resistance against P. syringae. Both of these mutant residues are on the left side of the molecule as shown in Figure 4, and comprise a second probable surface for the interaction. The natural S419P mutation in some variants resulted in sensitivity to P. syringae (vide Table I). This might have happened as Ser (shown at position 285 in Fig. 4) is at the start of β6 and mutating it to Pro that probably disturbed the original protein structure. The natural S154Y mutation (shown at position 20 in Fig. 4) does not result in loss of function phenotype since the side chain is not exposed and points inwards; it is guarded by Ser 187 side chain and therefore cannot play a significant role in the interaction with P. syringae.

Thus, it is evident from the above made structural analyses that the phenotypic responses of these induced and natural mutants of Arabidopsis against P. syringae could be addressed from the present 3D structure of RPS2.

Validation of the Model with I-2 Mutagenesis Data

Some mutants were studied for the tomato protein I-2 (7, 15, 35), which is not the same as but homologous to a complete variant I2C1 used in our study. The first mutation K207R studied in I-2 was in the crucial Walker A or P-loop region (Apaf-1 position K160); it showed strongly reduced ATP binding and hydrolysis (7), but results in a loss of biological function (15). Another position S233 present in I-2 is different in I2C1 having a D. This position is seen to have S, T, D, Q, or E in other NB-ARC domains instead of Apaf-1 G188 (at turn at β2 end, Fig. 1). Thus, the S233F mutation (15) would introduce an abnormal aromatic residue in this RNBS-A motif. It is seen from the Apaf-1 crystal structure that introducing a phenylala-nine side chain at G188 would disturb the D244 believed to act as a catalytic base in ATP hydrolysis (36), and hence open up the molecule in the ATP form via D392 in the winged helix domain. Since we find the active or ATP bound form of these NB-ARC domains are likely to have a different disposition for the winged helix domain from that in the ADP form (see comparison of Ced4 structures, Fig. 5), this S233F mutation would result in autoactivation as observed (15). The I-2 mutation D283E (15), corresponding to Apaf-1 position D244, also shows aspartate in I2C1. Autoactivation observed for this D283E mutant (15) can also be rationalized by a pathway similar to that in S233F. Another mutation D495V in the MHD motif, at the interface of the winged helix domain and the 3 layered α/β domain, triggered a cell death response (35) by autoactivation. This autoactivation can be caused by

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 12: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

367NB-ARC Domains of

Plant R Proteins and Ced4

introducing a different relative position of the winged helix domain by altering an interfacial residue, as predicted for all other cases of autoactivation (see below).

Validation of the Model Using Prf Mutagenesis Data

In tomato Prf protein, Salmeron et al. (28) studied the effect of one mutation T1230A corresponding to Apaf-1 residue T264 in the highly conserved kinase-3 or RNBS-B motif. This position tolerates only threonine or serine at the end of β4 (Figure 1); hence, the introduction of alanine can be expected to disrupt normal function, as observed (28).

Comparison of the Model with L6 Mutagenesis Data

There are seven known mutations studied for L6 (36, 37), but all of these are lo-cated near the C-terminus at β8, in the MHD region that is perhaps not properly aligned among the chosen seven sequences using Clustal W. If one were to manu-ally align the MHD (present only in I2C1 and L6 among the seven proteins as seen in Fig.1) and introduce breaks, it should align with LHD in residues 437-439 of Apaf-1 (36). The MHD region is predicted to be in the loop between β8 and α11 (Fig. 1), and spatially close to the sugar and phosphate groups of ADP, thus justify-ing its role as an important motif. Our present models did not use such a manual alignment forcing the alignment of the MHD regions in I2C1 and L6, but it can be seen that all these seven mutations, i.e., M539K, H540A, and five mutations at D541 (37) are all in the winged helix domain. Of these, the first mutation M539K corresponds to L437 of Apaf-1 (36); thus, M539 should be part of the hydrophobic core of the winged helix domain in L6. Thus, the M539K mutation should disturb the domain structure and hence result in loss of function, as observed (37). H540A and five mutations of the type D541X (37), all result in a change in the interface of the winged helix domain with the 3 layered α/β domain (with the ADP as well for H540A); hence, autoactivation happens.

Comparison of Ced4 Model with a Crystal Structure in Ced4-Ced9 Complex

The present authors were unaware of the publication of the Ced4-Ced9 complex crystal structure (38) while building their Ced4 model based on Apaf-1. The com-plex crystal has one molecule of Ced9 binding to an asymmetric dimer of Ced4, both Ced4 molecules bound to ATP and associated magnesium ions. Afterwards we compared our model in 2G2M with the two Ced4 molecules obtained in the crystal structure entry 2A5Y, being very similar to each other (38). Our Ced4 in 2G2M is superimposed with one Ced4 molecule from entry 2A5Y in Figure 5. The three-layered α/β domain (residues 1-180 in 2G2M, till β5) is seen to superimpose quite well in Figure 5a showing an alpha carbon stereo, the r.m.s.d. in 142 aligned Cα po-sitions being 1.5 Å. The helical domain I is also seen to retain an extremely similar structure, though shifted about 5 Å from its predicted location in our model (Figure 5a). Superimposing helical domain I alone between 2G2M (residues 183-196) and 2A5Y, an r.m.s.d. in 35 Cα positions of 2.1 Å is found, though this superimposition is not shown in Figure 5. Helical domain I appears shifted in the crystal structure by about 5 Å, but the C-terminal winged helix domain seems totally different between 2G2M and 2A5Y at a first glance (Figure 5a). However, on closer examination, it is seen that the winged helix domain in 2G2M needs a proper rotation and translation for comparison with that in 2A5Y (Figure 5b). Thus, the relationship between the Ced4 and Apaf-1 structures, both members of the NB-ARC family, has been clari-fied here for the first time, not elucidated explicitly by Yan et al. (38). With such a rotation and translation, the r.m.s.d. in 83 aligned Cα positions is 1.8 Å. Singleton et al. (39) reported the structure of a related protein within the AAA+ family, Cdc6/ORC from Aeropyrum pernix in five different crystal forms, three in complex with ADP and two in complex with a non-hydrolysable ATP analogue, ADPNP. They found that domains I and II of all five crystal forms superimposed quite well with

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 13: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

368

Chattopadhyaya and Pal

each other and with another ORC2 structure but domain III did not (39). They sug-gested a rigid-body rotation of domain III for Cdc6/ORC compared to the rest of the protein as also seen for the winged helix domain of Ced4 in Figure 5.

Table II Structural explanation of the effect of mutations induced in potato R protein Rx observed by Bendahmane et al. (40).

Mutation(s) induced, motif Phenotype Structural explanation for the observation based on our study and Apaf-1 crystal

structure G175A + K176A, Walker A

Failed to show HR in an agrobacterium

infiltration assay (loss of function)

Equivalent Apaf-1 positions G159 and K160 are totally conserved (Fig.1), and part of Walker A motif. These are needed for the proper structure of the ADP/ATP binding region and for maintaining hydrogen bonds. (Fig. 3)

D244A + D245A, Walker B

Do Equivalent Apaf-1 positions D243 and D244 are part of Walker B motif. The first D plays a role in indirect coordination of Mg2+, whereas the second is thought to be a catalytic base for hydrolysis.

G330A + P332A, GLPL

Do Equivalent Apaf-1 positions G319 and P321 are needed for maintaining the structure of this region interacting with the adenine base of ADP/ATP.

F393I, RNBS-D

Shows HR in an agrobacterium

infiltration assay (constitutive gain-

of-function)

Equivalent Apaf-1 position L386, part of RNBS-D and side chain forms hydrophobic core of winged helix domain. This mutation probably also results in a rotation of the altered winged helix domain with respect to the other domains and hence autoactivation.

D399V, RNBS-D

do Equivalent Apaf-1 position D392, which is on the surface of the winged helix domain, but points towards the 3-layered / domain. Thus changing this residue leads to a rotation of the winged helix domain correlated with autoactivation.

E400K, RNBS-D

do Equivalent Apaf-1 position V393, which is on the winged helix domain surface, and near the 3-layered / domain. Same explanation as D399V above.

D460V, MHD

do Equivalent Apaf-1 position D439, mutation at winged helix domain interface with other domains. Same explanation as with D399V and E400K.

Figure 5: (a) Our Apaf-1-based Ced-4 model in blue (PDB entry 2G2M)is superimposed in stereo on one Ced-4 molecule from the crystal structure of Ced-4-Ced-9 complex in red (PDB entry 2A5Y), using 142 pairs of Cα atoms taken from the 3 layered α/β do-mains. Larger differences in position in this 3 layered domain are seen in the loops between (α1,α2) and (α5,α6) and in the helices α4, and α7, but otherwise the Cα superposition is good (r.m.s.d. 1.5 Å). Helical do-main I at the bottom of the picture is shifted to the right in the crystal (in red) relative to our model (in blue) by about 5 Å, probably because of distortion caused by dimer/complex formation. However, the winged helix domain is in a drastically new position and the rela-tion between our model and the crystal is not apparent for this domain. (b) If the winged helix domain of our model is taken separately and superimposed with the Ced-4 molecule in the crystal, a reasonable superim-position (r.m.s.d. of 1.8 Å for 83 Cα atom pairs) can be achieved, as seen here. Loops preceding the helical winged helix domain are colored in yellow for the Ced-4 crystal structure in red and in light blue for our Ced-4 model (much of it omitted for simplicity here) in blue. This illustrates the relation in the three dimensional structures between the ADP form and the magnesium-ATP form of these NB-ARC domains through a signifi-cant rotation of the winged helix domain.

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 14: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

369NB-ARC Domains of

Plant R Proteins and Ced4

Rationalization of Mutational Data for Potato and Tobacco R Proteins

In the present study, we have not built any models for the R proteins in potato called Rx or in tobacco called N, since we selected the five most diverse NB-ARC domains as stated in the Materials and Methods section. However, the effect of several mutations in these two R proteins Rx and N have been studied (40, 41), with corresponding equivalent positions in Apaf-1 tabulated (36). The effect of most of these mutations can be structurally rationalized, as shown in Tables II and III.

Comparison with Previous Studies

Albrecht and Takken (42) presented an update on the domain architectures of mam-malian NACHT-LRR and plant R proteins with insights obtained from the Apaf-1 crystal structure exploiting their assumed close evolutionary relationships, a major part of the discussion devoted to nomenclatures. We broadly agree with this idea of conservation. The ADP form of the R proteins are ‘inactive and closed’ where the nucleotide is deeply buried, and that the auto-activating variants could disturb this compact organization resulting in an open conformation (36, 42). We arrived at this idea by comparing our Ced-4 model (2G2M, based on Apaf-1) with the Ced-4 crys-tal structure (2A5Y) and noted the movement of the winged helix domain (Figure 5). However, neither Albrecht and Takken (42) nor Takken et al. (36) built or depos-ited three dimensional models of R proteins as have been undertaken in the present study. McHale et al. (43) built predicted structures of TNL RPS4 and CNL RPS5 of Arabidopsis using a self-consistent mean-field homology modeling technique, but did not deposit their models in the PDB. By comparing Figure 3a of McHale et al. (43) with Figure 4, most of the various important motifs are found to be similarly positioned. However, in our R protein models the kinase 2 (Walker B) motif is far from the ADP, and in their models kinase 3 (RNBS-B) motif is far from the ADP (43). In our models RNBS-C motif is close to ADP, but in their models it is far (43). These differences are probably result from the lack of manual sequence alignment in this study, though there is no such alignment problem for Ced-4 (Fig. 1). The Ced-4 crystal structure shows that the invariant Thr 263 and Arg 265 (Apaf-1 numbers) are closer to the γ phosphate of ATP compared to the aspartates in Walker B (38).

Table III Structural explanation of the effect of mutations induced in tobacco R protein N observed by Dinesh-Kumar et al. (41).

Mutation(s) induced, motif Phenotype Structural explanation for the observation

based on our study and Apaf-1 crystal structure G216A, G216E, G216R, G216V Walker A

loss of resistance to TMV (loss of function)

Equivalent Apaf-1 position G154 totally conserved (Fig.1), and part of Walker A motif. These are needed for the proper structure of the ADP/ATP binding region and for maintaining hydrogen bonds (Fig. 3).

G218P, G218S, Walker A

resistance to TMV (wild-type)

Equivalent Apaf-1 position G156, part of Walker A motif. However, other residues like P and A are found in other organisms instead of G (Fig. 1) and some variation is tolerated as long as the new residue is small and uncharged.

G218V, G219V Walker A

partial resistance to TMV (partial loss of

function)

Equivalent of Apaf-1 position G156/G157. The introduction of V is not as deleterious as those of charged residues like D or R.

G219D, G218R Walker A

loss of resistance to TMV (loss of function)

See above. Loss of function results when large or charged side chain is introduced at these positions.

K222E, K222N Walker A

do Equivalent Apaf-1 position K160, part of Walker A, totally conserved in all proteins (Fig. 1). Has role in ADP/ATP binding (Fig. 3).

T223A, T223N, T223S Walker A

do, partial loss of function for T223S

Equivalent Apaf-1 position S161, part of Walker A motif. The T residue shows a mild effect on changing to S in some plants (Table I) but A and N more different.

D301H, D301N, D301Y Walker B

do Equivalent Apaf-1 position D243, part of Walker B motif, plays a role in indirect coordination of Mg2+. Mutations probably prevent the movement of the winged helix domain needed for normal function.

R325G, R325Y RNBS-B

do Equivalent Apaf-1 position Q259, at start of 4, far away from ADP. Not clearly understood, but this position has either K or R (Fig. 1). Probably has role in interaction with other proteins.

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 15: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

370

Chattopadhyaya and Pal

McHale et al. (Fig. 3b) seem to have moved the lysine side chain in the important Walker A motif so it no longer hydrogen bonds to the phosphate of ADP, whereas we have left it at the original position (Fig. 3). Further, in the present study an attempt has been made to rationalize the mutational data in a number of plant R proteins in the light of the three dimensional models and protein structural principles.

Acknowledgements

The authors are thankful to Ms. Jolly Basak for extending her help at the initiation of this study. They are grateful to the graphics facilities provided at the Bioinfor-matics Center, Bose Institute. Financial assistance (BT/PR5433/BID/07/93/2004) from the Department of Biotechnology, Government of India to A.P. is also here-by acknowledged.

References and Footnotes

1.2.

3.4.5.6.7.

8.9.10.11.12.13.

14.15.

16.17.18.19.20.

21.22.

23.24.25.26.27.28.

29.

30.31.32.33.

34.

35.

36.37.

38.

D. Mackey, Y. Belkhadir, J. M. Alonso, J. R. Ecker, J. L. Dangl. Cell 122, 379-389 (2003).J. A. Hoffmann, F. C. Kafatos, C. A. Janeway Jr., R. A. B. Ezekowitz. Science 284, 1313-1317 (1999).J. Ellis, D. Jones. Curr Opin Plant Biol 1, 288-293 (1998).M. Saraste, P. R. Sibbald, A. Wittinghofer. Trends Biochem Sci 15, 430-434 (1990).T. W. Traut. Eur J Biochem 222, 9-19 (1994).J. E. Walker, M. Saraste, M. J. Runswick, N. J. Gay. EMBO J 1, 945-951 (1982).W. I. L. Tameling, S. D. J. Elzinga, P. S. Darmin, J. H. Vossen, F. L. W. Takken, M. A. Haring, B. J. C. Cornelissen. Plant Cell 14, 2929-2939 (2002).J. Yuan, H. R. Horwitz. Development 116, 309-320 (1992).H. Zou, W. J. Henzel, X. Liu, A. Lutschg, X. Wang. Cell 90, 405-413 (1997).E. A. van der Biezen, J. D. G. Jones. Curr Biol 8, R226-R227 (1998). L. L. Dangl, J. D. G. Jones. Nature 411, 826-833 (2001).B. Kobe, A. V. Kajava. Curr Opin Struc Biol 11, 725-732 (2001).B. C. Meyers, A. W. Dickerman, R. W. Michelmore, S. Sivaramakrishnan, B. W. Sobral, N. D. Young. Plant J 20, 317-332 (1999).H. H. Flor. Ann Rev Phytopath 9, 275–296 (1971).W. I. L. Tameling, J. H. Vossen, M. Albrecht, T. Lengauer, J. A. Berden, M. A. Haring, B. J. C. Cornelissen, F. L. W. Takken. Plant Physiol 140, 1233-1245 (2006).S. Seshagiri, L. K. Miller. Curr Biol 7, 455-460 (1997).B. M. Seiffert, J. Vier, G. Hacker. Biochem Biophys Res Commun 290, 359-365 (2002).L. A. Kelly, R. M. MacCallum, M. J. E. Sternberg. J Mol Biol 299, 499- 520 (2000).S. J. Riedl, W. Li, Y. Chao, R. Schwarzenbacher, Y. Shi. Nature 434, 926-933 (2005).M. J. Axtell, T. W. McNellis, M. B. Mudgett, C. S. Hsu, B. J. Staskawicz. Mol Plant-Microb Interact 14, 181-188 (2001).R. T. Leister, F. Katagiri. Plant J 22, 345-354(2000).R. Mauricio, E. A. Stahl, T. Korves, D. Tian, M. Kreitman, J. Bergelson. Genetics 163, 735-746 (2003).Y. Tao, F. Yuan, R. T. Leister, F. M. Ausubel, F. Katagiri. Plant Cell 12, 2541-2554 (2000).J. Basak, S. Kundagrami, T. K. Ghose, A. Pal. Mol Breed 14, 375-383 (2004).G. Chang. J Mol Biol 330, 419-430 (2003).A. N. Lupas, J. Martin. Curr Opin Struct Biol 12, 746-753 (2002).J. D. Thompson, D. G. Higgins, T. J. Gibson. Nucl Acids Res 22, 4673-4680 (1994).J. M. Salmeron, G. E. Oldroyd, C. M. Rommens, S. R. Scofield, H. S. Kim, D. T. Lavelle, D. Dahlbeck, B. J. Staskawicz. Cell 86, 123-133 (1996).N. Ori, Y. Eshed, I. Paran, G. Presting, D. Aviv, S. Tanksley, D. Zamir, R. Fluhr. Plant Cell 9, 521-532 (1997).G. J. Lawrence, E. J. Finnegan, M. A. Ayliffe, J. G. Ellis. Plant Cell 7, 1195-1206 (1995).A. Pal, A. Chakrabarti, J. Basak. J Theor Biol doi:10.1016/j.jtbi.2007.01.013 (2007).S. Zhao, G. M. Morris, A. J. Olson, D. S. Goodsell. J Mol Biol 314, 1245-1255 (2001). A. F. Bent, B. N. Kunkel, D. Dahlbeck, K. L. Brown, R. Schmidt, J. Giraudat, J. Leung, B. J. Staskawicz. Science 265, 1856-1860 (1994).M. Mindrinos, F. Katagiri, D. Dahlbeck, K. L. Brown, R. Schmidt, J. Giraudat, J. Leung, B. J. Staskawicz. Science 265, 1856-1860 (1994).S. De la Fuente van Bentem, J. H. Vossen, K. de Vries, S. C. van Wees, W. I. L. Tameling, H. Dekker, C. G. de Koster, M. A. Haring, F. L. W. Takken, B. J. C. Cornelissen. Plant J 43, 284-298 (2005).F. L. W. Takken, M. Albrecht, W. I. L. Tameling. Curr Opin Plant Biol 9, 383-390 (2006).P. Howles, G. Lawrence, J. Finnegan, H. McFadden, M. Ayliffe, P. Dodds, J. Ellis. Mol Plant Microbe Interact 18, 570-582.N. Yan, J. Chai, E. S. Lee, L. Gu, Q. Liu, J. He, J. W. Wu, D. Kokel, H. Li, Q. Hao, D. Xue, Y. Shi, Nature 437, 831-837 (2005).

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14

Page 16: Three-dimensional Models of NB-ARC Domains of Disease Resistance Proteins in Tomato, Arabidopsis, and Flax

371NB-ARC Domains of

Plant R Proteins and Ced4

39.

40.41.

42.43.

M. R. Singleton, R. Morales, I. Grainge, N. Cook, M. N. Isupov, D. B. Wigley. J Mol Biol 343, 547-557 (2004).A. Bendahmane, G. Farnham, P. Moffet, D. C. Baulcombe. Plant J 32, 195-204 (2002).S. P. Dinesh-Kumar, W. H. Tham, B. J. Baker. Proc Natl Acad Sci USA 97, 14789- 14794 (2000).M. Albrecht, F. L. W. Takken. Biochem Biophys Res Comm 339, 459-462 (2006).L. McHale, X. Tan, P. Koehl, R. W. Michelmore. Genome Biology 7, 212.1-212.11 (2006).

Date Received: May 23, 2007

Communicated by the Editor Ramaswamy H. Sarma

Dow

nloa

ded

by [

Uni

vers

ity O

f Pi

ttsbu

rgh]

at 1

7:50

25

Nov

embe

r 20

14