Upload
rajesh-g
View
46
Download
0
Tags:
Embed Size (px)
Citation preview
Protein Folds
Three-dimensional structures sometimes may differ substantially from each other, at the
sequence and even and at the structural level, but still have the same type of topology called
fold.
What is a Protein Fold?
Or
What is the difference between fold, domain and tertiary structure of a protein?
a fold is a certain way of arrangement of secondary structure elements in space.
93436 structures and 1,393 folds
This leads to the broad acceptance of the view that there are a finite and relatively small
number of folds found in nature.
Class Number of folds Number of superfamilies Number of families
All alpha proteins (α) 284 507 871
All beta proteins (β) 174 354 742
Alpha and beta proteins (α/β) 147 244 803
Alpha and beta proteins (α+β) 376 552 1055
Multi-domain proteins 66 66 89
Membrane and cell surface proteins 58 110 123
Small proteins 90 129 219
Total 1195 1962 3902
SCOP classification of protein folds
At the lowest level of SCOP hierarchy are the individual domains. Sets of domains are
grouped into families of homologues. These comprise domains for which the similarities in
structure, function and sequence imply a common evolutionary origin.
Families that share common structure and function, but lack adequate sequence similarity, so
that the evidence for evolutionary relationship is suggestive but not compelling, are grouped
into superfamilies.
Superfamilies that share a common folding topology, for at least large central portion of the
structure, are grouped as folds.
Finally, each fold group falls into one of the general classes.
Class Alpha/beta
Fold Flavodoxin-like
Superfamily Flavoproteins
Family Flavodoxin-related
Protein Flavodoxin
Classification of protein folds: All-α
Alamethicin (a helix) Rop (helix-turn-helix) CytochromeC (four-helix bundle)
An all-α proteins is a class of structural domains in which the secondary structure is
composed entirely of α-helices, with the possible exception of a few isolated β-sheets on the
periphery. Common examples include the bromodomain, the globin fold and the
homeodomain fold.
β sandwich β barrel
Classification of protein folds: All-β
An all-β proteins is a class of structural domains in which the secondary structure is
composed entirely of β-sheets, with the possible exception of a few isolated α-helices on the
periphery. Common examples include the SH3 domain, the beta-propeller domain, the
immunoglobulin fold and B3 DNA binding domain.
Placental ribonuclease inhibitor α/β horseshoe Triose phosphate isomerase α/β barrel
• The most common tertiary fold observed in high resolution protein crystal structures
• 10% of all known enzymes have this domain
α/β proteins is a class of structural domains in which the secondary structure is composed of
alternating α-helices and β-strands along the backbone. The β-strands are therefore mostly
parallel. Common examples include the flavodoxin fold, the TIM barrel and leucine-rich-
repeat (LRR) proteins such as ribonuclease inhibitor.
Classification of protein folds: α/β
Classification of protein folds: α+β
α+β proteins are a class of structural domains in which the secondary structure is composed
of α-helices and β-strands that occur separately along the backbone. The β-strands are
therefore mostly antiparallel. Common examples include the ferredoxin fold, ribonuclease A,
and the SH2 domain.
TtMoaC (3JQJ,
3JQK, 3JQM)
Oligonucleotide Binding (OB) fold
A novel folding motif was observed in four different proteins which bind oligonucleotides or
oligosaccharides: staphylococcal nuclease, anticodon binding domain of asp-tRNA synthetase
and B-subunits of heat-labile enterotoxin and verotoxin-1. The common fold of the four
proteins has a five-stranded beta-sheet coiled to form a closed beta-barrel.
Murzin et al., 1993, EMBO J, 12, 861-867.
Rossmann fold
TtMogA (3MCH); AaMogA (3MCI and 3MCJ)
Hot dog fold
Leesong et al., 1996, Structure, 4, 253-264.
The Hotdog fold was first observed in the structure of E. coli β-hydroxydecanoyl thiol ester
dehydratase (FabA). Hotdog fold is in eukaryotes, bacteria, and archaea and is involved in a
range of cellular processes, from thioester hydrolysis, to phenylacetic acid degradation and
transcriptional regulation of fatty acid biosynthesis.
Dillon and Bateman, 2004, BMC Bioinformatics, 5, 109
How many biological functions do we know?
1. Transport proteins (Hemoglobin, carrier proteins in membrane)
2. Nutrient and storage proteins (ovalbumin, Casein, Ferritin)
3. Contractile or motile proteins (Actin, Myosin, Tubulin)
4. Structural proteins (Collagen, Elastin, Keratin, Fibroin)
5. Defense proteins (Fibrinogen, Thrombin)
6. Regulatory proteins (Insulin)
7. Other functional proteins (Monelin, Antifreeze, Resillin)
8. Enzymes (Enzymoproteins)
Can we classify proteins based on their functions?
Major classes of enzymes according to their biological functions
1. Oxidoreductases
These are enzymes which catalyze the reduction or oxidation of a molecule.
2. Transferases
These enzymes catalyze the transfer of a group of atoms from one molecule to
another. For example, transfer of a phosphate between ATP and a sugar molecule.
3. Hydrolases
These enzymes catalyze hydrolysis reactions. For example, hydrolysis of an ester.
4. Isomerases
These enzymes catalyze the conversion of a molecule into an isomer. For example,
cis-trans interconversion of maleate and fumarate.
5. Lyases
Reactions which add a small molecule such as water or ammonia to a double bond
(and the reverse, elimination, reactions) are catalyzed by lyases.
6. Ligases
These enzymes catalyze reactions which make bonds to join together (ligate) smaller
molecules to make larger ones.
Is there any significant relationship between the fold of a protein and its biological
function?
The great majority of proteins which exhibit significant structural similarity are homologues
and perform identical or similar functions.
However, beyond these inherited similarities, different enzyme functions are performed by
proteins with a wide variety of different architectures and topologies.
Can there be any rules or guidelines which may suggest function from structure?
Why does one particular protein perform a given function?
To date, all protein pairs with sequences which indicate a definite evolutionary relationship
are observed to adopt the ‘same’ fold, with only minor variation (e.g. changes in domain
orientations, lengths of loops or additional secondary structures).
For example, globins from a wide variety of species with widely diverged sequences, all adopt
the same fold and perform an oxygen carrier/storage function.
Kosloff and Kolodny, 2008, Proteins, 71, 891-902.
Sequence-similar, structure-dissimilar
Sequence Identity (%) Total number of chain pairs a
≥ 0 Å b ≥ 3 Å c ≥ 6 Å c
100 1,941 444 158
99 12,868 757 278
70 114,021 6,873 1,575
50 147,186 11,749 2,653a Number of chain pairs after removing redundant structures from PDB.b Total number of chain pairs for each of the four subsets.c Total number of structurally-dissimilar pairs, restricted to RMSD 3 Å or 6 Å.
There are hundreds if not thousands of examples in the structure database demonstrating that
highly similar structures may have radically different sequences. So although it is true that
highly similar sequences adopt highly similar structures, so too do highly dissimilar sequences
sometimes adopt similar structures.
However, there are also a few homologous proteins which clearly have different functions,
despite adopting the same structure. The classic example is that of lysozyme and α-
lactalbumin. Although these enzymes possess ~35% sequence identity, the latter has lost the
catalytic carboxylates from glutamate and aspartate residues necessary for sugar cleavage.
Same fold, different function
Super folds are very popular
The average number of functions per
fold is 1.2 (1.8 for enzyme-related
folds alone).
The average number of folds for a
given function to be 3.6 (2.5 for
enzymes alone).
Orengo et al., 1994, Nature, 372, 631-634.Gerstein and Hegyi, 1998, FEMS Microbiol. Rev. 22, 277-304.
In contrast, there are several examples of proteins which perform the same function, but are
clearly not evolutionarily related. Here the classic examples are the trypsin and subtilisin
proteinases, which not only perform the same function despite having totally different
structures, but have evolved the same Asp–His–Ser catalytic-triad mechanism. This is a
genuine example of functional convergence.
Different fold, same function
The current status is like this
It is clear that the majority of topologies contain a single homologous family.
Analysis of all the structures in the PDB reveals that most proteins with the same topology
belong to the same homologous family (i.e. they are evolutionarily related).
Glycolysis: An example
5.3.1.92.7.1.1 2.7.1.11
PFKPGI
HK
4.1.2.13
ALD
5.3.1.1
TIM
GAPDH
1.2.1.12 2.7.2.3
PGK
5.4.2.1
PGM
4.2.1.11
Enolase
2.7.1.40
PK
Given this observation, it is striking that the structures of the 10 enzymes of the glycolytic pathway all
belong to the αβ class of structures and use only three architectures.
Functional classification for other proteins is more difficult, but we do find distinct structural
class preferences for those proteins that bind some of the most common biomolecules —
haems, sugars, nucleic acids and nucleotides.
Nevertheless, within such a group, the individual proteins adopt a wide variety of different
topologies to bind their similar ligands, which are used for different functions.
It may concluded that function at the top level of the EC number enzyme classification is not
related to fold, as only a very few specific residues are actually responsible for enzyme
activity. Conversely, the fold is much more closely related to ligand type.
Martin et al., 1998, Structure, 6, 875-884.
Then, what could be the best strategy to find the relationship between protein
fold and function?
Assignment No. 4
For your protein of interest:
1. Find out how many domains are present.
2. Which fold your protein belongs to.
3. How stable your protein is.
4. Is it secretary or cytosolic protein?
5. How many disulfide bonds are present in your protein.
6. Is there any post-translational modificaiton. If yes, which one?