Upload
clifton-walton
View
221
Download
2
Tags:
Embed Size (px)
Citation preview
By Michael Schroeder, Biotec 5
Motivation: Conformational changes
Upon ligand binding structures may change Structural alignment can highlight the changes
By Michael Schroeder, Biotec 6
GEFs
GAPs
Conformational changes: Small GTPases
Small GTPases act as molecular switches to control and regulate important functions and pathways within in cell
Activated by guanine nucleotide exchange factors (GEF)
Inactivated by GTPase activating proteins (GAP)
By Michael Schroeder, Biotec 8
Open and closed conformation of cytrate synthase (1cts,5cts)
Open: oxalacetate, Closed: oxalacetate and co-enzyme A Loop between two helices moves by 6A and rotates by 28º, some atoms
move by 10A
By Michael Schroeder, Biotec 10
Hinge motion in Lactoferrin (1lfh, 1lfg) Lactoferrin is an iron-binding protein found in
secretions such as milk or tears Rotation of 54º upon iron-binding
By Michael Schroeder, Biotec 11
Hinge motion in Lactoferrin (1lfh, 1lfg) Lactoferrin is an iron-binding protein found in
secretions such as milk or tears Rotation of 54º upon iron-binding
By Michael Schroeder, Biotec 13
Motivation: (Distant) Relatives Sequence similarity may be low, but structural
similarity can still be high
Picture from www.jenner.ac.uk/YBF/DanielleTalbot.ppt
By Michael Schroeder, Biotec 14
Distant relatives
Globins occur widely Primary function: binding oxygen Assembly of helices surrounding haem group
By Michael Schroeder, Biotec 15
Relatives
Sperm whale myoglobin (2lh7) and Lupin leghaemoglobin (1mbd)
By Michael Schroeder, Biotec 17
Relatives Actinidin (2act) and Papain (9pap) Sequence identity 49%, rmsd 0.77A Same family: Papain-like
By Michael Schroeder, Biotec 18
Relatives
Plastocyanin (5pcy) and azurin (2aza) Core of structure is conserved
By Michael Schroeder, Biotec 19
Relatives
Structure classifications like CATH and FSSP use structural alignments to identify superfamilies.
By Michael Schroeder, Biotec 21
Sequence similarity: low
>1cse SubtilisinAQTVPYGIPLIKADKVQAQGFKGANVKVAVLDTGIQASHPDLNVVGGASFVAGEAYNTDGNGHGTHVAGTVAALDNTTGVLGVAPSVSLYAVKVLNSSGSGSYSGIVSGIEWATTNGMDVINMSLGGASGSTAMKQAVDNAYARGVVVVAAAGNSGNSGSTNTIGYPAKYDSVIAVGAVDSNSNRASFSSVGAELEVMAPGAGVYSTYPTNTYATLNGTSMASPHVAGAAALILSKHPNLSASQVRNRLSSTATYLGSSFYYGKGLINVEAAAQ>1acb ChymotrypsinCGVPAIQPVLSGLSRIVNGEEAVPGSWPWQVSLQDKTGFHFCGGSLINENWVVTAAHCGVTTSDVVVAGEFDQGSSSEKIQKLKIAKVFKNSKYNSLTINNDITLLKLSTAASFSQTVSAVCLPSASDDFAAGTTCVTTGWGLTRYTNANTPDRLQQASLPLLSNTNCKKYWGTKIKDAMICAGASGVSSCMGDSGGPLVCKKNGAWTLVGIVSWGSSTCSTSTPGVYARVTALVNWVQQTLAAN
By Michael Schroeder, Biotec 23
Convergent Evolution
c.41.1 and b.47.1 share interaction partners
c.41.1Subtilisin-like
d.58.3Protease propeptides/
inhibitors
d.84.1Subtilisin inhibitor
d.40.1
CI-2 family of serine protease inhibitors
b.47.1Trypsin-like
serine proteases
c.56.5
Zn-dependentexopeptidase
g.15.1Ovomucoid/PCI-1
like inhibitor
By Michael Schroeder, Biotec 24
Convergent Evolution
1OYV
4sgbOvomucoid/PCI-1 like inhibitor, g.15.1, topTrypsin-like serine proteases, b.47.1.2, bottom
1oyvOvomucoid/PCI-1 like inhibitor, g.15.1topSubtilisin like c.41.1bottom
By Michael Schroeder, Biotec 25
Aligned structures
1cseCI-2 family of serine proteases inhitors, d.40.1 topSubtilisin like c.41.1bottom
1acbCI-2 family of serine proteases inhitors, d.40.1 topTrypsin-like serine proteases, b.47.1.2, bottom
Convergent Evolution
By Michael Schroeder, Biotec 26
Catalytic Triad
>1cse SubtilisinAQTVPYGIPLIKADKVQAQGFKGANVKVAVLDTGIQASHPDLNVVGGASFVAGEAYNTDGNGHGTHVAGTVAALDNTTGVLGVAPSVSLYAVKVLNSSGSGSYSGIVSGIEWATTNGMDVINMSLGGASGSTAMKQAVDNAYARGVVVVAAAGNSGNSGSTNTIGYPAKYDSVIAVGAVDSNSNRASFSSVGAELEVMAPGAGVYSTYPTNTYATLNGTSMASPHVAGAAALILSKHPNLSASQVRNRLSSTATYLGSSFYYGKGLINVEAAAQ>1acb ChymotrypsinCGVPAIQPVLSGLSRIVNGEEAVPGSWPWQVSLQDKTGFHFCGGSLINENWVVTAAHCGVTTSDVVVAGEFDQGSSSEKIQKLKIAKVFKNSKYNSLTINNDITLLKLSTAASFSQTVSAVCLPSASDDFAAGTTCVTTGWGLTRYTNANTPDRLQQASLPLLSNTNCKKYWGTKIKDAMICAGASGVSSCMGDSGGPLVCKKNGAWTLVGIVSWGSSTCSTSTPGVYARVTALVNWVQQTLAAN
By Michael Schroeder, Biotec 27
Convergent evolution
A and B are native, C is viral
C
BA
A’
A CB C
Henschel et al., Bioinformatics 2006
By Michael Schroeder, Biotec 28
Comparison of Nef-SH3 and intra-chain interaction of catalytic domain and SH3 of Hck, PDBs: 1efn and 2hck
No evidence of homology between Nef and Kinase
HIV1-Nef
Kinase (Src Haematopoeitic cell kinase, Catalytic domain)
Fyn-SH3/Hck-SH3
HIV Nef mimics kinase in binding SH3
Henschel et al., Bioinformatics 2006
By Michael Schroeder, Biotec 29
Automatic calculation of equivalent residues
Apart from PxxP motif matches: Arg71/Lys249, Phe90/His289
Residues with equivalents are strictly conserved in HIV-Nef
Nef Kinase
Henschel et al., Bioinformatics 2006
By Michael Schroeder, Biotec 30
Caspase (red) P35 (yellow) IAP (green)
Upon infection cell starts apoptosis programme, p35 tries to stop it
Mimickry of baculovirus p35 and human inhibitor of apoptosis
Henschel et al., Bioinformatics 2006
By Michael Schroeder, Biotec 31
HIV capsid protein (yellow)
Cyclophilin (red, green)
Cyclophilin A restricts HIV infectivity
Upon mutation of cyclophilin or inhibition with cyclophorin, infectivity goes up >100 (Towers, Nature Medicine, 2003)
Mimickry of Capsids and Cyclophilin
Henschel et al., Bioinformatics 2006
By Michael Schroeder, Biotec 33
What do we need?
To main operations to align structures: Translation Rotation
How to evaluate a structural alignment? Root mean square deviation, rmsd
By Michael Schroeder, Biotec 38
Root Mean Square Deviation What is the distance between two points a with
coordinates xa and ya and b with coordinates xb and yb? Euclidean distance:
d(a,b) = √ (xa--xb )2 + (ya -yb )2
And in 3D?
a
b
By Michael Schroeder, Biotec 39
Root Mean Square Deviation
In a structure alignment the score measures how far the aligned atoms are from each other on average
Given the distances di between n aligned atoms, the root mean square deviation is defined as
rmsd = √ 1/n ∑ di2
By Michael Schroeder, Biotec 40
Quality of Alignment and Example Unit of RMSD => e.g. Ångstroms
Identical structures => RMSD = “0” Similar structures => RMSD is small (1 – 3 Å) Distant structures => RMSD > 3 Å
By Michael Schroeder, Biotec 42
A very simple algorithm…
…to align identical structures with conformational changes
Generate a sequence alignment (not necessary if both sequences are really 100% identical)
Compute center of mass for both structures Move both structures so that the centers of mass are
the origin Compute the angle between all aligned residues Rotate structure by median of all angles
By Michael Schroeder, Biotec 43
A very simple algorithm…
…to align identical structures with conformational changes
Generate a sequence alignment (not necessary if both sequences are really 100% identical)
Compute center of mass for both structures Move both structures so that the centers of mass are
the origin Compute the angle between all aligned residues Rotate structure by median of all angles
Question: How?Assume n atoms
(x1,y1,z1) to (xn,yn,zn)(for one structure)
By Michael Schroeder, Biotec 44
A very simple algorithm…
…to align identical structures with conformational changes
Generate a sequence alignment (not necessary if both sequences are really 100% identical)
Compute center of mass for both structures Move both structures so that the centers of mass are
the origin Compute the angle between all aligned residues Rotate structure by median of all angles
Question: How?
Question: How?Assume n atoms(x1,y1,z1) to (xn,yn,zn:)Center of mass (xCoM,yCoM,zCoM) = (1/n n
i=1 xi , 1/n ni=1 yi 1/n n
i=1 zi )
By Michael Schroeder, Biotec 45
A very simple algorithm…
…to align identical structures with conformational changes
Generate a sequence alignment (not necessary if both sequences are really 100% identical)
Compute center of mass for both structures Move both structures so that the centers of mass are
the origin Compute the angle between all aligned residues Rotate structure by median of all angles
For all i: do xi:= xi-xCoM, yi:= yi-yCoM, yi:= yi-yCoM,
Question: How?Assume n atoms (x1,y1,z1) to (xn,yn,zn:)Center of mass (xCoM,yCoM,zCoM) = (1/n n
i=1 xi , 1/n ni=1 yi 1/n n
i=1 zi
By Michael Schroeder, Biotec 46
A very simple algorithm…
…to align identical structures with conformational changes
Generate a sequence alignment (not necessary if both sequences are really 100% identical)
Compute center of mass for both structures Move both structures so that the centers of mass are
the origin Compute the angle between all aligned residues Rotate structure by median of all angles
Why median andnot mean?
By Michael Schroeder, Biotec 47
A refinement: Alternating alignment and superposition
1. P = initial alignment (e.g. based on sequence alignment)
2. Superpose structures A and B based on P 3. Generate distance-based scoring matrix R from
superposition 4. Use dynamic programming to align A and B using
scoring matrix R 5. P‘ = new alignment derived from dynamic
programming step 6. If P‘ is different from P then go to step 2 again
By Michael Schroeder, Biotec 48
Distance-based scoring matrix Let d(Ai, Bj) be the Euclidean distance between Ai and Bj
Let t be the upper distance limit for residues to be rewarded
The scoring matrix R is defined as follows:
R(Ai, Bj) = 1 / d(Ai, Bj) - 1 / t
if R(Ai, Bj) > max. score then R(Ai, Bj) = max. score
The gap/mismatch penalty is set to 0
By Michael Schroeder, Biotec 49
Let d(Ai, Bj) be the Euclidean distance between Ai and Bj
Let t be the upper distance limit for residues to be rewarded
The scoring matrix R is defined as follows:
R(Ai, Bj) = 1 / d(Ai, Bj) - 1 / t
if R(Ai, Bj) > max. score then R(Ai, Bj) = max. score
The gap/mismatch penalty is set to 0
Distance-based scoring matrix
What size doesPAM have?
What size doesR have?
By Michael Schroeder, Biotec 52
Doube dynamic programming
Goal: Simultaniously align and superpose structures Double dynamic programming is a heuristic which
tries to achieve goal Implemented as part of SSAP (used e.g. by CATH)
By Michael Schroeder, Biotec 53
Idea of double dynamic programming
Use two levels of dynamic programming: High level, which
summarises low level DP
Low level, which generates alignment based on assumption that ai and bj are part of an optimal alignment
By Michael Schroeder, Biotec 54
Low level matrix
ijR is the low level scoring matrix assuming the pair ai and bj are aligned
ijRkl is the score showing how well ak fits onto bl under the constraint that ai and bj are aligned
Perform dynamic programming for all pairs i,j using ijR with constraint that optimal alignment includes (i,j)
By Michael Schroeder, Biotec 63
Summary
Structural alignments are useful to study conformational changes, to classify domains into families (DDP is used in CATH), to study proteins with distant relationships and hence low sequence similarity
Algorithms Basic operations: translate and rotate Simple algorithm based on dynamic programming Double dynamic programming:
low-level programming using substitution matrix based residue distance
Aggregation of best paths for high-level programming