Upload
ethel
View
36
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Thinking Outside the Box: Applications Including Finding Off-targets for Major Pharmaceuticals. Philip E. Bourne [email protected]. Agenda. Overall Theme - Thinking differently about proteins: Spherical harmonics and phylogeny The Gaussian Network Model and new modes of motion - PowerPoint PPT Presentation
Citation preview
Thinking Outside the Box: Applications Including Finding
Off-targets for Major Pharmaceuticals
Philip E. [email protected]
Agenda
• Overall Theme - Thinking differently about proteins:– Spherical harmonics and phylogeny– The Gaussian Network Model and new
modes of motion– The Geometric Potential for Describing
Ligand Binding Sites– SOIPPA for finding off-site targets
The Curse of the Ribbon
The conventional view of a protein (left) has had a remarkable impact on our understanding of living systems, but its time for new views It is not how a ligand sees a protein after all.
Limitations
• A local viewpoint – does not capture the global properties of the protein
• A local viewpoint does not capture the global properties of a protein
• Cartesian coordinates do not necessarily capture the properties of the protein
• Comparative analysis is limited
Agenda
• Overall Theme - Thinking differently about proteins:– Spherical harmonics and phylogeny– The Gaussian Network Model and new
modes of motion– The Geometric Potential for Describing
Ligand Binding Sites– SOIPPA for finding off-site targets
Protein Kinase A – Open Book View
Scheeff & Bourne 2005 PLoS Comp. Biol. 1(5): e49
Superfamily Members – The Same But Different
Scheeff & Bourne 2005 PLoS Comp. Biol. 1(5): e49
• Roots in spherical harmonics• Parameter space and boundary
conditions can be a variety of properties• Order of the multipoles defines the
granularity of the descriptors• Bottom line – interpreted as shape
descriptors
An Alternative Approach: Multipolar Representation
Gramada & Bourne 2006 BMC Bioinformatics 7:242
Geometric Comparison Does Not Reflect Biological Reality
Gramada & Bourne 2006 BMC Bioinformatics 7:242
Results – Protein Kinase Like Superfamily Alignment
Clear distinction between families.
Some clustering seen inside TPKs that resemble various groups, even though there is little shape discrimination at this level.
Gramada & Bourne 2006 BMC Bioinformatics 7:242
Results – Protein Kinase Like Superfamily Alignment
Gramada & Bourne 2006 BMC Bioinformatics 7:242
Possibilities – Structure Based Phylogenetic Analysis
Scheeff & Bourne Multipoles
Gramada & Bourne 2007 PLoS ONE submitted
Agenda
• Overall Theme - Thinking differently about proteins:– Spherical harmonics and phylogeny– The Gaussian Network Model and new
modes of motion– The Geometric Potential for Describing
Ligand Binding Sites– SOIPPA for finding off-site targets
Protein Motion
OrderedStructures
DisorderedStructures
Structures exist in a spectrum from order to disorder
Gu, Gribskov & Bourne 2006 PLoS Comp. Biol. 2(7) e90
Obtaining Protein Dynamic InformationProtein Structures Treated as a
3-D Elastic Network
Bahar, I., A.R. Atilgan, and B. Erman Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential.
Folding & Design, 1997. 2(3): p. 173-181.
Gaussian Network Model• Each C is a node in the network.
• Each node undergoes Gaussian-distributed fluctuations influenced by neighboring interactions within a given cutoff distance. (7Å)
• Decompose protein fluctuation into a summation of different modes.
Functional Flexibility Score
• Utilize correlated movements to help define regional flexibility with functional importance.
Functionally Flexible Score
For each residue:1. Find Maximum and
Minimum Correlation.2. Use to scale normalized
fluctuation to determine functional importance.
Gu, Gribskov & Bourne 2006 PLoS Comp. Biol. 2(7) e90
Identifying FFRs in HIV Protease
Gu, Gribskov & Bourne 2006 PLoS Comp. Biol. 2(7) e90
Other Examples BPTI and Calmodulin
Gu, Gribskov & Bourne 2006 PLoS Comp. Biol. 2(7) e90
Side Note: Gaussian Network Model vs Molecular Dynamics
• GNM relatively course grained• GNM fast to compute vs MD
–Look over larger time scales–Suitable for high throughput
Agenda
• Overall Theme - Thinking differently about proteins:– Spherical harmonics and phylogeny– The Gaussian Network Model and new
modes of motion– The Geometric Potential for Describing
Ligand Binding Sites– SOIPPA for finding off-site targets
Motivation
• What if we can characterize a protein-ligand binding site from a 3D structure (primary site) and search for that site on a proteome wide scale?
• We could perhaps find alternative binding sites (secondary sites) for existing pharmaceuticals?
• We could use it for lead optimization and possible ADME/Tox prediction
Background – PDB Contains Major Pharmaceuticals Bound to Receptors
Generic Name Other Name Treatment PDBid
Lipitor Atorvastatin High cholesterol 1HWK, 1HW8…
Testosterone Testosterone Osteoporosis 1AFS, 1I9J ..
Taxol Paclitaxel Cancer 1JFF, 2HXF, 2HXH
Viagra Sildenafil citrate ED, pulmonary arterial hypertension
1TBF, 1UDT, 1XOS..
Digoxin Lanoxin Congestive heart failure
1IGJ
Background – Superfamily (Derived from Structure) Covers 38% of the Human Proteome
http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY
Background – Advantage to Using Functional Site Similarity
ProteinSequence/Structure
Similarity
ProteinFunctional Site
Similarity
Small moleculeSimilarity
. Not adequately reflecting functional relationship. Not directly addressing drug design problem
• Poor correlation between structure and activity• Infinite chemical space
. Build closer structure- function relationships . Limit chemical space through co-evolution
Overview of Algorithm
Protein structure is represented with C atoms only and is characterized with a geometric potential
• tolerant to protein flexibility and model uncertainty
Optimum superimposition is achieved with a maximum weighted sub-graph algorithm with geometric constraints
• sequence order independent to detect cross-fold relationships
• to identify sub site similarity
Functional site similarity is measured with both evolutionary correlation and physiochemical similarity
• to distinguish divergent and convergent evolution
Xie and Bourne 2007 BMC Bioinformatics, 8(Suppl 4):S9
Characterization of the Ligand Binding Site - Conceptual
1. Represent the protein structure
2. Determine the environmental boundary
3. Determine the protein boundary
4. Computation of the geometric potential
5. Computation of the virtual ligand
1
2
3
4
5
a b
c
Xie and Bourne 2007 BMC Bioinformatics, 8(Suppl 4):S9
• Initially assign C atom with a value that is the distance to the environmental boundary
• Update the value with those of surrounding C atoms dependent on distances and orientation – atoms within a 10A radius define i
0.20.1)cos(
0.1
iDiPiPGP
neighbors
Conceptually similar to hydrophobicity or electrostatic potential that is dependant on both global and local environments
Characterization of the Ligand Binding Site - Conceptual
Xie and Bourne 2007 BMC Bioinformatics, 8(Suppl 4):S9
Discrimination Power of the Geometric Potential
0
0.5
1
1.5
2
2.5
3
3.5
4
0 11 22 33 44 55 66 77 88 99
Geometric Potential
binding sitenon-binding site
• Geometric potential can distinguish binding and non-binding sites
100 0
Geometric Potential Scale
Boundary Accuracy of Ligand Binding Site Prediction
0
5
10
15
20
25
10 20 30 40 50 60 70 80 90 100
Sensitivity (%)
Dis
trib
utio
n (%
)
0
10
20
30
40
50
60
70
10 20 30 40 50 60 70 80 90 100
Specificity (%)
Dis
trib
utio
n (%
)
• ~90% of the binding sites can be identified with above 50% sensitivity
• The specificity of ~70% binding sites identified is above 90%
So Far…
• Geometric potential dependant on local environment of a residue – relative to other residues and the environmental boundary
• Geometric potential reasonably good at discriminating between ligand binding sites and non-ligand binding sites
• Boundary of the binding site reasonably well defined
• How to compare sites ???
Agenda
• Overall Theme - Thinking differently about proteins:– Spherical harmonics and phylogeny– The Gaussian Network Model and new
modes of motion– The Geometric Potential for Describing
Ligand Binding Sites– SOIPPA for finding off-site targets
• Geometric and graph characterization of the protein structure
• Chemical similarity matrix and evolutionary relationship with profile-profile comparison
• Optimum alignment with maximum-weight sub-graph algorithm
Identification of Functional Similarity with Local Sequence Order Independent Alignment
Xie and Bourne 2007 PNAS, Submitted
Similarity Matrix of Alignment
Chemical Similarity• Amino acid grouping: (LVIMC), (AGSTP), (FYW), and
(EDNQKRH)• Amino acid chemical similarity matrix
Evolutionary Correlation• Amino acid substitution matrix such as BLOSUM45• Similarity score between two sequence profiles
ia
i
ib
ib
i
ia SfSfd
fa, fb are the 20 amino acid target frequencies of profile a and b, respectivelySa, Sb are the PSSM of profile a and b, respectively Xie and Bourne 2007 PNAS, Submitted
Local Sequence-order Independent Alignment with Maximum-Weight Sub-Graph Algorithm
L E R
V K D L
L E R
V K D L
Structure A Structure B
• Build an associated graph from the graph representations of two structures being compared. Each of the nodes is assigned with a weight from the similarity matrix
• The maximum-weight clique corresponds to the optimum alignment of the two structures
Efficient Functional Site Comparison with Evolutionary and Geometric Constraints
• The search space is segmented with the residue clusters determined from the geometric potential
• The nodes and edges are greatly reduced with the robust residue boundary orientation and neighbors
a
b c
1
2
a a2 1
bb
12
cc
21
a a2 1
bb
12
cc
21
+
The time complexity is almost linearly dependant on the number of residues
Improved Performance of Alignment Quality and Search Sensitivity and Specificity
0
10
20
30
40
50
60
70
80
90
<1.0 <3.0 <5.0 <7.0 <9.0 <11.0
RMSD (Angsgroms)
Freq
uenc
y (%
)
Amino Acid GroupingChemical SimilaritySubstitution MatrixProfile-Profile
0
0.005
0.01
0.015
0.02
0.025
0.03
0 0.04 0.08 0.12 0.16 0.2
True Positive Ratio
Fals
e Po
sitiv
e R
atio
Amino Acid GroupChemical SimilaritySubstitution MatrixProfile-Profile
RMSD distribution of the aligned common fragments of ligands from 247 test cases showing four scores: amino acid grouping, chemical similarity, substitution matrix and profile-profile.
.
So What is the Potential of this Methodology?
Lead Discovery from Fragment Assembly
• Privileged molecular moieties in medicinal chemistry
• Structural genomics and high throughput screening generate a large number of protein-fragment complexes
• Similar sub-site detection enhances the application of fragment assembly strategies in drug discovery
1HQC: Holliday junction migration motor protein from Thermus thermophilus1ZEF: Rio1 atypical serine protein kinase from A. fulgidus
Lead Optimization from Conformational Constraints
• Same ligand can bind to different proteins, but with different conformations
• By recognizing the conformational changes in the binding site, it is possible to improve the binding specificity with conformational constraints placed on the ligand
1ECJ: amido-phosphoribosyltransferase from E. Coli1H3D: ATP-phosphoribosyltransferase from E. Coli
Finding Secondary Binding Sites for Major Pharmaceuticals
• Scan known binding sites for major pharmaceuticals bound to their receptors against the human proteome
• Try and correlate strong hits with known data from the literature, databases, clinical trials etc. to provide molecular evidence of secondary effects
A Case Study
Selective Estrogen Receptor Modulators (SERM)
• One of the largest classes of drugs
• Breast cancer, osteoporosis, birth control etc.
• Amine and benzine moiety
Xie, Wang and Bourne 2007 Nature Biotechnology, Submitted.
Adverse Effects of SERMs
cardiac abnormalities
thromboembolic disorders
ocular toxicities
loss of calcium homeostatis ?????
Xie, Wang and Bourne 2007 Nature Biotechnology, Submitted.
Ligand Binding Site Similarity Search On a Proteome Scale
• Searching human proteins covering ~38% of the drugable genome against SERM binding site
• Matching Sacroplasmic Reticulum (SR) Ca2+ ion channel ATPase (SERCA) TG1 inhibitor site
• ER ranked top with p-value<0.0001 from reversed search against SERCA
ER
0 20 40 60 80
0.00
0.02
0.04
0.06
Score
Den
sity
SERCA
Xie, Wang and Bourne 2007 Nature Biotechnology, Submitted.
Structure and Function of SERCA
• Regulating cytosolic calcium levels in cardiac and skeletal muscle
• Cytosolic and transmembrane domains
• Predicted SERM binding site locates in the TM, inhibiting Ca2+ uptakes
Xie, Wang and Bourne 2007 Nature Biotechnology, Submitted.
Binding Poses of SERMs in SERCA from Docking Studies
• Salt bridge interaction between amine group and GLU
• Aromatic interactions for both N-, and C-moiety
6 SERMS A-F (red)
Off-Target of SERMs
cardiac abnormalities
thromboembolic disorders
ocular toxicities
loss of calcium homeostatis SERCA !
in vivo and in vitro Studies TAM play roles in regulating calcium uptake activity of cardiac SR TAM reduce intracellular calcium concentration and release in the platelets Cataract results from TG1 inhibited SERCA up-regulations EDS increases intracellular calcium in lens epithelial cells by inhibiting SERCA
in silico Studies Ligand binding site similarity Binding affinity correlation
Conclusion• By thinking differently about how to
represent proteins we have seen potential value in:– Phylogenetic analysis– The study of the dynamics of proteins– Improvements to the drug discovery
process
Acknowledgements
Jenny GuProtein Motions
Apostol GramadaMultipole Analysis
Support Open Access
Lei Xie
Jian Yang
Swiss-Prot - 20 Year Celebration
www.pdb.org • [email protected] on Drug Development
Affinity (ER Site) Affinity (SERCA) Affinity Difference
Bazedoxifene(BAZ) -9.44 +/- 0.54 -7.23 +/- 0.13 2.21
Lasofoxifene(LAS) -8.66 +/- 0.40 -6.54 +/- 0.20 2.12
Ormeloxifene(ORM) -8.67 +/- 0.18 -5.84 +/- 0.33 2.83
Raloxifene(RAL) -8.08 +/- 0.64 -5.78 +/- 0.23 2.30
4-hydroxytamoxifen(OHT) -7.67 +/- 0.47 -5.40 +/- 0.15 2.27
Tamoxifen(TAM) -7.30 +/- 0.28 -5.64 +/- 0.28 1.66
• Taking account of both target and off-target for lead optimization
• Drug delivery and administration regime
A Protein is More than the Union of its Parts
• Breaking the protein into parts changes the object of the comparison
• This is interpreted in many cases to imply that the rmsd measure is inadequate.
• The reality is that it is the aligning of structure that breaks the triangle inequality and not the measure per se. The reason for failure is that we effectively compare different objects then we say we do.
From Røgen & Fain (2003), PNAS 100:119-124
New Tricks – Protein Representation
An Alternative Approach: Multipolar Representation
Roots in Spherical Harmonics• Parameterization
+ boundary conditionsgCharge distribution (i.e. structure) Ð
f qlm out;M lm in;qilm; M i
lmg
Scalar potential
Gramada & Bourne 2006 BMC Bioinformatics 7:242New Tricks – Protein Representation
Spatial distribution ofa scalar quantity
• “Out” Multipoles
qlm = Pi=1
Nrl
i Y ãlm(òi;þi); l = 0;ááá;1 ; m = à l;ááá;l
For a given rank l, they form a 2l+1 dimensional vector under 3D rotations
ql = fql;mgm=à l;ááá;l
Vector algebra applies => metric properties
Gramada & Bourne 2006 BMC Bioinformatics 7:242
An Alternative Approach: Multipolar Representation
New Tricks – Protein Representation
The multipoles can be interpreted as shape descriptors
In principle, from the entire series of multipoles one can reconstruct the scalar field and therefore the density, i.e the entire set of Cartesian coordinates, i. e. of the structure with a geometric level of detail
The partitioning of the multipole series according to various representation of the rotational group allows for a multi-scale description of the structure
An Alternative Approach: Multipolar Representation
Gramada & Bourne 2006 BMC Bioinformatics 7:242New Tricks – Protein Representation