Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Prediction of Protein-Protein
Binding Sites and
Epitope Mapping
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Epitope Mapping
▪ Antibodies interact with antigens at epitopes
– Epitope is collection residues on antigen
– Continuous (sequence) or non-continuous
– Patentable druggable sites
▪ Methods for mapping epitopes
– X-ray crystallography (gold standard)
– Peptide scanning
– Display technologies (e.g. phage display)
– Hydrogen-Deuterium exchange (HDX)
▪ In silico methods for mapping epitopes
– Patch Analysis (antigen surface properties)
– Protein Docking (pose prediction)
– Very challenging: large surface area / conformational flexibility
▪ Key question: are in silico methods accurate?
– Suggest mutagenesis experiments early in development
– Reduce crystallographic efforts / expense
Antigen
Non-continuous interaction
Continuousinteraction
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Hydrophobic Patches
▪ Identify key hydrophobic regions on the molecular surface
– Hydrophobic patches implicated in protein binding
– Desolvation of hydrophobic patches is a driver of protein-protein interactions (e.g. binding affinity, HIC, aggregation, etc.)
▪ Protein Patch Analyzer Methodology
– Derive patches from local lipophilic potential1
– Map atomic logP (octanol/water partition coefficient) onto surface
– Remove portions of surface with local logP < 0.1 (≈ methyl C)
– Retain contiguous patches of at least 50 Å2 (≈ methane)
Hydrophobic
Hydrophilic
Hydrophobic patch
1. Wildman,S.A., Crippen,G.M.;J. Chem. Inf. Comput. Sci. 39(5), 868-873, (1999)
Antibody(PDB:3G6D)
Negative patch
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
% Inclusion Body Predictions for 31 Adnectins
▪ Sequences and experimental dataprovided in publication1
– % IB: a measure of insoluble proteinaggregate formation
– 31 triple mutants in series
▪ Mutating key residues of hydrophobicpatches improves solution behavior
– Mutated residues correspond tohydrophobic patches
– Polar variantsreduce % IB
– Rational designusing homologymodels effectivein this case
Adnectin % IB
31 76
1 17
2 19
3 20
Adnectin 31Homology Model
Trainor et al, JMB 2016,428(6), p1365-1374
31
1
2
3
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Hydrophobic Patches Predict Adnectin Solubility
▪ Homology model of all 31 mutant adnectins
– Measure hydrophobic patch surface area
– Correlate with experimental data – compare with SAP score
0
10
20
30
40
50
60
70
80
10 20 30 40 50 60 70 80
Pre
dic
ted
% in
Incl
usi
on
Bo
die
s
Experimental % in Inclusion Bodies
MOE (R2=0.701)
SAP (R2=0.709)
MOE Hydrophobic Patch Descriptor
– Sum of hydrophobic patch surface areas
– Averaged over 25 LowModeMD conformations
SAP Spatial Aggregation Propensity (Trout)
– Score of exposed residues
– Averaged over MD simulation
Chennamsetty et al, J Phys Chem B 2010,114(19), p6614-6624
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
▪ ~3000 crystal structures of antibodies in PDB, 1582 used
– Removed structures with resolution > 3 Å or missing segments in Fv domain
▪ Small hydrophobic patches are common
– Unusually large patches present concernfor developability
– Multiple hydrophobic patches togethercan reduce solubility
Hydrophobic Patch StatisticsFr
eq
ue
ncy
120 Å2
0
50
100
150
200
250
300
350
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
# of Hydrophobic Patches
0
50
100
150
200
250
300
60
80
10
01
20
14
01
60
18
02
00
22
02
40
26
02
80
30
03
20
34
03
60
38
04
00
42
04
40
46
0
SA of Largest Hydrophobic Patch (Å2)
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
HIC Predictions for 137 Clinical Candidates
▪ Recent publication providing sequences and experimental data1
– Diverse set of antibody Fv regions expressed on common IgG1 scaffold
– Use data as benchmark to measure broad applicability of HIC predictions
▪ Generate homology models and predictions with descriptors / QSPR
– Automatic antibody modeling and patch surface area calculation
AntibodyHomology Model
1: Jain et al. PNAS 2017;114:944-949
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
HIC Predictions for 137 Clinical Candidates
▪ 3D hydrophobicity descriptorsoutperform sequence-based index
– Signals are generally modest
– Likely due to clinical candidateshaving been solubility optimized
– Difficult to differentiate narrowrange of low HIC RT values
▪ CDR region important
– Candidates -> framework mutations
– LowModeMD sampling improvement
▪ Prediction quality is case dependent
– Adnectin predictions more accurate
– Triple mutants easier than multi-class
▪ QSPR model more predictive
– 4-point QSPR can enhance signal
r2 = 0.17
r2 = 0.10
r2 = 0.23
r2 = 0.33
r2 = 0.41
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
▪ Sum of hydrophobic patch surface areas of CDRs calculated
▪ Many candidates hydrophobic vs PDB antibodies
– Trastuzumab has a low HIC RT (9.7 min)
– Cixutumumab has higher HIC RT (11.8 min) but longer H3 loop dilutes the hydrophobicity
– Cituxumumab removed from pipeline after phase II
▪ Predictions across classes are difficult
– Relative predictions within a series is more reliable
– Ab:Ag co-crystal structure useful for optimization
HIC Predictions for 137 Clinical Candidates
Fre
qu
en
cy
cdr_hyd SA (Å2)Trastuzumab
Cixutumumab
700 Å2
0
5
10
15
20
25
0 50 100 150 200 250 300 350 400 450 500 550 600 650 700
240 Å2
PDB
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
▪ 750 antibody:antigen complexes from antibody structural database
– Antigens larger than 2000 residues removed
– Count hydrophobic patch frequency at PPI
– Order by patch surface area (blue is largest)
▪ Useful for epitope mapping and identifying residues for PPI optimization
– To reduce undesired PPIs, mutate residues forming hydrophobic patches
Hydrophobic Patches Prevalent at PPI Sites
%
0
10
20
30
40
50
60
70
80
90
100
ab ag
Antibody
Antigen
Antibody
Antigen
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Patch Analysis of Interleukin-1β
▪ Known complexes with Canakinumab and Gevokizumab
– Protein-protein interfaces consist of clusters of multiple patches
– Patches are present in other exposed regions
▪ Patch analysis alone is insufficient to predict epitopes
– Ranking patches by size does not correlate with interface
Interleukin-1β (4I1B)
Canakinumab (4G6J)Gevokizumab (4G6M)
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Protein-Protein Docking
▪ Predict macromolecular complexes with protein docking
– More comprehensive approach for identifying binding sites
– Computationally more expensive than Protein Patch Analysis
▪ First principles physical approximations
– No protein structure training set used in parametrization
▪ Site restraints incorporated using generic energy term
– Applied automatically to antibody CDR regions
AntibodyPDB: 3G6D
Antigen PosePredictions (100-200)
Antibody CDRs
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
0 10 20 30 40 50 60 70 80 90 100
PatchDockHEX/G
GRAMMHEX
FTDock
FTDock/GZDOCK2.1MolFit/G
MolFig/GHDOT
ZDOCK1.3SDOCK
ZDOCK2.…FRODOCK
ATTRACT
Success rate (%)
Performance of Protein Dockers on Antibodies
Top 5
Top 30
Top 100
Antibody Docking Results
▪ Repacking/refinement improves ranking– Backbone and sidechain
rearrangement in complex
– Scoring the single correct pose is not realistic
– Ranking of top 100 shows good enrichment
– Use ensemble of poses to refine prediction
▪ ZDOCK 4.0 Benchmark test set– Success defined as a structure with ligand RMSD < 10 Å
– Prior knowledge (site restraint) needed to achieve high accuracy with state-of-the-art dockers
Huang, S.-Y.; Exploring the potential of global protein–protein
docking; Drug Discovery Today, 20(8) (2015) 969–977
21 antibody-antigen complexes in ZDOCK set
ATTRACT:LJ
ZDOCK3.0.2
PIPER
MOE
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Epitope Mapping Uses Multiple Docked Poses
▪ An approach for analyzing the protein poses is required
– 100-200 poses needed for reasonable performance
– Use protein-protein interaction fingerprints to describe poses
▪ Goal: Identify sets of interacting residues
– Dock antibody Fv restricted to CDRs to the entire antigen
– For each pose, identify antigen residues that are implicated
– Generate statistical summaries of docked poses
– Is epitope identification more robust than predicting a single pose?
▪ Validation: Similarity to native complex structure
– Measure overlap betweensets of interacting residues in different poses
– Use ZDOCK5 benchmark complex structures
Interacting residues(spheres)
Antibody pose 1 (blue)
Antibody pose 2 (red)
Antigen
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Identify Interactions using Surface Patch Contacts
▪ Identify residue-residue interactions from antibody poses
– Fingerprints: surface patch contacts on the antigen in each pose
– Count residue on the antigen if it is interacting with the antibody
▪ Classify patch contacts into four categories
– Both HYD, One HYD, Any POS/NEG, None
▪ Example: Colicin E7 nuclease (7CEI), key residue N26
Native complex crystal contacts Unbound structure docked pose (5Å)
H-bonds
ReceptorsurfaceInhibitor
surface
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
▪ Benchmark data set of protein-protein complexes
– 150 rigid-body targets from ZDOCK test set
– 32 antibody-antigen complexes, 118 others
▪ Protein contacts calculated using distance and energy
– Cut-offs applied to select mostsignificant interactions
– Average number of contacts:~6 per complex
▪ Surface interactions calculatedfrom patch fingerprints
– Most contacts correspond tosurface interactions
– Energetic calculations tend toover-emphasize charged interactions
– Surface contacts betterrepresent hydrophobic contacts
0
5
10
15
20
25
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Protein Contacts
non-Ab
Antibody
0
5
10
15
20
25
30
0 1 2 3 4 5 6 7 8 9
Residues in Common
non-Ab
Antibody
Surface Contacts Describe Native Complexes
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Protein-Protein Interaction Fingerprints
▪ Populations are calculated for each residue in all poses
– The top-200 refined poses are used in the analysis
– The most likely residues to interact with the antibody are identified
▪ Highest peaks do not necessarily represent epitopes
– Cluster poses to identify residues that are spatially close
Interleukin-1β/canakinumabPose Interaction Histogram
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Using Fingerprints to Determine Epitopes
Poses intop cluster
selected
Consensusresidues in
epitope
Interleukin-1β/canakinumab Epitope Selection▪ Cluster poses by fingerprint
similarity
– Connected poses are defined by a similarity cutoff (0.4)
▪ Rank clusters by ensemble free energy
– Effective temperature is an empirical parameter fit to score values
▪ For each cluster generate the epitope
– Each cluster is a potential epitope (spatially close)
– Select highest-population residues from each cluster
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Epitope Benchmark Results
▪ Predictions carried out using ZDOCK test set
– Selected targets with representative docking performance
– Divided into antibody (20) and non-antibody (30) classes
▪ Native contacts identified in predicted epitopes
▪ At least 3 native contacts in top-ranked epitope
– Antibody: 60%, non-antibody: 53%
0 25 50 75 100
Top 5
Top 1
Antibody
5+ 3-4 1-2
0 25 50 75 100
Top5
Top 1
non-Antibody
5+ 3-4 1-2
0
5
10
0 1 2 3-5 6-9 10-14 15+
Nu
mb
er o
f ta
rget
s
Number of Hits in top-200
Benchmark Docking ResultsAntibody
non-Antibody
Percentage of targets with native contacts
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Predicted Epitopes of IL-1β Complexes
▪ Results of docking with different antibody structures
– Correct ordering impossible from patch analysis alone
– Top-ranked docking poses are not in top-ranked clusters
▪ PPI fingerprint analysis correctly predicts crystal epitopes
Gevokizumab#1#2#3
Canakinumab#1#2#3
IL-1β
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Hydrogen-Deuterium Exchange (HDX)
▪ Soak the protein with deuterium
– Deuterium will replace H over time; buried atoms no replacement
– Solvent accessibility, h-bonding, sidechain neighbors affect uptake
▪ Digest with pepsin and use mass spectrometry
– Collection of partially deuterated sequence fragments
▪ Measure deuteration of backbone amides in solution
D2O
Peptide N-DPeptide N-H
Iacob et al, Expert Rev. Proteomics 2015,12(2), p1-11
D
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
▪ Compare bound and unbound deuterated structures
– Reduction in deuteration indicates interface residues
▪ Provides low-resolution information – sequence segments
– Qualitative effect, divided into weak and strong categories
– Effects may not be due to contact, e.g. flexibility changes
– Not all interactions appear
▪ Idea: use HDX data to bias docking to find IL-23:adnectin epitope residues
Adnectin interface residuesshow less deuteration
(less exposed to solvent)
IL-23p40
HDX Useful to Probe Binding Interfaces
Iacob et al, Expert Rev. Proteomics 2015,12(2), p1-11
p40 domain sequence p19 domain sequence
HDX MS Data (IL-23)
0 50 100 150 200 250 300 0 50 100 150
p19
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Characterization of Binding Interface
▪ HDX data alone is not enough to characterize epitope
– Resolution does not label single residues
– Energy-based methods assume accurate atomic structure
– Buried surface alone does not distinguish different types of interactions (e.g. hydrophobic, H-bonds)
– Surface Patch Fingerprints can be effective
▪ Patch contacts reflect key interactions
– Close agreement with atomic-levelinteractions in native structure
– More selective than contacts basedon energy alone
– Surface patches provide a more robustcharacterization of docked conformations
Epitoperesidues
HDX MS Data (IL-23)
p40 50 100 150 200 250 300 p19 50 100 150
Co-crystal Inspection
Co-crystal Patch FP
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Interleukin-23 : Adnectin Calculation
▪ IL-23:Adnectin show conformational changes in complex
– Docking the bound conformations works very well (RMSD < 0.5 Å, #1)
– Docking unbound conformations fails (too much loop movement)
▪ Docking procedure
1. 20 Adnectin models
2. 20 IL-23 models
3. HDX IL-23 constraints
4. FG-loop constraints
5. Dock Pairs – 400 runs
▪ Epitope Analysis
– Keep top 1,000 poses
– Calculate patch FPsand cluster
– Rank clusters by dockingscore, contact area, HDX overlap
– Output top rank
IL-23 loop notwell resolvedin unboundcrystal structure
20 IL-23 Homology Models(from PDB:3D87 - unbound)
20 Adnectin Homology Models(from PDB:1FNA - unbound)
AdnectinFG-loop
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Interleukin-23 : Adnectin Epitope Prediction
▪ Ensemble modeling preserveskey features of complex
– Consensus fingerprints showgood correspondence (4/8)to crystal interactions
– Homology models outperformdocking of unbound structureswhen there is significantconformational flexibility
▪ HDX biased protein docking can elucidate epitope
0 50 100 150 200 250 300
3D87:1FNA models
3D87 models:Xtal
Xtal:1FNA models
Crystal inspection
0 50 100 150p40 p19
Receptormodels
DockedposesPredicted
epitope
Variableloops
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.
Conclusions
▪ Protein Patch Analyzer for detecting protein hot spots
– Fast method for predicting protein binding sites on a single body
– Identify hydrophobic patches implicated in PPIs
▪ Predict protein-protein complexes with docking
– Predicts protein interactions between two bodies
– Site restraints derived from automatic sequence annotation are effective for docking antibody-antigen complexes
▪ Analysis of docked poses can be used for epitope mapping
– Similar poses determined by clustering PPI fingerprints
– First-ranked epitope correct in 56% of antibody-antigen targets
▪ Epitope analysis can be used to interpret HDX data
– Case study: Adnectin complex with IL-23
– HDX data can be incorporated into docking calculations
– Consensus fingerprints show good agreement with crystal structure
▪ Epitope mapping available in MOE 2018
Copyright © 2017 Chemical Computing Group ULC All Rights Reserved.