Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Computational Aspects of HTS Planning and Analysis Course
Introduction to ChEMBL -
Anne Hersey
ChEMBL Group
EMBL-EBI
Outline • ChEMBL
• Background
• Content
• Identifying compounds binding to a protein target
• Assessing compounds
• Using other resources • Crystal structures
• Druggability algorithms
2
Compound Selection Ta
rget
Annotation Possible ?
Virtual screening
file ChEMBL ChEMBL
Virtual screening
file DrugEBIlity PDBe
No Diversity based list
Yes Activity based hits
Known actives ?
No Select
compounds similar to known
actives
Yes
Are the actives ‘Drug like’ ?
No
Yes
Activity-based
Yes No
Structure based hits
Known structure ?
Yes Select compounds
compatible with binding site
Is the binding site ‘Drug like’ ?
Yes
No Structure-based
3
What is ChEMBL • Open access database for drug discovery
• Freely available (searchable and downloadable)
• Content:
• Bioactivity data manually extracted from the primary medicinal chemistry literature from journals such as J. Med. Chem.
• Subset of data from PubChem
• Deposited data e.g. neglected disease screening, GSK kinase set
• Bioactivity data is associated with a biological target and a chemical structure
• Compounds are stored in a structure searchable format
• Protein targets are linked to protein sequences in UniProt
• Updated regularly with new data
• Secure searching (https://www.ebi.ac.uk/chembldb )
4
Data Example EP1 Antagonists for Inflammatory Pain A. Hall et al.
Bioorg. Med. Chem. Lett. 19 (2009) 2599–2603
5
View of data in ChEMBL
6
Compound Target Activity Assay Lit ref
Some Numbers (ChEMBL17)
7
Accessing ChEMBL Data
8
Drug Targets
9
Data for: ~260 drug targets ~6000 protein targets (single proteins,families and complexes)
Are there known Active Compounds for my Target?
10
From ChEMBL identify compounds that bind to the target Select:
• Potent compounds • Rule of 5 compliant (drug-like) • Ligand efficient molecules
Example (DPP4):
11
Linagliptin
Saxagliptin
Sitacliptin
Alogliptin
Compounds with DPP4 data in ChEMBL Are they drug like?
Ligand Efficiencies
12
LE -RTln(Ki)/Heavy_atoms (Hopkins AL et al DDT; 2004) BEI pKi*1000/MW (Abad-Zapatero C et al DDT 2005) SEI pKi*100/PSA LLE pKi – ALogP (Leeson PD et al NRDD 2007)
In ChEMBL LE calculated for: IC50,Ki,EC50,Kd,XC50,AC50,Potency
Select most ligand efficient compounds
Other Information about compounds
13
Linagliptin bound to DPP4
Compound availability
Another example
14
Sci Transl Med 5, 206ra138 (2013)
Information on PERK Target
15
Searching by Compound
16
Extending dataset – Similar Targets
17
>sp|Q9NZJ5|E2AK3_HUMAN Eukaryotic translation initiation factor 2-alpha kinase 3 OS=Homo sapiens GN=EIF2AK3 PE=1 SV=3 MERAISPGLLVRALLLLLLLLGLAARTVAAGRARGLPAPTAEAAFGLGAAAAPTSATRVPAAGAVAAAEVTVEDAEALPAAAGEQEPRGPEPDDETELRPRGRSLVIISTLDGRIAALDPENHGKKQWDLDVGSGSLVSSSLSKPEVFGNKMIIPSLDGALFQWDQDRESMETVPFTVESLLESSYKFGDDVVLVGGKSLTTYGLSAYSGKVRYICSALGCRQWDSDEMEQEEDILLLQRTQKTVRAVGPRSGNEKWNFSVGHFELRYIPDMETRAGFIESTFKPNENTEESKIISDVEEQEAAIMDIVIKVSVADWKVMAFSKKGGHLEWEYQFCTPIASAWLLKDGKVIPISLFDDTSYTSNDDVLEDEEDIVEAARGATENSVYLGMYRGQLYLQSSVRISEKFPSSPKALESVTNENAIIPLPTIKWKPLIHSPSRTPVLVGSDEFDKCLSNDKFSHEEYSNGALSILQYPYDNGYYLPYYKRERNKRSTQITVRFLDNPHYNKNIRKKDPVLLLHWWKEIVATILFCIIATTFIVRRLFHPHPHRQRKESETQCQTENKYDSVSGEANDSSWNDIKNSGYISRYLTDFEPIQCLGRGGFGVVFEAKNKVDDCNYAIKRIRLPNRELAREKVMREVKALAKLEHPGIVRYFNAWLEAPPEKWQEKMDEIWLKDESTDWPLSSPSPMDAPSVKIRRMDPFATKEHIEIIAPSPQRSRSFSVGISCDQTSSSESQFSPLEFSGMDHEDISESVDAAYNLQDSCLTDCDVEDGTMDGNDEGHSFELCPSEASPYVRSRERTSSSIVFEDSGCDNASSKEEPKTNRLHIGNHCANKLTAFKPTSSKSSSEATLSISPPRPTTLSLDLTKNTTEKLQPSSPKVYLYIQMQLCRKENLKDWMNGRCTIEERERSVCLHIFLQIAEAVEFLHSKGLMHRDLKPSNIFFTMDDVVKVGDFGLVTAMDQDEEEQTVLTPMPAYARHTGQVGTKLYMSPEQIHGNSYSHKVDIFSLGLILFELLYPFSTQMERVRTLTDVRNLKFPPLFTQKYPCEYVMVQDMLSPSPMERPEAINIIENAVFEDLDFPGKTVLRQRSRSLSSSGTKHSRQSNNSHSPLPSN
PERK sequence from Uniprot BLAST search for similar sequences
Compound Selection Ta
rget
Annotation Possible ?
Virtual screening
file ChEMBL ChEMBL
Virtual screening
file druggability PDBe
No Diversity based list
Yes Activity based hits
Known actives ?
No Select
compounds similar to known
actives
Yes
Are the actives ‘Drug like’ ?
No
Yes
Activity-based
Yes No
Structure based hits
Known structure ?
Yes Select compounds
compatible with binding site
Is the binding site ‘Drug like’ ?
Yes
No Structure-based
18
Structure Based Design
19
Is my Protein Druggable? - • Structure based methods identify cavities in protein crystal
structures and assessing the properties of these cavities
• Rules for properties that indicate a druggable cavity learnt from analysis of co-crystal complexes with drug-like ligands
• Examples of Algorithms:
• PocketFinder – An, Totrov & Abagyan, 2005
• Druggability Indices – Hajduk, Huth & Fesik, 2005
• Rule based method - Perola, Herman & Weiss, 2012
20
DrugEBIlity • https://www.ebi.ac.uk/chembl/drugebility/structure
• All potential pockets in crystal structures from PDB predicted using a pocket-finding algorithm (based on SurfNet, Laskowski 1995)
• Decision tree algorithm trained on known binding pockets for drug-like ligands (e.g., rule-of-five)
• Decision tree used to classify unknown pockets into druggable/undruggable
• Second ‘tractability’ algorithm also trained with more relaxed ligand criteria (e.g., Mwt < 800)
21
Is the Binding Site Druggable?
22
Acknowledgements
• John Overington
• Anna Gaulton
• Mark Davies
• Patricia Bento
• Jon Chambers
• Francis Atkinson
• Louisa Bellis
23
• Yvonne Light
• George Papadatos
• Shaun McGlinchey
• Nathan Dedman
• Michal Nowotka
• Ruth Akhtar
• Kaz Ikeda