Upload
biocs
View
1.336
Download
3
Embed Size (px)
DESCRIPTION
CeMM, Vienna (2008-06-12) Paper: http://www.ncbi.nlm.nih.gov/pubmed/18621671
Citation preview
Using side effects of medicines to identify
drug targets
Michael KuhnPeer Bork lab, EMBL Heidelberg
Drugs and their targets
Side effects
Prediction of drug targets
Drugs
• only consider organic small molecules
• no antibodies, peptides, ions
• ~750 drugs with side effects
• ~600 drugs with targets
Targets
• most drugs bind to proteins
Targets
• most drugs bind to proteins
• few exceptions, e.g.:
• alkylating antineoplastic agents (which modify DNA)
• contrast agents for X-ray
• saline solution
Penicillin-binding proteinMyeloperoxidase-likeSodium: neurotransmitter symporter familyType II DNA topoisomeraseFibronectin type IIICytochrome P450
Rhodopsin-like GPCRsNuclear receptorsLigand-gated ion channelsVoltage-gated ion channels
26.8
13
7.95.54.13
2.72.3
2.11.9
Freq
uenc
y
400
350
300
250
200
150
100
1.8 2.8 3.3 3.8 4.3 4.8 5.3 5.8 6.3 6.8 7.3 7.8 8.3 8.8 9.3 9.8 10.3 10.8 11.3 11.8
50
0
–Log10 a!nity
protein is believed to be the sole or major route through which a drug achieves its efficacy, we assign the drug against this single target; for example, the histamine H1 receptor is believed to be the major mechanistic target for cetirizine and hydroxyzine, and acebutolol acts through the !1 adrenoceptor, although all these drugs show binding to other G-protein-coupled receptors (GPCRs) in in vitro assays. In other cases, the drug acts through a number of target subtypes: for example, carvedilol acts through blocking a number of "- and !-adrenoceptors. Finally, a drug can act through multiple distinct mechanisms, and therefore unrelated targets. For example, ritonavir is an HIV protease inhibitor; however, it is usually given in combination with other HIV protease inhibitors because it inhibits the cytochrome P450 3A4 (CYP3A4)-mediated metabolism of other HIV protease inhibitors such as lopinavir. In such cases, both HIV1 protease and human CYP3A4 are regarded as the molecular targets.
There is a relatively small, but clinically significant, class of drugs that bind to either ribosomes or DNA, or that have no distinct
or an unknown mode of action. The literature changes frequently in terms of the knowledge available about drug indications and mecha-nisms of action, and so this information needs to be reviewed regularly.
Efficacy targets of current drugsOn the basis of existing knowledge, we were able to determine that all current drugs with a known mode-of-action act through 324 distinct molecular drug targets. Of these, 266 are human-genome-derived proteins, and the remainder are bacterial, viral, fungal or other pathogenic organism targets. Small-molecule drugs modulate 248 proteins, of which 207 are targets encoded by the human genome (TABLE 1). Oral small-molecule drugs target 227 molecular targets, of which 186 are human targets.
A complicating feature of any such analysis is that many drugs have complex and relatively poorly understood pharma-cology, and often limited selectivity against related proteins, and some targets are actually complex multimeric proteins with variable subunit compositions and so on. If one makes the assumption that proteins related down to 50% identity show related pharmacology, then this list of 324 targets expands to 604 genes for the human genome (comparison carried out against ENSEMBL genome June 2006 release containing 29,679 genes). Extending the analysis to include all close homologues (35% identity or closer) increases the number to 1,048 genes (3.5% of the genome). This line of reasoning lead to the initial estimate of the size of the druggable genome4. Understanding the real pharmacological footprint of current drugs offers many opportunities for both develop-ing new, optimized agents with different selectivity profiles, and also more efficient lead discovery and optimization strategies.
Current biological drugs target 76 pro-teins, with currently marketed monoclonal
antibody therapeutics acting on 15 distinct human targets. So far, only nine targets are modulated by both small-molecule and biological drugs, with the differing agent types usually targeting different domains or binding sites. This relatively small number of jointly modulated targets is driven by both technical and commercial considerations. For example, the biological drugs cetuximab and panitumumab target the extracellular domain of the receptor tyrosine kinase EGFR (ERBB1), whereas the small-molecule drugs gefitinib and erlotinib target the adenine portion of the ATP-binding site of the cytosolic catalytic kinase domain within the same receptor.
Drug polypharmacologyIt was clear from both our curation of drug targets from the literature and also data-mining of known affinity values of drugs for targets (as abstracted in a large database of medicinal chemistry literature17) that many drugs show clinically relevant polyphar-macology (that is, they are ‘dirty drugs18). Quite expectedly, closely related members of the gene family will show significant drug promiscuity, and, as a result of the generally similar function of these proteins, give rise to complex composite clinical pharmacology. The point of genuine multitarget effects of drugs is well illustrated by several recently launched protein kinase inhibitors. Imatinib, originally developed as a highly selective inhibitor of c-ABL11 (and which target association led to its first approval for chronic myeloid leukaemia), has subsequently been discovered to be have significant activity against several other clinically relevant kinases, such as c-KIT12–14, leading to expan-sion of the clinical utility of this important therapeutic. Sorafenib has been recently launched as an explicit multikinase inhibi-tor, affecting both tumour proliferation and tumour angiogenesis pathways, and acting
Figure 1 | Gene-family distribution of current drugs per drug substance. The family share as a percentage of all FDA-approved drugs is dis-played for the top ten families. Beyond the ten most commonly drugged families, there are a further 120 domain families or singletons for which only a few drugs have been successfully launched. Data based on 1,357 dosed compo-nents from >20,000 approved products, FDA, December 2005. GPCR, G-protein-coupled receptor.
Figure 2 | Frequency distribution for small-molecule drug potencies.
PERSPECT IVES
994 | DECEMBER 2006 | VOLUME 5 www.nature.com/reviews/drugdisc
Penicillin-binding proteinMyeloperoxidase-likeSodium: neurotransmitter symporter familyType II DNA topoisomeraseFibronectin type IIICytochrome P450
Rhodopsin-like GPCRsNuclear receptorsLigand-gated ion channelsVoltage-gated ion channels
26.8
13
7.95.54.13
2.72.3
2.11.9
Freq
uenc
y
400
350
300
250
200
150
100
1.8 2.8 3.3 3.8 4.3 4.8 5.3 5.8 6.3 6.8 7.3 7.8 8.3 8.8 9.3 9.8 10.3 10.8 11.3 11.8
50
0
–Log10 a!nity
protein is believed to be the sole or major route through which a drug achieves its efficacy, we assign the drug against this single target; for example, the histamine H1 receptor is believed to be the major mechanistic target for cetirizine and hydroxyzine, and acebutolol acts through the !1 adrenoceptor, although all these drugs show binding to other G-protein-coupled receptors (GPCRs) in in vitro assays. In other cases, the drug acts through a number of target subtypes: for example, carvedilol acts through blocking a number of "- and !-adrenoceptors. Finally, a drug can act through multiple distinct mechanisms, and therefore unrelated targets. For example, ritonavir is an HIV protease inhibitor; however, it is usually given in combination with other HIV protease inhibitors because it inhibits the cytochrome P450 3A4 (CYP3A4)-mediated metabolism of other HIV protease inhibitors such as lopinavir. In such cases, both HIV1 protease and human CYP3A4 are regarded as the molecular targets.
There is a relatively small, but clinically significant, class of drugs that bind to either ribosomes or DNA, or that have no distinct
or an unknown mode of action. The literature changes frequently in terms of the knowledge available about drug indications and mecha-nisms of action, and so this information needs to be reviewed regularly.
Efficacy targets of current drugsOn the basis of existing knowledge, we were able to determine that all current drugs with a known mode-of-action act through 324 distinct molecular drug targets. Of these, 266 are human-genome-derived proteins, and the remainder are bacterial, viral, fungal or other pathogenic organism targets. Small-molecule drugs modulate 248 proteins, of which 207 are targets encoded by the human genome (TABLE 1). Oral small-molecule drugs target 227 molecular targets, of which 186 are human targets.
A complicating feature of any such analysis is that many drugs have complex and relatively poorly understood pharma-cology, and often limited selectivity against related proteins, and some targets are actually complex multimeric proteins with variable subunit compositions and so on. If one makes the assumption that proteins related down to 50% identity show related pharmacology, then this list of 324 targets expands to 604 genes for the human genome (comparison carried out against ENSEMBL genome June 2006 release containing 29,679 genes). Extending the analysis to include all close homologues (35% identity or closer) increases the number to 1,048 genes (3.5% of the genome). This line of reasoning lead to the initial estimate of the size of the druggable genome4. Understanding the real pharmacological footprint of current drugs offers many opportunities for both develop-ing new, optimized agents with different selectivity profiles, and also more efficient lead discovery and optimization strategies.
Current biological drugs target 76 pro-teins, with currently marketed monoclonal
antibody therapeutics acting on 15 distinct human targets. So far, only nine targets are modulated by both small-molecule and biological drugs, with the differing agent types usually targeting different domains or binding sites. This relatively small number of jointly modulated targets is driven by both technical and commercial considerations. For example, the biological drugs cetuximab and panitumumab target the extracellular domain of the receptor tyrosine kinase EGFR (ERBB1), whereas the small-molecule drugs gefitinib and erlotinib target the adenine portion of the ATP-binding site of the cytosolic catalytic kinase domain within the same receptor.
Drug polypharmacologyIt was clear from both our curation of drug targets from the literature and also data-mining of known affinity values of drugs for targets (as abstracted in a large database of medicinal chemistry literature17) that many drugs show clinically relevant polyphar-macology (that is, they are ‘dirty drugs18). Quite expectedly, closely related members of the gene family will show significant drug promiscuity, and, as a result of the generally similar function of these proteins, give rise to complex composite clinical pharmacology. The point of genuine multitarget effects of drugs is well illustrated by several recently launched protein kinase inhibitors. Imatinib, originally developed as a highly selective inhibitor of c-ABL11 (and which target association led to its first approval for chronic myeloid leukaemia), has subsequently been discovered to be have significant activity against several other clinically relevant kinases, such as c-KIT12–14, leading to expan-sion of the clinical utility of this important therapeutic. Sorafenib has been recently launched as an explicit multikinase inhibi-tor, affecting both tumour proliferation and tumour angiogenesis pathways, and acting
Figure 1 | Gene-family distribution of current drugs per drug substance. The family share as a percentage of all FDA-approved drugs is dis-played for the top ten families. Beyond the ten most commonly drugged families, there are a further 120 domain families or singletons for which only a few drugs have been successfully launched. Data based on 1,357 dosed compo-nents from >20,000 approved products, FDA, December 2005. GPCR, G-protein-coupled receptor.
Figure 2 | Frequency distribution for small-molecule drug potencies.
PERSPECT IVES
994 | DECEMBER 2006 | VOLUME 5 www.nature.com/reviews/drugdisc
Overington et al. How many drug targets are there?. Nat Rev Drug Discov (2006) vol. 5 (12) pp. 993-6
Proteinfamilies
Drug-target databases
• DrugBank
• Matador (Bork/Russell groups at EMBL)
• PDSP Ki database (Ki ! 10 "M)
Drugs per target
1
10
100
1000
0 50 100 150 200 250 300 350 400 450 500
Proteins
Nu
mb
er o
f B
ind
ing
Dru
gs
Top proteins
Metabolizing Non-Metabolizing
187 Cytochrome P450 3A4 70-76 #-adrenergic receptors (5x)
81 Cytochrome P450 3A5 59 Histamine H1 receptor
79 Cytochrome P450 3A7 54Muscarinic acetylcholine
receptors (2x)
67 Cytochrome P450 2D6 48-54 Serotonin receptors (3x)
62 Cytochrome P450 2C9 53 Noradrenaline transporter
Targets per drug
1
10
100
1000
0 100 200 300 400 500 600 700
Drugs
Nu
mb
er o
f B
ind
ing
Pro
tein
s
stitch.embl.de
Top Drugs
# of targets Drugs
37-48 antipsychotics (9x, e.g. clozapine)
37 calcium channel blocker (verapamil)
30-35 more antipsychotics (4x)
28 antidepressants (2x, e.g. fluoxetine)
...
4 aspirin
Drugs and their targets
Side effects
Prediction of drug targets
Clinical trials
• Phase I: initial safety (20-80 healthy volunteers)
• Phase II: efficacy (20-300 patients)
• Phase III: large-scale testing for efficacy and side effects (300-3000 patients)
• Phase IV: post-marketing surveillance
Example
hematologic abnormalities, specifically anemia (unusual tiredness or weakness), agranulocytosis (chills; fever; sore throat; unusual tiredness or weakness)—sometimes fatalhemolytic anemia (continuing unusual tiredness or weakness), ...
Ontology
• COSTART: Coding Symbols for a Thesaurus of Adverse Reaction Terms
• aggregated with other ontologies in UMLS (Unified Medical Language System)
• Concept: [CUI: C0002871] Anemia
• Concept: [CUI: C0002871] Anemia
• Semantic Type:!!Disease or Syndrome
• Concept: [CUI: C0002871] Anemia
• Semantic Type:!!Disease or Syndrome
• Definitions: subnormal levels or function of erythrocytes, resulting in symptoms of tissue hypoxia. ...
• Concept: [CUI: C0002871] Anemia
• Semantic Type:!!Disease or Syndrome
• Definitions: subnormal levels or function of erythrocytes, resulting in symptoms of tissue hypoxia. ...
• Atoms (54): (Sorted by Source, String)anaemia [A1386521/AOD/NP/0000023514]anemia [A0473599/AOD/DE/0000005870]Anemia [A0023657/CCS/MD/4.1]Anemia; unspecified [A8298367/CCS/MD/4.1.3.7]ANEMIA [A0386215/COSTAR/PT/051]anemia [A0473601/CSP/PT/0427-0313]
Recognized entities
hematologic abnormalities, specifically anemia (unusual tiredness or weakness), agranulocytosis (chills; fever; sore throat; unusual tiredness or weakness)—sometimes fatalhemolytic anemia (continuing unusual tiredness or weakness), ...
Drugs per side effect
1
10
100
1000
0 200 400 600 800 1000 1200
Side Effects
Nu
mb
er o
f D
ru
gs
Top side effects
# of drugs side effect
578 nausea
554 rash
538 vomiting
518 headache
517 dizziness
Side effects per drug
1
10
100
1000
0 100 200 300 400 500 600 700 800
Drugs
Nu
mb
er o
f S
ide E
ffects
Recap
side effects targets
drugs
side effects
targets
side effects
targets
Drugs and their targets
Side effects
Prediction of drug targets
Goal: Side-effect similarity measure
• idea: drugs with similar side effects might share targets
Problem 1/3
• similar/interchangeable side effects
• e.g. macrocytic and megaloblastic anemia
• need to capture similarities!
Solution 1/3
• use parent terms from the ontologyExtraction of side e!ects from package inserts
Downweighting of correlated side e!ects: weight gi
Randomization yields p-values used as side e!ect similarity measure
Chemical similarity
Cl
NN
NH2
NH 2
CH 3
DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
megaloblastic anemia (A), …
A
E
B
GFDC
Assignment of parent terms from the ontology
ABCDEFG
0.620.620.621.852.442.212.21
Drugs Weight gi
Calculation of raw score SX,Y
Downweighting of frequent side e!ects: weight ri = -log fi
En
rich
me
nt
ove
r ra
nd
om
Frequency fi of side e!ect
-1
0
1
2
3
4
0.001 0.01 0.1 1
OO
O
N
N
NH 2 N H2
CH 3CH3
CH 3
Cl
NN
NH2
NH 2
CH 3
Tanimoto score = 0.34
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3 PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3Raw score
DARAPRIM®
(pyrimethamine)Cl
NN
NH2
NH2
CH3
Weighting schemes
Side e!ects
Pairwise scores
X
Y
X
Y
Chem
ical
sim
ilarit
y
Side e!ect similarity X,Y
A
C
F
G
B
D
E
H
Fig.1
PROLOPRIM®
(trimethoprim) OO
O
N
N
NH2NH2
CH3CH3
CH3
Combination of side e!ect and chemical structure similarity for prediction of shared targets
Probability of shared drug target: Low High
! " #
$ =
Y X i
i i Y X g r S ,
Problem 2/3
• some side effects are very common and not predictive
• high-level parent terms (e.g. “disease”)
• nausea, dizziness, ...
Solution 2/3
• weigh side effects according to frequency (i.e. fraction of labels with the side effect)
Extraction of side e!ects from package inserts
Downweighting of correlated side e!ects: weight gi
Randomization yields p-values used as side e!ect similarity measure
Chemical similarity
Cl
NN
NH2
NH 2
CH 3
DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
megaloblastic anemia (A), …
A
E
B
GFDC
Assignment of parent terms from the ontology
ABCDEFG
0.620.620.621.852.442.212.21
Drugs Weight gi
Calculation of raw score SX,Y
Downweighting of frequent side e!ects: weight ri = -log fi
En
rich
me
nt
ove
r ra
nd
om
Frequency fi of side e!ect
-1
0
1
2
3
4
0.001 0.01 0.1 1
OO
O
N
N
NH 2 N H2
CH 3CH3
CH 3
Cl
NN
NH2
NH 2
CH 3
Tanimoto score = 0.34
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3 PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3Raw score
DARAPRIM®
(pyrimethamine)Cl
NN
NH2
NH2
CH3
Weighting schemes
Side e!ects
Pairwise scores
X
Y
X
Y
Chem
ical
sim
ilarit
y
Side e!ect similarity X,Y
A
C
F
G
B
D
E
H
Fig.1
PROLOPRIM®
(trimethoprim) OO
O
N
N
NH2NH2
CH3CH3
CH3
Combination of side e!ect and chemical structure similarity for prediction of shared targets
Probability of shared drug target: Low High
! " #
$ =
Y X i
i i Y X g r S ,
Problem 3/3
• side effects are correlated with each other
Extraction of side e!ects from package inserts
Downweighting of correlated side e!ects: weight gi
Randomization yields p-values used as side e!ect similarity measure
Chemical similarity
Cl
NN
NH2
NH 2
CH 3
DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
megaloblastic anemia (A), …
A
E
B
GFDC
Assignment of parent terms from the ontology
ABCDEFG
0.620.620.621.852.442.212.21
Drugs Weight gi
Calculation of raw score SX,Y
Downweighting of frequent side e!ects: weight ri = -log fi
Enrichm
ent over
random
Frequency fi of side e!ect
-1
0
1
2
3
4
0.001 0.01 0.1 1
OO
O
N
N
NH 2 N H2
CH 3CH3
CH 3
Cl
NN
NH2
NH 2
CH 3
Tanimoto score = 0.34
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3 PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3Raw score
DARAPRIM®
(pyrimethamine)Cl
NN
NH2
NH2
CH3
Weighting schemes
Side e!ects
Pairwise scores
X
Y
X
Y
Chem
ical
sim
ilarit
y
Side e!ect similarity X,Y
A
C
F
G
B
D
E
H
Fig.1
PROLOPRIM®
(trimethoprim) OO
O
N
N
NH2NH2
CH3CH3
CH3
Combination of side e!ect and chemical structure similarity for prediction of shared targets
Probability of shared drug target: Low High
! " #
$ =
Y X i
i i Y X g r S ,
Solution 3/3
• Gerstein-Sonnhammer-Chothia weights (from HMM building)
Extraction of side e!ects from package inserts
Downweighting of correlated side e!ects: weight gi
Randomization yields p-values used as side e!ect similarity measure
Chemical similarity
Cl
NN
NH2
NH 2
CH 3
DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
megaloblastic anemia (A), …
A
E
B
GFDC
Assignment of parent terms from the ontology
ABCDEFG
0.620.620.621.852.442.212.21
Drugs Weight gi
Calculation of raw score SX,Y
Downweighting of frequent side e!ects: weight ri = -log fi
Enrichm
ent over
random
Frequency fi of side e!ect
-1
0
1
2
3
4
0.001 0.01 0.1 1
OO
O
N
N
NH 2 N H2
CH 3CH3
CH 3
Cl
NN
NH2
NH 2
CH 3
Tanimoto score = 0.34
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3 PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3Raw score
DARAPRIM®
(pyrimethamine)Cl
NN
NH2
NH2
CH3
Weighting schemes
Side e!ects
Pairwise scores
X
Y
X
Y
Chem
ical
sim
ilarit
y
Side e!ect similarity X,Y
A
C
F
G
B
D
E
H
Fig.1
PROLOPRIM®
(trimethoprim) OO
O
N
N
NH2NH2
CH3CH3
CH3
Combination of side e!ect and chemical structure similarity for prediction of shared targets
Probability of shared drug target: Low High
! " #
$ =
Y X i
i i Y X g r S ,
Raw similarity score
Extraction of side e!ects from package inserts
Downweighting of correlated side e!ects: weight gi
Randomization yields p-values used as side e!ect similarity measure
Chemical similarity
Cl
NN
NH2
NH 2
CH 3
DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
megaloblastic anemia (A), …
A
E
B
GFDC
Assignment of parent terms from the ontology
ABCDEFG
0.620.620.621.852.442.212.21
Drugs Weight gi
Calculation of raw score SX,Y
Downweighting of frequent side e!ects: weight ri = -log fi
En
rich
me
nt
ove
r ra
nd
om
Frequency fi of side e!ect
-1
0
1
2
3
4
0.001 0.01 0.1 1
OO
O
N
N
NH 2 N H2
CH 3CH3
CH 3
Cl
NN
NH2
NH 2
CH 3
Tanimoto score = 0.34
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3 PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3Raw score
DARAPRIM®
(pyrimethamine)Cl
NN
NH2
NH2
CH3
Weighting schemes
Side e!ects
Pairwise scores
X
Y
X
Y
Chem
ical
sim
ilarit
y
Side e!ect similarity X,Y
A
C
F
G
B
D
E
H
Fig.1
PROLOPRIM®
(trimethoprim) OO
O
N
N
NH2NH2
CH3CH3
CH3
Combination of side e!ect and chemical structure similarity for prediction of shared targets
Probability of shared drug target: Low High
! " #
$ =
Y X i
i i Y X g r S ,
Need to normalize
Extraction of side e!ects from package inserts
Downweighting of correlated side e!ects: weight gi
Randomization yields p-values used as side e!ect similarity measure
Chemical similarity
Cl
NN
NH2
NH 2
CH 3
DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
megaloblastic anemia (A), …
A
E
B
GFDC
Assignment of parent terms from the ontology
ABCDEFG
0.620.620.621.852.442.212.21
Drugs Weight gi
Calculation of raw score SX,Y
Downweighting of frequent side e!ects: weight ri = -log fi
Enrichm
ent over
random
Frequency fi of side e!ect
-1
0
1
2
3
4
0.001 0.01 0.1 1
OO
O
N
N
NH 2 N H2
CH 3CH3
CH 3
Cl
NN
NH2
NH 2
CH 3
Tanimoto score = 0.34
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3 PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3Raw score
DARAPRIM®
(pyrimethamine)Cl
NN
NH2
NH2
CH3
Weighting schemes
Side e!ects
Pairwise scores
X
Y
X
Y
Chem
ical
sim
ilarit
y
Side e!ect similarity X,Y
A
C
F
G
B
D
E
H
Fig.1
PROLOPRIM®
(trimethoprim) OO
O
N
N
NH2NH2
CH3CH3
CH3
Combination of side e!ect and chemical structure similarity for prediction of shared targets
Probability of shared drug target: Low High
! " #
$ =
Y X i
i i Y X g r S ,
• shuffling yields p-value: side effect similarity!
Chemical similarity
Extraction of side e!ects from package inserts
Downweighting of correlated side e!ects: weight gi
Randomization yields p-values used as side e!ect similarity measure
Chemical similarity
Cl
NN
NH2
NH 2
CH 3
DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
megaloblastic anemia (A), …
A
E
B
GFDC
Assignment of parent terms from the ontology
ABCDEFG
0.620.620.621.852.442.212.21
Drugs Weight gi
Calculation of raw score SX,Y
Downweighting of frequent side e!ects: weight ri = -log fi
En
richm
ent
ove
r ra
ndom
Frequency fi of side e!ect
-1
0
1
2
3
4
0.001 0.01 0.1 1
OO
O
N
N
NH 2 N H2
CH 3CH3
CH 3
Cl
NN
NH2
NH 2
CH 3
Tanimoto score = 0.34
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3 PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3Raw score
DARAPRIM®
(pyrimethamine)Cl
NN
NH2
NH2
CH3
Weighting schemes
Side e!ects
Pairwise scores
X
Y
X
Y
Chem
ical
sim
ilarit
y
Side e!ect similarity X,Y
A
C
F
G
B
D
E
H
Fig.1
PROLOPRIM®
(trimethoprim) OO
O
N
N
NH2NH2
CH3CH3
CH3
Combination of side e!ect and chemical structure similarity for prediction of shared targets
Probability of shared drug target: Low High
! " #
$ =
Y X i
i i Y X g r S ,
Extraction of side e!ects from package inserts
Downweighting of correlated side e!ects: weight gi
Randomization yields p-values used as side e!ect similarity measure
Chemical similarity
Cl
NN
NH2
NH 2
CH 3
DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
megaloblastic anemia (A), …
A
E
B
GFDC
Assignment of parent terms from the ontology
ABCDEFG
0.620.620.621.852.442.212.21
Drugs Weight gi
Calculation of raw score SX,Y
Downweighting of frequent side e!ects: weight ri = -log fi
En
richm
ent
ove
r ra
ndom
Frequency fi of side e!ect
-1
0
1
2
3
4
0.001 0.01 0.1 1
OO
O
N
N
NH 2 N H2
CH 3CH3
CH 3
Cl
NN
NH2
NH 2
CH 3
Tanimoto score = 0.34
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3
PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3 PROLOPRIM®
(trimethoprim)
ADVERSE REACTIONS
OO
O
N
N
NH2NH2
CH3CH3
CH3DARAPRIM®
(pyrimethamine)
ADVERSE REACTIONS
Cl
NN
NH2
NH2
CH3Raw score
DARAPRIM®
(pyrimethamine)Cl
NN
NH2
NH2
CH3
Weighting schemes
Side e!ects
Pairwise scores
X
Y
X
Y
Chem
ical
sim
ilarit
y
Side e!ect similarity X,Y
A
C
F
G
B
D
E
H
Fig.1
PROLOPRIM®
(trimethoprim) OO
O
N
N
NH2NH2
CH3CH3
CH3
Combination of side e!ect and chemical structure similarity for prediction of shared targets
Probability of shared drug target: Low High
! " #
$ =
Y X i
i i Y X g r S ,
Benchmarking similarity
Benchmark results
Drug–drug network
Donepezil
Paroxetine
Fluoxetine
Rabeprazole
Zolmitriptan
PergolideVenlafaxine
Rabeprazole
Rabeprazole: proton pump inhibitor,
used against ulcers
S
O
O
ONH
N
N
Rabeprazole
Rabeprazole: proton pump inhibitor,
used against ulcers
Pergolide: dopamine receptor agonist, used for the treatment of Parkinson’s disease
S
N
NH
H
H
S
O
O
ONH
N
N
Rabeprazole
Rabeprazole: proton pump inhibitor,
used against ulcers
Pergolide: dopamine receptor agonist, used for the treatment of Parkinson’s disease
S
N
NH
H
H
S
O
O
ONH
N
N
binds dopamine receptor!
Rabeprazole
Rabeprazole: proton pump inhibitor,
used against ulcers
Pergolide: dopamine receptor agonist, used for the treatment of Parkinson’s disease
S
N
NH
H
H
S
O
O
ONH
N
N
inhibits dopamine receptor!?!
13 of 20 drug pairs
11 Ki ! 10 "M2 Ki > 10 "M
Inhib
itio
n o
f con
trol specific
bin
din
g (
%)
-log[drug](M)
9
11
0
50
100
0
50
100
0
50
100
6 10
12
0
50
100
0
50
100
8 7 6 5 4 3 8 7 6 5 4 3
8 7 6 5 4 3
3
0
50
100 4
0
50
100
8 7 6 5 4 38 7 6 5 4 3
5
0
50
100
8 7 6 5 4 3
8 7 6 5 4 3
8 7 6 5 4 3
7
0
50
100 8
0
50
100
8 7 6 5 4 338 7 6 5 4
1
0
50
100 2
0
50
100
8 7 6 5 4 338 7 6 5 4
13
8 7 6 5 4 3
0
50
100
Cell assays
• nine candidates could be tested
• all showed activity (activation/inhibition)
Conclusion
• information about drugs makes human phenotypes accessible
• new drug targets might lead to new indication areas
Acknowledgements
Monica Campillos
Lars Juhl Jensen
Anne-Claude Gavin
Peer Bork
STITCH: http://stitch.embl.de/