Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Presentation of a Structurally Diverse and Commercially Available Drug Data Set for
Correlation and Benchmarking Studies
Anders KarlénUppsala University
OHOHO
O
NH2
NH2
H2N
NH2OHO
OHO
OH
OO
HOOH
NH2H CH2NH2
H3C
HOCH3
CH3O CH3
CH3 CH3
CH3
CH3
H2NSH
O
P
PO
HOHO
HOHO OH
NH2CH3
O
H3C CH3
O
NO
OHNH
O
O
N
O
O
HCH3H3C
H3C
CH3S
NH3C
CH3
OO
Aim of study
• Derive a “benchmark data set“– Drug-like– Physicochemically diverse – Commercially available and inexpensive– Amenable to analytical measurements
• Start the generation of benchmark data– Derive good-quality data from the same lab
Possible use of the data set• General description of drugs• Developing ADME/TOX filters
(permeability, solubility, plasma protein binding etc.)
• To validate novel experimental techniques
Generation of a ”benchmark” data set based on the list of drugs in Sweden (FASS 2001)
691 cpds
Remove compounds•Molecular weight >900•Polymers, polypeptides•Inorganic and metal containing
799 cpds 370 cpds
Select commercially available< $800/g
332 cpds
•Select only oral, nasal, pulminal, ocular, parenteral and rectal administered drugs
284 cpds
Remove “odd” ATC classese.g. A01(Mouth and teeth),A05(Bile acids)A06 (Laxative)…
Exp.design
24-compounddata set
450
Cost and availability of the 691-compound data set
Histogram
Binned Price/gram ($)0.0284 - 24.9 24.9 - 50.2 50.2 - 79.6 79.6 - 100 100 - 995 995 - 3228000
50
100
150
200
450 of the 691 compounds can be boughtPrice range $0.03/gram - $3,228 000/gram (2001)
NN
N
N
Methenamine
HO
CH2
OH
H
H3C
CH3OH
CH3
Calcitrol
Back0.03 -24.9 24.9 – 50.2 50.2 – 79.6 79.6 – 100 100 – 995 995 – 3,228 000
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45
Principal component analysis
Lipophilicity
Size
Polarity
• General descriptors
• General hydrogen bonding descriptors
• Hydrogen bond donor descriptors
• Hydrogen bond acceptor descriptors
Σ28 molecular descriptors
Principal component analysis
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16
t[2]
t[1]
Series (Variable MOL_WEIGHT)0 - 200200 - 400400 - 600600 - 800800 - 1000
SIMCA-P+ 11 - 2006-11-10 10:27:53
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16
t[2]
t[1]
Series (Variable MLOGP)-7 - -4-4 - -1-1 - 22 - 55 - 8
SIMCA-P+ 11 - 2006-11-10 10:32:21
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16
t[2]
t[1]
Series (Variable PSASAVOL)0 - 100100 - 200200 - 300300 - 400
SIMCA-P+ 11 - 2006-11-10 10:34:12
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45
Polarity
SizeLipophilicity
The factorial design“A face-centered central composite design”
P C 2
P C 1
P C 3
P C 2
P C 1
P C 3
+ + -
+ - -
+ - +
- + -
+ + +- + +
- - +
- - -
20 proteolytes4 nonproteolytes
NSH
COOHO
H2N-SO2
F3C
S
NH
NH
OO
SNH
NH
O
NH
ON
N
OO
COOHOH
OHNH2
H
COOH
NH2H
O
I
I
OHI
I
N
NSH
NH2 F
HOOCS
ON
NCl
NH2 NH2
NH
NH2
O NH
N
NH2O
Cl
SN
S
NH
NHH2N-SO2
Cl
OO
NH
OO
Cl
O
OH
O
H H
H
OHO
SNN
NO2
OO
CF3
NOH
SN
Cl
NH2
NH
ON
O
Cl OO
O
O
O OHOH
O
OHHH
OH
OH
O
NH2
N
N
N N
N
NH2
OH
NH
NH
O H COOH
COOH
ONH
O
ONH2
O
NN
Cl
OH
NOH
OH
O
O
O O
O
OHOH
OOH
O OH
N
O
Captopril (0 + 0)
Bendroflumethiazidea (+ − −)
Glipizide (+ − −)
Levodopa (+ + +)
Levothyroxine (0 − 0)
Thiamazole (− + +)
Amantadine (− + +)
Sulindac (0 0 −)
Amiloride (+ + +)
Carbamazepine (− + −)
Chlorprothixene (− 0 0)
Hydrochlorothiazide (+ + −)
Chlorzoxazone (− + −)
Prednisone (0 0 0)
Tinidazole (+ + −)
Flupenthixol (− − −)
Metoclopramide (0 0 0)
Fenofibrate (− − −)
Tetracycline (+ 0 0)
Folic acid (+ − +)
Carisoprodola (0 0 +)
Meclizinea (− − +)
Terfenadineb (− − +)Erythromycin (+ − +)
24-compound data set
P C 2
P C 1
P C 3
The cost of buying the entire data set (at least 1 gram of each compound) is less than $1,500
Comparison of the data sets with respect to some common molecular descriptors
4.71414.9190HBA
2.7802.4190HBD
0.944.8−5.00.7412.3−10.6logDACD_6.5
1.95.3−2.01.97.6−6.4logPMor
992468933730PSA
34977711434785460MW
MeanMaxMinMeanMaxMin
24-compound data set691-compound data set OHOHO
O
NH2
NH2
H2N
NH2OHO
OHO
OH
OO
HOOH
NH2H CH2NH2
N
NO
CH3
O ON
NN
NHO CH3O
O
Candesartan cilexetillogPMor= 7.6
NeomycinHBD = 19
Comparison of the data sets with respect to functional groups
0,00%
25,00%
50,00%
75,00%
ALIPHATIC
q-AMIN
E
ALIPHATIC
t-AMIN
E
ALIPHATIC
s-AMIN
E
ALIPHATIC
p-AMIN
E
COOHBENZE
NEALIP
HATIC O
H
AROMATIC t-A
MINE
AROMATIC s-
AMINE
AROMATIC p-
AMINE
AROMATIC O
H
ESTER
HETEROCYCLIC
Functional group
Perc
ent o
f com
poun
ds c
onta
inin
g th
e fu
nctio
nal g
roup
24-setFASS (druglike only)691- set
Number of substances Percent of dataset
ATC Description 24-set 691-set 24-set 691-setA GI 1 69 4,2% 9,99%B Blood 0 21 0,0% 3,04%C Cardio 2 89 8,3% 12,88%D Topical 0 36 0,0% 5,21%G Gen.hormones 1 38 4,2% 5,50%H Hormones 3 14 12,5% 2,03%J Infection 5 89 20,8% 12,88%L Tum.,immuno 1 53 4,2% 7,67%M Muscle,mov. 3 37 12,5% 5,35%N Nervous 6 134 25,0% 19,39%P Antiparasite 0 13 0,0% 1,88%R Respiration 1 52 4,2% 7,53%S Eye,ear 1 24 4,2% 3,47%V Various 0 22 0,0% 3,18%
Distribution in ATC
Comparison of the data sets with respect to ATC classes
The Anatomical Therapeutic Chemical (ATC) classification system is the most commonly used classification system for drug substances
Start the generation of benchmark data.Derive good-quality data from the same lab
1. Measurment of pKa by pH-metric or pH-UV technique (n=20)
2. Measurment of lipophilicity(a) pH-metric logP (n=18)(b) capacity factors by RP-HPLC (n=21)
3. Measurment of intrinsic and kinetic solubilitypH-metric solubility (CheqSol technique) or shake-plate solubility (n=17)
4. Measurment of permeability across Caco-2 Cells. A to B direction (n=22)
2. LipophilicitypH-metric measurment of logP and logD
-3,00
-2,00
-1,00
0,00
1,00
2,00
3,00
4,00
5,00
6,00
7,00
Amantad
ineAmilo
ride
Bendro
flumeth
iazide
Captop
ril
Chlorpr
othixe
ne
Chlorzo
xazo
ne
Erythro
mycin
Fenofi
brate
Flupen
thixo
lGlip
izide
Hydroc
hlorot
hiazid
eLe
vodo
pa
Levo
thyrox
ineMec
lizine
Metoclo
pramide
Sulind
acTerf
enad
ineTetr
acyc
line
Thiamaz
oleTini
dazo
le
Series1Series2
logP (neutral)logD (pH 7.4)
logP missing for;•Folic acid•Carbamazepin•Prednisone•Carisoprodol
2. LipophilicityExperimental logP vs calculated logP
R2 = 0,70
-4,0
-2,0
0,0
2,0
4,0
6,0
8,0
-2,0 0,0 2,0 4,0 6,0 8,0
logPexp
logP
crip
Crippen logP
R2 = 0,88
-4,0
-2,0
0,0
2,0
4,0
6,0
8,0
-2,0 0,0 2,0 4,0 6,0 8,0logPexp
logP
ACD
ACD/LogP
R2 = 0,89
-4,0
-2,0
0,0
2,0
4,0
6,0
8,0
-2,0 0,0 2,0 4,0 6,0 8,0
logPexp
logP
Clo
gP
ClogP (BioByte)
R2 = 0,80
-3,0-2,0-1,00,01,02,03,04,05,06,0
-2,0 0,0 2,0 4,0 6,0 8,0
logPexp
logP
Mor
Moriguchi logP
2. LipophilicityCorrelation between the measured HPLC capacity
factor (k) and pH-metric logD (pH 6.8)
•Compounds from the 8 corner points have different colors
•The 2 compounds at each corner point have the same color
•The axis points are colored black
•Center point pink
R2 = 0.92
(pH=6.8)
3. SolubilityMeasurment of intrinsic solubility using CheqSol
(24-compound data set)
Log
(μg/
mL)
-3,0
-2,0
-1,0
0,0
1,0
2,0
3,0
4,0
Terfen
adine
Mecliz
ine
Chlorpr
othixe
neFen
ofibra
teGlip
izide
Folic A
cidSuli
ndac
Bendrof
lumeth
iazide
Levo
thyrox
ineFlup
enthi
xol
Metoclo
pramide
Carbam
azep
inePred
nison
eTetr
acyc
line
Hydroc
hloro
thiaz
ideChlo
rzoxa
zone
Amantad
ine
names
Solubility ranges from 0.009 μg/ml to 2119 μg/ml
3. Solubility
http://www.cheqsol.com/download%20files/download01.pdf
19 of the compounds studied also present in the 691-compound data set
CheqSol solubility ranges from 0.9 μg/mL to 3500 μg/mL in these 19 compounds
Compound not present in the 691 data set
Kinetic Solubility
Kinetic Solubility
CheqSol Shake-Flask Literature Chaser non-chaser
1 Phthalic Acid 5330 5950 84622 Quinine 363 201 491 3913 Trazodone 134.6 138.0 4354 Nitrofurantoin 112.5 109.5 78.9 3195 Nortriptyline 27.0 49.3 20.0 27.36 Verapamil 48.5 48.5 9.7 47.87 Niflumic Acid 9.53 29.5 598 Imipramine 17.2 21.7 18.1 17.39 Flumequine 34.2 20.7 121
10 Furosemide 19.7 20.4 5.9 9611 Maprotiline 5.80 8.05 3.49 7712 Piroxicam 5.92 5.95 3.16 23313 Warfarin 5.30 5.25 5.60 12014 Chlorpromazine 2.70 2.41 1.71 2.7015 Lidocaine 3500 3810 460016 Famotidine 740 1100 590017 Hydrochlorothiazide 630 700 240018 Chlorpheniramine 608.3 615.2 66819 Sulfamerazine 200.3 203.0 70120 Ketoprofen 130.6 178.0 33621 Propranolol 81.0 70.0 34022 Ibuprofen 50.0 49.0 18023 Pindolol 41.7 32.7 142424 Miconazole 1.00 0.67
25 Diclofenac 0.90 0.80 4526 Amodiaquin 0.41 8.827 Pamoic acid 0.0003 0.019
All results in µg/mL
Name Equilibrium solubility
In the 24-compound data set the solubility ranges from 0.009 μg/ml to 2119 μg/ml
24-compound data set is structurally diverse
-10
-8
-6
-4
-2
0
2
4
6
8
-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
t[2]
t[1]
No ClassClass 1Class 2
SIMCA-P+ 11 - 2006-11-10 14:05:50
-10
-8
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45
Polarity
SizeLipophilicity
No class19-data set24-data set
0.01
0 .1
1
10
1 00
0 .0 1 0.1 1 1 0 1 0 0 10 00Caco-2 permeab ility (x 10-6 cm /s) a t pH 6.5
Hum
an je
junu
m p
erm
eabi
lity
(x 1
0-4 c
m/s
) at p
H 6
.5F ur ose m i de
H ydr o chl o ro thi azi de
Ate nol o l
Ci m e ti d i ne
M anni tol
Te r butal i ne
Am o xi ci l l i n (C)
Li s i no pr i l (C)
M eto pro l o l
Ce phal e xi n ( C)
Enal apr i l (C )
P r o pr ano l o l
P he nyl al ani ne (C)
De si pr am i ne
Anti pyr i ne
P i r o xi cam
Ve r apam i l (C)
Ke to pr o fen
Napro xen
D-G l uco se (C)
l o g Y = 0 .6 53 2 l o gX - 0 .3 03 6, R2 = 0 .7 27 6 (al l drug s)l o g Y = 0 .7 52 4 l o gX - 0 .5 44 1, R2 = 0 .8 49 2 (pass i ve l y di ffus i ve )L og Y = 0 .5 4 2Lo g X + 0.06 , R2 = 0.78 54 (C ar r i e r-m e di at ed)
Sun, D. et al. Comparison of Human and Caco 2 Gene Expression Profiles for 12,000 Genes and the Permeabilities of 26 Drugs in the Human Intestine and Caco 2 Cells. Pharm Res 2002, 19, 1398-1413
4. Permeability/absorption
0,01
0,10
1,00
10,00
100,00
Erythro
mycin
Captopri
lLe
vodop
a
Tetrac
ycline
Hydrochl
orothia
zide
Amiloride
Folic ac
id
Bendro
flumeth
iazide
Levot
hyrox
ineSulin
dac
Terfen
adine
Flupenth
ixol
Metoclo
pramide
Chlorpro
thixen
eGlipi
zide
Carisopro
dol
Amantad
inePred
nison
eTinid
azole
Carbamazep
ineThia
mazole
Chlorzo
xazon
eP
app/(
10-6
cm
s-1
)
Low
Med
ium
Hig
h
4. Permeability/absorptionIn vitro Papp values in human Caco-2 cells
Suggestions on the ”Uppsala diverse data set” usage
• The 24 compounds can be used– as a test set for testing already derived models of permeability,
lipophilicity, solubility etc.– as a validation set for new experimental techniques– on its own for building and validating models by dividing it into a
training set and a test set
We hope that other groups are willing to help us to supplement the herein-started characterization
”Bench mark data set”
J. Med. Chem.; (ASAP); 2006; 49(23); 6660-6671
Acknowledgements
AstraZeneca R&D MölndalSusanne WiniwarterAnna-Lena UngellJohan WernevikFredrik BergströmLeif Engström
Sirius Analytical Instruments LtdJohn Comer Karl BoxRuth Allen Jon Mole
Faculty of Pharmacy Uppsala UniversityChristian SköldTorbjörn LundstedtAnders HallbergHans Lennernäs