23
Presentation of a Structurally Diverse and Commercially Available Drug Data Set for Correlation and Benchmarking Studies Anders Karlén Uppsala University O HO HO O NH 2 NH 2 H 2 N NH 2 OH O O HO OH O O HO OH NH 2 H CH 2 NH 2 H 3 C HO CH 3 CH 3 O CH 3 CH 3 CH 3 CH 3 CH 3 H 2 N SH O P P O HO HO HO HO OH NH 2 CH 3 O H 3 C CH 3 O N O OH N H O O N O O H CH 3 H 3 C H 3 C CH 3 S N H 3 C CH 3 O O

Presentation of a Structurally Diverse and Commercially

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Presentation of a Structurally Diverse and Commercially

Presentation of a Structurally Diverse and Commercially Available Drug Data Set for

Correlation and Benchmarking Studies

Anders KarlénUppsala University

OHOHO

O

NH2

NH2

H2N

NH2OHO

OHO

OH

OO

HOOH

NH2H CH2NH2

H3C

HOCH3

CH3O CH3

CH3 CH3

CH3

CH3

H2NSH

O

P

PO

HOHO

HOHO OH

NH2CH3

O

H3C CH3

O

NO

OHNH

O

O

N

O

O

HCH3H3C

H3C

CH3S

NH3C

CH3

OO

Page 2: Presentation of a Structurally Diverse and Commercially

Aim of study

• Derive a “benchmark data set“– Drug-like– Physicochemically diverse – Commercially available and inexpensive– Amenable to analytical measurements

• Start the generation of benchmark data– Derive good-quality data from the same lab

Page 3: Presentation of a Structurally Diverse and Commercially

Possible use of the data set• General description of drugs• Developing ADME/TOX filters

(permeability, solubility, plasma protein binding etc.)

• To validate novel experimental techniques

Page 4: Presentation of a Structurally Diverse and Commercially

Generation of a ”benchmark” data set based on the list of drugs in Sweden (FASS 2001)

691 cpds

Remove compounds•Molecular weight >900•Polymers, polypeptides•Inorganic and metal containing

799 cpds 370 cpds

Select commercially available< $800/g

332 cpds

•Select only oral, nasal, pulminal, ocular, parenteral and rectal administered drugs

284 cpds

Remove “odd” ATC classese.g. A01(Mouth and teeth),A05(Bile acids)A06 (Laxative)…

Exp.design

24-compounddata set

450

Page 5: Presentation of a Structurally Diverse and Commercially

Cost and availability of the 691-compound data set

Histogram

Binned Price/gram ($)0.0284 - 24.9 24.9 - 50.2 50.2 - 79.6 79.6 - 100 100 - 995 995 - 3228000

50

100

150

200

450 of the 691 compounds can be boughtPrice range $0.03/gram - $3,228 000/gram (2001)

NN

N

N

Methenamine

HO

CH2

OH

H

H3C

CH3OH

CH3

Calcitrol

Back0.03 -24.9 24.9 – 50.2 50.2 – 79.6 79.6 – 100 100 – 995 995 – 3,228 000

Page 6: Presentation of a Structurally Diverse and Commercially

-10

-8

-6

-4

-2

0

2

4

6

8

-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45

Principal component analysis

Lipophilicity

Size

Polarity

• General descriptors

• General hydrogen bonding descriptors

• Hydrogen bond donor descriptors

• Hydrogen bond acceptor descriptors

Σ28 molecular descriptors

Page 7: Presentation of a Structurally Diverse and Commercially

Principal component analysis

-10

-8

-6

-4

-2

0

2

4

6

8

-8 -6 -4 -2 0 2 4 6 8 10 12 14 16

t[2]

t[1]

Series (Variable MOL_WEIGHT)0 - 200200 - 400400 - 600600 - 800800 - 1000

SIMCA-P+ 11 - 2006-11-10 10:27:53

-10

-8

-6

-4

-2

0

2

4

6

8

-8 -6 -4 -2 0 2 4 6 8 10 12 14 16

t[2]

t[1]

Series (Variable MLOGP)-7 - -4-4 - -1-1 - 22 - 55 - 8

SIMCA-P+ 11 - 2006-11-10 10:32:21

-10

-8

-6

-4

-2

0

2

4

6

8

-8 -6 -4 -2 0 2 4 6 8 10 12 14 16

t[2]

t[1]

Series (Variable PSASAVOL)0 - 100100 - 200200 - 300300 - 400

SIMCA-P+ 11 - 2006-11-10 10:34:12

-10

-8

-6

-4

-2

0

2

4

6

8

-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45

Polarity

SizeLipophilicity

Page 8: Presentation of a Structurally Diverse and Commercially

The factorial design“A face-centered central composite design”

P C 2

P C 1

P C 3

P C 2

P C 1

P C 3

+ + -

+ - -

+ - +

- + -

+ + +- + +

- - +

- - -

Page 9: Presentation of a Structurally Diverse and Commercially

20 proteolytes4 nonproteolytes

NSH

COOHO

H2N-SO2

F3C

S

NH

NH

OO

SNH

NH

O

NH

ON

N

OO

COOHOH

OHNH2

H

COOH

NH2H

O

I

I

OHI

I

N

NSH

NH2 F

HOOCS

ON

NCl

NH2 NH2

NH

NH2

O NH

N

NH2O

Cl

SN

S

NH

NHH2N-SO2

Cl

OO

NH

OO

Cl

O

OH

O

H H

H

OHO

SNN

NO2

OO

CF3

NOH

SN

Cl

NH2

NH

ON

O

Cl OO

O

O

O OHOH

O

OHHH

OH

OH

O

NH2

N

N

N N

N

NH2

OH

NH

NH

O H COOH

COOH

ONH

O

ONH2

O

NN

Cl

OH

NOH

OH

O

O

O O

O

OHOH

OOH

O OH

N

O

Captopril (0 + 0)

Bendroflumethiazidea (+ − −)

Glipizide (+ − −)

Levodopa (+ + +)

Levothyroxine (0 − 0)

Thiamazole (− + +)

Amantadine (− + +)

Sulindac (0 0 −)

Amiloride (+ + +)

Carbamazepine (− + −)

Chlorprothixene (− 0 0)

Hydrochlorothiazide (+ + −)

Chlorzoxazone (− + −)

Prednisone (0 0 0)

Tinidazole (+ + −)

Flupenthixol (− − −)

Metoclopramide (0 0 0)

Fenofibrate (− − −)

Tetracycline (+ 0 0)

Folic acid (+ − +)

Carisoprodola (0 0 +)

Meclizinea (− − +)

Terfenadineb (− − +)Erythromycin (+ − +)

24-compound data set

P C 2

P C 1

P C 3

The cost of buying the entire data set (at least 1 gram of each compound) is less than $1,500

Page 10: Presentation of a Structurally Diverse and Commercially

Comparison of the data sets with respect to some common molecular descriptors

4.71414.9190HBA

2.7802.4190HBD

0.944.8−5.00.7412.3−10.6logDACD_6.5

1.95.3−2.01.97.6−6.4logPMor

992468933730PSA

34977711434785460MW

MeanMaxMinMeanMaxMin

24-compound data set691-compound data set OHOHO

O

NH2

NH2

H2N

NH2OHO

OHO

OH

OO

HOOH

NH2H CH2NH2

N

NO

CH3

O ON

NN

NHO CH3O

O

Candesartan cilexetillogPMor= 7.6

NeomycinHBD = 19

Page 11: Presentation of a Structurally Diverse and Commercially

Comparison of the data sets with respect to functional groups

0,00%

25,00%

50,00%

75,00%

ALIPHATIC

q-AMIN

E

ALIPHATIC

t-AMIN

E

ALIPHATIC

s-AMIN

E

ALIPHATIC

p-AMIN

E

COOHBENZE

NEALIP

HATIC O

H

AROMATIC t-A

MINE

AROMATIC s-

AMINE

AROMATIC p-

AMINE

AROMATIC O

H

ESTER

HETEROCYCLIC

Functional group

Perc

ent o

f com

poun

ds c

onta

inin

g th

e fu

nctio

nal g

roup

24-setFASS (druglike only)691- set

Page 12: Presentation of a Structurally Diverse and Commercially

Number of substances Percent of dataset

ATC Description 24-set 691-set 24-set 691-setA GI 1 69 4,2% 9,99%B Blood 0 21 0,0% 3,04%C Cardio 2 89 8,3% 12,88%D Topical 0 36 0,0% 5,21%G Gen.hormones 1 38 4,2% 5,50%H Hormones 3 14 12,5% 2,03%J Infection 5 89 20,8% 12,88%L Tum.,immuno 1 53 4,2% 7,67%M Muscle,mov. 3 37 12,5% 5,35%N Nervous 6 134 25,0% 19,39%P Antiparasite 0 13 0,0% 1,88%R Respiration 1 52 4,2% 7,53%S Eye,ear 1 24 4,2% 3,47%V Various 0 22 0,0% 3,18%

Distribution in ATC

Comparison of the data sets with respect to ATC classes

The Anatomical Therapeutic Chemical (ATC) classification system is the most commonly used classification system for drug substances

Page 13: Presentation of a Structurally Diverse and Commercially

Start the generation of benchmark data.Derive good-quality data from the same lab

1. Measurment of pKa by pH-metric or pH-UV technique (n=20)

2. Measurment of lipophilicity(a) pH-metric logP (n=18)(b) capacity factors by RP-HPLC (n=21)

3. Measurment of intrinsic and kinetic solubilitypH-metric solubility (CheqSol technique) or shake-plate solubility (n=17)

4. Measurment of permeability across Caco-2 Cells. A to B direction (n=22)

Page 14: Presentation of a Structurally Diverse and Commercially

2. LipophilicitypH-metric measurment of logP and logD

-3,00

-2,00

-1,00

0,00

1,00

2,00

3,00

4,00

5,00

6,00

7,00

Amantad

ineAmilo

ride

Bendro

flumeth

iazide

Captop

ril

Chlorpr

othixe

ne

Chlorzo

xazo

ne

Erythro

mycin

Fenofi

brate

Flupen

thixo

lGlip

izide

Hydroc

hlorot

hiazid

eLe

vodo

pa

Levo

thyrox

ineMec

lizine

Metoclo

pramide

Sulind

acTerf

enad

ineTetr

acyc

line

Thiamaz

oleTini

dazo

le

Series1Series2

logP (neutral)logD (pH 7.4)

logP missing for;•Folic acid•Carbamazepin•Prednisone•Carisoprodol

Page 15: Presentation of a Structurally Diverse and Commercially

2. LipophilicityExperimental logP vs calculated logP

R2 = 0,70

-4,0

-2,0

0,0

2,0

4,0

6,0

8,0

-2,0 0,0 2,0 4,0 6,0 8,0

logPexp

logP

crip

Crippen logP

R2 = 0,88

-4,0

-2,0

0,0

2,0

4,0

6,0

8,0

-2,0 0,0 2,0 4,0 6,0 8,0logPexp

logP

ACD

ACD/LogP

R2 = 0,89

-4,0

-2,0

0,0

2,0

4,0

6,0

8,0

-2,0 0,0 2,0 4,0 6,0 8,0

logPexp

logP

Clo

gP

ClogP (BioByte)

R2 = 0,80

-3,0-2,0-1,00,01,02,03,04,05,06,0

-2,0 0,0 2,0 4,0 6,0 8,0

logPexp

logP

Mor

Moriguchi logP

Page 16: Presentation of a Structurally Diverse and Commercially

2. LipophilicityCorrelation between the measured HPLC capacity

factor (k) and pH-metric logD (pH 6.8)

•Compounds from the 8 corner points have different colors

•The 2 compounds at each corner point have the same color

•The axis points are colored black

•Center point pink

R2 = 0.92

(pH=6.8)

Page 17: Presentation of a Structurally Diverse and Commercially

3. SolubilityMeasurment of intrinsic solubility using CheqSol

(24-compound data set)

Log

(μg/

mL)

-3,0

-2,0

-1,0

0,0

1,0

2,0

3,0

4,0

Terfen

adine

Mecliz

ine

Chlorpr

othixe

neFen

ofibra

teGlip

izide

Folic A

cidSuli

ndac

Bendrof

lumeth

iazide

Levo

thyrox

ineFlup

enthi

xol

Metoclo

pramide

Carbam

azep

inePred

nison

eTetr

acyc

line

Hydroc

hloro

thiaz

ideChlo

rzoxa

zone

Amantad

ine

names

Solubility ranges from 0.009 μg/ml to 2119 μg/ml

Page 18: Presentation of a Structurally Diverse and Commercially

3. Solubility

http://www.cheqsol.com/download%20files/download01.pdf

19 of the compounds studied also present in the 691-compound data set

CheqSol solubility ranges from 0.9 μg/mL to 3500 μg/mL in these 19 compounds

Compound not present in the 691 data set

Kinetic Solubility

Kinetic Solubility

CheqSol Shake-Flask Literature Chaser non-chaser

1 Phthalic Acid 5330 5950 84622 Quinine 363 201 491 3913 Trazodone 134.6 138.0 4354 Nitrofurantoin 112.5 109.5 78.9 3195 Nortriptyline 27.0 49.3 20.0 27.36 Verapamil 48.5 48.5 9.7 47.87 Niflumic Acid 9.53 29.5 598 Imipramine 17.2 21.7 18.1 17.39 Flumequine 34.2 20.7 121

10 Furosemide 19.7 20.4 5.9 9611 Maprotiline 5.80 8.05 3.49 7712 Piroxicam 5.92 5.95 3.16 23313 Warfarin 5.30 5.25 5.60 12014 Chlorpromazine 2.70 2.41 1.71 2.7015 Lidocaine 3500 3810 460016 Famotidine 740 1100 590017 Hydrochlorothiazide 630 700 240018 Chlorpheniramine 608.3 615.2 66819 Sulfamerazine 200.3 203.0 70120 Ketoprofen 130.6 178.0 33621 Propranolol 81.0 70.0 34022 Ibuprofen 50.0 49.0 18023 Pindolol 41.7 32.7 142424 Miconazole 1.00 0.67

25 Diclofenac 0.90 0.80 4526 Amodiaquin 0.41 8.827 Pamoic acid 0.0003 0.019

All results in µg/mL

Name Equilibrium solubility

In the 24-compound data set the solubility ranges from 0.009 μg/ml to 2119 μg/ml

Page 19: Presentation of a Structurally Diverse and Commercially

24-compound data set is structurally diverse

-10

-8

-6

-4

-2

0

2

4

6

8

-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

t[2]

t[1]

No ClassClass 1Class 2

SIMCA-P+ 11 - 2006-11-10 14:05:50

-10

-8

-6

-4

-2

0

2

4

6

8

-8 -6 -4 -2 0 2 4 6 8 10 12 14 16SIMCA-P 11 - 2006-11-01 16:08:45

Polarity

SizeLipophilicity

No class19-data set24-data set

Page 20: Presentation of a Structurally Diverse and Commercially

0.01

0 .1

1

10

1 00

0 .0 1 0.1 1 1 0 1 0 0 10 00Caco-2 permeab ility (x 10-6 cm /s) a t pH 6.5

Hum

an je

junu

m p

erm

eabi

lity

(x 1

0-4 c

m/s

) at p

H 6

.5F ur ose m i de

H ydr o chl o ro thi azi de

Ate nol o l

Ci m e ti d i ne

M anni tol

Te r butal i ne

Am o xi ci l l i n (C)

Li s i no pr i l (C)

M eto pro l o l

Ce phal e xi n ( C)

Enal apr i l (C )

P r o pr ano l o l

P he nyl al ani ne (C)

De si pr am i ne

Anti pyr i ne

P i r o xi cam

Ve r apam i l (C)

Ke to pr o fen

Napro xen

D-G l uco se (C)

l o g Y = 0 .6 53 2 l o gX - 0 .3 03 6, R2 = 0 .7 27 6 (al l drug s)l o g Y = 0 .7 52 4 l o gX - 0 .5 44 1, R2 = 0 .8 49 2 (pass i ve l y di ffus i ve )L og Y = 0 .5 4 2Lo g X + 0.06 , R2 = 0.78 54 (C ar r i e r-m e di at ed)

Sun, D. et al. Comparison of Human and Caco 2 Gene Expression Profiles for 12,000 Genes and the Permeabilities of 26 Drugs in the Human Intestine and Caco 2 Cells. Pharm Res 2002, 19, 1398-1413

4. Permeability/absorption

Page 21: Presentation of a Structurally Diverse and Commercially

0,01

0,10

1,00

10,00

100,00

Erythro

mycin

Captopri

lLe

vodop

a

Tetrac

ycline

Hydrochl

orothia

zide

Amiloride

Folic ac

id

Bendro

flumeth

iazide

Levot

hyrox

ineSulin

dac

Terfen

adine

Flupenth

ixol

Metoclo

pramide

Chlorpro

thixen

eGlipi

zide

Carisopro

dol

Amantad

inePred

nison

eTinid

azole

Carbamazep

ineThia

mazole

Chlorzo

xazon

eP

app/(

10-6

cm

s-1

)

Low

Med

ium

Hig

h

4. Permeability/absorptionIn vitro Papp values in human Caco-2 cells

Page 22: Presentation of a Structurally Diverse and Commercially

Suggestions on the ”Uppsala diverse data set” usage

• The 24 compounds can be used– as a test set for testing already derived models of permeability,

lipophilicity, solubility etc.– as a validation set for new experimental techniques– on its own for building and validating models by dividing it into a

training set and a test set

We hope that other groups are willing to help us to supplement the herein-started characterization

”Bench mark data set”

J. Med. Chem.; (ASAP); 2006; 49(23); 6660-6671

Page 23: Presentation of a Structurally Diverse and Commercially

Acknowledgements

AstraZeneca R&D MölndalSusanne WiniwarterAnna-Lena UngellJohan WernevikFredrik BergströmLeif Engström

Sirius Analytical Instruments LtdJohn Comer Karl BoxRuth Allen Jon Mole

Faculty of Pharmacy Uppsala UniversityChristian SköldTorbjörn LundstedtAnders HallbergHans Lennernäs