14
Drugs R D 2007; 8 (6): 349-362 REVIEW ARTICLE 1174-5886/07/0006-0349/$44.95/0 © 2007 Adis Data Information BV. All rights reserved. Improving Early Drug Discovery through ADME Modelling An Overview David S. Wishart Departments of Biological Science and Computing Science, University of Alberta, Edmonton, Alberta, Canada Contents Abstract .................................................................................... 349 1. Measuring ADME ......................................................................... 351 2. ADME Databases ........................................................................ 352 3. Chemical Determinants of ADME .......................................................... 354 4. ADME Prediction Methods ................................................................ 354 5. ADME Prediction and Modelling Software .................................................. 357 5.1 ADME Parameter Prediction for Rationalising Drug Discovery Efforts ....................... 357 5.2 Metabolic Fate/Stability Prediction for Rationalising Drug Discovery Efforts ................. 359 5.3 Physiology-Based Pharmacokinetic Prediction for Rationalising Drug Discovery Efforts ....... 359 6. Looking Ahead .......................................................................... 360 Drug development is an intrinsically risky business. Like a high stakes poker Abstract game the entry costs are high and the probability of winning is low. Indeed, only a tiny percentage of lead compounds ever reach US FDA approval. At any point during the drug development process a prospective drug lead may be terminated owing to lack of efficacy, adverse effects, excessive toxicity, poor absorption or poor clearance. Unfortunately, the more promising a drug lead appears to be, the more costly it is to terminate its development. Typically, the cost of killing a drug grows exponentially as a drug lead moves further down the development pipeline. As a result there is considerable interest in developing either experimental or computational methods that can identify potentially problematic drug leads at the earliest stages in their development. One promising route is through the prediction or modelling of ADME (absorption, distribution, metabolism and excretion). ADME data, whether experimentally measured or computationally predicted, provide key insights into how a drug will ultimately be treated or accepted by the body. So while a drug lead may exhibit phenomenal efficacy in vitro, poor ADME results will almost invariably terminate its development. This review focuses on the use of ADME modelling to reduce late-stage attrition in drug discovery programmes. It also highlights what tools exist today for visualising and predict- ing ADME data, what tools need to be developed, and the importance of integrating ADME data to aid in compound selection during the earliest phases of drug discovery. In particular, it highlights what tools exist today for visualising

Improving Early Drug Discovery through ADME Modelling

Embed Size (px)

Citation preview

Page 1: Improving Early Drug Discovery through ADME Modelling

Drugs R D 2007; 8 (6): 349-362REVIEW ARTICLE 1174-5886/07/0006-0349/$44.95/0

© 2007 Adis Data Information BV. All rights reserved.

Improving Early Drug Discoverythrough ADME ModellingAn Overview

David S. Wishart

Departments of Biological Science and Computing Science, University of Alberta, Edmonton,Alberta, Canada

ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3491. Measuring ADME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3512. ADME Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3523. Chemical Determinants of ADME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3544. ADME Prediction Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3545. ADME Prediction and Modelling Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

5.1 ADME Parameter Prediction for Rationalising Drug Discovery Efforts . . . . . . . . . . . . . . . . . . . . . . . 3575.2 Metabolic Fate/Stability Prediction for Rationalising Drug Discovery Efforts . . . . . . . . . . . . . . . . . 3595.3 Physiology-Based Pharmacokinetic Prediction for Rationalising Drug Discovery Efforts . . . . . . . 359

6. Looking Ahead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360

Drug development is an intrinsically risky business. Like a high stakes pokerAbstractgame the entry costs are high and the probability of winning is low. Indeed, only atiny percentage of lead compounds ever reach US FDA approval. At any pointduring the drug development process a prospective drug lead may be terminatedowing to lack of efficacy, adverse effects, excessive toxicity, poor absorption orpoor clearance. Unfortunately, the more promising a drug lead appears to be, themore costly it is to terminate its development. Typically, the cost of killing a druggrows exponentially as a drug lead moves further down the development pipeline.As a result there is considerable interest in developing either experimental orcomputational methods that can identify potentially problematic drug leads at theearliest stages in their development. One promising route is through the predictionor modelling of ADME (absorption, distribution, metabolism and excretion).ADME data, whether experimentally measured or computationally predicted,provide key insights into how a drug will ultimately be treated or accepted by thebody. So while a drug lead may exhibit phenomenal efficacy in vitro, poor ADMEresults will almost invariably terminate its development. This review focuses onthe use of ADME modelling to reduce late-stage attrition in drug discoveryprogrammes. It also highlights what tools exist today for visualising and predict-ing ADME data, what tools need to be developed, and the importance ofintegrating ADME data to aid in compound selection during the earliest phases ofdrug discovery. In particular, it highlights what tools exist today for visualising

Page 2: Improving Early Drug Discovery through ADME Modelling

350 Wishart

and predicting ADME data including: (1) ADME parameter predictors; (2) meta-bolic fate predictors; (3) metabolic stability predictors; (4) cytochrome P450substrate predictors; and (5) physiology-based pharmacokinetic (PBPK) model-ling software. It also discusses what kinds of tools need to be developed, and theimportance of integrating ADME data to aid in compound selection during theearliest phases of drug discovery.

Drug discovery and development is both expen- cost pharmaceutical companies billions of dol-lars.[3,4] Even among widely used or ‘safe’ drugssive and risky. Recent studies have shown that thethere is still a risk – not only for the drug companyaverage cost of bringing a drug to market is inbut also for the patient and the prescribing physi-excess of $US800 million.[1,2] This figure is calculat-cian. Adverse drug reactions lead to an averageed by amortising the costs of 12 very long years ofof 2 million hospitalisations, 100 000 deaths andresearch and development needed to take an inter-thousands of malpractice suits per year in the USesting ‘hit’ to a marketable product. At any pointalone.[5]during this development process, the drug could run

Given the high costs of fending off adverse-drug-into an unforeseen snag or regulatory hurdle and itsreaction suits or recouping losses from late-stagedevelopment may be terminated. Typically only 1 indrug withdrawal, the mantra in the drug industry30 validated lead compounds makes it into phase Ithese days is ‘fail early and fail often’. As a resultclinical trials and only 1 in 6 drugs makes it fromthere is considerable interest in developing eitherphase I trials into the marketplace.[2] Even drugs thatexperimental or computational methods that canare eventually approved still stand a ~5% chance (42identify problematic drug leads at the earliest stagesof 1100 new drug applications over the past 40in their development. One promising route isyears) of having to be withdrawn because of signifi-through the prediction or modelling of how drugscant adverse drug reactions (figure 1). The cost ofwill be absorbed, processed and eliminated by thefailure at any of these stages grows exponentially.body. Most pharmaceutical scientists refer to thisFailures in the lead discovery or preclinical stage arebiological processing as ADME (absorption, distri-typically measured in hundreds of thousands of dol-bution, metabolism and excretion). It may also belars, failures in phase I trials may cost a companycalled ADMET or ADME/Tox, where the ‘T’ andmillions, failures in phase III trials typically cost‘Tox’ refer to toxicology. ADME data, whethertens to hundreds of millions, and approved-drugpredicted or experimentally measured, can providewithdrawals because of adverse drug reactions cankey insights into whether a drug lead has the ‘rightstuff’ to ultimately be a marketable drug. It can alsobe used to prioritise hits from high throughputscreening efforts or to enrich the results fromin silico docking experiments.[6-8] In other words,ADME prediction can, when applied to the earliestphases of drug testing and development, be used toreduce late-stage attrition in drug discoveryprogrammes. This review provides a summary ofhow ADME is measured, where ADME data arearchived, the chemical determinants of ADME, howADME can be predicted and a variety of softwaretools for visualising and predicting ADME data. Italso provides some perspective on how in silico

CostTime

PhaseSuccess

$US2803.5y

$US601y

$US902y

$US2503y

$US120 million2.5y

0.02%

Dis

cove

ry a

ndpr

eclin

ical

test

ing

20%

Pha

se I

40%

Pha

se II

60%

Pha

se II

I

95%

US

FD

Aap

prov

al

Fig. 1. The drug development pipeline. This outlines the (amor-tised) costs and time taken for many of the key phases of drugdevelopment, from lead compound discovery to US FDA approval.The % Success indicates the likelihood a compound at any givenstage will obtain or retain FDA approval.

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 3: Improving Early Drug Discovery through ADME Modelling

ADME Modelling to Improve Early Drug Discovery 351

ADME has evolved and where it needs to go to play pounds in rodent models.[10,11] These permitted nota more integral role in drug discovery and develop- only the identification of drug metabolites but alsoment. the measurement of absorption, bioavailability, tis-

sue distribution mass balance and routes of excre-tion. More recently, radioactive in vivo approaches1. Measuring ADMEhave been supplanted by non-radioactive liquidchromatography-mass spectrometry (LC-MS), gasThe barriers a chemical entity has to clear inchromatography-mass spectrometry (GC-MS) orgoing from a lead compound to an approved drugnuclear magnetic resonance (NMR) methods.[12]

are quite daunting. Not only does the compoundIn particular, the emergence of high-throughputhave to exhibit high activity and specificity againstmetabolomics (or metabonomics) methods haveits target, it must also have appropriate physical andopened up new vistas for experimentally monitoringchemical properties to be readily absorbed, to passADME and ADMET.[13]through multiple cell or tissue layers, to get to the

right organs, to perform its activity safely, to be In addition to these recent instrumental advances,deactivated and ultimately to be eliminated. In other many in vivo ADME assays are now being replacedwords, activity and specificity are only a small part by in vitro cell models, such as Caco-2 or Madin-of what ultimately makes a lead compound into a Darby canine kidney (MDCK) cells for intestinalviable drug. Given the importance of these ‘other’

absorption, liver microsomes for monitoring drugqualities in drug assessment it is not surprising to

metabolism, immobilised artificial membrane chro-learn that almost every compound of interest is

matography (IAMC) for measuring lipid partition-thoroughly tested for its absorption, distribution,ing and bovine brain microvessel endothelial cellsmetabolism, excretion and toxicity (ADME, AD-for monitoring blood-brain barrier penetration. TheME/Tox or ADMET). Failure in any one of theseuse of well defined in vitro assays has made manytests is sufficient and appropriate reason to kill aADME measurements more accessible and moredrug lead.reproducible.[14] Similarly, the development ofWhile ADME is clearly a key issue in drugquantitative gene chip assays such as the Affymetricdiscovery and development, it is also important toAmpliChip CYP450[15] are allowing for the effectsremember that ADME actually refers to a relativelyof CYP polymorphisms on drug metabolism, drug-broad set of ill-defined physiological processes. As

a result ADME measurements made by pharmaceu- drug interactions and drug toxicity to be monitoredtical scientists are usually reported in terms of a in phase I-II and III trials.more refined or quantitative set of parameters.[9]

Even with these advances and improvements, theThese include aqueous solubility, Caco-2 permea- experimental measurement of ADME or ADMET isbility, blood-brain barrier penetration, volume of

still expensive and time consuming. As a result theredistribution (Vd), human intestinal absorption

has been a strong motivation, particularly over(HIA), oral bioavailability, plasma protein binding,the last decade, to develop alternative ways to inex-cytochrome P450 (CYP) metabolic stability, andpensively and rapidly measure, model or predictelimination half-life. Table I provides a more de-ADME. Most of these alternative methods are basedtailed description of what these terms mean, howon theoretical, statistical or computational ap-they relate to ADME and how they are measuredproaches. However, for these methods to work it isexperimentally.critical to have a sufficiently large and reliable train-As seen in this table, the measurement of ADMEing set (derived from experiments) from which tocan be done through a variety of in vivo and in vitrolearn or develop the appropriate modelling software.assays. Historically most ADME measurements

were done using radiolabelled (14C or 3H) com- This is where ADME databases come in.

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 4: Improving Early Drug Discovery through ADME Modelling

352 Wishart

Table I. Summary of common ADME (absorption, distribution, metabolism and excretion) parameters and experimental measurementmethods

ADME measurement Definition or description Experimental tools

Aqueous or water solubility Ability of a drug to dissolve in water. Usually reported as logS. TurbidometryTypically, poor solubility leads to poor absorption and therefore a Laser nephelometrygeneral goal is to avoid poorly soluble compounds. 80% of approveddrugs have a logS > –4. Relates to A and D in ADME

Permeability coefficient, A measure of in vitro intestinal drug absorption and transport. It is Caco-2 cell linesmembrane permeability, commonly reported as an apparent permeability coefficient (Papp or MDCK cellsCaco-2 permeability, Pm) and is usually measured in cm/sec. Most methods used Caco-2 IAMCintestinal transcellular or MDCK cell monolayers. The Caco-2 cell line is derived from human PAMPApermeability colorectal carcinoma and exhibits close similarity to the small intestinal

columnar epithelium. Other in vitro assays (IAMC or PAMPA) can alsobe used to measure passive membrane diffusion. Relates to A and Din ADME

Blood-brain barrier penetration Blood-brain barrier penetration is based on a measure of the Bovine brain microvessel cellsor permeability partitioning of a compound across the blood-brain barrier. The property IAMC

(logBB) is measured experimentally as the ratio of the concentration inthe brain to that in the blood. Relates to A, D and M in ADME

Volume of distribution Defined as the volume in which the amount of drug would need to be Animal pharmacokineticsuniformly distributed to produce the observed blood concentration. It Radio-HPLCis equal to the total amount of the drug in the body divided by the LC-MS/MSconcentration of the drug in the blood (usually expressed as L/kg). It is GC-MSused to quantify drug distribution. A large volume of distribution implieswide distribution and/or extensive tissue binding. Relates to D in ADME

% Human intestinal % Human intestinal absorption (% oral absorption) is an in vivo Radio-HPLCabsorption or % oral measure defined as the % dose of orally administered drug to reach Autoradiographyabsorption or % fractional the hepatic portal vein. It can also be described as the % of urinary LC-MS/MSabsorption excretion of drug-related material following oral administration, or the GC-MS

ratio of the total mass absorbed divided by the drug dose (% fractional NMRabsorption). Oral absorption takes into consideration metabolism thatoccurs in the gut wall, but not first-pass metabolism in the liver.Relates to A, D and M in ADME

Oral bioavailability Defined as the availability of a drug to the general circulation or site GC-MSof pharmacological actions. It is also defined as the % of oral dose LC-MS/MSavailable to the general blood circulation. Relates to A and E in ADME

Plasma or serum protein Plasma protein binding values (% fraction bound) are often given as % HPLCbinding of the total plasma concentration of a drug that is bound to all plasma Ultrafiltration

proteins. Drugs with high % protein binding values tend to have a Equilibrium dialysisgreater half-life than those with lower values. Relates to D in ADME

Cytochrome P450 (CYP) Measures a drug’s resistance or proclivity to first-pass hepatic Microsome or supersomemetabolic stability or metabolism by CYP enzymes. It also identifies which CYP variants assaysphase I-II metabolic fate are most likely to degrade the drug. Metabolic stability is usually Hepatocyte assays

expressed as in vitro half-life and intrinsic clearance. This relates to LC-MS/MSM and E in ADME

Elimination or plasma half-life The time taken for the body to eliminate or break down one-half of a Radio-HPLCdose of a drug. More precisely, it is the time taken for the plasma LC-MS/MSconcentration to fall to one-half its original value. Usually measured in GC-MSminutes or hours. Relates to M and E in ADME NMR

GC-MS = gas chromatography-mass spectrometry; HPLC = high-performance liquid chromatography; IAMC = immobilised artificialmembrane chromatography; LC-MS/MS = liquid chromatography-mass spectrometry/mass spectrometry; MDCK = Madin-Darby caninekidney; NMR = nuclear magnetic resonance; PAMPA = parallel artificial membrane permeability analysis.

2. ADME Databases cades. As a result there are a growing number ofboth freely available and commercially distributed

Pharmaceutical scientists have been measuring ADME or ADMET databases. These contain quanti-and publishing ADME and ADMET data for de-

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 5: Improving Early Drug Discovery through ADME Modelling

ADME Modelling to Improve Early Drug Discovery 353

Table II. Useful ADME (absorption, distribution, metabolism and excretion) databases and data sets

Database name URL or reference Contents

DrugBank (free) http://www.drugbank.ca Elimination half-lifeSerum protein binding% Oral absorptionWater solubilityCYP interactions

UCSD ADME Database (free) http://modem.ucsd.edu/adme/databases/ Water solubilitydatabases_extend.htm Caco-2 permeability

BBB permeability% Oral absorptionOral bioavailability

AQUASOL Database (subscription) http://www.pharmacy.arizona.edu/outreach/ Water solubilityaquasol/

PhysProp Database http://www.syrres.com/esc/physdemo.htm Water solubility(free/subscription) LogP

Melting/boiling point

CYP450 Drug Interaction Table (free) http://medicine.iupui.edu/flockhart/table.htm CYP drug interaction data

BBB Permeability Data Set Garg & Verma[16] BBB permeability

HIA or % Oral Absorption Data Set Zhao et al.,[17] Iyer et al.[18] HIA (% oral absorption)

Volume of Distribution Data Set Lombardo et al.[19] Human volume of distribution

MDL Metabolite Database (commercial) http://www.mdl.com/products/predictive/ Drug metabolism andmetabolite/index.jsp transformation data

BioRad Know-It-All Databases (commercial) http://www.knowitall.com/adme LogPHIABioavailabilityBBB permeabilitySerum protein binding

PharmaInformatic Databases (commercial) http://www.pharmainformatic.com BioavailabilityElimination half-lifeVolume of distributionBBB permeabilityWater solubilitySerum protein bindingCYP interactions

BBB = blood-brain barrier; CYP = cytochrome P450; HIA = human intestinal absorption.

tative data about many of the ADME parameters blood-brain barrier permeability databases havelisted in table I. Almost all of these databases re- fewer than 120 measurements, while most oralpresent compilations of literature-derived data, al- bioavailability/absorption databases contain fewerthough some represent the results of comprehensive, than 700 entries. The relatively small size of thesecontrolled experimental assessments. Table II sum- databases has proven to be a bit of a hindrance to themarises some of the main ADME databases and testing and training of many ADME predictors.[20]

published data sets that are now available, along Similarly, the continuing confusion about the mean-with brief descriptions of their content. ing of certain ADME terms (oral bioavailability,

oral absorption, intestinal permeability, permeabili-As can be seen from this table, there is no singlety coefficient) combined with their frequent inter-database that contains all known ADME descriptorschangeability has made data collection from litera-or parameters. With the exception of the PhysPropture values somewhat difficult.[21] In addition toand AQUASOL databases (which contain 10 000+these issues, there are a number of lingering con-compounds), most of these ADME resources con-cerns over the quality or consistency of some typestain a relatively modest number of ADME parame-

ters. For instance, most Caco-2 permeability and of ADME measurements, particularly with in vivo

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 6: Improving Early Drug Discovery through ADME Modelling

354 Wishart

data. Indeed, many in vivo data sets are gathered tion for providing quantitative and informativefrom widely different animal models (mice, rats, descriptors for almost any chemical entity.[22,23] Ta-dogs, humans) using somewhat different protocols. ble III provides a summary of some of the more

common or well known chemical propertyNevertheless, the existence of these databases,descriptors that are used in ADME modelling. Theseand the continuing efforts to expand and improvefeatures can describe the structure, shape, size, flexi-the quality and quantity of data in them, is what isbility, solubility and electrical properties of almostdriving most of the field of predictive ADME orany organic compound. Furthermore, many of theseADME modelling. Certainly the dozen or sophysicochemical properties can be rapidly deriveddatabases listed in table II serve as the primaryor accurately predicted using only the molecule’straining, testing and validating data sets for most oftwo-dimensional structure or even its SMILEStoday’s predictive ADME software.string.[24] In fact, physicochemical property predic-tion or calculation has been an integral part of many3. Chemical Determinants of ADMEchemistry software packages for more than 30 years.

The primary goal of computational ADME pre- Most of today’s commercial chemistry softwarediction and modelling is to be able to accurately vendors, such as ACD labs, CambridgeSoft, Tripospredict multiple ADME parameters without having and Acclerys, offer very high quality and very accu-to do costly in vivo or in vitro experiments. In an rate chemical property prediction/prediction tools.ideal world, ADME predictions would serve as hy- However, many of these property descriptions orpothesis generators allowing medicinal chemists predictions are freely available over the internetand pharmaceutical chemists to make informed de- through a variety of high quality web servers such ascisions about whether to conduct a series of FAF-Drugs[25] and other open source or open accesspharmacokinetic experiments or to go back to the tools.[22,26]

drawing board and redesign a completely new Table III also outlines which molecular propertychemical entity. The goal of in silico ADME predic- descriptors have been shown (in different predictivetion is to have users submit only a simple two- models) to contribute significantly to which ADMEdimensional sketch (or an MOL file or a SMILES properties. As a general rule any single ADME[Simplified Molecular Input Line Entry Specifica- property (such as logBB or Papp) is determined by ation] string) of a molecule as input and the pro- combination of anywhere from 2 to 20 individualgramme should automatically generate a predicted chemical property descriptors. Determining whichset of physiological parameters (such as blood-brain combination of chemical descriptors provides thepermeability [logBB], aqueous solubility [logS], ap- most reliable prediction of a given ADME propertyparent permeability coefficient [Papp], oral absorp- has occupied the efforts of hundreds of scientiststion [%OA] or 50% lethal dose [LD50]) as output. and statisticians for most of the past decade.

The challenge in ADME prediction is to findquantitative relationships between the physicochem- 4. ADME Prediction Methodsical properties of the compound of interest and itsphysiological (i.e. ADME) properties. This requires The availability of quantitative and reliablehaving both quantitative descriptors of the mole- QSAR data about known drugs in combination withcules and quantitative descriptors of their physiolog- quantitative (but somewhat less reliable) data aboutical properties. As we have seen in tables I and II, their known ADME properties led a number ofquantitative descriptors and tabulations already exist scientists in the 1990s to use existing data to predictfor many ADME properties. What about quantita- ADME properties for new drug candidates. Thesetive descriptors of molecules? Fortunately, the fields early efforts led to some very simple empirical rulesof cheminformatics and quantitative structure-activ- to predict the drug-likeness or oral absorption/intes-ity relationships (QSAR) have laid a solid founda- tinal permeability of a given compound. The most

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 7: Improving Early Drug Discovery through ADME Modelling

ADME Modelling to Improve Early Drug Discovery 355

Table III. Common chemical descriptors used in ADME (absorption, distribution, metabolism and excretion) modelling and prediction

Chemical property Description ADME relevance

LogP Water-octanol partition coefficient. The ratio of the Oral bioavailability; BBB partitioning;respective concentrations of a compound in the intestinal absorption; plasma proteinoctanol and water phases of a 2-phase system at bindingequilibrium. Measures hydrophobicity

ClogP Calculated logP or water-octanol partition coefficient Oral bioavailability; BBB partitioning;plasma protein binding

AlogP Ghose-Crippen-Viswanadhan water-octanol partition Oral bioavailability; BBB partitioningcoefficient

LogS Aqueous or water solubility usually expressed as Oral bioavailability; BBB partitioning;mol/L or mg/mL plasma protein binding

LogD The apparent water-octanol partition (or distribution) Permeability coefficient; volume ofcoefficient measured for ionic species. A combination distributionof logP and pKa

pKa The negative log of the acid ionisation constant (pKa). Water solubility; volume of distributionIt describes the ability of an ionisable group of anorganic compound to donate a proton (H+) in anaqueous medium

Molecular weight (MW) The sum of the atomic weights of all the atoms in a Oral bioavailability; BBB partitioning;molecule CYP interactions

Polar surface area (PSA) The surface area (Angs**2) of N, O, P and S atoms Oral bioavailability; intestinalabsorption; BBB partitioning

No. of hydrogen bond donors (nHBD) Number of relatively electronegative atoms such as N, Oral bioavailability; BBB partitioning;O, S or F with attached hydrogen atoms intestinal absorption

No. of hydrogen bond acceptors (nHBA) Number of relatively electronegative atoms such as N, Oral bioavailability; BBB partitioningO, S or halogens with an available lone pair

No. of rotatable bonds (nRB) Number of sp3 (single) bonds, not in a ring, bound to Oral bioavailability; BBB partitioninga non-terminal heavy atom

Molar refractivity (MR) Molar refractivity is a measure of the volume occupied Oral bioavailability; intestinalby an atom or group. It varies with temperature, index absorptionof refraction and pressure

Molecular volume (Vm) Volume occupied by 1 mole of molecule; it equals the CYP interactions; water solubility;molecular weight divided by the density intestinal absorption; oral

bioavailability; BBB partitioning

Dipole moment (Dp or μ) A measure of the electrical polarity of a molecule with Water solubility, CYP interactions;partially charged atoms serum protein binding

F(H2O) water solvation energy Free energy of dissolving a compound in water BBB partitioning; permeabilitycoefficient

Radius of gyration (Rg) Distance between the axis of a rotating body and its CYP interactionscentre of gyration. A combined measure of molecularvolume and shape

Miscellaneous topological descriptors These features describe the bond connectivity, bond Water solubility; oral absorption;(Weiner index, Kier shape index, Kier types and overall shape for a given molecule serum protein bindingflexibility index, Kier-Hall valenceconnectivity)BBB = blood-brain barrier; CYP = cytochrome P450.

well known of these is ‘Lipinski’s rule of 5’.[27] tion and permeability. In other words it is a ‘non-According to this rule, if the molecular weight of a drug’. Other rules for predicting drug-likeness, de-drug lead is >500 daltons, its water-octanol partition veloped using larger training sets and more sophisti-coefficient (logP) >5, its number of hydrogen bond cated statistical methods, have followed.[28,29] Notdonors (nHBD) >5 and its number of hydrogen bond unexpectedly, these have proven to be somewhatacceptors (nHBA) >10, it will exhibit poor absorp- more accurate than Lipinski’s original rule of five.

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 8: Improving Early Drug Discovery through ADME Modelling

356 Wishart

Table IV. Summary of benefits and limitations of different machine learning algorithms

Machine earning Benefits Assumptions and/or limitationsalgorithm

Decision tree or rule- Easy to understand and efficient training Classes must be mutually exclusivebased system algorithm Final decision tree dependent on order of attribute selection

Order of training instances has no effect on Errors in training set can result in overly complex decisiontraining treesPruning can deal with the problem of over- Missing values for an attribute make it unclear about whichfitting branch to take when that attribute is tested

Naive Bayes Foundation based on statistical modelling Assumes attributes are statistically independentEasy to understand and efficient training Assumes normal distribution on numeric attributesalgorithm Classes must be mutually exclusiveOrder of training instances has no effect on Redundant attributes mislead classificationtraining Attribute and class frequencies affect accuracyUseful across multiple domainsWorks well with noisy data

Neural network Can be used for classification or regression Difficult to understand structure of algorithmAble to represent Boolean functions (AND, Too many attributes can result in over-fittingOR, NOT) Optimal network structure can only be determined byTolerant of noisy inputs experimentationInstances can be classified by more than oneoutput

Support vector machine Models nonlinear class boundaries Training is slow compared with Bayes and decision treesOver-fitting is unlikely to occur Difficult to determine optimal parameters when training dataComputational complexity reduced to are not linearly separablequadratic optimisation problem Difficult to understand structure of algorithmEasy to control complexity of decision ruleand frequency of errorCan be used for classification or regression

Genetic algorithm Simple algorithm, easy to implement Computation or development of scoring function is non-trivialCan be used in feature classification and Not the most efficient method to find some optima, tends tofeature selection find local optima rather than globalPrimarily used in optimisation Complications involved in the representation of training/outputAlways finds a ‘good’ solution (not always datathe best solution)

The success of these early ‘qualitative’ ADME tions of these machine learning methods is given intable IV.efforts has led to the development of a number of

more advanced approaches aimed at quantitatively A more detailed description of all of these ma-predicting many ADME parameters listed in table chine learning techniques is beyond the scope of thisI.[14,18,20,30] These approaches can basically be divid- review; however, some excellent primers or summa-ed into two categories: statistical methods and ma- ries are available.[31-33] As a general rule, machinechine learning methods. The statistical methods learning methods tend to work better than statisticalused to date generally employ the classical methods approaches – especially with noisy data and smallerof QSAR and multivariate statistics, such as partial training sets. This is especially true in the area ofleast squares fitting and linear regression. These ADME prediction. Recently a comprehensive analy-approaches generally work well, although more re- sis of more than 60 published approaches to drugcently they have been supplanted by machine learn- absorption and drug permeability has appeared.[34]

ing techniques such as artificial neural networks, This study affirmed the superior predictive strengthdecision trees or rule-based systems, support vector of many machine learning methods, with many ofmachines, naive Bayes methods and genetic algo- the newer approaches obtaining correlation coeffi-rithms. A brief summary of the strengths and limita- cients of >0.90 between a variety of observed and

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 9: Improving Early Drug Discovery through ADME Modelling

ADME Modelling to Improve Early Drug Discovery 357

predicted ADME parameters (logBB, fractional ab- published today are significantly more reliable androbust.sorption [%FA] and Papp).

The reason for the success of machine learning5. ADME Prediction and

lies in its ability to make inferences and decisions Modelling Softwarethat are ‘not allowed’ in conventional statisticalmethodologies. Unlike statistical methods, which Many of the methods described above for ADME

modelling and prediction have found their way intoare purely numerical, machine learning methods cana variety of freely available web servers or commer-employ Boolean logic (AND, OR, NOT), absolutecially available software packages. A partial list ofconditionality (IF, THEN, ELSE), conditionalthese packages appears in table V. There are basical-probabilities (the probability of X given Y) andly three types of ADME prediction and modellingunconventional optimisation strategies to modelsystems: ADME parameter/property predictiondata or classify patterns. Machine learning stillsoftware, metabolic fate prediction/modellingdraws heavily from statistics and probability, but itsoftware, and physiology-based pharmacokineticis fundamentally more powerful because it allows(PBPK) modelling software. All three kinds ofinferences or decisions to be made that could notsoftware typically accept ‘raw’ chemical structuresotherwise be made using conventional statistical(MOL, SDF [Structure Data Format] or SMILESmethodologies.[31] For instance, many statisticalfiles) or ‘raw’ chemical descriptor data (logP, mo-methods such as multivariate regression or correla-lecular weight, the water-octanol distribution coeffi-tion analysis assume that the variables are indepen-cient [logD]) and generate numerical or visual AD-dent and that data can be modelled using linearME predictions. A more detailed discussion of the

combinations of these variables. When the relation-specifics of each of the three kinds of software tools

ships are nonlinear and the variables are interdepen-and their applications to rationalising drug discov-

dent (or conditionally dependent) conventional sta- ery efforts is given below.tistics usually flounders. It is in these situations thatmachine learning tends to excel. 5.1 ADME Parameter Prediction for

Rationalising Drug Discovery EffortsRegardless of whether an ADME predictor usesmultivariate statistics or neural networks, it can still

ADME parameter-prediction software typicallyfall victim to ‘the curse of dimensionality’. This

generates numeric estimates for logP, logD, logS,refers to the problem of over-fitting data when the

%FA, %OA, HIA, Vd, plasma half-life (t1/2), logBBnumber of variables is large and the training set (i.e. and/or Papp. Most of these predictions are generatedthe number of observations) is small. Given that using statistical (partial least squares fitting), neuralmany ADME data sets are quite small (<200 obser- network or support vector machine methods. Asvations) and the number of QSAR variables can be shown in table V, different software packages pro-quite large (>200), there have been more than a few vide very different levels of ADME parameter cov-published ADME models that appear to over-fit erage, with the commercial packages (i.e. BioRad’stheir data or which were later found to perform KnowItAll or Simulation-Plus’s ADMET Predictor)poorly when given novel inputs (see assessments typically providing the most comprehensive set ofin Hou et al.[30,34]). The lack of rigorous testing, predictions and the free web servers (such as Pre-particularly in the early days of in silico ADME or ADME and Actelion) providing the least. BecauseADMET, led to a number of pessimistic assess- the field is quite competitive most vendors providements of the field’s future.[35,36] With the advent of very detailed descriptions of their methods and com-larger data sets and more rigorous testing protocols parative performance.(such as n-fold cross validation or testing on never- ADME parameter predictors can be and havebefore-seen data), most ADME modelling methods been used quite effectively to focus, reduce or ra-

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 10: Improving Early Drug Discovery through ADME Modelling

358 Wishart

Table V. Commonly used or well known ADME (absorption, distribution, metabolism and excretion) servers and software packages

Software or server URL Capabilities

Actelion Property Explorer http://www.actelion.com/uninet/www/ www_main_p.nsf/ Toxicity risks; ClogP; solubility; drug-Server Content/Technologies +Property+Explorer likeness; drug-score

Pre-ADME and Pre-ADMET http://preadmet.bmdrc.org/preadmet/index.php Drug-likeness; Caco-2 permeability; MDCKServer permeability; BBB penetration; intestinal

absorption; plasma protein binding; Amestest

Admet Predictor http://www.simulations-plus.com/products/predictor/ pKa; intestinal permeability (Peff); MDCKpredictor.html permeability (Papp); water solubility; logD;

volume of distribution; BBB penetration;plasma protein binding; multi-toxicity models

KnowItAll http://www.knowitall.com/adme Absorption rate; bioavailability; BBBpermeability; elimination half-life; first-passmetabolism; logD; plasma protein binding;drug-likeness; immuno- and genotoxicity

VolSurf http://www.moldiscovery.com/soft_volsurf.php Water solubility; Caco-2 permeability; BBBpenetration; plasma protein binding; volumeof distribution; hERG modelling; CYP3A4metabolism

ChemSilico http://www.chemsilico.com/ LogD; logP; pKa; BBB penetration; serumprotein binding; intestinal absorption;genotoxicity

Pharma Algorithms http://www.ap-algorithms.com/ Oral bioavailability; logD; P-glycoproteinsubstrate binding; absorption; activetransport properties; acute toxicity

MexAlert http://www.compudrug.com/ First-pass metabolic fate and metabolic routeprediction

MetabolExpert http://www.compudrug.com/ Metabolic fate and transformation prediction

MetaDrug http://www.genego.com/metadrug.php Phase I and II metabolite prediction; off-target effect prediction

Meteor http://www.lhasalimited.org/index.php Metabolic fate and transformation prediction

MetaSite http://www.moldiscovery.com/soft_metasite.php CYP metabolic fate and metabolising siteprediction

GastroPlus http://www.simulations-plus.com/products/gastro_plus.html Multicompartment PK models; PK datafitting; physiological system modelling;modelling drug release, dissolution, transportand absorption

BBB = blood-brain barrier; ClogP = calculated logP or water-octanol partition coefficient; CYP = cytochrome P450; hERG = human ether-a-go-go related gene potassium channel Kv11.1; logD = water-octanol distribution coefficient; logP = water-octanol partition coefficient;MDCK = Madin-Darby canine kidney; Papp = apparent permeability coefficient; Peff = effective intestinal permeability; PK = pharma-cokinetic; pKa = acid dissociation constant.

tionalise a number of compounds tested in chemical characterisation of an anti-HIV quinolone library,[37]

library screens. They have also been used to limit the simplification and reduction in size of a virtualthe composition of a chemical library or the number/ library of peroxisome proliferator-activated receptortype of compounds in virtual libraries. More recent- inhibitors,[38] the creation of orally active CRTH2ly, ADME parameter predictors have been used in (chemoattractant receptor expressed on TH2 cells)conjunction with compound screening to generate anti-inflammatory agonists from an ADME-focusedlead compounds with very favourable oral absorp- library[39] and the creation of a focused natural prod-tion properties. Published examples of these appli- uct scaffold library for drug screening.[40] Manycations include the use of in silico ADME tech- other examples are also discussed in other in siliconiques to assist with the simplification and ADME reviews.[14,25,34]

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 11: Improving Early Drug Discovery through ADME Modelling

ADME Modelling to Improve Early Drug Discovery 359

5.2 Metabolic Fate/Stability Prediction for Among the three types of metabolism predictors,Rationalising Drug Discovery Efforts drug-drug interaction (CYP interaction) predictors

and metabolic fate predictors are generally consid-The prediction of a lead compound’s metabolic ered the most reliable, while the predictors of meta-

fate or stability is particularly important if research- bolic stability and consequent t1/2 or induction poten-ers are attempting to identify which enzymes are tial are considered less reliable.[44] In terms of ra-metabolising a given compound, how they are tionalising or focusing early-stage drug discovery, itmetabolising it, where the transformations are hap- is quite clear that in silico prediction of drug metab-pening and how it is being cleared.[41] Of particular olism and drug-drug interactions is routinely beinginterest in biotransformation studies is the possibili- used as a screening tool in the pharmaceutical indus-ty that a potentially useful drug compound could be try. These tools are allowing pharmaceutical re-transformed into a reactive intermediate. Such an searchers to reduce the size of their high-throughputintermediate could elicit toxic adverse effects screening libraries and to allow them to focus onthrough the covalent reaction and modification of smaller numbers of higher likelihood leads.[41,44]

important signalling proteins or enzymes. This is Likewise, the prediction of metabolic hotspots (asone of the prime motivations for developing meta- done with MetaSite) is proving to be quite importantbolic fate/stability predictors. in early-phase drug discovery, particularly since it

There are three types of ADME-related metabol- helps guide chemists towards the synthesis of newic predictors: metabolic fate predictors, metabolic compounds with reduced metabolic liability. How-stability predictors and CYP substrate predictors. ever, the literature is not exactly replete with exam-Metabolic fate predictors typically generate hypo- ples showing how metabolic prediction software hasthetical molecular structures of possible phase I or substantially changed the drug discovery processes.phase II metabolites using rule-based systems, This may be because these tools are somewhatdatabase similarities (to known structures) or deci- less developed and less reliable than other in silicosion trees. Meteor, MetabolExpert and MexAlert ADME techniques.[44]

are all examples of commercial metabolic fatepredictors. The second group of predictors, metabol- 5.3 Physiology-Based Pharmacokineticic stability predictors, typically predict in vitro t1/2 Prediction for Rationalising Drugand intrinsic clearance (CLint), which in turn can be Discovery Effortsused to predict pharmacokinetic parameters such asbioavailability and in vivo t1/2. Unfortunately, there Long before the introduction of ‘omics’ tech-are relatively few software packages for these kinds niques to ADME, PBPK modelling actually playedof predictors.[41] The third class of metabolic a very prominent role in the field of ADME predic-predictors, CYP substrate predictors, use rule-based tion and analysis.[45] With improvements in bothsystems or machine learning tools to predict a com- computer technology and metabolite measurementpound’s CYP enzyme isotype specificity, its meta- techniques, PBPK is experiencing something of abolic hotspots or its activity (activator or inhibi- resurgence in its application to ADME model-tor).[42,43] This latter feature is particularly useful in ling.[46] Rather than predicting ADME propertiespredicting drug-drug interactions.[41] MetaSite is an from standard statistical approaches, PBPK methodsexample of a software package that predicts meta- attempt to predict some of these properties by accu-bolic hotspots, while KnowItAll is an example of a rately modelling physiological processes using dif-package that supports CYP substrate prediction. ferential equations and chemical property data. ForThere are also a number of published (but not yet instance, tools such as GastroPlus (table V) andavailable) CYP substrate predictors that use ma- IDEA[47] accept chemical property descriptorschine learning methods to predict more extensive (logP, diffusion coefficient, solubility) and performCYP interactions.[42,43] pharmacokinetic simulations of the rate and extent

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 12: Improving Early Drug Discovery through ADME Modelling

360 Wishart

of absorption in the gastrointestinal tract. These ance and to embrace novel approaches that promisePBPK models are quite sophisticated. For example, improved prediction and modelling. This is a goodthe GastroPlus package is based on the advanced sign.compartmental absorption and transit (ACAT) Three areas that ADME prediction and modellingmodel.[48] In this model the gastrointestinal tract is will need to focus on in the coming years are:divided into nine compartments, one corresponding (1) improving the quality and quantity of data into the stomach, seven for the small intestine and one ADME databases; (2) developing consensus ap-for the colon, with the absorption process being proaches to improve ADME prediction accuracy;described via >80 differential equations. The infor- and (3) developing ADME data visualisation tools.mation generated from these PBPK simulations can The Achilles heel to most ADME predictions is thebe used to help identify which molecular parameters lack of sufficient numbers of high quality experi-have the greatest effect on the predicted absorption mental ADME data sets.[34-36] Future efforts mustfor a given class of screened compounds. This al- focus on generating gold standard training and testlows drug researchers to rationalise their choice of data, preferably obtained from a small number offollow-on compound characterisation experiments. core laboratories using a single consensus experi-For instance, if solubility is predicted to have little mental protocol for each ADME measurement. Thiseffect on the absorption of a certain class of drug

kind of effort is not without precedent, as similarleads, there would be no need to run extensive

standardisation approaches have been adopted orsolubility experiments on this class of drug leads.

proposed to generate validated proteomic andThese kinds of PBPK models can also be used to microarray data.[50] There is little doubt that signifi-

assist later stages in drug development, particularly cant improvements in the quality and size of ADMEduring the drug formulation phases.[49] This is be-

databases will lead to significant improvements incause PBPK simulations can help evaluate and de-

the quality of ADME predictors.[34,44]

sign dosage formulations (changing particle size orA second area that is likely to lead to long-termparticle density in silico), numerically assess or sim-

improvements in ADME prediction quality is theulate controlled release profiles and evaluate varia-use of multiple models and consensus scores. Thetions in drug transit times. Obviously these sophisti-use of consensus predictions is common in manycated simulation approaches could make a numberareas of bioinformatics and it often leads to im-of aspects of drug formulation and developmentprovements of 5–10% in accuracy. However, the usemuch faster and more focused.of multiple ADME predictors and consensus ADMEpredictions appears to be relatively uncommon.[34,51]

6. Looking AheadNo doubt, as others begin to adopt this strategy it islikely that the quality of ADME predictions andComputational ADME and ADMET predictionmodels will be improved. The third area withinis clearly having a positive impact on the pharma-ADME prediction and modelling that needs to beceutical industry. The growing number and improv-developed lies in the field of data visualisation. Withing quality of dedicated ADME predictors, webthe move to virtual libraries, virtual screening, virtu-servers, databases, software packages and compa-al docking and, now, virtual ADME there is goingnies is a strong testament to this fact. However, asto be a growing need for significantly improvedwith any new or emerging field there is a tendencydata handling.[52] Rather than processing 10 or 100to over-inflate its performance or potential applica-compounds at a time, it is possible that billions oftions. This can often lead to unreasonable expecta-compounds could be routinely processed in thesetions and a subsequent degree of disenchantment or‘virtual laboratories’. Given that many screeningpessimism about the future of the field.[35,36,44] For-and ADME predictors generate dozens of datatunately, the in silico ADME community appears topoints for each molecule, there is a clear need tobe keen to find ways to boost its predictive perform-

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 13: Improving Early Drug Discovery through ADME Modelling

ADME Modelling to Improve Early Drug Discovery 361

5. Lazarou J, Pomeranz BH, Corey PN. Incidence of adverse drugdevelop far more sophisticated data reduction andreactions in hospitalized patients: a meta-analysis of prospec-

data visualisation tools.[53] The use of microarray- tive studies. JAMA 1998; 279: 1200-56. Glick M, Klon AE, Acklin P, et al. Enriching extremely noisystyle visualisation tools such as heat maps, hierar-

high-throughput screening data using a naive Bayes classifier.chical clustering diagrams and graphing tools could J Biomol Screening 2004; 9: 32-6greatly aid the analysis and interpretation of in silico 7. Rogers D, Brown RD, Hahn M. Using extended-connectivity

fingerprints with Laplacian-modified Bayesian analysis inADME results. Likewise, the spatio-temporal map-high-throughput screening follow-up. J Biomol Screening

ping of PBPK simulations or even spatio-temporal 2005; 10: 682-6mapping of predicted/measured ADME parameters 8. Klon AE, Glick M, Thoma M, et al. Finding more needles in the

haystack: a simple and efficient method for improving high-on to high quality ‘virtual human models’[54] couldthroughput docking results. J Med Chem 2004; 47: 2743-9

also facilitate the interpretation and investigation of 9. Li AP. Screening for human ADME/Tox drug properties in drugdiscovery. Drug Discov Today 2001; 6: 357-66ADME data.

10. Dalvie D. Recent advances in the applications of radioisotopesADME prediction and modelling is already beingin drug metabolism, toxicology and pharmacokinetics. Curr

used to reduce late-stage attrition in the drug discov- Pharm Des 2000; 6: 1009-2811. Marathe PH, Shyu WC, Humphreys WG. The use of radio-ery programmes of a number of major pharmaceuti-

labeled compounds for ADME studies in discovery and ex-cal companies.[14,36,49] Given the interest and invest- ploratory development. Curr Pharm Des 2004; 10: 2991-3008ment by today’s pharmaceutical industries in predic- 12. Chu I, Nomeir AA. Utility of mass spectrometry for in-vitro

ADME assays. Curr Drug Metab 2006; 7: 467-77tive ADME, it is likely that in silico ADME will13. Nicholson JK, Connelly J, Lindon JC, et al. Metabonomics: a

become more and more integrated into other aspects platform for studying drug toxicity and gene function. Nat RevDrug Discov 2002; 1: 153-61of drug discovery and development, including high-

14. Van de Waterbeemd H. From in vivo to in vitro/in silico ADME:throughput screening, formulation, preclinicalprogress and challenges. Expert Opin Drug Metab Toxicol

animal studies and phase I-II clinical trials. Howev- 2005; 1: 1-415. Jain KK. Applications of AmpliChip CYP450. Mol Diagn 2005;er, it is important to remember that computational

9: 119-27predictions cannot serve as complete surrogates to16. Garg P, Verma J. In silico prediction of blood brain barrier

real experimental results. So while computational permeability: an artificial neural network model. J Chem InfModel 2006; 46: 289-97ADME may be growing in popularity, its ascendan-

17. Zhao YH, Le J, Abraham MH, et al. Evaluation of humancy will only lead to a greater emphasis on obtaining intestinal absorption data and subsequent derivation of a quan-titative structure-activity relationship (QSAR) with the Abra-faster, better and cheaper data from experimentalham descriptors. J Pharm Sci 2001; 90 (6): 749-84ADME. In other words, both experimental ADME

18. Iyer M, Tseng YJ, Senese CL, et al. Prediction and mechanisticand computational ADME must be seen as insepara- interpretation of human oral drug absorption using MI-QSAR

analysis. Mol Pharmaceutics 2007; 4: 218-31ble partners on the road to better and safer drugs.19. Lombardo F, Obach RS, Shalaeva MY, et al. Prediction of

human volume of distribution values for neutral and basicAcknowledgements drugs: 2. Extended data set and leave-class-out statistics. J

Med Chem 2004; 47 (5): 1242-5020. Klon AE, Lowrie JF, Diller DJ. Improved naive Bayesian mod-The author acknowledges Genome Canada, Genome Al-

eling of numerical data for absorption, distribution, metabo-berta and the National Institute for Nanotechnology (NRC)lism and excretion (ADME) property prediction. J Chem Inffor their financial support in the preparation of this review. Model 2006; 46: 1945-56

The author has no conflicts of interest that are directly rele- 21. Chiou WL. The rate and extent of oral bioavailability versus thevant to the content of this review. rate and extent of oral absorption: clarification and recommen-

dation of terminology. J Pharmacokinet Pharmacodyn 2001;28: 3-6

References 22. Tetko IV. The WWW as a tool to obtain molecular parameters.Mini Rev Med Chem 2003; 3: 809-201. DiMasi JA, Hansen RW, Grabowski HG. The price of innova-

tion: new estimates of drug development costs. J Health Econ 23. Selassie CD, Mekapati SB, Verma RP. QSAR: then and now.2003; 22: 151-85 Curr Top Med Chem. 2002; 2: 1357-79

2. Bains W. Failure rates in drug discovery and development: will 24. Weininger D. SMILES, a chemical language and informationwe ever get any better? Drug Disc World 2004; Fall: 9-17 system: 1. Introduction to methodology and encoding rules. J

Chem Inf Comput Sci 1988; 28: 31-63. Horton R. Vioxx, the implosion of Merck and aftershocks at theFDA. Lancet 2004; 364: 1995-6 25. Miteva MA, Violas S, Montes M, et al. FAF-Drugs: free AD-

4. Lang, L. Valdecoxib (Bextra) withdrawal leaves pain relief ME/tox filtering of compound collections. Nucleic Acids Restreatment gap. Gastroenterology 2005; 128: 1769-70 2006; 34: W738-44

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)

Page 14: Improving Early Drug Discovery through ADME Modelling

362 Wishart

26. Geldenhuys WJ, Gaasch KE, Watson M, et al. Optimizing the 42. Yap CW, Xue Y, Li ZR, et al. Application of support vectoruse of open-source software applications in drug discovery. machines to in silico prediction of cytochrome p450 enzymeDrug Discov Today 2006; 11: 127-32 substrates and inhibitors. Curr Top Med Chem 2006; 6:

27. Lipinski CA. Drug-like properties and the causes of poor solu- 1593-607bility and poor permeability. J Pharmacol Toxicol Methods 43. Fox T, Kriegl JM. Machine learning techniques for in silico2000; 44: 235-49

modeling of drug metabolism. Curr Top Med Chem 2006; 6:28. Andrews CW, Bennett L, Yu LX. Predicting human oral 1579-91

bioavailability of a compound: development of a novel quanti-44. Pelkonen O, Raunio H. In vitro screening of drug metabolismtative structure-bioavailability relationship. Pharm Res 2000;

during drug development: can we trust the predictions? Expert17: 639-44Opin Drug Metab Toxicol 2005; 1: 49-5929. Yoshida F, Topliss JG. QSAR model for drug human oral

bioavailability. J Med Chem 2000; 43: 2575-85 45. Gerlowski LE, Jain PK. Physiologically based pharmacokineticmodelling: principles and applications. J Pharm Sci 1983; 72:30. Hou T, Wang J, Zhang W, et al. ADME evaluation in drug1103-127discovery: 6. Can oral bioavailability in humans be effectively

predicted by simple property-based rules? J Chem Inf Model 46. Theil FP, Guentert TW, Haddad S, et al. Utility of physiologi-2007; 47: 460-3

cally based pharmacokinetic models to drug development and31. Mitchell T. Machine learning. New York: McGraw Hill, 1997 rational drug discovery candidate selection. Toxicol Lett 2003;

138: 29-4932. Cruz JA, Wishart DS. Applications of machine learning incancer prediction and prognosis. Cancer Informatics 2006; 2: 47. Norris DA, Leesman GD, Sinko PJ, et al. Development of59-67

predictive pharmacokinetic simulation models for drug discov-33. Rodvold DM, McLeod DG, Brandt JM, et al. Introduction to ery. J Control Release 2000; 65: 55-62

artificial neural networks for physicians: taking the lid off the48. Agoram B, Woltosz WS, Bolger MB. Predicting the impact ofblack box. Prostate 2001; 46: 39-44

physiological and biochemical processes on oral drug bioavai-34. Hou T, Wang J, Zhang W. Recent advances in computational

lability. Adv Drug Deliv Rev 2001; 50 Suppl. 1: S41-67prediction of drug absorption and permeability in drug discov-ery. Curr Medicinal Chem 2006; 13: 2653-67 49. Kuentz M, Nick S, Parrott N, et al. A strategy for preclinical

formulation development using GastroPlus as pharmacokinet-35. Stouch TR, Kenyon JR, Johnson SR, et al. In silico ADME/Tox:ic simulation tool and a statistical screening design applied to awhy models fail. J Comput Aided Mol Des 2003; 17: 83-92dog study. Eur J Pharm Sci 2006; 27: 91-936. Lombardo F, Gifford E, Shalaeva MY. In silico ADME predic-

tion: data, models, facts and myths. Mini Rev Med Chem 50. Qin LX, Beyer RP, Hudson FN, et al. Evaluation of methods for2003; 3: 861-75 oligonucleotide array data via quantitative real-time PCR.

BMC Bioinformatics 2006; 7: 2337. Filliponi E, Cruciani G, Tabarrini O, et al. QSAR study andVolsurf characterization of anti-HIV quinolone library. J Com- 51. Hou TJ, Xia K, Zhang W, et al. ADME evaluation in drugput Aided Mol Des 1001; 15: 203-7 discovery: 4. Prediction of aqueous solubility based on atom

38. Liao C, Liu B, Shi L, et al. Construction of a virtual combinato- contribution approach. J Chem Inf Comp Sci 2004; 44: 266-75rial library using SMILES strings to discover potential struc-

52. Green DV. Virtual screening of virtual libraries. Prog Medture-diverse PPAR modulators. Eur J Med Chem 2005; 40:Chem 2003; 41: 61-97632-40

53. Stoner CL, Gifford E, Stankovic C, et al. Implementation of an39. Ulven T, Receveur JM, Grimstrup M, et al. Novel selectiveADME enabling selection and visualization tool for drug dis-orally active CRTH2 antagonists for allergic inflammation

developed from in silico derived hits. J Med Chem 2006; 49: covery. J Pharm Sci 2004; 93: 1131-416638-41

54. Turinsky A, Sensen CW. On the way to building an integrated40. Samiulla DS, Vaidyanathan VV, Arun PC, et al. Rational selec- computational environment for the study of developmental

tion of structurally diverse natural product scaffolds with patterns and genetic diseases. Int J Nanomed 2006; 1: 89-96favorable ADME properties for drug discovery. Mol Divers2005; 9: 131-9

41. Baranczewski P, Stanczak A, Sundberg K, et al. Introduction to Correspondence: Dr David S. Wishart, 2-21 Athabasca Hall,in vitro estimation of metabolic stability and drug interactions

University of Alberta, Edmonton, AB, Canada T6G 2E8.of new chemical entities in drug discovery and development.E-mail: [email protected] Rep 2006; 58: 453-72

© 2007 Adis Data Information BV. All rights reserved. Drugs R D 2007; 8 (6)