36
Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott

Martin Ott

  • Upload
    jamuna

  • View
    64

  • Download
    0

Embed Size (px)

DESCRIPTION

Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA). Martin Ott. Outline. Introduction Structures and activities Regression techniques: PCA, PLS Analysis techniques: Free-Wilson, Hansch - PowerPoint PPT Presentation

Citation preview

Page 1: Martin Ott

Bioinformatics IV

Quantitative Structure-Activity Relationships (QSAR)

and

Comparative Molecular Field Analysis (CoMFA)

Martin Ott

Page 2: Martin Ott

Outline

• Introduction• Structures and activities • Regression techniques:

PCA, PLS• Analysis techniques:

Free-Wilson, Hansch• Comparative Molecular Field Analysis

Page 3: Martin Ott

QSAR: The Setting

Quantitative structure-activity relationships are usedwhen there is little or no receptor information, butthere are measured activities of (many) compounds

They are also useful to supplement docking studies which take much more CPU time

Page 4: Martin Ott

From Structure to Property

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

0

1

2

3

4

5

6

7

8

9

1 3 5 7 9 11 13 15

EC5

0

Page 5: Martin Ott

From Structure to Property

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

LD50

Page 6: Martin Ott

From Structure to Property

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

Page 7: Martin Ott

QSAR: Which Relationship?

Quantitative structure-activity relationships correlate chemical/biological activitieswith structural features or atomic, group ormolecular properties

within a range of structurally similar compounds

Page 8: Martin Ott

Free Energy of Binding andEquilibrium Constants

The free energy of binding is related to the reaction constants of ligand-receptor complex formation:Gbinding = –2.303 RT log K

= –2.303 RT log (kon / koff)

Equilibrium constant KRate constants kon (association) and koff (dissociation)

Page 9: Martin Ott

Concentration as Activity Measure

• A critical molar concentration Cthat produces the biological effectis related to the equilibrium constant K

• Usually log (1/C) is used (c.f. pH)

• For meaningful QSARs, activities needto be spread out over at least 3 log units

Page 10: Martin Ott

Molecules Are Not Numbers!

O

NCH3

OH

H

HOH-1.09.109*10-31

2.99792*108

0 -0.3183

-180.156

196.967

149,597,870,691

e

43

7

Where are the numbers? Numerical descriptors

Page 11: Martin Ott

An Example: Capsaicin Analogs

X EC50(M) log(1/EC50)

H 11.80 4.93Cl 1.24 5.91

NO2 4.58 5.34CN 26.50 4.58

C6H5 0.24 6.62NMe2 4.39 5.36

I 0.35 6.46NHCHO ? ?

X

NH

O

OH

MeO

Page 12: Martin Ott

An Example: Capsaicin Analogs

X log(1/EC50) MR Es

H 4.93 1.03 0.00 0.00 0.00Cl 5.91 6.03 0.71 0.23 -0.97

NO2 5.34 7.36 -0.28 0.78 -2.52CN 4.58 6.33 -0.57 0.66 -0.51

C6H5 6.62 25.36 1.96 -0.01 -3.82NMe2 5.36 15.55 0.18 -0.83 -2.90

I 6.46 13.94 1.12 0.18 -1.40NHCHO ? 10.31 -0.98 0.00 -0.98

MR = molar refractivity (polarizability) parameter; = hydrophobicity parameter;

= electronic sigma constant (para position); Es = Taft size parameter

Page 13: Martin Ott

An Example: Capsaicin Analogs

X

NH

O

OH

MeO

log(1/EC50) = -0.89 + 0.019 *

MR + 0.23 * + -0.31 * +

-0.14 * Es

Page 14: Martin Ott

Basic Assumption in QSAR

The structural properties of a compound contributein a linearly additive way to its biological activity

provided there are no non-linear dependencies of transport or binding on some properties

Page 15: Martin Ott

Molecular Descriptors • Simple counts of features, e.g. of atoms,

rings,H-bond donors, molecular weight

• Physicochemical properties, e.g. polarisability, hydrophobicity (logP), water-solubility

• Group properties, e.g. Hammett and Taft constants, volume

• 2D Fingerprints based on fragments• 3D Screens based on fragments

Page 16: Martin Ott

2D Fingerprints

Br

NH

O

OH

MeO

C N O P S X F Cl Br I Ph CO NH OH Me Et Py CHO SO C=C CΞC C=N Am Im

1 1 1 0 0 1 0 0 1 0 1 1 1 1 1 0 0 0 0 1 0 0 1 0

Page 17: Martin Ott

Principal Component Analysis (PCA)

• Many (>3) variables to describe objects= high dimensionality of descriptor data

• PCA is used to reduce dimensionality• PCA extracts the most important factors

(principal components or PCs) from the data• Useful when correlations exist between

descriptors• The result is a new, small set of variables

(PCs) which explain most of the data variation

Page 18: Martin Ott

PCA – From 2D to 1D

Page 19: Martin Ott

PCA – From 3D to 3D-

Page 20: Martin Ott

Different Views on PCA

• Statistically, PCA is a multivariate analysis technique closely related to eigenvector analysis

• In matrix terms, PCA is a decomposition of matrix Xinto two smaller matrices plus a set of residuals: X = TPT + R

• Geometrically, PCA is a projection technique in which X is projected onto a subspace of reduced dimensions

Page 21: Martin Ott

Partial Least Squares (PLS)

y1 = a0 + a1x11 + a2x12 + a3x13 + … + e1

y2 = a0 + a1x21 + a2x22 + a3x23 + … + e2

y3 = a0 + a1x31 + a2x32 + a3x33 + … + e3

yn = a0 + a1xn1 + a2xn2 + a3xn3 + … + en

Y = XA + E

(compound 1)(compound 2)(compound 3)…(compound n)

X = independent variablesY = dependent variables

Page 22: Martin Ott

PLS – Cross-validation

• Squared correlation coefficient R2

• Value between 0 and 1 (> 0.9)• Indicating explanative power of regression equation

• Squared correlation coefficient Q2

• Value between 0 and 1 (> 0.5)• Indicating predictive power of regression equation

With cross-validation:

Page 23: Martin Ott

Free-Wilson Analysis

log (1/C) = aixi + xi: presence of group i (0 or 1) ai: activity group contribution of group i : activity value of unsubstituted compound

Page 24: Martin Ott

Free-Wilson Analysis

+ Computationally straightforward

– Predictions only for substituents already included

– Requires large number of compounds

Page 25: Martin Ott

Hansch Analysis

Drug transport and binding affinitydepend nonlinearly on lipophilicity:

log (1/C) = a (log P)2 + b log P + c + k

P: n-octanol/water partition coefficient: Hammett electronic parametera,b,c: regression coefficientsk: constant term

Page 26: Martin Ott

Hansch Analysis

+ Fewer regression coefficients needed for correlation

+ Interpretation in physicochemical terms

+ Predictions for other substituents possible

Page 27: Martin Ott

Pharmacophore

• Set of structural features in a drug molecule recognized by a receptor

• Sample features: H-bond donor charge hydrophobic center

• Distances, 3D relationship

Page 28: Martin Ott

Pharmacophore Selection

L = lipophilic site; A = H-bond acceptor;D = H-bond donor; PD = protonated H-bond donor

DopaminePharmacophore

LPD

D

d1

d2 d3

LPD

D

d1

d2 d3L

PD

D

d1

d2 d3

NH+

CO2H

CH3H

NH

NH+H

CH3

OH

OH

OH

OH

NH3+

Page 29: Martin Ott

OH

NH3+

OH

NH+H

CH3

OH

OH

Pharmacophore Selection

L = lipophilic site; A = H-bond acceptor;D = H-bond donor; PD = protonated H-bond donor

DopaminePharmacophore

LPD

D

d1

d2 d3

LPD

D

d1

d2 d3L

PD

D

d1

d2 d3

NH+

CO2H

CH3H

NH

LPD

D

d1

d2 d3

Page 30: Martin Ott

Comparative Molecular Field Analysis (CoMFA)

• Set of chemically related compounds• Common pharmacophore or

substructure required• 3D structures needed (e.g., Corina-

generated)• Flexible molecules are “folded” into

pharmacophore constraints and aligned

Page 31: Martin Ott

CoMFA Alignment

C7OH

OH

A

D

B

C1

MeO OMe

ClClCl

BA

O

OC7OH

OHOH

A

B

C1

O

NMe2

OH

A B

CL

LL d1

d2d3L

LL

d1

d2

d3

L

LL

d1

d2

d3

L

L

L

d1 d2

d3

L

LL

d1

d2

d3

"Pharmacophore"

Page 32: Martin Ott

CoMFA Grid and Field Probe

(Only one molecule shown for clarity)

Page 33: Martin Ott

Electrostatic Potential Contour Lines

Page 34: Martin Ott

CoMFA Model Derivation

Van der Waals field(probe is neutral carbon)

Evdw = (Airij-12 - Birij

-6)

Electrostatic field(probe is charged atom)

Ec = qiqj / Drij

• Molecules are positioned in a regular gridaccording to alignment

• Probes are used to determine the molecular field:

Page 35: Martin Ott

3D Contour Map for Electronegativity

Page 36: Martin Ott

CoMFA Pros and Cons

+ Suitable to describe receptor-ligand interactions

+ 3D visualization of important features+ Good correlation within related set+ Predictive power within scanned space– Alignment is often difficult– Training required