22
A novel approach to analysis of primary HTS data Compound Set Enrichment Thibault Varin Ansgar Schuffenhauer Gubler, H., Parker, C., Zhang, JH., Raman, P., Ertl, P.

Compound Set Enrichment

  • Upload
    airlia

  • View
    83

  • Download
    0

Embed Size (px)

DESCRIPTION

Compound Set Enrichment. A novel approach to analysis of primary HTS data. Thibault Varin. Ansgar Schuffenhauer. Gubler, H., Parker, C., Zhang, JH., Raman, P., Ertl, P. Compound Set Enrichment. INTRODUCTION. Introduction. - PowerPoint PPT Presentation

Citation preview

Page 1: Compound Set Enrichment

A novel approach to analysis of primary HTS data

Compound Set Enrichment

Thibault Varin Ansgar Schuffenhauer

Gubler, H., Parker, C., Zhang, JH., Raman, P., Ertl, P.

Page 2: Compound Set Enrichment

INTRODUCTION

| Compound Set Enrichment | Thibault Varin | 10/07/142

Compound Set Enrichment

Page 3: Compound Set Enrichment

Introduction

Active series identification: Can relevant SAR be extracted from primary HTS data?

Are activity data binary or continuous?

| Compound Set Enrichment | Thibault Varin | 10/07/143

Page 4: Compound Set Enrichment

IntroductionActive series identification

| Compound Set Enrichment | Thibault Varin | 10/07/144

Hypothesis 1:Within primary HTS screening data, structure activity relationships (SAR) are apparent and can be used to help selecting active compound classes.

Page 5: Compound Set Enrichment

IntroductionAre the activity data binary or continuous?

| Compound Set Enrichment | Thibault Varin | 10/07/145

Scaffold 1 Scaffold 2

Activity

Binary activity:-1 active / 5 inactives-Scaffold 1 = Scaffold 2

Continuous activity:Scaffold 1 > Scaffold 2

Active compound (binary)Inactive compound (binary)

N

N

NN

O

O

Page 6: Compound Set Enrichment

Introduction Are the activity data binary or continuous?

| Compound Set Enrichment | Thibault Varin | 10/07/146

Threshold 1Activity

Threshold 2Activity

Binary scaffold activity is different according to the threshold

Active compound (binary)Inactive compound (binary)

Hypothesis 2:

Methods based on an activity cut-off distort the activity information leading to the incorrect assignment of active series of compounds.

N

N

N

Page 7: Compound Set Enrichment

METHODS

| Compound Set Enrichment | Thibault Varin | 10/07/147

Compound Set Enrichment

Page 8: Compound Set Enrichment

The Scaffold Tree – Visualization of the Scaffold Universe by Hierarchical Scaffold Classification A. Schuffenhauer, P. Ertl et al. J. Chem. Inf. Model., 47, 47, 2007

MethodsThe Scaffold Tree classification

| Compound Set Enrichment | Thibault Varin | 10/07/148

Page 9: Compound Set Enrichment

MethodsDatasets

| Compound Set Enrichment | Thibault Varin | 10/07/149

PubChem Annotationfrom CRC

Simulation of the primary screening data

Hypothesis 1

Page 10: Compound Set Enrichment

Methods Single hypothesis test: summary procedure

1. State the null and the alternative hypotheses

- H0: „the scaffold is inactive“

- H1: „the scaffold is active“

2. Specify a significance level: α=0.01

3. Compute the statistics and the p-value )→p-value=probability that the scaffold is inactive (H0)

4. Decision step:

- p-value> α: H0 is accepted

- p-value< α: H0 is rejected and then H1 is accepted„The scaffold is active“

| Compound Set Enrichment | Thibault Varin | 10/07/1410

Page 11: Compound Set Enrichment

Methods The KS and the Binomial hypothesis tests

| Compound Set Enrichment | Thibault Varin | 10/07/1411

Continuous dataKS test

Binary dataBinomial test

Actives Inactives

BioassayScaffold

H0: there is no difference in the activity distribution defined by compounds having the scaffold S3-2 and the background distribution

H0: there is no difference in the proportion of active compounds for compounds having the scaffold S3-2 and the proportion of active compounds for the full dataset.

Page 12: Compound Set Enrichment

Methods Multiple hypothesis tests: Bonferroni correction

Problem of false positives• α =probability to identify as active an inactive scaffold (for each test done...)

• 100 inactive scaffolds: probability to identify an „active“ by chance is equal 63% (1-0.99100))

Suggests to test each scaffold at a critical significance level equal to α = 0.01 / Nbr of scaffolds

Makes the assumption that the individual tests are independent

Each level in the Scaffold Tree have been done separately

| Compound Set Enrichment | Thibault Varin | 10/07/1412

Page 13: Compound Set Enrichment

MethodsDetermining the activity of classes

| Compound Set Enrichment | Thibault Varin | 10/07/1413

Hypo1

Hypo2

Scaffold activity evaluation

Comparison of results

Multiple hypothesis test correction (Bonferroni)

Page 14: Compound Set Enrichment

RESULTS

| Compound Set Enrichment | Thibault Varin | 10/07/1414

Compound Set Enrichment

Page 15: Compound Set Enrichment

ResultsComparison of KSP and BTP predictions

| Compound Set Enrichment | Thibault Varin | 10/07/1415

BioassayTotal BPCA significantly

activesBPCA non significantly

actives

KSP BTP Δ BPCA KSP BTP Δ KSP BTP Δ

Hydroxysteroid dehydrogenase 330 231 +99 199 183 168 +15 147 63 +84

Caspase-1 331 114 +217 5 2 2 0 329 112 +217

PK 12 4 +8 12 3 3 0 9 1 +8

Luciferase 67 12 +55 15 13 11 +2 54 1 +53

Luciferase 178 48 +130 41 32 35 -3 146 13 +133

CYP450 2C9 58 33 +25 34 34 31 +3 24 2 +22

CYP450 3A4 121 64 +57 60 60 53 +7 61 11 +50With:-KSP: KS Prediction-BTP: Binomial Threshold Prediction-Δ: KSP-BTP-BPCA: Binomial PubChem Annotation

Both KSP and BTP retrieve BPCA significantly active classesNumber of active classes: KSP > BTPMost of new KSP active classes are not BPCA significantly actives

Page 16: Compound Set Enrichment

ResultsKSP significantly active scaffolds that are in Pubchem inactives

| Compound Set Enrichment | Thibault Varin | 10/07/1416

S

NH

S

O

O

NH

NH

O

NH

S O

O

O

N

N

Inconclusives?Inconclusive?

Inconclusives?

Compound activity (PubChem Annotation)

Active InconclusiveInactiveWA

WAWA

WA

Page 17: Compound Set Enrichment

ResultsPrioritize nodes instead of individual scaffolds

| Compound Set Enrichment | Thibault Varin | 10/07/1417

Scaffold activity (KS Prediction / Bonferroni)

Non significantly activeSignificantly active

Page 18: Compound Set Enrichment

ResultsVisualization tool (Peter Ertl)

| Compound Set Enrichment | Thibault Varin | 10/07/1418

Page 19: Compound Set Enrichment

CONCLUSION

| Compound Set Enrichment | Thibault Varin | 10/07/1419

Compound Set Enrichment

Page 20: Compound Set Enrichment

ConclusionCompound Set Enrichment

| Compound Set Enrichment | Thibault Varin | 10/07/1420

Validation of initial hypotheses

A method to mine HTS data and identify active series of compounds• Chemical classification: Scaffold Tree

• Statistical analysis: Kolmogorov-Smirnov hypothesis test

• Multiple hypothesis test correction: Bonferroni correction

Use all primary data

No activity cut-off

Identification of new active scaffolds not necessarily represented by very active compounds (latent hits) during the primary screen

Page 21: Compound Set Enrichment

With many thanks to

| Compound Set Enrichment | Thibault Varin | 10/07/1421

Acknowledgments

Primary mentor: - Ansgar Schuffenhauer

Scientific advisers:-Christian Parker-Hanspeter Gubler-Ji-Hu Zhang-Peter Ertl-Edgar Jacoby

Help: MLI group

Fellowship: Education office

Discussions:-Martin Beibel-Sebastian Bergling-Meir Glick-Alain Dietrich-Marie-Cecile Didiot

Page 22: Compound Set Enrichment

Questions?

| Compound Set Enrichment | Thibault Varin | 10/07/1422