14
Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc.

Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

Embed Size (px)

Citation preview

Page 1: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

Shape and Color Clustering with SAESAR

Norah E. MacCuish, John D. MacCuish, and

Mitch Chapman

Mesa Analytics & Computing, Inc.

Page 2: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

ABSTRACT

SAESAR identifies potentially interesting patterns in shape and color space for leads from HTS screening data. Analysis of a several public datasets will be described as well as a discussion of a successful analysis in an industrial setting.

Page 3: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

SAESAR Features

Data Exploration, Unsupervised and Supervised Learning with Shape, Electrostatics, and 2D Structure and Properties

Powerful OpenEye Scientific Software and Mesa Analytics & Computing tools with Visualization and 2D and 3D depictions.

Clustering Taylor’s (symmetric, asymmetric,

non-disjoint, disjoint versions) Hierarchical (RNN

implementations of Ward’s, Complete Link, Group Average)

Conformer Generation OMEGA User supplied

Modeling - Model Builder Classification

Linear and Quadratic discrimination KNN

Example Tasks Find Key Shapes Find Key Structures Find Key Color Groups Generate Predictive Model with

Shape, Electrostatics, Color, 2D Structure, other variables

Page 4: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

SAESAR - 2D & 3D Clustering on Shape and Pharmacophore Features

2D Descriptors MACCS ‘drug like’ keys and public keys

from PubChem, 768 key fingerprints* 3D Descriptors

OEShape - volume overlap OEColor - hydrogen-bond donors, hydrogen-

bond acceptors, hydrophobes, anions, cations, and rings, can be user defined

*New key-based molecular fingerprinter for visualization and data analysis in compound clustering, similarity searching, and substructure commonality analysis,N. MacCuish, J.D.MacCuish, 233rd ACS, Chicago,March 25-29, 2007.

Page 5: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

Mining Primary Screening Data

Three primary screens -JNK3,Rock2,FAK Cluster hits in 3D shape (full, subshape) Cluster in 3D color Identify ‘Key shape’ clusters Identify ‘Key color’ clusters Validate with secondary screening data

Page 6: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

Datasets

Dataset Screen Structures Conformers

JNK3 Primary

Secondary

366

57

1724

256

FAK Primary

Secondary

756

189

3264

434

Rock-2 Primary

Secondary

212

67

869

273

Page 7: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

Results Summary

Dataset Secondary Screen Matches

Expected if random

Significant

JNK3

Shape

Color

21

23

14

12

yes

yes

FAK

Shape

Color

24

42

11

44

yes

no

Rock-2

Shape

Color

47

31

41

32

Marginal

No

Page 8: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

JNK3 ‘Key Shapes’

Page 9: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

Jnk3 Color & shape 8 matches

Page 10: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

JNK3 Color & Shape Common HitsJNK3 Color & Shape Common HitsSecondary screening hits which group both by shape and color

Page 11: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

Xray Structures and ‘Key Shapes’Xray Structures and ‘Key Shapes’

Rock2 (2H9V)Matches 1stKey shape

FAK (2ETM)Matches 1stKey shape

JNK3 (2EXC)Sub-shapeMatch

Page 12: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

Lead Hopping For SIRT1 Activators*

SIRT1 Actives and Not ActivesInput to SAESAR

3D ‘Key Shape’ Query

Potential leads are in a different 2D space, but similar 3D space as theactive SIRT1 compounds

Available Compounds

*See, J. Bemis, Bioorganic Gordon Research Conference, June 2008.

Page 13: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

Lead Hopping For SIRT1 Activators

• SAESAR was used to identify key shapes which encapsulated 3D shape features of SIRT1 active compounds• Key shapes were queries in a virtual screened against 3D database of Available compounds• Sets of hits were identified:

• 20 compounds had highest overall shape matching Tanimoto scores• 47 compounds had shape Tanimoto scores > 0.6• 172 compounds had Tversky score > 0.8

• Compounds were ordered and screened in SIRT1 assay:• one novel scaffold was identified with low micromolar activity• optimization lowered SIRT1 activation potency

Page 14: Shape and Color Clustering with SAESAR Norah E. MacCuish, John D. MacCuish, and Mitch Chapman Mesa Analytics & Computing, Inc

Acknowledgements

Jean Bemis, Sirtris Pharmaceuticals, a GSK Company

Evan Bolton, PubChem, NIH Software and Databases: CDK, R, PDB,

ZINC, PubChem OpenEye Scientific Software, Inc.