6
December 1, 2007 December 1, 2007 1 Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch Computational Biology Branch NCBI/NLM/NIH NCBI/NLM/NIH Rocky ‘07 Rocky ‘07

December 1, 2007 1 Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07

Embed Size (px)

Citation preview

Page 1: December 1, 2007 1 Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07

December 1, 2007 December 1, 2007

11

Classification Analysis of HIV RNase H Bioassay

Lianyi HanComputational Biology BranchComputational Biology Branch

NCBI/NLM/NIHNCBI/NLM/NIH

Rocky ‘07Rocky ‘07

Page 2: December 1, 2007 1 Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07

December, 2007December, 2007

22

Introduction The needneed for new anti-HIV agents

Drug resistant mutations Side effect / Toxicity

The limitlimit in virtual screening techniques Huge chemical space Structure and activities

The challengechallenge to generate new hypothesis Noise reduction Knowledge exploration

Page 3: December 1, 2007 1 Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07

December, 2007December, 2007

33

HIV-1 reverse transcriptase associated ribonuclease H assay

Associations among actives and inactives (Tanimoto ≥ 0.95)

inactives

actives

Compounds Collection

Total number of compounds

Total number of clusters

Isolated Clusters(only 1 member)

Non-Isolated Clusters(2 members and above)

Active 1,250 602 424 178

Inactive 63,969 3245 1663 1582

Designed by Dr. Michael Parniak of the University of Pittsburgh

PubChem, AID 565

65218 compounds tested, 1250 of them are actives

Distributions of all compounds tested in The HIV-1 RT-RNase H assay

HIV-1 RT-RNase H assay

Page 4: December 1, 2007 1 Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07

December, 2007December, 2007

44

A learning machine PubChem fingerprint: Numerical understanding of molecular

structures

2-Methyl pentane (1,1,…0)

Probabilistic Neural Network : Machine learning

… …

1

1

0

Hidden Layer

Summation Layer

classi

patterni OutputP

New Compounds

Fingerprint processing

22

d

pattern eOutput

Output Layer

Page 5: December 1, 2007 1 Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07

December, 2007December, 2007

55

Model evaluation

10 fold Cross validation

Sensitivity 86.4%Specificity 92.0%Matthews correlation coefficient 0.26

Receiver Operating Characteristic (ROC) curve analysis

Area Under Curve (AUC) : 0.90

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

False Positive Rate

Tru

e P

os

itiv

e R

ate

Page 6: December 1, 2007 1 Classification Analysis of HIV RNase H Bioassay Lianyi Han Computational Biology Branch NCBI/NLM/NIH Rocky ‘07

December, 2007December, 2007

66

Conclusions

Acknowledgements

The bioactivity data of HIV-1 RT-RNH assay can be learned for new hypothesis

The machine learning of HTS data can be used for virtual hits exploration

Yanli Wang

Steve Bryant

This research was supported by the Intramural Research Program of the NIH/NLM