View
436
Download
0
Category
Tags:
Preview:
Citation preview
Julio E. Peironcely @peyron
Juliopeironcely.com
PhD student at Leiden University and TNO
Structure Generation, Metabolite Space, and Metabolite-Likeness
Metabolomics
the quantitative and qualitative analysis of all metabolites in
samples of cells, body fluids, tissues, etc.
Julio E. Peironcely
Metabolomics
Julio E. Peironcely
Biological question
Sample preparation
Experi- mental design
Data acquisition
Data pre- processing
Biological inter-
pretation
Data analysis
Samples Raw data List of peaks/ biomolecules
Relevant biomolecules/ connectivities
& Models
Metabolites
Sampling
Protocol
Metabolomics
Julio E. Peironcely
Biological question
Sample preparation
Experi- mental design
Data acquisition
Data pre- processing
Biological inter-
pretation
Data analysis
Samples Raw data List of peaks/ biomolecules
Relevant biomolecules/ connectivities
& Models
Metabolites
Sampling
Protocol
We want
Julio E. Peironcely
List Of Candidate Structures
As Short As Possible
Good Structure Is In The List
We need
Julio E. Peironcely
Structure Generator
Keep only metabolites
Use experimental information to filter molecules
Structure Generator
In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
Elemental Formula
Generate
Candidate Structures
Fragments
Structure Generator
In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
Elemental Formula
Generate Keep Molecules if
Canonical Augmenta:on
Candidate Structures
Fragments
Structure Generator
In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
Adding bonds
Structure Generator
In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
Isomorphism
Isomorphic class “triangle + 1 edge”
Isomorphic class “3-edge chain”
2 3
4
12 3
4 3
4 3
4
3
4
11
1
2 2
2
2 3
4
12 3
4 3
4 3
4
3
4
1
1
21
2
2
1
Structure Generator
In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
Isomorphism
Isomorphic class “triangle + 1 edge”
Isomorphic class “3-edge chain”
2 3
4
12 3
4 3
4 3
4
3
4
11
1
2 2
2
2 3
4
12 3
4 3
4 3
4
3
4
1
1
21
2
2
1
Output ONLY orange graphs
Structure Generator
Julio E. Peironcely
Canonical Labeling
2 3
4
12 3
4 3
4 3
4
3
4
11
1
2 2
2
2 3
4
12 3
4 3
4 3
4
3
4
1
1
21
2
2
1
Canonizer (Nauty)
(1,2) (1,3) (1,4) (2,3)
(1,2) (1,3) (2,4)
(1,2)(1,3)(1,4)(2,3)
1
2 3
4 5
(1,2)
1
2 3
4 5
(1,2)(1,3)
1
2 3
4 5
(1,2)(1,3)(1,4)
1
2 3
4 5
(1,2)(1,3)(2,3)
1
2 3
4 5
1
2 3
4 5
(1,2)(1,3)(2,3)(2,4)
1
2 3
4 5 (1,2)(1,3)(1,4)(3,4)
1
2 3
4 5 (1,2)(1,3)(1,4)(4,5)
1
2 3
4 5
X
Use canonizer to remove duplicates after each extension
Canonical Augmentation
Julio E. Peironcely
A canonical object
augmented in a canonical way
produces a canonical object
Check For Canonical Augmentation
Julio E. Peironcely
Keep object if
a canonical deletion
takes you to the canonical father
(1,2)(1,3)(1,4)(2,3)
1
2 3
4 5
(1,2)
1
2 3
4 5
(1,2)(1,3)
1
2 3
4 5
(1,2)(1,3)(1,4)
1
2 3
4 5
(1,2)(1,3)(2,3)
1
2 3
4 5
1
2 3
4 5
(1,2)(1,3)(2,3)(2,4)
1
2 3
4 5 (1,2)(1,3)(1,4)(4,5)
1
2 3
4 5
Accept only canonically
augmented graphs
(1,2)(1,3)(1,4)(3,4)
2 3
4 5
X
1
X
Structure Generator Results
Glycine Phenylalanine Malic acid D-Cysteine p-Cresol sulfate
C2H5NO2 C9H11NO2 C4H6O5 C3H7NO2S C7H8O3S
84 277,810,163 8,070 3,838 10,203,389
6 4,037,499 1,601 100 19,940
93,137 948
584
278
Elemental Composition
# Output Molecules
1 Fragment
2 Fragments
3 Fragments
MOLGEN same # of molecules
In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
Elemental Composition
Structure Generation
Molecules
Metabolite Likeness
Metabolites
Julio E. Peironcely
How do metabolites look like?
Understanding and Classifying Metabolite Space and Metabolite-Likeness Julio E. Peironcely et al. PLoS One (in press)
Elemental Composition
Structure Generation
Molecules
Metabolite Likeness
Metabolites
Julio E. Peironcely
Metabolite-likeness
Julio E. Peironcely
HMDB 8K
ZINC 21M
Atom Counts
Physicochemical desc.
MDL Public Keys
FCFP_4
ECFP_4
Support Vector Machines (SVM)
Random Forest (RF)
Naïve Bayes (NB)
Representation + Classification
Metabolite-likeness
Julio E. Peironcely
HMDB 8K
ZINC 21M
Standardization
Diversity Selection Atom Counts Physicochemical desc.
MDL Public Keys FCFP_4 ECFP_4
Metabolite-likeness
Julio E. Peironcely
Training Set 532 + 532
HMDB 8K
ZINC 21M
Standardization
Diversity Selection
Test Set 6.4K + 6.4K
Atom Counts Physicochemical desc.
MDL Public Keys FCFP_4 ECFP_4
Metabolite-likeness
Julio E. Peironcely
Training Set 532 + 532
HMDB 8K
ZINC 21M
Standardization
Diversity Selection
Test Set 6.4K + 6.4K
5-fold CV
SVM RF BC
Atom Counts Physicochemical desc.
MDL Public Keys FCFP_4 ECFP_4
Metabolite-likeness
Julio E. Peironcely
Training Set 532 + 532
HMDB 8K
ZINC 21M
Standardization
Diversity Selection
Test Set 6.4K + 6.4K
5-fold CV
SVM RF BC
Metabolite likeness
3 classifiers X
5 descriptions
Metabolite-likeness
Julio E. Peironcely
Training Set 532 + 532
HMDB 8K
ZINC 21M
Standardization
Diversity Selection
Test Set 6.4K + 6.4K
5-fold CV
SVM RF BC
Metabolite likeness
Best = RF – MDLPublicKeys
Sensitivity Specificity AUC
99.84% 87.52% 99.20%
Bad BC – P_desc
Sensitivity Specificity AUC
42.51% 86.56% 61.57%
Metabolite-likeness, external validation
Julio E. Peironcely
HMDB External
validation set ChEMBL
Metabolite likeness
DrugBank
Standardization
Random Selection
Julio E. Peironcely
Molecule Minimized_Energy ALogP Index
0.1100 -1.605 5142
C9H11NO2
Structure Generation
277 M
Julio E. Peironcely
Molecule Minimized_Energy ALogP Index
0.1100 -1.605 5142
C9H11NO2
Structure Generation
41 K
44%
99%
Julio E. Peironcely
Molecule Minimized_Energy ALogP Index
0.1100 -1.605 5142
C9H11NO2
Structure Generation
8 K
E < 10
40%
Julio E. Peironcely
Molecule Minimized_Energy ALogP Index
0.1100 -1.605 5142
C9H11NO2
Structure Generation
31
E < 10
ALogP < -1
76%
Conclusions
Julio E. Peironcely
Met-Likeness prediction is good, interpretation not
Local models needed
Structure Generator + Met-Likeness + other constraints = Met Id
improvement
Acknowledgements
TNO Quality of Life Leon Coulier Albert Tas
Evry University Jean-Loup Faulon Davide Fichera
HMP University of Alberta David Wishart Ying (Edison) Dong
Leiden University Miguel Rojas-Cherto Piotr Kasper Michael van Vliet Theo Reijmers Rob Vreeken Ronnie van Doorn Thomas Hankemeier
University of Cambridge Andreas Bender
Julio E. Peironcely
Recommended