30
BIOMOLECULES IN R&D INFORMATICS Roland Knispel

BIOMOLECULES IN R&D INFORMATICS - ChemAxon · Biomolecules in Instant JChem Texelia BioScity - Data management/tracking and inventory for the Bio/Chem lab IDBS EWB Web – bringing

Embed Size (px)

Citation preview

BIOMOLECULES IN

R&D INFORMATICS

Roland Knispel

WHAT’S IN A STRUCTURE

Cyclosporin A

Common nameCyclosporin A

InChI keyPMATZTZNYRCHOR-CGLBZJNRSA-N

IUPAC name (biological)cyclo[((2S)-2-aminobutyryl)-sarcosyl-N-

methyl-L-leucyl-L-valyl-N-methyl-L-

leucyl-L-alanyl-D-alanyl-N-methyl-L-

leucyl-N-methyl-L-leucyl-N-methyl-L-

valyl-N-methyl-(4R)-4-[(E)-but-2-enyl]-4-

methyl-L-threonyl]

IUPAC name (chemical)(3S,6S,9S,12R,15S,18S,21S,24S,30S,33S)-30-ethyl-33-[(E,1R,2R)-1-

hydroxy-2-methylhex-4-enyl]-1,4,7,10,12,15,19,25,28-nonamethyl-6,9,18,24-

tetrakis(2-methylpropyl)-3,21-di(propan-2-yl)-1,4,7,10,13,16,19,22,25,28,31-

undecazacyclotritriacontane-2,5,8,11,14,17,20,23,26,29,32-undecone

CAS number59865-13-3

InChIInChI=1S/C62H111N11O12/c1-25-27-28-

40(15)52(75)51-56(79)65-43(26-

2)58(81)67(18)33-48(74)68(19)44(29-

34(3)4)55(78)66-

49(38(11)12)61(84)69(20)45(30-

35(5)6)54(77)63-41(16)53(76)64-

42(17)57(80)70(21)46(31-

36(7)8)59(82)71(22)47(32-

37(9)10)60(83)72(23)50(39(13)14)62(85)73

(51)24/h25,27,34-47,49-52,75H,26,28-

33H2,1-

24H3,(H,63,77)(H,64,76)(H,65,79)(H,66,78)

/b27-25+/t40-,41+,42-

,43+,44+,45+,46+,47+,49+,50+,51+,52-

/m1/s1

Ref: PubChem CID 5284373

*ChemAxon generated

Closest natural sequence*AALLVTAGLVL

Canonical HELM*PEPTIDE1{A.[dA].[meL].[meL].[meV].[B

MT].[Abu].[Sar].[meL].V.[meL]}$PEPTID

E1,PEPTIDE1,11:R2-1:R1$$$

Canonical SMILESCC[C@@H]1NC(=O)[C@H]([C@H](O)[C@

H](C)C\C=C\C)N(C)C(=O)[C@H](C(C)C)N(

C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](

CC(C)C)N(C)C(=O)[C@@H](C)NC(=O)[C@

H](C)NC(=O)[C@H](CC(C)C)N(C)C(=O)[C

@@H](NC(=O)[C@H](CC(C)C)N(C)C(=O)C

N(C)C1=O)C(C)C

Sequence-based depiction

Cyclosporin A

Common nameCyclosporin A

InChI keyPMATZTZNYRCHOR-CGLBZJNRSA-N

IUPAC name (biological)cyclo[((2S)-2-aminobutyryl)-sarcosyl-N-

methyl-L-leucyl-L-valyl-N-methyl-L-

leucyl-L-alanyl-D-alanyl-N-methyl-L-

leucyl-N-methyl-L-leucyl-N-methyl-L-

valyl-N-methyl-(4R)-4-[(E)-but-2-enyl]-4-

methyl-L-threonyl]

IUPAC name (chemical)(3S,6S,9S,12R,15S,18S,21S,24S,30S,33S)-30-ethyl-33-[(E,1R,2R)-1-

hydroxy-2-methylhex-4-enyl]-1,4,7,10,12,15,19,25,28-nonamethyl-6,9,18,24-

tetrakis(2-methylpropyl)-3,21-di(propan-2-yl)-1,4,7,10,13,16,19,22,25,28,31-

undecazacyclotritriacontane-2,5,8,11,14,17,20,23,26,29,32-undecone

CAS number59865-13-3

InChIInChI=1S/C62H111N11O12/c1-25-27-28-

40(15)52(75)51-56(79)65-43(26-

2)58(81)67(18)33-48(74)68(19)44(29-

34(3)4)55(78)66-

49(38(11)12)61(84)69(20)45(30-

35(5)6)54(77)63-41(16)53(76)64-

42(17)57(80)70(21)46(31-

36(7)8)59(82)71(22)47(32-

37(9)10)60(83)72(23)50(39(13)14)62(85)73

(51)24/h25,27,34-47,49-52,75H,26,28-

33H2,1-

24H3,(H,63,77)(H,64,76)(H,65,79)(H,66,78)

/b27-25+/t40-,41+,42-

,43+,44+,45+,46+,47+,49+,50+,51+,52-

/m1/s1

Ref: PubChem CID 5284373

*ChemAxon generated

Closest natural sequence*AALLVTAGLVL

2D structure

(Marvin JS)

Canonical HELM*PEPTIDE1{A.[dA].[meL].[meL].[meV].[B

MT].[Abu].[Sar].[meL].V.[meL]}$PEPTID

E1,PEPTIDE1,11:R2-1:R1$$$

Canonical SMILESCC[C@@H]1NC(=O)[C@H]([C@H](O)[C@

H](C)C\C=C\C)N(C)C(=O)[C@H](C(C)C)N(

C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](

CC(C)C)N(C)C(=O)[C@@H](C)NC(=O)[C@

H](C)NC(=O)[C@H](CC(C)C)N(C)C(=O)[C

@@H](NC(=O)[C@H](CC(C)C)N(C)C(=O)C

N(C)C1=O)C(C)C

Cyclosporin A

Common nameCyclosporin A

InChI keyPMATZTZNYRCHOR-CGLBZJNRSA-N

IUPAC name (biological)cyclo[((2S)-2-aminobutyryl)-sarcosyl-N-

methyl-L-leucyl-L-valyl-N-methyl-L-

leucyl-L-alanyl-D-alanyl-N-methyl-L-

leucyl-N-methyl-L-leucyl-N-methyl-L-

valyl-N-methyl-(4R)-4-[(E)-but-2-enyl]-4-

methyl-L-threonyl]

IUPAC name (chemical)(3S,6S,9S,12R,15S,18S,21S,24S,30S,33S)-30-ethyl-33-[(E,1R,2R)-1-

hydroxy-2-methylhex-4-enyl]-1,4,7,10,12,15,19,25,28-nonamethyl-6,9,18,24-

tetrakis(2-methylpropyl)-3,21-di(propan-2-yl)-1,4,7,10,13,16,19,22,25,28,31-

undecazacyclotritriacontane-2,5,8,11,14,17,20,23,26,29,32-undecone

CAS number59865-13-3

InChIInChI=1S/C62H111N11O12/c1-25-27-28-

40(15)52(75)51-56(79)65-43(26-

2)58(81)67(18)33-48(74)68(19)44(29-

34(3)4)55(78)66-

49(38(11)12)61(84)69(20)45(30-

35(5)6)54(77)63-41(16)53(76)64-

42(17)57(80)70(21)46(31-

36(7)8)59(82)71(22)47(32-

37(9)10)60(83)72(23)50(39(13)14)62(85)73

(51)24/h25,27,34-47,49-52,75H,26,28-

33H2,1-

24H3,(H,63,77)(H,64,76)(H,65,79)(H,66,78)

/b27-25+/t40-,41+,42-

,43+,44+,45+,46+,47+,49+,50+,51+,52-

/m1/s1

Ref: PubChem CID 5284373

*ChemAxon generated

Closest natural sequence*AALLVTAGLVL

Canonical HELM*PEPTIDE1{A.[dA].[meL].[meL].[meV].[B

MT].[Abu].[Sar].[meL].V.[meL]}$PEPTID

E1,PEPTIDE1,11:R2-1:R1$$$

Canonical SMILESCC[C@@H]1NC(=O)[C@H]([C@H](O)[C@

H](C)C\C=C\C)N(C)C(=O)[C@H](C(C)C)N(

C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](

CC(C)C)N(C)C(=O)[C@@H](C)NC(=O)[C@

H](C)NC(=O)[C@H](CC(C)C)N(C)C(=O)[C

@@H](NC(=O)[C@H](CC(C)C)N(C)C(=O)C

N(C)C1=O)C(C)C

Sequence

BioEddie

In a workflow*

How chemist refines structure

How SAR is performed

Original Modified

How to make

transition

seamless

*See Joshua Bishop, UGM presentation, 2017, San Francisco

THE TOOLS

Equivalents of Marvin JS and Jchem Base for large molecules

- JS application for all

major browsers

- Easy editing

- No-structure

components

- Native support for

MOL/HELM/sequence

BioEddie

BioEddie

- JS application for all

major browsers

- Easy editing

- No-structure

components

- Native support for

MOL/HELM/sequence

- Customizable views

- Multi-level annotations

Recently added

capabilities

• Domain support in

sequences for e.g.

antibodies

• Improved support

for HELM2

specification

BioEddie

Next up

• Library manager

(shared with

Biomolecule Toolkit)

• HELM2 group

support for e.g. ADCs

• Canvas layout

improvements

BioEddie

API (Java and REST-ful) for– Native HELM support (HELM, HELM2, xHELM)

– Standardization

– Centralized DB storage

– Registration of entities and batches with custom

business logic

– Search by sequence/chemical

structure/metadata

– Conversion to/from Mol/FASTA/HELM

– Property calculations

Biomolecule Toolkit

Recently added capabilities• support

• Improved support for HELM2 specification

• Entity type filter for querying

Next up• KNIME/PP nodes

• Search for similar sequence

• Position-based sequence enumeration

• Genealogy tracking

Biomolecule Toolkit

MT A CRLCYWEC

MT L CRLCYWEC

MT I CRLCYWEC

MT V CRLCYWEC

MT T CRLCYWEC

MT[Dha]CRLCYWEC

MT[Hse]CRLCYWEC

MT[Nle]CRLCYWEC

MT[Nva]CRLCYWEC

MT[Sar]CRLCYWEC

MT[Aha]CRLCYWEC

MTGCRLCYWEC

Enumerate (Pos. 3) / Find (distance=1)

USE CASE: LIBRARY MANAGEMENT

Using Marvin JS, BioEddie + Biomolecule Toolkit

Marvin JS + BioEddie

Marvin JS + BioEddie

Marvin JS + BioEddieSynchronizing...

Marvin JS + BioEddieSynchronizing...

Marvin JS + BioEddie

Marvin JS + BioEddieSynchronizing...

Marvin JS + BioEddieSynchronizing...

New monomer

User draws structure

Monomer library check

Registration

Verify and curate

structure

Add missing data

Push to library

Not

found?

New monomerSynchronizing...

New

BIOMOLECULE TOOLKIT AS A

PLATFORM

- Create new project

pointing to

Biomolecule Toolkit

DB

- Grid view with

molecule visualization

Biomolecules in Instant JChem

- Create new project

pointing to

Biomolecule Toolkit

DB

- Grid view with

molecule visualization

- Form views with

additional data

- Querying

Biomolecules in Instant JChem

Texelia BioScity - Data management/tracking and inventory for the Bio/Chem lab

IDBS EWB Web – bringing HELM support into an ELN

Mockups only!!! – POC development won‘t be possible til UGM

HELM (auto-generated): untitled_molecule_file3.mol

HELM (auto-generated):

untitled_mole...

More on this by Paul Denny-Gouldson, IDBS

Summary

• Platform or add-on to provide large

molecule informatics support

• Seamless transition between chemical

space and sequence space

• HELM-compliant

THANK YOU