1
3. Arficial Intelligence Machine learning algorithms such as neural networks have found use in a number of branches in com- puter science due to their high predicvity. Their use in toxicology has so far been limited, but they are gaining much aenon [8]. One drawback of such complex methods is difficulty in understanding their predicons - making them less aracve in risk assessment procedures. A cartoon of a neural network that might be used for MIE predicon is shown below. Inputs are fed into the leſt hand side of the network through the input neurons. Each layer of neurons is connected to the adjacent layers by synapses. When the network is trained on known data its adjusts the weights of each synapse to alter how signals propagate through the network. Output nodes on the right correspond to desired predicons. In the case of MIE predicon we can feed a neural network chemical informaon to its input layer and predict biological acvity at its output layer . The types of in- put informaon, the number of neurons and hidden layers, and the mathemacal relaonships con- tained within neurons and synapses can all be changed before training to invesgate which set of pa- rameters provide the highest levels of predicvity. Models constructed using both structural alerts and neural networks for predicng androgen receptor binding as an MIE are shown in the table above. Three structural alerts-based methodologies [4,5,9] are compared to neural networks using a single hidden layer of 1000 neurons and two hidden layers of 100 neurons. The neural networks use extended connecvity fingerprints as inputs and L2 regularizaon. Sigmoid and ReLU acvaon funcons are used as shown in the table. The neural network approaches perform admirably, considering the high accuracy achieved over a relavely short development period. Further invesgaon of hyperparameters is required to achieve maximum model performance. 1.Introducon The adverse outcome pathway (AOP) provides a framework to encapsulate the chemical and biological processes that can lead to toxicological outcomes [1]. The molecular iniang event (MIE) is a chemical interacon which starts an AOP [2,3]. Some examples of MIEs include receptor acvaon, DNA modifi- caon and enzyme inhibion. Chemistry is key to understanding the MIE. What is it about these com- pounds that allow them to do this? We have used several in silico methodologies to invesgate MIEs using chemical informaon [4,5]. Structural features, reacvity paerns and arficial intelligence can be used to make predicons and gain greater understanding of why chemicals lead to toxicological effects. Quantum mechanical calcula- ons have been performed on a subset of α,β unsaturated carbonyls to idenfy their transion states and acvaon energies during reacons with a DNA nucleobase model nucleophile. Arficial intelli- gence approaches such as machine learning neural networks are being used to make MIE predicons, and similarity calculaons are being invesgated to idenfy how they may allow us to beer under- stand these complex and powerful predicve methods for use in chemical risk assessment. Computaonal approaches for predicng Molecular Iniang Events Timothy E. H. Allen 1 , Jonathan M. Goodman 1 , Mahew N. Grayson 1 , Paul J. Russell 2 , and Steve Gutsell 2 1. Centre for Molecular Informacs, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW 2. Safety & Environmental Assurance Centre, Unilever, Colworth Science Park, Sharnbrook, Bedfordshire, MK44 1LQ 4. Chemical and Biological Similarity To beer understand how neural networks “think” we can explore the similarity between network sig- nal propagaon between different chemical compounds . Once a network is trained compounds can be fed in and provide different numerical values at each individual node. These values can be thought of as coordinates in n-dimensional space, and the distance between them calculated. Compounds that are close have a high similarity in the neural network. Because the network links chemical informaon to biological predicons the compounds can be thought of as being both chemi- cally and biologically similar . When making predicons the neural network could provide a predicon and list of most similar compounds, increasing confidence in its predicons in risk assessment. 2. Transion State Modelling The Ames mutagenicity assay is a long established in vitro test for compound mutagenicity potenal. One of the MIEs that can lead to mutagenicity is the covalent binding of electrophilic chemicals to DNA nucleobases. To explore this we modelled reacons between electrophilic α,β unsaturated carbonyls and a model methylamine nucleophile using quantum mechanical density funcon theory calculaons. Transion states were located and acvaon energies calculated for a number of example molecules [6] with Ames test data in the OECD QSAR Toolbox [7]. These are shown below with Ames posive com- pounds in red on the right hand side and Ames negave compounds in green on the leſt hand side. The calculaons allow us to separate the graph into three regions. Above 25.7 Kcal/mol all the com- pounds calculated are Ames negave as the cannot bind directly to DNA. Below 22.0 Kcal/mol they are all Ames posive and between 22.0 and 25.7 Kcal/mol there are examples of both. These results sug- gest a clear link between acvaon energy and mutagenicity potenal . This methodology allows us to calculate an energy barrier for this process, and other covalent bond forming MIEs and compare them to experimentally tested compounds. References 5. Conclusions MIEs are important chemical-biological interacons in toxicology that can be modelled in silico. Quantum mechanical transion state modelling and acvaon energy calculaons have been used to invesgate DNA binding as an MIE leading to genotoxicity. Acvaon energy was found to correlate well with Ames mutagenicity studies for α,β unsaturated carbonyls, with high acvaon energies corresponding to negave Ames test results. Neural networks have been invesgated for the predicon of receptor binding MIEs, with promis- ing results reported for the Androgen receptor. Future work will involve aempng to beer understand neural network predicons, using a com- binaon of chemical and biological similarity between novel and training compounds. 1. Ankley G. T. et al., Environ. Toxicol. Chem., (2000) 29, 730-741. 2. Allen T. E. H. et. al., Chem. Res. Toxicol., (2014) 27, 2100-2112. 3. Allen T. E. H. et. al., Chem. Res. Toxicol., (2016) 29, 2060-2070. 4. Allen T. E. H. et. al., Chem. Res. Toxicol., (2016) 29, 1611-1627. 5. Allen T. E. H. et. al., Toxicol. Sci., (2018) Advanced Access. 6. Allen T. E. H. et. al., J. Chem. Inf. Model., (2018) 58, 1266-1271. 7. OECD QSAR Toolbox, hp://www.oecd.org/chemicalsafety/risk- assessment/oecd-qsar-toolbox.htm 8. Mayr, A. et al. Front. Environ Sci. (2016) 3, 1-15. 9. Wedlake A. J. et al. Unpublished Data. Ankley's conceptual diagram of an AOP, including the MIE. Image adapted from Ankley 2010 [1]. Acknowledgements Unilever Andrew J. Wedlake Centre for Molecular Informacs St John’s College, Cambridge MIE Predictions for the Androgen Receptor Model TP FN TN FP ACC Alerts (2016) [4] 847 2250 6854 44 77.0 Alerts (2018) [5] 1017 2080 6838 60 78.6 Alerts (Bayesian) [9] 1550 1547 6738 160 82.9 NN [1000] (Sigmoid) 85.6 NN [100,100] (Sigmoid, ReLU) 83.4

omputational approaches for predicting Molecular ... › wp-content › uploads › 2019 › 02 › ... · Machine learning algorithms such as neural networks have found use in a

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: omputational approaches for predicting Molecular ... › wp-content › uploads › 2019 › 02 › ... · Machine learning algorithms such as neural networks have found use in a

3. Artificial Intelligence Machine learning algorithms such as neural networks have found use in a number of branches in com-

puter science due to their high predictivity. Their use in toxicology has so far been limited, but they are

gaining much attention [8]. One drawback of such complex methods is difficulty in understanding their

predictions - making them less attractive in risk assessment procedures. A cartoon of a neural network

that might be used for MIE prediction is shown below.

Inputs are fed into the left hand side of the network through the input neurons. Each layer of neurons

is connected to the adjacent layers by synapses. When the network is trained on known data its adjusts

the weights of each synapse to alter how signals propagate through the network. Output nodes on the

right correspond to desired predictions. In the case of MIE prediction we can feed a neural network

chemical information to its input layer and predict biological activity at its output layer. The types of in-

put information, the number of neurons and hidden layers, and the mathematical relationships con-

tained within neurons and synapses can all be changed before training to investigate which set of pa-

rameters provide the highest levels of predictivity.

Models constructed using both structural alerts and neural networks for predicting androgen receptor

binding as an MIE are shown in the table above. Three structural alerts-based methodologies [4,5,9] are

compared to neural networks using a single hidden layer of 1000 neurons and two hidden layers of 100

neurons. The neural networks use extended connectivity fingerprints as inputs and L2 regularization.

Sigmoid and ReLU activation functions are used as shown in the table. The neural network approaches

perform admirably, considering the high accuracy achieved over a relatively short development period.

Further investigation of hyperparameters is required to achieve maximum model performance.

1. Introduction

The adverse outcome pathway (AOP) provides a framework to encapsulate the chemical and biological

processes that can lead to toxicological outcomes [1]. The molecular initiating event (MIE) is a chemical

interaction which starts an AOP [2,3]. Some examples of MIEs include receptor activation, DNA modifi-

cation and enzyme inhibition. Chemistry is key to understanding the MIE. What is it about these com-

pounds that allow them to do this?

We have used several in silico methodologies to investigate MIEs using chemical information [4,5].

Structural features, reactivity patterns and artificial intelligence can be used to make predictions and

gain greater understanding of why chemicals lead to toxicological effects. Quantum mechanical calcula-

tions have been performed on a subset of α,β unsaturated carbonyls to identify their transition states

and activation energies during reactions with a DNA nucleobase model nucleophile. Artificial intelli-

gence approaches such as machine learning neural networks are being used to make MIE predictions,

and similarity calculations are being investigated to identify how they may allow us to better under-

stand these complex and powerful predictive methods for use in chemical risk assessment.

Computational approaches for predicting Molecular Initiating Events

Timothy E. H. Allen1, Jonathan M. Goodman1, Matthew N. Grayson1,

Paul J. Russell2, and Steve Gutsell2

1. Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW

2. Safety & Environmental Assurance Centre, Unilever, Colworth Science Park, Sharnbrook, Bedfordshire, MK44 1LQ

4. Chemical and Biological Similarity

To better understand how neural networks “think” we can explore the similarity between network sig-

nal propagation between different chemical compounds. Once a network is trained compounds can be

fed in and provide different numerical values at each individual node. These values can be thought of as

coordinates in n-dimensional space, and the distance between them calculated.

Compounds that are close have a high similarity in the neural network. Because the network links

chemical information to biological predictions the compounds can be thought of as being both chemi-

cally and biologically similar. When making predictions the neural network could provide a prediction

and list of most similar compounds, increasing confidence in its predictions in risk assessment.

2. Transition State Modelling

The Ames mutagenicity assay is a long established in vitro test for compound mutagenicity potential.

One of the MIEs that can lead to mutagenicity is the covalent binding of electrophilic chemicals to DNA

nucleobases. To explore this we modelled reactions between electrophilic α,β unsaturated carbonyls

and a model methylamine nucleophile using quantum mechanical density function theory calculations.

Transition states were located and activation energies calculated for a number of example molecules [6]

with Ames test data in the OECD QSAR Toolbox [7]. These are shown below with Ames positive com-

pounds in red on the right hand side and Ames negative compounds in green on the left hand side.

The calculations allow us to separate the graph into three regions. Above 25.7 Kcal/mol all the com-

pounds calculated are Ames negative as the cannot bind directly to DNA. Below 22.0 Kcal/mol they are

all Ames positive and between 22.0 and 25.7 Kcal/mol there are examples of both. These results sug-

gest a clear link between activation energy and mutagenicity potential. This methodology allows us to

calculate an energy barrier for this process, and other covalent bond forming MIEs and compare them

to experimentally tested compounds.

References

5. Conclusions

MIEs are important chemical-biological interactions in toxicology that can be modelled in silico.

Quantum mechanical transition state modelling and activation energy calculations have been used

to investigate DNA binding as an MIE leading to genotoxicity.

Activation energy was found to correlate well with Ames mutagenicity studies for α,β unsaturated

carbonyls, with high activation energies corresponding to negative Ames test results.

Neural networks have been investigated for the prediction of receptor binding MIEs, with promis-

ing results reported for the Androgen receptor.

Future work will involve attempting to better understand neural network predictions, using a com-

bination of chemical and biological similarity between novel and training compounds.

1. Ankley G. T. et al., Environ. Toxicol. Chem., (2000) 29, 730-741.

2. Allen T. E. H. et. al., Chem. Res. Toxicol., (2014) 27, 2100-2112.

3. Allen T. E. H. et. al., Chem. Res. Toxicol., (2016) 29, 2060-2070.

4. Allen T. E. H. et. al., Chem. Res. Toxicol., (2016) 29, 1611-1627.

5. Allen T. E. H. et. al., Toxicol. Sci., (2018) Advanced Access.

6. Allen T. E. H. et. al., J. Chem. Inf. Model., (2018) 58, 1266-1271.

7. OECD QSAR Toolbox, http://www.oecd.org/chemicalsafety/risk-assessment/oecd-qsar-toolbox.htm

8. Mayr, A. et al. Front. Environ Sci. (2016) 3, 1-15.

9. Wedlake A. J. et al. Unpublished Data.

Ankley's conceptual diagram of an AOP, including the MIE. Image adapted from Ankley 2010 [1].

Acknowledgements Unilever

Andrew J. Wedlake

Centre for Molecular Informatics

St John’s College, Cambridge

MIE Predictions for the Androgen Receptor

Model TP FN TN FP ACC

Alerts (2016) [4] 847 2250 6854 44 77.0

Alerts (2018) [5] 1017 2080 6838 60 78.6

Alerts (Bayesian) [9] 1550 1547 6738 160 82.9

NN [1000] (Sigmoid) 85.6

NN [100,100] (Sigmoid, ReLU) 83.4