1
TOXICITY MODELLING OF “EEC PRIORITY LIST 1” TOXICITY MODELLING OF “EEC PRIORITY LIST 1” COMPOUNDS COMPOUNDS Council Directive 76/464/EEC of the European Communities (EEC 1976a) includes the so-called “List 1 compounds” that are dangerous compounds for aquatic environments, selected mainly on the basis of their toxicity, persistence and bioaccumulation.Thus it is very important to obtain all the information and data relevant to the particular substances in living aquatic organisms. If no data are available to make an appropriate judgement for a specific substance, the substance is considered a candidate for List 1 until such data become available. For many chemicals there is little reliable information detailing their relative toxicity, so the application of molecular descriptors and chemometric methods in Quantitative Structure-Activity Relationships (QSAR)studies is used to predict toxicological data for different aquatic organisms. INTRODUCTION INTRODUCTION All the toxicity data are expressed in mmol/l and in logarithmic scale as log (1/response). The values used for calculations were selected by prof. Marco Vighi (Dept. of Environmental Sciences, Milano) from among the more reliable data of all the sets available. Selected data were produced with comparable, officially accepted testing methods (e.g. standard OECD or EEC Guidelines). 30 min EC 50 of the light emitted by a photoluminescent bacterium (Photobacterium phosphoreum) obtained by a standard automated method (Microtox). Available for 33 homogeneous molecules. 96 h EC 50 of unicellular chlorophiceans (Selenastrum, Chlorella or comparable species) obtained with standard methods were used. Available for 45 molecules. 48 h EC 50 obtained with standard methods were selected. Available for 94 molecules. 96 h LC 50 obtained with standard methods and produced with Onchorinchus mikiss, Poecilia reticulata or Pimephales promelas were selected. Available for 88 molecules. Bacteria Bacteria Algae Algae Daphnia Daphnia PC 1 PC 2 3 5 6 7 13 17 19 20 23 25 28 29 30 32d 38 40 46 47c 98 49 52 52b 53 55 59 62 63 63c 63e 64 64b 64c 64d 64e 64f 65 65b 65c 67tr 67cs 70 73 76 78 79 80 81 84 85 85b 86 88 89 94 96 97 99 100 102 105 106 111 112 115 118 118b 121 122 122b 122c 122f 124 129 129b 129c 130s 134s 135s -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 -3 -2 -1 0 1 2 3 4 Low toxicity High toxicity T. Daphnia T. Fish PC 1 PC 2 7 17 20 25 28 29 30 32d 46 52 53 55 59 63 63c 64 64b 64d 64e 64f 67tr 67cs 73 79 80 85 85b 96 99 102 118 118b 122b 122c 129 129b 130s -3 -2 -1 0 1 2 3 -4 -3 -2 -1 0 1 2 3 4 5 Low toxicity High toxicity T. Daphnia T. Fish T. Algae PC 1 PC 2 7 17 20 28 29 30 32d 52 53 55 63 63c 64 102 118 -1.4 -0.8 -0.2 0.4 1.0 1.6 2.2 -5 -3 -1 1 3 5 Low toxicity High toxicity T. Algae T. Daphnia T. Fish T. Bacteria Fish Fish The minimum energy conformations of all the compounds were obtained by the molecular mechanics method of Allinger (MM+), using the package HyperChem. All descriptors were calculated from the obtained coordinates using the package WHIM-3D/QSAR. Principal Component Analysis (PCA) was performed by STATISTICA. The Selection of the best Subset Variables (VSS method) for modelling the toxicity was done by taking a Genetic Algorithm (GA-VSS) approach, where the response is obtained by Ordinary Least Squares regression (OLS), using the package Moby Digs for variable selections. All the calculations were performed using the leave-one-out procedure of cross-validation, maximising the cross-validated R squared (Q 2 ), (Quick rule). To avoid an overestimation of the predictive capability of the models, the leave-more-out procedure (with N cross-validation groups, I.e. a 30% of objects left out at each step) was also performed (Q 2 LMO ). Standard Deviation Error in Prediction (SDEP) and Standard Deviation Error in Calculation (SDEC) are also reported, together with the multiple correlation coefficient (R 2 ). For the obtained models, the leverages approach was performed, with the aim of estimating the reliability of the predicted data and allowing only reliable predicted data to be considered. MOLECULAR DESCRIPTORS MOLECULAR DESCRIPTORS The molecule structure has been represented by different set of descriptors: mono- dimensional (count), two-dimensional (graph-invariants) and three-dimensional (3D- WHIM, 3D-Weighted Holistic Invariant Molecular) by the software produced by the Milano Chemometric Research Group of prof. Roberto Todeschini (1) . Count descriptors (38) directly encode particular features of molecular structure and are simply obtained from the chemical structural formula of molecules, counting defined elements such as atoms (nAT), bonds (nBT), rings (nCIC), H-bond acceptors (nHA) and H- bond donors (nHD); atom type counts are obtained such as number of hydrogens, carbons, halogens (nH, nC, nX respectively). The second set is constituted by the more frequently used 34 graph-invariants descriptors (topological and information indices). The molecular weight (MW) is always used. For the 3D representation of the molecules, the WHIM descriptors, recently proposed and widely applied by Todeschini and Gramatica (2) , have been used: a set constituted by the 33 non-directional WHIM and the 66 directional WHIM. WHIM descriptors are molecular indices that represent different sources of chemical information about the whole 3D-molecular structure in terms of size, shape, symmetry and atom distribution. These indices are calculated from (x,y,z)-coordinates of a 3D-structure of the molecule, usually from a spatial conformation of minimum energy, within different weighting schemes in a straightforward manner and represent a very general approach to describe molecules in a unitary conceptual framework. (1) R. Todeschini, WHIM-3D/QSAR- Software for the calculation of the WHIM descriptors, rel. 4.1 for Windows, Talete srl, Milano (Italy) 1996. Download: http://www.disat.unimi.it/chm. (2) R. Todeschini and P. Gramatica, 3D-modelling and prediction by WHIM descriptors. Part 5. Theory development and chemical meaning of the WHIM descriptors, Quant. Struct.-Act. Relat., 16 (1997) 113-119; Part 6. Applications in QSAR Studies, same, 120-125. Exp. toxicity Calc. toxicity 3 5 6 13 17 19 20 23 25 28 29 30 32d 38 40 46 47 47c 98 49 52 52b 53 54 55 59 58 62 63 63c 63e 64 64b 64c 64d 64e 64f 65 65b 65c 67tr 67cs 70 73 76 78 79 80 81 83 84 85 85b 86 88 89 94tr 94cs 97 99 100 102 105 106 107 111 112 115 118 118b 118c 121 122 122b 122c 122e 122f 124 127 129 129b 129c 130s 133s 134s 135s -1 0 1 2 3 4 5 6 7 -1 0 1 2 3 4 5 6 7 Exp. toxicity Calc. toxicity 3 5 6 7 11 13 17 18 19 20 23 25 28 29 30 31 32b 32d 32h 38 39 40 46 47c 98 49 52 52b 53 55 59 62 63 63b 63c 63e 63f 64 64b 64c 64d 64e 64f 65 65b 65c 67tr 67cs 70 73 76 78 79 80 81 84 85 85b 86 88 89 94tr 94cs 96 97 99 99b 99e 99f 100 102 105 106 108 111 112 115 118 118b 121 122 122b 122c 122d 122f 124 125 126 129 129b 129c 130s 134s 135s -1 0 1 2 3 4 5 6 7 -1 0 1 2 3 4 5 6 7 Exp. toxicity Calc. toxicity 7 17 20 25 28 29 30 31 32b 32d 32h 32i 46 52 53 55 59 63 63b 63c 63f 64 64b 64d 64e 64f 67tr 67cs 73 79 80 85 85b 96 99 99e 102 118 118b 122b 122c 122d 129 129b 130s -1 0 1 2 3 4 5 -1 0 1 2 3 4 5 Exp. toxicity Calc. toxicity 7 17 18 19 20 28 29 30 32d 38 40 52 52b 52c 52d 52e 52f 53 54 55 63 63b 63c 63e 63f 64 86 89 102 106 107 118 122 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 F. Consolaro and P. Gramatica QSAR Research Unit, Dept. of Structural and Functional Biology, University of Insubria, Varese, Italy. E-mail: [email protected] Web-site: http://andromeda.varbio.unimi.it/~QSAR/ EXPERIMENTAL DATA EXPERIMENTAL DATA Uniform dimension Principal Component Analyses were performed on all the experimental toxicity data with the aim of highlighting the distribution of the studied compounds. It can be noted that along the first component the compounds are well separated by the global toxicity, while along the second principal component they are separated by their specific toxicity: Toxicity in Bacteria Toxicity in Bacteria (33 objects) Tox = 6.83 + 1.21 nBO - 0.32 nO - 4.84 WIA - 2.90 P2s - 7.70 Ke R 2 = 89.8% Q 2 LOO = 86.1% Q 2 LMO = 82.0% SDEP = 0.26 SDEC = 0.22 F 5,27 = 47.65 S = 0.23 nBO: n. of skeleton bonds nO: n. of oxigen atoms WIA: average Wiener index P2s: shape dir-WHIM descriptor Ke: shape glob-WHIM descriptor Toxicity in Algae Toxicity in Algae (45 objects) Tox = - 0.21 - 0.66 nS - 0.63 nOH - 0.53 nNH2 - 4.12 P1s + 1.02 Tm - 0.23 As R 2 = 70.6% Q 2 LOO = 61.5% Q 2 LMO = 58.1% SDEP = 0.56 SDEC = 0.49 F 6,38 = 15.20 S = 0.52 nS: n. of sulphur atoms nOH: n. of OH groups nNH2: n. of NH2 groups P1s: shape dir-WHIM descriptors Tm and As: dimensional glob.-WHIM descriptors Toxicity in Toxicity in Daphnia Daphnia (94 objects) Tox = - 3.57 + 4.05 nP - 0.39 nHA + 1.02 IDM + 0.67 E1m R 2 = 84.2% Q 2 LOO = 82.1% Q 2 LMO = 81.7% SDEP = 0.68 SDEC = 0.64 F 4,89 = 118.66 S = 0.65 nP: n. of phosphorous atoms nHA: n. of h bonds acceptors IDM: mean inf. cont. on the dist. magn. E1m: atom distribution dir-WHIM descriptor Toxicity in Fish Toxicity in Fish (88 objects) Tox = - 2.29 - 0.66 nNO - 0.91 nHD + 0.94 IDM - 10.39 Du + 7.39 De + 2.01 Ds R 2 = 81.5% Q 2 LOO = 78.1% Q 2 LMO = 77.8% SDEP = 0.58 SDEC = 0.53 F 6,81 = 59.55 S = 0.55 nNO: n. of NO groups nHD: n. of H bonds donors IDM: mean inf. cont. on the dist. magn. Du, De and Ds: atom distribution glob.-WHIM descriptors Bacteria Daphnia Algae Fish PCA on toxicity of bacteria, algae, Daphnia and fish Training set: 15 mol. Cum. E.V. = 86.6% (PC1 = 72.8%) PCA on toxicity of algae, Daphnia and fish Training set: 37 mol. Cum. E.V. = 90.0% (PC1 = 69.0%) PCA on toxicity of Daphnia and fish Training set: 79 mol. Cum. E.V. = 100% (PC1 = 86.5%) CONCLUSIONS CONCLUSIONS The used procedures have confirmed the quite satisfactory predictive capability of the obtained models. The role of the descriptors in predicting the toxic effects can be explained, though there are a few uncertainties. Count descriptors play an important role in all models because of their capability in explaining particular features of some groups of chemicals; also the shape (P, k) and the density factors (E, D) are determinant in predicting the toxicity of the studied compounds. Using the reliable predicted data it was possible to add many toxicological data to the available experimental values. The graphics below and the annexed table report all the available experimental data and in addition the values predicted by our models (pink data). PC 1 PC 2 3 5 6 7 11 13 17 18 19 20 23 25 28 29 30 31 32 32b 32c 32d 32e 32f 32g 32h 32i 38 39 40 46 47 47b 47c 98 49 50 52 52b 52c 52d 52e 52f 53 54 55 59 58 62 63 63b 63c 63d 63e 63f 64 64b 64c 64d 64e 64f 65 65b 65c 65d 67tr 67cs 68 68c 68d 68e 70 73 76 78 79 80 81 83 84 85 85b 85c 86 88 89 94tr 94cs 96 97 99 99b 99c 99d 99e 99f 99g 100 102 105 106 107 108 111 112 115 118 118b 118c 121 122 122b 122c 122d 122e 122f 124 125 126 127 129 129b 129c 130s 133s 134s 135s -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 -3 -1 0 1 2 4 Low toxicity High toxicity T. Daphnia T. Fish Experimental Predicted PC1 PC2 3 7 11 13 17 20 23 25 28 29 30 31 32 32b 32c 32d 32e 32f 32g 32h 32i 38 39 40 46 52 53 54 55 59 58 62 63 63b 63c 63d 63e 63f 64 64b 64c 64d 64e 64f 65 65b 65c 65d 67tr 67cs 68 68c 68d 68e 70 73 78 79 80 83 84 85 85b 85c 86 94tr 94cs 96 97 99 99b 99c 99d 99e 99f 99g 100 102 106 107 111 112 118 118b 118c 121 122 122b 122c 122d 122e 122f 129 129b 129c 130s 134s -2.5 -1.5 -0.5 0.5 1.5 2.5 -5 -3 -1 1 3 5 Low toxicity High toxicity T. Daphnia T. Fish T. Algae Experimental Predicted PC1 PC2 7 17 20 28 29 30 31 32 32b 32c 32d 32e 32f 32g 32h 32i 38 39 40 52 53 54 55 63 63b 63c 63d 63e 63f 64 64b 64c 64d 64e 64f 79 86 102 106 107 112 118 118b 118c 122 122b 122c 122d 122e 122f 129 129b 129c 134suppl. -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 -4 -2 -1 0 1 2 3 5 Low toxicity High toxicity T. Algae T. Daphnia T. Fish T. Bacteria Experimental Predicted PCA on toxicity of bacteria, algae, Daphnia and fish n. tot. mol.: 54 Cum. E.V. = 81.5% (PC1 = 58.2%) PCA on toxicity of algae, Daphnia and fish n. tot. mol.: 97 Cum. E.V. = 93.7% (PC1 = 77.0%) PCA on toxicity of Daphnia and fish n. tot. mol.: 125 Cum. E.V. = 100% (PC1 = 88.3%) METHODS METHODS 2s/P003

TOXICITY MODELLING OF “EEC PRIORITY LIST 1” COMPOUNDS

Embed Size (px)

DESCRIPTION

TOXICITY MODELLING OF “EEC PRIORITY LIST 1” COMPOUNDS. 2s/P003. F. Consolaro and P. Gramatica QSAR Research Unit, Dept. of Structural and Functional Biology, University of Insubria, Varese, Italy. E-mail: [email protected] Web-site: http://andromeda.varbio.unimi.it/~QSAR/. - PowerPoint PPT Presentation

Citation preview

Page 1: TOXICITY MODELLING OF “EEC PRIORITY LIST 1” COMPOUNDS

TOXICITY MODELLING OF “EEC PRIORITY LIST 1” TOXICITY MODELLING OF “EEC PRIORITY LIST 1” COMPOUNDSCOMPOUNDS

TOXICITY MODELLING OF “EEC PRIORITY LIST 1” TOXICITY MODELLING OF “EEC PRIORITY LIST 1” COMPOUNDSCOMPOUNDS

Council Directive 76/464/EEC of the European Communities (EEC 1976a) includes the so-called “List 1 compounds” that are dangerous compounds for aquatic environments, selected mainly on the basis of their toxicity, persistence and bioaccumulation.Thus it is very important to obtain all the information and data relevant to the particular substances in living aquatic organisms. If no data are available to make an appropriate judgement for a specific substance, the substance is considered a candidate for List 1 until such data become available. For many chemicals there is little reliable information detailing their relative toxicity, so the application of molecular descriptors and chemometric methods in Quantitative Structure-Activity Relationships (QSAR)studies is used to predict toxicological data for different aquatic organisms.

INTRODUCTIONINTRODUCTION

All the toxicity data are expressed in mmol/l and in logarithmic scale as log (1/response). The values used for calculations were selected by prof. Marco Vighi (Dept. of Environmental Sciences, Milano) from among the more reliable data of all the sets available. Selected data were produced with comparable, officially accepted testing methods (e.g. standard OECD or EEC Guidelines).

30 min EC50 of the light emitted by a photoluminescent bacterium (Photobacterium phosphoreum) obtained by a standard automated method (Microtox).Available for 33 homogeneous molecules.

96 h EC50 of unicellular chlorophiceans (Selenastrum, Chlorella or comparable species) obtained with standard methods were used.Available for 45 molecules.

48 h EC50 obtained with standard methods were selected.

Available for 94 molecules.

96 h LC50 obtained with standard methods and produced with Onchorinchus mikiss, Poecilia reticulata or Pimephales promelas were selected.

Available for 88 molecules.

BacteriaBacteria

AlgaeAlgae

DaphniaDaphnia

PC 1

PC

2

3

56

7

13

17 19

20

2325

28

2930 32d

3840

46

47c

98

49

52

52b

53

5559

62

6363c

63e64

64b

64c64d

64e64f

6565b

65c

67tr67cs

70

73

76

7879

80

81

84 85

85b

86

88

89

94

96

97

99

100

102

105

106

111

112

115

118

118b121

122

122b122c

122f

124

129129b129c 130s

134s

135s

-2.5

-2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

-3 -2 -1 0 1 2 3 4

Low toxicity High toxicity

T. Daphnia

T. Fish

PC 1

PC

2

7

17

20

25

28

29

3032d

46

5253

55

59

63

63c

64

64b

64d

64e64f

67tr67cs

73

7980

85

85b96

99

102118118b

122b

122c129129b

130s

-3

-2

-1

0

1

2

3

-4 -3 -2 -1 0 1 2 3 4 5

Low toxicity High toxicity

T. Daphnia

T. Fish

T. Algae

PC 1

PC

2

7

17

20

28

29

30 32d

52

53

55

63

63c

64

102

118

-1.4

-0.8

-0.2

0.4

1.0

1.6

2.2

-5 -3 -1 1 3 5Low toxicity High toxicity

T. Algae

T. Daphnia

T. Fish

T. Bacteria

FishFish

The minimum energy conformations of all the compounds were obtained by the molecular mechanics method of Allinger (MM+), using the package HyperChem. All descriptors were calculated from the obtained coordinates using the package WHIM-3D/QSAR.

Principal Component Analysis (PCA) was performed by STATISTICA.

The Selection of the best Subset Variables (VSS method) for modelling the toxicity was done by taking a Genetic Algorithm (GA-VSS) approach, where the response is obtained by Ordinary Least Squares regression (OLS), using the package Moby Digs for variable selections.

All the calculations were performed using the leave-one-out procedure of cross-validation, maximising the cross-validated R squared (Q2), (Quick rule). To avoid an overestimation of the predictive capability of the models, the leave-more-out procedure (with N cross-validation groups, I.e. a 30% of objects left out at each step) was also performed (Q2

LMO). Standard Deviation Error in Prediction (SDEP) and Standard Deviation Error in

Calculation (SDEC) are also reported, together with the multiple correlation coefficient (R2). For the obtained models, the leverages approach was performed, with the aim of estimating the reliability of the predicted data and allowing only reliable predicted data to be considered.

MOLECULAR DESCRIPTORSMOLECULAR DESCRIPTORS

The molecule structure has been represented by different set of descriptors: mono-dimensional (count), two-dimensional (graph-invariants) and three-dimensional (3D-WHIM, 3D-Weighted Holistic Invariant Molecular) by the software produced by the Milano Chemometric Research Group of prof. Roberto Todeschini(1).Count descriptors (38) directly encode particular features of molecular structure and are simply obtained from the chemical structural formula of molecules, counting defined elements such as atoms (nAT), bonds (nBT), rings (nCIC), H-bond acceptors (nHA) and H-bond donors (nHD); atom type counts are obtained such as number of hydrogens, carbons, halogens (nH, nC, nX respectively). The second set is constituted by the more frequently used 34 graph-invariants descriptors (topological and information indices). The molecular weight (MW) is always used.For the 3D representation of the molecules, the WHIM descriptors, recently proposed and widely applied by Todeschini and Gramatica(2), have been used: a set constituted by the 33 non-directional WHIM and the 66 directional WHIM. WHIM descriptors are molecular indices that represent different sources of chemical information about the whole 3D-molecular structure in terms of size, shape, symmetry and atom distribution. These indices are calculated from (x,y,z)-coordinates of a 3D-structure of the molecule, usually from a spatial conformation of minimum energy, within different weighting schemes in a straightforward manner and represent a very general approach to describe molecules in a unitary conceptual framework.

(1) R. Todeschini, WHIM-3D/QSAR- Software for the calculation of the WHIM descriptors, rel. 4.1 for Windows, Talete srl, Milano (Italy) 1996. Download: http://www.disat.unimi.it/chm.(2) R. Todeschini and P. Gramatica, 3D-modelling and prediction by WHIM descriptors. Part 5. Theory development and chemical meaning of the WHIM descriptors, Quant. Struct.-Act. Relat., 16 (1997) 113-119; Part 6. Applications in QSAR Studies, same, 120-125.

Exp. toxicity

Ca

lc. t

ox

icit

y

3

56

7

1317 19

20

23

25

2829 30

32d3840

46

4747c 98

49

52 52b

5354 55

5958

62

6363c63e

6464b 64c64d 64e64f

65 65b65c 67tr67cs

70

73

76

78

79

80

81

83

84

8585b

86

88

89

94tr94cs

9697

99

100

102105

106

107

111

112

115

118118b118c

121

122122b122c122e 122f

124

127

129129b129c

130s

133s

134s

135s

-1

0

1

2

3

4

5

6

7

-1 0 1 2 3 4 5 6 7

Exp. toxicity

Cal

c. t

oxic

ity

3

56

7

11

13

1718 1920

23

25

2829

303132b

32d32h 383940

46

47c98

4952

52b 5355

59

62

6363b63c 63e

63f

6464b64c

64d 64e64f

6565b

65c67tr67cs

70

73

76

78

79

80

81

84

8585b

8688

89 94tr94cs

96

97

99

99b

99e99f

100

102

105

106

108

111112

115

118

118b

121

122

122b

122c

122d 122f 124

125

126

129129b129c130s

134s

135s

-1

0

1

2

3

4

5

6

7

-1 0 1 2 3 4 5 6 7

Exp. toxicity

Cal

c. t

oxic

ity

7

1720

25

28

29

30

31

32b32d

32h

32i

46

525355

59

6363b

63c

63f

64

64b

64d64e

64f

67tr

67cs

73

79

80

8585b

96

99

99e

102118

118b

122b122c122d

129129b

130s

-1

0

1

2

3

4

5

-1 0 1 2 3 4 5Exp. toxicity

Cal

c. t

oxic

ity

7

1718

19

20

28

29

30

32d

3840

5252b

52c52d 52e

52f53

54

55

6363b63c63e

63f 64

86

89

102

106

107

118

122

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

-0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

F. Consolaro and P. Gramatica QSAR Research Unit, Dept. of Structural and Functional Biology, University of Insubria, Varese, Italy.

E-mail: [email protected] Web-site: http://andromeda.varbio.unimi.it/~QSAR/

EXPERIMENTAL DATAEXPERIMENTAL DATA

Uniform dimension Principal Component Analyses were performed on all the experimental toxicity data with the aim of highlighting the distribution of the studied compounds. It can be noted that along the first component the compounds are well separated by the global toxicity, while along the second principal component they are separated by their specific toxicity:

Toxicity in BacteriaToxicity in Bacteria (33 objects)

Tox = 6.83 + 1.21 nBO - 0.32 nO - 4.84 WIA - 2.90 P2s - 7.70 Ke

R2 = 89.8% Q2LOO = 86.1% Q2

LMO = 82.0%

SDEP = 0.26 SDEC = 0.22 F5,27 = 47.65 S = 0.23

nBO: n. of skeleton bonds nO: n. of oxigen atoms WIA: average Wiener index

P2s: shape dir-WHIM descriptor Ke: shape glob-WHIM descriptor

Toxicity in AlgaeToxicity in Algae (45 objects)

Tox = - 0.21 - 0.66 nS - 0.63 nOH - 0.53 nNH2 - 4.12 P1s + 1.02 Tm - 0.23 As

R2 = 70.6% Q2LOO = 61.5% Q2

LMO = 58.1%

SDEP = 0.56 SDEC = 0.49 F6,38 = 15.20 S = 0.52

nS: n. of sulphur atoms nOH: n. of OH groups nNH2: n. of NH2 groups

P1s: shape dir-WHIM descriptors Tm and As: dimensional glob.-WHIM descriptors

Toxicity in Toxicity in DaphniaDaphnia (94 objects)

Tox = - 3.57 + 4.05 nP - 0.39 nHA + 1.02 IDM + 0.67 E1m

R2 = 84.2% Q2LOO = 82.1% Q2

LMO = 81.7%

SDEP = 0.68 SDEC = 0.64 F4,89 = 118.66 S = 0.65

nP: n. of phosphorous atoms nHA: n. of h bonds acceptors

IDM: mean inf. cont. on the dist. magn. E1m: atom distribution dir-WHIM descriptor

Toxicity in FishToxicity in Fish (88 objects)

Tox = - 2.29 - 0.66 nNO - 0.91 nHD + 0.94 IDM - 10.39 Du + 7.39 De + 2.01 Ds

R2 = 81.5% Q2LOO = 78.1% Q2

LMO = 77.8%

SDEP = 0.58 SDEC = 0.53 F6,81 = 59.55 S = 0.55

nNO: n. of NO groups nHD: n. of H bonds donors IDM: mean inf. cont. on the dist. magn.

Du, De and Ds: atom distribution glob.-WHIM descriptors

Bacteria

Daphnia

Algae

Fish

PCA on toxicity of bacteria, algae, Daphnia and fish

Training set: 15 mol. Cum. E.V. = 86.6% (PC1 = 72.8%)

PCA on toxicity of algae, Daphnia and fish

Training set: 37 mol. Cum. E.V. = 90.0% (PC1 = 69.0%)

PCA on toxicity of Daphnia and fish

Training set: 79 mol. Cum. E.V. = 100% (PC1 = 86.5%)

CONCLUSIONSCONCLUSIONSThe used procedures have confirmed the quite satisfactory predictive capability of the obtained models. The role of the descriptors in predicting the toxic effects can be explained, though there are a few uncertainties. Count descriptors play an important role in all models because of their capability in explaining particular features of some groups of chemicals; also the shape (P, k) and the density factors (E, D) are determinant in predicting the toxicity of the studied compounds.

Using the reliable predicted data it was possible to add many toxicological data to the available experimental values. The graphics below and the annexed table report all the available experimental data and in addition the values predicted by our models (pink data).

PC 1

PC

2

3

56

711

13

1718 19

2023 25

28

2930

31

3232b32c

32d32e

32f 32g

32h

32i

38

3940

46

4747b47c

98

49

50

52

52b

52c52d

52e52f

5354

5559

5862

6363b63c 63d63e

63f

64

64b

64c64d 64e64f

6565b

65c

65d 67tr67cs

68

68c68d68e

70

73

76

7879

80

81

83

84 85

85b

85c86

88

89

94tr94cs

96

97

99

99b

99c99d

99e

99f99g

100

102

105

106107

108

111

112

115

118

118b

118c

121

122

122b122c

122d122e

122f

124

125

126

127

129129b129c 130s

133s

134s

135s

-2.5

-2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

-3 -2 -1 0 1 2 3 4Low toxicity High toxicity

T. Daphnia

T. Fish

ExperimentalPredicted

PC1

PC

2

3

7

11

13

17

2023

25

28

29

30

3132

32b

32c

32d32e32f32g

32h32i

3839

40

46

52

53

54 55

59

58

62

63

63b

63c

63d

63e

63f

64

64b

64c

64d

64e64f

6565b

65c

65d

67tr67cs

68

68c68d

68e

70

73

78

79 80

838485

85b

85c

86

94tr

94cs

96

97

99

99b

99c

99d

99e

99f

99g

100

102

106

107

111

112

118118b

118c

121

122122b

122c122d

122e122f

129129b

129c

130s

134s

-2.5

-1.5

-0.5

0.5

1.5

2.5

-5 -3 -1 1 3 5Low toxicity High toxicity

T. Daphnia

T. Fish

T. Algae

ExperimentalPredicted

PC1

PC

2

7

17

20

28

29 30

31

32

32b

32c

32d32e

32f32g

32h32i

38

39

40

52

5354

55

63

63b

63c

63d

63e

63f

64

64b

64c64d

64e

64f

79

86

102

106

107

112

118

118b

118c

122

122b

122c122d

122e

122f129129b129c

134suppl.

-2.5

-1.5

-0.5

0.5

1.5

2.5

3.5

-4 -3 -2 -1 0 1 2 3 4 5Low toxicity High toxicity

T. Algae

T. Daphnia

T. Fish

T. Bacteria

ExperimentalPredicted

PCA on toxicity of bacteria, algae, Daphnia and fish

n. tot. mol.: 54 Cum. E.V. = 81.5% (PC1 = 58.2%)

PCA on toxicity of algae, Daphnia and fish

n. tot. mol.: 97 Cum. E.V. = 93.7% (PC1 = 77.0%)

PCA on toxicity of Daphnia and fish

n. tot. mol.: 125 Cum. E.V. = 100% (PC1 = 88.3%)

METHODSMETHODS

2s/P003