C. Gomes , H. Noçairi, M. Thomas

© L’Oréal - Reproduction interdite sans l’accord préalable écrit de L’Oréal.Reproduction prohibited without written agreement of L’Oréal.

Research & InnovationAdvanced Research

C. Gomes, H. Noçairi, M. Thomas

Alternative approaches for skin sensitization evaluation: Statistical and integrated

approach for the combination of non animal methods

ESTIV2012



OverviewOverview

Introduction/Context

Specific methodology Visualization of the methodology Process of validation rules

Data and Application

Conclusions and Perspectives

ESTIV2012



L'Oreal is developing approaches for alternative safety evaluation for skin sensitization of ingredients by combining multiple in vitro and in silico data.

Purpose : develop a predictive model for hazard identification : Sensitizer/Non Sensitizer

Data : For this purpose we used a full data set on 165 chemicals

composed of 35 different variables, representing the results from in silico predictions (Derek, TIMES,

Toxtree), from DPRA, MUSST,Nrf-2 and PGE-2 in vitro tests as well as numerous physico-chemical experimental or calculated parameters

Statutory ContextStatutory Context

ESTIV2012



Specific MethodologySpecific Methodology

A large number of supervised classification models have been proposed in the Literature

Solution “stacking" meta-model.

Which One To Choose?

Objective : Prediction of binary outcome (Sensitizer/Non Sensitizer)

Bias induced by the use of one single statistical approach

ESTIV2012



Specific MethodologySpecific Methodology

Repeated sub-sampling for variables selection

• Small number of observations

• Choice of different models

Boosting, Naïve Bayes, SVM, Sparse PLS-DA, and Expert Scoring

ESTIV2012



Each model provides a probability of being dangerous

Visualization of the methodologyVisualization of the methodology

BoostingSparse PLS DA SVMNaïve Bayes Score Method

INPUT VARIABLES (Qualitative and quantitative) BINARY OUTCOME(Sensitizer class (S) / Non Sensitizer class (NS) for N Subjects)

11 22 33 44 55

1

N

Subj

ects

11 22 33 44 55 Response VariableResponse Variable S

NS

Stacking Meta-modelStacking Meta-modelBy Logistics PLS-DABy Logistics PLS-DA

Stacking is a combination of 5 supervised classification methods

ESTIV2012

Robust prediction



Step 5Step 5 : Stacking model on

thevalidation

set

Global Stacking

Step 4Step 4 : Stacking model on the learning setwith variables selected in step 3Step 3Step 3 : Parameterization of each models,

and selection of the common variables in all subsetsselection of the common variables in all subsets

Step 2Step 2 : Learning set split into Q subsets

Data(N

observations)Validation set (30%)

Learning set (70%)

1st :1st : StackingMeta-model

Qth : Qth : StackingMeta-model

Step 1Step 1 : Data split into Learning/ Validation set

1st subsets :1st subsets :learning (80%)

Test (20%)

Qth subsets :Qth subsets :learning (80%)

Test (20%)

qth subsets :qth subsets :learning (80%)

Test (20%)qth :qth : StackingMeta-model

Process of validation rulesProcess of validation rules

ESTIV2012



Model StackingModel Boosting

Data and ApplicationData and Application(Sensitizer No Sensitizer )

SensitizerConclusion

No SensitizerConclusion

Inconclusive

Conclusion

85%

15%

(N=67: ≥ 85% and ≤ 15%) (N=135: ≥ 85% and ≤ 15%)

ESTIV2012



ESTIV2012

Results show that stacking model has better performance than all the other models taken

separately on a larger set

Performances on the validation set (N = Performances on the validation set (N = 50)50)

Performance comparisons on a validation set (25 Sensitizer and 25 Non Sensitizer) :

Take into account only high probabilities (≥ 85% and ≤ 15%) :



Conclusions and PerspectivesConclusions and Perspectives Conclusions :

The Stacking Meta-Model gives a prediction model with better performances for the development of alternative approaches in safety evaluation of chemicals than each of the five initial models separately

This kind of alternative prediction tool will ultimately contribute to the risk assessment decision making in a Weight of Evidence approach. Perspectives :

Implementation of another prediction model into the Stacking meta-model

Link the outputs of statistical approach with the comprehension of biological mechanisms.

Obtain a predictive model for potency evaluation of sensitizers (multi-class case )

ESTIV2012

Thanks you for your attention



Back up



In order to precise the probability a priori on each tests, a quality criterion ( Quality Factor) is used, based on Klimisch-like code 1,2, 3 (noted QF):

o Klimisch-like « code 1 » : Reliable Results QF = 1 o Klimisch-like « code 2 » : Doubtful results QF = 0,8 o Klimisch-like « code 3 » : Not reliable Results QF = 0,2

Bayes' theorem relates the conditional and marginal probabilities of stochastic events A and B:

P(A/B)=P(B/A)P(A)P(B)

1-)1( 2010

10

ppppppP

0p

2pTest SpecificityTest Sensitivity

A priori Test probability 1p

Naïve BayesNaïve Bayes

Performances

Test1 Test2 Test3

SensitivitySpecificity

Posterior Test1 Test2 Test3Probability

(=A)Probability

(=B)

Prior Test1 Test2 Test3Probability

(=A)Probability

(=B)

Result Test1 Test2 Test3A(=1) or

B(=0)QF

0.750.87

5

0.670.88

1

0.80.67

1

0.8

0.8 + (1-0.67)

0.50.5

0.5 x

0.5 x

0.7050.29

= 0.705P(A)= 0.5

x

0.7050.29

01

0.875

(1-0.75)

0.29 x

0.705 x

= 0.41P(A) = 1 - 0.29

x0.875 +

0.410.59

0.410.59

11

0.800.20

0.2

The aim of this criterion is correcting the observed "raw“ prediction by taking into account the reliability of the test in the following way: o Corrected Sensitivity =o Corrected Specificity =

0.5 + QF* (Sensitivity -0.5)0.5 + QF* (Specificity-0.5)

0.5 + 0.2 x (0.75 - 0.5) = 0.550.5 + 0.2 x (0.875 - 0.5) = 0.575

0.550.57

5



The score method allows, by a graphic visualization, to select important variables, and to fix thresholds. Example for qualitative variables:

AScore

(+1,+2,+3)

BScore

(-1,-2,-3)

Modality 1 0 -1

Modality 2 0 0

Modality 3 +2 0

Score scenario for Var 1

Expert ScoreExpert Score

ModaliModalityty

BB

BA

AA

Parameter 1N

321



The score method allows by a graphic visualization, to select important variables, and to fix thresholds.

Example for continuous variables:

B Score(+1,+2,+

3)

A Score(-1,-2,-

3)

<=Threshold

+2 0

> Threshol

d

0 0

Score scenario for Var 2

Var 2Value

of Var 2

B A

Threshold




Table: global scores value for all subjects


Global Score

Subject-1

7

Subject-n

6

1-Specificity

Sens

itiv

ity

1

10

Choice of the Threshold :Best compromise

between sensitivity and specificity

ROC curve



The PLSDA is a classification technique that combines the properties of PLS regression with the power of discrimination of discriminant analysis:

Sparse PLS-DASparse PLS-DA

Regression vector

1 … … p1

n X Scaled

PLS-DA model 1

q

Y

Y = b.X

t1

t 2

Maximum variation Between (B)

PLS Solves the optimization problem :

,cov2

1,11,1 hhT

vuhhvuvuYXMinYvXuMax

hhhh

Sparse PLS Solves the optimization problem : 21

2

11 hλhλhhT

v,uvPuPvuYXMin

hh



BoostingBoosting



Support Vector Machines (SVMs) are a set of machine learning approaches used for classification and regression, developed by Vladimir Vapnik

SVM is based on the concept of decision planes that define decision boundaries.

How does it work?How does it work?

What is SVMs?What is SVMs?

MICHAUT_V

j'enlèverai cette phrase : on comprend très bien quand même sans elle, et en plus, ça te fait moins de texte



x1

x2

Class BClass A

Example 1 : Linear SVMsExample 1 : Linear SVMs How would you classify these points using a

linear discriminant function in order to minimize the error rate?

Infinite number of answers!

Which one is the best?



“Safety zone”Margin

x1

x2

Class BClass A

x+

x+

x-

Support Vectors

The linear discriminate function with the maximum margin is the best The margin is defined as the maximal width that the boundary could be moved from the separating hyper plane before hitting the first data point

Why is it the best? Robust to outliners

and thus strong generalization ability

Example 1 : Linear SVMsExample 1 : Linear SVMs



0 x

x2

But what are we going to do if the data set is just impossible to separate in 2 parts ?

How about… mapping data to a higher-dimensional space ?

Example 2 : No-Linear SVMsExample 2 : No-Linear SVMs

0 x

Datasets that are linearly separable with some noise work out great :

Documents

C. Gomes , H. Noçairi, M. Thomas