Upload
william-gosset
View
93
Download
1
Embed Size (px)
Citation preview
University of Cagliari
Master Science in Chemical and Process Engineering
Statistical control of FTIR measurements in commercial detergents production
Supervisor:Ing. Massimiliano GROSSO
Co-supervisor: Student: Ing. Vincenzo GUIDA Alessandra TARIS
Scientific committee: in collaboration withProf. Ing. Roberto BARATTI
2011-2012
• potassium hydroxide
• surfactants (anionic, amphoteric, non ionic)
• Chelating agents
• sodium carbonate
• perfume
• ethanol
• Fatty acid
• polymers
• etc.
Aim: ensure standard quality in detergents
Focus on surfaces detergents
Complex formulations containing:
• ingredients mixing
• packaging
• quality control
Problems:
• Interpretation and manipulation of collected process variables may be difficult
• Online quality control is not always feasible
• Analytical techniques are slow (e.g. concentration measurements)
Steps in liquid detergents production:
Process deviations due to compositionvariations of detergent
FTIR spectroscopy: fast analytical technique, can be used online
Reproduction using a 142 samples set ofdetergentJoint variation of 11 experimentalconditions (compounds concentration)
Samples FTIR spectra
NPNN
P
P
PN
yyy
yyy
yyy
Y
21
22221
11211
N=142, P=1738Numero d'onda (cm-1)
Assorb
anza
Numero d'onda (cm-1)
Assorb
anza
Experimental campaign (P&G, Bruxelles)
142 spectra1738 absorbances for each spectrum
Samples FTIR spectra
Deviations reflect on spectra
Problem: How can weidentify samples differencesusing spectra analysis?
Thesis aims:
1. Development of methods for statistical control on experimentalmeasurements (spectra) using Multivariate StatisticalTechniques (to be implemented online in the future)
2. Detect compounds that significantly affect the spectra
Numero d'onda (cm-1)
Assorb
anza
PCA goals: data compression, informations extraction
Example: Bidimensional case-study (x1-x2 set)
Original variables Principal components (PC)
High dimensionsExtremely correlated
• PC1 greatest variance
• PC2 residual variance
• PC1 and PC2 indipendent (orthogonal)
fewerindipendent
x1
x2
x1
x2
x1
x2
PC1
PC2
Score1 (t1): projections on PC1 direction
Score2 (t2): projections on PC2 direction
Sscore1>>Sscore2PCA model : only one principal component (PC1)
Data coordinates in the newspace: scores (T)
Scores variance:
PC1
PC2
x1
x2
x1
x2
x1
x2
x1
x2
PC2
PC1
Hotelling T2
Measures distance from O′within PCA model
Q Statistic
Measures sample distance from PCA model
(that is from orthogonal projection on PC1 line)
Bidimensional case-study: 2 samples supposed to be out-of-control
Out-of-control samples identification using Q and T2 statistics
x1
x2
x1
x2
x1
x2
Low T2
High Q
High T2
Low Q
If T2 > T2lim or Q>Q lim Sample is out-of-control
O′
Q and T2 geometric interpretation
PTYJJJNJN
Y original experimental measurements
T scores matrix (new coordinates)
P loadings matrix (space rotation)
1) Components decomposition
2) PCA model
How many A principal components?
YY ˆ
N=142, J=1738
Multivariate data: N° variables >> 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2055
60
65
70
75
80
85
90
95
100
Numero componenti principali
Va
ria
nza
sp
ieg
ata
(%
)
Explained variance = 95%
1738Original variables (absorbances)
16Principal Components
16 components
Cumulative variance explained by principal components
Spectra can be well characterized using 16 PC
Synthetic chart, easy interpretation
Determination of the region (rectangular-shaped) in which in-controlsamples have to fall
T2 e Q limits(confidence level 95%)
T 2lim = 31.13 (MacGregor, 1995)
Qlim = 109.6 (Jackson, 1979)
False-positive samples
Joint region of multivariate gaussian distribution(limits more selective for outliers)
Ellipse
Determination of T2 e Q statistics foreach spectrum (Jackson, 1991)
0 10 20 30 40 50 60
40
60
80
100
120
140
160
180
T2
Q
Tlim
2
Qlim
Auto-validation
New definition of normal operating region
0 10 20 30 40 50 60
40
60
80
100
120
140
160
180
T2
Q
Tlim
2
Qlim
T2 T2
bx
3. Confidence ellipse equation:
1. Gaussian test for T2 and Q :
• Q approchable as gaussian
• T2 not gaussian
2. Non linear transformation
cost 212 )()( yyVxx
T2
bx
Q
T2
bx
Q
T2
bx
Q
New control region limits
Confidence limits:
• 95% e 99% (red)
• 100-th percentile (green)
T2
bx
Q
Statistical control simulation: identification out-of-control spectra
Load FTIR spectrum
Projection on PCA model(developed on training set)
Calculus T2bx and Q statistics
Statistical control using joint confidence region calculated on
training set
Joint confidence region
)()( XfYPTYXfT AAA
• Linear model
• Goal: Define relationship Y-X
• Significant variables choise
N=142 samplesM=11 experimental conditions (concentrations)A=16 scores (16 regressive models)
Stepwise Methods (Draper and Smith, 1998 )
Identification of variables that are most significant
perfumeNaCObuffer pHssurfactantsoda
surfactant ionicnonsoda
3
252423222122
121111
bbbbbat
bbat
• Models examples:
Influential variables :Sodium Hydroxide and surfactants
Non influential variables :Co-solvent (ethanol)
non ionic
surfactant
Sodium
hydroxideAnionic
surfactant
Amphoteric
surfactant
Sodium
Carbonate
• Qualitative compounds influence on spectra:
Qualitative relationship between experimental conditions (X) and scores (TA):
• Solvent does not influence spectra
• Spectra depend on soda, surfactants and sodium carbonate
Developments of general methods for statistical control:
• Spectra analysis and compression using PCA
• Variables reduction from 1738 to 16
• T2-Q control chart definition
• New different joint confidence region T2bx-Q
This work has been realized in cooperation with the
Procter & Gamble Research Centre in Pomezia (RM)