Upload
bathsheba-hopkins
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
Efficient handling of raw material variation in industry
Tormod Naes
Ingunn Berget (post-doc)
Kjetil Joergensen (Ph.D. student)
Overview• IBION-project
• The problem. Handling raw material variation
– Why?
– Different aspects
• Building relevant models
– Data/design + modelling
• Different types of use of the equation• Robustness
• Sorting
• Continuous updating
• Combinations
• Robustness, validation
Industrial Biostatistics Network (IBION) www.ibion.no
Efficient use of raw materials in industry
The research project IBION (Industrial Biostatistics Network) is a consortium with partners from
five Norwegian bio-processing companies (Tine, Ewos, Stabburet, Mills, Borregaard)
the software and consulting company Prediktor and
three research institutes:-Department of mathematics, Agricultural University of Norway, AUN, Ås -Matforsk, Norwegian Food Research Institute, Ås-CPAC, University of Washington, Seattle.
The project is financed by the industrial partners and by The Research Council of Norway (NFR).
It was started July 1, 2001 and will continue until July 1, 2005. The total budget for the project is about 40 million NOK.
Handling raw material variationWhy important?
• Raw materials vary in quality
• Raw material costs represent a large portion of the total costs
• Customers require goodh and stable quality
Handling raw material variation
• Adjust processes to account for unwanted raw material variation– Stable, good quality– avoid waste
• Utilise potential in raw material– Best raw materials for best product– Good combinations– Quality and cost
In all these cases
We would like to have a model or a strategy that can be used to tell us what to do when a
batch of raw material is received and characterised
“process=f(raw materials, target values of output)”
Important
• All serious industries have strategies and techniques for handling raw material variation (including the partners)– Expert knowledge, practical experience
• We help them improving strategies – Using statistical/chemometric methods
Important criteria
• Methods should be easy to use and understand
• Results should be easy to interpret
• Methods should stimulate to user interaction
• Methods should be flexible/versatile/robust
• Realistic validation of results
Important steps• Problem formulation
• Measurements, where?, type?.
• Data collection
– Design or historical data?
• Modelling
– Type of model, variable selection
• Use of the models
– Interpretation, optimisation
• Properties, robustness, validation
• New round?
Collaboration
• Typical area for collaboration
– raw material knowledge– process knowledge– spectroscopy – statistics
• Without close collaboration, no results!
Knowledge available
• A large amount of useful components are avaialable
• Experimental design
• Empirical modelling (polynomials)
• Variable selection
• Optimisation
• Validation
• etc.
• Little focus on this particular problem area
Raw material handling
• The best that can be done prior to processing starts
• Should be followed up by process monitoring and/or control strategies when appropriate.
Problem Description
Process parameters
Raw-materials
PROCESS Final ProductQuality
N o r w e g i a n F o o d R e s e a r c h I n s t i t u t e
Time
Different approaches
• Robustness with respect to raw material variation
• Continuous updating of process settings
• Sorting of raw materials (define classes with corresponding optimal processing and with good properties in each)
• Combinations
• Useful for different situations, depending on local conditions
Two pieces of work done by studentsone related to modelling
one related to use of models
• Design and analysis strategy for situations with uncontrolled raw material variation
– Jorgensen and Næs, (2004). J. Chem. (in press)
• Optimal sorting of raw materials based on the predicted end product quality.
– Berget and Næs (2002). Qual. Eng.
• Batch oriented
A possible strategy for model buildingK. Jorgensen and T. Næs. (2004). J. Chem. (in press)
Proposal for a strategy and a case study, cheese, Dry matter • End product quality = F(raw materials, process settings)
– In practice often used inverted
• Problems
– Raw materials are natural products, can not be controlled/designed
• How can we set up an experimental design?
– Raw material characterisation
• Time consuming
• Sometimes one does not know what to measure
Possible solution
• Design– Block design with raw materials as blocks
• Measurements– Spectroscopy, use principal components of
spectra directly in modelend product quality=F(PC’s, process)
Experiment
• 4 2-level factors + 3 2-level block factors – 7 factors in total– factorial design 24 in 8 blocks
– protein content– renneting time– amount of starter culture added– coagulum cuting
Design variable Low (-) value High (+) valueProtein content (A) 3.15% 3.50%Renneting Time (B) “Standard” “Standard” + 7 minutesStarter culture added (C) 1.7% 2.2%Coagulum cutting (D) “Reduced” “Standard”
Replicate #1 Replicate #2Run # Block A B C D Run # Block A B C D
1 1 - - - - 17 5 + + + -2 1 - + + + 18 5 + - - +3 1 + - + + 19 5 - - + -4 1 + + - - 20 5 - + - +5 2 + + + + 21 6 - - - +6 2 + - - - 22 6 + - + -7 2 - - + + 23 6 + + - +8 2 - + - - 24 6 - + + -9 3 - + - + 25 7 - + + +
10 3 - - + - 26 7 + + - -11 3 + + + - 27 7 - - - -12 3 + - - + 28 7 + - + +13 4 - - - + 29 8 + + + +14 4 - + + - 30 8 - + - -15 4 + - + - 31 8 - - + +16 4 + + - + 32 8 + - - -
ANOVA tableSource DF Adjusted SS Adjusted MS F P-valueBlocks 7 0.0334 0.00478 4.38 0.008Main Effects 4 0.0523 0.01308 12.00 0.0002-Way Interactions 5 0.0010 0.00020 0.19 0.963Residual Error 15 0.0163 0.00109Total 31
Estimated regression coefficients and p-values for individual main effectsTerm Coefficient SE of coeff. P-ValueProtein 0.003 0.0058 0.565Starter -0.007 0.0058 0.237Renneting Time 0.037 0.0058 0.000Cutting 0.015 0.0058 0.019
ANOVA tableSource DF Adjusted SS Adjusted MS F P-valueCovariates 6 0.0328 0.00547 5.16 0.004Main Effects 4 0.0352 0.00879 8.29 0.0012-Way Interactions 5 0.0014 0.00029 0.27 0.922Residual Error 16 0.0170 0.00106Total 31
Estimated regression coefficients and p-values for individual main effectsand Principal ComponentsTerm Coefficient SE of coeff. P-valuePC 1 -0.093 0.1202 0.449PC 2 0.026 0.0061 0.001PC 3 0.024 0.0285 0.420PC 4 0.023 0.0087 0.017PC 5 0.009 0.0076 0.239PC 6 -0.007 0.0081 0.436Protein 0.099 0.1216 0.426Starter -0.007 0.0060 0.291Renneting time 0.030 0.0063 0.000Cutting 0.017 0.0064 0.018
Same level of residual standard deviation
ANOVA tableSource DF Adjusted SS Adjusted MS F P-valueCovariates 2 0.0304 0.01519 15.65 0.000Main Effects 4 0.0410 0.01024 10.56 0.0002-Way Interactions 5 0.0013 0.00026 0.26 0.928Residual Error 20 0.0194 0.00097Total 31
Estimated regression coefficients and p-values for individual main effectsand Principal ComponentsTerm Coefficient SE of coeff. P-valuePC 2 0.026 0.0058 0.000PC 4 0.018 0.0062 0.007Protein 0.004 0.0055 0.455Starter -0.007 0.0057 0.241Renneting time 0.033 0.0057 0.000Cutting 0.016 0.0057 0.010
0.650.550.45
0.05
0.00
-0.05
Fitted Value
Res
idu
al
Residuals
1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
Abs
orba
nce
Wavelength (cm-1)
FT-IR spectra
Loadings for PC2 og PC4
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
Loadin
g P
C2
1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
Wavelength (cm-1)
Loadin
g P
C4
PLS instead of PCA
• Preliminary investigations based on simulations and real data indicate
– PLS on the residuals (iteratively) gives better interpretability of spectral information (fewer components)
Conclusions• Block designs, with raw materials as blocks, combined with rapid characterisation of the raw
materials: useful tool for model building.– (verified also in other experiments)
• Using PC’s of spectra is flexible and does not need decisions about which properties to measure. Interpretation is important
• First step towards a good model– Possible to interpret– Model is verified
• Must may extended or fine-tuned by extra experiments to incorporate conlinearities etc.
• Best possible combination of process variables and blocks? Research!
• Combinations of collinear spectral data and factors. Research going on!
Utilising equations for process improvements
• Some ideas, possibilities, feasibility
– not finished industrial implementations
• General goal: adjust process after measurement of raw material quality
– Optimal for each batch
– Optimal, but robust with respect to certain types of noise
– A simpler strategy: Identify a small number of homogeneous raw material classes and their corresponding optimal process settings
• robust with respect to measurement error of raw material measurements
• simple in use (if difficult to change process)
• well suited for situations where it is possible to sort. – Receive raw materials, sort, store in bins and process from same bin
Goal: Reduce the effect of variable raw material quality on end product quality
Procedure:
Identify optimal classes with corresponding processing conditions,cluster analysis.
After identification: measure raw material and put in best class (with known
processing conditions).
Industrial process with sorting
RAW MATERIALS
PROCESS 1
END PRODUCT
“POOR QUALITY” “GOOD QUALITY”
PROCESS 2
Model: Predicted quality = Raw material + Process
Sorting:Predicted quality depends on category
Objective: Minimise (Predicted quality - Target)2
for all objects in all categories
Model: ),(ˆ zxfy
Sorting:i = object index (i = 1,…,n)j = group index (j = 1,…,C)
),(ˆ ijij zxfy
y = end product quality x = process variablesz = raw material variables n = number of objects
C = number of categoriesT = Target
Distance between objects and groups= the loss from object i when it is allocated to category j
22 )ˆ( Tyd ijij
Optimal sorting of raw materials, based on the predicted end product quality
Paper I
Fuzzy clustering
• Fuzzy clustering as strategy for finding groups
– Flexible with respect to distance
– Easy to implement
– Good convergence properties
• Gives a quantitative description of how well each object fit in each cluster
• Membership values
– Numbers between 0 and 1
– Sum up to 1 for each object
– Relative numbers
OPTIMAL PROCESS SETTINGS
xo1, xo
2,,…, xoc
MEMBERSHIP VALUES
U ={ui j}
MODEL
ŷ=f(x,z)
DATAZ ~p(z)
TARGET, T
Number of Categories
(C)
FUZZY CLUSTER ANALYSI
S
EXPERIMENTAL DATAX,Y, ZExp
Example: Baking of hearth bread• Data taken from a study of baking process and flour quality*
– 10 flour blends
– 3 levels of mixing and proofing time ( = resting time after dough has been shaped)
– 90 combinations of flours and baking process
– Response: bread loaf volume
*Færgestad, E. M. et al. Influence of flour quality and baking process on hearth bread characteristics using gentle mixing. Journal of Cereal Science 1999, 30 61-70.
Input to cluster analysis
zxzxxx
zxxy
2122
21
21
52.090.008.030.0
10.2447.405.226.523
• MODELz1 protein
x1 mixing time
x2 proofing time
y volume
• Target– T=530 mL
• Raw material data– 100 equally spaced points within experimental region (10.2 - 14.3%
protein)• Number of groups
– C=2
Membership values and loss
10 10.5 11 11.5 12 12.5 13 13.5 14 14.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
mem
bers
hip
valu
es
protein content (%)
Category 1Category 2
10 10.5 11 11.5 12 12.5 13 13.5 14 14.50
500
1000
1500
2000
2500
3000
protein content (%)
with two groupswithout sorting
loss
Mixing Proofing Averageloss
p < 12% 11.7 61.3p > 12% 7.6 51.9 199.1
Withoutsorting
8.7 54.4 705.9
Results
• Optimal process settings • Convergence properties
– With two groups the clustering algorithm converges in less than 20 iterations.
– More iterations needed with more groups.
– No sensitivity to different initialisations.
… but what about bread shape?
Volume = 354 ml
Form ratio = 0.64 (height/width)
Volume = 352 ml
Form ratio = 0.52 (height/width)
Sorting of raw materials with focus on multiple end-product properties
• Product quality is often defined by several product characteristics
• Different responses have different optima
– Example: volume and form ratio
– longer proofing times give larger, but flatter breads
Paper II
Suggested approaches• Optimise one response under constraints on the
others– Convergence problems in investigated example
• Weighted squared loss– Use weights to prioritise responses
• Desirability functions– Functions of predicted product quality
Alternatively
RAW MATERIALS
PROCESS 1
PRODUCT 1
POOR QUALITY GOOD QUALITY
PROCESS 2
PRODUCT 2
Assessing robustness
• Various approaches exist– Box et al, optimisation of polynomials models
– Bootstrapping (parametric) for assessing robustness
• Estimate model.
• Simulate data from the model
• Repeat optimisations.
• Visualise the optimal points
5 10 15 20 2535
40
45
50
55
60
mixing time
proo
fing
time
20
20
30
30
30
30
40
40
40
4050
50
50
50
60
60
60
60
70
70
70
70
80
80
80
80
90
9090
90
100
100
100110 110
120
[13.5, 60.0]
5 10 15 20 2535
40
45
50
55
60
mixing time
proo
fing
time
20
20
2020
20
20
20
30
30
3030
30
30
30
40
4040
40
40
4040
50
50
50
50
50
50 50
6060
60
60
6070
70
70
8090
100110
[7.7, 52.3]
Can also be used in other situations
• Robustness of robust process optimization
– Mevik(2003) Qual. Eng.
– Some variables controlled, robustness to others
– robustness to model and target uncertainty
• Product and process improvement using mixture-process variable design and robust optimization techniques.
– Sahni, Piepel and Næs (2004), in prep.
– Some variables controlled, robustness to other (mixture-process)
– robustness to coefficient and model selection uncertainty
• Results
– Memberships and optimal process settings are variable when regression coefficients are uncertain.
– Misclassification rate due to variable membership values is small.
– Average error in predicted response due to variable optimal process settings small compared to the prediction error
• Indicate that prediction sorting is rather robust to random error in the regression coefficients.
• Paper submitted
Robustness of prediction sorting
Paper IV
List of papers
I. Berget and T. Næs. “Optimal sorting of raw materials, based on the predicted end-product quality”. Quality Engineering (2002) 14 (3) 459-478
I. Berget and T. Næs, “Sorting of raw materials with focus on multiple end-product properties”. Journal of Chemometrics (2002) 16 263-273
I. Berget, A. Aamodt, E. M. Færgestad and T. Næs. “Optimal sorting of raw materials for use in different products”. Journal of Chemometrics and Intelligent Laboratory systems (in press).
I. Berget and T. Næs. “Robustness of prediction sorting”. (submitted)
Conclusions• Fuzzy clustering combined with suitable distance measure can be
used for sorting
• Robust splitting and reasonably robust process settings
• Clear improvements over non-sorting
• Method can be extended to multivariate data and can be penalised
• Bootstrap can be used for evaluating robustness