Upload
rohan-marquiss
View
217
Download
0
Embed Size (px)
Citation preview
On Sequential Experimental Design for Empirical Model-Building
under Interval Error
Sergei Zhilin,[email protected]
Altai State University,Barnaul, Russia
2
Outline
• Regression under interval error
• Experimental design: refining context
• Classical and “interval” design optimality criteria
• Sequential experimental design for regression models under interval error
• Comparative simulation study of classical and “interval” sequential design procedures
• Conclusions
3
Regression under Interval Error
• Model structure
xT +…
x1x2
xp
y
Input variablesx = (x1,…,xp)T measured
without error
Output variable y
measured with error
Linear-parameterized modeling function
Model parametersto be estimated Measurement error
],[ • “Interval” error means “unknown but bounded”:
4
Regression under Interval Error
.,...,1, nj yxy S jjjjjj
• Each row (xj , yj , j) of the measurements table constrains possible values of the parameter with the set
n
jjSA
1
• Values of the parameter consistent with all constraints form an uncertainty set
5
Set of feasible models
Regression under Interval Error
• Fitting data with the model y = 1 + 2x
1
2
x
y
In (x, y) domain In (1, 2) domain
Uncertainty set A is unbounded =
not enough data to build the model
Uncertainty set A
Uncertainty set ASet of feasible
models
6
Regression under Interval Error
• Problems that may be stated with respect to uncertainty set A
,max iA
i
,min iA
i
:],[...],[ 11 ppIA
.,...,1 pi
• Interval estimates of
• Point estimates of
,21 iii
.,...,1 pi
:,...,1
p
2
1
11
2
2
• Model parameters estimation
1^
2^
7
Regression under Interval Error
• Problems that may be stated with respect to uncertainty set A
• Point estimate of y )()(2
1)( xyxyxy
,min)( xxy T
A
:)(),()( xyxyx y• Interval estimate of y
,max)( xxy T
A
• Prediction of the output variable value for fixed values of input variables
x
y
y(x)
y(x)
x
y(x)^
8
– Sequential experimental design
– Simultaneous experimental design
Experimental Design: Refining Context
• Product or process optimization
• Model quality optimization
ExperimentAnalysis
(Is the model quality satisfactory?)
Design for ~1observation
End
Beg
in
Experiment AnalysisDesign for Nobservations E
nd
Beg
in
9
Experimental Design for Regression under Interval Error
, Txy
Tn
T
x
xX
1
x
pR
ny
yY
1
n
E
1
E
XXM T
1MD
Dxxxd T)(
– model
– design space
– design matrix
– measurements
– error bounds
– information matrix
covariance matrix
– standardizedvariance functionof y(x,)
• Notations
01
1
01
1
10
Experimental Design for Regression under Interval Error
• Design optimality criteria
– ClassicalName Minimizes
D -optimality (volume of joint confidence
interval)
G -optimality (maximal variance of
prediction)
– Interval (by M.P. Dyvak)Name Minimizes
ID -optimality squared volume of A
IE -optimality squared maximal diagonal of A
IG -optimality maximal prediction error
Ddet
)(max xd x
D = (XTX)–1
d(x) = xTDx
Depend only on X,hence are applicable for
interval error as well
IE- and IG-optimality are equivalent for
spherical design space and n > p
11
Experimental Design for Regression under Interval Error
• Motivation – Classical methods of experimental design
use only an information which X brings, nor Y, nor E
– Interval methods of experimental design developed by Dyvak work for saturated designs (p=n) anduse X and E, nor Y.
– Does using of information, which Y contains, allow to improve the quality of constructed model or to increase the “speed” of sequential experimental design procedure?
12
xnext = IEDesign( , X, Y, E)
Experimental Design for Regression under Interval Error
• How to use the information which Y brings?
2
1
1. Find out the direction a of maximal spread of A:
*2
*1 a
,maxarg},{ 21,
*2
*1
21
A
2. Next experimental point xnextis selected in such a way that it
• induces the constraint orthogonal to a
• has maximal norm (width of constraint )
next2 xw
w
,*next akx ||max
,
* kk ka k
R
Uncertainty set A(X,Y,E)
13
i = 0;
repeat
x = IEDesign( , Xi, Yi, Ei);
Experimental Design for Regression under Interval Error
• IE-optimal sequential design
(X0, Y0, E0) – initial dataset
14
Experimental Design for Regression under Interval Error
• IE-optimal sequential design
;;; 111
ii
iiT
ii
EE
yY
Y xX
X
(X0, Y0, E0) – initial dataset
y = measurement in x with error ;
i = i + 1;
until i > N or IA(Xi, Yi, Ei) is small;
i = 0;
repeat
x = IEDesign( , Xi, Yi, Ei);
15
Experimental Design for Regression under Interval Error
• Simulation study 1. Comparison of IE- and D-optimal sequential designs under zero errors
31.049.024.059.061.0260
0
.X ,12 xx x T R ,)2,1( T ,4.0
repeat
next1 x
XX i
i
,,,next iiE YX DesignIx ii XY
until i > 9
0i
1ii
IE-optimal sequential design D-optimal sequential design
repeat
next1 x
XX i
i
0i
until i > 9
1ii
iX DDesignx ,next
16
Experimental Design for Regression under Interval Error
• Simulation study 1. D-optimal sequential design results
0 0.5 1 1.5 21
1.5
2
2.5
3
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
Variables domain Parameters domain
Volume(A) = 0.6400 42
IA = [0.45, 1.55][1.45, 2.55] Volume(IA) = 1.21
3,7
1,5,9
2,6,104,8
2
17
Experimental Design for Regression under Interval Error
0 0.5 1 1.5 21
1.5
2
2.5
3
• Simulation study 1. IE-optimal sequential design results
Variables domain Parameters domain
Volume(A) = 0.5077 2
IA = [0.59, 1.41][1.60, 2.40] Volume(IA) = 0.66
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
2
18
Experimental Design for Regression under Interval Error
• Simulation study 2. Comparison of IE- and D-optimal sequential designs under error which follows truncated normal distribution
,1 x T xx dR ,)2,1( T ,4.0
3
)(TN
)( TNErrors are simulated by
– truncated normal distribution
0X { 3 uniformly distributed points from }
19
Experimental Design for Regression under Interval Error
for r = 1 to 1500 do
;1
I
IiI
i xX
X
II xy
until i > N
;0i
;1ii
;,,, DesignIE Ii
Ii
I YXx
repeat
end for
;, Di
D XDDesignx
0X { 3 uniformly distributed points from };
;00 XX D ;00 XX I ;00 YY I ;00 YY D
DD xy
;000 ΞXY
0Ξ { 3 random values from };)(TN
random value from ;)(TN
;1
D
DiD
i xX
X ;1
D
DiD
i yY
Y;1
I
IiI
i yY
Y
,,Volume,,Volume DN
DN
IN
IN YXIAYXIA if then ;1kk
Simulation study 2;0k
20
Experimental Design for Regression under Interval Error
• Simulation study 2. Results for
Number of selected points N
Num
ber
of w
inni
ngs
k, (
1500
– k
)
,12 x T xx R
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 5 10 15 20 250
250
500
750
1000
1250
1500
IE-Design
D-Design
21
Experimental Design for Regression under Interval Error
• Simulation study 2. Results for
Number of selected points N
Num
ber
of w
inni
ngs
k, (
1500
– k
)
,13 x T xx R
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 5 10 15 20 250
250
500
750
1000
1250
1500
IE-Design
D-Design
22
Experimental Design for Regression under Interval Error
• The “cost” of IE-optimal design– The problem of finding maximal spread direction of A
is a concave quadratic programming problem (CQPP)
– It is proved that CQPP is NP-hard, i.e. solving time of the problem exponentially depends on its dimension (the number of input variables p)
– To overcome the difficulties we need to use special computational means (such as parallel computers) or we can limit ourself with near-optimal solutions
21,
*2
*1
21
maxarg},{
A
23
Conclusions
• Interval model of error allows to use the information about measured values of output variable for effective sequential experimental design
• The results of the performed simulation study give a cause for careful analytical investigation of properties of IE-optimal sequential design procedures
• IE-optimal sequential design for high-dimensional
models demands for special computational techniques