OPTIMAL SENSOR PLACEMENT FOR JOINT PARAMETER …...Sensor Placement for Joint Parameter and State Estimation Herzog, Riedel, Ucinski´ parameters could be easily incorporated by declaring

OPTIMAL SENSOR PLACEMENT FOR JOINTPARAMETER AND STATE ESTIMATION

PROBLEMS IN LARGE-SCALE DYNAMICALSYSTEMS WITH APPLICATIONS TO

THERMO-MECHANICS

Roland Herzog∗ Ilka Riedel† Dariusz Ucinski‡

February 7, 2017

We consider large-scale dynamical systems in which both the initial stateand some parameters are unknown. These unknown quantities must beestimated from partial state observations over a time window. A data as-similation framework is applied for this purpose. Specifically, we focuson large-scale linear systems with multiplicative parameter-state couplingas they arise in the discretization of parametric linear time-dependent par-tial differential equations. Another feature of our work is the presence of aquantity of interest different from the unknown parameters, which is to beestimated based on the available data. In this setting, we develop a simpli-cial decomposition algorithm for an optimal sensor placement and set forthformulae for the efficient evaluation of all required quantities. As a guidingexample, we consider a thermo-mechanical PDE system with the tempera-ture constituting the system state and the induced displacement at a certainreference point as the quantity of interest.

∗Technische Universität Chemnitz, Faculty of Mathematics, Professorship Numerical Mathematics (Par-tial Differential Equations), D–09107 Chemnitz, Germany, [email protected],http://www.tu-chemnitz.de/herzog

†Technische Universität Chemnitz, Faculty of Mathematics, Professorship Numerical Mathematics (Par-tial Differential Equations), D–09107 Chemnitz, Germany, [email protected],http://www.tu-chemnitz.de/mathematik/people/part_dgl/riedel

‡University of Zielona Góra, Institute of Control and Computation Engineering, ul. Podgórna 50, 65-246Zielona Góra, Poland, [email protected], http://staff.uz.zgora.pl/ducinski

mailto:[email protected]

http://www.tu-chemnitz.de/herzog


http://www.tu-chemnitz.de/mathematik/people/part_dgl/riedel


http://staff.uz.zgora.pl/ducinski

Sensor Placement for Joint Parameter and State Estimation Herzog, Riedel, Ucinski

1. INTRODUCTION

In this paper, we consider joint parameter and state estimation problems for large-scaledynamical systems of the form

E x(t) = A(p) x(t) + f (t), t ∈ [0, t f ],

x(0) = x0 ∈ Rn.(1.1)

Here x(t) ∈ Rn is the state vector, E ∈ Rn×n is a non-singular matrix, f (t) ∈ Rn

signifies a known forcing input, p ∈ Rq stands for a set of system parameters, andA(p) ∈ Rn×n is a matrix representing parameter dependent dynamics. The purposeof our estimation procedure is to infer an estimate x0 of the unknown initial state x0 aswell as an estimate p of the unknown parameters p from partial measurements

yj = Cy x(tj) + ηj ∈ Rm, j = 1, . . . , N (1.2)

of the state trajectory evaluated at sampling instants t1, . . . , tN which are fixed in a giventime horizon [0, t f ]. Here ηj ∈ Rm denotes measurement noise which accounts for fac-tors such as measurement errors and inadequacies of the mathematical model (1.1). Weadopt a Bayesian setting, which means that while estimating x0 and p some informationabout them is available.

Once determined, the estimates x0 and p are supposed to be plugged into the model(1.1) so as to produce an estimate x(t f ) of the terminal state x(t f ) and then finally yieldthe estimate z = Cz x(t f ) of a quantity of interest (QOI) z, which depends linearly onthe terminal state x(t f ),

z = Cz x(t f ) ∈ Rr, (1.3)

where Cz ∈ Rr×n is given. In practice, the measurements of the observable quantity yare subject to measurement error. Logically, the output noise propagates into the esti-mate of (x0, p), thereby influencing the estimate of the QOI z. The amount of perturba-tion in z depends on the matrix Cy which encodes which parts of the state trajectory arebeing observed.

It is the purpose of this paper to optimize the measurement matrix Cy in order to min-imize the influence of the measurement error on the estimate of the QOI, in a sense tobe made precise below. We envision that the state vector x(t) ∈ Rn is high-dimensionaland it represents a distributed quantity, as for instance in the discretization of time-dependent partial differential equations. It is assumed that the measurement matrix Cyconsists of m distinct rows of the n× n identity matrix. In this setting, the optimizationof Cy can be understood as choosing optimal sensor locations. We point out that weconsider the sensors to be static here.

NOTATION. Throughout the paper, R+ and R++ stand for the sets of nonnegativeand positive real numbers, respectively. We adopt the convention that all vectors have

2


column form. The set of real m × n matrices is denoted by Rm×n. We use sym m todenote the set of symmetric m×m matrices, Sm

+ to denote the set of symmetric nonneg-ative definite m×m matrices, and Sm

++ to denote the set of symmetric positive definitem × m matrices. The symbol idn denotes the n × n identity matrix. The symbol 1ndenotes a vector whose components are all equal to one. Given two vectors x and yof dimension n, x · y is an n-vector whose i-th component is xi yi (the componentwisemultiplication operator). Finally, the symbol conv(q1, . . . , q`) denotes the convex hullof a set of vectors qi, i = 1, . . . , `.

MOTIVATION: THERMO-MECHANICAL PDE SYSTEM

As a motivation to consider sensor placement problems for systems of type (1.1), wemention an application described by a thermo-mechanical PDE system. More detailsare given in Section 5. Suppose that the temperature T of a machine tool constitutes thestate of the system and it is governed by the heat equation

ρ cp T − div(λ∇T) = 0,

endowed with boundary conditions

λ∂

∂nT + α(x) (T − Tref) = r(x, t)

describing the heat flux. The heat transfer coefficient, α(x), depends on the spatialposition x and it subsumes various physical phenomena, such as convective and radia-tive heat transfer. Its true value is therefore unknown and must be estimated from atime series of temperature measurements. A second unknown is the initial tempera-ture state T0(x), which arises from previous operation of the machine and is impossibleto be measured directly. The right hand side r(x, t) represents heat sources acting onthe machine tool. A table describing these correspondences with the model (1.1) is pro-vided as Table 5.2. It is not our primary goal to estimate the temperature distributionof the machine at time t f , but rather to estimate the QOI, that is the displacement ofa certain relevant point of the machine structure induced by that temperature. Noticethat thermally induced displacements can be the source of dominating positioning er-rors in machine tools. It is the precision of the estimation of these displacements thatwe are concerned with. To increase this precision, we wish to find optimal locations oftemperature (state) sensors on the machine’s surface.

RELATED WORK AND STRUCTURE OF THE PAPER

Let us put our paper into perspective. In the absence of unknown parameters p in(1.1), the estimation of the terminal state x(t f ) in a dynamical model such as (1.1) fromprevious measurements of the state is known as a data assimilation problem, see, e.g.,(Freitag and Potthast, 2013; Law et al., 2015; Cacuci et al., 2014). Notice that unknown

3


parameters could be easily incorporated by declaring them as artificial state variablessatisfying p(t) = 0. We do not follow this approach but prefer to keep p and x separate.Such joint parameter and state estimation problems were considered, for instance, inKühl et al. (2011); Küpper et al. (2009).

A key design problem in state and/or parameter estimation of distributed parame-ter systems (DPSs) consists in properly deploying the available measurement sensors.Logically, they should be placed at sites which provide the most valuable informa-tion about the estimated quantities. As it is desirable to determine ‘best’ sensor po-sitions before the actual data collection, the issue that must be primarily addressed isthe appropriate choice of the optimality criterion. As for state estimation, various cri-teria quantifying observability were employed in deterministic scenarios (El Jai andPritchard, 1988), whereas in stochastic settings the research was focused on minimizingcriteria which aggregated the covariance matrix of the estimation error, see (Kubruslyand Malebranche, 1985) for the state of the art in the mid-1980s. Since the Kalman filter,which was the main tool to produce state estimates, was hard to implement in realis-tic settings due to its prohibitive computational and memory requirements, this line ofresearch was abandoned for nearly two decades, and then revived interest in it was ob-served in the framework of variational data assimilation (Cacuci et al., 2014) or spatialstatistics (Cressie and Wikle, 2011).

In turn, sensor location for parameter estimation usually follows the traditional ap-proach of statistical experimental design (Atkinson et al., 2007; Pázman, 1986; Pron-zato and Pázman, 2013; Pukelsheim, 2006) and is based on various scalar measures ofperformance defined on the Fisher information matrix (FIM) associated with the esti-mated parameters. The inverse of the FIM constitutes the Cramér-Rao bound to thecovariance matrix of the estimates. The approach dates back to the work of Uspenskiiand Fedorov (1975), whose ideas were then extended by Rafajłowicz (1981, 1986). Acomprehensive overview of this currently very active research area is contained in themonograph (Ucinski, 2005).

Over the past decade communications about sensor location have continued to grow.Results regarding various types of PDEs have been reported, e.g., for reaction-diffusionor convection-diffusion problems (Alonso et al., 2004a; Armaou and Demetriou, 2006;Alonso et al., 2004b; García et al., 2007), as well as for models in fluid dynamics (Mokhasiand Rempfer, 2004; Cohen et al., 2006; Willcox, 2006; Yildirim et al., 2009). By the sametoken, the problem has been considered in numerous applications, e.g., in environ-mental and water resource systems (Sun and Sun, 2015), for mechanical deformationproblems (Yi et al., 2011; Meo and Zumpano, 2005), as well as in sensor networks (Songet al., 2009).

A great difficulty in estimation of DPSs arises due to the infinite dimensional nature ofthe parameter space. Some theoretical problems, such as existence of a least-squaresestimator, continuous dependence of the estimator on the data and convergence of ap-proximations, require compactness of the parameter space. If these aspects are notproperly addressed, the estimation process may be ill-posed in the sense that noisein the data may give rise to significant errors in the estimate. Therefore, techniques

4


known as regularization methods have been developed to deal with this ill-posedness,e.g., Tikhonov regularization (Vogel, 2002). They, however, hardly ever consider thestatistical aspects of the estimation problem. Alternatively, a Bayesian framework canbe employed, which quite naturally makes it possible to take account of prior statis-tical information of the unknown parameters and/or states. Bayesian methods, un-like asymptotic methods of classical statistics, turn out to be well-suited theoreticallyand computationally to infinite dimensional parameter spaces and can well handle theabove-mentioned theoretical problems (Fitzpatrick, 1991).

Unfortunately, sensor location for Bayesian inference in DPSs (or, in general, estima-tion combined with regularization) has not been sufficiently considered yet. Recentresearch, however, points to some breakthrough in this area, especially in the contextof variational data assimilation. Gejadze and Shutyaev (2012) approached the prob-lem of efficiently evaluating the gradient of the A-optimality criterion with respectto the spatial coordinates of the sensors for estimating the initial condition of a one-dimensional Burgers equation with a nonlinear viscous term. To this end, they useda limited-memory approximation of the inverse Hessian of the data assimilation costfunction (up to a multiplier, the Hessian is equal to the FIM associated with the co-efficients of a finite-dimensional parametrization of the initial state). The cost of theattendant computations is substantially reduced by extensive use of adjoint equations.In turn, selection of an optimal subset of candidate sensor locations have been studiedby Alexanderian et al. (2014) for estimation of the initial state of a three-dimensionaladvection-diffusion equation. The optimality criterion was the trace of the posteriorcovariance, implemented in practice through a randomized trace estimator. Substan-tial computational savings result from using a randomized SVD to get a low-rank sur-rogate for the prior-preconditioned parameter-to-observable map. Efficiency is addi-tionally increased by specifying the covariance operator of the Gaussian prior as theinverse of an elliptic differential operator, which can be evaluated using fast solversfor elliptic PDEs. A successful attempt to generalize this approach to a parameter es-timation problem (i.e., a nonlinear inverse problem) for inferring a coefficient field ina two-dimensional elliptic problem has been made in (Alexanderian et al., 2016). Thisinspired the formulation we use in our paper.

In an earlier paper Herzog and Riedel (2015), we focused on sensor placement problemsfor thermo-mechanical systems, but in the absence of a dynamical system (1.1). To beprecise, the temperature field was estimated directly from instantaneous measurementsand in a reduced-order temperature space. This is not possible here, since the heattransfer coefficient α is considered unknown. Notice that an estimation of α is onlypossible in a time-dependent model.

The particular features of the problem at hand and novelties in the present paper com-pared with previous work on sensor placement are the following. The presence of theQOI prevents us from using directly the Fisher information matrix (FIM) of the (x0, p)-estimation problem to formulate the objective for the optimal sensor placement prob-lem. Instead, we must use the (approximate) covariance matrix of the QOI estimator,which involves the solution map of a linearized state system. Since we assume the di-

5


mension of the QOI to be much lower than the state dimension (r n), we employan adjoint technique to evaluate that covariance matrix efficiently. In order to solve thesensor placement problem, we employ a simplicial decomposition algorithm, whichwas analyzed in Patriksson (1999) and Bertsekas (2015). To solve the main subproblemwe make use of the classical multiplicative algorithm which goes back to Silvey et al.(1978), but needs to be adapted to the objective at hand. We refer the reader to Torsney(2009); Yu (2010) for a historical overview.

Basically, while solving the relaxed convex sensor selection problem (Problem 3.2) wecould adapt the approach outlined by Joshi and Boyd (2009), which advocates an interior-point method. As will be shown, however, the implementation of simplicial decompo-sition is strikingly easy, the algorithm usually runs very fast, and most often the so-lutions produced by it are rather sparse (i.e., the number of nonzero weights is low).Sparsity may be quite an acute problem as far as relaxed solutions are concerned andusually requires augmenting the criterion by sparsifying penalty functions (Chepuriand Leus, 2015; Alexanderian et al., 2014; Haber et al., 2010, 2008). The linear pro-gramming subproblem built in simplicial decomposition seems to successfully retain amoderate number of nonzero weights.

Due to the multiplicative coupling of parameters p and state vector x(t) in (1.1), thecovariance of the QOI is going to depend not only on the measurement matrix Cy butalso on the unknown parameters p themselves (but not on the unknown initial statex0). Often, this feature is addressed in sensor placement or similar experimental designproblems by embedding the latter in a robust formulation, where the unknown param-eter is confined to an uncertainty set. This significantly adds to the level of complexityof the problem; see, e.g., (Ucinski, 2005; Pronzato and Pázman, 2013) or (Körkel et al.,2004; Diehl et al., 2006; Bock et al., 2007).

In this paper, we focus on the sensor placement problem for systems of type (1.1) inthe presence of a QOI and therefore content ourselves with a given set-point (nominalvalue) p0 in the parameter space. In Section 2, we formulate the data assimilation prob-lem, which is used to jointly estimate the unknown initial state x0 and the parametersp. The sensor placement problem is addressed in Section 3 and we propose a simplicialdecomposition algorithm for its solution in Section 4. Subsequently, we elaborate ona specific thermo-mechanical system modeling a machine tool, where the temperatureconstitutes the system state x(t) and the thermo-mechanically induced displacement ata certain reference point (the tool center point, or TCP) serves as the quantity of interestz. We seek optimal locations of temperature sensors on the surface of the machine inorder to obtain an accurate estimate of the TCP displacement. The details are given inSection 5 and illustrated with numerical results in Section 6.

2. DATA ASSIMILATION PROBLEM

We consider the dynamical system (1.1) with state x(t) ∈ Rn, unknown initial statex0 ∈ Rn and unknown parameter vector p ∈ Rq. We assume that measurements (1.2) of

6


certain parts of the state trajectory are taken at given measurement times tj, j = 1, . . . , Nduring the time interval [0, t f ] under consideration. The measurements yj are subjectto measurement errors ηj, j = 1, . . . , N, which we assume to be i.i.d. random variableswith normal distributionN (0, Vy), where Vy = σ2 idm. This means that the componentsof each ηj are independent zero-mean random variables with the same variance σ2,or equivalently, that the measurements from different sensors are independent of oneanother and that their accuracy is the same.

The unknowns in the model (1.1) are x0 and p. However, our prior (background) in-formation are their prior estimates xbg

0 and pbg which are supposed to be realizationsof Gaussian random vectors with means x0 ∈ Rn and p ∈ Rq, and covariance matricesVx0 ∈ Rn×n and Vp ∈ Rq×q, respectively, i.e., xbg

0 ∼ N (x0, Vx0) a pbg ∼ N ( p, Vp). Herex0 and p are unknown and interpreted as the ‘true’ initial state and the ‘true’ parameter,respectively. In turn, as for Vx0 and Vp, we assume that they are known and positivedefinite, and hence invertible.

As is usually the case in data assimilation problems, the number of unknowns (n + q)exceeds the number of measurements (N m). Consequently, regularization terms areneeded expressing the above-mentioned prior information about the unknowns. Wethus state our data assimilation as follows, cf. Cacuci et al. (2014):

minx0∈Rn, p∈Rq

JDA(x0, p) =12‖x0 − xbg

0 ‖2V−1

x0+

12‖p− pbg‖2

V−1p

+12

N

∑j=1‖yj − Cz x(tj; x0, p)‖2

V−1y

,(2.1)

where the term x(tj; x0, p) is the solution to (1.1) at sampling time tj evaluated at givenx0 and p0.

In order to solve the nonlinear least-squares problem (2.1), one can employ a standardderivative-based method such as the Gauss-Newton or Levenberg-Marquardt algo-rithms; see for instance (Nocedal and Wright, 2006, Section 10.3). In order to formulatethe Jacobian of the model output w.r.t. the unknowns (x0, p), we introduce the sensi-tivities X0(t) = ∂

∂x0x(t; x0, p) ∈ Rn×n and Xp(t) = ∂

∂p x(t; x0, p) ∈ Rn×q of the statex(t; x0, p) with respect to the initial state x0 and the parameters p.

By the implicit function theorem, it follows from (1.1) that X0 is given by the linearsystem

E X0(t) = A(p)X0(t), t ∈ [0, t f ],

X0(0) = In(2.2)

and Xp satisfies E Xp(t) = A′(p) x(t) + A(p)Xp(t), t ∈ [0, t f ],

Xp(0) = 0 ∈ Rn×p.(2.3)

7


Note that, for simplicity of notation, we let A′(p) x(t) stand for the Jacobian matrix ofthe mapping p 7→ A(p) x(t) with respect to p while holding x(t) constant. Using theChain Rule Theorem (Magnus and Neudecker, 1999, Thm. 12, p. 108), we easily deducethat

A′(p) x(t) =(

∂

∂pA(p) x

)x=x(t)

=(

x(t)> ⊗ In)∂ vec A(p)

∂p

=

[∂A(p)

∂p1. . .

∂A(p)∂pq

] (idq⊗x(t)

),

(2.4)

where ‘vec’ is the column-stacking operator and ⊗ signifies the Kronecker product.

Due to the linearity of the output equation (1.2), the sensitivity of the model output tochanges in (x0, p) is given by

∂Cy x(tj; x0, p)∂(x0, p)

= Cy[

X0(tj) Xp(tj)]∈ Rm×(n+q), j = 1, . . . , N. (2.5)

The data assimilation problem (2.1) can be written as a weighted least-squares problemof the form

minx0∈Rn, p∈Rq

12

r(x0, p)>H r(x0, p) (2.6)

with the residual vector

r(x0, p) =

x0 − xbg

0p− pbg

y1 − Cy x(t1; x0, p)...

yN − Cy x(tN ; x0, p)

∈ Rn+q+N m (2.7)

and the symmetric non-negative definite weight matrix

H = diag(V−1

x0, V−1

p , V−1y , . . . , V−1

y︸︷︷︸N times

).

The Jacobian of the residual can be computed from the sensitivities defined above inthe following way:

J(x0, p) =∂r(x0, p)∂(x0, p)

=

idn 00 idq

−CyX0(t1) −CyXp(t1)...

...−CyX0(tN) −CyXp(tN)

. (2.8)

8


Notice that for a large state dimension n, the sensitivity trajectory X0 : [0, t f ] → Rn×n

will be of a formidable size. Also, since the number of model outputs and measure-ments, N m, is typically smaller than the number of unknowns n + q, it is more eco-nomical to evaluate the Jacobian using the adjoint technique. We will now show thatone single adjoint variable S : [0, t f ]→ Rn×m is enough to attain this objective.To this end, consider one typical output estimate yj = Cy x(tj; x0, p) of the actual outputyj. Adjoin (1.1) to this estimate with an arbitrary time-varying Lagrange multipliermatrix Sj(t) ∈ Rn×m as follows:

yj = Cy x(tj; x0, p)

+∫ tj

0Sj(t)>

[E x(t; x0, p)− A(p) x(t; x0, p)− f (t)︸︷︷︸

=0

]dt (2.9)

Let us integrate the Sj(t)>E x(t) term in (2.9) by parts, yielding

yj =[Cy + Sj(tj)

>E]x(tj)− Sj(0)>E x(0)

−∫ tj

0

[Sj(t)>E + Sj(t)>A(p)]x(t)dt−

∫ tj

0Sj(t)> f (t)dt.

(2.10)

Differentiating both the sides of (2.10) with respect to x0, we thus get

CyX0(tj) =∂yj

∂x0

=[Cy + Sj(tj)

>E]X0(tj)− Sj(0)>E X0(0)

−∫ tj

0

[Sj(t)>E + Sj(t)>A(p)]X0(t)dt.

(2.11)

To avoid having to determine the function X0(t), we choose the multiplier functionSj(t) so that the coefficients of X0(t) and X0(tj) vanish, i.e., we specify it as the solutionto the following backwards-in-time adjoint differential equation:

E>Sj(t) = −A(p)>Sj(t), t ∈ [0, tj],

E>Sj(tj) = −C>y .(2.12)

Equation (2.11) then becomes

CyX0(tj) = −Sj0(0)

>EX0(0) = −Sj0(0)

>E. (2.13)

In turn, differentiating both the sides of (2.10) with respect to p, we get

CyXp(tj) =∂yj

∂p

=[Cy + Sj(tj)

>E]Xp(tj)− Sj(0)>E Xp(0)

−∫ tj

0

[Sj(t)>E + Sj(t)>A(p)]Xp(t)dt

−∫ tj

0Sj(t)>

(A′(p) x(t)

)dt

(2.14)

9


But on account of (2.12) and the initial condition for (2.3), this simplifies to

CyXp(tj) = −∫ tj

0Sj(t)>

(A′(p) x(t)

)dt. (2.15)

Consequently, the block row of the Jacobian (2.8) associated with the output at time tjcan be expressed as

[−CyX0(tj) −CyXp(tj)

]=

[Sj(0)>E

∫ tj

0Sj(t)>

(A′(p) x(t)

)dt]

.

It is now important to observe that (2.12) is an autonomous system. Therefore, Sj(t) =Sk(t− tj + tk) holds whenever both are defined. We conclude that in place of N differentsystems of type (2.12) it is enough to consider a single adjoint system for the adjoint stateS : [0, t f ]→ Rn×m

E>S(t) = −A(p)>s(t), t ∈ [0, t f ],

E>S(t f ) = −C>y .(2.16)

Since Sj(t) = S(t − tj + t f ) holds, each block row of the Jacobian can be evaluatedaccording to[−CyX0(tj) −CyXp(tj)

]=

[S(t f − tj)

>E∫ tj

0S(t− tj + t f )

>(A′(p) x(t))

dt]

. (2.17)

We provide in Table 3.1 an overview over the quantities required during the solution ofthe data assimilation problem (2.1) by gradient-based methods.

3. SENSOR PLACEMENT PROBLEM

3.1. COVARIANCE OF THE QOI ESTIMATOR

Having solved the data assimilation problem (2.1), we obtain estimates x0 and p of thesought ‘true’ values x0 and p, respectively. In the sequel, we shall concatenate x0 and pso as to have only one vector of unknown true parameters θ = (x0, p) and its estimatesθ = (x0, p).

As was mentioned in the introduction, our main concern is not to estimate the unknowninitial state x0 or the parameter vector p directly, but rather to estimate a quantity ofinterest z depending on the terminal state x(t f ) at time t f ,

z = Cz x(t f ; θ) ∈ Rr (3.1)

throughz = Cz x(t f ; θ) ∈ Rr (3.2)

10


with r small compared with the dimension n of the state variable.

To be able to assess the quality of the estimator (3.2), we investigate the expected dis-persion of the estimates produced by it, which is quantified by the covariance matrixCov(z). Clearly, the QOI z depends on the unknowns (x0, p) in an indirect way, andits dependence on p is nonlinear. Therefore, to obtain an expression for the covariancematrix of the estimator z is a real challenge. That is why we follow here a standardapproach in the literature, cf. Mehra (1974), and resort to the covariance of a linearizedestimator, which is obtained by linearizing the parameter-to-QOI map. This approachis backed up by asymptotical considerations; see for instance (Pronzato and Pázman,2013, Chapter 3).

From now on, let θ0 = (x00, p0) denote a given set-point in the parameter space (we may

set θ0 = θbg = (xbg0 , pbg)), where (3.1) is linearized. An application of the chain rule,

applied to (3.1) and (2.2)–(2.3), shows that this linearization is given by the matrix

Q =∂z∂θ

∣∣∣∣θ=θ0

= Cz X(t f ; θ0) ∈ Rr×d, (3.3)

where here and subsequently for abbreviation we write d = n + q and

X(t; θ) =[

X0(t; θ) Xp(t; θ)]

.

Consequently, the covariance of the linearized QOI estimator is related via

Cov(z) = Q Cov(θ) Q> (3.4)

to the covariance Cov(θ) of the parameter estimator θ. Throughout the paper we as-sume that the matrix Q has full row rank:

rank Q = r. (3.5)

In order to form the matrix Q, we exploit the similarity of (3.3) and (2.5) and follow anadjoint approach. To be precise, we solve the additional adjoint system for SQ : [0, T]→Rn×r

E>SQ(t) = −A(p0)>SQ(t), t ∈ [0, t f ],

E>SQ(t f ) = −C>z(3.6)

and evaluate

Q =[

CzX0(t f ; θ0) CzXp(t f ; θ0)]

=

[−SQ(0)>E −

∫ t f

0SQ(t)>

(A′(p0) x(t)

)dt]

.(3.7)

The problem of characterizing and evaluating Cov(θ) has extensively been investigatedby researchers concerned with variational data assimilation. Gejadze et al. outlined anapproach to obtain approximations to the covariance matrices for the data assimilation

11


Quantity defined in evaluate using requires

r(x0, p) residual (2.7) (2.7) solution x of (1.1)J(x0, p) Jacobian (2.8) (2.17) solution S of (2.16)

Q Jacobian of QOI (3.3) (3.7) solution SQ of (3.6)

Table 3.1: Overview of quantities for the solution of the data assimilation and the sensorplacement problem, and how to evaluate them efficiently.

problem in which either the initial state x0 or the parameter vector p are unknowns,see Gejadze et al. (2008) and Gejadze et al. (2010), respectively, as well as Gejadze et al.(2013); Gejadze and Shutyaev (2012). It is rather straightforward to combine these re-sults in our problem of joint estimation of x0 and p, thereby obtaining

Cov(θ) ≈(V−1

θ +N

∑j=1

X(tj)>C>y V−1

y CyX(tj))−1, (3.8)

where Vθ = diag(Vx0 , Vp) and X(tj) = X(tj; θ). The dependence of the right-handside on the ‘true’ vector θ is not surprising, as it is a rule as long as estimates of thecovariance matrices of various estimators are constructed in settings where the outputsdepend nonlinearly on the estimated parameters. Clearly, we do not know θ and, inpractice, we approximate it by a preliminary estimate θ0 (e.g., a logical choice is θ0 =θbg).

3.2. THE CRITERION TO BE OPTIMIZED

Our optimal design problem consists in determining an m-element subset selected outof a total of n state variables, which would yield the lowest variability in the estimatesof the QOI as measured by the covariance matrix (3.4). In order to express this formally,we define a decision variable which is the n-dimensional vector w whose componentwi is zero if xi is supposed to be measured and zero if xi in not going to be measured.In consequence, the observation matrix takes the form

Cy(w) = D(diag(w)), (3.9)

where D stands for the operation of forming a submatrix of its matrix arguments bydeleting all zero rows. Since we assume that the measurements of the observed statecomponents are independent of one another and taken by equally accurate sensors, i.e.,Vy = σ2 idm for some known variance σ2, it follows that

Cov(θ) ≈ I(w)−1, (3.10)

12


where

I(w) = V−1θ +

1σ2

N

∑j=1

[X(tj)]> diag(w) X(tj)

= V−1θ +

n

∑i=1

wiΥi,

(3.11)

Υi =1σ2

N

∑j=1

rowi(X(tj))> rowi(X(tj)), i = 1, . . . , n. (3.12)

Here rowi signifies the i-th row of its matrix argument. We call I(w) the Bayesian infor-mation matrix for θ, cf. Chepuri and Leus (2015). Observe that the positive definitenessof Vx0 and Vp implies that of Vθ , and this, in turn, forces I(w) to be positive definite(since the term ∑n

i=1 wiΥi is nonnegative definite). Consequently, there is no problemwith the inversion of I(w).

For the intended search for an optimal w, we have to introduce the appropriate optimal-ity criterion. As nonnegative-definite matrices can be only partially ordered, instead ofdirectly comparing the covariance matrices for different choices of the output matrix,a scalar performance index Ψ defined on Cov(θ) can be used here. Thus, our sensorselection problem can be ultimately expressed as the optimization problem:Problem 3.1 (Sensor Selection Problem). Find a vector w?

bin ∈ Rn to minimize

J (w) = Ψ(Q I(w)−1Q>

)(3.13)

subject to the constraints

1>n w = m, (3.14)wi ∈ 0, 1, i = 1, . . . , n. (3.15)

In the role of Ψ, various ‘alphabetical’ optimality criteria commonly used in experimen-tal design can be considered. Specifically, three possible criteria follow:

(i) DQ-optimality (or generalized D-optimality), which corresponds to Ψ = log det,

J (w) = log det(Q I(w)−1Q>

), (3.16)

(ii) AQ-optimality (or generalized A-optimality), which corresponds to Ψ = trace,

J (w) = trace(Q I(w)−1Q>

), (3.17)

(iii) EQ-optimality (or generalized E-optimality), which corresponds to Ψ = λmax,

J (w) = λmax(Q I(w)−1Q>

), (3.18)

13


where λmax is the maximal eigenvalue of its matrix argument. See (Atkinson et al., 2007,p. 137) or (Silvey, 1980, p. 10) for justification of this terminology and notation.

Different optimality criteria may produce different solutions to Problem 3.1, but thisresults from the their slightly different interpretations in terms of the uncertainty el-lipsoid for the estimates z. Roughly speaking, a DQ-optimum design minimizes itsvolume, an AQ-optimum design suppresses the mean squared length of its axes, andan EQ-optimum design minimizes the length of its largest axis. In what follows, our at-tention will be focused on the DQ-optimality criterion (3.16). Note that the assumption(3.5) implies

rank Q I(w)−1Q> = rank Q = r, (3.19)

see, e.g., the Range Inclusion Lemma in (Pukelsheim, 2006, p. 17), which clearly demon-strates that Q I(w)−1Q> is always nonsingular.

3.3. RELAXED SENSOR SELECTION PROBLEM

Owing to the combinatorial nature of Problem 3.1, which may make its solution in-tractable even for small-scale problems, we relax it by replacing the non-convex Booleanconstraints wi ∈ 0, 1 with the convex box constraints wi ∈ [0, 1]. Thus we get the fol-lowing convex relaxed sensor selection problem:Problem 3.2 (Relaxed Sensor Selection Problem). Find a vector w? ∈ Rn to minimize

J (w) = Ψ(Q I(w)−1Q>

)= Ψ

(Q(V−1

θ +n

∑i=1

wiΥi)−1Q>

) (3.20)


1>n w = m, (3.21)0 ≤ wi ≤ 1, i = 1, . . . , n. (3.22)

It goes without saying that the above relaxed problem is not equivalent to the originalproblem, as some components of the computed optimal solution w? may be fractionaland not binary. It is however by no means useless, as J (w?) constitutes a lower boundto J (w?

bin) solving Problem 3.1. What is more, rounding up m largest components of w?

to one and the remaining components to zero, we can produce a suboptimal solutionfor Problem 3.1. This option is typical for sensor selection problems, see, e.g., Joshi andBoyd (2009). What is more, solutions to Problem 3.2 can be embedded into a generalbranch-and-bound scheme to yield a solution w?

bin, see (Ucinski and Patan, 2007) fordetails.

Problem 3.2 possesses a number of notable features which, in theory, should make itssolution straightforward. First of all, note that the performance index J (w) is convexover the convex feasible set W defined by the constraints (3.21) and (3.22), being the

14


intersection of a hyperplane and a hyperbox. The convexity results from the fact that,under the assumption (3.5), the mapping Φ : M 7→ log det(Q M−1Q>) is convex onthe set of positive-definite Rd×d matrices (Marshall et al., 2011, Theorem 16.F.4, p. 688).What is more, J is differentiable with

φ(w) := ∇J (w) =[φ1(w), . . . , φn(w)

]> , (3.23)

whereφi(w) = − trace

(Φ′(I(w))Υi

)∈ R, (3.24)

where Φ′(X) ≡ ddX Φ(X) signifies the matrix derivative of Ψ(X) of a matrix argument

X ∈ Rd×d, which is the d× d matrix whose (i, j) entry is ∂Φ(X)/∂X(j,i), cf. (Bernstein,2005, p. 410). As I ∈ Rd×d is positive definite, we have

Φ′(I) = ddI log det(Q I−1Q>) = −I−1Q>

(Q I−1Q>

)−1D I−1, (3.25)

cf. (Bernstein, 2005, Eqn. (10.6.13), p. 411). Substituting this into (3.24) and using thecyclic commutativity of the trace of a product of matrices, we get

φi(w) = trace((

Q I(w)−1Q>)−1Q I(w)−1 Υi I(w)−1 Q>

),

i = 1, . . . , n. (3.26)

As the feasible set W is a rather nice convex set, numerous computational methods canpotentially be employed for solving Problem 3.2, e.g., the conditional gradient methodor a gradient projection method. Unfortunately, if the number of the support pointsn is large, which is rather a common situation in applications, then these algorithmsrequire additional efforts regarding implementation in order to avoid unsatisfactorycomputational times.

On the other hand, an extremely simple multiplicative algorithm (Silvey et al., 1978; Yu,2010) is available to maximize the DQ-optimality criterion over the canonical simplex.Its idea is reminiscent of the EM algorithm used for maximum likelihood estimationand a decisive advantage is ease of implementation.

In what follows, it will be shown how this multiplicative algorithm can be built intoa very simple and efficient computational scheme in which account of the additionalupper-bound constraint in (3.22) is taken. The principal tool in its construction will besimplicial decomposition.

4. SIMPLICIAL DECOMPOSITION FOR PROBLEM 3.2

4.1. ALGORITHM MODEL

Simplicial decomposition (SD) proved extremely useful for large-scale pseudoconvexprogramming problems encountered, e.g., in traffic assignment or other network flow

15


problems (Patriksson, 1999). In its basic form, it proceeds by alternately solving linearand nonlinear programming subproblems, called the column generation problem (CGP)and the restricted master problem (RMP), respectively. In the RMP, the original problemis relaxed by replacing the original constraint set W with its inner approximation beingthe convex hull of a finite set of feasible solutions. In the CGP, this inner approximationis improved by incorporating a point in the original constraint set that lies furthestalong the gradient direction computed at the solution of the RMP. This basic strategyhas been discussed and extended in numerous references (Bertsekas, 2015; Patriksson,1999). A marked characteristic of the SD method is that the sequence of solutions tothe RMP tends to a solution of the original problem in such a way that the objectivefunction strictly monotonically approaches its optimal value.

The SD algorithm may be viewed as a form of modular nonlinear programming, pro-vided that one has an effective computer code for solving the RMP, as well as access toa code which can take advantage of the linearity of the CGP. One of the aims of this pa-per is to show that this is the case within the framework of Problem 3.2. What is more,since we deal with minimization of the convex function J over a bounded polyhedralset W, this will automatically imply the convergence of the resulting SD scheme in afinite number of RMP steps (Bertsekas, 2015).

Tailoring the SD scheme to our needs, we obtain Algorithm 1. In the sequel, its consec-utive steps will be discussed in turn.

4.2. CHARACTERIZATION OF THE OPTIMAL DESIGN AND TERMINATIONOF ALGORITHM 1

In the original SD setting, the criterion for terminating the iterations is checked onlyafter solving the column generation problem. The computation is then stopped if thecurrent point w(k) satisfies the condition of nondecrease, to first order, in performancemeasure value in the whole constraint set, i.e.,

minw∈W

φ(w(k))>(w− w(k)) ≥ 0. (4.8)

The condition (4.4) is less costly in terms of the number of floating-point operations. Itresults from the following characterization of w? which has the property that J (w?) =minw∈W J (w).

Theorem 4.1. A vector w? constitutes a global minimum of J over W if, and only if,there exists a number λ? such that

φi(w?)

≥ λ? if w?

i = 1,= λ? if 0 < w?

i < 1,≤ λ? if p?i = 0

(4.9)

for i = 1, . . . , n.

16


Algorithm 1 Algorithm model for solving Problem 3.2 via simplicial decomposition.

Step 0: (Initialization)Guess an initial solution w(0) ∈ W such that I(w(0)) is nonsingular. Set I =

1, . . . , n

, G(0) =

w(0) and k = 0.

Step 1: (Termination check)Set

I(k)ub =

i ∈ I | w(k)i = 1

, (4.1)

I(k)im =

i ∈ I | 0 < w(k)i < 1

, (4.2)

I(k)lb =

i ∈ I | w(k)i = 0

. (4.3)

If

φi(w(k))

≥ λ if i ∈ I(k)ub ,

= λ if i ∈ I(k)im ,

≤ λ if i ∈ I(k)lb

(4.4)

for some positive λ, then STOP and w(k) is optimal.

Step 2: (Solution of the column generation subproblem, CGP)Compute

g(k+1) = arg minw∈W

φ(w(k))>w (4.5)

and setG(k+1) = G(k) ∪

g(k+1). (4.6)

If g(k+1) ∈ conv(G(k)), then STOP

Step 3: (Solution of the restricted master subproblem, RMP)Find

w(k+1) = arg minw∈conv(G(k+1))

Ψ(Q I(w)−1Q>

), (4.7)

and purge G(k+1) of all extreme points with zero weights in the resulting expres-sion of w(k+1) as a convex combination of elements in G(k+1). Increment k by oneand go back to Step 1.

17


The proof of this result proceeds in much the same way as that of Proposition 1 in(Ucinski and Patan, 2007).

4.3. SOLUTION OF THE COLUMN GENERATION SUBPROBLEM

In Step 2 of Algorithm 1 we deal with the linear programming problem

minimize c>wsubject to w ∈W,

(4.10)

where c = φ(w(k)), in which the feasible region is defined by 2n bound constraints (3.22)and one equality constraint (3.21). Making use of this special form of the constraints,we can develop an algorithm to solve this problem, which is almost as simple as aclosed-form solution. The key idea is to make use of the following assertion which canbe demonstrated in much the same way as Theorem 4.1.

Theorem 4.2. A vector g ∈W constitutes a global solution to the problem (4.10) if, andonly if, there exists a scalar ρ such that

ci

≥ ρ if gi = 1,= ρ if 0 < gi < 1,≤ ρ if gi = 0

(4.11)

for i = 1, . . . , n.

We thus see that, in order to solve (4.10), it is sufficient to pick m largest components ciof c and set the corresponding weights gi as one, and the remaining weights as zero.

4.4. SOLUTION OF THE RESTRICTED MASTER SUBPROBLEM

Suppose that in the (k + 1)-th iteration of Algorithm 1, we have

G(k+1) =

g1, . . . , g`

, (4.12)

possibly with ` < k + 1 owing to the built-in deletion mechanism of points in G(j), 1 ≤j ≤ k, which did not contribute to the convex combinations yielding the correspondingiterates w(j). Step 3 of Algorithm 1 involves minimization of the design criterion (3.20)over

conv(G(k+1)) = `

∑j=1

vjgj∣∣∣ `

∑j=1

vj = 1, vj ≥ 0, j = 1, . . . , `

. (4.13)

From the representation of any w ∈ conv(G(k+1)) as

w =`

∑j=1

vjgj, (4.14)

18


or, in component-wise form,

wi =`

∑j=1

vjgji , i = 1, . . . , n, (4.15)

gji being the i-th component of gj, it follows that

I(w) = V−1θ +

n

∑i=1

wiΥi =`

∑j=1

vj

(V−1

θ +n

∑i=1

gjiΥi

)=

`

∑j=1

vjI(gj). (4.16)

From this, we see that the RMP can equivalently be formulated as the following prob-lem:Problem 4.3. Find the sequence of weights v ∈ R` to minimize

P(w) = log det(Q H(v)−1Q>

)(4.17)


1>` v = 1, (4.18)vj ≥ 0, j = 1, . . . , ` (4.19)

where

H(v) =`

∑j=1

vjHj, Hj = I(gj), j = 1, . . . , `. (4.20)

Basically, since the constraints (4.18) and (4.19) define the probability simplex in R`,i.e., a very nice convex feasible domain, it is intuitively appealing to determine opti-mal weights using a numerical algorithm specialized for solving convex optimizationproblems. Note, however, that this formulation has already captured close attention inoptimum experimental design theory, where various characterizations of optimal solu-tions and efficient computational schemes have been proposed (Atkinson et al., 2007).In particular, in the case of the DQ-optimality criterion studied here, we can employ theGeneral Equivalence Theorem of (Ucinski, 2005, Theorem 3.2, p. 48) to get the followingconditions for global optimality:

Theorem 4.4. A vector v? constitutes a global solution to Problem 4.3 if and only if

ψj(v?)

= r if v?j > 0,

≤ r if v?j = 0(4.21)

for each j = 1, . . . , `, where

ψj(v) = trace((

Q H(v)−1Q>)−1Q H(v)−1 Hj H(v)−1 Q>

), j = 1, . . . , `. (4.22)

19


A very simple multiplicative algorithm (Yu, 2010) can be adapted to the above RMP. It issummarized in Algorithm 2. Although only its monotonicity can be proven for the DQ-optimality criterion, and not global convergence, cf. (Yu, 2010), in practice it behavesflawlessly. As an alternative, an interior-point method has recently been proposed byLu and Pong (2013), for which global convergence is guaranteed, but at the cost of amuch more complicated implementation.

Algorithm 2 Algorithm model for the restricted master problem.

Step 0: (Initialization)Select a weight vector v(0) with positive components which sum up to one, e.g.,set v(0) = (1/r)1`. Set κ = 0.

Step 1: (Termination check)If

1r

ψ(v(κ)) 1` (4.23)

then STOP.

Step 2: (Multiplicative update)Evaluate

v(κ+1) =1r

ψ(v(κ)) · v(κ). (4.24)

Increment κ by one and go to Step 1.

5. APPLICATION TO A THERMO-MECHANICAL SYSTEM

In this section, we descibe in more detail the application of the sensor placement pro-cedure for a certain thermo-mechanical system. To be more precise, we consider thetemperature evolution T(x, t) of the machine tool column depicted in Figure 5.1. Wedenote the solid body of the machine column by Ω and its surface by Γ. The tempera-ture evolution is governed by the linear heat equation,

ρ cp T − div(λ∇T) = 0 in Ω× (0, t f ),

λ∂

∂nT + α(x) (T − Tref) = r(x, t) on Γ× (0, t f ),

T(x, 0) = T0(x) in Ω.

(5.1)

The boundary conditions represent a simplified model for the heat transfer occurringat the different parts of the machine’s surface. Since the underlying heat transfer mech-anism includes both convective and radiative phenomena, the value of the effective co-efficient α(x) is considered unknown and also dependent on the spatial position x. We

20


make here the following ansatz:

α(x) :=q

∑k=1

αk χk(x), (5.2)

where each χk is an indicator function with values in 0, 1, which selects a certain por-tion of the machine’s surface Γ. Here the surface of the machine is divided into fiveparts. The value of α is fixed to zero on those two areas where the two heat sourcesact, which are expressed through the right hand side r(x, t). The heat sources are as-sumed to be known and described in Section 6 where numerical results are presented.They originate from an electrical drive mounted on top of the machine column and onthe other hand through the spindle driving the horizontal movement of the column,see Figure 5.1(c). On the remaining q = 4 parts of the surface, the heat transfer coeffi-cients α1, . . . , α4 need to be estimated but some background information αbg is available.We have 12 W K−1 m−2 on the vertical surfaces, 10 W K−1 m−2 and 8 W K−1 m−2 on thehorizontal planes with the outer normal facing upwards and downwards, respectively,and 5 W K−1 m−2 on all enclosed surfaces, including the inner surfaces of the cavities;see Figure 5.1(c). At those surface parts where the electrical drives are mounted, theheat transfer coefficient α(x) is zero. All symbols occuring in (5.1) are summarized inTable 5.1.

(a) Photograph of themachine column.

(b) CAD model withmounting points deter-mining the TCP loca-tion.

(c) Background values of αbg.

Figure 5.1: Auerbach ACW 630 machine column.

We now switch to a spatial finite element model of (5.1) w.r.t. a basis ϕi, i = 1, . . . , n.In our computations, we are using the standard nodal basis composed of piecewise lin-ear, continuous elements on a tetrahedral grid of the geometry depicted in Figure 5.1(b).In a slight abuse of notation, we denote the coefficient vector representing the temper-ature field T also by T. By converting (5.1) to its weak formulation and restricting it to

21


Symbol Meaning Value Units

T temperature Kr thermal surface load W m−2

ρ density 7 250 kg m−3

cp specific heat at constant pressure 500 J kg−1 K−1

λ thermal conductivity 46.8 W K−1 m−1

Tref ambient temperature 20 C

αbg background information on α 0 to 12 W K−1 m−2

α heat transfer coefficient unknown W K−1 m−2

T0 initial temperature unknown K

Table 5.1: Table of symbols associated with the thermal model.

the finite element space, we arrive at the following semi-discretized version of (5.1)

M T(t) + K T(t) +q

∑k=1

αk Mk (T(t)− Tref) = r(t), t ∈ [0, t f ],

T(0) = T0.

(5.3)

Here M and Mk denote mass and boundary mass matrices, respectively, and K is thestiffness matrix:

M = ρ cp

(∫Ω

ϕi ϕj dx)

i,j, Mk =

(∫Γ

ϕi ϕj χk dx)

i,j,

K = λ(∫

Ω∇ϕi · ∇ϕj dx

)i,j

with indices i, j = 1, . . . , n. Tref is a coefficient vector in Rn with identical entries. Theright hand side vector r(t) represents the load vector generated by the given boundaryheat sources:

r(t) =(∫

Γr(x, t) ϕj dx

)j.

Finally, we recall that the coefficient vector T0 representing the initial temperature dis-tribution

T0(x) =n

∑j=1

T0,j ϕj(x)

is unknown. It is clear that the finite element model (5.3) is of the form (1.1) when theidentifications given in Table 5.2 are made.

Our model output y(t) = Cy T(t), which is adjusted to the temperature measurementsduring the data assimilation process, is described by the measurement matrix Cy. Inthe present setting, we wish to use as potential measurement locations all finite element

22


mesh nodes which are located on the surface of the machine column. Therefore, Cy iscomposed of all rows of the n× n identity matrix corresponding to the surface degreesof freedom. The specific form of the adjoint system (2.16) reads

M S(t) = −K S(t)−q

∑k=1

αk Mk S(t), t ∈ [0, t f ],

M S(t f ) = −C>y .

(5.4)

Notice that the symmetry of M and K has been used. The block rows of the Jacobianaccording to (2.17) are[

S(t f − tj)>M

∫ tj

0S(t− tj + t f )

> [ M1 T(t) · · · · · · M4 T(t)]

dt]

. (5.5)

Symbol in (1.1) Symbol in (5.3) Remark

x Tp α unknownx0 T0 unknown

E M

A(p) −K−q∑

k=1αk Mk

f (t) r(t) +q∑

k=1αk Mk Tref

Table 5.2: Correspondence of symbols in the general dynamical system (1.1) and thefinite element model of the heat equation (5.3).

We recall that our emphasis is not on the estimation of the temperature distribution ofthe machine, but rather on the estimation of the QOI, i.e., the temperature-induced dis-placement of a certain reference point of the machine structure, at time t f . The overalldisplacement field is governed by a quasi-static linear elasticity model since the timescale of the heat equation is unable to generate wave motion in the machine structure.The linear elasticity model is based on the balance of forces,

− div σ(ε(u), T(t f )

)= 0 in Ω. (5.6)

We employ an additive split of the stress tensor σ into its mechanically and thermallyinduced parts. (An alternative, equivalent approach would apply such a split to thestrains.) Together with the usual homogeneous and isotropic stress-strain relation, weobtain the following constitutive law; see (Boley and Weiner, 1960, Section 1.12), (Es-lami et al., 2013, Section 2.8):

23


σ(ε(u), T(t f )

)= σel(ε(u)) + σth(T(t f )),

σel(ε(u)) =E

1 + νε(u) +

Eν

(1 + ν)(1− 2ν)trace(ε(u)) id,

σth(T(t f )) = −E

1− 2νβ(T(t f )− Tref

)id3 .

(5.7)

Herein, ε denotes the linearized strain tensor

ε(u) =12(∇u +∇u>).

The elasticity modulus E and Poisson ratio ν of the cast iron machine column are given.For convenience, all quantities relevant for the displacement model are summarized inTable 5.3.

Symbol Meaning Value Units

u displacement mσ stress N m−2

ε strain 1

ν Poisson’s ratio 0.3 1E modulus of elasticity 114·109 N m−2

β thermal volumetric expansion coefficient 1.1·10−5 K−1

L length of the main spindle 0.993 m` auxiliary quantity, see Appendix A 0.535 mσ standard deviation of temperature sensors 0.0333 K

Table 5.3: Table of symbols associated with the displacement model.

We continue by a specification of the mechanical boundary conditions for the elasticityequation (5.6)–(5.7). The machine column is free to move in the X-direction on the railby which it connects to the machine bed, see Figure 5.1(a). Movements in Y and Z-directions are prohibited. Moreover, the machine column is connected by a spindle nutto the spindle in the machine bed which drives the horizontal movement during oper-ation. This leads to the following mixture of essential and natural boundary conditionsfor (5.6)–(5.7):

u2 = 0, u3 = 0, [σ n]1 = 0 on Γrail,u = 0 on Γnut,

σ n = 0 on Γ \(Γnut ∪ Γrail

).

(5.8)

The third boundary condition expresses the absence of boundary loads on the remain-der of the surface.

24


We discretize (5.6)–(5.8) by standard nodal (vector-valued) linear finite elements on thesame mesh employed for the discretization of the heat equation (5.1). This leads to astationary, discrete problem of the following form:

K u + F (T(t f )− Tref) = 0, (5.9)

where K is the stiffness matrix and F is a matrix associated with the thermally inducedstress. Clearly, the solution map T(t f ) 7→ u taking the terminal temperature to theinduced displacement is affine.

Our quantity of interest z = u(xTCP) ∈ Rr with r = 3 is the displacement at a cer-tain reference point xTCP, the tool center point. As TCP, we use the tip of the mainspindle (holding the tool) seen in the left of Figure 5.1(a). We consider the main spin-dle assembly as a rigid body which is thermally insulated from the machine column.Consequently, the TCP displacement is determined by the displacement at the fourmounting points x1, . . . , x4 of the sledge holding the main spindle, see Figure 5.1(b).The dependence u(xTCP) = N(u(x1), . . . , u(x4)) is nonlinear, and we refer the reader to(Herzog and Riedel, 2015, Section 3.2) for more details. Here we are only interested inthe linearization Cz of the map described by (5.9),

T(t f ) 7→ u 7→ u(xTCP)

at the constant reference temperature Tref. By the chain rule, it is evident that

Cz = −N′(0) K−1F ∈ R3×n

holds. The specific form of N′(0) is given in Appendix A.

Clearly, it is advantageous to evaluate Cz in an adjoint fashion according to

C>z = −F>K−>N′(0)>.

This amounts to the solution of only r = 3 adjoint elasticity equations with pointsources acting in x1, . . . , x4.

With the matrix Cz available, the output matrix Q can be evaluated by solving the ad-joint system (3.6) and applying (3.7). For the forward system (5.3) under consideration,this amounts to solving

MSQ(t) = −K SQ(t)−q

∑k=1

αk Mk SQ(t), t ∈ [0, t f ],

MSQ(t f ) = −C>z

(5.10)

for SQ : [0, t f ]→ Rn×r and subsequently evaluating

Q =

[−SQ(0)>M −

∫ t f

0SQ(t)>

[M1 T(t) · · · · · · M4 T(t)

]dt]

.

The symmetry of M and K has been used in these formulas.

25


6. NUMERICAL RESULTS

In this section we present some numerical results. We focus on the sensor placementproblem and its solution by the simplicial decomposition sensor placement methoddescribed in Algorithm 1. The algorithm is applied to the thermo-mechanical systemdescribed in Section 5. We therefore assume that the set-point θ0 = (T0

0 , α0) is givenand no data assimilation problem needs to be solved.

6.1. DESCRIPTION OF PROBLEM DATA

We fix the set-point of the initial temperature state equal to the ambient temperature,i.e., T0

0 (x) ≡ Tref. The set-point of the heat transfer parameter α0(x) varies over differentparts of the boundary and it is zero were the heat sources are applied, see (5.2) andFigure 5.1(c). We have chosen typical values for the heat transfer coefficient,

α0(x) =

12 W K−1 m−2 if x ∈ Γvert (vertical surfaces),10 W K−1 m−2 if x ∈ Γup (horizontal surfaces facing up),8 W K−1 m−2 if x ∈ Γdown (horizontal surfaces facing down),5 W K−1 m−2 if x ∈ Γinner (enclosed surfaces),0 W K−1 m−2 if x ∈ Γr1 ∪ Γr2 (surfaces with heat sources).

The inverse covariance matrices for the initial state and for the parameter were chosenas V−1

x0= M (finite element mass matrix) and V−1

p = id4. The machine column ex-periences the influence of two heat sources, see Figure 5.1(c). One originates from anelectrical drive mounted on the top of the machine column (Γr1) and the other one fromthe spindle driving the horizontal movement of the column (Γr2). The heat sources aredescribed by

r(x, t) =

6700 W m−2 if x ∈ Γr1 and 0s ≤ t ≤ 2400s,2700 W m−2 if x ∈ Γr2 and 0s ≤ t ≤ 4800s,6700 W m−2 if x ∈ Γr1 and 4800s < t ≤ 7200s,0 else.

All calculations are done in the time interval [0 s, 7200 s]. The standard deviation of themeasurements was assumed to be σ = 1.

6.2. DISCRETIZATION

As described in Section 5, we used a finite element model with a standard nodal basisof piecewise linear, continuous elements for the temperature T as well as for the dis-placement u on a tetrahedral grid of the geometry depicted in Figure 5.1(b). The size ofthe mesh can be seen in Table 6.1 All finite element nodes on the boundary are potential

26


number of meshnodes

number of meshcells

number of nodes on theboundary (potential sen-sor locations)

n = 25 615 79 197 25 288

Table 6.1: Size of the finite element mesh.

sensor positions in the sensor placement problem.

In order to compute the required quantities for the sensor placement problem, partic-ularly the Jacobian J and the output matrix Q, we need to solve the time-dependentforward system (5.1), the adjoint system for the sensitivities (5.4) and the adjoint sys-tem (5.10) for the matrix Q. For the forward system we employed the implicit Eulermethod with time step length ∆t = 360 s. The adjoint systems with discretized withthe consistent adjoint time stepping scheme. The measurements y(tj) = Cy T(tj) weretaken at the same time instances tj = j ∆t, j = 1, . . . , N = 20, which occur duringintegration.

6.3. EFFICIENT IMPLEMENTATION

Notice that the sensitivities X(tj) = [X0(tj), Xp(tj)] ∈ Rn×(n+q), j = 1, . . . , N as well asthe matrices Υi ∈ R(n+q)×(n+q), i = 1, . . . , n are dense and therefore would require alarge amount of memory to store. Moreover, the assembly of the matrix I(w) appear-ing in the evaluation of φi during the CGP step, see (4.5) and (3.26), and during theRMP step (4.7) of Algorithm 1 would be computationally rather expensive. Here wetake advantage of the fact that only the product h = I(w)−1Q> is needed to computeall required quantities in. Instead of forming I(w), we therefore solve I(w) h = Q>

by means of r = 3 calls to a preconditioned conjugate gradient method based onmatrix-vector-products with I(w), which are much more economical to implement.As preconditioner we use the background covariance matrix V−1

θ = diag(V−1x0

, V−1p ) =

diag(M, id4). Similar considerations apply to the evaluation of ψj in (4.22).

In addition, the computation of φ(w) and ψ(v), see (3.26) and (4.22), as well as thematrix-vector products with the FIM I(w) are executed in parallel with N = 20 threads,where each thread j only uses sensitivity information X(tj) for time step j.

6.4. RESULTS AND PERFORMANCE

For practical purposes the termination criteria for the simplicial decomposition prob-lem (4.4) and for the restricted master problem (4.23) are implemented only up to cer-tain tolerances. In (4.4) a weight wi is considered zero (one), if it is below 0.05 (above

27


0.95) and hence i is taken to belong to the set Ilb (Iub). Afer solving the RMP, a columngj is purged if the corresponding vj is below 0.05.

All values for tolerances as well as maximal iteration numbers for solving both prob-lems can be found in Table 6.2.

Parameter Value

maximal number of iterations for SDP 40

zero weight in termination check for SDP 0.05unit weight in termination check for SDP 0.95tolerance in termination check for SDP 0.01

tolerance for purging columns in RMP 0.05

maximal number of iterations for RMP 30tolerance in termination check for RMP 0.01

Table 6.2: Parameters used in Algorithm 1.

The sensor placement problem for the thermo-mechanical system described in Section 5was solved for the setting described in Section 6.1 with a desired number of m = 10sensors. Algorithm 1 stopped after 6 iterations, because the column generated in theCGP was already contained in the previous column set G. In this case the RMP tobe solved would be the same as in the step before and no further progress could beachieved. The computation took about 2.5 h, further detail about the performance ofthe algorithm are listed in Table 6.3.

time for computation of sensitivities (5.5) ≈ 15 min

time for computation of φ(w) 150 s

average number of RMP steps per SDP step 18time for RMP step 70 s

number of SDP steps 6average time for SDP step 1360 s

overall time 2.5 h

Table 6.3: Computation times for the application of Algorithm 1.

Figure 6.1 shows the evolution of the distribution of measurement weights for eachSDP step over all possible sensor locations, which were all 25 288 boundary nodes ofthe FE mesh. The final solution is achieved practically after 4 iterations, which is alsoreflected in the objective values Ψ

(Q I(w)−1Q>

)in Figure 6.2(b). The optimal sensors

are all placed in the vicinity of the two heat sources, see Figure 6.2(a).

Since the DQ-criterion targets the volume of the confidence ellipsoid of the QOI (TCP

28


Figure 6.1: Evolution of the measurement weights w(k) during SPD iterations.

(a) Optimal sensor posi-tions (m = 10).

(b) Objective values Ψ(Q I(w(k))−1Q>

)over it-

eration number.

Figure 6.2: Optimal sensors and objective values.

29


Figure 6.3: Eigenvectors of the final covariance matrix Q I(w)−1Q> for the largest(blue), the second largest (green) and the smallest (red) eigenvalue.

displacement), one might expect that its minimization might lead to an ellipsoid withlargely unequal half axes. This would mean that the precision of the TCP displacementestimation would be highly dependent on the direction. However, this is not the casein our example. The lengths of the ellipsoid’s half axes are proportional to the squareroots of the eigenvalues of the QOI estimator covariance matrix Q I(w)−1Q>. In ourexample, we obtain [

√λ1,√

λ2,√

λ3] = [124.2, 94.5, 70.6]. Consequently, the TCP dis-placement can be estimated with comparable precision in the x-, y- and z-directions.Figure 6.3 shows the corresponding eigenvectors.

7. CONCLUSION AND OUTLOOK

In this paper we addressed a sensor placement problem for large-scale linear dynamicalsystems for which the initial state and a number of parameters are unknown and areestimated within a data assimilation framework. A thermo-mechanical PDE systemwas used as an illustrative example. Particular emphasis was placed on the efficientevaluation of the arising quantities, where we exploited the fact that the quantity ofinterest is low-dimensional.

It should be noted that the sensitivities X(tj) =[

X0(tj) Xp(tj)]

of the model outputand consequently the QOI estimator’s covariance matrix depend on the unknown pa-rameter and initial state θ0 = (x0

0, p0). Therefore, an array of optimal sensor locationsmay not necessarily be optimal for a variation of θ0. This issue is particularly unsat-isfactory when the data assimilation problem (2.1) is considered in a moving horizoncontext where updated estimates on the unknowns become continually available butupdates on sensor locations during the operation are impossible to realize. Robust sen-sor selection strategies are then required and will be investigated elsewhere. Another

30


issue of future investigation is the proper use of parametric model order reduction inorder to reduce the computational effort.

ACKNOWLEDGMENT

This work was supported by a DFG grant within the Special Research ProgramSFB/Transregio 96 (Thermo-energetische Gestaltung von Werkzeugmaschinen), which isgratefully acknowledged.

A. SPECIFIC FORM OF N′(0)

1/4 L` (x2z − x4z)

L` (x4y − x2y)

L` (x4z − x2z) 1/4 L

` (x2x − x4x)L` (x2y − x4y)

L` (x4x − x2x) 1/4

u(x1)

+

1/4 L` (x3z − x1z)

L` (x1y − x3y)

L` (x1z − x3z) 1/4 L

` (x3x − x1x)L` (x3y − x1y)

L` (x1x − x3x) 1/4

u(x2)

+

1/4 L` (x4z − x2z)

L` (x2y − x4y)

L` (x2z − x4z) 1/4 L

` (x4x − x2x)L` (x4y − x2y)

L` (x2x − x4x) 1/4

u(x3)

+

1/4 L` (x1z − x3z)

L` (x3y − x1y)

L` (x3z − x1z) 1/4 L

` (x1x − x3x)L` (x1y − x3y)

L` (x3x − x1x) 1/4

u(x4)

We refer the reader to Herzog and Riedel (2015) for more details.

REFERENCES

Alexanderian, A., Petra, N., Stadler, G. and Ghattas, O. (2014). A-optimal design ofexperiments for infinite-dimensional bayesian linear inverse problems with regular-ized `0-sparsification. SIAM Journal on Scientific Computing 36: A2122–A2148, doi:10.1137/130933381.

Alexanderian, A., Petra, N., Stadler, G. and Ghattas, O. (2016). A fast and scalablemethod for A-optimal design of experiments for infinite-dimensional bayesian non-linear inverse problems. SIAM Journal on Scientific Computing 38: A243–A272, doi:10.1137/140992564.

Alonso, A. A., Frouzakis, C. E. and Kevrikidis, I. G. (2004a). Optimal sensor placementfor state reconstruction of distributed process systems. AIChE Journal 50: 1438–1452,doi: 10.1002/aic.10121.

31

http://transregio96.de

http://transregio96.de

http://dx.doi.org/10.1137/130933381

http://dx.doi.org/10.1137/130933381

http://dx.doi.org/10.1137/140992564

http://dx.doi.org/10.1137/140992564

http://dx.doi.org/10.1002/aic.10121


Alonso, A. A., Kevrekidis, I. G., Banga, J. R. and Frouzakis, C. E. (2004b). Optimalsensor location and reduced order observer design for distributed process systems.Computers & Chemical Engineering 28: 27–35, doi: 10.1016/S0098-1354(03)00175-3.

Armaou, A. and Demetriou, M. A. (2006). Optimal actuator/sensor placement for linearparabolic PDEs using spatial norm. Chemical Engineering Science 61: 7351–7367, doi:10.1016/j.ces.2006.07.027.

Atkinson, A., Donev, A. and Tobias, R. (2007). Optimum Experimental Designs, With SAS.Oxford: Oxford University Press.

Bernstein, D. S. (2005). Matrix Mathematics: Theory, Facts, and Formulas with Applicationto Linear Systems Theory. Princeton, NJ: Princeton University Press.

Bertsekas, D. P. (2015). Convex Optimization Algorithms. Belmont, MA: Athena Scientific.

Bock, H. G., Körkel, S., Kostina, E. and Schlöder, J. P. (2007). Robustness aspectsin parameter estimation, optimal design of experiments and optimal control. InReactive flows, diffusion and transport. Springer, Berlin, 117–146, doi: 10.1007/978-3-540-28396-6_6.

Boley, B. A. and Weiner, J. H. (1960). Theory of thermal stresses. New York London: JohnWiley & Sons Inc.

Cacuci, D. G., Navon, I. M. and Ionescu-Bujor, M. (2014). Computational Methods for DataEvaluation and Assimilation. Boca Raton: CRC Press.

Chepuri, S. P. and Leus, G. (2015). Sparsity-promoting sensor selection for non-linearmeasurement models. IEEE Transactions on Signal Processing 63: 684–698, doi: 10.1109/TSP.2014.2379662.

Cohen, K., Siegel, S. and McLaughlin, T. (2006). A heuristic approach to effective sensorplacement for modeling of a cylinder wake. Computers & Fluids 35: 103–120, doi:10.1016/j.compfluid.2004.11.002.

Cressie, N. and Wikle, C. K. (2011). Statistics for Spatio-Temporal Data. Hoboken, NJ:Wiley.

Diehl, M., Bock, H. G. and Kostina, E. (2006). An approximation technique for robustnonlinear optimization. Mathematical Programming 107: 213–230.

El Jai, A. and Pritchard, A. J. (1988). Sensors and Controls in the Analysis of DistributedSystems. New York, NY, USA: Halsted Press.

Eslami, M. R., Hetnarski, R. B., Ignaczak, J., Noda, N., Sumi, N. and Tanigawa, Y. (2013).Theory of elasticity and thermal stresses, Solid Mechanics and its Applications 197. Dor-drecht: Springer, doi: 10.1007/978-94-007-6356-2, explanations, problems and so-lutions.

32

http://dx.doi.org/10.1016/S0098-1354(03)00175-3

http://dx.doi.org/10.1016/j.ces.2006.07.027

http://dx.doi.org/10.1016/j.ces.2006.07.027

http://dx.doi.org/10.1007/978-3-540-28396-6_6

http://dx.doi.org/10.1007/978-3-540-28396-6_6

http://dx.doi.org/10.1109/TSP.2014.2379662


http://dx.doi.org/10.1016/j.compfluid.2004.11.002


http://dx.doi.org/10.1007/978-94-007-6356-2


Fitzpatrick, B. G. (1991). Bayesian analysis in inverse problems. Inverse Problems 7: 675–702, doi: 10.1088/0266-5611/7/5/003.

Freitag, M. A. and Potthast, R. W. E. (2013). Synergy of inverse problems and data as-similation techniques. In Large scale inverse problems. De Gruyter, Berlin, Radon Ser.Comput. Appl. Math. 13, 1–53.

García, M. R., Vilas, C., Banga, J. R. and Alonso, A. A. (2007). Optimal field reconstruc-tion of distributed process systems from partial measurements. Industrial & Engineer-ing Chemistry Research 46: 530–539, doi: 10.1021/ie0604167.

Gejadze, I. Y., Le Dimet, F.-X. and Shutyaev, V. (2008). On analysis error covariances invariational data assimilation. SIAM Journal on Scientific Computing 30: 1847–1874, doi:10.1137/07068744X.

Gejadze, I. Y., Le Dimet, F.-X. and Shutyaev, V. (2010). On optimal solution error covari-ances in variational data assimilation problems. Journal of Computational Physics 229:2159–2178, doi: 10.1016/j.jcp.2009.11.028.

Gejadze, I. Y. and Shutyaev, V. (2012). On computation of the design function gradi-ent for the sensor-location problem in variational data assimilation. SIAM Journal onScientific Computing 34: B127–B147, doi: 10.1137/110825121.

Gejadze, I. Y., Shutyaev, V. and Le Dimet, F.-X. (2013). Analysis error covariance versusposterior covariance in variational data assimilation. Quarterly Journal of the RoyalMeteorological Society 139: 1826–1841, doi: 10.1002/qj.2070.

Haber, E., Horesh, L. and Tenorio, L. (2008). Numerical methods for experimental de-sign of large-scale linear ill-posed inverse problems. Inverse Problems 24: 055012, doi:10.1088/0266-5611/24/5/055012.

Haber, E., Horesh, L. and Tenorio, L. (2010). Numerical methods for the design of large-scale nonlinear discrete ill-posed inverse problems. Inverse Problems 26: 025002, doi:10.1088/0266-5611/26/2/025002.

Herzog, R. and Riedel, I. (2015). Sequentially optimal sensor placement in thermoelasticmodels for real time applications. Optimization and Engineering 16: 737–766, doi: 10.1007/s11081-015-9275-0.

Joshi, S. and Boyd, S. (2009). Sensor selection via convex optimization. IEEE Transactionson Signal Processing 57: 451–462, doi: 10.1109/TSP.2008.2007095.

Körkel, S., Kostina, E., Bock, H. G. and Schlöder, J. P. (2004). Numerical methodsfor optimal control problems in design of robust optimal experiments for nonlin-ear dynamic processes. Optimization Methods & Software 19: 327–338, doi: 10.1080/10556780410001683078, the First International Conference on Optimization Methodsand Software. Part II.

33

http://dx.doi.org/10.1088/0266-5611/7/5/003

http://dx.doi.org/10.1021/ie0604167

http://dx.doi.org/10.1137/07068744X

http://dx.doi.org/10.1137/07068744X

http://dx.doi.org/10.1016/j.jcp.2009.11.028

http://dx.doi.org/10.1137/110825121

http://dx.doi.org/10.1002/qj.2070

http://dx.doi.org/10.1088/0266-5611/24/5/055012

http://dx.doi.org/10.1088/0266-5611/24/5/055012

http://dx.doi.org/10.1088/0266-5611/26/2/025002

http://dx.doi.org/10.1088/0266-5611/26/2/025002

http://dx.doi.org/10.1007/s11081-015-9275-0

http://dx.doi.org/10.1007/s11081-015-9275-0


http://dx.doi.org/10.1080/10556780410001683078

http://dx.doi.org/10.1080/10556780410001683078


Kubrusly, C. and Malebranche, H. (1985). Sensors and controllers location in dis-tributed systems—a survey. Automatica 21: 117–128, doi: http://dx.doi.org/10.1016/0005-1098(85)90107-4.

Kühl, P., Diehl, M., Kraus, T., Schlöder, J. P. and Bock, H. G. (2011). A real-time algorithmfor moving horizon state and parameter estimation. Computers & Chemical Engineer-ing 35: 71–83, doi: http://dx.doi.org/10.1016/j.compchemeng.2010.07.012.

Küpper, A., Diehl, M., Schlöder, J., Bock, H. G. and Engell, S. (2009). Efficient movinghorizon state and parameter estimation for smb processes. Journal of Process Control19: 785–802, doi: http://dx.doi.org/10.1016/j.jprocont.2008.10.004.

Law, K., Stuart, A. and Zygalakis, K. (2015). Data assimilation, Texts in Applied Mathe-matics 62. Springer, Cham, doi: 10.1007/978-3-319-20325-6, a mathematical intro-duction.

Lu, Z. and Pong, T. K. (2013). Computing optimal experimental designs via interiorpoint method. SIAM Journal on Matrix Analysis and Applications 34: 1556–1580, doi:10.1137/120895093.

Magnus, J. R. and Neudecker, H. (1999). Matrix Differential Calculus with Applications inStatistics and Econometrics. Chichester: John Wiley & Sons, 2nd ed.

Marshall, A. W., Olkin, I. and Arnold, B. C. (2011). Inequalities: Theory of Majorizationand Its Applications. New York: Springer, 2nd ed.

Mehra, R. (1974). Optimal input signals for parameter estimation in dynamic systems–survey and new results. IEEE Transactions on Automatic Control 19: 753–768, doi: 10.1109/TAC.1974.1100701.

Meo, M. and Zumpano, G. (2005). On the optimal sensor placement techniques for abridge structure. Engineering Structures 27: 1488–1497, doi: 10.1016/j.engstruct.2005.03.015.

Mokhasi, P. and Rempfer, D. (2004). Optimized sensor placement for urban flow mea-surement. Physics of Fluids 16: 1758–1764, doi: 10.1063/1.1689351.

Nocedal, J. and Wright, S. (2006). Numerical Optimization. New York: Springer, 2nd ed.,doi: 10.1007/978-0-387-40065-5.

Patriksson, M. (1999). Nonlinear Programming and Variational Inequality Problems: A Uni-fied Approach. Applied Optimization. Dordrecht: Kluwer Academic Publishers.

Pázman, A. (1986). Foundations of optimum experimental design, Mathematics and its Ap-plications (East European Series) 14. D. Reidel Publishing Co., Dordrecht, translatedfrom the Czech.

34

http://dx.doi.org/http://dx.doi.org/10.1016/0005-1098(85)90107-4

http://dx.doi.org/http://dx.doi.org/10.1016/0005-1098(85)90107-4

http://dx.doi.org/http://dx.doi.org/10.1016/j.compchemeng.2010.07.012

http://dx.doi.org/http://dx.doi.org/10.1016/j.jprocont.2008.10.004

http://dx.doi.org/10.1007/978-3-319-20325-6

http://dx.doi.org/10.1137/120895093

http://dx.doi.org/10.1137/120895093

http://dx.doi.org/10.1109/TAC.1974.1100701

http://dx.doi.org/10.1109/TAC.1974.1100701

http://dx.doi.org/10.1016/j.engstruct.2005.03.015

http://dx.doi.org/10.1016/j.engstruct.2005.03.015

http://dx.doi.org/10.1063/1.1689351

http://dx.doi.org/10.1007/978-0-387-40065-5


Pronzato, L. and Pázman, A. (2013). Design of experiments in nonlinear models, Lec-ture Notes in Statistics 212. Springer, New York, doi: 10.1007/978-1-4614-6363-4,asymptotic normality, optimality criteria and small-sample properties.

Pukelsheim, F. (2006). Optimal design of experiments, Classics in Applied Mathematics 50.Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM), reprint ofthe 1993 original.

Rafajłowicz, E. (1981). Design of experiments for eigenvalue identification indistributed-parameter systems. International Journal of Control 34: 1079–1094, doi:10.1080/00207178108922583.

Rafajłowicz, E. (1986). Optimum choice of moving sensor trajectories for distributed-parameter system identification. International Journal of Control 43: 1441–1451, doi:10.1080/00207178608933550.

Silvey, S. (1980). Optimal Design: An Introduction to the Theory for Parameter Estimation.London: Chapman and Hall.

Silvey, S., Titterington, D. and Torsney, B. (1978). An algorithm for optimal designs ona design space. Communications in Statistics - Theory and Methods 7: 1379–1389, doi:10.1080/03610927808827719.

Song, Z., Chen, Y., Sastry, C. R. and Tas, N. C. (2009). Optimal Observation for Cyber-physical Systems: A Fisher-information-matrix-based Approach. London: Springer.

Sun, N.-Z. and Sun, A. (2015). Model Calibration and Parameter Estimation: For Environ-mental and Water Resource Systems. New York: Springer.

Torsney, B. (2009). W-iterations and ripples therefrom. In Optimal design and related areasin optimization and statistics. Springer, New York, Springer Optim. Appl. 28, 1–12, doi:10.1007/978-0-387-79936-0_1.

Ucinski, D. (2005). Optimal measurement methods for distributed parameter system identifi-cation. Systems and Control Series. Boca Raton, FL: CRC Press.

Ucinski, D. and Patan, M. (2007). D-optimal design of a monitoring network for param-eter estimation of distributed systems. Journal of Global Optimization 39: 291–322, doi:10.1007/s10898-007-9139-z.

Uspenskii, A. B. and Fedorov, V. V. (1975). Computational Aspects of the Least-SquaresMethod in the Analysis and Design of Regression Experiments. Moscow: Moscow Univer-sity Press, (In Russian).

Vogel, C. R. (2002). Computational Methods for Inverse Problems. Philadelphia, PA: Societyfor Industrial and Applied Mathematics, doi: 10.1137/1.9780898717570.

35

http://dx.doi.org/10.1007/978-1-4614-6363-4

http://dx.doi.org/10.1080/00207178108922583

http://dx.doi.org/10.1080/00207178108922583

http://dx.doi.org/10.1080/00207178608933550

http://dx.doi.org/10.1080/00207178608933550

http://dx.doi.org/10.1080/03610927808827719

http://dx.doi.org/10.1080/03610927808827719

http://dx.doi.org/10.1007/978-0-387-79936-0_1

http://dx.doi.org/10.1007/978-0-387-79936-0_1

http://dx.doi.org/10.1007/s10898-007-9139-z

http://dx.doi.org/10.1007/s10898-007-9139-z

http://dx.doi.org/10.1137/1.9780898717570


Willcox, K. (2006). Unsteady flow sensing and estimation via the gappy proper orthog-onal decomposition. Computers & Fluids 35: 208–226, doi: 10.1016/j.compfluid.2004.11.006.

Yi, T.-H., Li, H.-N. and Gu, M. (2011). Optimal sensor placement for structural healthmonitoring based on multiple optimization strategies. The Structural Design of Tall andSpecial Buildings 20: 881–900, doi: 10.1002/tal.712.

Yildirim, B., Chryssostomidis, C. and Karniadakis, G. (2009). Efficient sensor placementfor ocean measurements using low-dimensional concepts. Ocean Modelling 27: 160–173, doi: 10.1016/j.ocemod.2009.01.001.

Yu, Y. (2010). Monotonic convergence of a general algorithm for computing optimaldesigns. The Annals of Statistics 38: 1593–1606, doi: 10.1214/09-AOS761.

36



http://dx.doi.org/10.1002/tal.712

http://dx.doi.org/10.1016/j.ocemod.2009.01.001

http://dx.doi.org/10.1214/09-AOS761

Documents

OPTIMAL SENSOR PLACEMENT FOR JOINT PARAMETER …...Sensor Placement for Joint Parameter and State Estimation Herzog, Riedel, Ucinski´ parameters could be easily incorporated by declaring