prev

next

out of 9

View

215Download

1

Embed Size (px)

Automatica, Vol. 5, pp. 731-739. Pergamon Press, 1969. Printed in Great Britain.

A Parameter-Adaptive Control Technique* Une technique de commande h adaptation de param&res

Ein Verfahren der Parameter-adaptiven Regelung

TexHvira ynpaBJieHrI~ c a)Ial-tTaIIrte~ napaMeTpoB

G. STEINS" and G. N. SARIDIS+ +

An approximate solution of the functional equation of dynamic programming has been used to develop a simple adaptive controller for linear stochastic systems with unknown parameters.

Summary--Control of linear stochastic systems with un- known parameters is accomplished by means of an approxi- mate solution of the associated functional equation of dynamic programming. The approximation is based on repeated linearizations of a quadratic weighting matrix appearing in the optimal cost function for the control pro- cess. This procedure leads to an adaptive control system which is linear in an expanded vector of state estimates with feedback gains which are explicit functions of a posteriori parameter probabilities. The performance of this controller is illustrated with a simple example.

1. INTRODUCTION

FOR SOME years now the dual control formulation [5, 6] and the dynamic programming formulation [2, 3] for the so-called "optimal adaptive control problem" have been available to solve control problems involving certain unknown quantities such as parameters of the system's mathematical model, parameters of the statistical descriptions for various disturbances affecting the system, or entire functional relationships involved in the mathe- matical representation of the control problem. Various efforts have been made to utilize these formulations and to modify them for numerous special circumstances [1, 11, 13]. Only limited success, however, has been achieved in dealing with the significant analytical complexities and com- putational burdens associated with both formula- tions.

This paper considers a special case of the optimal adaptive control problem for which it is possible to exploit a simple approximation technique to obtain an approximate solution of the functional equation

* Received 14 February 1969; revised 26 May 1969. The original version of this paper was presented at the 4th IFAC Congress which was held in Warsaw, Poland during June 1969. It was recommended for publication in revised form by associate editor P. Dorato.

t Honeywell Inc., Systems & Research Div., Research Dept., 2345 Walnut Street, St. Paul, Minnesota 55113, USA.

Purdue University, Department of Electrical Engineer- ing, West Lafayette, Indiana, USA.

associated with the dynamic programming for- mulation. The adaptive control problem itself is formulated in section 2 of the paper, followed by a discussion of the approximation technique in section 3 and the resulting adaptive control system in section 4. The solution is then illustrated with a simple example in section 5.

2. A FORMULATION OF THE ADAPTIVE CONTROL PROBLEM

Let the system be described by the following linear, discrete-time, stochastic model,

x(k + 1)=A(c~, k)x(k)+ B(c~, k)u(k)+ F(~, k)~(k)

k=O, 1 . . . . . N -1

~eE~,= {~1, c2 . . . . . as} (1)

with the measurement equation

y(k) = C(a, k)x(k) + D(~, k)rl(k),

k = 1, 2 . . . . . N. (2)

The vector x(k) is an n-vector of state variables defined at time instant t,, u(k) is an unconstrained m-vector of control inputs, and y(k) is an r-vector of measured outputs. The r~-vectors (k) and the r2-vectors t/(k) form two independent sequences of independent identically distributed Gaussian ran- do ln vectors:

~(k) ~-* ]~q'(O, Ir~),

~7(k) ,-~N(0,/,~),

I,, =E{~(k)~r(k)}

k=0,1 . . . . . N -1

I,~=E{q(k)qT(k)}

k=l , 2 . . . . . N . (3)

Similarly, the system's initial state x(0) is assumed to be a Gaussian distributed random vector:

x(O)~N[p(O), P(0)]

P(0) = E{ Ix(0) - p(0)] [x(0) - #(0)] r ) . (4)

731

732 G. STEIN and G. N. SARIDIS

The quantities A(~, k), B(a, k), F(a, k), C&, k) and D(~, k) are matrices with appropriate dimensions whose elements are arbitrary but known functions of the time index k and of the /-vector ~. The vector a consists of unknown system parameters. It is assumed to belong to the finite set ~ and to be constant on the control interval k = 0, 1 , . . . , N.

The adaptive control problem for this system consists of finding a sequence of control inputs {u(k), k=0, 1, 2 , . . . , N - l} as functions of the available measurements,

u(k)=f~(yk), yk= {y(1), y(2) . . . . . y(k)},

k =0, 1 . . . . . N -1 (5)

such that the following average cost function is minimized:

1 2 (6)

The symmetric matrices Q(a, k) and R(c, k] represent the relative weights to be placed upon various components of state and control deviations. Their dependence on ~ is included to reflect the empirical fact that quadratic weights are not chosen a priori but rather are chosen to suit the particular plant and the designer's overall conception of satisfactory performance.

The following additional assumptions are made:

(i) D(a, k)DT(~, k) >0] - T . . . . I for all ~e~, and

(ii) Q(~, k)=(2 (c~, k)_>O~k=l ' 2 . . . . . N (7) l

(iii) R(e, k)=RT(~, k)>0J

(iv) an a priori discrete probability distribution function q(0) for the vector a is available, where q(0) is an s-vector with components

0_< qi(0) = Prob[a = aJ _< 1,

satisfying

qi(O) = 1. i= l

i=1, 2, . . . , s

Since a feedback control of the form (5) is desired, the method of dynamic programming will be used to minimize criterion (6).

Define the "optimal return function": V(y k, N-k)Acost of an N-k stage adaptive

control process using the optimal control sequence {u*(k), u*(k + 1) . . . . , u*(X- 1)} based upon a priori informa- tion (4) and (7 iv) and upon the measurement sequence yk= {y(1), y(2) . . . . . y(k)}.

Applying the "Principle of Optimality" [2], the optimal return function obeys the following recursive functional equation:

V(y k, N-k) =rain E{llx(k + 1)ll ~ + II~(k)ll~ u(k)

+ V(y k+ ', N -k -1 ) [ y k} (8)

with V(y N, 0)---0 (with probability one). In this equation, E{ . . . [yk} denotes the mathe-

matical expectation conditioned on the sequence yk and on the a priori data (4) and (7 iv). The depen- dence of Q and R upon parameters ~ and time index k has been suppressed. As a matter of con- venience, this practice is continued in all subse- quent derivations.

Equation (8) may be solved backwards, starting with a one-stage process.

V(yN-1, 1)= min E(I[x(N)I[ ~+ I lu(N- 1)IINlY~-'). u(N -- 1 )

(9)

The conditional expectation of equation (9) can be expressed as

e( ( . . . )ly

= ~ Prob[==~,lyN-1]E{(... )]=__~,, yN-1} i= l

s

= y'~q,(N-1)E{(.. .)[~=~,, yN-1} (10) i=1

where q,(N- 1), i= 1, 2 , . . . , s, is the a posteriori probability distribution of parameter vector based upon measurements yN- 1. So (9) becomes

V(y s-l , 1)= min ~ qi(N-1)E{tIx(N)]]~ u(N- 1) i=1

' 2 yN- 1} + Ilu(N- 1)[I.l =.i ,

= rain ~ qI(N-1)E u(N- I) i=1

+ ][u(N- 1)ll, l = (11)

Replacing x(N) by the system equation (1) for each value ~= ~i and carrying out the expectations and minimization, it can now be readily verified [12] that V(y N-l, 1) is quadratic and that the cor- responding optimal control is linear in terms of the following expanded vector of state estimates:

)~(k)A[pr(a,, k), #T(~ 2, k) . . . . . #'r(cq, k)] r (12)

where

P(~i, k)A-.E{x(k)l~ = ~,, yk}, i= 1, 2 . . . . . . 1.

A parameter-adaptive control technique 733

The one-stage cost and control are

V(Y N-i, 1)= I[2(N-1)[[s2tq(N-1), 11

+ T[q(N- 1), 1]

u*(N- 1) = - G[q(N - 1), 112(N - 1)

(13)

(14)

where matrices S and G and scalar Tare non-linear functions of the a posteriori distribution qi(N- 1), i= 1, 2 . . . . , s. These are defined in the Appendix, equations (A.1), (A.2), and (A.3).

It is now evident that the vectors ~(k) and q(k) constitute a set of "sufficient coordinates" [15] for the adaptive control problem formulated in equations (1)-(7). The optimal return function can be expressed as

V(y k, N - k) = V[2(k), q(k), N - k]

and the functional equation (8) becomes

V[2(k), q(k), N -k ]

= rain ~ q,(k)E{llx(k + 1)118 + Ilu(k)lll u(k) i= 1

+ V[2(k + 1), q(k + 1), N - k - 1] la = a,, .~(k)}

(15)

with V{X(N), q(N), O}-0 (with probability one). The existence of sufficient coordinates reduces the dependence of I I ( . . . ) upon a growing number of variables (yk) to the dependence upon a constant and finite number of variables [~(k), q(k)].

Equation (15) can be used to continue the dynamic programming solution, starting with the quadratic return function V[~(N-1), q(N-1), l] of (13). As defined by equation (A.2), however, the matrix S[q(N- 1), 1] is a non-linear function of the a posteriori distribution qi(N-1), i= 1, 2 . . . . . s. This fact prevents the successful completion of the solution in closed form. The function V[~(N-2), q(N-2), 2] and all subsequent optimal return functions are no longer expressible in terms of quadratics or in terms of other similarly convenient functional forms.

The solution of (15) must therefore be obtained by numerical techniques [10] or by approximation methods. Because the computing time and memory requirements of numerical solutions are prohibitive for all but the simplest problems, the following discussion will deal with an approximation method which is based upon a very intuitive and appealing linearization tec