Upload
ngocong
View
224
Download
0
Embed Size (px)
Citation preview
Model Following Neuro-Adaptive Control Design for Non-square, Non-affine
Nonlinear Systems
Radhakant Padhi 1, Nishant Unnikrishnan 2, S. N. Balakrishnan 3
1Department of Aerospace Engineering, Indian Institute of Science – Bangalore, India
2-3Department of Mechanical and Aerospace Engineering, University of Missouri – Rolla, USA
Abstract
This paper proposes a new model-following adaptive control design technique for a class of non-affine and non-
square nonlinear systems using neural networks. An appropriate stabilizing controller is assumed available for a
nominal system model. This nominal controller may not be able to guarantee stability/satisfactory performance in
the presence of unmodeled dynamics (neglected algebraic terms in the mathematical model) and/or parameter
uncertainties present in the system model. In order to ensure stable behavior, an online control adaptation procedure
is proposed in this paper. The controller design is carried out in two steps: (i) synthesis of a set of neural networks
which capture matched unmodeled (neglected) dynamics or model uncertainties due to parametric variations and (ii)
synthesis of a controller that drives the state of the actual plant to that of a desired nominal model. The neural
network weight update rule is derived using Lyapunov theory, which guarantees both stability of the error dynamics
(in a practical stability sense) and boundedness of the weights of the neural networks. A desirable characteristic of
the adaptation procedure presented in this paper is that it is independent of the technique used to design the nominal
controller; and hence can be used in conjunction with any known control design technique. Numerical results for
two challenging illustrative problems are presented in this paper which demonstrate these features and clearly bring
out the potential of the proposed approach.
Keywords: Nonlinear systems, Lyapunov stability, Neural Networks, Online adaptation, Unmodeled dynamics
1 Asst. Professor, Email: [email protected] , Tel: +91-80-2293-2756, Fax: +91-80-2360-0134 2 Ph.D. Student, Email: [email protected] 3 Professor, Email: [email protected], Tel: +1-573-341-4675, Fax: +1-573-341-4607
2
1. Introduction
The field of artificial neural networks and its application to control systems has seen phenomenal growth in the last
two decades. The origin of research on artificial neural networks can be traced back to 1940s [1]. In 1990, a
compiled book was published [2] detailing various applications of artificial neural networks. A good survey paper
appeared in 1992 [3], which outlined various applications of artificial neural networks to control system design. The
main philosophy that is exploited in system theory applications is the universal function approximation property of
neural networks [4]. Benefits of using neural networks for control applications include its ability to effectively
control nonlinear plants while adapting to unmodeled dynamics and time-varying parameters.
In 1990, a paper by Narendra and Parthasarathy demonstrated the potential and applicability of neural networks for
the identification and control of nonlinear dynamical systems [5]. The authors suggested various architectures as
well as learning algorithms useful for identification and adaptive control of nonlinear dynamic systems using
recurrent neural networks. Since then, Narendra and his co-workers have come up with a variety of useful adaptive
control design techniques using neural networks, including applications concerning multiple models [6].
In 1992, Sanner and Slotine [7] developed a direct adaptive tracking control architecture with Gaussian Radial Basis
Function (RBF) networks to compensate for plant nonlinearities. The update process also kept the weights of the
neural networks bounded. In 1996, Lewis et al. [8] proposed an online neural network that approximated unknown
functions and it was used in designing a controller for a robot. Their approach avoided some of the limiting
assumptions (like linearized models) of traditional adaptive control techniques. More important, their theoretical
development also provided a Lyapunov stability analysis that guaranteed both tracking performance as well as
boundedness of weights. However, the applicability of this technique was limited to systems which could be
expressed in the “Brunovsky form” [9] and which were affine in the control variable (in state space form). A robust
adaptive output feedback controller for SISO systems with bounded disturbance was studied by Aloliwi and Khalil
[10]. In a more recent paper, an adaptive output feedback control scheme for the output tracking of a class of
nonlinear systems was presented by Seshagiri and Khalil using RBF neural networks [11].
A relatively simpler and popular method of nonlinear control design is the technique of dynamic inversion (e.g. [12,
13, 14]), which is essentially based on the philosophy of feedback linearization [9, 15]. In this approach, an
appropriate co-ordinate transformation is carried out to make the system dynamics take a linear form. Linear control
3
design tools are then used to synthesize the controller. A drawback of this approach is its sensitivity to modeling
errors and parameter inaccuracies. One way of addressing the problem is to augment the dynamic inversion
technique with the H∞ robust control theory [14]. Important contributions have come from Calise and his co-
workers in a number of publications (e.g. [16 - 20]), who have proposed to augment the dynamic inversion
technique with neural networks so that the inversion error is cancelled out. The neural networks are trained online
using a Lyapunov-based approach (similar to the approach followed in [7] and [8]). This basic idea has been
extended to a variety of cases, namely output based control design [19, 20], reconfigurable control design [21] etc.
The feasibility and usefulness of this technique has been demonstrated in a number of applications in the field of
flight control.
There is a need to develop a universal control design technique to address modeling error issues, which can be
applicable not only with dynamic inversion, but also with any other control design technique. Almost all techniques
mentioned in literature are applicable only for certain classes of nonlinear systems (control-affine systems, SISO
systems etc). In this context, a more powerful tool which can be applied to nonlinear systems in a general form will
be useful in solving a wide variety of control problems in many engineering applications. Keeping these issues in
mind, an approach was first presented in [22], where the authors followed a model-following approach. The idea
presented in this reference is to design an ‘extra control’ online, which when added to a nominal controller, leads to
overall stability and improves the overall performance of the plant.
The objective of this paper is to present a new systematic approach relying on the philosophy presented in [22] to
address the important issues pointed out in the previous paragraph. In this paper, the controller design is carried out
in two steps: (i) synthesis of a set of neural networks which capture matched unmodeled dynamics (due to neglected
algebraic terms) or due to uncertainties in the parameters and (ii) computation of a controller that drives the state of
the actual plant to that of a desired nominal model. The neural network weight update rule is derived using a
Lyapunov-based approach, which guarantees both stability of the error dynamics (in a practical stability sense) as
well as boundedness of weights of the neural networks. Note that this technique is applicable for non-square and
non-affine systems. In this study, numerical results from two challenging problems are presented. The results
obtained from this research bring out the potential of this new approach.
Rest of the paper is organized as follows. In Section 2, the control design technique is laid out and the two step
process proposed in this paper is explained. The neural network structure is discussed and the derivation of the
4
weight update rule is outlined in this section. Simulation studies on two challenging nonlinear problems were
carried out. Simulation details and promising results are presented in Section 3. Appropriate conclusions are drawn
in Section 4.
2. Control Design Procedure
The control design procedure proposed in this paper has two main components. In one part, assuming a neural
network approximation of the unknown algebraic function (that represents the unmodeled dynamics and/or
parametric uncertainties), the objective is to get the solution for the control variable that guarantees model-
following. In the other part, the aim is to train the weights of the neural networks, in a stable manner, so as to
capture the unknown function that is part of the plant dynamics. These two components of the control design
procedure are discussed in detail in the following subsections.
2.1 Problem Description
Consider a nonlinear system, which has the following representation
( )*d d dX = f X ,U (1)
where nR∈dX is the desired state vector and mR∈dU ( )m n≤ is the nominal control vector of a nominal system.
It is assumed that the order of the system n is known and a satisfactory nominal controller dU has been designed
using some standard method (e.g. Dynamic inversion technique, Optimal control theory, Lyapunov theory etc.) such
that this controller meets some desired performance goal. However, Eq. (1) may not truly represent the actual plant
because (i) there may be neglected algebraic terms in this mathematical model (this study is restricted to this class of
unmodeled dynamics) and (ii) the numerical values of the parameters may not perfectly represent the actual plant
and this error results in unknown functions in the model. As a consequence, the actual plant is assumed to have the
following structure
( ) ( )X = f X,U + d X (2)
where nR∈X is the state of the actual plant and mR∈U is the modified controller. The unknown algebraic
function ( ) nR∈d X arises because of the two reasons mentioned above. Note that the two functions ( )*d df X ,U
and ( )f X,U may or may not have same algebraic expressions. However, ( )f X,U contains the known part of the
5
dynamics of Eq. (2). The task here is to design a modified controller U online in such a way that the states of the
actual plant track the respective states of the nominal model. In other words, the goal is to ensure that → dX X as
t → ∞ , which ensures that the actual system performs like the nominal system. As a means to achieve this, the aim
is to first capture the unknown function ( )d X , which is accomplished through a neural network approximation
( )d X . A necessary intermediate step towards this end is the definition of an ‘approximate system’ as follows
( ) ( ) ( ) ( ) ( )ˆa a aX = f X,U + d X + X - X , X 0 = X 0 (3)
Through this artifice, one can ensure that → →a dX X X as t → ∞ . Obviously this introduces two tasks: (i)
ensuring → aX X as t → ∞ and (ii) ensuring →a dX X as t → ∞ . The reason of choosing an approximate system
of the form in Eq. (3) is to facilitate meaningful bounds on the errors and weights.
2.2 Control Solution (Ensuring →a dX X )
In this loop, it is assumed that a neural network approximation of the unknown function ( )d X is available. The
goal in this step is to drive →a dX X as t → ∞ , which is achieved by enforcing the following first order
asymptotically stable error dynamics
( ) ( ) 0K+ =a d a dX - X X - X (4)
where a positive definite gain matrix K is chosen. A relatively easy way of choosing the gain matrix is to have
( )11/ , ... ,1/K ndiag τ τ= (5)
where , 1, ,i i nτ = can be interpreted as the desired time constant for the i th channel of the error dynamics in
Eq.(4). Substituting Eqs. (1) and (3) in Eq. (4) leads to
( ) ( ) ( ) ( ) ( )ˆ 0K+ =*a d d a df X,U + d X + X - X - f X ,U X - X (6)
Solving for ( )f X,U from Eq. (6),
( ) ( )a d df X,U = b X, X , X ,U (7)
where
( ) ( ) ( ) ( ) ( ){ }ˆ*a d d d d a d ab X, X , X ,U f X ,U - X - X - X - X - d XK (8)
6
The next step is to solve for the control U from Eq. (7). A few different cases and issues need to be considered in
this context, which are discussed next
Case 1:
If the following conditions are satisfied:
• The system is square; i.e. m n=
• The system dynamics is affine in control variable; i.e. f(X,U) can be written as
( ) ( ) ( )1g 1f X,U = f X + X U (9)
• ( )( )n n
t×
X1g is non singular t∀
From Eqs. (7-9), U can be obtained in a straight forward manner as
( ) ( ) ( ){ }1g -1
a d d 1U = X b X, X , X ,U - f X (10)
Case 2:
The question is what if the system is control affine but non-square? Two cases may arise; i.e. either ( )m n> or
( )m n< . If ( )m n> , a technique that can be made use of is linear programming. Linear programming is the process
of optimizing a linear objective function subject to a finite number of linear equality and inequality constraints [23].
Control allocation problems in the face of redundant controllers have been dealt with successfully using linear
programming in aerospace applications as shown in [24]. However, if m n< , which is usually the case in many
engineering applications, a novel method of introducing extra variables to augment the control vector to make it
square is proposed. This technique leads to a square problem that facilitates a solution. From this solution,
components of the augmented control vector that represent the actual controller can be extracted. This idea will be
elaborated in the following paragraphs.
When m n< , the number of equations is more than the number of control variables and Eq. (6) cannot be solved for
U . In order to find a solution, a slack-variable vector sU is introduced first. Next, an )( mnn −× matrix XΨ( ) is
designed and sX UΨ( ) is added to the right hand side of the approximate system (Eq. (3)) to get
( ) ( ) ( ) ( ) ( ) ( )ˆ ,a Ψa s s a aX = f X,U + X U + d X U + X - X , X 0 = X 0 (11)
The following quantities are defined
7
; TT T
sV U U (12)
( ) ( ) ( ) a sf X,V f X,U + X UΨ (13)
( ) ( ) ( )ˆ , Ψa s sd X U d X - X U (14)
Using the definitions in Eqs. (12-14), Eq. (11) can be expressed as
( ) ( ) ( ) ( ) ( )ˆ ,a a a s a aX = f X,V + d X U + X - X , X 0 = X 0 (15)
Note that Eq. (15) defines a square system in X and V and therefore it is feasible to get a solution for V . The first
m elements of V represents U . As a part of this process the control designer needs to obtain ( )ˆ ,a sd X U .
A neural network is used in this study for this purpose. ( )ˆ ,ia sd X U can be obtained as the output of a neural
network represented by ˆ ,i iT
sW Φ (X U ) . Here W and Φ are the weight vector and basis function vector of a neural
network respectively. ,T Ts X U is the input vector to the neural network. The subscript i stands for the each state
of the plant model. i.e. each state equation has a separate neural network associated with it.
Similar to the expression in Eq. (4), the error dynamic equation for a control affine but non square system can be
written as
( ) ( ) ( ) ( ) ( ) ( ) ( )ˆ ,
1g Ψ K*1 a s a d d a d
s
Uf X + X X + d X U + X - X - f X ,U + X - X = 0
U (16)
( ) ( ) ( ) ( ) ( ) ( ) ( )ˆ , *
1 a s a d d a df X + X X V + d X U + X - X - f X ,U + X - X = 01g Ψ K (17)
This leads to the solution
( ) ( ), G-1
s a d d sV = X b X, X , X ,U U (18)
where
( ) ( ) ( ) X X X1G g Ψ (19)
( ) ( ) ( ) ( ) ( ) ( ){ }ˆ, ,-K*s a d d s d d a d a a s 1b X, X , X ,U U f X ,U X - X - X - X - d X U - f X (20)
( ) ( )ˆ ˆ , , 1,i i i i n= …T
a s sd X,U = W Φ X U (21)
8
Note that the function ( )XΨ should be chosen carefully such that the square matrix ( )G X does not become
singular. Choosing such a function ( )XΨ , however, is problem dependent and care should be taken while choosing
it.
It has to be noted that this formulation will result in a fixed point problem in the control solution because the control
vector V contains the vector sU , and the control solution equation (Eq. (18)) also contains sU on the right hand
side. In Eq. (18), sU is an input to the neural network that approximates the uncertain function. Solution for V is
obtained numerically as ( )1ˆ ,-1Gk a k+ = −V H d (X V ) . The validity of this solution has been proved using the
contraction mapping theorem (see Appendix A). The proof contains conditions required to prove the existence of a
unique control solution are given for the most general case, i.e. the non-square, non-affine case.
Case 3:
The system dynamics is square ( )m n= , but not control-affine. In such a situation, the following three options are
available
(i) The form of the equation may be such that it may still facilitate a closed-form solution for the control
variable.
(ii) Another option is the use a numerical technique such as the standard Newton-Raphson technique [25].
With the availability of fast computational algorithms and high-speed processors, fast online numerical solution of
algebraic equations is not considered to be an infeasible task. For example, the viability of the Newton-Raphson
technique for online applications is discussed in [26, 27] where the authors have used the technique for complex
real-life problems. Note that a good initial guess solution can be provided at any time step k as
( )1
: 1
: 2,3,...k
k
k−
= =
dguess k
UU =
U (22)
(iii) Following the idea in [28, 29], a novel method is introduced to deal with a class of control non-affine
smooth nonlinear systems of the form X = f(X,U) , where f is a smooth mapping and (0,0) 0=f . If the unforced
dynamic equation 0X = f(X,0) f (X) of a system that falls in the class of systems mentioned earlier is Lyapunov
stable, the system equation can be represented as
( )u∑m
0 0 i ii=1
X = f (X)+ g (X)U + R (X,U)U (23)
9
as shown in [28, 29]. In the above mentioned representation,
[ ]∂ ∈∂
0
0 0 n×m0 1 m
f (X) f(X,0)
fg (X) (X,0) = g (X)…g (X)
uR
(24)
and : n m n mR R R ×× →iR (X,U) is a smooth mapping for 1 i m≤ ≤ . The actual plant equation ( ) ( )X = f X,U + d X
for this class of nonlinear non-affine systems can be expressed as
( ))0g u∑m
0 i ii=1
X = f (X)+ (X)U + R (X,U U + d(X) (25)
The approximate plant equation now becomes
ˆ ,0ga 0 aX = f (X)+ (X)U + d(X U)+ X - X (26)
In this case, the online neural network output ˆ ,d(X U) captures the uncertainty ( ))u∑m
i ii=1
R (X,U U + d(X) . Now the
control solution can be obtained from the error dynamic equation Eq. (4) between the approximate state and the
desired state as
( ) ( ) ( ){ },= 0g-1
a d d 0U X b X, X , X ,U U - f X (27)
where
( ) ( ) ( ) ( ) ( ){ }ˆ, ,−K*a d d d d a d ab X, X , X ,U U f X ,U X - X - X - X - d X U (28)
Note that ( )0g X is assumed to be nonsingular t∀ . Here again it can be seen that Eq. (27) constitutes a fixed
point problem. The control solution is obtained numerically using ( )1ˆ ,-1Gk a k+ = −V H d (X V ) . Such a solution is
shown to be valid by proving that the mapping in Eq. (27) is a contraction mapping. The proofs and conditions that
lead to the validity of the solution are given in appendix A.
Case 4:
If the system is both non-square and non-affine in control, the approximate plant equation takes the form
ˆ , , Ψaa 0 0 s a sX = f (X)+ g (X)U + d (X U U )+ X - X + (X)U (29)
which reduces to
[ ] ˆ , ,
0g Ψa 0 a s as
UX = f (X)+ (X) (X) + d (X U U )+ X - X
U (30)
10
with
( ) ( ) ( ) ( )ˆ , , )u
∑Ψm
a s s i ii=1
d X U U d X - X U + R (X,U U (31)
Define TT T
sV U U and ( ) ( ) ( )0G g Ψ X X X .
The error dynamic equation can be expressed as
( ) ( ) ( ) ( ) ( )ˆ , ,G K*0 a s a d d a df X + V + d X U U + X - X - f X ,U + X - X = 0 (32)
The control can be solved as
( ) ( )1, ,
−= G s a d d sV X b X, X , X ,U U U (33)
( ) ( ) ( ) ( ) ( ) ( ){ }ˆ, , , ,K*s a d d s d d a d a a s 0b X, X , X ,U U U f X ,U - X - X - X - X - d X U U - f X (34)
Only the first m elements of V will be needed for the implementation of the control on the actual plant. ( )G X
is assumed to be nonsingular t∀ . The control solution is obtained numerically using ( )1ˆ ,-1Gk a k+ = −V H d (X V ) .
This solution is shown to be valid by proving that the mapping in Eq. (33) is a contraction mapping. The detailed
proof is provided in Appendix A.
2.3 Capturing the Unknown Function and Neural Network Training (Ensuring → aX X )
In this section, the process of realizing the uncertainties in the actual plant equations (which is crucial for controller
synthesis) is discussed in detail. The Stone –Weierstrass theorem from classical real analysis can be used to show
that certain network architectures possess the universal approximation capability. Networks typically have the
desirable properties that larger networks produce less error than smaller networks and almost all functions can be
modeled by neural networks. This makes the authors believe that the neural networks are more efficient in
approximating complex functions if there are a large number of neurons in the hidden layer.
2.3.1 Selection of Neural Network Structure
An important idea used in this work is to separate all the channels in the system equations. Thus there will be n
independent neural networks to approximate uncertainties in each of the n channels, which facilitates easier
mathematical analysis. Define ( ) ( )1
T
nd d d(X) X X , where ( ) , 1, ,id i n=X is the thi component of
( )d X which is the uncertainty in the thi state equation. Since each element of ( )d X is represented by a separate
11
neural network, each network output can be expressed as ( )ˆ Ti iW Φ X . It should be noted here that the neural
network input vector may contain states, the control vector and the slack variable vector. Separation of channels has
been carried out in this work to keep the uncertainties in each system equation distinct. During system operation,
magnitudes of uncertain terms in system equations may be of different orders. In such a case, having one network
approximate uncertainties of the whole system may affect the convergence of the single network. In order to prevent
this from happening, all channels were separated.
Trigonometric basis neural networks [30, 31] are used in this study for approximating each of the unknown
functions ( )id X . The online ‘uncertainty approximating neural network’ can be represented by a linearly
parameterized feedforward structure. Radial Basis Functions (RBFs) can be used in thses structures because these
functions are universal approximators. However, RBFs are very poor at interpolating between their design centers,
and in such cases a large number of basis functions are needed. Researchers typically use basis functions constructed
from functions that they think richly represent the nature of the unknown terms that are being approximated. There
is no standard procedure for choosing basis functions for any application. Fourier series has the ability to
approximate any nonlinear function quite well. Note that such a choice also makes it application independent.
Trigonometric basis neural networks are used in this study for approximating each of the unknown functions as the
authors’ believe that the trigonometric sine and cosine functions and their combinations have the capability to
represent many nonlinear functions well. In order to form the vector of basis functions, the input data is first pre-
processed. In the numerical experiments carried out, vectors iC , 1, ,i n= which had a structure
[ ]1 sin( ) cos( )T
i ix x=iC were created. The vector of basis functions was generated as
( )( )( ), , , ,kron kron kron= n 3 1 2Φ C C C ,C (35)
( ),kron ∗ ∗ represents the “Kronecker product” and is defined as [30]
( ) [ ]1 1 1 2
T
n mkron y z y z y z=Y,Z (36)
where nR∈Y and mR∈Z . The dimension of the neural network weight vector is same as the dimension of Φ .
The neural network outputs for each of the different cases considered in this study have been tabulated in Table 1.
Table 1: Uncertainties and neural network nutputs for different system types
12
No. System Type Uncertainty Neural Network Output
1 Square, Affine ( )d X ( )d X
2 Non-square, Affine ( ) ( )Ψ sd X - X U ( )ˆa sd X,U
3 Square, Non-affine ( ) ( ))u∑
m
i ii=1
d X + R (X,U U ( )ˆ ,d X U
4 Non-square, Non-affine ( ) ( ) ( ))Ψ u∑
m
s i ii=1
d X - X U + R (X,U U ( )ˆ ,a sd X,U U
2.3.2 Training of Neural Networks
The technique for updating the weights of the neural networks (i.e. training the networks) for accurate
representations of the unknown functions ( ) , 1, ,id i n=X is discussed here. Define
( )i ia i ae x x≡ − (37)
From Eqs. (2-3), equations for the i th channel can be decomposed as
( ) ( )i i ix f d= +X,U X (38)
( ) ( )ˆ , ,i i ia i a ax f d e= + +sX,U X U U (39)
Subtracting Eq. (39) from Eq. (38) and using the definition in Eq.(37) gives
ˆ( ) ( , , )i i ia i a ae d d e= − −sX X U U (40)
From the universal function approximation property of neural networks [31], it can be stated that there exists an
ideal neural network with an optimum weight vector iW and basis function vector ( )iΦ X that approximates id (X)
to an accuracy of iε ; i.e.
( ) ( )i id ε= +Ti iX W Φ X (41)
Let the actual weight of the network used to approximate the uncertainties be ˆiW . The approximated function can be
written as
13
( ) ( )ˆ ˆ, , , ,iad = T
s i i sX U U W Φ X U U
(42)
Substituting Eqs. (41-42) in Eq.(40) leads to
( ), ,i ia i ae eε= + −T
i i sW Φ X U U (43)
where ( )ˆ−i i iW W W is the error between the actual weight and ideal weight of the neural network. Note that
ˆ−i iW = W since iW is constant. An important point to be noted here is that the aim of each neural network is to
capture the resulting function in each state equation and not parameter estimation or system identification. The
magnitudes of the uncertainties/nonlinearities in the state equations are then used to make the plant track the desired
reference trajectory.
Theorem: A stable adaptive weight update rule proposed as follows
ˆ ˆ, ,i i il a l ieγ γ σ= −i i s iW Φ (X U U ) W (44)
will ensure bounds on the error signal iae and the adaptive weights of the online networks ˆ
iW . il
γ is the learning rate
of the thi online network and iσ is a sigma modification factor used to ensure a bound on the network weights.
Proof: Choose a Lyapunov function for each state equation as
2 11 1( ) ( )
2 2i ii a i l iv e γ −= + TW W (45)
Taking the derivative of the Lyapunov function
1
i i ii a a i l iv e e γ −= + TW W (46)
On substituting the expression for iae in Eq. (46)
( ) 1, ,i i ii a i i i a i l iv e eε γ −= + − +T T
sW Φ (X U U ) W W (47)
If the proposed weight update rule ˆ ˆ, ,i i ii l a i l i ieγ γ σ= −sW Φ (X U U ) W is used, the error dynamics of the difference
between the optimal weight vector that represents the uncertainty and the weight vector used in the online adaptive
network can be represented as
ˆ, ,i i ii l a i l i ieγ γ σ= − +sW Φ (X U U ) W (48)
14
since ˆi i−W = W .
On substituting Eq. (48) in Eq. (47)
2 1
2
ˆ( ( ))
ˆi i i i i i
i i
i a i i a i a i l l a i i i
a i a i i i
v e e e e
e e
ε γ γ σ
ε σ
−= + − + − −
= − +
T T
T
W Φ W Φ W
W W (49)
However,
( )( )
( )
1ˆ ˆ2( )2
12
21
2 22
i i i i
i i i
i i i i
=
= −
= −
T T
T
T T
W W W W
W (W W )
W W W W
(50)
The first term in Eq. (50) can be expanded in the following fashion
2
ˆ ˆ
ˆ ˆ
ˆ
ˆ ˆ
i i i i i i
i i i i i i
i i i i i i i i
i i i i i i i
i i i i i i
= +
= + + −
= + + −
= − + +
= − + +
T T T
T T
T T T T
T T T
T T T
W W W W W W
W (W W ) (W W ) W
W W W W W W W W
W (W W ) W W W W
W W W W W W
(51)
Eq. (50) can now be expressed as
( )1ˆ ˆ ˆ2i i i i i i i i i i i i= − + + − −T T T T T TW W (W W ) (W W ) (W W ) (W W ) (W W ) (52)
Eq. (52) can be expressed as
( )( )1ˆ ˆ ˆ2
1 ˆ2
i i i i i i i i
i i i
= − − +
≤ − − +
T T T T
22 2
W W (W W ) (W W ) (W W )
W W W (53)
Therefore the last term in Eq (49) can be written in terms of the inequality
22 21 1 1ˆ ˆ2 2 2i i i i i i i i iσ σ σ σ≤ − − +TW W W W W (54)
The equation for iv becomes
15
22 22
2 2 22 22
2 2 222
1 1 1ˆ2 2 2
1 1 1ˆ2 2 2 2 2
1 1 1 ˆ2 2 2 2 2
i i
i
i
i
i a i a i i i i i i
a ia i i i i i i
a ii i i i i i
v e e
ee
e
ε σ σ σ
ε σ σ σ
ε σ σ σ
≤ − − − +
≤ + − − − +
= − + + − −
W W W
W W W
W W W
(55)
Define
221
2 2i
i i i
εβ σ
+
W (56)
For 0v < ,
2
2iae
β> (57)
or
2ia ie β> (58)
Thus it can be seen that selecting a sufficiently small iσ and choosing a sufficiently good set of basis functions
which will reduce the approximation error iε will help in keeping the error bound small. The error bound for the
proposed weight update scheme is 2 iβ .
The following steps will prove that the weight update rule is stable and all the signals in the weight update rule are
bounded.
It can be seen from Eq. (48) that ˆ, ,i i ii l a i l i ieγ γ σ= − +sW Φ (X U U ) W .
Eq. (48) can be expanded as
, , ( )
( , , )
i i i
i i i
i l a i l i i i
l i i l i i a i
e
e
γ γ σ
γ σ γ σ
= − + −
= − + −s
s
W Φ (X U U ) W W
W W Φ (X U U ) (59)
Define c iX W , il iA γ σ− ,
ilB γ , , ,
ic i i a ieσ − sU W Φ (X U U ) .
Eq. (59) can be expressed as a linear differential equation of the form
c c cA B= +X X U (60)
For the above mentioned linear time-invariant system with a negative definite A , the solution can be written as
16
0
0
( ) ( )0( ) ( ) ( )
tA t t A t
c c c
t
t e t e B dτ τ τ− −= + ∫X X U (61)
On using the bound 0 0( ) ( )A t t t te ke λ− − −≤ , the bound on the solution to Eq. (60) can be expressed as [32]
0
0
0
0
( ) ( )0
( )0
( ) ( ) ( )
( ) sup ( )
tt t t
c c c
t
t tc c
t t
t ke t ke B d
k Bke t
λ λ τ
λ
τ
τ τ
τλ
− − − −
− −
≤ ≤
≤ +
≤ +
∫X X U
X U
(62)
Such a system is input-to-state stable [32]. This proves that cX Wi is bounded for all bounded inputs. Since the
input to the system in Eq. (62) is bounded, iW is bounded, which proves that ˆiW is bounded as well. This completes
the proof.
3. Simulation Studies
In this section, two motivating examples that demonstrate the ideas presented in Section 2 are presented. The
examples show that the methodology discussed in this paper can indeed be used to design controllers for complex
nonlinear systems.
3.1 Van der Pol Problem
As the first exercise, the Van der Pol system [34] was selected. The motivations for selecting it were (i) it is a vector
problem (ii) it is a non-square problem ( 1, 2m n= = ) (iii) the homogeneous system has an unstable equilibrium at
the origin and (iv) the system exhibits limit cycle behavior. These properties make it a challenging problem for state
regulation. The desired system dynamics for this problem is given by
( ) ( )1 2
2 2 22 1 2 1 1 21 1
d d
d d d d d d d
x x
x x x x x x uα
=
= − − + + + (63)
where 1dx represents position and 2d
x represents velocity. The goal was to drive 1 2[ , ] 0Xd d
Tx x → as t → ∞ .
Formulating a regulator problem, the desired state and control trajectories were obtained using a new method known
as ‘Single Network Adaptive Critic’ (SNAC) using a quadratic cost function shown below
17
2
0
1( )
2X XT
dJ r u dt∞
= +∫ Q (64)
where 2( )diag I=Q and 1r = . Details of the SNAC technique can be obtained from [35]. In the SNAC synthesis
for this problem, the critic neural network was made up of two sub-networks, each having a 2-6-1 structure.
The plant dynamics was assumed to be
( ) ( ) ( )1 2
2 2 22 1 2 1 1 21 1
x x
x x x x d x x uα
=
= − − + + + +X (65)
where ( ) ( )22 cosd x=X was the assumed unmodeled dynamics. Following the discussions in Section 2, the
approximate system can be expressed as
( )( ) ( ) ( ) ( )
1 2 1 1
2 2 22 1 2 1 2 2 1 2
ˆ1 1
a a
a a
x x x x
x x x x d x x x x uα
= + −
= − − + + − + + +X (66)
Since the problem is in a non-square form, the technique mentioned in Section 2 was used and [ 10 10]T= −ψ was
selected. The approximate system was expressed as
( ) ( )( ) ( ) ( ) ( )
1
2
1 2 1 1
2 2 22 1 2 1 2 2 1 2
ˆ10 ,
ˆ1 10 , 1
a a
a a
s a
s a
x x x x u d
x x x x u d x x x x uα
= + − − +
= − − + + + − + + +
s
s
X U
X U (67)
with ( )1
ˆ ,ad sX U expected to approximate the uncertainty 10 su introduced in the first state equation as a result of the
approximate system being made square and ( )2
ˆ ,ad sX U expected to approximate the uncertainty created by the
algebraic sum of ( )d X and 10 su− which are the uncertain terms in the second state equation of the approximate
system. The augmented controller in Eq. (18) was expressed as
11 1 1 11
2 22 2 2 2 21 2
0 10
1 10K a d a
a d a
x x x xf
x x f x xx x
− − − − ∆ = − − − − ∆ −+ + V (68)
where ( ) ( )
( ) ( ) ( ) ( )1
2
2 21
2 2 2 22 1 2 1 1 2 1 1 2
ˆ ,
ˆ1 , 1 1
d
d d d d d
a
a d
x x df
f x x x d x x x x x uα α
− +∆ ≡ ∆ − − + − − + − + +
s
s
X U
X U (69)
The gain matrix was selected as ( )1, 1diag=K . After solving for V , the control variable u was extracted from it
as the first element of V . For the first iteration in the control solution process, 0su = was used. With this, the
explicit expression for u is given by
18
( ) ( )( ) ( ) ( )1 1 1 2 2 2 2 1 1 2 22 21 2
11
1 d d a au k x x k x x f x x x x
x x = − − − + − − ∆ − − − − + +
(70)
The basis function vectors for the two neural networks were selected as
( ) ( ) ( )
1 1
2 2
1 2
[1 sin( ) cos( )]
[1 sin( ) cos( )]
T
T
x x
x x
kron
=
== =
1
2
1 2
C
C
Φ X Φ X C ,C
(71)
and the neural network training learning parameters were selected as 1 2
10l lγ γ= = and 61 2 1 10σ σ −= = × .
In Figure 1, the resulting state trajectories for the nominal system with the nominal controller, the actual system with
the nominal controller and the actual system with the adaptive controller are given. First, it is clear from the plot that
the nominal control is doing the intended job (of driving the states to zero) for the nominal plant. Next, it can be
observed that if the same nominal controller is applied to the actual plant (with the unmodeled dynamics
( ) 22 cosd x=X ), 1x cannot reach the origin. However, if the adaptive controller is applied, the resulting controller
drives the states to the origin by forcing them to follow the states of the nominal system.
0 5 10 15-0.5
0
0.5
1
1.5
2
Pos
ition
(x 1)
Fig. 1(a)
Desired Trajectory
Plant Trajectory with Nominal Control
Plant Trajectory with Adaptive Control
0 5 10 15-1.5
-1
-0.5
0
0.5
Time (sec)
Vel
ocity
(x 2)
Fig. 1(b)
Figure 1: State trajectories vs. Time. (Fig. 1(a): State 1x vs. Time, Fig. 1(b): State 2x vs. Time)
19
Figure 2 illustrates control trajectories and neural network approximations of uncertainties in the two state
equations. In Figure 2(a), a comparison between the histories of the nominal control and the adaptive control is
presented. Figure 2(b) shows the output of the first neural network, 1ˆ ( )d X tracking the uncertainty in the first state
equation due to 1 suψ− . From Figure 2(c) it can be seen how well the neural network approximates the unknown
function ( )( )2 sd uψ−X , which is critical in deriving the appropriate adaptive controller.
0 5 10 15
-2
0
2
4
Co
ntro
l (u
)
Fig. 2(a)
Nominal control
Adaptive control
0 5 10 15-1
0
1
2
3
NN
1 Ou
tpu
t
Fig. 2(b)
Unknown function
Network approximation
0 5 10 15-0.5
0
0.5
1
1.5
Time (sec)
NN
2 Ou
tpu
t
Fig. 2(c)
Unknown function
Network approximation
Figure 2: Control and Uncertainty approximation trajectories vs. Time. (Fig. 2(a): Control trajectory vs. Time,
Fig. 2(b): Network approximation of uncertainty in the first state equation, Fig. 2(c): Network approximation of uncertainty in the second state equation)
3.2 Double Inverted Pendulum Problem
The next problem considered is a double inverted pendulum [36, 19]. The interesting aspects of this problem are (i)
the equations of motion consist of four states (ii) it is a non-square problem ( 4, 2n m= = ) and, more important,
(iii) it is a problem which is non-affine in the control variable. In this problem both parameter variation and
unmodeled dynamics are considered simultaneously. These characteristics make this problem a sufficiently
challenging one to demonstrate that the proposed technique works for complex problems. The nominal system
dynamics for this problem is given by [36]
20
( ) ( ) ( )
( ) ( ) ( )
1 2
2 1 1 1 1 1 1 4
3 4
4 2 3 2 2 2 2 2
sin tanh sin
sin tanh sin
d d
d d d d
d d
d d d d
x x
x x u x
x x
x x u x
α β ξ σ
α β ξ σ
=
= + + +
=
= + + +
(72)
where for 1, 2i = the parameters are defined as:
( )
max
2
2
4
2
4
ii
i i
ii
i
ii
ii
m gr kr
J J
krl b
J
u
J
kr
J
α
β
ξ
σ
−
− (73)
In Eqs. (72-73), 1dx and 2d
x denote the desired position and velocity of mass-1 respectively. Similarly 3dx and 4d
x
denote the desired position and velocity of mass-2 respectively. Note that the control variables di
u (torques applied
by the servomotors) enter the system dynamics in a non-affine fashion. The system parameters and their values are
listed in Table 2.
Table 2: System Parameter Values
System Parameter Value Units
End mass of pendulum 1 ( 1m ) 2 kg
End mass of pendulum 2 ( 2m ) 2.5 kg
Moment of inertia ( 1J ) 0.5 2kg m
Moment of inertia ( 2J ) 0.625 2kg m
Spring constant of connecting spring ( k ) 100 N m
Pendulum height ( r ) 0.5 m
Natural length of spring ( l ) 0.5 m
Gravitational acceleration ( g ) 9.81 2m s
Distance between pendulum hinges ( b ) 0.4 m
21
Maximum torque input (max1u ) 20 Nm
Maximum torque input (max2u ) 20 Nm
Objectives of the nominal controllers were to make 1dx and 3d
x track a reference signal sin(2 / )R t Tπ= , with
10T = . Since 2dx and
dx4 are derivatives of 1d
x and 3dx respectively, it means that 2d
x and d
x4 must track the
reference signal ( )2 / cos(2 / )R t T t Tπ π= . The nominal controller was designed using the dynamic inversion
technique [33]. A second order error dynamic equation ( ) ( ) ( ) 0d d dX R + X R + X R − − − = d pK K was made use
of in the controller design as the objective was tracking. The gain matrices used were 2I= =d pK K .
In this problem, parametric uncertainties 1α∆ and 2α∆ were added to parameters 1α and 2α respectively.
Functions ( )1f X and ( )2f X were added as unmodeled dynamic terms. The true plant equations now were of the
following form
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
1 2
2 1 1 1 1 1 1 1 4 1
3 4
4 2 2 3 2 2 2 2 2 2
sin tanh sin
sin tanh sin
x x
x x u x f X
x x
x x u x f X
α α β ξ σ
α α β ξ σ
=
= + ∆ + + + +=
= + ∆ + + + +
(74)
To test the robustness of the proposed method, 1α∆ and 2α∆ were selected to be 20% of their corresponding
nominal values. Similarly ( )1f X and ( )2f X were assumed to be exponential functions of the form 1 1
1
a xmK e and
2 3
2
a xmK e respectively with positive values for 1a and 2a . Parameters
1 20.1m mK K= = and 1 2 0.01a a= = were
chosen. In this case, the goal for the neural networks was to learn ( ) ( ) ( )2 40 0T
d X d X d X , where
( ) ( ) 1 1
12 1 1sin a xmd X x K eα= ∆ + and ( ) ( ) 2 3
24 2 3sin a xmd X x K eα= ∆ + .
It can be seen that the system dynamics is non-affine in the control variable, and it is also a non-square problem
where the number of control variables is less that the number of states. Applying the transformations given in Eq.
(23), ( )0f X and ( ) 0g X were defined as follows
22
( ) ( ) ( )
( ) ( )
2
1 1 1 1 4
4
2 3 2 2 2
sin sin
sin sin
0f X
x
x x
x
x x
α β σ
α β σ
+ +
+ +
, ( ) 1
2
0 0
0
0 0
0
ξ
ξ
0g X (75)
The actual plant equations were expressed as
( )( )0g u∑m
0 i i1
X = f (X)+ (X)U + R X,U U + d(X) (76)
Since ( ) 0g X is not a square matrix, ( ) 10 10 0 02
10 10 10 10ψ
T− = × − −
X was chosen and a square problem was
formulated. The approximate system equation was
ˆ ,0ga 0 aX = f (X)+ (X)U + d(X U)+ X - X (77)
where ˆ ,d(X U) represents ( )( )iu∑m
i1
R X,U U + d(X) . This could further be rewritten as
ˆ , ,0g Ψa 0 a s a sX = f (X)+ (X)U + d (X U U )+ X - X + U (78)
where ˆ , ,a sd (X U U ) is the output of the function approximating neural networks and represents
( )( ) Ψiu∑m
i s1
R X,U U + d(X) - U . Note that (2 1)×sU is the slack variable used to create a square control effectiveness
matrix to help solve for the real control variable U . The gain matrix for the linear error dynamic equation was
selected as ( )1 2 3 41/ ,1/ ,1/ ,1/K diag τ τ τ τ= with 1 2 3 4 0.2τ τ τ τ= = = = . The control solution vector was obtained
as [ ] ( ) ( )1 ˆ , ,0g Ψ K− = − − + − 0 a s a d a dV (X) f (X)+ d (X U U )+ X - X X X X . The numerical solution was
obtained using [ ] ( ) ( )1 ˆ ,0g Ψ K− = − − + − k+1 0 a k a d a dV (X) f (X)+ d (X V )+ X - X X X X . After solving for V , the
first two elements which made up U were extracted from V .
The basis function vectors were selected in the following manner
( ) ( ) ( )
1 1 1
2 2 2
1
[1 sin( ) cos( )]
[1 sin( ) cos( )]
T
T
x x
x x
=
=
2 1 2
C
C
Φ X = Φ X = kron C ,C
(79)
23
( ) ( )
3 3
4 4
[1 sin( ) cos( )]
[1 sin( ) cos( )]
T
T
x x
x x
=
=3
4
3 4 3 4
C
C
Φ X = Φ X = kron(C ,C )
(80)
For this problem, the neural network training parameters selected were 61 2 3 4 1 10σ σ σ σ −= = = = × and
1 2 3 420l l l lγ γ γ γ= = = = . For the first iteration on the control solution scheme 1 2 0 0
d d
Tu u = V was used.
Numerical results from this problem, as obtained by simulating the system dynamics with forth-order Runge-Kutta
method [27] with step size 0.01t∆ = are presented in Figures 3-5. State trajectories are given in Figures 3. It can be
seen that the nominal controller is inadequate to achieve satisfactory tracking. However, with adaptive tuning, the
resulting modified controller does a better job of forcing the state variables to track the reference signals.
0 5 10 15 20 25 30-2
0
2
4
Pos
ition
(x 1)
Fig. 3(a)
Desired trajectory
Plant trajectory with nominal control
Plant trajectory with adaptive control
0 5 10 15 20 25 30-1
0
1
2V
eloc
ity (
x 2)
Fig. 3(b)
0 5 10 15 20 25 30-2
-1
0
1
2
3
4
Time (sec)
Pos
ition
(x 3)
Fig. 3(c)
0 5 10 15 20 25 30-1
-0.5
0
0.5
1
1.5
2
Time (sec)
Vel
ocity
(x 4)
Fig. 3(d)
Figure 3: State trajectories vs. Time
The nominal and modified controller trajectories are plotted in Figure 4. These plots indicate that the online
adaptation comes up with a significantly different control history, which is the key in achieving the controller goal.
24
0 5 10 15 20 25 30-1
-0.5
0
0.5
1
Tor
que
(u1)
Fig. 4(a)
Nominal control
Adaptive control
0 5 10 15 20 25 30-1
-0.5
0
0.5
1
Time (sec)
Tor
que
(u2)
Fig. 4(b)
Figure 4: Control trajectories vs. Time
An important component of the control design procedure is proper approximation of the unknown functions as
neural network outputs ( )ˆ , ,ia sd X U U . These unknown functions and the neural network outputs (approximations)
are plotted in Figure 5. It can be seen how efficiently and accurately the neural networks learn the unknown
functions.
25
0 10 20 30-2
0
2
4
6N
N1 O
utpu
t
Fig. 5(a)
Unknown function
Network approximation
0 10 20 30-4
-2
0
2
4
NN
2 Out
put
Fig. 5(b)
0 10 20 30-0.6
-0.4
-0.2
0
0.2
Time (sec)
NN
3 Out
put
Fig. 5(c)
0 10 20 30-2
0
2
4
Time (sec)
NN
4 Out
put
Fig. 5(d)
Figure 5: Neural Network approximations vs. Time
4. Conclusions
Dynamic systems and processes are difficult to model accurately and/or their parameters may change with time. It is
essential that these unmodeled terms or changes in parameters are captured and are used to adapt the controller for
better performance. A model-following adaptive controller using neural networks has been developed in this paper
for a fairly general class of nonlinear systems which may be non-square and non-affine in the control variable. The
nonlinear system for which the method is applicable is assumed to be of known order, but it may contain unmodeled
dynamics and/or parameter uncertainties. Simulation results have been shown for two challenging problems. The
potential of this technique has been demonstrated by applying it to non-square systems and a system that is non-
affine in control. Another distinct characteristic of the adaptation procedure presented in this paper is that it is
independent of the technique used to design the nominal controller; and hence can be used in conjunction with any
known control design technique. This powerful technique can be made use of in practical applications with relative
ease.
26
Acknowledgement:
This research was supported by NSF-USA grants 0201076 and 0324428.
References
[1] McCulloch, W. S. and Pitts, W.: ‘A Logical Calculus of the Ideas Immanent in Nervous Activity’, Bulletin of
Mathematical Biophysics, 1943, Vol. 9, pp. 127-147.
[2] Miller, W. T., Sutton, R. and Werbos, P. J. (Eds.): ‘Neural Networks for Control’, MIT Press, 1990.
[3] Hunt, K. J., Zbikowski, R., Sbarbaro, D. and Gawthorp, P. J.: ‘Neural Networks for Control Systems- A Survey’,
Automatica, 1992, 28(6):1083-1112.
[4] Barto, A. G., Sutton, R. S., and Anderson, C. W.: ‘Neuron-like Adaptive Elements that Can Solve Difficult
Control Problems’, IEEE Transactions on Systems, Man and Cybernetics, 1983, Vol. SMC-13, No. 5, pp. 834-846.
[5] Narendra, K. S. and Parthasarathy, K.: ‘Identification and Control of Dynamical Systems Using Neural
Networks’, IEEE Transactions on Neural Networks, 1990, Vol. 1, No. 1, pp. 4-27.
[6] Chen, L. and Narendra, K. S.: ‘Nonlinear Adaptive Control Using Neural Networks and Multiple Models’,
Proceedings of the American Control Conference, 2000.
[7] Sanner, R. M. and Slotine, J. J. E.: ‘Gaussian Networks for Direct Adaptive Control’, IEEE Transactions on
Neural Networks’, 1992, Vol. 3, No. 6, March, pp. 837-863
[8] Lewis, F. L., Yesildirek, A. and Liu, K.: ‘Multilayer Neural Net Robot Controller with Guaranteed Tracking
Performance’, IEEE Transactions on Neural Networks, 1996, Vol. 7, No. 2, pp. 388-399
[9] Khalil, H. K.: ‘Nonlinear Systems: 2nd Ed.’, Prentice Hall Inc, New Jersey, USA, 1996.
[10] Aloliwi, B. and Khalil, H, K.: ‘Adaptive Output Feedback Regulation of a Class of Nonlinear Systems:
Convergence and Robustness’, IEEE Transactions on Automatic Control, 1997, Vol. 42, No. 12, pp. 1714 -1716.
[11] Seshagiri, S., and Khalil, H. K.: ‘Output Feedback Control of Nonlinear Systems Using RBF Neural Networks’,
IEEE Transactions on Neural Networks, 2000, Vol. 11, No. 1, pp. 69-79
[12] Enns, D., Bugajski, D., Hendrick, R. and Stein, G.: ‘Dynamic Inversion: An Evolving Methodology for Flight
Control Design’, International Journal of Control, 1994, Vol.59, No.1, pp. 71-91.
[13] Lane, S. H. and Stengel, R. F.: ‘Flight Control Using Non-Linear Inverse Dynamics’, Automatica, 1988,
Vol.24, No.4, pp. 471-483.
27
[14] Ngo, A. D., Reigelsperger, W. C. and Banda, S. S.: ‘Multivariable Control Law Design for A Tailless
Airplanes’, Proceedings of the AIAA Conference on Guidance, Navigation and Control, 1996, AIAA-96-3866.
[15] Slotine, J-J. E., and Li, W.: ‘Applied Nonlinear Control’, Prentice Hall, 1991.
[16] Kim, B. S. and Calise, A. J.: ‘Nonlinear Filight Control using Neural Networks’, AIAA Journal of Guidance,
Control, and Dynamics, 1997, Vol. 20, No. 1, pp.26-33.
[17] Leitner, J., Calise, A. and Prasad, J. V. R.: ‘Analysis of Adaptive Neural Networks for Helicopter Flight
Controls’, AIAA Journal of Guidance, Control, and Dynamics, 1997, Vol. 20, No. 5, pp.972-979.
[18] McFarland, M. B., Rysdyk, R. T., and Calise A. J.: ‘Robust Adaptive Control Using Single-Hidden-layer Feed-
forward Neural Networks’, Proceeding of the American Control Conference, 1999, pp. 4178-4182.
[19] Hovakimyan, N., Nardi, F., Calise A. J. and Lee, H.: ‘Adaptive Output Feedback Control of a Class of
Nonlinear Systems Using Neural Networks’, International Journal of Control, 2001, Vol. 74, No. 12, pp.1161-1169.
[20] Hovakimyan, N., Nardi, F., Nakwan, K. and Calise A. J.: ‘Adaptive Output Feedback control of Uncertain
Systems Using Single Hidden Layer Neural Networks’, IEEE Transactions on Neural Networks, 2002, Vol. 13, No.
6, pp. 1420-1431.
[21] Calise, A. J., Lee, S. and Sharma, M.: ‘Development of a Reconfigurable Flight Control Law for the X-36
Tailless Fighter Aircraft’, Proceedings of the AIAA Conference on Guidance, Navigation and Control, Denver, CO,
2000.
[22] Balakrishnan, S. N. and Huang, Z.: ‘Robust Adaptive Critic Based Neurocontrollers for Helicopter with
Unmodeled Uncertainties’, Proceedings of the 2001 AIAA Conference on Guidance, Navigation and Control, 2001.
[23] Karloff, H.: ‘Linear Programming’, Birkhauser Boston, 1991.
[24] Paradiso, J. A.: ‘A Highly Adaptable Method of Managing Jets and Aerosurfaces for Control of Aerospace
Vehicles’, Journal of Guidance Control and Dynamics, Vol. 14, No. 1, pp. 44-50, 1991.
[25] Gupta S. K.: ‘Numerical Methods for Engineers’, Wiley Eastern Ltd, 1995.
[26] Soloway, D. and Haley P.: ‘Aircraft Reconfiguration Using Generalized Predictive Control’, Proceedings of
American Control Conference, Arlington, 2001, VA, USA, pp. 2924-2929.
[27] Soloway, D. and Haley P.: ‘Neural Generalized Predictive Control: A Newton-Raphson Implementation’,
Proceedings of the IEEE CCA/ISIC/CACSD, 1996.
28
[28] Lin, W.: ‘Stabilization of Non-Affine Nonlinear Systems via Smooth State Feedback’ Proceedings of the 33rd
Conference on Decision and Control, Lake Buena Vista, FL, December, 1994.
[29] Lin, W.: ‘Feedback Stabilization of General Nonlinear Control Systems: A Passive System Approach’, Systems
and Control Letters, 1995, 25, pp. 41-52.
[30] Ham, F. M. and Kostanic, I.: ‘Principles of Neurocomputing for Science and Engineering’, Mc Graw Hill, Inc,
2001.
[31] Hassoun, M. H.: ‘Fundamentals of Artificial Neural Networks’, MIT Press, Cambridge, MA, 1995.
[32] Khalil, H. K.: ‘Nonlinear Systems: 3rd Ed.’, Prentice Hall Inc, New Jersey, USA, 2002.
[33] Padhi, R. and Balakrishnan, S. N.: ‘Implementation of Pilot Commands in Aircraft Control: A New Dynamic
Inversion Approach’, Proceedings of AIAA Guidance, Navigation, and Control Conference, 2003, Austin, TX,
USA.
[34] Yesildirek, A.: ‘Nonlinear Systems Control Using Neural Networks’, Ph.D. Thesis, 1994, University of Texas,
Arlington.
[35] Padhi, R., Unnikrishnan, N., Balakrishnan, S. N.: ‘Optimal Control Synthesis of a Class of Nonlinear Systems
Using Single Network Adaptive Critics’, Proceedings of American Control Conference, 2004.
[36] Spooner, J. T., and Passino, K. M.: ‘Decentralized Adaptive Control of Nonlinear Systems Using Radial Basis
Neural Networks’, IEEE Transactions on Automatic Control, Vol. 44, No. 11, November 1999.
Appendix A
The most general form of the approximate system dynamics will be considered in this proof. A resolution to the
fixed point problem that arises due to the particular structure of the control solution equation is discussed in this
section. The approximate plant model can be represented by
ˆ , ,a Ψa 0 0 s a sX = f (X)+ g (X)U + d (X U U )+ X - X + (X)U (1a)
Substitution of Eq. (1a) in the stable error dynamic equation ( ) ( ) 0K+ =a d a dX - X X - X , leads to
( ) ( )ˆ , , 0a − + − =Ψ K0 0 s a s d a df (X)+ g (X)U + d (X U U )+ X - X + (X)U X X X (2a)
Eq. (2a) can be rewritten as
29
[ ] ˆ , ,a= −Ψ0 sg (X) (X) V H d (X U U ) (3a)
where ( )( ) ( )( )− + − + −K0 a d a dH f (X) X - X X X X . Define [ ]G Ψ0g (X) (X) . From Eq. (3a), the control
vector V can be solved for as
( )ˆ ,a= −-1GV H d (X V) (4a)
Eq. (4a) represents a fixed point problem to be solved at each instant. Assuming the state vector X to be fixed, let
the mapping in Eq. (4a) be represented by T . Let S be a closed subset of a Banach space χ and let T be a
mapping that maps S into S . The contraction mapping theorem is as follows [32].
Suppose that ( ) ( ) , , , 0 1x y x y x yρ ρ− ≤ − ∀ ∈ ≤ <T T S ,
then
• There exists a unique vector *x ∈ S satisfying * *( )x x= T
• *x can be obtained by the method of successive approximation, starting from any arbitrary initial vector in S .
A solution to Eq. (4a) can be obtained by successive approximation if it can be shown that the mapping shown in
Eq. (4a) is a contraction mapping. Let k denote the iteration instant. Eq. (4a) at instants k and 1k + can be
expressed as
( )( )1 1
ˆ ,
ˆ ,
k a k
k a k+ +
= −
= −
-1
-1
G
G
V H d (X V )
V H d (X V ) (5a)
Let
( ) ( )1ˆ ˆ, ,a k a k+= − − −-1 -1G Gk+1 kT(V ) -T(V ) H d (X V ) H d (X V ) (6a)
On simplification and expressing the neural network output in terms of weights and the basis function vector, Eq.
(6a) becomes
( )( )
ˆ ˆ
ˆ
T T
T
= −
= −
-1
-1
G
G
k+1 k k k+1
k k+1
T(V ) -T(V ) W Φ(X,V ) W Φ(X,V )
W Φ(X,V ) Φ(X,V ) (7a)
Taking the norm of the right hand side of Eq. (7a)
( ) ( )ˆ ˆT − ≤ −-1 -1G Gk k+1 k k+1W Φ(X,V ) Φ(X,V ) W Φ(X,V ) Φ(X,V ) (8a)
The trigonometric basis functions used in this work ensure that
30
( ) 1ˆ ˆ
+− ≤ −-1 -1G Gk k+1 k kW Φ(X,V ) Φ(X,V ) W V V (9a)
In order to prove that the mapping in Eq. (4a) is a contraction mapping, it needs to be shown from the inequality
1ˆ
+≤ −-1Gk+1 k k kT(V ) -T(V ) W V V (10a)
That the term ˆ 1<-1G W . In order to ensure this inequality the terms used to form the matrix G should be
carefully chosen. Since the matrix G is formed from the control effectiveness matrix and the slack matrix Ψ(X) , the
control designer has control over -1G . A proper choice of the matrix G will guarantee convergence of the control
iteration.