Neural Predictive Control for a Car-like Mobile · PDF file · 2002-06-01Neural Predictive Control for a Car-like ... The multi-layer back-propagation neural network is constructed

1

Neural Predictive Control for a Car-like Mobile RobotDongbing Gu and Huosheng Hu

Department of Computer Science, University of EssexWivenhoe Park, Colchester CO4 3SQ, UK

Fax: +44-1206-872788Email: [email protected], [email protected]

Abstract: This paper presents a new path-tracking scheme for a car-li ke mobile robot based on neuralpredictive control. A multi -layer back-propagation neural network is employed to model non-linearkinematics of the robot instead of a linear regression estimator in order to adapt the robot to a largeoperating range. The neural predictive control for path tracking is a model-based predictive controlbased on neural network modelli ng, which can generate its output in term of the robot kinematics and adesired path. The desired path for the robot is produced by a polar polynomial with a simple closedform. The multi -layer back-propagation neural network is constructed by a wavelet orthogonaldecomposition to form a wavelet neural network that can overcome the problem caused by the localminima when training the neural network. The wavelet neural network has the advantage of using anexplicit way to determine the number of the hidden nodes and initial value of weights. Simulationresults for the modelli ng and control are provided to justify the proposed scheme.

Key words: Model predictive control, Wavelet neural network, Path tracking

1. INTRODUCTION

Car-li ke mobile robots are mainly used in industry, ports, planet exploration, nuclear waste cleanup,agriculture and mining since they have the necessary loading capabilit y. The control scheme for suchmobile robots is traditionally decomposed into three subtasks: trajectory generation, position estimationand path tracking. Although behaviour-based approaches have been very popular since Brooks'invention in 1980s[2], their sole implementation on these mobile robots for many complex applicationscauses some diff iculty. Instead, a behaviour-based approach can be incorporated with a traditionalplanning-based system to form a hybrid system to deal with unexpected situations [1]. The aim of trajectory generation is to generate a feasible path that allows the robots to movesmoothly. A path consisting of straight lines and circular arc segments is easy to generate, but may notbe suitable for a car-li ke mobile robot which has the non-holonomic constraint and its steering angles atthe line-arc transition points have discontinuous curvature. Although a Clothoid's trajectory [10] canprovide smooth transitions with a continuous curvature, its disadvantage is lack of the closed-formexpression. Nelson proposed a continuous-curvature polar polynomial for path generation [14], whichhas a closed form expression and is easy to calculate. Pinchard improved Nelson's polar polynomialwith variable speeds, leading to a fifth-order polynomial for a continuous-curvature path [16]. Position estimation plays a key role in path tracking. Many approaches have been proposed in termsof the different sensors on robots. Odometry is simple, inexpensive, and easy to implement in real time,but suffers from the unbounded accumulation of errors. The triangulation approach is appropriate forinitiali sing position, but is too sensiti ve to angle measurements when robots move around. Grid-basedposition [12] is a successful method in dealing with sensory uncertainty, but is not suitable for model-based predictive control due to the time it takes. A Kalman filter is a feasible estimation approach thatcan fuse multiple sensory measurements to provide relatively accurate results. Path tracking for a car-li ke robot is in fact a feedback control problem to which model-basedpredictive control (MPC) has been an effective mechanism [4][19]. Most MPC control applications arebased on linear models of dynamic systems to predict outputs over a certain horizon. Future sequences

International Journal of Robotics and Autonomous Systems, Vol. 39, No. 2-3, May, 2002

2

of control signals are evaluated by minimising a cost index that takes account of the future outputprediction errors over a reference trajectory, as well as control efforts. However, when the systems arenon-linear and are operated over a large dynamic range, the use of linear models becomes impractical,and the identification of non-linear models for control becomes absolutely necessary. A Neural network is an attractive tool to model complex non-linear systems due to its inherent abilityto approximate arbitrary continuous functions. Conclusive proofs were given that feed-forward neuralnetworks with one hidden layer are capable of approximating any continuous function on a compact setin a very precise and satisfactory sense [8]. Some researchers have successful applied feed-forwardneural networks as system models to MPC, namely neural predictive control (NPC). For example, NPChas been used in manipulator control [3][5][20][21], chemical process control [17][24], mobile robotpath tracking [18][28], and other applications [11][23]. In these research works, the control schemes ofNPC were changed into non-linear optimisation problems. The optimisation constraints were the systemdynamics represented by neural networks. The main drawbacks are undesirable local minima and slowconvergence of back-propagation learning. Moreover, the implementation of multiple feed-forwardneural networks suffers from the lack of efficient constructive approaches, both for determiningparameters of neurones and for choosing network structures. Other algorithms such as random searchtechniques and genetic algorithms have been adopted to overcome the local minima caused by gradientdescent rules [11][18]. However, the computational cost of these algorithms will degrade theperformance of real-time control systems. Recently, wavelet decomposition emerged as a new powerful tool for function approximation in amanner that readily reveals properties of the arbitrary L2 function (energy-finite and continuous ordiscontinuous). Combining wavelets and neural networks can result in wavelet neural networks withefficient constructive approaches [15][25]. Because wavelet neural networks can approximate arbitraryfunctions belonging to L2 space, they have results in a wide variety of application areas, like speechrecognition [22] and digital communication [6]. They were also applied in robot applications whererobot kinematics is a continuous function and operates in a bounded range [20]. The wavelet neuralnetworks can further result in a convex cost index to which simple iterative solutions such as gradientdescent rules are justifiable and are not in danger of being trapped in local minima when choosing theorthogonal wavelets as the activation functions of nodes in neural networks [25].

Figure 1 A car-like mobile robot The rest of this paper is organised as follows. A car-like mobile robot being modelled in this researchis outlined in section 2, including a brief explanation of robot kinematics, trajectory generation based ona polar polynomial, and position estimation by a Kalman filter. Section 3 investigates the wavelet neuralnetwork modelling of robot kinematics. Neural predictive control for path tracking is described in

x

y θ

α

X

Y

H

O

R

3

section 4. The simulation results for the proposed control scheme are given in section 5, includingmodelling results and control results. Finally, section 6 presents a brief conclusion and future work.

2. A CAR-LIKE MOBILE ROBOT

The robot considered in this research is a car-like mobile robot with four pneumatic tyre wheels. Twofront wheels serve for steering and two rear wheels serve for tracking, as shown in figure 1. The robot has a modular structure such that it can be readily extended and easily configured to allowthe use of a range of sensor modules or actuator modules. Two 500 count/revolution rotary encoders arefixed onto two motor shafts for position servo control by a PID controller. Another two 500count/revolution rotary encoders have been connected to the rear axes near the rear wheels using 3:1ratio pulley and belts. The reason we use these two extra encoders is to compensate for the positionerrors caused by inaccuracy of the steering angle. The robot has a weight of 100 Kg and a maximumspeed of 1 m/s. Sensors include optical encoders, a sonar array, a laser scanner, an infra-red proximity,and a stereo head [7].

2.1 Robot kinematics

Assume that the midpoint of the rear axis of the mobile robot is a reference point (R as shown in figure1) in the global frame XOY. A state vector x(k) =[x(k), y(k), θ(k)]T is used to represent the position (x, y)and the orientation (θ ) of the robot. Its discrete kinematics equation can be described as follows:

∆+∆+∆+

=

+++

=+Hktkvk

kktkvky

kktkvkx

k

ky

kx

k

/))(sin()()(

))(cos())(sin()()(

))(cos())(cos()()(

)1(

)1(

)1(

)1(

αθαθαθ

θx (1)

where Tk�kvk )](),([)( =u is the control vector for motion tracking, ∆t is the sample period and H is thewheelbase between front and rear axes. The two control variables are the speed of robot ( )(kv ) and thesteer angle ( )(k� ). Such a non-linear system is open loop controllable, which can be linearised in order to use traditionallinear feedback control to regulate the robot. But if the robot operates over a large range in its statespace, especially when the robot turns around corners, the linearisation of the kinematics will l ead to theloss of controllabilit y. Since the MPC’s models are based on linear regressions, they are not suitable tonon-linear robotic systems. Furthermore, the kinematics is diff icult to model precisely because thereexist many physical sources of uncertainty, for example backlash, friction, load, and pneumatic tyres.Adaptive control should be employed to deal with the uncertainty.

2.2 Position estimation

Two kinds of sensors are used to estimate the robot state in this research. The optical encoders mountedon two rear wheels are used for odometry calculation. The problems for odometry are (i) surfaceroughness and undulation may cause the distance to be over-estimated. (ii ) wheel slippage can causedistance to be under-estimated. (iii ) variations in load can distort the odometry wheels and introduceadditional errors. Because odometry leads to unbounded accumulation of the errors, it is clear thatabsolute position estimation would be necessary. Another kind of sensors is the laser scanner on therobot interacting with artificial beacons mounted around the operating space, which can be used for theabsolute position estimation. A priori information about the beacons can be viewed as a form of worldmodel. With this model, the state of the robot is determined by integrating odometry with the bearing ofthe beacons measured by the laser scanner. To do this, an Extended Kalman Filter (EKF) is used toestimate the state based on information from the odometry and the laser scanner under the assumptionthat the errors of measurement follow a Gaussian distribution [7].

4

2.3 Path Generation

A path consisting of straight lines and circular arc segments is easy to generate, but may not be suitablefor a car-li ke mobile robot which has non-holonomic constraints and its steering angles at the line-arctransition points can generate discontinuities of the trajectory curvatures. A Clothoid's trajectory [10]provides smooth transitions with a continuous curvature. Its disadvantage is lack of closed-form positionexpression. Nelson proposed a continuous-curvature polar polynomial for path generation [14], which isa closed form expression and easy to calculate. Pinchard [16] improved the Nelson's polar polynomial with variable speeds, leading to a fifth-orderpolynomial for a continuous-curvature path, which is used here for trajectory generation. The position,tangent and curvature constraints yield a fifth-order polynomial for a continuous-curvature path. Itsgeneral form is:

55

44

33

2210)( φφφφφφ aaaaaar +++++= (2)

where r is the polar radius and φ the polar angle shown in figure 2. The initial and final conditions of thetransition between paths are:

r=R1, r'=0, c=0, at φ = 0r=R2, r'=0, c=0, at φ = Φ

where c is curvature. Φ is the angle the robot will t urn. R1 and R2 are the initial and final radii . v1 and v2

are the initial and final velocities. With these conditions, the coeff icients of a general polar polynomialcan be given as follows:

5

22

2121

5

4

22

2121

4

3

22

2121

3

12110

2

1212

2

233030

2

32020

2/,0,

ΦΦ+Φ−+−=

ΦΦ−Φ+−=

Φ

Φ+Φ−+−=

===

RRRRa

RRRRa

RRRRa

RaaRa

(3)

r

φ

Φ

v1

v2

R1

R2

X

Y

O

(x1, y1)

(x2, y2)

Figure 2 Path generation

2.4 NPC Path tracking scheme

To control a robot to track a path, the desired path and the measured path are two basic informationsources used for a controller. The trajectory generation module shown in figure 3 produced a desiredpath represented by a set of states xr(k)=[xr(k), yr(k), θr(k)] T. The controller, either MPC controller orPID controller shown in figure 3, yields the control vector u(k) =[v(k), α(k)] T. The motor movementleads to position changes that are estimated by an EKF, based on the sensory information. The current

5

state of the robot obtained from the position estimation module is estimated as xm(k)=[xm(k), ym(k),θm(k)]T. PID control is a simple and feasible choice. It uses the state errors (xr(k)- xm(k)) as feedbacksignals, which are further decomposed into tangential and normal errors. The dash line loop, which is inparallel with a MPC controller in figure 3, shows the PID control scheme. More details for the PIDcontrol scheme in path tracking can be found in our previous paper [7]. MPC is based on an assumed model of the robot kinematics and on an assumed scenario for thefuture control signals. The solid line parts in figure 3 show the MPC control scheme. With the desiredtrajectory and measured path mentioned above, it could produce a sequence of control signals. Only thefirst one is applied to the robot. Due to the reasons in modelling kinematics discussed in subsection 2.1,a neural network is employed to model the kinematics expressed in (1). It uses the measurement statesxm(k) and control variables u(k) to produce an adaptive kinematics model that provides a substrate forthe MPC controller. Section 4 will present the details about the algorithm. The neural network used to model the kinematics is trained according to u(k) and xm(k). Initially, thePID controller controls the robot to track the path in order to produce the training data for neuralnetwork learning. During the training stage, the MPC controller is replaced with the PID controllershown as a dash line in figure 3. Once the neural network has been trained, it can be used for the MPCcontroller and will be adapted during the path tracking process.

TrajectoryGeneration

RobotDriving

PositionEstimation

NeuralNetwork

MPCController

PIDController

Figure 3 NPC for path tracking

3. MODELLING ROBOT KINEMATICS

Due to the similarity between wavelet decomposition and hidden layers in neural networks, waveletneural networks were proposed by combining wavelet and neural networks in [25]. These waveletneural networks can remedy the weakness of a large dimension of wavelet decomposition and theconstructive problem of neural networks. In this section, the way to model the robot kinematics by usingwavelet neural networks is proposed.

3.1 Wavelet neural network

In multi-resolution signal analysis, a square integral function f(t) in L2 space can be decomposed into asummation of a series of weighted wavelet components[25]:

∑=nm

nmnm tftf,

,, )(,)( ψψ (4)

6

where )2(2)( 2/, ntt mmnm −= ψψ is the orthogonal wavelet function. The weight in each wavelet

component is nmf ,,ψ that denotes the inner product of function f(t) and wavelet nm,ψ . The formula

(4) can be rewritten in the following form according to the wavelet theory:

∑∑>

+=nMm

nmnmn

nMnM tftftf,

,,,, )(,)(,)( ψψϕϕ (5)

where )(, tnMϕ is called scaling function. Most of the physical functions used in engineering have

bounded energy and can be regarded as square integral functions. Therefore they can be decomposedinto an expression as shown in (5). What is more important for the function decomposition is that for a suff iciently large M, any squareintegral function f(t) can be approximated arbitraril y closely by the first term of (5)[25]. That is for any

0>ε

εϕϕ <− ∑n

nMnM tftf )(,)( ,,

It shows that the function approximation for f(t) in (5) can be expressed by the truncated waveletdecomposition as:

∑∑ =≈n

nMn

n

nMnM tCtftf )()(,)( ,,, ϕϕϕ (6)

It means that some fine components (high frequency) that can be represented by a series of thewavelet functions for the function f(t) are neglected and some coarse components (low frequency) thatcan be represented by a series of the scaling function are preserved to approximate the originatefunction under the M scale. How to choose a wavelet function is very important in wavelet decomposition. Most orthogonalwavelets lack the finite support set that can guarantee good time resolution. Daubechies’ waveletspossess the finite support set, but are not suff iciently smooth to guarantee good frequency resolution.Lemarie-Meyer’s wavelets are orthogonal and posses good frequency and time resolution, but there isno simple closed form expression required by the real-time applications in this paper. Walter and Zhangin [27] found that two classes of wavelets can be represented by the simple closed form expressions andare orthogonal, which can be seen as the particular examples of Lemarie-Meyer’s wavelets. We chooseone of them as the function approximation base in this paper. Its scaling function expression is

))4(1(

)1(cos4)1(sin)(

2tt

tttt

βπβπββπϕ

−++−=

Figure 4 shows its scaling function and its wavelet where β=1/4.

(a) Scaling function (b) Wavelet functionFigure 4 Orthogonal wavelet

7

The function approximation expression (6) has a similar structure with a 3-layer neural network asshown in figure 5. The number of hidden nodes is decided by wavelet translation n that depends on thesupport set of f(t). Here we can assume a positive integer N to form a range [-N, N] that can cover thesupport set of f(t). For a multiple dimension case, the scaling functions or wavelets are generated bytensor products of one-dimensional scaling functions or wavelets.

Σ t

ϕ

ϕ

ϕ

2M

n

1

n+1Cn-1

Cn

Cn+1

f(t)

•

•

•

•

•

•n-1

Figure 5 The wavelet neural network where nMn fC ,,ϕ=

3.2 Neural network modelling

We use a one-step-ahead predictive model to describe the robot kinematics (1). Thus, formula (1) canbe rewritten as:

x(k+1)=F(z(k)) (7)where z(k)=[v(k), α(k), xm(k), ym(k), θm(k)]T and x(k)=[x(k), y(k), θ(k)]T. This predictive model can berepresented by a wavelet neural network as shown in figure 6, where the input vector z(k) with its sizep=5 consists of the control vector and the estimated state, and the output vector x(k) with its size q=3 isthe predicted state. The activation function ϕd(t) in the hidden layer is a multidimensional scalingfunction that is the tensor product of one-dimensional scaling functions. Other network parameters are:

• the number of the hidden nodes r=2N+1;• the weight matrix from input to hidden layer is W1;• the weight matrix from input to hidden layer is W2.

Σ

v (k -1 )

ϕ d

ϕ d

ϕ d

•

•

•

ϕ d

Σ

Σ

α (k -1 )

x m (k -1 )

y m (k -1 )

θ m (k -1 )

x (k )

y (k )

θ (k )

Figure 6 Wavelet neural network for robot kinematics

8

Based on wavelet neural networks, the robot kinematics can be represented by:

x(k+1)= W2Φ( W1z(k)) (8)

where Φ=[ϕd,…, ϕd]T. The weight matrix W1 is invariable during the training cycle and expressed as

follows in terms of the wavelet decomposition.

5)12(

1

22

22

×+

=

N

MM

MM

W�

��

�

(9)

Initial values of weight matrix W2 can also be generated by the wavelet decomposition:

)12(3,31,3,3

,21,2,2

,11,1,1

2 )0(

+×+−−

+−−

+−−

=

NNNN

NNN

NNN

CCC

CCC

CCC

W�

�

�

(10)

It can be further updated by a steepest-gradient-decent algorithm.

W2(k+1)= W2(k)- µµΦeT (11)

where e =[xm(k)-x(k)] is the error between the estimated state from EKF and the predicted state from thewavelet neural network, and µµ is the update rate.

4. PREDICTIVE CONTROL SCHEME

A predictive controller is proposed here, based on the prediction model that predicts the future values ofthe robot state. The prediction model is constructed by the wavelet neural network shown in figure 6.The errors between predicted states and state estimates, combining with the control variations, are usedto form a quadratic index. The process of minimising the quadratic index is to find an optimal controlvector. At each control cycle during the path tracking, the wavelet neural network is first adapted basedon the estimated state, and then optimisation control is executed.

4.1 MPC formation

The outputs of the neural network (8) presented in the previous section can be rewritten in the statespace form as follows:

)](),([)]([

))((

))((

))((

)1(

)1(

)1(

)1(

3

2

1

kkFkF

kf

kf

kf

k

ky

kx

k m uxz

z

z

z

x ==

=

+++

=+θ

(12)

The MPC controller is to optimise a cost index J(x(k), u(k)) under the constraints of formula (12):

))(),((min)(

kkJk

uxu

The current control vector is chosen to minimise the state errors and control energy over severalsteps in the future so that the path tracking of the robot is smooth and stable. Therefore, the cost indexcan be expressed as

∑∑==

−+∆++−+=µN

ii

N

Nir dikikikJ

1

22)(

2

1)()(

2

1 2

1

u�

xx (13)

9

where λλ(i) is a weighting matrix (2 by 2 positive definite symmetric), penalising the control effort. Thiscost index reflects the desire to drive the next robot state x (k) close enough to the desired state xr(k)generated by the fifth order polynomial introduced in section 2.3 without excessive expenditure ofcontrol effort. The cost index is conditioned on data up to control cycle k, assuming no futuremeasurements are available. At each control cycle, an optimal control sequence is calculated, but onlythe first one is applied to the robot. This process will be repeated at the next control cycle to form areceding horizon optimisation procedure. The notations used for equation (13) are

• N1 and N2 are the minimum and maximum output horizons respectively;• Nµ is called the control horizon;• ∆u(k) is a set of control increments, ∆u(k) =∆u(k) -∆u(k-1) and ∆u(k+i)=0, if i>Nµ-1;• d is the dead-time of the system.

In general, N2 is chosen to encompass all the responses that are significantly affected by the currentcontrol. In practice, it is more typically set to approximate rise-time of the system. N1 is selectedaccording to dead time of the robot system, and is often taken as 1.

4.2 Optimisation algorithm

In each control cycle k, we need to compute a control increment matrix ∆U(k) (∆U(k)=[∆u(k), …,∆u(k+Nµ -1) ] T that has K vectors and choose the first one as the actual control increments for outputs ofthe controller. For the non-linear system presented in (12), there are no analytic expressions available.We have to implement the non-linear optimal algorithm to find an optimal solution of a control vectorduring the interval k and k+1 on the fly. It should be noticed that the constraints on the cost index (13)are a convex expression shown in (8) due to the wavelet orthogonal decomposition. Therefore, these Kcontrol increments can be calculated recursively from the steepest gradient decent algorithm during theinterval between k and k+1 without a local minimum. Let j be the iteration step for the optimisation (j=0, …, T) at the kth control cycle. T is the number ofthe discrete times and decided by the optimal terminal conditions. Figure 7 shows the calculating cyclerelationship during the control and optimisation process where k is the current control cycle and thetotal number of control cycles in the path tracking is S. In our tracking case, we choose:

N1 =1, N2 =L, d=1, Nµ =K

0 k k+1 k+K-1 S-1

timecyclej j+1 T-1

k+L-1

optimisation horizon

control horizon

output horizon

Figure 7 Time step for the predictive control

The control increment ∆U(k) can be obtained based on the cost index (13) according to the gradientdescent algorithm. Assume that µ is the optimising rate. We have:

jj JUU µδ−=∆ (14)

The cost index change jJUδ is caused by the control increment ∆U(k) and the state change jUXδ . It

can be represented in (15) by differentiating (13),jjjjJ U

�EX UU ∆+−= δδ (15)

10

where the state change jUXδ is also caused by control increment ∆U(k). The state-changing gain and

control-changing gain are jE and � respectively. They can be expressed as:

−+∂+∂

−+∂+∂

∂+∂

∂+∂

∂+∂

=

)1(

)(

)1(

)(0

)(

)(

)(

)(

)(

)1(

Kk

Lk

Kk

Kk

k

Lk

k

Kk

k

k

j

j

j

j

j

j

j

j

j

j

j

u

x

u

xu

x

u

x

u

x

XU ��

��δ

[ ]Tjr

jr

j LkLkkk )()()1()1( +−++−+= xxxxE �],,[ 1 Kdiag ��

=

Substituting (15) into (14), control increments during one control cycle can be expressed as:

jjj I EX�

U Uµδµ 1)( −+=∆ (16)

At each optimisation step j, the predicted states xj(k+i1) (i1=1, ..., L) are produced by the neuralnetwork based on the control vector uj-1(k+i2) (i2=0, ..., K-1). For the first optimisation step j=1,u0(k+i2) is initialised as uT(k-1+i2) (the control vector at the last control cycle). After T iterations, anoptimal control increment ∆U*(k) at the kth control cycle is produced. The main computation during this

optimising procedure is the matrix calculation )(

)(

lk

ik

+∂+∂

ux (i=1, …, L; l=0, …, T-1). With reference to the

neural network model (8), one of the matrix elements is presented as:

+≠

+=′=+∂

∂=

+∂−+∂

=+∂+∂ ∑∑

==

)1(0

)1(2)()(

))1((

)(

)(1

211

21

1

li

liwlk

w

lk

ikf

lk

ikxr

m

Mdm

r

m

dmϕ

α

ϕ

ααz

(17)

The optimal control vector can be obtained by choosing the first vector in ∆U*(k):

u*(k)=u*(k-1)+[1, 0, …, 0] ∆U*(k) (18)

5. SIMULATION RESULTS 1

A simulator based on the robot kinematics (1) is developed to validate the proposed approaches. TheGaussian noise with zero mean value is added to the robot state to simulate the estimated state xm(k).As the first stage, we need to collect data of the robot kinematics to train the wavelet neural network.This can be done by employing a PID controller to track the path (see figure 3). Previously a PIDcontrol scheme has been used for the real robot in [7] where we estimated the position by an EKF andused the same trajectory generation. With the tedious trials, PID can track the path. However, thisprocess has to be repeated when the robot travels on different road surface, or even on the same road at

1 One important aspect in modelling is that the system has to be excited enough (the control signal has to bepersistently exciting). However, the training data is generated based on the PID controller without any additionalexcitation noise. Because there is no drift term in the kinematics model (1), the lack of excitation is not a problem.This only holds if a model similar to (1) represents the dynamics of the true robot. The motivation to model the robotand not use (1) is to make the controller adaptable in case of changes in the system, such as increasing the loadof the robot. This will lead to an increase in inertia resulting in lower accelerations and decelerations. These effectscannot be modelled in the kinematics model (which only deals with the speed). The modelling may try toincorporate this in the model by introducing a drift term (which is possible according to the general kinematicsmodel in (7)). Without excitation noise, the model obtained along the PID trajectory may not be valid along thegenerated path, because the model does not have to generalise over a "larger" part of the state space.

11

different times for reasons of load change or tyre inflation. This is the main reason why we investigateadaptive control for path tracking. Once the training data is collected, formula (11) is used to train the wavelet neural network until acertain number of the training trials, or the errors between the network output and the simulated positionestimation are less than given thresholds. An initial kinematics is built up, which can be used for theNPC control scheme. In the NPC control process, the PID controller is replaced with the NPCcontroller. The control vector (18) output from the NPC controller is calculated at each control cycle,while on-line training of the network and the optimisation of the cost index are executed within thecurrent control interval (see figure 7).

5.1 Wavelets

The scaling function is shown in figure 4. The dilation and translation of the scaling function ϕM,n(t)=2Mϕ(2 M t-n) are centred in n/2 M. We transform the input and output ranges of the network into [-1/2,1/2]. So, the number of hidden nodes can be determined roughly as (2 M+1) p and p is the number ofinputs to the hidden nodes. For different outputs, p is selected according to the relationship betweeninputs and outputs described by the kinematics (1). To choose a proper M, a trial and error approach isused as follows [25]:

Step 1: Start from a small M.Step 2: Computed the mean square error (MSE).Step 3: Increase M by 1 and repeat from Step 1 if MSE is bigger than a threshold.

The sparse data in our path tracking case can further decrease the support range N of the networkinputs, e.g. the steer angle is changed in a smaller region.

5.2 Modelling ValidationWe implemented three path-tracking courses to collect the training data: straight line, turn right, andturn left. 65 hidden nodes were used to cover the range of the robot state. Only the weights in an outputlayer need to train (see figure 5). A path starting from (19000mm, 12600mm, 0rad) to (23000mm,14600mm, π/2rad) is used to validate the model of the kinematics. The results are shown in figure 8,where the unit of x, y, and x error is mm and of θ is radians. The number on the horizontal axis ismotion steps.

1 9 0 0 0

2 1 0 0 0

2 3 0 0 0

2 5 0 0 0

1 4 1 8 1 1 2 1 1 6 1 2 0 1

(a) A plot of x

12

1 2 5 0 0

1 3 5 0 0

1 4 5 0 0

1 5 5 0 0

1 4 1 8 1 1 2 1 1 6 1 2 0 1 2 4 1

(b) A plot of y

- 0 . 8

0

0 . 8

1 . 6

2 . 4

1 4 1 8 1 1 2 1 1 6 1 2 0 1 2 4 1

(c) A plot of θ

-4 0 0

-2 0 0

0

2 0 0

4 0 0

1 4 1 8 1 1 2 1 1 6 1 2 0 1 2 4 1

(d) A plot of x errorFigure 8 Neural modelling: (a) x, (b) θ , (c) y, and (d) x error.

This desired path was produced by the trajectory generation part shown in figure 3, which consists oftwo parts: a straight line starting from (19000mm, 12600mm, 0rad) to (21000mm, 12600mm, 0rad) anda left turning starting from (21000mm, 12600mm, 0rad) to (23000mm, 14600mm, π/2rad). The solidlines shown in the figure 8 (a)(b)(c) are the results of PID controller, which start to turn left aroundpoint (21400mm, 12600mm, 0rad) rather than (21000mm, 12600mm, 0rad) as desired since the PIDcontroller is not a perfect controller. It did not affect our simulation results since the collected data isonly used to train the wavelet neural network to model the kinematics. The dash lines in the figure 8(a)(b)(c) are the modelling results of the wavelet neural network. The data shows that the results of the

13

modelli ng can approximate to the dash lines. The approximate error of x is shown in the figure 8(d).They fall within a range of (-200mm, 200mm). In the straight-line part, the modelli ng error of x isrelative large comparing with the errors of y and θ since only x increased while y and θ nearly keptconstant. In the left turning part, all three variables increased to achieve their desired values.

5.3 Path tracking2

The trajectory generation produced a desired trajectory as shown in figure 9, which is the same as theone used in the modelli ng validation. The PID controller was first implemented to make the robot trackthis desired trajectory, namely “PID control” in figure 9. Based on the trained wavelet neural networkthe NPC controller was then implemented and the tracking result was also shown in figure 9, namely“NPC control” . In the NPC tracking, we use L=10, K=10, T=5. Although the modelli ng of the wavelet neural network was based on the data collected from theprocess of PID control, the NPC control has different performance with the PID control. There is notmuch difference in the straight-line part where only x error is large (this can be seen from the starting ofthe turning point). Then PID control turned the robot left with a larger angle than NPC control. This ledto the PID control trajectory diverging from the desired path than the NPC control trajectory at thebeginning of the turning. Later on, the deviation of the NPC trajectory from the desired path is causedmainly by x error, while the PID control trajectory deviates from the desired path in all three variables.

12500

13000

13500

14000

14500

15000

18000 19000 20000 21000 22000 23000 24000

PID control

NPC control

Desired Trajectory

Figure 9 Path tracking

PID NPC

Position (x,y) ASE 98.06 57.29

Orientation θ ASE 0.167 0.168

Table 1 Comparison of ASE for PID AND NPC

2 The general application domains of a MPC are slow systems such as process plants. The primary objective inthese cases is not the "optimality" of the control algorithm but the safety. The main advantage of a MPC is that itcan take into account arbitrary constraints. Not the first "optimal action" is applied, but the first optimal action thatdoes not violate the constraints within the prediction horizon is applied. In the mobile robot context we can say thatthe robot is not a point as in the simulation, the true robot has a certain width. This would seem like a more naturalmotivation for the use of a MPC for mobile robots. Take the trajectories in figure 9 as an example. Suppose that thepath is generated to make the robot turn around a corner. The PID result makes the robot move too much along theinside of the trajectory, which may lead to a collision of the robot with the corner. In this case the NPC result did notcollide, but only because it followed the path very closely. The collision was not prevented by rejecting all actionsthat may have resulted in a collision.

14

In NPC control, maximum errors in x and y are 118mm and 48mm respectively. In the PID tracking,maximum errors in x and y are 300.48mm and 51mm. A numerical comparison between PID and NPCcontrollers through the average squared error (ASE) with reference to the desired trajectory is shown intable 1. The ASE is an average value for whole trajectory. The position (x,y) ASE is the root of sum ofsquared error x and y. The position (x,y) ASEs in table 1 show that the NPC control trajectory is muchcloser to the desired path than the PID control trajectory. The orientation ASEs show that averageorientation deviations of the two trajectories from the desired are nearly the same in this section of thepath.

6. CONCLUSIONS

The neural predictive control scheme proposed in this paper includes path generation, positionestimation, and a wavelet neural network based MPC. Benefits resulted from wavelet neural networkswill be the convex cost index without the local minima dilemma, appropriate initial value setting ofweights, and a constructive approach for the hidden layer of feed-forward neural networks. Simulationresults show the ability of approximation of wavelet neural networks to the non-linear kinematics andadaptive path tracking by this wavelet neural network based predictive controller. This paper shows that adaptive neural network based MPC is one of the solutions for time-variablenon-linear systems. The aim is to provide a constructive way to build up the neural networks used formodelling and limit the size of neural networks. Further research should be conducted to investigate thestructurally dynamic wavelet neural networks that exploit the properties of wavelet decomposition tochoose the wavelets on the fly. It is equivalent to dynamically choosing the hidden nodes in the neuralnetworks based on the measurement of robot states [20]. In this way, the identification of non-linearsystems can be preceded on both the model parameters and the model structure. The implementation ofthis neural predictive control scheme on the real robot will be also conducted in the next stage.

Acknowledgement: The authors wish to thank the anonymous reviewers for their constructivecomments and suggestions which improved the quality of this paper.

REFERENCES

[1] R. C. Arkin, Behaviour-Based Robotics, The MIT Press, 1998.[2] R. Brooks, A Robust Layered Control System for a Mobile Robot, IEEE Journal of Robotics and Automation,

Vol. RA-2, No. 1, pages 14-23, 1986.[3] J. Canderle and D. Caldwell, Generalised Predictive Control of Pneumatic Muscle Actuator, European

Advanced Robotics Systems Master class and Conference-Robotics 2000, Salford, UK, April 2000.[4] D. Clarke, C. Mohtadi, and P. Tuffs, Generalised Predictive Control-Part I. The Basic Algorithm, Automatica,

Vol. 23, No. 2, pages 137-148, 1987.[5] J. A. Gangloff and M. F. Mathelin, High Speed Visual Servoing of a 6 DOF Manipulator Using MIMO

Predictive Control, Proc. of the 2000 IEEE Int. Conf. on Robotics and Automation, San Francisco, CA, April2000, pages 3751-3756.

[6] R. Gomez-Sanchez and D. Andina, A Wavelet Neural Network for detection of Signals in Communications,Proc. of SPIE, Vol. 3391, pages 265-272, 1998.

[7] D. Gu, H. Hu, M. Brady, F. Li, Navigation System for Autonomous Mobile Robots at Oxford, Proceedings ofInternational Workshop on Recent Advances in Mobile Robots, Leicester, U.K., pages 24-33, 1-2 July 1997.

[8] S. Haykin, Neural Network - A Comprehensive Foundation, Prentice Hall, 1999.[9] H. Hu, D. Gu, and M. Brady, A Modular Computing Architecture for Autonomous Robots, Int. Journal of

Microprocessors and Microsystems, Vol. 21, No. 6, pages 349-362, March 1998.[10] Y. Kanayama, and N. Miyaki, Trajectory generation for mobile robots, Proc. of the IEEE Int. Conf. Robotics

and Automation, pages 333-340, France, 1985.[11] M. Martinez, J. S. Senent, and X. Blasco, Generalised Predictive Control Using Genetic Algorithms

(GAGPC), Artificial Intelligence, Vol. 11, 1998, pages 355-367.

15

[12] H. P. Moravec, Sensor Fusion in Certainty Grids for Mobile Robots, AI Magazine, 9(2), pages 61-74, 1988.[13] K. S. Narendra and K. Parthasarathy, Identification and Control of Dynamical Systems, IEEE Trans. on

Neural Network, Vol. 1, No. 1, March 1990, pages 4-27.[14] W. L., Nelson, Continuous-curvature Paths for Autonomous Vehicle, Proc. of the IEEE Int. Conf. Robotics

and Automation, pages 1260-1264, 1989.[15] Y. C. Pati and P. S. Krishnaprasad, Analysis and Synthesis of Feedforward Neural Networks Using Discrete

Affine Wavelet Transformations, IEEE Trans. on Neural Network, Vol. 4, No. 1, Jan. 1993, pages 73-85.[16] O. Pinchard, A. Liegeois, and F. Pougent, Generalised polar polynomials for vehicle path generation with

dynamic constraints, Proc. of the IEEE Int. Conf. Robotics and Automation, pages 915-920, Minneapolis,Minnesota, 1996.

[17] S. Piche, et al, Non-linear Model Predictive Control Using Neural Networks, IEEE Control SystemsMagazine, Vol. 20, No. 3, June 2000, pages 53-62.

[18] D. Ramirez, et al, Non-linear MBPC for Mobile Robot Navigation Using Genetic Algorithms, Proceedings ofthe 1999 IEEE Int. Conf. on Robotics & Automation, Detroit, Michigan, May 1999, pages 2452-2457.

[19] J. B. Rawlings, Tutorial Overview of Model Predictive Control, IEEE Control Systems Magazine, Vol. 20,No. 3, June 2000, pages 38-52.

[20] R. M. Sanner and J. E. Slotine, Structurally Dynamic Wavelet Networks for the Adaptive Control of UncertainRobotic Systems, Proceedings of the 34th Conf. on Decision & Control, New Orland, LA, Dec. 1995, pages2460-2467.

[21] B. Song and A. Koivo, Neural Network Model Based Control of a Flexible Link Manipulator, Proc. of theIEEE Int. Conf. on Robotics & Automation, Leuven, Belgium, May 1998, pages 812-817.

[22] H. H. Szu, B. Telfer, and S. Kadambe, Neural Network Adaptive Wavelets for Signal Representation andClassification, Optical Engineering, Vol. 31, No. 9, September 1992, pages 1907-1916.

[23] Y. Tan and R. Keyser, Neural Network Based Adaptive Predictive Control, Advances in Model BasedPredictive Control, Edited by D. Clarke, pages 358-369, 1994.

[24] J. M. Zamarreno and P. Vega, Neural Predictive Control: Application to a Highly Non-linear System,Artificial Intelligence, Vol. 12, 1999, pages 149-158.

[25] J. Zhang, G. Walter, Y. Miao, and W. Lee, Wavelet Neural Networks for Function Learning, IEEE Trans. onSignal Processing, Vol. 43, No. 6, June 1995, pages 1485-1495.

[26] Q. Zhang, Using Wavelet Network in Non-parametric Estimation, IEEE Trans. on Neural Network, Vol. 8,No. 2, March 1997, pages 227-236.

[27] G. Walter and J. Zhang, Orthonormal Wavelets with Simple Closed-Form Expressions, IEEE Trans. on SignalProcessing, Vol. 46, No. 8, 1998, pages 2248-2251.

[28] J. G. Ortega, and E. F. Camacho, Neural Network GPC for Mobile Robot Path Tracking, Robotics andComputer-Integrated manufacturing, Vol. 11, pages 271-278, 1994.

Documents

Neural Predictive Control for a Car-like Mobile · PDF file · 2002-06-01Neural Predictive Control for a Car-like ... The multi-layer back-propagation neural network is constructed