Neural network-based control of balanced robotic manipulators with joint flexibility

Mechatronics Vol. 1, No. 4, pp. 487-507, 1991 095%4158/91 $3.00+0.00 Printed in Great Britain O 1991 Pergamon Press pie

NEURAL NETWORK-BASED CONTROL OF BALANCED ROBOTIC MANIPULATORS WITH JOINT FLEXIBILITY

S. B. LEE and H. S. CHO*

D e p a r t m e n t of P roduc t ion Engineer ing , Korea A d v a n c e d Ins t i tu te of Science and Technology, P .O. Box 150, Chongryangr i , Seoul, Korea

(Received 16 January 1991; accepted 23 May 1991)

Abstract--This paper addresses an improvement on the controlled performance of balanced manipulators in a practical level by implementing neural network based controller. The mass balancing of robotic manipulators has been shown to have favorable effects on the dynamic characteristics. However, it was also pointed out that for the manipulators having a certain degree of flexibility at the joints, due to the lowered structural natural frequencies and reduced velocity related terms, mass balancing results in vibratory motion at high speed operation. Such a vibratory tendency of the balanced flexible joint manipulator limits the admissible range of servo gains of the conventional controllers, making those controllers unsuitable for controlling the manipulator at high speeds.

To avoid such difficulty, an artificial neural network (NN) controller is introduced to realize the dynamic control of the balanced flexible joint manipulators. A feedforward type of NN controller is proposed and its performance is evaluated through a series of numerical simulations. The proposed NN controller showed much better tracking performances over the conventional PD controller.

1. INTRODUCTION

There have been ever-increasing needs and many efforts for improving the motion accuracy of robotic manipulators at high speed. Recently, software-oriented control methods [1] have been preferred partly due to the advancement in computer technology. Also, there have been many efforts to reduce the complexities of manipulator dynamics through physical modification of kinematic and mass distribu- tion structures [2-4].

In the latter approaches, Chung and Cho [3, 4] proposed an effectively implement- able method by introducing automatic balancing mechanism (ABM). Chung [4] showed that the dynamic characteristics can be greatly enhanced through mass balancing. This study, however, was performed without taking into account for the joint flexibility which is an inherent characteristics of most robotic manipulators. The joint flexibility can cause inaccurate movement and even instability at high speed

*To whom correspondence should be addressed.

487

488 S.B. LEE and H. S. CHO

motion. Such effects become critical for the operations that require dynamic force control such as assembly and machining [6, 7].

Lee and Cho [8] investigated the effects of the joint flexibility on the dynamic performances of the balanced manipulators. They found that the joint flexibility effect lowers the structural natural frequencies and renders the balanced manipulator, which has reduced velocity related terms such as Coriolis and centrifugal terms in its dynamics, very susceptible to vibratory motion. Since such a vibratory trend of balanced flexible joint manipulators results partly from the reduced Coriolis terms, the difference in vibratory trend between balanced and unbalanced manipulators becomes most apparent at high operational speed, in which the effects of the velocity related terms are large. This high vibrational tendency of the flexible joint balanced manipulator limits the use of high feedback gain for the conventional controllers such as PD or PID controllers, and thus renders them unsuitable for high speed motion control. It is known that such characteristics may give equally undesirable results under the implementation of various adaptive control schemes. In order to achieve accurate tracking capability of robotic manipulators at high speed, dynamic control is often required. The computed torque control method was widely applied to the control of rigid manipulators in diverse forms and a few research efforts were made to extend it to the control of manipulators with flexible joints [9-13]. Forrest [9] applied the inverse dynamics control scheme to a cylindrical manipulator with flexible joints. For the revolute manipulators which generally do not have global nonlinear feedback linearizability, Spong [13] designed an approximate nonlinear feedback linearization controller using reduced mathematical models based on the concept of integral manifolds [12] and assumptions on the rotational kinetic energy of the motor [10]. The performance of this mathematical inverse dynamics control largely depends on the completeness and exactness of the mathematical models used for the inverse dynamics calculation. In addition, successful implementation of these controllers requires the measurement of the arm link accelerations and jerks. In view of the uncertainty and complexity in dynamic structure of flexible joint manipulators and the limited capability of present sensor technology, these schemes may present consider- able difficulties in actual implementation.

To avoid such criticism, this paper introduces the use of artificial neural network (NN) controller to realize successful dynamic control of the flexible joint balanced manipulators. Recently, diverse types of NN controllers for rigid manipulators [14, 16, 18, 19, 21] have been proposed and evaluated to achieve good control performance. One previous research effort [20] can be found on the application of the neural network to the dynamic control of flexible joint manipulators. Yet, its application is still limited to a single axis manipulator. The proposed neural network based controller operates in two phases. In the first phase, the learning capability of the NN controller is utilized to identify the inverse dynamics of the manipulator without a priori knowledge about the dynamic model equation. Then, in the second phase, the trained NN controller functions as a dynamic controller. The controller consists of a NN feedforward controller implemented in parallel with the motor servoing PD controller. The performance of the proposed controller is evaluated through a series of numerical simulations, using a PUMA 760 model. The results show that the NN controller gives better tracking performance over the conventional PD controller.

Control of balanced robotic manipulators 489

2. DYNAMICS OF A BALANCED MANIPULATOR WITH JOINT FLEXIBILITY

In this study, the subject matter considers a three-joint PUMA 760 robot model, which is balanced at each joint as shown in Fig. 1. In this figure, the arm links 2 and 3 are equipped with a balancing mechanism which can position counter-balancing weights to eliminate gravity loadings acting on the joints 2 and 3. Each joint of the manipulator is composed of an actuator, a transmission unit and an arm link. The transmission units of PUMA 760 are made of gear trains consisting of spur and bevel gears as shown in Fig. 2(a). As in most other research [9-13] on flexible joint manipulators, the transmission units are considered as equivalent springs, and the gear inertia and damping properties are appropriately allocated to motor rotors and arm links. The transmissions then act as serially connected torsional springs and the resulting torsional stiffness K~ of the ith joint transmission unit, as referred to the arm link side, becomes

ri ( r H 21 G -1 (K~) -1 ~ n (Ki~) -1 + (Ki (r~+ll) , (1) m=l \k=m I

where gi G denotes the torsional spring constant of the mth gear shaft of the ith axis, nik is the gear ratio of the kth gear pair of the ith axis and r, represents the number of gear pairs in the ith axis.

A schematic diagram of a representative joint having an equivalent torsional spring is shown in Fig. 2(b). The dynamic equations are derived using Lagrangian formulation. In the dynamic formulation, it was assumed that the rotational kinetic energy of motor rotors and gear pairs originates entirely from their rotational motions, which

Balancing mass

oint3 Balancing mass .~

f ixture

Fig. 1. Schematic of a balanced PUMA 760 robot.


unit transmission n ~ o (r,t}

l arm lin,k ID

(a) a representive motor ro to r - t ransmission-arm link mechanism

/ " ~ w - ~ ~ motor rotor

(b) a flexible joint represented as on e q u i v a l e n t t o r s i o n a l sp r ing

Fig. 2. Schematics of a flexible joint model.

indicates that the effect of arm link motion can be neglected [10]. Then, the resulting equations motion of the balanced manipulator with flexible transmission units can be written as

[O]0 L + I-I(0 L, 0 L) + [BL]0 J- + r Lc + rE = [KJ]{[NJ] - I0 J~4 - 0 L} (2)

[jM]~M + [BM]0M + r~c + [ N , ] - I [ K , ] ( [ N J ] - I o M _ 0 L} = ~ , (3)

L aLl -o LT [Hd 0 L , where H(O L,O e) = 0 LT [H2]

0 LT [H31 0 L

0 L and 0 M are the (3 x 1) angular position vectors of the arm links and the motor rotors, respectively, [D] and [B L] denote the 3 x 3 inertia and damping matrix of the arm links, respectively, [Hi] (i = 1 ,2 ,3 ) is the (3 x 3) Coriolis and centripetal coefficient matrix of the ith axis, [jM] and [B M] represent the inertia and damping matrix of the motor rotors, respectively, f fc and ~MC are the (3 x 1) coulomb friction


vectors of the arm links and motor rotors, respectively, r M denotes the (3 x 1) motor input torque vector. In the above,

[K J] = K J , a n d [ N J] = N2 , K 0 N3

where Ni represents the total gear ratio of the ith joint. The inertia matrix [D] and Coriolis and centripetal coefficient matrix [Hi] of the balanced manipulator are found to have significantly simplified structures compared with those of the unbalanced manipulator [3,4]. In particular, the inertia elements except D u in [D] become constant and more than half of the Coriolis and centripetal coefficient elements are eliminated. With its simplified dynamic structure, the balanced manipulator exhibits many favorable dynamic characteristics such as reduced driving torque, high speed motion, and increased payload capacity [4]. Nevertheless, when the joint flexibility exists, the flexibility causes the manipulator to have a highly vibratory tendency at high speed [8]. This vibrational tendency limits the maximum admissible feedback gains of conventional PD controller, making it incapable of achieving accurate motion control at high speed.

3. NEURAL NETWORK-BASED CONTROLLER

Recently, various types of neural network-based controllers for robot motion control were proposed and evaluated [14, 16, 18, 19,21]. Most of the studied NN controllers have multilayer NN structure and error back propagation learning architecture. Generalized learning architecture [14] shown in Fig. 3(a) can be used when the input torques and motion responses of the manipulator are known. Since in most cases the amount of available training data sets are limited, and often only the desired path trajectory of the manipulator is known, the generalized training method is inappropriate to train the NN controller to render a good performance tracking the desired trajectory. Specialized learning [14] and inverse transfer learning schemes [15], as shown in Fig. 3(b) and (c), respectively, use only the manipulator output error information to train the NN controller. By backpropagating the manipulator dynamic response error through the manipulator dynamics and the inverse transfer function of the manipulator, respectively, these schemes complete the gradient descent search for NN weights that minimize the manipulator dynamic response error energy function. Although these methods enable unsupervisory training of the NN controller in the on-line mode, they require calculations of the input-output sensitivity and inverse transfer function of the manipulator, respectively. Such requirements can be fulfilled only when an adequate amount of a priori knowledge is given about the manipulator dynamics. These requirements end up debasing one of the important merits of NN controllers, that is, no need for a priori knowledge about the controlled plant dynamics. Feedback error learning architecture, proposed by Kawato [16], as shown in Fig. 3(d), uses the feedback error directly without passing through the dynamics or inverse transfer function of the manipulator to train the NN controller. Since this method does not take into consideration the manipulator dynamics, the


(a) general learning

= Y

(b) specialized learning

y d ~ / -=

I \ i ,o,0 ,o i l Rolationshipl | l "-t_ T.ra. 9~er.

1

yd

(c) inverse transfer function learning

(d) feedback error learning

Fig. 3. Various learning architectures.

learning convergence and stability properties largely rely on the selected values of training parameters such as learning rate and the initial choice of the NN weights [19]. Despite such drawbacks, this scheme has been favored because it does not require the plant sensitivity parameters. In this study, we mainly follow the concept of feedback error learning to construct the NN controller for the flexible joint manipulator. But here, we do not directly use the output of the feedback controller to train the NN. Instead, the weighted sum of the position and velocity output errors of the arm link will be used as training errors, which will be illustrated in Fig. 6.


3.1. Multilayer NN feedback controller

The implementation of a feedforward control can improve the tracking performance of the manipulator, especially in high speed motion. In this study, we attempt to adopt a multilayer neural network to construct the feedforward controller for improving the high speed tracking ability. Theoretically, the inverse dynamics feedforward controller, such as the computed torque feedforward controller, can accomplish perfect tracking if the inverse dynamics model is exact. As mentioned previously, the mathematical algorithmic realization of this is extremely difficult for the flexible joint manipulator. By adopting the NN feedforward controller such difficulty can be resolved. Since the trajectories are planned only for the arm links but not for the motor rotors, we must use only the arm link trajectories for the feedforward controller input variables. It can be seen in the dynamic equations (2) and (3) that the jerk rate, jerk, acceleration, velocity and position trajectories of the arm links are required to calculate the complete inverse dynamics feedforward torque only in terms of arm link variables. The NN feedforward controller should require the use of all these trajectory variables up to the 4th order derivative, i.e. jerk rate so as to approximate the complete inverse dynamics. But, our initial attempt to feed all these high order derivative trajectories to the NN feedforward controller has failed to give satisfactory performances. This has occurred mainly because the rapid variation of the feedforward torque generated by the NN controller during early learning phase induced undesirable vibration and resulted in unstable learning. Therefore, a modified NN controller is constructed using the acceleration, velocity and position trajectories for its input variables. This NN feedforward controller can approximately estimate the inverse dynamics in a relatively low speed range with respect to the structural natural frequencies [20].

Figure 4 shows the structure of the NN designed for the feedforward controller. In the figure, i, represents the input variable at the nth node of the input layer, oj and ok are the outputs of the ]th node of the hidden layer and the kth node of the output layer, respectively, and wji and bj represent the weights connecting the input layer node i and the hidden layer node j and bias of the hidden layer node ], respectively, and wkj and b~ denote the connecting weight and bias between hidden layer and output layer, respectively. The network consists of an input layer receiving the

o;("r~,)

: l i n e a r n o d e

(~ : n o n l i n e a r n o d e

Fig. 4. Three layer neural network model having one internal hidden layer which consists of both nonlinear and linear nodes.


planned trajectory data (O Ld, O Ld, 0 La) of the three arm links, one hidden layer having 30 nodes, and an output layer generating the feedforward control torques (rnl, rn2, rn3) for the three axes. Since the balanced manipulator has simplified dynamics and some of nonlinear properties in its dynamics are removed, the hidden layer composed of both nonlinear and linear nodes is used to improve learning performance. Half of the hidden layer nodes use the nonlinear activation functions, while the rest use the linear activation functions. The following sigmoid function [19] was used as the activation function for the nonlinear nodes.

s[1 - exp (-4x/s)] fix) = 211 + exp (-4x/s)]' (4)

where s is the shape parameter. The linear nodes are used in the output layer so as not to restrain the magnitude and sign of the generated torques.

We attempted to realize the NN feedforward controller by the feedback error learning scheme with the arm link servo PD controller as shown in Fig. 3(d). This servo generates the following compensation torque:

• c(t) = [kp][0 L° - 0 L] + [kv][0 L° - 0L]. (5)

In this scheme the NN directly uses r c as the training error. Out initial attempt to construct this controller failed due to inherent instability of the arm link servo PD controller, as the highly flexible joint manipulators are reported to be marginally stable [12]. In order to investigate the stability situations of the balanced manipulator with joint flexibility the system pole locations of the manipulator are surveyed. To get the poles at a certain posture of the manipulator, the dynamic equations (2) and (3) are linearized about a specified configuration q = qo. Then, neglecting the second order terms of A~I the linearized vector equation can be written as

[M(qo)lAq(t) + [BlAq(t) + [KlAq(t) = A~t ) , (6)

where Aq is the 6 × 1 perturbed angular position vector, A t represents the 6 × 1 perturbed torque vector,

q(/) = [0L : 0M] v ,vc ( t ) = [V Lc : rMC] T,v(t) = [0 : vM] T,

and

L[oi i [ s M l J L[0] [ B M ] /

IKI=F! I! . . . . . . . . . . . . . . . . . . . . . . . . .

L-[KJ] [NJ] -1 [NJ]-I[KJI[NJ]-I~

Figure 5(a) shows the 12 open loop poles of the manipulator, which are obtained at the manipulator posture displayed in the figure. It can be seen that they are located very closely to the imaginary axis of the s-plane. Figure 5(b) shows the closed loop pole locations at the same posture for the case of the arm link servo PD controller which utilizes the arm link state variables as feedback signals. The following PD servo gains were used to calculate the poles:


o~

Z"

t~ o

_E

150

100

5O

0

- 5 0

-100

-150 -10 - 0 5

Real s

10

(a) open loop

o

o _E

150

100

50

0

- 5 0

-100

-150 -100

x

x

x

I i

- 5 0 0 50 100 Reol s

(b) arm link servo PD controller

150

1 0 0

m 5 0

o c 0

o

g - 5 0

- 1 0 0

; 8 ;o ' ' - 1 - 2 - - 4 0 - 2 0

R e a l s

x x x

x x x

02'o,0

(c) motor servo PD controller

Fig. 5. Poles of the balanced flexible joint manipulator.


I I E: J 1500 0 0 70 0 0

[kp] = 0 1200 0 , [kv] = 60 0 . 0 0 3000 0 140

It can be seen that the poles are moved into the right half s-plane and the system becomes unstable due to the arm link PD controller. Although the magnitudes of the pole movements will depend on the PD gain values, the arm link servo PD controller tends to further reduce the stability of the marginally stable open loop system and make the system even unstable for the improper choice of the feedback gains. Since this inherent property of the control structure makes the manipulator highly oscillatory or even diverge during the early stages of learning, it seems that the feedback error learning NN with the arm link servo PD is not suitable for the flexible joint manipulator.

Therefore, for the successful unsupervisory on-line training of the NN controller, another means to improve the system stability and to guarantee the manipulator motion in the vicinity of the planned trajectory during the early stages of learning should be contrived. Tomei [23] showed that the motor velocity feedback could solve the intrinsic instability problem of the arm link motion feedback alone. Potkonjak [24] showed that the motor variable feedback PD controller made the manipulator more stable than the arm link servo PD controller. Based upon these findings, a motor servo PD controller was adopted as a mate controller of the NN controller, as shown in Fig. 6. This arrangement was made to improve the system stability and guarantee the manipulator motion in the vicinity of the planned trajectory during the NN training. The motor servo PD controller outputs the compensating torque according to the following equation:

If(t) = [ k p ] [ N J ] - x [ o Md - 0 M] + [kv][NJ]- l [0 M+ - 0M], (7)

where 0 Md indicates the desired motor displacement vector. Figure 5(c) shows the pole locations of the manipulator with the motor servo controller (7) when the same servo gains as those used in the arm link servo are used. We can see that the system poles are moved farther into the left hand s-plane by the motor servo PD controller,

0L~ Ld ) ( ~

+

/ I + [a,l(g~'- ~b I I

-I co.~.o,,. . I t / p, • i : . !

Fig. 6. Neural network based feedforward controller implemented in parallel with motor servo PD controller.


compared to the open loop poles in Fig. 5(a). This indicates the apparent improvement of the system stability.

Since, in the case of the flexible joint manipulators, the desired motor motions O Ld compatible with the desired arm link motions 0 M" cannot be obtained due to flexibility in the joints, the desired arm link motions are used as the desired motor motions of the motor servo PD controller, i.e. 0 M~= [NJ]0 L~. Since the joint stiffnesses are still very high, the motor servo controller thus designed shows reasonable control performances at low speed operations and guarantees the manipulator motion in the vicinity of the desired trajectory even at high speed. In practice, the motor servo PD controller is a typical controller implemented in most conventional manipulators. We will use this controller as a mate controller with the NN controller, as shown in Fig. 6.

Now, because the output to(t) of the motor servo PD controller does not reflect the arm link output error, the NN training cannot be performed directly by the PD feedback error learning. Instead, as shown in the figure, the NN training is performed by the weighted sum of the angular position and velocity errors of the arm link:

e i --- a i o ( O Ld -- 0 L) + ail(O L~ - 0 L) (i = 1, 2, 3), (8)

where a~0 and all are the weighting coefficients of the position and velocity errors of the ith arm link, respectively.

In the proposed NN controller architecture, the motor servo PD controller takes the role to improve the system stability and guarantee the manipulator motion in the vicinity of the planned trajectory, while the role to make the manipulator accurately track the desired trajectory is taken by the NN controller.

3.2. Learning architecture

Figure 6 shows the NN feedforward controller implemented with the motor servo PD controller and its training architecture. The objective of training is to minimize the following energy function E, which represents the weighted sum of the squared angular position and velocity errors of the arm links.

1 3 E = ~ • E {ofi( OLd - 0L) 2 -~- fli( OLd -- 0L)2}, (9)

i=i

where c~ i and /3 i denote the weighting parameters of the position error and velocity error, respectively. The weights wkj between the hidden layer and output layer are modified by the following steepest descent rule:

OE /Xwkj = - r / ~wkj (10)

where Awkj denotes the incremental change of the weight w~j and r/ represents the learning rate. The total input sume to the kth node of the output layer is calculated by the following equation;

3O

netk = ~ WkjOj + bk. (11) j=l

Then, the equation (10) can be rewritten by chain rule as fohows:


A w k j = --~] - - a E ~ netk

3netk 3wki

~E - - o

----~l ~netk O)

3E 0Ok = - t l SOh anetk Oj

~E = - t l ~ " f'(netk)" Oj, (12)

where f ' ( . ) is the derivative of the node activation function, which is constant for the output layer nodes. In the above equation, aE/30k represents the varying rate of the energy function with respect to the kth node output of the neural net, i.e. the NN feedforward torque to the kth axis of the manipulator. This can be expressed as

/ 30k - -Ei=l °l'(O~ - 0~) ~ + fii(Oi - Oil) - ~ k l " (13)

The values of O0~/8Ok and S@/~Ok are required to evaluate the above equation. This means that we must have a priori knowledge of the manipulator dynamics. As explained in the previous section, various methods exist for determination of these terms. All these methods require additional plant modelling by other means. For the present problem, a total of 18 sensitivities of the arm link output to the NN output should be determined in advance. However, as explained previously, by adopting the feedback error learning [16] or the direct error leaning method [19], such a procedure can be excluded. It can be regarded that in these methods the plant input-output sensitivity effects are included in the learning parameter r/. Yabuta [19] showed that the properties of the learning stability and convergency of these learning architectures depend largely on the learning parameters such as learning rate and initial weights. In this study, selection of these parameter values to guarantee the learning convergence was made in a trial and error manner. In the above equations, the coupling effect t e r m s SOi/~O k and 30//8Ok (i • k) are neglected in consideration of the simplified decoupling structure of the balanced manipulator. The equation (13) then can be rewritten in the form of the weighted sum of the position and velocity output errors a s

3E SOt - {ak0(0~ ~ - 0~) + ak~(O~ d - 0~)}, (14)

where ak0 and akl are the coefficients reflecting the combined effects of the parameters crk, 90~/3Ok and ilk, 30~/3Ok, respectively. If the -3E/OOk term is represented by ek given in equation (8), the weights of the output layer can be changed according to the following equation:

Awkj = tl" ek " f'(netk)" Oj. (15)

The weights wji between input and hidden layers were changed according to the


standard generalized delta rule [15]. The biases bj and bk were learned in the same manner as were the other weights.

4. SIMULATION RESULTS AND DISCUSSIONS

4.1. Simulation procedure

In the simulation, the dynamic equations (2) and (3) of the balanced flexible joint manipulator derived in section 2 were used to obtain the instantaneous motion data of the three joint motors and arm links at each sampling time. The sampling times for the control and learning were chosen to both be 5 msec. At every learning instance, the NN controller receives the planned desired trajectory data, which consist of desired positions, velocities, accelerations and future accelerations of five control sample times ahead for the three arm links, from the trajectory planner and computes feedforward torque ~,(t) using the weights updated during the previous learning sampling interval. This NN feedforward torque is added to the PD contoller torque • c(t) to give the motor driving torque r~4(t) by which the balanced flexible joint manipulator is driven. It is conceivable that the NN controller using only the angular position, velocity and acceleration trajectory data cannot accurately perform the complete inverse dynamics feedforward control. This is because the dynamic equations (2) and (3) reveal that the NN requires the desired trajectory up to the 4th derivative in order to map the complete inverse dynamics of the flexible joint manipulator. However, these trajectory variables are not easily measurable. One way to overcome this is to adopt the idea of the predictive control law which utilizes the future trajectory input data so as to improve the tracking ability. Though is was not reasoned analytically, it is anticipated that, from the similar concept of the feedforward part of the conventional predictive control, the added future acceleration command input data would improve performance of the NN feedforward controller. The angular positions and velocities of the motors are assumed to be measured and fed back to the PD controller, while those of the arm links are used to update the weighted error sum ek. Then, the error sum updating equation (8) determines e k using the desired and measured position and velocity data of the arm links. The weighted error sum is backpropagated through equation (15) in order to train the NN controller and updates the NN weights in the direction to minimize the energy function E given in equation (9). The above procedure continues along the entire planned trajectory to complete the first trial of learning. The learning trial is repeated until the RMS combined output errors for the three arm links reach stationary values.

The weighting constant matrices of the output error sum vector e were chosen as

Eli 0 0j ( 0 0j [ao] = 103 0 , [al] = 0 102 0 . 0 103 0 0 102

With these weightings, the learning rate r /= 0.0001 was used. Random numbers of maximum magnitude of 0.01 were used as the initial weights. The shape parameter of the sigmoidal function was chosen as s = 5.0, which is the value giving high nonlinearity as shown in Fig. 8.


All parameter values, except for the joint stiffnesses and balancing masses, were chosen to be the same values as those obtained from the PUMA 760 model in [5]. The balancing mass values of the manipulator, Mb2 for the 2nd joint and Mb3 for the 3rd joint, were cited from [3] and they are 109.2 kg and 48.4 kg, respectively. The exact values of the joint stiffness were not available but they were chosen to be a little higher than the ones reported from the model of a PUMA 560 [22]. They were K] = 5 x 104 for the 1st joint, K J ---- 8 X 10 4 for the 2nd joint, and K~ = 2 x 104 Nm rad -1 for the 3rd joint, respectively.

Two circular trajectories were adopted for the simulation, one of which differs from the other in the sense that each joint participates in a different ratio to form the planned trajectory. These are displayed in Fig. 7. The trajectory circles have a common center point (50 cm, 11.4 cm, 0) in the world coordinates and their diameters of the circular motion are identically 60 cm. The trajectories were chosen to take a trapezoidal velocity profile. The maximum tracking velocity was kept constant at Vmax = 1.2 msec -1, comparatively high from the viewpoint of the manipulator speed capability. This is to investigate the possibility of using the NN controller in the vibratory environment. Numerical integration was performed using 4th order Runge-Kut ta algorithm and the integration time step was chosen as 1.25 msec.

4.2. Results and discussions

The inputs for the NN controller consist of desired positions, velocities, accelerations and future accelerations of five control sample times ahead for the three arm links. The inclusion of the desired accelerations of the future time in the NN inputs was found to result in good performance. Since the motion trajectories for most manipulator works are preplanned, such an inclusion of future trajectory data is allowable in a practical sense.

It appears that the value of the learning rate has critical influence on the learning behavior. A large learning rate gives rapid changes in the NN weights during the early learning stage and, thus, makes the feedforward torques generated from the NN controller change abruptly. The rapidly changing driving torques contain high frequency spectral power and activate structural vibration in the flexible joint

- ¥

0

/ ~ X

( X-Y p l ane : t r a j e c t o r y 1 )

( X-Z p l ane : t r a j e c t o r y 2 )

Fig. 7. Tra jec tor ies used for s imula t ion .


5.0 • ' " , f ~ S = 1 0

/ 1 - 0.0 S=l

- 5 . 0 - 1 5 - 5 5 15

x

Fig. 8. Sigmoidal activation function shapes for various shape parameters.

balanced manipulator. Such induced vibration appears to lead to failure in learning convergency.

Random numbers were used as the initial weights. The magnitudes of the initial weights also appeared to have significant effect on the learning behavior. Large initial weights make the untrained NN controller in the early learning stage generate arbitrary large feedforward torques exciting the structural vibration. Again, such a vibration led to destabilize the manipulator motion and resulted in learning failure. To avoid this problem, the maximum magnitude of the random numbers to initialize the NN weights was constrained to 0.01 in this simulation.

An NN with two hidden layers of more than 30 nodes was also tested. However, its learning performance at best was not better than that of the case having one layer and 30 nodes. In fact, the learning behavior appeared to be more sluggish and oscillatory. The cases in which the learning never converged were also found. Considering the simplified dynamics of the balanced manipulator, the hidden layer composed of mixed linear and nonlinear nodes were used. This hybrid hidden layer showed good learning speed and performance over pure linear or nonlinear hidden layers.

Figure 9 compares the tracking performance of the NN feedforward controller with that of the motor servo PD controller alone. The tracking was performed in

70

E o 40

c-

O :.~ 10

I -20 >-

PD con t ro l l e r . . . . 10 " t ha i - - 80 =* t ha i

- 5 0 ' , , -10 20 50 80 110

X - d i r e c t i o n ( x ) , c m

Fig. 9. Tracking profiles simulated with a neural network feedforward controller during various learning trials: trajectory 1.


trajectory 1. This figure is the diagram of the tracking motion profiles projected in the planned trajectory plane. The tracking profile by the motor servo PD controller alone reveals the performance limit when used in high speed tracking control of the balanced flexible joint manipulator. The servo gains were tuned heuristically through computer simulations. Since the balanced flexible joint manipulator becomes susceptible to vibratory motion in high speed motion, the high feedback gains often required to guarantee a better tracking ability were not allowed. Compared with this result, significant improvement in tracking performance is observed with the NN controller. As learning proceeds, the NN controller feedforward torques are changed in the direction of minimizing the weighted square sum of motion error. The NN controller gradually takes over the control from the PD controller, which takes the control role during the early learning stage. Since the initial weights were chosen at very small values below 0.01, the tracking profile after first learning attempt remained almost unchanged from the one with the PD controller alone. Although the learning started with small initial weights, the learning procedure converged rapidly. After 80 learning trials, the tracking performance did not show further improvement. The position tracking errors denoted by the Euclidean norm of the Cartesian error vector are shown in Fig. 10. The tracking error is greatly reduced with the NN controller, especially in the region of rapid changing dynamics during the period from approximately 1 to 2 sec. After 20 learning trials, the error history was found to be nearly the same as the one after 80 learning attempts, except in the region of the abruptly changing dynamics. Figures 11 and 12 show the torques generated from NN and PD controllers, respectively, as learning proceeds. In Fig. 11, it is seen that the 1st joint is trained more rapidly compared to 2nd and 3rd joints. Figure 12 shows the torques generated from the PD controller along the learning trials. It is noticed that, unlike the case of the feedback error learning method [16], the PD output torques do not decay out as learning proceeds. This is due to the fact that the NN training is performed in the direction to minimize the weighted sum of the squared angular position and velocity errors of the arm links. If the feedback error method could be used, the PD output torque will decay to zero in case of a well-trained neural net since the neural net controller generates exact inverse dynamics feedforward torque.

5.0 E

~S o k.

D~ t -

L ~ 2 . 5

n

0,0

PD controller . . . . . 10~ triol - - 80 • trial

0 I 2

T i m e ( t ) , s e c

Fig. 10. Position tracking errors simulated with a neural network feedforward controller during various learning trials: trajectory 1.


2OO

E Z

100

2

~ 0 o

o

0--100

Z Z

- 2 0 0

20O

E Z

"~ 100

2

~ 0 o

2 ~- - I00

Z Z

-200

2OO

E Z

3 100 E

=== l i t trial A . . . . . 10 th t r ia l I~

joint t

jo int 2 i

z / jot . t a

-208.0 0/5 1/0 I/5 2.0 Time(t), sec

Fig. 11. Neural network controller output torques during various learning trials.

Figure 13 shows the reduction trend in the RMS angular position errors of the arm links in trajectory 1. The 1st joint which has much larger initial RMS error compared to those of the other joints, show significantly higher rate and larger magnitude in error reduction. The reduction trend, however, does not nearly occur for the case of the 3rd joint. For all the RMS angular position errors remain at about 0.01 rad, regardless of the learning iteration number. The main sources of the somewhat large tracking errors still remained with the proposed NN feedforward controller seems to be attributable to the fact that the NN feedforward controller employed here is not so


20O

E

:•100 g

..o 0

~-I00 U

a n

-200 2OO

E z

- 100

E 0

V

~ - 2 0 0 L 20O

E z

- 100 a)

E 0

-100 I -2o8. o

Fig. 12.

=== 1 '~ t r ia l . . . . . 1 0 ~ t r ia l

80 • t r ia l

j o i n t 1 = i i

j o i n t 2 i

j o i n t 3

o.'s 1.'s 2.o Time(t), sec

PD controller output torques during various learning trials.

designed to generate an exact inverse dynamics torque. Rather, it generates torque to minimize the weighted sum of the squared error variables. Also, the NN training by the steepest descent rule does not guarantee the global minimum of the error performance function [15]. In view of the similar RMS error values remained for the three joints after training, it seems that the initial RMs error of the 3rd joint may be the limit of tracking error which can be achieved by the proposed NN controller.

Figures 14 and 15 show the tracking profiles and position tracking errors, respectively, in trajectory2. It is noted that in this trajectory the 1st joint motion is not involved. A similar performance can be seen as the one observed in trajectory 1.

C o n t r o l of b a l a n c e d r o b o t i c m a n i p u l a t o r s 505

0.05

8_ - - j o i n t 1

~_ 0.04 - - - - - - j o i n t 2 = = = j o i n t 3

0.03

0 c~ 0.02

:a_. o.oi co -m--~_. a n-

O .00 1 20 4-0 60 80 100 120

Leorning i terat ion number

Fig. 13. RMS joint angle error reductions along the learning iteration: trajectory 1.

60

E o 30

c -

O '.~ 0

O

" U

I - 3 0 N

-60 -10

PD cont ro l le r . . . . . 10 • t r ia l - - 8 0 ~ tr ia l

20 50 80 110

X - d i r e c t i o n ( x ) , c m

Fig. 14. Tracking profiles simulated with the NN controller during various learning trials: trajectory 2.

5.0 E - I N . PD c o n t r o l l e r o . . . . . 1 0 =" t r l o l

. L _

£

• '~ 2,5 O

8

O

a. 0.0 0 1 2

T i m e ( t ) , s e c

Fig. 15. Position tracking errors simulated with the NN controller during various learning trials: trajectory 2.

506 S .B . LEE and H. S. CHO

5. C O N C L U S I O N S

A neura l n e t w o r k - b a s e d f eed fo rward con t ro l l e r has been p r o p o s e d and eva lua t ed for the b a l a n c e d m a n i p u l a t o r with jo in t f lexibi l i ty which has s o m e w h a t highly v ib ra to ry t endency in the reg ion of high speed mot ion . F r o m the obse rva t ions o b t a i n e d th rough the s imula t ion s tudy, we have r eached the fol lowing conclus ions .

(1) The NN f e e d f o r w a r d con t ro l l e r i m p l e m e n t e d in pa ra l l e l with the m o t o r servoing PD con t ro l l e r shows a p p a r e n t l y i m p r o v e d t rack ing p e r f o r m a n c e ove r the P D con t ro l l e r a lone , especia l ly in the t r a j ec to ry reg ion of the rap id ly changing dynamics .

(2) D u e to the highly v ib ra to ry t endency o f the con t ro l l ed sys tem, the lea rn ing p a r a m e t e r s such as the l ea rn ing ra te and ini t ial weights s ignif icant ly inf luence the lea rn ing pe r fo rmances . H igh lea rn ing ra te and large ini t ial weights m a k e the NN con t ro l l e r g e n e r a t e the rap id ly changing un t ra ined f e e d f o r w a r d to rques in the ear ly lea rn ing stage. This exci tes the s t ruc tura l v ib ra t ion of the m a n i p u l a t o r and leads to lea rn ing ins tabi l i ty of fa i lure .

(3) The m o t o r servo PD con t ro l l e r i m p l e m e n t e d in para l l e l with the NN con t ro l l e r pe r fo rms a very i m p o r t a n t role in suppress ing the v ib ra to ry t endency and gua ran t ee s the l ea rn ing convergency of the N N cont ro l le r .

(4) In cons ide ra t ion of the s impl i f ied dynamics of the ba l anced ma n ipu l a to r , the h idden layer c o m p o s e d of mixed l inear and non l inea r nodes is used. This N N s t ruc ture shows a be t t e r l ea rn ing p e r f o r m a n c e ove r the pure l inear or non l inea r N N s t ructures .

REFERENCES

1. Lee C. S. G. and Chung M. J., An adaptive control strategy for mechanical manipulator. 1EEE Trans. Automatic Control AC-29,837-840 (1984).

2. Toumi K. Y. and Asada H., The design of open-loop manipulator arms with decoupled and configuration-invariant inertia tensors. ASME J. Dyn. Sys. Meas. Cont. 109,268-275 (1987).

3. Chung W. K. and Cho H. S., On the dynamics and control of robotic manipulators with an automatic balancing mechanism. Proc. lnstit. Mech. Eng. 201, No. B1 25-34 (1987).

4. Chung W. K. and Cho H. S., On the dynamic characteristics of a balanced PUMA-760 Robot. IEEE Trans. Industrial Electronics IE 35,222-230 (1988).

5. Moon J. I., Chung W. K., Cho H. S. and Gweon D. G., A dynamic parameter identification method for the PUMA-760 robot. 16th 1S1R Brussels 55-65 September (1986).

6. Sweet L. M. and Good M. C., Redefinition of the robot motion control problem. IEEE Control System Magazine 18-25 August (1985).

7. Chiou B. C. and Shahinpoor M., The effects of joint and link flexibilities on the dynamic stability of force controlled manipulators. IEEE Conf. Robotics Automation 398-403 (1989).

8. Lee S. B. and Cho H. S., Dynamic characteristics of balanced robotic manipulators with joint flexibility. Robotica (1991).

9. Forrest°Barlah M. G. and Babcock S. M., Inverse dynamics position control of a compliant manipulator. IEEE J. Robotics Automation RA-3, 75-83 (1987).

10. Spong M. W., Modeling and control of elastic joint manipulators. ASME J. Dyn. Sys. Meas. Cont. 109, 310-319 (1987).

11. De Luca A., Isidori A. and Nicolo F., Control of robot arm with elastic joints via nonlinear dynamic feedback. Proc. 24th IEEE Conference on Decision and Control, Ft Lauderdale, Florida (1985).

12. Spong M. W., Khorasani K. and Kokotovic P. V., An integral manifold approach to the feedback control of flexible joint robots. IEEE J. Robotics Automation RA-3,291-300 (1987).

13. Cesarco G. and Marino R., On the controllability properties of elastic robots. Proc. 6th International Conference on Analysis and Optimization Systems, INRIA, Nice (1984).

Cont ro l of ba lanced robot ic man ipu la to r s 507

14. Psaltis D., Sderis A. and Yamaura A. A., A multilayered neural network controller. IEEE Control Systems Magazine April 17-21 (1988).

15. Chen V. C. and Pao Y. H., Learning control with neural networks. Proc. IEEE Conference on Robotics and Automation, Phoenix, AZ, 1448-1453 (1989).

16. Kawato M., Uno Y., Isobe M. and Suzuki R., Hierarchical neural network for voluntary movement with application to robotics. IEEE Control Systems Magazine 8-15 April (1988).

17. Pao Yoh-Han, Adaptive Pattern Recognition and Neural Networks. Addison Wesley, Reading, MA (1988).

18. Guez A. and Selinsky J. W., A neurocontroller with guaranteed performance for rigid robots. Proc. International Joint Conference on Neural Networks 2,347-350 (1989).

19. Yabuta T. and Yamada T., Possibility of neural networks controller for robot manipulators. Proc. IEEE Conf. on Robotics and Automation 1686-1693 (1990).

20. Zeman V., Pafel R. V. and Khorasani K., A neural network based control strategy for flexible joint manipulators. Proc. IEEE 28th Conf. on Decision and Control 1759-1764 (1989).

21. Miller W. T., Hewes R. P., Glanz E. and Kraft L. G., Real time dynamic control of an industrial manipulator using a neural-network based learning controller. 1EEE Trans. Robotics Automation 6, 1-9 (1990).

22. Whitney D. E., Lozinski C. A. and Rourke J. M., Industrial robot forward calibration method and results. ASME J. Dyn. Sys. Meas. Cont. 108, 1-8 (1986).

23. Tomei P., Nicola S. and Ficola A., An approach to the adaptive control of elastic at joints robots. Proc. 1EEE Conf. Robotics Automation 552-558 (1986).

24. Potkonjak V., Contribution to the dynamics and control of robots having elastic transmission. Robotica 6, 63-69 (1988).

Documents

Neural network-based control of balanced robotic manipulators with joint flexibility