13
Neural Process Lett DOI 10.1007/s11063-013-9332-7 Exploiting Chaos in Learning System Identification for Nonlinear State Space Models Mehmet Ölmez · Cüneyt Güzeli¸ s © Springer Science+Business Media New York 2013 Abstract The paper presents two learning methods for nonlinear system identification. Both methods employ neural network models for representing state and output functions. The first method of learning nonlinear state space is based on using chaotic or noise signals in the training of state neural network so that the state neural network is designed to produce a sequence in a recursive way under the excitement of the system input. The second method of learning nonlinear state space has an observer neural network devoted to estimate the states as a function of the system inputs and the outputs of the output neural network. This observer neural network is trained to produce a state sequence when the output neural network is forced by the same sequence and then the state neural network is trained to produce the estimated states in a recursive way under the excitement of the system input. The developed identification methods are tested on a set of benchmark plants including a non-autonomous chaotic system, i.e. Duffing oscillator. Both proposed methods are observed much superior than well-known identification methods including nonlinear ARX, nonlinear ARMAX, Ham- merstein, Wiener, Hammerstein–Wiener, Elman network, state space models with subspace and prediction error methods. Keywords Neural networks · State space · System identification · Learning · Chaos 1 Introduction In the analysis and design of control systems, it is usually necessary to have a mathemat- ical model for the considered plant. Such a model may need to be constructed from a set M. Ölmez (B ) Technical Programs Department, ˙ Izmir Vocational School, Dokuz Eylül University, Buca, Izmir, Turkey e-mail: [email protected] C. Güzeli¸ s Electrical and Electronics Engineering Department, ˙ Izmir University of Economics, Balçova, Izmir, Turkey e-mail: [email protected] 123

Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

  • Upload
    cueneyt

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

Neural Process LettDOI 10.1007/s11063-013-9332-7

Exploiting Chaos in Learning System Identification forNonlinear State Space Models

Mehmet Ölmez · Cüneyt Güzelis

© Springer Science+Business Media New York 2013

Abstract The paper presents two learning methods for nonlinear system identification. Bothmethods employ neural network models for representing state and output functions. The firstmethod of learning nonlinear state space is based on using chaotic or noise signals in thetraining of state neural network so that the state neural network is designed to produce asequence in a recursive way under the excitement of the system input. The second method oflearning nonlinear state space has an observer neural network devoted to estimate the statesas a function of the system inputs and the outputs of the output neural network. This observerneural network is trained to produce a state sequence when the output neural network isforced by the same sequence and then the state neural network is trained to produce theestimated states in a recursive way under the excitement of the system input. The developedidentification methods are tested on a set of benchmark plants including a non-autonomouschaotic system, i.e. Duffing oscillator. Both proposed methods are observed much superiorthan well-known identification methods including nonlinear ARX, nonlinear ARMAX, Ham-merstein, Wiener, Hammerstein–Wiener, Elman network, state space models with subspaceand prediction error methods.

Keywords Neural networks · State space · System identification · Learning · Chaos

1 Introduction

In the analysis and design of control systems, it is usually necessary to have a mathemat-ical model for the considered plant. Such a model may need to be constructed from a set

M. Ölmez (B)Technical Programs Department, Izmir Vocational School,Dokuz Eylül University, Buca, Izmir, Turkeye-mail: [email protected]

C. GüzelisElectrical and Electronics Engineering Department, Izmir University of Economics,Balçova, Izmir, Turkeye-mail: [email protected]

123

Page 2: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

M. Ölmez, C. Güzelis

of input-output measurements by so called system identification methods [1]. Herein, themodels must fit enough to the available input-output measurements and, at the same time,must have a good performance for the inputs not considered in the model design phase. Totackle this issue especially for the systems possessing nonlinear dynamics, a number of arti-ficial neural networks (ANNs) based system identification methods have been developed inthe literature [2–9].

In the black-box approach, different ANN models such as multi-layer perceptron (MLP)and radial basis function network (RBFN) or machine learning models such as support vectorregression (SVR) and least square SVR (LS-SVR) are employed as nonlinear algebraic build-ing blocks in nonlinear ARX, nonlinear ARMAX, Hammerstein, Wiener or Hammerstein-Wiener models [3,6–8,10–12]. In the state space approach, ANNs are used to represent statefunctions only or state and output functions together [13–18].

In the state space based identification methods which have the potential of learning internalstructures of nonlinear dynamical systems, defining target values for the states in approxi-mating to the original state function is problematic since the states are not always available inthe measurements. On the other hand, learning state transitions defined by the state functionin an indirect way may yield poor generalization when a limited number of measurementsamples are used in the training of the neural network devoted to the state function. Thispaper introduces two system identification methods to overcome these problems [19].

The developed system identification methods are both based on state space models andemploy ANNs. The first method employs a state ANN for learning state function and anoutput ANN for learning output function. Where, firstly the state ANN is trained to learn toproduce a chaotic or noise sequence recursively under the excitement of the input signals andthen the output ANN is trained from the input-output measurement samples together with thestate provided at the output of the already trained state ANN under the system input. So, thefirst method is called as learning state space with forced chaotic state recursion (LSS–CSR).The second method of learning nonlinear state space has an observer ANN in addition tothe state and output ANNs. The observer ANN herein is devoted to estimate the states as afunction of the outputs of the output ANN and the system inputs. After training the outputANN to learn mapping from the system input and the state defined with a chosen chaotic ornoise sequence into the system output, the observer ANN is trained to produce the same statesequence as a function of the outputs of the output ANN forced by the same sequence and thesystem inputs. The state ANN in this second method is trained in the last phase of learningby providing the outputs of the observer ANN and the system inputs as the input samples andaccepting the samples of the sequence used in the training of observer and output ANNs asthe target values at the outputs of the state ANN. So, the second method is called as learningstate space with chaotic state observer (LSS–CSO).

The developed two identification methods are tested on a set of benchmark plants includ-ing first, second as a non-autonomous chaotic Duffing oscillator, and third order nonlineardynamical plants. Among the mentioned two methods, the best identification performancesare obtained for the LSS–CSR method. The extensive simulations done for the same bench-mark plants show that the introduced LSS–CSR method is observed to provide much supe-rior identification performances especially in the generalization compare to the well-knownidentification methods such as nonlinear ARX, nonlinear ARMAX, Hammerstein, Wiener,Hammerstein–Wiener, Elman network, state space models with subspace and predictionerror methods. Herein, normalized mean square error and signal to error ratio are used as theperformance measures.

When the system to be identified is chaotic, LSS–CSR and LSS–CSO methods use chaoticsequences as state sequences in the training phase, if the system to be identified is non-chaotic,

123

Page 3: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

System Identification for Nonlinear State Space Models

xk

xk+1yk

StateANN

uk

OutputANNDELAY

xkΔ

Fig. 1 Discrete-time state space model for the LSS–CSR method

noise sequence is used instead of chaotic sequence in the training phase. It may be arguedthat the chaotic signals used especially in the training of the state ANN provides an improve-ment in learning the input-output measurements and further that this improvement is dueto the broad spectrum of chaotic signals possessing a rich variety of state samples, so pro-viding a considerable state contribution on the system’s input space to configure the sys-tem’s output space. The relatively high performance of the LSS–CSR identification methodmay also be explained to rely on the fact that, as described in Section II, the state ANNof the LSS–CSR method implementing a kind of conditional and delayed auto-associationfrom the current state to the next state under the knowledge of current input learns betterwith a chaotic signal carrying a rich variety of states. The same argument is also valid forthe cascade of output ANN and observer ANN of LSS–CSO method which is taught tolearn a conditional auto-association from the current state into itself under the knowledge ofthe current input with a better performance when a broad spectrum signal such as chaoticsignal.

This paper is organized as follows. Section 2 presents LSS–CSR and LSS–CSO methods.Section 3 presents the simulation results obtained by the implementations of the developedmethods and the methods available in the literature in a comparative way. The conclusiontakes place in Sect. 4.

2 Learning Nonlinear State Space with Forced Chaotic State Recursion and withChaotic State Observer

Both of the methods developed for learning the intrinsic dynamics of the system to be iden-tified assume the following discrete-time state space model. Where xk ∈ Rn, uk ∈ Rr andyk ∈ Rm .

xk+1 = f(

xk, uk)

yk = h(

xk, uk) (1)

The method LSS–CSR of learning state space with a forced chaotic state recursion uses twoANNs. One of the ANNs is for learning state function f (xk, uk) defining the state recursionxk+1 = f (xk, uk) under the system input uk excitation. The other ANN is for learning theoutput function yk = h(xk, uk). The block diagram of the assumed state space model isshown in Fig. 1.

123

Page 4: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

M. Ölmez, C. Güzelis

DELAYxk

uk

xk+1 yk

InputLayer

Weight

HiddenLayer

Weight

OutputLayer

InputLayer

Weight

HiddenLayer

Weight

OutputLayer

xkΔ

Fig. 2 ANN structures of the proposed LSS–CSR system identification method

The state and output ANNs in the LSS–CSR method are chosen as multi-layer perceptrons(MLPs) as depicted in Fig. 2.

In the method LSS–CSR of learning state space with a forced chaotic state recursion, thestate ANN and output ANN are trained with a considered finite set {(uk, yk)}K

k=0of input-output measurement samples in the following way.

• Firstly, the state ANN is trained in a supervised way to learn to predict next state vectorxk+1 from the current state xk and input uk . Where, the state sequence {xk}K

k=owith afinite natural number K is either a chaotic signal or noise signal.

• Secondly, the output ANN is trained in a supervised to learn to predict yk output fromthe current state xk and input uk . Where, the state xk is the delayed output of the stateANN already trained in the first step as the response to the input uk .

It should be noted that the state ANN trained in the above way tries to learn a kind ofconditional delayed auto-association defined by the state recursion xk+1 = f (xk, uk). Thatis the state ANN learns to predict the next state as a function of the current state and input.As will be reported in Sect. 3, the training and generalization performance of the overallidentification procedure is comparatively much higher when a chaotic sequence is attemptedto be learned by the state ANN as a conditional state association. A reason for this relativelyhigh performance may be due to the richness of these chaotic signals in carrying very differentstate transitions under the considered inputs.

The second method LSS–CSO of learning nonlinear state space employs an observer ANNin addition to the state and output ANNs as depicted in Fig. 3.

The state, observer, and output ANNs in the LSS–CSO method are chosen as multi-layerperceptrons (MLPs) as depicted in Fig. 4.

In the method LSS–CSO of learning state space with a chaotic state observer, the overall(closed loop system) state space model defined by the state, output and observer ANNs aredesigned by a supervised learning scheme to learn a considered finite set {(uk, yk)}K

k=0 ofinput-output measurement samples. In both of the training and test phases of the LSS–CSOmethod, the considered identification model given in Fig. 3 implements indeed the followingstate model since the state estimated by the observer is just an estimate not the state itself.

xk+1 = f

(�x

k, uk

)

yk = h(

xk, uk)

�x

k = h+ (yk, uk

)(2)

123

Page 5: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

System Identification for Nonlinear State Space Models

xk

yk

StateANN

uk

OutputANN

ObserverANN

h+(yk,uk)

f(xk,uk)h(yk,uk)

DELAY

xk+1kxΔ

Fig. 3 Discrete-time state space model for the LSS–CSO method

DELAY

uk

xk+1 yk

InputLayer

Weight

HiddenLayer

Weight

OutputLayer

InputLayer

Weight

HiddenLayer

Weight

OutputLayer

InputLayer

Weight

HiddenLayer

Weight

OutputLayer

xkkx Δ

Fig. 4 ANN structures of the proposed LSS–CSO system identification method

The observer ANN is used for estimating the states as a function of the outputs of the outputANN and the system inputs. The output ANN is for learning the mapping from the systeminput and the state. The state, output and observer ANNs in LSS–CSO method are trained asfollows.

• Firstly, the output ANN is trained in a supervised way to learn the output functionyk = h(xk, uk) providing the output vector yk from the current state xk and input uk . Inthe training, the considered finite set {(uk, yk)}K

k=0 of input-output measurement samplesand a chaotic or noise state sequence {xk}K

k=0 with a finite K are used.• Secondly, the observer ANN is trained in a supervised to learn to estimate the state xk

from the output yk of the output ANN and the system input uk . Where, the desired output

123

Page 6: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

M. Ölmez, C. Güzelis

of the observer ANN is the state xk fed to the output ANN when training the output ANNin the first phase of the training.

• Thirdly, the state ANN is trained in a supervised way again. In the training of state ANN,

the input samples are the outputs�x

kof the observer ANN which are the estimates of the

states and the system inputs uk and the targets are the samples of a chaotic or noise statesequence {xk}K

k=0 which are used in the training of the output and observer ANNs in thetwo preceding phases of learning.

Note that the cascade of observer ANN and output ANN functions as a conditional auto-association from the current state into itself under the knowledge of the current input. As willbe seen in Section III, the LSS–CSO method has a comparable performance relatively to thesystem identification methods available in the literature but its performance is inferior to theone of LSS–CSR.

3 A Performance Evaluation of the Developed Identification Methods

In order to assess the performances of both of the LSS–CSR and LSS–CSO identifica-tion methods, three nonlinear dynamical plants are chosen in the computer experiments.All of the three plants under consideration are single-input single-output non-autonomoussystems widely used in the literature as benchmark plants. For constructing the state, out-put and observer ANNs in both of the identification methods, MLP neural networks areused.

3.1 Structures and Learning of ANN Blocks

The considered plants to be identified are all single-input single-output and first, second orthird order dynamical systems. The order of the system dynamics is assumed unknown, so thenumber of state variables is chosen as 2 for the state space models used in the identificationirrespective of the original system order. The state ANN used in both of the identificationmethods is chosen to have 3 input neurons, one for the system input and two others for thecurrent value of 2 state variables, and to have 2 output neurons each of which representsthe next value of one of 2 state variables. The output ANN has 3 input neurons again onefor the system input and two others for the current value of 2 state variables, and it hasa single output neuron representing the system output. The observer ANN of the LSS–CSO method has 2 input neurons, one for the system input and the other for the systemoutput and 2 output neurons representing the current values of the state variables. 3 hiddenneurons are observed to be sufficient for all of the state, output and observer ANNs andso the identification results are obtained for 3 hidden neuron ANNs in all experiments. AllANNs are trained with the gradient descent type back propagation with the momentumterm 0.9 and with a constant learning 0.05 for tansig (i.e. the hyperbolic tangent sigmoid)activation function in the input and hidden layers, linear transfer function in the outputlayer.

For both of the LSS–CSR and LSS–CSO identification methods, 150 samples of the input-output measurement data {(uk, yk)}K

k=0 which are obtained by simulating the chaotic plantare used for training and the rest 150 samples are reserved for the test. For non-chaotic plants48 samples of the input-output measurement data {(uk, yk)}K

k=0 are used for training and therest 22 samples are reserved for the test.

123

Page 7: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

System Identification for Nonlinear State Space Models

3.2 Chaotic Signal Production for Training

The chaotic signals used for the training described in Section II are produced by the Lorenzand Henon chaotic systems [20,21]. The second state of the Lorenz system defined in (3)with the parameters σ = 10, r = 56.6, b = 5.02, and with initial conditions x1(0) = 1,x2(0) = 0, and x3(0) = 1 is applied as the first chaotic state variable in the experiment.

x1 = σ(x2 − x1)

x2 = r x1 − x2 − 20x1x3

x3 = 5x1x2 − bx3

(3)

The state of the Henon map defined in (4) with x(−1) = x(0) = 0.5 is used as the secondchaotic state variable needed in the training.

x(k + 1) = −1.4x2(k) + 0.3x(k − 1) + 1 (4)

Three nonlinear dynamical plants on which the performances of the proposed identificationmethods are compared to each other and to a set of well-known identification methods aredescribed below.

3.3 Plant I

One of the benchmark plants to be identified is the forced Duffing oscillator given by thefollowing non-autonomous second order ordinary differential equation [21].

x1 = x2

x2 = 7.6 cos(t) − 0.04x2 − x31

y = 0.05x1 + 10

(5)

Where, an affine function of the state is chosen as the output in the identification. Theforced Duffing oscillator with the considered parameter values is known to possess chaoticdynamics. In the simulations for identification, the Duffing system is run for the initialconditions x1(0) = 3 and x2(0) = 4.

3.4 Plant II

The second plant considered in the experiment for the identification is the following single-input, single-output first order discrete-time plant which is a modified version of the bench-mark plant used in the literature [22].

x(k + 1) = 10 sin(x(k)) + u(k) [0.1 + cos(x(k)u(k))]

y(k) = 0.025x(k) + 0.5(6)

In the computer experiments, the input is chosen as the chaotic signal produced by the logisticmap u(k+1) = 4u(k)(1−u(k)) with u(0) = 0.1 and the initial state is chosen as x(0) = 0.5.

123

Page 8: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

M. Ölmez, C. Güzelis

Table 1 Performance comparison of the identification methods for Plant I

Method NMSE(training)

SER (dB)(training)(worst)

SER (dB)(training)(mean)

SER (dB)(training)(best)

NMSE(test)

SER (dB)(test)(worst)

SER (dB)(test)(mean)

SER (dB)(test)(best)

LSS–CSR 0.0074 20.803 21.3191 22.977 0.0117 18.093 19.315 21.706

LSS–CSO 0.0095 19.958 20.237 21.362 0.010 19.485 19.95 20.496

NonlinearARX

0.028 15.102 15.53 15.721 0.0688 11.403 11.62 11.832

State spacemodel withpredictionerror method

0.197 6.663 7.03 7.904 2.541 −4.589 −4.05 −3.453

State spacemodel withsubspacemethod

0.749 0.432 1.25 1.698 1.013 −0.332 −0.056 0.102

Wiener model 0.029 14.895 15.40 15.793 0.151 7.773 8.204 8.502

Hammersteinmodel

0.0548 12.093 12.61 12.856 0.167 7.551 7.75 7.896

Hammerstein–Wienermodel

0.0275 15.398 15.604 15.873 0.295 4.706 5.293 5.554

NonlinearARMAX

0.039 13.794 14.092 14.443 0.069 11.394 11.56 11.707

Elmannetwork

0.0320 14.607 14.9446 15.254 0.0335 14.392 14.750 14.967

3.5 Plant III

The third plant used as an example for evaluating the identification performances is thefollowing single-input, single-output third order discrete-time plant which is a modifiedversion of the benchmark plant used in the literature [2].

x(k + 1) = 2.6x(k)x(k − 1)x(k − 2)u(k − 1)(x(k − 2) − 1) + u(k)

1 + x2(k − 1) + x2(k − 2)

y(k) = x(k) + 1(7)

In the computer experiments, the input is chosen as the sinusoidal signal u(k) = sin(k)andthe initial states are chosen as x(0) = 1, x(1) = 1.5 and x(2) = 2.

3.6 Identification Methods Used in the Comparison

For the Hammerstein, Wiener and Hammerstein–Wiener models, 10 piecewise linear blocksat the nonlinear static part are used in the experiments. The linear parts of these models arechosen to have the z-domain input-output transfer function given by yL [z] = B[z]

F[z] uL(z)z−n

where L represents the linear part, B(z) numerator is of second order, F(z) is of third orderand the input delay is n. The nonlinear ARX model employed admits two regressors, namelyy(t-1) and y(t-2) outputs, and also 10 sigmoid networks as the nonlinear static block. Thenonlinear ARMAX model use four regressors u(t-1), u(t-2), y(t-1) and y(t-2), and 10 sigmoidnetworks as a nonlinear static block. The prediction error and subspace methods are applied

123

Page 9: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

System Identification for Nonlinear State Space Models

Tabl

e2

Perf

orm

ance

com

pari

son

ofth

eid

entifi

catio

nm

etho

dsfo

rPl

antI

I

Met

hod

NM

SE(t

rain

ing)

SER

(dB

)(t

rain

ing)

(wor

st)

SER

(dB

)(t

rain

ing)

(mea

n)

SER

(dB

)(t

rain

ing)

(bes

t)

NM

SE(t

est)

SER

(dB

)(t

est)

(wor

st)

SER

(dB

)(t

est)

(mea

n)

SER

(dB

)(t

est)

(bes

t)

LSS

–CSR

0.07

510

.963

11.2

0111

.577

0.07

710

.892

11.1

311

.395

LSS

–CSO

0.07

212

.673

13.0

5313

.559

0.08

610

.486

10.6

1710

.995

Non

linea

rA

RX

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Pred

ictio

ner

ror

met

hod

0.10

29.

603

9.90

10.0

050.

141

7.92

08.

488.

7732

Subs

pace

met

hod

0.07

510

.403

11.2

311

.584

0.18

6.89

27.

447.

771

Wie

ner

mod

el0.

054

12.2

8312

.64

12.8

050.

097

9.85

010

.11

10.4

26

Ham

mer

stei

nm

odel

0.08

10.5

8910

.84

10.9

080.

691.

322

1.58

1.71

28

Ham

mer

stei

n–W

iene

rm

odel

0.05

511

.998

12.5

912

.874

0.53

1.99

22.

752.

995

Non

linea

rA

RM

AX

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Non co

nver

gent

Elm

anne

twor

k0.

084

10.3

6210

.76

11.0

670.

1004

9.75

29.

9810

.329

6

123

Page 10: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

M. Ölmez, C. Güzelis

Table 3 Performance comparison of the identification methods for Plant III

Method NMSE(training)

SER (dB)(training)(worst)

SER (dB)(training)(mean)

SER (dB)(training)(best)

NMSE(test)

SER (dB)(test)(worst)

SER (dB)(test)(mean)

SER (dB)(test)(best)

LSS–CSR 0.135 8.0128 8.674 8.9932 0.157 7.779 8.017 8.209

LSS–CSO 0.149 9.435 9.883 10.043 0.220 8.001 8.208 8.462

Nonlinear ARX 0.22 6.382 6.55 6.770 0.37 3.336 4.30 4.508

Prediction errormethod

0.49 2.589 3.07 3.453 1.51 −1.903 −1.81 −0.040

Subspacemethod

0.63 1.403 1.94 2.004 1.181 −1.023 −0.70 −0.068

Wiener model 0.078 10.873 11.03 11.110 0.204 6.201 6.89 6.970

Hammersteinmodel

0.112 8.994 9.49 9.743 0.951 0.003 0.217 0.702

Hammerstein–Wienermodel

0.016 17.302 17.89 18.003 0.286 5.274 5.48 5.596

NonlinearARMAX

0.071 10.890 11.44 11.774 0.79 0.403 1.012 1.205

Elman network 0.105 7.3596 9.79 12.385 0.152 5.9402 8.18 10.362

0 50 100 150 200 250 300

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7Forced Duffing Oscillator Method I and Method II versus Hammerstein and Elman Network

LSS−CSRDesired OutputHammersteinElman NetworkLSS−CSO

Fig. 5 Waveforms of the desired and the obtained outputs by the LSS–CSR, LSS–CSO, Hammerstein andElman networks for Plant I. Blue represents the original system output, LSS_CSR, LSS_CSO, Hammerstein,and Elman network outputs are indicated. (Color figure online)

123

Page 11: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

System Identification for Nonlinear State Space Models

0 10 20 30 40 50 60 70−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

samples

outp

ut

LSS−CSRdesired outputHammersteinLSS−CSO

Fig. 6 Waveforms of the desired and the obtained outputs by the LSS–CSR, LSS–CSO and Hammersteinfor Plant II. Blue represents the original system output, LSS_CSR, LSS_CSO, and Hammerstein outputs areindicated. (Color figure online)

to a 2-dimensional linear time-invariant (A,B,C,D) state model. The Elman network [13,16]used has one hidden layer of three neurons.

3.7 The training and test performances for proposed methods

For the three plants in (5)–(7), the developed identification methods LSS–CSR and LSS–CSO,and also the below given well-known identification methods of the literature are comparedto each other. Their identification performances are measured based on Normalized MeanSquare Error (NMSE) and also Signal to Error Ratio (SER) applied on the system outputs.

MSE = 1

n

n∑i=1

(y pi − yi )

2 (8)

NMSE = MSE

desired output signal power(9)

SER = 10 log10mean square of desired signal

MSE(10)

The training and test performances of LSS–CSR and LSS–CSO, and the above explainedidentification methods of the literature for Plant I, Plant II and Plant III are listed in Tables1, 2 and 3, respectively. For each plant, training and test procedure is implemented for 100different trials. At each trial, 300 sample data are produced for plant I, and 70 sample data

123

Page 12: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

M. Ölmez, C. Güzelis

0 10 20 30 40 50 60 70−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

3

samples

outp

ut

LSS−CSRdesired outputPrediction Error MethodLSS−CSO

Fig. 7 Waveforms of the desired and the obtained outputs by the LSS–CSR, LSS–CSO and predicition errormethod for Plant III. Blue represents the original system output, LSS_CSR, LSS_CSO, and state space modelwith prediction error method outputs are indicated. (Color figure online)

for Plant II and Plant III. Half of the 300 samples related to Plant I is used for trainingand the other half for test. In training and test for Plant II and Plant III, 48 of 70 samplesgenerated from the plants are used for training and the rest 22 samples for test. Tables 1, 2and 3 represent training and test performances in terms of NMSE and SER. The scores givenfor NMSE are average values over all trials. For SER measure, the average overall trials, theworst trial and the best trial scores are provided separately.

The waveforms of the desired and the obtained outputs by the LSS–CSR, LSS–CSO, andsome well known models for Plant I, II and III are given in Figs. 5, 6 and 7, respectively.

4 Conclusion

Two novel nonlinear system identification methods which are based on learning state spaceby employing MLP neural networks as building blocks to realize state and output func-tions are presented in the paper. It is observed that training the state ANN in the firstmethod of LSS–CSR and training the cascade of the observer and output ANNs in thesecond method of LSS–CSO by using chaotic signals provide much better overall identifi-cation performances in comparison to the methods available in the literature. Consideringthe results obtained from the extensive computer experimentation, it is concluded that theperformance improvements may be due to the exploitation of the broad spectrum of thechaotic signals in learning of a kind of conditional and delayed auto-association of states

123

Page 13: Exploiting Chaos in Learning System Identification for Nonlinear State Space Models

System Identification for Nonlinear State Space Models

by the state ANN in the LSS–CSR method and also in learning of a kind of conditionalauto-association of states by the cascade of the observer and output ANNs in the LSS–CSOmethod.

A theoretical performance analysis on the developed identification methods in a futurestudy might be useful for explaining the achieved performance improvements in the identi-fication by the usage of chaotic signals in training the mentioned state, output, and observerANN blocks. The methods are by no means restricted to the usage of multilayer neuralnetworks employed in this paper; other models such as radial basis function networks andsupport vector machines may be also used to represent state, output and observer blocks.In this direction, generalization ability of the identified models may be further increaseddepending on the suitability of the chosen neural network model and associated learningalgorithm to the plant under consideration.

References

1. Ljung L (1987) System identification: theory for the user. PTR Prentice Hall, Englewood Cliffs2. Narendra KS, Parthasarathy K (1990) Identification and control of dynamical systems using neural net-

works. IEEE Trans Neural Netw 1(1):4–273. Chen S, Billings SA (1992) Neural networks for nonlinear dynamic system modelling and identification.

Int J Control 56:319–3464. Ljung L, Sjöberg J (1992) A system identification perspective on neural nets. IEEE Neural Netw Signal

Process. 423–4355. Sjöberg J, Hjalmarsson H, Ljung L (1994) Neural networks in system identification. In: Proceedings of

the 10th IFAC symposium on system identification, vol. 2:49–716. Goethals I, Pelckmans K, Suykens JAK, De Moor B (2005a) Identification of MIMO Hammerstein models

using least squares support vector machines. Automatica 41:1263–12727. Goethals I, Pelckmans K, Suykens JAK, De Moor B (2005b) Subspace identification of Hammerstein

systems using least squares support vector machines. IEEE Trans Autom Control 50(10):1509–15198. Hong X, Chen S (2012) The system identification and control of Hammerstein system using non-uniform

rational B-spline neural network and particle swarm optimization. Neurocomputing 82:216–2239. Ljung L (2010) Perspectives on system identification. Annu Rev Control 34(1):1–12

10. Billings SA, Wei HL (2005) A new class of wavelet networks for nonlinear system identification. IEEETrans Neural Netw 16(4):862–874

11. Martínez-Ramón M, Rojo-Álvarez JL, Camps-Valls G, Muñoz-Marí J, Navia-Vázquez A, Soria-OlivasE, Figueiras-Vidal AR (2006) Support vector machines for nonlinear kernel ARMA system identification.IEEE Trans Neural Netw 17(6):1617–1622

12. Hong X, Mitchell RJ (2007) A Hammerstein model identification algorithm using Bezier–Bernsteinapproximation. IET Proc Control Theory Appl 1(4):1149–1159

13. Elman JL (1990) Finding structure in time. Cogn Sci 14:179–21114. Suykens JAK, De Moor B, Vandewalle J (1995) Nonlinear system identification using neural state space

models, applicable to robust control design. Int J Control 62(1):129–15215. Suykens JAK, Vandewalle J (1995) Learning a simple recurrent neural state space model to behave like

Chua’s double scroll. IEEE Trans Circuits Syst Part I 42(8):499–50216. Gao XZ, Gao XM, Ovaska SJ (1996) A modified Elman neural network model with application to

dynamical systems identification. Proc. IEEE Syst Man Cybern Int Conf 2:1376–138117. Yu W, Poznyak AS, Li X (2001) Multilayer dynamic neural networks for nonlinear system on-line

identification. Int J Control 74(18):1858–186418. Yu W (2005) State-space recurrent fuzzy neural networks for nonlinear system identification. Neural

Process Lett 22:391–40419. Ölmez M (2013) Exploiting chaos in system identification and control. PhD Thesis, Graduate School of

Natural and Applied Sciences, Dokuz Eylül University20. Cuomo KM, Oppenheim AV (1993) Circuit implementation of synchronized chaos with applications to

communications. Phys Rev Lett 71(1):65–6821. Chen G, Chen Y, Ögmen H (1997) Identifying chaotic systems via a Wiener-type cascade model. IEEE

Control Syst Mag 8:29–3622. Narendra KS (1996) Neural networks for control theory and practice. Proc IEEE 84(10):1385–1406

123