System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

System Identification and Optimization MethodsBased on Derivatives

Chapters 5 & 6 from Jang

System Identification and Optimization MethodsBased on Derivatives

Chapters 5 & 6 from Jang

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

2

Neuro-Fuzzy and Soft ComputingNeuro-Fuzzy and Soft Computing

Neural networks

Fuzzy inf. systems

Model space

Adaptive networks

Derivative-free optim.

Derivative-based optim.

Approach space

SoftSoftComputingComputing


3

OutlineOutline

•System Identification•Least Square Optimization•Optimization Based on Differentiation

• First order differentiation – Steepest descent• Second order differentiation


4

System Identification: IntroductionSystem Identification: Introduction

Goal• Determine a mathematical model for an unknown

system (or target system) by observing its input-output data pairs


5


Purpose:• To predict a system’s behavior, as in time series

prediction & weather forecasting• To explain the interactions & relationships

between inputs & outputs of a system• To design a controller based on the model of a

system, as an aircraft or ship control• Simulate the system under control once the model

is known


6


There are two main steps that are involved

• Structure identification

• Parameter identification


7


• Structure identification

Apply a-priori knowledge about the target system to determine a class of models within which the search for the most suitable model is to be conducted; this class of model is denoted by a function y = f(u,θ) where:

y is the model outputu is the input vector θ is the parameter vector

f depends on the problem at hand and on the designer’s experience and the laws of nature governing the target system


8


• Parameter identification

The structure of the model is known, however we need to apply optimization techniques in order todetermine the parameter vector such thatthe resulting model describes the system appropriately:

θ=θ ˆ)ˆ,u(fy θ=

iii u to assignedy with 0yy →−


9



10


The data set composed of m desired input-output pairs (ui;yi) (i = 1,…,m) is called the training data

System identification needs to do both structure & parameter identification repeatedly until satisfactory model is found: it does this as follows:

1) Specify & parameterize a class of mathematical models representing the system to be identified

2) Perform parameter identification to choose the parameters that best fit the training data set

3) Conduct validation set to see if the model identified responds correctly to an unseen data set

4) Terminate the procedure once the results of the validation test are satisfactory. Otherwise, another class of model is selected and repeat steps 2 to 4


11

Least-Square EstimatorsLeast-Square Estimators

General form:

y = θ1 f 1(u) + θ2 f2(u) + … + θnfn(u) (*)where:

• u = (u1, …, up)T is the model input vector• f1, …, fn are known functions of u• θ1, …, θn are unknown parameters to be estimated


12


The task of fitting data using a linear model is referred to as linear regression

We collect a training data set {(ui;yi), i = 1, …, m}

Equation (*) becomes:

which is equivalent to: A θ = y

=θ++θ+θ

=θ++θ+θ

=θ++θ+θ

mnmn2m21m1

2n2n222121

1n1n212111

y)u(f...)u(f)u(f

y)u(f...)u(f)u(fy)u(f...)u(f)u(f

M


13


Where: A is an m*n matrix which is:

θ is n*1 unknown parameter vector:

and y is an m*1 output vector:

A θ = y ⇔ θ = A-1y (solution)

=

)u(f)u(f

)u(f)u(fA

mnm1

1n11

L

M

L

θ

θ=θ

n

1

M

[ ])u(f),...,u(fa and y y

y ini1Ti

m

1

=

= M


14


We have m outputs & n fitting parameters to find (or m equations & n unknown variables)

Usually, m is greater than n, since the model is just an approximation of the target system and the data observed might be corrupted. Therefore, an exact solution is not always possible.

To overcome this inherent conceptual problem, an error vector e is added to compensate

A θ + e = y


15


The goal now consists of finding that reduces the errors between yi and

or

θθ= T

ii ay

∑=

−m

i

Tii ayimize

1min θ

θ

( )∑=θ

θ−m

1i

2Tii ayimizemin


16


If e = y - Aθ then:

We need to compute:

∑=

=θ−θ−==θ−=θ

mi

1i

TT2Tii )Ay()Ay(ee)ay()(E

)Ay()Ay(min T θ−θ−θ


17


Theorem [least-squares estimator]

The squared error is minimized when θ =(called the least-squares estimators LSE) satisfies the normal equation ATA = ATy, if ATA isnonsingular, is unique & is given by

= (ATA)-1ATy

θ

θθ

θ


18

Least-Square Estimators: ExampleLeast-Square Estimators: Example

ExampleThe relationship is between the spring length & the force applied L = k1f + k0 (linear model)

• Goal: find = (ko , k1)T that best fits the data for a given force fo, we need to determine the corresponding spring length L0

- Solution 1: provide 2 pairs (L0,f0) and (L1,f1) and solve a linear system of 2 equations and 2 variables k1 and k0However, because of noisy data, this solution is not reliable.

- Solution 2: use a larger training set (Li,fi)

θ

01 k & k⇒


20


since y = e + Aθ, we can write:

therefore the LSE of [k0,k1]T which minimizesis equal to :

{

{321

M

43421y

e7

4

3

2

1

1

0

A

0.56.41.43.35.21.25.1

e

eeee

kk

9.2 17.4 15.9 14.4 13.2 11.9 11.1 1

=

+

θ


21


Therefore, the LSE of [k0,k1]T which minimizes

is equal to :

∑=

=7

1i

2i

T eee

==

−

44.020.1

yA)AA(k

k T1T

1

0


22


We rely on this estimation because we have more data

If we are not happy with the LSE estimators then we can increase the model’s degree of freedom such that:

L = k0 + k1f + k2 f2 + … + knfn

(least square polynomial)


23


Higher order models fit better the data but they do not always reflect the inner law that governs the system

For example, when f is increasing toward 10N, the length is decreasing.


24



25

Derivative-Based OptimizationDerivative-Based Optimization

Based on first derivatives:• Steepest descent• Conjugate gradient method• Gauss-Newton method• Levenberg-Marquardt method• And many others

Based on second derivatives:• Newton method• And many others


26

Steepest DescentSteepest Descent[ ]Tnθθθθ ,...,, 21=

( ) ( )( )θθ EE min=*

Have n-dimensional space

*θθ =Want to find such that

Investigate the search space via iteration

( ),...3,2,11 =+=+ kkkkk dηθθ

such that

( ) ( ) ( )kkkk EEE θηθθ <+=+ d1


27

Steepest DescentSteepest Descent

is determined from the gradient of ( )θEIf the direction d

( ) ( ) ( ) ( ) ( ) T

n

EEEEθ

∂∂

∂∂

∂∂

=∇=θθ

θθ

θθθ ,...,,

21

defgi.e.,

such that gηθθ −=+ kk 1

then we talk about the method of steepest descent

Documents

System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter