MAE345Lecture17.pdf

Stochastic, Robust, andAdaptive Control

Robert Stengel

Robotics and Intelligent Systems, MAE 345, Princeton

University, 2011

Uncertain dynamic systems

State estimation

Stochastic control

Robust control

Adaptive control

Copyright 2011 by Robert Stengel. All rights reserved. For educational use only.http://www.princeton.edu/~stengel/MAE345.html

The problem: physical plants have uncertain

Initial conditions

Inputs

Measurements

System parameters or dynamic structure

Design goal: control systems that provide

satisfactory stability and response in the

presence of uncertainty

Systems

with

Uncertainty

Stochastic controller minimizes response to random initial

conditions, disturbances, and measurement errors

Robust controller has fixed gains and structure, and it

minimizes likelihood of instability or unsatisfactory

performance due to parameter uncertainty in the plant

Adaptive controller has variable gains and/or structure, and

it minimizes likelihood of instability or unsatisfactory

performance due to plant parameter uncertainty,

disturbances, and measurement errors

Stochastic, Robust, and Adaptive

Control

Optimal StateEstimation

State Estimation

Goals

Minimize effects of measurement error on knowledge of the state

Recontruct full state from reduced measurement set (r ! n)

Average redundant measurements (r " n) to produce estimate of

the full state

Method

Provide optimal balance between measurements and estimates

based on the dynamic model alone

Continuous- or discrete-time implementation

Linear-Optimal

State Estimation

Continuous-time linear dynamic process

with random disturbance

Measurement with random error !x(t) = F(t)x(t) +G(t)u(t) + L(t)w(t)

z(t) = Hx(t) + n(t)

Uncertainty model for initial condition,

disturbance input, and measurement error

x(t0) = E x(t

0)[ ] ; P(t0 ) = E x(t0 ) ! x(t0 )[ ] x(t0 ) ! x(t0 )[ ]

T{ }u(t) = E u(t)[ ] ; U(t0 ) = 0

w(t) = 0 ; W(t) = E w(t)[ ] w(" )[ ]T{ }

n(t) = 0 ; N(t) = E n(t)[ ] n(" )[ ]T{ }

Linear-Optimal

State Estimator

(Kalman-Bucy Filter)

Optimal estimate of state

!x(t) = F(t)x(t) +G(t)u(t) +K(t) z(t) !Hx(t)[ ], x(to ) = x(to )

K(t) : Optimal estimator gain matrix (n " r)

Two parts to the optimal estimator

Propagation of the expected value of x

Least-squares correction to the model-based estimate

! x (t) = F(t)! x (t) + G(t)!u(t)

+K(t) !z(t) "H! x (t)[ ]

Estimator Gain for the

Kalman-Bucy Filter

K(t) = P(t)HTN

!1(t)

Optimal filter gain matrix

Matrix Riccati equation for estimator

P (t) = F(t)P(t) + P(t)FT(t) + L(t)W(t)L

T(t)

!P(t)HTN

!1HP(t), P(t

o) = P

o

Same equations as those that define LQ control gain,

except

Solution matrix, P, propagated forward in time

Matrices and matrix sequences are different

Comparison of Running Average and KalmanEstimate of Velocity from Position Measurement Second-Order Example

of Kalman-Bucy Filter

!p t( )

!! t( )

"

#

$$

%

&

''=

Lp 0

1 0

"

#$$

%

&''

p t( )

! t( )

"

#

$$

%

&

''+

L(A

0

"

#$$

%

&''(A t( ) +

Lp

0

"

#$$

%

&''pw t( )

Rolling motion of an airplane

p

!

"

#$$

%

&''=

Roll rate, rad/s

Roll angle, rad

"

#$$

%

&''

(A = Aileron deflection, rad

pw = Turbulence disturbance, rad/s

Measurement of roll rate and angle

pM t( )

!M t( )

"

#

$$

%

&

''k

=p t( ) + np t( )

! t( ) + n! t( )

"

#

$$

%

&

''k

= I2x t( ) + n t( )

Lp :Roll ! rate damping

Turbulence sensitivity

"#$

%$

L&A : Control effectiveness

Second-Order Example

of Kalman-Bucy Filter

Covariance extrapolation

!p11t( ) !p

12t( )

!p12t( ) !p

22t( )

!

"

##

$

%

&&=

Lp0

1 0

!

"##

$

%&&

p11t( ) p

12t( )

p12t( ) p

22t( )

!

"

##

$

%

&&+

p11t( ) p

12t( )

p12t( ) p

22t( )

!

"

##

$

%

&&

Lp1

0 0

!

"##

$

%&&

+Lp

2'pW

20

0 0

!

"##

$

%&&(

p11t( ) p

12t( )

p12t( ) p

22t( )

!

"

##

$

%

&&

'pM

20

0 ')M2

!

"

##

$

%

&&

(1

p11t( ) p

12t( )

p12t( ) p

22t( )

!

"

##

$

%

&&

Estimator gain computation

k11t( ) k

12t( )

k21t( ) k

22t( )

!

"

##

$

%

&&=

p11t( ) p

12t( )

p12t( ) p

22t( )

!

"

##

$

%

&&

' pM2

0

0 '(M2

!

"

##

$

%

&&

)1

Kalman-Bucy Filter with

Two Measurements

State estimate with roll rate and angle measurements

!p t( )

!! t( )

"

#

$$$

%

&

'''

=Lp 0

1 0

"

#$$

%

&''

p t( )

! t( )

"

#

$$

%

&

''+

L(A

0

"

#$$

%

&''(A t( )

+k11t( ) k

12t( )

k21t( ) k

22t( )

"

#

$$

%

&

''

pM t( ) ) p t( )

!M t( ) ) ! t( )

"

#

$$

%

&

''

State Estimate with

Angle Measurement Only

Covariance extrapolation

!p11t( ) !p

12t( )

!p12t( ) !p

22t( )

!

"

##

$

%

&&=

Lp0

1 0

!

"##

$

%&&

p11t( ) p

12t( )

p12t( ) p

22t( )

!

"

##

$

%

&&+

p11t( ) p

12t( )

p12t( ) p

22t( )

!

"

##

$

%

&&

Lp1

0 0

!

"##

$

%&&+

Lp

2'pW

20

0 0

!

"##

$

%&&(1

')M2

p11t( ) p

12t( )

p12t( ) p

22t( )

!

"

##

$

%

&&

T

p11t( ) p

12t( )

p12t( ) p

22t( )

!

"

##

$

%

&&

Gain computation

k11t( )

k21t( )

!

"

##

$

%

&&=1

'(M2

p11t( ) p

12t( )

p12t( ) p

22t( )

!

"

##

$

%

&&

State estimate with roll angle measurement

!p t( )

!! t( )

"

#

$$$

%

&

'''

=Lp 0

1 0

"

#$$

%

&''

p t( )

! t( )

"

#

$$

%

&

''+

L(A

0

"

#$$

%

&''(A t( ) +

k11t( )

k21t( )

"

#

$$

%

&

''

!M t( ) ) ! t( )"# %&

Stochastic OptimalControl

Deterministic vs.

Stochastic Optimization

Deterministic The state is defined by a known

dynamic process and by

precise input

precise initial condition

precise measurement

J* = J(x*, u*)

Stochastic The state is defined by a known dynamic

process and by

an unknown input

an imprecise initial condition

an imprecise or incomplete measurement

Optimal cost = E{J[x*, u*]}

Linear-Quadratic Gaussian (LQG)

Optimal Control Law Minimize expected value of cost, subject to uncertainty

Stochastic optimal feedback control law combines the linear-

optimal control law with a linear-optimal state estimate

u * (t) = !R!1G

T(t)S(t)x(t) = !C(t) x(t)

where is an optimal estimate of the state perturbationx(t)

minu

E(J ) ! E(J*)

Certainty equivalence:

Feedback control is computed from optimal estimate of the state

Stochastic feedback control law is the same as the deterministic

control law

z(t) = Hx(t) + n(t)

!x(t) = F(t)x(t) +G(t)u(t) +K(t) z(t) !Hx(t)[ ]

u(t) = !C(t) x(t)

Linear-Quadratic-Gaussian

Control of a Dynamic Process

LQG Rolling Mill Control

System Design Example

Maintain desired thickness of

shaped beam

Account for random

variations in thickness/hardness ofincoming beam

eccentricity in rolling cylinders

measurement errors

Open- and Closed-Loop Response

http://www.mathworks.com/help/toolbox/control/ug/f0-1004500.html

Stochastic RobustControl

Robust Control System Design

Make closed-loop response insensitive to plant

parameter variations

Robust controller

Fixed gains and structure

Minimize likelihood of instability

Minimize likelihood of unsatisfactory performance

Probabilistic Robust

Control Design

Design a fixed-parameter controller for

stochastic robustness

Monte Carlo Evaluation of competing designs

Genetic Algorithm or Simulated Annealing

search for best design

Representations of Uncertainty

sI ! F = det sI ! F( ) !

"(s) = sn+ a

n!1sn!1

+ ...+ a1s + a

0

= s ! #1( ) s ! #2( ) ...( ) s ! #n( ) = 0

Characteristic equation of the uncontrolled system

Uncertainty can be expressed in

Elements of F

Coefficients of

Eigenvalues

! s( )

Root Locations for an

Uncertain System

Variation may be represented by

Worst-case, e.g., Upper/lower bounds

Probability, e.g., Gaussian distribution

Uniform Distribution Gaussian Distribution

s Plane s Plane

Stochastic Root

Loci for Second-

Order Example

Root distributionsare nonlinear

functions ofparameterdistributions

Unboundedparameter

distributions alwayslead to non-zeroprobability ofinstability

Boundeddistributions may beguaranteed to bestable

Probability of Satisfying a Design Metric

Probability of satisfying a design metric

d: Control design parameter vector [e.g., SA, GA, ]

v: Uncertain plant parameter vector [e.g., RNG]

e: Binary indicator, e.g.,

0: satisfactory 1: unsatisfactory

H(v): Plant

C(d): Controller (Compensator)

Pr(d,v) !1

Ne C d( ),H v( )"# $%

i=1

N

&

Design Control System to Minimize

Probability of Instability

!closed" loop (s) = sI " F v( ) "G v( )C d( )#$ %&

= s " '1( ) s " '2( ) ...( ) s " 'n( )#$ %&closed" loop = 0

Characteristic equation of the closed-loop system

Monte Carlo evaluation of probability of instability

Minimize probability of instability of control parameters

using numerical search

mind

Pr Re !i, i = 1,n( )"# $% > 0{ }

Control Design Example*

Challenge: Design a feedback compensator for a 4th-order

spring-mass system (the plant) whose parameters are

bounded but unknown

Minimize the likelihood of instability

Satisfy a settling time requirement

Don#t use too much control

* 1990 American Control Conference Robust Control Benchmark Problem

Uncertain Plant*

Plant dynamic equation

* 1990 American Control Conference Robust Control Benchmark Problem

m1 m2

ku y

!x1

!x2

!x3

!x4

!

"

#####

$

%

&&&&&

=

0 0 1 0

0 0 0 1

' k m1

k m1

0 0

k m2

' k m20 0

!

"

#####

$

%

&&&&&

x1

x2

x3

x4

!

"

#####

$

%

&&&&&

+

0

0

1 m1

0

!

"

####

$

%

&&&&

u +

0

0

0

1 m2

!

"

####

$

%

&&&&

w

w

!(s) = s2 s2 + km1+ m

2( )m1m2

"

#$

%

&' = s

2s2 ()

n

2"# %&

Plant characteristic equation y = x2

Parameter Uncertainties, Root

Locus, and Control Law

Parameters of mass-spring

system

Uniform probability density

functions for

0.5 < m1, m2 < 1.5

0.5 < -k < 2

Effects of parameters on root

locations (right)

Single-input/single-output feedback control law

u s( ) = !C s( )y s( )

Design Cost Function

Probability of Instability, Pri ei = 1 (unstable) or 0 (stable)

Probability of Settling Time

Exceedance, Prts ets = 1 (exceeded) or 0 (not

exceeded)

Probability of Control Limit

Exceedance, Pru eu = 1 (exceeded) or 0 (not

exceeded)

Design Cost Function

J = aPri2 + bPrts

2 + c Pru2

a = 1

b = c = 0.01

Monte Carlo Evaluation of Probabilityof Satisfying a Design Metric

Compute v using randomnumber generators over Ntrials

Required number of trialsdepends on outcomeprobability (right)

Search for best d using agenetic algorithm tominimize J

Prk(d,v) !

1

NekC d( ),H v( )"# $%

i=1

N

& , k = 1,3

J = aPri

2(d,v) + bPr

ts

2(d,v) + cPr

u

2(d,v)

Stabilization Requires

Compensation

Proportional feedback alone cannot stabilize the system

Feedback of either sign drives at least one root into theright half plane

u s( ) = !cy s( )

Search-and-Sweep Design of Family of

Robust Feedback Compensators

1) Begin with lowest-order feedback compensator

2) Arrange parameters as binary design vector

3) Genetic algorithm search for best values of the

design vector, i.e., design vector that minimizes J

C12(s) =

a0+ a

1s

b0+ b

1s + b

2s2! C d( )

d = a0,a1,b0,b1,b2{ }

d* = a0*,a

1*,b

0*,b

1*,b

2*{ }

Search-and-Sweep Design of Family ofRobust Feedback Compensators

1) Define next higher-order compensator

2) Optimize over all parameters, including optimal

coefficients in starting population

3) Sweep to satisfactory design or no further improvement

C22(s) =

a0+ a

1s + a

2s2

b0+ b

1s + b

2s2

d = a0*,a

1*,a

2,b0*,b

1*,b

2*{ }! d ** = a0 **,a1 **,a2 **,b0 **,b1 **,b2 **{ }

C23(s) =

a0+ a

1s + a

2s2

b0+ b

1s + b

2s2+ b

3s3

C33(s) =

a0+ a

1s + a

2s2+ a

3s3

b0+ b

1s + b

2s2+ b

3s3

C34(s) =

a0+ a

1s + a

2s2+ a

3s3

b0+ b

1s + b

2s2+ b

3s3+ b

4s4

...

Design Cost and Probabilities for

Optimal 2nd- to 5th-Order

CompensatorsNumber of Zeros = Number of Poles

0

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

Cost Prob Instab Prob Ts Viol x 0.01 Prob Cont Viol x 0.1

2nd Order

3rd Order

4th Order

5th Order

System Identification

Parameter-Dependent

Linear System

!x t( ) = F p( )x t( ) +G p( )u t( ) + L p( )w t( )

z t( ) = Hx t( ) + n t( )

! Linear systems contains parameters

! What if the parameter vector, p, is unknown?

Dynamic Model for

Parameter Estimation Augment vector to include original state and parameter vector

xAt( ) =

x t( )

p t( )

!

"

##

$

%

&&

System model for parameter identification becomes nonlinearbecause parameter is contained in the augmented state

!xAt( ) =

F p( )x t( ) +G p( )u t( ) + L p( )w t( )!" #$

fp p t( ),w t( )!" #$

%

&'

('

)

*'

+'" f

Ax t( ),p t( ),u t( ),w t( )!" #$

z t( ) = HA

x t( )

p t( )

!

"

,,

#

$

--+ n t( ) = H 0!" #$

x t( )

p t( )

!

"

,,

#

$

--+ n t( )

System Identification Using an

Extended Kalman-Bucy Filter

!x(t)

!p(t)

!

"

##

$

%

&&= f x(t), p(t),u(t)[ ] +K z(t) 'H

x(t)

p(t)

!

"##

$

%&&

()*

+*

,-*

.*

Add unknown parameter vector to the estimator state

Extend the state estimator derived for linearsystems to account for the nonlinear dynamics

Multiple-Model Testing for System

Identification

!x1(t) = F

1x1(t) +G

1u(t) +K

1z(t) !H

1x1(t)[ ]

!x2(t) = F

2x2(t) +G

2u(t) +K

2z(t) !H

2x2(t)[ ]

...

!xn(t) = F

nxn(t) +G

nu(t) +K

nz(t) !H

nxn(t)[ ]

Create a separate estimator for each hypothetical model

Choose model with minimum error residual

Ji=1

2!T! =

1

2z(t) "H

ixi(t)[ ]

T

z(t) "Hixi(t)[ ]

i = 1,n

Adaptive Control

Adaptive Control

System Design

Control logic changes to accommodate changes orunknown parameters of the plant

System identification

Gain scheduling

Learning systems

Control law is nonlinear

u t( ) = c z(t),a,y * t( )!" #$

c [ ] : Control law

x(t) : State

z x(t)[ ] : Measurement of state

a : Parameters of operating point

y * (t) : Command input

Operating Points Within a

Flight Envelope

Dynamic model is a function of altitude and airspeed

Design LTI controllers throughout the flight envelope

Gain Scheduling

Proportional-integral controller with scheduled gains

u t( ) = CFa( )y *+C

Ba( )!x + C

Ia( ) !y t( )dt" # c x(t),a,y * t( )$% &'

Scheduling variables, a, e.g., altitude, speed, properties ofchemical process,

Cerebellar Model

Articulation Controller (CMAC)

Inspired by models ofcerebellum

CMAC: Two-stage mappingof a vector input to a scalaroutput

First mapping: Input space toassociation space

s is fixed

a is binary

Second mapping:Association space to outputspace

g contains learned weights

s : x! a

Input! Selector vector

g :a! y

Selector!Output

Example of Single-

Input CMACAssociation Space

x is in (xmin, xmax)

Selector vector is binary and has Nelements

Receptive regions of associationspace map x to a

Analogous to neurons that fire inresponse to stimulus

NA = Number of receptive regions =N + C 1 = dim(a)

C = Generalization parameter = # ofoverlapping regions

Input quantization = (xmax $xmin) / N

s : x! a

Input! Selector vector

a = 0 0 0 1 1 1 0 0[ ]T

CMAC Output and

Training

CMAC output (i.e., control command) from activated cellsof c Associative Memory layers

ASSOCIATION MEMORY, c = 3

INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3

input 2

inp

ut

1

quant. widthof input 2

yCMAC = wTa = wi,activated

i= j

j+C!1

" j= index of first activated region

wjnew = wjold +!c

ydesired " wioldi=1

c

#$%&'()

Least-squares training of CMAC weights, w

Analogous to synapses between neurons

is the learning rate and wj is an activated cell weight

Localized generalization and training

!

CMAC Output and Training

In higher dimensions, association space is dim(x), a plane,cube, or hypercube

Potentially large memory requirements

Granularity (quantization) of output

Variable generalization and granularity



input 2

inp

ut 1


2-dimensional association space

CMAC Control of a Fuel-

Cell Pre-Processor(Iwan and Stengel)

BATTERIES

POWER

CONDITIONING

AND MOTOR

CONTROL

GEARMOTOR/

GEN.

FUEL

PROCESSOR

FUEL

STORAGE

FUEL CELL

STACKShift

2H O

Air

PrOx

Reformer or Partial

Oxidation Reactor

Fuel cell produces electricity for electric motor

Preferential Oxidizer (PrOx)

Proton-Exchange Membrane Fuel Cell converts hydrogen

and oxygen to water and electrical power

Steam Reformer/Partial Oxidizer-Shift Reactor converts fuel

(e.g., alcohol or gasoline) to H2, CO2, H2O, and CO. Fuel flow

rate is proportional to power demand

CO poisons the fuel cell and must be removed from the

reformate

Catalyst promotes oxidation of CO to CO2 over oxidation of

H2 in a Preferential Oxidizer (PrOx)

PrOx reactions are nonlinear functions of catalyst,

reformate composition, temperature, and air flow

FUEL

PROCESSOR

Shift

2H O

Air

PrOx

Reformer or Partial

Oxidation Reactor

CMAC/PID Control System

for Preferential Oxidizer

desired H2

conversion

airCMAC

airPID

airTOTAL

training

+-

+

+! ! PROX

PID

CMAC

H2 conv.

error

HYBRID CONTROL SYSTEM

(ANN)

(Conventional)

PROX reformate flow rate

PROX inlet [CO] Inlet coolant temperature

gains=f(flow rate)

Inlet reformate

Outlet reformate

H2 conv. =

f(airTotal, [H2]in, [H 2]out,

flow rate, sensor dynamics)

H2 Conversion Calc.

actual H2 conversion[H2]out

[H2]in



input 2

inp

ut

1


Summary of CMAC

Characteristics

Inputs and Number of Divisions:

PrOx inlet reformate flow rate (95)

PrOx inlet cooling temperature (80)

PrOx inlet CO concentration (100)

Output: PrOx air injection rate

Associative Layers, C: 24

Number of Associative Memory Cells/Weights and

Layer Offsets: 1,276 and [1,5,7]

Learning Rate, : ~0.01

Sampling Interval: 100 ms



input 2

inp

ut

1


!

Flow Rate and Hydrogen Conversion

of CMAC/PID Controller

H2 conversion command (across PrOx only): 1.5%

Novel data, with (---) and without pre-training ()

Federal Urban Driving Cycle (= FUDS)

Comparison of PrOx Controllers

on FUDS mean H2 error

maximum H2 error

mean CO out

max. CO out

% % ppm ppm %

Fixed-Air 0.68 0.87 6.3 28 57.2

Table Look-up 0.13 1.43 6.5 26 57.8

PID 0.05 0.51 7.7 30 58.1

CMAC/PID 0.02 0.16 7.3 26 58.1

net H2 output

Reinforcement Learning

Learn from success and failure

Repetitive trials

Reward correct behavior

Penalize incorrect behavior

Learn to control from a humanoperator

http://en.wikipedia.org/wiki/Reinforcement_learning

Next Time:Classification of Data

Sets

Supplementary Material

Dynamic Models for the

Parameter Vector

Unknown constant: p(t) = constant

!p t( ) = 0; p 0( ) = p

o; Pp 0( ) = Pp

o

Random p(t) (integrated white noise)

!p t( ) = w t( ); p 0( ) = 0; Pp 0( ) = Ppo

E w t( )!" #$ = 0; E w t( )wT %( )!" #$ = Qw& t ' %( )

Linear dynamic system (Markov process)

!p t( ) = Ap t( ) + Bw t( ) " fp p t( ),w t( )!" #$; w t( ) # N 0,Qw( )

Inputs for System Identification

Transient inputs Step or square wave

Impulse or pulse train

Persistent excitation Random noise

Sinusoidal frequency sweep

Documents

MAE345Lecture17.pdf