15
Stochastic, Robust, and Adaptive Control Robert Stengel Robotics and Intelligent Systems, MAE 345, Princeton University, 2011 Uncertain dynamic systems State estimation Stochastic control Robust control Adaptive control Copyright 2011 by Robert Stengel. All rights reserved. For educational use only. http://www. princeton .edu/~stengel/MAE345.html The problem: physical plants have uncertain Initial conditions Inputs Measurements System parameters or dynamic structure Design goal: control systems that provide satisfactory stability and response in the presence of uncertainty Systems with Uncertainty Stochastic controller minimizes response to random initial conditions, disturbances, and measurement errors Robust controller has fixed gains and structure , and it minimizes likelihood of instability or unsatisfactory performance due to parameter uncertainty in the plant Adaptive controller has variable gains and/or structure , and it minimizes likelihood of instability or unsatisfactory performance due to plant parameter uncertainty, disturbances, and measurement errors Stochastic, Robust, and Adaptive Control Optimal State Estimation

MAE345Lecture17.pdf

Embed Size (px)

Citation preview

  • Stochastic, Robust, andAdaptive Control

    Robert Stengel

    Robotics and Intelligent Systems, MAE 345, Princeton

    University, 2011

    Uncertain dynamic systems

    State estimation

    Stochastic control

    Robust control

    Adaptive control

    Copyright 2011 by Robert Stengel. All rights reserved. For educational use only.http://www.princeton.edu/~stengel/MAE345.html

    The problem: physical plants have uncertain

    Initial conditions

    Inputs

    Measurements

    System parameters or dynamic structure

    Design goal: control systems that provide

    satisfactory stability and response in the

    presence of uncertainty

    Systems

    with

    Uncertainty

    Stochastic controller minimizes response to random initial

    conditions, disturbances, and measurement errors

    Robust controller has fixed gains and structure, and it

    minimizes likelihood of instability or unsatisfactory

    performance due to parameter uncertainty in the plant

    Adaptive controller has variable gains and/or structure, and

    it minimizes likelihood of instability or unsatisfactory

    performance due to plant parameter uncertainty,

    disturbances, and measurement errors

    Stochastic, Robust, and Adaptive

    Control

    Optimal StateEstimation

  • State Estimation

    Goals

    Minimize effects of measurement error on knowledge of the state

    Recontruct full state from reduced measurement set (r ! n)

    Average redundant measurements (r " n) to produce estimate of

    the full state

    Method

    Provide optimal balance between measurements and estimates

    based on the dynamic model alone

    Continuous- or discrete-time implementation

    Linear-Optimal

    State Estimation

    Continuous-time linear dynamic process

    with random disturbance

    Measurement with random error !x(t) = F(t)x(t) +G(t)u(t) + L(t)w(t)

    z(t) = Hx(t) + n(t)

    Uncertainty model for initial condition,

    disturbance input, and measurement error

    x(t0) = E x(t

    0)[ ] ; P(t0 ) = E x(t0 ) ! x(t0 )[ ] x(t0 ) ! x(t0 )[ ]

    T{ }u(t) = E u(t)[ ] ; U(t0 ) = 0

    w(t) = 0 ; W(t) = E w(t)[ ] w(" )[ ]T{ }

    n(t) = 0 ; N(t) = E n(t)[ ] n(" )[ ]T{ }

    Linear-Optimal

    State Estimator

    (Kalman-Bucy Filter)

    Optimal estimate of state

    !x(t) = F(t)x(t) +G(t)u(t) +K(t) z(t) !Hx(t)[ ], x(to ) = x(to )

    K(t) : Optimal estimator gain matrix (n " r)

    Two parts to the optimal estimator

    Propagation of the expected value of x

    Least-squares correction to the model-based estimate

    ! x (t) = F(t)! x (t) + G(t)!u(t)

    +K(t) !z(t) "H! x (t)[ ]

    Estimator Gain for the

    Kalman-Bucy Filter

    K(t) = P(t)HTN

    !1(t)

    Optimal filter gain matrix

    Matrix Riccati equation for estimator

    P (t) = F(t)P(t) + P(t)FT(t) + L(t)W(t)L

    T(t)

    !P(t)HTN

    !1HP(t), P(t

    o) = P

    o

    Same equations as those that define LQ control gain,

    except

    Solution matrix, P, propagated forward in time

    Matrices and matrix sequences are different

  • Comparison of Running Average and KalmanEstimate of Velocity from Position Measurement Second-Order Example

    of Kalman-Bucy Filter

    !p t( )

    !! t( )

    "

    #

    $$

    %

    &

    ''=

    Lp 0

    1 0

    "

    #$$

    %

    &''

    p t( )

    ! t( )

    "

    #

    $$

    %

    &

    ''+

    L(A

    0

    "

    #$$

    %

    &''(A t( ) +

    Lp

    0

    "

    #$$

    %

    &''pw t( )

    Rolling motion of an airplane

    p

    !

    "

    #$$

    %

    &''=

    Roll rate, rad/s

    Roll angle, rad

    "

    #$$

    %

    &''

    (A = Aileron deflection, rad

    pw = Turbulence disturbance, rad/s

    Measurement of roll rate and angle

    pM t( )

    !M t( )

    "

    #

    $$

    %

    &

    ''k

    =p t( ) + np t( )

    ! t( ) + n! t( )

    "

    #

    $$

    %

    &

    ''k

    = I2x t( ) + n t( )

    Lp :Roll ! rate damping

    Turbulence sensitivity

    "#$

    %$

    L&A : Control effectiveness

    Second-Order Example

    of Kalman-Bucy Filter

    Covariance extrapolation

    !p11t( ) !p

    12t( )

    !p12t( ) !p

    22t( )

    !

    "

    ##

    $

    %

    &&=

    Lp0

    1 0

    !

    "##

    $

    %&&

    p11t( ) p

    12t( )

    p12t( ) p

    22t( )

    !

    "

    ##

    $

    %

    &&+

    p11t( ) p

    12t( )

    p12t( ) p

    22t( )

    !

    "

    ##

    $

    %

    &&

    Lp1

    0 0

    !

    "##

    $

    %&&

    +Lp

    2'pW

    20

    0 0

    !

    "##

    $

    %&&(

    p11t( ) p

    12t( )

    p12t( ) p

    22t( )

    !

    "

    ##

    $

    %

    &&

    'pM

    20

    0 ')M2

    !

    "

    ##

    $

    %

    &&

    (1

    p11t( ) p

    12t( )

    p12t( ) p

    22t( )

    !

    "

    ##

    $

    %

    &&

    Estimator gain computation

    k11t( ) k

    12t( )

    k21t( ) k

    22t( )

    !

    "

    ##

    $

    %

    &&=

    p11t( ) p

    12t( )

    p12t( ) p

    22t( )

    !

    "

    ##

    $

    %

    &&

    ' pM2

    0

    0 '(M2

    !

    "

    ##

    $

    %

    &&

    )1

    Kalman-Bucy Filter with

    Two Measurements

    State estimate with roll rate and angle measurements

    !p t( )

    !! t( )

    "

    #

    $$$

    %

    &

    '''

    =Lp 0

    1 0

    "

    #$$

    %

    &''

    p t( )

    ! t( )

    "

    #

    $$

    %

    &

    ''+

    L(A

    0

    "

    #$$

    %

    &''(A t( )

    +k11t( ) k

    12t( )

    k21t( ) k

    22t( )

    "

    #

    $$

    %

    &

    ''

    pM t( ) ) p t( )

    !M t( ) ) ! t( )

    "

    #

    $$

    %

    &

    ''

  • State Estimate with

    Angle Measurement Only

    Covariance extrapolation

    !p11t( ) !p

    12t( )

    !p12t( ) !p

    22t( )

    !

    "

    ##

    $

    %

    &&=

    Lp0

    1 0

    !

    "##

    $

    %&&

    p11t( ) p

    12t( )

    p12t( ) p

    22t( )

    !

    "

    ##

    $

    %

    &&+

    p11t( ) p

    12t( )

    p12t( ) p

    22t( )

    !

    "

    ##

    $

    %

    &&

    Lp1

    0 0

    !

    "##

    $

    %&&+

    Lp

    2'pW

    20

    0 0

    !

    "##

    $

    %&&(1

    ')M2

    p11t( ) p

    12t( )

    p12t( ) p

    22t( )

    !

    "

    ##

    $

    %

    &&

    T

    p11t( ) p

    12t( )

    p12t( ) p

    22t( )

    !

    "

    ##

    $

    %

    &&

    Gain computation

    k11t( )

    k21t( )

    !

    "

    ##

    $

    %

    &&=1

    '(M2

    p11t( ) p

    12t( )

    p12t( ) p

    22t( )

    !

    "

    ##

    $

    %

    &&

    State estimate with roll angle measurement

    !p t( )

    !! t( )

    "

    #

    $$$

    %

    &

    '''

    =Lp 0

    1 0

    "

    #$$

    %

    &''

    p t( )

    ! t( )

    "

    #

    $$

    %

    &

    ''+

    L(A

    0

    "

    #$$

    %

    &''(A t( ) +

    k11t( )

    k21t( )

    "

    #

    $$

    %

    &

    ''

    !M t( ) ) ! t( )"# %&

    Stochastic OptimalControl

    Deterministic vs.

    Stochastic Optimization

    Deterministic The state is defined by a known

    dynamic process and by

    precise input

    precise initial condition

    precise measurement

    J* = J(x*, u*)

    Stochastic The state is defined by a known dynamic

    process and by

    an unknown input

    an imprecise initial condition

    an imprecise or incomplete measurement

    Optimal cost = E{J[x*, u*]}

    Linear-Quadratic Gaussian (LQG)

    Optimal Control Law Minimize expected value of cost, subject to uncertainty

    Stochastic optimal feedback control law combines the linear-

    optimal control law with a linear-optimal state estimate

    u * (t) = !R!1G

    T(t)S(t)x(t) = !C(t) x(t)

    where is an optimal estimate of the state perturbationx(t)

    minu

    E(J ) ! E(J*)

    Certainty equivalence:

    Feedback control is computed from optimal estimate of the state

    Stochastic feedback control law is the same as the deterministic

    control law

  • z(t) = Hx(t) + n(t)

    !x(t) = F(t)x(t) +G(t)u(t) +K(t) z(t) !Hx(t)[ ]

    u(t) = !C(t) x(t)

    Linear-Quadratic-Gaussian

    Control of a Dynamic Process

    LQG Rolling Mill Control

    System Design Example

    Maintain desired thickness of

    shaped beam

    Account for random

    variations in thickness/hardness ofincoming beam

    eccentricity in rolling cylinders

    measurement errors

    Open- and Closed-Loop Response

    http://www.mathworks.com/help/toolbox/control/ug/f0-1004500.html

    Stochastic RobustControl

    Robust Control System Design

    Make closed-loop response insensitive to plant

    parameter variations

    Robust controller

    Fixed gains and structure

    Minimize likelihood of instability

    Minimize likelihood of unsatisfactory performance

  • Probabilistic Robust

    Control Design

    Design a fixed-parameter controller for

    stochastic robustness

    Monte Carlo Evaluation of competing designs

    Genetic Algorithm or Simulated Annealing

    search for best design

    Representations of Uncertainty

    sI ! F = det sI ! F( ) !

    "(s) = sn+ a

    n!1sn!1

    + ...+ a1s + a

    0

    = s ! #1( ) s ! #2( ) ...( ) s ! #n( ) = 0

    Characteristic equation of the uncontrolled system

    Uncertainty can be expressed in

    Elements of F

    Coefficients of

    Eigenvalues

    ! s( )

    Root Locations for an

    Uncertain System

    Variation may be represented by

    Worst-case, e.g., Upper/lower bounds

    Probability, e.g., Gaussian distribution

    Uniform Distribution Gaussian Distribution

    s Plane s Plane

    Stochastic Root

    Loci for Second-

    Order Example

    Root distributionsare nonlinear

    functions ofparameterdistributions

    Unboundedparameter

    distributions alwayslead to non-zeroprobability ofinstability

    Boundeddistributions may beguaranteed to bestable

  • Probability of Satisfying a Design Metric

    Probability of satisfying a design metric

    d: Control design parameter vector [e.g., SA, GA, ]

    v: Uncertain plant parameter vector [e.g., RNG]

    e: Binary indicator, e.g.,

    0: satisfactory 1: unsatisfactory

    H(v): Plant

    C(d): Controller (Compensator)

    Pr(d,v) !1

    Ne C d( ),H v( )"# $%

    i=1

    N

    &

    Design Control System to Minimize

    Probability of Instability

    !closed" loop (s) = sI " F v( ) "G v( )C d( )#$ %&

    = s " '1( ) s " '2( ) ...( ) s " 'n( )#$ %&closed" loop = 0

    Characteristic equation of the closed-loop system

    Monte Carlo evaluation of probability of instability

    Minimize probability of instability of control parameters

    using numerical search

    mind

    Pr Re !i, i = 1,n( )"# $% > 0{ }

    Control Design Example*

    Challenge: Design a feedback compensator for a 4th-order

    spring-mass system (the plant) whose parameters are

    bounded but unknown

    Minimize the likelihood of instability

    Satisfy a settling time requirement

    Don#t use too much control

    * 1990 American Control Conference Robust Control Benchmark Problem

    Uncertain Plant*

    Plant dynamic equation

    * 1990 American Control Conference Robust Control Benchmark Problem

    m1 m2

    ku y

    !x1

    !x2

    !x3

    !x4

    !

    "

    #####

    $

    %

    &&&&&

    =

    0 0 1 0

    0 0 0 1

    ' k m1

    k m1

    0 0

    k m2

    ' k m20 0

    !

    "

    #####

    $

    %

    &&&&&

    x1

    x2

    x3

    x4

    !

    "

    #####

    $

    %

    &&&&&

    +

    0

    0

    1 m1

    0

    !

    "

    ####

    $

    %

    &&&&

    u +

    0

    0

    0

    1 m2

    !

    "

    ####

    $

    %

    &&&&

    w

    w

    !(s) = s2 s2 + km1+ m

    2( )m1m2

    "

    #$

    %

    &' = s

    2s2 ()

    n

    2"# %&

    Plant characteristic equation y = x2

  • Parameter Uncertainties, Root

    Locus, and Control Law

    Parameters of mass-spring

    system

    Uniform probability density

    functions for

    0.5 < m1, m2 < 1.5

    0.5 < -k < 2

    Effects of parameters on root

    locations (right)

    Single-input/single-output feedback control law

    u s( ) = !C s( )y s( )

    Design Cost Function

    Probability of Instability, Pri ei = 1 (unstable) or 0 (stable)

    Probability of Settling Time

    Exceedance, Prts ets = 1 (exceeded) or 0 (not

    exceeded)

    Probability of Control Limit

    Exceedance, Pru eu = 1 (exceeded) or 0 (not

    exceeded)

    Design Cost Function

    J = aPri2 + bPrts

    2 + c Pru2

    a = 1

    b = c = 0.01

    Monte Carlo Evaluation of Probabilityof Satisfying a Design Metric

    Compute v using randomnumber generators over Ntrials

    Required number of trialsdepends on outcomeprobability (right)

    Search for best d using agenetic algorithm tominimize J

    Prk(d,v) !

    1

    NekC d( ),H v( )"# $%

    i=1

    N

    & , k = 1,3

    J = aPri

    2(d,v) + bPr

    ts

    2(d,v) + cPr

    u

    2(d,v)

    Stabilization Requires

    Compensation

    Proportional feedback alone cannot stabilize the system

    Feedback of either sign drives at least one root into theright half plane

    u s( ) = !cy s( )

  • Search-and-Sweep Design of Family of

    Robust Feedback Compensators

    1) Begin with lowest-order feedback compensator

    2) Arrange parameters as binary design vector

    3) Genetic algorithm search for best values of the

    design vector, i.e., design vector that minimizes J

    C12(s) =

    a0+ a

    1s

    b0+ b

    1s + b

    2s2! C d( )

    d = a0,a1,b0,b1,b2{ }

    d* = a0*,a

    1*,b

    0*,b

    1*,b

    2*{ }

    Search-and-Sweep Design of Family ofRobust Feedback Compensators

    1) Define next higher-order compensator

    2) Optimize over all parameters, including optimal

    coefficients in starting population

    3) Sweep to satisfactory design or no further improvement

    C22(s) =

    a0+ a

    1s + a

    2s2

    b0+ b

    1s + b

    2s2

    d = a0*,a

    1*,a

    2,b0*,b

    1*,b

    2*{ }! d ** = a0 **,a1 **,a2 **,b0 **,b1 **,b2 **{ }

    C23(s) =

    a0+ a

    1s + a

    2s2

    b0+ b

    1s + b

    2s2+ b

    3s3

    C33(s) =

    a0+ a

    1s + a

    2s2+ a

    3s3

    b0+ b

    1s + b

    2s2+ b

    3s3

    C34(s) =

    a0+ a

    1s + a

    2s2+ a

    3s3

    b0+ b

    1s + b

    2s2+ b

    3s3+ b

    4s4

    ...

    Design Cost and Probabilities for

    Optimal 2nd- to 5th-Order

    CompensatorsNumber of Zeros = Number of Poles

    0

    0.001

    0.002

    0.003

    0.004

    0.005

    0.006

    0.007

    0.008

    0.009

    Cost Prob Instab Prob Ts Viol x 0.01 Prob Cont Viol x 0.1

    2nd Order

    3rd Order

    4th Order

    5th Order

    System Identification

  • Parameter-Dependent

    Linear System

    !x t( ) = F p( )x t( ) +G p( )u t( ) + L p( )w t( )

    z t( ) = Hx t( ) + n t( )

    ! Linear systems contains parameters

    ! What if the parameter vector, p, is unknown?

    Dynamic Model for

    Parameter Estimation Augment vector to include original state and parameter vector

    xAt( ) =

    x t( )

    p t( )

    !

    "

    ##

    $

    %

    &&

    System model for parameter identification becomes nonlinearbecause parameter is contained in the augmented state

    !xAt( ) =

    F p( )x t( ) +G p( )u t( ) + L p( )w t( )!" #$

    fp p t( ),w t( )!" #$

    %

    &'

    ('

    )

    *'

    +'" f

    Ax t( ),p t( ),u t( ),w t( )!" #$

    z t( ) = HA

    x t( )

    p t( )

    !

    "

    ,,

    #

    $

    --+ n t( ) = H 0!" #$

    x t( )

    p t( )

    !

    "

    ,,

    #

    $

    --+ n t( )

    System Identification Using an

    Extended Kalman-Bucy Filter

    !x(t)

    !p(t)

    !

    "

    ##

    $

    %

    &&= f x(t), p(t),u(t)[ ] +K z(t) 'H

    x(t)

    p(t)

    !

    "##

    $

    %&&

    ()*

    +*

    ,-*

    .*

    Add unknown parameter vector to the estimator state

    Extend the state estimator derived for linearsystems to account for the nonlinear dynamics

    Multiple-Model Testing for System

    Identification

    !x1(t) = F

    1x1(t) +G

    1u(t) +K

    1z(t) !H

    1x1(t)[ ]

    !x2(t) = F

    2x2(t) +G

    2u(t) +K

    2z(t) !H

    2x2(t)[ ]

    ...

    !xn(t) = F

    nxn(t) +G

    nu(t) +K

    nz(t) !H

    nxn(t)[ ]

    Create a separate estimator for each hypothetical model

    Choose model with minimum error residual

    Ji=1

    2!T! =

    1

    2z(t) "H

    ixi(t)[ ]

    T

    z(t) "Hixi(t)[ ]

    i = 1,n

  • Adaptive Control

    Adaptive Control

    System Design

    Control logic changes to accommodate changes orunknown parameters of the plant

    System identification

    Gain scheduling

    Learning systems

    Control law is nonlinear

    u t( ) = c z(t),a,y * t( )!" #$

    c [ ] : Control law

    x(t) : State

    z x(t)[ ] : Measurement of state

    a : Parameters of operating point

    y * (t) : Command input

    Operating Points Within a

    Flight Envelope

    Dynamic model is a function of altitude and airspeed

    Design LTI controllers throughout the flight envelope

    Gain Scheduling

    Proportional-integral controller with scheduled gains

    u t( ) = CFa( )y *+C

    Ba( )!x + C

    Ia( ) !y t( )dt" # c x(t),a,y * t( )$% &'

    Scheduling variables, a, e.g., altitude, speed, properties ofchemical process,

  • Cerebellar Model

    Articulation Controller (CMAC)

    Inspired by models ofcerebellum

    CMAC: Two-stage mappingof a vector input to a scalaroutput

    First mapping: Input space toassociation space

    s is fixed

    a is binary

    Second mapping:Association space to outputspace

    g contains learned weights

    s : x! a

    Input! Selector vector

    g :a! y

    Selector!Output

    Example of Single-

    Input CMACAssociation Space

    x is in (xmin, xmax)

    Selector vector is binary and has Nelements

    Receptive regions of associationspace map x to a

    Analogous to neurons that fire inresponse to stimulus

    NA = Number of receptive regions =N + C 1 = dim(a)

    C = Generalization parameter = # ofoverlapping regions

    Input quantization = (xmax $xmin) / N

    s : x! a

    Input! Selector vector

    a = 0 0 0 1 1 1 0 0[ ]T

    CMAC Output and

    Training

    CMAC output (i.e., control command) from activated cellsof c Associative Memory layers

    ASSOCIATION MEMORY, c = 3

    INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3

    input 2

    inp

    ut

    1

    quant. widthof input 2

    yCMAC = wTa = wi,activated

    i= j

    j+C!1

    " j= index of first activated region

    wjnew = wjold +!c

    ydesired " wioldi=1

    c

    #$%&'()

    Least-squares training of CMAC weights, w

    Analogous to synapses between neurons

    is the learning rate and wj is an activated cell weight

    Localized generalization and training

    !

    CMAC Output and Training

    In higher dimensions, association space is dim(x), a plane,cube, or hypercube

    Potentially large memory requirements

    Granularity (quantization) of output

    Variable generalization and granularity

    ASSOCIATION MEMORY, c = 3

    INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3

    input 2

    inp

    ut 1

    quant. widthof input 2

    2-dimensional association space

  • CMAC Control of a Fuel-

    Cell Pre-Processor(Iwan and Stengel)

    BATTERIES

    POWER

    CONDITIONING

    AND MOTOR

    CONTROL

    GEARMOTOR/

    GEN.

    FUEL

    PROCESSOR

    FUEL

    STORAGE

    FUEL CELL

    STACKShift

    2H O

    Air

    PrOx

    Reformer or Partial

    Oxidation Reactor

    Fuel cell produces electricity for electric motor

    Preferential Oxidizer (PrOx)

    Proton-Exchange Membrane Fuel Cell converts hydrogen

    and oxygen to water and electrical power

    Steam Reformer/Partial Oxidizer-Shift Reactor converts fuel

    (e.g., alcohol or gasoline) to H2, CO2, H2O, and CO. Fuel flow

    rate is proportional to power demand

    CO poisons the fuel cell and must be removed from the

    reformate

    Catalyst promotes oxidation of CO to CO2 over oxidation of

    H2 in a Preferential Oxidizer (PrOx)

    PrOx reactions are nonlinear functions of catalyst,

    reformate composition, temperature, and air flow

    FUEL

    PROCESSOR

    Shift

    2H O

    Air

    PrOx

    Reformer or Partial

    Oxidation Reactor

    CMAC/PID Control System

    for Preferential Oxidizer

    desired H2

    conversion

    airCMAC

    airPID

    airTOTAL

    training

    +-

    +

    +! ! PROX

    PID

    CMAC

    H2 conv.

    error

    HYBRID CONTROL SYSTEM

    (ANN)

    (Conventional)

    PROX reformate flow rate

    PROX inlet [CO] Inlet coolant temperature

    gains=f(flow rate)

    Inlet reformate

    Outlet reformate

    H2 conv. =

    f(airTotal, [H2]in, [H 2]out,

    flow rate, sensor dynamics)

    H2 Conversion Calc.

    actual H2 conversion[H2]out

    [H2]in

    ASSOCIATION MEMORY, c = 3

    INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3

    input 2

    inp

    ut

    1

    quant. widthof input 2

    Summary of CMAC

    Characteristics

    Inputs and Number of Divisions:

    PrOx inlet reformate flow rate (95)

    PrOx inlet cooling temperature (80)

    PrOx inlet CO concentration (100)

    Output: PrOx air injection rate

    Associative Layers, C: 24

    Number of Associative Memory Cells/Weights and

    Layer Offsets: 1,276 and [1,5,7]

    Learning Rate, : ~0.01

    Sampling Interval: 100 ms

    ASSOCIATION MEMORY, c = 3

    INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3

    input 2

    inp

    ut

    1

    quant. widthof input 2

    !

  • Flow Rate and Hydrogen Conversion

    of CMAC/PID Controller

    H2 conversion command (across PrOx only): 1.5%

    Novel data, with (---) and without pre-training ()

    Federal Urban Driving Cycle (= FUDS)

    Comparison of PrOx Controllers

    on FUDS mean H2 error

    maximum H2 error

    mean CO out

    max. CO out

    % % ppm ppm %

    Fixed-Air 0.68 0.87 6.3 28 57.2

    Table Look-up 0.13 1.43 6.5 26 57.8

    PID 0.05 0.51 7.7 30 58.1

    CMAC/PID 0.02 0.16 7.3 26 58.1

    net H2 output

    Reinforcement Learning

    Learn from success and failure

    Repetitive trials

    Reward correct behavior

    Penalize incorrect behavior

    Learn to control from a humanoperator

    http://en.wikipedia.org/wiki/Reinforcement_learning

    Next Time:Classification of Data

    Sets

  • Supplementary Material

    Dynamic Models for the

    Parameter Vector

    Unknown constant: p(t) = constant

    !p t( ) = 0; p 0( ) = p

    o; Pp 0( ) = Pp

    o

    Random p(t) (integrated white noise)

    !p t( ) = w t( ); p 0( ) = 0; Pp 0( ) = Ppo

    E w t( )!" #$ = 0; E w t( )wT %( )!" #$ = Qw& t ' %( )

    Linear dynamic system (Markov process)

    !p t( ) = Ap t( ) + Bw t( ) " fp p t( ),w t( )!" #$; w t( ) # N 0,Qw( )

    Inputs for System Identification

    Transient inputs Step or square wave

    Impulse or pulse train

    Persistent excitation Random noise

    Sinusoidal frequency sweep