32
Performance Bounds for Constrained Linear Stochastic Control Yang Wang and Stephen Boyd Electrical Engineering Department Stanford University Workshop on Optimization and Control with Applications, Harbin, June 9 2009

Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Performance Bounds for

Constrained Linear Stochastic Control

Yang Wang and Stephen Boyd

Electrical Engineering DepartmentStanford University

Workshop on Optimization and Control with Applications, Harbin, June 9 2009

Page 2: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Outline

• constrained linear stochastic control problem

• some heuristic control schemes

– projected linear control– control-Lyapunov– certainty-equivalent model predictive control (MPC)

• the linear quadratic case

• performance bound

• performance bound parameter choices for control schemes

• numerical examples

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 1

Page 3: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Linear stochastic system

• linear dynamical system with process noise:

xt+1 = Axt + But + wt, t = 0, 1, . . . ,

– xt ∈ Rn is the state– ut ∈ U is the control input– U ⊂ Rm is the input constraint set, with 0 ∈ U– wt ∈ Rn is zero mean IID process noise, Ewtw

T

t= W

• state feedback control policy:

ut = φ(xt), t = 0, 1, . . . ,

φ : Rn → U is the state feedback function

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 2

Page 4: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Objective

• objective is average stage cost:

J = lim supT→∞

1

TE

T−1∑

t=0

(ℓx(xt) + ℓu(ut))

– ℓx : Rn → R is state stage cost function– ℓu : U → R is the input state cost function

• ℓx, ℓu, U need not be convex

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 3

Page 5: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Stochastic control problem

• stochastic control problem: choose feedback function φ to minimize J

• infinite dimensional nonconvex optimization problem

• problem data:

– dynamics and input matrices A, B– distribution of process noise wt

– state and input cost functions ℓx, ℓu

– input constraint set U

• φ⋆ denotes an optimal feedback function

• J⋆ denotes optimal objective value

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 4

Page 6: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

‘Solution’ via dynamic programming

• find V ⋆ : Rn → R and α with

V ⋆(z) + α = minv∈U

(ℓu(v) + EV ⋆(Az + Bv + wt))

(expectation is over wt)

• optimal feedback function is then

φ⋆(z) = argminv∈U

(ℓu(v) + EV ⋆(Az + Bv + wt))

• optimal value of stochastic control problem is J⋆ = α

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 5

Page 7: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Stochastic control problem

• generally very hard to solve(even more: how would we represent a general function φ?)

• can be effectively solved

– when the problem dimensions are very small, e.g., n = m = 1– when U = Rm and ℓx, ℓu are convex quadratic;

in this case optimal policy is linear: φ⋆(z) = Kz

• many suboptimal methods have been proposed

– can evaluate J for a given φ via Monte Carlo simulation– but how suboptimal is it?

• this talk: an effective method for finding a (good) lower bound on J⋆

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 6

Page 8: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Projected linear state feedback

• a simple suboptimal policy:

φpl(z) = P(Kplz)

– Kpl ∈ Rm×n is a gain matrix (to be chosen)– P is projection onto U

• when U is a box, i.e., U = {u | ‖u‖∞ ≤ Umax}, reduces to saturatedlinear state feedback

φpl(z) = Umaxsat((1/Umax)Kplz)

sat is (entrywise) unit saturation

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 7

Page 9: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Control-Lyapunov policy

• control-Lyapunov policy is

φclf(z) = argminv∈U

(ℓu(v) + EVclf(Az + Bv + wt))

– Vclf : Rn → R (which is to be chosen) is the control-Lyapunovfunction

– when Vclf = V ⋆, this is optimal policy

• when Vclf is quadratic, the control-Lyapunov policy simplifies to

φclf(z) = argminv∈U

(ℓu(v) + Vclf(Az + Bv))

since Ewt = 0, and term involving EwtwT

t= W is constant

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 8

Page 10: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Certainty-equivalent model predictive control (MPC)

• φmpc(z) is found by solving (possibly approximately)

minimize∑

T−1

τ=0 (ℓx(xτ) + ℓu(vτ)) + Vmpc(xT )subject to xτ+1 = Axτ + Bvτ , τ = 0, . . . , T − 1

vτ ∈ U , τ = 0, . . . , T − 1x0 = z

– variables are v0, . . . , vT−1, x0, . . . , xT

– Vmpc : Rn → R is the terminal cost (to be chosen)– T is the planning horizon (also to be chosen)

• let solution be v⋆

0, . . . , v⋆

T−1, x⋆

0, . . . , x⋆

T

• MPC policy is φmpc(z) = v⋆

0

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 9

Page 11: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Parameters in heuristic control policies

• performance of suboptimal policies depends on choice of parameters(Kpl, Vclf, Vmpc and T )

• one choice for Vclf, Vmpc: (quadratic) value function for someunconstrained linear quadratic problem

• one choice for Kpl: optimal gain matrix for some unconstrained linearquadratic problem

• we will suggest some parameters later . . .

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 10

Page 12: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

The performance bound

our method:

• computes a lower bound J lb ≤ J⋆ using convex optimization(hence is tractable)

• bound is computed for each specific problem instance

• (at this time) cannot guarantee tightness of bound

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 11

Page 13: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Judging a heuristic policy

• suppose we have a heuristic policy φ with objective J(evaluated by Monte Carlo, say)

• since J lb ≤ J⋆ ≤ J , if J − Jlb is small, then

– policy φ is nearly optimal– bound J lb is nearly tight

• if J − J lb is big, then for this problem instance, either

– policy is poor, or,– bound is poor (or both)

• examples suggest that J − J lb is often small

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 12

Page 14: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Unconstrained linear quadratic control

• can effectively solve stochastic control problem when

– U = Rm (no constraints)– ℓx(z) = zTQz, ℓu(v) = vTRv, Q � 0, R � 0

• optimal cost is J⋆

lq = Tr(P ⋆

lqW )

• optimal state feedback function is φ⋆(z) = K⋆

lqz, where

K⋆

lq = −(R + BTP ⋆

lqB)−1BTP ⋆

lqA

• P ⋆

lq is positive semidefinite solution of ARE

P ⋆

lq = Q + ATP ⋆

lqA − ATP ⋆

lqB(R + BTP ⋆

lqB)−1BTP ⋆

lqA

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 13

Page 15: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Linear quadratic control via LMI/SDP

• can characterize J⋆

lq and P ⋆

lq via the semidefinite program (SDP)

maximize Tr(PW )

subject to P � 0[

R + BTPB BTPAATPB Q + ATPA − P

]

� 0

– variable is P– optimal point is P = P ⋆

lq; optimal value is J⋆

lq

• solution does not depend on W , as long as W ≻ 0

• constraints are convex in (P, Q,R), so J⋆

lq(Q,R) is a concave functionof (Q,R)

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 14

Page 16: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Basic bound

• suppose Q � 0, R � 0, s satisfy

zTQz + vTRv + s ≤ ℓx(z) + ℓu(v) for all z ∈ Rn, v ∈ U

i.e., quadratic stage costs are everywhere smaller than ℓx + ℓv

• then J⋆

lq(Q, R) + s is a lower bound on J⋆

• follows from monotonicity of stochastic control cost w.r.t. stage costs

• lefthand side is optimal value of unconstrained quadratic problem

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 15

Page 17: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Optimizing the bound

• can optimize the lower bound over Q, R, s by solving

maximize J⋆

lq(Q,R) + s

subject to Q � 0, R � 0,zTQz + vTRv + s ≤ ℓx(z) + ℓu(v) for all z ∈ Rn, v ∈ U

• a convex optimization problem

– objective is concave– constraints are convex– last constraint is convex in Q, R, s for each z and v

• last constraint is semi-infinite, parameterized by the (infinite) setz ∈ Rn, u ∈ U

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 16

Page 18: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Optimizing the bound

• semi-infinite constraint makes problem difficult in general

• can solve exactly in a few cases

• in other cases, can replace semi-infinite constraint with conservativeapproximation, which still gives a lower bound

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 17

Page 19: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Quadratic stage cost and finite input set

• can solve optimization problem exactly when

– ℓx(z) = zTQ0z, ℓu(v) = vTR0v, Q � 0, R � 0– U = {u1, . . . , uK} (finite input constraint set)

• constraint

zTQz + vTRv + s ≤ ℓx(z) + ℓu(v) for all z ∈ Rn, v ∈ U

becomes

Q � Q0, uT

iRui + s ≤ uT

iR0ui, i = 1, . . . , K

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 18

Page 20: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

• to optimize the bound we solve SDP (with variables P , Q, R, s)

maximize Tr(PW ) + s

subject to P � 0, Q � 0, R � 0, Q � Q0[

R + BTPB BTPAATPB Q + ATPA − P

]

� 0

uT

iRui + s ≤ uT

iR0ui, i = 1, . . . , K

• monotone in Q, so we can set Q = Q0 w.l.o.g.

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 19

Page 21: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

S-procedure relaxation

• suppose stage costs are quadratic

• suppose we can find R1, . . . , RM and s1, . . . , sM for which

U ⊆ U = {v | vTRiv + si ≤ 0, i = 1, . . . , M}

• a sufficient condition for

zTQz + vTRv + s ≤ ℓx(z) + ℓu(v) for all z ∈ Rn, v ∈ U

is

zTQz + vTRv + s ≤ zTQ0z + vTR0v for all z ∈ Rn, v ∈ U

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 20

Page 22: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

• equivalent to Q � Q0 and

vTRiv + si ≤ 0, i = 1, . . . , M =⇒ vTRv + s ≤ vTR0v

• which is implied by Q � Q0 and the existence of λ1, . . . , λM ≥ 0 with

R − R0 �M∑

i=1

λiRi, s ≤M∑

i=1

λisi

(by the S-procedure)

• so J⋆

lq(Q,R) + s is a still a lower bound on J⋆

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 21

Page 23: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

• to optimize the bound we solve the SDP

maximize Tr(PW ) + s0

subject to P � 0, Q � 0, R � 0, Q � Q0

[

R + BTPB BTPAATPB Q + ATPA − P

]

� 0

R − R0 �∑

M

i=1 λiRi, s0 ≤∑

M

i=1 λisi

λi ≥ 0, i = 1, . . . , M

with variables P , Q, R, λ1, . . . , λM , s0, . . . , sM

• can set Q = Q0 w.l.o.g.

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 22

Page 24: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Suboptimal control policies

• optimizing the lower bound gives Plb

• can interpret Tr(PlbW ) as optimal cost of an unconstrained quadraticproblem that approximates (and underestimates) our problem

• suggests thatVlb(z) = zTPlbz,

andKlb = −(Rlb + BTPlbB)−1BTPlbA

are good choices of parameters for suboptimal control policies

• examples show this is the case

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 23

Page 25: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Numerical examples

• illustrate bounds for 3 examples

– small problem with trilevel inputs– large problem with box constraints– discretized mechanical control system

• compare lower bound with various heuristic policies

– projected linear state feedback– model predictive control– control-Lyapunov policy

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 24

Page 26: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Small problem with trilevel inputs

• n = 8, m = 2

• A, B matrices randomly generated; A scaled so maxi |λi(A)| = 1

• quadratic stage costs with R0 = I, Q0 = I

• wt ∼ N (0, 0.25I)

• finite input set: U = {−0.2, 0, 0.2}2

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 25

Page 27: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Large problem with box constraints

• n = 30, m = 10

• A, B matrices randomly generated; A scaled so maxi |λi(A)| = 1

• quadratic stage costs with R0 = I, Q0 = I

• wt ∼ N (0, 0.25I)

• box input constraints: U = {v ∈ Rm | ‖v‖∞ ≤ 0.1}

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 26

Page 28: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Discretized mechanical control system

u1 u2

u3

• 6 masses connected by springs; 3 input tensions between masses

• quadratic stage costs with R0 = I, Q0 = I

• wt uniform on [−0.5, 0.5]

• box input constraints: U = {v ∈ Rm | ‖v‖∞ ≤ 0.1}

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 27

Page 29: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Heuristic policies

• projected linear state feedback with Kpl = K⋆

lq

• control-Lyapunov policy with Vclf(z) = zTPlbz

• model predictive control (MPC) with T = 30, Vmpc(z) = zTPlbz

(for trilevel example we solve convex relaxation with u(t) ∈ [−0.2, 0.2],then round value to {−0.2, 0, 0.2})

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 28

Page 30: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Results

small trilevel large random masses

PLSF 12.9 31.3 269.8

CLF 10.8 25.6 61.1

MPC 10.9 25.7 58.9

J lb 9.1 23.8 43.2

• control-Lyapunov with Plb and MPC achieve similar performance

• control-Lyapunov policy can be computed very fast (in tens ofmicroseconds); MPC policy can be computed in milliseconds

• bound Jlb is reasonably close to J for these examples

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 29

Page 31: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Conclusions

• we’ve shown how to find lower bounds on optimal performance forconstrained linear stochastic control problems

• requires solution of convex optimization problem, hence is tractable

• provides only provable lower bound on optimal performance that we areaware of

• as a by-product, provides excellent choice for quadraticcontrol-Lyapunov function

• in many cases, gives everything you want:

– a provable lower bound on performance– a relatively simple heuristic policy that comes close

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 30

Page 32: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

References

• Wang & Boyd, Performance Bounds for Linear Stochastic Control,Systems Control Letters 58(3):178–182, 2008

• Wang & Boyd, Performance Bounds and Suboptimal Policies for LinearStochastic Control via LMIs, (manuscript)

• Cogill & Lall, Suboptimality Bounds in Stochastic Control: A QueueingExample, Proceedings ACC p1642–1647, 2006

(similar results in MDP setting)

• Lincoln & Rantzer, Relaxing Dynamic Programming, IEEE T-AC51(8):1249-2006, 2006

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 31