Upload
marko-petkovic
View
38
Download
0
Embed Size (px)
DESCRIPTION
EPLFL MPC JONES OPERATOR SPLITTING
Citation preview
Model Predictive Control
Lecture: Operator Splitting Methods for Fast MPC
Colin Jones
Laboratoire dAutomatique, EPFLSome material from Stephen Boyds lecture notes on ADMM
Outline
1. Duality
2. Dual Decomposition
3. Method of Multipliers
4. Alternating Direction Method of Multipliers
5. Common Patterns in Control
6. ADMM for MPC
Operator Splitting Methods for Fast MPC 102 Model Predictive Control ME-425
Duality
Primal problem: minz
f (z)
s.t. Az = b
Define the Lagrangian
L(z , ) = f (z) + T (Az b)
and the dual function
d() = minz
L(z , )
The dual problem is
max
d()
Recover the primal optimal solution from
z? = argminz L(z , ?)
Operator Splitting Methods for Fast MPC 103 Model Predictive Control ME-425
Properties of the Dual Function
d() is concave1
d() = minz
f (z) + T (Az b)
The dual function is the pointwise minimum of affine functions
d() f (z) for all and all z such that Az = bGiven a feasible z such that Az = b
f (z) = f (z) + T (Az b) minz
L(z , ) = d()
Dual function gives lower bounds on the optimal solution.
1This is true whether f is convex or notOperator Splitting Methods for Fast MPC 104 Model Predictive Control ME-425
Properties of the Dual Function
d() is concave1
d() = minz
f (z) + T (Az b)
The dual function is the pointwise minimum of affine functions
d() f (z) for all and all z such that Az = bGiven a feasible z such that Az = b
f (z) = f (z) + T (Az b) minz
L(z , ) = d()
Dual function gives lower bounds on the optimal solution.
1This is true whether f is convex or notOperator Splitting Methods for Fast MPC 105 Model Predictive Control ME-425
Dual ProblemLagrange dual problem (find the best lower bound)
max
d()
Always a convex optimization problem
max d() minAx=b f (x)If problem is convex, then (under mild assumptions), we have strong duality:
max
d() = minz
f (z) s.t. Az = b
We can solve the primal, or the dual (whichever is easier).
Note: Duality extends to problems with inequality constraints too!
Operator Splitting Methods for Fast MPC 106 Model Predictive Control ME-425
Example: Linearly Constrained QPs
minz
12zTQz + cT z
s.t. Az = b
Goal: Find a feasible z and a such that the dual function equals the primal
d() = minz
12zTQz + cT z + T (Az b)
Take the gradient and set it to zero: Qz+c+AT=0
= f (z)
Conditions for optimality:
Qz + c + AT = 0
Az = b
(z
)=
[Q AT
A 0
]1(cb
)
Operator Splitting Methods for Fast MPC 107 Model Predictive Control ME-425
Dual Ascent MethodDual problem:
max
d()
This is a convex, unconstrained optimization problem.
We would like to apply a gradient ascent approach:
k+1 = k + ckd(k)
How do we compute a gradient of the dual function?
(Note: Were making the strong assumption that the dual function isdifferentiable here. Similar procedure in the non-differentiable case too.)
Operator Splitting Methods for Fast MPC 108 Model Predictive Control ME-425
Gradient of the DualTheorem:If z = argminL(z , ), then Az b d()
Recall: g is a subgradient of a function h if and only if
h(x) gT (x y) h(y)
for all x and y .
d() (Az b)T ( ) = L(z , ) (Az b)T ( )= f (z) + T (Az b) (Az b)T ( )= f (z) + T (Az b) d()
Note: If d is differentiable, then d() = d()
Operator Splitting Methods for Fast MPC 109 Model Predictive Control ME-425
Dual Gradient MethodWe can compute the gradient of the dual and implement the gradient method:
xk+1 = argminx L(x , k)
k+1 = k + c(Axk+1 b)
This works, but requires a number of strong assumptions.
Operator Splitting Methods for Fast MPC 1010 Model Predictive Control ME-425
Outline
1. Duality
2. Dual Decomposition
3. Method of Multipliers
4. Alternating Direction Method of Multipliers
5. Common Patterns in Control
6. ADMM for MPC
Operator Splitting Methods for Fast MPC 1011 Model Predictive Control ME-425
Dual Decomposition
Suppose our problem has the form:
min f (x) + g(y)
s.t. Ax + By = d
Computing the dual function:
minx ,y
L(x , y , ) = minx ,y
f (x) + g(y) + T (Ax + By d)= min
x(f (x) + TAx) + min
y(g(y) + TBy) d
The Lagrangian function is separable for a fixed !
Algorithm becomes:
xk+1 = argminx f (x) + Tk Ax
yk+1 = argminy g(y) + Tk By
k+1 = k + c(Axk+1 + Byk+1 b)Minimizing f and g independently and in parallelOperator Splitting Methods for Fast MPC 1012 Model Predictive Control ME-425
Dual Decomposition
Benefits:
Can solve very large problems in parallel
min f and min g may be much simpler to solve than min f + g
Limitations:
The function value converges non-monotonically to the optimal valueDoesnt matter for control
Slow (sub-gradient method)
If objective is not strictly convex, then the primal iterates xk do notnecessarily converge
MPC objectives will almost never be stricty convex because they willinclude indicator functionsThe primal iterates are the control law these must converge!
Operator Splitting Methods for Fast MPC 1013 Model Predictive Control ME-425
Outline
1. Duality
2. Dual Decomposition
3. Method of Multipliers
4. Alternating Direction Method of Multipliers
5. Common Patterns in Control
6. ADMM for MPC
Operator Splitting Methods for Fast MPC 1014 Model Predictive Control ME-425
Augmented Lagrangian
min f (z)
s.t. Az = b
Add a penalty term to the cost function to make it strictly convex:
min f (z) +
2Az b2
s.t. Az = b
Note that this doesnt change the solution!
Recall: f (z) is strictly convex if
z1, z2,t (0, 1) f (tz1 + (1 t)z2) < tf (z1) + (1 t)f (z2)
Operator Splitting Methods for Fast MPC 1015 Model Predictive Control ME-425
Augmented Lagrangian
min f (z)
s.t. Az = b
Add a penalty term to the cost function to make it strictly convex:
min f (z) +
2Az b2
s.t. Az = b
Note that this doesnt change the solution!
Recall: f (z) is strictly convex if
z1, z2,t (0, 1) f (tz1 + (1 t)z2) < tf (z1) + (1 t)f (z2)
Operator Splitting Methods for Fast MPC 1016 Model Predictive Control ME-425
Dual FunctionThe (augmented) Lagrangian is:
L(z , ) = f (z) + T (Az b) + 2Az b2
The dual function is
d() = minz
L(z , )
Note that the dual function has changed
Theorem: Convex Analysis, Rockafellar (1970)If a convex program has a strictly convex objective, it has a unique solutionand its Lagrangian dual function is differentiable.
Convergence of the iterates
Differentiable function faster gradient method, rather than sub-gradient
Operator Splitting Methods for Fast MPC 1017 Model Predictive Control ME-425
Augmented Lagrangian Method
xk+1 = argminx f (x) + Tk (Ax b) +
2Ax b2
k=1 = k + (Axk+1 b)
We want to apply this to problems of the form
min f (x) + g(y)
s.t. Ax + By = d
Problem: Augmented Lagrangian doesnt decompose
minx ,y
f (x) + g(y) + Tk (Ax + By b) +
2Ax + By b2
Couples x and y
Operator Splitting Methods for Fast MPC 1018 Model Predictive Control ME-425
Augmented Lagrangian Method
Positive:
Converges under extremely loose conditions:non-differentiable functions, unbounded functions / indicator functions, etc
Negative:
Does not decompose / parallelize due to the quadratic term
Operator Splitting Methods for Fast MPC 1019 Model Predictive Control ME-425
Outline
1. Duality
2. Dual Decomposition
3. Method of Multipliers
4. Alternating Direction Method of Multipliers
5. Common Patterns in Control
6. ADMM for MPC
Operator Splitting Methods for Fast MPC 1020 Model Predictive Control ME-425
Alternating Direction Method of Multipliers
min f (x) + g(y)
s.t. Ax + By = b
L(x , y , ) = f (x) + g(y) + T (Ax + By b) + 2Ax + By b2
ADMM:
xk+1 = argminx L(x , yk , k)
y k+1 = argminy L(xk+1, y , k)
k+1 = k + (Axk+1 + By k+1 b)
Idea: Approximate the computation of the dual (sort of) via one step ofGauss-Seidel / block coordinate descent.Operator Splitting Methods for Fast MPC 1021 Model Predictive Control ME-425
ADMM: A Cleaner FormulationCombine linear and quadratic terms:
L(x , y , ) = f (x) + g(y) + T (Ax + By b) + 2Ax + By b2 (1)
= f (x) + g(y) +
2Ax + By b + 2 (2)
where = 1
ADMM (scaled form):
xk+1 = argminx f (x) +
2Ax + By k b + k2
y k+1 = argminy g(y) +
2Axk+1 + By b + k2
k+1 = k + Axk+1 + By k+1 b
This is the form that we will use in the exercises
Operator Splitting Methods for Fast MPC 1022 Model Predictive Control ME-425
Convergence of ADMM
If
f , g convex, closed, proper
L has a saddle point (i.e., an optimal solution exists).Note that this requires that the problem is feasible!
then
iterates approach feasibility Axk + By k b 0 objective approaches optimal valuef (xk) + g(y k) minx ,y f (x) + g(y) s.t. Ax + By = c
Operator Splitting Methods for Fast MPC 1023 Model Predictive Control ME-425
Outline
1. Duality
2. Dual Decomposition
3. Method of Multipliers
4. Alternating Direction Method of Multipliers
5. Common Patterns in Control
6. ADMM for MPC
Operator Splitting Methods for Fast MPC 1024 Model Predictive Control ME-425
Easily Evaluated Updates
Given a function f (x), we need to compute
x+ = argminx f (x) +
2Ax v2
If A = I (common), then this is called the proximal operator of f
proxf , (v) = argminx f (x) +
2x v2
We types of functions f can we evaluate this easily?
Operator Splitting Methods for Fast MPC 1025 Model Predictive Control ME-425
Common Patterns in Control. Upper / lower bounds
f (x) =
x l0 l x u x u
proxf , (v) = argminx f (x) +
2x v2
= argminx x v2s.t. l x u
=
l v lv l v uu v u
Evaluation of the proximal operator is trivial!
Operator Splitting Methods for Fast MPC 1026 Model Predictive Control ME-425
Common Patterns in Control. Polytopic constraints
min f (x)
s.t. Hx h
Re-write using slack variables:
min f (x) + g(s)
s.t. Hx + s = h
where g(s) is the indicator function for the positive orthant
g(s) =
{0 s 0 otherwise
proxg, (s) = max{s, 0}
Operator Splitting Methods for Fast MPC 1027 Model Predictive Control ME-425
Common Patterns in Control. Quadratic function
f (x) =12xTQx + cT x
proxf , (v) = argminx12xTQx + cT x +
2(x v)T (x v)
Take gradient, set to zero
Qx + c + (x v) = 0proxf , (v) = (Q + I )
1(v c)
Operator Splitting Methods for Fast MPC 1028 Model Predictive Control ME-425
Other Common PatternsMany other common constraints and functions have nice proximal operators
Ellipsoidal / ball-constraints
Vector norms: 1, 2, inf norms Matrix norms: Frobenius-norm, 2norm, Nuclear-norm Standard convex cones: second-order cone, semi-definite cone, positiveorthant, etc
Operator Splitting Methods for Fast MPC 1029 Model Predictive Control ME-425
Distributed Optimization
Separable cost function with shared variables
min
fi (xi )
s.t. xi = z for all i
Note that g(z) = 0.
Augmented Lagrangian is:
L(x0, . . . , xn, ) =
fi (xi ) +
2xi z + i2
ADMM:
xk+1i = argminxi fi (xi ) +
2xi zk + ki 2 Parallel
zk+1 = argminz
2xk+1i z + ki 2
=1n
xk+1i +
ki Consensus
k+1i = ki + x
k+1i + z
k+1 Parallel
Operator Splitting Methods for Fast MPC 1030 Model Predictive Control ME-425
Outline
1. Duality
2. Dual Decomposition
3. Method of Multipliers
4. Alternating Direction Method of Multipliers
5. Common Patterns in Control
6. ADMM for MPC
Operator Splitting Methods for Fast MPC 1031 Model Predictive Control ME-425
Linear Quadratic Predictive Control
14
Assumption: Prox operators for X and U are simple(Also possible for more complex sets)
Linear dynamics Quadratic stage costs Simple stage constraints
How to define functions f and g?
minx,u
N1i=0
x iQxi + uiRui
Z[ xi+1 = Axi + Buixi X, ui U
Sequential Convex Program
15
Make a copy of states and inputsmin
x,u
N1i=0
x iQxi + uiRui
Z[ xi+1 = Axi + Buixi = xi , ui = ui
xi X, ui U
Sequential Convex Program
16
minx,u
N1i=0
x iQxi + uiRui
Z[ xi+1 = Axi + Buixi = xi , ui = ui
xi X, ui U
f (x,u) Linear quadratic regulator
Sequential Convex Program
17
minx,u
N1i=0
x iQxi + uiRui
Z[ xi+1 = Axi + Buixi = xi , ui = ui
xi X, ui U
f (x,u) Linear quadratic regulator
Linear coupling constraints
Sequential Convex Program
18
minx,u
N1i=0
x iQxi + uiRui
Z[ xi+1 = Axi + Buixi = xi , ui = ui
xi X, ui U xi , ui xi , ui
f (x,u) Linear quadratic regulator
Simple constraints
Linear coupling constraints
Sequential Convex Program
19
minx,u
N1i=0
x iQxi + uiRui
Z[ xi+1 = Axi + Buixi = xi , ui = ui
xi X, ui U xi , ui xi , ui
g(xi , ui)
Sequential Convex Program Proximal Operators
20
2. Stage constraints Box (upper/lower bounds) Clipping Sphere Scaling
Also possible with additional scaling: Polyhedron Ellipse
1. LQR (Linearly constrained least-squares)
proxf (x , u) = minx,u
N1i=0
x iQxi + uiRui +
2xi xi22 +
2ui ui22
Z[ xi+1 = Axi + Bui
= M
x
u
.
Putting it Together: Sequential Convex Programming
21
minx,u
N1i=0
x iQxi + uiRui
Z[ xi+1 = Axi + Buixi = xi , ui = ui
xi X, ui U
min f (x) + g(y)
Z[ Ax + By = c
xk+1
uk+1
= M
xk + k
uk + k
xk+1 = X(x
k+1) uk+1 = U(uk+1)
k+1 = k + xk+1 xk+1k+1 = k + uk+1 uk+1
Multiplication
Clipping
Addition
Second-Order Example
22
k = 1
)SHJR! x6YHUNL! x
Second-Order Example
23
k = 2
)SHJR! x6YHUNL! x
Second-Order Example
24
k = 3
)SHJR! x6YHUNL! x
Second-Order Example
25
k = 5
)SHJR! x6YHUNL! x
Second-Order Example
26
k = 10
)SHJR! x6YHUNL! x
Second-Order Example
27
k = 15
)SHJR! x6YHUNL! x
Second-Order Example
28
k = 20
)SHJR! x6YHUNL! x
Second-Order Example
29
k = 30
)SHJR! x6YHUNL! x
Second-Order Example
30
k = 40
)SHJR! x6YHUNL! x
Second-Order Example
31
k = 50
)SHJR! x6YHUNL! x
Second-Order Example
32
k = 60
)SHJR! x6YHUNL! x
Second-Order Example
33
k = 70
)SHJR! x6YHUNL! x
Second-Order Example
34
k = 75
)SHJR! x6YHUNL! x
10 20 30 40 50 60 70 80 90 1000
0.5
1
1.5
2
Solve
Tim
e (m
s)
Iteration
Example: Quad-Copter
35
7 states, 4 inputs, horizon 8 Upper/lower bounds on states and inputs Ellipsoidal terminal set Algorithm: Fast AMA
Max: 1.5ms
Mean: 250s
Computation time sensitive to number of active constraints
Worst-case analysis critical!
Time [Ye, et al, 2013]
Example: AC/DC Converter
36
AC grid DC source
Switched inverter (150 kHz)CL filter
[Richter, et al, 2010]
min1
2uTHu + g(x, xss , uss , w)
T u
Z[uk U() uss , k = 0, . . . , N 1
0 100 200 300 400 500
100 ns1 us
10 us100 us
1 ms
Performance of Auto-Tuned FGM on 2.5GHz PC
37
Time step
CPLEX: 0.51 ms
0 100 200 300 400 500
100 ns1 us
10 us100 us
1 ms
Performance of Auto-Tuned FGM on 2.5GHz PC
38
Time step
CPLEX: 0.51 ms
0 100 200 300 400 500
100 ns1 us
10 us100 us
1 ms
Fast gradient: 360ns
On average 1400x faster than CPLEX
[Richter, et al, 2010]
Exercise: Box-constrained QP via ADMM
minz
12zTQz + cT z
s.t. Az = b
l z u
minz
12zTQz + cT z
s.t. Az = b
l z uz = z
zupdate: zk+1 = argminz12zTQz + cT z +
2z zk + k2
s.t. Az = b(zk+1
?
)=
[Q + I AT
A 0
]1(c + (xk k)b
)
zupdate: zk+1 = min{max{zk+1 + k , l}, u}
Dual update: k+1 = k + zk+1 zk+1
Operator Splitting Methods for Fast MPC 1057 Model Predictive Control ME-425
Simple prox operators
Least-squares
Toolbox for Deployment of Embedded Optimization
64Release date: Very soon ;)
% Optimization variablesx = splitvar(n, N);u = splitvar(m, N-1);x(:,1) = parameter(n,1); % Objective and dynamicsobj = 0;for i = 1:N-1 x(:,i+1) == A*x(:,i) + B*u(:,i); obj = obj + x(:,i)'*Q*x(:,i) + ...
u(:,i)'*R*u(:,i);endobj = obj + x(:,end)'*x(:,end); % set up constraints-5
Splitting Methods for Control
Main idea:
Separate complex optimization into a sequence of simpler operations
Use dual to push the individual problems into consensus
Key properties
Centralized optimization : Each iteration is extremely cheap
Parallel optimization : Sub-problems can be solved in parallel
Major downside
Number of iterations may be very high. Lots of ongoing research to dealwith this issue
Operator Splitting Methods for Fast MPC 1059 Model Predictive Control ME-425
DualityDual DecompositionMethod of MultipliersAlternating Direction Method of MultipliersCommon Patterns in ControlADMM for MPC