Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Ellipsoid Method
• ellipsoid method
• convergence proof
• inequality constraints
• feasibility problems
Prof. S. Boyd, EE392o, Stanford University
Challenges in cutting-plane methods
• can be difficult to compute appropriate next query point
• localization polyhedron grows in complexity as algorithm progresses
can get around these challenges . . .
ellipsoid method is another approach
• developed in 70s by Shor and Yudin
• used in 1979 by Khachian to give polynomial time algorithm for LP
Prof. S. Boyd, EE392o, Stanford University 1
Ellipsoid algorithm
idea: localize x? in an ellipsoid instead of a polyhedron
1. at iteration k we know x? ∈ E(k)
2. set x(k+1) := center(E (k)); evaluate ∇f(x(k+1)) (or g(k) ∈ ∂f(x(k+1)))
3. hence we know
x? ∈ E(k) ∩ {z | ∇f(x(k+1))T (z − x(k+1)) ≤ 0}
(a half-ellipsoid)
4. set E(k+1) := minimum volume ellipsoid coveringE(k) ∩ {z | ∇f(x(k+1))T (z − x(k+1)) ≤ 0}
Prof. S. Boyd, EE392o, Stanford University 2
PSfrag replacements
E(k)
x(k+1)
∇f(x(k+1))
E(k+1)
compared to cutting-plane method:
• localization set doesn’t grow more complicated
• easy to compute query point
• but, we add unnecessary points in step 4
Prof. S. Boyd, EE392o, Stanford University 3
Properties of ellipsoid method
• reduces to bisection for n = 1
• simple formula for E (k+1) given E(k), ∇f(x(k+1))
• E(k+1) can be larger than E (k) in diameter (max semi-axis length), butis always smaller in volume
• vol(E(k+1)) < e−12n vol(E(k))
(note that volume reduction factor depends on n)
Prof. S. Boyd, EE392o, Stanford University 4
Example
PSfrag replacements
px(0)
PSfrag replacements
px(1)
PSfrag replacements
px(2)
Prof. S. Boyd, EE392o, Stanford University 5
PSfrag replacements
px(3)
PSfrag replacements
px(4)
PSfrag replacements
px(5)
Prof. S. Boyd, EE392o, Stanford University 6
Updating the ellipsoid
E(x,A) ={
z | (z − x)TA−1(z − x) ≤ 1}
PSfrag replacements rxrx+
r
¡¡¡ª
E
@@@R
E+g
Prof. S. Boyd, EE392o, Stanford University 7
(for n > 1) minimum volume ellipsoid containing
E ∩{
z | gT (z − x) ≤ 0}
is given by
x+ = x−1
n + 1Ag̃
A+ =n2
n2 − 1
(
A−2
n + 1Ag̃g̃TA
)
where g̃∆= g
/
√
gTAg
Prof. S. Boyd, EE392o, Stanford University 8
Stopping criterion
x? ∈ Ek, so
f(x?) ≥ f(x(k)) +∇f(x(k))T (x? − x(k))
≥ f(x(k)) + infx∈E(k)
∇f(x(k))T (x− x(k))
= f(x(k))−√
∇f(x(k))TA(k)∇f(x(k))
simple stopping criterion:
√
∇f(x(k))TA(k)∇f(x(k)) ≤ ²
Prof. S. Boyd, EE392o, Stanford University 9
PSfrag replacements
AK
f(x(k))−√
∇f(x(k))TA(k)∇f(x(k))
¡ªf(x(k))
f?
k0 5 10 15 20 25 30
Prof. S. Boyd, EE392o, Stanford University 10
more sophisticated stopping criterion: Uk − Lk ≤ ², where
Uk = mini≤k
f(x(i))
Lk = maxi≤k
(
f(x(i))−√
∇f(x(i))TA(i)∇f(x(i))
)
Prof. S. Boyd, EE392o, Stanford University 11
PSfrag replacements
@ILk
¡ªUk
f?
k0 5 10 15 20 25 30
Prof. S. Boyd, EE392o, Stanford University 12
Basic ellipsoid algorithm
ellipsoid described as E(x,A) = { z | (z − x)TA−1(z − x) ≤ 1 }
given ellipsoid E(x,A) containing x?, accuracy ² > 0
repeat1. evaluate ∇f(x) (or g ∈ ∂f(x))
2. if√
∇f(x)TA∇f(x) ≤ ², return(x)3. update ellipsoid
3a. g̃ := ∇f(x)/
√
∇f(x)TA∇f(x)
3b. x := x− 1n+1Ag̃
3c. A := n2
n2−1
(
A− 2n+1Ag̃g̃TA
)
Prof. S. Boyd, EE392o, Stanford University 13
Interpretation
• change coordinates so uncertainty (E) is unit ball
• take gradient (or subgradient) step with fixed length 1/(n + 1)
properties:
• can propagate Cholesky factor of A; get O(n2) update
• not a descent method
• often slow but robust in practice
Prof. S. Boyd, EE392o, Stanford University 14
Proof of convergence
assumptions:
• f is Lipschitz: |f(y)− f(x)| ≤ G‖y − x‖
• E(0) is ball with radius R
suppose f(x(i)) > f? + ², i = 0, . . . , k
thenf(x) ≤ f? + ² =⇒ x ∈ E (k)
since at iteration i we only discard points with f ≥ f(x(i))
Prof. S. Boyd, EE392o, Stanford University 15
from Lipschitz condition,
‖x− x?‖ ≤ ²/G =⇒ f(x) ≤ f? + ² =⇒ x ∈ E (k)
so B = {x | ‖x− x?‖ ≤ ²/G} ⊆ E(k)
hence vol(B) ≤ vol(E (k)), so
βn(²/G)n ≤ e−k/2n vol(E(0)) = e−k/2nβnR
n
(βn is volume of unit ball in Rn)
therefore k ≤ 2n2 log(RG/²)
Prof. S. Boyd, EE392o, Stanford University 16
PSfrag replacements
E(0)
E(k)
x(k)
f(x) ≤ f? + ²
B = {x | ‖x− x?‖ ≤ ²/G}
x?
conclusion: for K > 2n2 log(RG/²),
mini=0,...,K
f(x(i)) ≤ f? + ²
Prof. S. Boyd, EE392o, Stanford University 17
Interpretation of complexity
since x? ∈ E0 = {x | ‖x− x(0)‖ ≤ R}, our prior knowledge of f? is
f? ∈ [f(x(0))−GR, f(x(0))]
our prior uncertainty in f? is GR
after k iterations our knowledge of f? is
f? ∈
[
mini=0,...,k
f(x(i))− ², mini=0,...,k
f(x(i))
]
posterior uncertainty in f? is ≤ ²
Prof. S. Boyd, EE392o, Stanford University 18
iterations required:
2n2 logRG
²= 2n2 log
prior uncertainty
posterior uncertainty
efficiency: 0.72/n2 bits per gradient evaluation (degrades with n)
Prof. S. Boyd, EE392o, Stanford University 19
Inequality constrained problems
minimize f0(x)subject to fi(x) ≤ 0, i = 1, . . . ,m
same idea: maintain ellipsoids E (k) that
• contain x?
• decrease in volume to zero
Prof. S. Boyd, EE392o, Stanford University 20
case 1: x(k) feasible, i.e., fi(x(k)) ≤ 0, i = 1, . . . ,m
• then do usual update of E (k) based on ∇f0(x(k))
• rules out halfspace of points with larger function value than currentpoint
case 2: x(k) infeasible, say, fj(x(k)) > 0;
• then ∇fj(x(k))T (x− x(k)) ≥ 0 =⇒ fj(x) > 0 =⇒ x infeasible so
update E(k) based on ∇fj(x(k))
• rules out halfspace of infeasible points
Prof. S. Boyd, EE392o, Stanford University 21
Example
PSfrag replacements
ªf1(x) = 0
px(0)
∇f1(x(0))
PSfrag replacements
px(1)∇f0(x
(1))
PSfrag replacements
px(2)
∇f0(x(2))
Prof. S. Boyd, EE392o, Stanford University 22
PSfrag replacementspx(3)
∇f1(x(3))
PSfrag replacements
px(4)∇f0(x
(4))
PSfrag replacements
px(5)
∇f0(x(5))
Prof. S. Boyd, EE392o, Stanford University 23
Stopping criterion
if x(k) is feasible, we have a lower bound on f? as before:
f? ≥ f(x(k))−√
∇f(x(k))TA(k)∇f(x(k))
if x(k) is infeasible, we have for all x ∈ E (k)
fj(x) ≥ fj(x(k)) +∇fj(x
(k))T (x− x(k))
≥ fj(x(k)) + inf
x∈E(k)∇fj(x
(k))T (x− x(k))
= fj(x(k))−
√
∇fj(x(k))TA(k)∇fj(x(k))
Prof. S. Boyd, EE392o, Stanford University 24
hence, problem is infeasible if for some j,
fj(x(k))−
√
∇fj(x(k))TA(k)∇fj(x(k)) > 0
stopping criteria:
• if x(k) is feasible and√
∇f0(x(k))TA(k)∇f0(x(k)) ≤ ²(x(k) is ²-suboptimal)
• if fj(x(k))−
√
∇fj(x(k))TA(k)∇fj(x(k)) > 0(problem is infeasible)
Prof. S. Boyd, EE392o, Stanford University 25
Ellipsoid method for feasibility
abstract feasibility problem: find x ∈ C ⊂ Rn or determine C = ∅
separating hyperplane oracle: for any x, oracle either
• confirms x ∈ C, or
• returns g 6= 0 s.t. z ∈ C ⇒ gT (z − x) ≤ 0
PSfrag replacementsx(k)
E(k+1)
E(k)
C
g(k)
Prof. S. Boyd, EE392o, Stanford University 26
start with E(0) which intersects C
1. If x(k) := center(E (k)) ∈ C, quit. Else, compute g 6= 0, s.t.x ∈ C ⇒ gT (x− x(k)) ≤ 0
2. E(k+1) := minimum volume ellipsoid covering
E(k) ∩ {z | gT (z − x(k)) ≤ 0}
Prof. S. Boyd, EE392o, Stanford University 27
Example
PSfrag replacements ••PSfrag replacements ••
PSfrag replacements•
•PSfrag replacements
••
Prof. S. Boyd, EE392o, Stanford University 28