Click here to load reader

12-Monotone Split Slides

  • View
    13

  • Download
    0

Embed Size (px)

DESCRIPTION

Monotone Split Slides.

Text of 12-Monotone Split Slides

Monotone Operator Splitting Methods

Stephen Boyd (with help from Neal Parikh and Eric Chu)EE364b, Stanford University

1

Outline

1 Operator splitting 2 Douglas-Rachford splitting 3 Consensus optimization

Operator splitting

2

Operator splitting

want to solve 0 F (x) with F maximal monotone main idea: write F as F = A + B , with A and B maximal monotone called operator splitting solve using methods that require evaluation of resolvents

RA = (I + A)1 ,

RB = (I + B )1

(or Cayley operators CA = 2RA I and CB = 2RB I ) useful when RA and RB can be evaluated more easily than RF

Operator splitting

2

Main result

A, B maximal monotone, so Cayley operators CA , CB nonexpansive hence CA CB nonexpansive key result:

0 A(x) + B (x) CA CB (z ) = z, nonexpansive operator CA CB

x = R B (z )

so solutions of 0 A(x) + B (x) can be found from xed points of

Operator splitting

3

Proof of main result

write CA CB (z ) = z and x = RB (z ) as

x = R B (z ),

z = 2x z,

x = RA ( z ),

z = 2 xz

subtract 2nd & 4th equations to conclude x = x 4th equation is then 2x = z + z now add x + B (x) z and x + A(x) z to get

2x + (A(x) + B (x)) z + z = 2x hence A(x) + B (x) 0 argument goes other way (but we dont need it)

Operator splitting

4

Outline

1 Operator splitting 2 Douglas-Rachford splitting 3 Consensus optimization

Douglas-Rachford splitting

5

Peaceman-Rachford and Douglas-Rachford splitting

Peaceman-Rachford splitting is undamped iteration

z k+1 = CA CB (z k ) doesnt converge in general case; need CA or CB to be contraction Douglas-Rachford splitting is damped iteration

z k+1 := (1/2)(I + CA CB )(z k ) always converges when 0 A(x) + B (x) has solution these methods trace back to the mid-1950s (!!)

Douglas-Rachford splitting

5

Douglas-Rachford splittingwrite D-R iteration z k+1 := (1/2)(I + CA CB )(z k ) as xk+1/2 z k+1/2 xk+1 z k+1 last update follows from z k+1 := = = (1/2)(2xk+1 z k+1/2 ) + (1/2)z k xk+1 (1/2)(2xk+1/2 z k ) + (1/2)z k z k + xk+1 xk+1/2 := := := := R B (z k ) 2xk+1/2 z k RA (z k+1/2 ) z k + xk+1 xk+1/2

can consider xk+1 xk+1/2 as a residual z k is running sum of residualsDouglas-Rachford splitting 6

Douglas-Rachford algorithm

many ways to rewrite/rearrange D-R algorithm equivalent to many other algorithms; often not obvious need very little: A, B maximal monotone; solution exists A and B are handled separately (via RA and RB ); they are uncoupled

Douglas-Rachford splitting

7

Alternating direction method of multipliers

to minimize f (x) + g (x), we solve 0 f (x) + g (x) with A(x) = g (x), B (x) = f (x), D-R is xk+1/2 z k+1/2 xk+1 z k+1 := := := := argmin f (x) + (1/2) x z k 2xx k+1/2 2 2

zk2 2

argmin g (x) + (1/2) x z k+1/2x

z k + xk+1 xk+1/2

a special case of the alternating direction method of multipliers (ADMM)

Douglas-Rachford splitting

8

Constrained optimization

constrained convex problem:

minimize subject to

f (x ) xC

take B (x) = f (x) and A(x) = IC (x) = NC (x) so RB (z ) = proxf (z ) and RA (z ) = C (z ) D-R is

xk+1/2 z k+1/2 xk+1 z k+1Douglas-Rachford splitting

:= := := :=

proxf (z k ) 2xk+1/2 z k C (z k+1/2 ) z k + xk+1 xk+1/29

Dykstras alternating projections

nd a point in the intersection of convex sets C , D D-R gives algorithm

xk+1/2 z k+1/2 xk+1 z k+1

:= := := :=

C (z k ) 2xk+1/2 z k D (z k+1/2 ) z k + xk+1 xk+1/2

this is Dykstras alternating projections algorithm much faster than classical alternating projections

(e.g., for C , D polyhedral, converges in nite number of steps)

Douglas-Rachford splitting

10

Positive semidenite matrix completion

some entries of matrix in Sn known; nd values for others so completed

matrix is PSD known C = Sn , (i, j ) K} + , D = {X | Xij = Xij projection onto C : nd eigendecomposition X =n n i=1 T i qi qi ; then

C (X ) =i=1

T max{0, i }qi qi

projection onto D : set specied entries to known values

Douglas-Rachford splitting

11

Positive semidenite matrix completionspecic example: 50 50 matrix missing about half of its entries

initialize Z 0 = 0Douglas-Rachford splitting 12

Positive semidenite matrix completion

X k+1 X k+1/2

F

10

2

10

0

10

2

10

4

20

40

60

80

100

iteration k blue: alternating projections; red: D-R X k+1/2 C , X k+1 DDouglas-Rachford splitting 13

Outline

1 Operator splitting 2 Douglas-Rachford splitting 3 Consensus optimization

Consensus optimization

14

Consensus optimization

want to minimize

N i=1

f i (x )N

rewrite as consensus problem

minimize i=1 fi (xi ) subject to x C = {(x1 , . . . , xN ) | x1 = = xN } D-R consensus optimization:

xk+1/2 z k+1/2 xk+1 z k+1

:= := := :=

proxf (z k ) 2xk+1/2 z k C (z k+1/2 ) z k + xk+1 xk+1/2

Consensus optimization

14

Douglas-Rachford consensus xk+1/2 -update splits into N separate (parallel) problems:

xi

k+1/2

:=

k argmin fi (zi ) + (1/2) zi zi zi

2 2

,

i = 1, . . . , N

xk+1 -update is averaging:N +1 xk i

:=

z k+1/2 = (1/N )i=1

zi

k+1/2

,

i = 1, . . . , N

z k+1 -update becomesk+1 zi

= = =

k zi + z k+1/2 xi

k+1/2 k+1/2

k zi + 2xk+1/2 z k xi k zi + (xk+1/2 xi k+1/2

) + (xk+1/2 z k )

taking average of last equation, we get z k+1 = xk+1/2Consensus optimization 15

Douglas-Rachford consensus

renaming xk+1/2 as xk+1 , D-R consensus becomes+1 xk i k+1 zi

:= :=

k ) proxfi (zi +1 k zi + (xk+1 xk ) + (xk+1 xk ) i

subsystem (local) state: x, zi , xi gather xi s to compute x, which is then scattered

Consensus optimization

16

Distributed QP

we use D-R consensus to solve QP

minimize f (x) = subject to Fi x gi , with variable x Rn

N i=1 (1/2)

Ai x bi i = 1, . . . , N

2 2

each of N processors will handle an objective term, block of constraints coordinate N QP solvers to solve big QP

Consensus optimization

17

Distributed QP

D-R consensus algorithm is+1 xk i k+1 zi

:= :=

argmin (1/2) Ai xi biFi xi gi

2 2

k + (1/2) xi zi

2 2

+1 k zi + (xk+1 xk ) + (xk+1 xk ), i

rst step is N parallel QP solves second step gives coordination, to solve large problem inequality constraint residual is 1T (F xk g )+

Consensus optimization

18

Distributed QPexample with n = 100 variables, N = 10 subsystems 1T (F xk g )+102

10

0

10

2

10

20

30

40

50

2000

f (x k )

1500 1000 500 10 20 30 40 50

iteration kConsensus optimization 19

Search related