104
Claudia Sagastiz ´ abal IMECC-UNICAMP Brazil One World Optimization Seminar UniVie, June 22nd, 2020

One World Optimization Seminar...One World Optimization Seminar UniVie, June 22nd, 2020 Context: when to use decomposition methods? I problems too difficult to solve directlymixed

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

  • Claudia SagastizábalIMECC-UNICAMP Brazil

    One World Optimization SeminarUniVie, June 22nd, 2020

  • Context: when to use decomposition methods?

    I problems too difficult to solve directly mixed 0-1, linear, quadratic,or nonlinear programsstochastic UC with nonconvex HPF

    I problems with partial separable structure complicating variables:block diagonal 2nd-stage constraints,coupled with 1st stagecomplicating constraints, separable objective

    I problems of problems equilibrium problems, games, variationalinequalitiesenergy markets

    I information not accesiblebig data, machine learning

  • Context: when to use decomposition methods?

    I problems too difficult to solve directlymixed 0-1, linear, quadratic, or nonlinear programs

    stochastic UC with nonconvex HPFI problems with partial separable structure

    complicating variables: block diagonal 2nd-stage constraints,coupled with 1st stage

    complicating constraints, separable objectiveI problems of problems

    equilibrium problems, games, variational inequalitiesenergy markets

    I information not accesible in ML, commercial oraclessomething something

  • Context: when to use decomposition methods?

    I problems too difficult to solve directlymixed 0-1, linear, quadratic, or nonlinear programsstochastic UC with nonconvex HPF

    I problems with partial separable structurecomplicating variables: block diagonal 2nd-stage constraints,

    coupled with 1st stage

    complicating constraints, separable objectiveI problems of problems

    equilibrium problems, games, variational inequalitiesenergy markets

    I information not accesible in ML, commercial oraclessomething something

  • Context: when to use decomposition methods?

    I problems too difficult to solve directlymixed 0-1, linear, quadratic, or nonlinear programsstochastic UC with nonconvex HPF

    I problems with partial separable structurecomplicating variables: block diagonal 2nd-stage constraints,

    coupled with 1st stagecomplicating constraints, separable objective

    I problems of problemsequilibrium problems, games, variational inequalitiesenergy markets

    I information not accesible in ML, commercial oraclessomething something

  • Context: when to use decomposition methods?

    I problems too difficult to solve directlymixed 0-1, linear, quadratic, or nonlinear programsstochastic UC with nonconvex HPF

    I problems with partial separable structurecomplicating variables: block diagonal 2nd-stage constraints,

    coupled with 1st stagecomplicating constraints, separable objective

    I problems of problemsequilibrium problems, games, variational inequalities

    energy marketsI information not accesible in ML, commercial oracles

    something something

  • Context: when to use decomposition methods?

    I problems too difficult to solve directlymixed 0-1, linear, quadratic, or nonlinear programsstochastic UC with nonconvex HPF

    I problems with partial separable structurecomplicating variables: block diagonal 2nd-stage constraints,

    coupled with 1st stagecomplicating constraints, separable objective

    I problems of problemsequilibrium problems, games, variational inequalitiesenergy markets

    I information not accesible in ML, commercial oraclessomething something

  • Whi

    chde

    com

    posi

    tion

    met

    hod?

    It all depends on the output of interest!

  • Whi

    chde

    com

    posi

    tion

    met

    hod?

    It all depends on the output of interest!

  • Decomposition: what and how?

    Primal

    Dual

    Primal-Dual

    It all depends on the output of interest

  • Decomposition: what and how?

    Primal

    Dual

    Primal-Dual

    It all depends on the output of interest

  • Decomposition: what and how?

    Primal

    Dual

    Primal-Dual

    It all depends on the output of interest

  • Decomposition: what and how?

    Primal

    Dual

    Primal-Dual

    It all depends on the output of interest

  • Illustration with a simple example min fT (yT ) + fH(yH)s.t. yT ∈ ST ,yH ∈ SH

    yT + yH = d

    Two power plants

    yT ∈ ST yH ∈ SH〈F ,x〉+fT (yT ) fH(yH)

    x ∈ {0,1} and yT ≤ x yup

    yT + yH = d (demand)

  • Illustration with a simple example min fT (yT ) + fH(yH)s.t. yT ∈ ST ,yH ∈ SH

    yT + yH = d

    Two power plants

    yT ∈ ST yH ∈ SH〈F ,x〉+fT (yT ) fH(yH)

    x ∈ {0,1} and yT ≤ x yupyT + yH = d (demand)

  • Illustration with a simple example min fT (yT ) + fH(yH)s.t. yT ∈ ST ,yH ∈ SH

    yT + yH = d

    Two power plants

    yT ∈ ST yH ∈ SH〈F ,x〉+fT (yT ) fH(yH)

    x ∈ {0,1} and yT ≤ x yup

  • A less simple example min fT (yT ) + fH(yH)s.t. yT ∈ ST ,yH ∈ SH

    yT + yH = d

    Two power plants

    yT ∈ ST yH ∈ SH〈F ,x〉+ fT (yT ) fH(yH)

    x ∈ {0,1} and yT ≤ x yup

  • A less simple example

    min 〈F ,x〉+ fT (yT ) + fH(yH)s.t. yT ∈ ST ,yH ∈ SH

    yT + yH = dx ∈ {0,1} and yT ≤ x yup ⇐⇒ (x ,yT ) ∈ ST

    Two power plants

    yT ∈ ST yH ∈ SH〈F ,x〉+ fT (yT ) fH(yH)

    x ∈ {0,1} and yT ≤ x yup

  • Primal scissors (Benders)

    I Good , to split 0-1 variables from possible NLP relations

    I Bad , because all technologies dealt with together

    for instance,

    min 〈F ,x〉+fT (yT ) + fH(yH)s.t. (xk ,yT ) ∈ ST ,yH ∈ SH

    yT + yH = d

  • Primal scissors (Benders)

    I Good , to split 0-1 variables from possible NLP relationsI Bad , because all technologies dealt with together

    for instance,

    min 〈F ,x〉+fT (yT ) + fH(yH)s.t. (xk ,yT ) ∈ ST ,yH ∈ SH

    yT + yH = d

  • Primal scissors (Benders)

    I Good , to split 0-1 variables from possible NLP relationsI Bad , because all technologies dealt with together

    for instance,

    min 〈F ,x〉+fT (yT ) + fH(yH)s.t. (xk ,yT ) ∈ ST ,yH ∈ SH

    yT + yH = d

  • Primal scissors (Benders)

    I Good , to split 0-1 variables from possible NLP relationsI Bad , because all technologies dealt with together

    for instance,

    min 〈F ,x〉+fT (yT ) + fH(yH)s.t. ( xk ,yT ) ∈ ST ,yH ∈ SH

    yT + yH = d=⇒ operational subproblem in both yT ,yH

    for each given xk (a master program defines xk+1)

    To separate technologieswe need dual scissors acting on the demand constraint

  • Primal scissors (Benders)

    I Good , to split 0-1 variables from possible NLP relationsI Bad , because all technologies dealt with together

    for instance,

    min 〈F ,x〉+fT (yT ) + fH(yH)s.t. ( xk ,yT ) ∈ ST ,yH ∈ SH

    yT + yH = d=⇒ operational subproblem in both yT ,yH

    for each given xk (a master program defines xk+1)

    To separate technologieswe need dual scissors

    acting on the demand constraint

  • Primal scissors (Benders)

    I Good , to split 0-1 variables from possible NLP relationsI Bad , because all technologies dealt with together

    for instance,

    min 〈F ,x〉+fT (yT ) + fH(yH)s.t. ( xk ,yT ) ∈ ST ,yH ∈ SH

    yT + yH = d=⇒ operational subproblem in both yT ,yH

    for each given xk (a master program defines xk+1)

    To separate technologieswe need dual scissors acting on the demand constraint

  • Dual scissors: Lagrangian relaxation

    min fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST

    yH ∈ SHyT + yH = d ↔ u

    =⇒

    L(x ,y ,u) = fT (x ,yT ) + fH(yH)+〈u,d− yT − yH〉

    = LT (x ,yT ,u) + LH(yH ,u)+〈u,d〉

    ⇔minx ,y

    maxu

    L(x ,y ,u)

    s.t. (x ,yT ) ∈ STyH ∈ SH

  • Dual scissors: Lagrangian relaxation

    min fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST

    yH ∈ SHyT + yH = d ↔ u

    =⇒

    L(x ,y ,u) = fT (x ,yT ) + fH(yH)+〈u,d− yT − yH〉

    = LT (x ,yT ,u) + LH(yH ,u)+〈u,d〉

    minx ,y

    maxu

    L(x ,y ,u)

    s.t. (x ,yT ) ∈ STyH ∈ SH

  • Dual scissors: Lagrangian relaxation

    min fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST

    yH ∈ SHyT + yH = d ↔ u

    =⇒

    L(x ,y ,u) = fT (x ,yT ) + fH(yH)+〈u,d− yT − yH〉

    = LT (x ,yT ,u) + LH(yH ,u)+〈u,d〉

    ⇔minx ,y

    maxu

    L(x ,y ,u)

    s.t. (x ,yT ) ∈ STyH ∈ SH

  • Dual scissors: Lagrangian relaxation

    min fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST

    yH ∈ SHyT + yH = d ↔ u

    =⇒

    L(x ,y ,u) = fT (x ,yT ) + fH(yH)+〈u,d− yT − yH〉

    = LT (x ,yT ,u) + LH(yH ,u)+〈u,d〉

    DUAL maxmin replaces minmaxmax

    umin

    yL(x ,y ,u)

    s.t. (x ,yT ) ∈ STyH ∈ SH

    =⇒

    maxu

    θT (u) + θH(u) + 〈u,d〉

    forθT (u) := minLT (x ,yT ,u) : (x ,yT ) ∈ ST

    θH(u) := minLH(yH ,u) : yH ∈ SH

  • Dual scissors: Lagrangian relaxation

    min fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST

    yH ∈ SHyT + yH = d ↔ u

    =⇒

    L(x ,y ,u) = fT (x ,yT ) + fH(yH)+〈u,d− yT − yH〉

    = LT (x ,yT ,u) + LH(yH ,u)+〈u,d〉

    DUAL maxmin replaces minmaxmax

    umin

    yL(x ,y ,u)

    s.t. (x ,yT ) ∈ STyH ∈ SH

    =⇒

    maxu

    θT (u) + θH(u) + 〈u,d〉

    forθT (u) := minLT (x ,yT ,u) : (x ,yT ) ∈ ST

    θH(u) := minLH(yH ,u) : yH ∈ SH

  • Dual scissors: Lagrangian relaxation

    min fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST

    yH ∈ SHyT + yH = d ↔ u

    =⇒

    L(x ,y ,u) = fT (x ,yT ) + fH(yH)+〈u,d− yT − yH〉

    = LT (x ,yT ,u) + LH(yH ,u)+〈u,d〉

    DUAL maxmin replaces minmaxmax

    umin

    yL(x ,y ,u)

    s.t. (x ,yT ) ∈ STyH ∈ SH

    =⇒

    maxu

    θT (u) + θH(u) + 〈u,d〉

    forθT (u) := minLT (x ,yT ,u) : (x ,yT ) ∈ ST

    θH(u) := minLH(yH ,u) : yH ∈ SH

  • Dual scissors: Lagrangian relaxation à la bundle

    min fT (yT ) + fH(yH)s.t. x ∈ {0,1} ,yT ∈ ST

    yT ≤ x yupyH ∈ SHyT + yH = d ↔ u

    =⇒

    L(y ,u) = fT (yT ) + fH(yH)+〈u,d− yT − yH〉

    = LT (yT ,u) + LH(yH ,u)+〈u,d〉

    DUAL maxmin replaces minmaxmax

    umin

    yL(x ,y ,u)

    s.t. (x ,yT ) ∈ STyH ∈ SH

    =⇒

    maxu

    θT (u) + θH(u) + 〈u,d〉

    forθT(u)≈minLT (x ,yT ,u) : (x ,yT ) ∈ ST

    θH(u)≈minLH(yH ,u) : yH ∈ SH

  • Dual scissors: Lagrangian relaxation à la bundle

    min fT (yT ) + fH(yH)s.t. x ∈ {0,1} ,yT ∈ ST

    yT ≤ x yupyH ∈ SHyT + yH = d ↔ u

    =⇒

    L(y ,u) = fT (yT ) + fH(yH)+〈u,d− yT − yH〉

    = LT (yT ,u) + LH(yH ,u)+〈u,d〉

    DUAL maxmin replaces minmaxmax

    umin

    yL(x ,y ,u)

    s.t. (x ,yT ) ∈ STyH ∈ SH

    =⇒

    maxu

    θT (u) + θH(u) + 〈u,d〉

    forθT(u)≈minLT (x ,yT ,u) : (x ,yT ) ∈ ST

    θH(u)≈minLH(yH ,u) : yH ∈ SH

  • Dual scissors

    I Good , to separate technologies + inexact information

    I Good , if output of interest if u (shadow price)I Bad , if output of interest is x ,y : the final ones may be infeasible

    (dual approach solves the bi-dual of original problem)

  • Dual scissors

    I Good , to separate technologies + inexact informationI Good , if output of interest if u (shadow price)

    I Bad , if output of interest is x ,y : the final ones may be infeasible(dual approach solves the bi-dual of original problem)

  • Dual scissors

    I Good , to separate technologies + inexact informationI Good , if output of interest if u (shadow price)I Bad , if output of interest is x ,y : the final ones may be infeasible

    (dual approach solves the bi-dual of original problem)

  • Dual scissors

    I Good , to separate technologies + inexact informationI Good , if output of interest if u (shadow price)I Bad , if output of interest is x ,y : the final ones may be infeasible

    (dual approach solves the bi-dual of original problem)

  • Dual scissors

    I Good , to separate technologies + inexact informationI Good , if output of interest if u (shadow price)I Bad , if output of interest is x ,y : the final ones may be infeasible

    (dual approach solves the bi-dual of original problem)

  • Scissor features

    I Primal scissors primal feasibility

    I Dual scissors separability

    Can we have both (x ,y) and u???

    Need primal-dual scissors

  • Scissor features

    I Primal scissors primal feasibility

    I Dual scissors separability

    Can we have both (x ,y) and u???

    Need primal-dual scissors

  • Scissor features

    I Primal scissors primal feasibility

    I Dual scissors separability

    Can we have both (x ,y) and u???

    Need primal-dual scissors

  • Primal-dual scissors: Augmented Lagrangians

    Lr (x ,y ,u) = L(x ,y ,u) +r2‖d− yT − yH‖2

    I Good , closes duality gap

    I Bad , puts back together the technologiesI Also: inexact calculations make it difficult to manage parameter r

    Need to sharpen our scissors

    L#(x ,y ,u, r) = L(x ,y ,u) +r2‖d− yT − yH‖2 + r |d− yT − yH |1

  • Primal-dual scissors: Augmented Lagrangians

    Lr (x ,y ,u) = L(x ,y ,u) +r2‖d− yT − yH‖2

    I Good , closes duality gapI Bad , puts back together the technologies

    I Also: inexact calculations make it difficult to manage parameter r

    Need to sharpen our scissors

    L#(x ,y ,u, r) = L(x ,y ,u) +r2‖d− yT − yH‖2 + r |d− yT − yH |1

  • Primal-dual scissors: Augmented Lagrangians

    Lr (x ,y ,u) = L(x ,y ,u) +r2‖d− yT − yH‖2

    I Good , closes duality gapI Bad , puts back together the technologiesI Also: inexact calculations make it difficult to manage parameter r

    Need to sharpen our scissors

    L#(x ,y ,u, r) = L(x ,y ,u) +r2‖d− yT − yH‖2 + r |d− yT − yH |1

  • Primal-dual scissors: Augmented Lagrangians

    Lr (x ,y ,u) = L(x ,y ,u) +r2‖d− yT − yH‖2

    I Good , closes duality gapI Bad , puts back together the technologiesI Also: inexact calculations make it difficult to manage parameter r

    Need to sharpen our scissors

    L#(x ,y ,u, r) = L(x ,y ,u) +r2‖d− yT − yH‖2 + r |d− yT − yH |1

  • Primal-dual scissors: Sharp Augmented Lagrangians

    Lr (x ,y ,u) = L(x ,y ,u) +r2‖d− yT − yH‖2

    I Good , closes duality gapI Bad , puts back together the technologiesI Also: inexact calculations make it difficult to manage parameter r

    Need to sharpen our scissors

    L#(x ,y ,u, r) = L(x ,y ,u) + r |d− yT − yH |1

  • Primal-dual scissors: Sharp Augmented Lagrangians

    Lr (x ,y ,u) = L(x ,y ,u) +r2‖d− yT − yH‖2

    I Good , closes duality gapI Bad , puts back together the technologiesI Also: inexact calculations make it difficult to manage parameter r

    Need to sharpen our scissors

    L#(x ,y ,u, r ) = L(x ,y ,u) + r |d− yT − yH |1r is a dual variable

    parameter r

  • Generalized Augmented Lagrangians (GAL)The perturbation function p

    p(u) =

    min fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST ,yH ∈ SHyT + yH = d+u ⇐⇒ p(u) = min ϕ(x)s.t. x ∈ Xh(x) = 0u

    For the example,p is the minimumof two quadratic functions

  • Generalized Augmented Lagrangians (GAL)The perturbation function p

    p(u) =

    inf fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST ,yH ∈ SHyT + yH = d+u ⇐⇒ p(u) = inf ϕ(x)s.t. x ∈ Xh(x) = u

    If in the examplefT/H are quadratic,p is the minimumof two quadratic functions

  • Generalized Augmented Lagrangians (GAL)The perturbation function p

    p(u) =

    inf fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST ,yH ∈ SHyT + yH = d+u ⇐⇒ p(u) = inf ϕ(x)s.t. x ∈ Xh(x) = u

    If in the examplefT/H are quadratic,p is the minimumof two quadratic functions

  • Generalized Augmented Lagrangians (GAL)The perturbation function p

    p(u) =

    inf fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST ,yH ∈ SHyT + yH = d+u ⇐⇒ p(u) = inf ϕ(x)s.t. x ∈ Xh(x) = u

    If in the examplefT/H are quadratic,p is the minimumof two quadratic functions

    p(0)→ ↖ p∗∗(0)

    Need a “wedge” to close duality gap and bring magenta curve closer to blue one

  • Generalized Augmented Lagrangians (GAL)The perturbation function p

    p(u) =

    inf fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST ,yH ∈ SHyT + yH = d+u ⇐⇒ p(u) = inf ϕ(x)s.t. x ∈ Xh(x) = u

    If in the examplefT/H are quadratic,p is the minimumof two quadratic functions

    p(0)→ ↖ p∗∗(0)

    Need a “wedge” to close duality gap and bring magenta curve closer to blue one

  • Generalized Augmented Lagrangians (GAL)The perturbation function p

    p(u) =

    inf fT (x ,yT ) + fH(yH)s.t. (x ,yT ) ∈ ST ,yH ∈ SHyT + yH = d+u ⇐⇒ p(u) = inf ϕ(x)s.t. x ∈ Xh(x) = u

    If in the examplefT/H are quadratic,p is the minimumof two quadratic functions

    p(0)→ ↖ p∗∗(0)

    Need a “wedge” to close duality gap and bring magenta curve closer to blue one

  • The GAL according to R&W

    p(u) =

    inf ϕ(x)s.t. x ∈ Xh(x) = u

    =infx∈X

    (ϕ(x) + I{h(x)−u=0}(x)

    )︸ ︷︷ ︸

    D(x ,u)

    I p as a marginal function of DI D is the conjugate of the Lagrangian: D∗ = LI “Fix” D adding a “σ -term”

    Dσ = D+ 1r σ∗

    D∗σ = D∗+∨rσ= Lσ is the GAL

    I Lσ (x ,u, r) = L(x ,u) + rσ(h(x))I pσ is the marginal function of Dσ σ = | · |1

    p(0)→↑ pσ (0)

    no duality gap!

  • The GAL according to R&W

    p(u) =

    inf ϕ(x)s.t. x ∈ Xh(x) = u =infx∈X

    (ϕ(x) + I{h(x)−u=0}(x)

    )︸ ︷︷ ︸

    D(x ,u)

    I p as a marginal function of DI D is the conjugate of the Lagrangian: D∗ = LI “Fix” D adding a “σ -term”

    Dσ = D+ 1r σ∗

    D∗σ = D∗+∨rσ= Lσ is the GAL

    I Lσ (x ,u, r) = L(x ,u) + rσ(h(x))I pσ is the marginal function of Dσ σ = | · |1

    p(0)→↑ pσ (0)

    no duality gap!

  • The GAL according to R&W

    p(u) =

    inf ϕ(x)s.t. x ∈ Xh(x) = u =infx∈X

    (ϕ(x) + I{h(x)−u=0}(x)

    )︸ ︷︷ ︸

    D(x ,u)

    I p as a marginal function of D

    I D is the conjugate of the Lagrangian: D∗ = LI “Fix” D adding a “σ -term”

    Dσ = D+ 1r σ∗

    D∗σ = D∗+∨rσ= Lσ is the GAL

    I Lσ (x ,u, r) = L(x ,u) + rσ(h(x))I pσ is the marginal function of Dσ σ = | · |1

    p(0)→↑ pσ (0)

    no duality gap!

  • The GAL according to R&W

    p(u) =

    inf ϕ(x)s.t. x ∈ Xh(x) = u =infx∈X

    (ϕ(x) + I{h(x)−u=0}(x)

    )︸ ︷︷ ︸

    D(x ,u)

    I p as a marginal function of DI D is the conjugate of the Lagrangian: D∗ = L

    I “Fix” D adding a “σ -term”Dσ = D+ 1r σ

    D∗σ = D∗+∨rσ= Lσ is the GAL

    I Lσ (x ,u, r) = L(x ,u) + rσ(h(x))I pσ is the marginal function of Dσ σ = | · |1

    p(0)→↑ pσ (0)

    no duality gap!

  • The GAL according to R&W

    p(u) =

    inf ϕ(x)s.t. x ∈ Xh(x) = u =infx∈X

    (ϕ(x) + I{h(x)−u=0}(x)

    )︸ ︷︷ ︸

    D(x ,u)

    I p as a marginal function of DI D is the conjugate of the Lagrangian: D∗ = LI “Fix” D adding a “σ -term”

    Dσ = D+ 1r σ∗

    D∗σ = D∗+∨rσ= Lσ is the GAL

    I Lσ (x ,u, r) = L(x ,u) + rσ(h(x))I pσ is the marginal function of Dσ σ = | · |1

    p(0)→↑ pσ (0)

    no duality gap!

  • The GAL according to R&W

    p(u) =

    inf ϕ(x)s.t. x ∈ Xh(x) = u =infx∈X

    (ϕ(x) + I{h(x)−u=0}(x)

    )︸ ︷︷ ︸

    D(x ,u)

    I p as a marginal function of DI D is the conjugate of the Lagrangian: D∗ = LI “Fix” D adding a “σ -term”

    Dσ = D+ 1r σ∗

    D∗σ = D∗+∨rσ

    = Lσ is the GAL

    I Lσ (x ,u, r) = L(x ,u) + rσ(h(x))I pσ is the marginal function of Dσ σ = | · |1

    p(0)→↑ pσ (0)

    no duality gap!

  • The GAL according to R&W

    p(u) =

    inf ϕ(x)s.t. x ∈ Xh(x) = u =infx∈X

    (ϕ(x) + I{h(x)−u=0}(x)

    )︸ ︷︷ ︸

    D(x ,u)

    I p as a marginal function of DI D is the conjugate of the Lagrangian: D∗ = LI “Fix” D adding a “σ -term”

    Dσ = D+ 1r σ∗

    D∗σ = D∗+∨rσ= Lσ is the GAL

    I Lσ (x ,u, r) = L(x ,u) + rσ(h(x))

    I pσ is the marginal function of Dσ σ = | · |1

    p(0)→↑ pσ (0)

    no duality gap!

  • The GAL according to R&W

    p(u) =

    inf ϕ(x)s.t. x ∈ Xh(x) = u =infx∈X

    (ϕ(x) + I{h(x)−u=0}(x)

    )︸ ︷︷ ︸

    D(x ,u)

    I p as a marginal function of DI D is the conjugate of the Lagrangian: D∗ = LI “Fix” D adding a “σ -term”

    Dσ = D+ 1r σ∗

    D∗σ = D∗+∨rσ= Lσ is the GAL

    I Lσ (x ,u, r) = L(x ,u) + rσ(h(x))I pσ is the marginal function of Dσ σ = | · |1

    p(0)→↑ pσ (0)

    no duality gap!

  • The GAL according to R&W

    p(u) =

    inf ϕ(x)s.t. x ∈ Xh(x) = u =infx∈X

    (ϕ(x) + I{h(x)−u=0}(x)

    )︸ ︷︷ ︸

    D(x ,u)

    I p as a marginal function of DI D is the conjugate of the Lagrangian: D∗ = LI “Fix” D adding a “σ -term”

    Dσ = D+ 1r σ∗

    D∗σ = D∗+∨rσ= Lσ is the GAL

    I Lσ (x ,u, r) = L(x ,u) + rσ(h(x))I pσ is the marginal function of Dσ σ = | · |1

    p(0)→↑ pσ (0)

    no duality gap!

  • Sharp and Proximal GAL according to R&W

    Lσ (x ,u, r)

    L(x ,u) + r |h(x)|1 and L(x ,u) +r2|h(x)|22

    r-co

    mpo

    nent

    x-component

    No duality gap, but two catches:I separable L(x ,u) = LT (xT ,u) + LH(xH ,u)

    turned into non-separable Lσ (x ,u, r) = L(x ,u) + r |xT + xH−d |1I requires exact evaluation of dual function θσ (u, r) = minx∈X Lσ (x ,u, r)

    a global optimization problem

    Our workis a proposal to address these issues

  • Sharp and Proximal GAL according to R&W

    Lσ (x ,u, r)

    L(x ,u) + r |h(x)|1 and L(x ,u) +r2|h(x)|22

    r-co

    mpo

    nent

    x-component

    No duality gap, but two catches:

    I separable L(x ,u) = LT (xT ,u) + LH(xH ,u)turned into non-separable Lσ (x ,u, r) = L(x ,u) + r |xT + xH−d |1

    I requires exact evaluation of dual function θσ (u, r) = minx∈X Lσ (x ,u, r)a global optimization problem

    Our workis a proposal to address these issues

  • Sharp and Proximal GAL according to R&W

    Lσ (x ,u, r)

    L(x ,u) + r |h(x)|1 and L(x ,u) +r2|h(x)|22

    r-co

    mpo

    nent

    x-component

    No duality gap, but two catches:I separable L(x ,u) = LT (xT ,u) + LH(xH ,u)

    turned into non-separable Lσ (x ,u, r) = L(x ,u) + r |xT + xH−d |1

    I requires exact evaluation of dual function θσ (u, r) = minx∈X Lσ (x ,u, r)a global optimization problem

    Our workis a proposal to address these issues

  • Sharp and Proximal GAL according to R&W

    Lσ (x ,u, r)

    L(x ,u) + r |h(x)|1 and L(x ,u) +r2|h(x)|22

    r-co

    mpo

    nent

    x-component

    No duality gap, but two catches:I separable L(x ,u) = LT (xT ,u) + LH(xH ,u)

    turned into non-separable Lσ (x ,u, r) = L(x ,u) + r |xT + xH−d |1I requires exact evaluation of dual function θσ (u, r) = minx∈X Lσ (x ,u, r)

    a global optimization problem

    Our workis a proposal to address these issues

  • Sharp and Proximal GAL according to R&W

    Lσ (x ,u, r)

    L(x ,u) + r |h(x)|1 and L(x ,u) +r2|h(x)|22

    r-co

    mpo

    nent

    x-component

    No duality gap, but two catches:I separable L(x ,u) = LT (xT ,u) + LH(xH ,u)

    turned into non-separable Lσ (x ,u, r) = L(x ,u) + r |xT + xH−d |1I requires exact evaluation of dual function θσ (u, r) = minx∈X Lσ (x ,u, r)

    a global optimization problem

    Our workis a proposal to address these issues

  • GAL according to

    GAL of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0

    ≡ Lagrangian of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0σ(h(x))≤ 0

    From now on we consider L(x ,u, r) for problem on the right

    ←←←←←

    to examine relations of approximate solutions to minx L(x ,u, r)I with ε-subgradients of the dual function θ (u, r)I with numerical schemes à la bundleI with solutions to problem on the left

    Any CQ for original problem yields multipliers (u, r) for the σ -augmented problem

  • GAL according to

    GAL of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0≡ Lagrangian of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0σ(h(x))≤ 0

    From now on we consider L(x ,u, r) for problem on the right

    ←←←←←

    to examine relations of approximate solutions to minx L(x ,u, r)I with ε-subgradients of the dual function θ (u, r)I with numerical schemes à la bundleI with solutions to problem on the left

    Any CQ for original problem yields multipliers (u, r) for the σ -augmented problem

  • GAL according to

    GAL of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0≡ Lagrangian of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0σ(h(x))≤ 0

    From now on we consider L(x ,u, r) for problem on the right

    ←←←←←

    to examine relations of approximate solutions to minx L(x ,u, r)I with ε-subgradients of the dual function θ (u, r)I with numerical schemes à la bundleI with solutions to problem on the left

    Any CQ for original problem yields multipliers (u, r) for the σ -augmented problem

  • GAL according to

    GAL of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0≡ Lagrangian of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0σ(h(x))≤ 0

    From now on we consider L(x ,u, r) for problem on the right

    ←←←←←

    to examine relations of approximate solutions to minx L(x ,u, r)

    I with ε-subgradients of the dual function θ (u, r)I with numerical schemes à la bundleI with solutions to problem on the left

    Any CQ for original problem yields multipliers (u, r) for the σ -augmented problem

  • GAL according to

    GAL of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0≡ Lagrangian of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0σ(h(x))≤ 0

    From now on we consider L(x ,u, r) for problem on the right

    ←←←←←

    to examine relations of approximate solutions to minx L(x ,u, r)I with ε-subgradients of the dual function θ (u, r)I with numerical schemes à la bundleI with solutions to problem on the left

    Any CQ for original problem yields multipliers (u, r) for the σ -augmented problem

  • GAL according to

    GAL of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0≡ Lagrangian of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0σ(h(x))≤ 0

    From now on we consider L(x ,u, r) for problem on the right

    ←←←←←

    to examine relations of approximate solutions to minx L(x ,u, r)I with ε-subgradients of the dual function θ (u, r)I with numerical schemes à la bundleI with solutions to problem on the left

    Any CQ for original problem yields multipliers (u, r) for the σ -augmented problem

  • GAL according to

    GAL of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0≡ Lagrangian of

    min ϕ(x)s.t. x ∈ X

    h(x) = 0σ(h(x))≤ 0

    From now on we consider L(x ,u, r) for problem on the right

    ←←←←←

    to examine relations of approximate solutions to minx L(x ,u, r)I with ε-subgradients of the dual function θ (u, r)I with numerical schemes à la bundleI with solutions to problem on the left

    Any CQ for original problem yields multipliers (u, r) for the σ -augmented problem

  • Everett’s theorem σ continuous, non-negative with unique minimizer at 0

    Fix (ũ, r̃) and consider evaluating θ (ũ, r̃) = minx L(x , ũ, r̃),what primal problem is solved by minimizer x̃?

    NLP Everett’s: A stationary point x̃ for θ (ũ, r̃), i.e., solving0 ∈ ∂xL(x , ũ, r̃) + NX (x) , with x ∈ X

    is a stationary point for the original problem ⇐⇒ h(x̃) = 0ε-Everett’s: An ε-solution x̃ for θ (ũ, r̃), i.e., satisfying

    L(x̃ , ũ, r̃)≤ L(x , ũ, r̃) + ε for all x ∈ Xapproximately solves the original problem (ϕ(x̃)≤ ϕ∗+ ε) ⇐⇒ h(x̃) = 0

    • Summing up:I Augmentation (approximately) closes the duality gap, provided h(x̃) = 0I h(x) = 0 ⇐⇒ σ(h(x)) = 0I An inexact bundle method drives σ(h(xk )) to 0

  • Everett’s theorem σ continuous, non-negative with unique minimizer at 0

    Fix (ũ, r̃) and consider evaluating θ (ũ, r̃) = minx L(x , ũ, r̃),what primal problem is solved by minimizer x̃?

    NLP Everett’s: A stationary point x̃ for θ (ũ, r̃), i.e., solving0 ∈ ∂xL(x , ũ, r̃) + NX (x) , with x ∈ X

    is a stationary point for the original problem ⇐⇒ h(x̃) = 0ε-Everett’s: An ε-solution x̃ for θ (ũ, r̃), i.e., satisfying

    L(x̃ , ũ, r̃)≤ L(x , ũ, r̃) + ε for all x ∈ Xapproximately solves the original problem (ϕ(x̃)≤ ϕ∗+ ε) ⇐⇒ h(x̃) = 0

    • Summing up:I Augmentation (approximately) closes the duality gap, provided h(x̃) = 0I h(x) = 0 ⇐⇒ σ(h(x)) = 0I An inexact bundle method drives σ(h(xk )) to 0

  • Everett’s theorem σ continuous, non-negative with unique minimizer at 0

    Fix (ũ, r̃) and consider evaluating θ (ũ, r̃) = minx L(x , ũ, r̃),what primal problem is solved by minimizer x̃?

    NLP Everett’s: A stationary point x̃ for θ (ũ, r̃), i.e., solving0 ∈ ∂xL(x , ũ, r̃) + NX (x) , with x ∈ X

    is a stationary point for the original problem ⇐⇒ h(x̃) = 0

    ε-Everett’s: An ε-solution x̃ for θ (ũ, r̃), i.e., satisfyingL(x̃ , ũ, r̃)≤ L(x , ũ, r̃) + ε for all x ∈ X

    approximately solves the original problem (ϕ(x̃)≤ ϕ∗+ ε) ⇐⇒ h(x̃) = 0• Summing up:I Augmentation (approximately) closes the duality gap, provided h(x̃) = 0I h(x) = 0 ⇐⇒ σ(h(x)) = 0I An inexact bundle method drives σ(h(xk )) to 0

  • Everett’s theorem σ continuous, non-negative with unique minimizer at 0

    Fix (ũ, r̃) and consider evaluating θ (ũ, r̃) = minx L(x , ũ, r̃),what primal problem is solved by minimizer x̃?

    NLP Everett’s: A stationary point x̃ for θ (ũ, r̃), i.e., solving0 ∈ ∂xL(x , ũ, r̃) + NX (x) , with x ∈ X

    is a stationary point for the original problem ⇐⇒ h(x̃) = 0ε-Everett’s: An ε-solution x̃ for θ (ũ, r̃), i.e., satisfying

    L(x̃ , ũ, r̃)≤ L(x , ũ, r̃) + ε for all x ∈ Xapproximately solves the original problem (ϕ(x̃)≤ ϕ∗+ ε) ⇐⇒ h(x̃) = 0

    • Summing up:I Augmentation (approximately) closes the duality gap, provided h(x̃) = 0I h(x) = 0 ⇐⇒ σ(h(x)) = 0I An inexact bundle method drives σ(h(xk )) to 0

  • Everett’s theorem σ continuous, non-negative with unique minimizer at 0

    Fix (ũ, r̃) and consider evaluating θ (ũ, r̃) = minx L(x , ũ, r̃),what primal problem is solved by minimizer x̃?

    NLP Everett’s: A stationary point x̃ for θ (ũ, r̃), i.e., solving0 ∈ ∂xL(x , ũ, r̃) + NX (x) , with x ∈ X

    is a stationary point for the original problem ⇐⇒ h(x̃) = 0ε-Everett’s: An ε-solution x̃ for θ (ũ, r̃), i.e., satisfying

    L(x̃ , ũ, r̃)≤ L(x , ũ, r̃) + ε for all x ∈ Xapproximately solves the original problem (ϕ(x̃)≤ ϕ∗+ ε) ⇐⇒ h(x̃) = 0

    • Summing up:I Augmentation (approximately) closes the duality gap, provided h(x̃) = 0I h(x) = 0 ⇐⇒ σ(h(x)) = 0I An inexact bundle method drives σ(h(xk )) to 0

  • Everett’s theorem σ continuous, non-negative with unique minimizer at 0

    Fix (ũ, r̃) and consider evaluating θ (ũ, r̃) = minx L(x , ũ, r̃),what primal problem is solved by minimizer x̃?

    NLP Everett’s: A stationary point x̃ for θ (ũ, r̃), i.e., solving0 ∈ ∂xL(x , ũ, r̃) + NX (x) , with x ∈ X

    is a stationary point for the original problem ⇐⇒ h(x̃) = 0ε-Everett’s: An ε-solution x̃ for θ (ũ, r̃), i.e., satisfying

    L(x̃ , ũ, r̃)≤ L(x , ũ, r̃) + ε for all x ∈ Xapproximately solves the original problem (ϕ(x̃)≤ ϕ∗+ ε) ⇐⇒ h(x̃) = 0

    • Summing up:I Augmentation (approximately) closes the duality gap, provided h(x̃) = 0

    I h(x) = 0 ⇐⇒ σ(h(x)) = 0I An inexact bundle method drives σ(h(xk )) to 0

  • Everett’s theorem σ continuous, non-negative with unique minimizer at 0

    Fix (ũ, r̃) and consider evaluating θ (ũ, r̃) = minx L(x , ũ, r̃),what primal problem is solved by minimizer x̃?

    NLP Everett’s: A stationary point x̃ for θ (ũ, r̃), i.e., solving0 ∈ ∂xL(x , ũ, r̃) + NX (x) , with x ∈ X

    is a stationary point for the original problem ⇐⇒ h(x̃) = 0ε-Everett’s: An ε-solution x̃ for θ (ũ, r̃), i.e., satisfying

    L(x̃ , ũ, r̃)≤ L(x , ũ, r̃) + ε for all x ∈ Xapproximately solves the original problem (ϕ(x̃)≤ ϕ∗+ ε) ⇐⇒ h(x̃) = 0

    • Summing up:I Augmentation (approximately) closes the duality gap, provided h(x̃) = 0I h(x) = 0 ⇐⇒ σ(h(x)) = 0

    I An inexact bundle method drives σ(h(xk )) to 0

  • Everett’s theorem σ continuous, non-negative with unique minimizer at 0

    Fix (ũ, r̃) and consider evaluating θ (ũ, r̃) = minx L(x , ũ, r̃),what primal problem is solved by minimizer x̃?

    NLP Everett’s: A stationary point x̃ for θ (ũ, r̃), i.e., solving0 ∈ ∂xL(x , ũ, r̃) + NX (x) , with x ∈ X

    is a stationary point for the original problem ⇐⇒ h(x̃) = 0ε-Everett’s: An ε-solution x̃ for θ (ũ, r̃), i.e., satisfying

    L(x̃ , ũ, r̃)≤ L(x , ũ, r̃) + ε for all x ∈ Xapproximately solves the original problem (ϕ(x̃)≤ ϕ∗+ ε) ⇐⇒ h(x̃) = 0

    • Summing up:I Augmentation (approximately) closes the duality gap, provided h(x̃) = 0I h(x) = 0 ⇐⇒ σ(h(x)) = 0I An inexact bundle method drives σ(h(xk )) to 0

  • GAL and the σ -simplex recall that h(x) = 0 ⇐⇒ σ(h(x)) = 0

    θ (ũ, r̃) = minx∈X L(x , ũ, r̃) = minx∈X ϕ(x) + 〈ũ,h(x)〉+ r̃σ(h(x))=⇒

    (h(xmin),σ(h(xmin))

    )∈ ∂εθ (ũ, r̃)

    Suppose X compactExplicit calculus rule in

  • GAL and the σ -simplex recall that h(x) = 0 ⇐⇒ σ(h(x)) = 0

    θ (ũ, r̃) = minx∈X L(x , ũ, r̃) = minx∈X ϕ(x) + 〈ũ,h(x)〉+ r̃σ(h(x))=⇒

    (h(xmin),σ(h(xmin))

    )∈ ∂εθ (ũ, r̃)

    Suppose X compactExplicit calculus rule in

  • GAL and the σ -simplex recall that h(x) = 0 ⇐⇒ σ(h(x)) = 0

    θ (ũ, r̃) = minx∈X L(x , ũ, r̃) = minx∈X ϕ(x) + 〈ũ,h(x)〉+ r̃σ(h(x))=⇒

    (h(xmin), σ(h(xmin))

    )∈ ∂εθ (ũ, r̃)

    Suppose X compactExplicit calculus rule in

    =⇒ σ -simplex=conv{

    σ(h(x i)), x i approximate minimizers}

    Theorem: 0 ∈ σ -simplex is equivalent toI 0 ∈ ∂εθ (ũ, r̃)I One x ibest in the σ -simplex is primal feasible: h(x ibest ) = 0

    =⇒ x ibest approximate solution for original problem

  • GAL and the σ -simplex recall that h(x) = 0 ⇐⇒ σ(h(x)) = 0

    θ (ũ, r̃) = minx∈X L(x , ũ, r̃) = minx∈X ϕ(x) + 〈ũ,h(x)〉+ r̃σ(h(x))=⇒

    (h(xmin), σ(h(xmin))

    )∈ ∂εθ (ũ, r̃)

    Suppose X compactExplicit calculus rule in

    =⇒ σ -simplex=conv{

    σ(h(x i)), x i approximate minimizers}

    Theorem: 0 ∈ σ -simplex is equivalent toI 0 ∈ ∂εθ (ũ, r̃)I One x ibest in the σ -simplex is primal feasible: h(x ibest ) = 0

    =⇒ x ibest approximate solution for original problem

  • PDBM: a primal-dual bundle method for GAL

    Init Choose (u1, r1) and compute x1 ≈minx∈X L(x ,u1, r1)Dual Solve bundle QP with a model for θ to obtain (u+, r+)

    Noise? Adjust prox-parameter and go to Dual if too much noise

    Primal Compute x+ ≈minx∈X L(x ,u+, r+)Stop if σ(h(x+)) is small

    Bundle Classify (u+, r+) as serious or null step, update the model

    Loop to DualTheorem:1. Noise attenuation loop is finite2. There is a primal feasible limit point x̄ ibest that is approximately optimal3. If the dual sequence has accumulation points, they solve approximately the

    dual problem4. For existence of dual accumulation points, see “Convex proximal bundle

    methods in depth” MP2014

  • PDBM: a primal-dual bundle method for GAL

    Init Choose (u1, r1) and compute x1 ≈minx∈X L(x ,u1, r1)Dual Solve bundle QP with a model for θ to obtain (u+, r+)

    Noise? Adjust prox-parameter and go to Dual if too much noise

    Primal Compute x+ ≈minx∈X L(x ,u+, r+)Stop if σ(h(x+)) is small

    Bundle Classify (u+, r+) as serious or null step, update the model

    Loop to DualTheorem:1. Noise attenuation loop is finite2. There is a primal feasible limit point x̄ ibest that is approximately optimal3. If the dual sequence has accumulation points, they solve approximately the

    dual problem4. For existence of dual accumulation points, see “Convex proximal bundle

    methods in depth” MP2014

  • PDBM: a primal-dual bundle method for GAL

    Init Choose (u1, r1) and compute x1 ≈minx∈X L(x ,u1, r1)Dual Solve bundle QP with a model for θ to obtain (u+, r+)

    Noise? Adjust prox-parameter and go to Dual if too much noise

    Primal Compute x+ ≈minx∈X L(x ,u+, r+)

    Stop if σ(h(x+)) is smallBundle Classify (u+, r+) as serious or null step, update the model

    Loop to DualTheorem:1. Noise attenuation loop is finite2. There is a primal feasible limit point x̄ ibest that is approximately optimal3. If the dual sequence has accumulation points, they solve approximately the

    dual problem4. For existence of dual accumulation points, see “Convex proximal bundle

    methods in depth” MP2014

  • PDBM: a primal-dual bundle method for GAL

    Init Choose (u1, r1) and compute x1 ≈minx∈X L(x ,u1, r1)Dual Solve bundle QP with a model for θ to obtain (u+, r+)

    Noise? Adjust prox-parameter and go to Dual if too much noise

    Primal Compute x+ ≈minx∈X L(x ,u+, r+)Stop if σ(h(x+)) is small

    Bundle Classify (u+, r+) as serious or null step, update the model

    Loop to DualTheorem:1. Noise attenuation loop is finite2. There is a primal feasible limit point x̄ ibest that is approximately optimal3. If the dual sequence has accumulation points, they solve approximately the

    dual problem4. For existence of dual accumulation points, see “Convex proximal bundle

    methods in depth” MP2014

  • PDBM: a primal-dual bundle method for GAL

    Init Choose (u1, r1) and compute x1 ≈minx∈X L(x ,u1, r1)Dual Solve bundle QP with a model for θ to obtain (u+, r+)

    Noise? Adjust prox-parameter and go to Dual if too much noise

    Primal Compute x+ ≈minx∈X L(x ,u+, r+)Stop if σ(h(x+)) is small

    Bundle Classify (u+, r+) as serious or null step, update the model

    Loop to DualTheorem:1. Noise attenuation loop is finite2. There is a primal feasible limit point x̄ ibest that is approximately optimal3. If the dual sequence has accumulation points, they solve approximately the

    dual problem4. For existence of dual accumulation points, see “Convex proximal bundle

    methods in depth” MP2014

  • PDBM: a primal-dual bundle method for GAL

    Init Choose (u1, r1) and compute x1 ≈minx∈X L(x ,u1, r1)Dual Solve bundle QP with a model for θ to obtain (u+, r+)

    Noise? Adjust prox-parameter and go to Dual if too much noise

    Primal Compute x+ ≈minx∈X L(x ,u+, r+)Stop if σ(h(x+)) is small

    Bundle Classify (u+, r+) as serious or null step, update the model

    Loop to Dual

    Theorem:1. Noise attenuation loop is finite2. There is a primal feasible limit point x̄ ibest that is approximately optimal3. If the dual sequence has accumulation points, they solve approximately the

    dual problem4. For existence of dual accumulation points, see “Convex proximal bundle

    methods in depth” MP2014

  • PDBM: a primal-dual bundle method for GAL

    Init Choose (u1, r1) and compute x1 ≈minx∈X L(x ,u1, r1)Dual Solve bundle QP with a model for θ to obtain (u+, r+)

    Noise? Adjust prox-parameter and go to Dual if too much noise

    Primal Compute x+ ≈minx∈X L(x ,u+, r+)Stop if σ(h(x+)) is small

    Bundle Classify (u+, r+) as serious or null step, update the model

    Loop to DualTheorem:1. Noise attenuation loop is finite2. There is a primal feasible limit point x̄ ibest that is approximately optimal3. If the dual sequence has accumulation points, they solve approximately the

    dual problem4. For existence of dual accumulation points, see “Convex proximal bundle

    methods in depth” MP2014

  • Solving difficult problems with PDBM for GALDC problems with explicit nonconvexity ( exact solution of subproblems)

    {min

    x∈X=IRnϕ(x) := 12 〈x ,Qx〉+ 〈x ,q〉− maxi∈{1,...,N}

    {〈x ,αi〉+ βi}

    s.t. h(x) := Ax−b = 0

    augmented with σ(·) = 12‖ · ‖22 yields an easy dual function θ(u, r) = mini∈{1,...,N} θi (u, r)

    Each θi is a QP having the same quadratic term for all i

    θi (u, r) = minx

    12〈x ,Qx〉+ 〈x ,q〉+ 〈x ,αi〉+ βi + 〈Ax−b,u〉+

    12

    r‖Ax−b‖22

    PDBM MSM (Gas02) ENUMavg stdev avg stdev avg stdev

    ∆ϕ -1E-03 3E-03 -7E-02 3E-01 0 0h(x̄) 4E-08 3E-08 1E-03 6E-03#primal 25 42 105 115CPU (s) 5 8 21 36 209 515

  • Solving difficult problems with PDBM for GALDC problems with explicit nonconvexity ( exact solution of subproblems)

    {min

    x∈X=IRnϕ(x) := 12 〈x ,Qx〉+ 〈x ,q〉− maxi∈{1,...,N}

    {〈x ,αi〉+ βi}

    s.t. h(x) := Ax−b = 0augmented with σ(·) = 12‖ · ‖

    22 yields an easy dual function θ(u, r) = mini∈{1,...,N} θi (u, r)

    Each θi is a QP having the same quadratic term for all i

    θi (u, r) = minx

    12〈x ,Qx〉+ 〈x ,q〉+ 〈x ,αi〉+ βi + 〈Ax−b,u〉+

    12

    r‖Ax−b‖22

    PDBM MSM (Gas02) ENUMavg stdev avg stdev avg stdev

    ∆ϕ -1E-03 3E-03 -7E-02 3E-01 0 0h(x̄) 4E-08 3E-08 1E-03 6E-03#primal 25 42 105 115CPU (s) 5 8 21 36 209 515

  • Solving difficult problems with PDBM for GALDC problems with explicit nonconvexity ( exact solution of subproblems)

    {min

    x∈X=IRnϕ(x) := 12 〈x ,Qx〉+ 〈x ,q〉− maxi∈{1,...,N}

    {〈x ,αi〉+ βi}

    s.t. h(x) := Ax−b = 0augmented with σ(·) = 12‖ · ‖

    22 yields an easy dual function θ(u, r) = mini∈{1,...,N} θi (u, r)

    Each θi is a QP having the same quadratic term for all i

    θi (u, r) = minx

    12〈x ,Qx〉+ 〈x ,q〉+ 〈x ,αi〉+ βi + 〈Ax−b,u〉+

    12

    r‖Ax−b‖22

    PDBM MSM (Gas02) ENUMavg stdev avg stdev avg stdev

    ∆ϕ -1E-03 3E-03 -7E-02 3E-01 0 0h(x̄) 4E-08 3E-08 1E-03 6E-03#primal 25 42 105 115CPU (s) 5 8 21 36 209 515

  • Solving difficult problems with PDBM for GALDC problems with explicit nonconvexity ( exact solution of subproblems)

    {min

    x∈X=IRnϕ(x) := 12 〈x ,Qx〉+ 〈x ,q〉− maxi∈{1,...,N}

    {〈x ,αi〉+ βi}

    s.t. h(x) := Ax−b = 0augmented with σ(·) = 12‖ · ‖

    22 yields an easy dual function θ(u, r) = mini∈{1,...,N} θi (u, r)

    Each θi is a QP having the same quadratic term for all i

    θi (u, r) = minx

    12〈x ,Qx〉+ 〈x ,q〉+ 〈x ,αi〉+ βi + 〈Ax−b,u〉+

    12

    r‖Ax−b‖22

    PDBM MSM (Gas02) ENUMavg stdev avg stdev avg stdev

    ∆ϕ -1E-03 3E-03 -7E-02 3E-01 0 0h(x̄) 4E-08 3E-08 1E-03 6E-03#primal 25 42 105 115CPU (s) 5 8 21 36 209 515

  • Solving difficult problems with PDBM for GALUnit-commitment problems ( inexact solution of subproblems, using ADMM)

    min ∑

    i∈I

    (〈F ,xi〉+ Ci (yi )

    )s.t. (xi ,yi ) ∈ Si

    ∑i

    yi = D

    ⇐⇒

    min ∑

    i∈Iϕi (xi ,yi )

    s.t. (xi ,yi ) ∈ Si∑

    izi = D

    h(x ,y) = z− y = 0augmented with σ(·) = | · |1 gives

    L(x ,y ,z,u, r) =L0-1(x ,y ,u) + Lcont (z,u) + r |y− z|1

    ≈L0-1(x ,y ,u) +

    r2|y− zfixed |1︸ ︷︷ ︸ + Lcont (z,u) + r2 |yfixed − z|1︸ ︷︷ ︸

    ∑i

    min(xi ,yi )∈Si

    min∑i zi =D

  • Solving difficult problems with PDBM for GALUnit-commitment problems ( inexact solution of subproblems, using ADMM)

    min ∑

    i∈I

    (〈F ,xi〉+ Ci (yi )

    )s.t. (xi ,yi ) ∈ Si

    ∑i

    yi = D⇐⇒

    min ∑

    i∈Iϕi (xi ,yi )

    s.t. (xi ,yi ) ∈ Si∑

    izi = D

    h(x ,y) = z− y = 0

    augmented with σ(·) = | · |1 givesL(x ,y ,z,u, r) =L0-1(x ,y ,u) + Lcont (z,u) + r |y− z|1

    ≈L0-1(x ,y ,u) +

    r2|y− zfixed |1︸ ︷︷ ︸ + Lcont (z,u) + r2 |yfixed − z|1︸ ︷︷ ︸

    ∑i

    min(xi ,yi )∈Si

    min∑i zi =D

  • Solving difficult problems with PDBM for GALUnit-commitment problems ( inexact solution of subproblems, using ADMM)

    min ∑

    i∈I

    (〈F ,xi〉+ Ci (yi )

    )s.t. (xi ,yi ) ∈ Si

    ∑i

    yi = D⇐⇒

    min ∑

    i∈Iϕi (xi ,yi )

    s.t. (xi ,yi ) ∈ Si∑

    izi = D

    h(x ,y) = z− y = 0augmented with σ(·) = | · |1 gives

    L(x ,y ,z,u, r) =L0-1(x ,y ,u) + Lcont (z,u) + r |y− z|1

    ≈L0-1(x ,y ,u) +

    r2|y− zfixed |1︸ ︷︷ ︸ + Lcont (z,u) + r2 |yfixed − z|1︸ ︷︷ ︸

    ∑i

    min(xi ,yi )∈Si

    min∑i zi =D

  • Solving difficult problems with PDBM for GALUnit-commitment problems ( inexact solution of subproblems, using ADMM)

    min ∑

    i∈I

    (〈F ,xi〉+ Ci (yi )

    )s.t. (xi ,yi ) ∈ Si

    ∑i

    yi = D⇐⇒

    min ∑

    i∈Iϕi (xi ,yi )

    s.t. (xi ,yi ) ∈ Si∑

    izi = D

    h(x ,y) = z− y = 0augmented with σ(·) = | · |1 gives

    L(x ,y ,z,u, r) =L0-1(x ,y ,u) + Lcont (z,u) + r |y− z|1

    ≈L0-1(x ,y ,u) +

    r2|y− zfixed |1︸ ︷︷ ︸ + Lcont (z,u) + r2 |yfixed − z|1︸ ︷︷ ︸

    ∑i

    min(xi ,yi )∈Si

    min∑i zi =D

  • Solving difficult problems with PDBM for GALUnit-commitment problems ( inexact solution of subproblems, using ADMM)

    min ∑

    i∈I

    (〈F ,xi〉+ Ci (yi )

    )s.t. (xi ,yi ) ∈ Si

    ∑i

    yi = D⇐⇒

    min ∑

    i∈Iϕi (xi ,yi )

    s.t. (xi ,yi ) ∈ Si∑

    izi = D

    h(x ,y) = z− y = 0augmented with σ(·) = | · |1 gives

    L(x ,y ,z,u, r) =L0-1(x ,y ,u) + Lcont (z,u) + r |y− z|1

    ≈L0-1(x ,y ,u) +

    r2|y− zfixed |1︸ ︷︷ ︸ + Lcont (z,u) + r2 |yfixed − z|1︸ ︷︷ ︸

    ∑i

    min(xi ,yi )∈Si

    min∑i zi =D

  • Solving difficult problems with PDBM for GALUnit-commitment problems ( inexact solution of subproblems, using ADMM)

    Very good performance, provided r0 is well chosen (not too large)

    Results for 56 synthetic instances, horizon from 1 to 7 days, hourly discretization

    PDBM MSM (Gas02)avg stdev avg stdev

    Gap(%) 4.5 6.5 14.6 12.1h(x̄) 4.5E-03 7.2E-04 5.3E-03 1.1E-03#primal 105 52 208 131CPU (s) 96 121 148 140

  • Some works on GAL

    I E. Golshtein and N. Tretyakov, Modified Lagrangians and Monotone Maps in Optimization, Wiley 1996

    I R. Rockafellar and R. Wets. Variational Analysis. Springer, 1998

    I A. M. Rubinov, B. M. Glover and X. Q. Yang, “Decreasing Functions with Applications to Penalization”, SiOPT 1999

    I R. N. Gasimov. “Augmented Lagrangian Duality and Nondifferentiable Optimization Methods in Nonconvex Programming”,JOGO 2002←MSMI X. X. Huang and X. Q. Yang. “A Unified Augmented Lagrangian Approach to Duality and Exact Penalization”,MOR 2003

    I R. Burachik, R. Gasimov, N. Ismayilova, C. Kaya. “On a Modified Subgradient Algorithm for Dual Problems via Sharp Augmented Lagrangian”,JOGO 2006

    I R. Burachik and A. Rubinov. “Abstract Convexity and Augmented Lagrangians”,SiOPT 2007

    I A. Nedič, A. Ozdaglar, and A. Rubinov. “Abstract convexity for nonconvex optimization duality”, OPT 2007

    I R. Burachik, A. N. Iusem, and J. G. Melo. “A primal dual modified subgradient algorithm with sharp Lagrangian”,JOGO 2010

    I R. Burachik. “On primal convergence for augmented Lagrangian duality”, OPT 2011

    I R. Burachik, A. Iusem, and J. Melo. “An Inexact Modified Subgradient Algorithm for Primal-Dual Problems via Augmented Lagrangians”, JOTA 2013

    I N. L. Boland and A. C. Eberhard. “On the augmented Lagrangian dual for integer programming”, MP 2015

    I M. J. Feizollahi, S. Ahmed, and A. Sun. “Exact augmented Lagrangian duality for mixed integer linear programming”,MP 2017

    I A. M. Bagirov, G. Ozturk, and R. Kasimbeyli, “A sharp augmented Lagrangian-based method in constrained non-convex optimization”, OMS 2018

    I X. Gu, S. Ahmed, and S. S. Dey. “Exact Augmented Lagrangian Duality for Mixed Integer Quadratic Programming”,SiOPT 2020

    Questions?

  • Some works on GAL

    I E. Golshtein and N. Tretyakov, Modified Lagrangians and Monotone Maps in Optimization, Wiley 1996

    I R. Rockafellar and R. Wets. Variational Analysis. Springer, 1998

    I A. M. Rubinov, B. M. Glover and X. Q. Yang, “Decreasing Functions with Applications to Penalization”, SiOPT 1999

    I R. N. Gasimov. “Augmented Lagrangian Duality and Nondifferentiable Optimization Methods in Nonconvex Programming”,JOGO 2002←MSMI X. X. Huang and X. Q. Yang. “A Unified Augmented Lagrangian Approach to Duality and Exact Penalization”,MOR 2003

    I R. Burachik, R. Gasimov, N. Ismayilova, C. Kaya. “On a Modified Subgradient Algorithm for Dual Problems via Sharp Augmented Lagrangian”,JOGO 2006

    I R. Burachik and A. Rubinov. “Abstract Convexity and Augmented Lagrangians”,SiOPT 2007

    I A. Nedič, A. Ozdaglar, and A. Rubinov. “Abstract convexity for nonconvex optimization duality”, OPT 2007

    I R. Burachik, A. N. Iusem, and J. G. Melo. “A primal dual modified subgradient algorithm with sharp Lagrangian”,JOGO 2010

    I R. Burachik. “On primal convergence for augmented Lagrangian duality”, OPT 2011

    I R. Burachik, A. Iusem, and J. Melo. “An Inexact Modified Subgradient Algorithm for Primal-Dual Problems via Augmented Lagrangians”, JOTA 2013

    I N. L. Boland and A. C. Eberhard. “On the augmented Lagrangian dual for integer programming”, MP 2015

    I M. J. Feizollahi, S. Ahmed, and A. Sun. “Exact augmented Lagrangian duality for mixed integer linear programming”,MP 2017

    I A. M. Bagirov, G. Ozturk, and R. Kasimbeyli, “A sharp augmented Lagrangian-based method in constrained non-convex optimization”, OMS 2018

    I X. Gu, S. Ahmed, and S. S. Dey. “Exact Augmented Lagrangian Duality for Mixed Integer Quadratic Programming”,SiOPT 2020

    Questions?