Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
arX
iv1
701
0154
9v2
[m
ath
OC
] 6
Apr
201
8
TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION
AND LAGRANGE MULTIPLIER EXPRESSIONS
JIAWANG NIE
Abstract This paper proposes tight semidefinite relaxations for polynomialoptimization The optimality conditions are investigated We show that gener-ally Lagrange multipliers can be expressed as polynomial functions in decisionvariables over the set of critical points The polynomial expressions is de-termined by linear equations Based on these expressions new Lasserre typesemidefinite relaxations are constructed for solving the polynomial optimiza-tion We show that the hierarchy of new relaxations has finite convergence orequivalently the new relaxations are tight for a finite relaxation order
1 Introduction
A general class of optimization problems is
(11)
fmin = min f(x)st ci(x) = 0 (i isin E)
cj(x) ge 0 (j isin I)where f and all ci cj are polynomials in x = (x1 xn) the real decision variableThe E and I are two disjoint finite index sets of constraining polynomials Lasserrersquosrelaxations [17] are generally used for solving (11) globally ie to find the globalminimum value fmin and minimizer(s) if any The convergence of Lasserrersquos relax-ations is related to optimality conditions
11 Optimality conditions A general introduction of optimality conditions innonlinear programming can be found in [1 Section 33] Let u be a local minimizerof (11) Denote the index set of active constraints
(12) J(u) = i isin E cup I | ci(u) = 0If the constraint qualification condition (CQC) holds at u ie the gradients nablaci(u)(i isin J(u)) are linearly independent (nabla denotes the gradient) then there existLagrange multipliers λi (i isin E cup I) satisfying
(13) nablaf(u) =sum
iisinEcupIλinablaci(u)
(14) ci(u) = 0 (i isin E) λjcj(u) = 0 (j isin I)
(15) cj(u) ge 0 (j isin I) λj ge 0 (j isin I)
2010 Mathematics Subject Classification 65K05 90C22 90C26Key words and phrases Lagrange multiplier Lasserre relaxation tight relaxation polynomial
optimization critical point
1
2 JIAWANG NIE
The second equation in (14) is called the complementarity condition If λj+cj(u) gt0 for all j isin I the strict complementarity condition (SCC) is said to hold For theλirsquos satisfying (13)-(15) the associated Lagrange function is
L (x) = f(x)minussum
iisinEcupIλici(x)
Under the constraint qualification condition the second order necessary condition(SONC) holds at u ie (nabla2 denotes the Hessian)
(16) vT(
nabla2L (u)
)
v ge 0 for all v isin⋂
iisinJ(u)nablaci(u)perp
Here nablaci(u)perp is the orthogonal complement of nablaci(u) If it further holds that
(17) vT(
nabla2L (u)
)
v gt 0 for all 0 6= v isin⋂
iisinJ(u)nablaci(u)perp
then the second order sufficient condition (SOSC) is said to hold If the con-straint qualification condition holds at u then (13) (14) and (16) are necessaryconditions for u to be a local minimizer If (13) (14) (17) and the strict comple-mentarity condition hold then u is a strict local minimizer
12 Some existing work Under the archimedean condition (see sect2) the hierar-chy of Lasserrersquos relaxations converges asymptotically [17] Moreover in additionto the archimedeanness if the constraint qualification strict complementarity andsecond order sufficient conditions hold at every global minimizer then the Lasserrersquoshierarchy converges in finitely many steps [33] For convex polynomial optimiza-tion the Lasserrersquos hierarchy has finite convergence under the strict convexity orsos-convexity condition [7 20] For unconstrained polynomial optimization thestandard sum of squares relaxation was proposed in [35] When the equality con-straints define a finite set the Lasserrersquos hierarchy also has finite convergence asshown in [18 24 31] Recently a bounded degree hierarchy of relaxations wasproposed for solving polynomial optimization [23] General introductions to poly-nomial optimization and moment problems can be found in the books and surveys[21 22 25 26 39] Lasserrersquos relaxations provide lower bounds for the minimumvalue There also exist methods that compute upper bounds [8 19] A convergencerate analysis for such upper bounds is given in [9 10] When a polynomial opti-mization problem does not have minimizers (ie the infimum is not achievable)there are relaxation methods for computing the infimum [38 42]
A new type of Lasserre relaxations based on Jacobian representations were re-cently proposed in [30] The hierarchy of such relaxations always has finite conver-gence when the tuple of constraining polynomials is nonsingular (ie at every pointin Cn the gradients of active constraining polynomial are linearly independent seeDefinition 51) When there are only equality constraints c1(x) = middot middot middot = cm(x) = 0the method needs the maximal minors of the matrix
[nablaf(x) nablac1(x) middot middot middot nablacm(x)
]
When there are inequality constraints it requires to enumerate all possibilitiesof active constraints The method in [30] is expensive when there are a lot ofconstraints For unconstrained optimization it is reduced to the gradient sum ofsquares relaxations in [27]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 3
13 New contributions When Lasserrersquos relaxations are used to solve polyno-mial optimization the following issues are typically of concerns
bull The convergence depends on the archimedean condition (see sect2) which issatisfied only if the feasible set is compact If the set is noncompact howcan we get convergent relaxations
bull The cost of Lasserrersquos relaxations depends significantly on the relaxationorder For a fixed order can we construct tighter relaxations than thestandard ones
bull When the convergence of Lasserrersquos relaxations is slow can we constructnew relaxations whose convergence is faster
bull When the optimality conditions fail to hold the Lasserrersquos hierarchy mightnot have finite convergence Can we construct a new hierarchy of strongerrelaxations that also has finite convergence for such cases
This paper addresses the above issues We construct tighter relaxations by usingoptimality conditions In (13)-(14) under the constraint qualification conditionthe Lagrange multipliers λi are uniquely determined by u Consider the polynomialsystem in (x λ)
(18)sum
iisinEcupIλinablaci(x) = nablaf(x) ci(x) = 0 (i isin E) λjcj(x) = 0 (j isin I)
A point x satisfying (18) is called a critical point and such (x λ) is called a criticalpair In (18) once x is known λ can be determined by linear equations Generallythe value of x is not known One can try to express λ as a rational function in xSuppose E cup I = 1 m and denote
G(x) =[nablac1(x) middot middot middot nablacm(x)
]
When m le n and rankG(x) = m we can get the rational expression
(19) λ =(G(x)TG(x)
)minus1G(x)Tnablaf(x)
Typically the matrix inverse(G(x)TG(x)
)minus1is expensive for usage The denom-
inator det(G(x)TG(x)
)is typically a high degree polynomial When m gt n
G(x)TG(x) is always singular and we cannot express λ as in (19)Do there exist polynomials pi (i isin E cup I) such that each
(110) λi = pi(x)
for all (x λ) satisfying (18) If they exist then we can do
bull The polynomial system (18) can be simplified to
(111)sum
iisinEcupIpi(x)nablaci(x) = nablaf(x) ci(x) = 0(i isin E) pj(x)cj(x) = 0(j isin I)
bull For each j isin I the sign condition λj ge 0 is equivalent to
(112) pj(x) ge 0
The new conditions (111) and (112) are only about the variable x not λ Theycan be used to construct tighter relaxations for solving (11)
When do there exist polynomials pi satisfying (110) If they exist how can wecompute them How can we use them to construct tighter relaxations Do thenew relaxations have advantages over the old ones These questions are the maintopics of this paper Our major results are
4 JIAWANG NIE
bull We show that the polynomials pi satisfying (110) always exist when thetuple of constraining polynomials is nonsingular (see Definition 51) More-over they can be determined by linear equations
bull Using the new conditions (111)-(112) we can construct tight relaxationsfor solving (11) To be more precise we construct a hierarchy of newrelaxations which has finite convergence This is true even if the feasibleset is noncompact andor the optimality conditions fail to hold
bull For every relaxation order the new relaxations are tighter than the standardones in the prior work
The paper is organized as follows Section 2 reviews some basics in polynomialoptimization Section 3 constructs new relaxations and proves their tightness Sec-tion 4 characterizes when the polynomials pirsquos satisfying (110) exist and shows howto determine them for polyhedral constraints Section 5 discusses the case of gen-eral nonlinear constraints Section 6 gives examples of using the new relaxationsSection 7 discusses some related issues
2 Preliminaries
Notation The symbol N (resp R C) denotes the set of nonnegative integral(resp real complex) numbers The symbol R[x] = R[x1 xn] denotes the ringof polynomials in x = (x1 xn) with real coefficients The R[x]d stands for theset of real polynomials with degrees le d Denote
Nnd = α = (α1 αn) isin N
n | |α| = α1 + middot middot middot+ αn le dFor a polynomial p deg(p) denotes its total degree For t isin R lceiltrceil denotes thesmallest integer ge t For an integer k gt 0 denote [k] = 1 2 k For x =(x1 xn) and α = (α1 αn) denote
xα = xα1
1 middot middot middotxαnn [x]d =
[1 x1 middot middot middot xn x21 x1x2 middot middot middot xdn
]T
The superscript T denotes the transpose of a matrixvector The ei denotes theith standard unit vector while e denotes the vector of all ones The Im denotesthe m-by-m identity matrix By writing X 0 (resp X ≻ 0) we mean that Xis a symmetric positive semidefinite (resp positive definite) matrix For matricesX1 Xr diag(X1 Xr) denotes the block diagonal matrix whose diagonalblocks are X1 Xr In particular for a vector a diag(a) denotes the diagonalmatrix whose diagonal vector is a For a function f in x fxi
denotes its partialderivative with respect to xi
We review some basics in computational algebra and polynomial optimizationThey could be found in [4 21 22 25 26] An ideal I of R[x] is a subset such thatI middot R[x] sube I and I + I sube I For a tuple h = (h1 hm) of polynomials Ideal(h)denotes the smallest ideal containing all hi which is the set
h1 middot R[x] + middot middot middot+ hm middot R[x]The 2kth truncation of Ideal(h) is the set
Ideal(h)2k = h1 middot R[x]2kminusdeg(h1) + middot middot middot+ hm middot R[x]2kminusdeg(hm)
The truncation Ideal(h)2k depends on the generators h1 hm For an ideal Iits complex and real varieties are respectively defined as
VC(I) = v isin Cn | p(v) = 0 forall p isin I VR(I) = VC(I) capR
n
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 5
A polynomial σ is said to be a sum of squares (SOS) if σ = s21+ middot middot middot+s2k for somepolynomials s1 sk isin R[x] The set of all SOS polynomials in x is denoted asΣ[x] For a degree d denote the truncation
Σ[x]d = Σ[x] cap R[x]d
For a tuple g = (g1 gt) its quadratic module is the set
Qmod(g) = Σ[x] + g1 middot Σ[x] + middot middot middot+ gt middot Σ[x]The 2kth truncation of Qmod(g) is the set
Qmod(g)2k = Σ[x]2k + g1 middot Σ[x]2kminusdeg(g1) + middot middot middot+ gt middot Σ[x]2kminusdeg(gt)
The truncation Qmod(g)2k depends on the generators g1 gt Denote
(21)
IQ(h g) = Ideal(h) + Qmod(g)IQ(h g)2k = Ideal(h)2k +Qmod(g)2k
The set IQ(h g) is said to be archimedean if there exists p isin IQ(h g) such thatp(x) ge 0 defines a compact set in Rn If IQ(h g) is archimedean then
K = x isin Rn | h(x) = 0 g(x) ge 0
must be a compact set Conversely if K is compact say K sube B(0 R) (the ballcentered at 0 with radius R) then IQ(h (gR2 minus xTx)) is always archimedean andh = 0 (gR2 minus xTx) ge 0 give the same set K
Theorem 21 (Putinar [36]) Let h g be tuples of polynomials in R[x] Let K beas above Assume IQ(h g) is archimedean If a polynomial f isin R[x] is positive onK then f isin IQ(h g)
Interestingly if f is only nonnegative on K but standard optimality conditionshold (see Subsection 11) then we still have f isin IQ(h g) [33]
Let RNnd be the space of real multi-sequences indexed by α isin Nnd A vector in
RNnd is called a truncated multi-sequence (tms) of degree d A tms y = (yα)αisinNn
d
gives the Riesz functional Ry acting on R[x]d as
(22) Ry
(
sum
αisinNnd
fαxα)
=sum
αisinNnd
fαyα
For f isin R[x]d and y isin RNnd we denote
(23) 〈f y〉 = Ry(f)
Let q isin R[x]2k The kth localizing matrix of q generated by y isin RNn2k is the
symmetric matrix L(k)q (y) such that
(24) vec(a1)T(
L(k)q (y)
)
vec(a2) = Ry(qa1a2)
for all a1 a2 isin R[x]kminuslceildeg(q)2rceil (The vec(ai) denotes the coefficient vector of ai)
When q = 1 L(k)q (y) is called a moment matrix and we denote
(25) Mk(y) = L(k)1 (y)
The columns and rows of L(k)q (y) as well as Mk(y) are indexed by α isin Nn with
2|α|+ deg(q) le 2k When q = (q1 qr) is a tuple of polynomials we define
(26) L(k)q (y) = diag
(
L(k)q1 (y) L(k)
qr (y))
6 JIAWANG NIE
a block diagonal matrix For the polynomial tuples h g as above the set
(27) S (h g)2k =
y isin RN
2kn
∣∣∣L
(k)h (y) = 0 L(k)
g (y) 0
is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k
The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k
3 The construction of tight relaxations
Consider the polynomial optimization problem (11) Let
λ = (λi)iisinEcupI
be the vector of Lagrange multipliers Denote the set
(31) K =
(x λ) isin Rn times R
EcupI∣∣∣∣∣
ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum
iisinEcupIλinablaci(x)
Each point in K is called a critical pair The projection
(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers
Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that
λi = pi(x)
Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly
bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as
(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])
bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that
(34) λj = minus1
2xjfxj
(j isin [n])
bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have
(35) λ1 = minus1
2xTnablaf(x)
bull (Triangular constraints) Suppose E cup I = 1 m and each
ci(x) = τixi + qi(xi+1 xn)
for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then
λ = T (x)minus1 middot[fx1
middot middot middot fxm
]T
Note that the inverse T (x)minus1 is a matrix polynomial
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7
For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5
For the polynomials pi as in Assumption 31 denote
(36) φ =(
nablaf minussum
iisinEcupIpinablaci
(pjcj
)
jisinI
)
ψ =(pj)
jisinI
When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem
(37)
fc = min f(x)st ceq(x) = 0 cin(x) ge 0
φ(x) = 0 ψ(x) ge 0
We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is
(38)
f primek = min 〈f y〉
st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)cin(y) 0
L(k)φ (y) = 0 L
(k)ψ (y) 0 y isin RN
n2k
Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is
(39)
fk = max γ
st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k
We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations
By the construction of φ as in (36) Assumption 31 implies that
Kc = u isin Rn ceq(u) = 0 φ(u) = 0
By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN
A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set
fc = +infin
When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that
(311) fk le f primek le fc
for all k Moreover fk and f primek are both monotonically increasing If for some
order k it occurs thatfk = f prime
k = fc
then the kth order Lasserrersquos relaxation is said to be tight (or exact)
8 JIAWANG NIE
31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption
Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0
In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases
a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0
b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)
Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial
(312) ρ =sum
iltt
minus1
cji(ui)cji(x)ℓi(x)
2 +sum
igetℓi(x)
2
satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge
0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that
gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial
(313) ρ =sum
iltt
gi(x)(ϕi(f(x))
)2+sum
iget
(ϕi(f(x))
)2
satisfies Assumption 32
We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)
Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If
i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds
then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin
of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if
one of the conditions i)-iii) is satisfied
Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)
The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9
for f on each subvariety and then construct a single one for f over the entire setof critical points
Proof of Theorem 33 Clearly every point in the complex variety
K1 = x isin Cn | ceq(x) = 0 φ(x) = 0
is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0
Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let
I = Ideal(ceq φ)
the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)
Ti = K1 cap f(x) = vi (i = t N)
Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that
VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that
I = Itminus1 cap It cap middot middot middot cap IN
and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set
(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0
For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)
1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and
IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)
By Theorem 21 we have
p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)
Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that
minus1 equiv p1 equiv p2 mod Itminus1
Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have
f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2
mod Itminus1
So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k
1It is the preordering of the polynomial tuple (cin ψ) see sect71
10 JIAWANG NIE
For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial
st(ǫ) =radicǫsummtminus1
j=0
(12
j
)
ǫminusjf j
Then we have that
D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)
)2 equiv 0 mod It
This is because in the subtraction of D1 after expanding(st(ǫ)
)2 all the terms f j
with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2
then f + ǫminus σt(ǫ) = D1 and
(315) f + ǫminus σt(ǫ) =summtminus2
j=0bj(ǫ)f
mt+j
for some real scalars bj(ǫ) depending on ǫ
For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let
si =radicvisummiminus1
j=0
(12
j
)
(fvi minus 1)j
Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii
Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that
a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂
i6=jisintminus1NIj
For ǫ gt 0 denote the polynomial
σǫ = σt(ǫ)a2t +
sum
t6=jisintminus1N(σj + ǫ)a2j
then
f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum
t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a
2t
For each i 6= t f minus σi isin Ii so
(f minus σi)a2i isin
N⋂
j=tminus1
Ij = I
Hence there exists k1 gt 0 such that
(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)
Since f + ǫ minus σt(ǫ) isin It we also have
(f + ǫ minus σt(ǫ))a2t isin
N⋂
j=tminus1
Ij = I
Moreover by the equation (315)
(f + ǫ minus σt(ǫ))a2t =
mtminus2sum
j=0
bj(ǫ)fmt+ja2t
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
2 JIAWANG NIE
The second equation in (14) is called the complementarity condition If λj+cj(u) gt0 for all j isin I the strict complementarity condition (SCC) is said to hold For theλirsquos satisfying (13)-(15) the associated Lagrange function is
L (x) = f(x)minussum
iisinEcupIλici(x)
Under the constraint qualification condition the second order necessary condition(SONC) holds at u ie (nabla2 denotes the Hessian)
(16) vT(
nabla2L (u)
)
v ge 0 for all v isin⋂
iisinJ(u)nablaci(u)perp
Here nablaci(u)perp is the orthogonal complement of nablaci(u) If it further holds that
(17) vT(
nabla2L (u)
)
v gt 0 for all 0 6= v isin⋂
iisinJ(u)nablaci(u)perp
then the second order sufficient condition (SOSC) is said to hold If the con-straint qualification condition holds at u then (13) (14) and (16) are necessaryconditions for u to be a local minimizer If (13) (14) (17) and the strict comple-mentarity condition hold then u is a strict local minimizer
12 Some existing work Under the archimedean condition (see sect2) the hierar-chy of Lasserrersquos relaxations converges asymptotically [17] Moreover in additionto the archimedeanness if the constraint qualification strict complementarity andsecond order sufficient conditions hold at every global minimizer then the Lasserrersquoshierarchy converges in finitely many steps [33] For convex polynomial optimiza-tion the Lasserrersquos hierarchy has finite convergence under the strict convexity orsos-convexity condition [7 20] For unconstrained polynomial optimization thestandard sum of squares relaxation was proposed in [35] When the equality con-straints define a finite set the Lasserrersquos hierarchy also has finite convergence asshown in [18 24 31] Recently a bounded degree hierarchy of relaxations wasproposed for solving polynomial optimization [23] General introductions to poly-nomial optimization and moment problems can be found in the books and surveys[21 22 25 26 39] Lasserrersquos relaxations provide lower bounds for the minimumvalue There also exist methods that compute upper bounds [8 19] A convergencerate analysis for such upper bounds is given in [9 10] When a polynomial opti-mization problem does not have minimizers (ie the infimum is not achievable)there are relaxation methods for computing the infimum [38 42]
A new type of Lasserre relaxations based on Jacobian representations were re-cently proposed in [30] The hierarchy of such relaxations always has finite conver-gence when the tuple of constraining polynomials is nonsingular (ie at every pointin Cn the gradients of active constraining polynomial are linearly independent seeDefinition 51) When there are only equality constraints c1(x) = middot middot middot = cm(x) = 0the method needs the maximal minors of the matrix
[nablaf(x) nablac1(x) middot middot middot nablacm(x)
]
When there are inequality constraints it requires to enumerate all possibilitiesof active constraints The method in [30] is expensive when there are a lot ofconstraints For unconstrained optimization it is reduced to the gradient sum ofsquares relaxations in [27]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 3
13 New contributions When Lasserrersquos relaxations are used to solve polyno-mial optimization the following issues are typically of concerns
bull The convergence depends on the archimedean condition (see sect2) which issatisfied only if the feasible set is compact If the set is noncompact howcan we get convergent relaxations
bull The cost of Lasserrersquos relaxations depends significantly on the relaxationorder For a fixed order can we construct tighter relaxations than thestandard ones
bull When the convergence of Lasserrersquos relaxations is slow can we constructnew relaxations whose convergence is faster
bull When the optimality conditions fail to hold the Lasserrersquos hierarchy mightnot have finite convergence Can we construct a new hierarchy of strongerrelaxations that also has finite convergence for such cases
This paper addresses the above issues We construct tighter relaxations by usingoptimality conditions In (13)-(14) under the constraint qualification conditionthe Lagrange multipliers λi are uniquely determined by u Consider the polynomialsystem in (x λ)
(18)sum
iisinEcupIλinablaci(x) = nablaf(x) ci(x) = 0 (i isin E) λjcj(x) = 0 (j isin I)
A point x satisfying (18) is called a critical point and such (x λ) is called a criticalpair In (18) once x is known λ can be determined by linear equations Generallythe value of x is not known One can try to express λ as a rational function in xSuppose E cup I = 1 m and denote
G(x) =[nablac1(x) middot middot middot nablacm(x)
]
When m le n and rankG(x) = m we can get the rational expression
(19) λ =(G(x)TG(x)
)minus1G(x)Tnablaf(x)
Typically the matrix inverse(G(x)TG(x)
)minus1is expensive for usage The denom-
inator det(G(x)TG(x)
)is typically a high degree polynomial When m gt n
G(x)TG(x) is always singular and we cannot express λ as in (19)Do there exist polynomials pi (i isin E cup I) such that each
(110) λi = pi(x)
for all (x λ) satisfying (18) If they exist then we can do
bull The polynomial system (18) can be simplified to
(111)sum
iisinEcupIpi(x)nablaci(x) = nablaf(x) ci(x) = 0(i isin E) pj(x)cj(x) = 0(j isin I)
bull For each j isin I the sign condition λj ge 0 is equivalent to
(112) pj(x) ge 0
The new conditions (111) and (112) are only about the variable x not λ Theycan be used to construct tighter relaxations for solving (11)
When do there exist polynomials pi satisfying (110) If they exist how can wecompute them How can we use them to construct tighter relaxations Do thenew relaxations have advantages over the old ones These questions are the maintopics of this paper Our major results are
4 JIAWANG NIE
bull We show that the polynomials pi satisfying (110) always exist when thetuple of constraining polynomials is nonsingular (see Definition 51) More-over they can be determined by linear equations
bull Using the new conditions (111)-(112) we can construct tight relaxationsfor solving (11) To be more precise we construct a hierarchy of newrelaxations which has finite convergence This is true even if the feasibleset is noncompact andor the optimality conditions fail to hold
bull For every relaxation order the new relaxations are tighter than the standardones in the prior work
The paper is organized as follows Section 2 reviews some basics in polynomialoptimization Section 3 constructs new relaxations and proves their tightness Sec-tion 4 characterizes when the polynomials pirsquos satisfying (110) exist and shows howto determine them for polyhedral constraints Section 5 discusses the case of gen-eral nonlinear constraints Section 6 gives examples of using the new relaxationsSection 7 discusses some related issues
2 Preliminaries
Notation The symbol N (resp R C) denotes the set of nonnegative integral(resp real complex) numbers The symbol R[x] = R[x1 xn] denotes the ringof polynomials in x = (x1 xn) with real coefficients The R[x]d stands for theset of real polynomials with degrees le d Denote
Nnd = α = (α1 αn) isin N
n | |α| = α1 + middot middot middot+ αn le dFor a polynomial p deg(p) denotes its total degree For t isin R lceiltrceil denotes thesmallest integer ge t For an integer k gt 0 denote [k] = 1 2 k For x =(x1 xn) and α = (α1 αn) denote
xα = xα1
1 middot middot middotxαnn [x]d =
[1 x1 middot middot middot xn x21 x1x2 middot middot middot xdn
]T
The superscript T denotes the transpose of a matrixvector The ei denotes theith standard unit vector while e denotes the vector of all ones The Im denotesthe m-by-m identity matrix By writing X 0 (resp X ≻ 0) we mean that Xis a symmetric positive semidefinite (resp positive definite) matrix For matricesX1 Xr diag(X1 Xr) denotes the block diagonal matrix whose diagonalblocks are X1 Xr In particular for a vector a diag(a) denotes the diagonalmatrix whose diagonal vector is a For a function f in x fxi
denotes its partialderivative with respect to xi
We review some basics in computational algebra and polynomial optimizationThey could be found in [4 21 22 25 26] An ideal I of R[x] is a subset such thatI middot R[x] sube I and I + I sube I For a tuple h = (h1 hm) of polynomials Ideal(h)denotes the smallest ideal containing all hi which is the set
h1 middot R[x] + middot middot middot+ hm middot R[x]The 2kth truncation of Ideal(h) is the set
Ideal(h)2k = h1 middot R[x]2kminusdeg(h1) + middot middot middot+ hm middot R[x]2kminusdeg(hm)
The truncation Ideal(h)2k depends on the generators h1 hm For an ideal Iits complex and real varieties are respectively defined as
VC(I) = v isin Cn | p(v) = 0 forall p isin I VR(I) = VC(I) capR
n
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 5
A polynomial σ is said to be a sum of squares (SOS) if σ = s21+ middot middot middot+s2k for somepolynomials s1 sk isin R[x] The set of all SOS polynomials in x is denoted asΣ[x] For a degree d denote the truncation
Σ[x]d = Σ[x] cap R[x]d
For a tuple g = (g1 gt) its quadratic module is the set
Qmod(g) = Σ[x] + g1 middot Σ[x] + middot middot middot+ gt middot Σ[x]The 2kth truncation of Qmod(g) is the set
Qmod(g)2k = Σ[x]2k + g1 middot Σ[x]2kminusdeg(g1) + middot middot middot+ gt middot Σ[x]2kminusdeg(gt)
The truncation Qmod(g)2k depends on the generators g1 gt Denote
(21)
IQ(h g) = Ideal(h) + Qmod(g)IQ(h g)2k = Ideal(h)2k +Qmod(g)2k
The set IQ(h g) is said to be archimedean if there exists p isin IQ(h g) such thatp(x) ge 0 defines a compact set in Rn If IQ(h g) is archimedean then
K = x isin Rn | h(x) = 0 g(x) ge 0
must be a compact set Conversely if K is compact say K sube B(0 R) (the ballcentered at 0 with radius R) then IQ(h (gR2 minus xTx)) is always archimedean andh = 0 (gR2 minus xTx) ge 0 give the same set K
Theorem 21 (Putinar [36]) Let h g be tuples of polynomials in R[x] Let K beas above Assume IQ(h g) is archimedean If a polynomial f isin R[x] is positive onK then f isin IQ(h g)
Interestingly if f is only nonnegative on K but standard optimality conditionshold (see Subsection 11) then we still have f isin IQ(h g) [33]
Let RNnd be the space of real multi-sequences indexed by α isin Nnd A vector in
RNnd is called a truncated multi-sequence (tms) of degree d A tms y = (yα)αisinNn
d
gives the Riesz functional Ry acting on R[x]d as
(22) Ry
(
sum
αisinNnd
fαxα)
=sum
αisinNnd
fαyα
For f isin R[x]d and y isin RNnd we denote
(23) 〈f y〉 = Ry(f)
Let q isin R[x]2k The kth localizing matrix of q generated by y isin RNn2k is the
symmetric matrix L(k)q (y) such that
(24) vec(a1)T(
L(k)q (y)
)
vec(a2) = Ry(qa1a2)
for all a1 a2 isin R[x]kminuslceildeg(q)2rceil (The vec(ai) denotes the coefficient vector of ai)
When q = 1 L(k)q (y) is called a moment matrix and we denote
(25) Mk(y) = L(k)1 (y)
The columns and rows of L(k)q (y) as well as Mk(y) are indexed by α isin Nn with
2|α|+ deg(q) le 2k When q = (q1 qr) is a tuple of polynomials we define
(26) L(k)q (y) = diag
(
L(k)q1 (y) L(k)
qr (y))
6 JIAWANG NIE
a block diagonal matrix For the polynomial tuples h g as above the set
(27) S (h g)2k =
y isin RN
2kn
∣∣∣L
(k)h (y) = 0 L(k)
g (y) 0
is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k
The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k
3 The construction of tight relaxations
Consider the polynomial optimization problem (11) Let
λ = (λi)iisinEcupI
be the vector of Lagrange multipliers Denote the set
(31) K =
(x λ) isin Rn times R
EcupI∣∣∣∣∣
ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum
iisinEcupIλinablaci(x)
Each point in K is called a critical pair The projection
(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers
Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that
λi = pi(x)
Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly
bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as
(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])
bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that
(34) λj = minus1
2xjfxj
(j isin [n])
bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have
(35) λ1 = minus1
2xTnablaf(x)
bull (Triangular constraints) Suppose E cup I = 1 m and each
ci(x) = τixi + qi(xi+1 xn)
for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then
λ = T (x)minus1 middot[fx1
middot middot middot fxm
]T
Note that the inverse T (x)minus1 is a matrix polynomial
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7
For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5
For the polynomials pi as in Assumption 31 denote
(36) φ =(
nablaf minussum
iisinEcupIpinablaci
(pjcj
)
jisinI
)
ψ =(pj)
jisinI
When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem
(37)
fc = min f(x)st ceq(x) = 0 cin(x) ge 0
φ(x) = 0 ψ(x) ge 0
We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is
(38)
f primek = min 〈f y〉
st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)cin(y) 0
L(k)φ (y) = 0 L
(k)ψ (y) 0 y isin RN
n2k
Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is
(39)
fk = max γ
st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k
We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations
By the construction of φ as in (36) Assumption 31 implies that
Kc = u isin Rn ceq(u) = 0 φ(u) = 0
By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN
A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set
fc = +infin
When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that
(311) fk le f primek le fc
for all k Moreover fk and f primek are both monotonically increasing If for some
order k it occurs thatfk = f prime
k = fc
then the kth order Lasserrersquos relaxation is said to be tight (or exact)
8 JIAWANG NIE
31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption
Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0
In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases
a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0
b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)
Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial
(312) ρ =sum
iltt
minus1
cji(ui)cji(x)ℓi(x)
2 +sum
igetℓi(x)
2
satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge
0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that
gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial
(313) ρ =sum
iltt
gi(x)(ϕi(f(x))
)2+sum
iget
(ϕi(f(x))
)2
satisfies Assumption 32
We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)
Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If
i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds
then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin
of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if
one of the conditions i)-iii) is satisfied
Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)
The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9
for f on each subvariety and then construct a single one for f over the entire setof critical points
Proof of Theorem 33 Clearly every point in the complex variety
K1 = x isin Cn | ceq(x) = 0 φ(x) = 0
is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0
Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let
I = Ideal(ceq φ)
the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)
Ti = K1 cap f(x) = vi (i = t N)
Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that
VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that
I = Itminus1 cap It cap middot middot middot cap IN
and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set
(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0
For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)
1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and
IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)
By Theorem 21 we have
p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)
Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that
minus1 equiv p1 equiv p2 mod Itminus1
Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have
f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2
mod Itminus1
So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k
1It is the preordering of the polynomial tuple (cin ψ) see sect71
10 JIAWANG NIE
For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial
st(ǫ) =radicǫsummtminus1
j=0
(12
j
)
ǫminusjf j
Then we have that
D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)
)2 equiv 0 mod It
This is because in the subtraction of D1 after expanding(st(ǫ)
)2 all the terms f j
with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2
then f + ǫminus σt(ǫ) = D1 and
(315) f + ǫminus σt(ǫ) =summtminus2
j=0bj(ǫ)f
mt+j
for some real scalars bj(ǫ) depending on ǫ
For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let
si =radicvisummiminus1
j=0
(12
j
)
(fvi minus 1)j
Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii
Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that
a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂
i6=jisintminus1NIj
For ǫ gt 0 denote the polynomial
σǫ = σt(ǫ)a2t +
sum
t6=jisintminus1N(σj + ǫ)a2j
then
f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum
t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a
2t
For each i 6= t f minus σi isin Ii so
(f minus σi)a2i isin
N⋂
j=tminus1
Ij = I
Hence there exists k1 gt 0 such that
(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)
Since f + ǫ minus σt(ǫ) isin It we also have
(f + ǫ minus σt(ǫ))a2t isin
N⋂
j=tminus1
Ij = I
Moreover by the equation (315)
(f + ǫ minus σt(ǫ))a2t =
mtminus2sum
j=0
bj(ǫ)fmt+ja2t
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 3
13 New contributions When Lasserrersquos relaxations are used to solve polyno-mial optimization the following issues are typically of concerns
bull The convergence depends on the archimedean condition (see sect2) which issatisfied only if the feasible set is compact If the set is noncompact howcan we get convergent relaxations
bull The cost of Lasserrersquos relaxations depends significantly on the relaxationorder For a fixed order can we construct tighter relaxations than thestandard ones
bull When the convergence of Lasserrersquos relaxations is slow can we constructnew relaxations whose convergence is faster
bull When the optimality conditions fail to hold the Lasserrersquos hierarchy mightnot have finite convergence Can we construct a new hierarchy of strongerrelaxations that also has finite convergence for such cases
This paper addresses the above issues We construct tighter relaxations by usingoptimality conditions In (13)-(14) under the constraint qualification conditionthe Lagrange multipliers λi are uniquely determined by u Consider the polynomialsystem in (x λ)
(18)sum
iisinEcupIλinablaci(x) = nablaf(x) ci(x) = 0 (i isin E) λjcj(x) = 0 (j isin I)
A point x satisfying (18) is called a critical point and such (x λ) is called a criticalpair In (18) once x is known λ can be determined by linear equations Generallythe value of x is not known One can try to express λ as a rational function in xSuppose E cup I = 1 m and denote
G(x) =[nablac1(x) middot middot middot nablacm(x)
]
When m le n and rankG(x) = m we can get the rational expression
(19) λ =(G(x)TG(x)
)minus1G(x)Tnablaf(x)
Typically the matrix inverse(G(x)TG(x)
)minus1is expensive for usage The denom-
inator det(G(x)TG(x)
)is typically a high degree polynomial When m gt n
G(x)TG(x) is always singular and we cannot express λ as in (19)Do there exist polynomials pi (i isin E cup I) such that each
(110) λi = pi(x)
for all (x λ) satisfying (18) If they exist then we can do
bull The polynomial system (18) can be simplified to
(111)sum
iisinEcupIpi(x)nablaci(x) = nablaf(x) ci(x) = 0(i isin E) pj(x)cj(x) = 0(j isin I)
bull For each j isin I the sign condition λj ge 0 is equivalent to
(112) pj(x) ge 0
The new conditions (111) and (112) are only about the variable x not λ Theycan be used to construct tighter relaxations for solving (11)
When do there exist polynomials pi satisfying (110) If they exist how can wecompute them How can we use them to construct tighter relaxations Do thenew relaxations have advantages over the old ones These questions are the maintopics of this paper Our major results are
4 JIAWANG NIE
bull We show that the polynomials pi satisfying (110) always exist when thetuple of constraining polynomials is nonsingular (see Definition 51) More-over they can be determined by linear equations
bull Using the new conditions (111)-(112) we can construct tight relaxationsfor solving (11) To be more precise we construct a hierarchy of newrelaxations which has finite convergence This is true even if the feasibleset is noncompact andor the optimality conditions fail to hold
bull For every relaxation order the new relaxations are tighter than the standardones in the prior work
The paper is organized as follows Section 2 reviews some basics in polynomialoptimization Section 3 constructs new relaxations and proves their tightness Sec-tion 4 characterizes when the polynomials pirsquos satisfying (110) exist and shows howto determine them for polyhedral constraints Section 5 discusses the case of gen-eral nonlinear constraints Section 6 gives examples of using the new relaxationsSection 7 discusses some related issues
2 Preliminaries
Notation The symbol N (resp R C) denotes the set of nonnegative integral(resp real complex) numbers The symbol R[x] = R[x1 xn] denotes the ringof polynomials in x = (x1 xn) with real coefficients The R[x]d stands for theset of real polynomials with degrees le d Denote
Nnd = α = (α1 αn) isin N
n | |α| = α1 + middot middot middot+ αn le dFor a polynomial p deg(p) denotes its total degree For t isin R lceiltrceil denotes thesmallest integer ge t For an integer k gt 0 denote [k] = 1 2 k For x =(x1 xn) and α = (α1 αn) denote
xα = xα1
1 middot middot middotxαnn [x]d =
[1 x1 middot middot middot xn x21 x1x2 middot middot middot xdn
]T
The superscript T denotes the transpose of a matrixvector The ei denotes theith standard unit vector while e denotes the vector of all ones The Im denotesthe m-by-m identity matrix By writing X 0 (resp X ≻ 0) we mean that Xis a symmetric positive semidefinite (resp positive definite) matrix For matricesX1 Xr diag(X1 Xr) denotes the block diagonal matrix whose diagonalblocks are X1 Xr In particular for a vector a diag(a) denotes the diagonalmatrix whose diagonal vector is a For a function f in x fxi
denotes its partialderivative with respect to xi
We review some basics in computational algebra and polynomial optimizationThey could be found in [4 21 22 25 26] An ideal I of R[x] is a subset such thatI middot R[x] sube I and I + I sube I For a tuple h = (h1 hm) of polynomials Ideal(h)denotes the smallest ideal containing all hi which is the set
h1 middot R[x] + middot middot middot+ hm middot R[x]The 2kth truncation of Ideal(h) is the set
Ideal(h)2k = h1 middot R[x]2kminusdeg(h1) + middot middot middot+ hm middot R[x]2kminusdeg(hm)
The truncation Ideal(h)2k depends on the generators h1 hm For an ideal Iits complex and real varieties are respectively defined as
VC(I) = v isin Cn | p(v) = 0 forall p isin I VR(I) = VC(I) capR
n
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 5
A polynomial σ is said to be a sum of squares (SOS) if σ = s21+ middot middot middot+s2k for somepolynomials s1 sk isin R[x] The set of all SOS polynomials in x is denoted asΣ[x] For a degree d denote the truncation
Σ[x]d = Σ[x] cap R[x]d
For a tuple g = (g1 gt) its quadratic module is the set
Qmod(g) = Σ[x] + g1 middot Σ[x] + middot middot middot+ gt middot Σ[x]The 2kth truncation of Qmod(g) is the set
Qmod(g)2k = Σ[x]2k + g1 middot Σ[x]2kminusdeg(g1) + middot middot middot+ gt middot Σ[x]2kminusdeg(gt)
The truncation Qmod(g)2k depends on the generators g1 gt Denote
(21)
IQ(h g) = Ideal(h) + Qmod(g)IQ(h g)2k = Ideal(h)2k +Qmod(g)2k
The set IQ(h g) is said to be archimedean if there exists p isin IQ(h g) such thatp(x) ge 0 defines a compact set in Rn If IQ(h g) is archimedean then
K = x isin Rn | h(x) = 0 g(x) ge 0
must be a compact set Conversely if K is compact say K sube B(0 R) (the ballcentered at 0 with radius R) then IQ(h (gR2 minus xTx)) is always archimedean andh = 0 (gR2 minus xTx) ge 0 give the same set K
Theorem 21 (Putinar [36]) Let h g be tuples of polynomials in R[x] Let K beas above Assume IQ(h g) is archimedean If a polynomial f isin R[x] is positive onK then f isin IQ(h g)
Interestingly if f is only nonnegative on K but standard optimality conditionshold (see Subsection 11) then we still have f isin IQ(h g) [33]
Let RNnd be the space of real multi-sequences indexed by α isin Nnd A vector in
RNnd is called a truncated multi-sequence (tms) of degree d A tms y = (yα)αisinNn
d
gives the Riesz functional Ry acting on R[x]d as
(22) Ry
(
sum
αisinNnd
fαxα)
=sum
αisinNnd
fαyα
For f isin R[x]d and y isin RNnd we denote
(23) 〈f y〉 = Ry(f)
Let q isin R[x]2k The kth localizing matrix of q generated by y isin RNn2k is the
symmetric matrix L(k)q (y) such that
(24) vec(a1)T(
L(k)q (y)
)
vec(a2) = Ry(qa1a2)
for all a1 a2 isin R[x]kminuslceildeg(q)2rceil (The vec(ai) denotes the coefficient vector of ai)
When q = 1 L(k)q (y) is called a moment matrix and we denote
(25) Mk(y) = L(k)1 (y)
The columns and rows of L(k)q (y) as well as Mk(y) are indexed by α isin Nn with
2|α|+ deg(q) le 2k When q = (q1 qr) is a tuple of polynomials we define
(26) L(k)q (y) = diag
(
L(k)q1 (y) L(k)
qr (y))
6 JIAWANG NIE
a block diagonal matrix For the polynomial tuples h g as above the set
(27) S (h g)2k =
y isin RN
2kn
∣∣∣L
(k)h (y) = 0 L(k)
g (y) 0
is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k
The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k
3 The construction of tight relaxations
Consider the polynomial optimization problem (11) Let
λ = (λi)iisinEcupI
be the vector of Lagrange multipliers Denote the set
(31) K =
(x λ) isin Rn times R
EcupI∣∣∣∣∣
ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum
iisinEcupIλinablaci(x)
Each point in K is called a critical pair The projection
(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers
Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that
λi = pi(x)
Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly
bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as
(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])
bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that
(34) λj = minus1
2xjfxj
(j isin [n])
bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have
(35) λ1 = minus1
2xTnablaf(x)
bull (Triangular constraints) Suppose E cup I = 1 m and each
ci(x) = τixi + qi(xi+1 xn)
for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then
λ = T (x)minus1 middot[fx1
middot middot middot fxm
]T
Note that the inverse T (x)minus1 is a matrix polynomial
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7
For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5
For the polynomials pi as in Assumption 31 denote
(36) φ =(
nablaf minussum
iisinEcupIpinablaci
(pjcj
)
jisinI
)
ψ =(pj)
jisinI
When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem
(37)
fc = min f(x)st ceq(x) = 0 cin(x) ge 0
φ(x) = 0 ψ(x) ge 0
We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is
(38)
f primek = min 〈f y〉
st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)cin(y) 0
L(k)φ (y) = 0 L
(k)ψ (y) 0 y isin RN
n2k
Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is
(39)
fk = max γ
st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k
We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations
By the construction of φ as in (36) Assumption 31 implies that
Kc = u isin Rn ceq(u) = 0 φ(u) = 0
By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN
A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set
fc = +infin
When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that
(311) fk le f primek le fc
for all k Moreover fk and f primek are both monotonically increasing If for some
order k it occurs thatfk = f prime
k = fc
then the kth order Lasserrersquos relaxation is said to be tight (or exact)
8 JIAWANG NIE
31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption
Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0
In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases
a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0
b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)
Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial
(312) ρ =sum
iltt
minus1
cji(ui)cji(x)ℓi(x)
2 +sum
igetℓi(x)
2
satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge
0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that
gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial
(313) ρ =sum
iltt
gi(x)(ϕi(f(x))
)2+sum
iget
(ϕi(f(x))
)2
satisfies Assumption 32
We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)
Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If
i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds
then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin
of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if
one of the conditions i)-iii) is satisfied
Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)
The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9
for f on each subvariety and then construct a single one for f over the entire setof critical points
Proof of Theorem 33 Clearly every point in the complex variety
K1 = x isin Cn | ceq(x) = 0 φ(x) = 0
is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0
Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let
I = Ideal(ceq φ)
the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)
Ti = K1 cap f(x) = vi (i = t N)
Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that
VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that
I = Itminus1 cap It cap middot middot middot cap IN
and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set
(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0
For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)
1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and
IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)
By Theorem 21 we have
p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)
Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that
minus1 equiv p1 equiv p2 mod Itminus1
Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have
f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2
mod Itminus1
So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k
1It is the preordering of the polynomial tuple (cin ψ) see sect71
10 JIAWANG NIE
For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial
st(ǫ) =radicǫsummtminus1
j=0
(12
j
)
ǫminusjf j
Then we have that
D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)
)2 equiv 0 mod It
This is because in the subtraction of D1 after expanding(st(ǫ)
)2 all the terms f j
with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2
then f + ǫminus σt(ǫ) = D1 and
(315) f + ǫminus σt(ǫ) =summtminus2
j=0bj(ǫ)f
mt+j
for some real scalars bj(ǫ) depending on ǫ
For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let
si =radicvisummiminus1
j=0
(12
j
)
(fvi minus 1)j
Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii
Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that
a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂
i6=jisintminus1NIj
For ǫ gt 0 denote the polynomial
σǫ = σt(ǫ)a2t +
sum
t6=jisintminus1N(σj + ǫ)a2j
then
f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum
t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a
2t
For each i 6= t f minus σi isin Ii so
(f minus σi)a2i isin
N⋂
j=tminus1
Ij = I
Hence there exists k1 gt 0 such that
(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)
Since f + ǫ minus σt(ǫ) isin It we also have
(f + ǫ minus σt(ǫ))a2t isin
N⋂
j=tminus1
Ij = I
Moreover by the equation (315)
(f + ǫ minus σt(ǫ))a2t =
mtminus2sum
j=0
bj(ǫ)fmt+ja2t
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
4 JIAWANG NIE
bull We show that the polynomials pi satisfying (110) always exist when thetuple of constraining polynomials is nonsingular (see Definition 51) More-over they can be determined by linear equations
bull Using the new conditions (111)-(112) we can construct tight relaxationsfor solving (11) To be more precise we construct a hierarchy of newrelaxations which has finite convergence This is true even if the feasibleset is noncompact andor the optimality conditions fail to hold
bull For every relaxation order the new relaxations are tighter than the standardones in the prior work
The paper is organized as follows Section 2 reviews some basics in polynomialoptimization Section 3 constructs new relaxations and proves their tightness Sec-tion 4 characterizes when the polynomials pirsquos satisfying (110) exist and shows howto determine them for polyhedral constraints Section 5 discusses the case of gen-eral nonlinear constraints Section 6 gives examples of using the new relaxationsSection 7 discusses some related issues
2 Preliminaries
Notation The symbol N (resp R C) denotes the set of nonnegative integral(resp real complex) numbers The symbol R[x] = R[x1 xn] denotes the ringof polynomials in x = (x1 xn) with real coefficients The R[x]d stands for theset of real polynomials with degrees le d Denote
Nnd = α = (α1 αn) isin N
n | |α| = α1 + middot middot middot+ αn le dFor a polynomial p deg(p) denotes its total degree For t isin R lceiltrceil denotes thesmallest integer ge t For an integer k gt 0 denote [k] = 1 2 k For x =(x1 xn) and α = (α1 αn) denote
xα = xα1
1 middot middot middotxαnn [x]d =
[1 x1 middot middot middot xn x21 x1x2 middot middot middot xdn
]T
The superscript T denotes the transpose of a matrixvector The ei denotes theith standard unit vector while e denotes the vector of all ones The Im denotesthe m-by-m identity matrix By writing X 0 (resp X ≻ 0) we mean that Xis a symmetric positive semidefinite (resp positive definite) matrix For matricesX1 Xr diag(X1 Xr) denotes the block diagonal matrix whose diagonalblocks are X1 Xr In particular for a vector a diag(a) denotes the diagonalmatrix whose diagonal vector is a For a function f in x fxi
denotes its partialderivative with respect to xi
We review some basics in computational algebra and polynomial optimizationThey could be found in [4 21 22 25 26] An ideal I of R[x] is a subset such thatI middot R[x] sube I and I + I sube I For a tuple h = (h1 hm) of polynomials Ideal(h)denotes the smallest ideal containing all hi which is the set
h1 middot R[x] + middot middot middot+ hm middot R[x]The 2kth truncation of Ideal(h) is the set
Ideal(h)2k = h1 middot R[x]2kminusdeg(h1) + middot middot middot+ hm middot R[x]2kminusdeg(hm)
The truncation Ideal(h)2k depends on the generators h1 hm For an ideal Iits complex and real varieties are respectively defined as
VC(I) = v isin Cn | p(v) = 0 forall p isin I VR(I) = VC(I) capR
n
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 5
A polynomial σ is said to be a sum of squares (SOS) if σ = s21+ middot middot middot+s2k for somepolynomials s1 sk isin R[x] The set of all SOS polynomials in x is denoted asΣ[x] For a degree d denote the truncation
Σ[x]d = Σ[x] cap R[x]d
For a tuple g = (g1 gt) its quadratic module is the set
Qmod(g) = Σ[x] + g1 middot Σ[x] + middot middot middot+ gt middot Σ[x]The 2kth truncation of Qmod(g) is the set
Qmod(g)2k = Σ[x]2k + g1 middot Σ[x]2kminusdeg(g1) + middot middot middot+ gt middot Σ[x]2kminusdeg(gt)
The truncation Qmod(g)2k depends on the generators g1 gt Denote
(21)
IQ(h g) = Ideal(h) + Qmod(g)IQ(h g)2k = Ideal(h)2k +Qmod(g)2k
The set IQ(h g) is said to be archimedean if there exists p isin IQ(h g) such thatp(x) ge 0 defines a compact set in Rn If IQ(h g) is archimedean then
K = x isin Rn | h(x) = 0 g(x) ge 0
must be a compact set Conversely if K is compact say K sube B(0 R) (the ballcentered at 0 with radius R) then IQ(h (gR2 minus xTx)) is always archimedean andh = 0 (gR2 minus xTx) ge 0 give the same set K
Theorem 21 (Putinar [36]) Let h g be tuples of polynomials in R[x] Let K beas above Assume IQ(h g) is archimedean If a polynomial f isin R[x] is positive onK then f isin IQ(h g)
Interestingly if f is only nonnegative on K but standard optimality conditionshold (see Subsection 11) then we still have f isin IQ(h g) [33]
Let RNnd be the space of real multi-sequences indexed by α isin Nnd A vector in
RNnd is called a truncated multi-sequence (tms) of degree d A tms y = (yα)αisinNn
d
gives the Riesz functional Ry acting on R[x]d as
(22) Ry
(
sum
αisinNnd
fαxα)
=sum
αisinNnd
fαyα
For f isin R[x]d and y isin RNnd we denote
(23) 〈f y〉 = Ry(f)
Let q isin R[x]2k The kth localizing matrix of q generated by y isin RNn2k is the
symmetric matrix L(k)q (y) such that
(24) vec(a1)T(
L(k)q (y)
)
vec(a2) = Ry(qa1a2)
for all a1 a2 isin R[x]kminuslceildeg(q)2rceil (The vec(ai) denotes the coefficient vector of ai)
When q = 1 L(k)q (y) is called a moment matrix and we denote
(25) Mk(y) = L(k)1 (y)
The columns and rows of L(k)q (y) as well as Mk(y) are indexed by α isin Nn with
2|α|+ deg(q) le 2k When q = (q1 qr) is a tuple of polynomials we define
(26) L(k)q (y) = diag
(
L(k)q1 (y) L(k)
qr (y))
6 JIAWANG NIE
a block diagonal matrix For the polynomial tuples h g as above the set
(27) S (h g)2k =
y isin RN
2kn
∣∣∣L
(k)h (y) = 0 L(k)
g (y) 0
is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k
The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k
3 The construction of tight relaxations
Consider the polynomial optimization problem (11) Let
λ = (λi)iisinEcupI
be the vector of Lagrange multipliers Denote the set
(31) K =
(x λ) isin Rn times R
EcupI∣∣∣∣∣
ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum
iisinEcupIλinablaci(x)
Each point in K is called a critical pair The projection
(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers
Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that
λi = pi(x)
Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly
bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as
(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])
bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that
(34) λj = minus1
2xjfxj
(j isin [n])
bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have
(35) λ1 = minus1
2xTnablaf(x)
bull (Triangular constraints) Suppose E cup I = 1 m and each
ci(x) = τixi + qi(xi+1 xn)
for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then
λ = T (x)minus1 middot[fx1
middot middot middot fxm
]T
Note that the inverse T (x)minus1 is a matrix polynomial
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7
For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5
For the polynomials pi as in Assumption 31 denote
(36) φ =(
nablaf minussum
iisinEcupIpinablaci
(pjcj
)
jisinI
)
ψ =(pj)
jisinI
When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem
(37)
fc = min f(x)st ceq(x) = 0 cin(x) ge 0
φ(x) = 0 ψ(x) ge 0
We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is
(38)
f primek = min 〈f y〉
st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)cin(y) 0
L(k)φ (y) = 0 L
(k)ψ (y) 0 y isin RN
n2k
Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is
(39)
fk = max γ
st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k
We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations
By the construction of φ as in (36) Assumption 31 implies that
Kc = u isin Rn ceq(u) = 0 φ(u) = 0
By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN
A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set
fc = +infin
When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that
(311) fk le f primek le fc
for all k Moreover fk and f primek are both monotonically increasing If for some
order k it occurs thatfk = f prime
k = fc
then the kth order Lasserrersquos relaxation is said to be tight (or exact)
8 JIAWANG NIE
31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption
Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0
In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases
a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0
b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)
Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial
(312) ρ =sum
iltt
minus1
cji(ui)cji(x)ℓi(x)
2 +sum
igetℓi(x)
2
satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge
0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that
gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial
(313) ρ =sum
iltt
gi(x)(ϕi(f(x))
)2+sum
iget
(ϕi(f(x))
)2
satisfies Assumption 32
We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)
Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If
i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds
then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin
of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if
one of the conditions i)-iii) is satisfied
Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)
The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9
for f on each subvariety and then construct a single one for f over the entire setof critical points
Proof of Theorem 33 Clearly every point in the complex variety
K1 = x isin Cn | ceq(x) = 0 φ(x) = 0
is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0
Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let
I = Ideal(ceq φ)
the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)
Ti = K1 cap f(x) = vi (i = t N)
Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that
VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that
I = Itminus1 cap It cap middot middot middot cap IN
and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set
(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0
For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)
1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and
IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)
By Theorem 21 we have
p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)
Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that
minus1 equiv p1 equiv p2 mod Itminus1
Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have
f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2
mod Itminus1
So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k
1It is the preordering of the polynomial tuple (cin ψ) see sect71
10 JIAWANG NIE
For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial
st(ǫ) =radicǫsummtminus1
j=0
(12
j
)
ǫminusjf j
Then we have that
D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)
)2 equiv 0 mod It
This is because in the subtraction of D1 after expanding(st(ǫ)
)2 all the terms f j
with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2
then f + ǫminus σt(ǫ) = D1 and
(315) f + ǫminus σt(ǫ) =summtminus2
j=0bj(ǫ)f
mt+j
for some real scalars bj(ǫ) depending on ǫ
For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let
si =radicvisummiminus1
j=0
(12
j
)
(fvi minus 1)j
Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii
Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that
a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂
i6=jisintminus1NIj
For ǫ gt 0 denote the polynomial
σǫ = σt(ǫ)a2t +
sum
t6=jisintminus1N(σj + ǫ)a2j
then
f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum
t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a
2t
For each i 6= t f minus σi isin Ii so
(f minus σi)a2i isin
N⋂
j=tminus1
Ij = I
Hence there exists k1 gt 0 such that
(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)
Since f + ǫ minus σt(ǫ) isin It we also have
(f + ǫ minus σt(ǫ))a2t isin
N⋂
j=tminus1
Ij = I
Moreover by the equation (315)
(f + ǫ minus σt(ǫ))a2t =
mtminus2sum
j=0
bj(ǫ)fmt+ja2t
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 5
A polynomial σ is said to be a sum of squares (SOS) if σ = s21+ middot middot middot+s2k for somepolynomials s1 sk isin R[x] The set of all SOS polynomials in x is denoted asΣ[x] For a degree d denote the truncation
Σ[x]d = Σ[x] cap R[x]d
For a tuple g = (g1 gt) its quadratic module is the set
Qmod(g) = Σ[x] + g1 middot Σ[x] + middot middot middot+ gt middot Σ[x]The 2kth truncation of Qmod(g) is the set
Qmod(g)2k = Σ[x]2k + g1 middot Σ[x]2kminusdeg(g1) + middot middot middot+ gt middot Σ[x]2kminusdeg(gt)
The truncation Qmod(g)2k depends on the generators g1 gt Denote
(21)
IQ(h g) = Ideal(h) + Qmod(g)IQ(h g)2k = Ideal(h)2k +Qmod(g)2k
The set IQ(h g) is said to be archimedean if there exists p isin IQ(h g) such thatp(x) ge 0 defines a compact set in Rn If IQ(h g) is archimedean then
K = x isin Rn | h(x) = 0 g(x) ge 0
must be a compact set Conversely if K is compact say K sube B(0 R) (the ballcentered at 0 with radius R) then IQ(h (gR2 minus xTx)) is always archimedean andh = 0 (gR2 minus xTx) ge 0 give the same set K
Theorem 21 (Putinar [36]) Let h g be tuples of polynomials in R[x] Let K beas above Assume IQ(h g) is archimedean If a polynomial f isin R[x] is positive onK then f isin IQ(h g)
Interestingly if f is only nonnegative on K but standard optimality conditionshold (see Subsection 11) then we still have f isin IQ(h g) [33]
Let RNnd be the space of real multi-sequences indexed by α isin Nnd A vector in
RNnd is called a truncated multi-sequence (tms) of degree d A tms y = (yα)αisinNn
d
gives the Riesz functional Ry acting on R[x]d as
(22) Ry
(
sum
αisinNnd
fαxα)
=sum
αisinNnd
fαyα
For f isin R[x]d and y isin RNnd we denote
(23) 〈f y〉 = Ry(f)
Let q isin R[x]2k The kth localizing matrix of q generated by y isin RNn2k is the
symmetric matrix L(k)q (y) such that
(24) vec(a1)T(
L(k)q (y)
)
vec(a2) = Ry(qa1a2)
for all a1 a2 isin R[x]kminuslceildeg(q)2rceil (The vec(ai) denotes the coefficient vector of ai)
When q = 1 L(k)q (y) is called a moment matrix and we denote
(25) Mk(y) = L(k)1 (y)
The columns and rows of L(k)q (y) as well as Mk(y) are indexed by α isin Nn with
2|α|+ deg(q) le 2k When q = (q1 qr) is a tuple of polynomials we define
(26) L(k)q (y) = diag
(
L(k)q1 (y) L(k)
qr (y))
6 JIAWANG NIE
a block diagonal matrix For the polynomial tuples h g as above the set
(27) S (h g)2k =
y isin RN
2kn
∣∣∣L
(k)h (y) = 0 L(k)
g (y) 0
is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k
The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k
3 The construction of tight relaxations
Consider the polynomial optimization problem (11) Let
λ = (λi)iisinEcupI
be the vector of Lagrange multipliers Denote the set
(31) K =
(x λ) isin Rn times R
EcupI∣∣∣∣∣
ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum
iisinEcupIλinablaci(x)
Each point in K is called a critical pair The projection
(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers
Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that
λi = pi(x)
Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly
bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as
(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])
bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that
(34) λj = minus1
2xjfxj
(j isin [n])
bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have
(35) λ1 = minus1
2xTnablaf(x)
bull (Triangular constraints) Suppose E cup I = 1 m and each
ci(x) = τixi + qi(xi+1 xn)
for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then
λ = T (x)minus1 middot[fx1
middot middot middot fxm
]T
Note that the inverse T (x)minus1 is a matrix polynomial
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7
For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5
For the polynomials pi as in Assumption 31 denote
(36) φ =(
nablaf minussum
iisinEcupIpinablaci
(pjcj
)
jisinI
)
ψ =(pj)
jisinI
When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem
(37)
fc = min f(x)st ceq(x) = 0 cin(x) ge 0
φ(x) = 0 ψ(x) ge 0
We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is
(38)
f primek = min 〈f y〉
st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)cin(y) 0
L(k)φ (y) = 0 L
(k)ψ (y) 0 y isin RN
n2k
Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is
(39)
fk = max γ
st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k
We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations
By the construction of φ as in (36) Assumption 31 implies that
Kc = u isin Rn ceq(u) = 0 φ(u) = 0
By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN
A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set
fc = +infin
When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that
(311) fk le f primek le fc
for all k Moreover fk and f primek are both monotonically increasing If for some
order k it occurs thatfk = f prime
k = fc
then the kth order Lasserrersquos relaxation is said to be tight (or exact)
8 JIAWANG NIE
31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption
Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0
In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases
a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0
b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)
Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial
(312) ρ =sum
iltt
minus1
cji(ui)cji(x)ℓi(x)
2 +sum
igetℓi(x)
2
satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge
0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that
gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial
(313) ρ =sum
iltt
gi(x)(ϕi(f(x))
)2+sum
iget
(ϕi(f(x))
)2
satisfies Assumption 32
We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)
Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If
i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds
then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin
of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if
one of the conditions i)-iii) is satisfied
Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)
The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9
for f on each subvariety and then construct a single one for f over the entire setof critical points
Proof of Theorem 33 Clearly every point in the complex variety
K1 = x isin Cn | ceq(x) = 0 φ(x) = 0
is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0
Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let
I = Ideal(ceq φ)
the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)
Ti = K1 cap f(x) = vi (i = t N)
Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that
VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that
I = Itminus1 cap It cap middot middot middot cap IN
and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set
(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0
For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)
1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and
IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)
By Theorem 21 we have
p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)
Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that
minus1 equiv p1 equiv p2 mod Itminus1
Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have
f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2
mod Itminus1
So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k
1It is the preordering of the polynomial tuple (cin ψ) see sect71
10 JIAWANG NIE
For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial
st(ǫ) =radicǫsummtminus1
j=0
(12
j
)
ǫminusjf j
Then we have that
D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)
)2 equiv 0 mod It
This is because in the subtraction of D1 after expanding(st(ǫ)
)2 all the terms f j
with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2
then f + ǫminus σt(ǫ) = D1 and
(315) f + ǫminus σt(ǫ) =summtminus2
j=0bj(ǫ)f
mt+j
for some real scalars bj(ǫ) depending on ǫ
For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let
si =radicvisummiminus1
j=0
(12
j
)
(fvi minus 1)j
Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii
Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that
a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂
i6=jisintminus1NIj
For ǫ gt 0 denote the polynomial
σǫ = σt(ǫ)a2t +
sum
t6=jisintminus1N(σj + ǫ)a2j
then
f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum
t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a
2t
For each i 6= t f minus σi isin Ii so
(f minus σi)a2i isin
N⋂
j=tminus1
Ij = I
Hence there exists k1 gt 0 such that
(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)
Since f + ǫ minus σt(ǫ) isin It we also have
(f + ǫ minus σt(ǫ))a2t isin
N⋂
j=tminus1
Ij = I
Moreover by the equation (315)
(f + ǫ minus σt(ǫ))a2t =
mtminus2sum
j=0
bj(ǫ)fmt+ja2t
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
6 JIAWANG NIE
a block diagonal matrix For the polynomial tuples h g as above the set
(27) S (h g)2k =
y isin RN
2kn
∣∣∣L
(k)h (y) = 0 L(k)
g (y) 0
is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k
The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k
3 The construction of tight relaxations
Consider the polynomial optimization problem (11) Let
λ = (λi)iisinEcupI
be the vector of Lagrange multipliers Denote the set
(31) K =
(x λ) isin Rn times R
EcupI∣∣∣∣∣
ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum
iisinEcupIλinablaci(x)
Each point in K is called a critical pair The projection
(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers
Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that
λi = pi(x)
Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly
bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as
(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])
bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that
(34) λj = minus1
2xjfxj
(j isin [n])
bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have
(35) λ1 = minus1
2xTnablaf(x)
bull (Triangular constraints) Suppose E cup I = 1 m and each
ci(x) = τixi + qi(xi+1 xn)
for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then
λ = T (x)minus1 middot[fx1
middot middot middot fxm
]T
Note that the inverse T (x)minus1 is a matrix polynomial
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7
For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5
For the polynomials pi as in Assumption 31 denote
(36) φ =(
nablaf minussum
iisinEcupIpinablaci
(pjcj
)
jisinI
)
ψ =(pj)
jisinI
When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem
(37)
fc = min f(x)st ceq(x) = 0 cin(x) ge 0
φ(x) = 0 ψ(x) ge 0
We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is
(38)
f primek = min 〈f y〉
st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)cin(y) 0
L(k)φ (y) = 0 L
(k)ψ (y) 0 y isin RN
n2k
Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is
(39)
fk = max γ
st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k
We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations
By the construction of φ as in (36) Assumption 31 implies that
Kc = u isin Rn ceq(u) = 0 φ(u) = 0
By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN
A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set
fc = +infin
When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that
(311) fk le f primek le fc
for all k Moreover fk and f primek are both monotonically increasing If for some
order k it occurs thatfk = f prime
k = fc
then the kth order Lasserrersquos relaxation is said to be tight (or exact)
8 JIAWANG NIE
31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption
Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0
In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases
a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0
b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)
Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial
(312) ρ =sum
iltt
minus1
cji(ui)cji(x)ℓi(x)
2 +sum
igetℓi(x)
2
satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge
0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that
gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial
(313) ρ =sum
iltt
gi(x)(ϕi(f(x))
)2+sum
iget
(ϕi(f(x))
)2
satisfies Assumption 32
We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)
Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If
i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds
then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin
of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if
one of the conditions i)-iii) is satisfied
Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)
The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9
for f on each subvariety and then construct a single one for f over the entire setof critical points
Proof of Theorem 33 Clearly every point in the complex variety
K1 = x isin Cn | ceq(x) = 0 φ(x) = 0
is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0
Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let
I = Ideal(ceq φ)
the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)
Ti = K1 cap f(x) = vi (i = t N)
Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that
VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that
I = Itminus1 cap It cap middot middot middot cap IN
and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set
(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0
For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)
1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and
IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)
By Theorem 21 we have
p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)
Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that
minus1 equiv p1 equiv p2 mod Itminus1
Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have
f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2
mod Itminus1
So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k
1It is the preordering of the polynomial tuple (cin ψ) see sect71
10 JIAWANG NIE
For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial
st(ǫ) =radicǫsummtminus1
j=0
(12
j
)
ǫminusjf j
Then we have that
D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)
)2 equiv 0 mod It
This is because in the subtraction of D1 after expanding(st(ǫ)
)2 all the terms f j
with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2
then f + ǫminus σt(ǫ) = D1 and
(315) f + ǫminus σt(ǫ) =summtminus2
j=0bj(ǫ)f
mt+j
for some real scalars bj(ǫ) depending on ǫ
For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let
si =radicvisummiminus1
j=0
(12
j
)
(fvi minus 1)j
Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii
Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that
a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂
i6=jisintminus1NIj
For ǫ gt 0 denote the polynomial
σǫ = σt(ǫ)a2t +
sum
t6=jisintminus1N(σj + ǫ)a2j
then
f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum
t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a
2t
For each i 6= t f minus σi isin Ii so
(f minus σi)a2i isin
N⋂
j=tminus1
Ij = I
Hence there exists k1 gt 0 such that
(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)
Since f + ǫ minus σt(ǫ) isin It we also have
(f + ǫ minus σt(ǫ))a2t isin
N⋂
j=tminus1
Ij = I
Moreover by the equation (315)
(f + ǫ minus σt(ǫ))a2t =
mtminus2sum
j=0
bj(ǫ)fmt+ja2t
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7
For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5
For the polynomials pi as in Assumption 31 denote
(36) φ =(
nablaf minussum
iisinEcupIpinablaci
(pjcj
)
jisinI
)
ψ =(pj)
jisinI
When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem
(37)
fc = min f(x)st ceq(x) = 0 cin(x) ge 0
φ(x) = 0 ψ(x) ge 0
We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is
(38)
f primek = min 〈f y〉
st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)cin(y) 0
L(k)φ (y) = 0 L
(k)ψ (y) 0 y isin RN
n2k
Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is
(39)
fk = max γ
st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k
We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations
By the construction of φ as in (36) Assumption 31 implies that
Kc = u isin Rn ceq(u) = 0 φ(u) = 0
By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN
A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set
fc = +infin
When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that
(311) fk le f primek le fc
for all k Moreover fk and f primek are both monotonically increasing If for some
order k it occurs thatfk = f prime
k = fc
then the kth order Lasserrersquos relaxation is said to be tight (or exact)
8 JIAWANG NIE
31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption
Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0
In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases
a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0
b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)
Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial
(312) ρ =sum
iltt
minus1
cji(ui)cji(x)ℓi(x)
2 +sum
igetℓi(x)
2
satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge
0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that
gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial
(313) ρ =sum
iltt
gi(x)(ϕi(f(x))
)2+sum
iget
(ϕi(f(x))
)2
satisfies Assumption 32
We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)
Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If
i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds
then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin
of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if
one of the conditions i)-iii) is satisfied
Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)
The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9
for f on each subvariety and then construct a single one for f over the entire setof critical points
Proof of Theorem 33 Clearly every point in the complex variety
K1 = x isin Cn | ceq(x) = 0 φ(x) = 0
is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0
Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let
I = Ideal(ceq φ)
the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)
Ti = K1 cap f(x) = vi (i = t N)
Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that
VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that
I = Itminus1 cap It cap middot middot middot cap IN
and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set
(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0
For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)
1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and
IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)
By Theorem 21 we have
p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)
Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that
minus1 equiv p1 equiv p2 mod Itminus1
Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have
f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2
mod Itminus1
So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k
1It is the preordering of the polynomial tuple (cin ψ) see sect71
10 JIAWANG NIE
For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial
st(ǫ) =radicǫsummtminus1
j=0
(12
j
)
ǫminusjf j
Then we have that
D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)
)2 equiv 0 mod It
This is because in the subtraction of D1 after expanding(st(ǫ)
)2 all the terms f j
with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2
then f + ǫminus σt(ǫ) = D1 and
(315) f + ǫminus σt(ǫ) =summtminus2
j=0bj(ǫ)f
mt+j
for some real scalars bj(ǫ) depending on ǫ
For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let
si =radicvisummiminus1
j=0
(12
j
)
(fvi minus 1)j
Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii
Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that
a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂
i6=jisintminus1NIj
For ǫ gt 0 denote the polynomial
σǫ = σt(ǫ)a2t +
sum
t6=jisintminus1N(σj + ǫ)a2j
then
f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum
t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a
2t
For each i 6= t f minus σi isin Ii so
(f minus σi)a2i isin
N⋂
j=tminus1
Ij = I
Hence there exists k1 gt 0 such that
(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)
Since f + ǫ minus σt(ǫ) isin It we also have
(f + ǫ minus σt(ǫ))a2t isin
N⋂
j=tminus1
Ij = I
Moreover by the equation (315)
(f + ǫ minus σt(ǫ))a2t =
mtminus2sum
j=0
bj(ǫ)fmt+ja2t
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
8 JIAWANG NIE
31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption
Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0
In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases
a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0
b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)
Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial
(312) ρ =sum
iltt
minus1
cji(ui)cji(x)ℓi(x)
2 +sum
igetℓi(x)
2
satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge
0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that
gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial
(313) ρ =sum
iltt
gi(x)(ϕi(f(x))
)2+sum
iget
(ϕi(f(x))
)2
satisfies Assumption 32
We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)
Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If
i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds
then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin
of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if
one of the conditions i)-iii) is satisfied
Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)
The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9
for f on each subvariety and then construct a single one for f over the entire setof critical points
Proof of Theorem 33 Clearly every point in the complex variety
K1 = x isin Cn | ceq(x) = 0 φ(x) = 0
is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0
Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let
I = Ideal(ceq φ)
the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)
Ti = K1 cap f(x) = vi (i = t N)
Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that
VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that
I = Itminus1 cap It cap middot middot middot cap IN
and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set
(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0
For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)
1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and
IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)
By Theorem 21 we have
p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)
Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that
minus1 equiv p1 equiv p2 mod Itminus1
Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have
f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2
mod Itminus1
So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k
1It is the preordering of the polynomial tuple (cin ψ) see sect71
10 JIAWANG NIE
For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial
st(ǫ) =radicǫsummtminus1
j=0
(12
j
)
ǫminusjf j
Then we have that
D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)
)2 equiv 0 mod It
This is because in the subtraction of D1 after expanding(st(ǫ)
)2 all the terms f j
with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2
then f + ǫminus σt(ǫ) = D1 and
(315) f + ǫminus σt(ǫ) =summtminus2
j=0bj(ǫ)f
mt+j
for some real scalars bj(ǫ) depending on ǫ
For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let
si =radicvisummiminus1
j=0
(12
j
)
(fvi minus 1)j
Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii
Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that
a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂
i6=jisintminus1NIj
For ǫ gt 0 denote the polynomial
σǫ = σt(ǫ)a2t +
sum
t6=jisintminus1N(σj + ǫ)a2j
then
f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum
t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a
2t
For each i 6= t f minus σi isin Ii so
(f minus σi)a2i isin
N⋂
j=tminus1
Ij = I
Hence there exists k1 gt 0 such that
(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)
Since f + ǫ minus σt(ǫ) isin It we also have
(f + ǫ minus σt(ǫ))a2t isin
N⋂
j=tminus1
Ij = I
Moreover by the equation (315)
(f + ǫ minus σt(ǫ))a2t =
mtminus2sum
j=0
bj(ǫ)fmt+ja2t
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9
for f on each subvariety and then construct a single one for f over the entire setof critical points
Proof of Theorem 33 Clearly every point in the complex variety
K1 = x isin Cn | ceq(x) = 0 φ(x) = 0
is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0
Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let
I = Ideal(ceq φ)
the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)
Ti = K1 cap f(x) = vi (i = t N)
Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that
VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that
I = Itminus1 cap It cap middot middot middot cap IN
and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set
(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0
For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)
1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and
IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)
By Theorem 21 we have
p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)
Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that
minus1 equiv p1 equiv p2 mod Itminus1
Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have
f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2
mod Itminus1
So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k
1It is the preordering of the polynomial tuple (cin ψ) see sect71
10 JIAWANG NIE
For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial
st(ǫ) =radicǫsummtminus1
j=0
(12
j
)
ǫminusjf j
Then we have that
D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)
)2 equiv 0 mod It
This is because in the subtraction of D1 after expanding(st(ǫ)
)2 all the terms f j
with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2
then f + ǫminus σt(ǫ) = D1 and
(315) f + ǫminus σt(ǫ) =summtminus2
j=0bj(ǫ)f
mt+j
for some real scalars bj(ǫ) depending on ǫ
For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let
si =radicvisummiminus1
j=0
(12
j
)
(fvi minus 1)j
Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii
Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that
a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂
i6=jisintminus1NIj
For ǫ gt 0 denote the polynomial
σǫ = σt(ǫ)a2t +
sum
t6=jisintminus1N(σj + ǫ)a2j
then
f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum
t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a
2t
For each i 6= t f minus σi isin Ii so
(f minus σi)a2i isin
N⋂
j=tminus1
Ij = I
Hence there exists k1 gt 0 such that
(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)
Since f + ǫ minus σt(ǫ) isin It we also have
(f + ǫ minus σt(ǫ))a2t isin
N⋂
j=tminus1
Ij = I
Moreover by the equation (315)
(f + ǫ minus σt(ǫ))a2t =
mtminus2sum
j=0
bj(ǫ)fmt+ja2t
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
10 JIAWANG NIE
For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial
st(ǫ) =radicǫsummtminus1
j=0
(12
j
)
ǫminusjf j
Then we have that
D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)
)2 equiv 0 mod It
This is because in the subtraction of D1 after expanding(st(ǫ)
)2 all the terms f j
with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2
then f + ǫminus σt(ǫ) = D1 and
(315) f + ǫminus σt(ǫ) =summtminus2
j=0bj(ǫ)f
mt+j
for some real scalars bj(ǫ) depending on ǫ
For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let
si =radicvisummiminus1
j=0
(12
j
)
(fvi minus 1)j
Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii
Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that
a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂
i6=jisintminus1NIj
For ǫ gt 0 denote the polynomial
σǫ = σt(ǫ)a2t +
sum
t6=jisintminus1N(σj + ǫ)a2j
then
f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum
t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a
2t
For each i 6= t f minus σi isin Ii so
(f minus σi)a2i isin
N⋂
j=tminus1
Ij = I
Hence there exists k1 gt 0 such that
(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)
Since f + ǫ minus σt(ǫ) isin It we also have
(f + ǫ minus σt(ǫ))a2t isin
N⋂
j=tminus1
Ij = I
Moreover by the equation (315)
(f + ǫ minus σt(ǫ))a2t =
mtminus2sum
j=0
bj(ǫ)fmt+ja2t
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11
Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0
(f + ǫminus σt(ǫ))a2t isin I2k2
Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0
(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3
Hence if klowast ge maxk1 k2 k3 then we have
f(x) + ǫminus σǫ isin I2klowast
for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that
I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast
This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast
Case II Assume IQ(ceq cin) is archimedean Because
IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)
the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I
Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let
s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)
)2
Then s isin Σ[x]2k4 for some integer k4 gt 0 Let
f = f minus fc minus s
We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that
f2ℓ + q isin Ideal(ceq φ)
This is because by Assumption 32 f(x) equiv 0 on the set
K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
It has only a single inequality By the Positivstellensatz [2 Corollary 443] there
exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)
For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where
φǫ = minusτǫ1minus2ℓ(f2ℓ + q
)
θǫ = ǫ(
1 + f ǫ+ τ(f ǫ)2ℓ)
+ τǫ1minus2ℓq
By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0
φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5
Hence we can getf minus (fc minus ǫ) = φǫ + σǫ
where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that
IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
12 JIAWANG NIE
For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime
k = fc for all k ge k5
32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let
(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that
(317) rankMt(ylowast) = rankMtminusd(y
lowast)
then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]
The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)
Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties
i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible
ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough
In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover
iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every
minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough
Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is
feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible
ii) By Assumption 32 when (37) is infeasible the set
x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32
Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)
Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality
When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime
k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13
iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say
u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc
iv) By Assumption 32 (37) is equivalent to the problem
(318)
min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0
The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is
(319)
γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0
L(k)ceq (y) = 0 L
(k)φ (y) = 0 L
(k)ρ (y) 0
Its dual optimization problem is
(320)
γk = max γ
st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k
By repeating the same proof as for Theorem 33(iii) we can show that
γk = γprimek = fc
for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough
If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]
4 Polyhedral constraints
In this section we assume the feasible set of (11) is the polyhedron
P = x isin Rn | Axminus b ge 0
where A =[a1 middot middot middot am
]T isin Rmtimesn b =[b1 middot middot middot bm
]T isin Rm This corresponds
to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote
(41) D(x) = diag(c1(x) cm(x)) C(x) =
[AT
D(x)
]
The Lagrange multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies
(42)
[AT
D(x)
]
λ =
[nablaf(x)
0
]
If rankA = m we can express λ as
(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that
(44) L(x)C(x) = Im
then we can get
λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
14 JIAWANG NIE
where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it
The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent
Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA
Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular
Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n
For a subset I = i1 imminusn of [m] denote
cI(x) =prod
iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn
(x)minus1)
DI(x) = diag(ci1(x) cimminusn(x)) AI =
[ai1 middot middot middot ainminusm
]T
For the case that I = empty (the empty set) we set cempty(x) = 1 Let
V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that
(45) LI(x)C(x) = cI(x)Im
The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)
(46)
J K [n] n+ I n+ [m]II 0 EI(x) 0
[m]I cI(x) middot(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x) 0
Equivalently the blocks of LI are
LI(I [n]
)= 0 LI
(I n+ [m]I
)= 0 LI
([m]I n+ [m]I
)= 0
LI(I n+ I
)= EI(x) LI
([m]I [n]
)= cI(x)
(A[m]I
)minusT
LI([m]I n+ I
)= minus
(A[m]I
)minusT (AI
)T
For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that
G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0
G([m]I [m]I) =[
cI(x)(A[m]I
)minusT minusAminusT[m]IA
TI EI(x)
] [(A[m]I
)T
0
]
= cI(x)In
G([m]I I) =[
cI(x)(A[m]I
)minusT minus(A[m]I
)minusT (AI
)TEI(x)
] [ ATIDI(x)
]
= 0
This shows that the above LI(x) satisfies (45)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15
Step II We show that there exist real scalars νI satisfying
(47)sum
IisinVνIcI(x) = 1
This can be shown by induction on m
bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let
(48) N = i isin [m] | rankA[m]i = n
For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1
and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real
scalars ν(i)Iprime satisfying
(49)sum
IprimeisinVi
ν(i)Iprime cIprime(x) = 1
Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that
am = α1a1 + middot middot middot+ αnan
If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write
i αi 6= 0 = i1 ik
Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system
ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0
has no solutions Hence there exist real scalars micro1 microk+1 such that
micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1
This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)
sum
IprimeisinVij
ν(ij)Iprime cIprime(x) = 1
Then we can get
1 =k+1sum
j=1
microjcij (x) =k+1sum
j=1
microjsum
IprimeisinVij
ν(ij)Iprime cij (x)cIprime(x) =
sum
I=IprimecupijIprimeisinVij1lejlek+1
ν(ij)Iprime microjcI(x)
Since each I prime cup ij isin V (47) must be satisfied by some scalars νI
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
16 JIAWANG NIE
Step III For LI(x) as in (45) we construct L(x) as
(410) L(x) =sum
IisinVνIcI(x)LI(x)
Clearly L(x) satisfies (44) because
L(x)C(x) =sum
IisinVνILI(x)C(x) =
sum
IisinVνIcI(x)Im = Im
Each LI(x) has degree le mminus n so L(x) has degree le mminus n
Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets
Example 42 Consider the simplicial set
x1 ge 0 xn ge 0 1minus eTx ge 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1
minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1
Example 43 Consider the box constraint
x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0
The equation L(x)C(x) = I2n is satisfied by
L(x) =
[In minus diag(x) In Inminusdiag(x) In In
]
Example 44 Consider the polyhedral set
1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0
The equation L(x)C(x) = I5 is satisfied by
L(x) =1
2
minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1
Example 45 Consider the polyhedral set
1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0
The matrix L(x) satisfying L(x)C(x) = I4 is
1
6
x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1
3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3
1minus x12
minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1
2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17
5 General constraints
We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points
Suppose there are totally m equality and inequality constraints ie
E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange
multiplier vector λ =[λ1 middot middot middot λm
]Tsatisfies the equation
(51)
nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0
0 0 middot middot middot cm(x)
︸ ︷︷ ︸
C(x)
λ =
nablaf(x)000
Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that
(52) L(x)C(x) = Im
then we can get
(53) λ = L(x)
[nablaf(x)
0
]
= L1(x)nablaf(x)
where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists
Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn
Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular
Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that
P (x)W (x) = It
Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)
Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn
t = rank It le rankW (u) le t
So W (x) must have full column rank everywhere
ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns
W (x) =[w1(x) w2(x) middot middot middot wt(x)
]
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
18 JIAWANG NIE
Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)
Tw1(x) = 1 For eachi = 2 t denote
r1i(x) = ξ1(x)Twi(x)
then (use sim to denote row equivalence between matrices)
W (x) sim[
1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)
]
simW1(x) =
[1 r12(x) middot middot middot r1m(x)
0 w(1)2 (x) middot middot middot w
(1)m (x)
]
where each (i = 2 m)
w(1)i (x) = wi(x)minus r1i(x)w1(x)
So there exists P1(x) isin R[x](s+1)timess such that
P1(x)W (x) =W1(x)
Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation
w(1)2 (x) = 0
does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that
ξ2(x)Tw
(1)2 (x) = 1
For each i = 3 t let r2i(x) = ξ2(x)Tw
(1)2 (x) then
W1(x) sim
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 w(1)2 (x) w
(1)3 (x) middot middot middot w
(1)m (x)
sim
W2(x) =
1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)
0 0 w(2)3 (x) middot middot middot w
(2)m (x)
where each (i = 3 m)
w(2)i (x) = w
(1)i (x)minus r2i(x)w
(1)2 (x)
Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that
P2(x)W1(x) =W2(x)
Continuing this process we can finally get
W2(x) sim middot middot middot simWt(x) =
1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)
0 0 0 middot middot middot 10 0 0 middot middot middot 0
Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that
Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19
Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let
P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)
then P (x)W (x) = Im Note that P (x) isin C[x]ttimess
For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)
)2 (the P (x) denotes
the complex conjugate of P (x)) which is a real matrix polynomial
(ii) The conclusion is implied directly by the item (i)
In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)
Example 53 Consider the hypercube with quadratic constraints
1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0
The equation L(x)C(x) = In is satisfied by
L(x) =[minus 1
2diag(x) In]
Example 54 Consider the nonnegative portion of the unit sphere
x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0
The equation L(x)C(x) = In+1 is satisfied by
L(x) =
[In minus xxT x1Tn 2x
12x
T minus 121
Tn minus1
]
Example 55 Consider the set
1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0
The equation L(x)C(x) = I2 is satisfied by
L(x) =
[minusx1
3 minusx2
4 0 0 1 00 0 minusx3
4 minusx4
3 0 1
]
Example 56 Consider the quadratic set
1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0
The matrix L(x)T satisfying L(x)C(x) = I2 is
25x13 + 10 x1
2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1
3 minus 10x12 x2 minus 40 x1 x2
2 +49 x1
2+ 2x3
minus15 x12 x2 + 10x1 x2
2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2
2 minus 20 x3 x1 x2 + 10x1 minus x22
25 x3 x12 minus 20x1 x2
2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2
2 minus 10 x3 x1 x2 minus 2x1 minus x32
1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1
2
minus50 x12 minus 20 x2 x1 50 x1
2 + 20 x2 x1 + 1
We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically
Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
20 JIAWANG NIE
Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The
resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of
ci1 cin+1such that if Res(ci1 cin+1
) 6= 0 then the equations
ci1(x) = middot middot middot = cin+1(x) = 0
have no complex solutions Define
F1(c) =prod
(i1in+1)isinJ1
Res(ci1 cin+1)
For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero
Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations
cj1(x) = middot middot middot = cjk(x) = 0
have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define
F2(c) =prod
(j1jk)isinJ2
∆(cj1 cjk)
Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero
Last let F (c) = F1(c)F2(c) and
U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple
Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically
6 Numerical examples
This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21
The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie
pi =(
L1(x)nablaf(x))
i
In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds
By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)
We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo
Example 61 Consider the optimization problem
min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0
The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors
Table 1 Computational results for Example 61
order kwo LME with LME
lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567
2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn
i=1xi(1 minus xi)
2 +sum
i6=j x2
i xj isin IQ(ceq cin)
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
22 JIAWANG NIE
Example 62 Consider the optimization problem
min x41x22 + x21x
42 + x63 minus 3x21x
22x
23 + (x41 + x42 + x43)
st x21 + x22 + x23 ge 1
The matrix polynomial L(x) =[12x1
12x2
12x3 minus1
] The objective f is the sum
of the positive definite form x41 + x42 + x43 and the Motzkin polynomial
M(x) = x41x22 + x21x
42 + x63 minus 3x21x
22x
23
Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because
c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)
)= 0
defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1
3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic
3 plusmn 1radic
3) The
computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 2 Computational results for Example 62
order kwo LME with LME
lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530
Example 63 Consider the optimization problem
min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0
The matrix polynomial L(x) is
1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0
0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1
minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1
The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3
3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact
4 This is because 1 minus x21minus x2
2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and
1minusx23minusx2
4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23
Table 3 Computational results for Example 63
order kwo LME with LME
lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449
Example 64 Consider the polynomial optimization problem
minxisinR2
x21 + 50x22
st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1
8 ge 0 x22 + 2x1x2 minus 18 ge 0
It is motivated from an example in [12 sect3] The first column of L(x) is
8x13
5 + x1
5288 x2 x1
4
5 minus 16 x13
5 minus x2 x12 1245 + 8 x1
5 minus 2 x2minus 288x2 x1
4
5 minus 16 x13
5 + x2 x12 1245 + 8 x1
5 + 2 x2
and the second column of L(x) is
minus 8x12 x2
5 + 4x23
5 minus x2
10288x1
3 x22
5 + 16x12 x2
5 minus 142x1 x22
5 minus 9 x1
20 minus 8 x23
5 + 11x2
5
minus 288x13 x2
2
5 + 16x12 x2
5 + 142x1 x22
5 + 9 x1
20 minus 8 x23
5 + 11x2
5
The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25
radic5 asymp 1126517 and the minimizers are (plusmn
radic
12plusmn(radic
58 +radic
12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors
Table 4 Computational results for Example 64
order kwo LME with LME
lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537
Example 65 Consider the optimization problem
minxisinR3
x31 + x32 + x33 + 4x1x2x3 minus(x1(x
22 + x23) + x2(x
23 + x21) + x3(x
21 + x22)
)
st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0
The matrix polynomial L(x) is
1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0
minusx1 x2 0 1 0 minus1
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
24 JIAWANG NIE
The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3
+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 5 Computational results for Example 65
order kwo LME with LME
lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057
Example 66 Consider the optimization problem (x0 = 1)
minxisinR4
xTx+sum4i=0
prod
j 6=i(xi minus xj)
st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0
The matrix polynomial L(x) =[12diag(x) minusI4
] The first part of the objective
is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers
(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)
(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)
(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors
Table 6 Computational results for Example 66
order kwo LME with LME
lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354
Example 67 Consider the optimization problem
minxisinR3
x41x22 + x42x
23 + x43x
21 minus 3x21x
22x
23 + x22
st x1 minus x2x3 ge 0minusx2 + x23 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0
minusx3 minus1 0 0 0
]
By the arithmetic-geometric
mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25
relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin
for all k ge 5 up to numerical round-off errors
Table 7 Computational results for Example 67
order kwo LME with LME
lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618
Example 68 Consider the optimization problem
minxisinR4
x21(x1 minus x4)2 + x22(x2 minus x4)
2 + x23(x3 minus x4)2+
2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2
st x1 minus x2 ge 0 x2 minus x3 ge 0
The matrix polynomial L(x) =
[1 0 0 0 0 01 1 0 0 0 0
]
In the objective the sum
of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer
(05632 05632 05632 07510)
The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors
Table 8 Computational results for Example 68
order kwo LME with LME
lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228
Example 69 Consider the optimization problem
minxisinR4
(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)
st 0 le x1 x4 le 1
The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9
5 Note that 4minussum
4
i=1x2i =
sum
4
i=1
(
xi(1 minus xi)2 + (1 minus xi)(1 + x2i )
)
isin IQ(ceq cin)
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
26 JIAWANG NIE
Table 9 Computational results for Example 69
orderwo LME with LME
lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320
For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison
Example 610 Consider the optimization problem (x0 = 1)
minxisinRn
sum
0leilejlejlen cijkxixjxk
st 0 le x le 1
where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time
Table 10 Consumed time (in seconds) for Example 610
n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778
7 Discussions
71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that
IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k
If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set
Preord(cin ψ) =sum
r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27
The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is
(71)
f primeprek = min 〈f y〉
st 〈1 y〉 = 1 L(k)ceq(y) = 0 L
(k)φ (y) = 0
L(k)
gr11
middotmiddotmiddotgrℓℓ
(y) 0 forall r1 rℓ isin 0 1y isin RN
n2k
Similar to (39) the dual optimization problem of the above is
(72)
fprek = max γ
st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k
An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem
Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then
fprek = f primeprek = fc
for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre
k = fmin for all k big enough
Proof The proof is very similar to the Case III of Theorem 33 Follow the same
argument there Without Assumption 32 we still have f(x) equiv 0 on the set
K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0
By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such
that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same
72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge
73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
28 JIAWANG NIE
74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get
λ =(
C(x)TC(x))minus1
C(x)T[nablaf(x)
0
]
when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research
Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper
References
[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions
Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-
putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997
[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226
[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200
[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832
[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear
[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear
[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]
[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous
quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-
inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-
tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005
[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109
[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J
Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of
zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu
TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29
[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885
[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014
[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009
[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015
[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear
[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26
[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009
[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014
[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606
[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718
[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012
[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255
[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013
[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510
[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121
[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274
[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003
[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984
[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000
[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942
[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324
[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653
[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002
[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951
Department of Mathematics University of California at San Diego 9500 Gilman
Drive La Jolla CA USA 92093
E-mail address njwmathucsdedu