Upload
vankhanh
View
268
Download
0
Embed Size (px)
Citation preview
CONVEX OPTIMIZATION FOR
SIGNAL PROCESSING PROBLEMS
LUI WING KIN
DOCTOR OF PHILODOPHY
CITY UNIVERSITY OF HONG KONG
AUGUST 2009
CITY UNIVERSITY OF HONG KONG
Convex Optimization for Signal Processing Problems
Submitted to Department of Electronic Engineering
in Partial Fulfillment of the Requirements
for the Degree of Doctor of Philosophy
by
Lui Wing Kin
August 2009
To my parents.
i
Abstract
Convex optimization has been one of the most exciting research areas in optimization,
and it refers to minimizing a convex objective function subject to convex constraints.
By recognizing or formulating the optimization problem in convex form, the problem
can be solved it efficiently. In the recent decade, convex optimization has become
an essential tool in engineering because of the benefits from two convex properties.
First, convex optimization gives a globally optimal solution which can be found effi-
ciently and reliably. Second, the optimization problem can be computed within any
desired accuracy using well-developed numerical methods. Once the convex problem
is formed, the problem is claimed to be solved.
Source localization, sinusoidal parameter estimation, polynomial root finding and
the determination of the capacity region, which are important areas of signal pro-
cessing, will be tackled with convex optimization perspective. The problems are first
formulated as optimization problems and they are either relaxed or transformed as
convex problems to yield global solutions and high-fidelity approximation.
In source localization problems, the positions of targets, such as sensor or mobile
terminal are the parameters of interest. Given position-bearing measurements, such
as time-of-arrival or time-different-of-arrival with the known coordinates of receivers,
the target positions are able to be obtained. These problems, especially for time-of-
arrival based localization, have been extended to multiple sources in a collaborative
environment, which is called sensor network node localization. However, most liter-
ature concentrates on the case that the anchor positions and the propagation speed
are perfectly known. In this thesis, node localization in the presence of uncertainties
in anchor positions and/or propagation speed are dealt with. Furthermore, source
localization in non-line-of-sight propagation, which contributes to significant error,
is addressed. Source localization using time-different-of-arrival measurements is also
studied based on convex optimization.
Parameter estimators of several sinusoidal models, namely, single complex/real
ii
tone, multiple complex sinusoids, single two-dimensional complex tone and polyno-
mial phase signal, in the presence of additive Gaussian noise are developed with
convex perspective. The major difficulty for optimally determining the parameters is
that the corresponding maximum-likelihood estimators involve searching the global
minimum or maximum of multi-modal cost functions because of the nonlinearity of
frequencies in the observed signals. By relaxing the non-convex maximum-likelihood
formulations using semidefinite programs, high-fidelity approximate solutions are ob-
tained in a globally optimum fashion.
The problem of solving a polynomial equation has been a classical problem in
mathematics. Semidefinite relaxation, which is a branch of convex optimization tech-
nique, is investigated to find real roots of a real polynomial.
The determination of the capacity region of parallel Gaussian interference channels
is an open problem. The special case where the channel is one-sided is considered.
The sum capacity cost function of user powers is shown to be convex. Exploiting the
inherent structure of the problem, a numerical algorithm is constructed to compute
the sum capacity.
iii
Acknowledgments
I would like to express my gratitude to my supervisor, Dr. So, Hing Cheung, for his
great patience, guidance and support for both of my personal life and research. He
introduces many interesting research topics to me and provided a lot of insightful
and inspirational comments as well as proverbs for daily life during the time of these
researches.
I am also very grateful for the kind and valuable help of Prof. Chen, Ron Guan-
rong, Dr. Ma, Wing-Kin, Dr. Shum, Kenneth Wing Ki, Dr. Sung, Albert Chi Wan,
Dr. Wong, Kwok-Wo and Dr. Yan, Wei (by alphabetical order). The knowledge and
experience they shared with me improved my works a lot, and their expertise on their
research areas broaden my view.
My thanks also go to my ex-colleagues and colleagues, Dr. Chan, Frankie Kit Wing,
Chan, Thomas Chin Tao, Liu, Michael Hongqing, Tawfiq Amin, Lo, Thomas Kai
Chun, Dr. Wu, Yuntao, Zheng, Jason Jun (by alphabetical order). Frequent discus-
sions with them on researches and leisure have been enjoyable and productive.
Finally, I would like to thank my family and friends. Their support and encourage-
ment are important to the completion of this work and will be forever remembered.
v
Mathematical Symbols
Specific Sets
C complex numberCn complex column vector of length n (n1 matrix)Cmn complex matrix of size m nHmn Hankel matrix of size m nR real numberRn real column vector of length n (n1 matrix)Rmn real matrix of size m nR+ nonnegative real numberR++ positive real numberRn+ nonnegative real column vector of length n (n1 matrix)Rn++ positive real column vector of length n (n1 matrix)Sn symmetric matrix of size n nSn+ symmetric positive semidefinite matrix of size n nSn++ symmetric positive definite matrix of size n nZ integer numbersZ+ nonnegative integerZ++ positive integer
Vectors and Matrices
a bold lower case symbol symbolizing vectorA bold upper case symbol symbolizing matricesA calligraphic upper case symbol symbolizing set0m mm zero matrix0mn m n zero matrix1m mm matrix with all elements one1mn m n matrix with all elements oneIm mm identity matrix
vi
Operators
[a]i ith element of a[A]i,j (i, j)th entry of A|A| absolute value of A|A| number of candidates in AbAc floor of Aan, a n-norm of a, any norm of aA? optimal of AA conjugate of Ax+ x if x > 0 otherwise 0AT transpose AAH Hermitian transpose of AA1 inverse of Af1 inverse function of fA1/2 matrix square root of A
A estimate of Ax
partial differentiation with respect to xf sub-differential of function ff vector differential of function fa b a, b R being approximately equalA := B A defined as Ba b a Rn element-wise greater than bA 0n A Cnn being positive semidefiniteA 0n A Cnn being positive definiteA B A being a subset of BA B A being a proper subset of BA negation of AAB union of A and B
S A union of AAB intersection of A
S A intersection of Aa b statement a implying statement b
vii
aff A affine hull of A (See (2.1.20))B(xc, r) Euclidean ball with
radius r and center xc (See (2.1.27))arg max{a : a A} argument of the largest in a Aarg min{a : a A} argument of the smallest in a Ablkdiag (A1,A2, ,Am) block diagonal matrix with
diagonal matrices A1,A2, ,Akconv A convex hull of A (See (2.1.21))dom f domain of fdiag (a) Cnn diagonal matrix with a Cn as diagonal elementsdiag (A, k) Cnk column vector with
the kth diagonal elements of A Cnnepi f epigraph of function f (See (2.1.36))exp(a) exponential function of aE {A} expectation of Ainf{a : a A} infimum in a Aln(a) natural logarithm of a R++log(a) logarithm of a R++max{i1, i2, , im} largest element among i1 to im, for m 2max{a : a A} largest in a Amin{i1, i2, , im} smallest element among i1 to im, for m 2min{a : a A} smallest in a Ax mod y x ny, n = bx/yc and x > 0, y > 0N (,C) Gaussian distribution with and C
being the mean and covariance, respectivelyO(a) order of ap({Ai}|{Bj}) probability of {Ai}, given {Bj}perm1(S,mi,mj) Cnn, [perm1(S,mi,mj)]k,l = [S]i,j, i, j = 1, 2, ,mimj
n = mimj k = mi(i 1) (mimj 1)bi/mic+ 1,l = mi(j 1) (mimj 1)bj/mjc+ 1(See Appendix B.1)
perm2(S,mi,mj) Cnn, [perm2(S,mi,mj)]k,l = [Si,j, i, j = 1, 2, ,mimjn = min{mi,mj} i = mi(k 1) + k, j = mi(l 1) + l
(See Appendix B.2)relint A relative interior of A (See (2.2.5))sign(a) sign of aToeplitz(x) symmetric or Hermitian Toeplitz matrix
formed by x, which is the first rowtr(A) trace of Avec(A) vectorization of A
viii
Abbreviation
0 to L
2D two-dimensional3D three-dimensionalAOA angle-of-arrivalCRLB Cramer-Rao lower boundDPT discrete polynomial transformDTFT discreet-time Fourier transformESDP edge-based semidefinite programmingFIM Fisher information matrixFIR finite impulse responseGA genetic algorithmGP geometric programmingGPS global positioning systemIC interference channelIID independently and identically distributedIQML iterative quadratic maximum-likelihoodIW iterative water-fillingKKT Karush-Kuhn-TuckerLLS linear least-squaresLOS line-of-sightLS least-squares
ix
M to Z
MAP maximum a posterioriMCMC Markov chain Monte CarloMDS multidimensional scalingML maximum-likelihoodMLE maximum-likelihood estimatorMSE mean square errorMSPE mean square position errorNLOS non-line-of-sightNLS nonlinear least-squaresNP-hard nondeterministic polynomial-time hardPDF probability density functionPSD positive semidefiniteRD range-differenceRSS received signal strengthSOCP second-order cone programmingSDP semidefinite programmingSDR semidefinite relaxationSNR signal-to-noise ratioTDOA time-difference-of-arrivalTOA time-of-arrivalTSWLS two-step weighted least-squaresWLS weighted least-squaresWSN wireless sensor network
x
Table of Contents
Abstract ii
Acknowledgments v
Mathematical Symbols vi
Abbreviation ix
Table of Contents xi
List of Figures xiv
1 Introduction 1
1.1 Mathematical Optimization . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Conventional Optimization Problems . . . . . . . . . . . . . . . . . . 3
1.2.1 Least-Squares and Linear Programming . . . . . . . . . . . . . 3
1.2.2 Convex Optimization . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Nonlinear Optimization . . . . . . . . . . . . . . . . . . . . . 8
1.3 Applications to Signal Processing Problems . . . . . . . . . . . . . . . 10
1.3.1 Source Localization . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.2 Sinusoidal Parameter Estimation . . . . . . . . . . . . . . . . 12
1.3.3 Polynomial Root-Finding . . . . . . . . . . . . . . . . . . . . . 13
1.3.4 One-Sided Parallel Gaussian Interference Channels . . . . . . 13
1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Preliminaries 15
2.1 Disciplines of Convex Optimization . . . . . . . . . . . . . . . . . . . 15
2.1.1 Optimization Theory . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.2 Convex Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.3 Numerical Computation . . . . . . . . . . . . . . . . . . . . . 29
2.2 Duality Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2.1 The Largrangian . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2.2 Karush-Kuhn-Tucker Conditions . . . . . . . . . . . . . . . . 39
2.3 Types of Convex Optimization . . . . . . . . . . . . . . . . . . . . . . 39
2.3.1 Geometric Programming . . . . . . . . . . . . . . . . . . . . . 40
2.3.2 Conic Optimization . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
xi
3 Source Localization 46
3.1 TOA-Based Positioning . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.1.1 Sensor Network Localization . . . . . . . . . . . . . . . . . . . 48
3.1.2 Non-Line-of-Sight Environment . . . . . . . . . . . . . . . . . 77
3.2 TDOA-Based Positioning . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.2.1 Relaxation on LLS Based Estimator . . . . . . . . . . . . . . . 89
3.2.2 Relaxation on ML Based Estimator . . . . . . . . . . . . . . . 96
3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4 Sinusoidal Parameter Estimation 104
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.2 Single Complex Tone . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.2.1 Periodogram Approach . . . . . . . . . . . . . . . . . . . . . . 106
4.2.2 ML Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.2.3 NLS Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.3 Single Real Tone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.3.1 ML Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.3.2 NLS Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.4 Multiple Complex Tones . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.5 Two-Dimensional Complex Tone . . . . . . . . . . . . . . . . . . . . . 122
4.5.1 Periodogram Approach . . . . . . . . . . . . . . . . . . . . . . 122
4.5.2 NLS Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.6 Polynomial Phase Signal . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.6.1 Review of Discrete Polynomial Transform . . . . . . . . . . . 127
4.6.2 Relaxation on ML Function . . . . . . . . . . . . . . . . . . . 130
4.7 Fast Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.8 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5 Polynomial Root-Finding 146
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.2 Algorithm Development . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.3 Illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6 Gaussian Interference Channels 157
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.2 Channel Model and Problem Formulation . . . . . . . . . . . . . . . 159
6.3 Computation of the Sum Capacity . . . . . . . . . . . . . . . . . . . 164
6.3.1 First Subproblem . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.3.2 Second Subproblem . . . . . . . . . . . . . . . . . . . . . . . . 166
6.3.3 Alternating Optimization . . . . . . . . . . . . . . . . . . . . 167
6.4 Comparison with Suboptimal Schemes . . . . . . . . . . . . . . . . . 168
6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
xii
7 Conclusions and Future Work 171
A Development of MAP Estimator 174
B Development of Permutation Operators 175
B.1 Development of perm1(S,mi,mj) . . . . . . . . . . . . . . . . . . . . 175
B.2 Development of perm2(S,mi,mj) . . . . . . . . . . . . . . . . . . . . 176
C Proof of Theorem 6.3 177
D Proof of Theorem 6.6 180
Bibliography 184
Publications 196
xiii
List of Figures
2.1 Geometric interpretation of epigraph form problem. . . . . . . . . . . 21
2.2 Some simple convex and non-convex sets. . . . . . . . . . . . . . . . . 22
2.3 Convex hull of the kidney shaped set. . . . . . . . . . . . . . . . . . . 24
2.4 Geometric interpretation of a convex cone. . . . . . . . . . . . . . . . 24
2.5 Convexity of functions. . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.6 Geometric interpretation of sub-level sets. . . . . . . . . . . . . . . . 28
2.7 Hierarchical representation of common convex optimization problems. 40
3.1 Intersection of circles gives the receiver location. . . . . . . . . . . . . 47
3.2 Geometry of sensor network. . . . . . . . . . . . . . . . . . . . . . . . 67
3.3 Single trial performance of the standard SDP algorithm in the presence
of anchor position uncertainty. . . . . . . . . . . . . . . . . . . . . . . 68
3.4 Single trial performance of the proposed SDP algorithm in the presence
of anchor position uncertainty. . . . . . . . . . . . . . . . . . . . . . . 69
3.5 Single trial performance of the proposed ESDP algorithm in the pres-
ence of anchor position uncertainty. . . . . . . . . . . . . . . . . . . . 69
3.6 Mean square position error versus 2d in the presence of anchor position
uncertainty. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.7 Mean square position error versus 2i at 2d=-50dB. . . . . . . . . . . 71
3.8 Single trial performance of the standard SDP algorithm for unknown
propagation speed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.9 Single trial performance of the proposed SDP algorithm for unknown
propagation speed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.10 Mean square position error versus 2t/c2 for unknown propagation speed. 73
3.11 Mean square speed error versus 2t/c2 for unknown propagation speed. 73
3.12 Single trial performance of the standard SDP algorithm in the presence
of combined uncertainties. . . . . . . . . . . . . . . . . . . . . . . . . 74
3.13 Single trial performance of the proposed SDP algorithm in the presence
of combined uncertainties. . . . . . . . . . . . . . . . . . . . . . . . . 75
xiv
3.14 Mean square position error versus 2t/co2 in the presence of combined
uncertainties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.15 Mean square speed error versus 2t/co2 in the presence of combined
uncertainties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.16 Illustration of two overlapped regions (along the line passing through
xi and xj). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.17 Illustration of the non-overlapped case (along the line passing through
xi and xj). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.18 LOS/NLOS detection probability. . . . . . . . . . . . . . . . . . . . . 85
3.19 Mean square position error versus qi. . . . . . . . . . . . . . . . . . . 85
3.20 Intersection of hyperbolas gives the target location. . . . . . . . . . . 87
3.21 Geometrical interpretation of Ri and R1,i. . . . . . . . . . . . . . . . 913.22 Mean square position error versus 2 at x = [3.5, 2.5, 1.5]T m. . . . . . 95
3.23 Mean square position error versus 2 at x = [5.5, 3.5, 1.5]T m. . . . . . 96
3.24 Mean square position error versus 2 at x = [3.5, 2.5, 1.5]T m. . . . . . 101
3.25 Mean square position error versus x-coordinate at 2 = 10 dBm2. . 102
4.1 Mean square error for of single complex sinusoid. . . . . . . . . . . 133
4.2 Mean square error for of single complex sinusoid. . . . . . . . . . . 134
4.3 Mean square error for of single complex sinusoid. . . . . . . . . . . 134
4.4 Mean square error for of single real sinusoid. . . . . . . . . . . . . . 135
4.5 Mean square error for of single real sinusoid. . . . . . . . . . . . . . 136
4.6 Mean square error for of single real sinusoid. . . . . . . . . . . . . . 136
4.7 Mean square error for 1 of multiple complex sinusoids. . . . . . . . . 137
4.8 Mean square error for 2 of multiple complex sinusoids. . . . . . . . . 138
4.9 Mean square error for 1 of multiple complex sinusoids. . . . . . . . . 138
4.10 Mean square error for 2 of multiple complex sinusoids. . . . . . . . . 139
4.11 Mean square error for 1 of multiple complex sinusoids. . . . . . . . . 139
4.12 Mean square error for 2 of multiple complex sinusoids. . . . . . . . . 140
4.13 Mean square error for of 2D single complex sinusoid. . . . . . . . . 140
4.14 Mean square error for of 2D single complex sinusoid. . . . . . . . . 141
4.15 Mean square error for of 2D single complex sinusoid. . . . . . . . . 141
4.16 Mean square error for of 2D single complex sinusoid. . . . . . . . . 142
4.17 Mean square error for of polynomial phase signal. . . . . . . . . . . 143
xv
4.18 Mean square error for a0 of polynomial phase signal. . . . . . . . . . . 143
4.19 Mean square error for a1 of polynomial phase signal. . . . . . . . . . . 144
4.20 Mean square error for a2 of polynomial phase signal. . . . . . . . . . . 144
5.1 Interaction of f(x, y) = y + x + 1 and y x2 = 0. . . . . . . . . . . . 1505.2 f(x) = 0.1034x7+0.1573x6+0.4075x5+0.4078x4+0.0527x3+0.9418x2+
0.15x + 0.3844. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.3 f(x) = 0.8959x6 0.9791x5 + 0.6537x4 + 0.4208x3 + 0.8830x2 +1.2610x + 0.7249. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.4 f(x) = 0.6813x4 + 0.3795x3 + 0.8318x2 + 0.5028x + 0.7095. . . . . . . 154
5.5 Mean absolute error versus polynomial degree n. . . . . . . . . . . . . 155
6.1 One-sided parallel Gaussian IC. . . . . . . . . . . . . . . . . . . . . . 158
6.2 Sum capacity of a single one-sided Gaussian IC, Ca, when cross link is
weak (a = 0.5 1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1616.3 Sum capacity of a single one-sided Gaussian IC, Ca, when cross link is
strong (a = 2.5 > 1). . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.4 Comparison of three different transmission schemes. . . . . . . . . . . 169
xvi
CHAPT ER1Introduction
In this chapter, some important classes of mathematical optimization problemsand the investigated signal processing problems are presented. The constraintspecifications of two mathematical optimizations, namely, linear programming and
convex optimization, are introduced in Section 1.1. Conventional optimization prob-
lems, that is, least-squares, linear programs, convex optimization approaches and
nonlinear optimization methods, are presented in Section 1.2. Being the super-set of
classical optimization methods, such as least-squares and linear programming, con-
vex optimization extends our ability to solve richer classes of optimization problem.
By reviewing the nonlinear optimization approaches, it would be seen that convex
optimization also takes an important roles even if the problems are non-convex. The
tackled signal processing applications are described in Section 1.3. Finally, thesis
organization is provided in Section 1.4.
1.1 Mathematical Optimization
The mathematical optimization problem [1], or in short optimization problem, can
be expressed as
minimizex
f0(x)
subject to fi(x) bi, i = 1, 2 ,m.(1.1.1)
The optimization variables are encapsulated in a vector x = [x1, x2, , xn]T . Theobjective function, the inequality constraint functions and their bounds are denoted
by f0 : Rn R, fi : Rn R, i = 1, 2, ,m and b1, b2, , bm, respectively. The
optimal vector, or a solution of (1.1.1) is symbolized by vector x?, which corresponds
to the smallest objective value among all vectors that satisfies the constraints, that
is, for any vector z Rn with f1(z) b1, f2(z) b2, , fm(z) bm, the inequalityof f0(z) f0(x?) is obtained.
The classes of optimization problems can be characterized by particular forms of
the objective function as well as constraint functions. An important example is linear
1
CHAPT ER 1. INTRODUCTION 2
program [2], where its objective function and constraint functions, f0, f1, , fm, arelinear, satisfying
fi(x + y) = fi(x) + fi(y), i = 0, 1, ,m, (1.1.2)
where for all x,y Rn and for all , R. If any of the objective and constraintfunctions fail to fulfill the linearity requirement, the problem becomes a nonlinear
program.
In this thesis, a class of optimization problems in the content of convex optimiza-
tion [3] is investigated. This refers to an optimization problem where the objective
and constraint functions are convex [4], that is,
fi(x + y) fi(x) + fi(y), i = 0, 1, ,m, (1.1.3)
for all x,y Rn and for all , R+ with + = 1. From (1.1.2) and (1.1.3), itwould be observed that the convexity is more general than linearity, as the inequality
represents a super-set of restrictive equality and the inequality holds for certain values
of and provided that + = 1. Since any linear program is a convex optimization
problem, so it is considered to be a generalization of linear programming.
An algorithm that computes a solution of a class of optimization problem, to a
certain accuracy, is called a solution method. Since 1940s, many researchers have
devoted themselves into developing those algorithms for solving various kinds of op-
timization problems, analyzing their properties, and even implementing them with
well-developed software packages [5]. However, the forms of the objective and con-
straint functions together with the number of variables and constraints limit our
ability to solve (1.1.1). In some special structured optimization problems, such as a
sparse problem, where each constraint function depends on only a small number of
the variables, the sparsity [6] of the problem is inversely proportional to the solving
effort. On the contrary, even if the objective and constraint functions are smooth,
the optimization problem (1.1.1) is extremely difficult to solve. However, there are
exceptions. Some classes of optimization problems are friendly. There are effective
algorithms to deal with them even if the problem is large in scale, say, with thousands
of variables and constraints. Two important and famous instances are least-squares
CHAPT ER 1. INTRODUCTION 3
problem [7] and, as previously mentioned, linear programs, and they are introduced
in the following section. Another less famous exceptional class, which is the super-
class of the previous two classes of problems, is convex optimization, and it has been
one of the most active and exciting research areas in optimization recently. Like
least-squares and linear programing, convex optimization problems come along with
reliable and efficient algorithms to find the solutions.
1.2 Conventional Optimization Problems
In this section, the classical optimization problems, namely, least-squares and linear
programming, which are essentially two sub-sets of the convex optimization problem,
are reviewed. Then, followed by the introduction of convex optimization problem
and its relation with the least-squares and linear programs, the general nonlinear
optimization is presented. Compromising between the effectiveness and the quality
of solution for the nonlinear programming, convex optimization acts as a reliable
alternative and can play an important role even when the original problem is non-
convex.
1.2.1 Least-Squares and Linear Programming
Two widely adopted subclasses of convex optimization, namely, least-squares and
linear programs are described in this subsection.
Least-Squares
A least-squares problem is an optimization problem without constraints, or m = 0,
and the objective function f0 is a sum of squares of terms with the form of aTi xbi:
minimizex
f0(x) = Ax b22 =k
i=1
(aTi x bi)2, (1.2.1)
where A Rkn with k n, {aTi } are the rows of A, and the optimization variablesare stored in the vector x Rn. From (1.2.1), there is a set of linear equation,
(ATA)x = ATb, (1.2.2)
CHAPT ER 1. INTRODUCTION 4
where the closed-form solution is
x? = (ATA)1ATb
with x? is the optimal vector. For least-squares problems, well-developed software
and algorithms [8] exist, and their solving time is approximately proportional to n2k
with known coefficient matrix A. In some cases, the problem can be further speeded
up by exploiting the special structure in A. If matrix A is sparse, which means that
there are less than kn nonzero entries, the least-squares problem would be solved with
less than order n2k or O(n2k) computational time. In general, least-squares problem
is a mature technology. The solution is automatically computed when the problem is
recognized or reformulated with least-squares formulation.
Recognizing a least-squares problem is also straightforward. The problem has a
quadratic function as its objective function and there is no constraint. Two tech-
niques are employed to increase its flexibility, namely, weighted least-squares and
regularization.
Weighted Least-Squares The formulation of the weighted least-squares can be
given as
minimizex
(Ax b)TW(Ax b), (1.2.3)
where W is the positive definite weighted matrix. Here, the entries of W are chosen
to reflect the importance of the corresponding entries of
(Ax b)(Ax b)T .
The analytical solution of (1.2.3) is x? = (ATWA)1ATWb. One of the common
applications is estimation problem, where the parameter vector of interest x is es-
timated by the given measurements corrupted by noise with W1 being the known
noise covariance matrix.
Regularization In regularization, extra terms are added to the cost function. For
example, a sum of squares of the variables can be added to the objective function:
minimizex
ki=1
(aTi x bi)2 + n
i=1
x2i , (1.2.4)
CHAPT ER 1. INTRODUCTION 5
where > 0 is a user-defined parameter and x = [x1, x2, , xn]T . Large values ofxi, i = 1, 2, , n will be penalized by the extra terms, and hence the solutions tendto be small. The parameter gives the freedom to user for choosing the trade-off
between making the original objective functionk
i=1(aTi x bi)2 small while keeping
n
i=1 x2i not too big. Weighted least-squares and regularization are two common
modifications for least-squares optimization, but they can also be employed for other
classes of optimization methods, such as convex optimization.
Linear Programming
Another classical optimization is linear programming, where the objective and con-
straint functions are linear:
minimizex
cTx
subject to aTi x bi, i = 1, 2, ,m,(1.2.5)
where c, a1, a2, , am Rn are known parameter vectors and b1, b2, , bm Rare the scalar parameters, which vary with different linear programming problems.
Unlike least-squares problem, there is no simple analytical formula for the solution,
but well-developed and efficient algorithms, namely, Dantzigs simplex method [9] and
interior-point methods [10,11] are available to provide numerical solutions. The exact
operations involved in solving a linear programming problem cannot be evaluated,
but a rigorous bound can be estimated when the interior-point methods are used.
The complexity is around O(n2m), with the assumption of m n. The above-mentioned methods can handle large-scale problems. With the nature of sparsity or
other exploitable structure, problems with more variable can be solved practically.
The form of (1.2.5) can be found in some applications, but in many other cases,
transformation is necessary. A simple example is Chebyshev approximation problem
[12]:
minimizex
maxi|aTi x bi|, i = 1, 2, , k, (1.2.6)
where x Rn is the variable vector, and the known parameters are denoted bya1, a2, , ak Rn and b1, b2, , bk R. For both least-squares and linear pro-grams, the objective function evaluates the value of the term aTi xbi, i = 1, 2, , k.
CHAPT ER 1. INTRODUCTION 6
In linear program, such as Chebyshev approximation, the problem is with the ab-
solute sum while its least-squares counterpart is with the sum of squares. Another
important distinction is the non-differentiability of the objective function in (1.2.6),
while the least-squares in (1.2.1) is quadratic, which is differentiable. Nevertheless,
the Chebyshev approximation problem can be transformed to the following linear
program
minimizet,x
t
subject to aTi x bi t, i = 1, 2 , k,(aTi x bi) t, i = 1, 2 , k,
(1.2.7)
with t R and x Rn, which is ready to be solved. Reducing a problem to linearproblem involves more knowledge than least-squares problem. Nevertheless, like the
least-squares problem, once the problem is transformed or recognized as a linear
program, well-developed software [13] can be applied.
1.2.2 Convex Optimization
Convex optimization has been one of the most active and exciting research areas
in optimization recently, and it refers to minimizing an objective function, which is
convex but not necessarily differentiable, subject to convex constraints. The problem
is formulated as
minimizex
f0(x)
subject to fi(x) bi, i = 1, 2, ,m,(1.2.8)
with convex functions f0, f1, , fm : Rn R, satisfying (1.1.3). The least-squaresproblem of (1.2.1) and linear programming problem of (1.2.5) are the special cases
of the convex optimization problem of (1.2.8). Like the linear programs, convex
problems are in general without analytical formula, but there are effective methods
to solve them. Interior-point method [10, 11] is one of the most practical solvers for
convex problem given specified accuracy with a number of operations that does not
exceed a polynomial of the problem dimensions [14], which solves (1.2.8) in a number
of steps or iterations. Ignoring the structure of the problem, such as sparsity, each
CHAPT ER 1. INTRODUCTION 7
step requires the order of
max{n3, n2m,F} (1.2.9)
operations [15], where F is the time of evaluating the first and second derivatives
of the objective and constraint functions f0, f1, , fm, it is assumed that they aredifferentiable. Interior-point methods for solving convex optimization problems are
relatively mature. Like least-squares and linear programs, exploitation of the problem
sparsity makes problem with extreme large scale solvable in practice.
The usage of convex optimization is more or less the same as least-squares and
linear programming. By recognizing or formulating the optimization problem in con-
vex form, it can be solved efficiently just like solving least-squares and linear program
problems. However, recognizing a convex function or transforming the problem to
convex program can be difficult and the technique is much more sophisticated than
recognizing a least-squares or linear program. When a problem is cast into a convex
form, the structure of the optimal, which often reveals design insights, can often be
identified with a rigorous optimality condition and a duality theory [16,17]. Further-
more, numerical algorithms are available to solve for the solution, so the study of
convex optimization focuses on the formulation techniques. Once the convex problem
is formed, it would be claimed to be already solved.
In the recent decade, convex optimization has become an essential tool in engi-
neering because of the benefits from the convex properties:
1. Convex optimization gives a globally optimal solution which can be found effi-
ciently and reliably.
2. The optimization problem can be computed within any desired accuracy using
well-developed numerical methods.
The solutions of practical engineering problems, which are often large in scale, can
be obtained in a relatively reliable and efficient manner. Convex optimization, in
a certain extent, provides an indispensable modern computational tool, which ex-
tends some classical best-fit problem solvers, such as least-squares and linear pro-
gramming. A much larger and richer class of problems has been enabled by this
CHAPT ER 1. INTRODUCTION 8
optimization technique [18]. Breakthroughs in algorithms for solving convex prob-
lems [10,11,1921] and advances in computing power equip us to solve these kinds
of problems. New engineering applications are being proposed from almost every
discipline, for instance, control [2225], circuit design [26, 27], computer science [28],
and signal processing [16,17,2932].
1.2.3 Nonlinear Optimization
With nonlinear constraint functions and/or objective function, which may not be
convex, it would be said that these problems correspond to nonlinear optimization or
nonlinear programming. Unfortunately, there is no algorithm which is able to solve
the general optimization problem stated in (1.1.1) with nonlinear and non-convex
constraint functions or objective function. Even problems with few variables would
be extremely challenging, or exhaustive search is a must. Although stochastic algo-
rithms, such as genetic algorithm (GA) [33], which is implemented in a computer
simulation in which a population of abstract representations (called chromosomes
or the genotype of the genome) of candidate solutions (called individuals, creatures,
or phenotypes) to an optimization problem evolves toward better solutions, jumping
genes evolutionary algorithm [34], which introduces a genetic operator called jumping
genes transposition to increase the ability of GA in finding extreme solutions, par-
ticle swarm optimization [35], which is an algorithm modeled on swarm intelligence
that finds a solution to an optimization problem in a search space, ant colony opti-
mization [36], which is a probabilistic technique for solving computational problems
which can be reduced to finding good paths through graphs, Markov chain Monte
Carlo (MCMC) [37], which is a class of algorithms for sampling from probability dis-
tributions based on constructing a Markov chain that has the desired distribution as
its equilibrium distribution, particle filters [38], which are usually used to estimate
Bayesian models and are the sequential analogue of MCMC batch methods and are
often similar to importance sampling methods, are blooming in recent optimization
research, they fail to guarantee global convergence within a reasonable time, and they
are computationally intensive. In general, methods for the nonlinear optimization in-
volve two compromises, namely, local convergence and computational efficiency.
CHAPT ER 1. INTRODUCTION 9
For local optimization, seeking for the optimal x, which minimizes the objective
function over all feasible solutions, is not the major concern. Instead a point that is
only optimal locally is obtained, or it does not guarantee the obtained solution to pro-
duce the smallest objective function value over all feasible points. Local optimization
methods are fast and are able to handle large-scale problems. The only requirement
for the local optimization like Newtons method [39] is differentiable objective function
as well as constraints. Therefore, local optimization is widely adopted in engineering
design applications. However, apart from the possibility of local convergence, the
methods require an initial estimate for the optimization variable, x, meaning that
the quality of the solution is highly depending on how far the initial guess is apart
from the global optimal point. Local optimization methods are also sensitive to the
user-defined parameters in the algorithm, hence careful adjustments are required for a
particular family of problems. The procedures of applying local optimization methods
involve choosing a suitable algorithm, adjusting algorithm parameters, and finding a
high quality initial guess or a method for generating a high quality initial estimate.
Comparing convex optimization and local optimization for nonlinear optimization
problem, the former provides unique globally optimal solution and is able to handle
even for a non-differentiable objective function or constraints, although relaxation
may be needed. To achieve globally optimal solution using exhaustive search in the
general optimization problem of (1.1.1), the efficiency is inescapably sacrificed. The
exponential growth of the solving time with the problem size is expected. When the
number of problem variables is small and the computation time is not critical, it may
be worthy to find the true global optimum. Examples are worst-case analysis [40],
and verification of an expensive or safety-critical system from engineering design [41].
On the other hand, convex optimization also plays an important role even when
the problem is non-convex. The solution obtained from convex optimization can
be a high quality initial estimate for the local optimization. Approximating the
original problem as convex problem, the approximated problem is solved with rela-
tively straightforward manner. The solution is then used as the starting point for
the local optimization method. Second, some nondeterministic polynomial-time hard
(NP-hard) [42] combinatorial problems can be approximated as convex formulation
CHAPT ER 1. INTRODUCTION 10
and the approximated problem can be solved within a polynomial-time. Finally, in
some applications, exact solution is not important, and finding the lower bound on
the optimal value is required with relatively lower computational burden. Two meth-
ods to fulfill such requirement are based on relaxation, which are essentially replacing
each non-convex constraint with a looser but convex constraint, and Lagrangian relax-
ation, where the Lagrangian dual problem [20] is solved instead of the prime problem,
as prime-dual solution is not always the same. The dual problem is convex and it
provides a lower bound on the optimal value of the non-convex problem.
To conclude, convex optimization is playing an important role for nonlinear opti-
mization problems. Some signal processing problems are introduced in this thesis as
examples to illustrate the contribution of convex optimization.
1.3 Applications to Signal Processing Problems
In this thesis, some important problems on signal processing would be addressed
with the convex point of view. With these enhanced classes of optimization, new
constraints or requirements to the original problems can be added, or getting new
insights with duality theory. However, limitation also exists for convex optimization
technique. When the problem is non-convex in nature and it is impossible to made
it convex, there is no option but relaxing the problem. The performance is degraded
unavoidable in such relaxation. The degradation in estimation accurate heavily de-
pends on the relaxation technique applied. Guidelines can be outline, but there is lack
of standard procedure to deal with. Therefore, depending on empirical experience is
another major limitation on applying convex optimization. Also, there is difficult
to devise performance analysis for the convex formulation as there are no analytic
solutions in general. In this thesis, the operations of convex reformulation with some
signal processing problems would be demonstrated.
Source localization [43], sinusoidal parameter estimation [44], polynomial root
finding [45] and the determination of the capacity region [46], will be tackled with
convex optimization perspective. The problems are first formulated as optimization
problems and they are either relaxed or transformed as convex problems to yield
CHAPT ER 1. INTRODUCTION 11
high-fidelity global solutions.
The optimization problem described in (1.1.1) is an abstract problem of making
the best possible choice of a vector in Rn from a set of candidate choices. The
variable x represents the choice made, the constraints fi(x) bi, i = 1, 2, . . . , m,denote the firm requirements or the specifications that limit the possible choices,
and the objective function represents the cost of choosing x. A solution, x? of the
optimization problem (1.1.1) corresponds to a choice that has the minimum cost.
1.3.1 Source Localization
In the localization problem, the positions of targets, such as sensor node or mobile
terminal, are the concerned parameter given position-bearing measurements, such as
time-of-arrival (TOA), time-different-of-arrival (TDOA) or angle-of-arrival together
with the known coordinates of receivers. For the single-source TOA-based positioning,
which is the simplest case of localization, the case of m receivers with known positions
at xi = [xi, yi]T , i = 1, 2, ,m, is considered and the target is located at unknown
position x = [x, y]T . It is worthy to note that it can be directly upgraded to 3-
dimensional cases by including the z-coordinate in xi and x. The distance between the
target and the ith receiver, which is obtained from multiplying the known propagation
speed by the corresponding TOA measurement, is
di = x xi2 + i, i = 1, 2, ,m, (1.3.1)
where i R is the error of the ith measurement. It is assumed that the noise is azero-mean white Gaussian process, the maximum-likelihood (ML) estimator is [47]:
minimizex,{ri}
mi=1
(ri di)2
subject to ri = x xi2, i = 1, 2, ,m,(1.3.2)
where the norm function with equality constraint does not fit the convex framework,
and hence (1.3.2) is a non-convex problem. With convex optimization technique of
semidefinite relaxation, Cheung et al. [48] propose a convex program formulation for
this single-source case.
CHAPT ER 1. INTRODUCTION 12
The problem has been extended to multiple sources in a collaborative environ-
ment, which corresponds to sensor network node localization, and the related convex
formulations are presented by Biswas et al. [49,50]. However, they concentrate on the
case that the anchor positions and the propagation speed are perfectly known, which
is not valid in some applications. In this thesis, node localization in the presence
of uncertainties in anchor positions and/or propagation speed are tackled. Further-
more, source localization in non-line-of-sight propagation [51], which contributes to
significant error, is addressed. Source localization using TDOA measurements is also
studied based on convex optimization. The developments of these kinds of problems
under relaxation technique are presented in Chapter 3.
1.3.2 Sinusoidal Parameter Estimation
The simplest case of the problem, namely, estimation of parameters for a single real
sinusoid [52] is first considered, and its discrete-time signal model is
x(i) = cos(i + ) + q(i), i = 1, 2, ,m, (1.3.3)
where R++, (0, ) and [0, 2) are unknown but deterministic constantswhich represent the tone amplitude, frequency and phase, respectively, while the noise
q(i) is assumed to be zero-mean white process with unknown variance. The objective
is to find , and given the m samples of {x(i)}. Similar to (1.3.2) and (1.1.1),the ML sinusoidal parameter estimator is
minimize,,,{s(i)}
mi=1
(s(i) x(i))2
subject to s(i) = cos(i + ), i = 1, 2, ,m,(1.3.4)
where s(i) is the estimate of the noise-free signal. The constraint functions involve a
nonlinear operator of cos(), which corresponds to a nonlinear and non-convex opti-mization problem, but relaxation can be utilized to yield a practically solvable prob-
lem. With the same principle, other related signal models, that is, single/multiple
complex tone [5355], single two-dimensional complex tone [56] and polynomial phase
signal [57,58], are taken into consideration in Chapter 4.
CHAPT ER 1. INTRODUCTION 13
1.3.3 Polynomial Root-Finding
The problem of solving a polynomial equation is to find x that satisfies
p(x) = p0 + p1x + x2 + + pnxn = 0, (1.3.5)
where pi is the ith real coefficient of the polynomial. To transform it to an optimiza-
tion problem like (1.1.1), the |p(x)|2 is minimized, hence the root-finding problembecomes
minimizex,x
|pTx|2
subject to xi = xi, i = 0, 1, ,m,
(1.3.6)
where the vector p = [p0, p1, , pm]T stores the coefficients and x = [x0, x1, , xm]T .The challenging polynomial equality constraints, xi = x
i, i = 0, 1, ,m, in (1.3.6)make the problem to be nonlinear, hence relaxation is needed to solve the problem.
The relaxed optimization problem under convex technique is developed in Chapter 5.
1.3.4 One-Sided Parallel Gaussian Interference Channels
The determination of the capacity region of L parallel Gaussian interference channels
[59] is an open problem. By considering an one-sided situation, which is a special
case of the channel, the sum capacity is shown to be a convex function of the two
users powers, and the optimization problem can be expressed as:
maximizep1,p2
L
l=1
Ca(l)(p(l)1 , p
(l)2 )
subject to p P ,(1.3.7)
where p1,p2 RL denote the power allocation of the two users, namely, user 1and user 2, p is the power allocation for (p1,p2), P is the feasible set of the powerallocation, and Ca(p1, p2), which is convex but non-differentiable, is the function of the
sum capacity of a single one-sided Gaussian interference channel. Unlike the previous
problems where relaxation is the major technique, exploiting the inherent structure
in (1.3.7) is the major concept for dealing this sum capacity problem. Based on our
finding about the inherent structure, a numerical algorithm is proposed to compute
the sum capacity. The details of the algorithms as well as the formulation of the sum
capacity problem are presented in Chapter 6.
CHAPT ER 1. INTRODUCTION 14
1.4 Thesis Organization
The organization of this thesis is as follows. A brief introduction on the convex
optimization techniques and theories is in Chapter 2. The disciplines of convex opti-
mization including optimization theory, convex analysis and numerical computation,
duality theory and types of convex optimization are presented. Then, the signal
processing problems, namely, source localization, sinusoidal parameter estimation,
polynomial root finding and the determination of the capacity region as well as their
common applications and the development of convex formulations are presented in
Chapters 3 to 6, respectively. The TOA and TDOA based localization would be
studied in Chapters 3. For TOA based localization, the algorithms for sensor net-
work localization with uncertainties of anchor positions and/or propagation speed
are developed. Furthermore, the semidefinite relaxation formulation for identifying
NLOS measurements is derived. For TDOA source localization, two relaxations on
linear least-squares and ML are proposed. Sinusoidal parameter estimation includ-
ing single complex/real tone, multiple complex tone, two-dimensional complex tone,
polynomial phase signal is presented in Chapter 4. Based on the relaxation about the
periodogram, ML and nonlinear least-squares, algorithms are proposed under con-
vex framework. The application of convex optimization on polynomial root-finding
is descripted in Chapter 5. In Chapter 6, the sum capacity of a Gaussian interfer-
ence channel is investigated objective. The algorithm to obtain the sum capacity is
proposed and the related theoretical proofs are presented. Finally, conclusion and
possible future work are presented in Chapter 7.
CHAPT ER2Preliminaries
In this chapter, some background information about convex optimization theoryare introduced. Disciplines of convex optimization, duality theory and types ofconvex optimization are presented in the following sections.
2.1 Disciplines of Convex Optimization
Inheriting the wisdom cumulated from three disciplines, namely, convex analysis
[6062], optimization theory [6366] and numerical computation [6, 6769], convex
optimization can be claimed as a fusion research topic of them.
In this section, the three disciplines is briefly introduced in order to have a better
understanding about the scope of convex optimization. Optimization theory, which
contributes to the notations and forms to convex optimization, is presented. Convex
analysis equips engineers with ability to distinguish and to handle the convex sets and
convex functions. These two convex objects are critical for researchers to recognize
convex problems or transform the problem under convex framework. Numerical com-
putation is vital in solving the problem with an efficient and reliable manner. They
are the three pillars for the convex optimization.
2.1.1 Optimization Theory
In this subsection, the optimization theory is introduced. Convex optimization adopts
the notations and representation from the general optimization discipline. To better
understand the meaning of convex problem, the general notations and terms of op-
timization theory are presented. Finally, some tricks for handling difficult functions
developed in the field of general optimization, are introduced.
Optimization is a study of seeking a set of real or integer solutions to minimize
or maximize a real objective function with a systematic manner from an allowed
set [1]. Mathematically, it can be written as (1.1.1), where an equality constraint
is converted to two inequality constraints and the bounds bi are included in the
15
CHAPT ER 2. PRELIMINARIES 16
constraint functions. More specifically, the constraints can be divided into inequality
and equality constraints, here (1.1.1) can be rewritten as:
minimizex
f0(x)
subject to fi(x) 0, i = 1, 2 ,m,hi(x) = 0, i = 1, 2, , p,
(2.1.1)
which describes the problem of finding the optimization variable x Rn in minimizingthe objective function f0(x), f0 : R
n R, within the feasible set satisfying theconditions fi(x) 0, i = 1, 2 ,m and hi(x) = 0, i = 1, 2, , p, which referto the inequality constraints and the equality constraints, respectively. Function
fi : Rn R and hi : Rn R are called inequality constraint functions and equality
constraint functions, respectively. The allowed set of the problem is defined as the
intersection of the domain of all constraints functions, which is defined as:
D =m
i=0
dom fi p
i=1
dom hi, (2.1.2)
where a point x D is called feasible and D is feasible set or constraint set. Theproblem of (2.1.1) is said to be feasible if at least one point satisfies all the constraints
and infeasible otherwise.
The optimization theory provides a standard mathematical formulation to prob-
lems. By restricting the constraints and the objective function complied with convex
requirement, which is revealed by convex analysis, the problem can be said to be
convex.
The optimal value of (2.1.1) is denoted by o? and it is defined as
o? = inf{f0(x) | fi(x) 0, i = 1, 2, ,m, hi(x) = 0, i = 1, 2, , p}, (2.1.3)
where o? (,). The value of o? tends to infinity, o? if the problem isinfeasible. When there is at least one point xk with f0(xk) , where (2.1.1)is unbounded below. The optimal point x? is defined as f0(x
?) = o?, which means
putting the solution into the objective produces the lowest value for the feasible set.
In some problems, optimal points may not be unique, and they are defined as a set
CHAPT ER 2. PRELIMINARIES 17
Xopt
Xopt ={x | fi(x) 0, i = 1, 2, ,m,hi(x) = 0, i = 1, 2, , p, f0(x) = o?
},
(2.1.4)
where if Xopt is a non-empty set, (2.1.1) is solvable otherwise unsolvable, includingthe situation of bounded below. A suboptimal solution x with f0(x) p? + , where > 0 is called -suboptimal for (2.1.1). The locally optimal x is defined as
f0(x) = inf{f0(z) | fi(z) 0, i = 1, 2, ,m,hi(z) = 0, i = 1, 2, , p, z x2 R
},
(2.1.5)
where R > 0 and z Rn is the variable vector.Let x be a feasible point and fi(x) = 0, i = 1, 2, ,m, the inequality constraint
fi(x) 0 is active at x otherwise inactive, and inactive constraints are redundant.The feasible set does not change if the constraint is omitted.
In some problems, the objective value is identical to zero or some constants, which
are classified as feasibility problems:
minimizex
0
subject to fi(x) 0, i = 1, 2, ,m,hi(x) = 0, i = 1, 2, , p,
(2.1.6)
where the feasibility problem is to find a feasible point that satisfies all the constraints
otherwise unsolvable.
The optimization problem in (2.1.1) is a standard form. The convention of stan-
dard form is placing the right hand side of the inequality and equality constraints
with zeros. For instance, the equality constraint gi(x) = gi(x) is represented as
hi(x) = 0, where hi(x) = gi(x) gi(x), and the inequality of fi(x) 0 is expressed asfi(x) 0. The maximization problem can be replaced by minimizing the negativeobjective function f0(x) subject to the constraints.
Unsurprisingly, there are intractable problems in practice, but some of them are
computational friendly. Engineers can always make their live easier by transforming
complicated problems to some solvable programs. Some techniques are developed
CHAPT ER 2. PRELIMINARIES 18
within optimization theory, namely, substitution of variables, transformation of func-
tions, insertion of slack variables, divide-and-conquer, employing epigraph problem
form.
Substitution of Variables
It is assumed that a function : Rn Rn is one-to-one function, that is, (dom ) D, where the problem domain is D, functions fi and hi are defined as
fi(z) = fi((x)), i = 0, 1, ,m,hi(z) = hi((x)), i = 0, 1, , p.
(2.1.7)
Substitute x = (z) in (2.1.1) gives
minimizez
f0(z)
subject to fi(z) 0, i = 1, 2 ,m,hi(z) = 0, i = 1, 2, , p,
(2.1.8)
with z Rn. The two problems characterized by (2.1.1) and (2.1.8) are equivalent.If (2.1.1) can be solved efficiently with the solution x, then (2.1.8) can be solved with
the inverse function of z = 1(x). This technique helps us to change the target
variables. In some situations, functions in terms of particular variables are difficult
to be solved, but they are easy to solve while in terms of other variables.
Transformation of Functions
Given that 0 : R R is monotone increasing, 1, 2, , m : R R fulfilli(u) 0, i = 1, 2, ,m if and only if u 0, and m+1, m+2, , m+p : R Rsatisfy i(u) = 0, i = m + 1,m + 2, ,m + p if and only if u = 0. The compositedfunctions fi and hi are defined as
fi(x) = i(fi(x)), i = 0, 1, ,m,hi(x) = m+i(hi(x)), i = 1, 2 , p,
(2.1.9)
and the problem is transformed as
minimizex
f0(x)
subject to fi(x) 0, i = 1, 2 ,m,hi(x) = 0, i = 1, 2, , p,
(2.1.10)
CHAPT ER 2. PRELIMINARIES 19
where the feasible sets remain unchanged. This technique assists us to handle some
complicated functions by wrapping them with monotone increasing function. The
problem may be solvable after applying such alternation without changing the optimal
point. A subclass of convex programming, geometric programming (GP) [70], is
essentially based on this technique, and GP will be reviewed in Section 2.3.1.
Insertion of Slack Variables
Inserting slack variables, si R++, is a procedure to convert an inequality constraintto equality constraint and vice versa, that is, fi(x) 0 fi(x) + si = 0. Thetransformed optimization is
minimizex,s
f0(x)
subject to si 0, i = 1, 2 ,m,fi(x) + si = 0, i = 1, 2 ,m,hi(x) = 0, i = 1, 2, , p,
(2.1.11)
where x Rn and s = [s1, s2, , sm]T are the vector containing the optimizationvariables and the vector storing the slack variables, respectively. In (2.1.11), there
are n + m variables, m inequality constraints, which are the constraints to limit si,
i = 1, 2, ,m nonnegative, and m + p equality constraints. The technique is uselessin terms of simplifying the original problem, but it helps us to evaluate how tight
the approximation problem is. In Section 1.2.3 (See page 10), a relaxation is men-
tioned, which replaces the constraints with looser but convex constraints. Inequality
constraints are usually replaced by the equality constraints, so by introducing slack
variables, it is possible to evaluate how good our approximation is. The value ofm
i=1 si is inversely proportional to the tightness of the relaxation.
Divide-and-Conquer
In some optimization problems, there is
infx,y
f(x,y) = infx
f(x), (2.1.12)
where
f(x) = infy
f(x,y), (2.1.13)
CHAPT ER 2. PRELIMINARIES 20
with two sets of optimization variables encapsulated in x and y. It is possible to
minimize a function by first minimizing over some of the variables, and then mini-
mizing over the remaining ones. In the standard form of (2.1.1), It is assumed that
the variable x Rn is segmented as x = (x1,x2) with x1 Rn1 , x2 Rn2 such thatn1 +n2 = n. Then, by grouping the constraints according to the segmented variables,
the following problem can be obtained:
minimizex1,x2
f0(x1,x2)
subject to fi(x1) 0, i = 1, 2, ,m1,fi(x2) 0, i = 1, 2, ,m2,
(2.1.14)
where the constraints fi, i = 1, 2, ,m1 and fi, i = 1, 2, ,m2 are independent, inwhich each constraint function only depends on x1 or x2. The x2 is first minimized
and the f0 is defined in terms of x1 only
f0(x1) = inf{
f0(x1, z) | fi(z) 0, i = 1, 2, ,m2}
, (2.1.15)
with variable z Rm2 as the optimal point of minimizing over x2. The problem of(2.1.14) is equivalent to
minimizex1
f0(x1)
subject to fi(x1) 0, i = 1, 2 ,m1,(2.1.16)
where the number of variables is smaller than that in (2.1.14). It is a common sense
that the number of variables is proportional to the computation time. From (1.2.9),
it is observed that the computation time is a cubic function of involved variables.
Dividing the variables and performing optimization one by one is a way to reduce
computational time.
Employing Epigraph Form
The standard form of (2.1.1) can be represented by the following epigraph form
minimizex,t
t
subject to f0(x) t 0,fi(x) 0, i = 1, 2 ,m,hi(x) = 0, i = 1, 2, , p,
(2.1.17)
CHAPT ER 2. PRELIMINARIES 21
where x Rn and t R are the variables. The problem is to minimize t over theepigraph of f0 subject to the constraints on x. This technique has been demonstrated
by the transformation for (1.2.6) to (1.2.7), where the non-differentiable objective
function in (1.2.6) is replaced by the epigraph. When n = 1 is considered, x is a
scalar and denoted by x, and the optimization problem in (x, t) can be interpreted
geometrically. This is illustrated in Figure 2.1 and the optimal point is denoted by
(x?, t?).
Figure 2.1: Geometric interpretation of epigraph form problem.
The optimization theory benefits convex optimization with notations, symbols and
techniques. The notations and symbols used in optimization theory are adopted by
convex optimization, so that the convex optimization saves time to develop their own
set of language. The techniques of transforming the optimization problems assists us
to solve some apparently intractable problems.
In the next subsection, the set and function properties, which fall in the discipline
of convex analysis, are the focuses. With the knowledge about the sets and functions,
it is possible to distinguish the different properties of constraint sets and objective
function.
2.1.2 Convex Analysis
The originality of convex analysis can date back to antique time, but this name ap-
pears in the 60s of the 20th century [4]. There are many discoveries of convex analysis
CHAPT ER 2. PRELIMINARIES 22
recently, so it is a fusion of classical and modern topics. Closely related to geometry
and deeply connected with analysis, convex analysis has stimulated many interests
recently because of its vast applications in mathematics, mathematical physics and
economics. As a branch of mathematics, convex analysis devotes to study convex
sets and convex functions. In the following subsections, brief explanation of the two
major convex objects, namely, convex sets and convex functions is presented.
Convex Set
A set S Rn is said to be a convex set if it contains the line segment joining any of itspoints, that is, if for any x,y S and any with 0 1, then x + (1 )y S,where Figure 2.2 shows four 2-dimensional examples. The left most rounded shape
and the four-sided shape, which are inclusive, are convex. The irregular shaped set
and the separated set in the right hand side are not convex, because part of line
segment is not contained in the set. From geometry point of view, convex set is
Figure 2.2: Some simple convex and non-convex sets.
always bulging outward with no dents or kinks in it. Every point in the set can be
seen by other points, along the corresponding straight lines which lie in the set.
In the following subsections, some usual and important convex sets, namely, lines
and line segments, affine sets, convex hull, convex cones, hyperplanes and half-spaces,
Euclidean balls and ellipsoids, as well as some sets constructed by convexity preser-
vation operations are described.
CHAPT ER 2. PRELIMINARIES 23
Lines and Line Segments The line and line segment can be represented as
y = x1 + (1 )x2, (2.1.18)
with x1 6= x2 Rn and [0, 1]. The form represents a line segment between pointsx1 and x2. In addition, the y can be expressed in the form of
y = x2 + (x1 x2), (2.1.19)
where y is the sum of the base point x2 and the direction x1 x2 scaled by .
Affine Sets If for any x1, x2 C, C Rn and R and x1 +(1 )x2 C, thenC is defined as affine, or It can be said that C contains the linear combination of anytwo points in C. The idea can be generalized for more than two points. A point of1x1+2x2+ +kxk is defined, where 1+2+ k = 1 is an affine combination ofthe points x1,x2, ,xk, and the set of all affine combination of points in set C Rn,which is called affine hull of C:
aff C = {1x1 + 2x2 + + kxk | x1,x2, ,xk, 1 + 2 + k = 1}, (2.1.20)
where affine hull is the smallest affine set containing C.
Convex Hulls The convex hull of a set C, which is denoted by conv C, is the setof all convex combinations of points in C:
conv C = {1x1 + 2x2 + + kxk | xi C, i 0,i = 1, 2 , k, 1 + 2 + + k = 1},
(2.1.21)
where conv C is always convex, as it is the smallest convex set that constrains C:If G is any convex set that contains C, then conv C G. Figure 2.3 illustrates thedefinition of convex hull.
Convex Cones A set of convex cone C Rn contains all rays emerging from theorigin passing through its points, and all line segments joining any points on the those
rays, that is,
x,y C, , 0 = x + y C, (2.1.22)
CHAPT ER 2. PRELIMINARIES 24
Figure 2.3: Convex hull of the kidney shaped set.
Figure 2.4: Geometric interpretation of a convex cone.
where the geometrical interpretation is shown in Figure 2.4.
Besides, the nonnegative orthant, Rn+ is convex cone. The set of symmetric pos-
itive semidefinite matrices, Snn+ = {X Snn | X 0} is also a convex cone,since the positive combination of semidefinite matrices is semidefinite, hence Snn+ apositive semidefinite cone.
Hyperplanes and Half-Spaces The hyperplane is defined as the set containing
all possible x:
{x | aTx = b}, (2.1.23)
with a Rn, a 6= 0 and b R, and it can be alternatively expressed as
{x | aT (x x0) = 0}, (2.1.24)
where aT is the normal vector and x0 lies on the hyperplane. As the hyperplanes
contain all the lines and line segments joining any two points, so it is a convex set.
CHAPT ER 2. PRELIMINARIES 25
In the similar manner, a half-space is defined, which is
{x | aTx b}, (2.1.25)
with a Rn, a 6= 0 and b R, and the alternative representation is
{x | aT (x x0) 0}, (2.1.26)
where aT is the normal vector and x0 lies on the boundary, and the set is representing
the region under the boundary. The less than or equal to sign, , can be replacedwith greater than or equal to, , to represent the area above the boundary. Bothof them are convex in nature, as all lines joining any two points are also within the
sets.
Euclidean Balls and Ellipsoids An Euclidean ball in Rn is defined as
B(xc, r) = {x | x xc2 r} = {x | (x xc)T (x xc) r2}, (2.1.27)
where the radius is denoted by r > 0 and xc is the center of the ball. An alternative
representation for the Euclidean ball is
B(xc, r) = {xc + ru | u2 1}, (2.1.28)
which also contains points within a circle of radius r centered at xc. The Euclidean
ball is a convex set, when x1xc2 r is considered, x2xc2 r, and 0 1,then
x1 + (1 )x2 xc2 = (x1 xc) + (1 )(x2 xc)2 x1 xc2 + (1 )x2 xc)2 r.
(2.1.29)
Apart from Euclidean ball, another similar set is ellipsoids, which is defined as
E = {x | (x xc)TP1(x xc)}, (2.1.30)
where P = PT 0n is a symmetric and positive definite matrix and xc Rn is thecenter of the ellipsoid. The lengths of the semi-axes of E are given as i, where i,
CHAPT ER 2. PRELIMINARIES 26
i = 1, 2, , n are the eigenvalues. An Euclidean ball is a special case of ellipsoidwith P = r2In. An alternative representation of an ellipsoid is
E = {xc + Au | u2 1}, (2.1.31)
where A = P1/2 is a square and nonsingular matrix.
Sets Constructed by Convexity Preservation Operations Preservation of
convexity is one of the most important properties. There are many convexity preser-
vation operations and the study of them is still an active research area [71,72]. Some
simple examples of them are scalar multiplication, vector sum and linear transfor-
mation, but the most important operation goes to intersection, as it lets us combine
different convex sets to be a new one.
Let A be an arbitrary index set and {S| A} a collection of sets which areconvex, hence
A S is also convex.
The preservation examples can be found everywhere. For instance, a polyhedron,
which is intersection of a finite number of half-spaces, is convex set defined by the
principle of intersection. Another convex set example is the solution set of linear
matrix inequality
F (x) = A0 + x1A1 + + xmAm 0, (2.1.32)
where A0,A1, ,Am Snn are the coefficient matrices. The set of solution isdefined by the vector sum principle with affine set {xi | xiAi = bTi }, where {bi} aresome constant vectors, for i = 1, 2, ,m.
After presenting the convex sets, which always corresponds to the constraints and
the feasible solution sets of the problems, another convex object convex function,
which always relates to the objective function in convex problem would be described.
Convex Function
A function f : Rn is convex if its domain, dom f , is convex and for all x,y dom f, [0, 1] satisfies f(x + (1 )y) f(x) + (1 )f(y). If f is convex, thenf is concave, and some examples, which consider n = 1, are shown in Figure 2.5.
CHAPT ER 2. PRELIMINARIES 27
Figure 2.5: Convexity of functions.
An example of convex function is x2, where its domain is on R, while log(x) is
concave, with x is defined on R++. For the sake of fitting the criteria about the
objective function that the infeasible solution should return +, the convex functionf is always extended as
f(x) =
f(x), x dom f+, x / dom f
, (2.1.33)
where f still satisfies the basic definition of convexity. In this thesis, the same symbol
is used for f and its extension, so all convex functions are assumed to be extended.
Some important properties about the convex function are reviewed in the following
sections, namely, first-order and second-order conditions, sub-level sets, epigraph and
operations that preserve convexity.
First-Order and Second-Order Conditions If f is a differentiable function,
then f is convex if and only if dom f is convex and
f(y) f(x) +f(x)T (y x), (2.1.34)
where (2.1.34) holds for all x, y dom f Rn. The inequality of (2.1.34) showsthat if f(x) = 0n1, then for all y dom f , f(y) f(x), hence x is a globalminimizer of function f .
It is assumed that the Hessian, 2f , of function f exists at each point in dom f .The f is convex if and only if dom f is convex and its Hessian is positive semidefinite,
that is, 2f(x) 0 for all x dom f .They provide simple ways to determine whether a differentiable or twice-differen-
tiable function is convex or not.
CHAPT ER 2. PRELIMINARIES 28
Sub-Level Sets The -sub-level of f : Rn R is defined as
C = {x dom f | f(x) }, (2.1.35)
and C with different of a convex function is also convex. Figure 2.6 shows the geo-metric interpretation of f s sub-level sets, C with interval [a, b] and C with (, c],where f : R R. This property helps us identify quasi-convex function, which
Figure 2.6: Geometric interpretation of sub-level sets.
is with all two sub-level convex sets, as they always contain stationary points which
will be overlooked by first order or second order conditions even when the function is
differentiable. For some non-convex functions with local minimal points, these points
can be ignored with a well-designed sub-level set.
Epigraph It is assumed that f : Rn R is a function, and its graph is defined as{(x, f(x)) | x dom f}, which is a subset of Rn+1, and the epigraph of f is definedas
epi f = {(x, t) | x dom f, f(x) t}, (2.1.36)
where it is also a subset of Rn+1. Epigraph is a linkage between convex sets and
convex functions, as a function is convex if and only if its epigraph is a convex set,
which equips us to determine whether a function is convex by its epigraph.
Operations that Preserve Convexity The common convexity preservation op-
erations are nonnegative weighted sum, composition with an affine mapping and
point-wise maximum and supremum.
CHAPT ER 2. PRELIMINARIES 29
If fi : Rn R, i = 1, 2, ,m, are convex, then the nonnegative weighted sum,
which is
f =m
i=1
wifi (2.1.37)
is also convex with wi 0, i = 1, 2, ,m.An affine mapping function g : Rm R is considered with
g(x) = f(Ax + b), (2.1.38)
where f : Rn R, A Rnm, b Rn and dom g = {x | Ax + b dom f}. If fis convex, then g is also convex.
If f1 and f2 are convex functions and f is defined as
f(x) = max {f1(x), f2(x)} , (2.1.39)
and dom f = dom f1 dom f2 is convex, and hence the point-wise maximum
f(x) = max {f1(x), f2(x), , fm(x)} (2.1.40)
is also convex.
The study of convex objects enables us to recognize or cast the problem to fit into
the convex framework. With the knowledge about the requirements on the constraints
and the objective function of the optimization problem, better understanding about
relaxing or solving the problem can be achieved. Convex analysis contributes to a
systematic mean of analysis about the convexity of the optimization problem and the
assurance of optimality.
2.1.3 Numerical Computation
Numerical computation, which receives many attentions on matrix calculation, is the
study of algorithms to perform mathematical computations on computers. Numerical
computation also takes an important role on engineering and computational science
problems, such as image and signal processing, computational finance, material sci-
ence simulations, structural biology, data mining, bioinformatics and fluid dynamics.
CHAPT ER 2. PRELIMINARIES 30
The software packages of numerical computation rely on the development, analysis,
and implementation of state-of-the-art algorithms for solving various numerical linear
algebra problems, in large part because of the role of matrices in finite difference and
finite element methods. For example, the common problems are LU decomposition,
QR decomposition, singular value decomposition, eigenvalues, interior-point method
and conic optimization. Numerical computation enables a computational backup
to the convex optimization. Convex optimization problem can be solved efficiently
with the development of numerical computation algorithms or even some convex op-
timization software packages such as YALMIP [73], CVX [74], CVXOPT [75], SeDuMi [76],
SDPT3 [7779], etc..
Some numerical methods for tackling convex problems are reviewed in the follow-
ing sections, namely, ellipsoid method, sub-gradient method, cutting-plane methods
and interior-point methods.
Ellipsoid Method
The ellipsoid method is an algorithm for solving convex optimization problems. It
was introduced by Shor [80], Nemirovsky [81], and Yudin [82] in 1972, and used by
Khachiyan [83] to prove the polynomial-time solvability of linear programs [84]. At
that time, the ellipsoid method was the only algorithm for solving linear program
whose runtime was proved to be polynomial in time. The algorithm is to enclose the
minimizer of a convex function in a sequence of ellipsoids whose volume decreases at
each iteration.
First, a convex problem in standard form as in (2.1.1) is considered and every
equality constraint hi is replaced by two inequalities is assumed. An arbitrary initial
ellipsoid E0 Rn can be defined, which is
E0 ={z|(z x0)TP10 (z x0) 1
}, (2.1.41)
where z Rn and x0 is the center of E0. At kth iteration, the point xk Rn islocated at the center of Ek:
Ek ={z|(z xk)TP1k (z xk) 1
}. (2.1.42)
CHAPT ER 2. PRELIMINARIES 31
Then, the cutting-plane oracle is defined, which is given by the sub-gradient of f0(k),
gk Rn satisfying
gTk+1(x? xk) 0, (2.1.43)
and the optimal point x? is obtained as
x? Ek {z|gTk+1(z xk) 0}, (2.1.44)
where Ek is the minimal volume containing the solution x?. The update is given by
xk+1 = xk 1n + 1
Pkgk+1 (2.1.45)
Pk+1 =n2
n2 1(Pk 2
n + 1Pkgk+1g
Tk+1Pk
), (2.1.46)
where
gk+1 =1
gTk+1Pkgk+1gk+1. (2.1.47)
For each iteration, the feasibility of xk is checked. The xk is feasible when choosing
a sub-gradient gk+1 that satisfies
gTk+1(x? xk) + f0(xk) f(k)best 0, (2.1.48)
where f(k)best contains the smallest objective value of k feasible iterations. Otherwise
xk is infeasible and violates the j-th constraint, and the feasible set of z will be
updated as
gT(j)(z xk) + fj(xk) 0, (2.1.49)
where g(j) is a sub-gradient of fj(xk). Finally, the stopping criterion is given by
gTk+1Pkgk+1 = f0(xk) f(x?) , (2.1.50)
with is the tolerable error.
Although the algorithm provides a polynomial-time and a clear stopping criterion,
the interior-point method and variants of the simplex algorithm are much faster than
the ellipsoid method, in both theory and practice.
CHAPT ER 2. PRELIMINARIES 32
Sub-Gradient Method
Sub-gradient methods are algorithms for solving convex optimization problems. Orig-
inally developed by Shor [80] in the 1960s and 1970s, sub-gradient methods can be
used for a non-differentiable objective function [85]. When the objective function is
differentiable, sub-gradient methods for unconstrained problems use the same search-
ing direction as the method of steepest descent.
Although sub-gradient methods can be much slower than interior-point methods
and Newtons method in practice, they can be immediately applied to a far wider
variety of problems and require much less memory. Moreover, by combining the
sub-gradient method with primal or dual decomposition techniques, it is sometimes
possible to develop a simple distributed solver.
A convex function f : Rn R with dom f Rn is considered, the sub-gradientmethod of kth iteration is
xk+1 = xk kgk, (2.1.51)
where gk is a sub-gradient of f at xk and k > 0 is the step size, which is governed
by the step size rules, such as constant step size k = and constant step length
k = /gk2 where = xk+1xk2. For differentiable f , gk equals to f at xk. Alist f(k)best that keeps track the smallest objective function value in k valid iterations
is also kept, that is,
f(k)best = min{f(k1)best, f(xk)}. (2.1.52)
As convex problem is with unique optimum, the sub-gradient algorithm is guaranteed
to converge with constant step size and constant step length, hence
limk
f(k)best f ? , (2.1.53)
where f ? and are the optimal value and the error, respectively. The sub-gradient
method can be extended to handle the problem in the form of (2.1.1), where the
constraints are present, but the sub-gradient is modified as
gk =
f0(x), fi(x) 0, i = 1, 2, ,mfj(x), fj(x) > 0
, (2.1.54)
CHAPT ER 2. PRELIMINARIES 33
where f is the sub-differential of f . If xk is feasible, the algorithm uses an objective
sub-gradient, otherwise the algorithm chooses a sub-gradient of the violated constraint
fj.
Cutting-Plane Methods
In mathematics, more specifically in optimization, the cutting-plane method [86] is
an umbrella term for optimization methods which iteratively defines a feasible set
or objective function by means of linear inequalities, termed cuts. Such procedures
are popularly used to find integer solutions to mixed integer linear programming
problems, as well as to solve general, not necessarily differentiable convex optimiza-
tion problems. The use of cutting-planes to solve mixed integer linear programming
problems was introduced by Gomory [87].
Cutting-plane methods for mixed integer linear programming can be achieved by
solving the non-integer linear program, the relaxation of the given integer program.
The obtained optimum is tested if it is also an integer solution. If this is not the
case, it is guaranteed that there exists a linear inequality that separates the optimum
from the convex hull of the true feasible set. Finding such inequality is the separation
problem, and a found inequality is a cut. A cut can be added to the relaxed linear
program to cut off the current non-integer solution. This process is repeated until an
optimal integer solution is found.
Cutting-plane methods for general convex continuous optimization and variants
are known under various names:
Kelleys method [88], Kelley-Cheney-Goldstein method [89], and bundle methods [90].
They are popularly used for non-differentiable convex minimization, where a convex
objective function and its sub-gradient can be evaluated efficiently but usual gradi-
ent methods for differentiable optimization can not be used. This situation is most
typical for the concave maximization of Lagrangian dual functions. Another common
situation is the application of the Dantzig-Wolfe decomposition [91] to a structured
CHAPT ER 2. PRELIMINARIES 34
optimization problem in which formulations with an exponential number of variables
are obtained. Generating these variables on demand by means of delayed column
generation is identical to performing a cutting-plane on the respective dual problem.
Here, the base cutting-plane method with the simplest Gomorys cut would be
illustrated. It is assumed that an admissible solution x Rn is known to us and
[B F]
[xb
xf
]= b, (2.1.55)
with x = [xTb xTf ]
T where xb denotes the first nb elements, and hence
Bxb + Fxf = b = xb = B1bB1Fxf. (2.1.56)
The b = B1b and A = B1F is defined to yield
xb + Axf = b
xi +[Axf
]i= bi, i = 1, 2, , nb,
(2.1.57)
with x = [x1, x2, , xn]T and b = [b1, b2, , bnb ]T . Then, the inequalities is con-structed
xi +[
Axf
]i bi, i = 1, 2, , nb, (2.1.58)
for xi 0, i = 1, 2, , n. Without losing integer solutions, the right hand side isrounded to the integer:
xi +[
Axf
]i bbic, i = 1, 2, , nb. (2.1.59)
By subtracting (2.1.57) with (2.1.59), the following integer formulation of Gomorys
cut is obtained
[(A
A
)xf
]i bi bbic, i = 1, 2, , nb. (2.1.60)
Although Gomorys cut is proposed in the 60s, it has been forgotten for almost
thirty years. Many experts - including Gomory himself - considered it to be impracti-
cal, for numerical stability reasons as well as disregarding them as ineffective as many
rounds of cuts are needed to make progress in the objective. Things turned when
CHAPT ER 2. PRELIMINARIES 35
in the mid-90s Cornuejols et al. [92] showed them to be very effective in combina-
tion with branch-and-cut and ways to overcome numerical instabilities. Nowadays, all
commercial solvers of mixed integer linear programming, which refer to the minimiza-
tion or maximization of a linear function subject to linear constraints, use Gomorys
cut.
There exist many more general cuts for mixed integer programs. Gomorys cuts
however are very efficient to generate from a simplex tableau, whereas many other
types of cuts are either expensive or even NP-hard to separate. Among other general
cuts for mixed integer linear programming, lift-and-project [93] dominates Gomory
cuts.
Interior-Point Methods
Interior point methods, which also refer to as barrier methods, are a certain class of
algorithms to solve linear and nonlinear convex optimization problems.
These algorithms have been inspired by Karmarkars algorithm [94] developed
in 1984 for linear programming. The basic elements of the method consist of a
self-concordant barrier function to encode the convex set. Contrary to the simplex
method [91], it reaches an optimal solution by traversing the interior of the feasible
region.
Any convex optimization problem can be transformed into minimizing (or max-
imizing) a linear function over a convex set. The idea of encoding the feasible set
using a barrier and design