Upload
iitg-in
View
1
Download
0
Embed Size (px)
Citation preview
Implementation of Multiplicative weights and Multi-Commodity Flows Parallel
and Randomized Approximation Algorithms
BY
Ajay Shankar Bidyarthy
ON
16th July 2012
Report
Submitted in Systems Research Laboratory
TATA Research Development and Design Center (TRDDC)
54-B, Hadapsar Industrial Estate, Pune, MH, 411013, India.
Under the Guidance
of
Dr. Rajiv Raman
TATA Research Development and Design Center, Pune
Dr. Dilys Thomas
TATA Research Development and Design Center, Pune
1
1 Implementation of parallel and randomized algorithm for skew-
symmetric matrix game
1.1 Problem Statement
Given a (m,n)-matrix game A = [aij ] ∈ [−1, 1], we give a serial implementation to a parallel
randomized algorithm which computes a pair of ε-optimal strategies in 4ε2
log n number of iter-
ations where n is size of the matrix A. For any accuracy ε > 0, the expected sequential running
time of the suggested algorithm is O( 1ε2n log n). If we assume m = n the randomized algorithm
achieves an almost quadratic expected speedup relative to any deterministic method [1].
1.2 Introduction
Let A = [aij ] be an (m,n)-matrix game of value v? = min {v | Ax ≤ ve, eᵀx = 1, x ≥ 0} =
max {v | Aᵀy ≥ ve, eᵀy = 1, y ≥ 0}, where e is the vector of all ones. This value is always
achieved for a pair of optimal strategies x? ∈ Rn, y? ∈ Rm. Let the elements of the payoff
matrix A lie in some interval, say A = [aij ] ∈ [−1, 1] .
A pair of strategies x, y is ε− optimal for A if
Ax ≤ (v? + ε)e, eᵀx = 1, x ≥ 0 and Aᵀy ≥ (v? + ε)e, eᵀy = 1, y ≥ 0 ..........(?)
Define U(t) = AX(t) ∈ <m, and for symmetric matrix games (A = −Aᵀ) V (t) = −U(t)
for all tth iteration. The algorithm is a simple randomized variant of the deterministic method
(where players play pure best response) in which indices i(t) ∈ {1, ..., n} or j(t) ∈ {1, ...,m} are
selected with probability propotional to exp{−εVi(t)/2} or exp{+εUj(t)/2} for the minimizing
and maximizing player, respectively. For symmetric games, the probability of selcting an index
i(t) is given by the Gibbs distribution pi(t) = exp{εUi(t)/2}/∑n
j=1 exp{εUj(t)/2}.
1.3 Implementation and experiments
Let us consider a skew-symmetric matrix games for which A = −Aᵀ. In this case m = n, the
value of the game is v? = 0, the case of optimal strategies of both players coincide, and (?)
simplifies to the problem of computing an x such that Ax ≤ εe, x ∈ S = {x ∈ Rn|eᵀx = 1, x ≥ 0}
.
The algorithm given below finds an ε-optimal strategy x for A in O(f(n)) time with proba-
bility ≥ 1/2, where f(n) is a polynomial. We assume that ε ∈ (0, 1], n = m and n,m ≥ 8.
2
Result: ε-optimal strategy vector x
Initialize: X = 0, U = 0, X,U ∈ Rn, p = e/n, t = 0.
repeatIteration count: t = t+ 1.
Random direction: pick a random k ∈ {1, ..., n} with probability pk.
X − update:
Xk = Xk + 1.
U − update:
for i = 1, ..., n do
do in parallel Ui = Ui + aik.
end
p− update:
for i = 1, ..., n do
do in parallel pi = piexp{ ε2aik}/∑n
j=1 pjexp{ε2ajk}.
end
Stopping criterion:
if U/t ≤ εe then
Output: x = X/t and halt.
end
until;
Algorithm 1: Parallel and randomized apprximation algorithm for skew-symmetric
matrix games
Result: k ∈ {1, ..., n} with probability pk
for i = 1, ..., n do
calculate cummulative sum Si = p1 + p2 + ...+ pi.
end
Generate an uniform random number z ∈ [0, 1]
Pick a random number k ∈ {1, ..., n} such that Sk−1 < z ≤ Sk.Algorithm 2: Algorithm for picking a random k ∈ {1, ..., n} with probability pk
Theorem:
1. The stopping criterion U/t ≤ εe guarantees the ε− optimality of the output vector x i.e.
x = X/t [1].
2. Algorithm halts in t? = 4ε−2ln(n) iterations with probability ≥ 1/2 [1].
3
Corollary:
An ε-optimal strategy for a symmetric matrix game An×n ∈ [−1, 1] can be computed in
O(ε−2 log n) expected number of iterations of this algorithm. Since this algorithm runs re-
peatedly for at most t? = 4ε−2 lnn iterations succeeding each time with probability at least 1/2,
until the stopping criterion is satisfied.
Let f(n) = 1ε2
lnn. E[t?] = 12 × f(n) + 1
4 × 2× f(n) + 18 × 3× f(n) + ...+ 1
2i× i× f(n) + ...
=∑∞
i=1i2if(n) = f(n)
∑∞i=1
i2i≤ 4 × f(n). Since we know that 1 + x + x2 + x3 + ... = 1
1−x .
Differentiating both side w.r.t. x we get 0 + 1 + 2x+ 3x2 + ... = ddx( 1
1−x) = 1(1−x)2 = 1
(1− 12)2
= 4
at x = 1/2. �
Theorem:
An ε-optimal solution for a given symmetric n × n matrix game A = [aij ] ∈ [−1, 1] can be
computed in O(ε−2 log2 n) expected time on an n/log n-processor EREW PRAM ([1] Theorem
2.).
1.4 Analysis and results
log n dependence:
Figure 1 shows the number of iterations required to achieve an accuracy of ε. The plot shows
a log growth in size of matrix games n. For n = 100, 200, ..., 1000 and ε = 0.3, 0.4, 0.5, required
number of iterations varies from 7 to 40. Algorithm outputs ε-optimal strategy (output vector
x) atmost in 4ε2
log n iterations.
4
100 200 300 400 500 600 700 800 900 1000
10
15
20
25
30
35
Num
ber
of itera
tion −
−−
−−
−−
−−
−−
>
Size of the matrix (n) −−−−−−−−>
ln(n) dependence
ε1 = 0.3
ε2 = 0.4
ε3 = 0.5
Figure 1: Plot of number of iterations (t?) with respect to size of the matrix games An×n to
achieve an accuracy of ε.
n log n dependence:
Figure 2 shows the time taken to achieve an accuracy of ε. The plot shows a n log n depen-
dence in size of matirx games n. For n = 100, 200, ..., 1000 and ε = 0.3, 0.4, 0.5, required time
to get ε-optimal strategy varies from 0.01 seconds to 5 seconds.
100 200 300 400 500 600 700 800 900 1000
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Tim
e T
aken −
−−
−−
−−
−−
−−
>
Size of the matrix (n) −−−−−−−−>
ln(n) dependence
ε1 = 0.3
ε2 = 0.4
ε3 = 0.5
Figure 2: Plot of time taken with respect to size of the matrix An×n to achieve an accuracy of
ε.
1ε2
dependence for number of iterations:
5
Figure 3 shows the number of iterations required to achieve accuracies of ε’s for a fixed
matrix An×n. The plot shows a 1ε2
dependence in error accuracies ε’s. For ε = 0.02, 0.04, ..., 0.2
and n = 100, 200, 300, required number of iterations decaying from 55000 to 0. Algorithm
outputs ε-optimal strategy (output vector x) atmost in 4ε2
log n iterations.
0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5x 10
4
Num
ber
of
itera
tions −
−−
−−
−−
−−
−−
>
Epsilon values (ε) −−−−−−−−>
1/ε2 dependence
n = 100
n = 200
n = 300
Figure 3: Plot of number of iterations (t?) with respect to accuracy ε for a fixed size of the
matrix An×n.
1ε2
dependence for time taken:
Figure 4 shows th time taken to achieve accuracies of ε’s for a fixed matrix An×n. The plot
shows a 1ε2
dependence in error accuracies ε’s. For ε = 0.01, 0.02, ..., 1 and n = 100, 200, 300,
required time to get ε-optimal strategy decaying from 110 seconds to 0 second. Algorithm
outputs ε-optimal strategy (output vector x) atmost in 4ε2
log n iterations.
6
0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
10
20
30
40
50
60
70
80
90
100
110
Tim
e T
aken −
−−
−−
−−
−−
−−
>
Epsilon values (ε) −−−−−−−−>
1/eps2 dependence
n = 100
n = 200
n = 300
Figure 4: Plot of time taken with respect to accuracy ε for a fixed size of the matrix An×n.
1.5 Conclusion
Algorithm maintain the invariants X(t)/t ∈ S = {x ∈ Rn|eᵀx = 1, x ≥ 0} and U(t) = AX(t),
where X(t), U(t) denote the the n − vectors X, U at the end of the iteration t ≥ 1. This
immediately implies the stopping criterion of the algorithm guarantees the ε-optimality of the
output vector x. Due to limited precision arithmetic in our implementation we have varified
the assertion condition AX(t) − U(t) < O(10−8) which is extremely small. Thus we have
AX(t)− U(t) ≈ 0 =⇒ AX(t) = U(t).
? ? ?
7
2 Performance analysis of M.D. Grigoriadis and L.G. Khachiyan
algorithm Vs SeDuMi in Matlab of A sublinear-time random-
ized apprximation algorithm for matrix games.
2.1 Introduction
SeDuMi (Self-Dual-Minimization) is an add-on for Matlab, which let us solve optimization
problems with linear, quadratic and semidefiniteness constraints. It implements the self- dual
embedding technique for optimization over self-dual homogeneous cones. The self-dual embed-
ding technique essentially makes it possible to solve certain optimization problems in a single
phase, leading either to an optimal solution, or a certificate of infeasibility. Complex valued
data and variables can also be input for SeDuMi. Large scale optimization problems are solved
efficiently, by exploiting sparsity.
2.2 Analysis and results
Figure 5 represents precision time trade off, ε error accuracy with respect to time taken by
SeDuMi of matrix game with matrix A5000×5000.
30 40 50 60 70 80 90 100
500
1000
1500
2000
2500
ε −
−−
−−
−−
−−
−−
>
Time Taken (min) −−−−−−−−>
Precision Time Tradeoff for Sedumi
n = 5000
Figure 5: Plot of precision time trade off (matrix A5000×5000), ε error accuracy with respect to
time taken by SeDuMi Matlab toolbox.
Figure 6 represents precision time trade off (matrix A5000×5000), ε error accuracy with respect
to time taken by GK algorithm and SeDuMi Matlab toolbox. GK is faster than SeDuMi for
certain ε error accuracy and large matrix games size, say 0.01 and large n, An×n, say 20000
respectively.
8
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5
x 104
0.05
0.1
0.15
0.2
0.25
ε −
−−
−−
−−
−−
−−
>
Time Taken (min) −−−−−−−−>
Precision Time Tradeoff (n = 5000)
GK
Sedumi
10 20 30 40 50 60 70 80
5
10
15
20
ε −
−−
−−
−−
−−
−−
>
Time Taken (min) −−−−−−−−>
Precision Time Tradeoff (n = 5000)
GK
Sedumi
200 400 600 800 1000 1200 1400 1600 1800
0.01
0.02
0.03
0.04
0.05
ε −
−−
−−
−−
−−
−−
>
Time Taken (min) −−−−−−−−>
Precision Time Tradeoff (n = 5000)
GK
Sedumi
10 20 30 40 50 60 70 800.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
ε −
−−
−−
−−
−−
−−
>
Time Taken (min) −−−−−−−−>
Precision Time Tradeoff (n = 5000)
GK
Sedumi
Figure 6: Plot of precision time trade off (matrix A5000×5000), ε error accuracy with respect to
time taken by GK algorithm and SeDuMi Matlab toolbox.
? ? ?
9
3 Beating simplex for fractional packing and covering linear
programs
3.1 Introduction
Given a constraint matrix with n non-zeros, r rows, and c columns. Next we give a serial
implementation to an apprximation algorithm for packing and covering linear programs. We
will consider linear programs with non-negative coefficients. The algorithm which we implement
computes feasible primal and dual solutions with high probability whose costs are within a factor
of 1 + ε of OPT (the optimal cost) in time O(n + (r + c) log n/ε2), where n is the number of
non-zero entries in the constraint matrix and (r+c) is the number of rows plus columns (i.e.,
constraints plus variables).
A packing problem is a linear program of the form max{c.x : Ax ≤ b, x ∈ P}, where the
entries of the constraint matrix A are non-negative and P is a convex polytope admitting an
optimization oracle. A covering problem is of the form min{c.x : Ax ≥ b, x ∈ P}.
Without loss of generality by the transformation A′ij = Aij/(biaj) we assume that the
primal and dual programs are of the following restricted forms max{|x| : Ax ≤ 1, x ≥ 0},
min{|x| : Aᵀx ≥ 1, x ≥ 0} respectively.
3.2 Algorithm
On input (A ∈ [0, 1]r×c, ε)
Result: Feasible primal x and dual x?
Initialize Vectors x, x = 0; scalarN = d2ln(rc)/ε2e.
repeat
Let pi = (1 + ε)Aix and pj = (1− ε)Aᵀj x.
Choose random indices j′ and i′ respectively from the probability
distributions p/|p| and p/|p|.
Increase xj′ and xi′ each by 1.
Return x? = x/(maxiAix) and x? = x/(minjAᵀj x).
until maxiAi ≥ N ;
Algorithm 3: Algorithm for packing and covering linear programs.
The algorithm we implement is a very simple and slow algorithm, essentially a variant of Grigo-
riadis and Khachiyan’s algorithm [1] given above. For this we assume Aij ∈ [0, 1] and we analyze
10
the ruuning time. This algorithm demonstrates the use of coupled random increments to the
primal and dual. At the end of the iteration the algorithm returns a (1 − 2ε)-approximate
primal-dual pair with high probability.
log n dependence:
Figure 7 shows the number of iterations required to achieve an accuracy of ε. The plot shows
a log growth in size of matrix gamesAr×c. For r×c = 10000, 20000, ..., 80000 and ε = 0.3, 0.4, 0.5,
required number of iterations varies from 50 to 500. Algorithm outputs (1 − 2ε)-approximate
primal-dual pair with high probability.
1 2 3 4 5 6 7 8
x 104
100
150
200
250
300
350
400
450
Num
ber
of itera
tion −
−−
−−
−−
−−
−−
>
Size of the matrix (rxc) −−−−−−−−>
log(n) dependence
ε1 = 0.3
ε2 = 0.4
ε3 = 0.5
Figure 7: Plot of number of iterations (t?) with respect to size of the matrix Ar×c to achieve
an accuracy of ε.
n log n dependence:
Figure 8 shows the time taken to achieve an accuracy of ε. The plot shows a n log n depen-
dence in size of matirx games Ar×c. For r × c = 10000, 20000, ..., 80000 and ε = 0.3, 0.4, 0.5,
required time to get (1 − 2ε)-approximate primal-dual pair with high probability varies from
0.02 second to 3 seconds.
11
1 2 3 4 5 6 7 8
x 104
0.5
1
1.5
2
2.5
Tim
e T
aken −
−−
−−
−−
−−
−−
>
Size of the matrix (rxc) −−−−−−−−>
nlog(n) dependence
ε1 = 0.3
ε2 = 0.4
ε3 = 0.5
Figure 8: Plot of time taken (time) with respect to size of the matrix Ar×c to achieve an
accuracy of ε.
1ε2
dependence for number of iterations:
Figure 9 shows the number of iterations required to achieve accuracies of ε’s for a fixed
matrix Ar×c. The plot shows a 1ε2
dependence in error accuracies ε’s. For ε = 0.05, 0.1, ..., 0.5,
r = 50, 100, 150 and c = 100, 200, 300, required number of iterations decaying from 16000 to 0.
Algorithm outputs (1− 2ε)-approximate primal-dual pair with high probability.
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
2000
4000
6000
8000
10000
12000
14000
16000
Num
ber
of itera
tions −
−−
−−
−−
−−
−−
>
Epsilon values (ε) −−−−−−−−>
1/ε2 dependence
r = 50 and c = 100
r = 100 and c = 200
r = 150 and c = 300
Figure 9: Plot of number of iterations (t?) with respect to accuracy ε for a fixed size of the
matrix Ar×c.
12
1ε2
dependence for time taken:
Figure 10 shows th time taken to achieve accuracies of ε’s for a fixed matrix Ar×c. The
plot shows a 1ε2
dependence in error accuracies ε’s. For ε = 0.05, 0.1, ..., 0.5, r = 50, 100, 150
and c = 100, 200, 300, required time to get (1 − 2ε)-approximate primal-dual pair with high
probability decaying from 60 seconds to 0 second. Algorithm outputs (1 − 2ε)-approximate
primal-dual pair with high probability.
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
5
10
15
20
25
30
35
40
45
50
55
Tim
e T
aken −
−−
−−
−−
−−
−−
>
Epsilon values (ε) −−−−−−−−>
1/ε2 dependence
r = 50 and c = 100
r = 100 and c = 200
r = 150 and c = 300
Figure 10: Plot of time taken with respect to accuracy ε for a fixed size of the matrix Ar×c.
3.3 Precision issues
Data type double uses 8 bytes of memory and ranges from 1.7E+ /− 308 (exponent 15-digits).
In third step of algorithm we compute pi and pj . For ε < 0.03, r = 200 and c = 400, pi becomes
greater than 1.7E+308 which is a very large number and hence it assumes pi =∞ and similarly
pj becomes less than 1.7E− 308 which is a very small number and hence it assumes pj = 0. As
a comparison, the number of atoms in the observable universe, to which it is often compared,
is estimated to be between 4× 1079 and 1081.
3.4 Conclusion
This algorithm returns a (1 − 2ε) - approximate primal-dual pair i.e. feasible primal and dual
solutions x and x? such that |x?| ≥ (1 − 2ε)|x?| with probability at least 1 − 1/(rc). This
probability reaches to almost 1 when r and c taken to be very large. At the end of the iteration
the algorithm terminates with the termination condition maxiAix ≥ N =⇒ (1−2ε)maxiAix ≤
13
minjAᵀj x. Due to limited precision arithmetic in our implementation we have verified the
assertion condition with yes instance and |x| = |x| =⇒ the perfomance guarantee.
? ? ?
14
4 Implementation of Fast Algorithm for Approximate Semidef-
inite Programming using the Multiplicative Weights Update
Method
4.1 Introduction
Semidefinite programming SDP solves the following problem:
min c •X
such that Aj •X ≥ bj , for j = 1, 2, ...,m.
and X � 0. —— (1)
Where X ∈ Rn×n is a matrix of variables and A1, A2, ..., Am ∈ Rn×n. For n× n matrices A
and B, A • B is their inner product (A • B =∑
ij AijBij) treating them as vector in Rn2, and
A � 0 is notation for A is positive semidefinite.
The first polynomial time algorithm, an apprximation algorithm that computes the solu-
tion up to any desired accuracy ε in polynomial time in log 1ε was via the Ellipsoid method.
Faster running time interior point methods were later given by Alizadeh [8] with running time
O(√m(m + n3)L), (O suppress polylog(mnε ) factor), where L is an input size parameter, and
Nesterov and Nemirovskii [9].
Our implementation contributes the modified multiplicative weights technique to handle
some of high-width SDPs and technique used is the multiplicative weights technique. Algorithm
we have implemented assume a feasibility version of the SDP (1). Here we have implicitly
performed a binary search on the optimum and the objective is converted to a constraint in the
standard way. Here one additional constraint has been assumed, a bound on the trace of the
solution:
Aj •X ≥ bj for j = 1, 2, ...,m.∑iXii ≤ R
X � 0. ———— (2)
For SDPs relaxation it is natural to put an upper bounds on trace, Tr(X) =∑
iXii. We have
solved the SDP approximately up to a given tolerance ε, by which we mean that either we find
a solution X which satisfies all the constraints up to an additive error of ε, i.e. Aj •X− bj ≥ −ε
for j = 1, 2, ...,m, or conclude correctly that the SDP is infeasible.
15
Multiplicative weights update method: We perform several iterations, indexed by time
t. Associate a non-negative weight w(t)j with constraint j, where
∑j w
(t)j = 1. A high current
weight for a constraint indicated that it was not satisfied too well in the past, and therefore
should receive higher priority in the next step. Thus the optimization problem to the next step
is to
max∑
j w(t)j (Aj •X − bj)∑iXii ≤ R
X � 0
Which represents an eigenvalue problem, since the optimum is attained at an X that has rank
1. Update the weights wi according to the usual multiplicative update rule, for some constant
β:
w(t+1)j = w
(t)j (1− β(AjXt − bj))/St.
Where Xt was the solution to the eigenvalue problem, expressed as a rank 1 PSD matrix, at a
time t, and St is the normalization factor to make the weights sum to 1, can be written as:
St =∑(t)
j (1− β(Aj •Xt − bj))
If β is small enough, the average 1T
∑Tt=1Xt is guaranteed to converge to a near-feasible solution
to the original SDP if we asuume a feasible solution exists.
4.2 Algorithm
Objective: maxA •X
Xii ≤ 1 for i = 1, 2, ..., n
X � 0
Assuming diag(A) ≥ 0. Let N ≥ n be the number of nonzero entries of A, our implementation
get a multiplicative 1−O(ε) apprximation to the optimum value of the SDP.
16
On input ( An×nj , bj for j = 1, 2, .....m, ε, R)
Result: Feasible PSD matrix X
Initialize wj such that∑
j wj = 1, numiter = nlognε2
, β = ε2 .
while T < numiter do
Compute C: C =∑m
j=1w(t)j (Aj − bj
R I).
if C is negative definite then
report solution is infeasible and stop
else
compute the largest eigenvector of C say v1.
end
Compute Xt: Xt = v1 × vᵀ1Update Wj : w
(t+1)j = w
(t)j (1− β(Aj •Xt − bj))/St, where
St =∑m
j=1w(t)j (1− β(Aj •Xt − bj)).
Update T : T = T + 1.
end
Compute X: X =∑T
t=1XtT , where T = number of iterations (numiter),
and return feasible X.Algorithm 4: Decision making algorithm for semidefinite programming us-
ing multiplicative weights update method
Above algorithm returns Feasible X is PSD and corresponding number of satisfiable con-
straints Aj •X ≥ bj for j = 1, 2, ...,m.
4.3 Analysis and results
Figure 11 represents Plot of number of satisfiable constraint matrices versus Number of con-
straint matrices. for size of each constraint Ajn×n, where n = 10, 20, ..., 100 and j = 1, 2, ..., 100 =
m, parameter R which is the upper limit of Tr(X) taken to be 10 and 100 and the correspond-
ing constant factor of bj ’s respectively. These parameter like R and the constant factor of bj ’s
can change the decidability of given SDP optimization problems and accordingly our decision
making algorithm AHK will decide SDP optimization problems.
17
10 20 30 40 50 60 70 80 90 10010
20
30
40
50
60
70
80
90
100
m: Number of constraints matrices −−−−−−−−−−−>
Nu
mb
er
of
sa
tisfia
ble
co
nstr
ain
ts −
−−
−−
−−
−>
Number of Satisfiable constraints Vs Number constraints matrices
n=100, m=100, gaps=10, R=10, const=100
10 20 30 40 50 60 70 80 90 1001
2
3
4
5
6
7
8
9
10
11
m: Number of constraints matrices −−−−−−−−−−−>
Nu
mb
er
of
sa
tisfia
ble
co
nstr
ain
ts −
−−
−−
−−
−>
Number of Satisfiable constraints Vs Number of constraints matrices
n=100, m=100, gaps=10, R=100, const=1000
Figure 11: Plot of Number of satisfiable constraint matrices Vs Number of constraint matrices.
Lemma 1 [section3[4]] : Suppose we have a procedure that given a matrix C ∈ Rn×n
and a tolerance δ > 0 , computes a unit vector x which satifies xᵀCx ≥ −δ, in time Tev(C, δ),
or declares correctly that C is negative definite. Then using this procedure once with C =∑mj=1wi(Aj −
bjR I) and δ = ε
2R we have implemented X (Oracle).
Lemma 2 [section3[4]] : Let C ∈ Rn×n be a matrix with N non-zero entries and eigenvalues
λ1 ≥ λ2 ≥ ... ≥ λn . Let δ > 0 be a given error parameter. Let γ = max{ λ1λ1+|λn| ,
δ|λn|}. Then
the Lancoz algorithm with a random start applied to matrix C + ΛI yields with high probabil-
ity a unit vector x which satifies xᵀCx ≥ −δ or declares correctly that C is negative definite
in time Tev(C, δ) = O( N√γ ). Of course the parameter γ is unknown and this also shows that
the running time of the Oracle depends in on the sparsity of C or number of nonzero entries in it.
Lemma 3 [section3[4]] : Let C ∈ Rn×n be a matrix with N non-zero entries and let
S =∑
ij |Cij |. Let δ > 0 be given error parameter. Then there is a randomized procedure
which runs in O(N) time and with high probability produces a matrix C ′ such that A′ has
O(√nSδ ) non-zero entries and for all unit vectors x ∈ Rn, we have |xᵀCx− xᵀC ′x| ≤ O(δ).
4.4 Correctness of decision making algorithm “AHK” primal only multi-
plicative weights SDP solver
We have verified correctness of decision making algorithm “AHK” primal only multiplicative
weights SDP solver implemented in matlab by verifying the solutions with those obtained from
other solvers like Sedumi, SDPA, SDPLR, CSDP etc. all agree for a given SDP optimization
problem with objective min/max and m constraints.
18
SDP:
Max A0 •X
subject to Aj •X = bj , j = 1, 2, ...,m
X � 0, X is n× n PSD.
Where m is the number of constraint matrices and n is size of constraint square matrices
Aj ’s and X.
“AHK” solves (original problem):
Max A0 •X
Subject to Aj •X ≥ bj for j = 1, 2, ...,m
X ≥ 0, X is n× n PSD.
SDPA, Sedumi and other solver solves (Translating SDP’s):
Max A0 •X
Subject to Aj •X − yj = bj
Where new PSD Y will look like
Y =
Xn×n 0n×m
0m×n yj
� 0, for j = 1, 2, ...,m
and thus Y � 0, Y is (n+m)× (n+m) PSD. Y PSD ensures y1, y2, ..., ym are all positive and
X is PSD. This is becaouse det(Y − λI) = (λ − y1)(λ − y2)...(λ − ym).det(X − λI) = 0. All
roots non-negative imply y1 ≥ 0, y2 ≥ 0, ..., ym ≥ 0 and all roots of det(X−λI) ≥ 0, as X is PSD.
Things to be done to check correctness of the decision making algorithm “AHK”:
1. Solve SDP optimization problem using decision making algorithm “AHK” and check how
many constraints are satisfiable (feasible). (say K)
2. Solve SDP optimization problem using SDPA, SeDuMi and others with exactly K con-
straint and an A0 objective. Find Optimal value (say A0Opt).
3. Now give SDP optimization problem to the decision making algorithm “AHK” with ex-
actly K constraints plus add an extra constraint A0 to AHK (i.e. Max A0 •X = A0Opt).
Thus now we have:
K constraints + A0 •X ≥ A0Opt ± δ, for small δ ≥ 0, we may assume δ = ε. Thus the decision
making algorithm “AHK” should give
19
A0 •X ≥ A0Opt − δ : feasible (satisfiable) and
A0 •X ≥ A0Opt + δ : infeasible (not satisfiable).
Feasible A0 •X ≥ A0Opt− δ and infeasible A0 •X ≥ A0Opt+ δ guarantee that the decision mak-
ing algorithm “AHK” is correct and gives correct primal objective optimal solution of SDP’s
optimization problems.
Checking the decision making algorithm “AHK”’s correctness with following example.
Input for decision making algorithm “AHK”:
n = 2,m = 3, A0 =
−11 0
0 23
, A1 =
10 4
4 0
, A2 =
0 0
0 −8
, A3 =
0 −8
−8 −2
, b =
−48
−8
−20
.
Decision making algorithm “AHK” output: K = 3 (all three constraints satisfiable (feasible))
Give this SDP optimization problem to SDPA, Sedumi and other solvers. SDPA, SeDuMi,
etc output:
objValPrimal = +2.3000000262935881e+01
objValDual = +2.2999999846605416e+01
Now give SDP optimization problem back to the decision making algorithm “AHK” with 4
constraints, “AHK” outputs:
For n = 2,m = 3 + 1 = K + 1, ε = 0.01, R ≥ 11, (since Trace(X) ≤ R, δ = 0.01 and
A0Opt = 23.0, we get
A0 •X ≥ A0Opt + δ : infeasible (not satisfiable)
A0 •X ≥ A0Opt − δ : feasible (satisfiable).
4.5 Solving Optimization problems using decision making algorithm AHK
The decision making algorithm AHK is a deterministic algorithm, when considered theoretically
(numerical stability other issues left aside).
The only parameter that was in our control is R (Trace(X) ≤ R). We have selected R=100
in all the experiments.
The β, for the multiplicative weights, paper AHK05, Section 1.1 page 3, column 1 is se-
lected by us as β = min(0.01, ε/2) and maximum number of iterations before timeout is
20
max(1000, 16 × lognε2
). The only other reason we give an error is when C becomes negative
semi-definite [C is on page 6 in AHK05] during iterations. The effective optimal values we have
written in table represents the effective lower bound of optimal value that we can be obtained
for each epsilon accuracy.
SDPA package SDP:
Min b.x
subject to∑Aj .xA0 = X, for j = 1, 2, ...,m
X � 0, X is PSD, n× n (Primal)
and
Max A0 • Y
subject to Aj • Y = bj , for j = 1, 2, ...,m
Y � 0, Y is PSD, n× n (Dual)
Decision making algorithm AHK SDP:
Max A0 •X
Subject to Aj •X ≥ bj , for j = 1, 2, ...,m
X � 0
Trace(X) ≤ R, X is n× n PSD.
SDPA package SDP to decision making algorithm AHK SDP:
Min −A0 •X
subject to
Aj •X ≥ bj − ε
−Aj •X ≥ −bj − ε,
for j = 1, 2, ...,m
X � 0
Trace(X) ≤ R, X is n× n PSD.
To get lower bound of optimum value by this decision making algorithm AHK, we add one
more constraint to AHK i.e. Objective constraint matrix.
(2m +1)th constraint will be −A0 •X ≥ −(A0Opt ± δ), where A0Opt is optimal objective value.
To get lower bound of optimum objective value by AHK decision making algorithm, AHK
will decide 2m constraints + one objective constraint −A0 •X ≥ −(A0Opt ± δ). Where A0Opt
is the optimum (minimum here) objective value (guessed) of the given SDP.
21
Hence AHK decision algorithm will solve and decide 2m + 1 constraints and correctly satisfy
with A0Opt and correctly gives lower bound of optimum objective value of a given SDP.
Given optimization problems (m is number of constraints and n is size of each constraints
square matrices) table represents the effective lower bound of optimal objective value, number
of iterations and time taken to get this lower bound. (optimization problems taken from SD-
PALIB 1.2)
Results:
(m= number of constraints, for nxn PSD matrix X, with inequalties satisfied upto additive
error eps) each iteration is a selection of the eigenvector, e, corresponding to the largest eigen-
value λ of the new weighted cumulative constraint matrix by the lanczos method and adding
eᵀ × e to the current solution.
22
Optimization Problem (m, n)
Optimal Objective value
Epsilon Delta Effective Lower bound of Optimal Objective value
Iterations Time taken (Seconds)
hinf1 (13, 14) 2.0326 1 1 -1.5 1 0.0175
0.1 0.1 1.6 190 1.286
0.01 0.01 4.7 2442 28.61
hinf2 (13, 16) 10.967 1 1 -1.5 1 0.0131
0.1 0.1 1.6 180 1.2995
0.01 0.01 4.7 2407 31.433
hinf3 (13, 16) 56.9 1 1 -5.9 1 0.0137
0.1 0.1 -2.7 884 8.1968
0.01 0.01 -0.2 4882 125.5089
hinf4 (13, 16) 274.764 1 1 -17.2 1 0.0143
0.1 0.1 -2.7 145 1.055
0.01 0.01 -0.2 2434 32.034
hinf5 (13, 16) 363 1 1 -1 1 0.0137
0.1 0.1 ----- ---- ----
0.01 0.01 188.8 131 0.947
hinf6 (13, 14) 449.0 1 1 -10.7 1 0.0138
0.1 0.1 -2.6 187 1.37
0.01 0.01 14.6 1088 10.6997
hinf7 (13, 16) 391.0 1 1 -27 1 0.0134
0.1 0.1 -2.6 74 0.4967
0.01 0.01 ---- ----- ---
hinf8 (13, 16) 116.0 1 1 -17.6 1 0.130
0.1 0.1 -2.7 149 1.0176
0.01 0.01 -0.2 2507 32.55
hinf9 (13, 16) 236.25 1 1 -26.2 1 0.129
0.1 0.1 -2.7 96 0.6517
0.01 0.01 -0.2 1775 19.53
hinf10 (21, 18) 109.0 1 1 -35.1 1 0.278
0.1 0.1 -4.3 75 0.6218
0.01 0.01 -0.4 1375 16.8
hinf11 (31, 22) 65.9 1 1 -40.8 1 0.0279
0.1 0.1 -6.3 76 0.805
0.01 0.01 -0.6 1344 21.089
hinf12 (43, 24) 0.2 1 1 -16.5 1 0.272
0.1 0.1 -8.6 703 10.794
0.01 0.01 -0.8 2876 73.204
hinf13 (57, 30) 46.0 1 1 -3.8 1 0.0363
0.1 0.1 -1.8 3540 160.998
0.01 0.01 -0.5 21493 4390.1
hinf14 (73, 34) 13.0 1 1 -2.9 1 0.0483
0.1 0.1 -1.3 4821 381.61
0.01 0.01 -0.1 7819 734.56
hinf15 (91, 37) 25.0 1 1 -3.2 1 0.071
0.1 0.1 -1.5 3567 296.6
0.01 0.01 -0.1 6734 924.75
? ? ?
23
5 Data Streaming and Online Algorithms: Matrix Multiplica-
tive weights
5.1 Introduction
For a symmetric matrix A, we let λ1(A) ≥ λ2(A) ≥ ... ≥ λn(A) denote its eigenvalues. Imag-
ine a matrix generalization of the usual 2-player zero-sum game. one player chooses a unit
vector v ∈ Sn−1. The other player chooses a matrix M such that 0 � M � I. The first
player has to pay the second vᵀMv = M • vvᵀ. We allow the first player to choose his vector
from a distribution D over Sn−1. We are interested in expected loss of the first player, that is
ED[vᵀMv] = M • ED[vvᵀ]. The matrix P = ED[vvᵀ] is a density matrix, it is positive semidefi-
nite and has trace 1. (Density matrices which appear in quantum computation). Consider an
online version of this game where the first player has to react to an external adversary who
picks a matrix M at each step; this is called an observed event. An online algorithm for the
first player chooses a density matrix P (t), and observe the event matrix M (t) in each round
t = 1, 2, ..., T . After T rounds, the best fixed vector for the first player in hindsight is the
unit vector v which minimizes the total loss∑T
t=1 vᵀM (t)v. This minimized when v is the unit
eigenvector of∑T
t=1M(t) corresponding to the smallest eigenvalue.
Our goal is to implement an algorithm whose total expected loss over the T rounds is not much
more than the minimum loss λn(∑T
t=1M(t)).
5.2 Matrix Multiplicative Weights Algorithm
On input (Event matrices M (t) and number of observations)
Result: Minimum payoff, total payoff, upper bound, worst payoff,
random pauoff
Initialize: Fix an ε < 12 , and let ε′ = −ln(1− ε). In every round t
for t = 1, 2, ... do
Compute W (t) = (1− ε)∑t−1
i=1M(i) = exp(−ε′(
∑t−1i=1M
(i))).
Use the density matirx P (t) = W (t)
Tr(W (t) and observe the event M (t).
end
Algorithm 5: Data streaming and online algorithm: matrix multiplica-
tive weights
Theorem (Arora and Kale [5]):
24
The matrix Multiplicative weights Update algorithm generates density matrices P 1, P 2, ..., P T
such that∑T
t=1M(t) • P (t) ≤ (1 + ε)λn(
∑Tt=1M
(t)) + lnnε .
We have implemented natrix multiplicative weights algorithm with the matrix exponent
function.
Problem:
You have to provide a density matrix P ( a positive semidefinite matrix PSD with trace 1 that
appears in quantum computation) at every unit of time.
After you provide the matrix, we matchers will evaluate your strategy for current set of obser-
vations.
At every unit of time, the environment gives you a positive semidefinite matrix,M, 0 ≤M ≤
I.
You incure a cost P ?M .
Your aim is to minimize the cost when the game is repeated T times. T is hopefully large, say
hundred at least usually.
5.3 Our experimental results
1. As T becomes large matrix exponents gives a strategy density that is quit responsible.
• For 100× 100 matrices T = 100, a 60% premium on [clairvoyant] optimal that drops
to a 10% premium on [clairvoyant] optimal for T = 1000.
• A random density matrix on the other hand, will cost you atleast a factor of say four.
2. The result is interesting as the largest eigenvector strategy that could incure the maximum
payoff (worst case algorithms analysis) grows atleast linearly in the matrix dimension and
could cost you say a factor of 100 more. So the worst strategy could be very very expensive
so on average we are doing very well.
It can be seen a data streaming online problem, as at every instant we matchers only store
O(n2) data (does not depend on T).
Figure 12 represents plot of total actual payoff with respect to size of PSD matrices for three
different fixed ε error accuracy with T = 500 rounds. For matrix An×n, n = 100, 200, ..., 1000,
three fixed error accuracy ε = 0.05, 0.1, 0.2 and after T = 500 rounds, total actual payoff remains
between 1.12 to 1.23.
25
100 200 300 400 500 600 700 800 900 1000
1.14
1.16
1.18
1.2
1.22
1.24
1.26
n: size of PSD matrices −−−−−−−−−−−>
To
tal p
ayo
ff −
−−
−−
−−
−>
Total payoff Vs size of PSD matrices
ε = 0.05
ε = 0.1
ε = 0.2
Figure 12: Plot of total actual payoff with respect to size of PSD matrices for three different
fixed ε error accuracy with T = 500 rounds.
Figure 13 represents plot of random payoff with respect to size of PSD matrices for three
different fixed ε error accuracy with T = 500 rounds. For matrix An×n, n = 100, 200, ..., 1000,
three fixed error accuracy ε = 0.05, 0.1, 0.2 and after T = 500 rounds, random payoff remains
between 2.8 to 8.
100 200 300 400 500 600 700 800 900 10002
3
4
5
6
7
8
n: size of PSD matrices −−−−−−−−−−−>
Ran
do
m p
ayo
ff −
−−
−−
−−
−>
Random payoff Vs size of PSD matrices
ε = 0.05
ε = 0.1
ε = 0.2
Figure 13: Plot of random payoff with respect to size of PSD matrices for three different fixed
ε error accuracy with T = 500 rounds.
Figure 14 represents plot of random payoff div. total actual payoff with respect to size
of PSD matrices for three different fixed ε error accuracy with T = 500 rounds. For matrix
An×n, n = 100, 200, ..., 1000, three fixed error accuracy ε = 0.05, 0.1, 0.2 and after T = 500
rounds, random payoff div. total actual payoff remains between 2.2 to 6.6. This represents
total actual payoff is 2.2 to 6.6 times lesser than random payoff.
26
100 200 300 400 500 600 700 800 900 10002
3
4
5
6
7
n: size of PSD matrices −−−−−−−−−−−>
Ran
do
m p
ayo
ff D
iv. T
ota
l p
ayo
ff −
−−
−−
−−
−>
Random payoff div. Total payoff Vs size of PSD matrices
ε = 0.05
ε = 0.1
ε = 0.2
Figure 14: Plot of random payoff div. total actual payoff with respect to size of PSD matrices
for three different fixed ε error accuracy with T = 500 rounds.
Figure 15 represents plot of worst payoff with respect to size of PSD matrices for three
different fixed ε error accuracy with T = 500 rounds. For matrix An×n, n = 100, 200, ..., 1000,
three fixed error accuracy ε = 0.05, 0.1, 0.2 and after T = 500 rounds, worst payoff remains
between 300 to 3300.
100 200 300 400 500 600 700 800 900 10000
500
1000
1500
2000
2500
3000
3500
n: size of PSD matrices −−−−−−−−−−−>
Wo
rst
payo
ff −
−−
−−
−−
−>
Worst payoff Vs size of PSD matrices
ε = 0.05
ε = 0.1
ε = 0.2
Figure 15: Plot of worst payoff with respect to size of PSD matrices for three different fixed ε
error accuracy with T = 500 rounds.
Figure 16 represents plot of upper bound with respect to size of PSD matrices for three
different fixed ε error accuracy with T = 500 rounds. For matrix An×n, n = 100, 200, ..., 1000,
three fixed error accuracy ε = 0.05, 0.1, 0.2 and after T = 500 rounds, upper bound remains
between 10 to 900.
27
100 200 300 400 500 600 700 800 900 10000
100
200
300
400
500
600
700
800
900
1000
n: size of PSD matrices −−−−−−−−−−−>
Up
perb
ou
nd
−−
−−
−−
−−
>
Upperbound Vs size of PSD matrices
ε = 0.05
ε = 0.1
ε = 0.2
Figure 16: Plot of upper bound with respect to size of PSD matrices for three different fixed ε
error accuracy with T = 500 rounds.
Figure 17 represents Plot of total actual payoff with respect to ε error accuracy for three dif-
ferent fixed size of PSD matrices with T = 500 rounds. For error accuracy ε = 0.01, 0.02, ..., 0.1,
three fixed matrix An×n, n = 100, 300, 500, and after T = 500 rounds, total actual payoff re-
mains between 1.1 to 1.8.
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.111.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
ε : Error accuracy −−−−−−−−−−−>
To
tal p
ayo
ff −
−−
−−
−−
−>
Total payoff Vs Error accuracy
n = 100
n = 300
n = 500
Figure 17: Plot of total actual payoff with respect to ε error accuracy for three different fixed
size of PSD matrices with T = 500 rounds.
Figure 18 represents Plot of random payoff with respect to ε error accuracy for three different
fixed size of PSD matrices with T = 500 rounds. For error accuracy ε = 0.01, 0.02, ..., 0.1, three
fixed matrix An×n, n = 100, 300, 500, and after T = 500 rounds, random payoff remains between
1.7 to 6.3.
28
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.111.5
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
ε : Error accuracy −−−−−−−−−−−>
Ran
do
m p
ayo
ff −
−−
−−
−−
−>
Random payoff Vs Error accuracy
n = 100
n = 300
n = 500
Figure 18: Plot of random payoff with respect to ε error accuracy for three different fixed size
of PSD matrices with T = 500 rounds.
Figure 19 represents Plot of random payoff div. total actual payoff with respect to ε error
accuracy for three different fixed size of PSD matrices with T = 500 rounds. For error accuracy
ε = 0.01, 0.02, ..., 0.1, three fixed matrix An×n, n = 100, 300, 500, and after T = 500 rounds,
random payoff div. total actual payoff remains between 1.5 to 5.2. This represents total actual
payoff is 1.5 to 5.2 times lesser than random payoff.
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.111
1.5
2
2.5
3
3.5
4
4.5
5
5.5
ε : Error accuracy −−−−−−−−−−−>
Ran
do
m p
ayo
ff D
iv. T
ota
l p
ayo
ff −
−−
−−
−−
−>
Random payoff Div. Total payoff Vs Error accuracy
n = 100
n = 300
n = 500
Figure 19: Plot of random payoff div. total actual payoff with respect to ε error accuracy for
three different fixed size of PSD matrices with T = 500 rounds.
Figure 20 represents Plot of worst payoff with respect to ε error accuracy for three different
fixed size of PSD matrices with T = 500 rounds. For error accuracy ε = 0.01, 0.02, ..., 0.1, three
fixed matrix An×n, n = 100, 300, 500, and after T = 500 rounds, worst payoff remains between
300 to 1700.
29
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11200
400
600
800
1000
1200
1400
1600
1800
ε : Error accuracy −−−−−−−−−−−>
Wo
rst
payo
ff −
−−
−−
−−
−>
Worst payoff Vs Error accuracy
n = 100
n = 300
n = 500
Figure 20: Plot of worst payoff with respect to ε error accuracy for three different fixed size of
PSD matrices with T = 500 rounds.
Figure 21 represents Plot of upper bound with respect to ε error accuracy for three different
fixed size of PSD matrices with T = 500 rounds. For error accuracy ε = 0.01, 0.02, ..., 0.1, three
fixed matrix An×n, n = 100, 300, 500, and after T = 500 rounds, upper bound remains between
32 to 2100.
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.110
500
1000
1500
2000
2500
ε : Error accuracy −−−−−−−−−−−>
Up
perb
ou
nd
−−
−−
−−
−−
>
Upperbound Vs Error accuracy
n = 100
n = 300
n = 500
Figure 21: Plot of upper bound with respect to ε error accuracy for three different fixed size of
PSD matrices with T = 500 rounds.
? ? ?
30
6 Implementation of primal-dual approach for approximately
solving SDPS
6.1 Introduction
Tis section implements a general primal-dual algorithm to compute a near-optimal solution to
any SDP and not just SDPs used in apprximation algorithms.
A general SDP with n2 variables, an n×n matrix variable X and m constraints and its dual
can be written as follows:
Max C •X
subject to Aj •X ≤ bj , for j = 1, 2, ...,m
X � 0 (Primal)
and
Min b.y
subject to∑m
j=1Ajyj � C
y ≥ 0 (Dual)
Where y = {y1, y2, ..., ym} are the dual variables and b = {b1, b2, ..., bm}. As in the case of
LP’s, strong duality holds for SDP’s under very mild conditions, always satisfied by the SDPs
here, and the optima of the two programs coincide. A leanear program is the special case
whereby all the matrices involved are diagonal.
For notational ease we assume that A1 = I and b1 = R. This serves to bound the trace
of the solution: Trace(X) ≤ R and is thus a simple scaling constraint. It is very naturally
present in SDP relaxations for combinatorial optimization problems. We assume that our
implementation of this algorithm uses binary search to reduce optimization to feasibility. Let
α be the algorithm’s current guess for the optimum value of the SDP. Implementation is trying
to either construct a PSD matrix that is primal feasible and has value > α. or a dual feasible
solution whose value is at most (1 + δ)α for some arbitrary small δ > 0.
6.2 Algorithm
As in usual in primal-dual algorithms, the algorithm starts with a trivial condidate for a primal
solution, in this case the trivial PSD matrix, possibly infeasible, of trace R, X(1) = Rn I. Then it
iteratively generates condidate primal solutions X(1), X(2), .... At every step it tries to improve
X(t) to obtain X(t+1), and in this it has help from ans auxiliary algorithm, called the ORACLE,
31
that tries to certify the validity of the current X(t) as follows: ORACLE searches for a vector
y from the polytope Dα = {y : y ≥ 0, b.y ≤ α} such that∑mj=1(Aj •X(t))yj − (C •X(t)) ≥ 0 ———— (?)
If ORACLE succeeds in finding such a y then we claim X(t) is either primal infeasible or
has value C •X(t) ≤ α. the reason is that otherwise∑mj=1(Aj •X(t))yj − (C •X(t)) ≤
∑mj=1 bjyj − (C •X(t)) < α− α = 0,
which would contradict (?). Thus y implicitly contains some usefull information to improve the
condidate primal X(t), and we use y to update X(t) using a matrix exponential update rule.
Our observation about matrix exponentials ensures that the new matrix X(t+1) is also PSD. If
ORACLE declares that there is no vector y ∈ Dα which satisfies ?, then it can be easily checked
using leanear programming duality that X(t) must be a primal feasible solution of objective
value atleast α.
Implementing ORACLE for Primal-Dual SDP: Calculating y using ORACLE
search
y ≥ 0, b1y1 + b2y2 + ...+ bmym ≤ α (1)
m∑j=1
(Aj •X(t))yj − (C •X(t)) ≥ 0 (2)
from (1)
y1 =α−
∑mj=2 bjyj − εb1
(3)
substitute (3) in (2)
(A1 •X(t))y1 +
m∑j=2
(Aj •X(t))yj ≥ C •X(t)
(A1 •X(t))(α−
∑mj=2 bjyj − εb1
) +m∑j=2
(Aj •X(t))yj ≥ C •X(t)
(A1 •X(t))α
b1−
(A1 •X(t))∑m
j=2 bjyj
b1− ε(A1 •X(t))
b1+
m∑j=2
(Aj •X(t))yj ≥ C •X(t)
m∑j=2
(Aj •X(t))yj −(A1 •X(t))
∑mj=2 bjyj
b1≥ C •X(t) +
ε(A1 •X(t))
b1− (A1 •X(t))
α
b1
m∑j=2
(Aj •X(t) − A1 •X(t)bjb1
)yj ≥ C •X(t) +ε(A1 •X(t))
b1− (A1 •X(t))
α
b1
32
say
dj = (Aj •X(t) − A1 •X(t)bjb1
)
c? =ε(A1 •X(t))
b1− (A1 •X(t))
α
b1
therefore
m∑j=2
djyj ≥ C •X(t) + c? = CNew (4)
for all dj < 0, set
yj = 0, 2 ≤ j ≤ m (5)
Let K of the dj ’s 2 ≤ j ≤ m be ≥ 0, say dl1 , dl2 , ..., dlk , then let S = dl1 + dl2 + ...+ dlk .
Set
yl1 = yl2 = ... = ylk =CNewS
+ ε (6)
others were set to zero in (5), and hence calculate y1 from (3). Hence (3), (5) and (6) gives
y returned by ORACLE.
The important point here is that the desired y is not dual feasible, in fact the ORACLE
can ignore the PSD constraint and its task consists of solving an LP with just one non-trivial
constraint, the others are just non-negativity constraints. Thus one may hope to implement
ORACLE efficiently, and furthermore, even find y with nice properties, so that the algorithm
makes fast progress towards feasibility. The Primal-Dual SDP algorithm, we have implemented,
depends on the width parameter of the ORACLE. This the smallest ρ ≥ 0 such that for every
primal condidate X, the vector y ∈ Dα returned by the ORACLE satisfies
||Ajyj − C|| ≤ ρ.
Primal-Dual Algorithm for SDP
33
On input (Aj , bj , C, for j = 1, 2, ...,m and Aj ’s is square matrix
n× n)
Result: Feasible primal PSD matrix X and dual y ≥ 0
Assumption: A1 = I, and b1 = R, where Trace(X) ≤ R, and the
binary search: α be the algorithm’s current guess for the optimum
value of the SDP.
Initialize: Set X(1) = Rn I. Let ε = δα
2ρR , and let ε′ = −ln(1− ε). Let
T = 8ρ2R2ln(n)δ2α2 .
for t = 1, 2, ..., T do
Run the ORACLE with condidate solution X(t).
if the ORACLE fails then
stop and output X(t).
else
let y(t) be the vector generated by ORACLE.
end
Let M (t) = (∑m
j=1Ajy(t)j − C + ρI)/2ρ.
Compute W (t+1) = (1− ε)∑tτ=1M
(τ)= exp(−ε′(
∑tτ=1M
(τ))).
Set X(t+1) = RW (t+1)
Trace(W (t+1))and continue.
end
Algorithm 6: Primal-dual approach for approximately solving SDPs
If all the matrices in question were diagonal, then the SDP reduces to an LP, an the algo-
rithm reduces precisely to the standard multiplicative Weights algorithms for LPs.
Theorem (Arora and kale [5])
Let e1 = {1, 0, ..., 0}. In the Primal-Dual SDP algorithm, assume that the ORACLE never fails
for T = 8ρ2R2ln(n)δ2α2 iterations. Let y = δα
R e1 + 1T
∑Tt=1 y
(t). Then y is a feasible double solution
with objective value atmost (1 + δ)α.
6.3 Analysis and results
34
6.4 Correctness of primal-dual approach for approximately solving SDPS
“AK” multiplicative weights SDP solver
6.5 Solving Optimization problems using primal-dual approach for approxi-
mately solving SDPS “AK” multiplicative weights SDP solver
? ? ?
7 References
1. M.D. Grigoriadis and L.G. Khachiyan. A sublinear-time randomized apprximation algo-
rithm for matrix games. Operations Research Letters, 18(2):53-58, 1995.
2. Christos Koufogiannakis and Neal E. Young. Beating simplex for fractional packing and
covering linear programs. The 48th IEEE Symposium on Foundation of Computer Science
(FOCS 2007) (corrected version FOCS 2008).
3. Michael Luby and Noam Nisan. A parallel apprximation algorithm for positive linear
programming. In proceedings of the twenty-fifth annual ACM symposium on Theory of
computing, STOC ’93, pages 448-457, New York, NY, USA, 1993. ACM.
35
4. S. Arora, E. Hazan and S. Kale. Fast algorithms for approximate semidefinite program-
ming using the multiplicative weights update method. In Proceedings of the 46th Annual
IEEE Symposium on Foundations of Computer Science, pages 339-348, 2005.
5. S. Arora and S. Kale. A combinatorial, prima-dual approach to semidefinite programs.
In Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing,
pages 227-236, 2007.
6. Rahul Jain and Penghui Yao. A parallel apprximation algorithm for positive semidefinite
programming. In FOCS, pages 463-471, 2011.
7. Rechard Peng and Kanat Tangwongsan. Faster and simpler width-independent parallel
algorithms for positive semidefinite programming. In CoRR abs/1201.5135, 2012.
8. F. Alizadeh. Interior Point Methods in Semidefinite Programming with Apprximation to
Combinatorial Optimization. SIAM J. Optim., 5(1):13-51, 1995.
9. Y. Nesterov and A. Nemirovskii. Interior Point Polynomial Methods in Convex Program-
ming: Theory and Applications. SIAM , Philadelphia, 1994.
10. Paul Christiano, Jonathan A. Kelner, Aleksander Madry, Daniel Spielman and Shang-
Huam Teng. Electrical flows, laplacian Systems, and faster apprximation of maximum
flow in undirected graphs. CoRR abs/1010.2921, 2010.
36