Upload
shada
View
36
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Due Giorni di Algebra Lineare Numerica Bologna, 6-7 marzo, 2007. On the role of Linear Algebra in the development of Interior Point algorithms and software. D. di Serafino, Second University of Naples, Italy [email protected] joint work with S. Cafieri, École Polytechnique, Paris - PowerPoint PPT Presentation
Citation preview
On the role of Linear Algebra in the development of Interior Point algorithms
and software
Due Giorni di Algebra Lineare NumericaBologna, 6-7 marzo, 2007
D. di Serafino, Second University of Naples, [email protected]
joint work with
S. Cafieri, École Polytechnique, ParisM. D’Apuzzo, V. De Simone, Second University of Naples
G. Toraldo, University of Naples “Federico II”
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
2
Outline
Motivations
KKT system and IP methods
Iterative solution of KKT system Constraint Preconditioner Adaptive termination control
Numerical experiments
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
3
Motivations
ubiquitous in optimization algorithms critical issue in an effective implementation of
optimization methods
H symmetric, D diagonal semidefinite positive, J full rank
dc
yx
DJJH T
n m
n
m
symmetric indefinte
Focus on IP methods
KKT system
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
4
Development of an IP-based software package for solving
large-scale (convex) Quadratic Programming problems
PRQP - Potential Reduction software for Quadratic Programming problems
x, y, , t primal and dual variables, s,, z slack variables
, , , ,spsd 21EI21 nmmmJJQ nmnmnn
Motivations
0),,,( , s.t.
21),,,(max
tzyxctzJyJQx
tudybQxxtyxpTe
Ti
TTTT
0),,(
, s.t.21)(min
vsxux
d,xJbsxJ
xcQxxxq
e
i
TT
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
5
KKT system and IP methods
type of problem different blocks formulation of the method in the KKT system
large-scale problems sparse direct or iterative solvers local convexity of the problem KKT matrix inertia control
accuracy requirements inexact solution of the KKT system adaptive stopping criteriaincreasing ill conditioning as the iterates approach the solution preconditioning techniques
OPTIMIZATION LINEAR ALGEBRA
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
6
Infeasible primal-dual PR framework
reduce the infeasibility at least at the same rate
as the duality gap
0, 0 corresponding to the initial point w0
n
iii
n
i
m
jjjii
TTT tsyzxtsyzxtzsyx11 1
)ln()ln()ln()ln(),,,,,( min
barrier functions
potential function (Tanabe,1988; Todd & Ye, 1990)
duality gap
00
0),,,,,(~
σσ
tszyxw
ctzJyJHxr
dxJruxrbsxJr
rrrrσ
Te
Tid
Eppip
dppp
, ,
),,,(
321
321 2
(Kojima, Mizuno & Todd, 1995)
primal infeasibilitydual infeasibility
= 0 feasible version
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
7
PR basic steps
1. Given the current interior iterate w=(x,y,s,z,,v,t), apply a Newton step to the perturbed KKT conditions
2. Update w
with suitably chosen
)/()( gwwG
www
parameteron perturbati /conditions KKT theofJacobian )(
wG
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
8
Newton system reduction: KKT system
2
1
2
1
bb
xx
DJJH T
equality + bound constraints:
inequality + bound constraints:
ineq. + eq. + bound constraints:
TVZXEEQH 11 , E accounts for bound constraints
0
e
Te
JJH
DJJH
i
Ti
Di pd 000
e
ii
Te
Ti
JDJ
JJH
D pd
bound constraints only (J = I): )(
211
2
21
11
bxTVxTbVbxH
diagonal
condensed system
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
9
Inherent ill conditioning
approaching the solution, some entries of D and E may become very
large, producing an increasing ill conditioning in the
matrix
DJJEQ T
it is crucial to use preconditioning strategies
0001SY
D
TVZXE 11
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
10
KKT system solvers: direct vs iterative
Direct solvers Widely used in well-established IP software
(e.g. Mosek, PCx, Loqo, OOQP, KNITRO-Interior/Direct, IPOPT) Ill conditioning not a severe problem
(S. Wright, 1997; M. Wright, 1998) Computational cost may become prohibitive for
large-scale problems
Iterative solvers Increasing attention by IP community in the last 15
years (implemented in KNITRO-Interior/CG, HOPDM, PRQP) Require suitable preconditioning techniques Adaptive termination criteria
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
11
Indefinite preconditioner with the same structure as the KKT matrix
M “simple” approximation to H such that is spd on ker(Je). Common choice: M diagonal.
CP variants investigated by many researchers(Axelsson, 1979; Golub & Wathen, 1998; Luksan & Vlcek, 1998; Keller et al., 2000; Rozloznik & Simoncini, 2002; Durazzi & Ruggiero, 2003; Gondzio et al., 2004; Dollar et al., 2006-2007; di Serafino et al., 2007; Forsgren et al., 2007; …)
DJJM
PT
Constraint Preconditioner (CP)
DJJH
KT
iiTi
JDJM 1
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
12
P-1K has at least m unit eigenvalues If rank(D)=p, then P-1K has additional m-p unit eigenvalues The remaining eigenvalues are real positive (bounds are
available)(Keller at al., 2000; Durazzi & Ruggiero, 2003; Bergamaschi et al., 2004; Dollar, 2007)
When the iterate approaches the solution, if q entries of D get close to zero, then additional q eigenvalues tend to be clustered around 1
We expect that the preconditioner increases its effectiveness as the IP method progresses
CP: spectral properties
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
13
CP: use of Conjugate Gradient (CG) method
(Gould, Hribar, Nocedal, 2001; Rozloznik, Simoncini, 2002; Cafieri, D’Apuzzo, De Simone, di Serafino, 2007; Dollar, 2007)
No breakdown Convergence in at most n-m+p iterations,
p=rank(D)
Starting guess such that
0
22
21021
01
bb
xx
JDJ
e
ii
CP+CG applied to KKT system behaves as CG applied to a reduced system with matrix using as preconditioner (Z basis of ker(Je))
),,( 022
021
01
xxx
Preconditioned Projected CG
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
14
Building CP Explicit LBLT factorization of P (B block diagonal with
1x1 or 2x2 blocks) LBLT factorization of the Schur complement of -M
Implicit Schilders factorization of P (Dollar et al., 2006)
T
TT
LJMI
DM
LJMI
DJJM
P0
1
001 00
00
TTM
LBLJJMDS000
1
May still account for a large part of the computational cost!“Requiring a factorization of P may still be considered a disadvantage,
and methods which avoid this are urgently needed ” (Gould, Orban, Toint, Acta Numerica, 2005)
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
15
Reducing the cost of building CP
Always set the (2,2)-block to 0(Dollar, Gould, Schilders, Wathen, 2007)
Approximate J by dropping away entries below a prescribed tolerance and outside a fixed band (case D=0 )(Bergamaschi, Gondzio, Venturin, Zilli, 2006)
Approximate the Schur complement via incomplete Cholesky factorization (case D=0)(Benzi, Simoncini, 2006)
Reuse CP for some IP iterations (Cafieri, D’Apuzzo, De Simone, di Serafino, 2007b)
DJJM
PT
TJJM 1
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
16
Reusing CP
The idea is not new (condensed system: Carpenter & Shanno, 1993; Karmakar & Ramakrishnan, 1991)
Reusing CP use an approximate CP
CG cannot be used, apply SQMR
DJJM
PT
~~
~
000~~~ 1SYD
computed at a previous outer step
)~~~~(~ 11 TVZXQdiagM
~,~,~,~,~,~ ZYTVSX
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
17
Reusing CP: spectral properties (work in progress)
pjDDrankmpDrank )~( ,)(
has an eigenvalue at 1 with multiplicity at least 2m-p-j at most 2j eigenvalues with nonzero imaginary part
The eigenvalues λ satify
KP 1~
pmpD
D i
}}
000
HIJR T )(
12/12/12/1 2 ,~ ,~~ RDJHIJRCMJJMHMH TT
,~~~ 1 RRJMJDS TT
CHCHmaxmaxminmin
,min,min
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
18
Reusing CP
CP is reused until its effectiveness deteriorates the number of inner iterations must not increase too much the number of outer steps for which CP is reused must be bounded
by taking into account the increasing ill conditioning and the accuracy requirements
k = k+1 factorize Pk
apply to the KKT system SQMR with Pk
j = k; l = 0while (iit(k) ≤ α ∙ iit(j) and l ≤ β ∙ lmax) do
apply to the KKT system SQMR with P j
k = k+1; l = l+1endwhilelmax = l
while (PR stopping criterion not satisfied) do…
…endwhile
iit(k) = # inner iter. at outer step k
= 2, = 1 (by numerical experiments)
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
19
Adaptive stopping criteria, according to the quality of the IP iterate (to reduce the number of inner iterations)Basic idea: at early outer iterations low accuracy in solving the KKT system as the iterates approach the optimal solution the accuracy requirement grows up
Regard the IP method as an inexact Newton method:,)( )()()()( kkkk whrr 0 , 10
step IP )(
kk
k
residual of the KKT system
Termination control
perturbation of the KKT cond. of the original
pb.
KKT cond. of the original pb.
chosen suitably 1/
)(
)()()(
k
kkkr
1 ,)( )()()()()( kkkkk whrr
condition for convergence
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
20
Termination control
polynomial convergence
solution KKT approx. theof residual ,43
rr
00 // Cafieri, D’Apuzzo, De Simone, di Serafino, Toraldo, 2007c
such that exists )(length step a
|)ln|)2(( , : )1,0( mnKKkK k
0 , ,)()( wwwww
mnmn 22solution opt. ,|||| 1,0 0, ),,,0,,,,( **0 wweeeeeew
)1(
Inexact PR convergence
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
21
Termination control: theory + comput. study
CG/SQMR stopping criterion:
Theory
Computational study Further reduce the number
of inner iterations Avoid slowdown in
decreasing the infeasibility
TOLr
43
TOL
strategies safeguardwith
otherwise,
if, ,min
ρτ
TOLεδTOLρ
τTOL
PRPR
|);)(|1/( xqδ 10 ;10 ,1
(Cafieri, D’Apuzzo, De Simone, di Serafino, 2007d)
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
22
Software architecture of PRQP
linear algebra kernels
PRQP core
CG SQMR MA27 (LBLT)
BLAS
SIF, AMPL interfaces
Pre/post-processing utilities
CP ICFS
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
23
Testing details
PROBLEM n m1 m2
AUG3DQP* 27543 0 8000
CVXQP1 10000 0 5000
CVXQP2 10000 0 2500
CVXQP3 10000 0 7500
GOULDQP3 19999 0 9999
LISWET5* 10002 1000 0
QPBAND 50000 25000 0
CVXQP1-M 10000 5000 0
CVXQP2-M 10000 2500 0
CVXQP3-M 10000 7500 0
Selected CUTEr problems
sparsity of Hessian and constr. matrices ≥ 99% * = diagonal Hessian
PR stopping criteria:
δ = relative duality gap, σ = relative infeasibility# PR iterations ≤ 80
Computational environment 2.53 GHz Pentium IV, 1.256 GB RAM, 128 KB L1 cache Linux Ubuntu 4.1.2, g77 3.4.6
and gcc 4.1.3 compilers
10 ,10 87 σδ
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
24
PRQP: selected results
Direct CG + CP
PROBLEM PR IT FACT TIME
SOLVE TIME
TOTALTIME
PR/GC IT
CP FACT.TIME
SOLVETIME
TOTALTIME
AUG3DQP* 19 3.04e+
1 2.72e-1 3.11e+1 19/19 3.20e+1 7.28e-1 3.35e+1
CVXQP1 20 1.16e+4 5.27e+0 1.16e+4 20/351 1.21e+1 3.84e+0 1.63e+1
CVXQP2 19 3.03e+3 2.15e+0 3.03e+3 20/393 1.48e-1 1.10e+0 1.45e+0
CVXQP3 16 2.63e+4 8.01e+0 2.63e+4 16/228 6.47e+1 5.76e+0 7.10e+1
GOULDQP3 21 1.32e+
0 8.35e-2 1.82e+0 21/90 5.47e-1 6.70e-1 1.79e+0
LISWET5* 25 6.16e-1 3.82e-2 8.77e-1 25/25 6.27e-1 2.49e-1 1.13e+0
QPBAND 14 1.37e+0 1.16e-1 2.32e+0 14/901 7.56e-1 1.74e+1 1.91e+1
CVXQP1-M 26 2.79e+
4 1.29e+1 2.79e+4 25/423 1.23e+1 4.30e+0 1.69e+1
CVXQP2-M 24 4.32e+
3 3.19e+0 4.32e+3 24/415 1.82e-1 1.19e+0 1.62e+0
CVXQP3-M 30 4.73e+
4 1.40e+1 4.72e+4 30/606 1.06e+2 1.36e+1 1.20e+2
* = diagonal Hessian
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
25
CG + CP SQMR + reused CP
PROBLEM PR/CGIT
CP FACT.TIME
SOLVETIME
TOTALTIME
PR/QMRIT
CP FACT.TIME
SOLVETIME
TOTALTIME
AUG3DQP* 19/19 3.20e+1 7.28e-1 3.35e+
1 19/300 1.57e+1
7.02e+0
2.31e+1
CVXQP1 20/351 1.21e+1
3.84e+0
1.63e+1 20/738 3.64e+
08.72e+
01.26e+
1
CVXQP2 20/393 1.48e-1 1.10e+0
1.45e+0 20/614 5.96e-2 2.15e+
02.40e+
0
CVXQP3 16/228 6.47e+1
5.76e+0
7.10e+1 16/607 1.88e+
11.53e+
13.42e+
1
LISWET5* 25/25 6.27e-1 2.49e-1 1.13e+0 25/200 2.99e-1 1.28e+
01.81e+
0
QPBAND 14/901 7.56e-1 1.74e+1
1.91e+1 14/1531 3.73e-1 3.81e+
13.92e+
1
CVXQP1-M 25/423 1.23e+1
4.30e+0
1.69e+1 25/650 5.92e+
0 6.81e+
01.30e+
1
CVXQP2-M 24/415 1.82e-1 1.19e+0
1.62e+0 24/624 9.44e-2 2.20e+
02.53e+
0
CVXQP3-M 30/606 1.06e+2
1.36e+1
1.20e+2 30/858 4.50e+
11.96e+
16.50e+
1
PRQP: selected results
* = diagonal Hessian
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
26
Future work
Devising and experimenting new strategies for CP approximation
Applying and analysing CPs in the context of IP methods for nonconvex (quadratic) programming
Developing and experimenting inertia-controlling iterative solvers
Thanks for your attention!