Upload
erica-kelley
View
213
Download
1
Tags:
Embed Size (px)
Citation preview
1
Efficient Parallel Software for Large-Scale Semidefinite Programs
Makoto Yamashita @ Tokyo-TechKatsuki Fujisawa @ Chuo University
MSC 2010 @ Yokohama [2010/09/08]
2
Outline
1. SemiDefinite Programming2. Conversion of stability condition for differen
tial inclusions to an SDP3. Primal-Dual Interior-Point Methods and its
parallel implementation4. Numerical Results
3
Many Applications of SDP Control Theory
Stability Condition for Differential Inclusions Discrete-Time Optimal Control Problem
Via SDP relaxation Polynomial Optimization Problem Sensor Network Problem Quadratic Assignment Problem
Quantum Chemistry/Information
Large SDP ⇒ Parallel Solver
5
Stability condition for differential inclusions to standard SDP
.
Does the solution remain in a bounded region?
i.e.,
Yes, if
Boyd et al
7
SDP from SCDI
.
Feasible solution ⇒ Boundness of the solution
Some translation for standard SDPby e.g. YALMIP [J. Löfberg].
8
Discrete-Time Optimal Control Problems
This Problem [Coleman et al] can be formulated as SDP via SparsePOP [Kim et al].
9
Primal-Dual Interior-Point Methods
Both Primal and Dual simultaneously in Polynomial-time
Many software are developed SDPA [Yamashita et al] SDPT3 [Toh et al] SeDuMi [Sturm et al] CSDP [Borcher et al]
10
Algorithmic Framework of Primal-Dual Interior-Point Methods
Feasible Region of
Optimal Solution
Initial Point
Target Point
Central Path
Search Direction
Step Lengthto keep interior property
The most computational timeis consumed by the Search Direction
11
Bottlenecks in PDIPMand SDPARA
To obtain the direction, we solve 1. ELEMENTS2. CHOLESKY
In SDPARA, parallel computation is applied to these two bottlenecks
Problem
ELEMENTS CHOLESKY Total
SCDI 22228 1593 23986
DTOC 668 1992 2713Xeon 5460,3.16GHz
12
Nonzero pattern ofSchur complement matrix (B)
Fully dense Schur complement matrixFully dense Schur complement matrix Sparse Schur complement matrixSparse Schur complement matrixSCDI DTOC
13
Exploitation of Sparsityin SDPA
We change the formula by row-wise
We keep this scheme on parallel computation
F1
F2
F3
14
Row-wise distribution for dense Schur complement matrix
4 CPU is availableEach CPU computes only their assigned rows
. No communication between CPUsEfficient memory management
15
Fomula-Cost Based distribution for sparse Schur complement
147 48 29 21
137 43
124 22
98 17
53
24
Load on each CPU
CPU1:195
CPU2:187
CPU3:189
CPU4:192
Average:190.75
16
Parallel Computation for CHOLESKY
We employ ScaLAPACK [Blackford et.al] ⇒ Dense MUMPS [Amestoy et.al] ⇒ Sparse
Different data storage enhance the parallel Cholesky factorization
18
Computation time on SDP [SCDI1]
2441012211
61863165
1625
2174810992
55162755
13872372
988524
305167
1
10
100
1000
10000
100000
1 2 4 8 16#processors
second
TOTALELEMENTSCHOLESKY
Xeon X5460(3.16GHz)48GB memory/node
Total 15.02 timesELEMENTS 15.67 timesCHOLESKY 14.20 times
ELEMENTS attains high scalability
19
Computation time on SDP [DTOC1]
27461601 1121 898
566
486267
12564
36
22061297 965 807
508
1
10
100
1000
10000
1 2 4 8 16#processors
second
TOTALELEMENTSCHOLESKY
Xeon X5460(3.16GHz)48GB memory/node
Total 4.85 timesELEMENTS 13.50 timesCHOLESKY 4.34 times
•Parallel Sparse Cholesky is difficult •ELEMENTS is still enhanced
20
Comparison with PCSDP [Ivanov et al]
1. SDPARA is faster than PCSDP2. The scalability of SDPARA is higher3. Only SDPARA can solve DTOC
Time is second, O.M.:out of memory
21
Concluding Remarks & Future works
1. SDP has many applications including control theory
2. SDPARA solves Larse-scale SDPs effectively by parallel computation
3. Appropriate parallel computations are the key of SDPARA implementation
Improvement on Multi-Threading for sparse Schur complement matrix