21
1 Efficient Parallel Software for Large-Scale Semidefinite Programs Makoto Yamashita @ Tokyo-Tech Katsuki Fujisawa @ Chuo Univer sity MSC 2010 @ Yokohama [2010/09/08]

1 Efficient Parallel Software for Large-Scale Semidefinite Programs Makoto Yamashita @ Tokyo-Tech Katsuki Fujisawa @ Chuo University MSC 2010 @ Yokohama

Embed Size (px)

Citation preview

1

Efficient Parallel Software for Large-Scale Semidefinite Programs

Makoto Yamashita @ Tokyo-TechKatsuki Fujisawa @ Chuo University

MSC 2010 @ Yokohama [2010/09/08]

2

Outline

1. SemiDefinite Programming2. Conversion of stability condition for differen

tial inclusions to an SDP3. Primal-Dual Interior-Point Methods and its

parallel implementation4. Numerical Results

3

Many Applications of SDP Control Theory

Stability Condition for Differential Inclusions Discrete-Time Optimal Control Problem

Via SDP relaxation Polynomial Optimization Problem Sensor Network Problem Quadratic Assignment Problem

Quantum Chemistry/Information

Large SDP ⇒ Parallel Solver

4

Standard form of SDP

5

Stability condition for differential inclusions to standard SDP

.

Does the solution remain in a bounded region?

i.e.,

Yes, if

Boyd et al

6

.

To hold this inequality,

Bounding the condition number⇒SDP.

Conversion to SDP

7

SDP from SCDI

.

Feasible solution ⇒ Boundness of the solution

Some translation for standard SDPby e.g. YALMIP [J. Löfberg].

8

Discrete-Time Optimal Control Problems

This Problem [Coleman et al] can be formulated as SDP via SparsePOP [Kim et al].

9

Primal-Dual Interior-Point Methods

Both Primal and Dual simultaneously in Polynomial-time

Many software are developed SDPA [Yamashita et al] SDPT3 [Toh et al] SeDuMi [Sturm et al] CSDP [Borcher et al]

10

Algorithmic Framework of Primal-Dual Interior-Point Methods

Feasible Region of

Optimal Solution

Initial Point

Target Point

Central Path

Search Direction

Step Lengthto keep interior property

The most computational timeis consumed by the Search Direction

11

Bottlenecks in PDIPMand SDPARA

To obtain the direction, we solve 1. ELEMENTS2. CHOLESKY

In SDPARA, parallel computation is applied to these two bottlenecks

Problem

ELEMENTS CHOLESKY Total

SCDI 22228 1593 23986

DTOC 668 1992 2713Xeon 5460,3.16GHz

12

Nonzero pattern ofSchur complement matrix (B)

Fully dense Schur complement matrixFully dense Schur complement matrix Sparse Schur complement matrixSparse Schur complement matrixSCDI DTOC

13

Exploitation of Sparsityin SDPA

We change the formula by row-wise

We keep this scheme on parallel computation

F1

F2

F3

14

Row-wise distribution for dense Schur complement matrix

4 CPU is availableEach CPU computes only their assigned rows

. No communication between CPUsEfficient memory management

15

Fomula-Cost Based distribution for sparse Schur complement

147 48 29 21

137 43

124 22

98 17

53

24

Load on each CPU

CPU1:195

CPU2:187

CPU3:189

CPU4:192

Average:190.75

16

Parallel Computation for CHOLESKY

We employ ScaLAPACK [Blackford et.al] ⇒ Dense MUMPS [Amestoy et.al] ⇒ Sparse

Different data storage enhance the parallel Cholesky factorization

17

Problems for Numerical Results

16 nodes Xeon X5460 (3.16GHz) 48GB memory

18

Computation time on SDP [SCDI1]

2441012211

61863165

1625

2174810992

55162755

13872372

988524

305167

1

10

100

1000

10000

100000

1 2 4 8 16#processors

second

TOTALELEMENTSCHOLESKY

Xeon X5460(3.16GHz)48GB memory/node

Total 15.02 timesELEMENTS 15.67 timesCHOLESKY 14.20 times

ELEMENTS attains high scalability

19

Computation time on SDP [DTOC1]

27461601 1121 898

566

486267

12564

36

22061297 965 807

508

1

10

100

1000

10000

1 2 4 8 16#processors

second

TOTALELEMENTSCHOLESKY

Xeon X5460(3.16GHz)48GB memory/node

Total 4.85 timesELEMENTS 13.50 timesCHOLESKY 4.34 times

•Parallel Sparse Cholesky is difficult •ELEMENTS is still enhanced

20

Comparison with PCSDP [Ivanov et al]

1. SDPARA is faster than PCSDP2. The scalability of SDPARA is higher3. Only SDPARA can solve DTOC

Time is second, O.M.:out of memory

21

Concluding Remarks & Future works

1. SDP has many applications including control theory

2. SDPARA solves Larse-scale SDPs effectively by parallel computation

3. Appropriate parallel computations are the key of SDPARA implementation

Improvement on Multi-Threading for sparse Schur complement matrix