26
Symmetric-pattern multifrontal factorization T(A) 1 2 3 4 6 7 8 9 5 5 9 6 7 8 1 2 3 4 1 5 2 3 4 9 6 7 8 A 9 1 2 3 4 6 7 8 5 G(A)

Symmetric-pattern multifrontal factorization

  • Upload
    nitesh

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

4. 1. G(A) . 7. 1. 2. 3. 4. 5. 6. 7. 8. 9. 6. 3. 8. 1. 2. 2. 5. 9. 3. 4. 5. 6. 9. T(A) . 7. 8. 8. 9. 7. A . 6. 3. 4. 1. 2. 5. Symmetric-pattern multifrontal factorization. 4. 1. G(A) . 7. 6. 3. 8. 2. 5. 9. 9. T(A) . 8. 7. 6. 3. 4. 1. 2. 5. - PowerPoint PPT Presentation

Citation preview

Page 1: Symmetric-pattern multifrontal factorization

Symmetric-pattern multifrontal factorization

T(A)

1 2

3

4

6

7

8

9

5

5 96 7 81 2 3 41

5

234

9

678

A

9

1

2

3

4

6

7

8

5

G(A)

Page 2: Symmetric-pattern multifrontal factorization

Symmetric-pattern multifrontal factorization

T(A)

1 2

3

4

6

7

8

9

5

For each node of T from leaves to root:• Sum own row/col of A with children’s

Update matrices into Frontal matrix• Eliminate current variable from Frontal

matrix, to get Update matrix• Pass Update matrix to parent

9

1

2

3

4

6

7

8

5

G(A)

Page 3: Symmetric-pattern multifrontal factorization

Symmetric-pattern multifrontal factorization

T(A)

1 2

3

4

6

7

8

9

5

1 3 7137

3 737

F1 = A1 => U1

For each node of T from leaves to root:• Sum own row/col of A with children’s

Update matrices into Frontal matrix• Eliminate current variable from Frontal

matrix, to get Update matrix• Pass Update matrix to parent

9

1

2

3

4

6

7

8

5

G(A)

Page 4: Symmetric-pattern multifrontal factorization

Symmetric-pattern multifrontal factorization

2 3 9239

3 939

F2 = A2 => U2

1 3 7137

3 737

F1 = A1 => U1

For each node of T from leaves to root:• Sum own row/col of A with children’s

Update matrices into Frontal matrix• Eliminate current variable from Frontal

matrix, to get Update matrix• Pass Update matrix to parent

T(A)

1 2

3

4

6

7

8

9

5

9

1

2

3

4

6

7

8

5

G(A)

Page 5: Symmetric-pattern multifrontal factorization

Symmetric-pattern multifrontal factorization

T(A) 2 3 9239

3 939

F2 = A2 => U2

1 3 7137

3 737

F1 = A1 => U1

3 7 8 93789

7 8 9789

F3 = A3+U1+U2 => U3

1 2

3

4

6

7

8

9

5

9

1

2

3

4

6

7

8

5

G(A)

Page 6: Symmetric-pattern multifrontal factorization

Symmetric-pattern multifrontal factorization

T(A)

1 2

3

4

6

7

8

9

5

5 96 7 81 2 3 41

5

234

9

678

L+U

9

1

2

3

4

6

7

8

5

G+(A)

Page 7: Symmetric-pattern multifrontal factorization

Symmetric-pattern multifrontal factorization

T(A)

1 2

3

4

6

7

8

9

5

1

2

3

4

6

7

8

95

G(A) • Really uses supernodes, not nodes

• All arithmetic happens on

dense square matrices.

• Needs extra memory for a stack of

pending update matrices

• Potential parallelism:1. between independent tree branches

2. parallel dense ops on frontal matrix

Page 8: Symmetric-pattern multifrontal factorization

MUMPS: distributed-memory multifrontal[Amestoy, Duff, L’Excellent, Koster, Tuma]

• Symmetric-pattern multifrontal factorization• Parallelism both from tree and by sharing dense ops• Dynamic scheduling of dense op sharing• Symmetric preordering• For nonsymmetric matrices:

• optional weighted matching for heavy diagonal• expand nonzero pattern to be symmetric• numerical pivoting only within supernodes if possible

(doesn’t change pattern)• failed pivots are passed up the tree in the update matrix

Page 9: Symmetric-pattern multifrontal factorization

SuperLU-dist: GE with static pivoting [Li, Demmel]

• Target: Distributed-memory multiprocessors

• Goal: No pivoting during numeric factorization

Page 10: Symmetric-pattern multifrontal factorization

SuperLU-dist: Distributed static data structure

Process(or) mesh

0 1 23 4 5

L0

0 1 2

3 4 5

0 1 2

3 4 5

0 1 2

3 4 5

0 1 2

3 4 50 1 23 4 5

0 1 2

0 1 23 4 5

0 1 2

0

3

03

0

3

U

Block cyclic matrix layout

Page 11: Symmetric-pattern multifrontal factorization

GESP: Gaussian elimination with static pivoting

• PA = LU• Sparse, nonsymmetric A• P is chosen numerically in advance, not by partial pivoting!• After choosing P, can permute PA symmetrically for sparsity:

Q(PA)QT = LU

= xP

Page 12: Symmetric-pattern multifrontal factorization

SuperLU-dist: GE with static pivoting [Li, Demmel]

• Target: Distributed-memory multiprocessors• Goal: No pivoting during numeric factorization

1. Permute A unsymmetrically to have large elements on the diagonal (using weighted bipartite matching)

2. Scale rows and columns to equilibrate3. Permute A symmetrically for sparsity4. Factor A = LU with no pivoting, fixing up small pivots:

if |aii| < ε · ||A|| then replace aii by ε1/2 · ||A||

5. Solve for x using the triangular factors: Ly = b, Ux = y6. Improve solution by iterative refinement

Page 13: Symmetric-pattern multifrontal factorization

SuperLU-dist: GE with static pivoting [Li, Demmel]

• Target: Distributed-memory multiprocessors• Goal: No pivoting during numeric factorization

1. Permute A unsymmetrically to have large elements on the diagonal (using weighted bipartite matching)

2. Scale rows and columns to equilibrate3. Permute A symmetrically for sparsity4. Factor A = LU with no pivoting, fixing up small pivots:

if |aii| < ε · ||A|| then replace aii by ε1/2 · ||A||

5. Solve for x using the triangular factors: Ly = b, Ux = y6. Improve solution by iterative refinement

Page 14: Symmetric-pattern multifrontal factorization

Row permutation for heavy diagonal [Duff, Koster]

• Represent A as a weighted, undirected bipartite graph (one node for each row and one node for each column)

• Find matching (set of independent edges) with maximum product of weights

• Permute rows to place matching on diagonal• Matching algorithm also gives a row and column scaling

to make all diag elts =1 and all off-diag elts <=1

1 52 3 41

5

234

A

1

5

2

3

4

1

5

2

3

4

1 52 3 44

2

531

PA

Page 15: Symmetric-pattern multifrontal factorization

SuperLU-dist: GE with static pivoting [Li, Demmel]

• Target: Distributed-memory multiprocessors• Goal: No pivoting during numeric factorization

1. Permute A unsymmetrically to have large elements on the diagonal (using weighted bipartite matching)

2. Scale rows and columns to equilibrate3. Permute A symmetrically for sparsity4. Factor A = LU with no pivoting, fixing up small pivots:

if |aii| < ε · ||A|| then replace aii by ε1/2 · ||A||

5. Solve for x using the triangular factors: Ly = b, Ux = y6. Improve solution by iterative refinement

Page 16: Symmetric-pattern multifrontal factorization

SuperLU-dist: GE with static pivoting [Li, Demmel]

• Target: Distributed-memory multiprocessors• Goal: No pivoting during numeric factorization

1. Permute A unsymmetrically to have large elements on the diagonal (using weighted bipartite matching)

2. Scale rows and columns to equilibrate3. Permute A symmetrically for sparsity4. Factor A = LU with no pivoting, fixing up small pivots:

if |aii| < ε · ||A|| then replace aii by ε1/2 · ||A||

5. Solve for x using the triangular factors: Ly = b, Ux = y6. Improve solution by iterative refinement

Page 17: Symmetric-pattern multifrontal factorization

SuperLU-dist: GE with static pivoting [Li, Demmel]

• Target: Distributed-memory multiprocessors• Goal: No pivoting during numeric factorization

1. Permute A unsymmetrically to have large elements on the diagonal (using weighted bipartite matching)

2. Scale rows and columns to equilibrate3. Permute A symmetrically for sparsity4. Factor A = LU with no pivoting, fixing up small pivots:

if |aii| < ε · ||A|| then replace aii by ε1/2 · ||A||

5. Solve for x using the triangular factors: Ly = b, Ux = y6. Improve solution by iterative refinement

Page 18: Symmetric-pattern multifrontal factorization

SuperLU-dist: GE with static pivoting [Li, Demmel]

• Target: Distributed-memory multiprocessors• Goal: No pivoting during numeric factorization

1. Permute A unsymmetrically to have large elements on the diagonal (using weighted bipartite matching)

2. Scale rows and columns to equilibrate3. Permute A symmetrically for sparsity4. Factor A = LU with no pivoting, fixing up small pivots:

if |aii| < ε · ||A|| then replace aii by ε1/2 · ||A||

5. Solve for x using the triangular factors: Ly = b, Ux = y6. Improve solution by iterative refinement

Page 19: Symmetric-pattern multifrontal factorization

Iterative refinement to improve solution

Iterate: • r = b – A*x• backerr = maxi ( ri / (|A|*|x| + |b|)i )• if backerr < ε or backerr > lasterr/2 then stop iterating• solve L*U*dx = r• x = x + dx• lasterr = backerr• repeat

Usually 0 – 3 steps are enough

Page 20: Symmetric-pattern multifrontal factorization

Convergence analysis of iterative refinement

Let C = I – A(LU)-1 [ so A = (I – C)·(LU) ]

x1 = (LU)-1br1 = b – Ax1 = (I – A(LU)-1)b = Cbdx1 = (LU)-1 r1 = (LU)-1Cbx2 = x1+dx1 = (LU)-1(I + C)br2 = b – Ax2 = (I – (I – C)·(I + C))b = C2b. . .In general, rk = b – Axk = Ckb

Thus rk 0 if |largest eigenvalue of C| < 1.

Page 21: Symmetric-pattern multifrontal factorization

SuperLU-dist: GE with static pivoting [Li, Demmel]

• Target: Distributed-memory multiprocessors• Goal: No pivoting during numeric factorization

1. Permute A unsymmetrically to have large elements on the diagonal (using weighted bipartite matching)

2. Scale rows and columns to equilibrate3. Permute A symmetrically for sparsity4. Factor A = LU with no pivoting, fixing up small pivots:

if |aii| < ε · ||A|| then replace aii by ε1/2 · ||A||

5. Solve for x using the triangular factors: Ly = b, Ux = y6. Improve solution by iterative refinement

Page 22: Symmetric-pattern multifrontal factorization

Directed graph

• A is square, unsymmetric, nonzero diagonal• Edges from rows to columns• Symmetric permutations PAPT

1 2

3

4 7

6

5

A G(A)

Page 23: Symmetric-pattern multifrontal factorization

Undirected graph, ignoring edge directions

• Overestimates the nonzero structure of A• Sparse GESP can use symmetric permutations

(min degree, nested dissection) of this graph

1 2

3

4 7

6

5

A+AT G(A+AT)

Page 24: Symmetric-pattern multifrontal factorization

Symbolic factorization of undirected graph

• Overestimates the nonzero structure of L+U

chol(A +AT) G+(A+AT)

1 2

3

4 7

6

5

Page 25: Symmetric-pattern multifrontal factorization

+

Symbolic factorization of directed graph

• Add fill edge a -> b if there is a path from a to b through lower-numbered vertices.

• Sparser than G+(A+AT) in general.

• But what’s a good ordering for G+(A)?

1 2

3

4 7

6

5

A G (A) L+U

Page 26: Symmetric-pattern multifrontal factorization

Question: Preordering for GESP

• Use directed graph model, less well understood than symmetric factorization

• Symmetric: bottom-up, top-down, hybrids• Nonsymmetric: mostly bottom-up

• Symmetric: best ordering is NP-complete, but approximation theory is based on graph partitioning (separators)

• Nonsymmetric: no approximation theory is known; partitioning is not the whole story

• Good approximations and efficient algorithms both remain to be discovered