48
Sparse factorization using low rank submatrices Cleve Ashcraft LSTC [email protected] 2010 MUMPS User Group Meeting April 15-16, 2010 Toulouse, FRANCE ftp.lstc.com:outgoing/cleve/MUMPS10 Ashcraft.pdf 1

Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Sparse factorizationusing low rank submatrices

Cleve AshcraftLSTC

[email protected]

2010 MUMPS User Group MeetingApril 15-16, 2010

Toulouse, FRANCE

ftp.lstc.com:outgoing/cleve/MUMPS10 Ashcraft.pdf

1

Page 2: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

LSTCLivermore Software Technology Corporation

Livermore, California 94550

• Founded by John Hallquist of LLNL, 1980’s

• Public domain versions of DYNA and NIKE codes

• LS-DYNA : implicit/explicit,nonlinear finite element analysis code

• Multiphysics capabilities

– Fluid/structure interaction

– Thermal analysis

– Acoustics

– Electromagnetics

2

Page 3: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Multifrontal Algorithm

• very large, sparse LDLT and LDU factorizations

• tree structure organizes factor storage,solve and factor operations

• medium-to-large sparse linear systemslocated at each leaf node of the tree

• medium-sized dense linear systemslocated at each interior node of the tree

• dense matrix-matrix operationsat each interior node

• sparse matrix-matrix adds between nodes

3

Page 4: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Multifrontal tree

0

1

2

3

4 5

6

7

8

9

10

11

12

13

1415

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47 48

49

50

51

52

53

54

55

56

57

58 59

60

61

62

63

64

65

66

67

68

69

70

71

4

Page 5: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Multifrontal tree – polar representation

0

1

2

3

45

6

7

8

9

10

11

1213

14

15

1617

18 19 20 21

22

23

24

25

26 27

28

2930

31 32

33

34

35

36

37

38

39

40

41

42

43

44454647

48

49

5051

52

53

54

55

56

57

58

59

60

6162

6364

65

66

67

68 69

70

71

5

Page 6: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Multifrontal Algorithm

• 40M equations, present frontier at LSTC

• serial, SMP, MPP, hybrid, GPU

• large problems, > 1M - 10M dof,require out-of-core storage of factor entries,even on distributed memory systems

• IO cost largely hidden during the factorization

• IO cost dominant during the solves

• eigensolver =⇒ several right hand sides

• many applications, e.g., Newton’s method,have a single right hand side

6

Page 7: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Low Rank Approximations

• hierarchical matrices,Hackbusch, Bebendorf, Leborne, others

• semi-separable matrices, Gu, others

• submatrices are numerically rank deficient

• method of choice for Boundary Elements (BEM)

• now applied to Finite Elements (FEM)

• A ≈ XY TAm

n

= m X

r

Y Tn

r

• storage = r(m + n) vs mn

• reduction in ops =r

min(m,n)7

Page 8: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Multifrontal Algorithm + Low Rank Approximations

• At each leaf node in the multifrontal tree —use standard multifrontal

• At each interior node in the multifrontal tree —low rank matrix-matrix multiplies and sums

• Between nodes —low rank matrix sums

• Dramatic reduction in storage

• Dramatic reduction in operations

• Excellent approximation propertiesfor finite element operators

• Our experience is with potential equations,elasticity with solids and shells

8

Page 9: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Outline

• graph, tree, matrix perspectives

• experiments – 2-D potential equation

• low rank computations

• blocking strategies

• summary

9

Page 10: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

One subgraph

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

One subtree

ΩI ΩJ

@@

@@

@@@I

S

6

∂S

One submatrix

AΩI ,ΩIAΩI ,S

AΩI ,∂S

AΩJ ,ΩJAΩJ ,S AΩJ ,∂S

AS,ΩIAS,ΩJ

AS,S AS,∂S

A∂S,ΩIA∂S,ΩJ

A∂S,S A∂S,∂S

10

Page 11: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Portion of original matrix

0 200 400 600 800 1000 1200

0

200

400

600

800

1000

1200

nz = 9052

A ordered by domains, separator and external boundary

11

Page 12: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Portion of factor matrix

0 200 400 600 800 1000 1200

0

200

400

600

800

1000

1200

nz = 48899

L ordered by domains, separator and external boundary

12

Page 13: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Schur complement matrix

[AS,S AS,∂S

A∂S,S A∂S,∂S

]=

0 20 40 60 80 100 120 140 160

0

20

40

60

80

100

120

140

160

nz = 20659

Schur complement, separator and external boundary

13

Page 14: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Outline

• graph, tree, matrix perspectives

• experiments – 2-D potential equation

• low rank computations

• blocking strategies

• summary

14

Page 15: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Computational experiments

• Compute LS,S, L∂S,S and A∂S,∂S

• Find domain decompositions of S and ∂S

• Form block matrices, e.g., LS,S =∑

K≥J

LK,J

• Find singular value decompositions of each LK,J

• Collect all singular values

σ =∑

K≥J

min(|K|,|J |)∑

i=1

σ(K,J)i

• Split matrix LS,S = MS,S + NS,S using singularvalues σ.

• We want ‖NS,S‖F ≤ 10−14 ‖LS,S‖F

15

Page 16: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

255 × 255 diagonal block factor LS,S

43% dense, relative accuracy 10−14

LS,S =

0 50 100 150 200 250

0

50

100

150

200

250

16

Page 17: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

754 × 255 lower block factor L∂S,S

16% dense, relative accuracy 10−14

L∂S,S =

0 100 200

0

100

200

300

400

500

600

700

17

Page 18: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

754 × 754 update matrix A∂S,∂S

21% dense, relative accuracy 10−14

A∂S,∂S =

0 100 200 300 400 500 600 700

0

100

200

300

400

500

600

700

18

Page 19: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

255 × 255 factor matrix LS,S

storage vs accuracy

0 0.1 0.2 0.3 0.4 0.5 0.6−16

−14

−12

−10

−8

−6

−4

−2

0

fraction of dense storage

log 10

(1 −

||M

|| F /

||A|| F

)

approximating 255 x 255 separator factor matrix L3,3

2 x 2 blocks3 x 3 blocks4 x 4 blocks5 x 5 blocks6 x 6 blocks7 x 7 blocks8 x 8 blocks

19

Page 20: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

754 × 255 factor matrix L∂S,S

storage vs accuracy

0 0.05 0.1 0.15 0.2−16

−14

−12

−10

−8

−6

−4

−2

0

fraction of dense storage

log 10

(1 −

||M

|| F /

||A|| F

)

approximating 754 x 255 interface−separator factor matrix L4,3

3 x 1 blocks6 x 2 blocks9 x 3 blocks12 x 4 blocks

20

Page 21: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

754 × 754 update matrix A∂S,∂S

storage vs accuracy

0 0.1 0.2 0.3 0.4 0.5−16

−14

−12

−10

−8

−6

−4

−2

0

fraction of dense storage

log 10

(1 −

||M

|| F /

||A|| F

)

approximating 754 x 754 schur complement matrix Ahat44

2 x 2 blocks3 x 3 blocks4 x 4 blocks5 x 5 blocks6 x 6 blocks7 x 7 blocks8 x 8 blocks

21

Page 22: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Outline

• graph, tree, matrix perspectives

• experiments – 2-D potential equation

• low rank computations

• blocking strategies

• summary

22

Page 23: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

How to compute low rank submatrices ?

• SVD – singular value decomposition

A = UΣUT

where U and V are orthogonal and Σ is diagonal

• Gold standard, expensive, O(n3) ops

• QR factorization

AP = QR

where Q is orthogonal and R is upper triangular,and P is a permutation matrix

• Silver standard, moderate cost, O(rn2) ops

23

Page 24: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

row norms of R vs singular values of A

754 × 754 update matrix A∂S,∂S

94 × 94 submatrix sizenear, mid and far interactions

0 2 4 6 8 10 12 14 16 18 20−20

−18

−16

−14

−12

−10

−8

−6

−4

−2

0row norms of R versus singular values of 754 x 754 A using 8 segments

near ||R

i,:||

2

near σi

mid ||Ri,:

||2

mid σi

far ||Ri,:

||2

far σi

24

Page 25: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

SVD vs QR with column pivoting — Conclusions :

• Column pivoting QR factorization does well.

• Row norms of R track singular values σ

• The numerical rank of R is greater than needed,but not that much greater

• For more accuracy/less storagetwo sided orthogonal factorizations

– AP = ULV , U and V orthogonal, L triangular

– PAQ = UBV ,U and V orthogonal, B bidiagonal

track the singular values very closely.

25

Page 26: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Type of approximation of submatrices

• submatrix LI,J of L∂S,S, ‖LI,J‖F = 2.92 × 10−2

factorization numerical rank total entriesnone — 4900QR 20 2681

ULV 18 2680SVD 18 2538

• submatrix LI,J of L∂S,S, ‖LI,J‖F = 2.04 × 10−3

factorization numerical rank total entriesnone — 4900QR 14 1896

ULV 10 1447SVD 10 1410

26

Page 27: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Operations with low rank submatrices

A = UAV TA , B = UBV T

B , C = UCV TC

See Bebendorf, “Hierarchical Matrices”, Chapter 1

• Multiplication A = B C, rank(A) ≤ min(rank(B), rank(C))

A = UAV TA =

(UBV T

B

) (UCV T

C

)= BC

= UB

(V T

B UC

)V T

C

= UB

((V T

B UC

)V T

C

)

=(UB

(V T

B UC

))V T

C

• Addition A = B + C, rank(A) ≤ rank(B) + rank(C)

A = UAV TA =

(UBV T

B

)+

(UCV T

C

)= B + C

=[UB UC

] [VB VC

]T

27

Page 28: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Multiplication A = B C

A = UAV TA , B = UBV T

B , C = UCV TC

A = B C =

=(

(m × h) (h × l))(

(l × k) (k × n))

= =

= (m × min(h, k)) (min(h, k) × n)

28

Page 29: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Near-near matrix product

0 5 10 15 20 25 30 35

−16

−14

−12

−10

−8

−6

−4

−2

0

near−near interaction, L2,1

L2,1T

σ(L21)

σ(L21*L21T)

29

Page 30: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

mid-near matrix product

0 5 10 15 20 25 30 35

−16

−14

−12

−10

−8

−6

−4

−2

0

mid−near interaction, L2,1

L1,1T

σ(L21)σ(L11)

σ(L21*L11T)

30

Page 31: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

far-near matrix product

0 5 10 15 20 25 30 35

−16

−14

−12

−10

−8

−6

−4

−2

0

far−near interaction, L5,1

L6,1T

σ(L51)σ(L61)

σ(L51*L61T)

31

Page 32: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

mid-mid matrix product

2 4 6 8 10 12 14 16

−16

−14

−12

−10

−8

−6

−4

−2

0

mid−mid interaction, L1,1

L1,1T

σ(L11)

σ(L11*L11T)

32

Page 33: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

far-mid matrix product

2 4 6 8 10 12 14 16

−16

−14

−12

−10

−8

−6

−4

−2

0

mid−far interaction, L6,1

L1,1T

σ(L61)

σ(L61*L61T)

33

Page 34: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

far-far matrix product

1 2 3 4 5 6 7 8 9 10 11

−16

−14

−12

−10

−8

−6

−4

−2

0

far−far interaction, L6,1

L6,1T

σ(L61)

σ(L61*L61T)

34

Page 35: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Addition A = B + C

rank(A) ≤ rank(B) + rank(C)

A = UAV TA =

(UBV T

B

)+

(UCV T

C

)= B + C

=[UB UC

] [VB VC

]T

= (Q1R1) (Q2R2)T

= Q1

(R1R

T2

)QT

2

= Q1 (Q3R3) QT2

= (Q1Q3)(R3Q

T2

)

= (Q1Q3)(Q2R

T3

)T

• R1RT2 usually has low numerical rank

• examples follow35

Page 36: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

update matrix, diagonal blockA3,3 = L3,1L

T3,1 + L3,2L

T3,2 + L3,3L

T3,3

0 5 10 15 20 25

−16

−14

−12

−10

−8

−6

−4

−2

0

log 10

(σ)

update matrix −− self interaction

σ(near)σ(sum)σ(mid)σ(far)

36

Page 37: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

update matrix, mid-distance off-diagonal blockA5,3 = L5,1L

T3,1 + L5,2L

T3,2 + L5,3L

T3,3

0 2 4 6 8 10 12−20

−18

−16

−14

−12

−10

−8

−6

−4

−2

log 10

(σ)

update matrix −− mid−range interaction

σ(sum)σ(near)σ(mid)σ(far)

37

Page 38: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

update matrix, far-distance off-diagonal blockA7,3 = L7,1L

T3,1 + L7,2L

T3,2 + L7,3L

T3,3

0 2 4 6 8 10 12−20

−18

−16

−14

−12

−10

−8

−6

−4

−2

log 10

(σ)

update matrix −− far−range interaction

σ(sum)σ(near)σ(mid)σ(far)

38

Page 39: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Outline

• graph, tree, matrix perspectives

• experiments – 2-D potential equation

• low rank computations

• blocking strategies

• summary

39

Page 40: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Blocking Strategies

• Active data structures[LJ,J

L∂J,J A∂J,∂J

]

• partition J and ∂J independently

• for 2-d problems, J and ∂J are 1-d manifolds

• for 3-d problems, J and ∂J are 2-d manifolds

• we need mesh partitioning of the separatorand the boundary of a region J

• each index set of a partition is a segment

40

Page 41: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Segment partition

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1 segment wirebasket domain decomposition

41

Page 42: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

0 50 100 150 200 250

0

50

100

150

200

250

LJ,J =∑

σ,τ

Lσ,τ σ × τ ⊆ J × J

0 100 200

0

100

200

300

400

500

600

700

L∂J,J =∑

σ,τ

Lσ,τ σ × τ ⊆ ∂J × J

0 100 200 300 400 500 600 700

0

100

200

300

400

500

600

700

A∂J,∂J =∑

σ,τ

Aσ,τ σ×τ ⊆ ∂J×∂J

42

Page 43: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Strategy I

• The partition of ∂J (local to node J)conforms to the partitions of ancestors K,K ∩ ∂J 6= ∅

• Advantages :

– Update assembly is simplified

A(p(J))σ2,τ2 = A

(p(J))σ2,τ2 + A

(J)σ1,τ1

where σ1 ⊆ σ2, τ1 ⊆ τ2

– One destination for each A(J)σ1,τ1

• Disadvantages :

– Partition of ∂J can be fragmented,more segments, smaller size,less efficient storage

43

Page 44: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Strategy II

• The partition of ∂J (local to node J)need not conform to the partitions of ancestors

• Advantages :

– Partition of ∂J can be optimized bettersince ∂J is small and localized

• Disadvantages :

– Update assembly is more complex

A(p(J))σ1∩σ2,τ1∩τ2

= A(p(J))σ1∩σ2,τ1∩τ2

+ A(J)σ1∩σ2,τ1∩τ2

where σ1 ∩ σ2 6= ∅, τ1 ∩ τ2 6= ∅

– Several destinations for a submatrix A(J)σ1,τ1

44

Page 45: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Outline

• graph, tree, matrix perspectives

• experiments – 2-D potential equation

• low rank computations

• blocking strategies

• summary

45

Page 46: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Multifrontal tree Multifrontal Factorization

0

1

2

3

45

6

7

8

9

10

11

1213

14

15

1617

18 19 20 21

22

23

24

25

26 27

28

2930

31 32

33

34

35

36

37

38

39

40

41

42

43

44454647

48

49

5051

52

53

54

55

56

57

58

59

60

6162

6364

65

66

67

68 69

70

71

• each leaf node is ad-dimensional sparseFEM matrix

• each interior node isa (d − 1)-dimensionaldense BEM matrix

• use low rank storageand computation ateach interior node

46

Page 47: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Call to action

• progress to date in low rank factorizationsdriven by iterative methods

• work needed from direct methods community

• start from industrial strength multifrontal code

– serial, SMP, MPP, hybrid, GPU

– pivoting for stability, out-of-core,singular systems, null spaces

47

Page 48: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse

Call to action

• Many challenges

– partition of space, partition of interface

– added programming complexityof low rank matrices

– challenges to pivot for stability

– challenges/opportunitiesto implement in parallel

• Payoff will be huge

– reduction in storage footprint

– reduction in computational work

– take direct methods to next levelof problem size

48