of 61 /61
Hierarchical Solvers Eric Darve CM4 Summer School June 2016

# Hierarchical Solvers - Applied Mathematics Darve.pdf · Hierarchical solvers • Hierarchical solvers offer a bridge between direct and iteration solvers. • They lead to efﬁcient

others

• View
16

0

Embed Size (px)

### Text of Hierarchical Solvers - Applied Mathematics Darve.pdf · Hierarchical solvers • Hierarchical... Hierarchical Solvers

Eric Darve CM4 Summer School

June 2016 Matrices and linear solvers

• How can we solve Ax = b?

• Direct methods: Gaussian elimination, LU and QR factorizations: O(n3)

• Iterative methods: GMRES, Conjugate Gradient, MINRES, etc Iterative Methods• Iterative methods can be very fast.

• They rely primarily on matrix-vector products Ax.

• If A is sparse this can be done very quickly.

• However, the convergence of iterative methods depends on the distribution of eigenvalues.

• So it may be quite slow in many instances. analysis is quite simplified.

• The key result is as follows:

kenkAke0kA

inf

p2Pn

max

�2⇤(A)|p(�)|

p 2 Pn: polynomials of degree less than n with p(0) = 1

⇤(A) is the set of all eigenvalues of A.

Error at step n Canonical cases

• If all the eigenvalues are clustered around a few points (say around 1), then convergence is fast.

• Just place all the roots of p inside each cluster of eigenvalues. Ill-conditioned case• Recall that p(0) = 1.

• So if some eigenvalues are very close to 0, while others are far away, it is difficult to minimize p(𝜆).

• For CG:

kenkAke0kA

2⇣p� 1p

+ 1

⌘n⇠ 2 (1� 2/

pk)n

Difficulty when condition number 𝜅 is large Preconditioners

• Most engineering matrices are not well-conditioned and have eigenvalues that are not well distributed.

• To solve such systems, a preconditioner is required.

• The effect of the preconditioner will be to regroup the eigenvalues into a few clusters. Hierarchical solvers• Hierarchical solvers offer a bridge between direct

and iteration solvers.

• They lead to efficient preconditioners suitable for iterative techniques.

• They are based on approximate direct factorizations of the matrix.

• Computational cost is O(n) for many applications (depending on properties of matrix). Cost of factorization• The problem with direct methods and matrix

factorization is that they lead to a large computational cost.

• Matrix of size n: cost is O(n3).

• This problem can be mitigated for sparse matrices with many zeros.

• Hierarchical solvers offer a trade-off between computational cost and accuracy for direct methods. Factorization for sparse matrices

Assume we start from a sparse matrix and perform one step of a block LU factorization:

A =

✓I

UA�111 I

◆✓A11

0 A22 � UA�111 U

T

◆✓I A�1

11 UT

I

This block may have a lot of new non-zero entries Sparsification

• Hierarchical methods attempt to maintain the sparsity of the matrix to prevent the fill-in we just discovered.

• How does it work? Low-rank• The basic mechanism is to take advantage of the

fact that dense blocks can often be approximated by a low-rank matrix.

• This is not always true though. We will investigate this in more details during the tutorial session.

• Canonical case: for elliptic PDEs, this low-rank property is always observed for clusters of points in the mesh that are well-separated (à la fast multipole method). What is a low-rank matrix?

=

Matrix A r columns r rows

May not be exact LU

• LU factorizations are a great tool for low-rank matrices.

• Assume we have a low-rank matrix and we perform an LU factorization (with full pivoting).

• What happens? LU for low-rank

All zero! Low-rank factorizationIn fact, LU directly produces a factorization of the form:

x

L factor U factor How can we use this?

• Let’s see how we can apply this to remove entries in our matrix.

• Recall that the factorization leads to a lot of fill-in.

• LU comes to the rescue to restore sparsity! Matrix with low-rank block

Low-rank block Create a new block of 0

New zero entries

Apply row transformations

from LU SparsityThe fast factorization scheme proceeds as follows:

• Perform a Cholesky or LU factorization.

• When a new fill-in occurs in a block corresponding to well-separated nodes (say in the mesh for a discretized PDE), use row transformations to sparsify the matrix.

This process allows factoring A into a product of completely sparse matrices! Connection to multigrid• This method can be connected to multigrid.

• Assume we partition our graph: Sparse elimination

• Start a block elimination, following the cluster partitioning shown previously.

• Whenever fill-in occur, we sparsify it.

• What does this mean? Low-rank fill-in

Rest of matrix is sparse Row and column transformations Row/Column permutation Fine/Coarse• Elimination of these

nodes does not create any new fill-in

• These are multigrid fine nodes.

• These are multigrid coarse nodes.

• They will be eliminated at the next round.  BenchmarksConvergence of iterative methods

• Examples of convergence behavior.

• For conjugate gradient and symmetric positive definite matrices, the eigenvalues are real and positive. This leads to a simple convergence behavior, based on the condition number. Unsymmetric systems• For unsymmetric systems, convergence is more

challenging.

• Condition number is still an important factor.

• However, clustering of the eigenvalues is critical.

• An interesting case is eigenvalues distributed on the unit circle.

• The condition number is 1. But convergence is still slow because of the lack of clustering. Re(6)-0.5 0 0.5 1 1.5 2 2.5

Im(6

)

-1.5

-1

-0.5

0

0.5

1

1.5

iteration0 50 100 150 200 250 300 350 400

residual

10-6

10-5

10-4

10-3

10-2

10-1

100

101

102

Evalues are clusteredRapid convergence

Slow convergence Condition number is small Preconditioning benchmark

• Let’s see how this works in practice.

DISCRETE ORDINATES IN SIMULTANEOUS FORM

ARI FRANKEL

1. Bare minimum physics

The radiative transfer equation solving for the intensity I(x, y, z, s) is

s ·rI + �

e

I � !�

e

4⇡

Z

4⇡Id⌦ = (1� !)�

e

I

b

(1)

where �

e

= ⇡d

2

4 n with particle diameter d and local number density n, and ! is a constant

controlling how strong the scattering is (for our case, about 0.7). Ib

= (5.67⇥10�8)T 4

is theblackbody intensity.

Some other definitions:

The unit vector s, defining the direction along which the radiation streams. Let ✓ bethe polar angle, and � be the azimuthal angle in spherical coordinates.

s = sin ✓ cos�x+ sin ✓ sin�y + cos ✓z = ⇠x+ ⌘y + µz (2)

Note that the direction vector does not depend on space, whereas the coe�cient �

e

is afunction of space.

The solid angle ⌦, whereZ

4⇡f(s)d⌦ =

Z 2⇡

0

Z⇡

0f(✓,�) sin ✓d✓d� (3)

The RTE is a partial di↵ero-integral equation, and it needs a boundary condition. Ata surface with temperature T

w

, emissivity ✏

w

, and unit normal vector n pointing in to thedomain,

I(~rwall

, s · n > 0) = ✏

w

(5.67⇥ 10�8)T 4w

+1� ✏

w

Z

s·n<0I|s · n|d⌦ (4)

The right-hand side integral is taken over all incoming directions. For PSAAP purposes,the surfaces do not reflect any energy, so we can simplify this down (for now) to

I(~rwall

, s · n > 0) =(5.67⇥ 10�8)T 4

w

(5)

In an earlier experimental setup, one lamp side was going to be irradiated with an arrayof lamps. To model this, I was using the e↵ective temperature of the lamps concentrated

1 ILU preconditioning

Re0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1

Im

-0.3

-0.2

-0.1

0

0.1

0.2

0.3original matrixpre-conditioned matrix

Original

Preconditioned

Need to avoid 0 Boundary element method• We solve the Helmholtz equation using the

boundary element method.

• This uses an integral formulation: Three geometries w/ Toru Takahashi, Pieter Coulier Point Jacobi vs iFMM (H solver) Frequency sweep

Point Jacobi vs iFMM (H solver) Standing horse TV in the living room Indefinite systems w/ Kai Yang

• No good preconditioner exists for these problems.

• ILU and MG/AMG fail for these matrices.

• 𝜆 is chosen from the interval [𝜆min, 𝜆max] for the Laplacian.

• 2D Poisson with 10k points.

4u� �u = f Convergence of H solver

Various eigenvalue shifts Frequency sweep

Mid 5 𝜆

High 40 𝜆 https://bitbucket.org/hadip/lorasp Lorasp: hierarchical linear solver for sparse matrices.

• w/ Pieter Coulier: hierarchical linear solver for dense matrices; iFMM. Requires an FMM formulation (e.g., BEM, integral equation)

• w/ Toru Takahashi: fast Helmholtz solver using hierarchical matrices. References• Fast hierarchical solvers for sparse matrices using low-rank

approximation, Hadi Pouransari, Pieter Coulier, Eric Darve; arXiv:1510.07363, http://arxiv.org/abs/1510.07363

• The inverse fast multipole method: using a fast approximate direct solver as a preconditioner for dense linear systems; Pieter Coulier, Hadi Pouransari, Eric Darve; arXiv:1508.01835 http://arxiv.org/abs/1508.01835

• Aminfar, A., and E. Darve. “A fast, memory efficient and robust sparse preconditioner based on a multifrontal approach with applications to finite-element matrices.” Int. J. Num. Meth. Eng. (2016): doi 10.1002/nme.5196 Hands-on

• Log onhttps://juliabox.org/

• Run sample code to see that everything works for you. Lab 1: convergence of iterative solvers

• We create matrices with different eigenvalue distributions.

• The linear system is solved using GMRES. Eigenvalue distributions

• Try out these different cases.

• What do you observe? How fast is the convergence? Can you explain your observations? Distribution #1 Distribution #2

Can you make GMRES convergence very slowly by changing x_shift? Distribution #3

This matrix corresponds to rotations in different planes. Try playing around with other eigenvalue distributions! Hierarchical Matrices• One fundamental property we use in hierarchical

matrix calculation is that the Schur complement can be compressed during an LU/Cholesky factorization.

• Is that true in practice?

• What types of PDE satisfy this compression property?

• Let’s investigate. PDE solver• Consider a regular mesh and a 5 point stencil for:

• Let’s do a Gaussian elimination (e.g., LU) on some part of the grid.

�k2 T + e ·rT �Dr2T = RHS Schematic view of a 2D grid, partitioned into 9 subdomains • Eliminate the center domain of the grid • LU factorization where we eliminate rows &

columns associated with the center domain • Points on the left and right boundaries become all connected.

• This forms a dense block in the matrix. • A key assumption in Hierarchical Solvers is that this

matrix must have low-rank blocks. • Is that in fact the case?

A =

✓I

UA�111 I

◆✓A11

0 A22 � UA�111 U

T

◆✓I A�1

11 UT

I Set up of benchmark• Matrix of system, focusing on the 3 clusters, in the

middle row:C: center; L: left; R: right

• Let’s eliminate ACC

0

@ACC ACL ACR

ALC ALL 0ARC 0 ARR

1

A Fill-in✓ALL �ALCA

�1CCACL �ALCA

�1CCACR

�ARCA�1CCACL ARR �ARCA

�1CCACR

Modified self-interaction

Fill-in between L and R

0

@ACC ACL ACR

ALC ALL 0ARC 0 ARR

1

A Low-rank assumption

• For hierarchical solvers to be efficient, this block should be low-rank.

• Let’s test this.

✓ALL �ALCA

�1CCACL �ALCA

�1CCACR

�ARCA�1CCACL ARR �ARCA

�1CCACR Case #1Pure diffusion equation.

k = 0 # shift

D = 1 # diffusion

ex = 0 # velocities

ey = 0 Case #2Convection dominated

k = 0 # shift

D = 0.01 # diffusion

ex 0.1 1 1ey 1 1 0.1 Case #3Oscillatory system

D = 1 # diffusion

ex = ey = 0 # velocity

k 0.1 0.5 2.5 ##### Tac 2012 econf.tac-atc.ca/english/annualconference/tac2012/docs/lionsden/sof… · High powered MILP solvers available as open source or commercial ... Human Engineered Iteration
Documents ##### The Hierarchical Poincar -Steklov schemefastdirect/notes/CBMS... · 2014. 6. 25. · CBMS-NSF Conference: Fast Direct Solvers for Elliptic PDEs June 25, 2014. Motivation Outline of
Documents ##### Linear Algebraic Primitives Beyond Laplacian Solvers · 2020-03-21 · Graph decompositions to decreasing iteration costs and speeding up convergence. (e.g. trees, spanners) Idea
Documents ##### Instruction Scheduling - Stanford Universityinfolab.stanford.edu/~ullman/dragon/w06/lectures/inst-sched.pdf · Iteration i Iteration i+ 4 Iteration i+ 1 Iteration i+ 3 Iteration i+
Documents ##### Newton and Quasi-Newton Methods · Newton’s method solves linear system at every iteration. Can be computationally expensive, if n is large. Remedy: Apply iterative solvers, e.g
Documents ##### Hierarchical Sampling for Least-Squares Policy Iterationengr.case.edu/ray_soumya/papers/devin_thesis_offline_hrl.pdf · Hierarchical Sampling for Least-Squares Policy Iteration Devin
Documents ##### Nested and Hierarchical Solvers in PETSc - Rice Umk51/presentations/PresSIAMCSE2013.pdf · Nested and Hierarchical Solvers in PETSc ... -fieldsplit_velocity_ksp_type preonly-fieldsplit_velocity_pc_type
Documents ##### HW #3 - University of Akron€¦ · iteration = 12 R = 0.0000000000002 iteration = 13 R = 0.0000000000000 n =4 iteration = 1 R = 0.1230966510786 iteration = 2 R = 0.0153456865664
Documents ##### CST Studio Suite 2019 - 3ds.com · Solvers The solvers are the foundation of CST Studio Suite. From the general purpose solvers like the Time Domain and Frequency Domain Solvers,
Documents ##### T-76.4115 Iteration Demo Apollo Crew PP Iteration 21.10.2008
Documents ##### IFacion Design Iteration Iteration 1 Iteration 2 Iteration 3 From blogs to popular Youtube stations, young women who have a passion for fashion find different
Documents ##### T-76.4115 Iteration Demo BaseByters [I1] Iteration 04.12.2005
Documents ##### MUMPS USERS DAYS JUNE 1ST / 2 2017mumps.enseeiht.fr/doc/ud_2017/Courteille_Talk.pdf · Krylov methods, basic iterative solvers, AMG Eigenvalue Solvers Subspace Iteration, Restarted
Documents ##### ANSYS Solvers: Usage and Performancetynecomp.co.uk/Xansys/solver_2002.pdf · ANSYS Solvers: Usage and Performance Gene Poole Ansys equation solvers: usage and guidelines Ansys Solvers
Documents ##### Intro: Agile over lunchagileconsortium.pbworks.com/f/Agile_Intro+2007-04-21b.pdfIteration 1 Iteration 2 Iteration 3 Iteration 4 Iteration 5 Iteration 6 Iteration 7 Iteration 8 Release
Documents ##### Iteration - StarkeyPro and Hierarchical ayesian Models for Subjective Preference Data ... 18 hearing-impaired participants compared three different sound ... interaction main
Documents Documents