The Fast Multipole Boundary Element Method for Large

The Fast Multipole Boundary Element Method for Large-Scale Modelling in Computational Acoustics

Daniel Wilkes 23/05/2018

A global university Perth | Kalgoorlie | Dubai | Malaysia | Singapore

Outline

• The BEM (numerical implementation, computational cost)

• The FMBEM (multipoles, octree structure, fast matrix-vector products)

• Numerical results for acoustics, TS modelling, elastodynamics, NNI, etc.

• Parallelisation of the FMBEM algorithm (my attempt)

• Parallel FMBEM results

The Boundary Element Method (BEM)

• The BEM is a numerical method which solves the boundary integral form of a differential equation using BCs defined on the domain boundary

• In acoustics, this means we are solving the integral form of the Helmholtz PDE to calculate the pressure p and/or particle velocity u on the boundary

Infinite Exterior Fluid Boundary Surface: solve for p/u

Int. Fluid

Restrictions: PDE must have a Green’s function (homogeneous media), need BCs Advantages: solution is restricted to domain surface, surface can be arbitrary, can couple with other models (FEM, RANS), directly treat infinite domains

Surface solution can then be used to solve for pressure in domain

The BEM: numerical implementation

• The domain surface is discretised into N elements and the continuous surface integrals in the BIE are approximated as a sum of element integrals

• The numerical solution for each element is dependent on every other element in the mesh i.e. N interactions for each of the N elements: i.e. O(N²) complexity

)16

(

)2

(

)1

(

)16

(

)2

(

)1

(

16,162,161,16

1,22,21,2

16,12,11,1

x

x

x

x

x

x

ip

ip

ip

q

q

q

III

III

III

bAxYields a matrix eq. of the form Solve for x

The Fast Multipole BEM (FMBEM)

• The BEM results in an N2 coefficient matrix A which is complex, non-symmetric and fully populated, requiring O(N2) memory to build/store

• Iterative numerical solution also requires O(N2) operations: costly for large N

• The FMBEM utilises multipole expansions to allow for interactions between well separated groups of elements, instead of the pair-wise interactions of the BEM

BEM FMBEM • Group storage/interaction of expansions reduces both memory and computational cost of the FMBEM

• The FMBEM algorithm requires: (1) A method to separate (factorise)

Green’s function interactions (2) A system to group elements

The FMBEM: Spherical Basis Functions • The following spherical basis functions

are defined as a basis for solutions to the Helmholtz equation

imm

n

mm

n

m

nn

m

n

m

nn

m

n

ePmn

mnnY

nnmnYkrhS

nnmnYkrjR

)(cos|)!|(

|)!|(

4

12)1(),(

..., , ,... 2 ,1 ,0 ),()()(

..., , ,... 2 ,1 ,0 ),()()(

||

1

r

r

•These basis functions can be used to build factorised forms of the Green’s function

•Groups of S or R basis functions about a common ‘expansion centre’ c can each be summed into a single set of coefficients to represent many Green’s functions

0

1221 )()()(n

n

nm

mn

mn SRikG rrrr

Re{R} isosurface

Re{S} isosurface

Re{Y} surface

r1+ r2

r1 r2

c

•Expansion sets can be ‘translated’ by shifting c, thus allowing S/R expansion sets to be reused or recombined/split into larger or smaller sets of elements

The FMBEM: Binary Octree Structure • The octree structure provides a systematic method for subdividing

3D space within the unit cube

z

x y

0

3

1

4

5

7

6

0

0

0

1

1

1

• xyz coords are converted into binary integers and ‘bit interleaved’: each set of 3 bits indicates which of the 8 boxes contains the point

x co-ordinate bit: 1 y co-ordinate bit : 0 z co-ordinate bit : 1 Box Number: 101 = 5

Successive 3-bit groups subdivide each box into 8 smaller boxes

Inherent relations to parent, children, neighbour boxes

All points within same box will have the same box number

The FMBEM: Octree Group Interactions • The octree structure is used to build a summed multipole expansion set representing all

elements within each box on every level of the octree structure.

• Applying the multipole translations between all ‘well separated’ octree boxes/expansions sets calculates the BEM surface integral for all elements simultaneously.

• Element interactions for neighbouring boxes on the lowest level of the octree cannot be treated with multipole expansions: BEM is used for these interactions

Fast Matrix-Vector Products using the FMBEM

• The FMM is used to calculate the ‘far field’ FF part of the BEM matrix-vector products: corresponding to well separated groups of elements

• Multipole expansions are not valid in the ‘near field’ NF i.e. when 2 groups of elements are not sufficiently separated. The standard BEM is used to treat this part of the matrix.

FMBEM FF BEM NF • The FMBEM gives the FF

contribution of the Ax matrix vector product (i.e. simultaneously calculated for all solution points) without needing to explicitly build or store the FF part of A

• The FMBEM reduces the computational/memory complexities of the BEM from O(N2) to O(NlogN) for the matrix-vector product and O(N) for the memory storage

FMBEM Numerical Examples: Acoustic Scattering

Acoustic scattering of a 3kHz plane wave impinging the BeTSSi II generic submarine pressure hull at broadside incidence for a rigid B.C. Mesh has 1.5x106 dof. • FMBEM required ~3h to solve on 6 cores (25GB RAM) • Equivalent BEM matrix would require 2.37x1012 ops

per mat-vec product and ~38 TB of storage space

Real (top) and imaginary (bottom) components of the total pressure for a k = 150 plane wave scattered from a rigid sphere (1.5x106 dof/elements). dashed line: FMBEM black markers: analytic solution

FMBEM TS Modelling

TS vs. incident angle for the BeTSSi II generic submarine pressure hull at 3kHz* Various numerical models

*M. Ostberg, ‘Target echo strength modelling at FOI, including results from the BeTSSi II workshop’, Proceedings of the 39th Scandinavian Symposium on Physical Acoustics, Geilo, Norway, Jan. 31 – Feb. 3 2016

FMBEM TS results (red line) plotted over other results

FMBEM TS Modelling

FMBEM TS Modelling

FMBEM: Other Flavours

Total x (left) and z (right) displacement for a ksa = 2π P-wave scattered by an ellipsoidal canyon (xyz = a,3a,a) at θ=30°

Dual FMBEM (left) and PAFEC (right) pressures for a k=35 plane wave scattering from a simplified model of the damping plate. r.e. norm = 22.44%

Dual FMBEM (left) and PAFEC (right) displacements for a k=35 plane wave scattering from a simplified model of the damping plate. r.e. norm = 29.14%

Elastodynamic FMBEM Unified FMBEM (FSI) FEM-FMBEM, Modal analysis

Radiated Sound Power, NNI

FMBEM: Parallelisation • A general parallelisation strategy for the matrix-vector product is to distribute the matrix

row-wise and replicate the vector across workers (a) resulting in a sub-vector on each worker and no inter-worker communication

• Parallelisation of the FMBEM requires column-wise distribution of the matrix (b). Vector components must then be transferred between workers to recover the full vector.

FMBEM: Parallelisation • The FMBEM does not require a specific

ordering of the dofs (b) which can result in an arbitrary distribution of the sparse near-field matrix (d) and FMBEM far-field interactions

• By sorting the dofs according to their octree box number (a), the sparse near-field (c) and FMBEM far-field interactions become a diagonally banded and block-structured matrix

• This diagonal banding can significantly reduces the inter-worker communication

FMBEM: Parallelisation • The diagonally banded matrix results in a number of empty rows in each column-wise

distributed block (a). The resulting vectors on each worker are then sparsely populated.

• The distributed column-wise blocks of the matrix can be row-compressed (b): the output vectors then only require a small component of data to be transferred between workers. Data transfer is proportional to the separation of the worker index.

FMBEM: Parallelisation • Column-wise distribution of matrix distributes the FMBEM translation operations across

workers. The majority of the required multiple expansion sets reside on the worker • Expansion sets on other workers are translated/summed and then simultaneously

exchanged between workers in a single data transfer step. Parallel FMBEM also

requires parallelisation of: - Sparse NF matrix - SAI preconditioner - fGMRES solver The resulting algorithm thus applies the entire FMBEM set-up and solution in parallel

Parallel FMBEM Results Sphere, ka = 100, 201404 elements

Enforced 1-core 1-core 2-cores 4-cores

Precompute time (s) 162 158 110 113

Near-field time (s) 1345 606 739 417

SAI time (s) 312 293 182 141

fGMRES time (s) 6219 4925 3698 2573

Memory (GB) 6.2 6.2 7.3 10.5

fGMRES speed-up x1 x1.26 x1.68 x2.42

Parallel Efficiency 100% 84% 60%

BeTSSi II submarine, 3kHz, 1538580 elements 1-core 3-cores

Precompute time (s) 1172 1220

Near-field time (s) 4717 3864

SAI time (s) 1048 1278

fGMRES time (s) 4715 2447

fGMRES speed-up x1 x1.92

Parallel Efficiency 64%

Parallel efficiency of other FMBEMs:

Giraud (2006) 4851900 dofs

Lopez-Fernandez (2011) 1009392 dofs

Thank you for listening! Questions?

Documents

The Fast Multipole Boundary Element Method for Large