Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
The Fast Multipole Boundary Element Method for Large-Scale Modelling in Computational Acoustics
Daniel Wilkes 23/05/2018
A global university Perth | Kalgoorlie | Dubai | Malaysia | Singapore
Outline
• The BEM (numerical implementation, computational cost)
• The FMBEM (multipoles, octree structure, fast matrix-vector products)
• Numerical results for acoustics, TS modelling, elastodynamics, NNI, etc.
• Parallelisation of the FMBEM algorithm (my attempt)
• Parallel FMBEM results
The Boundary Element Method (BEM)
• The BEM is a numerical method which solves the boundary integral form of a differential equation using BCs defined on the domain boundary
• In acoustics, this means we are solving the integral form of the Helmholtz PDE to calculate the pressure p and/or particle velocity u on the boundary
Infinite Exterior Fluid Boundary Surface: solve for p/u
Int. Fluid
Restrictions: PDE must have a Green’s function (homogeneous media), need BCs Advantages: solution is restricted to domain surface, surface can be arbitrary, can couple with other models (FEM, RANS), directly treat infinite domains
Surface solution can then be used to solve for pressure in domain
The BEM: numerical implementation
• The domain surface is discretised into N elements and the continuous surface integrals in the BIE are approximated as a sum of element integrals
• The numerical solution for each element is dependent on every other element in the mesh i.e. N interactions for each of the N elements: i.e. O(N²) complexity
)16
(
)2
(
)1
(
)16
(
)2
(
)1
(
16,162,161,16
1,22,21,2
16,12,11,1
x
x
x
x
x
x
ip
ip
ip
q
q
q
III
III
III
bAxYields a matrix eq. of the form Solve for x
The Fast Multipole BEM (FMBEM)
• The BEM results in an N2 coefficient matrix A which is complex, non-symmetric and fully populated, requiring O(N2) memory to build/store
• Iterative numerical solution also requires O(N2) operations: costly for large N
• The FMBEM utilises multipole expansions to allow for interactions between well separated groups of elements, instead of the pair-wise interactions of the BEM
BEM FMBEM • Group storage/interaction of expansions reduces both memory and computational cost of the FMBEM
• The FMBEM algorithm requires: (1) A method to separate (factorise)
Green’s function interactions (2) A system to group elements
The FMBEM: Spherical Basis Functions • The following spherical basis functions
are defined as a basis for solutions to the Helmholtz equation
imm
n
mm
n
m
nn
m
n
m
nn
m
n
ePmn
mnnY
nnmnYkrhS
nnmnYkrjR
)(cos|)!|(
|)!|(
4
12)1(),(
..., , ,... 2 ,1 ,0 ),()()(
..., , ,... 2 ,1 ,0 ),()()(
||
1
r
r
•These basis functions can be used to build factorised forms of the Green’s function
•Groups of S or R basis functions about a common ‘expansion centre’ c can each be summed into a single set of coefficients to represent many Green’s functions
0
1221 )()()(n
n
nm
mn
mn SRikG rrrr
Re{R} isosurface
Re{S} isosurface
Re{Y} surface
r1+ r2
r1 r2
c
•Expansion sets can be ‘translated’ by shifting c, thus allowing S/R expansion sets to be reused or recombined/split into larger or smaller sets of elements
The FMBEM: Binary Octree Structure • The octree structure provides a systematic method for subdividing
3D space within the unit cube
z
x y
0
3
1
4
5
7
6
0
0
0
1
1
1
• xyz coords are converted into binary integers and ‘bit interleaved’: each set of 3 bits indicates which of the 8 boxes contains the point
x co-ordinate bit: 1 y co-ordinate bit : 0 z co-ordinate bit : 1 Box Number: 101 = 5
Successive 3-bit groups subdivide each box into 8 smaller boxes
Inherent relations to parent, children, neighbour boxes
All points within same box will have the same box number
The FMBEM: Octree Group Interactions • The octree structure is used to build a summed multipole expansion set representing all
elements within each box on every level of the octree structure.
• Applying the multipole translations between all ‘well separated’ octree boxes/expansions sets calculates the BEM surface integral for all elements simultaneously.
• Element interactions for neighbouring boxes on the lowest level of the octree cannot be treated with multipole expansions: BEM is used for these interactions
Fast Matrix-Vector Products using the FMBEM
• The FMM is used to calculate the ‘far field’ FF part of the BEM matrix-vector products: corresponding to well separated groups of elements
• Multipole expansions are not valid in the ‘near field’ NF i.e. when 2 groups of elements are not sufficiently separated. The standard BEM is used to treat this part of the matrix.
FMBEM FF BEM NF • The FMBEM gives the FF
contribution of the Ax matrix vector product (i.e. simultaneously calculated for all solution points) without needing to explicitly build or store the FF part of A
• The FMBEM reduces the computational/memory complexities of the BEM from O(N2) to O(NlogN) for the matrix-vector product and O(N) for the memory storage
FMBEM Numerical Examples: Acoustic Scattering
Acoustic scattering of a 3kHz plane wave impinging the BeTSSi II generic submarine pressure hull at broadside incidence for a rigid B.C. Mesh has 1.5x106 dof. • FMBEM required ~3h to solve on 6 cores (25GB RAM) • Equivalent BEM matrix would require 2.37x1012 ops
per mat-vec product and ~38 TB of storage space
Real (top) and imaginary (bottom) components of the total pressure for a k = 150 plane wave scattered from a rigid sphere (1.5x106 dof/elements). dashed line: FMBEM black markers: analytic solution
FMBEM TS Modelling
TS vs. incident angle for the BeTSSi II generic submarine pressure hull at 3kHz* Various numerical models
*M. Ostberg, ‘Target echo strength modelling at FOI, including results from the BeTSSi II workshop’, Proceedings of the 39th Scandinavian Symposium on Physical Acoustics, Geilo, Norway, Jan. 31 – Feb. 3 2016
FMBEM TS results (red line) plotted over other results
FMBEM TS Modelling
FMBEM TS Modelling
FMBEM: Other Flavours
Total x (left) and z (right) displacement for a ksa = 2π P-wave scattered by an ellipsoidal canyon (xyz = a,3a,a) at θ=30°
Dual FMBEM (left) and PAFEC (right) pressures for a k=35 plane wave scattering from a simplified model of the damping plate. r.e. norm = 22.44%
Dual FMBEM (left) and PAFEC (right) displacements for a k=35 plane wave scattering from a simplified model of the damping plate. r.e. norm = 29.14%
Elastodynamic FMBEM Unified FMBEM (FSI) FEM-FMBEM, Modal analysis
Radiated Sound Power, NNI
FMBEM: Parallelisation • A general parallelisation strategy for the matrix-vector product is to distribute the matrix
row-wise and replicate the vector across workers (a) resulting in a sub-vector on each worker and no inter-worker communication
• Parallelisation of the FMBEM requires column-wise distribution of the matrix (b). Vector components must then be transferred between workers to recover the full vector.
FMBEM: Parallelisation • The FMBEM does not require a specific
ordering of the dofs (b) which can result in an arbitrary distribution of the sparse near-field matrix (d) and FMBEM far-field interactions
• By sorting the dofs according to their octree box number (a), the sparse near-field (c) and FMBEM far-field interactions become a diagonally banded and block-structured matrix
• This diagonal banding can significantly reduces the inter-worker communication
FMBEM: Parallelisation • The diagonally banded matrix results in a number of empty rows in each column-wise
distributed block (a). The resulting vectors on each worker are then sparsely populated.
• The distributed column-wise blocks of the matrix can be row-compressed (b): the output vectors then only require a small component of data to be transferred between workers. Data transfer is proportional to the separation of the worker index.
FMBEM: Parallelisation • Column-wise distribution of matrix distributes the FMBEM translation operations across
workers. The majority of the required multiple expansion sets reside on the worker • Expansion sets on other workers are translated/summed and then simultaneously
exchanged between workers in a single data transfer step. Parallel FMBEM also
requires parallelisation of: - Sparse NF matrix - SAI preconditioner - fGMRES solver The resulting algorithm thus applies the entire FMBEM set-up and solution in parallel
Parallel FMBEM Results Sphere, ka = 100, 201404 elements
Enforced 1-core 1-core 2-cores 4-cores
Precompute time (s) 162 158 110 113
Near-field time (s) 1345 606 739 417
SAI time (s) 312 293 182 141
fGMRES time (s) 6219 4925 3698 2573
Memory (GB) 6.2 6.2 7.3 10.5
fGMRES speed-up x1 x1.26 x1.68 x2.42
Parallel Efficiency 100% 84% 60%
BeTSSi II submarine, 3kHz, 1538580 elements 1-core 3-cores
Precompute time (s) 1172 1220
Near-field time (s) 4717 3864
SAI time (s) 1048 1278
fGMRES time (s) 4715 2447
fGMRES speed-up x1 x1.92
Parallel Efficiency 64%
Parallel efficiency of other FMBEMs:
Giraud (2006) 4851900 dofs
Lopez-Fernandez (2011) 1009392 dofs
Thank you for listening! Questions?