Page 1: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 1

Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

Dan GordonComputer ScienceUniversity of Haifa

Rachel GordonAerospace Eng.


Page 2: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 2


Background on Helmholtz equation

The CARP-CG parallel algorithm

Comparative results using low- and

high-order finite difference schemes

Page 3: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 3

The Helmholtz Equation Eqn: -Δu - k2u = g (k = "wave no.") c = speed of sound, f = frequency Wave length: = c/f = 2/k No. of grid pts per : Ng = /h, h=mesh size

Shifted Laplacian approach:– Bayliss, Goldstein & Turkel, 1983– Erlangga, Vuik & Oosterlee, 2004/06

introduced imaginary shift:-Δu – (ik2 u = f

Page 4: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 4

The Helmholtz Equation

Some other approaches:– Elman, Ernst & O'Leary, 2001– Plessix & Mulder, 2003– Duff, Gratton, Pinel & Vasseeur, 2007– Bollhöfer, Grote & Schenk, 2009– Osei-Kuffuor & Saad, 2010

This work: hi-order schemes following– Singer & Turkel, 2006– Erlangga & Turkel, 2011 (to appear)

Page 5: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 5

Difficulties with Helmholtz

High frequencies small diagonal 2nd order schemes require many grid

points/wavelength "Pollution effect": high frequency

requires more than fixed number of grid points/wavelength (Babuška & Sauter, 2000)

high-order schemes required

Page 6: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 6

CARP: block-parallel Kaczmarz

Given: Ax=b "Normal equations": AATy=b, x=ATy Kaczmarz algorithm (1937) "KACZ"

is SOR on normal equations Relaxation parameter of KACZ is the

usual relax. par. of SOR Cyclic relax. par.: each eq. gets its own

relax. par.

Page 7: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 7July 1, 2010 Parallel solution of the Helmholtz equation 7

KACZ: Geometric DescriptionKACZ: Geometric Description

eq. 1

eq. 2eq. 3


Page 8: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 8

CARP: Component-Averaged Row Projections

A block-parallel version of KACZ Equations divided into blocks (not

necessarily disjoint) Initial estimate: vector x=(x1,…,xn) Suppose component x1 appears

in 3 blocks x1 is “cloned” as y1 , z1 , t1 in the

different blocks. Perform a KACZ iteration on each

block (independently, in parallel)

Page 9: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 9

CARP – Explanation (cont)

The internal iterations in each block produce 3 new values for the clones of x1 : y1’ , z1’ , t1’

The next iterative value of x1 is

x1’ = (y1’ + z1’ + t1’)/3 The next iterate is

x’ = (x1’ , ... , xn’) Repeat iterations as needed for


Page 10: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 10

CARP as Domain Decomposition

x xy

0 11

domain Adomain A domain Bdomain B

external gridexternal gridpoint of Apoint of A

clone of clone of x1

Note: domains may overlap

Page 11: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 11

Overview of CARP

domain A domain B





KACZ in some superspace(with cyclic relaxation)

Page 12: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 12

Convergence of CARP

Averaging Lemma: the component-

averaging operations of CARP are

equivalent to KACZ row-projections

in a certain superspace (with =1) CARP is equivalent to KACZ in the

superspace, with cyclic relaxation parameters – known to converge

Page 13: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 13

CARP Applications

Elliptic PDEs w/large convection term result in stiff linear systems (large off-diagonal elements)– CARP very robust on such systems,

compared to leading solver & preconditioner combinations

– Downside: Not always efficient

Electron tomography (ET) – joint work with J.-J. Fernández

Page 14: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 14

CARP-CG: CG acceleration of CARP

CARP is KACZ in some superspace (with cyclic relaxation parameters)

Björck & Elfving (1979): developed CGMN, which is a (sequential) CG-acceleration of KACZ (double sweep, fixed relax. parameter)

We extended this result to allow cyclic relaxation parameters

Result: CARP-CG

Page 15: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 15

CARP-CG: Properties

Same robustness as CARP Very significant improvement in

performance on stiff linear systems derived from elliptic PDEs

Very competitive runtime compared to leading solver/preconditioner combinations on systems derived from convection-dominated PDEs

Highly scalable on Helmholtz eqns

Page 16: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 16

CARP-CG: Properties

On one processor, CARP-CG is identical to CGMN

Particularly useful on systems with LARGE off-diagonal elements– example: convection-dominated PDEs

Discontinuous coefficients are handled without requiring domain decomposition (DD)

Page 17: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 17

Robustness of CARP-CG KACZ inherently "normalizes" the eqns

(eqn i is divided by ║Ai║2) Normalization is generally useful for

discontinuous coefficients After normalization, the diagonal elements

of AAT are all 1, and strictly greater than the off-diagonal elements

This is not diagonal dominance, but it makes the normal eqns manageable

Also: when diag of A decreases, sum of off-diag of AAT decreases.

Page 18: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 18

Experiments with Hi-Order

Relax. par. = 1.5 for all problems 2nd, 4th & 6th order central difference

schemes, following– Singer & Turkel, 2006– Erlangga & Turkel, 2011

Hi-order schemes 9-pt. stencil Complex eqns: separated real &

imag., interleaved equations (following Day & Heroux, 2001)

Page 19: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 19

Problem 1 (with analytic sol'n) Based on Erlangga & Turkel, 2011 Eqn: (Δ+k2)u = 0, on [-0.5,0.5][0,1] bndry condition: Dirichlet on 3 sides:

– u=0 for x=-0.5 and x=0.5– u=cos(x) for y=0

– Sommerfeld: uy+iβu=0 for y=1, β2=k2-

Analytic solution: u = cos(x)exp(-iβy) Grid points per : Ng = 9,12,15,18 Approx. 186,000 – 742,000 complex variables One processor k = 300

Page 20: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 20

Prob. 1: rel-res for 2nd, 4th, 6th order schemes

Page 21: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 21

Prob. 1: rel-err for 2nd, 4th, 6th order schemes

Page 22: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 22

Prob. 1: rel-err for 2nd, 4th, 6th order schemes

Page 23: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 23

Problem 2 (with analytic soln)

Eqn: Δu + k2u = 0 Domain: [0,1][0,1] Analytic sol'n: u=sin(x)cos(βy), β2=k2-

Dirichlet bndry cond determined by u on the boundaries

Grid points per : Ng = 9 to 18 Approx. 186,000 – 742,000 real variables One processor k = 300 2nd, 4th, 6th order schemes

Page 24: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 24

Prob. 2: rel-res for 2nd, 4th, 6th order schemes

Page 25: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 25

Prob. 2: rel-err for 2nd, 4th, 6th order schemes

Page 26: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 26

Prob. 2: rel-err, 6th order, Ng=9–18

Page 27: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 27

Problem 3 (no analytic soln) Eqn: Δu + k2u = 0 Domain: [0,1][0,1] Bndry cond on y=0: discontinuity at

midpt.: u(0.5,0)=1, u(x,0) = 0 for x ≠ 0other sides: 1st order absorbing

Approx. 515,000 complex variables Grid points per : Ng = 15 One processor k = 300 2nd, 4th, 6th order schemes

Page 28: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 28

Problem 3: evaluating the error

No analytic solution Run 6th order scheme to


Saved result as “true” solution Compared results of 2nd, 4th and

6th order schemes with the “true” solution

Page 29: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 29

Prob. 3: rel-err for 2nd, 4th, 6th order schemes

Page 30: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 30

Parallel Performance, 1 to 16 Proc.

# proc 1 2 4 8 12 16

Prob 1 2881 3516 4634 6125 4478 4983

Prob 2 3847 3981 4328 4774 5561 5691

Prob 3 7344 7378 7441 7572 7710 7842

No. iter for rel-res=10-7, 6th order, Ng=15, ~515,000 var.

Page 31: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 31

Parallel Performance, 1 to 16 Proc.

# proc 1 2 4 8 12 16

rel-res= 10-4 288 163 87 50 41 37

rel-res= 10-7 810 459 243 139 113 103

Problem 3: time (s), 6th order scheme, Ng=15, ~515,000 var.

Times taken on a 12-node cluster, 2 quad proc. per node

Page 32: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 32

Prob. 2 & 3: rel-res for 1 to 16 processors

Page 33: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 33


Hi-freq Helmholtz require hi-orderschemes

CARP-CG is applicable to hi-freq Helmholtz with hi-order schemes

Parallel and simple General-purpose – for problems

with large off-diagonal elements and discontinuous coefficients

Page 34: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 34

Other Potential Applications

Hi-order schemes for Helmholtz in homog & heterog 3D domains

Maxwell equations Other physics equations Saddle-point problems Circuit problems Linear solver in some eigenvalue


Page 35: Parallel solution of high-frequency Helmholtz equation using high-order finite difference schemes

July 2011 High-order schemes for high-frequency Helmholtz equation 35

Publications and Software

CARP: SIAM J Sci Comp 2005

CGMN: ACM Trans Math Software 2008

Microscopy: J Parallel & Distr Comp 2008

Large convection + discont coef: CMES 2009

CARP-CG: Parallel Comp 2010

Normalization for discont coef: J Comp & Appl Math 2010

CARP-CG software:

