Dispersion,costoptimization,and symplectictime ...web.gps.caltech.edu/~ampuero/tmp/SEM_dispersionNEW_Tarje_Oct2011.pdf · Dispersion,costoptimization,and symplectictime integrationinspectral-elementbasedwavepropagation

Geophys. J. Int. (0000) 000, 000–000

Dispersion, cost optimization, and symplectic time

integration in spectral-element based wave propagation

Jean-Paul Ampuero⋆ and Tarje Nissen-Meyer†

SUMMARYWe present dispersion properties of spectral element methods (SEM) for the wave equa-tion, with emphasis on practical issues for the numerical modelling of seismic wavepropagation. Additonally, we suggest a quantitative method to select the polynomialorder p, the element size h and the timestep ∆t in order to optimize the computationalcost for a given accuracy goal. We show that in the conventional implementations ofSEM with second-order time stepping algorithms (e.g. Newmark scheme or centereddifference) the dispersion error is dominated by the time discretization error, and thespectral convergence of the space discretization remains unexploited. We reduce thatoverall dispersion error by applying a family of high-order symplectic time schemes withoptimal memory and CPU cost. We illustrate the advantages in applications rangingfrom local to global scale seismic wave propagation, sketching a strategy for maximizingcost efficiency even for cases where no reference solution exists.

1 INTRODUCTION

An increasing amount of problems in seismology requirethe computation of synthetic seismograms with particularlyhigh accuracy for waves travelling over many wavelengthsand very long distances. For instance the R3 and R4 phases(Rayleigh waves propagating twice around the globe) canimprove data coverage and provide valuable informationabout the Earth structure.

Numerical methods for seismic wave propagation havecome a long way in the last three decades, now accommo-dating even the most complex physical frameworks such asporoelastic, solid-fluid wave propagation through complexdomains (the entire globe, strong topography, strong inter-nal discontinuities) with various constitutive relationships(e.g. attenuation, anisotropy) while considering gravitationand rotation effects as well as dynamic earthquake rupture.In many instances, sufficiently accurate 3D models of theEarth at either local or global scale have allowed these meth-ods to match recorded data with astonishing precision. Mostof these methods scale extremely well on large supercomput-ers and count amongst the most computationally intensiveapplications in science today. All these developments arepromising, but challenges remain: Meshing complex struc-tures is a formidable, unresolved topic as well as absorb-ing boundaries. Most of all, more and more users of thesemethods are detached from the numerical intricacies of eachmethod, and the task of choosing physically correct andcomputationally optimal input settings has not been pre-sented in a quantitative and generally accessible manner. Weattempt to fill this gap for the spectral-element method, oneof the most widely used methods in computational seismol-ogy, with a rather practical guideline to choose appropriatesimulation parameters.

The spectral element method (SEM) is favored amongst

many owing to its combination of geometrical flexibility,high accuracy, explicit time marching and efficient paral-lelization. However, under typically applied simulation con-ditions, SEM synthetics of R3 and R4 phases show appre-ciable delay errors (figure).

Here we are concerned with the two competing goalsthat determine the performance of a simulation: maximizingaccuracy while minimizing the computational cost. Empiri-cal rules of thumb are usually applied to select the polyno-mial order and number of nodes per wavelength, as givenfor instance in the work of Seriani & Priolo (1994) foran acoustic Chebyshev implicit SEM. From recent reviews(Komatitsch et al. 2005; Chaljub et al. 2006) it appearsthat some available results on the theoretical analysis ofdispersion properties of the SEM have had little exposurein the computational seismology community and have notbeen translated into practical guidelines. We attempt to fillthat gap by expanding upon results by Thompson & Pinsky(1994) and Fauqueux (2003). We also compare to resultsby Ihlenburg & Babuska (1997) and Ainsworth (2004) forthe p-version of the finite element method (p-FEM), whichdiffers from the SEM by the usage of exact quadrature rules.

...

We show that current implementations of SEM are oftendominated by dispersion error due to the time discretization.

In the computational seismology literature the domi-nance of the time integration error and the potential advan-tages of high-order time stepping have been noted severaltimes [Komatitsch and Vilotte reply 1999; Komatitsch andTromp, 1999; Bielak reply 2005]. Here we quantify thesestatements.

We investigate some higher order time discretizationalgorithms, in particular symplectic methods.

.....

2 Jean-Paul Ampuero‡ and Tarje Nissen-Meyer§

2 COMPUTATIONAL CHALLENGES INSEISMOLOGY

With the rapid advance of high-performance hardware, com-putational seismology is on the verge of resolving almost allcrucial frequencies represented in high-quality data at scalesspanning 4 orders of magnitude, ranging from hydrofrac-tures to global waves. Moreover, full-wave techniques arebecoming a standard tool to solve the forward problem forthe purpose of constructing the mapping between data andmodel in tomographic inversions for 3D structure (e.g. Tapeet al. 2009; Fichtner et al. 2009; Nissen-Meyer et al. 2007a).In this case, possibly hundreds to thousands of simulationsneed to be undertaken for the exact same background model,only changing source and receiver characteristics.

More fundamentally than the mere computational costoptimization in inverse problems, errors in predicting theforward solution propagate through the inverse frameworkas a first-order effect, and more prominently than errorsin partial derivatives/sensitivity kernels ?. Considering thatdispersion errors grow with traveled distance, this introducesa potentially non-negligible bias into the data space whichmay significantly alter the resultant tomographic modelaway from the maximum-likelihood model in a non-trivialmanner ?. Given that tomographic datasets often span someorders of magnitude in distances, this issue needs to be con-sidered carefully by choosing a target error commensuratewith the maximum travel path.

It is thus essential to optimize numerical settings toguarantee sufficiently precise wavefields at minimal cost forthe specific setting. In this section, we present some realisticcases of seismic-wave propagation at various scales whichnecessitate simulations over many wavelengths at consid-erably high accuracy, assuming that appropriate hardwaredevices are available (for instance, wave propagation over100 wavelengths in all directions in a 3D spectral-elementcode equals about 4x108 degrees of freedom requiring150 GB RAM).

Passive-source monitoring of hydrofracturing fromwithin boreholes is a seismic technique to detect high-frequency signals emanating from fluid-filled cracks in thecontext of hydrocarbon extraction (Rentsch et al. 2007).Frequencies may reach 150 Hz and are recorded in boreholesat 1 − 2 km depth up to 1 km away from the cracks withS-wave velocities of 1.5 km/s. In real-time monitoring cases,synthetic modeling may provide a valuable asset to e.g.locate these sources in an inversion, requiring propagationover 100 wavelengths in 3D.

In active-source exploration seismology, the approxi-mate scale of interest is confined to about 20 km laterallyand 10 km vertically to illuminate geologic structurescontaining hydrocarbon deposits. These highly complexcrustal regions may contain an ocean layer, sedimentsand bedrock, as well as high-velocity salt diapirs withvelocity variations reaching up to an order of magnitude.Additionally, the geometry may be extremely complexwith quasi-vertical faults intersecting sedimentary depositlayers, intrusions, or lenses. One of the major challengesfor accurate seismic simulations is the correct model rep-resentation, i.e. meshing. In any element-based technique,

one attempts to honor all major velocity discontinuities,resulting in potentially highly deformed and variableelement sizes. In such settings, thin layers limit the avail-ability to model at high spatial polynomial order suchthat the commonly employed p = 4 approximation maynot be exceeded even if optimal simulation parameterswould suggest otherwise. Given a propagation distanceof 20 km for shallow shear-wave velocities as low as2 km/s and active-source frequencies up to 30 Hz, we obtainwavelengths of 70 m, i.e. propagation up to 300 wavelengths.

In seismic hazard analysis, one is typically concernedwith peak ground velocities due to local earthquakes andtheir impact on infrastructures. Scales may for instancecover regions such as the Los Angeles area, the Taipei basin,or the Anatolyan fault, i.e. on the order of 100 km, andbuildings resonate at a few Hz. In these settings, shallowsedimentary layers facilitate small velocities that may actas wave guides to the most destructive waves, hence wave-lengths of less than 1 km should be considered accurately,resulting in more than 100 propagated wavelengths.

Global-scale body waves carry detectable seismic peri-ods down to 1 second in specific cases, but more routinelydown to 5 seconds. A compressional wave that travelsthrough the mantle, outer as well as inner core for up to13, 000 km is denoted PKIKP, and assumes wavelengthsranging from 30 km to 70 km, such that simulations needto be accurate out to distances exceeding 200 wavelengths.

For low-frequency surface-wave tomography, typicalperiods range from 35 to 175 seconds at epicentral distancesbetween 30 and 150. For major-arc paths, propagatingup to 330, a typical period range is 70 to 175 seconds.With an average phase velocity of 4 km/s these numbersgive 5 ≤ L/λ ≤ 130.

All these “grand-challenge” settings require accurate sim-ulations over more than 100 wavelengths, and this shall bethe basis for discussing the potential necessity to use higher-order time schemes. In the next section, we show how ourresults can be applied to global-scale wave propagation in ahomogeneous model but variable grid size as well as realisticbackground models such as PREM, offering a guidance asto sensibly choosing the most cost-effective time scheme fora given resolution and target error, even in a heterogeneousrealm. Clearly, the challenge is to define generic selectioncriteria for cases in which no reference solutions are knownand hence no error values can be deducted upon simulations.We accomodate this by a series of test simulations to decideupon a time discretization scheme and other parameters bymeans of relative comparisons.

In such heterogeneous settings, our compact treatmentfor homogeneous model and grid spacing does not directlyapply, but it is reasonable to assume that dispersionerrors between different time integration schemes behaverelatively similar. The time step however may be controlledby geometrical as well as velocity constraints in thatmeshing results in fiercly suboptimal spacing-to-velocityratios in large portions of the domain and the stableregime is controlled by smallest ratios and hence heavilyoversampled elsewhere. Consequently, even a lower-order

Dispersion, cost optimization, and symplectic time integration in spectral-element based wave propagation 3

time scheme may be sufficient for some high-end settingsdue to having a decreased time step, but this can in no waybe generalized. In the ever-increasing high-resolution limit,it again becomes less important as the mesh becomes morerepresentative in mimicking the continuous structure suchthat thin layers are not the driving meshing constraint.Hence in any high-resolution limit for large-scale wavepropagation, higher-order schemes shall gain importance interms of optimizing performance.

2.1 Choice of target error

The choice of a reasonable target error of the numerical sim-ulation is problem-dependent and - most importantly - re-lated to how well the synthetic compares to data: If thebackground model is known to cause significant discrepan-cies, then a lax error is sufficient, but a rule of thumb may beadopted as follows: Waveform analyses often deal with pulseshifts down to about a tenth of the dominant period, andthis often represents the typical regime of observational un-certainties. The numerical dispersion error should thereforebe an order of magnitude less that this fraction at the fur-thest propagated distance. At 100 wavelengths, this wouldfor instance be ǫ ≤ 10−4. Observational uncertainties forhigh-quality phase velocity data range from 3% to 20% of thewave period, the largest errors being for the shortest periods(from Table 2 of Ekstrom et al. (1997) this is ǫ = σA/2π).

¶

Synthetic waveforms are involved in the matching filter tech-nique for measuring phase velocities (Ekstrom et al. 1997),which is based on cross-correlation like the delay misfit met-ric used in our figures. It is hence desirable to generate syn-thetics at least one order of magnitude more precise thanthe estimates of observational uncertainties quoted above.This means a typical target accuracy ǫ = 0.3% at distancesup to L = 130λ.

3 DISPERSION PROPERTIES OF THE SEM

3.1 Definitions and generalities

Numerical solutions to wave propagation introduce two mainintrinsic errors: dispersion and dissipation. For a monochro-matic plane wave these are defined as the phase error andthe amplitude error, respectively. In this study we focus ondispersion errors for two reasons. First, when combined withnon-dissipative timestepping, the SEM introduces virtuallyno numerical attenuation (exactly zero attenuation on ele-ment vertices). Second, at very long propagation distancesthe numerical errors in synthetic waveforms are dominatedby the mismatch introduced by time delay (dispersion) er-rors, a phenomenon known in the FEM literature as thepollution effect. [e.g. Thompson review 2006]

For the sake of clarifying the relevant formulae, we willconsider a non-dispersive medium with wave speed c gov-erned by the scalar wave equation, and a monochromaticplane wave with frequency ω and wavenumber k = ω/c. In

¶ See also Gudmundsson et al GJI 1990, cited by Deal et al JGR1999a. See Davies PEPI 1992 for effect of eqk mislocations.

practical wave propagation problems the wavefield is ob-viously not monochromatic. Nevertheless, the results de-scribed here are applicable by identifying ω with the domi-nant frequency ω0 defined in Appendix E.

The phase velocity ω/k of the numerical solution ob-tained after space and time discretization is generally differ-ent than c. We quantify the dispersion error by the relativephase velocity misfit between the numerical and the theo-retical wave solutions:

ǫ.= −ω/k − c

c. (1)

In the remainder of the paper, subscripts x and t shall in-dicate parameters dependent on space and time discretiza-tions, respectively. See Table (1) for definitions. We will seethat ǫ depends on the order of the space and time discretiza-tion schemes (SEM polynomial order p and time integrationorder q) and on the non-dimensional numbers κ = k∆x andΩ = ω∆t, where ∆x is the element size and ∆t is the timestep. However, it does not depend on the total travel timeT .

We are particularly interested in the asymptotic prop-erties of the dispersion error at small values of κx and Ωx

[TNM: WHY, AND WHAT ARE “SMALL” VALUES]. Inthis asymptotic regime, ǫ can be decomposed as the sum ofan error induced by the space discretization ǫx and an errorinduced by the time discretization ǫt:

‖

ǫ = ǫx + ǫt (2)

In the following sections we will study both components sep-arately.

In practice, we can measure the time-delay misfit ∆Tbetween a synthetic seismogram and the exact analytical so-lution by a standard, sub-sample precision cross-correlationtime delay estimate technique (Appendix E). The dispersionerror ǫ is related to the time delay by

ǫ =∆T

T(3)

Because the phase velocity error ǫ is independent of T ,(3) implies that the time delay grows proportionally tothe travel time. We prefer local error metrics, evaluatedat a given station, over a global error metric, defined asa weigthed integral over the whole Earth surface, becausethe latter obscures the linear increase of the dispersion errorwith propagation distance.

The number of nodes per wavelength, Cx, is defined asin Seriani & Priolo (1994) by

Cx.= pλ/∆x = 2π p/k∆x. (4)

‖ This decomposition can be justified as follows. An approximatephase velocity c′ is obtained by solving the wave equation afterspace and time discretization. The dispersion error is defined as

ǫ = − c′ − c

c

We denote ch the semi-discrete phase velocity obtained by solvingnumerically the wave equation in frequency-domain after spacediscretization. Considering c′−c = (c′−ch)+(ch−c) and assumingch ≈ c:

ǫ = − c′ − ch

ch− ch − c

c

4 Jean-Paul Ampuero∗∗ and Tarje Nissen-Meyer††

Table 1. Description of symbols used throughout this paper.

Symbol Units Definition (eq. ref.) Description

Background medium and source

c [m/s] homogeneous medium velocityL [m] source-receiver distanceT [s] T=L/c total traveltimeω [Hz] angular dominant source frequencyk [m−1] k = ω/c angular wavenumberλ [m] λ = 2π/k dominant source wavelengthu(t) [m] displacement time series at distance L

Spatial (spectral-element) discretization

p - polynomial order∆x [m] element sizeCx - Cx = pλ/∆x grid points per dominant wavelengthAx(px) - factor dependent on pκx - κx = k∆x dimensionless wavenumberΩx - Ωx = ω∆x/c dimensionless frequency

Temporal discretization

∆T [s] ∆T = Tnum − Tref time delayq - order of temporal schemeCt(q) - parameter for temporal stability∆t [s] max(ω)∆t < Ct numerical time stepAt(q) - scheme-dependent error factor

ω0 [s−1] ω0 =(

||u1+q/2||/||u||)2/q

cross-correlation based dominant frequency

Relative numerical dispersion error

ǫx - ǫx = Ax(k∆x)2p error due to spatial discretizationǫt - ǫt = At(ω∆t)q error due to temporal discretizationǫ - ǫ = ǫx + ǫt = ∆T/T overall error

Computational cost

Ex(p) - multiplications per elementΓ - Γ = E(p)/

[

(k∆x)Dω∆t]

cost of a spectral-element scheme (D: dimension)n - force evaluations per time step

3.2 Dispersion error due to space discretization

We focus here on the dispersion properties in a semi-discrete1D wave equation in which space has been discretized by theSEM but no time discretization is assumed. This is equiv-alent to solving the wave equation in frequency domain. Itis shown in Appendix A that the 1D case provides a higherbound for the dispersion error in higher dimensions. Theanalysis is done for homogeneous media and constant el-ement size ∆x. We will touch upon the applicability con-strained by this limitation later in the paper.

In a non dispersive medium ω/k = c, but waves in thesemi-discrete problem follow a non-linear numerical disper-

sion relation of the form (Ainsworth 2004):

cos(k∆x) = Rp(ω∆x/c) (5)

where Rp is a rational function parameterized by the poly-nomial order p. A compact expression for Rp is given byAinsworth (2004) for the p-FEM at arbitrary p. However,no similar expression is yet available for the SEM. Closedforms of Rp for low orders are derived in Cohen (2002) andFauqueux (2003), and summarized in Table 2. Dispersioncurves can be obtained numerically by the procedure intro-duced by Thompson & Pinsky (1994), and these are shownin Figure 1 for a usual range 3 ≤ p ≤ 8.

The asymptotic behavior of the dispersion error ǫx was


0 2 4 6 8 100

2

4

6

8

10

12

ω h

/c

NGLL = 4

0 5 10 150

5

10

15

20

NGLL = 5

0 5 10 15 200

5

10

15

20

25

ω h

/c

NGLL = 6

0 5 10 15 200

10

20

30

40

NGLL = 7

0 5 10 15 20 250

10

20

30

40

50

Re(k h) and Im(k h)

ω h

/c

NGLL = 8

0 5 10 15 20 25 300

10

20

30

40

50

60

Re(k h) and Im(k h)

NGLL = 9

Re(k h)Im(k h)exact

10 6 5 4 3 2

Nodes per wavelength

10 6 5 4 3 2


10 6 5 4 3 2 10 6 5 4 3 2

10 6 5 4 3 2 10 6 5 4 3 2

Figure 1. Dispersion curves for SEM, including complex wavenumbers. Stability of explicit time schemes is related to the highestpropagating frequency (red circle, the end of the last optical branch). The red cross indicates half that frequency.

6 Jean-Paul Ampuero‡‡ and Tarje Nissen-Meyer§§

10−1

100

101

10−12

10−10

10−8

10−6

10−4

10−2

100

k h

∆ T

/ T

, di

sper

sion

err

or

100

101

10−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100


∆ T

/ T

, di

sper

sion

err

or

456789

50 20 10 6 5 4 3 2 1 0.5

Elements per wavelength

456789

Figure 2. Dispersion error for SEM (3 ≤ p ≤ 8) as a function of wavenumber or number of elements per wavelength (right) and as afunction of number of nodes per wavelength (left). The dashed lines in the left plot are the asymptotic behavior given by equations 6and 7.

Table 2. Discrete dispersion relation

p Rp(Ωx) ǫx

1 1−Ω2x/2

κ2

24

248−22Ω2

x+Ω4

x

48+2Ω2x

κ4

2880

33600−1680Ω2

x+92Ω4

x−Ω6

x

3600+120Ω2x+2Ω4

x

κ6

604800

obtained analytically by Mulder (1999), for k∆x≪ p:

ǫx ≈ Ax(p)(k∆x)2p (6)

with

Ax(p) =1

2p(2p+ 1)

(

p!

(2p)!

)2

. (7)

This is readily verified by the analytical dispersion relationsat low p in Table 2, as the leading order term of the seriesexpansion of (cos(x) − Rp(x))/x

2 at low x. Relations (6)and (7) are confirmed for the numerical dispersion curvesat higher p in Figure 2. This result is of practical impor-tance to quantify the quality of SEM and to compare it toother methods. We see for instance, by comparing (7) tothe analogous results by Ainsworth (2004) for the p-FEM,that the dispersion error in SEM is always p times smallerthan in p-FEM, and of opposite sign, in agreement with theobservations at low p by Thompson & Pinsky (1994).

The spectral convergence of the SEM upon p-refinement

is demonstrated by rewriting the dispersion error (6) as

ǫx ≈ (πe/2Cx)2p

4p(2p+ 1). (8)

where we have employed Stirling’s formula, n! ≈√2πnn+1/2e−n, to approximate A(p). The error decays

faster than exponentially as a function of p, provided that

Cx > eπ/2 ≈ 4.27. (9)

Similar conditions on the minimum number of nodes perwavelength G are usually quoted as guaranteeing high qual-ity SEM simulations. However, (9) is only a condition toenter the super-exponential convergence regime. For practi-cal purposes a more meaningful and useful information is theminimum Cx required to achieve a prescribed accuracy ǫx.This is plotted in Figure ?? and reported in Table 3, basedon Equations (6) and (7). Clearly, Cx strongly depends onthe accuracy goal and on the polynomial order.

Note that by definition (3) ǫx has been normalized bythe total travel time. If one prescribes instead the absoluteerror ∆T it appears that a larger Cx is needed for largerpropagation distances. It is useful to express the dispersionerror in terms of the dominant period T0 = ω/2π and thenumber of wavelengths travelled T/T0:

ǫ =∆T

T0

/

T

T0. (10)

With ǫx = 10−4 one could achieve for instance propagationover distances of 100 wavelengths with numerical time delaysof 1% of a wave period.

For a global accuracy ǫ the numbers in Figure ?? and


Table 3. Number of nodes per wavelength Cx required in SEMto achieve an accuracy ǫx (dispersion error assuming perfect timeintegration) for different polynomial orders p.

ǫx \ p 1 2 3 4 5 6 7 8 9 10

10−3 40.6 9.6 6.5 5.5 5 4.7 4.5 4.4 4.4 4.310−4 128 17.2 9.5 7.3 6.3 5.7 5.4 5.1 5 4.810−5 406 30.5 14 9.7 7.9 6.9 6.3 5.9 5.6 5.4

Table 3 are only lower bounds as they do not account forthe additional error due to time discretization, studied inthe next paragraph.

3.3 Dispersion error due to time discretization

The analysis above gives the dispersion error that one wouldget if time discretization were perfect. Unfortunately thetime schemes introduce additional dispersion errors. In gen-eral dissipation errors may also be present but we aremainly interested here in conservative schemes. For a generalscheme of order q the error behaves as

ǫt ≈ At(ω∆t)q. (11)

The usual centered difference scheme is second order ac-curate, q = 2, and its dimensionless factor is At = 1/24.Within the second order class it is the most accurate scheme.

3.4 Stability

Explicit time integration schemes are conditionally stableand the timestep ∆t must verify

∆t < ∆tcrit =Ct√DΩm

∆x

c, (12)

where ∆tcrit is the critical timestep, D the dimension of theproblem, Ct a stability number that depends only on thetime scheme and Ωm a spectral radius that depends onlyon the space discretization scheme. The spectral radius isthe highest eigen-frequency of the dimensionless 1D waveequation (c = 1) discretized by spectral elements with ∆x =1. In terms of the dispersion relation, Ωm is the smallestsolution of

Rp(Ωm) = cos(pπ) = (−1)p. (13)

In general Ωm is inversely proportional to the smallest nodespacing. The Gauss-Lobatto-Legendre nodes of SEM areclustered at the edge of the elements with spacing ∝ 1/p2, sowe expect Ωm ∝ p2. Values for p ≤ 10 are reported in Table4 and are well approximated (better than 1% for p ≥ 4) by

Ωm(p) ≈ 0.64 p2 + 0.57 p+ 1. (14)

Compared to the p-FEM (Ainsworth 2004) the SEMhas the advantage of a smaller spectral radius, hence largertimesteps can be used.

Table 4. Spectral radius Ωm in SEM for different polynomialorders p.

p 1 2 3 4 5 6 7 8 9

Ωm 2 2√6

√

42 + 6√29 13.55 19.80 27.38 36.27 46.42 57.87

3.5 Overall dispersion error

Within the asymptotic regime, the total dispersion errorcombining space and time discretizations is

ǫ = Ax(p) (k∆x)2p +At(ω∆t)q. (15)

In common practice the time step is set to a moderate frac-tion of the critical value (12), i.e. ∆t = γ∆tcrit with γ . 1.We can write ǫ in terms of κx = k∆x as

ǫ ≈ Ax(p) κ2px + At

(

γCt√DΩm(p)

)q

κq (16)

Both components of the dispersion error are comparedin Figure 4 for p = 8. The error is dominated by the time-discretization error as soon as more than 4 nodes per wave-length are employed. At Cx = 10, ǫt is about six orders ofmagnitude larger than ǫx.

In Figure 5 we illustrate, for D = 3, γ = 0.5 and a rangeof values of p and Cx, that in virtually no case of practicalrelevance (ǫ ≪ 1%) is the time discretization error ǫx ne-glectable with respect to the space discretization error ǫx.In simulations requiring reasonable accuracy ǫ < 10−3, withthe usual setting γ . 1, the dispersion error is largelydominated by the low order time discretization er-rors and the advantages of spectral convergence ofthe SEM are wasted. In such situations ǫT must be con-trolled by reducing the timestep ∆t. For instance, for p = 6,Cx = 6, to achieve an error of 1% of a period over a prop-agation distance of 100 wavelengths (ǫ = 10−4) one needsto reduce ǫt by a factor of 10. This requires a

√10 ≈ 3-fold

reduction of ∆t and implies a ≈ 3-fold increase in com-putational cost. We will see in later sections that a moreefficient error reduction can be achieved with higher ordertime schemes (q > 2).

4 COMPUTATIONAL COST

Our goal in this section is to derive quantitative rules fora performance-based design of SEM simulations: how to setthe parameters ∆x and p in order to satisfy a given errortolerance at a minimum computational cost.

4.1 Definition of the computational cost

We focus on the computational cost quantified as CPU timecost (how long does it take to run a simulation?). Memorycosts will be discussed briefly in 4.2. We adopt the floatingpoint operation count as a proxy for the CPU time cost.Although imperfect¶¶, this rough approach seems adequate

¶¶ From J.M. Dennis (2005, Ph.D dissertation): “Historically, it-erative solvers have been designed to achieve the best numerical

8 Jean-Paul Ampuero∗ ∗ ∗ and Tarje Nissen-Meyer∗ ∗ ∗

10−6

10−5

10−4

10−3

0

5

10

15

20

25

30

εX

G

2 3 4 61015

Figure 3. Minimal number of nodes per wavelength Cx required to achieve a specified accuracy ǫx (dispersion error due to spacediscretization). Each curve is for a different polynomial order p (see legend).

Table 5. Number of multiplications per element per nodeEx(p))/(p+1)D for the computation of internal forces and storageper node in some implementations of SEM

Problem Code Ex(p)/(p + 1)D Storage per node

1D SEMLAB p+ 1 p+ 42D SH SEM2DPACK 4(p + 2) 6

2D P-SV SEM2DPACK 8(p + 3) 162.5D monopole AXISEM 8(p + 7)

2.5D di/quadrupole AXISEM 12(p + 9)3D SPECFEM3D 9(2p + 11) 22

for our illustrative purposes, as verified by our benchmarkingresults in Figure 6.

The number of floating point operations in SEM is dom-inated by the evaluation of the internal forces. The numberEx(p) of multiplications involved per element in this opera-tion is reported in Table 5 for some open source implemen-tations of SEM ‖‖. In the table we show Ex(p)/(p + 1)D

to reflect the number of operations per element per nodein dimension D. Other implementations may differ in theirbalance between optimization of memory and optimizationof operation count, but the leading term O(pD+1) in Ex(p)reported here cannot be reduced further.

The total computational cost for wave propagation in

accuracy for a given number of floating-point operations. How-ever, this approach ignores the cost of memory access, whichhas not seen nearly as rapid of an improvement as floating-pointcosts. To reduce the time to solution, we need to address boththe numerical efficiency and memory efficiency of an iterative al-gorithm.”‖‖ SEMLAB and SEM2DPACK are available online at www.

sg.geophys.ethz.ch/geodynamics/ampuero/software.html,SPECFEM3D is available at www.geodynamics.org.

a domain of linear dimension L and total travel time T isthe product of the number of elements, the cost per elementand the number of timesteps,

cost = Ex(p)

(

L

∆x

)DT

∆t. (17)

Assuming T ≈ L/c, it can be rewritten as

cost = (kL)1+D × Γ, (18)

where

Γ =Ex(p)

(k∆x)D ω∆t. (19)

The first term in (18) is entirely due to the physical dimen-sions of the problem whereas the term Γ encapsulates theeffect of the numerical resolution. In the remainder we willrefer to Γ as the computational cost.

Figure 6 reports the accuracy achieved and the CPUtime spent in a series of simulations with a large range of∆x and p values. The benchmark problem is 2D SH wavepropagation in unbounded homogeneous media with a pointforce source and Ricker source time function. It was solvedwith the SEM2DPACK code, using second order Newmarktimestepping. Compiler, compiling options and processor areindicated above Figure 6-c. The dispersion error ∆T/T ismeasured with respect to the respective analytical solution,at epicentral distance much larger than the dominant wave-length, L = 20λ. At this distance the dispersion error isdominated by the timestepping error, which translates intoa second order decay of ∆T/T as a function of ∆x (Fig-ure 6-a). Up to a multiplicative factor common to all ∆xand p, the relations between CPU time, dispersion error,h and p are well captured by our theoretical expectations.This demonstrates the adequacy of our proxy Γ for the CPUtime cost.


0 0.5 1 1.5 2 2.5 3 3.5 40

0.5

1

1.5

2

2.5

Pha

se v

eloc

ity /

wav

e sp

eed

h / λ = wavelengths per element

P = 8

10 6 5 4 3 2

P λ / h = nodes per wavelength

10−1

100

10−15

10−12

10−9

10−6

10−3

100

∆T /

T =

rel

ativ

e di

sper

sion

err

or

h / λ = wavelengths per element

∆T/T

∆T/T = A(p) (kh)2P

∆T/T = B (C kh/Ωmax

)2

∆T/T ∼ round−off

100 50 20 10 6 5 4 3 2

P λ / h = nodes per wavelength

Figure 4. Phase velocity curve (top) and dispersion error (bottom) for SEM with p = 8 as a function of resolution (wavelengths perelement or nodes per wavelength). The green dashed line is the theoretical estimate, the black dashed line is the time-discretization errorof the centered difference scheme.

4.2 Memory cost

What if we can afford large CPU times but we have limitedmemory resources? The memory requirements of SEM interms of number of floats stored per node are given in Table5. These include elastic, quadrature and coordinate map-ping coefficients or combinations thereof, and displacement,velocity and acceleration fields. The memory requirementper node does not depend explicitly on p or ∆t. The totalmemory cost is proportional to the total number of nodes

cost ∝ (p/k∆x)D. (20)

Assuming ǫt can be made arbitrarily small the memory costto achieve an accuracy ǫ is

cost ∝ pD (Ax(p)/ǫ)D/2p. (21)

For the practical range ǫ < 10−3 the optimal value popt

is very high (> 20), even higher than the values found byoptimizing the CPU cost. Again the minimum is very flatand if the mesh is required to have ∆x < λ one must allowfor lower popt.

5 OPTIMAL SIMULATION PARAMETERS

5.1 Time domain

Considering both components of the dispersion error (15),for a fixed accuracy goal ǫ there is an optimal set of param-eters p, ∆x and ∆t that minimize the computational cost.In Appendix B we show that the optimal element size andtimestep, for given space and time discretization orders p


10−6

10−5

10−4

10−3

10−2

10−1

10−4

10−3

10−2

εX

ε T

p=2

p=19

G = 4.5

5

6

10

εT = ε

X

Figure 5. Space and time discretization components (ǫx and ǫt) of the dispersion error in 3D SEM with second order timestepping.Each curve is drawn for a given number of nodes per wavelength (Cx). Each square corresponds to a different polynomial order p, from2 to 19. Lower errors are for lower p, as labeled for the curve Cx = 4.5. The timestep has been set to ∆t = ∆tcrit/2. Curves are rarelywell below the dashed line: there is virtually no case of practical interest (ǫ ≪ 10−2) where ǫt ≪ ǫx. For small ǫ the dispersion error isdominated by ǫt (the upper dotted lines indicate ǫt/ǫx = 10 and 100).

and q, are

∆xopt =λ

2π

(

qD

qD + 2p

ǫ

Ax(p)

)1/2p

, (22)

∆topt =λ/c

2π

(

2p

qD + 2p

ǫ

At

)1/q

. (23)

The associated cost

Γ =

[(

1 +2p

qD

)

Ax(p)

]D/2p [(

1 +qD

2p

)

At

]1/qEx(p)

ǫD2p

+ 1

q

(24)is plotted in Figure 7 for the 3D SEM. As the beneficialeffect of the spectral convergence is counterbalanced by thelinear increase of cost per node upon p-refinement (Table 5),there is always an optimal popt that minimizes the cost.

The values of popt found in Figure 7 are high (popt > 10)compared to the range employed in current SEM simula-tions. This contrast is exhacerbated as the accuracy require-ments become more stringent. However the minimum of Γis very flat and employing values of p within the usual range(p < 10) does not imply a significative loss of efficiency.

In realistic situations mesh design issues can limit p tolow values. A mesh must often honor features of the ge-ological model of length-scales somewhat smaller than onewavelength, implying ∆x < λ. From the number of elementsper wavelength λ/∆xopt plotted in Figure 7-c, the constraintλ/∆x > 1 restricts the useful range to p ≤ 7.

The optimization described above can be also hinderedby constraints on the timestep. In heterogeneous media, dueto the usage of conformal meshes ∆t is controlled by thesmall element faces of fast materials in contact with slowmaterials. The timestep can be much shorter than expected

in a uniform grid, by a factor of the order of the highestvelocity contrast. For instance, from Figure 7-d it appearsthat the optimal parameters for ǫ = 10−4 are appropriatefor velocity contrasts up to 5 only if p < 10.

5.2 Frequency domain

Increasing efforts are currently devoted towards realizing 3Dwaveform tomographic inversion [e.g. Bleibinhaus et al, JGR2007]. Waveform tomography is classically based on fre-quency domain solvers, which do not involve time discretiza-tion. By considering only the space discretization componentǫx of the dispersion error, we can address the cost-accuracytrade-offs of SEM for frequency domain solvers. Figure 8shows the computational cost as a function of polynomialorder p, assuming an optimal choice of element size ∆x fora given accuracy level and p. The minimum of each curve(indicated by circles) gives the optimal values of p for a givenaccuracy level. These optimal p values are significantly lowerthan in the presence of time integration error (Figure 7).

6 HIGHER ORDER SYMPLECTIC SCHEMES

6.1 Is higher order timestepping needed ?

By definition higher order time schemes yield faster decreaseof the error as the timestep is decreased (see 11). Howeverthis comes at a cost. For instance, the schemes we will studyhere involve nf evaluations of internal forces per timestep,with larger nf for higher q. The computational cost per el-ement is ΓT = nfT/∆t. For a given accuracy ǫ (11) it can


10−1

100

10−6

10−5

10−4

10−3

10−2

10−1

h

∆ T

/ T

100

10−1

100

101

102

103

104

h

CP

U ti

me

10−6

10−5

10−4

10−3

10−2

10−1

100

101

102

103

104

Processor: Xeon 3GHzCompiler: ifort 9.1

Flags: −O3 −ip −ipo −arch SSE2 −tpp7 −xN

∆ T / T

CP

U ti

me

p=3

p=5p=7p=9

p=3p=9

p=3

p=9

Figure 6. Computational cost and accuracy as a function of ∆x and p for a point source with Ricker source time function in a 2D SHhomogeneous elastic space, solved with the SEM2DPACK code and explicit Newmark time integration. The dispersion error ∆T/T isevaluated with respect to the analytical solution at epicentral distance L = 20λ. Each curve corresponds to a different polynomial orderp (see labels). Symbols indicate simulation results, dashed lines the theoretical predictions. Top-left plot: note the second order decay oferror as a function of ∆x, because the error from the second order time integration dominates. Bottom-left: Our CPU time predictionsderived in Equation 19, up to a common multiplicative factor. Right: Cost/accuracy performance curves. Note that performance keepsimproving with increasing p, consistent with our predictions of a very large optimal p.

be written as

ΓT = nf (At/ǫ)1/q . (25)

This is a good indicator of the relative efficiency of differenttime discretization schemes. A similar indicator comes byinspection of the optimal computational cost (24). Higherorder time schemes are more efficient than the usual centraldifference scheme (q = 2, At = 1/24) for all accuracies ǫonly when

nfA1/qt < 1/

√24. (26)

We have not found yet a scheme that satisfies this condi-tion (and there might be a fundamental reason for that).Nevertheless, all schemes are more efficient than centereddifferences below some accuracy level, when

ǫ <(√

24nfA1/qt

)− 2qq−2

. (27)

Table 6. Properties of some symplectic algorithms: number ofstages n, number of force evaluations nf , order q of the dispersionerror, its prefactor At and stability number Ct

Name n nf q At Ct

PV 1 1 2 1/24 2

CPV 1 2 4 1/720 2√3

PFR 3 3 4 0.0661431 1.5734PEFRL 4 4 4 1/12500 2.97633

For many of the schemes we have studied so far this condi-tion is met in the range ǫ < 0.1. So it is advantageous toapply high order time schemes even when the accuracy goalis not very stringent.


0 5 10 15 2010

3

104

105

106

3D

Com

puta

tiona

l cos

t Γ

Polynomial order

a)

0.001

0.0001

1e−050 5 10 15 20

4

6

8

10

Nod

es p

er λ b)

0 5 10 15 200

2

4

Ele

men

ts p

er λ c)

d)

0 5 10 15 200

0.5

1∆t

/ ∆t

c

Polynomial order

d)

Figure 7. (a) Optimal computational cost (number of multiplications in the evaluation of internal forces) of the 3D SEM as a functionof polynomial order p. Each curve is for a different dispersion accuracy ǫ given in the legend. Some attributes of the optimal settings:(b) number of nodes per wavelength, (c) number of elements per wavelength and (d) ratio of timestep to critical timestep. The circlesindicate the optimal polynomial order for each ǫ.

0 5 10 15 2010

5

106

107

108

3D

Tot

al c

ompu

tatio

nal c

ost

Polynomial order

ε = 10%1%0.1%0.01%

Figure 8. Theoretical cost-performance for an SEM frequency domain solver (equivalent to assume perfect time integration). Thecomputational cost is shown as a function of polynomial order p. Each curve corresponds to a given accuracy level (dispersion errorindicated in the legend). We assume an optimal choice of element size ∆x for a given accuracy and p. The optimal p (circles) is lowerthan in the case with imperfect time integration.


10−8

10−6

10−4

10−2

100

10−1

100

101

102

εT

Γ T

PVML

SO4m5

MLSB3AO6m7

KLs17odr8a

SSs35o10

Figure 9. Cost/accuracy curves for different symplectic schemes. Their stability limits are marked by black squares. High order schemesare generally more performant (lower cost ΓT for a given accuracy ǫt) than the usual centered difference scheme (PV).

6.2 Generalities [... in progress...]

In the absence of intrinsic attenuation the mechanical en-ergy is conserved in seismic wave propagation. As the SEMspace discretization does not introduce numerical attenua-tion∗ ∗ ∗ it is desirable to preserve this feature in the timediscretization too. However most high-order time integrationschemes, like Runge-Kutta or Adams-Bashforth, are dissi-pative leading to a cumulative distortion of the amplitudeof the waveform spectrum. In astrophysics and moleculardynamics, where numerical accuracy and energy conserva-tion over very long time-scales are essential goals, symplecticmethods have been developped and are widely used. Morerecently, these algorithms have been applied with successin FEM for electromagnetics (Rieben et al. 2004)∗ ∗ ∗. Sym-plectic methods belong to the larger class of geometric inte-grators, designed to preserve certain geometrical invariantsof the exact flow of a differential equation. Symplectic meth-ods in Hamiltonian mechanics preserve areas in the phasespace, a property coined the linear symplectic structure (seeDonnelly & Rogers (2005) for an introduction, [Marsden andRatiu, 2002; for a full coverage of geometrical mechanics])∗ ∗ ∗. ”In some cases, symplecticity reduces to well knownand easy to understand principles, including the Betti reci-procity principle and other well-known reciprocity principlesin mechanics” [Lew et al 2004]

Symplectic integrators are not explicitly designed toguarantee the exact conservation of mechanical energy. How-ever, they conserve a Hamiltonian function that is close tothe original Hamiltonian. Therefore they display no long-time energy drift: energy oscillates around its exact value

∗ ∗ ∗ well, at least at the interelement nodes, see Thompson andPinsky for the interior nodes...∗ ∗ ∗ More refs on Maxwell equations: Hirono et al 1997; Lu andSchmid 2001; Piperno 2006; Sha et al 2008∗ ∗ ∗ Other general references: Simo et al 1992; Sanz-Serna andCalvo 1994; Hairer, Lubich and Wanner, ”Geometric numeri-cal integration” 2002; Leimkuhler and Reich 2004; Holder andLeimkuhler 2001; Hardy et al 1999

with oscillation amplitude decreasing with the order of themethod. These methods preserve the symplectic structurein a global sense. Multi-symplectic methods, like the sym-plectic Gauss-Legendre Runge-Kutta methods, that insteadguarantee local conservation are still a matter of research(Frank et al. 2006). Their implementation and design is com-plex, as they entail a simultaneous design of space and timediscretizations, and its cost efficiency is yet to be demon-strated, as these are generally implicit methods.

Symplectic schemes conserve symplecticity and momen-tum. A different choice is to achieve conservation of energyand momentum [e.g. Gonzalez and Simo 1996]. High-ordermethods in this class: Galerkin (continuous), unfortunatelyimplicit. Note that energy and symplecticity cannot be con-served together if the time step is constant [Ge and Mars-den 1988]. (Asynchronous) variational integrators [Kane etal 1999; Lew et al 2004] fill this gap by using locally adaptivetimesteps, at the cost of larger implementation complexity,especially in parallel computing.

Linear multi-step methods are symplectic when sym-metric and irreducible [Eirola and Sanz-Serna 1992] butfor high order (multi-stage) they have poor performance interms of stability and memory requirements. If symplecticityis not required, linear multi-steps methods ...

For algorithms with fixed time steps the theorem of Geand Marsden (1988) has led to a general division of algo-rithms into those that are energy-momentum preserving andthose that are symplectic-momentum preserving. Space-timeasynchronous variational integrators have been developpedto bridge this gap (Kane, Marsden and Ortiz, 1999; Lew,Marsden, Ortiz and West 2004), but their implementationis quite involved. AVI are thought to be most effuicient forcomputing statistical properties of complex dynamic sys-tems.

Chen [2007] compared some third and fourth ordermethods for the wave equation (with applications usingpseudo-spectra space discretization): Lax-Wendroff meth-ods (modified equation approach: spatial derivatives to re-place high-order temporal derivatives), symplectic Nystrom


methods (Runge-Kutta specialized to second order ODE andsymplectic) and symplectic splitting methods (like the onesconsidered here). Disadvantages of Lax-Wendroff: not self-starting. ∗ ∗ ∗ Disadvantages of Nystrom: memory, requiresstorage of n auxiliary variables.

In summary, our choice of focusing on symplectic meth-ods is based on the following advantages: straightforward im-plementation (including self-starting), no additional mem-ory requirements and available numerical analysis results.

TTTTTT TNMWe apply the symplectic time integration to the semi-discretized, finite-dimensional system of ODEs which resultsfrom the spatial discretization by the SEM which, ignoringsolid-fluid boundaries for the sake of simplicity, is of the type

Mq+ Kq = f, (28)

where M is the diagonal mass matrix, K the non-diagonalstiffness matrix, q the displacement vector, f the sourceterm, and superscripted dots denote differentiation with re-spect to time.

Let us furthermore define the conserving total energy(e.g. Nissen-Meyer et al 2008) after the source duration (f =0) as

H =1

2vTMv+

1

2qTKq, (29)

where superscript T is the transpose, and additive terms de-note kinetic and potential contributions, respectively. Usingthe displacement q and momentum p = Mv as phase spacevariables of the system, we obtain

∂pH = M−1

p = q, (30)

∂qH = Ku = −Mq = −p, (31)

where we have used eq. (28) in eq. (31). These two equationsare the common canonical equations of motion within theHamiltonian framework, and as such the semi-discretizedtemporal ODE eq. (28) provides a sufficient form for sym-plectic time integration.

EEEEEEEEEEEEEEND TNM

Symplectic time integration can be applied to theformulation of the elastodynamic problem as an infinite-dimensional Hamiltonian system of PDEs (for a discus-sion on technicalities related to infinite-dimensionality, see[Marsden and Ratiu, 2002]). We denote by q(x, t) the dis-placement field and p(x, t) = ρ∂q/∂t the momentum field.We define the elastic potential

V (q) =1

2ǫijCijklǫkl (32)

and the Hamiltonian

H(q,p) =

∫ ∫ ∫

1

2p · p/ρ+ V (q)dx3 (33)

∗ ∗ ∗ Some modified equation algorithms can be obtained as spe-cial case of the symplectic algorithms with force gradient. A cor-rected 4th order scheme with only 2 force evaluations can beobtained within the “force-gradient” family of Omelyan et al.(2003), setting ξ = 1/24 in their equation 24. This is in contrastto their proposed value ξ = 1/which 12 yields only second order.This algorithm can also be obtained by the modified equation ap-proach [Cohen et al. 2001]. It has At = −1/720 [Scuro and Chin2005] and Ct =

√

(3) [Cohen et al 2001].

Note this is a separable Hamiltonitan: H(q, p) = T (q) +V (p) (kinetic and potential energies). The elastodynamicequations

ρ∂2u

∂t2= div(c · ∇u) (34)

where u(x, t) is the displacement field and c(x) the elasticitytensor, are equivalent to:

∂qi∂t

=∂H

∂pi(35)

∂pi∂t

= −∂H

∂qi(36)

(these are functional derivatives). An alternative approachis to first do spatial discretization, making sure the resultngsystem of ODEs is Hamiltonian, then apply symplectic timeintegration (see discussion in Lu and Schmid, 2001).

6.3 Some symplectic algorithms

We restrict ourselves to explicit algorithms. We consideronly time-reversible schemes to insure applications of thereciprocity theorem to the computation of Green’s func-tions and self-adjoint methods. We focus on schemes withn stages involving n or more evaluations of Kd, i.e. thefamily AB...BA of Omelyan et al. (2002; 2003) ∗ ∗ ∗.

The update of displacement and velocity fields, d and v,from time t to t+∆t is composed of a n-stage sub-steppingiteration, for k = 1 to n:

t ← t+ ak ∆t

d ← d+ ak ∆t v

v ← v + bk ∆t M−1[−Kd+ f(t)]

and a closing stage:

t ← t+ an+1 ∆t

d ← d+ an+1 ∆t v

This family of algorithms requires n evaluations of internalforces Kd per global timestep. The implementation is re-markably simple, it takes a few lines of coding to modify ex-isting SEM solvers, and it does not require additional mem-ory storage. Time-reversibility is guaranteed by imposingthe following symmetries on the coefficients: ak = an+2−k

and bk = bn+1−k. Some known algorithms are listed in Ap-pendix C.

It might seem useful, especially for non-linear problems,to restrict also to forward symplectic integrators, i.e. withpositive sub-steps (

∑n1 ak > 0). This requires an additional

sub-stage involving a Kd evaluation (the third order oper-ator in equation 5.2 of Scuro & Chin (2005) and in equation7 of Omelyan et al. (2003)). WILL STUDY THIS LATER

7 EXAMPLES AND APPLICATIONS

7.1 1D illustrative example

To compare the performance of different time schemes westudy the propagation of a plane wave in a 1D elastic

∗ ∗ ∗ they call “stage” each of the A or B, we call “stage” eachsequence AB and we don’t count the last A


101

102

103

10−3

10−2

10−1

100

Am

plitu

de m

isfit

Number of nodes per dominant wavelength = 12.0 Number of nodes per maximum wavelength = 4.8

PEFRLNewmark

101

102

103

10−4

10−3

10−2

10−1

Travel distance / dominant wavelength

Tim

e de

lay

/ dom

inan

t per

iod

PEFRLNewmark

Figure 10. Comparison of PEFRL and Newmark schemes. Amplitude and travel time misfits as a function of epicentral distance.

100 100.5 101 101.5 102 102.5 103−1

−0.5

0

0.5

1

Time / dominant period

Vel

ocity

sei

smog

ram

s

Distance / dominant wavelength = 100.0

ReferencePEFRLNewmark

Figure 11. Comparison of PEFRL and Newmark schemes. Velocity seismograms in the 1D test at epicentral distance L = 100λ.


medium. The source time function is a Ricker wavelet. To ef-ficiently explore long propagation distances we solve the testproblem with a 1D SEM code with free boundaries. We takep = 6 and 12 nodes per dominant wavelength. Given thatthe Ricker wavelet has significant power up to 2.5 times itsdominant frequency, we have only 4.8 nodes per wavelengthat the end of the source spectrum. A coarser grid gives visiblehigh frequency ringing. To be fair, the Newmark simulationemployed a short, sub-critical timestep such that the totalcost is similar to the 4-stage PEFRL simulation, i.e. we usedCNewmark

t = CPEFRLt /4 = 3/4 instead of CNewmark

t = 2.Figures 12 and 13 show the results of the compari-

son. We define two measures of misfit, for amplitude andtravel time respectively, that are analogous to usual mis-fit functions in observational global seismology. The traveltime misfit is a time delay estimate relative to the analyti-cal waveform measured by standard cross-correlation, withquadratic interpolation of the cross-correlation maximum,on time windows of width 3× the dominant period of thesource and centered at the analytical arrival time. The am-plitude misfit is the rms waveform error relative to the ana-lytical waveform, computed after time shift by the travel time

misfit estimated above and normalized by the norm of theanalytical waveform. As expected the amplitude and traveltime errors, shown in Figure 12, increase linearly with prop-agation distance. The travel time errors for the PEFRL aremore than an order of magnitude lower than for the New-mark scheme. If a larger, critical timestep were used forNewmark, as usual, its error would be higher by a factor(2× 4/3)2 ≈ 7. Figure 13 shows seismograms at a large epi-central distance, 100× dominant wavelength. The waveformdistortion and delay for the Newmark scheme is visible tothe naked eye whereas the PEFRL waveform shows a barelyapparent misfit.

7.2 Global wave propagation

We use the 2D spectral-element method developed byNissen-Meyer et al. (2007b; 2008) which solves the 3D elasto-dynamic wave equation for spherically symmetric referencemodels. In a first experiment, we verify our error predictionsdeveloped above by a spherical version of Lamb’s Problem(vertical single force above a half-space) and focus on theRayleigh wave, cross-correlating the pulse for different SEMsettings with the reference normal-mode summation solu-tion at various distances. Figure 14 shows seismograms dis-tributed over an epicentral range of 180, using Newmarkand symplectic time schemes, and the black line denotes thewindow selected for the cross-correlation of the surface-wavemisfits. Figure 15 shows dispersion errors ǫ for the surfacewaves with the classical Newmark scheme and a fourth- andsixth-order symplectic scheme. The numerical settings arequite typical: 7 nodes per wavelength, p = 6, timestep closeto critical. The accuracy target is not met with the sec-ond order Newmark scheme (we can linearly extrapolate theplot to L = 130λ). In contrast the 4th order PEFRL algo-rithm gives the desired accuracy. The same accuracy can beachieved in Newmark if the timestep is reduced by a fac-tor > 8, but this would become > 3× more expensive thanPEFRL.

As a second experiment, we focus on more realistic set-tings and simulate global waves upon a moment-tensor (ex-

plosion) source situated at 30 km depth for a dominantperiod of 10 seconds through a PREM earth model. Themethod we use is restricted to spherically symmetric back-ground models, but for the analysis here this poses no limi-tation: compared to full 3D models: Lateral heterogeneitiesare usually weak perturbations to the radial structure, andmostly ignored when meshing. As such, the mesh complex-ity completely depends on how to accomodate radial struc-tures. As shown in Figure 17, the simulations are repeatedfor Newmark scheme with an almost-critical time step, i.e.the conventional choice, the 4th- and 6th-order symplecticschemes, and the Newmark scheme with a time step loweredsuch that its computation time equals that of the fourth or-der scheme

8 DISCUSSION

A staggered-grid p-th order 3D finite difference methodneeds 24p + 9 multiplications per node per timestep (e.g.[Graves 1996] for p = 2 and 4) whereas a 3D SEM (on aregular grid) requires 18p + 36. However the order of thedispersion error in finite differences (assuming perfect timeintegration) is p whereas in SEM it is 2p, so for a givendispersion order p′x the costs are 24p′x + 9 and 9p′x + 36 re-spectively. Hence SEM clearly competes with FDM even inthe simple geometries that FDM can handle.

9 CONCLUSIONS

...

References

Ainsworth, M., 2004. Discrete dispersion relation for hp-version finite element approximation at high wave num-ber, SIAM J. Num. An., 42, 553–575.

Chaljub, E., Komatitsch, D., Vilotte, J.-P., Capdeville, Y.,Valette, B., & Festa, G., 2006. Spectral element analysisin seismology, in Advances in Wave Propagation in Het-

erogeneous Media, ”Advances in Geophysics” series, edsR.-S. Wu & V. Maupin, Elsevier, in press.

Chin, S. A. & Scuro, S. R., 2005. Exact evolution of time-reversible symplectic integrators and their phase errors forthe harmonic oscillator, Phys. Lett. A, 342, 397–403.

Cohen, G. C., 2002. Higher-order numerical methods for

transient wave equations, Springer-Verlag, Berlin, Heidel-berg, New York.

Donnelly, D. & Rogers, E., 2005. Symplectic integrators:An introduction, Am. J. Phys., 73, 938–945.

Ekstrom, G., Tromp, J., & Larson, E. W. F., 1997. Mea-surements and global models of surface wave propagation,J. Geophysical Research-solid Earth, 102, 8137–8157.

Fauqueux, S., 2003. Elements finis mixtes spectraux et

couches absorbantes parfaitement adaptees pour la prop-

agation d’ondes elastiques en regime transitoire, Ph.D.thesis, Universite Paris IX, Dauphine, Paris.

Fichtner, A., Kennett, B., Igel, H., & Bunge, H.-P.,2009. Full seismic waveform tomography for upper-mantlestructure in the australasian region using adjoint meth-ods, Geophys. J. Int., 179, 1703–1725.


101

102

103

10−3

10−2

10−1

100

Am

plitu

de m

isfit

Number of nodes per dominant wavelength = 12.0 Number of nodes per maximum wavelength = 4.8

PEFRLNewmark

101

102

103

10−4

10−3

10−2

10−1

Travel distance / dominant wavelength

Tim

e de

lay

/ dom

inan

t per

iod

PEFRLNewmark

Figure 12. Comparison of PEFRL and Newmark schemes. Amplitude and travel time misfits as a function of epicentral distance.

100 100.5 101 101.5 102 102.5 103−1

−0.5

0

0.5

1

Time / dominant period

Vel

ocity

sei

smog

ram

s

Distance / dominant wavelength = 100.0

ReferencePEFRLNewmark

Figure 13. Comparison of PEFRL and Newmark schemes. Velocity seismograms in the 1D test at epicentral distance L = 100λ.


0 2000 4000 6000 8000 10000 120000

20

40

60

80

100

120

140

160

180

time [s]

epic

entr

al d

ista

nce

[° ]

Newmark − SYMPLECTIC, long. comp.

u

newmark dt=0.4

unewmark

dt=0.15

usymp

4th order

usymp

6th order

Figure 14. Longitudinal displacement seismograms for the spherical version of Lamb’s problem, i.e. a vertical single force above ahomogeneous half-space. Shown are 25 epicentral distances until the antipode, and 4 simulations for Newmark and symplectic timeschemes (see legend) at dominant source period of 20 seconds.

101

102

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

utheta T0=40 s

Epicentral distance [wavelengths]

Pha

se m

isfit

sur

face

wav

e

symplectic 4th, dt=0.6newmark dt=0.4newmark dt=0.15newmark dt=0.05newmark dt=0.015

101

102

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

utheta T0=30 s


Pha

se m

isfit

sur

face

wav

e


101

102

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

utheta T0=25 s


Pha

se m

isfit

sur

face

wav

e


Figure 15. Comparison of dispersion error for PEFRL and Newmark schemes for global surface wave propagation with an axysimmetricSEM formulation.

Forest, E. & Ruth, R. D., 1990. 4th-order symplectic inte-gration, Physica D , 43, 105–117.

Frank, J., Moore, B. E., & Reich, S., 2006. Linear PDEsand numerical methods that preserve a multisymplecticconservation law, SIAM J. Sci. Comp., 28, 260–277.

Ihlenburg, F. & Babuska, I., 1997. Finite element solutionof the Helmholtz equation with high wave number .2. Theh-p version of the FEM, SIAM J. Num. An., 34, 315–358.

Komatitsch, D., Tsuboi, S., & Tromp, J., 2005. Thespectral-element method in seismology, in AGU Geophys-

ical Monograph on Seismic Data Analysis and Imaging

with Local Arrays, eds A. Levander & G. Nolet, vol. 157,

pp. 205–227, AGU, Washington DC, USA.

Mulder, W. A., 1999. Spurious modes in finite-element dis-cretizations of the wave equation may not be all that bad,Appl. Num. Math., 30, 425–445.

Nissen-Meyer, T., Dahlen, F. A., & Fournier, A., 2007a.Spherical-earth Frechet sensitivity kernels, Geophys. J.

Int., 168, 1051–1066.

Nissen-Meyer, T., Fournier, A., & Dahlen, F. A., 2007b.A 2-D spectral-element method for computing spherical-earth seismograms–I. Moment-tensor source, Geophys. J.

Int., 168, 1067–1093.

Nissen-Meyer, T., Fournier, A., & Dahlen, F. A., 2008. A 2-


0 500 1000 1500 2000 2500 3000 35000

20

40

60

80

100

120

140

160

180

time [s]

epic

entr

al d

ista

nce

[° ]

Newmark − SYMPLECTIC, long. comp.

u

newmark dt=0.4

unewmark

dt=0.15

usymp

4th order

usymp

6th order

0 500 1000 1500 2000 2500 3000 3500

−6

−4

−2

0

2

4

x 10−4

u θ [m]

Newmark (−−) vs. SYMPLECTIC (−) at 135o

u

symp4

unewm

dt=0.4

unewm

dt=0.15

usymp

6

1x(unewm

dt04−usymp

6)

10x(unewm

dt015−usymp

6)

50x(usymp

4−usymp

6)

Figure 16. Comparison of dispersion error for PEFRL and Newmark schemes for global surface wave propagation with an axysimmetricSEM formulation.

D spectral-element method for computing spherical-earthseismograms–II. Background models, Geophys. J. Int., ac-cepted.

Omelyan, I. P., Mryglod, I. M., & Folk, R., 2002. OptimizedForest-Ruth- and Suzuki-like algorithms for integration ofmotion in many-body systems, Comp. Phys. Comm., 146,188–202.

Omelyan, I. P., Mryglod, I. M., & Folk, R., 2003. Sym-plectic analytically integrable decomposition algorithms:classification, derivation, and application to moleculardynamics, quantum and celestial mechanics simulations,Comp. Phys. Comm., 151, 272–314.

Rentsch, S., Buske, S., Luth, S., & Shapiro, S. A., 2007.Fast location of seismicity: A migration-type approachwith application to hydraulic-fracturing data, Geophysics,72, S33–S40.

Rieben, R., White, D., & Rodrigue, G., 2004. High-ordersymplectic integration methods for finite element solu-tions to time dependent Maxwell equations, IEEE Trans.

Ant. Prop., 52, 2190–2195.

Scuro, S. R. & Chin, S. A., 2005. Forward symplectic inte-grators and the long-time phase error in periodic motions,Phys. Rev. E , 71, Art. No. 056703.

Seriani, G. & Priolo, E., 1994. Spectral element method foracoustic wave simulation in heterogeneous media, FiniteElements in Analysis and Design, 16, 337–348.

Tape, C., Liu, Q., Maggi, A., & Tromp, J., 2009. Adjointtomography of the southern california crust, Science, 325,988–992.

Thompson, L. L. & Pinsky, P. M., 1994. Complex wave-number fourier-analysis of the p-version finite-elementmethod, Comp. Mech., 13, 255–275.

APPENDIX A: GRID DISPERSION IN HIGHERDIMENSIONS

The dispersion properties for higher dimensions D can beobtained from the 1D results of Section 3.2. In SEM thepolynomial basis for D > 1 are tensor products of the 1D ba-sis. We note κi the components of the normalized wavenum-ber vector, and κ its amplitude

κ2 =

D∑

i=1

κ2i . (A1)

It can be shown (Ainsworth 2004) that the dispersion rela-tion is

D∑

i=1

Ω2i (κi) = Ω2

x, (A2)

where each Ωi satisfies a relation analogous to (5):

cos(κi) = Rp(Ωi). (A3)

The result (6) can then be applied to each component of theerror:

Ωi − κi

κi≈ Ax(p) κ

2pi (A4)

The asymptotic behavior of the dispersion error is found bycombining (A1), (A2) and (A4). For instance in 2D for apropagation angle θ we find:

ǫx ≈ Ax(p)κ2p [(cos θ)2p+2 + (sin θ)2p+2]. (A5)

The resulting shape of the asymptotic error contours andthe azimuthal behavior of the ratio ǫX(θ)/ǫX(θ = 0) areplotted in Figure A1. Clearly the error for given κ is largeralong the principal axes of the grid. The ratio between themaximum and minimum error,

ǫx(θ = 0)

ǫx(θ = π/4)= 2p, (A6)


900 1000 1100 1200 1300 1400 1500

−10

−5

0

5

x 10−5

PPPdiff PKIKPl=1.392e+04 kml=1.155e+04 km l=1.188e+04 km

u θ [m]

Newmark vs. SYMPLECTIC at 127.5o

u

symp4 dt=0.15

unewm

dt=0.1

unewm

dt=0.0375

unewm

dt=0.01

usymp

6 dt=0.15

1x(unewm

dt01−usymp

6)

10x(unewm

dt0375−usymp

6)

10x(unewm

dt001−usymp

6)

50x(usymp

4−usymp

6)

−2

−1.5

−1

−0.5

0

0.5

1

1.5

x 10−4

PPPdiff PKPPKIKP1.66e+041.34e+04 1.25e+041.31e+04

u r [m]

Newmark vs. SYMPLECTIC at 157.5o

u

symp4 dt=0.15

unewm

dt=0.1

unewm

dt=0.0375

unewm

dt=0.01

usymp

6 dt=0.15

1x(unewm

dt01−usymp

6)

10x(unewm

dt0375−usymp

6)

10x(unewm

dt001−usymp

6)

50x(usymp

4−usymp

6)


Figure 18. PP at 127

Figure 19. PKIKP and PKP at 157

increases rapidly with increasing p. The 1D analysis providesthe maximal error among all propagation directions.

APPENDIX B: MINIMIZATION OF THECOMPUTATIONAL COST

For given p, q and D we seek the parameters ∆xopt and∆topt that minimize the computational cost (19) under theconstraint of a prescribed dispersion accuracy goal (15). Wemust distinguish two cases, according to the sign of the lead-ing order coefficient At of the time discretization error.


0 0.5 1 1.50

0.2

0.4

0.6

0.8

1

1.2

1.4

κx / κ

κ y / κ

2 4 61015

0.5

1

30

210

60

240

90

270

120

300

150

330

180 0

2

4

6

Figure A1. Grid anisotropy in 2D SEM for different polynomial orders p (see legends). Left: shape of the isocontours of asymptoticdispersion error ǫx as a function of the two components of the wavevector (κx, κy). Right: azimuthal variation of ǫx(θ)/ǫX (0). At a fixedκ the error is maximum along the principal axes of the grid.

B1 Positive At

We define h∗ as the element size that yields a spatial dis-cretization error equal to the target error ǫ, and adopt asimilar definition for ∆t∗:

Ax(p)(k∆x∗)2p = |At|(ω∆t∗)q = ǫ (B1)

The constraint (15) is rewritten as

(

∆x

∆x∗

)2p

+

(

∆t

∆t∗

)q

= 1 (B2)

We define Γ∗ as the cost associated to ∆x = ∆x∗ and ∆t =∆t∗ and write the relative cost as

Γ

Γ∗=

(

∆x∗

∆x

)D∆t∗

∆t(B3)

We apply the constraint (B2) to eliminate ∆x/∆x∗ from theprevious expression

Γ

Γ∗=

(

1−(

∆t

∆t∗

)q)− D2p ∆t∗

∆t(B4)

Minimizing Γ/Γ∗ as a function of ∆t/∆t∗ it is readily foundthat

∆topt

∆t∗=

(

2p

qD + 2p

)1/q

(B5)

Then from (B2) we get

∆xopt

∆x∗=

(

qD

qD + 2p

)1/2p

(B6)

For usual values of p, q and D we note that ∆xopt ≈ ∆x∗

and ∆topt ≈ ∆t∗.

B2 Negative At

When At < 0, the leading order terms of ǫt and ǫx are ofopposite sign. In 1D, ∆x and ∆t can be chosen such thatthese two terms cancel each other at a given frequency (orat all frequencies if q = 2p). In higher dimensions this exactcompensation is not possible over all propagation directions.Confining our discussion to 2D, the L∞ norm of ǫ(θ) =ǫx(θ) + ǫt, defined by ||ǫ||∞ = maxθ |ǫ(θ)|, is minimized bysetting

ǫt = −[max ǫx(θ) + min ǫx(θ)]/2 (B7)

which gives

||ǫ||∞ = [max ǫx(θ)−min ǫx(θ)]/2 (B8)

¿From Equation (A6):

ǫt = −1 + 2−p

2Ax(p)(k∆x)2p (B9)

and

||ǫ||∞ = −1− 2−p

2Ax(p)(k∆x)2p (B10)

Setting ||ǫ||∞ = ǫ, this leads to

hopt

h∗=

(

2p+1

2p − 1

)1/2p

(B11)

∆topt

∆t∗=

(

2p + 1

2p − 1

)1/q

(B12)

(B13)


APPENDIX C: SOME SYMPLECTICINTEGRATION SCHEMES

POSITION VERLET (PV)

a1 = a2 = 1/2 b1 = 1 (C1)

Corrected Position Verlet (CPV)

a1 = a2 = 1/2 b1 = 1 c1 = 1/24

Position Forest-Ruth (PFR)

By Forest & Ruth (1990).

a1 = a4 = θ/2 b1 = b3 = θ

a2 = a3 = (1− θ)/2 b2 = 1− 2θ

with

θ = 1/(2 − 3√2)

Position extended Forest-Ruth-like (PEFRL)

By Omelyan et al. (2002).

a1 = a5 = ξ b1 = b4 = 1/2 − λ

a2 = a4 = χ b2 = b3 = λ

a3 = 1− 2(χ+ ξ)

with

ξ = 0.1786178958448091

λ = −0.2123418310626054χ = −0.06626458266981849

APPENDIX D: PROPERTIES OF SYMPLECTICALGORITHMS

To study the stability we apply the matrix method describedby Chin & Scuro (2005) and related results therein. We de-fine the amplification matrix A of a time integration schemeas

[

dt+∆t

∆t vt+∆t

]

= A

[

dt∆t vt

]

(D1)

For the n-stage algorithms studied here

A = Tn+1

1∏

k=n

VkTk (D2)

Tk =

[

1 ak

0 1

]

(D3)

Vk =

[

1 0−bkΩ2

x + 2ckΩ4x 1

]

(D4)

For time-reversible symplectic integrators it takes the form

A =

[

g τ−µ g

]

(D5)

with

det(A) = g2 + µτ = 1. (D6)

The coefficients g, µ and τ are polynomials of Ω2x with order

n when ck = 0, and order 2n otherwise. The eigenvalues ofA are:

ρ± = g ± i√µτ (D7)

When the stability criterion µτ > 0, or equivalently |g| < 1,is met the eigenvalues lie on the unit circle in the com-plex plane: symplectic algorithms do not introducenumerical dissipation [see also Simo et al 1992, sectionA.1.3]. The complete dispersion relation in terms of the non-dimensional approximate frequency Ω∗

x = ω∗∆t is

cos(Ω∗x) = g(Ωx) (D8)

The parameters At and q in the asymptotic behavior of thedispersion error ǫt = (Ω∗

x − Ωx)/Ωx can be obtained fromthe relation above by analytical expansion (at high orderswith the aid of symbolic software) or by numerical fit.

The stability number Ct is given by the critical fre-quency, the smallest real solution of

|g(Ωx)| = 1 (D9)

For low orders g and Ct can be computed explicitly in closedform. For q > 4 we recur to numerical computation applyingequations D2-D5.

APPENDIX E: TRAVEL TIME DELAYESTIMATES

We adopt a travel time error metric based on a cross-correlation technique for time delay estimation usuallyapplied in global tomography studies (e.g. [Marquering,Dahlen and Nolet, 1999]). Let u(t) denote a reference an-alytical seismogram, properly tapered over a selected timewindow [t1, t2] to isolate a particular seismic phase of inter-est. Let uSEM(t) denote a similar signal extracted from theseismogram computed by the SEM. The cross-correlationfunction between u(t) and uSEM(t) is defined by

C(δt) =

∫ t2

t1

u(t′)uSEM(t′ + δt) dt′ (E1)

The time delay ∆T is measured as the time lag δt thatmaximizes C(δt). In practice, the cross-correlation functionis only available at the discrete timesteps, Cn = C(n∆t).A sub-sample precision estimate of ∆T is obtained byquadratic interpolation near the peak of Cn.

If the waveform misfit ∆u(t) = u(t)−uSEM (t) is small,an explicit first order estimate of time delay is [Marquering,Dahlen and Nolet, 1999]:

∆T =

∫

u(t)∆u(t) dt∫

u(t)2 dt. (E2)

The perturbed seismogram can be written in frequency do-main as

u(ω) + ∆u(ω) = u(ω) ei∆ωT . (E3)

Noting that

∆u(ω) ≈ i∆ωT u(ω) (E4)

and applying Plancherel’s theorem, (E2) yields

∆T

T≈

∫

ω|u(ω)|2∆ω dω

‖u‖2 . (E5)


Considering the asymptotic behavior of the dispersion errorfor a single plane wave, ∆ω/ω ≈ At(ω∆t)q, we obtain acompact expression,

∆T

T≈ B(ω0∆t)q, (E6)

where the dominant frequency ω0 is defined by

ω0.=

(‖u(1+q/2)‖‖u‖

)2/q

. (E7)

For instance, for second order schemes (q = 2) :

ω0 =‖u‖‖u‖ (E8)

This is a very accurate estimate in 1D, where the planewave approximation is exact. In higher dimensions the seis-mograms are a mixture of plane waves travelling in differentdirections and a departure from the estimate above is ex-pected. However, this effect is small enough in the cases wehave studied [DOUBLE CHECK].

Documents

Dispersion,costoptimization,and symplectictime ...web.gps.caltech.edu/~ampuero/tmp/SEM_dispersionNEW_Tarje_Oct2011.pdf · Dispersion,costoptimization,and symplectictime integrationinspectral-elementbasedwavepropagation