18
HIGH-ORDER TIME-ACCURACY USING APPROXIMATE JACOBIANS FOR THE COMPRESSIBLE NAVIER-STOKES EQUATIONS Richard P. Dwight , Ursula Mayer and Safdar Abbas C 2 A 2 S 2 E, Institute for Aerodynamics and Flow Technology, German Aerospace Center (DLR), Braunschweig. (Contact: [email protected]) ABSTRACT It is well established that the most efficient implicit methods for solving compressible Navier- Stokes problems to steady state are approximations of Newton methods, using simplified Jacobians and inexact solution of the linear system at each iteration. Two such methods are described, which are cheap and stable for large CFL numbers. However, because of the approximate nature of the implicit operators they can not normally be applied directly to time-accurate calculations, but must be wrapped within a dual-time iteration, at the cost of efficiency. Here a time-accurate Runge-Kutta (RK) method is presented which overcomes this limitation. The explicit RK method is constructed in such a way that an arbitrary implicit term may be added at each stage without loss of order of accuracy in time. This term may then be chosen to stabilize the iteration at low cost - for which the previously mentioned approximate Newton operators are appropriate. The scheme is a special case of a Rosenbrock W-method. Significant performance gains are demonstrated over dual-time, and the need for time-step control is identified. 1. INTRODUCTION The level of detail and complexity demanded of numerical simulation is increasing continu- ously in many engineering fields, providing a constant pressure for more efficient algorithms. In computational aerodynamics time-accurate flow computations are becoming more impor- tant as the simulation of maneuvers, and the use of DES and LES turbulence models become practical. However unsteady calculations require disproportionate computing resources in comparison to stationary problems. Efficiency improvements of flow solvers in aerodynamics have concentrated principally on the rapid solution of steady flows using such techniques as Full Approximation Storage (FAS) multigrid and approximate (and inexact) Newton solvers. As a result these algorithms has become refined and well understood. When unsteady simulations are required, they are typically performed using a implicit dual time algorithm [10], where an analog of the non- linear steady problem is solved at each time-step using aforementioned steady solvers. The result is an orders-of-magnitude higher cost for unsteady simulation, which has precluded its application in many areas. Explicit methods on the other hand are cheap per time-step, but suffer from severe restrictions on time-step size for stiff problems. 1

HIGH-ORDER TIME-ACCURACY USING APPROXIMATE JACOBIANS FOR THE COMPRESSIBLE NAVIER-STOKES EQUATIONS

Embed Size (px)

Citation preview

HIGH-ORDER TIME-ACCURACY USING APPROXIMATE JACOBIANSFOR THE COMPRESSIBLE NAVIER-STOKES EQUATIONS

Richard P. Dwight†, Ursula Mayer† and Safdar Abbas†

† C2A2S2E, Institute for Aerodynamics and Flow Technology,German Aerospace Center (DLR), Braunschweig. (Contact: [email protected])

ABSTRACT

It is well established that the most efficient implicit methods for solving compressible Navier-Stokes problems to steady state are approximations of Newton methods, using simplifiedJacobians and inexact solution of the linear system at each iteration. Two such methodsare described, which are cheap and stable for large CFL numbers. However, because of theapproximate nature of the implicit operators they can not normally be applied directly totime-accurate calculations, but must be wrapped within a dual-time iteration, at the cost ofefficiency. Here a time-accurate Runge-Kutta (RK) method is presented which overcomesthis limitation. The explicit RK method is constructed in such a way that an arbitraryimplicit term may be added at each stage without loss of order of accuracy in time. Thisterm may then be chosen to stabilize the iteration at low cost - for which the previouslymentioned approximate Newton operators are appropriate. The scheme is a special case ofa Rosenbrock W-method. Significant performance gains are demonstrated over dual-time,and the need for time-step control is identified.

1. INTRODUCTION

The level of detail and complexity demanded of numerical simulation is increasing continu-ously in many engineering fields, providing a constant pressure for more efficient algorithms.In computational aerodynamics time-accurate flow computations are becoming more impor-tant as the simulation of maneuvers, and the use of DES and LES turbulence models becomepractical. However unsteady calculations require disproportionate computing resources incomparison to stationary problems.

Efficiency improvements of flow solvers in aerodynamics have concentrated principallyon the rapid solution of steady flows using such techniques as Full Approximation Storage(FAS) multigrid and approximate (and inexact) Newton solvers. As a result these algorithmshas become refined and well understood. When unsteady simulations are required, they aretypically performed using a implicit dual time algorithm [10], where an analog of the non-linear steady problem is solved at each time-step using aforementioned steady solvers. Theresult is an orders-of-magnitude higher cost for unsteady simulation, which has precludedits application in many areas. Explicit methods on the other hand are cheap per time-step,but suffer from severe restrictions on time-step size for stiff problems.

1

The class of semi-implicit methods attempts to combine the benefits of explicit andimplicit methods by treating those terms in the governing equations that limit stabilityimplicitly, and the remaining terms explicitly [20, 21]. A simple example is the second-orderAdams-Bashforth-Crank-Nicolson (ABCN) method for the convection-diffusion equation,which treats the convective terms with Adams-Bashforth and the diffusive terms with Crank-Nicolson. However such a splitting is only beneficial if the implicit part takes a simple formthat is easy to solve, and anyway a clean splitting of terms is rarely possible in practice,where stability may be limited by different terms in different regions of the domain.

A more general semi-implicit approach is required, and a promising set of candidatemethods are Rosenbrock W-methods originally due to Steihaug and Wolfbrandt [17], ofwhich a special case was applied recently to the incompressible Navier-Stokes by Nikitin [14,13]. Rosenbrock schemes are implicit Runge-Kutta methods for stiff non-linear problems,that attempt to replace the non-linear sub-iteration conventionally used to achieve implicit-ness, with the exact solution of a linear problem involving the exact Jacobian [9]. RosenbrockW-methods then relax the requirement that the Jacobian be exact and the linear systemsolved exactly.

The particular scheme considered here is based on a three-stage third-order explicitRunge-Kutta (RK) scheme, whose structure is carefully chosen such that it is possible toadd an arbitrary implicit term to each stage without compromising the order of accuracyof the method. The implicit terms may therefore be chosen to stabilize the iteration at lowcost, and are not required to satisfy any accuracy requirements. In particular the highlytuned approximate Newton methods developed in the pursuit of efficient schemes for steadyproblems may then be used as stabilization terms for unsteady problems.

In this article third- and fourth-order Rosenbrock W-methods are developed, referred toas RW3 and RW4 respectively, their stability is theoretically demonstrated, error estimatorsare derived, and they are applied to unsteady compressible Euler and Navier-Stokes prob-lems. The implicit operator is critical to the efficiency of the scheme, and two choices areconsidered: the LU-SGS scheme which is extremely cheap and unconditionally stable for(steady) inviscid flow problems [22, 6], and a recently proposed operator from Rossow [16, 18]which is more expensive per iteration than LU-SGS but also more stable for viscous prob-lems.

For the numerical investigations all schemes were implemented within the finite vol-ume flow solver the DLR TAU-Code [7], which enjoys extensive use within the Europeanaerospace industry.

2. GOVERNING EQUATIONS AND SPATIAL DISCRETIZATION

The 2d Favre averaged Navier-Stokes equations in conservation form may be written

∂w

∂t+∂

∂xi

f ci (w)− ∂

∂xi

f vi (w) =

∂w

∂t+ R(w) = 0, (1)

where summation convention is applied and the conservative state vector is defined byw = (ρ, ρu, ρv, ρE)T . The vectors of convective fluxes are given by

f cx =

ρu

ρuu+ pρuvρHu

, f cy =

ρvρvu

ρvv + pρHv

, (2)

2

and the vector of viscous fluxes by

f vx =

0τxx

τxy

Uiτxi − κ(∇T )x

, f vy =

0τyx

τyy

Uiτyi − κ(∇T )y

, (3)

where U = (u, v)T is the velocity vector, H = E + p/ρ is the total enthalpy, and τ(w, ∇w)is the viscous shear stress tensor

τ = µ

∇U +∇UT − 2

3∇ · U I

, (4)

where µ and κ represent the viscosity and thermal conductivity respectively. The equationof state for a calorically perfect gas, p = (γ− 1)ρe, where e is the internal energy, closes thesystem. Only laminar flow will be considered here.

These equations are discretized separately in space and time. The spatial discretization isa second-order accurate unstructured cell-vertex finite-volume method, using linear solutionreconstruction within cells and an upwind flux for the convective terms at cell interfaces.Starting from the integral form of the equations on a control volume Ω with boundary ∂Ωand volume |Ω|: ∫

Ω

∂w

∂tdΩ +

∫∂Ω

[f c(w)− f v(w)] · n d(∂Ω) = 0,

the exact fluxes f(w) are replaced by numerical fluxes f(wL, wR;n) in order to treat the dis-continuity between left and right states, wL and wR, across control volumes. By employingsupport points at the centroid of cells the volume integral may be eliminated, so that for afaceted control volume Ωi we have

|Ωi|∂wi

∂t= −

∑j∈N(i)

f(wL, wR;nij)−∑

j∈B(i)

fb(wL;nj) = −Ri(w), (5)

where N(i) is the set of neighbours of Ωi, B(i) the set of boundary faces, fb a numericalboundary flux, and R therefore represents the complete spatial discretization.

Here the particular case of the Roe flux is considered [15]:

f cRoe(wL, wR;nij) =

1

2(f c(wL) + f c(wR))− 1

2|A|(wR − wL),

where A = ∂f c/∂w is the convective flux-Jacobian evaluated at the Roe-averaged state.The Jacobian used in the LU-SGS implicit method will be based on the Lax-Friedrichs flux,namely:

f cL−F(wL, wR;nij) =

1

2(f c(wL) + f c(wR))− 1

2|λ|(wR − wL), (6)

where λ is the maximum eigenvalue of A.The spatial solution gradients are obtained using the Green-Gauss formula:

∇φi ≈1

2Ωi

∑j∈N(i)

(wi + wj)nij, (7)

limited using the Venkatakrishnan limiter [19] for the reconstruction of wL and wR. Thecalculation of the viscous fluxes on a face employs a central average of the gradients fromthe two neighbouring cells.

3

3. ROSENBROCK W-METHODS

The third-order Rosenbrock W-method (RW3) developed and applied by Nikitin [14, 13], isdescribed in relation to a general autonomous ordinary differential equation (5). The schemeis based on a particular third-order explicit Runge-Kutta scheme (here termed RK3), towhich at each stage an arbitrary implicit stabilization term is added, whose order is the sameas the truncation error of the stage, and which may therefore be regarded as a perturbationor error term. The explicit scheme is:

|Ω|w′ − wn

∆t= −2

3R(wn), (8)

|Ω|w′′ − wn

∆t= −1

3R(wn)− 1

3R(w′), (9)

|Ω|wn+1 − wn

∆t= −1

4R(wn)− 3

4R(w′′). (10)

which is a special case of the one-parameter family of third-order three-stage schemes withRK coefficient tableau

023

23

23

23− 1

4b14b

14

34− b b

with b = 3/4 [3]. The construction of the semi-implicit method relies upon two features ofthis scheme: that the abscissas of the second and third stages are identical, so that w′ andw′′ are evaluated at the same time increment t + 2

3∆t, and that the term R(w′) does not

appear in the final quadrature for wn+1.Observe that the following relations hold for the above scheme:

w′ = w(tn + 23∆t) +O(∆t2),

w′′ = w(tn + 23∆t) +O(∆t3),

wn+1 = w(tn+1) +O(∆t4).

This scheme is now perturbed by adding an implicit term to each stage as follows:

|Ω|w′ − wn

∆t= −2

3R(wn)− γ L (w′ − wn), (11)

|Ω|w′′ − wn

∆t= −1

3R(wn)− 1

3R(w′)− γ L (w′′ − w′), (12)

|Ω|wn+1 − wn

∆t= −1

4R(wn)− 3

4R(w′′)− γ L (wn+1 − wn+1), (13)

where the linear operator L is some approximation of the flux Jacobian ∂R/∂w and alsoincorporates some inexact solution of the linear system at each stage. The constant γ is apositive coefficient that controls the relative strength of the implicit terms. Finally wn+1 asecond-order accurate approximation of the solution at the full time increment t+ ∆t.

To see why the addition of L doesn’t degrade the order of the method, first considerthe right-hand side (RHS) of (11), which is obviously an O(∆t) perturbation of the RHS of

4

(8), which results in an O(∆t2) variation in w′ and R(w′), i.e. the order of accuracy of thefirst stage has not been reduced. Analogously the RHS of (12) is an O(∆t2) perturbationof the RHS of (9), resulting in an O(∆t3) variation in w′′ and R(w′′), and again there is noreduction in accuracy. Note that this was only possible because w′ and w′′ were evaluatedat the same time increment, otherwise w′′ − w′ would have been O(∆t).

Finally, assuming that wn+1 = w(tn+1) +O(∆t3) the RHS of equation (13) is an O(∆t3)perturbation of the RHS of (10) causing an O(∆t4) variation in wn+1. Thus at each stageand overall the order of accuracy is retained. In particular the local error wn+1−w(tn+1) isstill O(∆t4), and the RW3 scheme is third-order in time.

It remains to define wn+1, a second-order accurate approximation to the solution at thefull time increment, which may be chosen as

|Ω|wn+1 − wn

∆t= −1

4R(wn)− 3

4R(w′)− γ L (wn+1 − wn+1), (14)

where wn+1 is itself an O(∆t) approximation of w(tn+1), which is chosen as

wn+1 =3

2(αw′ + (1− α)w′′)− 1

2wn, (15)

where the real parameter α can be chosen arbitrarily.Of course although the formal order of the method is preserved, the addition of effectively

arbitrary terms must negatively influence absolute accuracy. It makes sense therefore tochoose as small a value for γ as possible, bearing in mind that the method is more implicit,and therefore more stable, for large values of γ. There is a compromise to be made for eachproblem, each time-step and each choice of L. For example, if the time-step is in a regionfor which the original explicit method is stable, then γ = 0 is very likely to be optimal, asthe added “error” has been reduced as far as possible. Since the RW3 scheme is also abouta factor of four more expensive than the baseline explicit scheme RK3 per time-step (giventhe use of the LU-SGS operator for L), it is therefore only attractive for time-steps greaterthan about four times the maximum stable explicit time-step.

In Section 3.1 it will be shown that for linear R, and L the exact Newton operator, thescheme is A-stable for γ ≥ 1

3. But while it has been noted that LU-SGS is unconditionally

stable for steady problems, this is only true for γ close to unity. These issues will end upreducing the absolute accuracy of the scheme for large time-steps, and it therefore becomesdesirable to develop Rosenbrock W-methods of higher order. In Section 3.2 therefore afourth-order scheme is derived.

3.1. Stability Analysis

Linear stability analysis may be carried out for RW3 in the idealized case that the stabilizingimplicit operator is the exact Newton operator [14]. The right-hand side of the system ofODEs is assumed to be linear and to coincide with the implicit operator L = −R. A rationalapproximation to the discrete system is determined from equations (11) to (14) as

wn+1 = Q(∆t · L)wn, (16)

where the matrix-valued rational function Q(z) is:

Q(z) = (1− γz)−4[1− (4γ − 1)z + (γ − 1

2)(6γ − 1)z2

−(4γ3 − 6γ2 + 2γ − 16)z3 + γ(γ − 1

2)(γ2 − 2γ + 1

3)z4].

5

The scheme is A-stable if the inequality

|Q(z)| ≤ 1,

is satisfied for all z in the left complex semi-plane, which may be shown to hold provided

γ ≥ 1

3.

The use of an approximate implicit operators L will have a significant effect on stability,so that this result is unlikely to hold in practice. The stability of the scheme with an implicitoperator based on a first-order scheme and a second-order residual might be analysed tosome extent with von Neumann analysis [6], but as we are also concerned with absoluteaccuracy, in the following we concentrate on numerical investigations. In particular thesewill show that for large time-steps the scheme is only stable for γ close to unity.

3.2. Extrapolation to a Fourth-Order Method (RW4)

The generalization of the method presented above to higher-orders of accuracy is highlydesirable, not principally because of the improved asymptotic behaviour, but because itis likely to also improve absolute accuracy for finite time-steps, which given the nature ofthe RW3 scheme could be low. The construction of higher-order accurate Rosenbrock W-methods with few stages is a very formally difficult problem however. For a third-orderscheme there are only 8 order conditions to satisfy, but for fourth-order 21, fifth-order58, and sixth-order 166 [9]. The construction of such schemes however would be of utmostimport for the effectiveness of Rosenbrock W-methods, and is a self-contained mathematicalassignment that could be performed independently of numerical investigations.

In lieu of an immediate alternative, the order of an existing scheme can always be incre-mented by one by applying Richardson extrapolation, incurring a cost of three evaluations ofthe original scheme. In this case therefore a fourth-order scheme that is roughly equivalentin cost to a 12-stage Runge-Kutta method is obtained.

In particular Richardson extrapolation based on the solutions wn+1, obtained from wn

in one step of the underlying scheme of size ∆t, and w(n+ 12

)+ 12 , obtained in two steps of size

12∆t, gives:

wn+1 =1

2p − 1( 2pw(n+ 1

2)+ 1

2 − wn+1), (17)

where p = 3 is the order of the underlying scheme, and wn+1 is of order p+ 1 in ∆t.Note that the extrapolation will only be accurate if the underlying scheme achieves its

theoretical order p at the given ∆t, though it will tend to improve the absolute accuracy ofthe method even if this is not the case. Also the stability of the extrapolation is only wellunderstood for non-stiff problems. For stiff problems, a non-polynomial remainder termin the error may become unbounded, even if the stability properties of the method areotherwise good [1, 5, 8, 12]. This may have an influence for high-Reynolds number viscouscases.

3.3. Local Error Estimation and Time-Step Control

Given that the Rosenbrock W-methods may have unpredictable error behaviour due tothe implicit term, error estimation and time-step control is more necessary than for morestandard time-integration schemes.

6

A local error estimation and a time-step control algorithm based on embedded lower-order schemes [2, 14] can be implemented for the third- and fourth-order Runge-Kuttamethods and used to attempt to achieve a given local error with minimal computationaleffort. An estimate of the local error can be determined very easily and cheaply, since twoapproximations of w(tn+1) of two different orders are already available in both schemes.

For RW3 we obtain wn+1 from the execution of the final stage, and wn+1 from executionof the additional stage of the semi-implicit scheme:

wn+1 = w(tn+1) +O(∆t4), wn+1 = w(tn+1) +O(∆t3). (18)

The local error ε and estimate of the local error ε are defined respectively

ε = ‖wn+1 − w(tn+1)‖ ≈ O(∆t4), (19)

ε = ‖wn+1 − wn+1‖ ≈ O(∆t3), (20)

whereby ‖ · ‖ is some appropriate norm.Similarly for RW4 we take the result of the Richardson extrapolation wn+1, which is of

O(∆t5) and the result of the last stage of RW3, wn+1, which is of O(∆t4)

wn+1 = w(tn+1) +O(∆t5) wn+1 = w(tn+1) +O(∆t4). (21)

The local error can again be estimated applying an appropriate norm,

ε = ‖wn+1 − wn+1‖. (22)

Since the order of the error estimate is in both cases smaller than the order of the trueerror, for sufficiently small ∆t the inequality ε < ε holds, and ε can be regarded as an upperbound on ε.

An improved time-step for the current iteration, ∆tactual, can now be found by multiply-ing the actual time-step with the factor λ, so that

∆tactual = λ · ∆t, λ =(τε

) 1p+1

, (23)

where τ is a given error tolerance and p is the order of the particular Runge-Kutta scheme.If it turns out that λ < λmin (which is here taken to be λmin = 0.5), then the step that wastaken using ∆t is considered to have been unsuccessful, and is repeated starting with thestored solution wn and using the smaller time-step ∆tactual = λmin · ∆t. Otherwise the stageis considered to have been successful and the time-step for the next stage is computed as

∆tnew = λ · ∆t λ = minλ, λmax, (24)

where the upper limit on λ is set to λmax = 1.5, to prevent the time-step exploding if theerror estimator happens to be very close to zero.

4. APPROXIMATE NEWTON OPERATORS

Two particular implicit operators are considered for the study of the RW schemes, LU-SGSand the Rossow operator. In essence there are two choices to be made for any approximateimplicit operator: the manner in which the Jacobian is approximated, and the manner in

7

which the linear system is solved. These two choices should not be made independently. Onthe most basic level it makes for example no sense to combine a very simple representationof the Jacobian with a very accurate solution of the linear system, or vise versa.

Both schemes described here are based on Jacobians of first-order accurate discretizationsof the convective terms, denoted A1, rather than Jacobians of second-order discretizations,A2. This choice has many advantages. Because of the smaller stencil A1 has considerably lessfill-in (on a regular tetrahedral mesh in 3d a cell has 15 immediate neighbours and about 77next-neighbours), and is therefore significantly cheaper to construct and store. A1 is bettermuch conditioned, with more diagonal dominance; Jacobi or Gauss-Seidel iterations maybe used, whereas applied to A2 they usually diverge without Krylov stabilization. Using A2

gives something close to a Newton method, so that a start-up technique is needed. Lastlyit may be shown that using A1 with a second-order spatial discretization for hyperbolicproblems gives an iterative method (the classical defect-correction method) that smootheslow-frequency solution error effectively, giving rapid convergence of the solution [4].

For the upwind scheme described in Section 2 there is a natural choice of first-orderdiscretization (and hence A1) obtained by using piecewise constant solution reconstructionrather than piecewise linear.

4.1. Lower-Upper Symmetric Gauss-Seidel (LU-SGS)

This method was developed in [6] based on work of Yoon and Jameson [22], to be animplicit method exceptionally cheap in CPU time per iteration and memory costs. In factthe scheme is cheaper in both respects than a 3-stage explicit Runge-Kutta scheme in thecase of the second-order finite volume method of the DLR TAU-Code. Nonetheless it is alsounconditionally stable in practice for inviscid problems.

This is achieved simply by observing the form of the convective Jacobian: the residual ofthe convective terms and boundary conditions for a first-order discretization may be written

Ri =∑

j∈N (i)

f c(wi, wj;nij) +∑

j∈B(i)

fb(wi;nj), (25)

whereby any consistent conservative numerical flux f may be written in dissipation form

f(wL, wR;nij) =1

2(f c(wL) + f c(wR)) · nij −

1

2d(wL, wR;nij), (26)

where d is some dissipation operator. Substituting into (25) gives

Ri =1

2f c(wi) ·

∑j∈N (i)

nij +1

2

∑j∈N (i)

f c(wj)nij,

− 1

2

∑j∈N (i)

d(wi, wj;nij) +∑

j∈B(i)

fb(wi;nj).

However for a closed control volume i, not touching any boundaries, the sum of the normalvectors must equal zero ∑

j∈N (i)

nij = 0.

8

The block-diagonal entries of the Jacobian ∂Ri/∂wi, are then just

∂Ri

∂wi

= −1

2

∑j∈N (i)

∂d(wi, wj;nij)

∂wi

, (27)

for a control-volume with no boundary faces. The diagonal blocks are the only block entriesthat must be explicitly inverted during a Jacobi or Gauss-Seidel sweep, thus by choosinga numerical flux with a particularly simple form of dissipation (for the Jacobian only) thecost of the iteration may be reduced dramatically. In particular for the Lax-Friedrichs flux(6)

∂dFL(wi, wj;nij)

∂wi

= |λ|I,

so inversion is trivial and storage of the diagonal requires storage of only one floating-pointvalue per control volume. In order to preserve this simple form, boundary conditions andviscous Jacobians are accounted for with their eigenvalues only.

Given the simplicity of this Jacobian, a highly accurate linear solver is not required,and symmetric Gauss-Seidel (SGS) is applied. In principle SGS provides communicationbetween every pair of nodes in the grid within one iteration, thereby removing any Courant-Friedrichs-Levy (CFL) restriction. Also, investigations have shown that a single SGS iter-ation provides the best compromise between stability and iteration cost [6]. This iterationmay be most conveniently written as an approximate factorization of the linear systemA1x = b:

(D + L) ·D−1 · (D + U)x = b,

where D, L and U , are the diagonal, lower-triangular and upper-triangular entries of A1,the LU-SGS Jacobian, and x is the approximate solution.

4.2. The Rossow Operator

The main features of this operator are described only briefly here, the interested reader isdirected to [18] and [16].

The Jacobian of the operator is based on the first-order Roe flux, giving full diagonalblocks in the Jacobian (as ∂d/∂w is full) that require inversion, but also giving a moreaccurate representation of the spatial discretization. The accuracy is further improved byusing reconstructed values of the solution on the control volume faces in the Jacobian, ahybrid of a first- and second-order Jacobian that does not increase the fill-in. In order tomake the method less costly the Jacobian is constructed in primitive variables for which theRoe flux Jacobian terms take a simple form (the flux balance continues to be computed inconservative variables to ensure numerical conservation).

The resulting linear system is solved with multiple symmetric Gauss-Seidel sweeps (inthe numerical examples that follow, 3), with direct inversion of the diagonal blocks usinge.g. LU factorization.

This operator was initially developed for convergence to a steady state, and was foundto be most efficient and robust when applied as a residual smoother (in the sense of theJameson-Schmidt-Turkel (JST) scheme [11]) at each stage of an explicit (non-time-accurate)Runge-Kutta scheme [18]. Hence it is a promising candidate for a RW implicit operator.

9

5. NUMERICAL RESULTS

The stabilized Runge-Kutta methods of 3rd (RW3) and 4th order (RW4) as well as theirexplicit counterparts RK3 and RK4 are been applied to several two-dimensional problems.They are compared with existing time-accurate time integration methods, a 2nd order two-stage explicit Runge-Kutta method (RK2) and fully implicit second- and third-order dual-time scheme, denoted DUAL2 and DUAL3 in the following respectively. For these schemesat each time-step the inner iteration is converged to a point where integrated forces on thegeometry are no longer varying within the accuracy tolerances of the spatial discretization.

The aim of these comparisons is to determine the stability and relative efficiency of thevarious schemes. Efficiency will be measured in terms of CPU time required to achieve agiven accuracy, where accuracy will always be determined with respect to a highly convergedreference solution (here obtained with a high-order explicit Runge-Kutta scheme). In thiscontext stability restrictions may be seen as placing a lower limit on the CPU time requiredto obtain any given accuracy. For example, as already discussed, the RW schemes willalways be less accurate than the corresponding explicit schemes for any given time step, butwill have a greater stability, allowing less accurate but cheap solutions not possible with theexplicit scheme.

In the figures in the following sections the stability limits of the methods will be shownimplicitly by termination of the error curve at large time steps.

5.1. An Expansion Fan in a Shock tube on a Structured Grid

In the first test-case a Riemann problem of an expansion fan in a shock tube is simulated.The one-dimensional solution is modeled on the two-dimensional structured grid shown inFigure 1, which has cells varying in aspect ratio from 1 : 1 on the lower wall to 1 : 100 nearthe upper wall. This irregular cell distribution was chosen in an attempt to model a typicalcomputational grid, where cell sizes and aspect ratios vary significantly over the domain.Choosing an isotropic grid would favour explicit methods, as local CFL conditions wouldlead to broadly similar time steps over the entire domain.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Figure 1: Part of structured shock tube grid for the Riemann problem test case. Cell aspectratio reaches 1 : 100.

The flow is initialized as a discontinuity at x = 0, and after a short time the normalizedpressure and density are a shown in Figure 2. The solution is a pure expansion fan, the

10

presence of a shock in the solution was deliberately avoided in order to concentrate on thetime-accuracy aspects of the problem. The AUSM-DV scheme was used to second-order inspace. Because the LU-SGS scheme is unconditionally stable for this case, any significantlycheaper than the Rossow scheme, only Rosenbrock W-methods with LU-SGS was used.

X

Nor

mal

ized

ρ,p

-2 -1 0 1

0.7

0.75

0.8

0.85

0.9

0.95

1ρ - Densityp - Pressure

Figure 2: Normalized density and pressure distributions of the expansion fan shortly afterinitialization.

A convergence study on the time step ∆t for the various schemes is depicted in Figure 3.The L2-error of the pressure over all points in the field is plotted logarithmically against thetime-step, so that the gradient of each curve is the achieved order of that method.

Time-step

L2er

rori

np

10-710-610-510-410-3

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

DUAL2DUAL3RK2RW3RW4RW3 optimal γ

Figure 3: L2-norm error in pressure com-ponent of solution against ∆t for the shocktube.

CPU Time

L2

erro

rin

p

104 105 106 107 108

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101DUAL2DUAL3RK2RW3RW4

Figure 4: L2-norm error in pressure compo-nent of solution against CPU time for theshock tube.

The explicit scheme RK2 is plotted in the range starting with the largest stable time-step to the point where the error is dominated by floating-point precision errors. Both fullyimplicit dual-time schemes are unconditionally stable, but require a considerable amount ofcomputational time, and are therefore plotted in a range starting with a time step close to

11

the upper limit for obtaining a reasonable solution, down to the point where the calculationis too expensive to be practical in any engineering situation. In contrast the RW results maybe readily calculated for the entire time step range, as shown. Over the range for which theexplicit third- and fourth-order Runge-Kutta schemes were stable the resulting error wasagain too small to estimate reliably.

For each scheme the expected order of convergence is obtained. Both semi-implicitmethods RW3 and RW4 start with a convergence of approximately second-order, whichincreases for smaller time steps to the formal asymptotic order of 3 and 4 respectively. Thisvariation of order is an expected consequence of the addition of the potentially large implicit“error” terms. While they certainly decay at the theoretical rate as ∆t→ 0, for finite andlarge ∆t there is no reason why they may not dominate any other term present. In practicalsituations this could make it difficult to be certain that the method was behaving time-accurately at all, hence the necessity for step-size control of Section 3.3. It is also confirmedin these results that for any given time-step the explicit scheme is more accurate than theRW schemes, again an consequence of artificially increasing the level of truncation error.

The curve labeled “RW3 optimal γ” shows an attempt to reduce error by choosing thesmallest value of the coefficient γ for which the RW3 method is stable for each given time-step. Unfortunately the order could not be improved, as γ could only be reduced to a valuesignificantly less than one as the stability limit of the explicit method was approached. Thisis likely a consequence of the LU-SGS scheme, which is unconditionally stable only for γclose to one, rather than a feature of the RW3 scheme itself, hence alternative implicitoperators may well modify this result.

The more practically relevant information is CPU time against solution error, and isshown in Figure 4. The CPU times were not measured for each run individually, as variationsin load and implementational differences tend to make the results too noisy. Rather therelative costs of individual iterations were estimated on the basis of algorithmic complexity,and the results were verified against representative solver runs. The base unit was taken tobe the cost of one LU-SGS iteration. The dual-time schemes are very costly schemes andneed 50-100 LU-SGS iterations per time-step. RK2 is - with a CPU-time that approximatelyequals 1.5 LU-SGS iterations - the cheapest scheme. RW3 takes about 6 and RW4 about18 LU-SGS iterations for one time-step. Based on these factors the L2-errors, multiplied bythe number of corresponding LU-SGS iterations, are plotted against the approximate CPUtime in Figure 4.

The explicit scheme RK2 provides by far the highest accuracy for a given time-step,but the serious deficiency of the low stability limit becomes obvious as well, even for therelatively low level of anisotropy in this case. If an L2-error of order one is required forexample, rather than the less realistic 10−5, then the RW schemes represent a saving of afactor 10 in CPU time over the RK2. The semi-implicit schemes in fact achieve a similarperformance to the dual-time schemes.

The simulation of the expansion fan demonstrates very well the extension of the stabilitylimit compared to fully explicit methods. The following test cases show that a significantspeed up over implicit dual-time schemes may be achieved in addition.

5.2. Euler Flow over a Forward Facing Step on a Rectangular Grid

Inviscid flow over a forward facing step, impulsively starting from rest to a Mach numberM = 3.0 is simulated in two dimensions on a uniform rectangular grid, shown in Figure 5.

12

Figure 5: Partially developed flow over an impulsively started forward facing step at Mach3, with computational grid.

A shock wave emerges in front of the step and eventually grows sufficiently to be reflectedat the channel walls. After some time the flow achieves a stationary behavior. The pressuredistribution of the partially developed flow after a single shock reflection is also depicted inFigure 5.

As before a convergence study of all implemented schemes is shown in Figure 6, wherethe L2-error in pressure is plotted logarithmically against the time-step. Again only LU-SGSis applied with RW. In this case the RW methods meet their theoretical order of convergenceimmediately. Most interestingly the presence of the convergence curves for the explicit RKschemes RK3 and RK4 allow direct inspection of the magnitude of the error contributedby the implicit stabilization, and it is seen to be about three orders of magnitude in theL2-error. This large decrease in accuracy would be of more concern if it wasn’t for the factthat the fully implicit schemes have a similar disadvantage over the explicit schemes. As itstands RW4 scheme achieves approximately the same accuracy for any given time-step asthe second- and third-order dual-time method. We might therefore expect a significant gainin efficiency in terms of CPU time.

The improvement in efficiency for both RW methods is demonstrated in Figure 7. Theaccuracy of the commonly applied dual-time methods for the two largest step sizes maybe regarded as an approximate reference point for typical engineering accuracy. In thisrange the RW schemes provide a reduction in CPU-time of a factor of 3 to 10 over the fullyimplicit methods. Notable is that RW4 scheme looses much of its large accuracy advantageover RW3, due to its high per-step expense, indicating that a fourth-order RW methodwith a significantly lower number of stages would be very desirable. RK3 and RK4 performexcellently here, primarily as a consequence of the uniform cell size - and hence uniformtime-step.

The results of the forward facing step problem demonstrate the potential of the presented

13

Time-step

L2er

rori

np

10-610-510-4

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

102

DUAL2DUAL3RK3RK4RW3RW4

Figure 6: L2-norm error in pressure com-ponent of solution against ∆t for forward-facing step.

CPU Time

L2

erro

rin

p

105 106

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

102

DUAL2DUAL3RK3RK4RW3RW4

Figure 7: L2-norm error in pressure compo-nent of solution against CPU time for forforward-facing step.

methods regarding stability and efficiency compared to fully implicit methods.

5.3. Von Karman Vortex Street

The final test-case is simulation of the classical von Karman vortex street induced by acircular cylinder. This necessarily viscous case, is performed at a Reynolds number of 1000,and in the grid shown in Figure 8 the resolution of the boundary-layer using anisotropicstructured cells is visible. The flow is unsteady by virtue of a self-excited instability, whichcauses periodic separation of a vortices from the top and bottom of the cylinder, resultingin a periodic vertical force on the cylinder shown in Figure 9. Because the calculationsare started from a steady state, and because the oscillation is self-excited, it is difficult tocompare solutions for different time-steps to a reference solution. Small differences nearthe start of the calculation can result in solutions being out-of-phase at a future time.Therefore instead amplitude and frequency obtained with various time-steps are compared.A second-order accurate discretization employing Roe’s flux is used.

The behaviour of the amplitude against time-step and CPU is shown in Figures 10 and11 respectively, while the corresponding behaviour of the frequency is shown in Figures 12and 13. Error is no longer shown as it is subject to too much noise, rather the solution valuesare plotted directly. Also plotted are vertical lines indicating time-steps corresponding to25, 50 and 100 time-steps per period, which is a typical range of time-steps employed inpractice with second-order fully implicit methods.

The results for amplitude and frequency are broadly similar, giving confidence thatthese values are representative of the time accuracy of the solution as a whole. The firstobservation is that RW with LU-SGS is no longer unconditionally stable. It is howeverstill stable for time-steps 100 times greater than the maximum stable time-step of RK3, atabout 100 steps per oscillation. RW with the Rossow operator is in this case unconditionallystable as expected given the more accurate Jacobian, but also significantly more accuratethan RW with LU-SGS. The cause for this large discrepancy in accuracy is not understood,

14

Figure 8: Computational grid for the vonKarman vortex street test case.

Normalized Time

CL

0 0.2 0.4 0.6 0.8 1 1.2 1.4

-1.5

-1

-0.5

0

0.5

1

1.5

Figure 9: Unsteady behaviour of CL start-ing from a stationary solution.

and it is as yet unclear if it arises for other cases. In any event the advantage is substantiallyreduced when comparisons are made on the basis of CPU time, where the low cost LU-SGSmakes up the difference.

For this case the explicit scheme is extremely ineffective, requiring about 10000 time-steps per period before it is becomes stable, and then achieving lower accuracy than theRW methods for the same time-step. This might be explained by the close proximity to thestability boundary of the scheme causing greater errors than would otherwise be expected.

As for the other test-cases, dual-time is seen to be more accurate for a given time-step,but its high cost makes it less efficient than either RW scheme in terms of CPU time. Thegain is perhaps a factor of 3-10 in CPU time, depending on the desired level of error. Howeverthis case in particular highlights the need for effective error control with RW schemes, asconventional guidelines of 25-100 time-steps per period for reasonable accuracy no longerapply reliably, and accuracy can decrease very rapidly if the time-step is too large.

6. CONCLUSIONS

A time-integration method for PDEs, the Rosenbrock W-methods have been introduced asa potential alternative to existing fully explicit, fully implicit (dual-time) and semi-implicitschemes. A particular third-order scheme has been shown theoretically to maintain itsorder of accuracy in time for an arbitrary choice of implicit operator, opening the possibilityof using existing approximate operators developed for steady problems in time-accurateiterations. It has been shown numerically to be competitive with or superior to dual-time and explicit Runge-Kutta methods for several inviscid and viscous compressible flowproblems.

Given the novelty of the method, there is a great deal of further work to be done - it ishoped that the present results give an impression of the promise of the method, and will en-courage further investigation. Firstly the development of Rosenbrock W-methods of fourth-and higher-order with a low number of stages will provide an immediate improvement inaccuracy and efficiency, complementing any other developments. There is a requirement

15

Time-step

Am

plitu

de

10-610-510-410-310-2

1

1.1

1.2

1.3

1.4

DUAL2RK3RW3 LU-SGSRW3 Rossow

Per

iod

/50

Per

iod

/100

Per

iod

/25

Figure 10: Amplitude of CL oscillationagainst ∆t for von Karman case.

CPU Time

Am

plitu

de

103 104 105 106

1

1.1

1.2

1.3

DUAL2RK3RW3 LU-SGSRW3 Rossow

Figure 11: Amplitude of CL oscillationagainst CPU time for von Karman case.

for new approximate implicit operators, build specifically to the requirements of stabilityand low cost (without necessarily giving good convergence in steady cases). Candidates forhigh-Reynolds number Navier-Stokes problems might be line-implicit schemes, with linesconstructed across the boundary-layer, thereby specifically tackling the main source of stiff-ness in the problem. Unconditionally stable implicit operators for problems with turbulencemodeling are also a challenge. The reliability of conventional error estimation and time-stepcontrol algorithms should be investigated as they apply to RW schemes, given the peculiarmodification to the truncation error. Eventually Rosenbrock W-methods might be usedas the implicit part of semi-implicit schemes, thereby limiting the negative effect of theapproximate implicit operator on accuracy.

ACKNOWLEDGMENTS

Thank you to Volker Hannemann, who contributed much of the original motivation, theshock-tube case and many helpful suggestions. Also thanks to Ralf Heinrich for providingthe von Karman test-case and an exact Riemann solver for setting up the shock-tube case.

References

[1] G. Bader and P. Deuflhard. A semi-implicit mid-point rule for stiff systems of ordinarydifferential equations. Numerische Mathematik, 41:373–398, 1983.

[2] J. Blazek. Computational Fluid Dynamics: Principles and Applications. Elsevier Sci-ence, 2001. ISBN 0-08-043009-0.

[3] John Butcher. Numerical Methods for Ordinary Differential Equations. Wiley, 2ndedition, 2008. ISBN: 0470723351.

[4] J.-A. Desideri and P. Hemker. Convergence analysis of the defect-correction iterationfor hyperbolic problems. SIAM Journal of Scientific Computing, 16(1):88–118, 1995.

16

Time-step

Fre

quen

cy

10-610-510-410-310-21

2

3

4

5

6

7

8

DUAL2RK3RW3 LU-SGSRW3 Rossow

Per

iod

/50

Per

iod

/10

0

Per

iod

/25

Figure 12: Frequency of CL oscillationagainst ∆t for von Karman case.

CPU Time

Fre

quen

cy

104 105 106

4

5

6

7

8

DUAL2RK3RW3 LU-SGSRW3 Rossow

Figure 13: Frequency of CL oscillationagainst CPU time for von Karman case.

[5] P. Deuflhard and F. Bornemann. Scientific Computing with Ordinary Differential Equa-tions, volume 42 of Texts in Applied Mathematics. Springer, 2002.

[6] R.P. Dwight. Efficiency Improvements of RANS-Based Analysis and Optimization us-ing Implicit and Adjoint Methods on Unstructured Grids. PhD thesis, School of Math-ematics, University of Manchester, 2006.

[7] T. Gerhold, M. Galle, O. Friedrich, and J. Evans. Calculation of complex 3D con-figurations employing the DLR TAU-Code. In AIAA Paper Series, AIAA-1997-0167,1997.

[8] E. Hairer and C. Lubich. Extrapolation of stiff differential equations. NumerischeMathematik, 52:377–400, 1988.

[9] Ernst Hairer and Gerhard Wanner. Solving Ordinary Differential Equations II: Stiffand Differential-Algebraic Problems. Springer, 3rd edition, 1996. ISBN: 3540604529.

[10] A. Jameson. Time dependant calculations using multigrid with applications to unsteadyflows past airfoils and wings. AIAA Paper, AIAA-91-1596, 1991.

[11] A. Jameson, W. Schmidt, and E. Turkel. Numerical solutions of the Euler equationsby finite volume methods using Runge-Kutta time-stepping schemes. In AIAA PaperSeries, AIAA-1981-1259, 1981.

[12] P. Kaps, S.W.H. Poon, and T.D. Bui. Rosenbrock methods for stiff ODEs: A com-parison of Richardson extrapolation and embedding technique. Computing, 34:17–40,1984.

[13] U. Mayer and R. Dwight. A fourth order semi-implicit Runge-Kutta method for thecompressible Euler equations. Institute Report AS IB-124-2007/3, ISSN 1614-7790,DLR, April 2007.

17

[14] N. Nikitin. Third-order-accurate semi-implicit Runge-Kutta scheme for incompress-ible Navier-Stokes equations. International Journal for Numerical Methods in Fluids,51:221–233, 2006.

[15] P.L. Roe. Characteristic-based schemes for the Euler equations. Annual Review ofFluid Mechanics, 18:337–365, 1986.

[16] C.-C. Rossow. Convergence acceleration for solving the compressible Navier-Stokesequations. 43rd AIAA Aerospace Sciences Meeting and Exhibit, Reno, Nevada, 2005.AIAA-2005-0094, 2005.

[17] T. Steihaug and A. Wolfbrandt. An attempt to avoid exact Jacobian and nonlinearequations in the numerical solution of stiff differential equations. Mathematics of Com-putation, 33:521–534, 1979.

[18] R.C. Swanson, E. Turkel, C.-C. Rossow, and V. Vatsa. Convergence acceleration formultistage time-stepping schemes. AIAA Paper, 2005.

[19] V. Venkatakrishnan. Convergence to steady state solutions of the Euler equations onunstructured grids with limiters. Journal of Computational Physics, 118:120–130, 1995.

[20] J.J. Yoh and X. Zhong. New hybrid Runge-Kutta methods for unsteady reactive flowsimulation. AIAA Journal, 42(8):1593–1600, 2004.

[21] J.J. Yoh and X. Zhong. New hybrid Runge-Kutta methods for unsteady reactive flowsimulation: Applications. AIAA Journal, 42(8):1601–1611, 2004.

[22] S. Yoon and A. Jameson. An LU-SSOR scheme for the Euler and Navier-Stokes equa-tions. AIAA Journal, 26:1025–1026, 1988.

18