34
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS Int. J. Numer. Meth. Fluids 2004; 00:1–6 Prepared using fldauth.cls [Version: 2002/09/18 v1.01] The Independent Set Perturbation Adjoint Method: A new method of differentiating mesh based fluids models F. Fang *1 , C. C. Pain 1 , I. M. Navon 2 , G. J. Gorman 1 , M. D. Piggott 1 , P. A. Allison 1 1 Applied Modelling and Computation Group, Department of Earth Science and Engineering, South Kensington Campus, Imperial College London, SW7 2AZ, UK URL: http://amcg.ese.imperial.ac.uk 2 Department of Scientific Computing, Florida State University, Tallahassee, FL, 32306-4120, USA URL: http://people.scs.fsu.edu/vnavon/index.html SUMMARY A new scheme for differentiating complex mesh-based numerical models (e.g. finite element models), the Independent Set Perturbation Adjoint method (ISP-Adjoint), is presented. Differentiation of the matrices and source terms making up the discrete forward model is realized by a graph colouring approach (forming independent sets of variables) combined with a perturbation method to obtain gradients in numerical discretisations. This information is then convolved with the ‘mathematical adjoint’ which uses the transpose matrix of the discrete forward model. The adjoint code is simple to implement even with complex non-linear parameterizations, discretisation methods and governing equations. Importantly, the adjoint code is independent of the implementation of the forward code. Copyright c 2004 John Wiley & Sons, Ltd.

The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS

Int. J. Numer. Meth. Fluids 2004; 00:1–6 Prepared using fldauth.cls [Version: 2002/09/18 v1.01]

The Independent Set Perturbation Adjoint Method: A new

method of differentiating mesh based fluids models

F. Fang∗1, C. C. Pain1, I. M. Navon2, G. J. Gorman1, M. D. Piggott1, P. A. Allison1

1 Applied Modelling and Computation Group,

Department of Earth Science and Engineering,

South Kensington Campus,

Imperial College London, SW7 2AZ, UK

URL: http://amcg.ese.imperial.ac.uk

2 Department of Scientific Computing,

Florida State University,

Tallahassee, FL, 32306-4120, USA

URL: http://people.scs.fsu.edu/vnavon/index.html

SUMMARY

A new scheme for differentiating complex mesh-based numerical models (e.g. finite element models),

the Independent Set Perturbation Adjoint method (ISP-Adjoint), is presented. Differentiation of the

matrices and source terms making up the discrete forward model is realized by a graph colouring

approach (forming independent sets of variables) combined with a perturbation method to obtain

gradients in numerical discretisations. This information is then convolved with the ‘mathematical

adjoint’ which uses the transpose matrix of the discrete forward model. The adjoint code is simple

to implement even with complex non-linear parameterizations, discretisation methods and governing

equations. Importantly, the adjoint code is independent of the implementation of the forward code.

Copyright c© 2004 John Wiley & Sons, Ltd.

Page 2: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

2 F. FANG ET. AL.

This greatly reduces the effort required to implement the adjoint model and maintain it as the forward

model continues to be developed; as compared with more traditional approaches such as applying

automatic differentiation tools. The approach can be readily extended to reduced order models. The

method is applied to a one-dimensional Burgers’ equation problem, with a highly non-linear. high-

resolution discretisation method; and to a two-dimensional, non-linear, finite element model of an

idealised ocean gyre.

keywords: Automatic differentiation; optimisation; adjoint; finite element, POD; reduced order

models. Copyright c© 2004 John Wiley & Sons, Ltd.

1. Introduction

Adjoints to computational codes have been used for diverse applications including; data

assimilation [1, 2, 3, 4], sensitivity analysis [5, 6], adaptive observations [7] and mesh based error

measures [8, 9]. However, differentiating discrete numerical models is generally impractical to

perform by hand and remains challenging when using automatic differentiation tools (AD)

(More details on AD at http://www.autodiff.org ). Automatic differentiation also restricts the

use of many programming features, further increasing the difficulty of developing a forward

and adjoint model. Both approaches often require intimate knowledge of the forward code in

order to design a hand differentiation method or to alter the automatically produced code so

that it works efficiently [10, 11, 12]. The time it takes to conduct this task may be greater

than writing the forward model and there are also computer science problems arising from

the complexity of the computation of derivatives and memory usage [13, 14]. These challenges

often preclude the use of gradient or adjoint variational (4D-VAR; three spatial dimensions

plus time) data assimilation methods, sensitivity based methods or any other method that

uses gradients of key functionals with respect to the control variables (controls), e.g. material

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 3: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 3

properties or initial/boundary conditions of flow models.

A forward model may have an actively developed dynamic core or utilize rapidly changing

and complex parameterizations. Therefore, the application of consistent or exact gradient

methods of differentiation to the discrete problem tends to lag behind developments in the

forward model code. Despite this there have been major successes in differentiating large

models [15, 16, 17, 18, 19]. Automatic differentiation methods have been utilized on stable

models at compile time after code modifications [20, 21, 22, 23]. In addition, Marta [13]

developed a selective automatic differentiation approach where AD is used to compute only

certain terms of the discrete adjoint equations, and not to differentiate the entire solver.

The ISP-Adjoint approach outlined here is largely independent of implementation details

of the forward model. The modelling software only needs to be modified so that it can

accommodate the adjoint, e.g. the formation of the ‘mathematical adjoint’ transpose matrices

of the forward problem, as well as the accommodation of the data structures necessary

to provide access to the forward solution when time marching backwards in the gradient

calculation. Similar approaches may be applied to reduced order models such as proper

orthogonal decomposition (POD) surrogates of full models.

This approach can also be used to help accelerate the matrix equation assembly process

on the assumption that the discretised system of equations has a polynomial representation

and can thus be formed by a summation of matrices. If the system of equations can not be

represented as a non-linear polynomial it may still be approximated in this way. Thus the fast

assembly methods may be applied but to preserve accuracy the matrices must be periodically

reformed in the time marching scheme.

The remainder of this paper is set out as follows. The next section outlines the model

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 4: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

4 F. FANG ET. AL.

governing equations, followed by a description of how discrete system assembly may be

performed using graph colouring methods. This is then followed by the extension to reduced

order POD models and the extension of the method to non-quadratic discrete systems of

equations. The gradient of the discrete system (ISP-Adjoint) method is outlined and the

method is applied to two problems: the Burgers equation; and the Navier-Stokes equations

applied to ocean gyre circulation. Finally, conclusions are drawn.

2. Governing equations

Two systems of partial differential equations used to illustrate the application of the method

are the Burgers’ equation and the Navier-Stokes equations.

The one-dimensional (1D) Burgers’ equation is defined as:

∂u

∂t+ u

∂u

∂x− ν ∂

2u

∂x2= 0. (1)

where u is the velocity, x is the spatial coordinate, t is the time coordinate and ν is the viscosity

coefficient.

The 3D Navier-Stokes continuity and momentum equations are considered in the form:

∇ · u = 0, (2)

∂u∂t

= −u · ∇u− fk× u +∇ · τ −∇p, (3)

where u = (ux, uy, uz)T is the velocity vector, x = (x, y, z)T are the orthogonal Cartesian

coordinates, p is the perturbation pressure (p = p/ρ0, ρ0 is the constant reference density), f

represents the Coriolis inertial force and k = (0, 0, 1)T . The stress tensor τ is used to represent

viscous terms. The pressure variable is split into the non-geostrophic and geostrophic parts

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 5: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 5

which are solved separately. This allows the accurate representation of hydrostatic/geostrophic

balance (for details see [24]).

3. Discrete system assembly using graph colouring methods

In this section a discrete forward model is derived for a high-order discrete equation based on

polynomials. The matrices associated with the nonlinear terms can be expressed by a set of

sub-matrices. For time dependent problems these sub-matrices are time-independent and can

be easily differentiated to form adjoint systems of equations.

3.1. Discrete model for incompressible fluids

The discrete model for incompressible fluids (equations (2) and (3)) calculations can be written

for finite element methods as explained in [24] as:

Cn+1Tun+1 = 0, (4)

En+1un+1 = Bn+1un + Cn+1pn+1 + sn+1, (5)

where n is time level and s includes the discretised sources, initial and boundary conditions

and body forces. The matrices En+1 and Bn+1 are nonlinear in u and Cn+1 is the matrix

associated with the pressure vector:

pn+1 = (pn+11 , pn+1

2 , . . . , pn+1M )T .

M is the number of pressure variables or basis functions. The vector of velocities is defined as

un = (unxT ,uny

T ,unzT )T ,

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 6: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

6 F. FANG ET. AL.

with sub-vectors associated with the velocity components,

unx = (uxn1 , uxn2 , . . . , ux

nN )T ,

uny = (uyn1 , uyn2 , . . . , uy

nN )T ,

unz = (uzn1 , uzn2 , . . . , uz

nN )T .

N is the number of velocity basis functions and all subscripts indicate nodal values of the

associated variables.

3.2. Graph colouring

Graph colouring methods are commonly used to model the dependency between different

subtasks or data. Here we define the graph Gr = (Vg, Eg), where the vertex set, Vg, are the

nodes of the finite element mesh (i.e. the rows of the matrix), and the edge set, Eg, is defined

by the union of finite element stencils. The chromatic number, χ(Gr), is bounded by:

ω(Gr) 6 χ(Gr) 6 ∆(Gr) + 1,

where ω(Gr) is the clique number and ∆(Gr) is the maximum vertex degree. However, the

actual number of colours used to colour the graph will be Nc ≥ χ(Gr).

We seek independent sets, where the finite element stencil is defined by the non-zero nodes

in∫

ΩNiQkNjdΩ, where Ω is the computational domain and Ni, Nj are finite basis functions

and Qk (1 ≤ k ≤ Q, Q is the number of basis functions Qk) is the finite basis function of a

variable q and does not interact with other nodes than k of a different colour. Depending on

the form of Qk the matrix stencil, and therefore the vertex degree of the graph, will be larger.

This would be the case for a material property or non-linearity which would be expanded

q =∑Qk=1Qkqk where q = (q1, q2, ..., qQ)T .

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 7: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7

3.3. Matrix representation of non-linear discrete systems

For a non-linearity with a polynomial representation in the discrete system (e.g. the Navier-

Stokes equations discretised with a continuous Galerkin method) the matrix equations that

are formed from a summation of matrices. For illustration, we will consider here: quadratic

discrete systems; high-order polynomial discrete systems; and non-polynomial systems.

3.3.1. Quadratic discrete systems As discussed above, a quadratic non-linearity in the

discrete system with can be written in a matrix of the form:

Qqij =∫

Ω

NiqNjdΩ, (6)

where q is the velocity. The application to other finite element matrices such as those formed

from inertia or advection terms of the form∫

ΩNiun · ∇NjdΩ follow along similar lines. The

matrix Qqcij in the colouring scheme can be expressed:

Qqcij =

∫Ω

NibcNjdΩ, (7)

where,

bc =Q∑k=1

Qkbck, bck =

1 if node k is of colour c;

0 at other nodes.(8)

Applying the same approach as outlined here for matrix Q to matrices En+1 and Bn+1 in

equation (5) the matrix Q can be constructed by a set of sub-matrices using graph colouring,

Qqij =Nc∑c=1

Qcqijqcij = Qqij

+Nc∑c=1

Qcqij

(qcij − qcij), (9)

where qcij is the value of q at a node of colour c neighboring both nodes i and j (see figure 1),

N is the number of nodes. The chromatic number, Nc, is four for 2D linear element cases, see

figure 1. Note that it is useful to write the matrix (9) as a perturbation from a mean matrix

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 8: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

8 F. FANG ET. AL.

Qqij=∫

ΩNiqNjdΩ =

∑Nc

c=1 Qcqijqcij in order to linearise about a solution q =

∑Qk=1Qkqk,

e.g. the mean of q or the most recently available value of q.

For a quadratic discrete model like a finite element Bubnov-Galerkin discretisation of the

Navier-Stokes equations [26], the matrices Qq, Qcq are time-independent similar to matrices

En+1 and Bn+1 in equation (5). The matrices Qq, Qcq can be constructed by sending down

different values of the non-linear terms into the matrix assembly routines.

3.3.2. High-order polynomial discrete systems For a high-order discrete model, for example

a cubic discrete model, the matrix Quq can therefore be written:

Qrqij =∫

Ω

NirqNjdΩ, (10)

where the velocity q =∑Qk=1Qkqk. The above is repeated but assuming a constant r and thus

Qrqij = Qrqij+Nc∑c=1

Qcrqij

(qcij − qcij), (11)

where

Qrqcij =

∫Ω

NirbcNjdΩ, Qrq

c

ij=∫

Ω

NirqNjdΩ. (12)

3.3.3. Non-polynomial systems The extension to non-polynomial discrete systems which are

for example obtained by certain discretisation methods (e.g. upwind methods, discontinuous

Galerkin (D)G, flux limiting methods, or model parameterisation like Large Eddy Simulation

(LES) is realised by linearising the system about q say, if q is the non-linear variable. The

linearisation realised through q may have to be re-done every so often into a time dependent

simulation in order to achieve accuracy. So the matrices may be formed by a summation of

matrices. These make the matrices easy to differentiate and fast to assemble but it involves

an approximation. A similar approach may also be applied to reduced order models to help in

the rapid assembly of their equations as described below.

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 9: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 9

4. Discrete system of reduced order model equations

The aim behind reduced order models is to find a relatively small number of basis functions

(few relative to the full model) that can accurately represent the system dynamics. These basis

functions are often obtained from the full forward model results by applying a singular value

decomposition analysis to a number of snapshots of the forward solution to determine the

most energetic modes of the system and in this way optimally obtain basis functions. This is

the approach taken in POD to obtain the basis functions, see [27]. In this section the matrices

are generated by a summation of matrices similar to the colouring method used above for the

finite element equations.

4.1. POD reduced order model

Once the reduced order basis functions are obtained the variables u can be expressed

as an expansion of the POD basis functions Φux 1, . . . ,ΦuxPu, Φuy 1

, . . . ,ΦuyPu and

Φuz 1, . . . ,ΦuzPu (here Pu is the number of POD bases for velocity):

u(t, x, y, z) = u +Pu∑m=1

NΦum (x)aum(t), (13)

and pressure is decomposed into p = png + pg in which png, pg are the non-geostrophic and

geostrophic pressures (see [27] for details associated with pg) and the POD basis functions

Φp1, . . . ,ΦpPp (here Pp is the number of POD bases) for non-geostrophic pressure are used

to obtain:

png(t, x, y, z) = png +Pp∑m=1

Φp m(x)ap m(t), (14)

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 10: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

10 F. FANG ET. AL.

while the geostrophic pressure can be represented by a summation of the two sets of geostrophic

basis functions Φg ux m and Φg uy m, which are obtained by Φux m and Φuy m

[27]:

pg(t, x, y, z) = pg +Pu∑m=1

Φg ux m(x)aux m(t) +Pu∑m=1

Φg uy m(x)auy m

(t). (15)

where aum = (aux m, auy m, auz m)T and ap m are the coefficients to be determined for velocity

u = (ux, uy, uz)T and p respectively, png, pg and u are the means of the ensemble of

snapshots for the variables png, pg and u(t) respectively, NΦum is a diagonal matrix including

the POD basis for the velocity variable u, in the finite element method, the POD basis

Φum(x) =∑Ni=1NiΦumi and Φp m(x) =

∑Ni=1NiΦpm i, where Ni is the basis function in

the finite element and N is the number of nodes in the computational domain.

Substituting (13), (14) and (15) into (2) and (3), and taking the POD basis function Φp and

Φu as the test function for (2) and (3) respectively, then integrating over the computational

domain Ω, ∫Ω

NΦl F

((u +

Pu∑m=1

NΦum (x))aum, p, t, x

)dΩ, (16)

with F defined as in (2) and (3), where NΦl = Φp while for the continuity equation (2) and

NΦl = NΦu

l = diag(Φux l,Φuy l,Φuz l) for the Navier-Stokes equation (3). The POD reduced

order model is then obtained:

∂al∂t

=⟨F

((u +

Pu∑m=1

NΦum (x))aum, p, t, x

), NΦu

l

⟩, (17)

subject to the initial condition

au l(0) =⟨

(u(0,x)− u(x))T ,NΦu

l

⟩, (18)

where < ·, · > is the canonical inner product in L2 norm, a = (au, ap)T , 1 ≤ l ≤ Pp for the

continuity equation (2) while 1 ≤ l ≤ Pu for Navier-Stokes equation (3).

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 11: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 11

4.2. Discrete POD model

Here the POD reduced order model for the incompressible continuity and Navier-Stokes

equations, can be expressed as:⟨N

Φp

l ,∇ · (Pu∑m=1

NΦum (x)aum(t))

⟩= 0, (19)

and

∂aul(t)∂t

+⟨

NΦu

l ,

((u +

Pu∑m=1

NΦum (x)aum(t)) · ∇(u +

Pu∑m=1

NΦum aum(t))

)⟩+

⟨NΦu

l ,

(fk× (u +

Pu∑m=1

NΦum (x)aum(t))

)⟩+

⟨NΦu

l ,

(∇p− µ∇2(u +

Pu∑m=1

NΦum (x)aum(t))

)⟩= 0, (20)

where f represents the Coriolis inertial force, µ is the kinematic viscosity and k = (0, 0, 1)T .

The discrete model of (19) and (20) at the time level n + 1 can be written in a general form

in a subspace:

Cn+1TPOD an+1

u = 0, (21)

and

En+1PODan+1

u = Bn+1PODanu + Cn+1

PODan+1p + sn+1

POD, (22)

where En+1POD, Bn+1

POD (associated with the nonlinear term) and Cn+1POD ( associated with the

pressure term) are the matrices including all the discretisation of (19) and (20), sn+1POD is a

discretised source term.

The (m, l)th (here 1 ≤ m, l ≤ 3Pu ) entry of the matrices En+1POD and Bn+1

POD can be constructed

by the entries of the matrices En+1 and Bn+1 in the full model (5):

En+1POD ml =

3N∑i=1

3N∑j=1

En+1ij DΦu

miDΦu

l j , (23)

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 12: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

12 F. FANG ET. AL.

Bn+1POD ml =

3N∑i=1

3N∑j=1

Bn+1ij DΦu

miDΦu

l j , (24)

and

sn+1POD m =

3N∑i=1

sn+1i DΦu

mi, (25)

where En+1ij and Bn+1

ij are the (i, j)th entry of the matrices En+1 and Bn+1 respectively which

in the colouring scheme can be formed by summation of matrices (see (9)), sn+1i is the ith

entry of s, DΦumi and DΦu

l j are the ith and jth entries of the diagonal matrices DΦum and DΦu

l

respectively,

DΦuxm = diag(Φuxm 1, . . . ,ΦuxmN ), (26)

DΦuym = diag(Φuym 1

, . . . ,ΦuymN ), (27)

DΦuzm = diag(Φuzm 1, . . . ,ΦuzmN ). (28)

For quadratic discrete systems and a large reduction of the model space the matrices EPOD

and BPOD are most efficiently formed by the a summation of matrices in a manner similar to

that used to form the finite element equations using a colouring method. However, due to the

global nature of the POD basis functions the number of colours is equal to the number of basis

functions. Thus splitting them into the mean matricies EsubPOD, BsubPOD and matricies

perturbing the system from these EsubPODxm, BsubPODxm (and similarly for y and z) to

obtain:

En+1POD = EsubPOD+

Pu∑m=1

EsubPODxmauxn+1m +

Pu∑m=1

EsubPODymauyn+1m

+Pu∑m=1

EsubPODzmauzn+1m ,

(29)

Bn+1POD = BsubPOD+

Pu∑m=1

BsubPODxmauxn+1m +

Pu∑m=1

BsubPODymauyn+1m

+Pu∑m=1

BsubPODzmauzn+1m .

(30)

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 13: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 13

This decomposition enables any differentiation of these matrices to be easily performed as well

as for the reduced model to be rapidly formed every time step.

5. Discrete systems of adjoint equations

A consistent adjoint model can be derived directly from a set of matrices in the forward model.

The discrete forward model can be expressed in a general form in a subspace:

AΦ = s, (31)

A ∈ RN Nt×N Nt is the matrix making up the discretisation in the forward model where N

is the number of basis functions (e.g. finite element or POD), and Nt is the number of time

levels; Φ = (φ1, φ2, . . . , φN Nt) is the the vector of state variables; s = s(m) is a discretised

source term — the controls of discretised control variables are m = (m1,m2, ...,mC)T in which

C are the number of controls. For a nonlinear simulation, the matrix A can be calculated by

the colouring approach (9).

In general A is a function of the controls m and state variables Φ that is A = A(m,Φ).

Differentiating (31) with respect to the control variable ml, the tangent linear model (TLM)

can be expressed

∂A∂ml

Φ + A∂Φ∂ml

=∂s∂ml

, (32)

The gradient of a cost functional (J) can be expressed as:

∂J

∂ml=(∂Φ∂ml

)T∂J

∂Φ. (33)

The TLM (32) can be re-expressed:

A∂Φ∂ml

= − ∂A∂ml

Φ +∂s∂ml

. (34)

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 14: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

14 F. FANG ET. AL.

It will be convenient to use:

∂A∂ml

Φ =[∂A∂ml

]direct

Φ + G∂Φ∂ml

, (35)

in which G = (g1,g2, ...,gN ) where gk = ∂A∂φk

Φ. Also[∂A∂ml

]direct

is defined as the direct

differentiation of A with respect to ml only

∂A∂ml

Φ =[∂A∂ml

]direct

Φ +N∑k=1

∂A∂φk

Φ∂φk∂ml

, (36)

whereN∑k=1

∂A∂φk

Φ∂φk∂ml

=N∑k=1

gk∂φk∂ml

= G∂Φ∂ml

. (37)

Therefore, the perturbation of the state variables with respective to the controls can be

calculated as: (∂Φ∂ml

)T=(−[∂A∂ml

]direct

Φ +∂s∂ml

)T(A + G)−T . (38)

Substituting (38) into (33), yields

∂J

∂ml=(−[∂A∂ml

]direct

Φ +∂s∂ml

)T(A + G)−T

∂J

∂Φ. (39)

Defining (A + G)−T ∂J∂Φ = Φ∗, the adjoint model is,

(A + G)TΦ∗ =∂J

∂Φ. (40)

The gradient of the cost function can therefore be calculated using

∂J

∂ml=(−[∂A∂ml

]direct

Φ +∂s∂ml

)TΦ∗. (41)

Notice that differentiation of the matrix A is achieved in a particularly simply way if the

matrices are formed for a polynomial system as described in the previous section. Thus from

(11): [∂Auqij

∂ml

]direct

= Acuqij

bclij , (42)

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 15: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 15

in which cl is the colour of node l and bclij = 1 if nodes i and j are neighbours to node l

otherwise bclij = 0. The equality becomes an approximation when the discrete matrix A has

been approximated by a polynomial system. However, even for more complex problems the

colouring approach can still be used to simplify the differentiation as described in the next

section.

5.1. Independent Set Perturbation Adjoint (ISP-Adjoint)

The terms[∂A∂ml

]direct

Φ, ∂s∂ml

in equation (41) can be derived by hand or automatic

differentiation in the traditional approach to forming discrete or exact gradients of the

functional of interest, J . An alternative method for calculating the gradient is to apply a

small perturbation, ∆ml, to each of the controls ml. In vector form the perturbation is a

vector of length C with the only non-zero at the lth entry, i.e. ∆m = (0, ..., 0,∆ml, 0, ..., 0)T .

Taking this approach the gradient is expressed as:

A(m + ∆ml,Φ)−A(m,Φ)∆ml

Φ =A(m + ∆ml,Φ)Φ−A(m,Φ)Φ

∆ml, (43)

s(m + ∆ml)− s(m)∆ml

. (44)

Combining these, the key gradient in (41) can be written as:

(−A(m + ∆ml,Φ)Φ + s(m + ∆ml))− (−A(m,Φ)Φ + s(m))∆ml

. (45)

Moreover, perturbations of a number of variables ml at a given time can be taken without

affecting the gradients. However, these must form an independent set of variables, i.e. one

variable perturbation should not affect the results of perturbing any other variable. This can

be done for the source s and for the matrix A. In fact the matrix need not be formed as what

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 16: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

16 F. FANG ET. AL.

is required is matrix vector multiplication:

A(m + ∆ml)−A(m)∆ml

Φ.

One can use a higher order Taylor series expansion than first order to form approximations to

−[∂A∂ml

]direct

Φ +∂s∂ml

,

from perturbations in ml, but this requires more computational effort. However, as long as

∆ml is small enough (ignoring round off errors) it will be accurate enough. A good example

is the use of A(m+∆ml,Φ)−A(m−∆ml,Φ)2∆ml

which will exactly represent a quadratic A. It should

be noted that for quadratic equations (e.g. Burgers’ and the Navier-Stokes equation) and with

quadratic discretisations arbitrary large ∆ml can be taken as the results are independent of

the size of ∆ml as A and s are linear in m.

The final aspect of the ISP-Adjoint method is the calculation of the matrix G used in

the adjoint calculation which is again formed using the independent set colouring approach

outlined previously. Suppose the perturbation vector is ∆Φk = (0, ..., 0,∆φk, 0, ..., 0)T for a

perturbation ∆φk of the kth entry in Φ then the columns of G are:

gk =∂A∂φk

Φ ≈ A(m,Φ + ∆Φk)−A(m,Φ)∆φk

Φ =A(m,Φ + ∆Φk)Φ−A(m,Φ)Φ

∆φk. (46)

Notice that this equation has similar form to (43) and thus the same algorithm, and therefore

same source code, may be used to evaluate both. Suppose all perturbations are equal ∆φk = ε

then:

G =1ε

(G′ −G), (47)

where G = (AΦ,AΦ, ...,AΦ) and G′ = (g′1,g′2, ...,g

′N ) with

g′k = A(m,Φ + ∆Φk)Φ. (48)

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 17: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 17

In an implementation of the formation of the matrix G one needs to perform matrix vector

multiplications involving Nc vectors. Thus, the entire matrix G′ can be stored within Nc

vectors of length N Nt. The matrix in this form would then be distributed to the most

convenient matrix form of the discrete system of equations. By construction, G has a similar

sparsity pattern as A.

An efficient implementation of the ISP-Adjoint method can be realised by:

• Using a discretisation or assembly subroutine that does not form the matrices but simply

performs matrix vector multiplication while assembling the equations.

• Enabling this subroutine to perform a matrix vector multiplication with a number of

solution perturbations simultaneously, thus to form G for finite element one only needs

to integrate across each element once.

• Reducing the number of independent sets or colours by introducing more variables in

the solution Φ. For example, by severing the link between elements in continuous finite

element discretisations and perturbing on this new system. This method we call the

Reduced ISP-Adjoint or ISP-Adjoint which, to round off error, produces identical results

to the ISP-Adjoint method. Moreover, a general finite element code, for example, needs

no modification to the assembly subroutines as simply a different linked list for the global

solution variables associated with each element is past to these subroutines.

• Performing the matrix vector multiplication GTΦ∗ without storing matrix GT .

The last step can be realised because in time marching methods typically all of the adjoint

solution operates in a lower block structure in which each block is associated with a solution

variable at a particular time step. This means that one can solve the adjoint equations by

marching backwards in time and the matrix G has zeros on the block diagonals of the matrix.

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 18: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

18 F. FANG ET. AL.

Thus, matrix vector multiplication,

GTΦ∗ (49)

can be placed on the right hand side of the matrix solution for the adjoint in equation (40). In

this case the matrix GT need not be formed and the matrix vector multiplication in equation

(49) involving G′TΦ∗ can be made highly efficient for the kth row using:

(G′TΦ∗)k = g′Tk Φ∗ = Φ∗T (A(m,Φ + ∆Φk)Φ). (50)

In practice the perturbations associated with each node or variable k say are grouped in terms

of colours and in this way a number of these matrix vector multiplications in (50) can be

calculated concurrently, i.e. form the dot product of g′c with the part of Φ∗ associated with

node k and form (G′TΦ∗)k for each row k and for colour c in which g′c =∑k∈ colour c g′k.

6. ISP-Adjoint applied to Burgers equation

A backward Euler time stepping method at time level n and centered on control volume i for

the Burgers’ equation (1) results in the following schemes:

1. With central differencing in space

un+1i − uni

∆t+ uni

un+1i+1 − u

n+1i−1

2∆x+ ν−un+1

i+1 + 2un+1i − un+1

i−1

(∆x)2= 0; (51)

2. With first-order upwind discretisation in space:

un+1i − uni

∆t+ max(0,

12∆x

(uni + uni−1))un+1i−1 + min(0,

12∆x

(uni + uni−1))un+1i

+ max(0,1

2∆x(uni+1 + uni ))un+1

i + min(0,1

2∆x(uni+1 + uni ))un+1

i+1

−uni+1 − uni−1

2∆xun+1i + ν

−un+1i+1 + 2un+1

i − un+1i−1

(∆x)2= 0; (52)

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 19: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 19

3. With high-resolution discretisation in space:

un+1i − uni

∆t− uni− 1

2(ξni− 1

2un+1i−1 + (1− ξni− 1

2)un+1i ) + uni+ 1

2(ξni+ 1

2un+1i + (1− ξni+ 1

2)un+1i+1 )

−uni+1 − uni−1

2∆xun+1i + ν

−un+1i+1 + 2un+1

i − un+1i−1

(∆x)2= 0. (53)

To determine in the high-resolution method where to apply first-order instead of high-order

fluxes we need to be able to detect an extrema and if there is to be a smooth transition

between the two fluxes we need to quantify how close to an extrema the solution is. This is

achieved here using the normalised variable diagram NVD method [28]. Suppose unu, unc , u

nd

are ordered consecutively and unf is the face value between control volumes (CV’s) c and d. If

f = i− 12 and unf > 0 then subscripts u = i− 2, c = i− 1, d = i and if unf < 0 then subscripts

u = i+ 1, c = i, d = i− 1 and uni− 1

2= 1

2 (uni−1 + uni ). For a monotonic solution,

unf =unf − unuund − unu

∈ [0, 1], (54)

which is the non-dimensional high order face value. The non-dimensional upwind face value is

unc =unc − unuund − unu

. (55)

The idea is to obtain a linear combination unf of unf and unc such that unf equals unc when there

exists an extrema (unc /∈ [0, 1] and the scheme becomes first order) and unf moves smoothly

between this and the high order flux unf . unf is calculated from

unf =unf − unuund − unu

. (56)

The flux limited solution is

unf = unf (und − unu) + unu. (57)

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 20: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

20 F. FANG ET. AL.

unf in this equation is calculated the normalised variable diagram, that is:

unf =

min1, 2unc ,maxunc ,unf , if unc ∈ (0, 1);

unc , otherwise.

(58)

Thus at an extrema the first order non-oscillatory method will be applied:

ξnf =

12 (1− eun

f−unc

unf−un

c) + 1

2 , ifunf > 0,

1− ( 12 (1− eun

f−unc

unf−un

c) + 1

2 ), otherwise.

Using the notation of the previous section the global matrix A has the structure:

A =

P1

1∆tI P2

. . . . . .

1∆tI PNt

(59)

with Nt the number of time levels and the matrix G has the structure:

G =

. . .

K2

. . . . . .

KNt

(60)

Thus the matrix equation (40) can be solved by time marching backwards in time. For

the central difference discretisation the Kn is diagonal with diagonal contributions Knii =

un+1i+1 −u

n+1i−1

2∆x and for the upwind discretisation it is distributed with non-zero values of

Kni i−1,Kn

i i,Kni i+1 taken from the coefficients of uni−1, uni , uni+1 respectively of the advection

term in equation (52).

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 21: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 21

For the central difference scheme (51) the matrices Kn are diagonal and thus only a single

colour (Nc = 1) is necessary to determine their values. The upwind difference scheme (52) has

a lower bi-diagonal matrices Kn for positive advection that is when uni+ 1

2> 0 (assuming the

cells are ordered consecutivitly from left to right) and two colours (Nc = 2) are needed to form

the matrices Kn. When there is a mixed sign then the matrix has a bandwidth of three with a

maximum of three non-zero’s per row and in this case three colours (Nc = 3) are needed. Due

to the use of the far field upwind values in the flux limiting calulcaiton of the high-resolution

method (53) it’s matrices Kn have a bandwidth of five with five non-zeros’ per row and thus

five colours (Nc = 5) are necessary and are used here to form the matrices Kn. If, instead

of linearising the previous time level value of the solution, one was to iterate to convergence

within a time step in order to calculate the non-linear terms at the future time level then the

matrices Kn would appear on the block diagonal of the system of equations occupying the

same place as Pn in the system of adjoint equations.

For a unit 1D domain and zero boundary conditions at both ends and space and time steps

of ∆t = 0.01, ∆x = 0.1 (11 cells) and a functional J = 12∆x

∑Ni=1(uNt

i )2 whose sensitivity with

respect to the initial conditions (u0j = exp(− (xj−0.5)2

0.12 ) for all control volume center positions

xj ∈ [0, 1], see figure 2a), is sought and x is the 1D spacial variable. The number of time

step for the 11 cell mesh is Nt = 50. As with all calculations presented in this paper they are

performed in double precision. The value of ∆u = 10−6 is used in the ISP-Adjoint sensitivity

calculation. There is also a fine mesh with 100 cells and ∆t = 0.001, ∆x = 0.01, Nt = 500.

The initial and final solution at the final time level are shown in figure 2a and 2b along with

the adjoint at the start of time n = 1 figure 2c and the sensitivity ∂J∂u0

iin figure 2d. The adjoint

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 22: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

22 F. FANG ET. AL.

solution essentially advects the results from its source,

∂J

∂uNti

= ∆xuNti ,

(which mirrors the forward solution at adjoint time level Nt) backwards through the domain.

This propagates the kinetic energy information at the final time level to the initial conditions

where the importance of the initial conditions contribution to J can be determined.

It should be noted that the accuracy of the sensitivity/gradient is such that no difference

can be observed visually between the exact gradient (obtained by taking perturbations of the

individual variables in the initial conditions with a perturbation of 10−10) and the gradients

shown in figure 2d. The maximum error in the sensitivity ∂J∂u0

iwhich is normalised by the

maximum value of ∂J∂u0

k,∀k is shown for the high-resolution method versus the perturbation

∆u used in the ISP-Adjoint calculation in figure 3. The sensitivity calculated by the ISP-

Adjoint becomes more accurate as ∆u is reduced until it gets to the point where round off

error interfers with its accuracy and its rate of decrease slows and eventually the error starts

to increase. It should be born in mind that due to the highly non-linear nature of the high-

resolution method the matrix A is not linear in un ∀n and thus the perturbation used in the

ISP-Adjoint calculation is an approximation. The central difference method (51) has a linear

A in un ∀n and thus the ISP-Adjoint is exact (to computer round off error). In addition,

the first order upwind scheme in (52) is ’mostly’ exact. It is only non-exact and non-linear

when a face value of the advection velocity (used to advect the velocity un+1) switches sign

on application of the perturbation ∆u which is not very likely to happen at least for small

∆u relative the the size of u. In addition, when it does happen the advection velocity will be

small and will thus contribute little to the gradient calculation.

The accuracy of the adjoint code can be ascertained through a gradient check [29], where

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 23: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 23

2 1 2

4 3

2 1 2

1 4 3

1

4 3 4 3

3

9

6

3

9

1

7

4

1

7

2

8

5

2

8 9

3

6

9

3

7

1

4

7

1

3

9

1

7

2

8 9

3

6 6 4 5

(a) (b) (c)

Figure 1. Extended colouring for material properties and non-linearities. (a) Graph colouring scheme

for an element-wise material property or non-linearity. Four colours are needed for 2-D linear

quadrilateral elements. (b) Graph colouring scheme for a node-wise material property or non-linearity.

Nine colours are needed for 2-D linear quadrilateral elements. (c) For the case shown in (b) the dotted

line indicates the nodes that are directly linked to both nodes 1 and 2. These are the nodes that their

needs to be an independent set for and there must be coloured with a different number than 1 and 2.

the quantity w(α) is calculated and it is verified that it satisfies w(α) = 1+O(α). The quantity

w is defined as

w(a) =J(u0 + ah)− J(u0)

ahT∇J(u0), (61)

where, the cost function J = 12∆x

∑Ni=1(uNt

i )2,∇J(u0) = ∂J(u0)∂u0 and h is an arbitrary vector of

unit length (here h = ∇J/‖∇J‖2). The vector u0 = (u01, u

02, ..., u

0N )T of initial control volume

values of velocity represents the control variables, here the initial velocity u0 =∑Nj=1Mju

0j

with basis function Mj unity in j and zero elsewhere. For values of a that are small, but not

too close to the machine zero, it is expected that w(a) should be close to unity. To quantify

the gradient accuracy, figure 4 shows the variation of the function w with respect to a. It can

be seen that, as required, the function w(a) is close to unity when a varies between 10−3 –

10−12. Notice that the gradient of the first order upwind scheme seems independent of the

perturbation which is because it is ’mostly’ independent of the size of the perturbation ∆u.

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 24: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

24 F. FANG ET. AL.

0 0.2 0.4 0.6 0.8 1x

0

0.2

0.4

0.6

0.8

1

u

fine mesh11 cell mesh

0 0.2 0.4 0.6 0.8 1x

0

0.1

0.2

0.3

0.4

0.5

0.6

u

fine meshcentralupwindhigh-res

(a) (b)

0 0.2 0.4 0.6 0.8 1x

0

0.0001

0.0002

0.0003

0.0004

0.0005

u*

fine meshcentralupwindhigh-res

0 0.2 0.4 0.6 0.8 1x

0

0.01

0.02

0.03

0.04

0.05

sens

itivi

ty

fine meshcentralupwindhigh-res

(c) (d)

Figure 2. Burgers equation solution (a) initial conditions for a coarse (11 cells) and fine (101 cells)

mesh (b) the forward solution at the end of time t = 0.5 and for the central difference method, the

upwind method and the high resolution methods along with a fine mesh high resolution result. (c)

corresponding adjoint and (d) sensitivities for the four simulations shown in (b).

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 25: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 25

1e-08 1e-06 0.0001 0.01perturbation in u

1e-08

1e-06

0.0001

0.01

norm

alis

ed e

rror

in s

ensi

tivity

high-res. - double prec.high-res. - single prec.central scheme - single prec.high-res. - 2nd order double prec.high res. - 2nd order single prec.

Figure 3. The normalised maximum error in the sensitivity for the coarse mesh calculation against

varying ∆u in the ISP-Adjoint calculation. The error is normalised with the maximum magnitude of

the sensitivity/gradient and the error is determined for both single and double precision. The central

scheme has an exact gradient (ignoring round off error) and thus its minimum value gives the best

error achievable using single precision. Also shown are the normalised errors in the high-resolution

gradients for single and double precision when a second order finite difference method is used - that

is A(m+∆ml)−A(m−∆ml)2∆ml

, A(m+∆φk)−A(m−∆φk)2∆φk

instead of A(m+∆ml)−A(m)∆ml

, A(m+∆φk)−A(m)∆φk

.

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 26: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

26 F. FANG ET. AL.

(a) (b)

Figure 4. The accuracy of the adjoint model (a) First order unwinding; (b) Flux limiting. Equation

(61): w(a) =`J(u0 + ah)− J(u0)

´/ahT∇J(u0) (here h = ∇J/‖∇J‖2) is used. The cost function J

is defined as the kinematic energy at t = 0.5 and the control variables are the initial velocities u0.

This properly may also be shared with other schemes that have a switch in the discretisation

depending on if information is going into or out of a cell or element e.g. Discontinuous Galerkin

methods, see [30]. It can thus be concluded, at least for this model problem set-up, that the

adjoint model developed here, is correct and accurate.

7. ISP-Adjoint applied to the Munk gyre

The approach for sensitivity analysis is applied to an idealised ocean gyre problem discretised

with a reduced order model. The sensitivity of the kinetic energy at the end of the simulation

with respect to the initial conditions will be sought. Although this is a relatively easily

differentiated problem because the descrete system of reduced equations are quadratic and the

matricies that change every time step are given by equations (29) and (30) it does demonstrate

how easily applied the ISP-Adjoint is. Since the system of equations are quadratic any sized

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 27: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 27

perturbation can be applied and thus a unity pertabation is applied to all variables in the

differentiation.

7.1. Matrix equation structure for fluids discretisations

The matrix that is solved at time level n typically has a structure which is a block matrix

representation of the discrete system 4:

Pn =

En Cn

CnT 0

(62)

and

Hn =

Bn 0

0 0

, (63)

and the forward model during the whole simulation period is therefore:

AΦ =

P1

H2 P2

. . . . . .

HNt PNt

Φ1

Φ2

...

ΦNt

=

s1

s2

...

sNt

, (64)

and the corresponding adjoint model is:

(AT + GT )Φ∗ =

P1T H2T

P2T . . .

. . . HNtT

PNtT

+ GT

Φ∗1

Φ∗2

...

Φ∗Nt

=

∂J∂Φ1

∂J∂Φ2

...

∂J∂ΦNt

. (65)

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 28: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

28 F. FANG ET. AL.

in which the forward and adjoint solutions are Φn = (unT ,pnT )T and Φ∗n = (u∗nT ,p∗nT )T

respectively.

7.2. Application to the Munk gyre

The ISP-Adjoint approach is tested in a computational domain of horizontal dimensions, 1000

km by 1000 km with a depth of H = 500 m. The wind forcing on the free surface is given by

τy = τ0cos(πy/L), τx = 0.0, (66)

where τx and τy are the wind stresses on the free surface along the x and y directions

respectively, and L = 1000 km. A maximum zonal wind stress of τ0 = 0.1 N m−1 is applied in

the latitudinal (y) direction. The Coriolis terms are taken into account using the beta-plane

approximation (f = βy) where β = 1.8×10−11 and the reference density is ρ0 = 1000 kg m−1.

The problem is non-dimensionalised with the maximum Sverdrup balance velocity

βHρ0v =∂τ

∂y≤ τ0π

L⇒ v ≤ 3.5× 10−2m s−1 (67)

(and so the velocity scale U = 3.5×10−2 m s−1 is used here), and the length scale is L = 1000

km. Time is non-dimensionalised with T = LU . The spin-up period is 0.1512 (50 days). The

equilibrium state at 50 days is taken as the initial state for both the full and reduced models.

The snapshots are collected from the results obtained in the full model during the simulation

period [50, 150] days. The time step is 3.78× 10−4, equivalent to 3h. Incorporating the beta-

plane approximation yields a non-dimensional β∗ = L2βU = 514.3. The non-dimensional wind

stress (applied as a body force here averaged over the depth of the domain) takes the same

cosine of latitude profile with τ∗0 = τ0L(U2ρ0H) = 163.3. No-slip boundary conditions are applied

to the lateral boundaries. The Reynolds number is defined as Re = ULν (here the kinematic

viscosity is 140 m2s−1). It should be pointed out that since the discrete system of equations

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 29: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 29

is quadratic application of the ISP-Adjoint is exact (to computational round off) for both the

reduced order and full system of finite element equations.

The POD bases are constructed by the snapshots which are obtained from the numerical

solutions by forcing the full forward model with the initial velocity (the background flow).

Forty snapshots with 35 POD bases for the velocity field u, v, w and pressure are chosen

which capture more than 99.5% of energy (calculated by the first 35 leading eigenvalues).

In ocean modelling, the pressure term also plays an important role in the geostrophic balance.

In this study, taking into account the role of the pressure term in both the POD-Galerkin model

and the geostrophic balance, the pressure in the momentum equations is divided into two parts:

p = png + pg. To accurately represent geostrophic pressure, its basis functions are split into

two sets: Φnguxand Φnguy

which are associated with the ux- and uy-velocity components.

The geostrophic pressure can be obtained from a quadratic finite element representation while

linear finite element representations are used for the velocity components. Furthermore the

geostrophic pressure can be represented by a summation of the two sets of geostrophic basis

functions, which are calculated by solving the elliptic equations using a conjugate gradient

iterative method (for details see [31]).

In this experiment, the cost function J is defined as the kinematic energy at the final time

level (n = Nt or 200 days):

J =12

∫Ω

||uNt ||22 dΩ. (68)

The sensitivity analysis of J with respect to the initial velocity u0 has been carried out, and

the corresponding results are shown in figure 5.

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 30: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

30 F. FANG ET. AL.

(a) (b) (c)

Figure 5. (a) Initial velocity vector (t = 100 days); (b) the velocity vector at t = 200 days on a regular

mesh; (c) Adjoint sensitivity/gradient of the final kinetic energy at t = 200 days with respect to the

initial velocity.

8. ConclusionsA new method of differentiating potentially complex mesh based numerical models is presented

and applied to simple Burgers equation and ocean gyre example problems. This method is

referred to here as the Independent Set Perturbation Adjoint approach (ISP-Adjoint) since

differentiation of the matrices and sources of the discrete forward finite element or control

volume model is realized by colouring variables to form a number of independent sets of

variables.

Three discretisation methods central difference, first order upwind, and high-resolution

schemes were applied to discretise Burgers equation to demonstrate that exactly the same

ISP-Adjoint code can switch between discretisations or parameterizations without changing

anything associated with the adjoint or gradient calculations - high resolution methods are

often used for parameterizations or models e.g. for turbulence. A reduced order Monk gyre

problem is solved here to demonstrate the application of the ISP-Adjoint to more complex

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 31: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 31

fluid flow problems.

The ISP-Adjoint method could make more tractable the formation of arbitrarily complex

multi-physics mesh-based (4D-VAR) models. It can enable low and high order adjoints of

models to be manipulated much more freely so that quicker progress can be made in data

assimilation, sensitivity and uncertainty analysis, error analysis, adaptive observations and a

whole host of other adjoint applications.

Acknowledgements

This work was carried out under funding from the UK’s Natural Environment Research Council

(projects NER/A/S/2003/00595, NE/C52101X/1 and NE/C51829X/1), the Engineering and

Physical Sciences Research Council (GR/R60898), the Leverhulme Trust (F/07058/AB),

and the USA National Science Foundation (NSF, grants ATM-0201808 and CCF-0635162),

and with support from the Imperial College High Performance Computing Service and the

Grantham Institute for Climate Change. Many thanks to Tamzin Mole for help with the

technical writing up of this work.

REFERENCES

1. A. M. Moore, N. S. Cooper, and D. L. T. Anderson. Initialisation and data assimilation in models of the

Indian Ocean. J. Phys. Oceanogr. 1987; 17:1965–1977.

2. F. X. Le Dimet and I. M. Navon. Variational and optimization methods in meteorology: A review,

1988. Technical Report: Early review on variational data assimilation, SCRI report No 144, pp88.

http://people.scs.fsu.edu/ navon/freqreq.html.

3. M. Wenzel, J. Schroter, and D. Olbers. The annual cycle of the global ocean circulation as determined

by 4D VAR data assimilation. Prog. Oceanog. 2001; 48(1):73–119.

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 32: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

32 F. FANG ET. AL.

4. Y. Q. Zhu and I. M. Navon. Impact of parameter estimation on the performance of the FSU global

spectral model using its full-physics adjoint. Mon. Weath. Rev., 127(7):1497–1517, 1999.

5. D. G. Cacuci, M. I. Bujor, and I. M. Navon. Sensitivity and Uncertainty Analysis: Applications to Large-

scale Systems Vol 2, CRC, 2005.

6. B. F. Sanders, N. D. Katopodes. Adjoint sensitivity analysis for shallow water wave control. J. Eng.

Mech., 126:909-919, 2000.

7. D. N. Daescu and I. M. Navon. Adaptive observations in the context of 4D-Var data assimilation.

Meteorol. Atmos. Phys. 2004; 85(4):205–226.

8. N. A.Pierce, and M. B.Giles. Adjoint recovery of superconvergent functionals from PDE approximations.

SIAM Rev., 42:247–264, 2000.

9. P. W. Power, C. C. Pain, M. D. Piggott, G. J. Gorman, F. Fang, D. P. Marshall, and A. J. H.

Goddard. Adjoint goal-based error norms for adaptive mesh ocean modelling. Ocean Modell., 15: 3–

38, doi:10.1016/j.ocemod.2006.05.001, 2006.

10. Martins J.R.R.A., Alonso J.J., and Reuther J.J. A coupled-adjoint sensitivity analysis method for high-

fidelity aero-structural design. Optim. Eng., 6(1):33–62, 2005.

11. Mohammadi B., Male J.M., and Rostaing-Schmidt N. Automatic differentiation in direct and

reverse modes: application to optimum shapes design in fluid mechanics. In Computational

Differentiation:Techniques, Applications, and Tools, edited by M. Berz, C.H. Bischof, G.F. Corliss and A.

Griewank:309–318, 1996(SIAM:Philadelphia, PA).

12. Sherman L.L, Taylor A.C., Green L.L, Newman P.A., and Hou G.W. Vamshi Mohan Korivi. First- and

second-order aerodynamic sensitivity derivatives via automatic differentiation with incremental iterative

methods. J. Comput. Phys., 129(2):307–331, 1996.

13. Marta A.C., Mader C.A., Martins J. R. R. A., Van der Weide E., and Alonso J.J. A methodology for

the development of discrete adjoint solvers using automatic differentiation tools. International Journal

of Computational Fluid Dynamics, 21(9–10):307–327, 2007.

14. Boudjemaa R., Cox M.G., Forbes A.B., and Harris P.M. Automatic differentiation and its application in

metrology. Advanced Mathematical and Computational Tools in Metrology VI, edited by P Ciarlini, M

G Cox, F Pavese and G B Rossi:170–179, 2004.

15. Baalousha H. and Kongeter J. Stochastic modelling and risk analysis of groundwater pollution using form

coupled with automatic differentiation. Advances in Water Resources, 29:1815–1832, 2006.

16. Glowinski R. and Toivanen J. A multigrid preconditioner and automatic differentiation for non-equilibrium

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 33: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 33

radiation diffusion problems. Journal of Computational Physics, 207:3545–374, 2005.

17. Green L.L., Newman P.A., and Haigler K.J. Sensitivity derivatives for advanced cfd algorithm and viscous

modeling parameters via automatic differentiatio. Journal of Computational Physics, 125(2):313–324,

1996.

18. Bischof C.H., Corliss G.F., Green L., Griewank A., Haigler K., and Newman P. Automatic differentiation

of advanced CFD codes for multidisciplinary design. Computing Systems in Engineering, 3(6):625–637,

1992.

19. Liu Z. and Sandu A. On the properties of discrete adjoints of numerical methods for the advection

equation. International journal for numerical methods in fluids, 56(7):769–803, 2008.

20. Reuther J., Alonso J.J., Jameson A., Rimlinger M., and Saunders D. Constrained multipoint aerodynamic

shape optimization using an adjoint formulation and parallel computers: part i. j. aircraft. J. Aircraft,

36(1):51–60, 1999.

21. Reuther J., Alonso J.J., Jameson A., Rimlinger M., and Saunders D. Constrained multipoint aerodynamic

shape optimization using an adjoint formulation and parallel computers: part ii. j. aircraft. J. Aircraft,

36(1):61–74, 1999.

22. Heimbach P., Hill C., and Giering R. An efficient exact adjoint of the parallel MIT general circulation

model, generated via automatic differentiation. Future Generation Computer Systems, Elsevier,

21(8):1356–1371, 2005.

23. Stammer D., Wunsch C., Giering R., Eckert C., Heimbach P., Marotzke J., Adcroft A., Hill C.N., and

Marshall J. The global ocean circulation during 1992-1997, estimated from ocean observations and a

general circulation model. J. Geophys. Res., 107(C9):1–27, 2002.

24. M. D. Piggott, G. J. Gorman, C. C. Pain, P. A. Allison, A. S. Candy, B. T. Martin, and M. R. Wells.

A new computational framework for multi-scale ocean modelling based on adapting unstructured meshes.

Int. J. Numer. Methods Fluids, 56:1003–1015, 2008.

25. M. T. Jones, P. E. Plassmann. A parallel graph coloring heuristic. SIAM Journal on Scientific Computing,

14(3):654–669, 1993.

26. R. Ford, C. C. Pain, M. D. Piggott, A. J. H. Goddard, C. R. E. de Oliveira, and A. P. Umbleby.

A nonhydrostatic finite-element model for three-dimensional stratified oceanic flows. Part I: Model

formulation. Mon. Weath. Rev. 2004; 132(12):2816–2831.

27. F. Fang, C. C. Pain, I. M. Navon, M. D. Piggott, G. J. Gorman, P. E. Farrell, P. Allison, and A. J. H.

Goddard. A POD reduced-order 4D-Var adaptive mesh ocean modelling approach. International Journal

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls

Page 34: The Independent Set Perturbation Adjoint Method: A new ...inavon/pubs/isp-adjoint28.pdfTHE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7 3.3. Matrix representation of non-linear discrete

34 F. FANG ET. AL.

for Numerical Methods in Fluids, DOI: 10.1002/fld.1911., 2008.

28. B. P. Leonard. The ULTIMATE conservative difference scheme applied to unsteady one-dimensional

advection, Computing Methods in Apllied Mechanics and Engineering, 88:17-74, 1991.

29. I. M. Navon, X. Zou, J. Derber, and J. Sela. Variational data assimilation with an adiabatic version of

the NMC Spectral Model. Mon. Weath. Rev., 120:55–79, 1992.

30. R. T. Ackroyd. Finite Element Methods for Particle Transport: Applications to Reactor and Radiation

Physics Research Studies in Particle and Nuclear Technology, 6, Taylor & Francis Group, London, 1997.

31. F. Fang, C. C. Pain, I. M. Navon, M. D. Piggott, G. J. Gorman, P. Allison, and A. J. H. Goddard. Reduced

order modelling of an adaptive mesh ocean model. International Journal for Numerical Methods in Fluids,

DOI: 10.1002/fld.1841., 2008.

Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6

Prepared using fldauth.cls