Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS
Int. J. Numer. Meth. Fluids 2004; 00:1–6 Prepared using fldauth.cls [Version: 2002/09/18 v1.01]
The Independent Set Perturbation Adjoint Method: A new
method of differentiating mesh based fluids models
F. Fang∗1, C. C. Pain1, I. M. Navon2, G. J. Gorman1, M. D. Piggott1, P. A. Allison1
1 Applied Modelling and Computation Group,
Department of Earth Science and Engineering,
South Kensington Campus,
Imperial College London, SW7 2AZ, UK
URL: http://amcg.ese.imperial.ac.uk
2 Department of Scientific Computing,
Florida State University,
Tallahassee, FL, 32306-4120, USA
URL: http://people.scs.fsu.edu/vnavon/index.html
SUMMARY
A new scheme for differentiating complex mesh-based numerical models (e.g. finite element models),
the Independent Set Perturbation Adjoint method (ISP-Adjoint), is presented. Differentiation of the
matrices and source terms making up the discrete forward model is realized by a graph colouring
approach (forming independent sets of variables) combined with a perturbation method to obtain
gradients in numerical discretisations. This information is then convolved with the ‘mathematical
adjoint’ which uses the transpose matrix of the discrete forward model. The adjoint code is simple
to implement even with complex non-linear parameterizations, discretisation methods and governing
equations. Importantly, the adjoint code is independent of the implementation of the forward code.
Copyright c© 2004 John Wiley & Sons, Ltd.
2 F. FANG ET. AL.
This greatly reduces the effort required to implement the adjoint model and maintain it as the forward
model continues to be developed; as compared with more traditional approaches such as applying
automatic differentiation tools. The approach can be readily extended to reduced order models. The
method is applied to a one-dimensional Burgers’ equation problem, with a highly non-linear. high-
resolution discretisation method; and to a two-dimensional, non-linear, finite element model of an
idealised ocean gyre.
keywords: Automatic differentiation; optimisation; adjoint; finite element, POD; reduced order
models. Copyright c© 2004 John Wiley & Sons, Ltd.
1. Introduction
Adjoints to computational codes have been used for diverse applications including; data
assimilation [1, 2, 3, 4], sensitivity analysis [5, 6], adaptive observations [7] and mesh based error
measures [8, 9]. However, differentiating discrete numerical models is generally impractical to
perform by hand and remains challenging when using automatic differentiation tools (AD)
(More details on AD at http://www.autodiff.org ). Automatic differentiation also restricts the
use of many programming features, further increasing the difficulty of developing a forward
and adjoint model. Both approaches often require intimate knowledge of the forward code in
order to design a hand differentiation method or to alter the automatically produced code so
that it works efficiently [10, 11, 12]. The time it takes to conduct this task may be greater
than writing the forward model and there are also computer science problems arising from
the complexity of the computation of derivatives and memory usage [13, 14]. These challenges
often preclude the use of gradient or adjoint variational (4D-VAR; three spatial dimensions
plus time) data assimilation methods, sensitivity based methods or any other method that
uses gradients of key functionals with respect to the control variables (controls), e.g. material
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 3
properties or initial/boundary conditions of flow models.
A forward model may have an actively developed dynamic core or utilize rapidly changing
and complex parameterizations. Therefore, the application of consistent or exact gradient
methods of differentiation to the discrete problem tends to lag behind developments in the
forward model code. Despite this there have been major successes in differentiating large
models [15, 16, 17, 18, 19]. Automatic differentiation methods have been utilized on stable
models at compile time after code modifications [20, 21, 22, 23]. In addition, Marta [13]
developed a selective automatic differentiation approach where AD is used to compute only
certain terms of the discrete adjoint equations, and not to differentiate the entire solver.
The ISP-Adjoint approach outlined here is largely independent of implementation details
of the forward model. The modelling software only needs to be modified so that it can
accommodate the adjoint, e.g. the formation of the ‘mathematical adjoint’ transpose matrices
of the forward problem, as well as the accommodation of the data structures necessary
to provide access to the forward solution when time marching backwards in the gradient
calculation. Similar approaches may be applied to reduced order models such as proper
orthogonal decomposition (POD) surrogates of full models.
This approach can also be used to help accelerate the matrix equation assembly process
on the assumption that the discretised system of equations has a polynomial representation
and can thus be formed by a summation of matrices. If the system of equations can not be
represented as a non-linear polynomial it may still be approximated in this way. Thus the fast
assembly methods may be applied but to preserve accuracy the matrices must be periodically
reformed in the time marching scheme.
The remainder of this paper is set out as follows. The next section outlines the model
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
4 F. FANG ET. AL.
governing equations, followed by a description of how discrete system assembly may be
performed using graph colouring methods. This is then followed by the extension to reduced
order POD models and the extension of the method to non-quadratic discrete systems of
equations. The gradient of the discrete system (ISP-Adjoint) method is outlined and the
method is applied to two problems: the Burgers equation; and the Navier-Stokes equations
applied to ocean gyre circulation. Finally, conclusions are drawn.
2. Governing equations
Two systems of partial differential equations used to illustrate the application of the method
are the Burgers’ equation and the Navier-Stokes equations.
The one-dimensional (1D) Burgers’ equation is defined as:
∂u
∂t+ u
∂u
∂x− ν ∂
2u
∂x2= 0. (1)
where u is the velocity, x is the spatial coordinate, t is the time coordinate and ν is the viscosity
coefficient.
The 3D Navier-Stokes continuity and momentum equations are considered in the form:
∇ · u = 0, (2)
∂u∂t
= −u · ∇u− fk× u +∇ · τ −∇p, (3)
where u = (ux, uy, uz)T is the velocity vector, x = (x, y, z)T are the orthogonal Cartesian
coordinates, p is the perturbation pressure (p = p/ρ0, ρ0 is the constant reference density), f
represents the Coriolis inertial force and k = (0, 0, 1)T . The stress tensor τ is used to represent
viscous terms. The pressure variable is split into the non-geostrophic and geostrophic parts
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 5
which are solved separately. This allows the accurate representation of hydrostatic/geostrophic
balance (for details see [24]).
3. Discrete system assembly using graph colouring methods
In this section a discrete forward model is derived for a high-order discrete equation based on
polynomials. The matrices associated with the nonlinear terms can be expressed by a set of
sub-matrices. For time dependent problems these sub-matrices are time-independent and can
be easily differentiated to form adjoint systems of equations.
3.1. Discrete model for incompressible fluids
The discrete model for incompressible fluids (equations (2) and (3)) calculations can be written
for finite element methods as explained in [24] as:
Cn+1Tun+1 = 0, (4)
En+1un+1 = Bn+1un + Cn+1pn+1 + sn+1, (5)
where n is time level and s includes the discretised sources, initial and boundary conditions
and body forces. The matrices En+1 and Bn+1 are nonlinear in u and Cn+1 is the matrix
associated with the pressure vector:
pn+1 = (pn+11 , pn+1
2 , . . . , pn+1M )T .
M is the number of pressure variables or basis functions. The vector of velocities is defined as
un = (unxT ,uny
T ,unzT )T ,
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
6 F. FANG ET. AL.
with sub-vectors associated with the velocity components,
unx = (uxn1 , uxn2 , . . . , ux
nN )T ,
uny = (uyn1 , uyn2 , . . . , uy
nN )T ,
unz = (uzn1 , uzn2 , . . . , uz
nN )T .
N is the number of velocity basis functions and all subscripts indicate nodal values of the
associated variables.
3.2. Graph colouring
Graph colouring methods are commonly used to model the dependency between different
subtasks or data. Here we define the graph Gr = (Vg, Eg), where the vertex set, Vg, are the
nodes of the finite element mesh (i.e. the rows of the matrix), and the edge set, Eg, is defined
by the union of finite element stencils. The chromatic number, χ(Gr), is bounded by:
ω(Gr) 6 χ(Gr) 6 ∆(Gr) + 1,
where ω(Gr) is the clique number and ∆(Gr) is the maximum vertex degree. However, the
actual number of colours used to colour the graph will be Nc ≥ χ(Gr).
We seek independent sets, where the finite element stencil is defined by the non-zero nodes
in∫
ΩNiQkNjdΩ, where Ω is the computational domain and Ni, Nj are finite basis functions
and Qk (1 ≤ k ≤ Q, Q is the number of basis functions Qk) is the finite basis function of a
variable q and does not interact with other nodes than k of a different colour. Depending on
the form of Qk the matrix stencil, and therefore the vertex degree of the graph, will be larger.
This would be the case for a material property or non-linearity which would be expanded
q =∑Qk=1Qkqk where q = (q1, q2, ..., qQ)T .
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 7
3.3. Matrix representation of non-linear discrete systems
For a non-linearity with a polynomial representation in the discrete system (e.g. the Navier-
Stokes equations discretised with a continuous Galerkin method) the matrix equations that
are formed from a summation of matrices. For illustration, we will consider here: quadratic
discrete systems; high-order polynomial discrete systems; and non-polynomial systems.
3.3.1. Quadratic discrete systems As discussed above, a quadratic non-linearity in the
discrete system with can be written in a matrix of the form:
Qqij =∫
Ω
NiqNjdΩ, (6)
where q is the velocity. The application to other finite element matrices such as those formed
from inertia or advection terms of the form∫
ΩNiun · ∇NjdΩ follow along similar lines. The
matrix Qqcij in the colouring scheme can be expressed:
Qqcij =
∫Ω
NibcNjdΩ, (7)
where,
bc =Q∑k=1
Qkbck, bck =
1 if node k is of colour c;
0 at other nodes.(8)
Applying the same approach as outlined here for matrix Q to matrices En+1 and Bn+1 in
equation (5) the matrix Q can be constructed by a set of sub-matrices using graph colouring,
Qqij =Nc∑c=1
Qcqijqcij = Qqij
+Nc∑c=1
Qcqij
(qcij − qcij), (9)
where qcij is the value of q at a node of colour c neighboring both nodes i and j (see figure 1),
N is the number of nodes. The chromatic number, Nc, is four for 2D linear element cases, see
figure 1. Note that it is useful to write the matrix (9) as a perturbation from a mean matrix
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
8 F. FANG ET. AL.
Qqij=∫
ΩNiqNjdΩ =
∑Nc
c=1 Qcqijqcij in order to linearise about a solution q =
∑Qk=1Qkqk,
e.g. the mean of q or the most recently available value of q.
For a quadratic discrete model like a finite element Bubnov-Galerkin discretisation of the
Navier-Stokes equations [26], the matrices Qq, Qcq are time-independent similar to matrices
En+1 and Bn+1 in equation (5). The matrices Qq, Qcq can be constructed by sending down
different values of the non-linear terms into the matrix assembly routines.
3.3.2. High-order polynomial discrete systems For a high-order discrete model, for example
a cubic discrete model, the matrix Quq can therefore be written:
Qrqij =∫
Ω
NirqNjdΩ, (10)
where the velocity q =∑Qk=1Qkqk. The above is repeated but assuming a constant r and thus
Qrqij = Qrqij+Nc∑c=1
Qcrqij
(qcij − qcij), (11)
where
Qrqcij =
∫Ω
NirbcNjdΩ, Qrq
c
ij=∫
Ω
NirqNjdΩ. (12)
3.3.3. Non-polynomial systems The extension to non-polynomial discrete systems which are
for example obtained by certain discretisation methods (e.g. upwind methods, discontinuous
Galerkin (D)G, flux limiting methods, or model parameterisation like Large Eddy Simulation
(LES) is realised by linearising the system about q say, if q is the non-linear variable. The
linearisation realised through q may have to be re-done every so often into a time dependent
simulation in order to achieve accuracy. So the matrices may be formed by a summation of
matrices. These make the matrices easy to differentiate and fast to assemble but it involves
an approximation. A similar approach may also be applied to reduced order models to help in
the rapid assembly of their equations as described below.
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 9
4. Discrete system of reduced order model equations
The aim behind reduced order models is to find a relatively small number of basis functions
(few relative to the full model) that can accurately represent the system dynamics. These basis
functions are often obtained from the full forward model results by applying a singular value
decomposition analysis to a number of snapshots of the forward solution to determine the
most energetic modes of the system and in this way optimally obtain basis functions. This is
the approach taken in POD to obtain the basis functions, see [27]. In this section the matrices
are generated by a summation of matrices similar to the colouring method used above for the
finite element equations.
4.1. POD reduced order model
Once the reduced order basis functions are obtained the variables u can be expressed
as an expansion of the POD basis functions Φux 1, . . . ,ΦuxPu, Φuy 1
, . . . ,ΦuyPu and
Φuz 1, . . . ,ΦuzPu (here Pu is the number of POD bases for velocity):
u(t, x, y, z) = u +Pu∑m=1
NΦum (x)aum(t), (13)
and pressure is decomposed into p = png + pg in which png, pg are the non-geostrophic and
geostrophic pressures (see [27] for details associated with pg) and the POD basis functions
Φp1, . . . ,ΦpPp (here Pp is the number of POD bases) for non-geostrophic pressure are used
to obtain:
png(t, x, y, z) = png +Pp∑m=1
Φp m(x)ap m(t), (14)
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
10 F. FANG ET. AL.
while the geostrophic pressure can be represented by a summation of the two sets of geostrophic
basis functions Φg ux m and Φg uy m, which are obtained by Φux m and Φuy m
[27]:
pg(t, x, y, z) = pg +Pu∑m=1
Φg ux m(x)aux m(t) +Pu∑m=1
Φg uy m(x)auy m
(t). (15)
where aum = (aux m, auy m, auz m)T and ap m are the coefficients to be determined for velocity
u = (ux, uy, uz)T and p respectively, png, pg and u are the means of the ensemble of
snapshots for the variables png, pg and u(t) respectively, NΦum is a diagonal matrix including
the POD basis for the velocity variable u, in the finite element method, the POD basis
Φum(x) =∑Ni=1NiΦumi and Φp m(x) =
∑Ni=1NiΦpm i, where Ni is the basis function in
the finite element and N is the number of nodes in the computational domain.
Substituting (13), (14) and (15) into (2) and (3), and taking the POD basis function Φp and
Φu as the test function for (2) and (3) respectively, then integrating over the computational
domain Ω, ∫Ω
NΦl F
((u +
Pu∑m=1
NΦum (x))aum, p, t, x
)dΩ, (16)
with F defined as in (2) and (3), where NΦl = Φp while for the continuity equation (2) and
NΦl = NΦu
l = diag(Φux l,Φuy l,Φuz l) for the Navier-Stokes equation (3). The POD reduced
order model is then obtained:
∂al∂t
=⟨F
((u +
Pu∑m=1
NΦum (x))aum, p, t, x
), NΦu
l
⟩, (17)
subject to the initial condition
au l(0) =⟨
(u(0,x)− u(x))T ,NΦu
l
⟩, (18)
where < ·, · > is the canonical inner product in L2 norm, a = (au, ap)T , 1 ≤ l ≤ Pp for the
continuity equation (2) while 1 ≤ l ≤ Pu for Navier-Stokes equation (3).
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 11
4.2. Discrete POD model
Here the POD reduced order model for the incompressible continuity and Navier-Stokes
equations, can be expressed as:⟨N
Φp
l ,∇ · (Pu∑m=1
NΦum (x)aum(t))
⟩= 0, (19)
and
∂aul(t)∂t
+⟨
NΦu
l ,
((u +
Pu∑m=1
NΦum (x)aum(t)) · ∇(u +
Pu∑m=1
NΦum aum(t))
)⟩+
⟨NΦu
l ,
(fk× (u +
Pu∑m=1
NΦum (x)aum(t))
)⟩+
⟨NΦu
l ,
(∇p− µ∇2(u +
Pu∑m=1
NΦum (x)aum(t))
)⟩= 0, (20)
where f represents the Coriolis inertial force, µ is the kinematic viscosity and k = (0, 0, 1)T .
The discrete model of (19) and (20) at the time level n + 1 can be written in a general form
in a subspace:
Cn+1TPOD an+1
u = 0, (21)
and
En+1PODan+1
u = Bn+1PODanu + Cn+1
PODan+1p + sn+1
POD, (22)
where En+1POD, Bn+1
POD (associated with the nonlinear term) and Cn+1POD ( associated with the
pressure term) are the matrices including all the discretisation of (19) and (20), sn+1POD is a
discretised source term.
The (m, l)th (here 1 ≤ m, l ≤ 3Pu ) entry of the matrices En+1POD and Bn+1
POD can be constructed
by the entries of the matrices En+1 and Bn+1 in the full model (5):
En+1POD ml =
3N∑i=1
3N∑j=1
En+1ij DΦu
miDΦu
l j , (23)
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
12 F. FANG ET. AL.
Bn+1POD ml =
3N∑i=1
3N∑j=1
Bn+1ij DΦu
miDΦu
l j , (24)
and
sn+1POD m =
3N∑i=1
sn+1i DΦu
mi, (25)
where En+1ij and Bn+1
ij are the (i, j)th entry of the matrices En+1 and Bn+1 respectively which
in the colouring scheme can be formed by summation of matrices (see (9)), sn+1i is the ith
entry of s, DΦumi and DΦu
l j are the ith and jth entries of the diagonal matrices DΦum and DΦu
l
respectively,
DΦuxm = diag(Φuxm 1, . . . ,ΦuxmN ), (26)
DΦuym = diag(Φuym 1
, . . . ,ΦuymN ), (27)
DΦuzm = diag(Φuzm 1, . . . ,ΦuzmN ). (28)
For quadratic discrete systems and a large reduction of the model space the matrices EPOD
and BPOD are most efficiently formed by the a summation of matrices in a manner similar to
that used to form the finite element equations using a colouring method. However, due to the
global nature of the POD basis functions the number of colours is equal to the number of basis
functions. Thus splitting them into the mean matricies EsubPOD, BsubPOD and matricies
perturbing the system from these EsubPODxm, BsubPODxm (and similarly for y and z) to
obtain:
En+1POD = EsubPOD+
Pu∑m=1
EsubPODxmauxn+1m +
Pu∑m=1
EsubPODymauyn+1m
+Pu∑m=1
EsubPODzmauzn+1m ,
(29)
Bn+1POD = BsubPOD+
Pu∑m=1
BsubPODxmauxn+1m +
Pu∑m=1
BsubPODymauyn+1m
+Pu∑m=1
BsubPODzmauzn+1m .
(30)
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 13
This decomposition enables any differentiation of these matrices to be easily performed as well
as for the reduced model to be rapidly formed every time step.
5. Discrete systems of adjoint equations
A consistent adjoint model can be derived directly from a set of matrices in the forward model.
The discrete forward model can be expressed in a general form in a subspace:
AΦ = s, (31)
A ∈ RN Nt×N Nt is the matrix making up the discretisation in the forward model where N
is the number of basis functions (e.g. finite element or POD), and Nt is the number of time
levels; Φ = (φ1, φ2, . . . , φN Nt) is the the vector of state variables; s = s(m) is a discretised
source term — the controls of discretised control variables are m = (m1,m2, ...,mC)T in which
C are the number of controls. For a nonlinear simulation, the matrix A can be calculated by
the colouring approach (9).
In general A is a function of the controls m and state variables Φ that is A = A(m,Φ).
Differentiating (31) with respect to the control variable ml, the tangent linear model (TLM)
can be expressed
∂A∂ml
Φ + A∂Φ∂ml
=∂s∂ml
, (32)
The gradient of a cost functional (J) can be expressed as:
∂J
∂ml=(∂Φ∂ml
)T∂J
∂Φ. (33)
The TLM (32) can be re-expressed:
A∂Φ∂ml
= − ∂A∂ml
Φ +∂s∂ml
. (34)
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
14 F. FANG ET. AL.
It will be convenient to use:
∂A∂ml
Φ =[∂A∂ml
]direct
Φ + G∂Φ∂ml
, (35)
in which G = (g1,g2, ...,gN ) where gk = ∂A∂φk
Φ. Also[∂A∂ml
]direct
is defined as the direct
differentiation of A with respect to ml only
∂A∂ml
Φ =[∂A∂ml
]direct
Φ +N∑k=1
∂A∂φk
Φ∂φk∂ml
, (36)
whereN∑k=1
∂A∂φk
Φ∂φk∂ml
=N∑k=1
gk∂φk∂ml
= G∂Φ∂ml
. (37)
Therefore, the perturbation of the state variables with respective to the controls can be
calculated as: (∂Φ∂ml
)T=(−[∂A∂ml
]direct
Φ +∂s∂ml
)T(A + G)−T . (38)
Substituting (38) into (33), yields
∂J
∂ml=(−[∂A∂ml
]direct
Φ +∂s∂ml
)T(A + G)−T
∂J
∂Φ. (39)
Defining (A + G)−T ∂J∂Φ = Φ∗, the adjoint model is,
(A + G)TΦ∗ =∂J
∂Φ. (40)
The gradient of the cost function can therefore be calculated using
∂J
∂ml=(−[∂A∂ml
]direct
Φ +∂s∂ml
)TΦ∗. (41)
Notice that differentiation of the matrix A is achieved in a particularly simply way if the
matrices are formed for a polynomial system as described in the previous section. Thus from
(11): [∂Auqij
∂ml
]direct
= Acuqij
bclij , (42)
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 15
in which cl is the colour of node l and bclij = 1 if nodes i and j are neighbours to node l
otherwise bclij = 0. The equality becomes an approximation when the discrete matrix A has
been approximated by a polynomial system. However, even for more complex problems the
colouring approach can still be used to simplify the differentiation as described in the next
section.
5.1. Independent Set Perturbation Adjoint (ISP-Adjoint)
The terms[∂A∂ml
]direct
Φ, ∂s∂ml
in equation (41) can be derived by hand or automatic
differentiation in the traditional approach to forming discrete or exact gradients of the
functional of interest, J . An alternative method for calculating the gradient is to apply a
small perturbation, ∆ml, to each of the controls ml. In vector form the perturbation is a
vector of length C with the only non-zero at the lth entry, i.e. ∆m = (0, ..., 0,∆ml, 0, ..., 0)T .
Taking this approach the gradient is expressed as:
A(m + ∆ml,Φ)−A(m,Φ)∆ml
Φ =A(m + ∆ml,Φ)Φ−A(m,Φ)Φ
∆ml, (43)
s(m + ∆ml)− s(m)∆ml
. (44)
Combining these, the key gradient in (41) can be written as:
(−A(m + ∆ml,Φ)Φ + s(m + ∆ml))− (−A(m,Φ)Φ + s(m))∆ml
. (45)
Moreover, perturbations of a number of variables ml at a given time can be taken without
affecting the gradients. However, these must form an independent set of variables, i.e. one
variable perturbation should not affect the results of perturbing any other variable. This can
be done for the source s and for the matrix A. In fact the matrix need not be formed as what
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
16 F. FANG ET. AL.
is required is matrix vector multiplication:
A(m + ∆ml)−A(m)∆ml
Φ.
One can use a higher order Taylor series expansion than first order to form approximations to
−[∂A∂ml
]direct
Φ +∂s∂ml
,
from perturbations in ml, but this requires more computational effort. However, as long as
∆ml is small enough (ignoring round off errors) it will be accurate enough. A good example
is the use of A(m+∆ml,Φ)−A(m−∆ml,Φ)2∆ml
which will exactly represent a quadratic A. It should
be noted that for quadratic equations (e.g. Burgers’ and the Navier-Stokes equation) and with
quadratic discretisations arbitrary large ∆ml can be taken as the results are independent of
the size of ∆ml as A and s are linear in m.
The final aspect of the ISP-Adjoint method is the calculation of the matrix G used in
the adjoint calculation which is again formed using the independent set colouring approach
outlined previously. Suppose the perturbation vector is ∆Φk = (0, ..., 0,∆φk, 0, ..., 0)T for a
perturbation ∆φk of the kth entry in Φ then the columns of G are:
gk =∂A∂φk
Φ ≈ A(m,Φ + ∆Φk)−A(m,Φ)∆φk
Φ =A(m,Φ + ∆Φk)Φ−A(m,Φ)Φ
∆φk. (46)
Notice that this equation has similar form to (43) and thus the same algorithm, and therefore
same source code, may be used to evaluate both. Suppose all perturbations are equal ∆φk = ε
then:
G =1ε
(G′ −G), (47)
where G = (AΦ,AΦ, ...,AΦ) and G′ = (g′1,g′2, ...,g
′N ) with
g′k = A(m,Φ + ∆Φk)Φ. (48)
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 17
In an implementation of the formation of the matrix G one needs to perform matrix vector
multiplications involving Nc vectors. Thus, the entire matrix G′ can be stored within Nc
vectors of length N Nt. The matrix in this form would then be distributed to the most
convenient matrix form of the discrete system of equations. By construction, G has a similar
sparsity pattern as A.
An efficient implementation of the ISP-Adjoint method can be realised by:
• Using a discretisation or assembly subroutine that does not form the matrices but simply
performs matrix vector multiplication while assembling the equations.
• Enabling this subroutine to perform a matrix vector multiplication with a number of
solution perturbations simultaneously, thus to form G for finite element one only needs
to integrate across each element once.
• Reducing the number of independent sets or colours by introducing more variables in
the solution Φ. For example, by severing the link between elements in continuous finite
element discretisations and perturbing on this new system. This method we call the
Reduced ISP-Adjoint or ISP-Adjoint which, to round off error, produces identical results
to the ISP-Adjoint method. Moreover, a general finite element code, for example, needs
no modification to the assembly subroutines as simply a different linked list for the global
solution variables associated with each element is past to these subroutines.
• Performing the matrix vector multiplication GTΦ∗ without storing matrix GT .
The last step can be realised because in time marching methods typically all of the adjoint
solution operates in a lower block structure in which each block is associated with a solution
variable at a particular time step. This means that one can solve the adjoint equations by
marching backwards in time and the matrix G has zeros on the block diagonals of the matrix.
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
18 F. FANG ET. AL.
Thus, matrix vector multiplication,
GTΦ∗ (49)
can be placed on the right hand side of the matrix solution for the adjoint in equation (40). In
this case the matrix GT need not be formed and the matrix vector multiplication in equation
(49) involving G′TΦ∗ can be made highly efficient for the kth row using:
(G′TΦ∗)k = g′Tk Φ∗ = Φ∗T (A(m,Φ + ∆Φk)Φ). (50)
In practice the perturbations associated with each node or variable k say are grouped in terms
of colours and in this way a number of these matrix vector multiplications in (50) can be
calculated concurrently, i.e. form the dot product of g′c with the part of Φ∗ associated with
node k and form (G′TΦ∗)k for each row k and for colour c in which g′c =∑k∈ colour c g′k.
6. ISP-Adjoint applied to Burgers equation
A backward Euler time stepping method at time level n and centered on control volume i for
the Burgers’ equation (1) results in the following schemes:
1. With central differencing in space
un+1i − uni
∆t+ uni
un+1i+1 − u
n+1i−1
2∆x+ ν−un+1
i+1 + 2un+1i − un+1
i−1
(∆x)2= 0; (51)
2. With first-order upwind discretisation in space:
un+1i − uni
∆t+ max(0,
12∆x
(uni + uni−1))un+1i−1 + min(0,
12∆x
(uni + uni−1))un+1i
+ max(0,1
2∆x(uni+1 + uni ))un+1
i + min(0,1
2∆x(uni+1 + uni ))un+1
i+1
−uni+1 − uni−1
2∆xun+1i + ν
−un+1i+1 + 2un+1
i − un+1i−1
(∆x)2= 0; (52)
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 19
3. With high-resolution discretisation in space:
un+1i − uni
∆t− uni− 1
2(ξni− 1
2un+1i−1 + (1− ξni− 1
2)un+1i ) + uni+ 1
2(ξni+ 1
2un+1i + (1− ξni+ 1
2)un+1i+1 )
−uni+1 − uni−1
2∆xun+1i + ν
−un+1i+1 + 2un+1
i − un+1i−1
(∆x)2= 0. (53)
To determine in the high-resolution method where to apply first-order instead of high-order
fluxes we need to be able to detect an extrema and if there is to be a smooth transition
between the two fluxes we need to quantify how close to an extrema the solution is. This is
achieved here using the normalised variable diagram NVD method [28]. Suppose unu, unc , u
nd
are ordered consecutively and unf is the face value between control volumes (CV’s) c and d. If
f = i− 12 and unf > 0 then subscripts u = i− 2, c = i− 1, d = i and if unf < 0 then subscripts
u = i+ 1, c = i, d = i− 1 and uni− 1
2= 1
2 (uni−1 + uni ). For a monotonic solution,
unf =unf − unuund − unu
∈ [0, 1], (54)
which is the non-dimensional high order face value. The non-dimensional upwind face value is
unc =unc − unuund − unu
. (55)
The idea is to obtain a linear combination unf of unf and unc such that unf equals unc when there
exists an extrema (unc /∈ [0, 1] and the scheme becomes first order) and unf moves smoothly
between this and the high order flux unf . unf is calculated from
unf =unf − unuund − unu
. (56)
The flux limited solution is
unf = unf (und − unu) + unu. (57)
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
20 F. FANG ET. AL.
unf in this equation is calculated the normalised variable diagram, that is:
unf =
min1, 2unc ,maxunc ,unf , if unc ∈ (0, 1);
unc , otherwise.
(58)
Thus at an extrema the first order non-oscillatory method will be applied:
ξnf =
12 (1− eun
f−unc
unf−un
c) + 1
2 , ifunf > 0,
1− ( 12 (1− eun
f−unc
unf−un
c) + 1
2 ), otherwise.
Using the notation of the previous section the global matrix A has the structure:
A =
P1
1∆tI P2
. . . . . .
1∆tI PNt
(59)
with Nt the number of time levels and the matrix G has the structure:
G =
. . .
K2
. . . . . .
KNt
(60)
Thus the matrix equation (40) can be solved by time marching backwards in time. For
the central difference discretisation the Kn is diagonal with diagonal contributions Knii =
un+1i+1 −u
n+1i−1
2∆x and for the upwind discretisation it is distributed with non-zero values of
Kni i−1,Kn
i i,Kni i+1 taken from the coefficients of uni−1, uni , uni+1 respectively of the advection
term in equation (52).
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 21
For the central difference scheme (51) the matrices Kn are diagonal and thus only a single
colour (Nc = 1) is necessary to determine their values. The upwind difference scheme (52) has
a lower bi-diagonal matrices Kn for positive advection that is when uni+ 1
2> 0 (assuming the
cells are ordered consecutivitly from left to right) and two colours (Nc = 2) are needed to form
the matrices Kn. When there is a mixed sign then the matrix has a bandwidth of three with a
maximum of three non-zero’s per row and in this case three colours (Nc = 3) are needed. Due
to the use of the far field upwind values in the flux limiting calulcaiton of the high-resolution
method (53) it’s matrices Kn have a bandwidth of five with five non-zeros’ per row and thus
five colours (Nc = 5) are necessary and are used here to form the matrices Kn. If, instead
of linearising the previous time level value of the solution, one was to iterate to convergence
within a time step in order to calculate the non-linear terms at the future time level then the
matrices Kn would appear on the block diagonal of the system of equations occupying the
same place as Pn in the system of adjoint equations.
For a unit 1D domain and zero boundary conditions at both ends and space and time steps
of ∆t = 0.01, ∆x = 0.1 (11 cells) and a functional J = 12∆x
∑Ni=1(uNt
i )2 whose sensitivity with
respect to the initial conditions (u0j = exp(− (xj−0.5)2
0.12 ) for all control volume center positions
xj ∈ [0, 1], see figure 2a), is sought and x is the 1D spacial variable. The number of time
step for the 11 cell mesh is Nt = 50. As with all calculations presented in this paper they are
performed in double precision. The value of ∆u = 10−6 is used in the ISP-Adjoint sensitivity
calculation. There is also a fine mesh with 100 cells and ∆t = 0.001, ∆x = 0.01, Nt = 500.
The initial and final solution at the final time level are shown in figure 2a and 2b along with
the adjoint at the start of time n = 1 figure 2c and the sensitivity ∂J∂u0
iin figure 2d. The adjoint
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
22 F. FANG ET. AL.
solution essentially advects the results from its source,
∂J
∂uNti
= ∆xuNti ,
(which mirrors the forward solution at adjoint time level Nt) backwards through the domain.
This propagates the kinetic energy information at the final time level to the initial conditions
where the importance of the initial conditions contribution to J can be determined.
It should be noted that the accuracy of the sensitivity/gradient is such that no difference
can be observed visually between the exact gradient (obtained by taking perturbations of the
individual variables in the initial conditions with a perturbation of 10−10) and the gradients
shown in figure 2d. The maximum error in the sensitivity ∂J∂u0
iwhich is normalised by the
maximum value of ∂J∂u0
k,∀k is shown for the high-resolution method versus the perturbation
∆u used in the ISP-Adjoint calculation in figure 3. The sensitivity calculated by the ISP-
Adjoint becomes more accurate as ∆u is reduced until it gets to the point where round off
error interfers with its accuracy and its rate of decrease slows and eventually the error starts
to increase. It should be born in mind that due to the highly non-linear nature of the high-
resolution method the matrix A is not linear in un ∀n and thus the perturbation used in the
ISP-Adjoint calculation is an approximation. The central difference method (51) has a linear
A in un ∀n and thus the ISP-Adjoint is exact (to computer round off error). In addition,
the first order upwind scheme in (52) is ’mostly’ exact. It is only non-exact and non-linear
when a face value of the advection velocity (used to advect the velocity un+1) switches sign
on application of the perturbation ∆u which is not very likely to happen at least for small
∆u relative the the size of u. In addition, when it does happen the advection velocity will be
small and will thus contribute little to the gradient calculation.
The accuracy of the adjoint code can be ascertained through a gradient check [29], where
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 23
2 1 2
4 3
2 1 2
1 4 3
1
4 3 4 3
3
9
6
3
9
1
7
4
1
7
2
8
5
2
8 9
3
6
9
3
7
1
4
7
1
3
9
1
7
2
8 9
3
6 6 4 5
(a) (b) (c)
Figure 1. Extended colouring for material properties and non-linearities. (a) Graph colouring scheme
for an element-wise material property or non-linearity. Four colours are needed for 2-D linear
quadrilateral elements. (b) Graph colouring scheme for a node-wise material property or non-linearity.
Nine colours are needed for 2-D linear quadrilateral elements. (c) For the case shown in (b) the dotted
line indicates the nodes that are directly linked to both nodes 1 and 2. These are the nodes that their
needs to be an independent set for and there must be coloured with a different number than 1 and 2.
the quantity w(α) is calculated and it is verified that it satisfies w(α) = 1+O(α). The quantity
w is defined as
w(a) =J(u0 + ah)− J(u0)
ahT∇J(u0), (61)
where, the cost function J = 12∆x
∑Ni=1(uNt
i )2,∇J(u0) = ∂J(u0)∂u0 and h is an arbitrary vector of
unit length (here h = ∇J/‖∇J‖2). The vector u0 = (u01, u
02, ..., u
0N )T of initial control volume
values of velocity represents the control variables, here the initial velocity u0 =∑Nj=1Mju
0j
with basis function Mj unity in j and zero elsewhere. For values of a that are small, but not
too close to the machine zero, it is expected that w(a) should be close to unity. To quantify
the gradient accuracy, figure 4 shows the variation of the function w with respect to a. It can
be seen that, as required, the function w(a) is close to unity when a varies between 10−3 –
10−12. Notice that the gradient of the first order upwind scheme seems independent of the
perturbation which is because it is ’mostly’ independent of the size of the perturbation ∆u.
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
24 F. FANG ET. AL.
0 0.2 0.4 0.6 0.8 1x
0
0.2
0.4
0.6
0.8
1
u
fine mesh11 cell mesh
0 0.2 0.4 0.6 0.8 1x
0
0.1
0.2
0.3
0.4
0.5
0.6
u
fine meshcentralupwindhigh-res
(a) (b)
0 0.2 0.4 0.6 0.8 1x
0
0.0001
0.0002
0.0003
0.0004
0.0005
u*
fine meshcentralupwindhigh-res
0 0.2 0.4 0.6 0.8 1x
0
0.01
0.02
0.03
0.04
0.05
sens
itivi
ty
fine meshcentralupwindhigh-res
(c) (d)
Figure 2. Burgers equation solution (a) initial conditions for a coarse (11 cells) and fine (101 cells)
mesh (b) the forward solution at the end of time t = 0.5 and for the central difference method, the
upwind method and the high resolution methods along with a fine mesh high resolution result. (c)
corresponding adjoint and (d) sensitivities for the four simulations shown in (b).
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 25
1e-08 1e-06 0.0001 0.01perturbation in u
1e-08
1e-06
0.0001
0.01
norm
alis
ed e
rror
in s
ensi
tivity
high-res. - double prec.high-res. - single prec.central scheme - single prec.high-res. - 2nd order double prec.high res. - 2nd order single prec.
Figure 3. The normalised maximum error in the sensitivity for the coarse mesh calculation against
varying ∆u in the ISP-Adjoint calculation. The error is normalised with the maximum magnitude of
the sensitivity/gradient and the error is determined for both single and double precision. The central
scheme has an exact gradient (ignoring round off error) and thus its minimum value gives the best
error achievable using single precision. Also shown are the normalised errors in the high-resolution
gradients for single and double precision when a second order finite difference method is used - that
is A(m+∆ml)−A(m−∆ml)2∆ml
, A(m+∆φk)−A(m−∆φk)2∆φk
instead of A(m+∆ml)−A(m)∆ml
, A(m+∆φk)−A(m)∆φk
.
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
26 F. FANG ET. AL.
(a) (b)
Figure 4. The accuracy of the adjoint model (a) First order unwinding; (b) Flux limiting. Equation
(61): w(a) =`J(u0 + ah)− J(u0)
´/ahT∇J(u0) (here h = ∇J/‖∇J‖2) is used. The cost function J
is defined as the kinematic energy at t = 0.5 and the control variables are the initial velocities u0.
This properly may also be shared with other schemes that have a switch in the discretisation
depending on if information is going into or out of a cell or element e.g. Discontinuous Galerkin
methods, see [30]. It can thus be concluded, at least for this model problem set-up, that the
adjoint model developed here, is correct and accurate.
7. ISP-Adjoint applied to the Munk gyre
The approach for sensitivity analysis is applied to an idealised ocean gyre problem discretised
with a reduced order model. The sensitivity of the kinetic energy at the end of the simulation
with respect to the initial conditions will be sought. Although this is a relatively easily
differentiated problem because the descrete system of reduced equations are quadratic and the
matricies that change every time step are given by equations (29) and (30) it does demonstrate
how easily applied the ISP-Adjoint is. Since the system of equations are quadratic any sized
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 27
perturbation can be applied and thus a unity pertabation is applied to all variables in the
differentiation.
7.1. Matrix equation structure for fluids discretisations
The matrix that is solved at time level n typically has a structure which is a block matrix
representation of the discrete system 4:
Pn =
En Cn
CnT 0
(62)
and
Hn =
Bn 0
0 0
, (63)
and the forward model during the whole simulation period is therefore:
AΦ =
P1
H2 P2
. . . . . .
HNt PNt
Φ1
Φ2
...
ΦNt
=
s1
s2
...
sNt
, (64)
and the corresponding adjoint model is:
(AT + GT )Φ∗ =
P1T H2T
P2T . . .
. . . HNtT
PNtT
+ GT
Φ∗1
Φ∗2
...
Φ∗Nt
=
∂J∂Φ1
∂J∂Φ2
...
∂J∂ΦNt
. (65)
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
28 F. FANG ET. AL.
in which the forward and adjoint solutions are Φn = (unT ,pnT )T and Φ∗n = (u∗nT ,p∗nT )T
respectively.
7.2. Application to the Munk gyre
The ISP-Adjoint approach is tested in a computational domain of horizontal dimensions, 1000
km by 1000 km with a depth of H = 500 m. The wind forcing on the free surface is given by
τy = τ0cos(πy/L), τx = 0.0, (66)
where τx and τy are the wind stresses on the free surface along the x and y directions
respectively, and L = 1000 km. A maximum zonal wind stress of τ0 = 0.1 N m−1 is applied in
the latitudinal (y) direction. The Coriolis terms are taken into account using the beta-plane
approximation (f = βy) where β = 1.8×10−11 and the reference density is ρ0 = 1000 kg m−1.
The problem is non-dimensionalised with the maximum Sverdrup balance velocity
βHρ0v =∂τ
∂y≤ τ0π
L⇒ v ≤ 3.5× 10−2m s−1 (67)
(and so the velocity scale U = 3.5×10−2 m s−1 is used here), and the length scale is L = 1000
km. Time is non-dimensionalised with T = LU . The spin-up period is 0.1512 (50 days). The
equilibrium state at 50 days is taken as the initial state for both the full and reduced models.
The snapshots are collected from the results obtained in the full model during the simulation
period [50, 150] days. The time step is 3.78× 10−4, equivalent to 3h. Incorporating the beta-
plane approximation yields a non-dimensional β∗ = L2βU = 514.3. The non-dimensional wind
stress (applied as a body force here averaged over the depth of the domain) takes the same
cosine of latitude profile with τ∗0 = τ0L(U2ρ0H) = 163.3. No-slip boundary conditions are applied
to the lateral boundaries. The Reynolds number is defined as Re = ULν (here the kinematic
viscosity is 140 m2s−1). It should be pointed out that since the discrete system of equations
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 29
is quadratic application of the ISP-Adjoint is exact (to computational round off) for both the
reduced order and full system of finite element equations.
The POD bases are constructed by the snapshots which are obtained from the numerical
solutions by forcing the full forward model with the initial velocity (the background flow).
Forty snapshots with 35 POD bases for the velocity field u, v, w and pressure are chosen
which capture more than 99.5% of energy (calculated by the first 35 leading eigenvalues).
In ocean modelling, the pressure term also plays an important role in the geostrophic balance.
In this study, taking into account the role of the pressure term in both the POD-Galerkin model
and the geostrophic balance, the pressure in the momentum equations is divided into two parts:
p = png + pg. To accurately represent geostrophic pressure, its basis functions are split into
two sets: Φnguxand Φnguy
which are associated with the ux- and uy-velocity components.
The geostrophic pressure can be obtained from a quadratic finite element representation while
linear finite element representations are used for the velocity components. Furthermore the
geostrophic pressure can be represented by a summation of the two sets of geostrophic basis
functions, which are calculated by solving the elliptic equations using a conjugate gradient
iterative method (for details see [31]).
In this experiment, the cost function J is defined as the kinematic energy at the final time
level (n = Nt or 200 days):
J =12
∫Ω
||uNt ||22 dΩ. (68)
The sensitivity analysis of J with respect to the initial velocity u0 has been carried out, and
the corresponding results are shown in figure 5.
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
30 F. FANG ET. AL.
(a) (b) (c)
Figure 5. (a) Initial velocity vector (t = 100 days); (b) the velocity vector at t = 200 days on a regular
mesh; (c) Adjoint sensitivity/gradient of the final kinetic energy at t = 200 days with respect to the
initial velocity.
8. ConclusionsA new method of differentiating potentially complex mesh based numerical models is presented
and applied to simple Burgers equation and ocean gyre example problems. This method is
referred to here as the Independent Set Perturbation Adjoint approach (ISP-Adjoint) since
differentiation of the matrices and sources of the discrete forward finite element or control
volume model is realized by colouring variables to form a number of independent sets of
variables.
Three discretisation methods central difference, first order upwind, and high-resolution
schemes were applied to discretise Burgers equation to demonstrate that exactly the same
ISP-Adjoint code can switch between discretisations or parameterizations without changing
anything associated with the adjoint or gradient calculations - high resolution methods are
often used for parameterizations or models e.g. for turbulence. A reduced order Monk gyre
problem is solved here to demonstrate the application of the ISP-Adjoint to more complex
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 31
fluid flow problems.
The ISP-Adjoint method could make more tractable the formation of arbitrarily complex
multi-physics mesh-based (4D-VAR) models. It can enable low and high order adjoints of
models to be manipulated much more freely so that quicker progress can be made in data
assimilation, sensitivity and uncertainty analysis, error analysis, adaptive observations and a
whole host of other adjoint applications.
Acknowledgements
This work was carried out under funding from the UK’s Natural Environment Research Council
(projects NER/A/S/2003/00595, NE/C52101X/1 and NE/C51829X/1), the Engineering and
Physical Sciences Research Council (GR/R60898), the Leverhulme Trust (F/07058/AB),
and the USA National Science Foundation (NSF, grants ATM-0201808 and CCF-0635162),
and with support from the Imperial College High Performance Computing Service and the
Grantham Institute for Climate Change. Many thanks to Tamzin Mole for help with the
technical writing up of this work.
REFERENCES
1. A. M. Moore, N. S. Cooper, and D. L. T. Anderson. Initialisation and data assimilation in models of the
Indian Ocean. J. Phys. Oceanogr. 1987; 17:1965–1977.
2. F. X. Le Dimet and I. M. Navon. Variational and optimization methods in meteorology: A review,
1988. Technical Report: Early review on variational data assimilation, SCRI report No 144, pp88.
http://people.scs.fsu.edu/ navon/freqreq.html.
3. M. Wenzel, J. Schroter, and D. Olbers. The annual cycle of the global ocean circulation as determined
by 4D VAR data assimilation. Prog. Oceanog. 2001; 48(1):73–119.
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
32 F. FANG ET. AL.
4. Y. Q. Zhu and I. M. Navon. Impact of parameter estimation on the performance of the FSU global
spectral model using its full-physics adjoint. Mon. Weath. Rev., 127(7):1497–1517, 1999.
5. D. G. Cacuci, M. I. Bujor, and I. M. Navon. Sensitivity and Uncertainty Analysis: Applications to Large-
scale Systems Vol 2, CRC, 2005.
6. B. F. Sanders, N. D. Katopodes. Adjoint sensitivity analysis for shallow water wave control. J. Eng.
Mech., 126:909-919, 2000.
7. D. N. Daescu and I. M. Navon. Adaptive observations in the context of 4D-Var data assimilation.
Meteorol. Atmos. Phys. 2004; 85(4):205–226.
8. N. A.Pierce, and M. B.Giles. Adjoint recovery of superconvergent functionals from PDE approximations.
SIAM Rev., 42:247–264, 2000.
9. P. W. Power, C. C. Pain, M. D. Piggott, G. J. Gorman, F. Fang, D. P. Marshall, and A. J. H.
Goddard. Adjoint goal-based error norms for adaptive mesh ocean modelling. Ocean Modell., 15: 3–
38, doi:10.1016/j.ocemod.2006.05.001, 2006.
10. Martins J.R.R.A., Alonso J.J., and Reuther J.J. A coupled-adjoint sensitivity analysis method for high-
fidelity aero-structural design. Optim. Eng., 6(1):33–62, 2005.
11. Mohammadi B., Male J.M., and Rostaing-Schmidt N. Automatic differentiation in direct and
reverse modes: application to optimum shapes design in fluid mechanics. In Computational
Differentiation:Techniques, Applications, and Tools, edited by M. Berz, C.H. Bischof, G.F. Corliss and A.
Griewank:309–318, 1996(SIAM:Philadelphia, PA).
12. Sherman L.L, Taylor A.C., Green L.L, Newman P.A., and Hou G.W. Vamshi Mohan Korivi. First- and
second-order aerodynamic sensitivity derivatives via automatic differentiation with incremental iterative
methods. J. Comput. Phys., 129(2):307–331, 1996.
13. Marta A.C., Mader C.A., Martins J. R. R. A., Van der Weide E., and Alonso J.J. A methodology for
the development of discrete adjoint solvers using automatic differentiation tools. International Journal
of Computational Fluid Dynamics, 21(9–10):307–327, 2007.
14. Boudjemaa R., Cox M.G., Forbes A.B., and Harris P.M. Automatic differentiation and its application in
metrology. Advanced Mathematical and Computational Tools in Metrology VI, edited by P Ciarlini, M
G Cox, F Pavese and G B Rossi:170–179, 2004.
15. Baalousha H. and Kongeter J. Stochastic modelling and risk analysis of groundwater pollution using form
coupled with automatic differentiation. Advances in Water Resources, 29:1815–1832, 2006.
16. Glowinski R. and Toivanen J. A multigrid preconditioner and automatic differentiation for non-equilibrium
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
THE INDEPENDENT SET PERTURBATION ADJOINT METHOD 33
radiation diffusion problems. Journal of Computational Physics, 207:3545–374, 2005.
17. Green L.L., Newman P.A., and Haigler K.J. Sensitivity derivatives for advanced cfd algorithm and viscous
modeling parameters via automatic differentiatio. Journal of Computational Physics, 125(2):313–324,
1996.
18. Bischof C.H., Corliss G.F., Green L., Griewank A., Haigler K., and Newman P. Automatic differentiation
of advanced CFD codes for multidisciplinary design. Computing Systems in Engineering, 3(6):625–637,
1992.
19. Liu Z. and Sandu A. On the properties of discrete adjoints of numerical methods for the advection
equation. International journal for numerical methods in fluids, 56(7):769–803, 2008.
20. Reuther J., Alonso J.J., Jameson A., Rimlinger M., and Saunders D. Constrained multipoint aerodynamic
shape optimization using an adjoint formulation and parallel computers: part i. j. aircraft. J. Aircraft,
36(1):51–60, 1999.
21. Reuther J., Alonso J.J., Jameson A., Rimlinger M., and Saunders D. Constrained multipoint aerodynamic
shape optimization using an adjoint formulation and parallel computers: part ii. j. aircraft. J. Aircraft,
36(1):61–74, 1999.
22. Heimbach P., Hill C., and Giering R. An efficient exact adjoint of the parallel MIT general circulation
model, generated via automatic differentiation. Future Generation Computer Systems, Elsevier,
21(8):1356–1371, 2005.
23. Stammer D., Wunsch C., Giering R., Eckert C., Heimbach P., Marotzke J., Adcroft A., Hill C.N., and
Marshall J. The global ocean circulation during 1992-1997, estimated from ocean observations and a
general circulation model. J. Geophys. Res., 107(C9):1–27, 2002.
24. M. D. Piggott, G. J. Gorman, C. C. Pain, P. A. Allison, A. S. Candy, B. T. Martin, and M. R. Wells.
A new computational framework for multi-scale ocean modelling based on adapting unstructured meshes.
Int. J. Numer. Methods Fluids, 56:1003–1015, 2008.
25. M. T. Jones, P. E. Plassmann. A parallel graph coloring heuristic. SIAM Journal on Scientific Computing,
14(3):654–669, 1993.
26. R. Ford, C. C. Pain, M. D. Piggott, A. J. H. Goddard, C. R. E. de Oliveira, and A. P. Umbleby.
A nonhydrostatic finite-element model for three-dimensional stratified oceanic flows. Part I: Model
formulation. Mon. Weath. Rev. 2004; 132(12):2816–2831.
27. F. Fang, C. C. Pain, I. M. Navon, M. D. Piggott, G. J. Gorman, P. E. Farrell, P. Allison, and A. J. H.
Goddard. A POD reduced-order 4D-Var adaptive mesh ocean modelling approach. International Journal
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls
34 F. FANG ET. AL.
for Numerical Methods in Fluids, DOI: 10.1002/fld.1911., 2008.
28. B. P. Leonard. The ULTIMATE conservative difference scheme applied to unsteady one-dimensional
advection, Computing Methods in Apllied Mechanics and Engineering, 88:17-74, 1991.
29. I. M. Navon, X. Zou, J. Derber, and J. Sela. Variational data assimilation with an adiabatic version of
the NMC Spectral Model. Mon. Weath. Rev., 120:55–79, 1992.
30. R. T. Ackroyd. Finite Element Methods for Particle Transport: Applications to Reactor and Radiation
Physics Research Studies in Particle and Nuclear Technology, 6, Taylor & Francis Group, London, 1997.
31. F. Fang, C. C. Pain, I. M. Navon, M. D. Piggott, G. J. Gorman, P. Allison, and A. J. H. Goddard. Reduced
order modelling of an adaptive mesh ocean model. International Journal for Numerical Methods in Fluids,
DOI: 10.1002/fld.1841., 2008.
Copyright c© 2004 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids 2004; 00:1–6
Prepared using fldauth.cls