Adaptive finite element method for shape optimization∗

ESAIM: COCV 18 (2012) 1122–1149 ESAIM: Control, Optimisation and Calculus of VariationsDOI: 10.1051/cocv/2011192 www.esaim-cocv.org

ADAPTIVE FINITE ELEMENT METHOD FOR SHAPE OPTIMIZATION ∗

Pedro Morin1, Ricardo H. Nochetto

2, Miguel S. Pauletti

3

and Marco Verani4

Abstract. We examine shape optimization problems in the context of inexact sequential quadraticprogramming. Inexactness is a consequence of using adaptive finite element methods (AFEM) to approx-imate the state and adjoint equations (via the dual weighted residual method), update the boundary,and compute the geometric functional. We present a novel algorithm that equidistributes the errorsdue to shape optimization and discretization, thereby leading to coarse resolution in the early stagesand fine resolution upon convergence, and thus optimizing the computational effort. We discuss theability of the algorithm to detect whether or not geometric singularities such as corners are genuine tothe problem or simply due to lack of resolution – a new paradigm in adaptivity.

Mathematics Subject Classification. 49M25, 65M60.

Received July 1st, 2011. Revised September 19, 2011Published online 16 January 2012.

1. Shape optimization as adaptive sequential quadratic programming

We consider shape optimization problems for partial differential equations (PDE) that can be formulated asfollows: we denote with u = u(Ω) the solution of a PDE in the domain Ω of Rd (d ≥ 2),

Bu(Ω) = f, (1.1)

Keywords and phrases. Shape optimization, adaptivity, mesh refinement/coarsening, smoothing.

∗ Partially supported by UNL through GRANT CAI+D 062-312, by CONICET through Grant PIP 112-200801-02182, byMinCyT of Argentina through Grant PICT 2008-0622 and by Argentina-Italy bilateral project “Innovative numerical methodsfor industrial problems with complex and mobile geometries”.Partially supported by NSF grants DMS-0505454 and DMS-0807811.Partially supported by NSF grants DMS-0505454 and DMS-0807811, and by Award No. KUS-C1-016-04, made by King AbdullahUniversity of Science and Technology (KAUST).Partially supported by Italian MIUR PRIN 2008 “Analisi e sviluppo di metodi numerici avanzati per EDP” and by Argentina-Italy bilateral project “Innovative numerical methods for industrial problems with complex and mobile geometries”.1 Departamento de Matematica, Facultad de Ingenierıa Quımica and Instituto de Matematica Aplicada del Litoral, UniversidadNacional del Litoral, CONICET, Santa Fe, Argentina. [email protected];www.imal.santafe-conicet.gov.ar/pmorin2 Department of Mathematics and Institute for Physical Science and Technology, University of Maryland, College Park, [email protected]; www.math.umd.edu/~rhn3 Department of Mathematics and Institute for Applied Mathematics and Computational Science, Texas A&M University,College Station, 77843 TX, USA. [email protected]; www.math.tamu.edu/~pauletti4 MOX – Modelling and Scientific Computing – Dipartimento di Matematica “F. Brioschi”, Politecnico di Milano, Milano, [email protected]; mox.polimi.it/~verani

Article published by EDP Sciences c© EDP Sciences, SMAI 2012

http://dx.doi.org/10.1051/cocv/2011192

http://www.esaim-cocv.org

www.imal.santafe-conicet.gov.ar/pmorin

www.math.umd.edu/~rhn

www.math.tamu.edu/~pauletti

mox.polimi.it/~verani

http://www.edpsciences.org

ADAPTIVE FINITE ELEMENT METHOD FOR SHAPE OPTIMIZATION 1123

which we call the state equation. Given a cost functional J [Ω] = J [Ω, u(Ω)], which depends on Ω itself and thesolution u(Ω) of the state equation, we consider the minimization problem

Ω∗ ∈ Uad : J [Ω∗, u(Ω∗)] = infΩ∈Uad

J [Ω, u(Ω)], (1.2)

where Uad is a set of admissible domains in Rd. We view this as a constrained minimization problem, (1.1)being the constraint. The goal of this paper is, assuming the existence of a local minimizer Ω of (1.1)–(1.2),to formulate and test a practical and efficient computational algorithm that adaptively builds a sequence ofdomains Ωkk≥0 converging to Ω. Coupling adaptivity with shape optimization seems to be important butrather novel.

To achieve this goal we will define an adaptive sequential quadratic Programming algorithm (ASQP). Tomotivate and briefly describe the ideas underlying ASQP, we need the concept of shape derivative δΩJ [Ω;v] ofJ [Ω] in the direction of a velocity v, which usually satisfies

δΩJ [Ω;v] =∫

Γ

g(Ω)v dS = 〈g(Ω), v〉Γ , (1.3)

where v = v·ν is the normal component of v to Γ = ∂Ω, the boundary of Ω, and g(Ω) is the Riesz representationof the shape derivative. We postpone the precise definition of (1.3) until Section 2.2. We will see later that g(Ω)depends on u(Ω) and on the solution z(Ω) of an adjoint equation. We present ASQPin two steps: we first introducean infinite dimensional sequential quadratic programming (∞-SQP) algorithm, which is an ideal but impracticalalgorithm, and next we discuss its adaptive finite dimensional version, which is responsible for the inexact natureof ASQPthat renders it practical.

Exact SQP algorithm. To describe the∞-SQPalgorithm, we let Ωk be the current iterate and Ωk+1 be the newone. We let Γk := ∂Ωk and V(Γk) be a Hilbert space defined on Γk. We further let AΓk

[·, ·] : V(Γk)×V(Γk)→ R

be a symmetric continuous and coercive bilinear form, which induces the norm ‖ · ‖V(Γk) and gives rise tothe elliptic selfadjoint operator Ak on Γk defined by 〈Akv, w〉Γk

= AΓk[v, w]. We then consider the following

quadratic model Qk : V(Γk)→ R of J [Ω] at Ωk

Qk(w) := J [Ωk] + δΩJ [Ωk;w] +12AΓk

[w, w]. (1.4)

We denote by vk the minimizer of Qk(w), which satisfies

vk ∈ V(Γk) : AΓk[vk, w] = −〈gk, w〉Γk

∀w ∈ V(Γk), (1.5)

with gk := g(Ωk). It is easy to check that vk given by (1.5) is the unique minimizer of Qk(w) and the coercivityof the form AΓk

(·, ·) implies that vk is an admissible descent direction, i.e. δΩJ [Ωk;vk] < 0, unless vk = 0, inwhich case we are at a stationary point of J [Ω]. We remark that using (1.5) is classical in the literature of shapeoptimization; see e.g. [9, 11, 15, 22].

Once vk has been found on Γk, we need to determine a vector field vk in Ωk so that vk ·νk = vk on Γk, alongwith a suitable stepsize μ so that the updated domain Ωk+1 = Ωk +μvk := y ∈ Rd : y = x+μvk(x), x ∈ Ωkgives a significant decrease of the functional value J [Ωk]. We are now ready to introduce the exact (infinitedimensional) sequential quadratic programming algorithm (∞-SQP) for solving the constrained optimizationproblem (1.1)–(1.2):

∞-SQP AlgorithmGiven the initial domain Ω0, set k = 0 and repeat the following steps:

(1) Compute uk = u(Ωk) by solving (1.1)(2) Compute the Riesz representation gk = g(Ωk) of (1.3)(3) Compute the search direction vk by solving (1.5) and extend it to vk

(4) Determine the stepsize μk by line search(5) Update: Ωk+1 = Ωk + μkvk; k := k + 1

1124 P. MORIN ET AL.

The ∞-SQP algorithm is not feasible as it stands, because it requires the exact computation of the followingquantities at each iteration:

• the solution uk to the state equation (1.1);• the solution zk to the adjoint equation which in turn defines gk;• the solution vk to problem (1.5);• the values of the functional J in the line search routine, which in turn depend on uk.

Adaptive SQP algorithm (ASQP). In order to obtain a practical algorithm, we replace all of the above non-computable operations by finite approximations. This leads to the adaptive sequential quadratic Programmingalgorithm (ASQP), which adjusts and balances the accuracies of the various approximations along the iteration.It is worth noticing that the adaptive algorithm has to deal with two distinct main sources of error: theapproximation of the PDE (PDE error) and the approximation of the domain geometry (geometric error). Weobserve that the approximation of (1.1) and the values of the functional J and of its derivative relate to the PDEError, whereas the approximation of (1.5) and domain update lead to the geometric error. Since it is wastefulto impose a PDE error finer than the expected geometric error, we devise a natural mechanism to balance thecomputational effort.

The ASQP algorithm is an iteration of the form

Ek → APPROXJ→ SOLVE→ RIESZ→ DIRECTION→ LINESEARCH → UPDATE→ Ek+1,

where Ek = Ek(Ωk, Sk, Vk) is the total error incurred in at step k, Sk = Sk(Ωk) is the finite element space definedon Ωk and Vk = Vk(Γk) is the finite element space defined on the boundary Γk. We now describe briefly eachmodule along with the philosophy behind ASQP. Let Gk be an approximation to the shape derivative g(Ωk), letvk ∈ V(Γk) be the exact solution of (1.5) on Γk and let Vk ∈ Vk(Γk) be its finite element approximation.

The discrepancy between vk and Vk leads to the geometric error. Upon using a first order Taylor expansionaround Ωk, together with (1.5) for the exact velocity vk, we obtain∣∣J [Ωk + μkVk]− J [Ωk + μkvk]

∣∣ ≈ μk

∣∣δΩJ [Ωk; Vk − vk]∣∣ = μk

∣∣AΓk[vk, Vk − vk]

∣∣ ≤ μk‖vk‖Γk‖vk − Vk‖Γk

.

Motivated by this expression, we now define two modules, APPROXJ and DIRECTION, in which adaptivity iscarried out. These modules are driven by different adaptive strategies and corresponding different tolerances,and tolerance parameters γ (PDE) and θ (geometry). Their relative values allow for different distributions ofthe computational effort in dealing with the PDE and the geometry.

The routine DIRECTION enriches/coarsens the space Vk to control the quality of the descent direction

‖Vk − vk‖Γk≤ θ‖Vk‖Γk

, (1.6)

which implies that〈Vk, vk〉Γk

‖Vk‖Γk‖vk‖Γk

≥√

1− θ2,

and thus the cosine of the angle between Vk and vk (as elements of V(Γk)) is bounded below by cosπ/6 =√3/2 provided θ ≤ 1/2. This guarantees that the angle between the directions Vk and vk is ≤ π/6. Besides

(1 − θ)‖Vk‖Γk≤ ‖vk‖Γk

≤ (1 + θ)‖Vk‖Γk, which implies a geometric error proportional to μk‖Vk‖2Γk

, namely∣∣J [Ωk + μkVk]− J [Ωk + μkvk]∣∣ ≤ δμk‖Vk‖2Γk

, (1.7)

with δ := θ(1 + θ) ≤ 32θ. Adaptivity in the module DIRECTION is guided by a posteriori estimators for the

energy error given by the bilinear form AΓk[·, ·]. In the applications of Sections 5 and 6, Ak is related to the

Laplace-Beltrami operator over Γk.


On the other hand, the module APPROXJ enriches/coarsens the space Sk to control the error in the approximatefunctional value Jk[Ωk + μkVk] to the prescribed tolerance γμk‖Vk‖2Γk

,∣∣J [Ωk + μkVk]− Jk[Ωk + μkVk]∣∣ ≤ γμk‖Vk‖2Γk

, (1.8)

where γ = 12 − δ ≥ δ prevents excessive numerical resolution relative to the geometric one; this is feasible if

θ ≤ 1/5. Adaptivity in APPROXJ is guided by the dual weighted residual method (DWR) [3, 6], taylored to theapproximation of the functional value J , instead of the usual energy estimators.

The remaining modules perform the following tasks. The module SOLVE finds finite element solutions Uk ∈ Sk

of (1.1) and Zk ∈ Sk of an adjoint equation (necessary for the computation of the shape derivative gk = g(Ωk)),while RIESZ builds on Sk an approximation Gk to gk. Finally, the module LINESEARCH finds an optimal stepsizeμ while using, if necessary, Lagrange multipliers to enforce domain constraints present in the definition of Uad.

Energy decrease. The triangle inequality, in conjunction with conditions (1.7) and (1.8), yields∣∣J [Ωk + μkVk]− Jk[Ωk + μkvk]∣∣ ≤ 1

2μk‖Vk‖2Γk

, (1.9)

which is a bound on the local error incurred in at step k. However, the exact energy decrease reads

J [Ωk]− J [Ωk + μkvk] ≈ −μkδΩJ [Ωk;vk] = μkAΓk[vk, vk] = μk‖vk‖2Γk

≥ (1 − θ)2μk‖Vk‖2Γk, (1.10)

and leads to the further constraint (1− θ)2 > 12 to guarantee the energy decrease

Jk[Ωk + μkVk] < J [Ωk].

Consistency. If ASQP converges to a stationary point, i.e. μk‖Vk‖2Γk→ 0 as k → ∞, then the modules

DIRECTION and APPROXJ approximate the descent direction Vk and functional J [Ωk] increasingly better ask → ∞, as dictated by (1.6) and (1.8). In other words, this imposes dynamic error tolerance and progressiveimprovement in approximating Uk, Zk and Gk as k →∞.

We observe that (1.8) is not a very demanding test for DWR. So we expect coarse meshes at the beginning,and a combination of refinement and coarsening later as DWR detects geometric singularities, such as corners,and sorts out whether they are genuine to the problem or just due to lack of numerical resolution. This aspect ofour approach is a novel paradigm in adaptivity, resorts to ideas developed in [8], and is documented in Sections 5and 6.

Prior work. The idea of coupling FEM, a posteriori error estimators and optimal design error estimators toefficiently solve shape optimization problems is not new. The pioneering work [4] presents an iterative scheme,where the Zienkiewicz-Zhu error indicator and the L2 norm of the shape gradient are both used at each iterationto improve the PDE error and the geometric error, respectively. However, the algorithm in [4] does not resort toany dynamically changing tolerance, that would allow, as it happens for ASQP, to produce coarse meshes at thebeginning of the iteration and a combination of geometric and PDE refinement/coarsening later on. Moreover, [4]does not distinguish between fake and genuine geometric singularities that may arise on the domain boundaryduring the iteration process, and does not allow the former to disappear. More recently, the use of adaptivemodules for the numerical approximation of PDEs has been employed by several authors [2, 29, 30] to improvethe accuracy of the solution of shape optimization problems. However, in these papers the critical issue oflinking the adaptive PDE approximation with an adaptive procedure for the numerical treatment of the domaingeometry is absent. We address this linkage below.

Outline. The rest of this paper is organized as follows. In Section 2 we introduce the Lagrangian L for theconstrained minimization problem (1.1)–(1.2) and derive the adjoint equation and shape derivative of L. InSection 3 we introduce the finite element discretization along with a brief summary of DWR and a novel error


estimate. In Section 4 we present in detail the ASQP algorithm, and discuss its several building blocks. We nextapply ASQP to two benchmark problems for viscous incompressible fluids governed by the Stokes equations.We examine drag minimization in Section 5 and aortic-coronary by-pass optimization in Section 6. In bothsections we derive the shape derivative as well as the full expression of the dual weighted residual estimate. Wealso document the performance of ASQPwith several interesting numerical simulations, which were implementedwithin ALBERTA [31] and postprocessed with PARAVIEW [19]. We end this paper in Section 7 with someconclusions.

2. Lagrangian formalism

2.1. State and adjoint equations

We consider a (nonlinear) functional J [Ω, u(Ω)] depending on a domain Ω and the solution u = u(Ω) of astate equation, which is a (linear) PDE defined in Ω. In strong form it reads Bu = f and in weak form can bewritten as follows:

u ∈ S : B[u, w] = 〈f, w〉 ∀w ∈ S. (2.1)

Here S is a Hilbert space, S∗ is its dual, B : S→ S∗ is a linear isomorphism, and B is the corresponding bilinearform. Therefore, B is continuous and satisfies the inf-sup condition

infw∈S

supv∈S

B[v, w]‖v‖‖w‖ = inf

v∈S

supw∈S

B[v, w]‖v‖‖w‖ > 0.

If f ∈ S∗, then (2.1) has a unique solution u = u(Ω). Our goal is to minimize J [Ω, u(Ω)] always maintainingthe state constraint (2.1) in the process. To this end, we introduce the Lagrangian

L[Ω, u, z] := J [Ω, u]−B[u, z] + 〈f, z〉, (2.2)

for u, z ∈ S. The adjoint variable z is a Lagrange multiplier for (2.1).The first order stationarity conditions, namely the state and adjoint equations for (u, z), read

〈δzL[Ω, u, z], w〉 = 0, (2.3)

〈δuL[Ω, u, z], w〉 = 0, (2.4)

for all test functions w ∈ S. Equations (2.3) and (2.4) imply respectively for all w ∈ S

u = u(Ω) ∈ S : B[u, w] = 〈f, w〉, (2.5)z = z(Ω) ∈ S : B[w, z] = 〈δuJ [Ω, u], w〉, (2.6)

which are the weak forms of state equation Bu = f and adjoint equation B∗z = δuJ [Ω, u]. Therefore, if weenforce (2.5), then the Lagrangian reduces to the cost functional

L[Ω, u, z] = J [Ω, u] (2.7)

no matter whether Ω is a minimizer or not. This is useful to construct a descent direction for J [Ω, u] viaL[Ω, u, z], perhaps using a discrete gradient flow.

2.2. Shape derivatives

To construct a descent direction we need δΩJ [Ω, u], which may not necessarily vanish unless we are alreadyat a stationary point. We now recall a basic rule for shape differentiation. If φ = φ(x) does not depend on Ωand

J [Ω] =∫

Ω

φdx


then the shape derivative of I[Ω] in the direction V is given by [32], Propositions 2.45, 2.50 and (2.145),

〈δΩJ [Ω],V〉 =∫

Γ

φV dS (2.8)

where V = V · ν is the normal velocity to Γ . This is unfortunately not enough: L also involves integrals offunctions which solve PDE in Ω, such as u(Ω) and z(Ω). If φ(Ω, x) also depends on Ω, then

〈δΩJ [Ω],V〉 =∫

Ω

φ′(Ω;V) +∫

Γ

φV dS (2.9)

where φ′(Ω;V) stands for the shape derivative of φ(Ω, x) in the direction V [32], Sections 2.31–2.33. Thisrequires computing the shape derivatives of the state and adjoint variables in the direction V, namely u′(Ω;V)and z′(Ω;V), which will be solutions of elliptic boundary value problems.

To render the discussion concrete, let u(Ω) solve the Dirichlet problem

Bu(Ω) = f in Ω, u(Ω) = on ∂Ω, (2.10)

with B a linear second order selfadjoint operator in S and f ∈ L2(Rd), ∈ H1(Rd) independent of Ω. The shapederivative u′(V) := u′(Ω;V) is the solution to the following Dirichlet problem [32]

Bu′(V) = 0 in Ω, u′(V) = −∇(u− ) ·V on ∂Ω. (2.11)

To obtain this expression it is first necessary to extend u(Ω) to a larger domain, for example by setting it equalto outside Ω and thus u(Ω) ∈ H1(Rd); see details in [12, 32].

Finally, the shape derivative of J [Ω, u(Ω)] can be computed by means of the usual chain rule. For example,if φ(Ω, x) = 1

2u(Ω, x)2 with u(Ω, x) solution to (2.10), then (2.9) yields

δΩJ [Ω;V] =∫

Ω

u(Ω)u′(V) dx +∫

∂Ω

12u(Ω)2V dS. (2.12)

The computation of δΩL[Ω, u(Ω), z(Ω)] resorts once more to the chain rule

〈δΩL[Ω, u(Ω), z(Ω)],V〉 = 〈δΩL[Ω, u, z],V〉+ 〈δuL[Ω, u, z], u′(V)〉 + 〈δzL[Ω, u, z], z′(V)〉, (2.13)

where on the right-hand side we regarded the variables Ω, u, z as independent. If either u′(V) or z′(V) wereadmissible test functions, then either the second or third term would vanish in light of (2.3) and (2.4). However,this is not the case when u, z satisfy a Dirichlet problem such as (2.10) and their shape derivatives u′(V), z′(V)have a non-vanishing trace dictated by (2.11).

There is however an approach to circumvent computing u′(V), z′(V) provided the ultimate goal is to obtainδΩJ [Ω;V] [1, 10]. Since such an approach hinges on suitably modifying the Lagrangian L of (2.2), which playsalso a vital role in deriving the estimates for DWR of Section 3, we do not adopt it here but briefly discuss itnow. If we append to J [Ω, u] the PDE and boundary conditions in (2.10) via Lagrange multipliers z, ξ, we endup with the modified Lagrangian

L[Ω, u, z, ξ] := J [Ω, u] +∫

Ω

(Bu− f)z dx +∫

∂Ω

(u− )ξ ds,

with functions u, z, ξ defined in Rd but independent of Ω. If u = u(Ω) is the solution of (2.10), then J [Ω, u(Ω)] =L[Ω, u(Ω), z, ξ] for all z, ξ. Hence, using the chain rule yields

δΩJ [Ω;V] = 〈δΩL[Ω, u, z, ξ] |u=u(Ω),V〉+ 〈δuL[Ω, u, z, ξ] |u=u(Ω), u′(V)〉


for all z, ξ. Since z(Ω), ξ(Ω) are solutions of 〈δuL[Ω, u(Ω), z(Ω), ξ(Ω)], v〉 = 0 for all v, we see that

δΩJ [Ω;V] = 〈δΩL[Ω, u, z, ξ] |u=u(Ω),z=z(Ω),ξ=ξ(Ω),V〉.

Using the PDE satisfied by u′(V) and z′(V), and suitable regularity assumptions, (2.13) can be rewritten asa duality pairing on the deformable part Γ of ∂Ω [32], Section 2.11 and Theorem 2.27,

〈δΩL[Ω, u(Ω), z(Ω)],V〉 = 〈g, V 〉Γ . (2.14)

We view g, which concentrates on Γ and pairs with the normal component V of V, as a Riesz representativeof the shape derivative of L. In Sections 5 and 6, we examine two examples for the Stokes flow, carry out thesecalculations in detail, and give explicit expressions for g.

3. Dual weighted residual method

We now want to evaluate the PDE error using finite element methods (FEM). Therefore, we assume thatthe domain Ω is fixed and omit it as argument in both J and L; thus L[Ω, u, z] = L[u, z]. We recall that if weenforce the state equation (2.5), then (2.7) holds as well.

Given a conforming and shape-regular triangulation T of Ω, for any T ∈ T we denote by hT := |T | 1d its size.Let ST ⊂ S be a finite element subspace that satisfies the discrete inf-sup condition

infW∈ST

supV ∈ST

B[V, W ]‖V ‖‖W‖ = inf

V ∈ST

supW∈ST

B[V, W ]‖V ‖‖W‖ ≥ β > 0,

with β independent of T . This yields existence and uniqueness of the Galerkin solutions to the following finiteelement approximations of (2.5)–(2.6)

U ∈ ST : B[U, W ] = 〈f, W 〉 ∀W ∈ ST , (3.1)

Z ∈ ST : B[W, Z] = 〈δuJ [U ], W 〉 ∀W ∈ ST , (3.2)

which are stationary points of L in ST . It remains to introduce the primal and dual residuals for w ∈ S

R(U, Z; w) := 〈δzL[U, Z], w〉 = 〈f, w〉 −B[U, w], (3.3)R∗(U, Z; w) := 〈δuL[U, Z], w〉 = 〈δuJ [U ], w〉 −B[w, Z]. (3.4)

In view of (3.1)–(3.2), these residuals satisfy Galerkin orthogonality

R(U, Z; W ) = R∗(U, Z; W ) = 0 ∀ W ∈ ST . (3.5)

The error J [u]−J [U ] = L[u, z]−L[U, Z] can be estimated in terms of the residuals R and R∗. This leads to thefollowing error representation formula, whose proof can be found in [3,6,16]. We present it here for completeness.

Proposition 3.1 (error representation). The following a posteriori expression for J [u]− J [U ] is valid

J [u]− J [U ] =12R(U, Z; z −Wz) +

12R∗(U, Z; u−Wu) + E ∀ Wz, Wu ∈ ST (3.6)

where the remainder term E = E(u, z, U, Z) is given by

E =12

∫ 1

0

〈δ3uJ [su + (1− s)U ], e, e, e〉 s(s− 1) ds (3.7)

with e = u−U the primal error. In addition, if J is a linear functional, then the two residuals R, R∗ are equal,namely R(U, Z; z) = R∗(U, Z; u), and

J [u]− J [U ] = R(U, Z; z −Wz) ∀ Wz ∈ ST . (3.8)


Proof. By the fundamental theorem of Calculus

L[u, z]− L[U, Z] =∫ 1

0

(〈δuL[s(u, z) + (1− s)(U, Z)], u− U〉+ 〈δzL[s(u, z) + (1− s)(U, Z)], z − Z〉

)ds.

The trapezoidal rule, in conjunction with the fact that δuL[u, z] = δzL[u, z] = 0, yields

L[u, z]− L[U, Z] =12〈δuL[(U, Z], u− U〉+ 1

2〈δzL[U, Z], z − Z〉+ E,

where E satisfies (3.7) by direct computation. The equality (3.6) follows from (2.7), (3.3), (3.4), and (3.5). Toprove (3.8), we observe that 〈δuJ [u], w〉 = J [w] if J is linear, whence (2.6) and (3.5) yield

J [u]− J [U ] = 〈δuJ [u], u− U〉 = B[u− U, z] = R(U, Z; z −W ) = R∗(U, Z; u−W ) ∀ W ∈ ST .

This completes the proof.

The Dual Weighted Residual method (DWR) consists of splitting R, R∗ into element contributions

R(U, Z; w) =∑T∈T〈r(U, Z), w〉T + 〈j(U, Z), w〉∂T , R∗(U, Z; w) =

∑T∈T〈r∗(U, Z), w〉T + 〈j∗(U, Z), w〉∂T

where r(U, Z) = f − BU , r∗(U, Z) = δuJ [U ] − B∗Z are the interior residuals, or strong form of the PDE, andj(U, Z), j∗(U, Z) are the jump residuals. They are both computable since they depend only on the computeddiscrete solutions U and Z. In most applications, the duality pairings 〈·, ·〉T , 〈·, ·〉∂T appearing in the last twoexpressions are just the L2(T ), L2(∂T ) inner products, respectively. Consequently, the first two terms in (3.6)yield the (constant-free) bounds

|R(U, Z; z −Wz)| ≤∑T∈T‖r(U, Z)‖L2(T )‖z −Wz‖L2(T ) + ‖j(U, Z)‖L2(∂T )‖z −Wz‖L2(∂T ),

|R∗(U, Z; u−Wu)| ≤∑T∈T‖r∗(U, Z)‖L2(T )‖u−Wu‖L2(T ) + ‖j∗(U, Z)‖L2(∂T )‖u−Wu‖L2(∂T ),

(3.9)

and the quantities ‖z−Wz‖L2(T ), ‖u−Wu‖L2(T ) as well as those on ∂T are regarded as local weights. Estimatingthese weights requires knowing the state and adjoint variables u and z, and finding suitable quasi-interpolantsWu and Wz. We present now a novel local interpolation estimate for a given function v (=u, z) expressed interms of jumps of the discontinuous Lagrange interpolant ΠT v of v plus a higher order remainder. Similarestimates without justification are proposed in [6] for polynomial degree 1.

Lemma 3.2 (local interpolation estimate). Let m ≥ 1 be the polynomial degree, d = 2 be the dimension, andv ∈ Hm+2(N (T )) where N (T ) is a discrete neighborhood of T ∈ T . There exist constants C1, C2 > 0, solelydependent on mesh regularity, so that

‖v −ΠT v‖L2(N (T )) + h1/2T ‖v −ΠT v‖L2(∂T ) ≤ C1

m∑j=0

hj+1/2T ‖[[DjΠT v]]‖L2(∂T ) + C2h

m+2T |v|Hm+2(N (T )), (3.10)

where [[·]] denotes jump accross interelement sides.

Proof. We scale to the reference element T , where the desired estimate contains no powers of meshsize. We thenproceed by contradiction: assume there is a sequence vn ∈ Hm+2(N (T )) so that

‖vn − Πvn‖L2(N (T )) = 1,

m∑j=0

‖[[DjΠvn]]‖L2(∂T ) + |vn|Hm+2(N (T )) → 0


as n → ∞. For a subsequence, still labeled vn, we have that vn → v ∈ Hm+2(N (T )) weakly and thus stronglyin Hm+1(N (T )) and pointwise. The latter yields convergence Πvn → Πv in L2(N (T )) as well as convergenceof [[DjΠvn]]→ [[DjΠv]] in L2(∂T ) for 0 ≤ j ≤ m because Πvn is a piecewise polynomial of degree ≤ m. Hence,[[DjΠv]] = 0 on ∂T and Πv is a global polynomial of degree ≤ m in N (T ).

On the other hand, the fact that |v|Hm+2(N (T )) = 0 implies that v is a polynomial of degree ≤ m + 1 in

N (T ). Therefore, v − Πv vanishes at the 12 (m + 1)(m + 2) canonical nodes of T . Moreover, v − Πv vanishes at

the additional 32m(m + 1) canonical nodes outside T but in N (T ). Since 3

2m(m + 1) ≥ m + 2 for all m ≥ 1, weinfer that v − Πv = 0 in N (T ), whence v is a global polynomial of degree ≤ m. This contradicts the property‖v − Πv‖L2(N (T )) = 1 and proves the asserted estimate for ‖v −ΠT v‖L2(N (T )).

The same reasoning applies to ‖v − Πv‖L2(∂T )), and thus concludes the proof.

Except in degenerate situations, the remainder in (3.10) is asymptotically of higher order, whence

‖v −ΠT v‖L2(N (T )) + h1/2T ‖v −ΠT v‖L2(∂T )

m∑j=0

hj+1/2T ‖[[DjΠT v]]‖L2(∂T ), (3.11)

as hT → 0. This may be viewed as a discrete version of the celebrated Bramble-Hilbert estimate. Unfortunately,however, the remainder in (3.10) cannot in general be removed. The estimate (3.11) is not really computablebecause it requires knowing v. If V ∈ ST is a Galerkin approximation of v ∈ S, then we expect its behavior tobe similar to that of ΠT v, which leads to the heuristic bound

‖v − V ‖L2(N (T )) + h1/2T ‖v − V ‖L2(∂T )

m∑j=0

hj+1/2T ‖[[DjV ]]‖L2(∂T ). (3.12)

Combining (3.6) and (3.9) with (3.12) we end up with the a posteriori upper bound

|J [u]− J [U ]| ∑T∈T

η(T ) (3.13)

with element indicator

η(T ) =(h

1/2T ‖r(U, Z)‖L2(T ) + ‖j(U, Z)‖L2(∂T )

) m∑j=0

hjT ‖[[DjZ]]‖L2(∂T )

+(h

1/2T ‖r∗(U, Z)‖L2(T ) + ‖j∗(U, Z)‖L2(∂T )

) m∑j=0

hjT ‖[[DjU ]]‖L2(∂T ),

(3.14)

which is the bound proposed in [6] for m = 1. One important drawback of this bound, discussed in [6], is thefact that there are unknown interpolation constants in it. This is less severe in the present context because weare mostly concerned with the correct distribution of spatial degrees of freedom rather than accurate bounds.Hence, for our purposes, the heuristic bound (3.13) is justified.

4. The adaptive sequential quadratic programming algorithm

In this section we describe the modules pertaining to the ASQP algorithm. Recall that k ≥ 1 stands forthe adaptive counter and Ωk is the current domain produced by ASQP with deformable boundary Γk. LetSk = STk

(Ωk) and Vk = VTk(Γk) be the finite element spaces on the bulk and boundary, which are compatible

and fully determined by one underlying mesh Tk of Ωk.


We define ASQP as follows:

Adaptive sequential quadratic programming algorithm (ASQP)

Given the initial domain Ω0, a triangulation T0 of Ω0, and the parameter 0 < θ ≤ 15,

set γ = 12 − θ(1 + θ), k = 0, ε0 = +∞, μ0 = 1 and repeat the following steps:

(1) [Tk, Uk, Zk, Jk, Gk] = APPROXJ(Ωk, Tk, εk)

(2) [Vk, Tk] = DIRECTION(Ωk, Tk, Gk, θ)

(3) [Ωk+1, Tk+1, μk+1] = LINESEARCH(Ωk, Tk,Vk, Jk, μk)

(4) εk+1 := γμk+1‖Vk‖2Γk; k ← k + 1.

In theory this algorithm is an infinite loop giving a more acurate approximation as the iterations progress, butin practice we implement a stopping criteria in LINESEARCH. In the next few subsections, we describe in detaileach module of ASQP.

4.1. The module APPROXJ

This is a typical adaptive loop based on a posteriori error estimators, in which the domain Ω remains fixed.In this context we use the goal oriented estimators alluded to in Section 3 and refine and coarsen separately.The module APPROXJ is defined as follows:

[T∗, U∗, Z∗, J∗, G∗] = APPROXJ(Ω, T , ε)do

[U, Z] = SOLVE(Ω, T )η(T )T∈T = ESTIMATE(U, Z, ST )[R, C] = MARK(T , η(T )T∈T )if (η(T ) > ε)

[T , C] = REFINE(T ,R)elseif (η(C) < δε)T = COARSEN(T , C)

endifwhile (η(T ) > ε)T∗ = T ; U∗ = U; Z∗ = ZJ∗ = EVALJ(Ω, T∗, U∗)G∗ = RIESZ(Ω, T∗, U∗, Z∗)

The module SOLVE computes the solution to the primal and dual discrete problems (3.1)–(3.2). The moduleESTIMATE determines the local indicators η(T ), T ∈ T of the DWR method given by (3.14).

The module MARK selects some elements of T and assigns them to the set R of elements marked for refinementor to the set C of elements marked for coarsening. In both cases, MARK uses the maximum strategy which turns outto be more local and thus effective than others in this application. In fact, given parameters 0 < δ− δ+ < 1,we let η∗ = maxT∈T η(T ) and apply the rules:

η(T ) > δ+η∗ ⇒ T ∈ R; η(T ) < δ−η∗ ⇒ T ∈ C.The module REFINE subdivides the elements in the set R via bisection, and perhaps a few more elements to

keep conformity of T ; REFINE also updates the set C which may have been affected by refinements. In contrast,the module COARSEN deals with the set of elements C selected for coarsening. Alternation of REFINE and COARSENis crucial in this context, in which geometric singularities detected early on may disappear as the algorithmprogresses towards the optimal shape. We illustrate this new paradigm with simulations in Sections 5 and 6.

Finally, the module EVALJ evaluates the functional J [Ω, U∗] on the updated mesh T∗, whereas RIESZ computesa finite element approximation G∗ to the shape derivative g(Ω).


4.2. The module DIRECTION

Given a tolerance θ ≤ 1/5, an approximate shape derivative G, and a domain Ω described through a trian-gulation T , the call

[V∗, T∗] = DIRECTION(Ω, T , G, θ)

finds an approximate descent direction V∗ and an updated mesh T∗ as follows: we let V(Γ ) be a Hilbert spaceover Γ , AΓ : V(Γ )× V(Γ )→ R a continuous and coercive bilinear form, and define the exact descent directionv as

v ∈ V(Γ ) : AΓ (v, w) = −〈G, w〉Γ ∀ w ∈ V(Γ ),

i.e., v is the weak solution of Av = −G on a smooth surface Γ being approximated by Γ . Let VT be the finiteelement space over the restriction of the mesh T to the boundary Γ of Ω, and let V satisfy

V ∈ VT : AΓ (V, W ) = −〈G, W 〉Γ ∀ W ∈ VT .

The module DIRECTION then performs (stationary) adaptivity through an alternation of refinement and coars-ening so that on the output mesh T∗, the finite element solution V∗ ∈ V∗ := VT∗ satisfies

‖V∗ − v‖V(Γ ) ≤ θ‖V∗‖V(Γ ) (4.1)

and is also a descent direction because (4.1) controls the angle between v and V∗; see (1.6) in Section 1. Thechoice of AΓ is critical to obtain a sequence of relatively smooth domains and avoid instabilities [15]. We havesuccessfully implemented the weighted Laplace-Beltrami bilinear form AΓ defined by

AΓ (v, w) =∫

Γ

ρ∇Γ v∇Γ w + vw dS ∀v, w ∈ V(Γ ) = H1(Γ );

see [9,11,22] for alternative choices of the bilinear form AΓ . The weight ρ depends on the optimization problemunder study, as well as on its relative scales (see Sects. 5 and 6). The error control (4.1) is achieved by resortingto residual a posteriori error estimates for the H1(Γ )-norm [13, 14, 21]. More precisely, if V ∈ VT denotes theGalerkin approximation to v on a mesh T we define the Laplace-Beltrami (LB) error indicator by [21]

η2Γ (T ; V ) := h2

T ‖ − ρΔΓ V + V −G‖2L2(T ) + hT ‖[[ρ∇Γ V ]]‖2L2(∂T ) + ‖ν − νT ‖2L∞(T )‖√

ρ∇Γ V + V ‖2L2(T ),

for T a surface element of T contained in Γ . These indicators satisfy ‖v − V ‖2V(Γ ) ≤ C

∑T⊂Γ η2

Γ (T ; V ). Thefirst two terms are the usual indicators for a reaction-diffusion equation, whereas the last one is a geometricindicator, that takes into account the error in approximating the domain through ν − νT . Here ν denotes theexact normal to the smooth surface Γ being approximated by Γ and νT denotes the normal of the discretesurface. Since we do not have access to the exact smooth surface Γ we estimate ‖ν − νT ‖L∞(T ) by the jumpterm ‖[[νT ]]‖L∞(∂T ), and the computable estimator reads

η2Γ (T ; V ) := h2

T ‖ − ρΔΓ V + V −G‖2L2(T ) + hT ‖[[ρ∇Γ V ]]‖2L2(∂T ) + ‖[[νT ]]‖2L∞(∂T )‖√

ρ∇Γ V + V ‖2L2(T ).

For two-dimensional domains with polygonal boundaries, a simpler upper bound is obtained as follows. Sinceboth the Clement or Scott-Zhang interpolation operators are H1-stable, they are used in deriving a posteriorierror estimates to approximate test functions in H1 and thus get the appropriate powers of hT in η2

Γ (T ; V );see [14, 21]. When the underlying (boundary) mesh is one-dimensional one can resort, instead, to Lagrangeinterpolation because H1 is embedded in the space of continuous functions. It turns out that the jump [[ρ∇Γ V ]]is multiplied by the test function minus its interpolant and evaluated at vertices. Since the Lagrange interpolantcoincides with the function at vertices, for one dimensional boundary meshes the term [[ρ∇Γ V ]] drops out andwe obtain the following simpler estimator:

η2Γ (T ; V ) := h2

T ‖ − ρΔΓ V + V −G‖2L2(T ) + ‖[[νT ]]‖2L∞(∂T )‖√

ρ∇Γ V + V ‖2L2(T ).


With this definition of a posteriori error indicators we execute a loop of the form

SOLVE → ESTIMATE → MARK → REFINE/COARSEN

with MARK based on the maximum strategy, as described in Section 4.1, until the last discrete solution V∗ satisfiesC

∑T⊂Γ η2

Γ (T ; V∗) ≤ θ2‖V∗‖V(Γ ), thereby giving (4.1).To advance the domain we need a vector velocity V∗ such that V∗ is its normal component. Since Γ is

piecewise polynomial, the normal ν is discontinuous. We thus define an average vector velocity V∗, and outputof DIRECTION, as follows

V∗ ∈ Vd∗ : 〈V∗, ϕ〉Γ = 〈V∗, ν · ϕ〉Γ ∀ϕ ∈ Vd

∗, (4.2)

as in [5, 15].

4.3. The module LINESEARCH

Given a domain Ω described by a triangulation T , a vector velocity V, the functional value J [Ω], and theprevious timestep μ, the LINESEARCHmodule computes a new timestep μ∗ and updates both the domain Ω toΩ∗ and the mesh T to T∗ as follows:

[Ω∗, T∗, μ∗] = LINESEARCH(Ω, T ,V, J, μ)

m = GEOSTEP(V) %find max possible geometric stepμ = min(μ, m)Jold = J, J = TRYSTEP(μ, Ω, T ,V)if (Jold < J) %energy is not decreasing reduce time step

[success, μ] = DECREASESTEP(J, μ, Ω, T ,V)if (success == false) %we reached the stopping criteriabreak end if

else %energy is decreasing, can we get better?[success, μ] = TRYDECREASE(J, μ, Ω, T ,V)if ( success == false )

[success, μ] = TRYINCREASE(J, μ, Ω, T ,V)end if

end if[Ω∗, T∗] = UPDATE(Ω, T ,V, μ), μ∗ = μ

The use of the module GEOSTEP is a mechanism to avoid mesh distortion due to tangential motion of nodes,i.e. we need to control the effect of the tangential component (I− ν ⊗ ν)V of V. Such a control boils down toa geometric restriction of steplength: the output m of GEOSTEP is the largest admissible steplength that avoidsnode crossing and is computed as follows. If ρT is the diameter of the largest inscribed ball in T ∈ T , and z is ageneric boundary node, then we let d(z) be the nodal function that takes the minimum of ρT over all T ∈ T thatshare z. The quantity ϑ d(z)

|(I−ν⊗ν)V(z)| gives the largest steplength allowed for node z to move without entanglingthe mesh, provided ϑ ≤ 1/2, and represents a worst case scenario; m is thus the smallest of those values for allboundary nodes z. Practice suggests that ϑ = 1/3 is a good choice for linear meshes whereas ϑ = 1/6 is a safechoice for quadratic meshes controlled by the hybrid method of [23].

The module TRYSTEP finds the energy of a deformation of Ω by μV corresponding to a given timestep μ asfollows:

J∗ = TRYSTEP(μ, Ω, T ,V)

[Ω∗, T∗] = UPDATE(μ, Ω, T ,V)[U∗, Z∗] = SOLVE(Ω∗, T∗)J∗ = EVALJ(Ω∗, T∗, U∗)


Here the module UPDATE advances the domain to the new configuration Ω∗ and updates the mesh T to T∗.This is done as follows:

[Ω∗, T∗] = UPDATE(Ω, T ,V, μ)

Ω∗ = Ωx = x + μV(x) ∀x ∈ ∂Ω∗MESHOPTIMIZE(Ω∗)

We first move the boundary using V and then we move the interior nodes using the mesh smoothing routineMESHOPTIMIZE that optimizes the location of the star center nodes trying to improve their quality; see [23] fordetails.

The module SOLVE finds primal and dual solutions of (3.1)–(3.2) on the new finite element space S∗. Finally,EVALJ evaluates the new functional J∗ = J [Ω∗, U∗].

The modules TRYDECREASE and TRYINCREASE decrement or increment the timestep as long as the energykeeps decreasing, and use the parameters 0 < a < 1 < b provided by the user. The module DECREASESTEP hasa built-in stopping mechanism: when the energy cannot be reduced anymore while keeping the timestep abovea threshold timestep μ0 the algorithm stops.

[success, μ∗] = TRYDECREASE(J, μ, Ω, T ,V)

%Reduce μ while energy keeps decreasingμ∗ = μ, success = falsedo

Jold = J, μ∗ = a ∗ μ∗

J = TRYSTEP(μ∗, Ω, T ,V)while (J < Jold)μ∗ = μ∗/aif (μ∗ < μ) success = true;

[success, μ∗] = TRYINCREASE(J, μ, Ω, T ,V)

%increment μ while energy keeps decreasingμ∗ = μ, success = falsedo

Jold = J, μ∗ = b ∗ μ∗

J = TRYSTEP(μ∗, Ω, T ,V)while (J < Jold)μ∗ = μ∗/bif (μ∗ > μ) success = true

[success, μ∗] = DECREASESTEP(J, μ, Ω, T ,V)

%decrease μ until we reduce energy or stop criteriaμ∗ = μ, success = falsedo

Jold = J, μ∗ = a ∗ μ∗

J = TRYSTEP(μ∗, Ω, T ,V)while (J > Jold) and (μ∗ > μ0)if (μ∗ > μ0) success = true;

Remark 4.1 (Wolfe-Armijo conditions). Even though it looks very simple minded, our linesearch and back-tracking algorithm is very robust. We have also tried to use the celebrated Armijo-Wolfe conditions, but theirbehavior was not so robust because it depends on having a reliable computation of the functional derivative inaddition to the functional itself.


4.4. Geometrically consistent mesh modification

After the final UPDATE in LINESEARCH some post-processing takes place to ensure a healthy and geometricallysound mesh for the next iteration. It involves a geometrically consistent relocation of the newly created nodes(if any) and a mesh quality check with the possibility of remeshing if certain threshold is not satisfied. Belowand in the next subsection we explain each process.

The presence of corners (or kinks) on the deformable boundary Γk is usually problematic. First, the scalarproduct AΓk

(·, ·) of (1.5) includes a Laplace-Beltrami regularization term (ρ > 0) which stabilizes the boundaryupdate but cannot remove kinks because Vk is smooth. Secondly, DWR regards kinks as true singularities andtries to refine around them accordingly. The combination of these two effects leads to numerical artifacts (earformation) and halt of computations. The geometrically consistent mesh modification (GCMM) method of [8]circumvents this issue. Assuming that a piecewise polynomial approximation Hk to the vector curvature of Γk

is available, [8] provides a method to place the nodes after a mesh modification such as refinement, coarseningor smoothing takes place. The method requires transfering Hk to an intermediate modified mesh and yields thenew position Xk of the free boundary from the fundamental geometric identity −ΔΓk

Xk = Hk. This preservesgeometric consistency – which is violated by simply interpolating Γk – as well as accuracy [8], and rounds kinks.For “fake” kinks (e.g. initial corners) the effect of GCMM permits the optimization flow to get rid of the kinks, ahighly desirable outcome. For “genuine” kinks, i.e. those that exist in the optimal shape, the optimization flowmay in principle conflict with the smoothing effects of GCMM. Since GCMM is only applied when adaptivitytakes place, the optimization flow dominates overall. Therefore, genuine corners of the optimal shape beingstable are increasingly better resolved despite the fact that descent directions are smooth – and so are all theintermediate shapes. The numerical curvature of genuine corners thus depends on the local meshsize and theregularization parameter ρ.

Since in the present context a vector curvature approximation Hk is not directly available, as required in [8],we resort to the techniques described in [5, 24] and we use a star average to construct a continuous normal νk

from the discontinuous element normals. In fact, for each boundary node x we define νk(x) as the normalizationof 1

|ω|∑

T⊂ω |T |nT , where ω is the set of elements T belonging to the boundary triangulation and sharing xas a vertex and nT is the element normal. Having the nodal values, this defines a unique piecewise linear andglobally continuous function. We next consider the scalar approximation to mean curvature Hk = divΓk

νk, letHk = Hkνk, and proceed as in [8].

4.5. Remeshing

A rule of thumb for dealing with complicated domain deformations is that remeshing is indispensable andunavoidable. Our approach is to use remeshing only when necessary for the continuation of the simulation. Atthe end of each iteration we check the mesh quality, and if it falls below a given threshold remeshing takes place.In the drag simulation of Section 5.6, for example, 4 remeshings were necesary in 180 iterations, whereas theless complex deformations for the bypass simulations of Section 6 did not require remeshing.

A disadvantage of remeshing in the context of adaptivity is that most mesh generators create unstructuredmeshes. Since the hierarchical mesh structure is then lost, coarsening cannot be performed beyond the structureof the new (macro) mesh. This problem could be easily solved by using a hierarchical mesh generator, but itsdiscussion is beyond the scope of this work.

4.6. Volume constraint

If the definition of Uad involves a fixed-volume constraint V0, like in Section 5, such a constraint is enforcedas follows in the module UPDATE of LINESEARCH. Given a descent direction V (from the module DIRECTION)for the unconstrained energy J [Ω] with |Ω| = V0, then [Ω∗, T∗] =UPDATE(Ω, T , V, μ) returns a new domain Ω∗with the same prescribed volume |Ω∗| = V0 and intuitively a smaller associated energy J [Ω∗]; the latter is inturn ultimately checked in LINESEARCH. The module UPDATE proceeds by moving the boundary of Ω by μV and


Ω

ΓsΓin Γout

Γw

Γw

Figure 1. Domain Ω for drag minimization for Stokes flow: Ω ⊂ Rd, d ≥ 2, is a boundeddomain with its boundary subdivided into an inflow part Γin, an outflow part Γout, a partconsidered as walls Γw, and an obstacle Γs which is the deformable part to be optimized.

then projects it onto the manifold Uad of shapes with the fixed volume V0. This projection is not arbitrary butbased on the augmented energy functional

J [Ω] = J [Ω]− λ(|Ω| − V0

),

where λ is a Lagrange multiplier to ensure that |Ω| = V0. The shape derivative of J in the direction V is givenby

〈δΩJ [Ω],V〉 = 〈δΩJ [Ω],V〉 − λ〈ν,V〉,

and a gradient flow would choose to move along the descent direction V = A−1Γ (−δΩJ [Ω]) to decrease the

unconstrained energy J [Ω]. Once the boundary of Ω has been deformed using μV and a new configuration Ω∗reached, it seems natural to evolve the domain with the normal flow

ddλ

X = ν in Ω(t); X(0) = Id in Ω(0) = Ω∗,

until we find a zero of the scalar function f(λ) = |Ω(λ)| − V0. Such a zero can be found via a Newton methodwith step δλ = − f(λ)

f ′(λ) . Since f ′(λ) = |∂Ω(λ)|, we now have the two ingredients of the following algorithm,namely a Newton correction of λ (starting from λ = 0) and a normal update of Ω(λ):

Ω = Ω∗

while(

|Ω|−V0V0

> ε)

δλ = − |Ω|−V0|∂Ω| %compute newton step

ν = AVERAGENORMAL(∂Ω); Ω = Ω + δλν %update the domainend while

Here ε is a given tolerance for the Newton method whereas the function AVERAGENORMAL(∂Ω) computes acontinuous normal ν over the piecewise polynomial boundary ∂Ω, with nodal unit length. To this end, it usesthe same averaging procedure (4.2) on stars described in Section 4.4 (see also [5]). Notice that in view of theuse of AVERAGENORMAL which changes the normal at each iteration, it seems to be more appropriate to refer tothe above algorithm as to a quasi-Newton scheme.

5. Drag minimization for stokes flow

5.1. The stokes problem

We consider the flow around an obstacle described by the following Stokes equations. Let Ω ⊂ Rd, d ≥ 2 bea bounded domain as depicted in Figure 1.


Let the velocity u := u(Ω) and the pressure p := p(Ω) solve the following problem:

− div(T(u, p)) = 0 in Ω

div u = 0 in Ω

u = ud on Γin ∪ Γs ∪ Γw

T(u, p)ν = 0 on Γout

(5.1)

where T(u, p) := 2με(u)− pI is the Cauchy tensor with ε(u) = ∇u+∇uT

2 , μ > 0 is the viscosity, and

ud =v∞ on Γin

0 on Γw ∪ Γs,

with v∞ = V∞v∞, v∞ being the unit vector pointing in the direction of the incoming flow and V∞ a scalarfunction.

In order to state a weak formulation of this problem, we introduce the following bilinear forms:

a[·, ·] : [H1(Ω)]d × [H1(Ω)]d → R, a[u,v] := 2μ

∫Ω

ε(u) : ε(v) dx,

b[·, ·] : L2(Ω)× [H1(Ω)]d → R, b[p,v] := −∫

Ω

p div v dx.

(5.2)

We let Γd := Γin ∪ Γs ∪ Γw be the Dirichlet boundary, introduce the affine manifolds

[H1Γd

(Ω)]d = u ∈ [H1(Ω)]d : u = 0 on Γd,ud ⊕ [H1

Γd(Ω)]d = u ∈ [H1(Ω)]d : u = ud on Γd,

and set S(v) := (v ⊕ [H1Γd

(Ω)]d)× L2(Ω). The weak formulation of the Stokes problem (5.1) reads

(u, p) ∈ S(ud) : B[(u, p), (v, q)] = 0 ∀(v, q) ∈ S(0), (5.3)

whereB[(u, p), (v, q)] := a[u,v] + b[p,v] + b[q,u]. (5.4)

5.2. Cost functional and Lagrangian

We let the cost functional measuring the obstacle drag be

I[Ω, (u, p)] := −∫

Γs

(T(u, p)ν) · v∞ dS, (5.5)

where (u, p) solves (5.3). We would like to minimize the linear boundary functional I subject to the stateconstraint (5.3) among all admissible configurations with fixed volume that can be obtained by piecewise smoothperturbations of the obstacle boundary Γs [25, 26]. We thus introduce the functional with Lagrange multiplierλ ∈ R and |Ω| =

∫Ω dx

J [Ω, (u, p), λ] := I[Ω, (u, p)] + λ(|Ω| − |Ω0|

). (5.6)

It is useful to rewrite I as a volume integral. This helps derive, as well as compute, the adjoint equation andshape derivative of I. We introduce a function Φ∞ ∈ [H1(Ω)]d such that

Φ∞ =−v∞ in N (Γs)

0 on Γw ∪ Γin,


where N (Γs) is a neighborhood of Γs. The traction-free boundary condition T(u, p)ν = 0 on Γout and Gausstheorem yield

I[Ω, (u, p)] =∫

∂Ω

(T(u, p)ν) ·Φ∞ dS =∫

Ω

div(T(u, p)Φ∞) dx

=∫

Ω

div(T(u, p)) ·Φ∞ + T(u, p) : ∇Φ∞ dx = a[u,Φ∞] + b[p,Φ∞].(5.7)

According to (2.2), the Lagrangian is defined as follows for all (v, r) ∈ S(0):

L[Ω, (u, p), (v, r), λ] := J [Ω, (u, p), λ] −B[(u, p), (v, r)]= a[u,Φ∞ − v] + b[p,Φ∞ − v]− b[r,u] + λ

(|Ω| − |Ω0|

)= B[(u, p), (Φ∞ − v,−r)] + λ

(|Ω| − |Ω0|

)= B[(u, p), (z, q)] + λ

(|Ω| − |Ω0|

),

(5.8)

where z = Φ∞ − v and q = −r are the adjoint variables.

5.3. Adjoint equation

We now derive the adjoint equation from the Lagrangian (5.8).

Lemma 5.1 (adjoint equation for (5.8)). The adjoint pair (z, q) satisfies the weak equation

(z, q) ∈ S(Φ∞) : B[(w, s), (z, q)] = 0 ∀(w, s) ∈ S(0), (5.9)

as well as the strong form− div(T(z, q)) = 0 in Ω

div z = 0 in Ω

z = Φ∞ on Γin ∪ Γs ∪ Γw

T(z, q)ν = 0 on Γout.

(5.10)

Proof. Differentiate (5.8) with respect to (u, p) to arrive at

〈δ(u,p)L[Ω, (u, p), (z, q), λ], (w, s)〉 = B[(w, s), (z, q)] = 0 ∀(w, s) ∈ S(0),

which is (5.9). The strong form (5.10) results from (5.9) by integration by parts.

5.4. Shape derivative

We now compute the shape derivative δΩL[Ω, (u, p), (z, q), λ], recalling that u, p, z, q depend on Ω, usingthe rules described in Section 2.2. See also [7, 22] for similar results.

Lemma 5.2 (shape derivative of (5.8)). Let (u, p) be the solution to (5.3) and (z, q) be the solution to (5.9).The shape derivative of L[Ω, (u, p)(Ω), (v, q)(Ω), λ(Ω)] in the direction V is given by⟨

δΩL[Ω, (u, p)(Ω), (z, q)(Ω), λ(Ω)],V⟩

= −2μ

∫Γs

ε(u) : ε(z) V dS + λ

∫Γs

V dS. (5.11)

Proof. In view of (5.8), we have

L[Ω, (u, p)(Ω), (z, q)(Ω), λ(Ω)] = a[u, z] + b[p, z] + b[q,u] + λ (|Ω| − |Ω0|)

=∫

Ω

(2με(u) : ε(z) − p div z− q div u) dx + λ

(∫Ω

dx−∫

Ω0

dx

).


Invoking (2.13), and recalling that div u = div z = 0, we deduce⟨δΩL[Ω, (u, p)(Ω), (v, q)(Ω), λ(Ω)],V

⟩=

∫Γs

(2με(u) : ε(z) + λ

)V dS

+∫

Ω

(2με(u′) : ε(z)− p′ div z− q div u′

)dx

+∫

Ω

(2με(u) : ε(z′)− q′ div u− p div z′

)dx

+ λ′(|Ω| − |Ω0|

),

(5.12)

with (u′, p′) = (u′(V), p′(V)) and (z′, q′) = (z′(V), q′(V)) the shape derivatives. Moreover, we have∫Ω

(2με(u′) : ε(z) − p′ div z− q div u′

)dx = B[(u′, p′), (z, q)] (5.13)∫

Ω

(2με(u) : ε(z′)− q′ div u− p div z′

)dx = B[(u, p), (z′, q′)]. (5.14)

Recalling (2.10) and (2.11), the shape derivatives (u′, p′) and (z′, q′) satisfy the boundary value problems

− div(T(u′, p′)) = 0 in Ω

div u′ = 0 in Ω

u′ = 0 on Γin ∪ Γw

u′ = −∇uν V on Γs

T(u′, p′)ν = 0 on Γout

(5.15)

and− div(T(z′, q′)) = 0 in Ω

div z′ = 0 in Ω

z′ = 0 on Γin ∪ Γw

z′ = −∇(z−Φ∞)ν V on Γs

T(z′, q′)ν = 0 on Γout.

(5.16)

Since the pair (u′, p′) /∈ S(0), combining (5.13) with (5.15) we obtain

B[(u′, p′), (z, q)] = −∫

Γs

(T(z, q)ν

)·(∇uν

)V dS.

We now exploit the fact that u = 0 on Γs to write(T(z, q)ν

)·(∇uν

)= T(z, q) : ∇u. (5.17)

By the definition of the Cauchy tensor T(z, q),

T(z, q) : ∇u = 2με(z) : ε(u)− q div u = 2με(z) : ε(u),

whenceB[(u′, p′), (z, q)] = −2μ

∫Γs

ε(z) : ε(u)V dS.

Similarly, using now (5.16) in conjunction with the fact that z = Φ∞ is constant in N (Γs) we infer that

B[(u, p), (z′, q′)] = −∫

Γs

(T(u, p)ν

)·(∇zν

)V dS = −2μ

∫Γs

ε(u) : ε(z)V dS.

Inserting the last two expressions into (5.12), and realizing that |Ω| = |Ω0|, we conclude the asserted expressionfor the shape derivative of the Lagrangian.


5.5. Dual weighted residual estimator

Let T be a conforming and shape regular triangulation of Ω. Let UT ×QT ⊂ [H1(Ω)]d × L2(Ω) be a stablepair of finite element spaces [18] for the Stokes equations so that UT (resp. QT ) contains polynomials of degree≤ m (resp. ≤ m− 1) for m ≥ 1. Let

UT (v) = U ∈ UT : U = v on Γd (5.18)

where we assume v ∈ UT and set ST (v) := UT (v) × QT . The finite element approximation to the Stokesproblem (5.3) reads

(U, P ) ∈ ST (ud) : B[(U, P ), (W, Φ)] = 0 ∀(W, Φ) ∈ ST (0), (5.19)

where we assume ud ∈ UT .We next evaluate the PDE error induced by the finite element method. To this end, we assume that the

domain Ω is fixed, whence J [Ω, (u, p), λ] = I[(u, p)]. In particular, we are interested in deriving an a posteriorierror estimate for the quantity

∣∣I[(u, p)] − I[(U, P )]∣∣ where (u, p) is the solution to (5.3) and (U, P ) that of

(5.19). Applying the abstract theory of the Dual Weighted Residual method presented in Section 3 we obtainthe following result. Even though a similar estimate has been derived in [17], we present the proof now forcompleteness.

Lemma 5.3 (DWR estimate for the drag). The following error estimate holds∣∣J [Ω, (u, p), λ]− J [Ω, (U, P ), λ]∣∣ ≤ ∑

T∈T

(‖r(U, P )‖L2(T )‖z− Z‖L2(T )

+ ‖j(U, P )‖L2(∂T )‖z− Z‖L2(∂T ) + ‖ρ(U)‖L2(T )‖q −Q‖L2(T )

),

(5.20)

where (Z, Q) is the finite element approximation to the solution (z, q) of the adjoint problem (5.10) and

r(U, P )|T := − divT(U, P ), ρ(U)|T := div U,

j(U, P )|S :=

⎧⎪⎨⎪⎩12 [T(U, P )ν], S ⊂ ∂Ω,

T(U, P )ν, S ⊂ Γout,

0 otherwise,

where [·] denotes the jump across the interelement side S.

Proof. Applying (3.8) to (5.6), with linear functional I obeying (5.7), yields

J [Ω, (u, p), λ]− J [Ω, (U, P ), λ] = B[(U, P ), (z − Z, q −Q)].

Integrating B[(U, P ), (z−Z, q−Q)] by parts over the elements T ∈ T , and collecting the boundary terms fromadjacent elements to form jumps, we obtain the expression

B[(U, P ), (z − Z, q −Q)] =∑T∈T

∫T

r(U, P )(z − Z) + ρ(U)(q −Q) dx +∫

∂T

j(U, P )(z− Z) dS.

Applying the Cauchy-Schwarz inequality leads to (5.20), which is consistent with (3.9).

In view of the discussion following Lemma 3.2, especially (3.13)–(3.14), for the simulations below we use thefollowing heuristic bound in the module ESTIMATE of the adaptive algorithm ASQP:∣∣J [Ω, (u, p), λ]− J [Ω, (U, P ), λ]

∣∣ ∑T∈T

ηT (T ).


Figure 2. Drag optimization: snapshots at iterations 0, 10, 28, 50, 130 and 174 of the evolutionof a non-convex obstacle in a channel flow using the ASQP algorithm to find the optimal rugby-ball shape that minimizes its drag. The flow is modeled as a stationary Stokes fluid. The obstacleis constrained to maintain its initial volume. Taylor-Hood finite elements with m = 2 (quadraticsfor velocity and linears for pressure) are employed for approximating both the state and adjointproblems. For the boundary we consider the Laplace-Beltrami operator with ρ = 0.05. It isworth noticing that the initial refinement due to the presence of (non-genuine) corners on theinitial shape disappears later on, and new refinement appears around the (genuine) corners ofthe optimal shape. This is the combined effect of DWR (Sect. 3) and GCMM (Sect. 4.4).

where the explicit element indicators are given by

ηT (T ) :=(h

1/2T ‖r(U, P )‖L2(T ) + ‖j(U)‖L2(∂T )

) m∑j=1

hjT ‖[DjZ]‖L2(∂T )

+ h1/2T ‖ρ(U)‖L2(T )

m−1∑j=0

hjT ‖[DjQ]‖L2(∂T ).

Since the various terms hjT ‖[DjZ]‖L2(∂T ) in this heuristic bound are expected to be of the same order, except

for pathological situations, we just take j = 1 in our numerical implementation.

5.6. Numerical experiments

We perform the numerical simulations in two dimensions. In Figure 2 we show a sequence of meshes, startingfrom a non-convex initial obstacle, and arriving at the optimal rugby-ball shape, with the same volume [25,26].

This simulation is rather demanding due to the non-convexity and non-smoothness of the initial obstacleshape. In Figure 2 we observe its evolution under the ASQP algorithm. It can be seen that DWR finds large


Table 1. Drag optimization: adaptivity statistics of the ASQP method for the first 35 iterations.The number of marked elements for refinement/coarsening is denoted with LB for Laplace-Beltrami and DWR for the dual weighted residual method. After iteration 35 the first occurrenceof remeshing appears (R) due to failed mesh quality check. Subsequent remeshings occured afteriterations 62, 100 and 154.

Iteration 2 3 4 5 8 9 10 17 18 23 26 28 30 31 32 33 35 RDWR-ref 220 216 94 79 153 80 225 11 241 94 RDWR-coars 34 51 54 48 90 778 63 43 23 135 RLB-ref 11 16 19 19 13 19 36 7 37 6 10 62 RLB-coars 2 8 2 1 R

estimators close to the corners of the initial shape, and forces some refinement in order to control the PDE error.Afterwards, using the direction dictated by the shape derivative δΩJ [Ω] and the scalar product AΓ (·, ·), as wellas GCMM, ASQP smooths out those corners and straightens out the non-convex part of Γ . A few iterations laterthe elements that were initially around the original corners are coarsened due to the current smoothness of Γ ,and the elements around the newly formed corners are strongly refined to reduce both the PDE and geometricerrors (see Fig. 3).

In Table 1 we show numerical data that document some aspects of the practical behavior of ASQP. Duringthe first 10 iterations the refinement process dominates the coarsening procedure; this is mostly due to thedetection of the initial corners. At iteration 28 a strong coarsening occurs in response to the flattening of theinitially highly refined corners. The method is thus able to detect and coarsen fake corners (see Fig. 3). Thefirst remeshing occurs after iteration 35. Subsequent remeshing occured after iterations 62, 100 and 154.

6. Optimization of an aortic-coronary by-pass

6.1. Cost functional, Lagrangian, and adjoint equation

We consider now a model of blood flow through an aortic-coronary by-pass. Let Ω ⊂ Rd, d ≥ 2 be a boundeddomain of Rd as depicted in Figure 4. Let the velocity-pressure pair (u, p) = (u(Ω), p(Ω)) solve the Stokesproblem (5.1) in strong form or (5.3) in weak form. Let the finite element pair (U, P ) solve the Galerkinproblem (5.19). We are interested in the total dissipated energy in Ω, which is given by

I[Ω, (u, p)] := 2μ

∫Ω

|ε(u)|2 dx. (6.1)

It is worth noticing that in two dimensions I measures the vorticity of an incompressible flow. The correspondingminimization problem has been considered in [27] to find the optimal design of an aorto-coronary by-pass (seealso [28]). We supplement (6.1) with a penalization of the perimeter of Γs and thus consider for ε > 0 fixed

J [Ω, (u, p)] := I[Ω, (u, p)] + ε|Γs|. (6.2)

According to (2.2), the Lagrangian associated with this shape optimization problem reads

L[Ω, (u, p), (z, q)] := J [Ω, (u, p)]−B[(u, p), (z, q)]. (6.3)

Lemma 6.1 (adjoint equation for (6.3)). The adjoint pair (z, q) satisfies the weak form

(z, q) ∈ S(0) : B[(w, s), (z, q)] = 4μ〈ε(u), ε(w)〉 ∀(w, s) ∈ S(0), (6.4)


Figure 3. Drag optimization: detection of genuine geometric singularities (iterations 0, 5, 20,32, 50 and 160). Zoom of the evolution of the non-convex obstacle towards the optimal shapethat minimizes the drag with a given volume. The geometric singularities given by the artificialcorners of the initial shape are quickly detected by the ASQP method. They are refined bythe combined effect of the LB and DWR error estimates, and smoothed out by the energyminimizing iterations and GCMM (first three frames). A few iterations later the elements thatwere initially around the original corners are coarsened due to the current smoothness of Γ(forth frame). As the genuine singularity of the problem (the corner of the rugby ball) appearsthe ASQP method is able to recognize it and to refine around it (last two frames) so as to improveboth the PDE and geometric approximation.

Ω

Γs

ΓwΓin

Γout

Γw

Γw

Γw

Figure 4. Domain Ω for coronary by-pass shape optimization: Ω ⊂ Rd, d ≥ 2, is a boundeddomain with boundary split into an inflow part Γin, an outflow part Γout, a part considered asa wall Γw, and a deformable part Γs, which is is the main design variable. The end points ofΓs connecting to Γw are fixed and are not part of the optimization.


as well as the strong form− div(T(z, q)) = −4μ div(ε(u)) in Ω,

div z = 0 in Ω,

z = 0 on Γd,

T(z, q)ν = 4με(u)ν on Γout,

(6.5)

Proof. It suffices to differentiate L in (6.3) with respect to (u, p) to get (6.4) and to integrate (6.4) by parts toobtain (6.5).

6.2. Shape derivative

We now compute δΩL[Ω, (u, p)(Ω), (z, q)(Ω)] for L in (6.3).

Lemma 6.2 (shape derivative of (6.3)). Let (u, p) be the solution of (5.3) and (z, q) be that of (6.4). The shapederivative of L[Ω, (u, p)(Ω), (z, q)(Ω)] in the direction V is given by

〈δΩL[Ω, (u, p)(Ω), (z, q)(Ω)],V〉 = 2μ

∫Γs

ε(u) :(ε(z)− ε(u)

)V dS + ε

∫Γs

κ V dS, (6.6)

where κ is the mean curvature of Γs.

Proof. In view of (6.3), we can write

L[Ω, (u, p)(Ω), (z, q)(Ω)] =∫

Ω

(2μ|ε(u)|2 + 2με(u) : ε(z)− p div z− q div u

)dx + ε

∫Γs

dS.

Applying the chain rule (2.13) we end up with

〈δΩL[Ω, (u, p)(Ω), (z, q)(Ω)],V〉 =∫

Γs

(2μ|ε(u)|2 + 2με(u) : ε(z)− p div z− q div u

)V dS

+∫

Ω

4με(u) : ε(u′) dx

+∫

Ω


)dx

+∫

Ω

(2με(u) : ε(z′)− p div z′ − q′ div u

)dx

+ ε

∫Γs

κV dS,

where the shape derivatives (u′, p′) = (u′(V), p′(V)) and (z′, q′) = (z′(V), q′(V)) satisfy (5.15) and (5.16)respectively, except that

− div(T(z′, q′)) = −4μ div(ε(u′)

)in Ω. (6.7)

We next examine each term separately. We first observe that integration by parts yields∫Ω

4με(u) : ε(u′) dx = −4μ

∫Ω

div(ε(u))u′ dx + 4μ

∫∂Ω

(ε(u)ν) · u′ dS. (6.8)

Employing (6.5), integrating by parts, and using the weak form of (5.15), we infer that

−4μ

∫Ω

div(ε(u)

)u′ dx =

∫Ω

T(z, q) : ∇u′ dx−∫

∂Ω

(T(z, q)ν

)· u′ dS

= B[(u′, p′), (z, q)] +∫

Γs

(∇uν

)·(T(z, q)ν

)V dS −

∫Γout

(T(z, q)ν

)· u′ dS.


Since B[(u′, p′), (z, q)] = 0 for (z, q) ∈ S(0) we deduce

−4μ

∫Ω

div(ε(u)

)u′ dx = 2μ

∫Γs

ε(u) : ε(z)V dS −∫

Γout

(T(z, q)ν

)· u′ dS,

where we have used, as in (5.17), that (∇uν) · (T(z, q)ν) = 2μ ε(u) : ε(z). We observe now that the last termcancels with a corresponding term in (6.8) because T(z, q)ν = 4με(u)ν according to (6.5). On the other hand,in light of (5.15), the remaining integral over ∂Ω\Γout in (6.8) becomes

4μ

∫Ω\Γout

(ε(u)ν

)· u′ dS = −4μ

∫Γs

(ε(u)ν

)· ∇uνV dS = −4μ

∫Γs

ε(u) : ε(u)V dS,

where we have used again the argument in (5.17) to write(ε(u)ν

)· ∇uν = ε(u) : ε(u). This implies∫

Ω

4με(u) : ε(u′) dx = 2μ

∫Γs

ε(u) : ε(z)V dS − 4μ

∫Γs

ε(u) : ε(u)V dS.

Similarly, since (z, q) ∈ S(0), the weak form of (5.15) yields∫Ω


)dx = B[(u′, p′), (z, q)] = 0,

whereas, proceeding as before∫Ω

(2με(u) : ε(z′)− p div z′ − q′ div u

)dx = B[(u, p), (z′, q′)]

=∫

Γs

(T(u, p)ν) · z′ dS

= −∫

Γs

(T(u, p)ν) · (∇zν)V dS

= −2μ

∫Γs

ε(u) : ε(z)V dS

because of (5.17); note that we do not resort to (6.7) because (z′, q′) is viewed as a test function. Collecting theexpressions above we arrive easily at the desired formula (6.6).

6.3. Dual weighted residual estimator

We now estimate the PDE error induced by the finite element approximation and assume that the domainΩ is fixed; thus J [Ω, (u, p)] = I[(u, p)].

Lemma 6.3 (DWR estimate for the by-pass). The following error estimate holds

|J [Ω, (u, p)] − J [Ω, (U, P )]| ≤ 12

∑T∈T

(‖r∗(U,Z, Q)‖L2(T )‖u−U‖L2(T )

+ ‖j∗(U,Z, Q)‖L2(∂T )‖u−U‖L2(∂T )

+ ‖ρ∗(Z)‖L2(T )‖p− P‖L2(T )

+ ‖r(U, P )‖L2(T )‖z− Z‖L2(T )

+ ‖j(U, P )‖L2(∂T )‖z− Z‖L2(∂T )

+ ‖ρ(U)‖L2(T )‖q −Q‖L2(T )

),

(6.9)


where (U, P ) and (Z, Q) are the finite element solutions of the state problem (5.1) and the adjoint problem(6.5), respectively, and the residuals are given by

r∗(U,Z, Q)|T := −4μ div(ε(U)) + div T(Z, Q), ρ∗(Z)|T := div Z,

r(U, P )|T := div T(U, P ), ρ(U)|T := div U,

j∗(U,Z, Q)|S =

⎧⎪⎪⎪⎨⎪⎪⎪⎩12[[4με(U)ν −T(Z, Q)ν]], S ⊂ ∂Ω,

4με(U)ν −T(Z, Q)ν, S ⊂ Γout,

0, otherwise,

j(U, P )|S =

⎧⎪⎪⎨⎪⎪⎩− 1

2[[T(U, P )ν ]], S ⊂ ∂Ω,

−T(U, P )ν, S ⊂ Γout,

0, otherwise,

with [[·]] denoting the jump across the interelement side S.

Proof. In view of (3.3) and (3.4) for J obeying (6.2), we can write for all (w, s) ∈ S

R((U, P ), (Z, Q); (w, s)) = −B[(U, P ), (w, s))

R∗((U, P ), (Z, Q); (w, s)) = 4μ

∫Ω

ε(U) : ε(w) dx −B[(w, s), (Z, Q)].

Therefore, applying (3.6) and realizing that the remainder E = 0 because J is quadratic, we obtain

J [Ω, (u, p)]− J [Ω, (U, P )] =12

4μ

∫Ω

ε(U) : ε(u−U) dx

−B[(u−U, p− P ), (Z, Q)]−B[(U, P ), (z − Z, q −Q)]

.

Splitting the integrals over elements T ∈ T , integrating by parts, and collecting the boundary terms fromadjacent elements to form jumps, yields

J [Ω, (u, p)]− J [Ω, (U, P )] =12

∑T∈T

∫T

r∗(U,Z, Q)(u−U) +∫

∂T

j∗(U,Z, Q)(u −U)

+∫

T

ρ∗(Z)(p − P ) +∫

T

r(U, P )(z − Z) +∫

∂T

j(U, P )(z − Z) +∫

T

ρ(U)(q −Q)

.

On using the Cauchy-Schwarz inequality we obtain the asserted estimate (6.9).

In view of the discussion following Lemma 3.2, especially (3.13), for the simulations below we use the followingheuristic bound in the module ESTIMATE of the adaptive algorithm ASQP:∣∣J [Ω, (u, p)]− J [Ω, (U, P )]

∣∣ ∑T∈T

ηT (T ) + η∗T (T )


Figure 5. Coronary by-pass: optimal configurations obtained with different penalization pa-rameters: ε = 0.5×10−5 (top-left), 0.8×10−5 (top-right), 1.0×10−5 (bottom-left) and 5.0×10−5

(bottom-right). Taylor-Hood finite element with m = 2 are employed for approximating stateand adjoint problems and for the boundary we consider the pure Laplace-Beltrami opertor.Small values of the penalization parameter yield a Taylor-patch like geometry [33] (top-left),while large values of ε lead to by-pass configurations similar to those in [20] (bottom-right).Corresponding energy plots are shown in Figure 6.

where

ηT (T ) :=(h

1/2T ‖r(U, P )‖L2(T ) + ‖j(U, P )‖L2(∂T )

) m∑j=1

hjT ‖[DjZ]‖L2(∂T )

+ h1/2T ‖ρ(U)‖L2(T )

m−1∑j=0

hjT ‖[DjQ]‖L2(∂T ),

η∗T (T ) :=

(h

1/2T ‖r∗(U,Z, Q)‖L2(T ) + ‖j∗(U,Z, Q)‖L2(∂T )

) m∑j=1

hjT ‖[DjU]‖L2(∂T )

+ h1/2T ‖ρ∗(Z)‖L2(T )

m−1∑j=0

hjT ‖[DjP ]‖L2(∂T ).

6.4. Numerical experiments

We perform the numerical simulations in two dimensions. We now elaborate on the iterations that led tothe meshes shown in Figure 5 and explore the different behavior of the algorithm when changing the perimeterpenalization parameter ε in the cost functional. In Figure 5 we depict the optimal bypass configurations obtainedwith the following penalization parameters: ε = 0.5× 10−5, 0.8× 10−5, 1.0× 10−5 and 5.0× 10−5. Small valuesof ε lead to a Taylor-patch like geometry [33], while large values of ε lead to by-pass configurations similar tothose in [20]. Remeshing was unnecessary for any of these by-pass simulations.

We observe the effect of DWR that refines at the junction between the deformable curve Γs and the wallΓw, which is a reentrant corner of Ω. DWR is also sensitive to changes of boundary conditions and thusrefines at the corners of the outflow boundary Γout where the traction-free condition changes to no-slip. Sincethe heuristic element indicators ηT (T ) and η∗

T (T ) of DWR contain terms of the form hjT ‖[DjZ]‖L2(∂T ) and

hjT ‖[DjZ]‖L2(∂T ), which are expected to be of the same order except for pathological cases, we take j = 1 in

our numerical experiments.


4e-05

4.5e-05

5e-05

5.5e-05

6e-05

6.5e-05

7e-05

7.5e-05

0 5 10 15 20 25 30 35 40

Iterations

Energy

eps = 5e-6eps = 8e-6

eps = 10e-6

0.00022

0.000225

0.00023

0.000235

0.00024

0.000245

0.00025

0 5 10 15 20 25

Iterations

Energy

eps = 50e-6

Figure 6. Coronary by-pass optimization: histories of convergence of the energy functionalJ [Ωk, (Uk, Pk)(Ωk)] in terms of the iteration counter k for different values of the perimeterpenalization parameter (left: ε = 0.5×10−5, 0.8×10−5, 1.0×10−5; right: 5.0×10−5). Figure 5displays the corresponding optimal configurations.

In Figure 6 we show the expected monotone behavior of J [Ωk, (Uk, Pk)(Ωk)] vs the number of iterations kfor a discrete gradient flow, for the four values of ε listed above.

7. Conclusions and further comments

• We develop an adaptive sequential quadratic programming (ASQP) algorithm for shape optimization, whichdynamically changes the tolerance and equidistributes the computational effort between the PDE and geome-try approximation. This leads to rather coarse meshes in the early stages of the optimization when the shapeis still far from optimal and full numerical accuracy is a waste.

• We give a formula that relates the PDE and geometric errors dynamically and depends on the best domaindeformation to perturb the energy functional J [Ω]. Such a deformation is characterized via shape differentialcalculus. The dual weighted residual (DWR) method controls the PDE error and the Laplace-Beltrami (LB)error indicator deals with the geometric error associated to domain deformations. This appears to be the firstwork with all these critical ingredients.

• We exploit the geometrically consistent mesh modification (GCMM) algorithms of [8] within ASQP. This allowsfor detection and removal of kinks that are fake (numerical artifacts) but not of those that are genuine to theproblem. This is a new paradigm in adaptivity and its resolution is a crucial contribution. It is important tonotice that both DWR and LB would insist on refining kinks regardless of their nature in the absence of arobust detection and correction mechanish.

• We apply the ASQP algorithm to two relevant problems governed by the stationary Stokes equation. We firstexamine a drag minimization problem, which has been the subject of intense research in the literature butwithout including adaptivity. We next study an aorto-coronary by-pass model. In both cases we discuss theingredients of ASQP in detail and document its performance with interesting simulations. It is worth realizingthat the applicability of ASQP goes far beyond viscous fluids.

• We present a novel interpolation error estimate which measures regularity in terms of jumps of the interpolantplus a higher order correction term (see Lem 3.2). This may be viewed as a discrete Bramble-Hilbert lemmaand justifies asymptotic expressions used in the literature of DWR.


References

[1] G. Allaire, Conception optimale de structures. Springer-Verlag, Berlin (2007).

[2] P. Alotto, P. Girdinio, P. Molfino and M. Nervi, Mesh adaption and optimization techniques in magnet design. IEEE Trans.Magn. 32 (1996) 2954–2957.

[3] W. Bangerth and R. Rannacher, Adaptive Finite Element Methods for Differential Equations, Birkhauser (2003)

[4] N.V. Banichuk, A. Falk and E. Stein, Mesh refinement for shape optimization, Struct. Optim. 9 (1995) 46–51.

[5] E. Bansch, P. Morin and R.H. Nochetto, Surface diffusion of graphs: variational formulation, error analysis and simulation.SIAM J. Numer. Anal. 42 (2004) 773–799.

[6] R. Becker and R. Rannacher, An optimal control approach to a posteriori error estimation in finite element methods. ActaNumer. 10 (2001) 1–102.

[7] J.A. Bello, E. Fernandez-Cara, J. Lemoine and J. Simon, The differentiability of the drag with respect to the variations of aLipschitz domain in a Navier-Stokes flow. SIAM J. Control Optim. 35 (1997) 626–640.

[8] A. Bonito and R.H. Nochetto and M.S. Pauletti, Geometrically consistent mesh modification. SIAM J. Numer. Anal. 48(2010) 1877–1899.

[9] M. Burger, A framework for the construction of level set methods for shape optimization and reconstruction. Interfaces FreeBound. 5 (2003) 301–329.

[10] J. Cea, Conception optimale ou identification de formes: calcul rapide de la derivee directionnelle de la fonction cout. RAIROModel. Math. Anal. Numer. 20 (1986) 371–402.

[11] F. de Gournay, Velocity extension for the level-set method and multiple eigenvalues in shape optimization. SIAM J. ControlOptim. 45 (2006) 343–367.

[12] M.C. Delfour and J.-P. Zolesio, Shapes and Geometries. SIAM Advances in Design and Control 22 (2011).

[13] A. Demlow, Higher-order finite element methods and pointwise error estimates for elliptic problems on surfaces. SIAM J.Numer. Anal. 47 (2009) 805–827.

[14] A. Demlow and G. Dziuk, An adptive finite element method for the Laplace-Beltrami operator on implicitly defined surfaces.SIAM J. Numer. Anal. 45 (2007) 421–442.

[15] G. Dogan, P. Morin, R.H. Nochetto and M. Verani. Discrete gradient flows for shape optimization and applications. Comput.Methods Appl. Mech. Engrg. 196 (2007) 3898–3914.

[16] M. Giles and E. Suli, Adjoint methods for PDEs: a posteriori error analysis and postprocessing by duality. Acta Numer. 11(2002) 145–236.

[17] M. Giles, M. Larson, J.M. Levenstam and E. Suli, Adaptive error control for finite element approximation of the lift and dragcoefficients in viscous flow. Technical Report 1317 (1997) http://eprints.maths.ox.ac.uk/1317/.

[18] V. Girault and P.A. Raviart, Finite Element Methods for Navier-Stokes Equations: Theory and Algorithms, Springer Seriesin Computational Mathematics 5. Springer-Verlag, Berlin (1986)

[19] A. Henderson, ParaView Guide, A Parallel Visualization Application. Kitware Inc. (2007).

[20] M. Lei, J.P. Archie and C. Kleinstreuer, Computational design of a bypass graft that minimizes wall shear stress gradients inthe region of the distal anastomosis. J. Vasc. Surg. 25 (1997) 637–646.

[21] K. Mekchay, P. Morin, and R.H. Nochetto, AFEM for Laplace Beltrami operator on graphs: design and conditional contractionproperty. Math. Comp. 80 (2011) 625–648.

[22] B. Mohammadi, O. Pironneau, Applied shape optimization for fluids. Oxford University Press, Oxford (2001).

[23] M.S. Pauletti, Parametric AFEM for geometric evolution equations and coupled fluid-membrane interaction. Ph.D. thesis,University of Maryland, College Park, ProQuest LLC, Ann Arbor, MI (2008)

[24] M.S. Pauletti, Second order method for surface restoration. Submitted.

[25] O. Pironneau, On optimum profiles in Stokes flow. J. Fluid Mech. 59 (1973) 117–128.

[26] O. Pironneau, On optimum design in fluid mechanics. J. Fluid Mech. 64 (1974) 97–110.

[27] A. Quarteroni and G. Rozza, Optimal control and shape optimization of aorto-coronaric bypass anastomoses. Math. ModelsMethods Appl. Sci. 13 (2003) 1801–1823.

[28] G. Rozza, Shape design by optimal flow control and reduced basis techniques: applications to bypass configurations in haemo-

dynamics. Ph.D.thesis, Ecole Polytechnique Federale de Lausanne (2005).

[29] J.R. Roche, Adaptive method for shape optimization, 6th World Congresses of Structural and Multidisciplinary Optimization.Rio de Janeiro (2005).

[30] A. Schleupen, K. Maute and E. Ramm, Adaptive FE-procedures in shape optimization. Struct. Multidisc. Optim. 19 (2000)282–302.

[31] A. Schmidt and K.G. Siebert, Design of Adaptive Finite Element Software, The Finite Element Toolbox ALBERTA, LectureNotes in Computational Science and Engineering 42. Springer, Berlin (2005).

[32] J. Sokolowski and J.-P. Zolesio, Introduction to Shape Optimization. Springer-Verlag, Berlin (1992).

[33] R.S. Taylor, A. Loh, R.J. McFarland, M. Cox and J.F. Chester, Improved technique for polytetrafluoroethylene bypass grafting:long-term results using anastomotic vein patches. Br. J. Surg. 79 (1992) 348–354.

http://eprints.maths.ox.ac.uk/1317/

Documents

Adaptive finite element method for shape optimization∗