A stable unstructured finite volume method for parallel large-scale viscoelastic fluid flow calculations

Journal of Non-Newtonian Fluid Mechanics 166 (2011) 779–791

Contents lists available at ScienceDirect

Journal of Non-Newtonian Fluid Mechanics

journal homepage: ht tp : / /www.elsevier .com/locate / jnnfm

A stable unstructured finite volume method for parallel large-scale viscoelasticfluid flow calculations

Mehmet SAHIN ⇑Faculty of Aeronautics and Astronautics, Astronautical Engineering Department, Istanbul Technical University, Maslak, Istanbul 34469, Turkey

a r t i c l e i n f o

Article history:Received 11 October 2010Received in revised form 30 March 2011Accepted 30 March 2011Available online 12 April 2011

Keywords:Oldroyd-B modelUnstructured finite volume methodLarge-scale computationNon-nested multigrid methodFlow past a cylinder in a channelFlow around a rigid sphere falling in a tube

0377-0257/$ - see front matter � 2011 Elsevier B.V. Adoi:10.1016/j.jnnfm.2011.03.010

⇑ Tel.: +90 212 285 3106; fax: +1 90 212 285 3139E-mail address: [email protected]

a b s t r a c t

A new stable unstructured finite volume method is presented for parallel large-scale simulation of visco-elastic fluid flows. The numerical method is based on the side-centered finite volume method where thevelocity vector components are defined at the mid-point of each cell face, while the pressure term and theextra stress tensor are defined at element centroids. The present arrangement of the primitive variablesleads to a stable numerical scheme and it does not require any ad-hoc modifications in order to enhancethe pressure–velocity–stress coupling. The log-conformation representation proposed in [R. Fattal, R.Kupferman, Constitutive laws for the matrix–logarithm of the conformation tensor, J. Non-NewtonianFluid Mech. 123 (2004) 281–285] has been implemented in order improve the limiting Weissenbergnumbers in the proposed finite volume method. The time stepping algorithm used decouples the calcu-lation of the polymeric stress by solution of a hyperbolic constitutive equation from the evolution of thevelocity and pressure fields by solution of a generalized Stokes problem. The resulting algebraic linearsystems are solved using the FGMRES(m) Krylov iterative method with the restricted additive Schwarzpreconditioner for the extra stress tensor and the geometric non-nested multilevel preconditioner forthe Stokes system. The implementation of the preconditioned iterative solvers is based on the PETSclibrary for improving the efficiency of the parallel code. The present numerical algorithm is validatedfor the Kovasznay flow, the flow of an Oldroyd-B fluid past a confined circular cylinder in a channeland the three-dimensional flow of an Oldroyd-B fluid around a rigid sphere falling in a cylindrical tube.Parallel large-scale calculations are presented up to 523,094 quadrilateral elements in two-dimensionand 1,190,376 hexahedral elements in three-dimension.

� 2011 Elsevier B.V. All rights reserved.

1. Introduction

Large-scale numerical simulations of viscoelastic fluid flowshave gained particular interest due to their wide range of applica-tion areas. Although significant progress has been made in thesolution of incompressible viscoelastic fluid flows, developmentof more advanced numerical algorithms in term of accuracy,stability, convergence and required computer power for bothsteady-state simulations and fully implicit time integration of theincompressible viscoelastic fluid flow equations is an activeresearch topic.

In the past two decades, considerable effort has been given tothe development of robust and stable numerical algorithms. In or-der to enhance the numerical stability, Perera and Walters [42]introduced the idea of elastic viscous split stress (EVSS) formula-tion in a finite difference method. Guenette and Fortin [23] pro-posed the so-called discrete elastic viscous split stress (DEVSS)method in a mixed FEM implementation. Sun et al. [51] proposed

ll rights reserved.

.

an adaptive viscoelastic stress splitting formulation (AVSS) andits applications: the streamline integration (AVSS/SI) and thestreamline upwind Petrov–Galerkin (AVSS/SUPG) methods. Oli-veira et al. [38] developed a collocated finite volume method onnon-orthogonal grids; the velocity–stress–pressure decouplingwas removed by using an interpolation similar to that of Rhieand Chow [44]. In the present paper, a new stable unstructured fi-nite volume method is presented for parallel large-scale simulationof viscoelastic fluid flows. The numerical method is based on theside-centered finite volume method where the velocity vectorcomponents are defined at the mid-point of each cell face, whilethe pressure term and the extra stress tensor are defined at ele-ment centroids. The present arrangement of the primitive variablesleads to a stable numerical scheme and it does not require any ad-hoc modifications in order to enhance the pressure–velocity–stresscoupling. This approach was initially used by Hwang [27] and Ridaet al. [45] for the solution of the incompressible Navier–Stokesequations on unstructured triangular meshes. Hwang [24] pointedout several important computational merits for the aforemen-tioned grid arrangement. Rida et al. [45] called this scheme side-centered finite volume method and the authors reported superior

http://dx.doi.org/10.1016/j.jnnfm.2011.03.010

mailto:[email protected]

http://dx.doi.org/10.1016/j.jnnfm.2011.03.010

http://www.sciencedirect.com/science/journal/03770257

http://http://www.elsevier.com/locate/jnnfm

780 M. SAHIN / Journal of Non-Newtonian Fluid Mechanics 166 (2011) 779–791

convergence properties compared to the semi-staggered approach.Therefore, the present side-centered finite volume method isimplemented for large-scale viscoelastic fluid flow simulationsrather than the semi-staggered finite volume algorithm given in[49]. Rannacher and Turek [43] used the same approach withinthe finite element framework by employing the stable non-con-forming fQ 1=Q 0 finite element pair which is a quadrilateral coun-terpart of the well-known non-conforming triangular Stokeselement of Crouzeix and Raviart [15]. The most appealing featureof this finite element pair is the availability of efficient multigridsolvers which are sufficiently robust even on non-uniform andhighly anisotropic meshes. Although the fully staggered approachwith multigrid method also leads to very robust numerical algo-rithm, obtaining the velocity components on unstructured stag-gered grids is not straightforward as well as the computation ofinter grid transfer operators in multigrid. The use of all the velocityvector components significantly simplifies the numerical discreti-zation of the governing equations on unstructured grids as wellas the implementation of physical boundary conditions. The pres-ent arrangement of the primitive variables can be applied to anynon-overlapping convex polygon which is very important for thetreatment of more complex configurations. In the present work, aspecial attention will be given to satisfy the continuity equationexactly within each element and the summation of the continuityequations can be exactly reduced to the domain boundary, which isimportant for the global mass conservation.

Although significant development has been made for robust andstable numerical algorithms for the solution of viscoelastic fluidflows, most numerical methods lose convergence at small or mod-erate Weissenberg numbers, limiting their applications due to theso-called High Weissenberg Number Problem (HWNP). Recently, alog representation of the conformation tensor was proposed byFattal and Kupferman [22]. In this approach, the governing consti-tutive equation is written in terms of the logarithm of the confor-mation tensor. This representation ensures the positivedefiniteness of the conformation tensor and captures sharp elasticstress layers which are exponential in nature. Hulsen et al. [26]showed that the log-conformation formulation improves the sta-bility of numerical methods by applying the DEVSS/DG methodto simulate the flow of Oldroyd-B fluid and Giesekus fluid past acylinder in a channel. Coronado et al. [17] presented a simple alter-nate form of the log-conformation formulation in which the con-formation tensor is replaced by the matrix exponential. Kaneet al. [30] compared four different implementation of the log-con-formation formulation on the flow around a circular cylinder. Theauthors pointed out that the original log-conformation [22] is anexcellent choice despite its complex implementation. Therefore,the original log-conformation formulation of [22] has been imple-mented in order to improve the limiting Weissenberg numbers inthe proposed finite volume method.

The parallel computation of viscoelastic flows is essential forbringing the modern computation power up to the task of carry-ing out the calculation required by complex constitutive equa-tions. Dou and Phan-Thien [19] implemented the parallelunstructured finite volume algorithm for the flow of PTT fluidaround a cylinder on a distributed computing environmentthrough Parallel Virtual Machine (PVM). Coala et al. [11] pre-sented a highly parallel time integration method for calculatingviscoelastic flows with the DEVSS-G/DG finite element discretiza-tion. The method is based on an operator splitting time integra-tion method that decouples the calculation of the polymericstress by solution of a hyperbolic constitutive equation fromthe evolution of the velocity and pressure fields by solution ofa generalized Stokes problem. The Stokes-like system is solvedby using the BiCGStab Krylov iterative method preconditionedwith the block complement and additive level method (BCALM).

Dimakopoulos [18] presented the parallelization of a fully impli-cit and stable finite element algorithm with relative low memoryrequirements for the accurate simulation of time-dependent free-surface flows of multimode viscoelastic liquids. Kim et al. [33]obtained high-resolution solutions of viscoelastic flow problemsbased on the DEVSS-G/SUPG formulation. An adaptive incompleteLU (AILU) preconditioner with variable reordering was used forthe coupled solution of the linearized system of equations. Castil-lo et al. [12] developed a fully coupled parallel finite elementmethod for solving three-dimensional viscoelastic free surfaceflows. A Krylov subspace method with an approximate inversepreconditioner (SPAI) was used for the solution of the linearizedsystem of non-linear equations. As in the work of Caola et al.[11], we use a time-splitting technique which decouples the solu-tion of the extra stress from the evolution of the velocity andpressure fields by solution of a generalized Stokes problem.Although this decoupling limits the allowable time step, bothsteps can be solved efficiently by using preconditioned Krylovsubspace methods. In here, the Stokes-like system is solved usingthe FGMRESm Krylov iterative method [47] preconditioned withnon-nested multigrid method. The present preconditioner isessential for a parallel scalable viscoelastic flow solver. Becauseit is well known that one-level methods (e.g., Jacobi, Gauss–Sei-del, incomplete LU, etc.) lead to non-scalable solvers since theycause increase in the number of iterations as the number ofsub-domains is increased.

Multigrid methods [24,54] are known to be the most efficientnumerical techniques for solving large-scale problems that arisein numerical simulations of physical phenomena because of theircomputational costs and memory requirements that scale linearlywith the degrees of freedom. The basic idea of the multigridmethod is to carry out iterations on a fine grid and then progres-sively transfer these flow field variables and residuals to a seriesof coarser grids. On the coarser grids, the low frequency errorsbecome high frequency ones and they can be easily annihilatedby simple explicit methods. There are various possible strategiesfor implementing a multigrid algorithm on unstructured meshes[37]. One of the most successful multigrid technique has beenthe use of non-nested coarse and fine levels. In this approach,coarse grid levels are created independently from the finermeshes and flow variables, residuals and corrections are trans-ferred back and forth between the various grid levels in a multi-grid cycle. To reduce the memory requirement of the multigridmethod we use more aggressive coarsening method similar tothat of Lin et al. [34,35]. In this study, we investigate the applica-bility of the multigrid technique for the coupled iterative solutionof the momentum and continuity equations. In the coupled ap-proach, the momentum and continuity equations are solvedsimultaneously. This will lead to more robust solution techniquescompared to SIMPLE, SIMPLER, etc. type decoupled solution tech-niques as well as the preconditioned iterative solvers based onblock factorization techniques [20,31,21,3,48]. An extensive re-view on the fully-coupled iterative solvers for the incompressibleNavier–Stokes equations may be found in [50]. However, it is notpossible to apply the standard multigrid methods with classicalsmoothing techniques (e.g., Jacobi, Gauss–Seidel, etc.) for the cou-pled iterative solution of the momentum and continuity equa-tions because of the zero-block in the saddle point problem[7,46,50]. In order to avoid the zero-block in the saddle pointproblem, we use an upper triangular right preconditioner whichresults in a scaled discrete Laplacian instead of a zero block inthe original system. Then Jacobi, Gauss–Seidel, etc. type simplesmoothers can be used directly as a smoother. The implementa-tion of the preconditioned Krylov subspace algorithm, matrix–matrix multiplication and the multilevel preconditioner were car-ried out using the PETSc (Portable, Extensible Toolkit for Scientific

Fig. 1. Two-dimensional unstructured mesh with a dual control volume.

Fig. 2. Three-dimensional unstructured mesh with a dual control volume.

M. SAHIN / Journal of Non-Newtonian Fluid Mechanics 166 (2011) 779–791 781

Computation) software package [5] developed at the Sandia Na-tional Laboratories. The computational meshes are partitionedinto a set of sub-domains using the METIS library [29].

The proposed numerical method is validated initially for theKovasznay flow [32] and the spacial convergence of the methodis established for a Newtonian fluid. Then the method is appliedto classical benchmark problem of the flow of a viscoelastic fluidpast a confined circular cylinder in a channel. The viscoelastic fluidpast a confined circular cylinder was investigated by manyresearchers [1,2,11,17,19,26,30,33,39,58]. It is known that theproblem is very difficult due to the very thin extra stress boundarylayer on the cylinder surface and in the wake behind the cylinder.In the present work, we employ very high mesh resolution in theseregions, allowing very accurate solutions at minimum cost, in or-der to study mesh convergence in the wake of the cylinder. Thenumerical results at We = 0.7 indicate that mesh convergence isachieved in the wake of the cylinder. However, the numerical re-sults at higher Weissenberg numbers indicate that no steady-statesolution is possible for an Oldroyd-B fluid beyond We = 0.8. Finally,the numerical method is applied to the three-dimensional flow of aviscoelastic fluid around a rigid sphere falling in a cylindrical tube[10,13,36,40,55–57]. The two-dimensional axisymmetric resultsavailable in the literature confirm the present three-dimensionalalgorithm.

This article is organized as follows: The governing equationsand the proposed finite volume algorithm are given in Section 2.The present numerical algorithm is tested on parallel machinesand verified for the Kovasznay flow, the flow of an Oldroyd-B fluidpast a confined circular cylinder in a channel and the three-dimen-sional flow of an Oldroyd-B fluid around a rigid sphere falling in acylindrical tube in Section 3. Finally, the conclusion and discus-sions are presented in Section 4.

2. Mathematical and numerical formulation

The governing equations for the incompressible Oldroyd-B fluidflow in the Cartesian coordinate system can be written in dimen-sionless form as follows: the continuity equation

�r � u ¼ 0 ð1Þ

the momentum equations

Re@u@tþ ðu � rÞu

� �þrp ¼ br2uþr � T ð2Þ

and the constitutive equation for the Oldroyd-B model

We@T@tþ ðu � rÞT� ðruÞ> � T� T � ru

� �¼ ð1� bÞðruþru!pÞ � T ð3Þ

In these equations, u represents the velocity vector, p is the pres-sure and T is the extra stress tensor. The non-dimensional parame-ters are the Reynolds number Re, the Weissenberg number We andthe viscosity ratio b. Integrating the differential Eqs. (1) and (3) overan unstructured quadrilateral/hexahedral element Xe with bound-ary @Xe gives

�I@Xe

n � udS ¼ 0 ð4Þ

WeZ

Xe

@T@t� ðruÞ> � T� T � ru

� �dV þ

I@Xe

ðn � uÞTdS� �¼ ð1� bÞ

I@Xe

ðunþ nuÞdS�Z

Xe

TdV ð5Þ

and the momentum Eq. (2) over an arbitrary dual control volume Xd

with boundary @Xd yields

ReZ

Xd

@u@t

dV þ ReI@Xd

ðn � uÞudSþI@Xd

npdS

¼I@Xd

n � rudSþI@Xd

n � TdS ð6Þ

The n represents the outward normal unit vector, V is the controlvolume and S is the control volume surface area. Fig. 1 illustratestypical two neighboring quadrilateral elements with an arbitrarydual control volume for the momentum Eq. (6) constructed by con-necting the element centroids to the common vertices shared by theboth quadrilateral elements. The velocity vector components aredefined at the mid-point of each cell face while the pressure andthe extra stress tensor are defined at the element centroids. Thethree-dimensional hexahedral elements with a dual control volumeare also shown in Fig. 2. In the following subsection, we restrictedourself to the numerical discretization of two-dimensional visco-elastic fluid flows and its extension to three-dimension isstraightforward.

2.1. Numerical discretization

The momentum equations along the x- and y-directions are dis-cretized over the dual finite volume shown in Fig. 1 and the dis-cretization area involves only the right and left elements thatshare the common edge where the components of the velocity vec-tor are discretized. The discrete contribution from the right cell


shown in Fig. 1 is given below for each term of the momentumequation along the x-direction.The time derivative

Re9unþ1

1 þ unþ12 þ unþ1

3 þ unþ16

12Dt

� �A123 � Re

9un1 þ un

2 þ un3 þ un

6

12Dt

� �A123

ð7Þ

The convective term

Re un12Dy12 � vn

12Dx12� �

unþ112 þ Re un

23Dy23 � vn23Dx23

� �unþ1

23 ð8Þ

The pressure term

pnþ11 þ pnþ1

2

2

� �Dy12 þ

pnþ12 þ pnþ1

3

2

� �Dy23 ð9Þ

The viscous term

�b@u@x

� �nþ1

12Dy12 � b

@u@x

� �nþ1

23Dy23 þ b

@u@y

� �nþ1

12Dx12 þ b

@u@y

� �nþ1

23Dx23

ð10Þ

The extra stress term

�ðTxxÞnþ112 Dy12 � ðTxxÞnþ1

23 Dy23 þ ðTxyÞnþ112 Dx12 þ ðTxyÞnþ1

23 Dx23 ð11Þ

In here, A123 is the area between the points x1; x2 and x3;Dt is the timestep, Dx12 ¼ x2 � x1; Dx23 ¼ x3 � x2; Dy12 ¼ y2 � y1; Dy23 ¼ y3 � y2,the values u12;u23;v12 and v23 are the velocity vector components de-fined at the mid-point of each dual volume face and p1;p2 and p3 arethe pressure values at the points x1; x2 and x3, respectively. The veloc-ity vector components u12;u23;v12 and v23 are computed using theleast square interpolations [4,6]. As an example,

u12 ¼ b u1 þru1r1½ � þ ð1� bÞ u2 þru2r2½ � ð12Þ

where b is a weight factor determining the type of convectionscheme used, ru1 and ru2 are the gradients of velocity compo-nents where the u1 and u2 velocity components are defined and r1

and r2 are the distance vectors from the mid-point of the dual con-trol volume face to the locations where the gradients of velocitycomponents are computed. In this present work, we will employonly b = 0.5 which corresponds to the central least square interpo-lation. For evaluating the gradient terms, ru1 and ru2, a leastsquare procedure is used in which the velocity data is assumed tobehave linearly. Referring to Fig. 1 as an example, the following sys-tem can be constructed for the term ru1

Dx21 Dy21

Dx31 Dy31

Dx41 Dy41

Dx51 Dy51

2666437775 @u

@x@u@y

" #¼

u2 � u1

u3 � u1

u4 � u1

u5 � u1

2666437775 ð13Þ

This overdetermined system of linear equations may be solved in aleast square sense using the normal equation approach, in whichboth sides are multiplied by the transpose. The modified systemis solved using the singular value decomposition provided by the In-tel Math Kernel Library in order to avoid the numerical difficultiesassociated with solving linear systems with near rank deficiency.The pressure values at x1 and x3 as well as the velocity values atthese nodes are computed in a similar manner. Then velocity gradi-ents at the dual control volume edge midpoints can be computedfrom the Green’s theorem.

@u@x¼1

A

I@Xc

udy ð14Þ

@u@y¼� 1

A

I@Xc

udx ð15Þ

where Xc covolume represents one-quarter of a quadrilateral ele-ment where the dual-control volume edge is aligned with one ofthe covolume diagonal lines and the line integral on the right-handside of the Eqs. (14) and (15) is evaluated using the mid-point ruleon each of the covolume faces. The procedure used here for theevaluation of viscous fluxes is very similar to the method describedby Hwang [27]. Although, the actual values of gradients can be com-puted trough the use of the least square procedure shown above, itis less accurate compared to the gradient values calculated with theGreen’s theorem [4]. The contribution from the left cell is also cal-culated in a similar manner. The discretization of the momentumequation along the y-direction follows very closely the ideas pre-sented here and therefore is not repeated here. The continuity Eq.(4) is integrated within each quadrilateral elements and evaluatedusing the mid-point rule on each of the element faces.

�X4

j¼1

unþ1j Dyj � vnþ1

j Dxj

h i¼ 0 ð16Þ

where Dxj and Dyj are the element edge lengths along the x- and y-directions, respectively, and uj and v j are the velocity vector compo-nents defined at the mid-point of each quadrilateral element face.The constitutive equation for the Oldroyd-B fluid is discretized asin [49] within each element assuming that the extra stresses Ti

and velocity gradients rui are constant:

WeTnþ1

i � Tni

DtAe � ðrun

i Þ> � Tnþ1

i Ae � Tnþ1i � run

i Ae

"

þX4

j¼1

unj Dyj � vn

j Dxj

h iTnþ1

j

#¼ ð1� bÞ runþ1

i þ ðrunþ1i Þ!p

� �Ae � Tnþ1

i Ae ð17Þ

where Ae is the area of the quadrilateral element and Tj is the valueof the extra stress tensor at the face centers of the quadrilateral ele-ments. In order to extrapolate the extra stresses to the boundariesof the finite volume elements the second-order upwind least squareinterpolation described above is used in order to maintain stabilityfor hyperbolic constitutive equations. The time-dependent finitevolume discretisation of the above equations leads to a linear sys-tem of equations of the form:

Ass Asu 0Aus Auu Aup

0 Apu 0

264375 s

u

p

264375 ¼ b1

b2

0

264375 ð18Þ

The above linear system of algebraic equations should be solved foreach time step.

2.2. The log conformation formulation

The conformation tensor is a quantity that describes the inter-nal microstructure of polymer molecules in a continuum level.The relation between the conformation tensor and the extra stresstensor is given by

r ¼ Iþ We1� b

T ð19Þ

The constitutive equation for the Oldroyd-B fluid in terms of theconformation tensor can be written as

We@r@tþ ðu � rÞr� ðruÞ> � r� r � ru

� �¼ �ðr� IÞ ð20Þ

The conformation tensor is symmetric and positive definite. Unlessspecial care is taken, the conformation tensor may lose this prop-erty at high Weissenberg numbers and the numerical solution willsoon diverge. Recently, a log-conformation formulation was

Table 1The two-level multigrid method is given below. P and R are prolongation andrestriction operators, respectively.

1. Perform m1 step iterative method for solving Ax ¼ b2. Compute residual r ¼ b� Ax3. Solve error on the coarse level RAPe ¼ Ace ¼ Rr4. Compute new x value x ¼ xþ Pe5. Perform m1 step iterative method for solving Ax ¼ b6. Check convergence. If krk2 6 rtol goto 1.


proposed by Fattal and Kupferman [22] where the constitutiveequation is rewritten in terms of the logarithm of the conformationtensor through eigenvalue computations, W ¼ log r ¼ RlogKR>.This representation ensures the positive definiteness of the confor-mation tensor and captures sharp elastic stress layers which areexponential in nature. It is shown in [22] that it is possible todecompose the gradient of divergence free velocity field into anti-symmetric X (pure rotation) and N tensors, and symmetric B tensorwhich commutes with the conformation tensor.

ru ¼ Xþ Bþ Nr�1 ð21Þ

By inserting the decomposition (21) into the constitutive Eq. (20)and replacing the conformation tensor with the new variable W,the new set of equations to be solved can be rewritten as follows:

We@W@tþ ðu � rÞW� ðXW�WXÞ � 2B

� �¼ �ðI� e�WÞ ð22Þ

Re@u@tþ ðu � rÞu

� �þrp ¼ br2uþ ð1� bÞ

Wer � ðeW � IÞ ð23Þ

� r � u ¼ 0 ð24Þ

The discretization of the above equation follows very closely theideas presented here and therefore is not repeated here.

2.3. Iterative solver

In practice, the solution of Eq. (18) does not converge veryquickly and it is rather difficult to construct robust preconditionersfor the whole coupled system. Therefore, we decouple the systemby using a time-splitting technique which decouples the calcula-tion of extra stresses from the evaluation of the velocity and pres-sure fields by solving a generalized Stokes problem.

Ass½ � s½ � ¼ b1 � Asuu½ � ð25Þ

Auu Aup

Apu 0

� �up

� �¼

b2 � Auss0

� �ð26Þ

However, it is not possible to apply the standard multigridmethods with classical smoothing techniques (e.g., Jacobi, Gauss–Seidel) for the coupled iterative solution of the momentum andcontinuity equations because of the zero-block in the saddle pointproblem [7,46,50]. This could be overcome by applying thosesmoothers to the squared system which is symmetric and positivedefinite. The smoothing properties for the squared system wasanalyzed in [53] and the convergence rate has been proven to beOð1=mÞ for the multigrid method, where m is the number ofsmoothing steps. But the method has not become popular due toits low efficiency. Although, the Vanka smoother [52] is very effec-tive in smoothing out errors it is very difficult to implement in anefficient manner. In the present paper, we use an upper triangularright preconditioner which results in a scaled discrete Laplacian in-stead of a zero block in the original system. Then the modified sys-tem becomes

Auu Aup

Apu 0

� �I �Aup

0 I

� �q

p

� �¼

Auu Aup � AuuAup

Apu �ApuAup

� �q

p

� �¼

b2 � Auss0

� �ð27Þ

and the zero block is replaced with �ApuAup, which is a scaled dis-crete Laplacian. Unfortunately, this leads to a significant increasein the number of non-zero elements due to the matrix–matrix mul-tiplication. However, it is possible to replace the �Aup block matrixin the upper triangular right preconditioner with a computationallyless expensive matrix, �bAup. The calculations indicate that the larg-est contribution for the pressure gradients in the momentum equa-tions comes from the right and left elements that share the common

edge/face where the components of the velocity vector are discret-ized. Therefore, we will use the contribution from these two ele-ments for the �bAup matrix which leads maximum three non-zeroentries per row. Although, this approximation does not changethe convergence rate of an iterative solver significantly, it leads toa significant reduction in the computing time and memoryrequirement.

The multilevel preconditioner is based on a multiplicative non-nested multigrid method with one V-cycle. In this multigrid meth-od, coarse grid levels are created independently from the finermeshes and flow variables, residuals and corrections are trans-ferred back and forth between the various grid levels in a multigridcycle. The basic two-level multigrid method is described in Table 1.However, for the application of non-nested restriction and prolon-gation operators, one needs to knows the coarser element contain-ing the centroids of both the elements and the edges/faces on thefine level. For this purpose, the quadtree/octree data structure(see, for example, [8]) can be constructed on the coarse mesh.The quadtree consists of a sequence of recursively divided squaressuperimposed on a region. Then searching the coarse mesh for a gi-ven point involves determining in which octant the point is con-tained and searching the associated coarse mesh elementswhether any of them contain the point. This algorithm is veryeffective when we are searching for a single point or several arbi-trary points. However, this procedure does not use the informationfrom previously computed neighboring points when multiple re-lated points are involved. Thus, we introduce the idea of the levelset renumbering algorithm on the fine mesh. We start with thefirst element and level-1 is defined as the set of elements con-nected to the vertices of the first element. The next level is foundby considering all new neighbors of level-1. This procedure is re-peated until all the elements are assigned to a level and the ele-ments are renumbered in ascending order based on their levels.In here, the use of the local level set renumbering algorithm en-sures that there is at least one neighboring fine element with a pre-viously computed coarse element number. Therefore, we alreadyknow the associated coarse elements without the use of a quadtreesearch algorithm and the present search algorithm only involvesthe testing of several associated coarse mesh elements whetherany of them contain the point. Therefore, it is possible to find thetarget coarse element numbers within several iterations. Thecoarse level elements are partitioned by computing the maximumnumber of fine level elements with the same processor numberwithin a coarse element and the load balancing on the coarse levelis generally well ensured. To reduce the memory requirement ofthe multigrid method we use more aggressive coarsening methodsimilar to the work of Lin et al. [34,35]. In order to reduce the com-plexity of data structure, the velocity vector components are de-fined at vertices on the coarse grid levels. The restricted additiveSchwarz preconditioner with the FGMRES(m) Krylov iterativemethod [47] is used as a smoother for the multilevel precondition-er and either the successive over-relaxation (SOR) preconditioneror the incomplete LU preconditioner with no fill-in is employedwithin each partitioned sub-blocks. The implementation of thepreconditioned Krylov subspace algorithm, matrix–matrix


multiplication and the multilevel preconditioner were carried outusing the PETSc [5] software package developed at the Sandia Na-tional Laboratories. METIS library [29] is used to decompose theflow domain into a set of sub-domains.

12

M1

U1

M2

M3

M4

M5U5

U4

U3

U2

Δh

Erro

r

10-3 10-2 10-110-6

10-5

10-4

10-3

10-2

10-1

MAC SchemeStructuredUnstructured

Fig. 3. The spacial convergence of the error ðku� uexactk2=ffiffiffiffiffiffiNup

Þ with meshrefinement ðDh ¼ 1=

ffiffiffiffiffiffiNepÞ for the Kovasznay flow at Re ¼ 40.

3. Numerical results

In this section, the proposed numerical algorithm is tested onseveral parallel machines and verified for the Kovasznay flow,the flow of an Oldroyd-B fluid past a confined circular cylinder ina channel and the three-dimensional flow of an Oldroyd-B fluidaround a rigid sphere falling in a cylindrical tube. The presenttwo-dimensional calculations are performed on the SGI Altix3000 (1300 MHz, Itanium 2) machine available at the Faculty ofAeronautics and Astronautics of ITU with 32 nodes and the com-puting facilities at TUBITAK ULAKBIM, High Performance and GridComputing Center. Meanwhile, the three-dimensional calculationsare carried out at the Anodolu (Intel Xeon 2.33 GHz) machine atthe National Center for High Performance Computing of Turkeyusing 128 nodes. The present numerical results are obtained byusing the time-splitting technique given in the Section 2.3.

3.1. Kovasznay flow

Kovasznay flow [32] is an analytical solution of the two-dimen-sional steady-state Navier–Stokes equations and it is used to estab-lish the spatial order of convergence for a Newtonian fluid. Thespatial domain in which Kovasznays solution is defined is takenhere as the unit square ½�0:5;0:5� � ½�0:5;0:5�. The analytical solu-tion has the following form:

uðx; yÞ ¼1� ekxcosð2pyÞ ð28Þ

vðx; yÞ ¼kekx sinð2pyÞ2p ð29Þ

pðx; yÞ ¼1� e2kx

2ð30Þ

k ¼Re2�

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiRe2

4þ 4p2

sð31Þ

For the present validation case, the Reynolds number is taken to be40. In order to establish the spatial convergence of the method, anh-refinement study is performed on both uniform Cartesian meshesas well as unstructured quadrilateral meshes. For uniform Cartesianmeshes, five different meshes are employed: mesh U1 with 21� 21node points, mesh U2 with 41� 41 node points, mesh U3 with81� 81 node points, mesh U4 with 161� 161 node points andmesh U5 with 321� 321 node points. For unstructured quadrilat-eral meshes the following meshes are considered: mesh M1 with314 node points and 273 elements, mesh M2 with 1133 node pointsand 1052 elements, mesh M3 with 4393 node points and 4232 ele-ments, mesh M4 with 17,635 node points and 17,314 elements andmesh M5 with 66,591 node points and 65,950 elements. The suc-cessive meshes are generated using the mapping and paving algo-rithms provided within the CUBIT mesh generation environment

Table 2The description of quadrilateral meshes used for the Newtonian Kovasznay flow.

Mesh Structured mesh

Number of nodes Number of elements DOF

U1 441 400 2080U2 1681 1600 8160U3 6561 6400 32,320U4 25,921 25,600 128,640U5 103,041 102,400 513,280

[9]. The details of the meshes corresponding to the meshes U1 toU5 and meshes M1 to M5 are provided in Table 2. The error mea-sure is taken to be:

Error ¼ ku� uexactk2ffiffiffiffiffiffiNup ð32Þ

where Nu is the number of edges. The mesh space Dh on theunstructured quadrilateral element is defined as

Dh ¼ 1ffiffiffiffiffiffiNep ð33Þ

where Ne is the number of elements. The convergence of error mea-sure with mesh spacing is shown in Fig. 3 and the error measure de-cays at an algebraic rate as the mesh is refined. In a log–log scale theexpected rate of convergence would appear as a straight line. Thecentral difference approximation to the convective term gives astraight line with an algebraic convergence rate of OðDh2Þ on bothstructured Cartesian and unstructured grids. The present error mea-sure is also compared with the results obtained from the MACscheme [25]. Although the present numerical scheme does notshow any significant improvement on the error measure, it is capa-ble of treating more complex configurations compared to the MACscheme.

3.2. Performance of Stokes solver

Efficient numerical solution of Stokes flow is a serious bottle-neck in performing parallel large-scale viscoelastic numerical sim-

Unstructured mesh

Mesh Number of nodes Number of elements DOF

M1 314 273 1445M2 1133 1052 5420M3 4393 4232 21,480M4 17,635 17,314 82,765M5 66,591 65,950 331,030


ulations. To illustrate the performance of the present two-levelpreconditioned iterative solver given in Section 2.3, an algorithmicscaling study is presented for the two- and three-dimensional lid-driven cavity problem on the SGI Altix 3000 (1300 MHz, Itanium 2)machine. The calculations are performed on uniform Cartesianmeshes with 501� 501 and 1001� 1001 resolutions in two-dimension and 51� 51� 51 and 101� 101� 101 resolutions inthree-dimension. The performance analysis has been carried outby considering the one- and two-level preconditioned iterativesolvers and is mainly focused on the Stokes flow. A special atten-tion is given to fit the partitioned data into the local physical mem-ory of computational nodes. In these calculations, the relativeresidual is set to 10�8.

3.2.1. Two-dimensional performanceTable 3 presents the results for the algorithmic scaling of the

present one- and two-level preconditioned iterative solvers forthe two-dimensional lid-driven Stokes flow in a square enclosureusing the present numerical algorithm. In addition, the algorithmicscaling of the one-level preconditioned iterative solver is presentedfor the classical MAC scheme [25]. The symbol � represents thecalculation that does not converge within a reasonable time orthe calculation for which the local physical memory of computa-tional nodes is not enough. The standard one-level iterative solveris based on the restricted additive Schwarz method with the flex-ible GMRES(200) algorithm [47]. A block-incomplete factorizationcoupled with the reverse Cuthill–McKee ordering [16] is usedwithin each partitioned sub-domains. As it may be seen from Ta-ble 3, the calculations indicates significant improvement in thecomputation time with an increase in the incomplete LU factoriza-tion level. However, it is clear that the iterative solver does notscale well for the present mesh sizes. The standard two-level iter-ative solver is based on the two-level non-nested geometric multi-grid preconditioner with the flexible GMRES(200) algorithm [47].The restricted additive Schwarz preconditoner is used as a smooth-er and either the successive over-relaxation (SOR) preconditioneror the block-incomplete factorization with no fill-in is employedwithin each sub-blocks. On the coarse level, the block-incompletefactorization with no fill-in is employed. In these two-level calcu-lations, the coarsening ratio is constant between the grid levels andit is set to 1:82. As it may be seen, the standard two-level solverwith the SOR and ILU(0) preconditioners converges within severaliterations and the number of iterations is independent of the prob-lem size. Although we have used only two-level preconditionediterative solvers, the numerical results indicate relatively goodscaling properties.

Table 3The change of iteration number and computation time for the two-dimensional lid-driven SSection 2.2 on an SGI Altix 3000 (1300 MHz, Itanium 2) for the present method and the M

Method Precond. 501� 501

Proc. Num. Iter. Num.

1-Level present ILU(0) 8 –ILU(1) 8 2860ILU(2) 8 1300ILU(3) 8 757ILU(4) 8 528ILU(5) 8 396

2-Level present MG-SOR 8 19MG-ILU 8 12

1-Level MAC ILU(0) 8 –ILU(1) 8 3540ILU(2) 8 1368ILU(3) 8 581ILU(4) 8 462ILU(5) 8 371

There are two main issues for the algorithmic scaling of an iter-ative solver. The first issue is to keep the number of iterations con-stant as the number of subdomains is increased. It is well knownthat one-level methods cause increase in the number of iterationsas the number of sub-domains is increased. Two-level methods canremedy the situation by keeping the coarsening ratio constant be-tween the fine and coarse levels [34,35]. Lin et al. [34] reportedthat a two-level preconditioner is optimally convergent for the gi-ven fine-to-coarse grid ratio of 82 in two-dimension and 83 inthree-dimension. The second issue is that the computation time re-quired for each iteration should scale linearly with the number ofunknowns. This is more difficult to achieve for two-level methodsdue to a relatively large coarse mesh LU factorization. Therefore,one must pay an attention to the coarse grid solve time. In here,the coarse mesh solution time is significantly improved using therestricted additive Schwarz preconditioner with a block-incom-plete factorization with no fill-in. However, for larger problems,it may be required to introduce additional levels in order to keepthe coarse level relatively cheap. In addition to the present calcula-tions, the standard one-level preconditioned iterative solver is ap-plied to the classical MAC scheme [25] in order to compare itscomputation time with the present numerical algorithm. The cal-culations indicate that the MAC scheme is relatively cheaper com-pared to the present numerical algorithm. It is well known that theunstructured finite volumes methods can not compete with thestructured numerical methods including the MAC scheme. Themain reason is that the unstructured finite volume solvers needto construct a dual volume and has to interpolate the fluxes tothe dual volume faces. Meanwhile the coefficients of the MACscheme can be constructed directly with no mesh information. Inaddition, the number of entries per row is significantly larger forthe unstructured finite volume methods. Finally, the convective–diffusion submatrix is approximately two times larger since weemploy all the velocity components.

3.2.2. Three-dimensional performanceTable 4 presents the similar results for the algorithmic scaling

for the three-dimensional lid-driven cubic Stokes flow. Although,the increase in the level of a block-incomplete factorization oneach partitioned sub-domains significantly reduces the numberof required iterations, this is not possible for large-scale three-dimensional calculations due to the prohibitively large physicalmemory requirement. The standard two-level preconditioned iter-ative solver with a constant coarsening ratio of 1:53 is employedfor the present three-dimensional calculations. The convergenceproperties of the three-dimensional results are very similar to that

tokes flow using the one- and two-level preconditioned iterative methods given in theAC scheme [25]. The relative residual is set to rtol ¼ 10�8.

1001� 1001

Time (s) Proc. Num. Iter. Num. Time (s)

– 32 – –577 32 – –353 32 5905 1645248 32 3544 1078182 32 3349 1126158 32 2156 784

37 32 19 4230 32 15 36

– 32 – –444 32 – –189 32 8075 113884 32 3896 58270 32 2186 35463 32 1669 286

Table 4The change of iteration number and computation time for the three-dimensionalcubic lid-driven Stokes flow using the one- and two-level preconditioned iterativemethods given in the Section 2.2 on an SGI Altix 3000 (1300 MHz, Itanium 2) for thepresent method. The relative residual is set to rtol ¼ 10�8.

Method Precond. 51� 51� 51 101� 101� 101

Proc.Num.

Iter.Num.

Time(seconds)

Proc.Num.

Iter.Num.

Time(seconds)

1-Level present ILU(0) 4 361 244 32 1279 767ILU(1) 4 – – 32 – –

2-Level present MG-SOR 4 14 133 32 14 154MG-ILU 4 11 129 32 11 147


of the two-dimensional calculations. The total solution time of thestandard two-level preconditioned iterative solver with the SORsmoother is approximately 154 seconds for the Stokes flow on anuniform 101� 101� 101 mesh. Approximately 72 seconds of thiscomputation time is spent for the construction of linear system,10 seconds is spent for the matrix–matrix multiplication in Eq.(27) and 8 seconds is spent for the construction of intergrid trans-fer operators. Therefore, the solution time is comparable with thetime required for the constructing of the linear system.

Although the preconditioned iterative solvers based on blockfactorization techniques [3,20,21,31,48] has been implemented asin [48] in addition to the one- and two-level preconditioned itera-tive methods, they do not perform as well as the present two-levelpreconditioned iterative solution technique. As an example, thethree-dimensional lid-driven cavity flow at Re ¼ 20 is solved on aCartesian uniform 65� 65� 65� 65 mesh using 6 Oseen itera-tions with zero initial value and rtol ¼ 10�5. The implemented blockbased least squares commutator (LSC) preconditioner [20] requires972 seconds on the SGI Altix 3000 machine with 32 nodes. In here,the smaller subproblems are solved inexactly similar to that of [21]using the HYPRE BoomerAMG solver [28]. However, the same testcase in [21] required 391 seconds using the block based pressureconvection diffusion (PCD) preconditioner [31] on the ASCI Redmachine (333 MHz, Intel Pentium II Xeon) with 100 nodes. Theslight difference in computation time is mainly due to size of theconvective–diffusion subproblem in our approach which is approx-imately three times larger since we use all the components of thevelocity vector. In addition, the LSC preconditioner requires tosolve the scaled discrete Laplacian subproblem two times for eachouter iteration while the PCD preconditioner requires only one. Onthe other hand, the standard two-level preconditioned iterativemethod for the first approach requires 195 seconds on the SGI Altix3000 with 32 nodes and indicates substantial improvement in thecomputation time.

Fig. 4. The computational coarse mesh M1 for an Oldroyd-B fluid past a confined circulatotal number quadrilateral elements is 35,313.

3.3. Oldroyd-B fluid past a confined circular cylinder

The parallel algorithm described in Section 2 is used to computethe two-dimensional viscoelastic flow past a confined circular cyl-inder in a channel [1,2,11,17,19,26,30,33,39,58]. For this flow prob-lem, we consider a circular cylinder of radius R positionedsymmetrically between two parallel plates separated by a distance2H. The blockage ratio R=H is taken equal to 0.5 and the computa-tional domain extends a distance 12R upstream and downstreamof the cylinder. The dimensionless parameters are the Reynoldsnumber Re ¼ hUiR=g, the Weissenberg number We ¼ khUi=R andthe viscosity ratio b ¼ gs=g. The physical parameters are the den-sity q, the average velocity at the inlet hUi, the relaxation time k,the zero-shear-rate viscosity of the fluid g and the solvent viscositygs. The viscosity ratio b is chosen to be 0.59, which is the value usedin the benchmarks for the Oldroyd-B fluid. In this work, fully devel-oped velocity boundary conditions are imposed at the inlet andnatural (traction-free) boundary conditions are imposed at the out-let boundary. No-slip velocity boundary conditions are imposed onall solid walls. The extra stresses are computed everywhere withinthe computational domain and their boundary conditions areintroduced through their fluxes, using the analytical values at theinlet boundary.

In the present work, three different meshes are employed:coarse mesh M1 with 35,815 node points and 35,313 elements,medium mesh M2 with 141,148 node points and 140,147 elementsand fine mesh M3 with 565,122 node points and 563,121 elements.The successive meshes are generated by doubling the number ofgrid points on the boundaries from the previous one and usingthe square root of stretching factors used from the previous one.As may be seen in Fig. 4, the mesh is highly stretched on the cylin-der surface, on the walls and in the wake behind the cylinder in or-der to resolve very strong stress gradients. The details of the meshcharacteristics are provided in Table 5. In order to validate ourcode, the flow of an Oldroyd-B fluid past a confined circular cylin-der in a channel is solved at a Weissenberg number of 0.7. Themesh convergence of Txx with mesh refinement is given in Fig. 5on the cylinder surface and along the center line in the wake. Asseen in Fig. 5, the mesh convergence is obvious at We ¼ 0:7 bothon the cylinder surface and along the center line in the wake. Inthe literature, there are some results for the mesh refinementstudy for extra stress along the center line in the wake. However,these numerical results are not mesh convergent at We ¼ 0:7 eventhough they clearly indicate a mesh convergence trend. The com-puted stress component Txx on the cylinder surface and along thecenter line in the wake are compared in Fig. 6 with the results ofYurun et al. [58], Hulsen et al. [26] and Afonso et al. [2]. These ex-treme values of Txx are provided with the other results available in

r cylinder in a channel with R=H ¼ 0:5. The total number of nodes is 35,815 and the

Table 5The description of quadrilateral meshes used for an Oldroyd-B fluid past a confinedcircular cylinder in a channel with R=H ¼ 0:5. Drmin is the minimum normal meshspacing and DSmin and DSmax are the minimum and maximum tangential mesh spacingon the cylinder surface.

Mesh Numberof nodes

Number ofelements

DOF Drmin=R DSmin=R DSmax=R

M1 35,815 35,313 283,508 0.00494 0.0004 0.031415M2 141,148 140,147 1,123,178 0.00241 0.0002 0.015707M3 565,122 563,121 4,508,970 0.00119 0.0001 0.007853

x/R

Txx

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3-20

0

20

40

60

80

100

120

Mesh M1Mesh M2Mesh M3

Fig. 5. The convergence of Txx with mesh refinement on the cylinder surface and inthe cylinder wake at We ¼ 0:7 with b ¼ 0:59 for an Oldroyd-B fluid.

Table 6The comparison of maximum value of extra stress tensor component Txx at thecylinder wall and in the wake region at We ¼ 0:7 for an Oldroyd-B fluid with b ¼ 0:59.

Authors Maximum value of Txx atthe cylinder wall

Maximum value of Txx inthe wake region

Present (M3) 106.80 42.33Yurun et al. [58] 106.77 40.05Chauvi�ere and

Owens [14]106.4 37.1

Kim et al. [33] 107.7 38.8Hulsen et al.

(M7) [26]107.73 38.92

Afonso et al. [2] 100.98 40.79


the literature in Table 6. The results of Yurun et al. [58] were ob-tained using a Galerkin/least-squares hp finite element method.Hulsen et al. [26] used the DEVSS/DG formulation in a FEM contextwith the log-conformation formulation. Afonso et al. [2] employeda structured collocated FVM based on the log-conformation formu-lation with a time-marching pressure-correction algorithm onhighly refined meshes at the rear stagnation region. The compari-son in Fig. 6 shows excellent agreement on the cylinder surface

x/R

Txx

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3-20

0

20

40

60

80

100

120

Mesh M3Yurun et al.Hulsen et al.Afonso et al.

Fig. 6. The comparison of Txx with the numerical results of Yurun et al. [58], Hulsenet al. [26] and Afonso et al.[2] on the cylinder surface and in the cylinder wake atWe ¼ 0:7 with b ¼ 0:59 for an Oldroyd-B fluid.

with the results of Yurun et al. [58] and Hulsen et al. [26]. Thelow value of Afonso et al. [2] is due to the relatively large mesh sizeemployed on the cylinder surface and the authors presentednumerical results similar to ours with mesh refinement. However,there is a slight difference along the center line in the wake and ourresults are shifted slightly downstream. This is mainly due to theextremely small mesh size required in the wake behind the cylin-der. Nevertheless, our results in the wake indicate a remarkableagreement with the one-dimensional DG calculation of Hulsenet al. in Fig. 6 of [26] (M4-1D) along the center line in the wake.The one-dimensional DG calculation of the authors were obtainedon a very fine mesh by starting from the back stagnation point andusing the u�velocity component from the FEM calculation.

The calculations at higher Weissenberg numbers led to moresurprising results. The calculations at a Weissenberg number of0.8 converged to steady-state solutions on all meshes and the meshconvergence of Txx with mesh refinement is given in Fig. 7 on thecylinder surface and along the center line in the wake. At thisWeissenberg number, there is no mesh convergent tendency inthe stress profiles. For the calculations at We ¼ 0:9, the numericalsolution becomes even worse since the extra stress along the cen-ter line in the wake initially grows exponentially with time and nosteady-state solution can be found anymore on meshes M2 andM3. The variation of the extra stress RMS value with iteration num-ber is given in Fig. 8 on meshes M1 to M3. As it may seen the RMSvalue on meshes M2 and M3 initially grows exponentially onmeshes M2 and M3. Then the solution becomes time-dependentdepending on the mesh resolution and time step. Although, the va-lue of the extra stress may reach quite high values such as 1� 104

x/R

Txx

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3-30

0

30

60

90

120

150

180


Fig. 7. The convergence of Txx with mesh refinement on the cylinder surface and inthe cylinder wake at We ¼ 0:8 with b ¼ 0:59 for an Oldroyd-B fluid.

Iteration Number

RM

S

0 1000 2000 3000 4000 500010-6

10-4

10-2

100

102

104

106

108


Fig. 8. The RMS convergence for the extra stress tensor at We ¼ 0:9 with b ¼ 0:59for an Oldroyd-B fluid past a confined cylinder ðDt ¼ 0:01Þ.


during the initial growth, the solution field is still quite smooth andthe velocity field is divergence-free. In addition, we observe an ex-tremely low pressure field at the stagnation point behind the cyl-inder behaving like a singularity point. It should be noted thatthe critical Weissenberg number at which the solution becomestime-dependent is relatively lower compared to the previous re-sults in the literature; it is 1.4 for the results of Hulsen et al. [26]and 1.0 for the result of Afonso [2]. At this point, we are not surewhether the extra stress along the center line in the wake shouldexhibit exponential unbounded growth with time to infinity orleads to a time-dependent solution for the present two-dimen-sional calculations. However, the calculations on the fine meshM3 show more smooth initial exponential growth indicating thefirst case. In the literature, there are also large discrepancies forthe stresses in the wake region; some researchers have suggestedthat there may be some numerical artifacts beyond We ¼ 0:7. Forexample, Yurun et al. [58] suggested that the flow of the Old-royd-B fluid may have about the same limiting Deborah numberas the UCM fluid and solutions with higher De may be numericalartifacts. Owens and Phillips [41] pointed out that there is a singu-lar point of extra stress somewhere between We = 0.7 and 0.8, themesh-convergence cannot be obtained over the singular point. Thepresent large-scale calculations also indicate that no steady-statesolution is possible for an Oldroyd-B fluid beyond We ¼ 0:8.

Table 7The comparison of the dimensionless drag coefficient for an Oldroyd-B fluid past a confin

We M1 M2 M3 Hulsen et al. [26] Yurun et al. [58]

0.0 – – – 132.358 132.360.1 130.301 130.384 130.387 130.363 130.360.2 126.566 126.647 126.649 126.626 126.620.3 123.133 123.213 123.215 123.193 123.190.4 120.535 120.613 120.615 120.596 120.590.5 118.772 118.848 118.849 118.836 118.830.6 117.722 117.797 117.798 117.792 117.770.7 117.263 117.337 117.339 117.340 117.320.8 117.297 117.374 117.376 117.373 117.36

Although the convergence of the drag coefficient with mesh refine-ment is not considered to be a very good indicator of accuracy, thevalue for the steady state drag coefficients are tabulated in Table 7and compared with several other results available in the literature.The total drag on the cylinder surface may be expressed as follows:

Fx ¼ �I

pdyþ bI@u@x

dy�I@u@y

dx� �

þI

Txxdy�I

Txy dx ð34Þ

The computed drag coefficient values are in relatively goodagreement with the other results available in the literature.

3.4. Oldroyd-B Fluid around a sphere falling in a cylindrical tube

The two-dimensional axisymmetric viscoelastic flow of an Old-royd-B fluid around a sphere falling in a cylindrical tube is one ofthe classical benchmark problems in non-Newtonian fluidmechanics and has been studied extensively by many researchers[10,13,36,40,55–57]. The parallel algorithm described in Section 2is validated by solving this particular benchmark problem inthree-dimension using all-hexahedral elements. For this problem,we consider a rigid sphere of radius Rs falling with a terminalvelocity Us along the axis of a cylindrical tube of radius Rt . The ratioof sphere to tube radius is taken to be 0.5 and the computationaldomain spans from x ¼ �12Rs to x ¼ 12Rs with the sphere locatedat x ¼ 0. The dimensionless parameters are the Reynolds numberRe ¼ UsRs=g, the Weissenberg number We ¼ kUs=Rs and the viscos-ity ratio b ¼ gs=g. The physical parameters are the density q, theterminal velocity Us, the relaxation time k, the zero-shear-rate vis-cosity of the fluid g and the solvent viscosity gs. The viscosity ratiob is chosen to be 0.50, which is the value used in the classicalbenchmark problem for the Oldroyd-B fluid. The associated bound-ary conditions are the prescribed unit uniform velocity along the x-axis at the inflow and on the circular tube wall, no-slip boundaryconditions on the sphere surface and natural (traction-free) bound-ary conditions at the outflow. The extra stresses are computedeverywhere as in two-dimension and their boundary conditionsare imposed through their fluxes, using the analytical values atthe inlet boundary.

The unstructured computational mesh shown partially in Fig. 9is used in our calculations. The mesh consists of 1,214,542 nodesand 1,190,376 hexahedral elements leading to total 19,117,980 de-grees of freedom (DOF). As may be seen in Fig. 9, the mesh is highlystretched on the sphere surface, on the walls and in the wake re-gion behind the sphere in order to resolve very strong stress gradi-ents. The normal mesh space on the sphere surface is set to2:6� 10�3 while the minimum tangential mesh space is approxi-mately equal to 8� 10�3. There are 11,580 quadrilateral elementson the sphere surface. The mesh is created using the mapping,paving and sweeping algorithms available within the CUBIT meshgeneration environment [9]. The METIS library [29] is used to par-tition the mesh into 128 sub-domains. The computed u-velocity

ed circular cylinder in a channel.

Owens et al. [2] Afonso et al. [39] Kim et al. [33] Caola et al. [11]

132.357 – 132.354 132.384– – 130.359 –– – 126.622 –– – 123.118 –– – 120.589 –118.827 118.818 118.824 118.763117.775 117.774 117.774 –117.291 117.323 117.315 –117.237 117.364 117.351 –

X

Y

Z

Fig. 9. Partial view of the computational mesh for a sphere falling in a circular tube ðRs=Rt ¼ 0:5Þ. The mesh is highly refined in the wake region. The total number of nodes is1,214,542 and the total number hexahedral elements is 1,190,376.

Fig. 10. The computed u-velocity component isosurfaces with streamtrace plot for an Oldroyd-B fluid around a falling sphere in a circular tube at We ¼ 0:6 with b ¼ 0:5. Thecontour levels are 0.0, 0.4, 0.8, 1.2 and 1.6.

Fig. 11. The computed Txx extra stress tensor component isosurfaces with contour plots on y ¼ 0 plane (red lines) and on solid walls (black lines) for an Oldroyd-B fluidaround a falling sphere in a circular tube at We ¼ 0:6 with b ¼ 0:5. The contour levels are 0.1, 1, 2, 4 and 8.


x/R

Txx

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3-8

0

8

16

24

32

40

PresentOwens&Phillips

Fig. 12. The comparison of Txx with the results of Owens and Phillips [40] on thesphere surface for an Oldroyd-B fluid around a falling sphere in a circular tube atWe ¼ 0:6 with b ¼ 0:5.

Table 8The comparison of maximum value of extra stress tensor component Txx on thesphere surface and in the wake region at We ¼ 0:6 for an Oldroyd-B fluid withb ¼ 0:50.

Authors Maximum value of Txx on thesphere surface

Maximum value of Txx inthe wake region

Present 34.73 5.12Lunsmann

et al. [36]35.14 –

Owens et al.[40]

35.67 –


component isosurfaces with streamtrace plot are given in Fig. 10 atWe ¼ 0:6 with b ¼ 0:5. The isosurfaces are plotted directly from theelement face center values by constructing a new mesh using ele-ment face centroids. As it may be seen the maximum value of theu�velocity component occurs between the sphere and the tubewalls. In addition, the computed stress component Txx isosurfaceswith contour plot on y ¼ 0 plane (red lines) and on solid walls(black lines) are shown in Fig. 11. For this case, the computed Txx

values are relatively low compared to the two-dimensional cylin-der calculations. The computed stress component Txx is given inFig. 12 on the sphere surface and along the centerline of the cylin-drical tube, and it is compared with the numerical results of Owensand Phillips [40] on the sphere surface. These extreme values of Txx

are also provided with the other results available in the literaturein Table 8. The results of Owens and Phillips [40] were obtainedusing a spectral element method; Lunsmann et al. [36] used aEVSS/FEM formulation. The present maximum value of Txx extrastress component presents only a 2.70% difference from the valuecomputed by Owens and Phillips [40] and a 1.18% difference fromthe value of Lunsmann et al. [36]. However, it should be noted thatthe present extreme value of the Txx corresponds to the value at thehexahedral element centroids next to the sphere surface. The pres-ent three-dimensional calculation required approximately 10789seconds on the Anodolu machine with 128 nodes. Although wewould like to do additional calculations as in two-dimension, thedecoupled time integration method given is Section 2 significantlylimits the allowable time step on the present fine mesh making thecalculations very expensive. In the future, we will develop a fully-coupled iterative solver in order to further improve the computa-tional efficiency of the present numerical algorithm.

4. Conclusions

A new stable unstructured finite volume is presented for paral-lel large-scale solution of the viscoelastic fluid flows with exactmass conservation within each elements. The present arrangementof the primitive variables leads to a stable numerical scheme and itdoes not require any ad-hoc modifications in order to enhance thepressure–velocity–stress coupling. The time stepping algorithmused decouples the calculation of the polymeric stress by solutionof a hyperbolic constitutive equation from the evolution of thevelocity and pressure fields by solution of a generalized Stokesproblem. The resulting algebraic linear systems are solved usingthe FGMRES(m) Krylov iterative method with the restricted addi-tive Schwarz preconditioner for the extra stress tensor and the geo-metric non-nested multilevel preconditioner for the Stokes system.The present multilevel preconditioner for the Stokes system isessential for parallel scalable viscoelastic flow computations. Thisis because, as it is well known, one-level methods lead to non-scal-able solvers since they cause increase in the number of iterationsas the number of sub-domains is increased. However, it is not pos-sible to apply the standard multigrid methods with classicalsmoothing techniques (e.g., Jacobi, Gauss–Seidel) for the couplediterative solution of the momentum and continuity equations be-cause of the zero-block in the saddle point problem. In order toavoid the zero-block in the saddle point problem, we use an uppertriangular right preconditioner which results in a scaled discreteLaplacian instead of a zero block in the original system. The log-conformation representation proposed in [22] has been imple-mented in order to simulate the three-dimensional viscoelasticinstabilities at high Weissenberg numbers in our future works.The implementation of the preconditioned Krylov subspace algo-rithm, matrix–matrix multiplication and the multilevel precondi-tioner were carried out using the PETSc software package [5]developed at the Sandia National Laboratories for improving theefficiency of the parallel code. The present numerical algorithm isvalidated for the Kovasznay flow, the flow of an Oldroyd-B fluidpast a confined circular cylinder in a channel and the three-dimen-sional flow of an Oldroyd-B fluid around a rigid sphere falling in acylindrical tube. The numerical results for the flow of an Oldroyd-Bfluid past a confined circular cylinder at We ¼ 0:7 indicate thatmesh convergence is achieved in the wake of the cylinder. How-ever, the numerical results at higher Weissenberg numbers indi-cate that no steady-state solution is possible for an Oldroyd-Bfluid beyond We ¼ 0:8. Although, we employ very high mesh reso-lution in the wake region, allowing very accurate solutions at min-imum cost, in order to study mesh convergence in the wake of thecylinder, the decoupled time integration method significantly lim-its the allowable time step. Therefore, we will develop a fully-cou-pled iterative solver in order to further improve the computationalefficiency of the present numerical algorithm.

Acknowledgments

The author gratefully acknowledge the use of the Chimera ma-chine at the Faculty of Aeronautics and Astronautics at ITU, thecomputing resources provided by the National Center for High Per-formance Computing of Turkey (UYBHM) under Grant No.10752009 and the computing facilities at TUBITAK ULAKBIM, HighPerformance and Grid Computing Center.

References

[1] M.A. Alves, F.T. Pinho, P.J. Oliveira, The flow of viscoelastic fluids past acylinder: finite-volume high-resolution methods, J. Non-Newtonian FluidMech. 97 (2001) 207–232.


[2] A. Afonso, P.J. Oliveira, F.T. Pinho, M.A. Alves, The log-conformation tensorapproach in the finite-volume method framework, J. Non-Newtonian FluidMech. 157 (2009) 55–65.

[3] R. Amit, C.A. Hall, T.A. Porsching, An application of network theory to thesolution of implicit Navier–Stokes difference equations, J. Comput. Phys. 40(1981) 183–201.

[4] W.K. Anderson, D.L. Bonhaus, An implicit upwind algorithm for computingturbulent flows on unstructured grids, Comp. Fluids 23 (1994) 1–21.

[5] S. Balay, K. Buschelman, V. Eijkhout, W.D. Gropp, D. Kaushik, M.G. Knepley, L.C.McInnes, B.F. Smith, H. Zhang, PETSc Users Manual, ANL-95/11, Mathematicand Computer Science Division, Argonne National Laboratory, 2004. Availablefrom: <http://www-unix.mcs.anl.gov/petsc/petsc-2/>.

[6] T.J. Barth, A 3-D upwind Euler solver for unstructured meshes, AIAA Paper 91-1548-CP, 1991.

[7] M. Benzi, G.H. Golub, J. Liesen, Numerical solution of saddle point problems,Acta Numer. 14 (2005) 1–137.

[8] M.L. Bittencourt, C.C. Douglas, R.A. Feijóo, Nonnested multigrid methods forlinear problems, Numer. Methods Partial Diff. Eqs. 17 (2001) 313–331.

[9] T.D. Blacker, S. Benzley, S. Jankovich, R. Kerr, J. Kraftcheck, R. Kerr, P. Knupp, R.Leland, D. Melander, R. Meyers, S. Mitchell, J. Shepard, T. Tautges, D. White,CUBIT Mesh Generation Enviroment Users Manual, vol. 1, Sandia NationalLaboratories, Albuquerque NM, 1999.

[10] C. Bodart, M.J. Crochet, The time-dependent flow of a viscoelastic fluid arounda sphere Original Research Article, J. Non-Newtonian Fluid Mech. 54 (1994)303–329.

[11] A.E. Caola, Y.L. Joo, R.C. Armstrong, R.A. Brown, Highly parallel time integrationof viscoelastic flows, J. Non-Newtonian Fluid Mech. 100 (2001) 191–216.

[12] Z. Castillo, X. Xie, D.C. Sorensen, M. Embree, M. Pasquali, Parallel solution oflarge-scale free surface viscoelastic flows via sparse approximate inversepreconditioning, J. Non-Newtonian Fluid Mech. 157 (2009) 44–54.

[13] C. Chauvière, R.G. Owens, How accurate is your solution? Error indicators forviscoelastic flow calculations, J. Non-Newtonian Fluid Mech. 95 (2000) 1–33.

[14] C. Chauvière, R.G. Owens, A new spectral element method for the reliablecomputation of viscoelastic flow, Comput. Methods Appl. Mech. Eng. 190(2001) 3999–4018.

[15] M. Crouzeix, P.A. Raviart, Conforming and nonconforming finite elementmethods for solving the stationary Stokes equations, RAIRO Anal. Numer. 7(1973) 33–76.

[16] E. Cuthill, J. McKee, Reducing the bandwidth of sparce symmetric matrices, in:24th. ACM National Conference, 1969, pp. 157–172.

[17] O.M. Coronado, D. Arora, M. Behr, M. Pasquali, A simple method for simulatinggeneral viscoelastic fluid flows with an alternate log-conformationformulation, J. Non-Newtonian Fluid Mech. 147 (2007) 189–199.

[18] Y. Dimakopoulos, An efficient parallel and fully implicit algorithm for thesimulation of transient free-surface flows of multimode viscoelastic liquids, J.Non-Newtonian Fluid Mech. 165 (2010) 409–424.

[19] H.-S. Dou, N. Phan-Thien, Parallelisation of an unstructured finite volume codewith PVM: viscoelastic flow around a cylinder, J. Non-Newtonian Fluid Mech.77 (1998) 21–51.

[20] H.C. Elman, Preconditioning for the steady-state Navier–Stokes equations withlow viscosity, SIAM J. Sci. Comput. 20 (1999) 1299–1316.

[21] H.C. Elman, V.E. Howle, J.N. Shadid, R.S. Tuminaro, A parallel block multi-levelpreconditioner for the 3D incompressible Navier–Stokes equations, J. Comput.Phys. 187 (2003) 504–523.

[22] R. Fattal, R. Kupferman, Constitutive laws for the -logarithm of theconformation tensor, J. Non-Newtonian Fluid Mech. 123 (2004) 281–285.

[23] R. Guenette, M. Fortin, A new mixed finite element method for computingviscoelastic flows, J. Non-Newtonian Fluid Mech. 60 (1995) 27–52.

[24] W. Hackbusch, Multigrid Methods and Applications, Springer-Verlag,Heidelberg, 1985.

[25] F.H. Harlow, J.E. Welch, Numerical calculation of time-dependent viscousincompressible flow of fluid with free surface, J. Comput. Phys. 8 (1965) 2182–2189.

[26] M.A. Hulsen, R. Fattal, R. Kupferman, Flow of viscoelastic fluids past a cylinderat high Weissenberg number: stabilized simulations using matrix logarithms,J. Non-Newtonian Fluid Mech. 127 (2005) 27–39.

[27] Y.H. Hwang, Calculations of incompressible flow on a staggered triangle grid.Part I: Mathematical formulation, Numer. Heat Transfer B 27 (1995) 323–1995.

[28] R. Falgout, A. Baker, E. Chow, V.E. Henson, E. Hill, J. Jones, T. Kolev, B. Lee, J.Painter, C. Tong, P. Vassilevski, U.M. Yang, Users manual, HYPRE HighPerformance Preconditioners, UCRL-MA-137155 DR, Center for AppliedScientific Computing, Lawrence Livermore National Laboratory, 2002.Available from: <http://www.llnl.gov/CASC/hypre/>.

[29] G. Karypis, V. Kumar, A fast and high quality multilevel scheme for partitioningirregular graphs, SIAM J. Sci. Comput. 20 (1998) 359–392.

[30] A. Kane, R. Gunette, A. Fortin, A comparison of four implementations of thelog-conformation formulation for viscoelastic fluid flows, J. Non-NewtonianFluid Mech. 164 (2009) 45–50.

[31] D. Kay, D. Loghin, A.J. Wathen, A preconditioner for the steady-state Navier–Stokes equations, SIAM J. Sci. Comput. 24 (2002) 237–256.

[32] L.I.G. Kovasznay, Laminar flow behind a two-dimensional grid, Proc. Camb.Philos. Soc. 44 (1948) 58–62.

[33] J.M. Kim, C. Kim, K.H. Ahn, S.J. Lee, An efficient iterative solver and high-resolution computations of the Oldroyd-B fluid flow past a confined cylinder, J.Non-Newtonian Fluid Mech. 123 (2004) 161–173.

[34] P.T. Lin, M. Sala, J.N. Shadid, R.S. Tuminaro, Performance of a geometric and analgebraic multilevel preconditioner for incompressible flow with transport, in:Proceedings of Computational Mechanics WCCM VI in Conjunction withAPCOM’04, Beijing, China, September 5–10, 2004.

[35] P.L. Lin, M. Sala, J.N. Shadid, R.S. Tuminaro, Performance of fully-coupledalgebraic multilevel domain decomposition preconditioners forincompressible flow and transport, Int. J. Numer. Methods Eng. 19 (2004) 1–10.

[36] W.J. Lunsmann, L. Genieser, R.C. Armstrong, R.A. Brown, Finite elementanalysis of steady viscoelastic flow around a sphere in a tube: calculationswith constant viscosity models, Int. J. Numer. Methods Eng. 48 (1993) 63–99.

[37] D.J. Mavriplis, Multigrid solution strategies for adaptive meshing problems,NASA/CP-3316, 1995.

[38] P.J. Oliveira, F.T. Pinho, G.A. Pinto, Numerical simulation of non-linear elasticflows with a general collocated finite-volume method, J. Non-Newtonian FluidMech. 79 (1998) 1–43.

[39] R.G. Owens, C. Chauvière, T.N. Philips, A locally-upwinded spectral technique(LUST) for viscoelastic flows, J. Non-Newtonian Fluid Mech. 108 (2002) 49–71.

[40] R.G. Owens, T.N. Phillips, Steady viscoelastic flow past a sphere using spectralelements, Int. J. Numer. Methods Eng. 39 (1996) 1517–1534.

[41] R.G. Owens, T.N. Phillips, Computational Rheology, Imperial College Press,London, 2002.

[42] M.G.N. Perera, K. Walters, Long-range memory effects in flows involvingabrupt changes in geometry, J. Non-Newtonian Fluid Mech. 2 (1977) 49–81.

[43] R. Rannacher, S. Turek, A simple nonconforming element, Numer. MethodsPDEs 8 (1992) 97–111.

[44] C.M. Rhie, W.L. Chow, Numerical study of the turbulent flow past an airfoilwith trailing edge separation, AIAA J. 21 (1983) 1525–1532.

[45] S. Rida, F. McKenty, F.L. Meng, M. Reggio, A staggered control volume schemefor unstructured triangular grids, Int. J. Numer. Methods Fluids 25 (1997) 697–717.

[46] M. Rozloznik, Saddle point problems, iterative solution and preconditioning: ashort overview, in: I. Marek (Ed.), Proceedings of the XVth Summer SchoolSoftware and Algorithms of Numerical Mathematics, University of WestBohemia, Pilsen, 2003, pp. 97–108.

[47] Y. Saad, A flexible inner-product preconditioned GMRES algorithm, SIAM J. Sci.Statist. Comput. 14 (1993) 461–469.

[48] M. Sahin, A preconditioned semi-staggered dilation-free finite volume methodfor the incompressible Navier–Stokes equations on all-hexahedral elements,Int. J. Numer. Methods Fluids 49 (2005) 959–974.

[49] M. Sahin, H.J. Wilson, A semi-staggered dilation-free finite volume method forthe numerical solution of viscoelastic fluid flows on all-hexahedral elements, J.Non-Newtonian Fluid Mech. 147 (2007) 79–91.

[50] P.R. Schunk, M.A. Heroux, R.R. Rao, T.A. Baer, S.R. Subia, A.C. Sun, Iterativesolvers and preconditioners for fully-coupled finite element formulations ofincompressible fluid mechanics and related transport problems. SAND2001-3512J, Sandia National Laboratories Albuuquerque, New Mexico, 2001.

[51] J. Sun, N. Phan-Thien, R.I. Tanner, An adaptive viscoelastic stress splittingscheme and its applications: AVSS/SI and AVSS/SUPG, J. Non-Newtonian FluidMech. 65 (1996) 75–91.

[52] S.P. Vanka, Block-implicit multigrid solutions of Navier–Stokes equations inprimitive variables, J. Comput. Phys. 65 (1986) 138–158.

[53] R. Vefrth, A multilevel algorithm for mixed problems, SIAM J. Numer. Anal. 21(1984) 264–271.

[54] P. Wesseling, An Introduction to Multigrid Methods, John Wiley & Sons, NewYork, 1992.

[55] F. Yurun, Solution behavior of the falling sphere problem in viscoelastic flows,Acta Mech. Sinica 19 (2003) 394–408.

[56] F. Yurun, Limiting behavior of the solutions of a falling sphere in a tube filledwith viscoelastic fluids, J. Non-Newtonian Fluid Mech. 110 (2003) 77–102.

[57] F. Yurun, M.J. Crochet, High-order finite element methods for steadyviscoelastic flows, J. Non-Newtonian Fluid Mech. 57 (1995) 283–311.

[58] F. Yurun, R.I. Tanner, N. Phan-Thien, Galerkin/least-square finite elementmethods for steady viscoelastic flows, J. Non-Newtonian Fluid Mech. 84 (1999)233–256.

http://www-unix.mcs.anl.gov/petsc/petsc-2/

http://www.llnl.gov/CASC/hypre/

Documents

A stable unstructured finite volume method for parallel large-scale viscoelastic fluid flow calculations