Compiled lecture notes

— 1 —

Compiled lecture notes

Peter W Möller

Göteborg — October 16, 2018

— 2 —

— 3 —

— 4 —

— 5 —

Table of Contents

1 The Finite Element Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9

1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9

1.2 Spring Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10

1.2.1 The Spring Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11

1.2.2 Connected Springs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12

1.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13

1.2.4 Solving the FE Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16

1.2.5 Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17

1.3 Truss Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18

1.3.1 A Two–Dimensional Bar Element. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18

1.3.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23

1.3.3 Solving the FE Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27

1.3.4 Calculating the Bar Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28

1.3.5 Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29

2 Second Order Problems in One Dimension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31

2.1 Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32

2.1.1 The Pre–Tensioned String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32

2.1.2 Linear Elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33

2.1.3 One–Dimensional Heat Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36

2.2 The Weak Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .38

2.3 Finite Element Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .40

2.4 Some Notes on the FE Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44

2.5 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45

2.5.1 Homogenous Dirichlet Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45

2.5.2 Non–Homogenous Dirichlet Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53

2.5.3 Natural Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56

2.5.4 Robin Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60

2.6 Element Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64

2.6.1 Conform Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64

2.6.2 Nonconforming Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65

2.6.3 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .66

2.6.4 Lagrange Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .68

2.7 A Note on the Load Vectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .72

3 Abstract Formulation of the Finite Element Method . . . . . . . . . . . . . . . . . . . . . . .75

3.1 Vector Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75

3.1.1 Vector Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .76

3.1.2 Inner products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .77

— 6 —

3.2 Function Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .78

3.2.1 Function Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80

3.2.2 Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81

3.3 Abstract Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .82

3.3.1 The Variational Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .82

3.3.2 Finite Element Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85

3.3.3 The Energy Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .86

4 The Minimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89

4.1 Preliminaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89

4.2 Minimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .91

4.2.1 The Continuous Minimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . .91

4.2.2 The Discrete Minimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . .91

4.2.3 Potential Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .93

4.3Stiff Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .95

5 Error Estimation and Adaptivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97

5.1 Error Sources in Engineering Computations. . . . . . . . . . . . . . . . . . . . . . . . . .97

5.2 Best Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .100

5.3 Galerkin Orthogonality and as a Best Approximation . . . . . . . . . . . . . . . .102

5.4 Error in Terms of the Energy Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .106

5.5 An Equation for the Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .108

5.6 Mesh Refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110

5.6.1 –refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111

5.6.2 –refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112

5.6.3 Hierarchical Basis Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112

5.7 A Residual Based a posteriori Error Estimate . . . . . . . . . . . . . . . . . . . . . . . .116

5.8 Adaptivity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .120

6 Convergence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125

6.1 One Dimensional Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125

6.2 Two Dimensional Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .127

6.3 Effects of Singularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .131

6.4 Rate of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .136

6.4.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .137

6.5 –refinements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .140

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .143

uh

h

p

p

— 7 —

— 8 —

— 9 —

1. The Finite Element Equations

1.1 Preliminaries

In the present text we will recognize the finite element method as a technique to approximatethe solutions of boundary value problems (BVP), i.e. differential equations with boundaryconditions that ensure that the solution is unique. We remark that in engineering practice

the differential equations are commonly of order with or . For instance the

exceedingly important Poisson’s equation as well as the Navier elasticity equations are of

order ( ); the biharmonic equation (used e.g. to model plate bending) and the Euler–

Bernoulli beam equation (the elastic line) are both of order ( ).

The first step is the to recast the BVP into avariational problem, by means of a varia-tional formulation. In this process, theorder of the highest derivative is reduced

from to . Hence, if we deal with a BVP

with a second order differential equation,the variational problem will involve firstderivatives of the unknown only. For thisreason, the BVP and its variational form aresometimes referred to as the strong andweak forms, respectively. Note, however,that there is no approximation involved inrecasting the BVP, so both problems have

the same solution , but since the deriva-

tives of are of lower order in the weak

problem it is easier to find an approximatesolution.

In the subsequent step, the finite elementformulation, an approximation is intro-duced. A number of so called basis func-

tions (sometimes referred to as ‘shape

functions’) are constructed, and the unknown function is approximated as a linear combina-

tion of these. If we denote the FE–approximation by , we thus have

(1.1)

The coefficients , , in the linear combination are known as the node variables;

they are to be calculated so that becomes an as good approximation as possible (in a

well defined sense). Next is substituted for in the variational problem and we observe

that we do not have any unknown function any longer, but instead a discrete number of yet

unknown node variables. Thus, we need equations to solve for the node variables; as it

turns out, such a system of equations may be established from the variational problem, andone obtains

(1.2)

2m m 1= m 2=

2 m 1=

4 m 2=

Boundary value problem:Differential equation(s) and

boundary conditions

Unknown: one or more functions u x( )

Strong form of the continuous problem

Weak form of the continuous problem

Variational problem; integral equation(s)

Unknown: one or more functions

Finite element problem; discrete problem

Algebraic system of equations: Ka f=

Unknown: node variables a a1 a2 … an

T=

Variational

Finite element

formulation

formulation

u x( )

2m m

u x( )u

Ni x( )

uh

u uh≈ N1 x( )a1 N2 x( )a2 … Nn x( )an+ + + Ni x( )ai

i 1=

n

∑= =

n ai i 1 2 … n, , ,=

u uh≈

uh u

n

Ka f=

K11 K12 … K1n

K21 K22 … K2n

… … … …Kn1 Kn2 … Knn

a1

a2

…an

f1

f2

…fn

=

— 10 —

where and are denoted the structure stiffness matrix and the structure load vector,

respectively. Having solved the system for the node variables , we have an approxi-

mation of available according to Eq. (1.1).

Remark: The notations stiffness matrix and load vector reflect the fact that the practical useof the finite element method was developed by structural engineers in the aviation industry,where it is natural to think of stiffnesses and force–displacement relations. Note though, thatfor this historic reason, the same notations are used even if the original BVP does notdescribe an elasticity or mechanical problem.

m

Remark: In an elasticity problem, the FE equations

may be thought of as a generalization of the ordi-

nary spring equation.

m

The boundary value problem being solved, is defined on some domain

with boundary . The variational problem will embrace integrals over

(and possibly curve integrals along ), and both the stiffness matrix and

load vector are established by solving these integrals. In practice is

divided into subdomains — ‘elements’ — and the required integrations

are performed over the elements; by adding up the integrals over the ele-

ments, one obtains the integrals over the complete domain . The illus-

tration to the right depict a subdivision of a two dimensional domain intotriangular shaped elements.

Integrations over an element result in an element stiffness matrix and

an element load vector . Subsequently these are ‘added’ together in a so

called assembly process to eventually yield the structure stiffness matrix

and the structure load vector , respectively.

In this chapter we will circumvent the tasks of integration to obtain ele-ment stiffness matrices and element load vectors, but adhere to a coupleof problems where these can be established more directly. Hence we focus

on the assembly process, i.e. how to construct the stiffness matrix and

the load vector in Eq. (1.2) from element matrices and vectors ,

respectively. We will also see how to handle the equation system (2) in the

common case when some of the node variables in the solution vector

should have prescribed values.

In the next chapter we shall dwell into the finite element method for a second order boundaryvalue problem, and describe the variational formulation as well as the subsequent finite ele-

ment formulation. The resulting element quantities ( and ) are assembled to a system of

equations (Eq. (1.2)) in exactly the same manner as is described below; actually, this will bethe case in any FE–problem. For this reason, it is of some importance to get a grip on theassembly process.

1.2 Spring Structures

In this section we will deal with force–displacement relations for assemblages of linear elasticsprings. The presentation is restricted to the special case when all displacements are in onedirection, so that the position of any point may be described by a single real number. In asubsequent section we expand the concept to the more general case where the displacement

K f

Ka f= a

u

f

Ka

Ka f=

Ka f=

Γ

Γ

Ω

Ωe

ΩΓ Ω

ΓΩ

Ωe

Ω

Ke

fe

K f

K

f Ke

fe

a

Ke

fe

— 11 —

is a vector, i.e. situations where connection points between structural parts may displace inmore than one direction.

1.2.1 The Spring Element

Consider a spring that transfers load in its axial direction

only. The axial force is denoted and is positive in tension.

A linear elastic spring gets an elongation

(1.3)

and the force is proportional to the change of length

(1.4)

The constant of proportionality, , is the spring stiffness and is a constitutive property.

Note that a tensile force ( ) yields an extension and a positive value of , while if the force

is compressive ( ) we get which corresponds to a reduction of the spring length. Also,

by elastic we mean that the spring regain its original length ( ) when it is unloaded.

The force–displacement relation Eq. (1.4) may be thought of as a con-stitutive equation. As a such it is not directly viable to use when westudy the connections of springs in a structure built up from two or

more springs. If, for instance, a point between two springs in a

serial connection deflects , its gives a positive contribution to the

extension of one of the springs, but a negative contribution to the other spring.

To facilitate the analysis of a spring structure, we introduce a horizontal coordinate and let

it be positive to the right. Forces and displacements are then defined to be positive in the pos-

itive –direction. Furthermore, we use superscript e for quantities that pertains to an ele-

ment. Our spring element have two ends, which we simply denote 1 and 2, respectively. To

each end we associate a force and a displacement ( ).

Comparing to the notations above, we see that and , so Eq. (1.3) may now be

written

(1.5)

Further we note that

(1.6)

Substituting Eqs. (1.5) and (1.6) into Eq. (1.4), one gets

(1.7)

k

NN

δ1δ2

N

δ δ1 δ2+=

N kδ=

k 0>N 0> δ

N 0< δ 0<δ 0=

i

δi

i

δi

x

x

Pi

eai

ei 1 2,=

P1

e

P2

e

a1

ea2

e

k

x

a1

e δ1–= a2

e δ2=

δ a2

ea1

e–=

P1

eN–= P2

eN=

P1

ek a1

ea2

e–( )= P2

ek a1

ea2

e–( )–=

— 12 —

or

(1.8)

Using the notation

(1.9)

we thus have

(1.10)

Here and are the element stiffness matrix and the element load vector, respectively. The

vector contains the degrees of freedom, or node variables, on the element. Recall that

superscript e means that we deal with a particular element, so the subscript numbering (1and 2) is local: first and second degree of freedom on the element.

Element stiffness matrices are always square and have one row and one column for eachnode variable on the element. It is observed that the element stiffness matrix is symmetric.This is also often the case when we approximate solutions of BVPs with FEM: if the governingdifferential equation embrace only even order derivatives of the unknown function(s) and theso called Galerkin method is utilized, the element stiffness matrices become symmetric. Sym-

metric –matrices will yield a symmetric structure stiffness matrix .

1.2.2 Connected Springs

Let us next consider two or more spring elements connected in series and/or in parallel; anexample is depicted below.

In an assembly like this, each spring will be an element so there are five elements in thisexample; these are only implicitly numbered in the illustration, in that the stiffness of the

:th element has been denoted . The connection points (and end points) of the springs,

marked by white circles, are the nodes, and the node variables represent displacements.

Given all node displacements, the position of the structure is known and all spring elongation— and thus spring forces — are easily calculated. For instance, the elongation of the second

element is and the spring force hence . The spring forces have to be in equilib-

rium with the applied forces , ; note that the external forces are applied at the

nodes and have been numbered in the same order as the displacements, so that and is

k k–

k– k

a1

e

a2

e

P1

e

P2

e=

Ke

k1 1–

1– 1= f

e P1

e

P2

e= a

e a1

e

a2

e=

Kea

efe

=

Ke

fe

ae

Ke

K

1 23

4 5

f1

f2

f3

f4f5

a1

a2

a3

a4

a5

k1

k2

k3

k4k5

i ki

ai

a4 a2– k2 a4 a2–( )

fi i 1 … 5, ,=

fj aj

— 13 —

the load and displacement, respectively, in the :th node. Let us now collect the node dis-

placements and forces in vectors, viz. a node displacement vector and a structure load vector

(1.11)

Our task is now to find a force–displacement relation on stiffness form

(1.12)

where is a structure stiffness matrix; it is realized that the stiffness matrix in this example

has dimension 5, since we have five node displacements and five node forces. As it turns out,the structure stiffness matrix may be established from the element stiffnesses. This is accom-plished in three steps as follows.

• For each element: find the element stiffness matrix and the element load vector

• Apply compatibility: express the element degrees of freedom in the structure

variables

• Enforce equilibrium at each node

The second and third step is called assembling and is typically done one element at a time,e.g. in Matlab one would use a for–loop.

1.2.3 Example

The assembly process is hereillustrated and described for asomewhat less elaborate exam-ple than the one used above.Consider two springs with stiff-

nesses and , respectively,

connected in series. The two ele-ments and the three nodes arenumbered from left to right asshown in the figure. There are thus three node displacements and three nodal forces

(1.13)

so the structure stiffness matrix (cf. Eq. (1.12)) is a matrix.

We take on the first step in establishing the structure stiffness matrix and thus consider theelements, one at a time, without regard to their respective location in the structure.

From Eq. (1.9) we obtain

j

a a1 a2 a3 a4 a5

T= f f1 f2 f3 f4 f5

T=

Ka f=

K

ae

a

k1 k2f1f2 f3

a1 a2 a3

1 21 2 3

k1 k2

a a1 a2 a3

T= f f1 f2 f3

T=

K 3 3×

P2

e 1=P1

e 1=P1

e 2=P2

e 2=k1

k2

a1

e 1= a1

e 2=a2

e 2=a2

e 1=

— 14 —

(1.14)

and

(1.15)

Now that the element quantities are at hand, we are set to start the assembling. First we takecompatibility into account (the second step in the procedure outlined above); this suggeststhat we are to identify the relations between the element degrees of freedom and the structuredegrees of freedom. For the first element we have

(1.16)

so Eq. (1.10) may be written

(1.17)

The second element connects nodes 2 and 3 so

(1.18)

so for this element Eq. (1.10) becomes

(1.19)

If we expand the element degree of freedom vectors to the structure vector , i.e.

for the first element and analogously for element 2, Eqs. (1.17) and (1.19) may be written

(1.20)

or

(1.21)

Ke 1= k1 k1–

k1– k1

= fe 1= P1

e 1=

P2

e 1== a

e 1= a1

e 1=

a2

e 1==

Ke 2= k2 k2–

k2– k2

= fe 2= P1

e 2=

P2

e 2== a

e 2= a1

e 2=

a2

e 2==

a1

e 1=a1= a2

e 1=a2=

Ke 1= a1

a2

fe 1=

=k1 k1–

k1– k1

a1

a2

P1

e 1=

P2

e 1==

a1

e 2=a2= a2

e 2=a3=

Ke 2= a2

a3

fe 2=

=k2 k2–

k2– k2

a2

a3

P1

e 2=

P2

e 2==

aa1

a2

a1

a2

a3

→

k1 k1– 0

k1– k1 0

0 0 0

a1

a2

a3

P1

e 1=

P2

e 1=

0

=

0 0 0

0 k2 k2–

0 k2 k2–

a1

a2

a3

0

P1

e 2=

P2

e 2=

=

Kee

a fee

=

— 15 —

for the respective elements. Here is the expanded element stiffness matrix while is the

expanded element load vector. For instance, for element 1 we have

(1.22)

To conclude the assembling process, we next establish equilibrium conditions for the nodes(third step in the process outlined at end of the previous subsection). Recalling Newton’s thirdlaw we have the following situation:

For nodes 1 through 3 we find the equilibrium equations

(1.23)

Using vectors, we may write these

(1.24)

Thus, the structure load vector is obtained as the sum of the expanded element load vectors.However, as seen in Eqs. (1.20) and (1.21), an expanded element load vector may beexpressed as the product of the expanded element stiffness matrix and the structure degree

of freedom vector . Hence, Eq. (1.24) becomes

(1.25)

or

(1.26)

where

(1.27)

Kee

fee

Kee 1=

k1 k1– 0

k1– k1 0

0 0 0

= fee 1=

P1

e 1=

P2

e 1=

0

=

1 2 3

f1 f2 f3

P1

e 1=

P2

e 1= P1

e 2=P2

e 2=

k1k2

f1 P1

e 1=– 0= f2 P2

e 1=– P1

e 2=– 0= f3 P2

e 2=– 0=

f1

f2

f3

P1

e 1=

P2

e 1=

0

0

P1

e 2=

P2

e 2=

+= f fee 1=

fee 2=

+=

a

f Kee 1=

Kee 2=

+( )a=

Ka f=

K Kee 1=

Kee 2=

+

k1 k1– 0

k1– k1 k2+ k2–

0 k2– k2

= =

— 16 —

1.2.4 Solving the FE Equations

As seen, or model problem is described by

(1.28)

The first thing that should be noted, is that the determinant of the stiffness matrix i zero:

(1.29)

Hence, the matrix does not have an inverse ( does not exist) and Eq. (1.28) does not have

a unique solution. To see this, consider the displacement vector for any arbitrary

constant ; it represents a rigid body displacement. The vector is then a zero vector

(1.30)

Adding the zero vector to the left hand side of Eq. (1.26), we have or

(1.31)

so if we have found any solution such that , will also be a solution for any value

of .

Many, albeit not all, engineering problems that are solved by finite element methods yield sin-gular stiffness matrices, and it is necessary to invoke boundary conditions that ensure aunique solution in each case. In elasticity problems for instance, one must make certain thatthe body cannot displace without deformations occurring. We will contemplate boundary con-ditions in somewhat more detail in the next chapter, where boundary value problems are dis-cussed. As for now, it is enough to note that in our spring problem it is sufficient to fix theposition of one node in order to suppress rigid body translation; let us study an example.

Let us keep the leftmost node fixed; this is thesame thing as saying that we measure all dis-placement relative to node 1. We hence have

, but will be an unknown ‘reaction’

force that will depend on the loads applied to

the other nodes — the magnitude of will

become exactly what is required to keep . Let us now apply a force at node 3, but no

force at node 2. We then have the three equations

(1.32)

with the three unknowns , , and .

k1 k1– 0

k1– k1 k2+ k2–

0 k2– k2

a1

a2

a3

f1

f2

f3

=

det K( ) k1 k1 k2+( )k2 k2

2k1– k2k1

2– 0= =

K1–

c c c cT

=

c Kc

K

c

c

c

k1c k1c–

k1c k1 k2+( )c k2c–+–

k2c– k2c+

0

0

0

= =

Ka Kc+ f=

K a c+( ) f=

a Ka f= a c+

c

k2k1Pf1

a1 0= f1

f1

a1 0= P

k1 k1– 0

k1– k1 k2+ k2–

0 k2– k2

0

a2

a3

f1

0

P

=

a2 a3 f1

— 17 —

Remark: In this trivial example is easily calculated from an equilibrium equation

( ), but in many cases this is not possible, since one commonly has more unknown

reaction forces than available equilibrium equations (so–called static indeterminate prob-lems). For this reason, we stick to Eq. (1.32) to solve for all the unknowns, as is required inthe general case.

m

We thus have unknowns both in the left hand side and in the right hand side of the equationsystem (32). The normal way to tackle this, is to partition the system into one that embracethe unknown node variables (left hand side unknowns) only, and a second system that

involves the right hand side unknowns. Using that , we here obtain

(1.33)

Solving first for the two node variables and subsequently for the reaction force, we get

(1.34)

The reader is here encouraged to verify that Eq. (1.34) solves Eq. (1.33).

1.2.5 Problems

Exercise: Study the linear spring system example in Ch. 9 of the CALFEM manual. Pay par-ticular attention to the data structures edof (element degrees of freedom) that defines thetopology (i.e. defines elements in terms of the global degrees of freedom) and bc (boundaryconditions) that may be used to prescribe values to node variables.

Get familiar with the CALFEM functions assem (to assemble element stiffness matrices

into the structure stiffness matrix), solveq that solves linear equation systems like Eq. (32)where there may be unknowns both in the left and right hand side, and extract that retrieves

solutions element wise, from the global solution vector .

r

Exercise:

Define the problem illustrated above, in terms of elements and nodes: number the elements,

number the nodes. Define the degree of freedom vector and the structure load vector .

How many rows and columns will the structure stiffness matrix have?

f1

f1 P+ 0=

a1 0=

k1 k2+ k2–

k2– k2

a2

a3

0

P= f1 k1a2–=

a2

a3

P1 k1⁄

k1 k2+( ) k1k2( )⁄= f1 P–=

Ke

ae

a

F

kkk

a f

K

— 18 —

Solve the problem with Matlab/Calfem; set and . Use functions spring1e and

spring1s to establish and calculate the spring forces, respectively. Did the results (reac-

tion force, spring forces, and node displacements) come out as expected? r

1.3 Truss Structures

The systems attended to above were one dimensional in sofar that we only considered displacements in one direction.If we connect springs together such that there is an anglebetween them, we may build up structures that exhibitstiffnesses in several directions and thus have load carry-ing capacity in more that a single direction. In such cases,we have to consider force–displacement relations in eachpossible direction of motion. In a truss structure the stiff-

nesses are provided by the axial stiffness of elastic

bars; constructions like that is quite common in e.g. roofsand bridges and we will pay some attention to the twodimensional case in this section. The extension to a threedimensional case is straight forward.

1.3.1 A Two–Dimensional Bar Element

The axial stiffness of a bar depends on its cross section

area , the length , and Young’s modulus of the mate-

rial. As the previously discussed linear spring, a bar carries

an axial load only. The axial force is denoted and is posi-

tive in tension; the force is proportional to the elongation ofthe bar

(1.35)

k 1= F 1=

Ke

k k

kk

k k

k

x

yui

ui 1+

ui

ui 1+

EA L⁄ EA L⁄

EA L⁄

EA L⁄EA L⁄ EA L⁄

EA L⁄

EA

L-------

EA L⁄

NN

δ1δ2

EA

L-------

A L E

N

NEA

L-------δ=

— 19 —

where

(1.36)

We now introduce a coordinate along the bar axis and let the element forces and displace-

ments be positive in the positive coordinate direction. As before, superscript e is used forquantities that pertains to an element; furthermore, over–bar is used to indicate that an

entity pertain to the element local coordinate . To each end of our bar element we associate

an axial force and a related displacement ( ).

Comparing to the notations above, we see that and , so Eq. (1.36) may now be

written

(1.37)

Further we note that

(1.38)

Substituting Eqs. (1.37) and (1.38) into Eq. (1.35), one gets

(1.39)

or

(1.40)

Note that with , this element matrix is identi-

cal to that of the spring element Eq. (1.8). However,we now have to introduce degrees of freedom per-pendicular to the bar axis, in order to accommodatejoint displacements in arbitrary directions when the

bar is a member of a truss. Let and denote

the displacements orthogonal to bar axis; we also

introduce the associated forces and . Notice

that for small displacements perpendicular to the bar, we do not get any deformation so theelement does not provide any stiffness in this direction. Hence, Eq. (1.40) may be prolongedto

δ δ1 δ2+=

x

x

Pie

ai

ei 1 3,=

P1e

P3e

a1e

a3

e

EA

L-------

x

a1

eδ1–= a3

eδ2=

δ a3

ea1

e–=

P1e

N–= P3e

N=

P1e EA

L------- a1

ea3

e–( )= P3

e EA

L------- a1

ea3

e–( )–=

EA

L-------

1 1–

1– 1

a1

e

a3e

P1e

P3e

=

x

yP1

e

P2e

P4e

P3e

a1e

a2e

a4e

a3

e

kEA

L-------=

a2

ea4

e

P2

eP4

e

— 20 —

(1.41)

or

(1.42)

where

(1.43)

Equation (1.42) provides the element force–displacement relation for the two dimensional barelement. Note though that the element stiffness matrix and the element load vector are estab-lished in a coordinate system that is local to the element. For each bar in a truss we may setup the element stiffness matrix and element load vector according to Eq. (1.43), but as waspreviously seen we have to utilize compatibility and equilibrium to assemble the element

quantities into the structure system of equations . In doing so, forces and displace-

ments have to be relative a common frame of reference. To this end we introduce a global

coordinate system and set forth to transform forces and displacements from the local

system to the common global system. Let and denote the displacements in and

direction at the first end of the bar, while indices and are used for the other end. The

element forces are labelled in the same order: , , etc. Further, let be the angle

between the global –axis and the element coordinate ; the angle is measured from to in

counter clock–wise direction.

Let us first see how to change displacement variables. First consider a non–zero displacement

in the –direction at end one: . Its components in the –system obviously are

P1

e

P2

e

P3

e

P4

e

EA

L-------

1 0 1– 0

0 0 0 0

1– 0 1 0

0 0 0 0

a1

e

a2

e

a3

e

a4

e

=

Kea

efe

=

fe

P1

e

P2

e

P3

e

P4

e

= Ke EA

L-------

1 0 1– 0

0 0 0 0

1– 0 1 0

0 0 0 0

= ae

a1

e

a2

e

a3

e

a4

e

=

Ka f=

x y,( )

x y,( ) a1

ea2

ex

y 3 4

P1

eP2

e θ

x x x x

x

y

x

y

P1

eP2

e

P4

e

P3

e

a1

e

a2

e

a4

e

a4

e

a1

e

a2

e

a3

e

a4

e

θ P1

e

P2

eP3

e

P4

e

x a1

e0≠ x y,( )

— 21 —

(1.44)

Likewise a non–zero –displacement at end one, , has components

(1.45)

in the local system. Thus, taken together, we find that the displacement in the first end of therod transforms as

(1.46)

Here it should be evident that the transformation for the displacement of the second end is

(1.47)

Equations (1.46) and (1.47) may be summarized in matrix form

(1.48)

or

(1.49)

where

(1.50)

Substituting the transformation Eq. (1.49) into Eq. (1.42), we have

(1.51)

We now turn our attention to the element forces; first we note that the axial forces and

have components

(1.52)

and

a1

ea1

e θcos= a2

ea1

e θsin–=

y a2

e0≠

a1

ea2

e θsin= a2

ea2

e θcos=

a1

ea1

e θcos a2

e θsin+= a2

ea1

e θsin– a2

e θcos+=

a3

ea3

e θcos a4

e θsin+= a4

ea3

e θsin– a4

e θcos+=

a1

e

a2

e

a3

e

a4

e

θcos θsin 0 0

θsin– θcos 0 0

0 0 θcos θsin

0 0 θsin– θcos

a1

e

a2

e

a3

e

a4

e

=

ae

Lae

=

L

θcos θsin 0 0

θsin– θcos 0 0

0 0 θcos θsin

0 0 θsin– θcos

= ae

a1

e

a2

e

a3

e

a4

e

=

KeLa

efe

=

P1

e

P3

e

P1

eP1

eθcos= P2

eP1

eθsin=

— 22 —

(1.53)

respectively, in the –system. The –components of the transverse forces and

are

(1.54)

and

(1.55)

Equations (1.52)–(1.55) sum up to

(1.56)

or

(1.57)

where

(1.58)

Now pre–multiply Eq. (1.51) by to obtain

(1.59)

so with Eq. (1.57) and the notation

(1.60)

we have the element force–displacement relation

(1.61)

in global components. With and given by Eqs. (1.43) and (1.50) we find

P3

eP3

eθcos= P4

eP3

eθsin=

x y,( ) x y,( ) P2

eP4

e

P1

eP2

eθsin–= P2

eP2

eθcos=

P3

eP4

eθsin–= P4

eP4

eθcos=

P1

e

P2

e

P3

e

P4

e

θcos θsin– 0 0

θsin θcos 0 0

0 0 θcos θsin–

0 0 θsin θcos

P1

e

P2

e

P3

e

P4

e

=

fe

LT

fe

=

fe

P1

eP2

eP3

eP4

eT

=

LT

LT

KeLa

eL

Tfe

=

Ke

LT

KeL=

Kea

efe

=

Ke

L

— 23 —

(1.62)

where we used the notations

(1.63)

Now that we are able to establish the bar element relation Eq. (1.61) with forces and displace-ments in terms of a coordinate system common to all elements, it is possible for us to assem-

ble element matrices and vectors to the structure equation system . To this end,

compatibility and equilibrium conditions are utilized in the same manner as was done for thespring structures in the previous section. The procedure is illustrated by an example.

1.3.2 Example

We consider a simple truss that is built from two bars

with lengths and , respectively. Both members

have axial stiffness and a horizontal force acts at

the joint that connects the two members. Our task isto calculate the joint displacements and the reactionforces.

First we number the two elements (encircled numbers)and the three nodes according to the figure. Next the

degrees of freedom , , and associated exter-

nal forces are defined; there are two degrees of free-

dom and, thus, two force components in each node.Our aim is to establish and solve

(1.64)

where

(1.65)

contains the node variables (joint displacement components) and

(1.66)

is a vector with the associated node forces; it follows that the structure stiffness matrix is a

6 by 6 matrix (one column for each displacement component and one row for each nodeforce). It is recognized that some of the displacement components are known by means of the

Ke EA

L-------

cθ2

cθsθ cθ2

– cθsθ–

cθsθ sθ2

cθsθ– sθ2

–

cθ2

– cθsθ– cθ2

cθsθ

cθsθ– sθ2

– cθsθ sθ2

=

cθ θcos= sθ θsin=

Ka f=

P

EA 3L,EA 5L,

4L

x

y

5L 3L

EA P

a1

a2

a3

a4

a5

a6

f1

f2

f5

f6

f3

f4

1 2

3

12

ai i 1 … 6, ,=

fi

Ka f=

a a1 a2 a3 a4 a5 a6

T=

f f1 f2 f3 f4 f5 f6

T=

K

— 24 —

given supports, and that the corresponding force components in the vector are unknown

reaction (or support) forces. At this stage we settle with noting that both and will embrace

unknown vector components as was the case with the spring system previously elaboratedon, and we return to this when it is time to solve the system Eq. (1.64).

Let us now construct the force–displacement relation Eq. (1.64) for the structure by assemblyof element relations, according to the three steps outlined at the end of Sec. 1.2.2. We thus

consider one element at a time and number the element degrees of freedom and element

forces , , on the two elements. Notice that the order in which the numbering is

done matters: it has to follow the ordering that was used to derive the element stiffnessmatrix Eq. (1.62), cf. the illustration on page 14. Hence, we need to number the degrees of

freedom first ( ) at the first node of the element, and then ( ) at the second node;

at each node the –degree is numbered first ( ), and thereafter the –degree ( ).

The order in which we numbered the two elements and the three nodes above were quite arbi-trary. Irrespective of that numbering, each single element has two nodes and we are free tochoose which one to use as the ‘first node on the element’. In this example, see illustrationbelow, we have chosen nodes 1 and 3 to be the first and second node, respectively, on ele-ment 1; for element 2, node 2 is the first node, while node 3 is the second.

Although we are free to use any of the two nodes on an element as the ‘first node’, it is essen-

tial to realize that the choice affects the value of the angle between the global –axis and

the local –axis, since the latter runs from the first towards the second element node (cf.

illustration above). Hence, for element 1 we have and , so with the

bar length the element stiffness matrix becomes (Eqs. (1.62) and (1.63))

(1.67)

and the element force–displacement relation Eq. (1.61) is

f

a f

ai

e

Pi

ei 1 2 3 4, , ,=

i 1 2,= i 3 4,=

x i 1 3,= y i 2 4,=

x

θ1

θ2

x

P1

e 1=P1

e 2=

P3

e 1=P3

e 2=

P2

e 1=

P2

e 2=

P4

e 1=

P4

e 2=

a1

e 2=a1

e 1=

a2

e 2=a2

e 1=

a4

e 1=

a3

e 1=

a4

e 2=

a3

e 2=

θ x

x

θ θ1cos=cos4

5---= θ1sin

3

5---=

5L

Ke 1= EA

5L-------

1

5( )2----------

16 12 16– 12–

12 9 12– 9–

16– 12– 16 12

12– 9– 12 9

⋅ EA

375L------------

48 36 48– 36–

36 27 36– 27–

48– 36– 48 36

36– 27– 36 27

= =

— 25 —

(1.68)

For the second element, with length , and , we find

(1.69)

so Eq. (1.61) reads

(1.70)

We have now established the element stiffness matrices and element load vectors and areready to tackle the second step in the three stage process outlined Sec. 1.2.2. By inspectionwe find the compatibility equations

(1.71)

Expanding the element degrees of freedom vectors to the structure degrees of freedom (Eq.(1.65)), we obtain

(1.72)

from Eq. (1.68), and

EA

375L------------

48 36 48– 36–

36 27 36– 27–

48– 36– 48 36

36– 27– 36 27

a1

e 1=

a2

e 1=

a3

e 1=

a4

e 1=

P1

e 1=

P2

e 1=

P3

e 1=

P4

e 1=

=

3L θ θ2cos=cos 0= θ2sin 1=

Ke 2= EA

3L-------

0 0 0 0

0 1 0 1–

0 0 0 0

0 1– 0 1

=

EA

375L------------

0 0 0 0

0 125 0 125–

0 0 0 0

0 125– 0 125

a1

e 2=

a2

e 2=

a3

e 2=

a4

e 2=

P1

e 2=

P2

e 2=

P3

e 2=

P4

e 2=

=

a1

e 1=

a2

e 1=

a3

e 1=

a4

e 1=

a1

a2

a5

a6

=

a1

e 2=

a2

e 2=

a3

e 2=

a4

e 2=

a3

a4

a5

a6

=

EA

375L------------

48 36 0 0 48– 36–

36 27 0 0 36– 27–

0 0 0 0 0 0

0 0 0 0 0 0

48– 36– 0 0 48 36

36– 27– 0 0 36 27

a1

a2

a3

a4

a5

a6

P1

e 1=

P2

e 1=

0

0

P3

e 1=

P4

e 1=

= or Kee 1=

a fee 1=

=

— 26 —

(1.73)

from Eq. (1.70), respectively.

The third and final step required to obtain the

structure force–displacement relation ,

involves (force) equilibrium at the nodes. Our sit-uation is depicted to the right; note that weinvoked Newtons 3rd law to get the correct direc-

tions for the element forces . We find the

equilibrium conditions

(1.74)

Collection these equations in vectors, we may write

(1.75)

or in terms of the structure load vector (Eq. (1.66)) and the expanded element load vectors(Eqs. (1.72) and (1.73))

(1.76)

Since , cf. Eqs. (1.72) and (1.73), we hence have

(1.77)

or

(1.78)

where

EA

375L------------

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 125 0 125–

0 0 0 0 0 0

0 0 0 125– 0 125

a1

a2

a3

a4

a5

a6

0

0

P1

e 2=

P2

e 2=

P3

e 2=

P4

e 2=

= or Kee 2=

a fee 2=

=

f3f1

f2

f5

f6

f4

P1

e 2=P1

e 1=

P2

e 1=

P3

e 1=

P3

e 2=

P2

e 2=

P4

e 2=P4

e 1=

1 2

3

Ka f=

Pj

e i=

Node 1: f1 P1

e 1== f2 P2

e 1==

Node 2: f3 P1

e 2== f4 P2

e 2==

Node 3: f5 P3

e 1=P3

e 2=+= f6 P4

e 1=P4

e 2=+=

f1

f2

f3

f4

f5

f6

P1

e 1=

P2

e 1=

0

0

P3

e 1=

P4

e 1=

0

0

P1

e 2=

P2

e 2=

P3

e 2=

P4

e 2=

+=

f fee 1=

fee 2=

+=

fee i=

Kee i=

a=

f fee 1=

fee 2=

+ Kee 1=

a Kee 2=

a+ Kee 1=

Kee 2=

+( )a= = =

Ka f=

— 27 —

(1.79)

1.3.3 Solving the FE Equations

We have established the force–displacement relation

(1.80)

for our example problem. Prior to solving the equations, we must sort out which force anddisplacement variables that are known, and which are yet unknown, very much the sameway as was done in the spring system example. In the present case, nodes 1 and 2 are fixed

to the ground, so we know some node displacements to be zero, viz. ; the

corresponding node forces ( , ) are unknown support forces. At node 3 there is a

horizontal force , but the vertical force component is zero, so we know that and

while the displacements and are yet unknown. Hence, Eq. (1.80) becomes

(1.81)

It is observed that we have unknowns in both the left and right hand sides. As with the springstructures, one solves for the unknowns in the left hand side first. The last two equations inEq. (1.81) are

(1.82)

while the first four read

(1.83)

K Kee 1=

Kee 2=

+EA

375L------------

48 36 0 0 48– 36–

36 27 0 0 36– 27–

0 0 0 0 0 0

0 0 0 125 0 125–

48– 36– 0 0 48 36

36– 27– 0 125– 36 152

= =

EA

375L------------

48 36 0 0 48– 36–

36 27 0 0 36– 27–

0 0 0 0 0 0

0 0 0 125 0 125–

48– 36– 0 0 48 36

36– 27– 0 125– 36 152

a1

a2

a3

a4

a5

a6

f1

f2

f3

f4

f5

f6

=

a1 a2 a3 a4 0= = = =

fi i 1 2 3 4, , ,=

P f5 P= f6 0=

a5 a6

EA

375L------------

48 36 0 0 48– 36–

36 27 0 0 36– 27–

0 0 0 0 0 0

0 0 0 125 0 125–

48– 36– 0 0 48 36

36– 27– 0 125– 36 152

0

0

0

0

a5

a6

f1

f2

f3

f4

P

0

=

EA

375L------------

48 36

36 152

a5

a6

P

0=

f1

f2

f3

f4

EA

375L------------

48– 36–

36– 27–

0 0

0 125–

a5

a6

=

— 28 —

Hence, having solved for the unknown node variables (Eq. (1.82)), we obtain the reactionforces from Eq. (1.83). One finds

(1.84)

The reader is encouraged to verify that Eq. (1.84) solves (1.82–83). Also note that the truss isin equilibrium since net force and torque are zero.

1.3.4 Calculating the Bar Forces

Solving the FE–equations (1.78)/(1.80), we find the node displacements (and support forces),

but these are rarely of any major interest. In practice we want to find the axial forces in the

bars or, equivalently the axial stresses , since these determine whether the truss will be

able to withstand the load or not. However, once the node displacements are known, it is pos-sible to calculate the axial extension, and thus the axial force, of each bar in the truss.

From Eqs. (1.38) and (1.39) we have that and , so that

(1.85)

Using Eqs. (1.46) and (1.47) to transform the node displacements to the global coordinate

system , we find

(1.86)

and note that the element degrees of freedom may be retrieved from the solution vector .

For element 1 in the above truss example, we have the element length , , ,

, , and ; substitution into Eq. (1.86) gives

us

(1.87)

a5

a6

PL

4EA-----------

38

9–=

f1

f2

f3

f4

P

4---

4–

3–

0

3

=

P

P

3P

4-------3P

4-------

4L

3L

N

N

σ N

A----=

P3

eN= P3

e EA

L------- a1

ea3

e–( )–=

NEA

L------- a3

ea1

e–( )=

x y,( )

NEA

L------- a3

ea1

e–( ) θcos a4

ea2

e–( ) θsin+[ ]=

ai

ea

5L θcos4

5---= θsin

3

5---=

a1

ea1 0= = a2

ea2 0= = a3

ea5

38PL

4EA-------------= = a4

ea6

9PL–

4EA-------------= =

NEA

5L-------

38PL

4EA------------- 0– 4

5---

9PL–

4EA------------- 0– 3

5---+

5P

4-------= =

— 29 —

The reader is encouraged to show that Eq. (1.86) yields for the second bar element in

the example problem.

Remark: Usually the primary unknown (i.e. the node variables) is of minor interest whenboundary value problems (differential equations with boundary conditions) are solved withFE, but some post–processing is required to obtain the quantities sought for, similarly to what

we did above to find . For instance, in an elasticity problem the node variables represents

displacements, but one wants to find stresses or strains which are obtained as combinationsof first derivatives of the displacements. In a heat flow problem the node variables representtemperatures, while the heat flow is obtained as combinations of first derivatives of the tem-perature.

m

1.3.5 Problems

Exercise: Study the plane truss example exs3 in Ch. 9 of the CALFEM manual. Note in par-

ticular how the geometry of an element is defined; the – and –coordinates of the two ele-

ment nodes are supplied in vectors ex and ey, respectively. This information is sufficient to

calculate the element length as well as the cosine and sine of the orientation angle , cf.

Eqs. (1.62–63).

The function bar2e returns the element stiffness matrix for a bar element (Eq. (1.62)), while

bar2s is used to calculate bar forces once the nod variables have been calculated (cf. Eq.

(1.86)).

r

Exercise: Use MATLAB/CALFEM to analyse the example problem in Sec. 1.3.2. Verify thesolution Eq. (1.84) and the axial forces in the two bars.

r

Exercise: Example exs4 in Ch. 9 of the CALFEM manual is a plane truss consisting of 10bars. Note how the assembly process is carried out in a for–loop.

Also pay attention to the tedious and error prone work to supply the element node coordi-nates in matrices Ex and Ey (one row per element). For instance, there are 5 bar elements

that share node 3 so its – and –coordinates are actually given 5 times. At the end of exs4,

it is shown how the CALFEM function coordxtr may be used to obtain the element coordi-nates Ex and Ey; learn how to use this function — it will save you a lot of work.

r

N3P–

4----------=

N

x y

L θ

N

x y

— 30 —

Exercise: The depicted structure shown is loaded with a force . Let ,

and calculate the displacement of the loaded joint, given that the cross–section

area of each bar is (answer: , ). Then set

and calculate the new displacement components (answer: , )

• Number elements and nodes; here each bar is an element and the joints are

nodes. Number the degrees of freedom; there are 2 degrees of freedom in each

node, viz. the displacement in – and –direction, respectively. Be systematic!

The degrees of freedom (joint displacements) are the primary unknowns in theproblem; once calculated, everything else (such as bar forces) are easilyobtained.

• Look up the function coordxtr in the CALFEM manual and create the matrices

Edof (5 columns), Coord (2 columns) and Dof (2 columns). Then use coordxtrto obtain the element coordinates Ex and Ey. Using eldraw2, you can draw thetruss. This gives a check that you got the geometry input data correct.

• Use a for–loop to create the element stiffness matrix (function bar2e) for one

element at a time and assemble (assem) it to the structure stiffness matrix

(note that has to be created as a zeros matrix before you invoke the for–loop).

Now create at structure load vector and insert the load . Use solveq to solve

the equation system — do not forget to supply the boundary conditions.

The solution contains all node displacements ( and in the order you num-

bered the degrees of freedom). The function extract gives us the displacementselement–wise in a matrix (one row per element) and eldisp2 may thereafter beused to draw the deformed structure.

• Using bar2s, you may compute the axial force in any element (bar).

P 20 kN= L 1.5 m=

E 210 GPa=

A 2124 mm2

= ux 0.7 mm–= uy 3.9 mm–= A 1530 mm2

=

ux 0.9 mm–= uy 5.4 mm–=

PL

L L L L

x

y

x y

K

K

f P

Ka f=

a x y

— 31 —

2. Second Order Problems in One Dimension

In this chapter we will study one–dimensional problems, i.e. problems that involve a single

independent variable only. The differential equations we work with will have the form

(2.1)

where is a given forcing function, is some constitutive function (also known), and

is the primary unknown. The differential equation will be defined over some interval and weimpose boundary conditions at the interval ends, in order to ensure that the solution isunique. Thus, we deal with boundary value problems (BVP)

Remark: We refer to as primary unknown since it appears as the unknown function in the

BVP, but it usually is of little interest in itself, but in practice some derivative of the unknown

is sought for. For instance, in an elasticity problem (an axially loaded linear elastic rod)

represents the axial displacement at , while one usually is more interested in the internal

axial force . Here is the strain and is the axial stiffness.

m

The first step in using finite elements to approximate the solution of a BVP is to cast theproblem into a variational form. The variational problem is derived from the BVP but noapproximation is involved, so both problems have the same solution. One advantage with thevariational form is that the derivatives of the unknown function are of lower order than in theBVP; for this reason, the BVP is sometimes named the strong form of the problem, while thevariational problem is called the weak form. Since the derivatives are of lower order, the regu-larity requirements are lower and it is easier to approximate the unknown.

In the second step we introduce an approximation in the variational problem. The

approximation function does not involve any unknown functions, but do embrace a

number of variables , , called the node variables. In solving the problem, we wish

to calculate node variable values so that is an as good approximation as possible (in some

sense) to the unknown. We name this second step the FE–formulation.

We deal with the variational formulation and the FE–formulation in sections 2.2 and 2.3,respectively, but first we shall describe some problems whose mathematical description isgiven by Eq. (2.1).

x

xd

dD x( )

xd

du– f x( )=

f D 0> u x( )

u

u x( )x

N x( ) D x( )xd

du=

xd

du ε= D x( ) EA x( )=

u uh≈

uh

ai i 1 2 …, ,=

uh

BVP:differential equation

boundary conditions

Strong formWeak form

Variational problemVariational formulation

(no approximation)

Ka f=

FE–formulation

Approximation introduced

FE problem:

Solution: u x( )

Solution: u x( )

Solution: uh x( ) Na Ni x( )ai u x( )≈i

∑= =

— 32 —

2.1 Modelling

Prior to going into the details about how to approximate the solution of the differential equa-tion (2.1) by means of the finite element method, we provide a brief engineering background.In this section we describe three different problems whose solutions are found by solving anequation of the type (2.1).

2.1.1 The Pre–Tensioned String

Our first model problem is the transversely loaded elastic string. Thus, we consider an

elastic string that is pre–tensioned by a force and let the –axis be alined with the unde-

flected string. When a transverse load with intensity (force/length) is applied, we get a

displacement i the –direction. We now seek the force–displacement relation. It is

assumed that the displacement is small compared to the length of the string and that the

angle so that . Under these assumptions, will be constant along the string.

Now consider equilibrium of a segment of length according to the right part of the illustra-

tion above. At each end of the segment, the pre–tension force is decomposed into vertical

and horizontal components. Horizontal equilibrium shows that the horizontal component of

has to be constant ( ); furthermore, it is seen that and that

, i.e. we have

(2.2)

Vertical equilibrium requires that ; dividing by , we get, in the limit

as , that

(2.3)

Equations (2.2) and (2.3) give us the force–deflection relation or, since ,

(2.4)

which is of the type Eq. (2.1).

S x

q x( )w x( ) z

θ 1«xd

dw θtan θ≈= S

x

z

S

S

w x( )

q x( )

x x ∆x+

q∆x S

S

θ x( )H x( )

V x( )

V x ∆x+( )

H x ∆x+( )

∆x

S

S

H x( ) H x ∆x+( )= H S θ S≈cos=

V H θ Hxd

dw=tan=

xd

dVH

x2

2

d

d w=

V x ∆x+( ) V x( )– q∆x+ 0= ∆x

∆x 0→

xd

dV– q=

H–x

2

2

d

d wq= H S≈

x2

2

d

d w–

q

S---=

— 33 —

Formally, given and we could solve Eq. (2.4) by integration. This yields two constants of

integration, so two boundary conditions, one at each end, are necessary to find a uniquesolution. Since we have a second order differential equation, conditions involve the unknown

and/or its first derivative . A couple of examples are depicted in the illustration below.

2.1.2 Linear Elasticity

Next we consider a one dimensional elasticity problem, viz. an axially loaded elastic bar. We

let the –axis be aligned along a bar with cross–sectional area . The material is assumed

to be linear elastic with Young’s modulus (modulus of elasticity) . We also introduce the vol-

ume load (force/volume) acting along the –axis; for instance, if we want to consider the

weight of the bar, with gravity acting in positive –direction, we would have .

Due to the loading each cross–section of the bar will displace along the –axis; we denote this

displacement .

In the problem described above, , , and are assumed to be known quantities, while

is the primary unknown and we wish to establish a force–displacement relation. To this endwe invoke three basic relations: a kinematic relation, a constitutive law, and an equilibriumrequirement.

Kinematics. First we will find out how the displacementrelates to the deformation of the bar. Consider two nearbycross–sections that in the un–displaced configuration are

located at and , respectively. When some load is

applied to the bar, the displacements become and

, respectively. The original length of the segment

between the two cross–sections is , while the length in

the deflected position is . Thus the extension is and the rela-

q S

wxd

dw

xL0

Sq x( )

x2

2

d

d w–

q

S---= 0 x L< <

w 0( ) 0=

w L( ) 0=

H S=

x0 L 2⁄

q x( )

sym

x2

2

d

d w–

q

S---= 0 x

L

2---< <

w 0( ) 0=

xd

dw

xL

2---=

0=

x A x( )E

Kx x( ) x

x Kx ρg=

x

u x( )

x

A x( )

Kx x( )

u x( )

E A Kx u x( )

x x ∆x+

u x( )u x ∆x+( )

x x ∆x+

u x( )u x ∆x+( )

∆x

∆x u x ∆x+( ) u x( )–+ u x ∆x+( ) u x( )–

— 34 —

tive elongation is . We now define the axial strain , or deformation, as the rela-

tive elongation in the limit

(2.5)

The deformation of the material causes an interior axial force to develop; the axial stress,

denoted , may here be defined as , i.e. as force per unit area. Note that the axial force

and the stress are positive in tension ( ) and negative in compression ( ). The relation

between stress and strain is given by some constitutive law. If the material behaves linearlyelastic, the stress is proportional to the deformation

(2.6)

which is known as Hooke’s law; the constant of proportionality , named Young’s modulus or

the modulus of elasticity, is a material parameter.

Equilibrium requires that the sum of the

forces in –direction cancels out. Once again

we study a small segment of the bar and

find that

Rearranging terms and dividing by , one obtains and as

(2.7)

which is our equilibrium condition.

Combining Eqs. (2.5) and (2.6), we get the stress in terms of the displacement as .

Inserting into Eq. (2.7), we obtain the force–displacement relation

(2.8)

Hence, with and Eq. (2.1) is a mathematical model for an axially

loaded elastic bar and once again we need two boundary condition, one at each end, toensure a unique solution. Since we have a second order differential equation, the boundary

conditions may be given on the unknown function and/or on its first derivative . A cou-

ple of examples follow.

First consider a bar of length and with axial stiffness

. The bar is clamped at its left end, loaded by a dis-

tributed load (force/length), and by a force at

the right end. The mathematical model is given by Eq.(2.8) and we need two boundary conditions. At the left

end we have trivially that , since the clamping

means that the axial displacement is constrained, but at

u x ∆x+( ) u x( )–

∆x---------------------------------------- ε

∆x 0→

εxd

du=

N x( )

σ σ N

A----=

ε 0> ε 0<

σ Eε=

E

N x( ) σA( ) x( )= N x ∆x+( ) σA( ) x ∆x+( )=KxA ∆x⋅

x x ∆x+

x

∆x

σA( ) x ∆x+( ) KxA ∆x⋅ σA( ) x( )–+ 0=

∆xσA( ) x ∆x+( ) σA( ) x( )–

∆x-----------------------------------------------------------– KxA= ∆x 0→

xd

d– σA[ ] KxA=

σ Exd

du=

xd

d– EA

xd

duKxA=

D x( ) EA x( )= f x( ) KxA( ) x( )=

uxd

du

0 L

EA x( )

KxA x( )P

x

PN L( )L

EA x( )KxA x( ) P

u 0( ) 0=

— 35 —

the other end the displacement is unknown (until we have solved the problem). Thus we

need to find a condition on the derivative ; this can be accomplished by considering equi-

librium

(2.9)

Now, we have that and from Hooke’s law that , so that . Using the kin-

ematic relation Eq. (2.5), we hence obtain . Substituting into Eq. (2.9), we

get the desired boundary condition . The unknown is thus the solution of

the boundary value problem

(2.10)

As a second example we consider the same barproblem as above, but now the structure is sup-

ported by an elastic spring with stiffness at its

right end. Apart from the external force and the

normal force we now also have a spring force

acting at . The equilibrium equation (2.9)

thus becomes

(2.11)

Once again using , we now find the BVP

(2.12)

It may be worthwhile to somewhat contemplate the two BVP:s Eqs. (2.10) and (2.12). We havehere introduced the three types of boundary conditions that may appear in conjunction withsecond order differential equations. The first type, also known as a Dirichlet condition, is when

the value of the primary unknown is prescribed, such as in the two examples above.

In the second type we have a prescribed value of the derivative , as at in Eq. (2.10);

this type is named a Neumann condition. The third type, Danckwerts or Robin condition,

assigns a value to a linear combination of and on the boundary, as at in Eq.

(2.12).

u L( )

xd

du

P N L( )– 0=

N σA= σ Eε= N EAε=

N L( ) EAxd

du

x L=

=

xd

du

x L=

P

EA L( )----------------= u x( )

xd

d– EA

xd

duKxA= 0 x L< <

u 0( ) 0=

xd

du

x L=

P

EA L( )----------------=

0 L

EA x( )

KxA x( ) P

x

PN L( )

ku L( )u L( )

k

k

P

N L( )ku L( ) x L=

P N L( )– ku L( )– 0=

N L( ) EAxd

du

x L=

=

xd

d– EA

xd

duKxA= 0 x L< <

u 0( ) 0=

xd

du

x L=

k

EA L( )---------------- u L( )+

P

EA L( )----------------=

u 0( ) 0=

xd

dux L=

uxd

dux L=

— 36 —

The boundary condition types will affect the FE–approximation in different ways. Dirichlet

conditions will modify the stiffness matrix in the equation system , since some of the

node variables in are known from the boundary condition, while the corresponding ele-

ments in the load vector will be yet unknown ‘reaction forces’. Hence, we will have

unknowns both in the left and right hand sides of the equation system when Dirichlet condi-tions are at hand.

Neumann conditions on the other hand, represents some type of loading as seen in Eq.

(2.10), and will end up in the load vector . It will be seen later, in the weak form of the BVP,

that Neumann conditions will be ‘combined’ with the differential equation into a single equa-tion. When we approximate the solution of the BVP by a finite element solution of this weakproblem, we will hence also approximate any Neumann condition. Conditions of this type aretherefore referred to as natural boundary conditions. In contrast to this, Dirichlet conditionsare called essential; these are strictly enforced.

Finally, a Robin condition will affect both the stiffness matrix and the load vector . This

will be elaborated on in an example later on.

2.1.3 One–Dimensional Heat Flow

Our third and final example of a problem described by thedifferential equation (2.1) is heat energy flow in one

dimension. Here the primary unknown function is a

temperature while the loading (right hand side of Eq.

(2.1)) will be denoted and is a supplied heat energy per

unit volume and time . The heat source may for

instance be a chemical reaction, such as e.g. when concrete solidifies through hydration, ordue to a radioactive decay as in a fuel rod in a nuclear reactor. In many other problems we

simply have and the energy flow is only driven by at temperature difference

between the ends of the studied interval ( ), such as the temperature difference

between the inside and outside of a building wall.

The heat energy flow per unit time through an area is denoted and is defined as

positive in the positive –direction. According to Fouriers law, the heat flow is proportional to

the temperature gradient , and is directed against it (i.e. the heat flow is directed from a

warmer towards a cooler region)

(2.13)

(cf. Hooke’s law); here, is a material property, viz. the thermal conductivity.

Let us now consider energy balance for a slice of length along

the domain. The heat energy inflow per time unit is

(2.14)

while the energy outflow during a unit time is found as

(2.15)

K Ka f=

a

f

f

K f

A x( )

x

x1x2

Q x( )

q x( )u x( )

°C[ ]Q x( )

J

m3

s⋅--------------

Q 0≡ q x( )x x1 x2,[ ]∈

q x( ) J

m2

s⋅--------------

x

xd

du

q kxd

du–=

kJ

m s °C⋅ ⋅----------------------

Q x( )

q x ∆x+( )q x( )

x x ∆x+

∆x

Qinflow Q x( ) A x( ) ∆x⋅ ⋅ q x( )A x( )+=

Qoutflow q x ∆x+( )A x ∆x+( )=

— 37 —

One realize that if the in– and outflows differ, the temperature will change with time inside

the studied volume . The temperature increases in case , while

yields a decrease in temperature. At stationary conditions, i.e. when the tem-

perature does not change with time, we must have ; from Eqs. (2.14) and

(2.15) we then obtain and in the limit

(2.16)

Substituting the Fourier law Eq. (2.13) into Eq. (2.16), we arrive at the heat supply–tempera-ture relation

(2.17)

which is an equation of the form Eq. (2.1).

In solving the governing differential equation (2.17), one will find two constants of integration,so as with the previous examples we need two boundary conditions to find a unique solution.In case the temperature is known on any end of the considered domain, one will of course

supply a boundary condition expressed in ; that would be a Dirichlet, or essential, condi-

tion. If the heat flow across a boundary is known, it is possible to give a condition on the

first derivative of the unknown function , since (Eq. (2.13)) . Hence, this yields a

Neumann, or natural, boundary condition. A special case is when it is known that there is noheat flow across a boundary, which occurs at a point of symmetry or if a perfect insulation is

assumed; one then has a homogeneous Neumann condition .

A Robin condition, where a combination of and is prescribed, will be at hand in a case of

convective heat exchange. Here the heat flow will be proportional to the difference between

the boundary temperature and the temperature of the surrounding media. An example

is given in the figure below, where the constant of proportionality is denoted ;

note that both and are unknown at and that the Robin condition only prescribes a

relation between them.

A x( ) ∆x⋅ Qinflow Qoutflow>

Qinflow Qoutflow<

u Qinflow Qoutflow=

qA( ) x ∆x+( ) qA( ) x( )–

∆x---------------------------------------------------------- QA= ∆x 0→

xd

dqA[ ] QA=

xd

dkA

xd

du– QA=

u

q

uxd

du q–

k------=

xd

du0=

uxd

du

q

u u0

α J

m2 °C s⋅ ⋅

-------------------------

uxd

dux h=

u 20°C=

q k h( )xd

du

x h=

– α u h( ) u0–( )= =

u0xd

dk

xd

du– 0= 0 x h< <

u 0( ) 20°C=

xd

du

x h=

αk h( )----------u h( )+

αk h( )----------u0=

(surrounding

x

h0

temperature)

— 38 —

2.2 The Weak Problem

When one uses the finite element method to approximate the solution of a differential equa-

tion, the equation must have a unique solution or otherwise the structure stiffness matrix

becomes singular and the FE equation system cannot be solved numerically. Hence, in

solving a second order equation like Eq. (2.1), we need two boundary conditions in order toobtain a well defined solution. In effect, we are actually approximating the solution of someboundary value problem (BVP), e.g. like the ones given as examples above. We remark that atleast one Dirichlet or a Robin condition is required to pin out a unique solution, i.e. two Neu-mann conditions will not be sufficient.

The first step in formulating a finite element method for a BVP is to recast it into its weakform. This process is referred to as the variational formulation, which involves the introduc-

tion of an almost arbitrary test function ; (in engineering textbooks is frequently

called ‘weight function’). By using integration by parts, we may reduce the highest order

derivative of the unknown function , which makes it easier to construct an approximation.

The fact that the derivative order is lower in the variational problem than in the original BVP,warrants the term ‘weak problem’ or ‘weak formulation’. In our case the formulation requiresthe functions to have first derivatives only, while the original BVP explicitly defines a secondderivative (cf. Eq. (2.1)); the latter formulation (the BVP) is analogously termed the ‘strong for-mulation’.

Let us perform a variational formulation of aBVP. As a model problem we choose the onedimensional elasticity problem described by Eq.(2.10), repeated here for convenience

(2.18)

Here is a given constant and we recall that and are known functions;

is our primary unknown.

We now introduce a test function . Note that this is not something unknown, but we

are free to choose just about any function, although, as will be seen, there are some restric-tions on permissible choices. We multiply both sides of the differential equation by the testfunction and integrate over the interval on which the problem is defined, to get

(2.19)

The left hand side is now integrated by parts, which yields

(2.20)

Here, the boundary terms that comes out from the integration have been moved to the righthand side. Note that we only face first derivatives in Eq. (2.20), while the original problem Eq.

(18) involved a second derivative of the unknown .

K

Ka f=

v x( ) v x( )

u

x

0 L

EA

KxA

P

xd

d– EA

xd

duKxA= 0 x L< <

u 0( ) 0=

xd

du

x L=

P

EA L( )----------------=

P D x( ) EA= f x( ) KxA=

u u x( )=

v v x( )=

vxd

dEA

xd

duxd

0

L

∫– vKxA xd

0

L

∫=

xd

dvEA

xd

duxd

0

L

∫ vKxA xd

0

L

∫ vEAxd

du

0

L

+=

u

— 39 —

This far we have only worked on the differential equation and it is now time to involve theother two equations in Eq. (2.18), i.e. we now consider the boundary conditions. To this endwe expand the second term in the right hand side of Eq. (2.20)

(2.21)

Here we could make use of the condition at , but at the derivative is unknown.

Note that is proportional to the reaction force due to the ‘clamping’ , and can

be computed once we have solved for , but is unknown at this stage. The model problem we

selected is actually statically determined and we could calculate the reaction force (and thus

) through a simple equilibrium equation, but in a more general problem this is not pos-

sible.

Now, to get a unique solution we need to know the conditions at the boundary such as e.g.

expressed in the BVP (2.18), but in Eq. (2.20) we can only specify the condition at , while

the term at remain unknown. In order to resolve this obstacle, we restrain ourselves to

only select test functions that vanish at , i.e. the considered test functions all satisfies

. Imposing this restriction on Eq. (2.21) and substituting the result into Eq. (2.20), we

arrive at the variational problem (weak form of (2.18))

Find such that (2.22)

Apart from what have already been mentioned, two things should be noted in Eq. (2.22).

First, we have omitted the boundary term , so the integral equation is valid only

when this term is zero; this is ensured by restricting the formulation to test functions that

satisfies . The second remark is on the essential boundary condition : it was

never used in obtaining the integral equation in Eq. (2.22) and, hence, we must state itexplicitly just as with the restriction on the test function. The integral equation in (2.22)

stems from the differential equation and the natural boundary condition (at ) of Eq.

(2.18); the essential boundary condition (at ) as well as the restriction on the test func-

tion have to be stated explicitly. Without the statements and , the variational

formulation is incomplete and of little use since the solution either does not exist or is notunique.

To be yet somewhat more strict, we mention that the test function has to be regular enoughfor the integrals to exist. It can be shown that this will be the case if the integral of thesquared functions exists and that the integral of the squared first derivative exists, i.e. if

(2.23)

vEAxd

du

0

L

v L( ) EAxd

du

x L=

v 0( ) EAxd

du

x 0=

– Pv L( ) v 0( ) EAxd

du

x 0=

–= =

x L= x 0=xd

du

xd

du

x 0=

u 0( ) 0=

u

xd

du

x 0=

x L=

x 0=

x 0=

v 0( ) 0=

u xd

dvEA

xd

duxd

0

L

∫ vKxA xd

0

L

∫ Pv L( )+=

u 0( ) 0= v 0( ) 0=

v 0( ) EAxd

du

x 0=

v 0( ) 0= u 0( ) 0=

x L=

x 0=

u 0( ) 0= v 0( ) 0=

v2

xd

0

L

∫ ∞<xd

dv

2

xd

0

L

∫ ∞<

— 40 —

Also, these regularity conditions has to be satisfied by the solution as well. A nice way to

express the variational problem Eq. (2.22), including the regularity conditions, is obtained ifwe define the function space

(2.24)

In other words, is the ‘set of functions’ that satisfies the essential boundary condition and

is regular enough for the integral equation in (2.22) to be meaningful. The variational problemcan now be stated as:


Notice how the conditions and are expressed here.

Briefly consider the BVP Eq. (2.18) once again. The right hand side of the differential equation

( ) and of the Neumann condition ( ) represents ‘loads’ on the system, while the unknown

( ) represents the system response. We may think of as the collection of all responses that

are a solution of the BVP for some function in combination with some constant .

2.3 Finite Element Formulation

Carefully observe that no approximation has been invoked to recast the BVP Eq. (2.18) intothe variational problem Eq. (2.25) (or (2.22)). In particular this means that both problemshave the same solution. Hence, if we want to approximate the solution of the BVP, we may aswell try to approximate the solution of the variational problem. As mentioned in the prelimi-naries of this chapter, it is less complicated to approximate the weak problem since it onlyrequires that our approximation has a first derivative, while the BVP Eq. (2.18) calls for a welldefined second derivative.

Using the finite element method, we first construct a set of basis functions ,

. The approximation, denoted , is then constructed as a linear combinations of

the basis functions

(2.26)

The coefficients in this construct are called node variables; if we collect these in a column

vector and the basis functions in a row vector

(2.27)

we may write

(2.28)

u

V v: v 0( ) 0 v2

xd

0

L

∫ ∞< xd

dv

2

xd

0

L

∫ ∞<=

=

V

u V∈xd

dvEA

xd

duxd

0

L

∫ vKxA xd

0

L

∫ Pv L( )+= v V∈∀

u 0( ) 0= v 0( ) 0=

KxA P

u V

KxA P

Ni Ni x( )=

i 1 2 … n, , ,= uh

u uh≈ N1a1 N2a2 … Nnan+ + + Niai

i 1=

n

∑= =

ai

N N1 N2 … Nn= a

a1

a2

…an

=

u uh≈ Na=

— 41 —

When this approximation is substituted into the integral in Eq. (2.25), we need to evaluate

; using Eq. (2.26), we obtain

(2.29)

where we introduced

(2.30)

Replacing by the FE–approximation in Eq. (2.25), we thus get

(2.31)

Note that, at this stage, the restriction on and , viz. , is pro tem ignored;

restrictions imposed by Dirichlet (or essential) conditions are dealt with when the resulting

equation system is to be solved, as will be illustrated in some examples in a subse-

quent section.

In Eq. (2.31) there is no unknown function, but we have unknown node variables ,

. Hence, we need equations to solve for ; we obtain these by selecting differ-

ent test functions . For each choice of the integrals in Eq. (2.31) may be evaluated to yield

an equation where the node variables appear as unknowns. While several candidates for testfunctions indeed are present, one almost exclusively uses the so called Galerkin method inthe context of finite elements. This means that the basis functions are used as test functions

to give the equations

(2.32)

We remark that the Galerkin method gives us a ‘best approximation’ in a well defined anddesirable way, which is why it is so extensively used; one needs very good reasons not to useit. In a later chapter we shall return to this topic and show why the Galerkin method works sowell.

If we collect the equations (Eq. (2.32)) row–wise, we have

(2.33)

or somewhat more compact

xd

duh

xd

duh

xd

dN1a1 xd

dN2a2 …

xd

dNnan+ + +

xd

dNiai

i 1=

n

∑ Ba= = =

Bxd

dN

xd

dN1

xd

dN2 …xd

dNn= =

u

xd

dvEAB xd

0

L

∫

a vKxA xd

0

L

∫ Pv L( )+=

u v u 0( ) v 0( ) 0= =

Ka f=

n ai

i 1 2 … n, , ,= n a n

v v

n

xd

dNiEAB xd

0

L

∫

a NiKxA xd

0

L

∫ PNi L( )+= i 1 2 … n, , ,=

n

xd

d

N1

N2

…Nn

EAB xd

0

L

∫

a

N1

N2

…Nn

KxA xd

0

L

∫ P

N1 L( )

N2 L( )

…Nn L( )

+=

— 42 —

(2.34)

With the notation

(2.35)

we may finally write

(2.36)

Solving for the node variable vector , we have the approximate solution according to Eq.

(2.28). As previously remarked, one is usually more interested in the derivative of the solu-tion. Here its approximation is obtained as

(2.37)

with defined by Eq. (2.30). For instance, in our one dimensional model problem Eq. (2.18)

we would most likely be interested in the axial force in the bar. Here we would have

(2.38)

We have seen that the i:th equation in Eq. (2.36) was obtained from the test function choice

, and we also note that the system of equations is linear. Since Eq. (2.31) is satisfied

with selected as anyone of the basis function, it is hence satisfied if the test function is cho-

sen to be any linear combination of basis functions. This makes it possible to express the FE–formulation in a manner similar to the variational problem Eq. (2.25). To this end we definethe finite element space

(2.39)

i.e. all functions that can be written as a linear combination of the selected basis functions

are in . Hence, the name basis functions: they constitute a basis for . We now have

the FE–formulation


To see that this expression is identical to the FE–formulation Eq. (2.34), we first note that

‘Find ’ means that we are looking for a function that can be expressed as a linear com-

bination of the basis functions: . Substituting into the FE formulation Eq. (2.40), we

get

BT

EAB xd

0

L

∫

a NT

KxA xd

0

L

∫ PNT

L( )+=

K BT

EAB xd

0

L

∫= f NT

KxA xd

0

L

∫ PNT

L( )+=

Ka f=

a

xd

du

xd

duh

xd

dNa Ba= =≈

B

N x( )

N x( ) σ x( )A x( ) EA x( )ε x( ) EA x( )xd

duEA x( )Ba≈= = =

v Ni=

v

Vh v: v ciNi

i 1=

n

∑=

=

Ni x( ) Vh Vh

uh Vh∈xd

dvEA

xd

duhxd

0

L

∫ vKxA xd

0

L

∫ Pv L( )+= v Vh∈∀

uh Vh∈

uh Na=

— 43 —

(2.41)

cf. Eq. (2.31). Next we realize that ‘ ’ states that the equation should hold for any arbi-

trary selected test function that can be written as a linear combination of the basis functions.

Choosing arbitrary coefficients , , we consequently have

(2.42)

With the column vector

(2.43)

we may express Eq. (2.42) as

(2.44)

and we get the derivative

(2.45)

Substituting Eqs. (2.44) and (2.45) into Eq. (2.41), we get

(2.46)

or, subsequent to rearranging terms,

(2.47)

Hence, and the vector inside the curly brackets are orthogonal; however, is arbitrary so

the latter has to be a zero vector (since the zero vector is the only vector that is orthogonal toall other vectors)

(2.48)

xd

dvEAB xd

0

L

∫

a vKxA xd

0

L

∫ Pv L( )+= v Vh∈∀

v Vh∈∀

ci i 1 2 … n, , ,=

v c1N1 c2N2 … cnNn+ + +=

c c1 c2 … cn

T=

v Nc cT

NT

= =

xd

dv

xd

dc1N1 c2N2 … cnNn+ + +[ ] c

T

xd

dN1

xd

dN2

…

xd

dNn

cT

BT

= = =

cT

BT

EAB xd

0

L

∫

a cT

NT

KxA xd

0

L

∫ PcT

NT

L( )+=

cT

BT

EAB xd

0

L

∫

a NT

KxA xd

0

L

∫– PNT

L( )–

0=

c c

BT

EAB xd

0

L

∫

a NT

KxA xd

0

L

∫– PNT

L( )– 0=

— 44 —

which is identical to Eq. (2.34). Thus, the FE–formulation Eq. (2.40) is merely an alternativemanner to express Eq. (2.34).

Prior to going into some more details, we remark that if the selected basis

functions are in the admissible space, , that is if the basis func-

tions satisfies all essential boundary conditions and are regular enoughfor the integrals to be finite valued, the finite element space will be a sub-

space of : . In particular, if is in , i.e. can be expressed as a

linear combination of the basis functions, the Galerkin method will give the exact solution

, which actually happens in a few special cases. Otherwise, will be as close to as

possible in a well defined sense; we will dwell into this in a later chapter.

2.4 Some Notes on the FE–Equations

Let us somewhat contemplate the FE–discretization Eq. (2.34) of our model problem Eq.(2.18). First we notice that two integrals over the domain on which our problem is defined (i.e.

the interval ) appear in the equation. It is readily seen that these two terms stem from

the differential equation in the BVP Eq. (2.18). In addition to this, one notice that a third termis present in Eq. (2.34), and that this term originates from the natural (or Neumann) condi-

tion at . Hence, in the FE–equation the differential equation and natural boundary con-

ditions are ‘blended together’; as a consequence, when FE is used to approximate the solutionof some differential equation, Neumann conditions will be approximated as well. In contrastto this, essential (or Dirichlet) conditions will be exactly satisfied.

Our second comment concerns the manner in which the leftand right hand sides of Eq. (2.34) are evaluated. Ratherthan directly calculating the integrals, the domain is dividedinto elements and the integrations takes part over one ele-ment at a time. The illustration (right) shows a subdivision

of the domain into elements with nodes; in

this case we have two nodes on each element, and there will be two basis functions, viz.

and , defined on each element along with associated node variables and . If we define

(2.49)

cf. Eqs, (2.27) and (2.30), the integrals evaluated on the i:th element yield an element stiff-

ness matrix and an element load vector

(2.50)

cf. Eq (2.35). These element quantities are then assembled to produce the structure stiffness

matrix and (part of) the structure load vector , see Eqs. (2.35) and (2.36). The assembly

process was detailed in the previous chapter.

Carefully note that assemblage of the element load vectors yields , i.e. the first term

of according to Eq. (2.35). Thus, the assembly of element quantities results in the contribu-

tion to the left and right hand sides of the governing differential equation (Eq. (2.18)) only,

V

u

Vh

uh

Ni V∈ i∀

V Vh V⊂ u Vh

uh u= uh u

0 x L< <

x L=

x1 2 n-1

x1 0= xn L=

0 x L< < n 1– n

N1

e

N2

ea1

ea2

e

ae a1

e

a2

e= N

e

N1

eN2

e= Be

xd

dNe

xd

dN1

e

xd

dN2

e

= =

Ke

fe

Ke

BeT

EABe

xd

xi

xi 1+

∫= fe

NeT

KxA xd

xi

xi 1+

∫=

K f

NT

KxA xd

0

L

∫

f

— 45 —

and subsequent to this one has to insert the boundary term to eventually get the completestructure load vector.

In BVP:s with two or three independent variables the integrals that stem from the differentialequation, will be over areas or volumes, respectively. In these cases, Neumann conditionsappear as line or surface integrals in the expression for the structure load vector. Once again,the assembly process will only account for the integrals that originate from the differential

equation, so the boundary condition part of still needs to be handled independently.

Next we note that essential (or Dirichlet) conditions do not appear explicitly in the FE–equa-tions (2.34)–(2.36). Instead, this type of condition rules out certain basis function and test

functions. In practice one ignores these conditions when is established, but they are

taken care of when the equation system is solved. As was seen in a couple of examples in Ch.

1, essential conditions result in that we end up with unknowns in both (node variables)

and the load vector ; the latter represents reaction forces. Further examples that illustrates

how to handle Dirichlet conditions will be given in the subsequent section.

Our model problem Eq. (2.18) does not involve any Robin condition, i.e. a boundary condition

involves both the unknown and its first derivative . We provide an example of this at the

end of the following section, and it will be seen that it will give a contribution to the load vec-

tor and the stiffness matrix .

2.5 Some Examples

In this section we will work through a few numerical examples. Our intention is to illustratethe notes made in the previous section, and in particular show how the various types ofboundary conditions are taken into account. We shall also disclose some properties of finiteelement approximations.

2.5.1 Homogenous Dirichlet Conditions

Use FE to approximate the solution of

(2.51)

Use four linear elements with size and compare the FE–approximation and its deriv-

ative with the exact solution

(2.52)

Remark: The BVP (2.51) may be thought of as describing an elastic bar with axial stiffness

and length , that is loaded by a frictional force ; see Eq. (2.8). The

two Dirichlet conditions mean that both ends of the bar are fixed. m

Let be the FE–approximation. The FE–space consists of all functions that may be

expressed as a linear combination of the basis functions used to construct ; these have to

be square integrable, have square integrable first derivative, and satisfy the two Dirichlet con-

ditions (i.e. ), cf. Eq. (2.24). The FE–formulation becomes

f

Ka f=

a

f

uxd

du

f K

x2

2

d

d u– 1 x+( ) 2–

= 0 x 1< <

u 0( ) 0=

u 1( ) 0=

h 0.25=

u x( ) 1 x+

2x

------------ln=xd

du1 x+( ) 1–

2( )ln–=

EA 1= L 1= KxA 1 x+( ) 2–=

uh u≈ Vh v

uh

v 0( ) v 1( ) 0= =

— 46 —


where we used that since the boundary terms were dropped out.

Now we divide the interval into four linear elements, each of length :

By linear elements, we mean that our basis functions are first degree polynomials. In thiscase, the end points of each element will be nodes. Thus we have 5 nodes and to each node

we associate a basis function . If we collect the basis functions in a row vector and the

node variables in a column vector

(2.54)

the FE–approximation may be written . Carefully note the manner in which the basis

functions are constructed: takes the function value at the i:th node, and is in all

other nodes. Although we have 5 basis functions, there are only 2 basis functions that are

non–zero on any one element. Let denote the coordinate of the i:th node. We the have

(2.55)

Hence, the node variables have themeaning of the (approximate) value ofthe unknown function in the nodes.The illustration to the right depicts the

graph of some function ; notice in

particular how the manner in whichthe basis functions were constructed

ensures that the function becomes –

continuous, i.e. all derivatives up to the 0:th derivative (the function itself) are continuous.

The derivative will be piece wise constant (and hence square integrable), but its second

derivative will be zero inside the elements; it is recognized that the piece wise linear function

easily can be used to approximate the weak problem as in Eq. (2.53), but we run into some

problems if try to insert it into the strong form Eq. (2.51).

The integrals in the FE–formulation Eq. (2.53) are now calculated element wise

uh Vh∈xd

dv

xd

duhxd

0

1

∫v

1 x+( )2------------------- xd

0

1

∫= v∀ Vh∈

v 0( ) v 1( ) 0= = vEAxd

du

0

1

h 0.25=

x3 1 2⁄=x1 0= x2 1 4⁄= x4 3 4⁄= x5 1=

N1 N2N3 N4 N5

1 1 1 1 1

Ni x( ) N

a

N N1 N2 N3 N4 N5= a a1 a2 a3 a4 a5

T=

uh Na=

Ni x( ) 1 0

xi

u xi( ) uh xi( )≈ N xi( )a N1 xi( ) … Ni xi( ) …

a1

…ai

…

0 … 1 …

a1

…ai

…

ai= = = =

x1 0= x5 1=

uh

a1

a2

a3

a4

a5uh

C0

xd

duh

uh

— 47 —

(2.56)

Let us number the elements from left to right. The approximation on one element can be writ-ten

(2.57)

where we used superscript to emphasize the numbering is local on the element (first and

second basis function on the element, etc.). For instance, on the i:th element we have

(2.58)

In the CALFEM toolbox, the connectivity between local and global numbering is specified (bythe user) in an array Edof. We can now construct the basis functions and their first deriva-tives for the i:th element

(2.59)

where and are the coordinates of the nodes on the element, and is the ele-

ment length. Substituting the approximation into the left hand side of Eq. (2.56), we get

(2.60)

Now select the test function according to Galerkin: first and then . For each

choice we calculate the left hand side of Eq. (2.56), and we collect the results in a column vec-tor to get

(2.61)

It is noted that the element stiffness matrix in this particular example, depends on the ele-

ment length only. We have used a unit value of the constitutive function (cf. Eq. (2.1)),

viz. in our interpretation of the BVP (2.51) as a model for an elastic bar; with

any other constant value, it is easily seen that we would obtain

xd

dv

xd

duhxd

xi

xi 1+

∫i 1=

4

∑v

1 x+( )2------------------- xd

xi

xi 1+

∫i 1=

4

∑=

uh N1

ea1

eN2

ea2

e+ N1

eN2

ea1

e

a2

eN

ea

e= = =

e

N1

eN2

e Ni Ni 1+=

a1

e

a2

e

ai

ai 1+

=

xixi 1+

N1

eNi= N2

eNi 1+=

11

N1

e xi 1+ x–

h--------------------=

xd

dN1

e1–

h------=

N2

e x xi–

h------------=

xd

dN2

e1

h---=

xi xi 1+ h xi 1+ xi–=

xd

dv

xd

dN1

e

a1

e

xd

dN2

e

a2

e+

xd

xi

xi 1+

∫ xd

dv

xd

dN1

e

xd

dN2

e

xd

xi

xi 1+

∫ ae

=

v N1

e= v N2

e=

xd

dN1

e

xd

dN2

e xd

dN1

e

xd

dN2

e

xd

xi

xi 1+

∫ ae 1

h2

-----1 1–

1– 1xd

xi

xi 1+

∫

ae 1

h---

1 1–

1– 1a

eK

ea

e= = =

Ke

h D x( )D x( ) EA 1= =

— 48 —

(2.62)

Remark: In case the constitutive property is not constant, we would have to integrate to

find the element stiffness matrix

(2.63)

cf. Eq. (2.61). This is rarely done in practice, but instead one divides the interval into ele-

ments that are small enough for being well approximated by a constant on each element.

m

Since all four elements have the same length in our example, Eq. (2.61) gives the elementstiffness matrix for all the elements and we have thus completed the evaluation of the left

hand side of Eq. (2.56), although it must be observed that differ between the elements,

which has to be accounted for when the summation (Eq. (2.56)) is done. Numbering the ele-ments from left to right, we have

(2.64)

as described by Eq. (2.58).

Next we calculate the element load vectors, i.e. terms in the right hand side of Eq. (2.56). Forthe first element, we have

(2.65)

where we used Eq. (2.59) to find explicit expressions for the two test functions and

The contributions from elements 2, 3 and 4 are obtained analogously

(2.66)

It is now time to assemble (expand and add) the element stiffness matrices and the elementload vectors, respectively. Consider for instance element number two; if the element matrix

and vector are expanded to the structure vector , we get

Ke EA

h-------

1 1–

1– 1=

D x( )

Ke 1

h2

-----1 1–

1– 1D x( ) xd

xi

xi 1+

∫

=

D

ae

ae a1

a2

a2

a3

a3

a4

a4

a5

, , ,=

fe N1

e

N2

e1 x+( ) 2–

xd

0

1 4⁄

∫ 0.10743

0.09257= =

v N1

e=

v N2

e=

fe N1

e

N2

e1 x+( ) 2–

xd

1 4⁄

1 2⁄

∫ 0.07071

0.06262= = f

e 0.05006

0.04517= f

e 0.03730

0.03413=

a a1 … a5

T=

— 49 —

(2.67)

where and are the expanded element stiffness matrix and the expanded element load

vector, respectively. We do this expansion for all four elements, whereby the summation of theleft and right hand sides in Eq. (2.56) yield

(2.68)

and

(2.69)

respectively. Thus, our equation system reads

(2.70)

Carefully note that we have so far only worked on the differential equation in the BVP Eq.(2.51); the two boundary conditions have not been taken into account. As been mentionedabove, Dirichlet conditions (essential boundary conditions) are not taken into account until it

is time to solve the FE–equation system , at which stage we also have to invoke the

restrictions on the test functions. To satisfy the boundary conditions we obviously must pre-

scribe and , since the node variables are the (approximate) function values in the

nodes (Eq. (2.55)). Inserting this into Eq. (2.70), we note that it appears that we have five

equations but only three unknowns ( , , and ). To resolve this we note that the five

equations have been obtained from Eq. (2.53) by selecting the test function in five different

ways, viz. , , so that the first equation ( ) was obtained with

in Eq. (2.53), while yielded the second equation ( ), etc.

However, the FE–formulation has been obtained from the BVP (2.51) via the variational prob-

Kee

a1

h---

0 0 0 0 0

0 1 1– 0 0

0 1– 1 0 0

0 0 0 0 0

0 0 0 0 0

a1

a2

a3

a4

a5

= fee

0

0.07071

0.06262

0

0

=

Kee

fee

1

h---

1 1– 0 0 0

1– 1 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 1 1– 0 0

0 1– 1 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 1 1– 0

0 0 1– 1 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 1 1–

0 0 0 1– 1

+ + +

a1

a2

a3

a4

a5

1

h---

1 1– 0 0 0

1– 2 1– 0 0

0 1– 2 1– 0

0 0 1– 2 1–

0 0 0 1– 1

a1

a2

a3

a4

a5

=

0.10743

0.09257

0

0

0

0

0.07071

0.06262

0

0

0

0

0.05006

0.04517

0

0

0

0

0.03730

0.03413

+ + +

0.1074

0.1633

0.1127

0.0825

0.0341

=

Ka f=

1

h---

1 1– 0 0 0

1– 2 1– 0 0

0 1– 2 1– 0

0 0 1– 2 1–

0 0 0 1– 1

a1

a2

a3

a4

a5

0.1074

0.1633

0.1127

0.0825

0.0341

=

Ka f=

a1 0= a5 0=

a2 a3 a4

v

v Ni= i 1 2 3 4 5, , , ,=1

h--- a1 a2–( ) 0.1074=

v N1= v N2=1

h--- a1– 2a2 a3–+( ) 0.1633=

— 50 —

lem: the differential equation has been multiplied by a test function and integrated over the

interval; partial integration of the left hand side results in

(2.71)

Since and are unknown, we restricted the choice of test functions to those that

satisfy and so as to get rid of the boundary terms, consequently obtaining

the FE–formulation Eq. (2.53). Now, we have that and , so the choices

and are not allowed, i.e. the first and fifth equation in the system (2.70) are not

valid. Hence we now have

(2.72)

which is three equations with three unknowns. Observing that the first and fifth columns are

multiplied by zero and inserting , we finally get

(2.73)

One finds

(2.74)

and gets the approximation

(2.75)

Now that we have found the node variables and have, the unknown boundary terms in Eq.(2.71) may be calculated. Substituting our FE–approximation and choice of test functionsinto the equation, we find

(2.76)

where

v

xd

dv

xd

duxd

0

1

∫v

1 x+( )2------------------- xd

0

1

∫ vxd

du

0

1

+=

xd

du

x 0=xd

du

x 1=

v 0( ) 0= v 1( ) 0=

N1 0( ) 1= N5 1( ) 1=

v N1= v N5=

1

h---

1 1– 0 0 0

1– 2 1– 0 0

0 1– 2 1– 0

0 0 1– 2 1–

0 0 0 1– 1

0

a2

a3

a4

0

0.1074

0.1633

0.1127

0.0825

0.0341

=

h1

4---=

8 4– 0

4– 8 4–

0 4– 8

a2

a3

a4

0.1633

0.1127

0.0825

=

a2 0.0499= a3 0.0589= a4 0.0398=

uh Na= a 0.0000 0.0499 0.0589 0.0398 0.0000T

=

vxd

du

0

1

xd

dv

xd

duxd

0

1

∫v

1 x+( )2------------------- xd

0

1

∫–= R⇒ BT

B xad

0

1

∫ NT 1

1 x+( )2------------------- xd

0

1

∫–=

— 51 —

(2.77)

Hence, we get

(2.78)

Note that may be thought of as a residual and that the non–zero values are the ‘reaction

forces’ due to the prescribed node variables. Thus, the five equations in the system Eq. (2.72)

do in effect embrace five unknowns: the node variables ( , , and ) and the two reaction

forces; the latter two have been eliminated by invoking the restriction on the

test function, but may always be calculated according to Eq. (2.78) once the node variables

have been solved for.

We also evaluate ; once again we point out that the derivative usually is of greater interest

that the function itself. On any single element, the approximation is given by Eq. (2.57). Thuswe get

(2.79)

where we introduced

(2.80)

With the solution Eq. (2.75), , and according to Eq. (2.64) for our four elements, we

find

(2.81)

for elements 1 trough 4, respectively.

R NT

1( )xd

duh

x 1=

NT

0( )xd

duh

x 0=

–

0

0

0

0

1

xd

duh

x 1=

1

0

0

0

0

xd

duh

x 0=

–

xd

duh

x 0=

–

0

0

0

xd

duh

x 1=

= = =

R Ka f–

0.3068–

0

0

0

0.1932–

= =

R

a2 a3 a4

v 0( ) v 1( ) 0= =

a

xd

duh

xd

duh

xd

dNe

ae

Bea

e= =

Be

xd

dNe

xd

dN1

e

xd

dN2

e1–

h------

1

h---= = =

1

h--- 4= a

e

xd

duh0.1994, 0.0362, 0.0776, – 0.1591–=

— 52 —

Graphs of the functions and as well as their first derivatives, are shown below.

Prior to proceeding to the next example, let us pose a few remarks on the above results.

(1) Possibly the most obvious observation is that it appears that and coincide at the

nodes, i.e. we get exact node variable values. While this certainly is true in the present exam-ple, it is not a property of the FE–method, but the result rather stems from a feature of thedifferential equation we solved. In general finite elements do not yield exact function values atthe nodes.

Second (2) we note that the FE–approximation is consistently ‘too small’. This is a general fea-ture and hence carry over to other differential equations considered in this text; it is not to

say that in each and every point, but the FE–approximation is ‘smaller than’ the exact

solution ‘on average’. For this reason, the FE solutions is said to be too stiff. We will investi-gate this property in somewhat more detail in a later chapter, where the terms ‘on average’and ‘smaller than’ will be clarified. However, we remark that the accuracy of the approxima-tion depend on how many node variables that are used in the discretization of the problem.

The more elements, the better the approximation. In fact, it can be shown that as the

element size , i.e. the FE–approximation converges to the exact solution as the element

sizes tend to zero. Element sizes may of course be different for various elements and the

notation should be understood as ‘the size of the largest element goes to zero’.

Third (3), note the different scales of the respective ordinates. The maximum error is

about while the maximum error in the approximation of the derivative is about 10 times

larger. The fact that the derivative of the solution is worse approximated that the functionitself, is a general property and may be considered as bad news, since derivatives are usuallyof larger practical interest than the primary unknown function. In the present example forinstance, if we interpret the BVP (2.51) as a one dimensional elasticity problem with axial

stiffness , we get the axial force in the bar

(2.82)

which normally would be of greater concern than the displacement .

Next (4) we remark that the derivative appears to be best approximated close to the ele-

ment centres. This is indeed so, and is a feature that carries over to problems in two andthree dimensions; when linear basis functions are utilized to approximate solutions to a BVPthat involve second order (partial) differential equations, first derivatives are best approxi-

u uh

0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

x

Exact solution u and FE approximation uh

uu

h

0 0.2 0.4 0.6 0.8 1−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5First derivative: Exact and FE approximation

x

du/dxdu

h/dx

u uh

uh u<

uh u→

h 0→h

h 0→

u uh–

0.01

EA 1=

N σA EAε EAxd

du

xd

du= = = =

u

xd

du

— 53 —

mated at element mid–points. For instance in two– and three–dimensional elasticity prob-lems, one is usually interested in the stress state, and since stresses are combinations ofvarious first derivatives of the primary unknown (the displacements), they become most accu-rate in the vicinity of the element centres.

Our fifth (5) remark concerns the location of the largest errors. Inspecting the right hand fig-ure above, we note that the error (in first derivative) is largest toward the left end of the inter-val simply because the exact derivative changes more rapidly here. This is also an attributethat is found in higher dimensions: the error become largest in regions where second deriva-tives of the exact solution are large. In practice this means that it is of some importance to befamiliar with the type of problem being solved. Solving an elasticity problem, for instance, onewould want to use more elements (smaller element size) in regions where one expects rapidchances in one or more stress components.

Next (6) consider the sum of the node forces

(2.83)

see e.g. Eq. (2.69) or (2.70), and the sum of the reaction forces as given by Eq. (2.78)

(2.84)

It is noticed that the system is in equilibrium. This will always be the case for the FE–meth-ods described in this text.

As a final (7) observation, we point out that the node

force corresponds to the jumps in first derivatives;

actually, to be more precise, node forces are equal to the

jumps in , i.e. the normal force, but in this example

we set . Thus we have in effect equilibrium at the nodes. As an example, we illustrate

this with the forces acting at node 2 (where an equilibrium equation has a rounding error inthe fourth decimal). This node equilibrium property is also present in two and three dimen-sional problems.

2.5.2 Non–Homogenous Dirichlet Data

In the example problem we elaborated on in detail above there were Dirichlet conditions only,but no Neumann or Robin condition. The FE–equation system Eq. (2.70) was establishedwithout regard to the Dirichlet condition, so that the left and right hand sides arises from theleft and right hand sides of the governing differential equation Eq. (2.51). The boundary con-

ditions were taken into account when solving , in that variables at boundary nodes

where assigned values according to the Dirichlet conditions and that equations obtained fromtest functions that do not satisfy homogenous Dirichlet conditions were omitted from theequation system, cf. Eq. (2.72). This way of handling essential boundary conditions is notparticular to one dimensional problems, but the procedure is the same for two and threedimensional BVP:s.

Before we dwell into how Neumann and Robin conditions are approximated, let us see how todeal with non–homogenous Dirichlet condition. To this end, we make a slight change to theprevious example problem Eq. (2.51) and consider

f1 f2 f3 f4 f5+ + + + 0.5=

R1 R2 R3 R4 R5+ + + + R1 R5+ 0.5–= =

R1R5

f1 f2f3

f4 f5

N EAxd

duh0.1994= =

N 0.0362=

f2 0.1633=

fi

EAxd

duh

EA 1=

Ka f=

— 54 —

(2.85)

i.e. we have here changed the condition at . To obtain the weak form of the problem, we

multiply the differential equation by a test function and integrate over the interval. Sub-

sequent to integration by parts we obtain

(2.86)

which is identical to Eq. (2.71) (or Eq. (2.20) (with , , and )). Since

and are unknown, we restrict ourselves to test functions such that

. Hence, we are lead to the variational problem


Note that the Dirichlet (essential) conditions are stated explicitly, since they are not part ofthe integral equation, cf. Eq. (2.22). Also observe that the test function has to satisfy homoge-

nous Dirichlet conditions although the condition at is non–homogenous. This is of

course because the boundary term in Eq. (2.86) has been omitted in the weak formulation(2.87).

We next set forth to approximate the solution by using four linear elements, each of length

:

With the approximation , where the basis function vector and the node variable

vector define by Eq. (2.54), substituted into the weak problem (2.87), we once again arrive

at Eq. (2.56) (repeated here for convenience)

(2.88)

Cautiously observe that we here neglected the Dirichlet boundary conditions and the restric-tions on the test function — as before, these are taken into account when the FE–equations

are solved. Hence, our FE model is identical to that in the previous example, so

becomes as before, i.e. as Eq. (2.70):

x2

2

d

d u– 1 x+( ) 2–

= 0 x 1< <

u 0( ) 0=

u 1( ) 0.1=

x 1=

v x( )

xd

dv

xd

duxd

0

1

∫ v 1 x+( ) 2–xd

0

1

∫ vxd

du

0

1

+=

L 1= EA 1= KxA 1 x+( ) 2–=

xd

du

x 0=xd

du

x 1=

v 0( ) v 1( ) 0= =

uxd

dv

xd

duxd

0

1

∫v

1 x+( )2------------------- xd

0

1

∫=

u 0( ) 0= u 1( ) 0.1=

v 0( ) 0= v 1( ) 0=

x 1=

h 0.25=

x3 1 2⁄=x1 0= x2 1 4⁄= x4 3 4⁄= x5 1=

N1 N2N3 N4 N5

1 1 1 1 1

u uh≈ Na= N

a

xd

dv

xd

duhxd

xi

xi 1+

∫i 1=

4

∑v

1 x+( )2------------------- xd

xi

xi 1+

∫i 1=

4

∑=

Ka f=

— 55 —

(2.89)

As in the previous example we have five equations and the i:th equation has been obtained by

the test function choice in Eq. (2.88). Since and , the first and fifth

equations are not valid. Furthermore, from the boundary conditions we have that and

, so we now have

(2.90)

Multiplying the first column by and the fifth by , we get (with )

(2.91)

which has the solution

(2.92)

and we get the approximation

(2.93)

Now that we have found the node variables and have, the unknown boundary terms that weomitted from Eq. (2.86) by restricting our choice of test functions to those that fulfil

, may now be calculated as described in Eq. (2.78)

(2.94)

and the approximation of the derivative of the primary unknown becomes (see Eq. (2.79))

(2.95)

The exact solution of the BVP Eq. (2.85) and its first derivative, viz.

1

h---

1 1– 0 0 0

1– 2 1– 0 0

0 1– 2 1– 0

0 0 1– 2 1–

0 0 0 1– 1

a1

a2

a3

a4

a5

0.1074

0.1633

0.1127

0.0825

0.0341

=

v Ni= N1 0( ) 1= N5 1( ) 1=

a1 0=

a5 0.1=

1

h---

1 1– 0 0 0

1– 2 1– 0 0

0 1– 2 1– 0

0 0 1– 2 1–

0 0 0 1– 1

0

a2

a3

a4

0.1

0.1074

0.1633

0.1127

0.0825

0.0341

=

a1 0= a5 0.1= h 0.25=

8 4– 0

4– 8 4–

0 4– 8

a2

a3

a4

0.1633

0.1127

0.0825

0.1

0

0

4–

–

0.1633

0.1127

0.4825

= =

a2 0.0749= a3 0.1089= a4 0.1148=

uh Na= a 0.0000 0.0749 0.1089 0.1148 0.1000T

=

v 0( ) v 1( ) 0= =

R Ka f–

0.4069–

0

0

0

0.0932–

= =

xd

duh0.2995, 0.1362, 0.0234, 0.0591–=

— 56 —

(2.96)

are depicted together with the FE– approximations (Eq. 2.93)) and (Eq. (2.95)) below.

Exercise At the end of the previous subsection, Sec. 2.5.1, we made 7 remarks on the FE–approximation; confirm that these apply to the present example.

2.5.3 Natural Boundary Conditions

We continue our exploration of different types of boundary conditions with an example thatinvolves a Neumann (or natural) condition. To this end we consider the very same BVP as

before, but change the condition at

(2.97)

Here the exact solution and its first derivative are

(2.98)

Remark: As before, the BVP may be thought of as

describing an elastic bar with axial stiffness

and length , that is loaded by a frictional force

; see Eq. (2.8). The boundary conditions

mean that the left end is fixed while the right end is

loaded by a force m

u x( ) x

10------

1 x+

2x

------------ln+=xd

du 1

10------ 1 x+( )+

1–2( )ln–=

uh xd

duh

0 0.2 0.4 0.6 0.8 10

0.02

0.04

0.06

0.08

0.1

0.12

x


uu

h

0 0.2 0.4 0.6 0.8 1−0.1

0

0.1

0.2

0.3

0.4

0.5


x

du/dxdu

h/dx

x 1=

x2

2

d

d u– 1 x+( ) 2–

= 0 x 1< <

u 0( ) 0=

xd

du

x 1=

0.15–=

u x( ) 1 x+( )ln13x

20---------–=

xd

du1 x+( ) 1– 13

20------–=

0 L 1=

EA x( ) 1=

KxA x( ) 1 x+( ) 2–=

P

x

P

N L( ) EAxd

du

x L=

–= =

EA 1=

L 1=

KxA 1 x+( ) 2–=

P 0.15–=

— 57 —

To obtain the weak form of the problem, we proceed as before: multiply the differential equa-

tion by a test function and integrate over the interval. Subsequent to integration by parts

we obtain

(2.99)

Now resolve the boundary term

(2.100)

Since is known, the Neumann condition could be tucked in here. However, as in the

previous examples, is unknown and represents some reaction force. To get rid of the

latter term, we restrict our choice of test functions to those that satisfies . Hence we

are lead to the following variational problem


cf. Eq. (2.22). Alternatively, with

(2.102)

we may express the weak problem as in Eq. (2.25)


The formulations (101) and (103) are identical, except that we have explicitly stated the regu-larity requirements in the latter expression. In either case, the variational equation stemsfrom both the differential equation and the natural boundary condition, while the essential

condition ( ) is sated separately.

Let us now approximate the solution by a finite element method. As in the foregoing examples

we use four linear elements, each of length :

v x( )

xd

dv

xd

duxd

0

1

∫ v 1 x+( ) 2–xd

0

1

∫ vxd

du

0

1

+=

vxd

du

0

1

v 1( )xd

du

x 1=

v 0( )xd

du

x 0=

– 0.15v 1( )– v 0( )xd

du

x 0=

–= =

xd

du

x 1=

xd

du

x 0=

v 0( ) 0=

u xd

dv

xd

duxd

0

1

∫v

1 x+( )2------------------- xd

0

1

∫ 0.15v 1( )–=

u 0( ) 0= v 0( ) 0=

V v: v 0( ) 0 v2

xd

0

1

∫ ∞< xd

dv

2

xd

0

1

∫ ∞<=

=

u V∈xd

dv

xd

duxd

0

1

∫v

1 x+( )2------------------- xd

0

1

∫ 0.15v 1( )–= v V∈∀

u 0( ) 0=

h 0.25=

x3 1 2⁄=x1 0= x2 1 4⁄= x4 3 4⁄= x5 1=

N1 N2N3 N4 N5

1 1 1 1 1

— 58 —

so that our approximation is with the basis function vector and the node varia-

ble vector define in Eq. (2.54). The approximation is substituted into the weak problem

(101) and we arrive at

(2.104)

Apart from the trailing boundary term , this is identical to the two previous FE–for-

mulations, see Eqs. (2.56) and (2.88). Hence it is clear that we will once again obtain Eq.(2.70), but the load vector in the right hand side will now get a contribution from the Neu-mann condition. Let us examine this term in some detail: it is recalled that the FE–equationsare obtained by using the basis functions as test function, so that the i:th equation is

obtained from the choice . In the current example we have 5 basis functions and it is

realized that the second term in the right hand side of Eq. (2.104) will generate a vector

(2.105)

cf. Eqs. (2.32) and (2.33). Here we note that , while all other basis functions take on

zero values at . Thus, Eq. (2.104) becomes

(2.106)

Now that we have obtained the FE–equations, we apply the Dirichlet condition and

the associated restriction on the test function; here we have that and that the first

equation is not valid, since the choice does not comply to the restriction . One

gets

(2.107)

and with

u uh≈ Na= N

a

xd

dv

xd

duhxd

xi

xi 1+

∫i 1=

4

∑v

1 x+( )2------------------- xd

xi

xi 1+

∫i 1=

4

∑ 0.15v 1( )–=

0.15v 1( )–

v Ni=

0.15v 1( ) 0.15

N1 1( )

N2 1( )

N3 1( )

N4 1( )

N5 1( )

→

N5 1( ) 1=

x 1=

1

h---

1 1– 0 0 0

1– 2 1– 0 0

0 1– 2 1– 0

0 0 1– 2 1–

0 0 0 1– 1

a1

a2

a3

a4

a5

0.1074

0.1633

0.1127

0.0825

0.0341

0

0

0

0

0.15

–=

u 0( ) 0=

a1 0=

v N1= v 0( ) 0=

1

h---

1 1– 0 0 0

1– 2 1– 0 0

0 1– 2 1– 0

0 0 1– 2 1–

0 0 0 1– 1

0

a2

a3

a4

a5

0.1074

0.1633

0.1127

0.0825

0.1159–

=

h 0.25=

— 59 —

(2.108)

which has the solution

(2.109)

Hence, we have the approximation

(2.110)

The unknown boundary term that was omitted from Eq. (2.99) by restricting our choice of

test functions to those that fulfil , may now be calculated as described by Eq. (2.78)

(2.111)

The approximation of the derivative of the primary unknown becomes (see Eq. (2.79))

(2.112)

The exact solution of the BVP and its first derivative (Eq. (2.98)) are compared to the FE–approximations Eqs. (2.110) and (2.112) in the graphs below.

Carefully note that while the essential boundary condition is exactly satisfied, the natu-

ral condition at is approximated only: we have and . This

is of course because the Neumann condition is merged with the differential equation in theweak formulation and subsequently approximated in the FE–formulation, while the Dirichlet

8 4– 0 0

4– 8 4– 0

0 4– 8 4–

0 0 4– 4

a2

a3

a4

a5

0.1633

0.1127

0.0825

0.1159–

=

a2 0.0607= a3 0.0805= a4 0.0721= a5 0.0432=

uh Na= a 0.0000 0.0607 0.0805 0.0721 0.0432T

=

v 0( ) 0=

R Ka f–

0.3500–

0

0

0

0

= =

xd

duh0.2426, 0.0793, 0.0334– , 0.1159–=

0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

x


uu

h

0 0.2 0.4 0.6 0.8 1−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25

0.3


x

du/dxdu

h/dx

u 0( )

x 1=xd

du

x 1=

0.15–=xd

duh

x 1=

0.1159–=

— 60 —

condition is exactly enforced by prescribing node variables when the FE–equations are solved.As previously noted, the FE approximation gets better the more elements that are used in theanalysis, and converges to the exact solution as element sizes tends to zero — see theremarks made at the end of Sec. 2.5.1.

2.5.4 Robin Conditions

Our final example in this section will demonstrate how a boundary condition that includesboth the unknown function and its first derivative is dealt with. We have already seen thatessential conditions are taken into account when solving the FE–equations, while natural

conditions will contribute to the structure load vector . Here we will find that a Robin condi-

tion affects both the left and right hand side of . As a model problem we consider the

very same BVP as before, but apply a Robin condition at

(2.113)

and find the exact solution and its first derivative as

(2.114)

Remark: The reader is encouraged to compare Eq. (2.113) to the BVP Eq. (2.12). One findsthat the mathematical model may be thought of as describing an elastic bar with axial stiff-

ness and length , that is loaded by a frictional force . The boundary

conditions mean that the left end is fixed while the right end is loaded by a force

and supported by a linear spring with stiffness m

The weak form of the differential equation is of course obtained as in the earlier examples

(2.115)

Breaking down the boundary term, we here find

(2.116)

Here we inserted the condition at , but as in the previous examples is unknown

and represents some reaction force; hence, we restrict our choice of test functions to those

that satisfies , so as to get rid of the unknown boundary term. We are consequently

lead to the variational problem

f

Ka f=

x 1=

x2

2

d

d u– 1 x+( ) 2–

= 0 x 1< <

u 0( ) 0=

xd

du

x 1=

0.05– 0.5u 1( )–=

u x( ) 1 x+( )ln1.1 2ln+( )x

3------------------------------–=

xd

du1 x+( ) 1– 1.1 2ln+( )

3---------------------------–=

EA 1= L 1= KxA 1 x+( ) 2–=

P 0.05–=

k 0.5=

xd

dv

xd

duxd

0

1

∫ v 1 x+( ) 2–xd

0

1

∫ vxd

du

0

1

+=

vxd

du

0

1

v 1( )xd

du

x 1=

v 0( )xd

du

x 0=

– v 1( ) 0.05– 0.5u 1( )–[ ] v 0( )xd

du

x 0=

–= =

x 1=xd

du

x 0=

v 0( ) 0=

— 61 —


or with the function space defined by Eq. (2.102)


Note that the function value is unknown and therefore kept in the left hand side of the

variational equation; it will be approximated along with the function . Hence, the Robin

condition affects both the left and right hand sides of the weak problem. As before, the essen-

tial condition ( ) is sated separately and will be dealt with when solving the FE–equa-

tions.

As in the previous examples we use four linear elements, each of length , to approxi-

mate the solution of the variational problem:

Substituting the approximation (see Eq. (2.54)) into the weak problem (2.117), we

obtain

(2.119)

It is recognized that the first term in the left and right hand side are the same as in our previ-ous examples, see Eq. (2.56) or (2.88), so we will once again obtain a stiffness matrix and load

vector as in Eqs. (2.70) and (2.89). However, the stiffness matrix will be modified by the

term and the load vector gets a contribution — both these additional

terms stem from the Robin condition. The loading term was dealt with in our previous exam-ple, when we investigated how Neumann conditions are treated, see Eq. (2.105). Here we get

(2.120)

The 5 choices for the test function are of course the same for the left hand side term so we

obtain

u xd

dv

xd

duxd

0

1

∫ 0.5v 1( )u 1( )+v

1 x+( )2------------------- xd

0

1

∫ 0.05v 1( )–=

u 0( ) 0= v 0( ) 0=

V

u V∈xd

dv

xd

duxd

0

1

∫ 0.5v 1( )u 1( )+v

1 x+( )2------------------- xd

0

1

∫ 0.05v 1( )–= v V∈

u 1( )u x( )

u 0( ) 0=

h 0.25=

x3 1 2⁄=x1 0= x2 1 4⁄= x4 3 4⁄= x5 1=

N1 N2N3 N4 N5

1 1 1 1 1

u uh≈ Na=

xd

dv

xd

duhxd

xi

xi 1+

∫i 1=

4

∑ 0.5v 1( )uh 1( )+v

1 x+( )2------------------- xd

xi

xi 1+

∫i 1=

4

∑ 0.05v 1( )–=

K

0.5v 1( )uh 1( ) 0.05v 1( )–

0.05v 1( ) 0.05

N1 1( )

N2 1( )

N3 1( )

N4 1( )

N5 1( )

0.05

0

0

0

0

1

=→

v

— 62 —

(2.121)

Hence, with test functions according to the Galerkin method, Eq. (2.119) reads

(2.122)

With to satisfy the boundary condition at and we have

(2.123)

where we also omitted the first equation, since the choice does not comply to the

requirement and thus not valid. In effect we have

(2.124)

The solution

(2.125)

gives us the approximation

(2.126)

From Eq. (2.78) we now get an approximation of the unknown boundary term as

0.5v 1( )uh 1( ) 0.5

N1 1( )

N2 1( )

N3 1( )

N4 1( )

N5 1( )

N1 1( ) N2 1( ) N3 1( ) N4 1( ) N5 1( )

a1

a2

a3

a4

a5

0.5

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 1

a1

a2

a3

a4

a5

=→

1

h---

1 1– 0 0 0

1– 2 1– 0 0

0 1– 2 1– 0

0 0 1– 2 1–

0 0 0 1– 1

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0.5

+

a1

a2

a3

a4

a5

0.1074

0.1633

0.1127

0.0825

0.0341

0

0

0

0

0.05

–=

a1 0= x 0= h 1 4⁄=

4 4– 0 0 0

4– 8 4– 0 0

0 4– 8 4– 0

0 0 4– 8 4–

0 0 0 4– 4.5

0

a2

a3

a4

a5

0.1074

0.1633

0.1127

0.0825

0.0159–

=

v N1=

v 0( ) 0=

8 4– 0 0

4– 8 4– 0

0 4– 8 4–

0 0 4– 4.5

a2

a3

a4

a5

0.1633

0.1127

0.0825

0.0159–

=

a2 0.0737= a3 0.1066= a4 0.1113= a5 0.0954=

uh Na= a 0.0000 0.0737 0.1066 0.1113 0.0954T

=

xd

du

x 0=

— 63 —

(2.127)

The derivative of the FE–approximation, Eq. (2.79), becomes

(2.128)

The exact solution of the BVP and its first derivative, Eq. (2.114), are compared to the FE–approximations in the graphs below

In the current example, the FE–equation (2.119) embrace the differential equation and the

Robin condition at . For that reason we must expect that the boundary equation will be

approximated along with the differential equation, just as is the case with natural boundaryconditions. According to our Robin condition, we should have

(2.129)

Using our FE–approximation Eqs. (2.126) and (2.128), we find

(2.130)

so the boundary condition is indeed approximately satisfied only. Once again we mentionthat the approximation will converge to the exact solution when more elements are used forthe approximation.

Let us also check for equilibrium. The nodal forces sums up to

(2.131)

see Eq. (2.123), while the sum of the reaction forces as given by Eq. (2.127) is

(2.132)

R Ka f–

0.4023–

0

0

0

0

= =

xd

duh0.2949, 0.1316, 0.0189, 0.0636–=

0 0.2 0.4 0.6 0.8 1−0.1

0

0.1

0.2

0.3

0.4

0.5


x

du/dxdu

h/dx

0 0.2 0.4 0.6 0.8 10

0.02

0.04

0.06

0.08

0.1

0.12

x


uu

h

x 1=

xd

du

x 1=

0.5u 1( ) 0.05+ + 0=

xd

duh

x 1=

0.5uh 1( ) 0.05+ + 0.0636 0.5 0.0954⋅ 0.05+ +– 0.0341= =

f1 f2 f3 f4 f5+ + + + 0.45=

R1 R2 R3 R4 R5+ + + + R1 0.4023–= =

— 64 —

Thus, at a quick glance it appears that we do not have equilibrium. However, the term

in the Robin condition represents a force (e.g. from a spring support) that we have notaccounted for, cf. the BVP Eq. (2.12). Here we have

(2.133)

and equilibrium is in effect at hand.

2.6 Element Approximations

We have investigated the finite element method for a one dimensional second order problemin somewhat detail, and in particular scrutinized how different kinds of boundary conditionsaffect the solution process. By now it should be clear that the FE–solution is an approxima-tion only (except for Dirichlet conditions which are satisfied exactly), but we also pointed outthat the approximation converges towards the exact solution when the element size tends tozero, i.e. when more and more elements are used in the analysis. When dealing with partialdifferential equations (PDE) involving two or three independent variables, we will see that theoutlined methodology, including the treatment of boundary conditions, will be very similar,although some slightly more elaborate mathematics is called for. However, it was neverdetailed how to construct the basis functions, but the utilized functions Eq. (2.59) mayappear chosen ad hoc, and we may ask ourselves if any set of functions will do the job as longas they are regular enough for the integrals (Eq. (2.56)) to exist. The short answer to thisquestion is no, and in this section we outline the requirements to obtain monotone conver-gence for second order problems. The demands are essentially the same for second orderPDEs in two and three dimensions.

2.6.1 Conform Elements

Monotone convergence is guaranteed if the elements are complete and compatible. By conform

elements, we mean those that are both complete and compatible.

An element is complete if it is possible to choose node variable values such that the approxi-

mation becomes an arbitrary constant, and to choose (other) node variable values so that

the first derivative becomes an arbitrary constant. Obviously, minimum requirement is

that the basis functions defined on an element must be able to describe an arbitrary firstorder polynomial, but there may also be other terms

(2.134)

where are constants. In view of the differential equation as a mathematical model of a one

dimensional elasticity problem, see e.g. the BVP Eq. (2.10) or (2.12), one may understandcompleteness as if the element must be able to describe an arbitrary rigid body displacementand an arbitrary constant strain. The linear element used in the examples in the previoussection, embrace an arbitrary linear polynomial but nothing more; it thus the simplest possi-ble complete element that can be used.

0.5u 1( )

0.5u 1( ) 0.5uh 1( ) 0.5 a5⋅ 0.0477= =≈

R1

f1 f2f3

f4 f5

0.5uh 1( )

fi

i

∑ R1 0.5uh 1( )–+ 0=

Conform Complete Compatible+=

uh

xd

duh

uhelement i

α0 i α1ix maybe other terms+ +=

αj i

— 65 —

A function is said to be –continuous if all its derivatives up to order are continuous. In

this context, the :th derivative is the function itself. Elements for second order differential

equations are compatible if the approximation is –continuous. In the examples we

obtained –continuity by the manner in which the basis functions were constructed:

We placed a node ateach end point ofeach element. Basisfunctions were con-structed to have

function value at

one node, and at

all other nodes.

The node variables

then have the

meaning of func-tion values at thenodes (cf. Eq.(2.55)).

When elements arejoined together theyshare a node andhence the approxi-

mation has the

same function valueon the two ele-ments at the com-

mon node. Thus, the FE–approximation becomes –continuous.

2.6.2 Nonconforming Elements

The element with two nodes and two linear basis functions that was used in the examples ofthe previous section is conform, i.e. complete and compatible, and will therefore give us mon-otone convergence; the errors in the approximations are reduced as the number of elementsused is increased. Conforming elements for BVPs involving second order differential equa-tions are quite easy to construct also in 2– and 3–dimensional settings and there is usuallyno good reason to abandon these well behaved element types. However, higher order partialdifferential equations require higher degree of continuity for the elements to be compatible,and one runs into difficulties in 2 and 3 dimensions. For this reason it is quite common toencounter nonconforming elements in certain types of problems, especially in problems thatinvolve bending of plates and shells.

Carefully observe that nonconforming approximations still have to be complete; completenessis a necessary requirement. Thus, a non–conform element is one that does not satisfy thecompatibility condition. For this reason the words conform and compatible are sometimesused as synonyms in some text books, and some confusion may arise here.

We shall not dwell into nonconforming finite element methods in this introductory text, butsettle with pointing out that such methods do exist. Nonconforming elements may or may notconverge and if convergence is obtained, it is not necessarily monotone. A so called ‘patchtest’ has been proposed to investigate whether a mesh of non–conform elements will give con-vergence or not. The reader is directed to Strang and Fix (1973) for details on this topic.

Cn

n

0

uh C0

C0

1

N1

e i=N2

e i= N1

e i 1+=N2

e i 1+=

xi xi 1+xi 1+

xi 2+

a1

e i=N1

e i=a2

e i=N2

e i=+ a1

e i 1+=N1

e i 1+=a2

e i 1+=N2

e i 1+=+

a1

e i=

a2

e i=a1

e i 1+=

a2

e i 1+=

xi xi 1+xi 1+

xi 2+

xi xi 1+xi 2+

ai

ai 1+ai 2+

uh aiNi ai 1+ Ni i+ ai 2+ Ni 2++ +=

Ni i+ N2

e i=N1

e i 1+=∪=

ai 1+ a2

e i=a1

e i 1+== =

1

0

ai

e

uh

C0

— 66 —

2.6.3 Change of Basis

Completeness requires that a linear combination of basis func-tions on an element should be able to represent an arbitraryfirst order polynomial. This could of course be expressed in the

standard basis as was done in Eq. (2.134), but we prefer

to use a basis where each function has value in one

node and is in all other nodes, since such a construction

facilitates –continuity and, thus, compatibility. In the follow-

ing we show how to make a change of basis, so as to obtain the desired basis functions formthe standard basis. For the linear element this may appear to be unnecessary since the basis

functions are easily obtained by inspection, as was done without comments in Eq. (2.59).

However, the very same procedure may be used in more complicated situations and we willlater on use it to obtain the basis functions for a 2 dimensional element.

Let us thus consider an element on the interval and have a node on each end of the

element. Hence, there are two linear basis functions and two node variables on the element.In standard basis we have

(2.135)

where we introduced

(2.136)

However, we want to have

(2.137)

with node variables and basis functions such that

(2.138)

Equating Eqs. (2.135) and (2.137) we get

(2.139)

and in particular at the nodes we have

(2.140)

x

xi xi 1+

a1

e

uh α0 α1x+=

a1

eN1

ea2

eN2

e+=

a2

e1 x,

N1

eN2

e, 1

0

C0

Ni

e

xi xi 1+,[ ]

uh α0 α1x+ Nα˜

= =

α˜

α0

α1

= N 1 x=

uh Nea

e=

ae

a1

ea2

eT

= N N1

eN2

e=

N1

exi( ) 1= N1

exi 1+( ) 0=

N2

exi( ) 0= N2

exi 1+( ) 1=

Nea

eNα

˜

=

Ne

xi( )ae

N xi( )α˜

= 1 0a1

e

a2

e1 xi

α0

α1

=⇒

Ne

xi 1+( )ae

N xi 1+( )α˜

= 0 1a1

e

a2

e1 xi 1+

α0

α1

=⇒

— 67 —

In matrix form we thus have

(2.141)

Defining

(2.142)

we may write

(2.143)

Given the node variables, we may now solve for the standard basis coefficients

(2.144)

The matrix inverse is

(2.145)

where we introduced the element length . Substituting Eq. (2.144) into Eq.

(2.139), we get

(2.146)

and find the requested basis functions

(2.147)

which are the ones given by Eq. (2.59) and that were used in the numerical examples. Thereader is invited to check that the conditions Eq. (2.138) are satisfied by (2.147).

This far we have only used a linear element with two nodes and twobasis functions, but one may of course consider to utilize more elabo-rate elements. In practice polynomials only are used as basis func-tions, so a natural extension would be to adopt quadratic functions.In that case we have 3 nodes and basis functions and as before each

basis function is constructed to have value in one node and in

the other ones. Also as before, we place nodes at the element ends in

order to get a –continuous approximation; the third node will be some chosen point inte-

rior to the element. The so called –matrix method that was adopted in the previous subsec-

tion may now be used to obtain the desired basis from a transformation of the

standard basis . In this case we have (cf. Eqs. (2.135–136)

1 0

0 1

a1

e

a2

e

1 xi

1 xi 1+

α0

α1

=

C1 xi

1 xi 1+

=

ae

Cα˜

=

α˜

C1–a

e=

C1– 1

xi 1+ xi–---------------------

xi 1+ xi–

1– 1

1

h---

xi 1+ xi–

1– 1= =

h xi 1+ xi–=

Nea

eNC

1–a

e=

Ne

NC1–

1 x1

h---

xi 1+ xi–

1– 1

xi 1+ x–

h--------------------

xi– x+

h-----------------= = =

x

xixi 1+ xi 2+

1 1 1

N1

e

N3

eN2

e

1 0

C0

C

Ne

N1

eN2

eN3

e=

1 x x2, ,

— 68 —

(2.148)

and Eq. (2.141) becomes

(2.149)

By taking the inverse of the matrix

(2.150)

we obtain the result in analogy with Eq. (2.147)

(2.151)

In this text we avoid to undertake the tedious task to establish the inverse of the matrix,

but the reader is referred to Sec. 7.2.2 of Ottosen and Petersson (1992).

2.6.4 Lagrange Elements

By now it is realized that the size of the –matrix grows with the number of basis functions.

For instance, if we used cubic basis functions, the standard basis is and is

; as another example we mention that an element that is quite commonly used in 3D

elasticity problems, has quadratic basis functions and thus will result in a –

matrix (see Ottosen and Petersson (1992) Sec. 19.4 for an illustration). It is hence obvious

that the –matrix method has some limitations and is far from always a practical means to

establish the basis functions for an element type.

We shall now consider a more direct method to establish basis functions and derive elementstiffness matrices. While we at this stage stick to the one dimensional case, it will later beseen that the procedure easily is extended to manage basis functions of two and three varia-bles.

As it turns out, it is beneficial to establish the basisfunctions in terms of a local coordinate system;

here we use to denote the local coordinate. Thus,

for an element in the interval we intro-

duce a –coordinate and, furthermore, scale it

such that at the left element end and

at the right end. For any coordinate inside the

element, we then have a corresponding coordinate

uh α0 α1x α2x2

+ + 1 x x2

α0

α1

α2

Nα˜

= = =

1 0 0

0 1 0

0 0 1

a1

e

a2

e

a3

e

1 xi xi

2

1 xi 1+ xi 1+

2

1 xi 2+ xi 2+

2

α0

α1

α2

=

C

1 xi xi

2

1 xi 1+ xi 1+

2

1 xi 2+ xi 2+

2

=

Ne

NC1–

1 x x2 C

1–= =

3 3×

C

4 1 x x2

x3, , , C

4 4×20 20 20× C

C

1– +1

r

x

xi xi 1+

x r( )xi 1+ xi+

2----------------------

xi 1+ xi–

2--------------------- r+=

h

r

xi x xi 1+≤ ≤

r

r 1–= r +1=

r

— 69 —

(2.152)

also inside the element; here, as before, is the element length. The reader is

encouraged to verify that maps to the left end , maps to , and that the

midpoint maps to the element centre .

In order to establish the basis functions in the local coordinate , the Lagrange polynomials

come in handy. Given a set of distinct points , , the :th Lagrange polynomial is

(2.153)

Note that numerator embraces factors that all are linear in . Hence, is a poly-

nomial of order and the factors in the numerator ensure that it takes on zero value in all

points except at point ; in addition, the denominator has been constructed so that the func-

tion value is at . We conclude that

(2.154)

and with distinct points we have Lagrange polynomial , , of degree .

This is exactly the features we want our basis function to exhibit.

Let us detail a couple of examples and see how to establish the element stiffness matrix. With

and the points and we find the two linear ( ) polynomials

(2.155)

Similarly, we find three quadratic ( ) polynomials when . Using ,

and , one finds

(2.156)

x r( )xi 1+ xi+

2----------------------

h

2--- r+=

h xi 1+ xi–=

r 1–= x xi= r 1= xi 1+

r 0=xi 1+ xi+

2----------------------

r

n ri i 1 … n, ,= i

li

n 1–( )r( )

r r1–( ) r r2–( )… r ri 1––( ) r ri 1+–( )… r rn–( )ri r1–( ) ri r2–( )… ri ri 1––( ) ri ri 1+–( )… ri rn–( )

----------------------------------------------------------------------------------------------------------------------------=

n 1– r li

n 1–( )r( )

n 1–

ri

1 ri

li

n 1–( )rj( )

1 j i=

0 j i≠

=

n n li

n 1–( )i 1 … n, ,= n 1–

n 2= r1 1–= r2 1= n 1– 1=

l1

1( )r( )

r r2–( )r1 r2–( )

--------------------1 r–

2-----------= =

l2

1( )r( )

r r1–( )r2 r1–( )

--------------------1 r+

2-----------= =

n 1– 2= n 3= r1 1–= r2 0=

r3 1=

l1

2( )r( )

r r2–( ) r r3–( )r1 r2–( ) r1 r3–( )

-----------------------------------------r r 1–( )

2------------------= =

l2

2( )r( )

r r1–( ) r r3–( )r2 r1–( ) r2 r3–( )

----------------------------------------- 1 r–( ) 1 r+( )= =

l3

2( )r( )

r r1–( ) r r2–( )r3 r1–( ) r3 r2–( )

-----------------------------------------r r 1+( )

2-------------------= =

— 70 —

Below the graphs of the Lagrange polynomials in Eqs. (2.155) and (2.156) are depicted.

Notice that in both examples we chose point locations that included the interval ends:

in Eq. (2.155) and in Eq. (2.156), respectively. This ensures that we have

nodes in the element end points when we use the coordinate transformation (2.152), and use

the Lagrange polynomials as basis functions. Thus, we will have a –continuous approxima-

tion and therefore a compatible element.

Prior to establishing explicit expressions for element stiffness matrices, we briefly repeat theFE–formulation of a BVP that involves the differential equation Eq. (2.1); here we let the

equation be defined on the interval . First we multiply both sides of the equation by

(almost) arbitrary test function . Integration by parts then yields

(2.157)

cf. Eqs (2.19) and (2.20). Here we only pay attention to the left hand side, since it will give us

the stiffness matrix. We introduce the FE–approximation as a linear combination of cho-

sen basis functions

(2.158)

where is a row vector with the basis functions (see Eq. (2.27)).

Thus, we may write

(2.159)

where

(2.160)

−1.5 −1 −0.5 0 0.5 1 1.5−0.2

0

0.2

0.4

0.6

0.8

1

1.2

l11(r) l

21(r)

r

Linear Lagrange polynomials

−1.5 −1 −0.5 0 0.5 1 1.5−0.2

0

0.2

0.4

0.6

0.8

1

1.2

l12(r) l

22(r) l

32(r)

r

Quadratic Lagrange polynomials

r1 2, 1+−= r1 3, 1+−=

C0

0 x L< <v

xd

dvD x( )

xd

duxd

0

L

∫ vf xd

0

L

∫ vDxd

du

0

L

+=

uh

u uh≈ Ni x( )ai

i 1=

n

∑ Na= =

N

xd

du

xd

duh≈ Ba=

Bxd

dN

xd

dN1

xd

dN2 …xd

dNn= =

— 71 —

see Eqs. (2.29) and (2.30). The approximation is substituted into the left hand side of Eq.(2.157) and the test function is selected according to the Galerkin method, viz.

; collecting the result row–wise, we obtain

(2.161)

As we have seen, the integration is carried out element wise to generate element stiffness

matrices that are subsequently assembled to produce . For an element with end points

and we hence calculate

(2.162)

where, rather than , we use

(2.163)

i.e. only the basis functions that are non–zero on the element are involved here. We also used

to denote the value of the constitutive function on the element. As we mentioned ear-

lier, the constitutive property (if not constant) is rarely integrated, but one would use ele-

ments that are small enough for to be approximately constant on each element.

Since we use a local coordinate to construct our basis functions (e.g. Eq. (2.155) or (2.156)),

we want to carry out the required integration in that coordinate, which calls for a change ofvariables. From the transformation of variables Eq. (2.152) we have

(2.164)

so that

(2.165)

and since and , Eq. (2.162) becomes

(2.166)

The chain rule is invoked to calculate the derivatives that are needed to establish

(2.167)

v N1 N2 … Nn, , ,=

BT

DB xad

0

L

∫ Ka=

Ke

K

xi xi 1+

Ke

BeT

DBe

xd

xi

xi 1+

∫=

B

Be

xd

dN1

e

xd

dN2e

…=

D D x( )

D x( )

r

rd

dx h

2---=

xdh

2--- rd=

x 1–( ) xi= x 1( ) xi 1+=

Ke Dh

2------- B

eTB

erd

1–

1

∫=

xd

dNi

e

Be

rd

dNi

e

rd

dx

xd

dNi

eh

2---

xd

dNi

e

= =

— 72 —

Thus, Eq. (2.163) gives us

(2.168)

For an element with two nodes, we would use the two linear Lagrange polynomials Eq.(2.155), in which case

(2.169)

and the element stiffness matrix Eq. (2.166) is

(2.170)

which indeed is the matrix we used in the examples of Sec. 2.5.

Using the three Lagrange polynomials Eq. (2.156) as basis functions, we have that

(2.171)

Hence, the stiffness matrix for an element with three nodes and quadratic basis functionsbecomes

(2.172)

The three node variables on the element, do of course represent the value of the FE–approxi-

mation at the points ( ), ( ), and ( ), respectively,

i.e. the coordinates used to establish the Lagrange function in Eq. (2.156).

2.7 A Note on the Load Vectors

Here we will briefly contemplate the part of the load vector that stems from the right hand

side of the differential equation Eq. (2.1). If the equation is defined on the interval and

with the basis functions colleted in a row vector

(2.173)

we have the structure load vector

Be

xd

dN1

e

xd

dN2e

…2

h---

rd

dN1

e

rd

dN2e

…= =

Be 2

h--- 1

2---–

1

2---

1

h--- 1– 1= =

Ke Dh

2-------

1

h---

21 1–

1– 1

BeT

Be

⋅ rd

1–

1

∫D

h----

1 1–

1– 1= =

Be 2

h--- r

1

2---–

2r– r1

2---+

1

h--- 2r 1–( ) 4r– 2r 1+( )==

Ke Dh

2-------

1

h---

22r 1–( )2

4r 8r2

–( ) 4r2

1–( )

4r 8r2

–( ) 16r2

4r 8r2

+( )–

4r2

1–( ) 4r 8r2

+( )– 2r 1+( )2

rd

1–

1

∫⋅ D

3h------

7 8– 1

8– 16 8–

1 8– 7

= =

uh r 1–= x xi= r 0= xxi 1+ xi+

2----------------------= r 1= x xi 1+=

0 x L< <

N N1 N2 … Nn=

— 73 —

(2.174)

It should be recognized that the load vector also may get contributions from Neumann and/orRobin boundary conditions, cf. Eq. (2.35) or the examples in Secs. 2.5.3 and 2.5.4, but in the

present setting we are only interested in the contribution from the forcing function in

the differential equation.

By construction, the :th component of the vector is

(2.175)

and it follows that the sum of the vector components is

(2.176)

Now recall the necessary condition of completeness, Sec. 2.6.1; in particular it must be possi-

ble to combine basis functions on an element so as to get an arbitrary constant, say ,

(2.177)

The manner in which we construct our basis functions, viz. a basis function has the value

in one node and is zero i all other nodes, makes Eq. (2.177) hold if all coefficients are cho-

sen to be : As a special case we have with that the sum of the basis

function on an element equals

(2.178)

The reader is encouraged to verify that this identity hold for the two linear Lagrange polyno-mial Eq. (2.155) as well as for the three quadratic polynomial in Eq. (2.156). Since the basisfunctions on any one element sums up to unity, it immediately follow that the sum of all

basis functions on all elements add up to

(2.179)

Hence, Eq. (2.176) becomes

(2.180)

i.e. the components of the load vector add up to the load resultant. As an illustrative example

one may take the sample problem in Sec. 2.5.1 where , see Eq. (2.51). Viewing the

BVP as a model of an one dimensional elasticity problem, we have that represents a

f NT

f xd

0

L

∫=

f f x( )=

i

fi Ni f xd

0

L

∫=

fi

i 1=

n

∑ N1 N2 … Nn+ + +( )f xd

0

L

∫=

c

αi Ni

e

i

∑ c=

1

αi

c αi c i, 1 2 …, ,= = c 1=

1

Ni

e

i

∑ 1=

1

N1 N2 … Nn+ + + 1=

fi

i 1=

n

∑ f xd

0

L

∫=

f 1 x+( ) 2–=

f KxA=

— 74 —

frictional load (force/length) along a bar with unit length ( ). A force is

hence applied along a length of the rod, so the total load is

(2.181)

which indeed equals the sum of the structure load vector components, see Eq. (2.83).

As previously explained and illustrated, the structure load vector is in practice obtained byassembly of element load vectors

(2.182)

where is a row vector with the basis functions that have non–zero values on the element,

cf. Eq. (2.50). Analogous to the case with the structure load vector, we thus find the :th com-

ponent of as

(2.183)

and since the basis functions on the element add up to , see Eq. (2.178), we deduce that the

sum of the element load vector components is the load resultant on the element

(2.184)

Element load vectors calculated according to Eq. (2.182) are said to be consistent. Forinstance, the element load vectors Eqs. (2.65) and (2.66) used in the example Sec. 2.5.1 areconsistent load vectors. As was seen in that example, some rather elaborate integrals mayarise and most commercial FE–programs does not allow the user to specify a general function

, but admits some constant value only. One may then for instance take the load resultant

on the element and split it between the nodes on the elements. For our element

with two nodes and two linear basis functions we would obtain

(2.185)

Element load vectors calculated this way are said to be lumped and the error introduced willbe smaller the more elements that are used in the analysis.

0 x 1< < KxA xd 1 x+( ) 2–xd=

xd

F 1 x+( ) 2–xd

0

1

∫ 1 x+( ) 1––[ ]0

1=

1

2---= = =

fe

NeT

f xd

xi

xi 1+

∫=

NeT

i

fe

fi

eNi

ef xd

xi

xi 1+

∫=

1

fi

e

i

∑ f xd

xi

xi 1+

∫=

f

F f xd

xi

xi 1+

∫=

fe F

2---

1

1=

— 75 —

3. Abstract Formulation of the Finite Element Method

This far we have encountered a number of boundary value problems (BVP) that include vari-ous differential equations. Our exposé started with a one–dimensional 2nd order problem andwent on with 2nd order problems in a two–dimensional setting, where we first studied Pois-sion’s equation and subsequently the elasticity problem (where there are two or three differ-ential equations with displacements represented by two or three unknown functions). Inaddition, a one–dimensional 4th order problem, viz. the Euler–Bernoulli beam equation, has

been dealt with. In each case we have derived a weak problem that has the same solution, ,

as the original BVP and subsequently derived a finite element formulation that yields an

approximate solution .

In subsequent chapters we will be concerned with the error in the approximation,

and how the error diminishes as we increase the number of elements and nodes to improvethe approximation. Yet another question is why we use the Galerkin method; there are obvi-ously many other ways in which we could choose test functions to obtain an equation systemto solve for the node variables. All these issues could of course be studied one BVP at a time,but as it turns out there are a few common properties shared by all the weak forms that wehave encountered, that matters in these contexts. This invokes a so–called abstract formula-tion of the finite element method: a formulation that includes all the BVPs and FE formula-tions that we have encountered. Hence, by working with this notation, all proofs andconclusions made, will be valid for all problems that we have studied this far.

The abstract formulation calls for the use of inner products and norms of functions. We intro-duce these concepts in Sec. 3.2, but first we briefly study the same concept in a more familiarsetting, viz. vectors.

3.1 Vector Spaces

Let us consider vectors in , that is vectors with components. We are familiar with the

notation

(3.1)

to express a vector , and it is then implicit that each component is associated with a

basis vector . Hence, Eq. (3.1) may be thought of as a shorthand for

(3.2)

Another way to think about this, is that the point in represents a vector; it

may be visualized as an arrow that extends from the origin to the coordinate .

Remark: In this text we assume that the basis vectors are linear independent, i.e. that

none of the basis vectors can be expressed as a linear combination of the other basis vectors.

Only then do the basis span . m

Next we consider a linear system of equations

(3.3)

where the right hand side is a given vector, and the coefficient matrix likewise are

known; furthermore, we stick to the case where the matrix inverse exists, so that there is

u

uh

e u uh–=

Rn

n

v v1 v2 … vn

T=

v Rn∈ vi

ei

v v1e1 v2e2 … vnen+ + +=

v1 v2 … vn, , ,( ) Rn

v1 v2 … vn, , ,( )

n

Rn

n

Ka f=

n n× K

K1–

— 76 —

a unique vector that solves Eq. (3.3). Assume now that we have a vector that is

thought to be a reasonable approximation of the solution — how to obtain such an approx-

imation is not an issue here, but we mention that there are a wealth of iterative solutions

methods that generates sequences of approximate solutions, , , that in some

sense converges to : as . Substituting into Eq. (3.3), we will not retrieve the

right hand side, but rather some approximation of it

(3.4)

and we are interested to investigate how good the approximation is. To this end, we sub-

tract Eq. (3.4) from Eq. (3.3)

(3.5)

If we define the error by

(3.6)

and the residual

(3.7)

we hence have an ‘error equation’

(3.8)

It is here recognized that the residual may be thought of as an ‘out of balance’ force; is the

load vector that causes the response , while is the load required to obtain the approxima-

tion . Moreover, the residual is easily computed, since is a given vector and is obtain

from Eq. (3.4), i.e.

(3.9)

In contrast to this, the error vector Eq. (3.6) is unknown (unless we have solved for , in

which case an approximation is of no interest). Hence, it seems plausible to estimate the

error by inspecting the ‘size’ of the out of balance force . In this and other cases where one

needs to determine the ‘size’ of a vector, vector norms provide such measuring instruments.

3.1.1 Vector Norms

The norm of a vector is denoted and is a real number that is a function of the vector

components: . A real valued function of the vector components is a norm if it satis-

fies

1. and (3.10)

2. where (3.11)

a Rn∈ vj R

n∈

a

vj j 1 2 …, ,=

a vj a→ j ∞→ vj

fj

Kvj fj=

vj a≈

Ka Kvj– f fj–= K a vj–( )⇒ f fj–=

ej a vj–=

rj f fj–=

Kej rj=

f

a fj

vj f fj

rj f Kvj–=

ej a

vj

rj

v Rn∈ v

v : Rn

R→

v 0≥ v 0= v⇔ 0=

αv α v= α R∈

— 77 —

3. ( ) (3.12)

The last equation, known as the triangle inequality, is givena geometrical interpretation by the illustration to the left.

Commonly used vector norms include the –, –, and infinity norms, given by

(3.13)

(3.14)

and

(3.15)

respectively; it is left as an exercise to the reader to verify that these three norms satisfy the

requirements Eqs. (3.10)–(3.12). The –norm is sometimes referred to as the ‘taxicab’ norm,

since it measures the size of by the number of ‘blocks’ one has to travel in a rectangular

grid. The –norm may be understood as the length of the vector by means of a generalization

of the Pythagorean theorem; it is therefore also known as the Euclidean vector norm.

With , , and the norms Eqs. (3.13)–(3.15) are special cases of the –norm

(3.16)

which is implemented in Matlab by means of the function ; if the second argument

( ) is omitted, Matlab uses to calculate the norm of .

3.1.2 Inner Products

An inner product of two vectors, say and , is a real valued function of the compo-

nents of the two vectors and denoted ; . A function of two vectors is an

inner product, if it is

1. symmetric (3.17)

2. linear where (3.18)

3. positive definite and (3.19)

Note that if is linear in its first argument according to Eq. (3.18), then it is also linear in

the second argument due to the symmetry Eq. (3.17).

v

w

v w+ v w+≤

v w+ v w+≤ v w Rn∈,

1 2

v 1 vi

i 1=

n

∑=

v 2 vi

2

i 1=

n

∑=

v ∞ maxi vi( )=

1

v

2

p 1= p 2= p ∞→ p

v p vi

p

i 1=

n

∑

1 p⁄

=

norm v p,( )

p p 2= v

v Rn∈ w R

n∈

v w,( ) * *,( ) : Rn

Rn

R→×

v w,( ) w v,( )=

α1v1 α2v2+ w,( ) α1 v1 w,( ) α2 v2 w,( )+= α1 α2 R∈,

v v,( ) 0≥ v v,( ) 0= v⇔ 0=

v w,( )

— 78 —

The probably most common example of an inner product, is the scalar product

(3.20)

In particular, if the vectors are said to be orthogonal to each other. We also remark

that the scalar product is closely connected to the Euclidean vector norm Eq. (3.14), since

(3.21)

It is quite common with situations where the presence of an inner product suggests the use aa related norm.

As another example of an inner product we may consider an matrix . If the matrix is

symmetric and positive definite, the function

(3.22)

satisfies Eqs. (3.17)–(3.19) and is hence an inner product. The vectors and are said to be

–orthogonal in case . The vector norm associated to the inner product Eq. (3.22),

is of course

(3.23)

3.2 Function Spaces

We now turn our attention to functions in some function space : . Here is used to

denote the independent variable(s), irrespective of how many there is. Hence, if the functions

are defined on some area in the –plane or over some volume, our notation should be

understood as and , respectively. Furthermore, denotes the domain

of definition and its boundary. For functions of a single variable, thus is an interval (e.g.

) and is the end points.

As an example we may consider the Poisson equation on a domain , with homogenous

Dirichlet conditions on the boundary (i.e. on ). The weak problem does then involve

functions in the space

(3.24)

that is functions that satisfy homogeneous Dirichlet conditions and that are sufficiently regu-lar for involved integrals to exist.

Another example that we have encountered at several occasions is the finite element space

. Here a function if it can be expressed as a linear combination of the basis func-

tions , , i.e. we have

(3.25)

v w,( ) vT

w viwi

i 1=

n

∑= =

v w,( ) 0=

v 2 v v,( )=

n n× K

v w,( ) vT

Kw=

v w

K vT

Kw 0=

v K vT

Kv=

V v x( ) V∈ x( )

x y,( )x( ) x y,( )= x( ) x y z, ,( )= Ω

Γ Ωx1 x x2< < Γ

Ωu 0= Γ

V v : v 0 on Γ= v2 Ωd

Ω∫ ∞< ∇v( )T∇v Ωd

Ω∫ ∞<, ,

=

Vh v x( ) Vh∈

Ni x( ) i 1 2 … n, , ,=

Vh v : v Ni x( )ci

i 1=

n

∑=

=

— 79 —

( ). As previously noted, if all basis functions are in ( ), the

FE–space is a subspace of all admissible functions: .

Next we consider a boundary value problem that involves a differential

equation with a linear differential operator

(3.26)

where is the unknown (which may be a vector values function, such as e.g. the

displacements in a two–dimensional elasticity problem) and is a given load function.

The differential operator may for instance be

(3.27)

in which case Eq. (3.26) is the Poisson equation. The operator is linear if

(3.28)

We encounter linear operators only, in the course.

Assume now that we have a FE–approximation of the exact solution ; by e.g.

splitting all element lengths in half we may refine our discretization and improve the approx-

imation. Successive refinements will produce a sequence of approximate solutions, ,

, that we suppose converges to in some sense: as . Substituting

into the governing differential equation (3.26), we will not retrieve the right hand side, but

rather some approximate load function , i.e.

(3.29)

and we are interested to investigate how good the approximation is. To this end, we

subtract Eq. (3.29) from Eq. (3.26)

(3.30)

where we used the fact that is linear. If we define the error by

(3.31)

and the residual

(3.32)

we hence have an ‘error equation’

(3.33)

It is here recognized that the residual may be thought of as an ‘out of balance’ force; is the

load function that causes the response , while is the load required to obtain the approxi-

mation . Moreover, the residual may be calculated, since is a given function and is

obtain from Eq. (3.29), i.e.

V

Vh

u

uh

ci R∈ V Ni V∈

Vh V⊂

D *( )

D u( ) f= in Ω

u u x( )= V∈f f x( )=

D *( ) div D *∇[ ]–=

D u1 u2+( ) D u1( ) D u2( )+=

uhj Vhj V⊂∈ u

uhj

j 1 2 …, ,= u uhj u→ j ∞→ uhj

fj

D uhj( ) fj=

uhj u≈

D u( ) D uhj( )– f fj–= D u uhj–( )⇒ f fj–=

D *( )

ej u uhj–=

rj f fj–=

D ej( ) rj=

f

u fj

uhj f fj

— 80 —

(3.34)

In contrast to this, the error function Eq. (3.31) is unknown. Hence, it seems plausible

to estimate the error by inspecting the ‘size’ of the out of balance force . In this and

other cases where one needs to determine the ‘size’ of a function, function norms providesuch measuring instruments.

3.2.1 Function Norms

The norm of a function is denoted and is a real valued functional (a ‘function of the

function’): . A real valued functional is a norm if it satisfies

1. and (3.35)

2. where (3.36)

3. ( ) (3.37)

For the boundary value problems we have encountered, it turns out that the so called energynorm is natural to use. The energy norm is introduced in Sec. 3.3.3; here we settle with acouple of other examples in order to illustrate the concept of function norms.

The possibly most frequently encountered norm in various contexts, is the –norm

(3.38)

for functions defined over a domain . Notice how this may be conceived as an analogue of

the Euclidean vector norm Eq. (3.14).

By including squared first derivatives, we obtain the –norm

(3.39)

which is closely related to the energy norm that will adhere to.

We remark that these norms may be utilized for vector valued functions, e.g.

, in which case is to be taken as

(3.40)

and the gradient operator (in a two dimensional case) is replaced by

rj f D uhj( )–=

ej V∈

rj rj x( )=

v V∈ v

v : V R→

v 0≥ v 0= v⇔ 0=

αv α v= α R∈

v w+ v w+≤ v w V∈,

L2

v L2 Ω( ) v2 Ωd

Ω∫

1 2⁄=

Ω

H1

vH

1 Ω( )v

2 ∇v( )T∇v+( ) Ωd

Ω∫

1 2⁄=

v vx x y,( ) vy x y,( )T

= v2

v2

vT

v vx

2vy

2+= =

∇x∂

∂y∂

∂T

=

— 81 —

(3.41)

3.2.2 Inner Products

An inner product of two functions, say and , is a real valued function of the two

functions and denoted ; . A function of two functions is an inner product,

if it is

1. symmetric (3.42)

2. linear where (3.43)

3. positive definite and (3.44)

Note that if is linear in its first argument according to Eq. (3.43), then it is also linear in

the second argument due to the symmetry Eq. (3.42).

The probably most common example of an inner product, is the scalar product

(3.45)

cf. Eq. (3.20). In particular, if the functions are said to be orthogonal to each other.

We also remark that the scalar product is closely connected to the –norm Eq. (3.38), since

(3.46)

As in the case with vector spaces, it is quite common with situations where the presence of aninner product suggests the use a a related norm.

As another example we may take

(3.47)

which satisfies Eqs. (3.42)–(3.44) and is hence an inner product; the norm associated to thisinner product is of course

(3.48)

Once again, if the functions are said to be –orthogonal.

∇˜x∂

∂0

0y∂

∂

y∂∂

x∂∂

=

v V∈ w V∈v w,( ) * *,( ) : V V R→×

v w,( ) w v,( )=

α1v1 α2v2+ w,( ) α1 v1 w,( ) α2 v2 w,( )+= α1 α2 R∈,

v v,( ) 0≥ v v,( ) 0= v⇔ 0=

v w,( )

v w,( ) vw Ωd

Ω∫=

v w,( ) 0=

L2

v L2 Ω( ) v v,( )=

v w,( )H

1 vw v∇( )Tw∇+( ) Ωd

Ω∫=

vH

1 Ω( )v w,( )

H1=

v w,( )H

1 0= H1

— 82 —

3.3 Abstract Formulation

3.3.1 The Variational Problem

Here we start off with studying a boundary value problem with thePoisson equation and a homogeneous Dirichlet condition only

(3.49)

where the constitutive matrix is symmetric and positive definite

(3.50)

The fact that a homogenous condition only is taken into account may appear to be a limita-tion, but due to the linearity of the differential operator this it is not the case. To see this,consider the non–homogeneous case

(3.51)

where is a given function. Now take any twice differentiable function

such that on ; substitution into (3.51) yields

(3.52)

Now if , we have found a solution ( ); otherwise we subtract Eq. (3.52) from (3.51)

to get

(3.53)

so that with

(3.54)

we have

(3.55)

Solving the BVP (3.55), we thus find the solution of Eq. (3.51) as

(3.56)

Hence, restricting our presentation to the homogeneous BVP Eq. (3.49) (or equivalently Eq.(3.55)), is in effect not a limitation.

Ω

Γ

x

y

n

div D u∇[ ]– Q= in Ωu 0= on Γ

DT

D= xT

Dx 0> x 0≠( )

div D u∇[ ]– Q= in Ωu g= on Γ

g g x y,( )= u1 u1 x y,( )=

u1 g= Γ

div D u1∇[ ]– Q1= in Ω

u1 g= on Γ

Q1 0≡ u u1=

div D u u1–( )∇[ ]– Q Q1–= in Ω

u u1– 0= on Γ

u2 u u1–= Q2 Q Q1–=

div D u2∇[ ]– Q2= in Ω

u2 0= on Γ

u u1 u2+=

— 83 —

Let be a sufficiently regular test function, such that on . The variational

problem, i.e. weak form of Eq. (3.49), may then be written

(3.57)

Here it is immediately recognized that the right hand side is a scalar product on and thus

may be expressed as , see Eq. (3.45). Let us also define

(3.58)

With defined in Eq. (57), the variational problem may now be stated in abstract form

(3.59)

The one dimensional analogue of the BVP Eq. (3.49) reads

(3.60)

where is a given constitutive function and represents some given load. This type of

BVP was extensively studied in Ch. 2 where we found the weak form

(3.61)

Once again we recognize the right hand side as a scalar product on .

Defining

(3.62)

we find that the variational problem may be stated as in Eq. (3.59) (with given in Eq.

(3.61)).

Next we contemplate a one dimensional 4th order differential equation with homogeneousDirichlet data

v v x y,( )= v 0= Γ

Find u V∈ v : v 0 on Γ= v2 Ω ∞<d

Ω∫ v∇( )T

v∇( ) Ω ∞<d

Ω∫, ,

such that=

v∇( )TD u∇( ) Ωd

Ω∫ vQ Ωd

Ω∫= v V∈∀

ΩQ v,( )

a w v,( ) v∇( )TD w∇( ) Ωd

Ω∫= v w V∈,

V

Find u V∈ such that

a u v,( ) Q v,( )= v V∈∀

xd

dD

xd

du– Q= x1 x x2< <

u x1( ) 0 u x2( ) 0=,=

D 0> Q

Find u V∈ v : v x1( ) v x2( ) 0== v2

xd

x1

x2

∫ ∞<xd

dv

2

xd

x1

x2

∫ ∞<, ,

such that=

xd

dvD

xd

duxd

x1

x2

∫ vQ xd

x1

x2

∫= v V∈∀

Q v,( ) Ω x1 x x2< < =

a w v,( )xd

dvD

xd

dwxd

x1

x2

∫=

V

— 84 —

(3.63)

where once again is a known constitutive function (e.g. the bending stiffness of an

Euler–Bernoulli beam) and is some known loading (e.g. a transverse force intensity); see

Ch. 17 in Ottosen and Petersson (1992) for details. With

(3.64)

the variational problem is

(3.65)

Again the right hand side is the scalar product (Eq. (3.45)) and if we define

(3.66)

Eq. (3.59), with according to Eq. (3.64), is our variational problem.

Finally we mention a BVP where the primary unknown is a vector valued function, such as

the displacement in a two dimensional elasticity problem; we intentionally avoid

vector notation here, and thus write rather than . With the differential operator defined in

Eq. (3.41) the BVP is

(3.67)

where is the symmetric and positive definite Hooke matrix, while is a given vol-

ume load on . Note that there are two differential equations here and that should be

understood as . With test functions and the function space defined in Eq.

(3.57), the weak problem reads

(3.68)

Here the right hand side may be comprehended as the scalar product of two vector valuedfunctions, i.e. we adopt the notation

x2

2

d

dD

x2

2

d

d uQ= x1 x x2< <

u x1( ) 0 u x2( ) 0xd

du

x x1=

0xd

du

x x2=

0=,=,=,=

D 0> EI

Q

V v : v x1( ) v x2( ) 0xd

dv

x x1=xd

dv

x x2=

0= =,== v2

xd

x1

x2

∫ ∞<xd

dv

2

xd

x1

x2

∫ ∞<x

2

2

d

d v

2

xd

x1

x2

∫ ∞<, , ,

=

Find u V∈ such that x

2

2

d

d vD

x2

2

d

d uxd

x1

x2

∫ vQ xd

x1

x2

∫= v V∈∀

Q v,( )

a w v,( )x

2

2

d

d vD

x2

2

d

d wxd

x1

x2

∫=

V

u ux uy

T=

u u

∇˜T

D∇˜ u– Q= in Ωu 0= on Γ

D Q bx by

T=

Ω u 0=

ux

uy

0

0= v vx vy

T=

Find u V[ ]2∈ such that ∇˜ v( )T

D∇˜ u Ωd

Ω∫ v

TQ Ωd

Ω∫= v V[ ]2∈∀

— 85 —

(3.69)

It is left as an exercise to verify that according to Eq. (3.69) satisfies the conditions Eqs.

(3.42)–(3.44), and hence is an inner product. Additionally we define

(3.70)

and conclude that the variational problem Eq. (3.68) may be written as Eq. (3.59) if we

replace by , i.e. the abstract formulation is

(3.71)

3.3.2 Finite Element Formulation

When we look for an approximate solution to a boundary value problem by means of a finite

element method, we substitute the approximation into the weak form, and solve the

problem in a subspace . When the Galerkin method is utilized, both and the test

function(s) are in the FE–space , which is spanned by selected basis functions

(3.72)

Applying this to the variational problem Eq. (3.59), we get the FE formulation

(3.73)

where the inner product is defined in Eq. (3.58), (3.62) or (3.66), depending on which

differential equation we have in the BVP. The notation means that

(3.74)

where

(3.75)

are vectors with the basis functions and node variables, respectively, while means

that the equation should hold for any choice of coefficients in

(3.76)

Q v,( ) vT

Q Ωd

Ω∫ vxbx vyby+( ) Ωd

Ω∫= =

Q v,( )

a w v,( ) ∇˜ v( )T

D∇˜ w Ωd

Ω∫=

V V[ ]2

Find u V[ ]2∈ such that

a u v,( ) Q v,( )= v V[ ]2∈∀

u uh≈

Vh V⊂ uh

Vh Ni x( )

Vh v: v Nici

i 1=

n

∑=

=

Find uh Vh∈ such that

a uh v,( ) Q v,( )= v Vh∈∀

a * *,( )uh Vh∈

uh N1 x( )a1 N2 x( )a2 … Nn x( )an+ + + Na= =

N N1 … Nn= a a1 … an

T=

v Vh∈∀

ci

v N1 x( )c1 N2 x( )c2 … Nn x( )cn+ + + Nc= = c c1 … cn

T=

— 86 —

Here, as previously explained, should be taken as in cases where the BVP is

defined on domains with two or more independent variables.

In case there are two unknown functions in the BVP, such as is the case in the elasticityproblem Eq. (3.67), we obtain the FE formulation from the variational problem Eq. (3.71), i.e.we have

(3.77)

Again we have , but with

(3.78)

and with node variables in the column vector ; the vector with test functions is here

with in Eq. (3.78) and the coefficients in arbitrary.

3.3.3 The Energy Norm

Let us return to the definition of Eq. (3.58). First we notice that the constitutive matrix

is symmetric, so

(3.79)

Hence, . We also have that

(3.80)

so is linear in its first argument. Third, we see that

(3.81)

since the constitutive matrix is positive definite; the product Eq. (3.81) equals zero, only if

. Thus, implies that the gradient is the zero vector everywhere in , in which

case is a constant. However, so on and, therefore, implies . We

may now conclude that satisfies Eqs. (3.42)–(3.44), and thus is an inner product on .

In an elasticity problem is proportional to the elastic energy (or strain energy) stored in

the domain, due to an imposed displacement (or ). For this reason is

sometimes referred to as an energy product (irrespective of the type of problem at hand). In

particular, if the functions are said to be energy orthogonal, or –orthogonal.

Since is an inner product, is natural to use the so called energy norm

(3.82)

whenever one wish to determine the ‘size’ of a function , or vector valued function

. For the Poisson equation we hence use

Ni x( ) Ni x y …, ,( )

Find uh Vh[ ]2∈ such that

a uh v,( ) Q v,( )= v Vh[ ]2∈∀

uh Na=

NN1 0 … Nn 0

0 N1 … 0 Nn

=

2n a

v Nc= N 2n c

a w v,( )D

v∇( )TD w∇( )[ ]

Tw∇( )T

v∇( )TD[ ]

Tw∇( )T

DT

v∇ w∇( )TD v∇( )= = =

a w v,( ) a v w,( )=

α1w1 α2w2+( )∇ α1 w1∇ α2 w2∇+=

a * *,( )

v∇( )TD v∇ 0≥

v∇ 0= a v v,( ) 0= Ωv v V∈ v 0= Γ v∇ 0= v 0=

a * *,( ) V

a w w,( )

w V∈ w V[ ]2∈ a w v,( )

a w v,( ) 0= a

a * *,( )

w a a w w,( )=

w V∈

w V[ ]2∈

— 87 —

(3.83)

and in a two dimensional elasticity problem one has

(3.84)

where is some displacement vector. In the 2nd and 4th order one dimen-

sional problems that we have encountered, the energy norm is of course

(3.85)

and

(3.86)

respectively.

w a w∇( )TD w∇( ) Ωd

Ω∫

1 2⁄=

w a ∇˜ w( )T

D ∇˜ w( ) Ωd

Ω∫

1 2⁄=

w wx wy

T= V[ ]2∈

w a Dxd

dw

2

xd

x1

x2

∫

1 2⁄

=

w a EIx

2

2

d

d w

2

xd

x1

x2

∫

1 2⁄

=

— 88 —

— 89 —

4. The Minimization Problem

4.1 Preliminaries

Consider a boundary value problem that involves any of the differential equations that wehave encountered this far. We restrict this presentation to homogeneous Dirichlet data, butrecall that this in effect is no limitation. Hence we discuss

(4.1)

where is linear differential operator, is the

unknown (which may be a vector valued function, such as e.g.

the displacements in a two–dimensional elasticity problem) and is a given load func-

tion. Furthermore, we use the notation in a generic manner, in so far that it denotes all

the independent coordinates (e.g. it is to be understood as in a two dimensional prob-

lem, such as the one symbolically depicted above).

The weak version of the BVP is given by the variational problem

(4.2)

Here, as before, is the space of functions that satisfies essential boundary conditions and

that are sufficiently regular.

We first show that Eq. (4.2) has a unique solution. To this end, assume that the functions

and solves the variational problem, so that

(4.3)

and

(4.4)

Subtracting Eq. (4.4) from Eq. (4.3), we get

(4.5)

or, since is linear in its first argument

(4.6)

Selecting , we find that , i.e. that . It follows that

, so any solution to the variational problem Eq. (4.2) is unique.

In a finite element formulation we have a finite element space that

is spanned by a set of selected basis functions . We have

a conform method if ( ), in which case . The FE formula-

tion is then simply the variational problem posed in the subspace, i.e.

(4.7)

Γ

Ω

x

yD u( ) b= in Ωu 0= on Γ

D *( ) u u x( )= V∈

b b x( )=

x( )x y,( )

Find u V∈ such that a u v,( ) b v,( )= v V∈∀

V

u u1= u u2=

a u1 v,( ) b v,( )= v V∈∀

a u2 v,( ) b v,( )= v V∈∀

a u1 v,( ) a u2 v,( )– 0= v V∈∀

a * *,( )

a u1 u2– v,( ) 0= v V∈∀

v u1 u2–= a u1 u2– u1 u2–,( ) 0= u1 u2– 0=

u1 u2=

V

Vh

u

uh

Vh

N1 N2 … Nn, , ,

Ni V∈ i∀ Vh V⊂

Find uh Vh∈ such that a uh v,( ) b v,( )= v Vh∈∀

— 90 —

With, say, the Poisson equation, we have

(4.8)

Substituting the FE approximation

(4.9)

and an arbitrary function

(4.10)

into Eq. (4.8), one gets

(4.11)

where

(4.12)

Thus, with

(4.13)

Eq. (4.7) reads

(4.14)

and since is arbitrary, the expression in parentheses must be a zero vector ( ), so

we have the familiar

(4.15)

By now we also know that the results Eqs. (4.11), (4.14), and (4.15), will be the same no mat-ter which differential equation (of the ones we have encountered) that are at hand; constitu-

tive data ( ) and the –matrix will of course differ between the various problems, but our

equations look the same.

a uh v,( ) ∇v( )TD uh∇( ) Ωd

Ω∫= b v,( ) vb Ωd

Ω∫=

uh N1 x( ) N2 x( ) … Nn x( ) a1 a2 … an

TNa= =

v Vh∈

v N1 x( ) N2 x( ) … Nn x( ) c1 c2 … cn

TNa c

TN

T= = =

a uh v,( ) cT

BT

DB Ω ad

Ω∫= b v,( ) c

TN

Tb Ωd

Ω∫=

B ∇Nx∂

∂N1 …x∂

∂Nn

y∂∂N1 …

y∂∂Nn

= =

K BT

DB Ω ad

Ω∫= f N

Tb Ωd

Ω∫=

cT

Ka f–( ) 0=

c Ka f– 0=

Ka f=

D B

— 91 —

4.2 Minimization Problems

4.2.1 The Continuous Minimization Problem

Let us now define a quadratic functional :

Definition: (4.16)

i.e. takes a function as its argument and returns a real number.

We will now show that the solution of the variational problem Eq. (4.2) makes stationary

(minimizes over ). To this end, chose any function and construct

(4.17)

We then have

(4.18)

Using the fact that the inner products are linear in both arguments, we obtain

(4.19)

where, in the last step, we also used that is symmetric so that . Here we

recognize that the first two terms in the last equality, equals ; also, since and is

the solution of the variational problem Eq. (4.2), we have that so the last two

terms adds up to zero. Thus we have

(4.20)

where the inequality follow from the fact that is positive definite: . It is

observed that only if , i.e. when , and thus according to Eq.

(4.17).

We conclude that the (unique) solution of the variational problem Eq. (4.2), minimizes ;

we are hence lead to the minimization problem

( ) (4.21)

4.2.2 The Discrete Minimization Problem

At this stage we have three problem formulations that all have the same solution : the

boundary value problem (4.1), the variational problem (4.2), and the minimization problem

(4.21) (with defined in Eq. (4.16)). It is also recognized that the minimization problem offers

an alternative means to find an approximate solution to the BVP. Rather than to derive a FEformulation from the variational problem, we may solve the minimization problem in the FE

Π V R→

Π v( ) 1

2---a v v,( ) b v,( )–= v V∈

Π v V∈

u ΠΠ V v V∈

w v u–= w V∈( )

Π v( ) Π u w+( ) 1

2---a u w u w+,+( ) b u w+,( )–= =

Π v( ) 1

2--- a u u w+,( ) a w u w+,( )+( ) b u w+,( )–=

1

2--- a u u,( ) a u w,( ) a w u,( ) a w w,( )+ + +( ) b u w+,( )–=

1

2---a u u,( ) b u,( )–

1

2---a w w,( ) a u w,( ) b w,( )–+ +=

a * *,( ) a w u,( ) a u w,( )=

Π u( ) w V∈ u

a u w,( ) b w,( )=

Π v( ) Π u( ) 1

2---a w w,( )+ Π u( )≥=

a * *,( ) a w w,( ) 0≥Π v( ) Π u( )= a w w,( ) 0= w 0= v u=

u Π

Find u V∈ such that Π u( ) Π v( )≤ v V∈∀ Π u( ) Π v( )= iff v u=

u

Π

— 92 —

space , i.e. find the function that minimizes . Thus we consider the discrete mini-

mization problem

(4.22)

Remark: Any function may be defined by a discrete number of variables ,

(cf. Eq. (4.10)), so may here be viewed as a function of variables. In contrast to this, the

argument in Eq. (4.21) is a function that may be changed continuously. m

Let us now substitute according to Eq. (4.10) into

(4.23)

From Eq. (4.11) we see that

(4.24)

so with Eq. (4.13), we have

(4.25)

where we now consider as a function of the variables in the vector .

To find the vector that minimize , we calculate the variation ; at the minimum or any

stationary point, one has . Let by an arbitrary vector and a real number. Then

(4.26)

The first term in the right hand side is

(4.27)

The first two terms in the last expression amount to while the third term may be omitted

for sufficiently small . In addition we have that , since . Hence,

(4.28)

Substituting into Eq. (4.26), we find

(4.29)

so is stationary when solves

(4.30)

Vh v Vh∈ Π

Find uh Vh∈ such that Π uh( ) Π v( )≤ v Vh∈∀

v Vh∈ ci i 1 2 … n, , ,=

Π n

v

v Π v( )

Π v( ) 1

2---a Nc c

TN

T,( ) b cT

NT,( )–=

a Nc cT

NT,( ) c

TB

TDB Ω cd

Ω∫= b c

TN

T,( ) cT

NT

b Ωd

Ω∫=

Π c( ) 1

2---c

TKc c

Tf–=

Π n c

c Π Π∂Π 0=∂ d ε 1«

Π∂ c( ) Π c εd+( ) Π c( )–=

Π c εd+( ) 1

2--- c εd+( )T

K c εd+( ) c εd+( )Tf–=

1

2--- c

TKc εc

TKd εd

TKc ε2

dT

Kd+ + +( ) cT

f– εdT

f–=

1

2---c

TKc c

Tf

ε2

2-----d

TKd+–

ε2--- c

TKd d

TKc+( ) εd

Tf–+=

Π c( )

ε cT

Kd dT

Kc= K KT

=

Π c εd+( ) Π c( ) εdT

Kc f–( )+=

Π∂ c( ) εdT

Kc f–( )=

Π c

Kc f=

— 93 —

i.e. for , where is the FE approximation of the variational problem with test functions

according to Galerkin (otherwise is not in ).

To investigate the nature of the stationary point, we calculate the second variation

(4.31)

Since is positive definite we have

(4.32)

everywhere and conclude that the stationary point is a minimum.

4.2.3 Potential Energy

A functional as defined in Eq. (4.16) exists for all the boundary value problems we have

encountered this far, and the exact solution to the BVP minimizes . In mechanics this has a

special meaning, so let us take a look at the elasticity problem. First consider a one dimen-

sional case, e.g. a bar of length and axial stiffness . Here is the intensity [force/length]

of an external load, and is an axial displacement that satisfies homogenous essential

boundary conditions We have

(4.33)

By virtue of kinematics, is the strain caused by the displacement ; adopting Hooke’s

law, we see that is the stress . We may therefore write

(4.34)

The factor has dimension [force/area] or [energy/volume]

and is referred to as the strain energy density. In the first term ofthe right hand side, the strain energy density is summed over the

volumes , and the result is the elastic energy or strain energy.

In the second term, is the force resultant over a length and

may be understood as some energy (or work) done by the

external load; in mechanics, the second integral, including theminus sign, is named the load potential. The sum of the two quan-tities is called the potential energy

Potential energy:

In this context, the minimization problem Eq. (4.21) is known as the principle of minimum

potential energy. It states that at (stable and static) equilibrium , the potential energy

attains a minimum. Any disturbance from this state will inevitably increase .

While the description above concerned a one dimensional case, the concept immediatelyextends to elasticity problems in two and three dimensions. In a two dimensional case

defined on a domain with thickness we have

v uh= uh

v Vh

∂2Π ∂Π c εd+( ) ∂Π c( )– εdT

K c εd+( ) f–( ) εdT

Kc f–( )– ε2d

TKd= = =

K

∂2Π 0>

ΠΠ

L EA b

v x( )

Π v( ) 1

2--- EA

xd

dv

2

xd

0

L

∫ vb xd

0

L

∫–=

xd

dv ε v( ) v

Eε v( ) σ v( )

Π v( ) 1

2--- σ v( )ε v( )A xd

0

L

∫ vb xd

0

L

∫–=

ε v( )

σ v( )σ v( )ε v( )

2----------------------

σ v( )ε v( )2

----------------------

A xd

b xd xd

vb xd

Π elastic energy load potential+=

v u=

Π

Ω t

— 94 —

(4.35)

where now is a vector valued function with the displacement components and

is an external volume load [force/volume]. The kinematic relation is

(4.36)

while Hooke’s law gives the corresponding stress vector

(4.37)

Hence Eq. (4.35) may be written

(4.38)

in analogy with Eq. (4.34); again, the first term in the right hand side is the strain energy,while the second term (including the minus sign) is the load potential.

We conclude this section with an example where the discrete version

Eq. (4.25) is utilized. The stiffness is provided by a linear spring while

the external load is the gravity on a particle with mass . There is a sin-

gle degree of freedom, , that defines the vertical position of the particle.

If we select the origin such that the spring is un–deformed at , the

potential energy of the system is

(4.39)

(The reader probably recognize the first term as the potential energy of an elastic spring that

has been elongated or compressed a length ). To find the minimum we calculate the deriva-

tive

(4.40)

and find the equilibrium position

(4.41)

Since

(4.42)

Π v( ) 1

2--- ∇˜ v( )

TD∇˜ vt Ωd

Ω∫ v

Tbt Ωd

Ω∫–=

v vx vy

T=

b bx by

T=

ε v( ) ∇˜ v εx v( ) εy v( ) γxy v( )T

= =

σ v( ) Dε v( ) σxx v( ) σyy v( ) σxy v( )T

= =

Π v( ) 1

2--- εT

v( )σ v( )t Ωd

Ω∫ v

Tbt Ωd

Ω∫–=

k

m 0

c mg

kc

k

m

c

c 0=

Π c( ) 1

2---kc

2mgc–=

c

cd

dΠkc mg–=

cd

dΠ0= c⇒

mg

k-------=

c2

2

d

d Πk 0>=

Π mg

k------- mg( )2

2k---------------–=

— 95 —

is a minimum.

Remark: From Eq. (4.40) we see that equals the sum of the forces on the particle

(4.43)

This is no coincidence, but holds in general; from Eq. (4.29) we find

(4.44)

and the principle of minimum potential energy states that at equilibrium. m

4.3 Stiff Approximations

We have found that the exact solution of the BVP Eq. (4.1), minimises (Eq. (4.16)) in ,

and that the Galerkin FE approximation minimizes the same functional in . One may

conclude that the energy in the FE solution cannot be lower than in the exact solution, i.e.

(4.45)

In particular, if may be expressed as a linear combination of the basis functions, the Galer-

kin method will give the exact solution

if (4.46)

For the approximation we have

(4.47)

or with

(4.48)

Substituting into Eq. (4.45) gives

(4.49)

One realize that for a given load vector , the calculated node variables become too small in

an average sense. In terms of an elasticity problem, the displacements are smaller than theexact solution, which is why the FE approximation is said to be too stiff.

cd

dΠ–

cd

dΠ– mg kc–=

∇Π–εd( )∂

∂Π– f Kc–= =

∇Π 0=

u Π V

uh Vh

Π u( ) Π uh( )≤

u

uh u= u Vh∈

Π uh( ) Π a( ) 1

2---a

TKa a

Tf–= =

Ka f=

Π uh( )

1

2---af–

1

2---a

TKa–

=

Π u( ) 1

2---af–≤

f a

— 96 —

— 97 —

5. Error Estimation and Adaptivity

5.1 Error Sources in Engineering Computations

We have described the finite element method as a method to approximate solutions of bound-ary value problems. Indeed, there are a few special cases in which one obtains the exact solu-

tion, but in general will to some extent deviate from . It is consequently of some

importance to be familiar with conditions that influences the accuracy of the approximation.From a practical point of view, however, the error in the approximation is not the sole con-cern; in an engineering context there is some physical problem at hand and the boundaryvalue problem (or its weak form) is considered a model of it. Accordingly, the mathematicalmodel is an approximation of a physical actuality so the FE–solutions are in a sense approxi-mations of idealizations.

In a text on the finite element method, the interest is naturally on the accuracy of the approx-imation vis–à–vis the exact solution of the mathematical model, but it should be clear thatthere are several possible origins of errors in engineering practice. For this reason, we willbriefly dwell upon error sources in this section. The view taken on the topic in this presenta-tion, is narrated by the illustration below.

Modelling errors. Consider the elasticity problemdepicted to the right. We may for instance want to knowthe deflection and rotation of the free end, as well as thestress state in the structure. To this end one could solvethe full three–dimensional elasticity equations, but thereare several other alternatives. In case one assumes aplane stress or plane strain state, it suffice to consider across–section as indicated by the red rectangle. Otheralternatives are to resort to some plate or beam theory,such as e.g. Kirchhoff or Euler–Bernoulli. The variousmathematical models will yield different results, but anytwo models that both are realistic will produce results that resemble each other.

The point we try to make, is that when we select a mathematical model, e.g. some particulardifferential equation, to analyse a situation, we implicitly make conjectures about constitu-tional properties of the physical problem, since such assumptions are made when the modelis derived. Consider for instance the Navier (or Navier–Cauchy) equations of elasticity; herethe equilibrium is established in the original, un–displaced, configuration, the strain–dis-placement relations are linearized, and Hooke’s law is invoked. Hence, it is assumed that thedisplacements, rotations and deformations are ‘small’, and that the stresses are proportionalto the deformations. If the equations are to be reduced to a two dimensional problem, some

uh u

Physical problem

Mathematicalmodel

(continuous problem)

Numerical model(discrete problem)

Observations(experiments,

measurements, etc)

Finite precisioncomputations

?

Discretizationerrors

Modelling errors

Numericalerrors

Experimentalerrors

— 98 —

further assumption has to be made, e.g. that stresses in the eliminated direction are zero (or‘small’).

In this context, one may also note that several other elasticity problems (plate bending, shelltheory, etc.), may be derived from Navier equations. Hence, these problems are subjected notonly to the assumptions made in the derivation of the elasticity equations in three dimen-sions, but to further presumptions (usually kinematic). Euler–Bernoulli beam theory, forinstance, assumes that planar cross–sections remains planar during deformation of thebeam, and that shear deformations are negligible.

Discrepancies between the assumptions made in order to derive a mathematical model andthe actual properties of the physical problem, are examples of modelling errors. Furthererrors of this type are introduced by boundary conditions. For instance, a cantilever beammay be modelled as clamped at one end, while in practice any support has some flexibilityand will, hence, permit some displacement and rotation. A load may be modelled as a concen-trated force (as in the illustration above), but will in practice have some distribution.

Uncertainty in data is also an example of modelling errors. It may be that exact values ofmaterial properties e.g. Young’s modulus, yield strength, etc) are not known, the exact magni-tude of loads are unknown and that the geometry is only known to some degree of accuracy.

Discretization errors. In the mathematical model we have one or several unknown continu-ous functions that constitutes the primary unknown in the problem. The functions may forinstance represent a displacement or temperature field, from which some more interestingsecondary unknowns, such as stresses or heat energy flows, may be calculated. Using finite

elements to approximate the solution , we replace the unknown(s) by linear combina-

tion(s) of selected basis functions

(5.1)

The finite element approximation therefore includes a discrete number of node variables

to be calculated. For this reason, the mathematical model is sometimes referred to as the

continuous problem, whereas the FE–problem is a discrete one. In practice, the basis func-

tions are defined when we dived the domain of the problem into elements of some chosen

type. This process is known as a discretization of the problem and will in general introduce socalled discretization errors.

Hence, the difference between the solutions of the continuous and the discrete problems,respectively, viz.

(5.2)

is a discretization error; notice that, defined this way, theerror is a function of the spatial coordinates. It should like-

wise be observed that and may be defined on slightly

different domains: if the boundary value problem is posed

on a domain with curved boundaries, it could be that

cannot be exactly covered by the elements. A circular cut–out may for instance be approximated by a polygonal curvewith piece–wise linear or quadratic edges. This may also beregarded as a discretization error.

u x( )Ni x( )

u x( ) uh x( )≈ aiNi x( )i

∑=

uh x( )

ai

Ω

e x( ) u x( ) uh x( )–=

Ωh

Ωu uh

Ω Ω

— 99 —

Numerical errors. Discretization of the continuous problem yields an equation system

which defines the node variables (collected in the solution vector ); solving for , we

obtain our FE–approximation according to Eq. (5.1). In any problem of practical interest, thenumber of node variables is so large that it is not workable to establish and solve the system

of equations by hand, but we use a computer to calculate the stiffness matrix and the load

vector as well as to solve for the node variables. Since computers typically uses 8 bytes to

store a number, it will be represented by 15 or 16 decimal digits. Hence, irrational and mostrational numbers cannot be exactly represented. Furthermore, if two numbers, each with(say) 16 significant decimals, are multiplied together the result is a number with 32 decimals,but the computer delivers a 16 digit number. A rounding–off error affects only the last digit inthe resulting number, but when further calculations are done, the error propagates and sub-sequent results become less accurate. It is noted that many millions of operations (multipli-cations etc) may be needed to establish and solve the FE equation system; the following postprocessing step (calculation of e.g. stresses) requires even more operations.

Another type of numerical error, viz. truncation error, arises when we calculate element

matrices and vectors by numerical integration. Using Gauss integration points, we actually

approximate the integrand by a polynomial of degree ; as been seen, the polynomial is a

truncated Taylor series representation of the integrand.

Due to truncation and rounding errors, both the stiffness matrix and load vector will havesome errors, so we end up with an equation system

(5.3)

rather than . Of course, solving for the node variable values introduces further round-

ing errors.

A somewhat more detailed account of numerical errors may be found in e.g. Wiberg (ed.)(1975), but is omitted here. Note, however, that some errors may be reduced at the expense ofincreasing others. For instance, both truncation and discretization errors are reduced if weincrease the number of elements in the mesh (invoke more node variables), but this yields alarger system of equation and thus requires more computational operations which will resultin an amplified rounding error.

Experimental errors. Our knowledge of the physical world stems from observations andmeasurements, but since measurements are error prone our data will be imprecise. Disre-garding clumsiness and blunders, experimental errors are commonly divided into two groups,viz. systematic and random errors, respectively. In order to comprehend the principal differ-ence, it may be fruitful to think about accuracy and precision. Measurements are accurate ifthe recorded values are close to some ‘true’ value and the precision is high if repeated meas-urements yield about the same results. Note that high precision does not imply accuracy (andvice versa).

Ka f=

ai a a

K

f

n

2n 1–

K K∂+( ) a aδ+( ) f fδ+=

Ka f=

— 100 —

Systematic errors origin from the manner in which the observations are done and results inconsistently too high — or too low — readings. Thus, systematic errors are those that affectaccuracy. A standard example is badly calibrated instruments. Consider for instance a ther-

mometer with an offset scale; it may show that water starts boiling at and freeze at ,

no matter how many times the experiment is repeated.

Another source of systematic errors are themeasurement itself, since instrumentations mayaffect the measurements. We could measure,say, the temperature of a liquid by sticking athermometer in it, but if our instrument doesn’thave the same temperature as the liquid it willchange the temperature. Hence, we will notmeasure the temperature of the liquid, butrather the temperature of the fluid with a ther-mometer added. As an additional example wecould consider vibration measurement byattaching accelerometers to a structure (picture). In this case we will effectively add mass tothe structure and therefore, to some extent, change its dynamic properties. Thus we willmeasure the vibrations of a the system ‘structure + accelerometers’.

Measurements subject to random errors differ from each other due to random, unpredicta-ble variations in the measurement process. Thus, this error type affects precision and stemsfrom limitations imposed by the precision of the utilized equipment As examples, we mentionunpredictable variations in temperature, mechanical vibrations, or voltage, all of which mayaffect the measuring equipment. Another source of random errors is uncertainty in readingand interpolating the scale of a measuring device.

No matter the source of the uncertainty, to be labelled ‘random’ an uncertainty must have theproperty that the fluctuations from some true value are equally likely to be positive or nega-tive. It follows that this type of error may be reduced by statistical methods. For instance onemay simply take the average from repeated measurements. Note, though, that such a proce-dure cannot be used to detect or reduce systematic errors. The only manners in which wemay detect presence of systematic errors are to carefully check the calibration of our meas-urement devices, compare our results with other experiments, and by changing the way themeasurements are done (e.g. using different instrumentations).

5.2 Best Approximation

From the previous section we conclude that there are several sources of errors in engineeringanalysis, but our subsequent concern will be the discretization error as expressed by Eq.

(5.2), i.e. the difference between the solution of the mathematical model and the FE–

approximation . Before dwelling into this, we describe the concept of Best Approximation

in a more general context.

A fundamental problem in Approximation theory is to find the best approximation to a givenfunction. We define this property here since it shall later be shown that the Galerkin FE–

approximation is a best approximation of .

Let be a linear space equipped with a norm and let be a finite dimensional sub-

space; the requirement for to be finite dimensional ensures that it has a basis. For a given

, a is the Best Approximation of out of the subspace if

(5.4)

102°C 2°C

u

uh u≈

uh u

V W V⊂W

v V∈ w∗ W∈ v

v w∗– v w–≤ w∀ W∈

— 101 —

Geometrically: given a , the best approximation in a subspace is the member that

has the shortest distance to as measured by the selected norm, of all members of the sub-

space.

Note that different norms will in general yield different best approximations. Quit often the

definition of the space suggest an appropriate norm. For instance, if is the space of func-

tions that are square-integrable over an interval, say , i.e. functions that satisfies

, it comes natural to use the –norm

(5.5)

and the best approximation satisfies , which is known as a least squares

approximation.

Example Let and be a linear polynomial. We shall find the coefficients

and such that becomes the best approximation with respect to the –norm, of on the

interval . If we define

(5.6)

we may solve the problem by calculating and such that , Thus

(5.7)

and one finds

(5.8)

Let us also try to find the best approximation the ininfinity norm. Define the error by

(5.9)

We are then to find and so that is mini-

mized. Let denote the largest error; at the

interval ends we then have , as indi-

v V∈ W w∗

v

V V

0 1[ , ] v

v2

xd

0

1

∫ ∞< L2

v 2 v2

xd

0

1

∫

1 2⁄

=

v w∗– 2 v w– 2≤

v x( )ln= w mx c+= m

c w L2 v

1 2[ , ]

F m c,( ) v w– 2

2x( )ln mx c+( )–[ ]2

xd

1

2

∫= =

m c F∇ 0=

F∇

2 x( )ln mx c+( )–[ ]x xd

1

2

∫

2 x( )ln mx c+( )–[ ] xd

1

2

∫

0

0= =

4 2( )ln3

2---–

14m

3----------– 3c– 0=

4 2( )ln 2– 3m– 2c– 0=

⇒

m 9 12 2( )ln–= 0.6822≈ c29

2------ 2( )ln– 0.6371–≈=

a 1= b 2=θ

x

e x( )v x( )

w∗ x( )E

e x( ) x( )ln mx c+( )–=

m c max e( )

E e θ( )=

e 1( ) e 2( ) E–= =

— 102 —

cated by the principal sketch to the right. In addition we have that , since has a

maximum at . Thus, we have the four equations

(5.10)

with the four unknowns , , , and . Solving for the latter two, we get

(5.11)

The two best approximations, Eqs. (5.8) and (5.11), are depicted in the figure below. r

Problem Find the first order polynomials that best approximates on the interval

with respect to the –norm and the infinity norm, respectively. Answer:

(5.12)

r

5.3 Galerkin Orthogonality and as a Best Approximation

Our first task is to show an orthogonality property of the (discretization) error in conforming

FE–approximations. A conform FE–method is one in which the chosen basis functions satisfyDirichlet conditions and that are sufficiently regular for the integrals in the weak form to

exist; hence, a FE–method is conform if we have for all basis functions, where is the

space of functions in which the weak problem is posed. In particular, in such cases we will

have , where is the space of functions that can be expressed as linear combinations

of , . Hence, a function will be a member of if for some choice of

coefficients .

xd

de

x θ=

0= e

x θ=

E θ( )ln mθ– c–= E– m– c–= E– 2( )ln 2m c––=1

θ--- m– 0=

E θ m c

m 2( )ln 0.6931≈= c4( )ln( )ln 1+( )–

2----------------------------------------- 0.6633–≈=

1 1.2 1.4 1.6 1.8 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

x

v, w

*

Best approximations w*= m*x+c of v = ln(x)

ln (x)w*(x) L

2−norm

w*(x) ∞−norm

mx c+ v ex

=

0 1[ , ] L2

m 18 6e–= c 4e 10 L2( )–=

m e 1–= ce m m( )ln–

2---------------------------- (inf)=

uh

Ni V∈ V

Vh V⊂ Vh

Ni i 1 2 …, ,= v x( ) Vh v viNi

i

∑=

vi

— 103 —

Now consider a boundary value problem (BVP) on a domain with boundary ;

here and denote the parts of the boundary where we have Dirichlet and Neumann con-

ditions, respectively.

(5.13)

where and are differential operators, and are given functions (that typically rep-

resent some loading), and is the unknown. The BVP (5.13) may be any of the ones we have

previously encountered, e.g. some elasticity or heat flow problem. The weak form of Eq. (5.13)is


(see Ch. 3). Let us now introduce a conform FE–approximation and let

denote the space spanned by the basis functions . The Galerkin FE–formulation then reads


Next we subtract the FE–formulation Eq. (5.15) from the variational problem Eq. (5.14) to get

(5.16)

Note that since the result (Eq. (5.16)) is valid for functions in only; in fact, if the

FE–space is not a subspace of — as is the case in non–conforming FE–methods used for

e.g. plate and shell analysis — it is not possible to combine Eqs. (5.14) and (5.15) to obtain

(5.16) in this manner. Using the fact that is linear in its first argument, we can write

Eq. (5.16) as

(5.17)

or, since is our discretization error

(5.18)

Ω Γ ΓD ΓN∪=

ΓD ΓN

D u( ) f= in Ωl u( ) h= on ΓN

u 0= on ΓD

D *( ) l *( ) f h

u

u V∈ a u v,( ) v f,( ) v h,⟨ ⟩+= v∀ V∈

uh Niai

i

∑ u≈= Vh

Ni

uh Vh V⊂∈ a uh v,( ) v f,( ) v h,⟨ ⟩+= v∀ Vh∈

a u v,( ) a uh v,( )– 0= v∀ Vh∈

Vh V⊂ v Vh

Vh V

a *,*( )

a u uh– v,( ) 0= v∀ Vh∈

u uh– e=

a e v,( ) 0= v∀ Vh∈

— 104 —

Hence, the error is orthogonal (with

respect to the inner product ) to all

functions . This property is called

Galerkin orthogonality; the discretization

error is said to be –orthogonal,

or energy orthogonal, to the finite element

space . A geometric interpretation is

mapped out to the right: here we let a

plane represent the space , while a line

in the plane represents the subspace ;

the functions , and are visu-

alized by vectors.

It appears that we may think of as the

orthogonal projection of onto , so that

any change of the approximation within

(i.e. along the grey line) will increase the

error. Indeed, using the Galerkin orthogonality, we can show that is a best approximation

in the sense described in the previous section.

It has been seen (Sec. 3.3.3) that the energy norm is natural to use in the

boundary value problems that we have ran across during the course; we will prove that

, , i.e. that of all functions in , is the best approximation of the

exact solution to the boundary value problem. To this end, choose an arbitrary function

, that is a function such that for arbitrary coefficients . Now construct

. Note that , so that . Using the fact that

is linear in both arguments and symmetric, we have

where we also used that . Now, since we have that (Galerkin orthog-

onality); furthermore, we have , so

We conclude that

(5.19)

Note that equality is obtained only in case , i.e. when .

Vh

uh

e

u = uh + e

Galerkin orthogonality

VV

a *,*( )v Vh∈

e u uh–= a

Vh

V

Vh

u uh e u uh–=

uh

u Vh

Vh

uh

v a a v v,( )=

u uh–a

u v– a≤ v Vh∈ Vh uh

u

v Vh∈ v x( ) Ni x( )vi

i

∑= vi

w uh v–= w Ni x( )ai

i

∑ Ni x( )vi

i

∑– Ni x( ) ai vi–( )i

∑= = w Vh∈

a * *,( )

u v– a

2u uh– w+

a

2a u uh– w+ u uh– w+,( )= = =

a u uh– u uh– w+,( ) a w u uh– w+,( )+ a u uh– u uh–,( ) a u uh– w,( ) a w u uh–,( ) a w w,( )+ + += =

a u uh– u uh–,( ) 2a u uh– w,( ) a w w,( )+ + a u uh– u uh–,( ) 2a e w,( ) a w w,( )+ += =

u uh–a

22a e w,( ) w a

2+ +

u uh– e= w Vh∈ a e w,( ) 0=

w a

20≥

u v– a

2u uh–

a

22a e w,( ) w a

2+ + u uh–

a

2≥=

u uh–a

u v– a≤ v∀ Vh∈

w a 0= v uh≡

— 105 —

Example We consider a bar that is fixed at its left endand loaded by a distributed load with constant inten-

sity (force/length) . Its axial stiffness is constant

and the length is . The solution of the BVP

(5.20)

yields the axial displacement , (and thereafter the stress ). One easily verifies that

(5.21)

solves Eq. (5.20).

The weak problem is given by


where and . Next consider a FE–approximation with a single

linear element: . Formally we have


which yields

(5.24)

The essential boundary condition enforces and since , i.e. is not in , the first

equation in Eq. (5.24) is not valid. Hence, we have and thus

; we now have the discretization error

(5.25)

Since is not in , the FE–space will in effect consist of functions that can be written as

; chose an arbitrary such function and evaluate

x

0 L

N1 x( ) N2 x( ) x

L---=

11

fEAf EA

L

xd

d– EA

xd

duf= 0 x L< <

u 0( ) 0=xd

du

x L=

0=

u σ Exd

du=

u x( ) fL2

2EA----------- 2

x

L---

x

L---

2

– =

u V∈ a u v,( ) v f,( )= v∀ V∈ v: v 0( ) 0 v2

xd

dv

2

+ xd

0

L

∫ ∞<,=

=

a u v,( )xd

dvEA

xd

duxd

0

L

∫= v f,( ) vf xd

0

L

∫=

u uh≈ a1N1 a2N2+=

uh Vh∈ V⊂ a uh v,( ) v f,( )= v∀ Vh∈

EA

L-------

1 1–

1– 1

a1

a2

fL

2-----

1

1=

a1 0= N1 0( ) 0≠ V

a2

fL2

2EA-----------=

uh a1N1 a2N2+fL

2

2EA-----------

x

L---⋅= =

e u uh–fL

2

2EA-----------

x

L---

x

L---

2

– = =

N1 V Vh

v2N2 x( )

— 106 —

(5.26)

Hence, the discretization error is energy orthogonal to . r

5.4 Error in Terms of the Energy Norm

It has been shown, see Ch. 4, that the bilinear form may be understood as an energy

measure. For instance, in an elasticity problem is twice the elastic energy (or strain

energy) in the exact solution of the BVP, while is twice the elastic energy in the

FE–approximation. Note that and are (possibly vector valued) functions of the spatial

coordinates, but the energy measure is merely a real number. Since the error in energy

is a single number, it furnishes a convenient mean to evaluate the accuracy of

the FE–approximation. We shall now show how this entity is related to the discretization error

. First we utilize that is symmetric and linear in its arguments to obtain

(5.27)

Now the last term is recast as

(5.28)

and it is noted that since , we have due to Galerkin orthogonality (see Eq.

(5.18)). Substituting Eq. (5.28) into (5.27), we obtain

(5.29)

Consequently, the energy in the error, , equals the error in energy ( ). Tak-

ing the square root of the left and right hand sides of Eq. (5.29), we find the energy norm ofthe discretization error

(5.30)

and we recall that this may be understood as a measure of the ‘size’ of the function .

Example Let us return to the bar problem in the previous section. With according to Eq.

(5.21) we have

(5.31)

Likewise, with a single linear element we obtained , so

(5.32)

a e v,( ) v2 xd

dN2EA

xd

dexd

0

L

∫ v2

1

L---EA

fL

2EA----------- 1 2

x

L---–( ) xd

0

L

∫v2fL

2----------

x

L---

x

L--- 2

–0

L

0= = = =

e Vh

a * *,( )a u u,( )

a uh uh,( ) aT

Ka=

u uh

a u u,( ) a uh uh,( )–

e u uh–= a * *,( )

a e e,( ) a u uh– u uh–,( ) a u u,( ) a uh uh,( ) a u uh,( ) a uh u,( )––+ a u u,( ) a uh uh,( ) 2a u uh,( )–+= = =

a u uh,( ) a uh e+ uh,( ) a uh uh,( ) a e uh,( )+= =

uh Vh∈ a e uh,( ) 0=

a e e,( ) a u u,( ) a uh uh,( ) 2a uh uh,( )–+ a u u,( ) a uh uh,( )–= =

a e e,( ) a u u,( ) a uh uh,( )–

e a a u u,( ) a uh uh,( )–=

e

u x( )

a u u,( )xd

duEA

xd

dux EA

fL

EA------- 1

x

L---–

2

xd

0

L

∫f2L

3

3EA-----------= =d

0

L

∫=

uh x( ) fL2

2EA-----------

x

L---⋅=

a uh uh,( )xd

duhEA

xd

duhx EA

fL

2EA-----------

2

xd

0

L

∫f2L

3

4EA-----------= =d

0

L

∫=

— 107 —

whereafter Eq. (5.30) yields

(5.33)

Remark: Rather than using the procedure in Eq. (5.32) to evaluate , one may use the

load vector and node variables to calculate the inner product:

(5.34)

m

Let us also evaluate the ‘size’ of the error relative that of the exact solution ; using Eqs.

(5.31) and (5.33), we get

(5.35)

Thus, our estimate says that the error in the approximation is 50%. Notice that the energynorm is evaluated by using the first derivatives of the functions, cf. Eqs. (5.30)–(5.32); it is the

square root of the ‘sum’ of squared derivatives. We may therefore perceive as a measure

of the error in the first derivative. From an engineering point of view, this is good news, sincederivatives (e.g. stresses, strains, or heat energy flow) usually are of greater interest that theprimary unknown (e.g. displacements or temperatures). Let us briefly investigate how theerror is reduced if we increase the number of elements. Using two linear elements, each of

size , to solve the bar problem, we obtain the equation system ( )

(5.36)

and finds the solution

(5.37)

and subsequently

(5.38)

The energy norm of the error, Eq. (5.30), becomes

(5.39)

so that

e a fLL

EA-------

1

3---

1

4---–

fLL

12EA--------------= = 0,289fL

L

EA-------≈

a uh uh,( )

f a

a uh uh,( )xd

duhEA

xd

duhx a

T

xd

dN

T

EAxd

dNxad

0

L

∫ aT

Ka= =d

0

L

∫ aT

ffL

2

2EA-----------

0

1

TfL

2-----

1

1⋅ f

2L

3

4EA-----------= = = =

u

e a

u a

----------e a

a u u,( )---------------------

1 12⁄1 3⁄

----------------- 0.50= = =

e a

hL

2---= Ka f=

2EA

L-----------

1 1– 0

1– 2 1–

0 1– 1

a1

a2

a3

fL

4-----

1

2

1

=

afL

2

8EA----------- 0 3 8

T=

a uh uh,( ) aT

ffL

2

8EA-----------

0

3

4

T

fL

4-----

1

2

1

⋅ 5f2L

3

16EA--------------= = =

e a fLL

EA-------

1

3---

5

16------–

fL

2-----

L

12EA--------------= = 0,144fL

L

EA-------≈

— 108 —

(5.40)

Whence, it appears that halving the element size reduces the error in first derivative by a fac-tor 2. In a subsequent chapter on convergence we shall see that this is no fluke, but willalways be the case when we use linear elements to solve second order PDEs (provided thatthe exact solution is sufficiently smooth).

r

5.5 An Equation for the Error

Consider a case where we have obtained a FE–approximation of the solution of a BVP

such as the one given by Eq. (5.13); although not necessary, for brevity we deal with the casewhere there is homogenous Dirichlet conditions only. Thus, we have an approximate solutionof the BVP

(5.41)

available. In case we have estimated the discretization error and found it too large

(as measured by e.g. the energy norm), or otherwise suspect that is not good enough for

our purposes, we need to change our discretization (i.e. our finite element mesh) so as toimprove the approximation. Enhancing the discretization is called mesh refinement, and canbe done in various manners as will be described below. First, however, let us further investi-gate what is required. To this end we substitute the FE–approximation into the BVP to obtain

(5.42)

Note that since the basis functions have to satisfy essential boundary conditions, the bound-

ary equation is exactly satisfied. However, when the differential operator acts on we

obtain a function that in general will differ from the right hand side of the PDE in Eq.

(5.41). Seeing that the FE–approximation is the exact solution of the BVP Eq. (5.42), one

may take the view that we have approximated the original problem Eq. (5.41) by replacing the

loading function .

Remark: In case a Neumann condition is present, we would have an additional expression in

Eq. (5.42), viz. (cf. Eq. (5.13)). Thus, the boundary load is replaced by ; as

noted earlier, natural boundary conditions are approximated along with the differential equa-tion(s). m

Remark: If is a second order differential operator one would normally use –continuous

elements to approximate the BVP (5.41). This means that has discontinuous first deriva-

tives so that does not exist in classical sense. However, by resorting to Dirac functions

and distributions, is a legitimate operation and the BVP Eq. (5.42) is mathematically

well defined. m

In order to explore the difference between Eqs. (5.41) and (5.42), we subtract the latter fromthe former. Because the differential operators we consider are all linear, we have that

, so we get

e a

u a

----------e a

a u u,( )---------------------

1 48⁄1 3⁄

----------------- 0,25= = =

uh u

D u( ) f= in Ωu 0= on Γ

e u uh–=

uh

D uh( ) fh= in Ω

uh 0= on Γ

D uh

fh

uh

f

l uh( ) hh on ΓN= h hh

D C0

uh

D uh( )

D uh( )

D u( ) D uh( )– D u uh–( )=

— 109 —

(5.43)

Here we recognize as the discretization error ; the difference between the two loading

functions, i.e. , may be thought of as a residual (unbalanced force) and is denoted

(5.44)

Hence, the discretization error is the solution of the BVP

(5.45)

Observe that the residual is known, since is a given function and may be calculated as

(see Eq. (5.42)).

To summarize, we have approximated the solution of the BVP Eq. (5.41) and found that theerror is given by Eq. (5.45), which is a BVP that is similar to the original problem. Only theright hand sides in the PDEs differs. Let us therefore approximate the discretization error bysolving Eq. (5.45) with a finite element method. The weak form of the problem reads


see Eq. (5.14). Here we note that if the test function is chosen from the subspace , with

spanned by the basis functions that was used for the FE–approximation of Eq. (5.41),

we get due to Galerkin orthogonality. We conclude that , or equiva-

lently

(5.47)

We may thus view the Galerkin FE–method as a ‘weighted residual method’ — the residual

‘weighted’ with the basis functions is made zero in the sense of Eq. (5.47). For this reason,

the basis functions are frequently called weight functions in engineering texts on finite ele-ments.

Remark: Rather than using the basis functions as ‘weight’ functions, one may do other

choices. In Ch. 8 of the text book by Ottosen and Petersson (1992) various alternatives are

described; for instance, the choice yields the least square method explained in

Ottosen and Petersson Sec. 8.4 (where the residual is denoted ). m

Example Once again we consider the bar problem given by Eq. (20). Here we have the differ-

ential operator so with a FE–approximation the residual becomes

(5.48)

cf. Eqs. (5.42) and (5.44). The weighted residual method Eq. (5.47) then yields

D u uh–( ) f fh–= in Ω

u uh– 0= on Γ

u uh– e

f fh– r

r f fh–=

D e( ) r= in Ωe 0= on Γ

f fh

fh D uh( )=

e V∈ a e v,( ) v r,( )= v∀ V∈

Vh V⊂

Vh uh

a e v,( ) 0= v r,( ) 0= , v Vh∈∀

Ni r,( ) 0= i, 1 2 … n, , ,=

Ni

Ni

v r= r r,( ) 0=

e

Dxd

d– EA

xd

d= uh

r f fh– f D uh( )– fxd

dEA

xd

duh– – f

xd

dEA

xd

duh+= = = =

— 110 —

(5.49)

or

Integrating the left hand side by parts, we find

(5.50)

Here the boundary term is zero, since is prescribed by the natural boundary con-

dition, while is imposed by the Dirichlet condition (see Eq. (5.20)). Hence, with

(where is a row vector with the basis functions, and

is a column vector with the node variables) we have

(5.51)

which indeed is the FE equation system r

5.6 Mesh Refinement

If one solves the BVP Eq. (5.45) using finite elements, an approximation of the error is

obtained. An improved approximation of the original BVP is then available as

(5.52)

As was discovered in the previous section, the basis functions used to calculate cannot be

used to solve Eq. (5.45), since the residual is orthogonal to the basis functions with respect

to the inner product . Thus, we need a new FE–discretization to improve our approxima-

tion. For this reason, one normally do not solve for an approximate solution of Eq.

(5.45), but solves for by a FE mesh that embrace more node variables than was used for

. If needed, an approximation of the discretization error may that be calculated as

(5.53)

There are two ways in which a FE mesh can be refined. Either one employs more elements orretains the element subdivision but switch to elements with higher order polynomial basis

functions. These approaches are referred to as the – and –methods, respectively; refers

to element size, while denotes polynomial order.

Ni r,( ) Ni fxd

dEA

xd

duh+

xd

0

L

∫ 0= = i 1 … n, ,=

Ni xd

dEA

xd

duhxd

0

L

∫– Ni f xd

0

L

∫= i 1 … n, ,=

xd

dNiEA

xd

duhxd

0

L

∫ NiEAxd

duh

0

L

– Ni f xd

0

L

∫= i 1 … n, ,=

xd

duh

x L=

0=

Ni 0( ) 0=

uh Na= N N1 x( ) … Nn x( )= a a1 … an

T=

xd

dNiEA

xd

dNxd

0

L

∫

a Ni f xd

0

L

∫= i 1 … n, ,=

Ka f=

eh e≈

u uh2≈ uh eh+=

uh

r

* *,( )eh e≈

uh2

uh e u uh–=

e eh≈ uh2 uh–=

h p h

p

— 111 —

5.6.1 –refinement

in a one dimensional problem isillustrated in the figure to theright. At top we see an intervaldiscretized by two linear ele-ments; below that, the the meshhas been refined by dividing eachelement into two new ones.Refined this way, the originalnodes will be present in the newmesh. Note, however, that theassociated basis functions willchange. For instance, the func-

tion labelled in the illustration,

takes on the value in the node

at and varies linearly to at

in the original mesh, but

linearly to at in the

refined mesh. The fact that the previous nodes are retained in the new mesh, assures that

the new FE approximation is at least as good as the previous one, since the previous can

be expressed by the new basis functions.

In two and three dimensional problems, –refinement

normally involves dividing all edges of an element. Forinstance, a triangular element can be divided into 4 tri-angles if we insert new nodes on the 3 edges. The pde–toolbox i Matlab uses this technique (illustrated in thefigure below).

0 0.2 0.4 0.6 0.8 10

1

2

3

4u

uh

N1

N2

0 0.2 0.4 0.6 0.8 10

1

2

3

4u

uh

N1

N2

h

N1

1

x 0= 0

x 0.50=

0 x 0.25=

uh

h

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

— 112 —

5.6.2 –refinement

As in the h–method, p–refine-ment involves adding nodes to themesh. However, the element sub-division is retained, so the nodesare now added to the existing ele-ments and the basis functions arereplaced by polynomials of ahigher degree. A one dimensionalcase is depicted to the right; heretwo elements with linear basisfunctions are replaced by two ele-ments with quadratic basis func-tions. In order to accomplish this,a node has to be added to eachelement (green dots), but the orig-inal nodes (and element subdivi-sion) are kept unchanged. As inthe case of h–refinement, all basisfunctions are changed, but bykeeping the original nodes in the refined mesh the original FE approximation can beexpressed by the new set of basis functions.

p–refinement can also be done in twoand three dimensional problems. Asan example (right) a 3–noded triangu-

lar element with linear ( ) basis

functions may be refined to an ele-

ment with a quadratic ( ) approxi-

mation, by adding a node on eachelement edge; by adding another 4nodes to the element, one may adoptan approximation that is a complete cubic polynomial.

5.6.3 Hierarchical Basis Functions

The finite element approximation is a linear combination of basis functions

(5.54)

where the coefficients are the node variables; here we have adopted a subscript for

‘coarse’, as opposed to which will be used to denote a refined FE approximation (below).

When the approximation is substituted into the variational problem and the basis functions

are used as test functions (Galerkins method), we obtain a system with equations

(5.55)

p

0 0.2 0.4 0.6 0.8 10

1

2

3

4u

uh

N1

N2

0 0.2 0.4 0.6 0.8 1−1

0

1

2

3

4

uu

h

N1

N2

p 1=

p 3=

p 2=p 1=

p 2=

1

x y

x2

xy y2

x3

x2y xy

2y

3

x4

x3y … … y

4

… … … … … …

p 1=

p 2=

p 3=

n Ni

uhc N1a1 N2a2 … Nnan+ + +=

ai c

f

n

Kccac fc=

— 113 —

that allows us to solve for the unknown node variables .

We can use either – or –refinement to improve the approximation. Using the conventional

approach as described in the previous sub–sections, we replace all basis functions by newones and construct the refined approximation

(5.56)

Since we introduce more nodes in the mesh and use the conventional method to define basis

functions (that is to let each basis it take on the value in one node and the value in all

other nodes on the element), we have that . Hence the conventional approach means

that we not only introduces more basis functions, but we do in effect also replace the existingones, and so get an entirely new system of equations

(5.57)

Even if the existing solution on the coarse grid, Eq. (5.55), has been found not to be accurateenough, we have done some work and do have some information about the solution of the

BVP, but it is difficult to exploit this when solving Eq. (5.57). Some of the node variables in

will indeed be the same as the ones in (if the original nodes have been kept), but neither

the new stiffness matrix, nor the new load vector is easily compared to the original ones. Asan alternative approach, one may keep the original approximation Eq. (5.54) and simply addwhat is missing to procure the refined approximation Eq. (5.56). This avenue is referred to ashierarchical refinement and the so engaged basis functions are accordingly called hierarchicalbasis functions. Hence we now have

(5.58)

where are the same as the original basis functions Eq. (5.54), while the hierarchi-

cal basis functions have been constructed so that the approximation becomes

identical to that of Eq. (5.56), albeit with a different basis. Since the original basis is keptunchanged, the previous equation system Eq. (5.55) will be found as a partitioned part of the

new system of equations; the hierarchical basis functions yields additional columns in

the stiffness matrix and the associated node variables are collected in a vector

. We get another equations by using the hierarchical functions as test

functions (Galerkins method). Hence, with hierarchical refinement we have

(5.59)

where , so the new equation system, Eq. (5.57), may be written

(5.60)

Here the node variables are identical to those in Eq. (5.55), but their respective values may

be somewhat different due to the coupling to the hierarchical node variables . Also note

ac a1 … an

T=

h p

uhf N1a1 N2a2 … Nnan Nn 1+ an 1+ Nn 2+ an 2+ … Nn m+ an m++ + + + + + +=

1 0

Ni Ni≠

Kffaf ff=

af

ac

uhf N1a1 N2a2 … Nnan Nn 1+ an 1+ Nn 2+ an 2+ … Nn m+ an m++ + + + + + +=

N1 … Nn, , m

Nn 1+ … Nn m+, ,

m m

ah an 1+ … an m+

T= m

Kff

Kcc Kch

Khc Khh

= af

ac

ah

= ff

fc

fh

=

Khc Kch

T=

Kcc Kch

Khc Khh

ac

ah

fc

fh

=

ac

Kch ah

— 114 —

that when we solve the initial system Eq. (5.55), it may be claimed that we actually solve the

refined system Eq. (5.60) with the hierarchical variables prescribed to zero, .

There are some advantages with hierarchical refinements as compared to the conventional

approach. First we note that the original structure stiffness matrix and structure load

vector appears in the refined equation system (5.60), so the computational work involved

in the numerical integration to establish these need not to be repeated. We are simply

required to calculate a few more columns ( and ) in order to obtain the new stiffness

matrix , and some more load vector elements . Second we notice that we have a good

estimate of the part of the new solution vector, since Eq. (5.55) already has been solved,

e.g. by Gaussian elimination or matrix factorization. There are several powerful iterativeequation solvers that can be utilized to solve Eq. (5.60) with very little computation effort,

using the current values of as a start vector and using the factorization of as a so

called pre–conditioner, see e.g. Möller (1994). Third, and which will be of our interest here(see next section), the hierarchical approach to mesh refinement allows us to do a simple esti-mation of the discretization error and of how this error is distributed over the computationaldomain.

A few graphic examples of hierarchical functions are shown below, but we refrain from givingexplicit expressions for the various functions; the interested read may consult Möller (1994)and various sources given therein.

Let us first consider hierarchical

–refinement. Using the conven-

tional approach, we would intro-duce a new node and divide anexisting element into two newones as described and depicted inthe previous section. With thehierarchical approach, one wouldrather keep the original basis

function ( and ) and just add

a function as shown in the

figure to the right. The so modi-fied element has three basis func-tion that can mimic any function

that can be expressed by two

linear elements. Notice that newnode is explicitly introduced, butthe new node variable will ratherhave the meaning of a ‘relative’function value as indicated by the vertical hatched line in the figure.

ah 0=

Kcc

fc

Kch Khh

Kff fh

ac

ac Kcc

0 0.1 0.2 0.3 0.4 0.50

1

2

3

4u

uh

N1

N2

0 0.1 0.2 0.3 0.4 0.50

1

2

3

4

Nhier

uuh

N1

N2

h

N1 N2

Nhier

uh

— 115 —

Using hierarchical –refinement,

we would go about in a manneranalogous to above. We keep thetwo original linear basis functionsand introduce a hierarchical

function that is a quadratic

polynomial; as above, the newfunction is zero in the nodes ofthe element. The three basis

functions ( , and ) on

this element, can be used todescribe the very same function

as the ‘conventional’ quad-

ratic element that was used for

–refinement in the previous sec-

tion, but, once again, the new‘node’ variable is not associatedwith any node and has rather a meaning of a relative function value (see vertical hatched linein the illustration).

Hierarchical basis functions can also be used in two and three dimensional problems. An

example is provided below: here one, viz. , of the four basis functions of an isoparametric

bi–linear element is depicted in its parent domain , along with three examples of hierar-

chical basis functions for –refinement.

Functions like these may be constructed in a systematic manner, both on triangular andquadrilateral elements, and explicit expressions may be found in the literature, see e.g.Möller (1994).

0 0.1 0.2 0.3 0.4 0.50

1

2

3

4

uh

u

N1

N2

0 0.1 0.2 0.3 0.4 0.50

1

2

3

4

Nhier

uh

u

N1

N2

p

Nhier

N1 N2 Nhier

uh

p

N4

e

ξ η,( )p

−1 −0.5 0 0.5 1−1

0

10

0.2

0.4

0.6

0.8

1

ξη

N4e

−1−0.5

00.5

1

−1

−0.5

0

0.5

10

0.5

1

ξη

φ 02

−1−0.5

00.5

1

−1

−0.5

0

0.5

1

−0.5

0

0.5

ηξ

φ 31

−1−0.5

00.5

1

−1

−0.5

0

0.5

10

0.5

1

ξη

φ 22

— 116 —

Notice that in two and threedimensional problems, hierarchi-cal basis functions may take onnon–zero values at elementboundaries; this is for instancethe case for two of the three hier-archical functions shown above.Adding such a function to an ele-ment approximation, we have toadd a corresponding function tothe approximation on any adja-cent element in order to preserve

–continuity; see illustration to

the right.

5.7 A Residual Based a posteriori Error Estimate

Hierarchical basis functions furnish a simple and convenient means to estimate the discreti-zation error in a FE–approximation, in terms of an energy measure. The idea is to estimatehow much the energy would change if a single one (hierarchical) basis function were includedin the approximation, and by summing the contribution from several such functions we getan estimate of the error. Moreover, since each basis function has local support, i.e. takes onnon–zero values on one or a few elements only, one also gets an estimation of how the error isdistributed over the domain; the contribution to the energy error from a basis function willstem from the few elements where the function has support. Knowledge of the error distribu-tion is of course very useful if we want to change the discretization, since we want to refinethe mesh in areas where the error is largest.

Let us now study a FE–approximation with basis function, and assume that we have estab-

lished and solved the FE–equations

(5.61)

so that we have the approximation

(5.62)

We now consider adding a hierarchical basis function to the approximation, so that a

refined approximation

(5.63)

is obtained. The potential energy changes from to and we seek to estimate the

difference. Note that the node variables i Eq. (5.63) are the very same as in Eq. (5.62)

(since is a hierarchical basis function), but the actual numerical values of the variables

may change due to the inclusion of the new function; hence the addition of the sub–string‘new’. Substituting the refined approximation into the weak form of the problem and usingthe new basis function as an additional test function, we obtain an additional row and col-umn in the stiffness matrix such that

−2−1.5

−1−0.5

00.5

11.5

2

−1

0

1

0

0.5

1

xyC0

n

Kccac fc=

u uhc Njaj

j 1=

n

∑ Nac= =≈

Ni

uhf Njaj new,

j 1=

n

∑ Niai+ N Ni

ac,new

ai

= =

Π uhc( ) Π uhf( )

ac,new ac

Ni

— 117 —

(5.64)

where . From the last equation in Eq. (5.64), we get an estimate of the new node var-

iable

(5.65)

where we assumed that original node variables do not change very much , or rather

that .

We are now in a position where we may estimated the change in potential energy that would

arise if the new basis function is adopted in the new approximation Eq. (5.63). This is a

reasonable measure, since we know that the exact solution of the BVP minimizes the poten-

tial energy over all admissible functions , and that a conforming FE–approximation with the

Galerkin method minimizes the same quantity in the FE–space . Since we can tell

that . Furthermore, it is noted that by selecting appropriate values for the node

variables in Eq. (5.63), we may reproduce Eq. (5.62), i.e. we can get and thus con-

clude that . In other words, the new approximation Eq. (5.63) embrace the same

node variables as the previous one (Eq. (5.62)) plus an additional variable, so calculating thenode variables so as to minimize the potential energy cannot possibly result in a higher valuethat before. We have

(5.66)

and

(5.67)

where we used that . Next we introduce a so called error indicator

(5.68)

Hence, the indicator value is the expected change in potential energy, due to the inclusion of

the basis function in the FE–approximation. Notice that the indicator gives a measure of a

local error, since the basis function is non–zero on one or a few elements only. Substituting

Eqs. (5.66) and (5.67) into Eq. (5.68) and once again making the approximation , we

get

(5.69)

Kcc Kci

Kic Ki i

ac,new

ai

fc

fi

=

Kic Kci

T=

ai

fi Kicac,new–

Ki i

-------------------------------fi Kicac–

Ki i

----------------------≈=

ac,new ac≈

Kicac,new Kicac≈

Ni

V

Vh Vh V⊂

Π uhc( ) Π u( )≥

uhf uhc=

Π uhc( ) Π uhf( )≥

Π uhc( ) 1

2---ac

TKccac ac

Tfc–=

Π uhf( ) 1

2--- ac,new

Tai

Kcc Kci

Kic Kii

ac,new

ai

ac,new

Tai

fc

fi

=–=

1

2---ac,new

TKccac,new ac,new

Tfc– Kicac,newai

1

2---Ki iai

2fiai–+ +

ac,new

TKci Kicac,new= ηi

2

ηi

2 Π uhc( ) Π uhf( )– 0≥=

Ni

ac,new ac≈

ηi

2fi Kicac–( )ai

1

2---Ki iai

2–=

— 118 —

Finally, the estimated value for according to Eq. (5.65) is inserted into Eq. (5.69) to get

(5.70)

Remark: As pointed out above, when solving Eq. (5.61) for we may claim that we solve the

refined system Eq. (5.64) with the variable prescribed to zero. The last equation in (5.64)

gives us the corresponding residual (or ‘reaction force’) as

(5.71)

When we release the constraint such that the variable slowly

grows from zero to , the residual decreases from to zero; the

work done is then negative, thus decreasing the energy in the sys-tem

(5.72)

Substituting Eqs. (5.65) and (5.71) into (5.72) gives us Eq. (5.70)

(5.73)

The error indicator is an estimate how much the potential energy will decrease if a basis

function is included in the FE–approximation, but does not tell very much about the error

(5.74)

in the available FE–solution. To get a grip on this error we need to calculates error indicatorvalues for several possible new basis functions, and sum these values. If, say, or currentsolution is based on linear elements, we could calculate error indicator values for all possible

quadratic functions and approximate by the sum. Hence, if we have calculated by con-

sidering new basis function, we adopt the approximation

(5.75)

We have previously mentioned that it comes natural to used the energy norm for the type ofproblems encountered in the present course, so we conclude this section by showing how the

error estimate Eq. (5.75) is related to the energy norm of the discretization error .

For simplicity we consider a case of Dirichlet conditions in the governing BVP; any Neumanncondition would add terms to the subsequent expressions, but does not void the conclusions.First we note that the potential energy in the exact solution is

(5.76)

ai

ηi

2 fi Kicac–( )2

2Ki i

------------------------------=

ac

ai

ri Kicac fi–=

ai

ri

∆Πi

ai ri

∆Πi

r– iai

2-----------=

ηi

2 ∆Πi=

ηi

2

Ni

∆Π Π uhc( ) Π u( )–=

∆Π ηi

2

m

∆Π ηi

2

i 1=

m

∑≈

e u uhc–=

Π u( ) 1

2---a u u,( ) u f,( )–=

— 119 —

and observe that the variational problem leads to

(5.77)

Hence, we have

(5.78)

By a similar derivation, using that the FE–formulation gives , we find the

potential energy in our FE–approximation as

(5.79)

Hence, from Eqs. (5.78) and (5.79) we conclude

(5.80)

and since the error in energy equals the energy of the discretization error

( (see Sec 5.4) we have

(5.81)

and thus

(5.82)

a u u,( ) u f,( )=

Π u( ) 1

2---a u u,( )–=

a uhc uhc,( ) uhc f,( )=

Π uhc( ) 1

2---a uhc uhc,( )–=

∆Π Π uhc( ) Π u( )–1

2--- a u u,( ) a uhc uhc,( )–( )= =

a u u,( ) a uhc uhc,( )– a e e,( )=

∆Π 1

2---a e e,( )=

e a a e e,( ) 2∆Π= = 2 ηi

2

i 1=

m

∑≈

— 120 —

5.8 Adaptivity

In this section we shall briefly describe the concept of adaptive FE–analysis and provide someexamples thereof. First, however, we contemplate how a FE–computation conventionally iscarried out.

Conceptually a FE–calculation may beunderstood as consisting of three steps:preparing the necessary input data, estab-lishing and solving the FE–equations

( ), and finally proceed with some

additional calculations to find and print outsought for secondary unknowns (e.g. findmechanical stresses, using calculated nodedisplacements and specified material data).In major FE–systems, these steps are han-dled by different modules, usually referredto as a pre–processor, a solver, and a post–

processor. Sometimes these modules arebundled together into a single programsuch that the user may not be aware of thedifferent steps, but frequently one has toactually run three different programs (i.e.(1) a preprocessor that prepares in input fileto (2) a solver, that in turn delivers resultsto (3) a post–processor). Such a conven-tional FE–analysis is tellingly depicted bythe white boxes in the flowchart provided tothe right.

In addition to the above, a solver in anadaptive FE–program will estimate the dis-cretization error and its distribution overthe domain. If the error is estimated to belarger than some threshold value specifiedby the user, the program changes the mesh

(using – or –refinement) such that the

discretization becomes denser (i.e. usingmore node variables) in regions where theerror is large. A new FE equation system issubsequently established and solved. Thisprocess, illustrated by the tinted boxes withblue frames in the flowchart, is repeateduntil the discretization error becomes suffi-ciently small.

The reader may be familiar with the pde–

toolbox in MATLAB; it offers adaptivity with –refinements as described above, except that

the user cannot supply an error threshold. Instead the refinement loop is terminated after aselected number refinements or when the number of employed elements exceeds a specifiedlimit, whichever occurs first.

A few examples of adaptive FE–calculations are provided in the following pages.

Start

Pre–processing:

geometry (elements and nodes),

material data, loads,Boundary conditions, etc.

Numerical integrationand assembly: establish

Ka f=

Solve equations to get a

Estimate the errorand its distribution

Error < Threshold?

Refine the mesh

No

Postprosessing:

Calculate secondary unknowns

(reaction forces, stresses

and strains, heat flows, etc)

Write results Stop

Yes

Define Problem. Read

Ka f=

h p

h

— 121 —

Example adopted from Möller (1994): a laminated composite plate with a prescribed axial

strain . A quarter of a cross section is studied, and one expects to get elevated interlaminar

stresses close to the edge of the plate. It can be seen that the adaptive –method refines the

mesh in that particular region.

εz

h

— 122 —

Example adopted from Möller (1994): 3D adaptive –refinement in an elasticity problem; a L–

shaped cantilever structure is subjected to a vertical load at its free end. Stress concentra-tions arise close to the clamping and at the re–entrant edge.

h

— 123 —

This example shows adaptive –refinements with hierarchical functions. Here a simple canti-

lever beam is modelled by 2D elasticity and plane stress. Note that there are two primaryunknowns (viz. the horizontal and vertical displacements) in the problem; the arrows in theillustrations indicates the approximation that has been refined in the respective cases.Adopted from Möller (1994).

p

— 124 —

— 125 —

6. Convergence

We have seen that if the approximation has been found not to be good enough, we can

improve the accuracy by using higher order polynomials as basis functions ( –method) or

using more elements ( –method) in constructing . In both cases we introduce more degrees

of freedom (node variables), and provided that this is done so that the current approximation

is in the new finite element space , we are guaranteed that the new solution cannot

be worse in terms of minimum potential energy: .

Notice that the above reasoning implies that we are using a conform method, since only then

are we assured that the finite element space is a subspace of the space of admissible func-

tions, i.e. . For second order problems, the FE–method is conforming if is –contin-

uous, while –continuity is required for a fourth order differential equation. In addition, the

elements have to be complete: it should be possible to choose node variable values such that

becomes an arbitrary constant on the element and to choose (other) values so that the first

derivative(s) become arbitrary constant(s); fourth order problems also requires that can

represent arbitrary constant second derivatives on the element.

In the subsequent presentation we concentrate on the –method for second order problems;

the –method is briefly treated in a concluding section. Our goal is to find out how the discre-

tization error behaves when the element size is changed.

6.1 One Dimensional Problems

We consider a piece wise linear finite element

approximation of an exact solution of a bound-

ary value problem. Now, focusing on one element wehave

(6.1)

while a Taylor series expansion of the exact solution

around a point gives

(6.2)

Here we have implicitly assumed that is sufficiently smooth to have such a series expan-

sion; this will be somewhat more elaborated on a bit later. We also remark that for any coor-

dinate within the element, one has

(6.3)

We observe that the node variables and (or the coefficients and in the standard

basis) can be calculated so that agrees with the first two terms in the right hand side of

Eq. (6.2); that is not necessarily what a conforming FE–method yields, but we know that is

uh

p

h uh

uh Vh new,

Π uh( ) Π uh new,( ) Π u( )≥ ≥

Vh

Vh V⊂ uh C0

C1

uh

uh

h

p

e u uh–= h

x

uh

u

hx0

uh

u

aiai 1+

ai

ai 1+

ai 1+

uh u

uh x( ) aiNi x( ) ai 1+ Ni 1+ x( )+ α0 α1x+= =

x0

u x( ) u x0( )xd

du

x0

x x0–( ) 1

2---

x2

2

d

d u

x0

x x0–( )2+ + +=

1

6---

x3

3

d

d u

x0

x x0–( )3 …+

u

x

x x0– h≤

ai ai 1+ α0 α1

uh

uh

— 126 —

the best possible approximation in the FE–space in the sense that the discretization error

is energy orthogonal to the basis functions. Hence we can write

(6.4)

Now, if (and, hence, ) is small, the first term in the right hand side of Eq. (6.4) will

dominate, and we have

(6.5)

where the constant depends on the exact solution (and thus normally will be unknown);

note that is a function and should be understood as some appropriate norm.

In engineering practice, the first derivative(s) of the solution is of more interest than the solu-tion itself. For instance, in an elasticity problem the stresses (first derivatives of the displace-

ments) are usually of more interest than displacements , and in a heat transfer

problem the heat flux (first derivatives of temperature) may be more important that the tem-

perature . For the linear element, we have , c.f. Eq. (6.1), while may be expressed

by a Taylor series analogous to Eq. (6.2). By the same reasoning that gave us Eq. (6.4), werealise that

(6.6)

where is the error in the first derivative; note that we use as a generic symbol

for a constant — the constant in Eq (6.5) is not the same as the constant in Eq. (6.6). Also,

once again, since is a function, should be understood as some appropriate norm. As

been seen, it is natural to associate the energy norm with the types of boundary value

problems and FE–methods that we have encountered. Taking the bar problem (one dimen-sional elasticity) as an example, we have

(6.7)

where , , is a constitutive function; loosely speaking: ‘the square root of the

(weighted) summed squared of ’. Thus, Eq. (6.6) may be written

(6.8)

Let us now consider elements with quadratic basis functions so that

(6.9)

e u uh–=

e x( ) 1

2---

x2

2

d

d u

x0

x x0–( )2 1

6---

x3

3

d

d u

x0

x x0–( )3 …+ +=

h x x0–

e x( ) Ch2

=

C

e x( ) e x( )

u ux uy

T=

uxd

duh α0=xd

du

xd

de Ch=

xd

de

xd

du

xd

duh–= C

xd

de

xd

de

a

e a a e e,( ) EA x( )xd

de

2

xd

0

L

∫

1 2⁄

= =

EA x( ) 0> x∀

xd

de

e a Ch=

uh α0 α1x α2x2

+ +=xd

duh α1 α2x+=

— 127 —

It is now seen that the error in will be in the fourth term of the right hand side of Eq. (6.2),

while the error in is the third term of the series expansion of , so Eqs. (6.5) and (6.8)

now become

(6.10)

It appears that if the basis functions are polynomials of degree we have the estimates

(6.11)

From the derivation above, it is clear that the constants depend on the first omitted terms

in the Taylor series expansions of the exact solution and its derivative . Thus depends

on the polynomial order and on the exact solution; since the latter in general is not known,

we cannot use the estimates to evaluate the discretization error in a FE–approximation, but

the estimates will tell us how the error will change if we change the element size . For

instance, with we see that the error will be reduced by a factor if we double the

number of elements by splitting them in halves ( ); the error in the first derivative

will be halved.

Consider for instance the problem solved inlecture 3 (see Sec. 2.5.1), where we used 4 lin-ear elements to approximate the solution of aboundary value problem. The figure (right)

shows the exact derivative and the finite

element approximation ; the latter is con-

stant in each element. It is here conceivable

how the error — e.g. the jumps in between

the elements — will be halved if we split eachelement into two new ones.

Estimates like those in Eq. (6.11) are called a priori estimates, since they will tell us some-

thing about how the error in a FE–solution will behave, before we have calculated . This is

as opposed to the a posteriori estimate discussed in Sec. 5.7, which calls for a

FE–approximation to work on.

Also note that is a fixed constant in Eq. (6.11), so the estimates cannot be used to figure

out how the error will change if we change the polynomial order (e.g. switch from elements

with linear basis functions ( ), to elements with a quadratic approximation ( )). This

is because the constants stem from the omitted terms in the Taylor series and, hence, will

change if we change .

6.2 Two Dimensional Problems

Having examined how the discretization error behaves in a one dimensional problem when

the element size is changed, we now turn our attention to the case where we have two inde-

uh

xd

duh

xd

du

e x( ) Ch3

= e a Ch2

=

p

e x( ) Chp 1+

= e a Chp

=

C

uxd

duC

p

h

p 1= e x( ) 4

h h 2⁄→ e a

0 0.2 0.4 0.6 0.8 1−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5Exact and FE derivatives

x

du/dxdu

h/dx

xd

du

xd

duh

xd

duh

uh

e a 2 ηi

2

i

∑≈

p

p 1= p 2=

C

p

h

— 128 —

pendent variables. With being the exact solution and a FE–approxima-

tion, the discretization error is a function of and . Here we have two first

derivatives, viz. with respect to and , respectively, and we will consider the gradient

of the error, i.e. the error in the first derivatives.

Here we should pay some attention to problems with more that one unknown function, suchas for instance a two dimensional elasticity problem where the unknown is a vector valued

function . Here, the discretization error is a vector

and rather than using the gradient operator, it seems more

natural to use to evaluate the error in the first derivatives. Hence, we would use

to evaluate the error in the first derivatives. This could be thought of as the strains in the

error function . In the subsequent presentation we consider the Laplace problem where

we have a single unknown function (e.g. a temperature ) and the associated energy

norm of the error , where the constitutive matrix is sym-

metric and positive definite, is a measure of the errors in the first derivatives in a FE–approx-imation. However, it should be recognized that the results that will follow, will also apply to

the (2D) elasticity problem if we use the associated energy norm ,

where the Hooke matrix is symmetric and positive definite, to evaluate the error in first

derivatives.

Now consider a discretization of a domain into elements, and in

particular we focus on an element that occupies a region . We

select a point in the element and let denote some charac-

teristic element size. The Taylor series expansion of the unknownbecomes

(6.12)

u u x y,( )= uh uh x y,( )=

e u uh–= x y

x y

∇ex∂

∂e

y∂∂e

T

x∂∂u

x∂∂uh

–

y∂∂u

y∂∂uh

–

T

= =

u ux x y,( ) uy x y,( )T

=

e ex ey ux uxh–( ) uy uyh–( )T

= =

∇˜

∇˜ e

x∂∂

0

0y∂

∂

y∂∂

x∂∂

ex

ey

x∂∂ex

y∂∂ey

y∂∂ex

x∂∂ey

+

= =

e x y,( )u u x y,( )=

e a a e e,( ) ∇e( )TD ∇e( ) Ωd

Ω∫

1 2⁄= = D

e a ∇˜ e( )T

D ∇˜ e( ) Ωd

Ω∫

1 2⁄=

D

x0 y0,( )

x

y

Ωe

ΩΩe

x0 y0,( ) h

u x y,( ) u x0 y0,( )x∂

∂u

x0 y0,( )x x0–( )

y∂∂u

x0 y0,( )y y0–( )+ + +=

1

2---

x2

2

∂∂ u

x0 y0,( )

x x0–( )2

x y∂

2

∂∂ u

x0 y0,( )x x0–( ) y y0–( ) 1

2---

y2

2

∂∂ u

x0 y0,( )

y y0–( )2+ + +

1

6---

x3

3

∂∂ u

x0 y0,( )

x x0–( )3 1

2---

x2

y∂

3

∂∂ u

x0 y0,( )

x x0–( )2y y0–( ) 1

2---

x y2∂

3

∂∂ u

x0 y0,( )

x x0–( ) y y0–( )2 1

6---

y3

3

∂∂ u

x0 y0,( )

y y0–( )3 …+ + + +

— 129 —

and it is recognized that and , since .

Let us now turn our attention to the FE–approximation . It will be a complete polynomial of

degree if it includes all polynomial terms of the form , for integers and such that

. Hence, for to be a polynomial of degree , it must contain (in terms of the

standard basis) , so that . Similarly,

would require the basis , so that

. The concept is best illustrated by a

Pascal triangle with polynomial terms:

The simplest possible element is the 3–node triangle, for which we have ,

and we see that the first three terms in Eq. (6.12) can be exactly reproduced. Hence, the erroris in the omitted second order terms, so we have

(6.13)

As for the first derivatives we first note that , so considering Taylor series repre-

sentations of the respective first derivatives, one expects the errors to be in the linear terms.Thus,

(6.14)

For a 4–node quadrilateral we have . While this expression contains a

second order term, it is complete up to only, since only one of the three second order

terms (second row of Eq. (6.12)) is accounted for, so we expect Eq. (6.13) to hold (although

with a different constant ). The gradient becomes . Since

lacks a linear term , we expect ; the error in the –derivative will behave similarly,

since the second component of the gradient does not include a linear term in . Thus, Eqs

(6.13) and (6.14) hold for the 4–node quadrilateral as well as for the 3–node triangle, i.e. ele-

ments for which .

Let us now investigate elements with complete second order polynomial approximations

( ). The simplest such element is the 6–node triangle, which features the approximation

. It is seen that this is enough to exactly represent the first

six terms in Eq. (6.12), so the error stems from the cubic terms (third row in Eq. (6.12)). Fur-

thermore, the gradient components are complete poly-

nomials of degree , so the error in first derivatives are expected to be of order . Another

x x0– h≤ y y0– h≤ x0 y0,( ) Ωe∈

uh

p xm

yn

m n

m n p≤+ uh p 1=

1 x y, , uh α1 α2x α3y possibly higher order terms+ + += p 2=

1 x y x2

xy y2, , , , ,

uh α1 α2x α3y α4x2 α5xy α6y

2possibly higher order terms+ + + + + +=

1

x y

x2

xy y2

x3

x2y xy

2y

3

x4

x3y x

2y

2xy

3y

4

x5

x4y x

3y

2x

2y

3xy

4y

5

… … … … … … …

p 1=

p 2=

p 3=

uh α1 α2x α3y+ +=

e x y,( ) Ch2

=

∇uh α2 α3

T=

e a Ch=

uh α1 α2x α3y α4xy+ + +=

p 1=

C ∇uh α2 α4y+( ) α3 α4x+( )T

=x∂

∂uh

xx∂

∂u

x∂∂uh

– h∼ y

y

p 1=

p 2=


2+ + + + +=

∇uh α2 2α4x α5y+ +( ) α3 α5x 2α6y+ +( )T

=

1 h2

— 130 —

common element with quadratic basis functions, is the 8–node Serendipity element, which

uses the approximation . It embraces two, but

not all four, cubic terms so like for the 6–node triangle, we must expect the leading terms in

the error to be proportional to . The gradient components includes quadratic

terms, but are complete up to degree only: ; hence, the

errors in first derivatives are proportional to . Hence, for the elements we have

(6.15)

cf. Eq. (6.10). As in the case with the one dimensional elements, we conclude that if the basis

functions constitute a complete basis for polynomials of orders up to we get the estimates

(6.16)

For instance, if we use linear elements ( ) to solve an elasticity problem, we anticipate

that the errors in calculated stresses (first derivatives) should be halved if we doubled the

number of elements in each direction, .

A note on vs. the Taylor series representation of . In the derivations above, we com-

pared the FE–approximation in standard basis, e.g. for the 3–node triangle,

with a Taylor series representation of the exact solution. In doing so, we investigated howmany leading terms in the series that could be exactly represented by appropriate selection of

values for the polynomial coefficients , and concluded that the error stems from the

remaining terms. However, in practice we use basis functions to construct ,

rather than the standard basis , and the associated node variables are calcu-

lated such that the potential energy is minimized. Once the node variables are known, we canalways express the approximation on the element in terms of the standard basis by means ofa transformation (change of basis). For instance, for a 3–node triangle one would have

; see the –matrix method for a derivation of the relation

between and . Hence, we cannot expect the actual coefficients to be such that the

leading terms in the Taylor series are exactly matched, but in general we will have an error

also in those low order terms and it is not obvious that an estimate like holds. In

order to resolve this obstacle, let us consider the function , where the

polynomial coefficients are chosen so as to fit the function to the leading terms in the Taylor

series Eq. (6.12); ( is known as the interpolant to ). Also define the error . From

the above it should then be obvious that holds. Now recall the Galerkin orthogo-

nality: the discretization error is –orthogonal to all basis functions; the FE–approx-

imation is such that is minimized. Thus, we may argue that , see Eq.


2 α7x2y α8xy

2+ + + + + + +=

e u uh–= h3

1 ∇uh

α2 2α4x α5y 2α7xy α8y2

+ + + +( )

α3 α5x 2α6y α7x2

2α8xy+ + + +( )=

h2

p 2=

e x y,( ) Ch3

= e a Ch2

=

p

e x y,( ) Chp 1+

= e a Chp

=

p 1=

h h 2⁄→

uh u

uh α1 α2x α3y+ +=

αi

N1

eN2

eN3

e …, , , uh

1 x y …, , , ai

e

uh a1

eN1

ea2

eN2

ea3

eN3

e+ + α1 α2x α3y+ += = C

ai

e αj αj

e a Chp

=

uI α1 α2x α3y …+ + +=

uI u eI u uI–=

eI aCh

p=

e u uh–= a

e a e a eI aCh

p=≤

— 131 —

(5.19), and claim that the estimates Eq. (6.16) hold even though , in general, does not coin-

cide with the leading terms in the Taylor series representation of .

6.3 Effect of Singularities

In the foregoing we have tactically assumed that the exact solution actually has a Taylor

series representation, but this is not always the case. In some practical problems there are

one or several points in which is singular; at such points, some derivative of will be infi-

nite and a Taylor series does not exist and the rate of convergence may be reduced. If the :th

derivative of tends to infinity at some point, the estimate Eq. (6.16) is modified to become

(6.17)

with arbitrarily small, but positive. The number may be taken as a measure of the

strength of the singularity — the smaller the number, the stronger the singularity. It should

be noted that is not necessarily an integer, but may very well be a fractional number. Thus,

in a case (see below) where , we claim that all derivatives of up to, but not including,

the :th derivatives exist. It is beyond the scoop of this presentation to dwell on fractional

derivatives, but we emphasize that these are well defined in mathematics (you may askMikael Enelund about this).

Let be a polar coordinate system with its origin at a singular point. Then it can be shown

that the exact solution is

(6.18)

in the vicinity of the point ( sufficiently small); here is smooth, i.e. continuously differenti-

able). Hence, a first derivative of is expected to be proportional to , so if the

derivative tends to infinity when .

uh

u

V

V

Vh

u

uh

eeI

uI

u

u u

λu

e a Chmin p λ ε–,( )

=

ε λ

λ

λ 1

2---= u

1

2---

r θ,( )

u r θ,( ) rλf r θ,( )=

r f

u rλ 1– λ 1 0<–

r 0→

— 132 —

One common design that yields a singular solution, is domains

with re–entrant corners. Here , where denotes the inte-

rior opening angle. The extreme case , corresponding to a

sharp crack, is for instance studied in LEFM (linear elastic frac-

ture mechanics); here , so close to the crack tip the dis-

placements vary as , cf. Eq. (18), and the stresses as .

Indeed, consulting some text book on the subject, e.g. Ref. [], onefinds

where , , and are constants depending on the material and boundary conditions, if

the load is perpendicular to the crack surface ( ); similar expressions may be found for

the case when the load is parallel or transverse the crack.

r

θω

Ω

y

x

λ πω----= 1< ω

ω 2π=

λ 1

2---=

r1

r------

ux

1 ν+( )KI

4πE----------------------- 2πr 2κ 1–( ) θ

2---cos

3θ2

------cos–= uy

1 ν+( )KI

4πE----------------------- 2πr 2κ 1+( ) θ

2---sin

3θ2

------sin–=

σxx

KI

2πr-------------

θ2--- 1

θ2---sin

3θ2

------sin– cos= σyy

KI

2πr-------------

θ2--- 1

θ2---sin

3θ2

------sin+ cos= σxy

KI

2πr-------------

θ2---

θ2---sin

3θ2

------coscos=

E ν κ KI

θ 0=

— 133 —

An characteristic example is provided below. An rectangular domain with a re–entrant corner

is horizontally loaded such that at the left and right edges. Using pdetool with

adaptivity in Matlab, one obtained the von Mises stress at the corner. Zooming at the

corner, we see that very small elements have been generated here; this is important, since iflarger elements are used, the singularity will ‘pollute’ the solution in regions away from thecorner. Also note that the exact stress is infinite at the corner, so further refinement of themesh will simply increase the calculated von Mises stress. In practice, the material willexhibit some non–linear behaviour in the vicinity of the corner, so the linear analysis can onlypredict the response away from this area provided that the non–linear region is small enoughand that it is well isolated with small elements. Should we be interested in the state close tothe corner, another material model (i.e. other than Hooke’s law) has to be invoked.

σxx 1 MPa=

25 MPa

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5

−0.1

0

0.1

0.2

0.3

0.4

0.5

Color: von Mises

5

10

15

20

[MPa]

0.2

— 134 —

Another cause for singular points in solutions, is blunt changes in boundary conditions. Thismay for instance be produced by a discontinuity in a Neumann condition as illustrated in the

example below, where a singularity in the shear stress is caused by an abrupt change in

boundary load.

Similarly, singularities may be instigated at points on a boundary where there is a changefrom Dirichlet to Neumann conditions. This is depicted in the following two examples.

σxy

−0.02 0 0.02 0.04 0.06 0.08 0.1 0.12

−0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

Color: sxy Displacement: (u,v)

0

0.5

1

1.5

2

2.5

x 106

sym

σyy 10 MPa–=

σxy 0=

σyy σxy 0= =

0.048 0.0485 0.049 0.0495 0.05 0.0505 0.051 0.0515 0.052

0.088

0.0885

0.089

0.0895

0.09

0.0905

0.091

0.0915

0.092

Color: von Mises Displacement: (u,v)

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

x 1010

−0.02 0 0.02 0.04 0.06 0.08 0.1 0.12

−0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

Color: von Mises Displacement: (u,v)

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

x 1010

sym

σxy 0=

uy 0,1mm–= σxy σyy 0= =

— 135 —

Yet another situation that may yield singularpoints in the exact solution is encounteredwhen there is discontinuity in constitutivedata (material properties). This typicallyoccurs in the interface between differentmaterials, in the vicinity of the domainboundary. An example, adopted from Möller(1994), is provided to the left; here we see alaminate consisting of four orthotropic car-bon/epoxy plies with the fibres oriented in

different directions to the longitudinal axis

. The stresses, due to an axial load, have

been calculated (by an adaptive FE–program)along the interface between the two upper-most laminas and it can be seen that the

interlaminar shear stress exhibits a sin-

gular behaviour at the domain boundary.

−2 0 2 4 6 8

x 10−3

0.092

0.094

0.096

0.098

0.1

0.102

Color: von Mises

20

40

60

80

100

120

−0.1 −0.05 0 0.05 0.1 0.15 0.2 0.25 0.3

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

Color: von Mises

20

40

60

80

100

120

P

σzz

σxz

σyz

σxx

θz

σyz

— 136 —

Finally we mention that point sources gives singular points in theexact solution. For instance, in an elasticity problem one gets an infi-nite stress at points where concentrated forces are applied. This situ-ation may be recognized as a special case of abrupt changes inboundary conditions.

6.4 Rate of Convergence

We will provide an example to exemplify convergence in energy norm, but prior to that it maybe elucidating to graphically envisage the relation expressed by Eq. (6.17). Graphs for thistype of relation are usually best shown in logarithmic scales. Taking the logarithm of bothsides of Eq. (6.17), we get

(6.19)

(Here, for convenience, we omitted the arbitrary positive number and write rather than

). Thus, the — graph will be a straight line with slope ; this slope is called

the rate of convergence. In the absence of singularities, i.e. for large enough, the rate of

convergence is given by the polynomial order of the element basis functions as shown by

the left figure below. On the other hand, in the presence of a singularity such that , the

singularity will govern the rate of convergence. The right figure below shows the convergence

rate for linear and quadratic elements for a case where ; the slope of both lines are

, as opposed to the slopes and that would have been obtained in the absence

of singularities (left figure).

P

Ω

e a( )log C( )log k h( )log+= k min p λ,( )=

ε λλ ε– e a( )log h( )log k

λp

λ p<

λ 1

2---=

k1

2---= k 1= k 2=

10−4

10−3

10−2

10−1

100

10−5

10−4

10−3

10−2

10−1

100

|| e

||a

<−− decreasing element length h

Linear and quadratic elements: Convergence λ > 2

p = 1p = 2

10−4

10−3

10−2

10−1

100

10−2

10−1

100

|| e

||a

<−− decreasing element length h

Linear and quadratic elements: Convergence λ < 1

p = 1p = 2

— 137 —

6.4.1 Example

Consider a string of length that is pre–tensioned by a

force . If a transverse distributed load is applied, the

displacement is obtained as the solution of

( ), subjected to (provided that the

displacement i small compared to the length of the string).

Now consider the case and . Utilizing the symmetry, we obtain the boundary value

problem (BVP)

(6.20)

which has the solution . It is seen that the essential boundary condition is sat-

isfied and since the natural condition is satisfied as well; yet another differentiation

yields , so it is verified that the BVP is satisfied. (It may be noted that

hardly can be regarded as small compared to , but we are interested in the mathematics

here and do not care about the physics; the problem is linear so you may set to get

and, hence, if that makes you feel better).

The weak form of the problem is: Find such that

(6.21)

where

(6.22)

the energy in the exact solution is thus

(6.23)

S

p x( )

u x( )

x

0 LL–

2L

S p x( )

u x( )x

2

2

d

d u–

p

S---=

L– x L< < u L–( ) u L( ) 0= =

L 1=p

S--- 1=

x2

2

d

d u– 1= 1– x 0< <

u 1–( ) 0=

xd

du

x 0=

0=

u x( ) 1

2--- 1 x

2–( )=

xd

dux–=

x2

2

d

d u1–= umax 1 2⁄=

L 1=

p S⁄ 0.01=

u x( ) 1

200--------- 1 x

2–( )= umax 1 200⁄=

u V∈

a u v,( ) v 1,( )= v V∈∀ v: v 1–( ) 0= , v2

xd

1–

0

∫ ∞< , xd

dv

2

xd

1–

0

∫ ∞<

=

a u v,( )xd

dv

xd

duxd

1–

0

∫= v 1,( ) v 1⋅( ) xd

1–

0

∫=

a u u,( ) x–( )2xd

1–

0

∫1

3---= =

— 138 —

Let us now introduce a finite element approximation , where is a

row vector with selected basis functions and is a column vector with the associ-

ated (and yet unknown) node variables. Substituting into the weak problem and applying theGalerkin method, we obtain

(6.24)

where

. Collecting the equations row–wise, we have

(6.25)

or

(6.26)

where and are the structure stiffness matrix and the structure load vector, respectively.

Now consider a discretization of the interval into linear ele-

ments, and study one of these with node coordinates and ;

the element length is . There are two basis functions on

the element, viz.

(6.27)

with associated node variables and . Using and , we have the

approximation on the element. Also, with the notation we can

now calculate the contribution to the stiffness matrix and load vector

(6.28)

Solution with 1 element With a single element we have and the element matrix and

vector Eq. (6.28) are also the structure stiffness matrix and structure load vector according toEq. (6.26). Numbering our node variables from left to right, we obtain the equations

u uh Na=≈ N N1 x( ) … Nn x( )=

a a1 … an

T=

xd

dvB xd

1–

0

∫ a v xd

1–

0

∫= v N1 … Nn, ,=

Bxd

dN

xd

dN1 …xd

dNn= =

BTB xd

1–

0

∫ a NT

xd

1–

0

∫=

Ka f= K BTB xd

1–

0

∫= f NT

xd

1–

0

∫=

K f

x

xixi 1+

h

N1

e N2

e

1 1

1 0,–[ ]xi xi 1+

h xi 1+ xi–=

N1

e xi 1+ x–

h--------------------= N2

e x xi–

h------------=

a1

ea2

eN

e

N1

eN2

e= ae

a1

ea2

eT

=

uh Nea

e= B

e

xd

dNe

1

h--- 1– 1= =

Ke

BeT

Be

xd

xi

xi 1+

∫1

h2

-----1 1–

1– 1xd

xi

xi 1+

∫1

h---

1 1–

1– 1= = = f

eN

eTxd

xi

xi 1+

∫h

2---

1

1= =

h 1=

— 139 —

(6.29)

where we prescribed the first node variable so as to fulfil the essential boundary condition.Also note that the first equation is not valid, since it was obtained by selection the first bases

function as test function: . However, so this function is not in the trial space

— see the definition of in Eq. (6.21). The second equation (6.29) yields . We may now

calculate the energy in the FE–approximation

(6.30)

Now, since the energy in the error equals the error in energy, Eqs. (6.23) and (6.30)

give us

(6.31)

and thus the energy norm of the error

(6.32)

Solution with 2 elements With equal size we have for both elements and Eq. (6.28)

gives and . Once again numbering elements and the variables from left

to right we have and for the first and second element, respectively, so

assembly gives

(6.33)

where we took the essential boundary condition into account. For the same reason as before,

the first of these equations is invalid; the second and third equations (6.33) give and

. We get the energy

(6.34)

Using Eqs. (6.23) and (6.34), we evaluate

(6.35)

1 1–

1– 1

a1 0=

a2

1 2⁄1 2⁄

=

v N1= N1 1–( ) 1=

V a2

1

2---=

a uh uh,( ) aT

Ka aT

f 01

2---

1

2--- boundary term+

1

2---

1

4---= = = =

e u uh–=

a e e,( ) a u u,( ) a uh uh,( )–1

3---

1

4---–

1

12------= = =

e a a e e,( ) 1

12---------- 0.2887≈= =

h 1 2⁄=

Ke 2 2–

2– 2= f

e 1 4⁄1 4⁄

=

ae

a1 a2

T= a

ea2 a3

T=

2 2– 0

2– 2 0

0 0 0

a1

a2

a3

0 0 0

0 2 2–

0 2– 2

a1

a2

a3

+

1 4⁄1 4⁄

0

0

1 4⁄1 4⁄

+=

2 2– 0

2– 4 2–

0 2– 4

a1 0=

a2

a3

1 4⁄1 2⁄1 4⁄

=⇒

a2 3 8⁄=

a3 1 2⁄=

a uh uh,( ) aT

f 03

8---

1

2---

1

4--- boundary term+ 1

2---

1

4---

T5

16------= = =

a e e,( ) a u u,( ) a uh uh,( )–1

3---

5

16------–

1

48------= = =

— 140 —

and, hence

(6.36)

We see, Eqs. (6.32) and (6.36), that doubling the number of elements reduces the

error (as measured by the energy norm) by a factor . This is in accordance with the theory

which says that, in the absence of any singularity, we should have when the basis

functions are polynomials that are complete up to degree (and the constant depends on

the exact solution ); in our case we have .

The Matlab/Calfem script file converge.m (available for download from the course web–page)

solves the problem with elements and plots the energy norm of the respective

errors versus the element lengths in a log–log scale; it is readily seen that we get a straight

line with slope .

6.5 –refinement

Having obtained a FE–solution that is deemed to coarse, one would normally use more ele-ments to improve the approximation while one hardly ever consider to change element types.

Thus, it is standard procedure to utilize the so called –method in order to refine ones

approximation, and Eq. (6.17) shows how we should expect the discretization error to behave.

There are, however, adaptive FE–programs that employ the –method in order to reduce the

discretization error so it is of some interest to compare the two approaches to each other interms of convergence rates.

The relation between the energy norm of the discretization error and the discretization

parameters ( and ), as expressed by Eq. (6.17), is only valid for constant ; changing the

polynomial order of the basis functions, we do not only get a different exponent in the right

e a a e e,( ) 1

2 12------------- 0.1443≈= =

h h 2⁄→2

e a Chp

=

p C

u p 1=

1 2 4 … 512, , , ,

p 1=

10−3

10−2

10−1

100

10−4

10−3

10−2

10−1

100

Ene

rgy

norm

of t

he e

rror

: || e

||a

Element length h

−u" = 1, u(−1) = 0, u’(0) = 0

p

h

p

p h p

— 141 —

hand side of the equation, but the constant will take on a different value as well. Hence, as

articulated earlier, Eq. (6.17) is valid in the context of –refinement only and one has to work

out some other expression to reveal how the error is affected by –refinement. In the latter

case, the element size will be constant since the element subdivision is unchanged. There-

fore, in order to compare the convergence rates to the two approaches ( – and –refinements,

respectively), we need to express the discretization error as a function of some variable that ispresent in both cases. An obvious such variable is the number of degrees of freedom (i.e. the

number of node variables), , in the discretization, which henceforth will be considered.

Let us thus express Eq. (6.17) in terms of the number of degrees of freedom employed in the

discretization. First consider a one dimensional problem posed on an interval . With

elements, the average elements size is (where we, without loss of general-

ity, used the interval length as the basis for length measurement). With nodes

on each element and variables in each node, we will have .

With successive refinements, i.e. a growing , the trailing constant can be ignored, so with

we have or . Hence, Eq. (6.17) can be written

(6.37)

In the case of two independent variables, we can measure the

element size in different directions, e.g. by length measures

and ; however, in our a priori estimate the value of the length

measure is of little interest, but focus is on how the discreti-

zation error changes when the element size is changed (e.g.halved in both directions). For this reason we will keep the

notation without subscript to denote a typical element size

and note that in the two dimensional case the number of ele-

ments will be proportional to . Counting vertex nodes only, we thus have

node variables. Substituting into Eq. (6.17), we find

(6.38)

It should now be easy to see that in a three dimensional problem, one has that , so

(6.39)

The reader is reminded that the formulas (Eqs. (6.37)–(6.39)) still are valid for the –method

only. Corresponding formulas for the –refinement are elaborate to derive and is left out in

this presentation, but the reader are referred to Ref. [] and []. However, we state the main

results: the rate of convergence of the –method is at least as good as for the –method;

moreover, if the exact solution exhibits any singular point, the –method converges twice as

fast as the –method provided that nodes are placed at singular points. One may realise that

given typical situations that yields singular points (see above), it is actually difficult to envi-sion a situation where there is no node in such a point.

C

h

p

h

h p

Ndof

xl x xr< <

Nel hxr xl–

Nel

--------------1

Nel

-------= =

xr xl– Nnoel 2≥

Nndof Ndof Nndof Nnoel 1–( )Nel

1+[ ]=

Nel

Nel h1–

= Ndof h1–∼ h CNdof

1–=

e a CNdof

min p λ ε–,( )–=

h

h

1

h--- elements

1 h--- e

lem

entshx

hy

h

h

1

h---

1

h---× h

2–=

Ndof Nndof

1

h--- 1+

2

Ch2–≈=

e a CNdof

min p λ ε–,( )–

2-----------------------------------

=

Ndof h3–∼

e a CNdof

min p λ ε–,( )–

3-----------------------------------

=

h

p

p h

p

h

— 142 —

The graph below provides an example; in a two dimensional problem with , –refine-

ments gives a slope in a logarithmic – graph, cf. Eq. (6.38), while the –

method yields a twice as steep slope ( ).

λ 1

2---= h

λ2---–

1

4---–= e a Ndof p

λ–1

2---–=

P

P

λ 1 2⁄=

102

103

104

105

10−0.9

10−0.8

10−0.7

10−0.6

10−0.5

10−0.4

|| e

||a

Ndof

Convergence of h− and p−methods in 2d. λ = 0.5

h−refinementsp−refinements

— 143 —

References

Bathe, K–J. and Wilson E. L., Numerical Methods in Finite Element Analysis, Prentice–Hall,Englewood Cliffs, NJ, 1976.

Chen Z., The Finite Element Method. Its Fundamentals and Applications in Engineering,World Scientific Publishing, Hackensack, NJ, 2011.

Ciarlet, P. G., The Finite Element Method for Elliptic Problems, North–Holland, Amsterdam,1978.

Johnson, C., Numerical Solutions of Partial Differential Equations by the Finite Element

Method, Studentlitteratur, Lund, 1987.

Ottosen, N. and Petersson, H., Introduction to the Finite Element Method, Prentice–HallEurope, 1992.

Möller, P. W., Procedures in Adaptive Finite Element Analysis (dissitation), Department ofStructural Mechanics, Chalmers University of Technology, Göteborg, 1994.

Samuelsson, A. and Wiberg, N–E., Finita Elementmetodens Grunder, Studentlitteratur, Lund,1988.

Strang, G. and Fix, G. F., An Analysis of the Finite Element Method, Prentice–Hall, EnglewoodCliffs, NJ, 1973.

Wiberg, N–E. (ed.), Finita Elementmetoden — En datoranpassad beräkningsmetod för ingenjör-

sproblem, LiberLäromedel, Lund, 1975.

Documents

Compiled lecture notes