On the origin and early history of functional analysis

U.U.D.M. Project Report 2008:1

Examensarbete i matematik, 30 hpHandledare och examinator: Sten Kaijser

Januari 2008

Department of MathematicsUppsala University

On the origin and early history of functional analysis

Jens Lindström

Abstract

In this report we will study the origins and history of functional analysis up until 1918. Webegin by studying ordinary and partial differential equations in the 18th and 19th centuryto see why there was a need to develop the concepts of functions and limits. We will seehow a general theory of infinite systems of equations and determinants by Helge von Kochwere used in Ivar Fredholm’s 1900 paper on the integral equation

ϕ(s) = f(s) + λ

b∫a

K(s, t)f(t)dt (1)

which resulted in a vast study of integral equations.One of the most enthusiastic followers of Fredholm and integral equation theory was

David Hilbert, and we will see how he further developed the theory of integral equationsand spectral theory.

The concept introduced by Fredholm to study sets of transformations, or operators,made Maurice Frechet realize that the focus should be shifted from particular objectsto sets of objects and the algebraic properties of these sets. This led him to introduceabstract spaces and we will see how he introduced the axioms that defines them.

Finally, we will investigate how the Lebesgue theory of integration were used by FrigyesRiesz who was able to connect all theory of Fredholm, Frechet and Lebesgue to form ageneral theory, and a new discipline of mathematics, now known as functional analysis.

Acknowledgements

First of all, I would like to give my sincerest gratitudes to Sten Kaijser, not only forsupervising this thesis, but also for being my menthor during my years at the university.If it were not for him, I would have followed my original plan and study theoreticalphilosophy instead of mathematics. For preventing this, I am grateful. Secondly, I amgrateful for the help of Gunnar Berg who provided me with helpful comments and criticismto improve this thesis. Finally I give my gratitudes to Olivier for interesting conversationsand help with French translations and bad grammar.

Contents

1 Introduction 4

2 Differential equations 52.1 Linear ordinary differential equations . . . . . . . . . . . . . . . . . . . . . . 52.2 Partial differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.3 Spectral theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Integral equations 103.1 Origins in applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2 Potential theory and electrostatics . . . . . . . . . . . . . . . . . . . . . . . 113.3 The connection between Differential and Integral equations . . . . . . . . . 13

4 Passing to the infinite 164.1 Pre-linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.2 Infinite systems and determinants . . . . . . . . . . . . . . . . . . . . . . . . 164.3 General theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 The concept of space in the 19th century 23

6 Fredholm on Integral equations 25

7 Hilbert on Spectral theory 30

8 Finalizing the concept of space 398.1 A new way of mathematics – Abstraction prior to problem solving . . . . . 398.2 Adding structure to abstract spaces – The introduction of Topology . . . . 39

9 Frechet on metric spaces 419.1 Synthetic geometry – Euclidean geometry in function spaces . . . . . . . . . 43

10 Lebesgue on Integration theory 45

11 The creation of modern Functional Analysis 4811.1 Spectral theory of compact operators . . . . . . . . . . . . . . . . . . . . . . 52

A Solution of the Dirichlet and Neumann problems by Fredholm’s metod 57A.1 The interior Dirichlet problem . . . . . . . . . . . . . . . . . . . . . . . . . . 57A.2 The exterior Dirichlet problem . . . . . . . . . . . . . . . . . . . . . . . . . 58A.3 The interior Neumann problem . . . . . . . . . . . . . . . . . . . . . . . . . 59A.4 The exterior Neumann problem . . . . . . . . . . . . . . . . . . . . . . . . . 60

3

1 Introduction

Functional analysis is the branch of mathematics where vector spaces and operators onthem are in focus. In linear algebra, the discussion is about finite dimensional vectorspaces over any field of scalars. The functions are linear mappings which can be viewed asmatrices with scalar entries. If the functions are mappings from a vector space to itself, thefunctions are called operators and they are represented by square matrices. In functionalanalysis, the vector spaces are in general infinite dimensional and not all operators onthem can be represented by matrices. Hence the theory becomes more complicated, butnonetheless there are many similarities.

Functional analysis has its origin in ordinary and partial differential equations, and inthe beginning of the 20th century it started to form a discipline of its own via integralequations. However, for a long time there were doubts wether the mathematical theorywas rich enough. Despite the efforts of many prominent mathematicians, it was not sureif there were sufficiently many functionals to support a good theory, and it was not until1920 that the question was finally settled with the celebrated Hahn–Banach theorem.

Seen from the modern point of view, functional analysis can be considered as a gener-alization of linear algebra. However, from a historical point of view, the theory of linearalgebra was not developed enough to provide a basis for functional analysis at its time ofcreation. Thus, to study the history of functional analysis we need to investigate whichconcepts of mathematics that needed to be completed in order to get a theory rigorousenough to support it. Those concepts turn out to be functions, limits and set theory.

For a long time, the definition of a function was due to Euler in his Introductio inAnalysin Infinitorum from 1748 which read: ”A function of a variable quantity is ananalytic expression composed in any way whatsoever of the variable quantity and numbersor constant quantities.”. For a detailed discussion about the problems concerning thisdefinition, see [11]. For the purpose of this report, it is enough to say that the entire focusof this definition is on the function itself, and the properties of this particular function.What lead to the success of functional analysis was that the focus was lifted from thefunction, and shifted to the algebraic properties of sets of functions – The algebraizationof analysis. The process of algebraization led mathematicians to study sets of functionswhere the functions are nothing more than abstract points in the set.

At the same time as the theory is very concrete and applicable to physical problems, itcan be presented in a very abstract way. Some proofs and results are significantly simplifiedby introducing the axiom of choice, Zorn’s lemma or the Baire category theorem – someof the most abstract concepts in set theory. The main theorems are

1. The Hahn-Banach theorem by Hans Hahn and Stephan Banach, which statesthat there are sufficiently many continuous functionals on every normed space tomake the theory of dual spaces and adjoint operators interesting

2. The uniform boundedness principle or the Banach-Steinhaus theorem byBanach and Hugo Steinhaus, which states that for any family of continuous linearoperators on a Banach space, pointwise boundedness is equivalent to boundednessin the operator norm

3. The open mapping theorem or the Banach-Schauder theorem by Banachand Juliusz Pawel Schauder, which classifies the open mappings between two Banachspaces

4

2 Differential equations

2.1 Linear ordinary differential equations

Due to the unclear notion of a function during the end of the 18th and the beginning ofthe 19th century, one thought of a function in the same way as we today would thinkof an analytic function. That is, it was asumed that around each point x0, the functionwas equal to a power series in x − x0. Taking derivatives of this function was equal totaking derivatives of the terms in the series expansion. In general convergence was notconsidered. The common recipe for solving a differential equation

y(n) = F (x, y, y′, y′′, . . . , y(n−1)) (1)

would then be to substitute the power series y =∑∞

0 ck(x− x0)k and its termwise deriva-tives, into (1). Identifying the series on both sides would then decide ck, for k ≥ n, as afunction of c0, c1, . . . , ck−1.

The usefulness of this metod was restricted to rather simple differential equations suchas the linear equation y′ = a(x)y + b(x) for which the solution had been known since the17th century. It was not until after 1760 that a general study of ordinary linear differentialequations of arbitrary order began. [5]

2.2 Partial differential equations

In the 18th century the development was triggered by physical problems, and one of thebest examples of this fact is the theory of partial differential equations. In 1747, Jeanle Rond d’Alembert (1717 – 1783) published a paper which proposed a solution to thevibrating string problem. Since the position of any point on the string is depending onboth time and position, a function describing the shape of the string must depend on twovariables, y = f(x, t). d’Alembert considered the string to be composed of infinitely manysmall parts, each with infinitely small mass, and used Newton’s laws of motion to derivea partial differential equation for the shape of the string, now called the wave equation,

∂2y

∂t2= c2

∂2y

∂x2, (2)

where c is a known function of x and constant if the mass of the string is homogeneous.d’Alembert considered the special case when c2 = 1 and by the change of variables X =x− t, Y = x+ t he reduced (2) to

∂2y

∂X∂Y= 0. (3)

From (3) d’Alembert concluded that the solution of (2) was y(x, t) = f(x− t) + g(x+ t)where f and g are arbitrary twice differentiable functions. [11]

This caused quite a controversy because of the deception that a function had to besomething very concrete, and not ”arbitrary”. Thus already at this point we see the needof an abstraction of the concept of a function to a level where the function itself is notimportant, but rather the collection of functions with abstract properties. In this case theproperty of being twice differentiable.

There was also another approach to the vibrating string problem which began alreadyin 1715 with Brook Taylor (1685 – 1731), but took almost 150 years to mature. By directarguments, without using (2), he concluded that when c is constant, the functions

un(x, t) = sinnπx

acos

nπt

a, n ≥ 1 (4)

5

represented the vibrations of the string, where the value of n decided the tone (withn = 1 representing the fundamental tone and n = 2, 3, . . . the harmonics1). This lead D.Bernoulli in 1750 to propose the general solution as a series

u(x, t) =∞∑

n=1

ansinnπx

acos

nπc

a(t− βn) (5)

for suitable values of an and βn. [5]To proceed from here we need to make a note about further progress on the notion of

a function. When Leonhard Euler (1707 – 1783) studied (2) he concluded that u(x, t) wasdecided once the two functions

u(x, 0) = ϕ(x)

and∂u

∂t(x, 0) = ψ(x)

were prescribed. Note that these are functions of a single variable. Using this fact, Eulerwas able to give a geometric construction equivalent to u(x, t) being explicitly given by

u(x, t) =12(ϕ(x− t) + ϕ(x+ t)) +

12

t∫−t

ψ(x− ξ)dξ.

Now it was a well-known fact due to experiments that the function ϕ could look quiteterrible. For example there could be points where there are no derivatives. This forcedEuler to extend the notion of a function, from what he called ”continuous” (analytical), tothe more general notion of ”mechanical”. Euler does not define explicitly what he meansby a mechanical function, but it seems as it would mean piecewise twice differentiable inour notation. [5]

From (5) Euler was led to the conclusion that any mechanical function defined on aninterval −a ≤ x ≤ a could be represented as a series

a0

2+ a1cos

πx

a+ b1sin

πx

a+ a2cos

2πxa

+ b2sin2πxa

+ . . .

where each term in the series is a continuous (analytic) function. However, Euler couldnot imagine that this sum of continuous functions could be anything but a continuousfunction. Euler’s opinions were shared by most of the mathematicians of his time, and noprogress was done until the work of Joseph Fourier (1768 – 1830) on the theory of heat.[5]

2.3 Spectral theory

The work of Fourier on the theory of heat triggered not only the development of trigono-metric series, which required mathematicians to even more consider what is a functionand the meaning of convergence, but it also gave birth to spectral theory which is a centralconcept in functional analysis.

Fourier studied the ”cooling off” problem for a solid sphere of radius r which withspherical symmetry gives the partial differential equation

∂u

∂t= k

(∂2u

∂x2+

2x

∂u

∂x

), (6)

1The fact that if a vibrating string is cut in half, one will hear a tone which is one octave higher, wasalready known by Pythagoras. [23]

6

with boundary condition that u(x, t) remains finite when x tends to 0 and satisfies therelation

∂u

∂x+ hu = 0 for x = r and for all t, (7)

where h and k are constants. Using the method of separation of variables, Fourier provedthat the function

u(x, t) =1xexp(−kλ2t)sin(λx) (8)

is a solution, where the parameter λ is a solution of the trancendental equation

λr

tgλr= 1− hr (9)

and that equation (9) has infinitely many real zeroes λn tending to +∞. In order toobtain a solution to (6) with boundary condition (7) such that u(x, 0) = f(x), for a givenfunction f(x), he expressed xf(x) as a series

∑∞n=1 cnsin(λnx) and proved the relations2

r∫0

sin(λnx)sin(λmx)dx = 0 for n 6= m (10)

from which he deduced that

cn =

r∫0

xf(x)sin(λnx)dx

r∫0

sin2(λnx)dx. (11)

As always with Fourier, no rigorous proofs or justifications were given, not even that theseries actually converged to xf(x). [5]

The ideas and results of Fourier in the 1820’s were further developed and put on amore rigorous basis by Simeon Denis Poisson (1781 – 1840), which in turn led CharlesFrancois Sturm (1803 – 1855), 1836, and Joseph Liouville (1809 - 1882), 1837, to developa general theory which included all of Fourier’s work.

They began with the study of the second order differential equation

y′′ − q(x)y + λy = 0, (12)

where q is a real-valued continuous function, now refering to ”continuous” in the usualway3, on a closed interval [a, b] and λ is a complex parameter. They first began to studythe boundary problem

y(a)cos(α)− y′(a)sin(α) = 0y(b)cos(β)− y′(b)sin(β) = 0,

(13)

where α and β are positive constants. Their problem was then to decide for which λthe problem has non-trivial solutions. Stated in our language, the problem is to findeigenfunctions for the eigenvalue4 λ.

They continued with a remark, which was basically already made by Poisson. If λ andµ are two different eigenvalues, then the relations

u′′ − qu+ λu = 0v′′ − qv + µv = 0

2This is what we now call an orthogonality relation, a word which was never used by Fourier. [5]3The definition used for continuity at this time was due to Cauchy, 1821. [11]4From here on I will use the terms eigenvalue and eigenfunction despite that we have not actually

proved the existence of them yet, and that those words were not used until Hilbert, 1904.

7

give thatu′′v − v′′u+ (λ− µ)uv = 0,

which by integrating both sides gives

b∫a

(u′′v − v′′u+ (λ− µ)uv)dx

=

b∫a

(u′′v − v′′u)dx+

b∫a

(λ− µ)uvdx

= [u′v − v′u]ba + (λ− µ)

b∫a

u(x)v(x)dx

= (λ− µ)

b∫a

u(x)v(x)dx

= 0

(14)

because of (13). An immediate consequence of this is that all eigenvalues are real. If wereplace µ by λ and v by u in (14) we get that

∫ ba |u(x)|

2dx = 0, which would imply thatu ≡ 0 on [a, b] – a contradiction to the assumptions.

In a rather long and cumbersome paper by Sturm, 1836, he proves the existence ofeigenvalues. We will also prove this fact, but as a reformulation following [5]. We beginby studying the equation y′′ + q(x)y = 0 and in the usual way writing this as a system offirst order equations by introducting y1 = y, y2 = y′ which gives the system

y′1 = y2

y′2 = −q(x)y1. (15)

Now we introduce two new functions r and θ such that y1 = rsin(θ) and y2 = rcos(θ)which turns (15) into

r′ = (1− q(x))rsin(θ)cos(θ)θ′ = cos2(θ) + q(x)sin2(θ)

. (16)

If we apply this change of variables to the equation (12) we get the equation

θ′ = cos2(θ) + (λ− q(x))sin2(θ), (17)

and if we assume that a solution ω(x, λ) exists, such that ω(a, λ) = α, then the eigenvaluesλ are the solutions of the equations

ω(b, λ) = β + nπ for n ∈ Z. (18)

One of Sturm’s comparison theorems, which is found in the same paper, then shows thatfor each x ∈]a, b[ the function λ 7→ ω(x, λ) is strictly increasing and that it follows from(17) that if ω(x, λ) = kπ for some integer k then ∂ω

∂x (x, λ) = 1. [5]These are the results that Sturm needed for his conclusion that equation (18) has

one and only solution λn for each n ≥ 1 and no solutions for n ≤ 0, and finally thatthe corresponding eigenfunctions un have exactly n zeroes in the interval [a, b]. One yearlater, Liouville continued Sturm’s work and gave generalizations of the works of Fourier

8

and Poisson. One of his main results concerning our purpose is that he in (14) replaced λand µ by λn and λm from which it follows that

b∫a

un(x)um(x)dx = 0 for n 6= m.

This is another orthogonality result for the eigenfunctions, but neither Liouville nor Sturmused this word. These results are now known as Sturm-Liouville theory for certain typesof partial differential equations.

9

3 Integral equations

The theory of integral equations provided mathematicians with three essential conceptswhich were of great importance not only to the development of functional analysis, butalso to a richer and more general theory of other areas of mathematics. Some of themwere perhaps quite unexpected, like algebra and group theory. These are:

1. Solution to the Dirichlet problem, and thus connecting the theory of differential andintegral equations

2. Passing from finite to infinite systems of equations

3. Developing the notion of infinite spaces and function spaces

We will try to track these ideas which lead us a few steps closer to functional analysis,and to a better understanding of mathematics in general.

3.1 Origins in applications

As for partial differential equations, the theory of integral equations has its cradle inapplications, mainly astronomy and electrostatics. The study of planetary motions ledmathematicians to successions of equations of the form

y′1,i = f1,i(x, a1, . . . , an)y′2,i = f2,i(x, y1,1, . . . , y1,n)y′3,i = f3,i(x, y1,1, . . . , y1,n, y2,1, . . . , y2,n)...for i = 1, 2, . . . , n

where all the right-hand sides are known functions. This problem was hence reduced toquadratures. No attempts to justify the procedure mathematically were made since itgave a satisfactory answer to observations. Yet it is an example of an iterative process forsolving large systems of equations or successive equations.

The question of wether a general differential equation has solutions, even though ex-plicit solutions can not be given, was raised and answered by Augustin Louis Cauchy(1789 – 1857) who proved some existence theorems on differential equations. In a pa-per published in 1835 he considered a method like the one outlined above for the partialdifferential equation

∂U

∂t=

p∑i=1

Ai(t, x1, . . . , xp)∂U

∂xi, (1)

where the problem was to find a solution which reduces to a given function u(x1, . . . , xp)for t = 0. Cauchy transformed (1), by considering x1, . . . , xp as parameters, to

U(t, x1, . . . , xp) = u(x1, . . . , xp) +

t∫0

(p∑

i=1

Ai(s, x1, . . . , xp)∂U

∂xi)ds (2)

which he was able to solve using the method of successive approximations. He Startedwith U0 = u and defined

Un(t, x1, . . . , xp) = u(x1, . . . , xp) +

t∫0

(p∑

i=1

Ai(s, x1, . . . , xp)∂Un−1

∂xi)ds (3)

10

which he could prove converged to a solution when the Ai:s are analytic functions. [5]In the previously mentioned paper from 1837, Liouville independently used a smiliar

method for the differential equation y′′ = f(x)y on [a, b] with boundary condition y′(a)−hy(a) = 0. He started his recursive definition with y0(x) = 1 + h(x − a) and consideredthe series

y = y0 + y1 + . . .+ yn + . . .

where yn is determined by

yn(x) =

x∫a

dt

t∫a

f(s)yn−1(s)ds.

On the question of convergence, Liouville proved that |yn(x)| ≤ cn(x − a)2n/(2n)! whichimplies that the series converges for every x. Liouville then continued without furthermotivation by assuming that y(x) is twice differentiable as the limit of twice differentiablefunctions. A common deception among mathematicians at this time since the notion ofuniform convergence did not exist yet.

An interesting remark is that Liouville gave another definition of y as

y = y0 +

x∫a

dt

t∫a

f(s)y(s)ds

which he could, but did not, have written as

y = y0 +

x∫a

(x− t)f(t)y(t)dt

thus giving him the first example of what is now called a Volterra integral equation of thesecond kind. [5]

3.2 Potential theory and electrostatics

When Daniel Bernoulli (1700 – 1782) and Adrien-Marie Legendre (1757 – 1833) studiednewtonian attractions they arrived at expresions such as

Ω(x, y, z) = µ

∫ ∫ ∫V

ρ(ξ, η, ζ)dξdηdζ√(x− ξ)2 + (y − η)2 + (z − ζ)2

(4)

for a point with a certain mass under the influence of a solid V with density ρ. Pierre-Simon Laplace (1749 – 1827) went on to show that this rather terrifying looking functionsatisfied the rather easy relation

∆Ωdef=

∂2Ω∂x2

+∂2Ω∂y2

+∂2Ω∂z2

= 0, (5)

for (x, y, z) 6∈ V , which since has played an important role in governing stationary phe-nomena in for example hydrostatics, the theory of heat and electrostatics.

The theory of partial differential equations had until now been in an embryonic stageand not gone through such drastic development as for example the theory of ordinarydifferential equations. This was about to change when George Green (1793 – 1841) in1828 published a paper on partial differential equations with general boundary conditions.

11

The concern of the paper was electrostatics and what he called potential functions whichwere not only of the type (4), but also

Ω(M) =∫ ∫

Σ

ρ(P )MP

dσ(P ) (6)

where Σ is a smooth surface, ρ a continuous function on Σ and dσ the element of area onΣ. This will later be called simple layer potentials. The motivation for his results werebased on experiments showing that on conductors, the electric charges are concentratedon the surface. He discovered his famous theorem when he studied in which potentials thesurface density function ρ would define the relation∫ ∫ ∫

V(u∆v − y∆u)dω =

∫ ∫Σ(v∂u

∂n− u

∂v

∂n)dσ, (7)

where Σ is a smooth surface limiting a bounded volume V , u and v are twice differentiablefunctions in a neighborhood of V and ∂u

∂n , ∂v∂n are the derivatives along the exterior normal

of Σ. Green considered a function u with the following two properties,

1. u is twice differentiable for all points different from some point M in V

2. u(P )− (1/MP ) is bounded when P →M ,

to which he applies (7) to V , from which a small ball centered at M has been removed.By letting the radius of this small ball tend to 0 he obtained the formula

4πv(M) +∫ ∫ ∫

V(u∆v − v∆u)dω =

∫ ∫Σ(v∂u

∂n− u

∂v

∂n)dσ, (8)

and finally by taking u(P ) = 1/MP he obtained

4πv(M) =∫ ∫

Σ(v∂(1

r )∂n

− 1r

∂v

∂n)dσ, with r(P ) = MP. (9)

This is an integral formula for solving the Laplace equation, ∆v = 0, when v and ∂v∂n are

known on Σ. Inspired by a paper published by Poisson in 1820, Green realized that hecould generalize (8) to a general domain V by replacing u with a function G(M,P ) withthe following properties:

1. G is twice differentiable in V × V when P 6= M and ∂G∂n exist on Σ

2. G(M,P )− 1/MP is bounded when P →M

3. G(M, P) = 0 when M is in V and P is on Σ

4. When M is fixed in V , then G as a function of P satisfies the Laplace equation in VM

Green could not formally prove the existence of such a function, but the existence was wellmotivated by experiments. Now this function is called the Green’s function in his honourand is an essential part of almost all theory of differential equations.

12

3.3 The connection between Differential and Integral equations

After the papers by Green on electrostatics that provided a theory of partial differentialequations with general boundary conditions, the embryo of theory of partial differentialequations awoke and started to grow rapidly. Carl Friedrich Gauss (1777 – 1855) hadbeen interested in the Laplace equation very early, both in two and three variables, inconnection with his work on complex numbers and astronomy, and already in 1813 hepublished some special cases of the Green formula (7) and used the word ”potential” forthe function (4). [5]

Gauss’ work on potential theory led him to a fundamental result. When he studiedequations of the type (6) with ρ ≥ 0, and a function U continuous on Σ, such that∫ ∫

Σ(Ω− 2U)ρdσ

is minimal with respect to all possible choices of ρ, then Ω − U is constant on Σ. Byadding a suitable constant to Ω, Gauss was able to solve the problem of finding a functionu such that

1. u is harmonic in V

2. u is continuous in V = V ∪ Σ

3. u = U on Σ for a given function U

This problem was later studied by William Thomson (1824 – 1907), later known as LordKelvin, around 1847 and by Gustav Lejeune Dirichlet (1805 – 1859( at about the sametime, and it bears the name of Dirichlet’s problem due to Bernhard Riemann (1826 – 1866).The solution of this problem also implies the existence of a Green’s function, since if weconsider the function u(M,P ), harmonic in V , as a function of P which takes the values−1/MP on Σ. Then the Green’s function is G(M,P ) = u(M,P )+(1/MP ), provided that∂G∂n exists and is continuous on Σ.

Thompson and Dirichlet used a technique based on methods of the calculus of varia-tions, and it was later picked up and used by Riemann in his legendary 1851 doctoral thesison holomorphic functions and Riemann surfaces. It provided him with the necessary toolsfor proving existence of solutions of differential equations based on existence theorems forthese harmonic functions. This thesis gained great attention and was studied in detail bymany mathematicians, and they found out that the results rested on a few unproven factstaken for granted by both Dirichlet and Riemann. In the early 1870’s, these facts wereproven wrong in such a way that the very foundations of the calculus of variations wereshaken. Karl Weierstrass (1815 – 1897) undertook the task of saving the reputation of thecalculus of variations by basically reconsidering everything and putting it back on solidgrounds, thus saving many results based on those methods. However, one of the problemsthat were able to escape his efforts was the Dirichlet problem. It took another 30 yearsuntil David Hilbert (1862 – 1943), 1899, was finally able to justify the use of the methodsused by Dirichlet and Riemann. [5]

In 1877, Carl Neumann (1832 – 1925) published a paper on the Dirichlet problemand boundary value problems for the Laplace equation under certain conditions, wherehe introduced what now is known as the Fredholm integral equation of the second kind.This paper is rather unclear and contains a logical gap which was not solved until HenriLebesgue (1875 – 1941) discovered and corrected it in 1937. To tie together the resultson general linear partial differential equations, Laplace equations and Dirichlet problemswith integral equations, we will instead follow [5] and turn our attention to a paper by

13

Hermann Amandus Schwarz (1843 – 1921) on the vibrating membrane problem, publishedin 1885.

By the same principles as the vibrating string problem (compare section 2.2) one candeduce that if z = u(x, y, t) is the equation of the surface at time t, then u satisfies

∂2u

∂x2+∂2u

∂y2=∂2u

∂t2(10)

for suitable units of time and length, and small vibrations. If one looks for solutions ofthe form u(x, y, t) = v(x, y, )w(t) one finds a solution for v by solving1

∂2v

∂x2+∂2v

∂y2+ λv = 0, (11)

for a suitable constant λ. This equation (11) was successfully studied by Heinrich Weber(1842 – 1913) in 1869 who proved interesting eigenvalue properties and orthogonalityrelations, which implied basically the same properties as for the vibrating string problem.The problem with Weber’s solution is that it used methods of the calculus of variationswhich were considered rather suspicious by Weierstrass among others. Hence his resultswere not fully accepted until Schwarz in 1885 published a long paper on minimal surfaceswhich used entirely new methods to obtain the same, and even more general results.

Schwarz considered a type of equations slightly more general than (11), namely

∂2v

∂x2+∂2v

∂y2+ λ2pv = 0 (12)

in a domain D with a continuous function p > 0. His topic of interest was not to studyeigenvalue problems λ2 of (12), but a Dirichlet problem for the equation

∆w + ξpw = 0 (13)

depending on the parameter ξ, and restricted to the case where w = 1 on the boundary∂D of D. He expressed the solution as a power series in ξ,

w = w0 + ξw1 + ξ2w2 + . . . , (14)

where he took w0 = 1 and wn such that wn = 0 on ∂D for n ≥ 1. This causes (13) to beinductively defined by

∆wn + pwn−1 = 0 for n ≥ 1. (15)

In order to continue he assumed that there exists a Green’s function2, G(M,P ), for thedomain D. This would imply that for any function f , continuous in D, the equation

∆w + f = 0 (16)

has a unique solution vanishing on ∂D, and also that the solution is given by

w(M) =12π

∫ ∫Df(P )G(M,P )dxdy. (17)

This means that his wn:s in equation (15) are explicitly given by

wn(M) =12π

∫ ∫Dp(P )wn−1(P )G(M,P )dω. (18)

1This equation is also called Helmholtz’s equation.2If there exist a Green’s function, then it is unique. Schwarz himself had already proven the existence

of Green’s function for extensive cases which motivated his decision for assuming one existed.

14

The only thing left now is to prove the convergence of (14), for small enough ξ, whichhe did by an ingenious use of the inequality that bears his name. The proof of this factbrought him one step further than those who had studied this problem before him. Firsthe proved that w is a solution of (13) and is equal to 1 on ∂D. Second he also provedthat when ξ = 1/

√c, for a well-defined constant c, the terms in (14) tend uniformly3 to a

limit U which is not identically zero in D, but vanishes on ∂D, and is a solution to

∆w + (1/c)pw = 0.

This means that he had proven the existence of the smallest eigenvalue λ2 = 1/c of (13)for the corresponding eigenfunction.

What is interesting from our perspective is that if we write w = w0 + ξv and solveequation (13) using the explicit formula (17) we get as a solution for v

v(M) = g(M) +ξ

2π

∫ ∫Dp(P )G(M,P )v(P )dω (19)

withg(M) =

12π

∫ ∫DG(M,P )p(P )dω,

where (19) will later be known as a Fredholm integral equation of second type.

3The concept of uniform convergence was given by Weierstrass in a series of lectures in the beginningof the 1850’s. [11]

15

4 Passing to the infinite

4.1 Pre-linear algebra

Mathematicians have been concerned with solving systems of equations in two or threevariables for thousands of years. Two variables representing a problem in the plane,and three variables in the space. Equations in more than three variables were suspiciousbecause it would no longer represent a ”real” problem. As an example, we can takeHeron’s formula dating over 2000 years. Given a triangle with sides a, b and c, Heron’sformula gives the area of this triangle as A =

√s(s− a)(s− b)(s− c) where s = a+b+c

2 .This formula raised suspiciousness among philosphers since it involes the multiplication offour numbers. One number alone represents a distance, two distances multiplied gives anarea and a distance multiplied by an area gives a volume, but how does one represent themultiplication of two areas? [10]

The introduction of the carthesian coordinate system in the 17th century allowed math-ematicians to clearly envision the geometrical representation of equations and systems ofequations in one, two or three variables, and some started to consider a similar conceptof geometry in any number of variables. In 1844 there could have been a major break-through concerning the concept of geometry generalized to any finite system of unknowns,but there was not. This was Die Lineare Ausdehnungslehre, ein neuer Zweig der Math-ematik by Hermann Grassmann (1809 – 1877), a German high-school teacher. In thisbook he gives a treatment of what we today refer to as linear algebra. This book waspersistently rejected by everyone for almost a century because of the way that it waswritten. The precise language of mathematics which we use today was not avaliable toGrassmann and due to its level of abstraction, the book is written in an intricate andphilosophical way which made it almost unreadable. Several well-known mathematiciansof the time (Mobius, Dedekind, ...) tried to read and realize the importance of it, butfailed. The negative response he recieved made him publish a second version in 1862, DieAusdehnungslehre: Vollstandig und in strenger Form bearbeitet, but this version did nothave any influence on the mathematical community.

By the end of the 19th century all the basic theorems of linear algebra had been proven,but they were presented in unclear notations and bilinear forms instead of vectors andmatrices. Thus it did not give a sufficient basis for a generalization towards functionalanalysis. The development was also in almost the exact reverse order as the ”logical”order which is taught in linear algebra courses today: Linear systems of equations →Determinants → Bilinear and quadratic forms → Matrices → Vector spaces.

4.2 Infinite systems and determinants

When studying the first occurences of infinite systems it is evident that there is no generaltheory under consideration. Almost all such problems arose while studying differentialequations and representing the solutions as power series. As we have seen, the techniquewas to assume that a convergent power series for the solution existed and to substitute thisinto the differential equation, taking termwise derivatives and identifying the coefficients.This gives an infinite system of equations in infinitely many unknowns and the problem isto find a recursive relation for the coefficients.

To find the first attempts of a general theory we return to Fourier in his 1822 treatiseTheorie Analytique de la Chaleur in which he uses a more sophisticated method for aninfinite system where no recursive formula is available. The problem was to determine the

16

coefficients an for the series∞∑

n=1

ancos((2n− 1)x) (1)

such that the function represented by this series will be constant for −π/2 ≥ x ≥ π/2. Tosolve this problem, he considered a more general case, namely when a series similar to (1)equaled an analytic function with a Maclaurin expansion only containing odd powers ofx,

f(x) =∞∑

n=1

(−1)(n+1)A2n−1x2n−1

(2n− 1)!, (2)

where the Ak:s are known, which led him to consider a series expansion only involvingsin(nx) for n = 1, 2, . . .. That is, he assumed that

f(x) =∞∑

n=1

ansin(nx) (3)

and he wanted to find the coefficients an. By putting x = 0 and taking derivatives of (2)and (3), he obtained the infinite system

A1 =∞∑

n=1nan

A3 =∞∑

n=1n3an

...

A2k−1 =∞∑

n=1n2k−1an

...for k = 1, 2, . . .

. (4)

He began by solving the system (4) for the first m equations in the first m unknowns. Thisgave him a set of solutions a(m)

n and his task was to determine limm→∞

a(m)n for n = 1, 2, . . ..

By lenghty and cumbersome calculations, Fourier arrived at the solution

12a1 = A1 −A3

(π2

3!− 1)

+A5

(π4

5!− π2

3!+ 1)− . . . ,

−22a2 = A1 −A3

(π2

3!− 1

22

)+A5

(π4

5!− 1

22

π2

3!+

124

)− . . . ,

32a3 = A1 −A3

(π2

3!− 1

32

)+A5

(π4

5!− 1

32

π2

3!+

134

)− . . . .

...

As usual, he did not give any justifications of his procedures and there are plenty ofresults which would require a more careful investigation. These results however, are notthat important but more an illuminating example of the fact that passing to the infinitewas necessary for the development of analysis.

The work done by Fourier on infinite systems was left unnoticed for half a centurysince it was not the major concern of his papers. According to Frigyes Riesz (1880 – 1956)there was only one paper (published in 1828 by an italian mathematician named G. Piola)on Fourier’s method.

17

Next time this method was used was in 1870 by Theodor Kotteritzsch in a paper thathad a general system of infinitely many unknowns under consideration. The advance ofthis paper is that he under certain conditions was able to solve the infinite system (4),which he pointed out is of importance when finding Fourier coefficients1.

It took another 15 years until general infinite systems were considered again, but thistime with much larger success since it triggered Henri Poincare (1854 – 1912) to give thetheory a rigorous treatment. These were two papers published in 1884 by Paul Appell(1855 – 1930) and in 1886 by George William Hill2 (1838 – 1914). The paper by Appellconsidered the problem of finding coefficients of a power series for certain elliptic andperiodic functions. His technique was the same as the one used by Fourier 62 yearsearlier, but this time it caught the attention of Poincare which is the main reason forthe importance of his paper. Poincare realized the usefulness of the method, but it wasunclear under which assumptions the method could be applied.

Poincare started by considering an infinite sequence of complex numbers an with|an+1| > |an| and lim

x→∞|an| = ∞, and he wanted to find a sequence An such that

∞∑n=1

Anapn = 0, for p = 0, 1, . . . . (5)

This system is similar to the one that Appell considered, but in general it has no solutionsand Poincare started to investigate under which assumptions one can solve the system(5). By a theorem of Weierstrass there exist an entire function F which has simple zeroesprecisely at the an:s, and Poincare assumed that this function F can be written as

F (z) =∞∏

n=1

(1− z

an

). (6)

If cn is a sequence of concentric circles such that the radius rn of cn satisfies |an−1| < rn <|an|, then Poincare could state that the system (5) has a solution An if

limn→∞

∮cn

zp

F (z)dx = 0 (7)

for every p.If this requirement is fulfilled, then by (7), the system (5) has a solution and it is given

by

Ai =−ai

∞∏n=1n6=i

(1− ai

an

) , (8)

since Ai is a residue of (F (z))−1 at ai. Unfortunately this solution is not unique. Let

Sp =∞∑

n=1

|Anapn|

and λn be a sequence such that

∞∑p=0

|λpSp| <∞,

1In this paper he even uses the word Fourier coefficients, but he does not mention the work by Fourieron infinite systems. [2]

2This paper was written already in 1877, but it did not reach Europe and Poincare until 1886.

18

then

Bi = Ai

∞∑p=1

λpapi

will also be a solution of (5).

One year later, Hill’s paper caught Poincare’s attention. In his astronomical studies,Hill had come across the differential equation

D2w = θw (9)

where D denotes the differential operator −id/dt. He supposed that in (9),

θ =∞∑

k=−∞θkζ

2k (10)

where ζ = eit, θ−k = θk for k = 1, 2, . . . and that there existed a solution for (9) of theform

w =∞∑

k=−∞bkζ

c+2k (11)

where all bk:s are constants. By substituting (10) and (11) into (9), Hill obtained theinfinite system of homogeneous equations

...· · · [−2]b−2 − θ−1b−1 − θ2b0 − θ3b1 − θ4b2 − · · · = 0· · · θ1b−2 + [−1]b−1 − θ1b0 − θ2b1 − θ3b2 − · · · = 0· · · θ2b−2 − θ1b−1 + [0]b0 − θ1b1 − θ2b2 − · · · = 0

...

(12)

where [k] = (c+ 2k)2 − θ0, k = 0,±1,±2, . . . . [2]It seems as he was unaware of Fourier’s method for solving such systems, but Hill used

a similar argument by considering the first p equations and putting bm = 0 for m < −pand m > p. Then by letting p→∞ be obtained a solution of (9). [5]

Poincare wrote regarding Hill’s results that:3 ”The solution adopted by M. HILL is asoriginal as it is bold ... . Did one have the right to set the determinant of these equationsequal to zero? M. HILL ventured to do so and it was a very daring thing to do; until thenan infinite number of linear equations had never been considered [sic!]; determinants ofinfinite order had never been studied; no one even knew how to define them, and it was notcertain that it was possible to give a precise meaning to this notion. I must add, however,for sake of completeness, that M. KOTERITZSCH had touched on the subject .... Buthis paper was hardly known in the scientific world and in any case was not known to M.HILL. ...

But it is not enough to be daring; dating must be justified by success. M. HILL suc-cessfully avoided all the traps that surrounded him; and let no one say that in proceedingthis way he exposed himself to the most glaring errors; no, if the method had not beenlegitimate, he would have been immediately warned, because he would have arrived at anumerical result completely different from that given by observations.”

These words were not written until 1905, but they still seem to reflect the excitementPoincare felt when he decided to justify the method used by Hill who, afterall, was anastronomer.

3Cited from [2].

19

4.3 General theory

The first one to give a broad and general theory for infinite matrices and determinants wasHelge von Koch (1870 – 1924), beginning in 1891 with his interest in the Fuchs’ equation,

P (y) =dny

dxn+ P2(x)

dn−1y

dxn−1+ · · ·+ Pn(x)y = 0, (13)

where each Pr(x) can be represented by a Laurent expansion

Pr(x) =∞∑

λ=−∞αrλx

λ for r = 2, 3, . . . , n, (14)

valid in some annulus A about the origin. It was already known that a general solution

y =∞∑

λ=−∞g

λxλ+% (15)

existed which was convergent in A. Von Koch’s problem was to find a general formula forthe coefficients g

λand % in (15). His investigations led him to consider an infinite matrix

of the same type as Poincare and using his results, von Koch was able to give explicitformulas for g

λand % under certain restrictive assumptions.

One year later, von Koch returned to the problem in order to lighten the restric-tion on his assumptions. He began by studying the infinite array A = Aik for i, k =. . . ,−2,−1, 0, 1, 2, . . . and denoting

Dm = det Aik for i, k = −m, . . . ,m. (16)

The determinant D of A is then limm→∞

Dm provided that the limit exists and is finite,

otherwise the determinant of A is said to be divergent. The main diagonal of A is Aiifor i = −∞, . . . ,∞. A00 is called the origin. From these definitions it follows that there aredenumerably many infinite matrices, all with the same main diagonal, but the determinantis not fixed until an origin has been chosen. Thus von Koch’s task was to prove that if Dexists for some choice of origin, then it will exist and be the same for all choices of origin.

To establish that the convergence of Dm is independent of the choice of origin, heproved the following theorem:

Theorem 4.1. Let D be an infinite determinant. Then in order that D converge, it issufficient that the product of the elements on the main diagonal converge absolutely, andthat the sum of the elements off the diagonal also converge absolutely.

Proof. Define aik by setting Aik = δik + aik for i, k = −∞, . . . ,∞, where δik is theKronecker delta. Then by assumption we have that

S =∞∑

i=−∞

∞∑k=−∞

|aik| <∞ (17)

and hence that

P =∞∏

i=−∞

(1 +

∞∑k=−∞

|aik|

)< eS <∞. (18)

Now let

Pm =m∏

i=−m

(1 +

m∑k=−m

aik

)

20

and

Pm =m∏

i=−m

(1 +

m∑k=−m

|aik|

),

from which one can deduce that

|Dm+p −Dm| ≤ Pm+p − Pm. (19)

We now have that since (18) is convergent, so isPm

which implies the convergence of

Dm by (19). Any determinant where aik satisfies (17) is said to be in normal form.

To prove that the limit of a convergent determinant is independent of the choise of originwe begin by defining

Dmn = det Aik for i, k = −n, . . . ,m

and

Pmn =m∏

i=−n

(1 +

m∑k=−n

|aik|

).

Note that Dpp and Ppp is the same as Dp and Pp respectively as above. Now we can statethe theorem we need for the final part:

Theorem 4.2. Let A be in normal form. Then limm→∞n→∞

Dmn = D.

Proof. By theorem 4.1 we know that D is finite and that (18) holds. For any pair (m,n)let p = max(m,n). Then we have, as before

|Dpp −Dmn| ≤ Ppp − Pmn.

The right hand side can be made arbitrarily small for sufficiently large m and n, andhence p, because of the convergence of (19). The triangle inequality then gives the desiredresult.

Under the assumption that A is in normal form, von Koch deduced several propertiesof D. The most important of these, from our perspective, is the possibility to expandD by minors. Suppose we want to expand by minors at the ith row. To determine thecoefficients of Aik von Koch replaced Aik by zero for i 6= k and Aik by one for i = k in Aand calculates the resulting determinant, which he denoted

adj(Aik) =(ik

)= αik =

∂D

∂Aik. (20)

The αik:s are called minors or subdeterminants of order one. As in the case of finitedeterminants we have that

D =∞∑

k=−∞Aikαik,

∞∑i=−∞

Aijαik = 0 for j 6= k

and∞∑

k=−∞Ajkαik = 0 for j 6= k.

21

Using these ideas, von Koch continued to expand D by two rows, say i and m, andthus obtaining the subdeterminant of order two as

adj∣∣∣∣ Aik Ain

Amk Amn

∣∣∣∣ = ∂2D

∂AikAmn=(i mk n

). (21)

Analogously, the determinant can now be written as4

D =∑k<n

∞∑n=−∞

∣∣∣∣ Aik Ain

Amk Amn

∣∣∣∣ ( i mk n

).

Inductively, one can continue and obtain the subdeterminant of order r, by rows i1, . . . , irand columns k1, . . . , kr, as

adj

∣∣∣∣∣∣∣∣∣Ai1k1 . . . Ai1kr

Ai2k1 . . . Ai2kr

.... . .

...Airk1 . . . Airkr

∣∣∣∣∣∣∣∣∣ =(i1 i2 . . . irk1 k2 . . . kr

)

and thus writing D as

∑k1

∑k2

· · ·∑kr

∣∣∣∣∣∣∣∣∣Ai1k1 . . . Ai1kr

Ai2k1 . . . Ai2kr

.... . .

...Airk1 . . . Airkr

∣∣∣∣∣∣∣∣∣(i1 i2 . . . irk1 k2 . . . kr

),

where k1 < k2 < . . . < kr with −∞ < kr < ∞. von Kock pointed out that there are norestrictions on calculating D by expanding either rows or columns, but any combinationof rows and columns can be used, even infinite sets.

von Koch’s final expression for D is given by

D = 1 +∞∑

p=−∞app +

∑p<q

∣∣∣∣ app apq

aqp aqq

∣∣∣∣+ ∑p<q<r

∣∣∣∣∣∣app apq apr

apq aqq aqr

apr arq arr

∣∣∣∣∣∣+ · · · (22)

where the largest summation index in each term is to range over all integers. This ex-pression is of uttermost importance to us since it will later be known as the Fredholmdeterminant which Ivar Fredholm (1866 – 1927) used to solve his integral equations. [2]

4We use the notation

∣∣∣∣ Aik Ain

Amk Amn

∣∣∣∣ = det

(Aik Ain

Amk Amn

).

22

5 The concept of space in the 19th century

Set-theoretic concepts are not new in mathematics. They were used by Aristotle and evenearlier. Naively one can say that as soon as you make a statement valid for a collection ofobjects, you are really talking about sets. For example the pythagorean theorem makes astatement about all right triangles, which of course form a set. In the beginning of the 19th

century the word class was commonly used to denote a collection of objects having thesame property, but it took up until mid 19th century with George Boole (1815 – 1864) andGeorg Cantor (1845 – 1918) to formalize the ideas and introduce notations for calculatingwith these concepts.

Since Die Ausdehnungslehre by Grassmann received little attention, we will follow [3]and consider Riemann as the one who introduced the concept of space. In his famous 1851doctoral thesis, Grundlagen fur eine allgemeine Theorie der Functionen einer verander-lichen complexen Grosse, one reads1:

”The totality of the functions forms a connected domain closed in itself[ein zusammenhangendes in sich abgeschlossenes Gebiet], since each of thesefunctions can go over continuously into every other . . ..”

It is obvious that Riemann understood what we today mean by a function space. In hiseven more famous talk, Ueber die Hypothesen, welche der Geometrie zu Grunde liegen, headvanced further and introduced a notion of geometry to classes of objects, even infiniteclasses2:

”But there also exist manifolds in which the determination of location [dieOrtsbestimmung] requires not a finite number but either an infinite sequenceor a continuum of determinations of quantities [. . . sondern entweder eine un-endliche Reihe oder eine stetige Mannigfaltigkeit von Grossenbestimmungenerfordert]. Such a manifold, for instance, is formed by the possible determina-tions of a function for a given domain.”

This talk was published in 1868 by Richard Dedekind (1831 – 1916), two years afterRiemann’s too early death, and after that the ideas introduced by Riemann slowly beganto become understood and accepted.

At this time another revolution in mathematics began, mainly due to Weierstrass andhis school, which culminated in Felix Klein’s (1849 – 1925) Erlanger Programm. The taskwas to tidy up mathematics. According to Weierstrass, mathematics lacked rigor andrelied too much on intuition and physical observations. This triggered the mathematicalcommunity to give more rigorous proofs and motivations, as well as making definitionsand axioms clearer.

Concerning the concept of space one man picked up all these ideas, but in the sameway as Die Ausdehnungslehre by Grassmann was persistently ignored, another potentialrevolution was ignored. This was in 1888 when Giuseppe Peano (1858 – 1932) publishedhis book Calcolo geometrico secondo l’Ausdehnungslehre di H. Grassmann preceduto dalleoperazioni della logica deduttiva. This book is interesting in several ways. Among othersthe symbols ∪,∩ and ∈ representing union, intersection and an element belonging to aset, respectively were introduced. Regarding the concept of space, we give a passage inchapter IX where Peano defined a linear space3.

1Cited from [4].2Cited from [4].3Axioms 1–3 cited from http://www-history.mcs.st-and.ac.uk/HistTopics/Abstract_linear_

spaces.html, 2007–10–11, and 4 freely translated from [14].

23

1. (a = b) if and only if (b = a), if (a = b) and (b = c) then (a = c).

2. The sum of two objects a and b is defined, i.e. an object is defined denoted by a+ b,also belonging to the system, which satisfies:If (a = b) then (a + c = b + c), a + b = b + a, a + (b + c) = (a + b) + c, and thecommon value of the last equality is denoted by a+ b+ c.

3. If a is an object of the system and m a positive integer, then we understand by ma thesum of m objects equal to a. It is easy to see that for objects a, b, . . . of the system andpositive integers m,n, . . . one has if (a = b) then (ma = mb), m(a+ b) = ma+mb,(m+ n)a = ma+ na, m(na) = mna, 1a = a. We suppose that for any real numberm the notation ma has a meaning such that the preceeding equations are valid.

4. We suppose that there exists an object, which we call the zero object, denoted by θ,such that 0a = θ for all a belonging to the system.

He continued to define linear dependence, independence and the dimension of the linearsystem as the maximum number of linearly independent objects in the linear system. Heeven gives the example:4

”If one considers only functions of degree n, then these functions form a linearsystem with n + 1 dimensions, the entire functions of arbitrary degree form alinear system with infinitely many dimensions.”

Compare this definition to a definition of a linear space in any modern textbook in lin-ear algebra. Unfortunately, this book was neither well known nor appreciated by othermathematicians, and they kept ignoring the importance of linear spaces, as well as the settheoretic notations, for about another 20 years. The reason why Peano was not given therecognition he deserved during his active period is a very interesting and sad story. Onereason is that he published much of his work in italian in the university of Turin’s ownjournal which was not very well-known. Another reason is that his teaching position inhigher analysis was questioned by the university because of his use of logical symbols andhis conviction that rigor was to be put before intuition. The academic disputes ultimatelyresulted in Peano’s dismissal in favor of Guido Fubini (1879 – 1943) and that Peano’sworks did not got the credits they deserved. For a detailed discussion, see [13].

The same year, 1888, Dedekind published Was sind und was sollen die Zahlen whichgave the extensive study of the real numbers that was needed in order to solve the problemsregarding the definition of a function5, convergence and limits. This provided mathemati-cians with the necessary rigor to perform the algebraization of analysis, and thus to thebeginning of modern functional analysis.

4Cited from http://www-history.mcs.st-and.ac.uk/HistTopics/Abstract_linear_spaces.html,2007-10-12.

5The definition of a function, as being a mapping from an arbitrary set into an another arbitrary set,seems to have appeared for the first time in this book. [5]

24

6 Fredholm on Integral equations

With all the ideas and concepts developed at the end of the 19th century, mathematicsand especially functional analysis was ready to enter the 20th century. Following [5] wefind the major steps in four fundamental papers:

• Fredholm 1900 on Integral equations

• Lebesgue 1902 on Integration theory

• Hilbert 1906 on Spectral theory

• Maurice Frechet (1878 – 1973) 1906 on Metric spaces

Yet each of these papers alone were not enough, and it took for F. Riesz to understand theconnection between them and to create a general theory. When discussing these papers wewill not follow the chronological order, but discuss the paper by Lebesgue last, since thework of Hilbert is based on those of Fredholm, and those of Frechet on those of Hilbert.If not, the chain of events will be broken.

The 1900 paper by Fredholm is entitled On a new method for the solution of Dirichlet’sproblem1 and, of course, concerns the solution of Dirichlet’s problem. It is easy to dismissthis paper as only dealing with this specific problem, but the theory and results in it arevery deep. Inspired by a visit in France 1899, where he met and worked with both Poincareand Jacques Hadamard (1865 – 1963), he returned home and succeded in improving theresults of all his predecessors.

Integral equations had been solved before Fredholm. The first who was able to give acomplete solution to an integral equation was Niels Henrik Abel (1802 – 1829) concerninga problem in mechanics. After him followed successful attempts by Vito Volterra (1860– 1940), Neumann and Poincare. The general recipe until Fredholm was to replace theintegral with finite Riemann sums, and then passing to the limit. Consider the equation

ϕ(s) = f(s) +

1∫0

K(s, t)f(t)dt, (1)

where ϕ and K are known, and f is to be determined. Partition the interval [0, 1] inton subdivisions x0, x1, . . . , xn with xp = p/n. Set ϕ(xp) = ϕp, K(xp, xq) = Kpq andf(xp) = fp. Then by substituting into (1) one obtains the finite system of equations

fp +1n

n∑k=0

Kpkfk = ϕp for p = 0, 1, . . . , n. (2)

If we fix n and let f (n)k be a solution of (2), then by plotting (xk, f

(n)k ), one obtains a

polygonal solution curve. By letting n→∞ the system (2) should go over into (1) and thepolygonal curve should represent a solution curve of (1), but it is of course this limitingprocess that causes problems.

Fredholm’s goal was to obtain a complete theory for the integral equation2

ϕ(s) = f(s) + λ

b∫a

K(s, t)f(t)dt (3)

1The original title is Sur une nouvelle methode pour la resolution du probleme de Dirichlet. This was apreliminary article, and the full version entitled Sur une classe d’equations fonctionelles was published inActa Mathematica, 1903.

2We do not follow Fredholm’s original notations here.

25

where ϕ is a given continuous function on [a, b], K is bounded and picewise continuous on[a, b]× [a, b], λ a complex parameter3 and f is the unknown. Though he does not explicitlydescribe the methods used by Volterra and von Koch, it is evident that he is well awareof their methods. Fredholm’s procedure is composed of three ideas:

1.) Replacing the integral in (3) by Riemann sums and thus obtaining a system ofequations,

f(xj) +λ(b− a)

n

n∑k=1

K(xk, xj)f(xk) = ϕ(xj) for j = 1, 2, . . . , n. (4)

2.) Writing the determinant of the resulting system by use of von Koch’s formula as(compare section 4.3)

1 +λ(b− a)

n

n∑k=1

K(xk, xk) +λ2(b− a)2

2!n2

∑k1,k2

∣∣∣∣ K(xk1 , xk1) K(xk1 , xk2)K(xk2 , xk1) K(xk2 , xk2)

∣∣∣∣+ · · ·

and letting n→∞. By denoting

K

(x1 x2 . . . xm

y1 y2 . . . ym

)=

∣∣∣∣∣∣∣∣∣K(x1, y1) K(x1, y2) . . . K(x1, ym)K(x2, y1) K(x2, y2) . . . K(x2, ym)

...... . . .

...K(xm, y1) K(xm, y2) . . . K(xm, ym)

∣∣∣∣∣∣∣∣∣ (5)

he writes the determinant of the integral equation (3) as

∆(λ) = 1 + λ

b∫a

K(x, x)dx+λ2

2!

b∫a

b∫a

K

(x1 x2

x1 x2

)dx1dx2 + · · · (6)

and it remains to3.) prove the uniform convergence of (6) in any closed and bounded subset of the

complex plane. For this it is sufficient to give a good upper bound for the determinant(5). By a theorem of Hadamard4, Fredholm showed that

K

(x1 x2 . . . xm

y1 y2 . . . ym

)≤√nnMn,

where M = maxp,q

|Kpq|, from which it follows that (6) converges since

∞∑n=1

1n!

√nnMn <∞.

Fredholm’s next step was to apply Cramer’s rule to the system (4) and again letn→∞. Expanding by the first row then gives

K

(s x1 . . . xm

t x1 . . . xm

)= K(s, t)K

(x1 . . . xm

x1 . . . xm

)− · · ·+ (−1)mK(s, xm)K

(x1 x2 . . . xm

t x1 . . . xm−1

).

(7)

3Note that in connection with spectral properties, both Fredholm and Hilbert (chapter 7) studied I−λKinstead of λI −K as we do today. Hence their spectrum of operators are different from ours.

4Hadamard’s theorem states that |det (A)|2 ≤n∏

i=1

(n∑

j=1

|aij |2)

.

26

To simplify things, Fredholm defined the minor,

∆(s, t;λ) = K(s, t) + λb∫aK

(s x1

t x1

)dx1 + · · ·+

+λm

m!

b∫a· · ·

b∫aK

(s x1 . . . xm

t x1 . . . xm

)dx1dx2 . . . dxm

(8)

and replaced each integrand by its expression (7), which gives the relation

∆(s, t;λ) = K(s, t)∆(λ)− λ

b∫a

K(s, ξ)∆(ξ, t;λ)dξ. (9)

He then introduced the function

Φ(s) = ϕ(s)∆(λ)− λ

b∫a

∆(s, ξ;λ)ϕ(ξ)dξ (10)

and from (9) it follows that

Φ(s) + λ

b∫a

K(s, t)Φ(t)dt = ϕ(s)∆(λ). (11)

The conclusion from (11) is that if ∆(λ) 6= 0 then f(s) = Φ(s)/∆(λ) is a solution of (3).[3]

Fredholm did not stop there. He went on and proved that

d∆(λ)dλ

=

b∫a

∆(s, s;λ)ds (12)

from which he could deduce that if λ0 is a zero of order ν of ∆(λ), then for a suitable choiceof ϕ, the function Φ(s) cannot be divisible by a power of λ− λ0 higher than (λ− λ0)ν−1.That is, if Φ(s) = (λ− λ0)kΦ1(s) then from (11) one has that

Φ1(s) + λ0

b∫a

K(s, t)Φ1(t)dt = 0, (13)

which means that if (13) has no non-trivial solutions, then ∆(λ) 6= 0 and for λ = λ0, thereexists a unique solution of (3). By taking λ0 = 1 and using the properties of double layerpotentials one can deduce that (13) has no non-trivial solutions5, and hence the existenceand uniqueness for the solutions of the Dirichlet problem is proved.

Despite the startling results in this paper, the methods that Fredholm used were notvery original. It relied heavily on von Koch’s theory of infinite determinants and takinglimits of finite systems. In the revised paper of 1903, Sur une classe d’equations fonction-nelles, he introduced a method which was many years ahead of its time. [3]

We again consider the equation (3),

ϕ(s) = f(s) + λ

b∫a

K(s, t)f(t)dt,

5For a domain with sufficiently smooth boundary, in Fredholm’s case, three times continuously differ-entiable.

27

but this time viewed as a transformation, depending on the kernel K, of the unknownfunction f into the known function ϕ. If we denote this transformation by f → SλKf , wehave that SλKf = ϕ with

SKf(s) = f(s) +

b∫a

K(s, t)f(t)dt.

where λ = 1. For two kernels K and K ′ we write the composition SKSK′ = SK′′ with

K ′′(x, t) = K(x, t) +K ′(x, t) +

b∫a

K(s, ξ)K ′(ξ, t)dξ. (14)

Suppose that ∆(λ) 6= 0 and define6

R(s, t;λ) = −∆(s, t;λ)/∆(λ),

thenSλKSR = SRSλK = Id,

and thus again we have that a necessary and sufficient condition for the existence anduniqueness of a solution of (3) is that ∆(λ) 6= 0.

To determine what happens if ∆(λ) = 0, Fredholm generalized (12) to

dm∆(λ)dλm

=

b∫a

· · ·b∫

a

∆(s1 . . . sm

s1 . . . sm;λ)ds1 . . . dsm (15)

from which he deduced that if ∆(λ) = 0, then there exist an integer m such that

∆(s1 . . . sm

s1 . . . sm;λ)

is not identically equal to zero. Let m be the smallest such integer, which is exactly theorder of λ as a zero of ∆, then the m solutions of (13),

Φ1(s) =∆(

s s2 . . . sm

t1 t2 . . . tm

)∆(s1 . . . sm

t1 . . . tm

)

Φ2(s) =∆(s1 s s3 . . . sm

t1 t2 t3 . . . tm

)∆(s1 . . . sm

t1 . . . tm

)...

are linearly independent and every other solution of (13) is a linear combination of theΦj :s for 1 ≤ j ≤ m. He concluded the paper by showing that for two kernels K and K ′,with corresponding determinants ∆K and ∆K′ , the word determinant is justified by thefact that for the composed kernel K ′′, one has ∆K′′ = ∆K∆K′ . [6]

6R(s, t, ; λ) is later called the resolvent kernel by Hilbert. [9]

28

The success of this paper is not only due to the solution of a classical problem, but alsofor its originality. What analysis needed was the introduction of algebra and group theory.Indeed, the idea used by Fredholm is to consider the set of all transformations (operators)which have a non-zero determinant, and realize that they form a group under composition.Then by using algebraic properties of groups and group actions, he could say somethingabout the underlying problem in analysis. Hence this paper is not only a forerunner offunctional analysis, but of all spectral theory and operator theory, in particular operatoralgebras.

29

7 Hilbert on Spectral theory

The work of Fredholm got immediate attention from mathematicians all over the world,and of those, Hilbert was one of the most enthusiastic. It made him drop almost everythingthat he was doing, and turn his attention to the theory of integral equation. He evenproposed at his seminar in Gottingen that Fredholm’s results on integral equations couldlead to a solution of the Riemann hypothesis. Hilbert hoped that the Riemann zeta-function, which is an entire function, could be expressed as the determinant of an integralequation with symmetric kernel, but unfortunately no one has been able to find such arepresentation yet. However, it clearly expresses Hilbert’s faith and enthusiasm in themethods invented by Fredholm.

During the years 1904–1906, Hilbert published six papers on integral equations whichlater were all put together in a single volume entitled Grundzuge einer allgemeinen Theorieder Integralgleichungen. Of these papers, the first and fourth are of main interest tous. From what I can see, he began by taking one step back, to make sure the methodsused prior to him were rigorous enough, and then took two steps forward. He startedby returning back to transforming the integral equation to a finite system of equations(compare chapter 6, equation (4)), under the restriction that the kernel is symmetric, andthen taking limits. One might ask why he bothered to do so when it had been consideredalready by Volterra and Poincare. The answer is probably that he under the assumptionof a symmetric kernel was able to obtain much more precise results for this special case,than he could using the previous general methods.

Let the kernel K(s, t) of the integral equation

ϕ(s) = f(s) + λ

b∫a

K(s, t)f(t)dt (1)

be a real, symmetric and continuous function. Then the associated matrix (K(yk, yj))is symmetric and it is also the matrix of the quadratic form

∑j,k K(yk, yj)ξkξj . Thus

Hilbert was faced with the problem of taking limits of this quadratic form. Under theseassumptions he was able to show that the roots of the Fredholm determinant are indeedreal, as foreseen already by Poincare. If we write these zeroes as a sequence (λn) withmultiplicity counted, then for each n there exist an eigenfunction, ϕn, such that

b∫a

ϕm(t)ϕn(t)dt = 0 for m 6= n.

Now normalize the ϕn:s subject tob∫

a

ϕn(t)2dt = 1

and define for each continuous function x on [a, b], the Fourier coefficients

(x, ϕn) =

b∫a

x(t)ϕn(t)dt.

Then Hilbert proved thatb∫

a

b∫a

K(s, t)x(s)y(t)dsdt =∑

n

1λn

(x, ϕn)(y, ϕn) (2)

30

for any two continuous functions x and y. Note that this is a generalization of the principalaxis theorem. An interesting remark is that he showed that the right-hand side of (2) isuniformly convergent for arbitrarily continuous functions x and y subject only to1

b∫ax(t)2dt ≤ 1 and

b∫ay(t)2dt ≤ 1.

We have already seen how Hilbert generalized and abstracted existing concepts, but hispurpose in these papers were on the contrary meant to deal with applications, and ab-straction was for Hilbert a tool to solve concrete problems. He even wrote that2

”...the systematic building of a general theory of integral equations for thewhole of analysis, especially for the theory of the definite integral and the theoryof the development of arbitrary functions in an infinite series, besides for thetheory of linear differential equations and analytic functions, as well as forpotential theory and calculus of variations, is of the greatest importance, andthat, the most noteworthy result is that the developability of a function in [aseries] of eigenfunctions belonging to an integral equation of the second kind isevidently dependent on the solvability of the corresponding integral equation ofthe first kind.”

Before we continue we need to make a remark about eigenvalues. Both Fredholm andHilbert studied eigenvalues in the sense that the operator λK − I is not invertible insteadas we do now when we consider K − λI. This means that a λ, in the sense of Fredholmand Hilbert, is an eigenvalue if and only if 1/λ is an eigenvalue in our sense.

Hilbert went on by showing that the set of λn:s is infinite unless K(x, y) is a finite linearcombination of functions of the form u(x)v(y), and that the resolvent kernel R(s, t;µ) haseigenvalues λn − µ with the corresponding eigenfunctions ϕn/(λn − µ). Thus one has therelation

R(s, t;µ)−R(s, y; ν) = (µ− ν)

b∫a

R(s, ξ;µ)R(ξ, t; ν)dξ

for µ and ν different from λn. Finally, he proved that if a function f can be written as

f(s) =

b∫a

K(s, t)g(t)dt (3)

then the corresponding Fourier expansion

f(s) =∑

n

(f, ϕn)ϕn

is absolutely and uniformly convergent and one has the Parseval relation

b∫a

f(s)2ds =∑

n

(f, ϕn)2.

The restriction to functions of type (3) was later removed by Erhard Schmidt (1876 –1959), a student of Hilbert, in his 1905 dissertation. [5]

1Unit balls in a Hilbert space!2See [3].

31

Hilbert’s restriction to symmetric kernels had its origin in applications to analyticalproblems, such as the Dirichlet problem, which is his concern in papers two and three.Hence we will not deal with them here since they did not contribute from our perspective.Instead we will turn our attention to his fourth paper, which by many is considered to bea masterpiece and the best paper Hilbert ever wrote. According to [5] ”it is by the depthand novelty of its ideas a turning point in the history of functional analysis, and indeeddeserves to be considered as the very first paper published in that discipline.”.

It was here Hilbert began by taking a step back and abandoned the integral equationpoint of view, and returned to the older concept of finite systems of equations, but with anew twist. This was because Hilbert was about to realize that the whole theory of integralequations can be contained as a special case in this older theory.

Let wn be a complete orthogonal system of continuous functions on [a, b] and supposethat a continuous function f is a solution of (1) for λ = 1. If we consider the Fouriercoefficients

xp =b∫af(s)wp(s)ds,

kpq =b∫a

b∫aK(s, t)wp(s)wq(t)dsdt,

bp =b∫aϕ(s)wp(s)ds,

then the xp:s, p = 1, 2, . . ., satisfy the infinite system of equations

xp +∞∑

q=1

kpqxq = bp for p = 1, 2, . . . , (4)

and because of the Bessel inequality we have∑px2

p <∞,∑p,qkpq <∞, and

∑pb2p <∞. (5)

The twist is then to consider this process in the reverse order. Assume that we have asolution xp of (4), with conditions (5) satisfied, and kq(s) =

∫ ba K(s, t)wq(t)dt. Then the

functions kq are continuous and satisfy

∑p

kp(s)2 ≤b∫

a

K(s, t)2dt,

which means that the series u(s) =∑

p xpkp(s) is absolutely and uniformly convergent,and hence that u is continuous and satisfying (u,wp) = bp − xp. Now if f = ϕ − u thenfrom (f, wp) = xp and the completeness of wp it follows that f is a solution of (1) withλ = 1.

After this rather standard procedure, according to [5], Hilbert ventured where no onehad ever gone before:

1. He exclusively considered sequences, x = (xp) with p = 1, 2, . . ., of real numbers suchthat

∑p x

2p <∞

2. He dropped all restrictions on the double sequence kpq except that kpq = kqp

3. The center of attention was no longer solutions of (4), but the bilinear symmetricform

K(x, y) =n∑

p,q=1

kpqxpyq, (6)

32

for which he wanted to pass to the limit

It is in Hilbert’s study of those infinite bilinear forms that we find the core of modernfunctional analysis.

To form the theory of infinite bilinear forms, he returned again to the integral equation

f(s) = ϕ(s)− λ

b∫a

K(s, t)ϕ(t)dt (7)

which he transformed as before to the finite system of equations3,

fp = ϕp − λn∑

q=1

Kpqϕq for p = 1, 2, . . . , n. (8)

He defined the inner product of two vectors x = (x1, x2, . . . , xn) and y = (y1, y2, . . . , yn)as

(x, y) =n∑

p=1

xpyp (9)

which he applied to (8) and thus formed the system, involving bilinear forms,

(u, f) = (u, ϕ)− λ(u,Kϕ). (10)

With these definitions, he started his study of the infinite bilinear form

K(x, y) =∞∑

p,q=1

kpqxpyq (11)

and in particular the form(x, y)− λK(x, y),

where the parameter λ played a more significant role than even Fredholm could imagine.Hilbert’s first task was to construct a resolvent, K(λ;x, y), for the form K (compare

chapter 6) which satisfies

K(λ;x, y)− λK(K(λ, x, y)) = (x, y)

in a way which will be made clear later.After the construction of this resolvent he further generalized the principal axis theorem

to infinite quadratic forms, and finally applied the theory of infinite bilinear forms toinfinite systems of equations.

With K(x, y) defined as in (11), the n–section of K is defined as

Kn(x, y) =n∑

p,q=1

kpqxpyq

and also

(x, y)n =n∑

p,q=1

xpyq

3For simplicity, suppose that the interval of integration, [a, b], is [0, 1].

33

with corresponding quadratic forms, K(x, x) and (x, y), and their n–sections

Kn(x, x) =n∑

p,q=1kpqx

2p and (x, x)n =

n∑p,q=1

x2p.

To the form(x, x)n − λKn(x, x)

there is the associated determinant

Dn(λ) =

∣∣∣∣∣∣∣∣∣1− λk11 −λk12 . . . λk1n

−λk21 1− λk22 . . . λk2n...

......

...−λkn1 −λkn2 . . . 1− λknn

∣∣∣∣∣∣∣∣∣which is a polynomial of order n in λ with rational coefficients and real zeroes λ(n)

1 , λ(n)2 , . . . , λ

(n)n .

The zeroes of this polynomial are called the eigenvalues of Kn and the set of all eigenvalueswas called the spectrum of Kn.

The resolvent Kn(λ;x, y) and its partial fraction decomposition is written

Kn(λ;x, y) =n∑

i=1

L(n)i (x)L(n)

i (y)

1− λ

λ(n)i

where L(n)i (x) is an orthonormal set of eigenforms. That is, L(n)

i (x) = (ϕi, x) where ϕi

is a normalized eigenvector associated with the eigenvalues λ(n)i of Kn(x, y). Assume for

simplicity that these eigenvalues λ(n)i have multiplicity one. The product4 of the forms

An(x, y) =∑n

p,q=1 apqxpyq and Bn(x, y) =∑n

p,q=1 bpqxpyq is denoted and defined as

An(x, .)Bn(., y) =n∑

p,q,r=1

apqbqrxpyq.

Note that this is the form associated with the matrix product AB and hence we will callit the product form. For simplicity we denote An(x, .)Bn(., y) by A(B(x, y)) instead.

For forms K with the spectrum of Kn uniformly bounded, Hilbert defined the functionsX

(n)p (λ) as

X(n)p (λ) =

0 for λ ≤ λ

(n)p

(L(n)p (x))2(λ− λ

(n)p ) for λ > λ

(n)p

p = 1, 2, . . . , n.

and

X(n)(λ) =n∑

p=1

X(n)p (λ).

A variable x = (x1, x2, . . .) is called a distinguished variable if x has exactly one non-zero component which equals one or exactly two non-zero components which both equals1/√

2 and the set of all distinguished variables is denoted by x(k). Hilbert then showedthat a subsequence Xmj (λ) of X(n)(λ) is uniformly convergent in λ for all distin-guished variables x and that X(n)(λ) is a quadratic form in x with variable coefficients

4By tradition called the Faltung (convolution) of the forms.

34

λ. Furthermore, if p and q are fixed, then the coefficients xpxq in X(mj)(λ) will makeX

(mj)pq (λ) converge to a continuous function of λ, say Xpq(λ).

Finally, from all the above, Hilbert defined the bilinear form

X(λ) =n∑

p,q=1

Xpq(λ)xpxq.

From the beginning, this form was only defined for the distinguished set of variables x(k)and the interval [a, b] containing the spectra of Kn, but Hilbert extended the definition ofX(λ) to all real λ and all variables x by taking linear combinations of the distinguishedvariables and extending X by linearity. Hilbert denoted the value of X(λ) at the distin-guished variable x(k) by X(λ)k and proved that they have left (τ−k (λ)) and right (τ+

k (λ))derivatives with respect to λ for all k and all real λ, and that they are non-decreasingfunctions of λ. The set of λ:s for which there is a k such that τ−k (λ) 6= τ+

k (λ) is countableis called the point or discontinuous spectrum of K and its elements are the eigenvalues ofK.

For those λ not in the point spectrum of K, Hilbert defined the quadratic form

τ(λ) =∞∑

p,q=1

τpq(λ)xpxq

where τpq is the common value of the right and left derivatives of Xpq at λ. For an elementλh of the point spectrum of K, the quadratic eigenform belonging to λh is defined as

Eh(x, x) = Eh =∞∑

p,q=1

(τ+pq(λh)− τ−pq(λh)

)xpxq,

where the right and left derivatives exist and are not equal at λh by assumption. Theseeigenforms satisfy

∑pEp ≤ (x, x) and from these forms Hilbert defined the two additional

formsε(λ) =

∑λp<λ

Ep(x, x)

andη(λ) =

∑λp<λ

(λ− λp)Ep(x, x),

where η also has right and left derivates at every point, which are equal everywhere exeptfor those λ belonging to the point spectrum of K. Next %(λ)

def= X(λ)− η(λ) is proved to

be a continuously differentiable function of λ which is used to define the spectral form ofK as

σ(λ)def=

∞∑p,q=1

σpq(λ)xpxq =d%

dλ= τ(λ)− ε(λ).

The set of real λ such that in every neighborhood there are points λ′ with σ(λ) 6≡ σ(λ′) forall x is called the line or continuous spectrum of K. The union of the point and continuousspectrum is called the spectrum of K. From the assumption that the spectrum of Kn isuniformly bounded and the fact that outside the spectrum of K, the coefficients of X arelinear and those of ε are constant, it follows that the spectrum of K is contained in somefinite interval s. Finally, by combining many results, Hilbert was able to get the resolventK as

K(λ;x, x) =∑

p

Ep(x, x)(

1− λ

λp

)−1

+∫s

(1− λ

µ

)−1

dσ(µ), (12)

35

where the sum is taken over all Ep, for which λp is in the point spectrum of K, and s isits continuous spectrum. How he obtained this result is outside the scope of this work,and I refer to [3] or [9] for a detailed discussion.

To sharpen his results, Hilbert went on by defining the concept of bounded forms. Theseare the forms for which there exists a non-negative number M such that |K(x, y)| < Mwhenever (x, x) < 1 and (y, y) < 1. This definition extends in a natural way to linear formsL(x) =

∑∞p=1 lpxp and is obviously equivalent to the modern definition of boundedness.

At the same time he also introduced continuity of a functional in infinitely many variables.Such a functional F is continuous at a = (a1, a2, . . .) if

lim(ε21+ε22+··· )→0

F (a1 + ε1, a2 + ε2, . . .) = F (a1, a2, . . .).

From this definition it is obvious that any bounded linear form is continuous. Thesedefinitions made it possible to extend his previous results when the spectrum of the kernelsK have infinity as a point of accumulation. It also made it possible to prove the spectralradius theorem, which in this case states that the spectrum of K is bounded away fromzero by M−1, where M is the smallest bound for K. This means that λ = 0 is not aneigenvalue, but it can happen that the absolute value of the eigenvalues tends to infinity.

We summarize, as Hilbert ([9] p. 137, Satz 32), these results in a major theorem.

Theorem 7.1. Let K(x, x) be a bounded quadratic form in the infinitely many variablesx = (x1, x2, . . .). The resolvent K(λ;x, x) of K(x, x) is a single well determined quadraticform

K(λ;x, x) =∑p,q

kpq(λ)xpxq

whose coefficients are regular analytic functions for all λ outside the spectrum of K.For such λ, the resolvent is a bounded form; it represents for all arbitrary values of the

infinitely many variables, x1, x2, . . ., an analytic function of λ.The resolvent permits for arbitrary values of the infinite variables x1, x2, . . ., and suf-

ficiently small λ, the power series representation

K(λ;x, x) = (x, x) + λK(x, x) + λ2K2(x, x) + · · · .

Furthermore, for arbitrary values of the infinitely many variables and for all λ outside thespectrum of K, the resolvent satisfies the partial fraction representation

K(λ;x, x) =∑

(p,∞)

Ep(x, x)(

1− λ

λp

)−1

+∫s

dσ(µ)

1− λ

µ

,

where the sum is taken over the entire point spectrum of K, namely extending over alleigenvalues, if necessary with the inclusion of the eigenvalue ∞. Ep denotes the quadraticeigenform belonging to λp; it is a bounded form for which no set of values of the variablesx1, x2, . . . is negative. The spectral form σ(λ) is a bounded form of the infinitely manyvariables x1, x2, . . ., and indeed represents for each of these sets of variables, a functionwhich is continuous with respect to λ. Moreover it increases with increasing λ inside thecontinuous spectrum s – except for special values of x1, x2, . . . – but remains constant inevery interval outside of s.

In particular, the following equations are satisfied:

(x, x) =∑

(p,∞)

Ep +∫s

dσ(λ)

36

andK(x, x) =

∑(p,∞)

Ep

λp+∫s

dσ(µ)µ

.

The resolvent K(λ;x, x) is related to K(x, x) through the relation

K(λ;x, y)− λK(K(λ;x, x)) = (x, y)

which is satisfied for all λ outside the spectrum of K.

To illustrate the concepts introduced by Hilbert in these papers, it is interesting tocompare the basic concepts of spectral theory in a standard textbook in functional analysis.For example [12] (p. 370–371) gives the following definitions.

Let X 6= 0 be a complex normed space and T :→ D(T ) → X a linear operator withdomain D(T ) ⊂ X. With T we associate the operator

Tλ = T − λI

where λ is a complex number and I the identity operator on D(T ). If Tλ has an inverse ,we denote it by Rλ, that is,

Rλ(T ) = T−1λ = (T − λI)−1

and call it the resolvent operator of T or, simply, the resolvent of T . He then goes on with

Definition (Regular value, resolvent set, spectrum). Let X 6= 0 be a complex normedspace and T : D(T ) → X a linear operator with domain D(T ) ⊂ X. A regular value λ ofT is a complex number such that

(R1) Rλ exists,

(R2) Rλ is bounded,

(R3) Rλ is defined on a set which is dense in X.

The resolvent set ρ(T ) of T is the set of all regular values λ of T . Its complementσ(T ) = C \ ρ(T ) in the complex plane C is called the spectrum of T , and a λ ∈ σ(T )is called a spectral value of T . Furthermore, the spectrum σ(T ) is partitioned into threedisjoint sets as follows,

• The point or discrete spectrum σp(T ) is the set such that Rλ(T ) does not exist. Aλ ∈ σp(T ) is called an eigenvalue of T .

• The continuous or line spectrum σc(T ) is the set such that Rλ(T ) exist and satisfyR3, but not R2, that is, Rλ(T ) is unbounded.

• The residual spectrum σr(T ) is the set such that Rλ(T ) exist (and maybe boundedor not) but does not satisfy R3, that is, the domain of Rλ is not dense in X.

As we can see, all concepts introduced by Hilbert are used, though in terms of operatorson Hilbert spaces instead of bilinear and quadratic forms, but the notations are still thesame almost 100 years later.

Despite the importance of the discussions above, we have not yet arrived at the mostimportant part concerning the future development of functional analysis. This is the

37

concept of complete continuity. A function of infinitely many variables F (x1, x2, . . .) iscalled completely continuous at a = (a1, a2, . . .) if

limε1→0,ε2→0,...∑

ε2i <1

F (a1 + ε1, a2 + ε2, . . .) = F (a1, a2, . . .) (13)

whenever ε1, ε2, . . . run through any sequence ε(k)1 , ε

(k)2 , . . . having the single limit

limk→∞

ε(k)1 = 0, lim

k→∞ε(k)2 = 0, . . . . (14)

Hilbert used this definition to prove several sufficient conditions for a quadratic form tobe completely continuous, of which the most important from our perspective is that the

coefficients of K satisfy∞∑

p,q=1k2

pq <∞. He proved that limk→∞

Kn(x, x) = K(x, x) uniformly

for any completely continuous quadratic form, where, as before, Kn is the n–section of K.He continued by showing numerous results on completely continuous forms, such as thatthey attain their maximum value on closed and bounded sets and that the continuousspectrum is empty and its eigenvalues have no finite point of accumulation. This led himto a further generalization of the principal axis theorem.

Theorem 7.2. If K is a completely continuous bounded form, then it can be brought intothe following representation through an orthogonal substitution:

K(x, y) =∑

j

kjx2j

where the kj are reciprocal eigenvalues.

It has even been said that this classification of operators having a pure point spectrumand no finite point of accumulation, is one of his finest results since they are in some sensethe easiest class of operators, except for the finite dimensional operators.

38

8 Finalizing the concept of space

8.1 A new way of mathematics – Abstraction prior to problem solving

In chapter 7, the modern reader recognizes almost all aspects of what now is called aHilbert space, and in particular the Hilbert space l2 which played an essential role inHilbert’s investigations of bilinear and quadratic forms. It seems clear that when Hilbertcreated his theories, he had Euclidean geometry in mind. This can in particular be seenin connection with his set of distinguished variables, which were chosen such that they intoday’s notation would have norm one. The same idea applies to the completely continuousforms. In some sense they are the forms which preserve lengths and distances. It is notby coincidence that Hilbert worked with this intuition. At the same time a new conceptemerged in mathematics in general – the concept of structure.

Until the middle of the 19th century, mathematics had been something very concrete.The problems dealt with concerned particular objects, such as numbers, points, curves,areas, volumes, surfaces and so on, and the manipulation of these objects had relied heavilyon which type of object that was under consideration. Around 1840, some mathematiciansbegan to see that the manipulations on these objects did not depend on the nature of theobjects, but rather on which rules that could be applied, and on how those rules could beapplied to numerous different kinds of objects. However, these ideas had to wait another50 years to mature and it was not until Cantor had created his set theory that seriousinvestigations could begin. By 1895 the definition of a group on an arbitrary set wasdefined by Weber in his famous Lehrbuch der Algebra, which was the starting point for anabstraction and axiomatization of algebra, and by 1920 all fundamental notions of algebrahad been defined.

In analysis there was no similar development at this time. The central concepts of lim-its, convergence and continuity had been defined relative to special objects such as curves,surfaces or functions, and no one had considered how they could be generalized to arbitrarysets. Both Fredholm and Hilbert had intentionally avoided this question by claiming thatthey were interested in explicitly solving integral equations without abstracting conceptsfor the purpose of abstraction alone. This was about to change when Frechet went in thecomplete opposite direction and did everything for the sole purpose of abstraction.

8.2 Adding structure to abstract spaces – The introduction of Topology

To understand how and why Frechet developed his ideas we need to look at the mathe-matical environment in Paris around 1900. Paris was the brilliant center of science andmathematics. The old and established mathematicians were still active and provided greatknowledge. We had Camille Jordan (1838 – 1922), Charles Hermite (1822 – 1901), of whichPoincare was a student and Charles Emile Picard (1856 - -1941) his son-in-law. EdouardGoursat (1858 – 1936) and Hadamard were about to gain international recognition, andamong all these great minds, Frechet was a student of Hadamard. At the same time asyoung scientists pilgrimaged to Paris with new ideas, Frechet could use his talent and allthis knowledge at his disposal to further develop some of their ideas. For sure he wasnot the first to bear the ideas of abstraction in mind, but he was the first with the rightamount of youth, naivity and knowledge to formalize all these ideas, and maybe mostimportant of all – he had Hadamard. According to [4], Hadamard was very modest, didnot publish very much and one can only speculate how much of Frechet’s work that owesto Hadamard.

The first attempt to give a structure to sets of functions was given by Weierstrass ina series of lectures in 1879. He introduced a concept of two functions being ”close”, if for

39

two functions ψ(x) and Ψ(x), and every x in some interval I, we have that

|ψ(x)−Ψ(x)| < ε and∣∣∣∣dkψ

dxk− dkΨdxk

∣∣∣∣ < ε, for k = 1, 2, . . . , p.

The functions ψ and Ψ are then said to be in an ε-neighborhood of order p. The importanceof this definition lies not in its applications, but the fact that it gave sufficient structureto a set of functions to make the concepts of limits and continuity meaningful. This wasimproved by the italians Giulio Ascoli (1843 – 1896), and Cesare Arzela (1847 – 1912) whenthey tried to extend the work by Cantor on set of points, to sets of curves or functions.In particular, they were interested in sequences of lines and their limits. This led themto the concept of equicontinuity of families of functions, and the requirement that for asequence of continuous functions to have a uniformly convergent subsequence, is that thesequence is equicontinuous and bounded. A corollary of this statement is that there is asubsequence of an equicontinuous and bounded sequence of functions such that

limn→∞

b∫a

fn(x)dx =

b∫a

limn→∞

fn(x)dx.

Arzela also answered the question of (Riemann) integrability of a sequence of functions.Let fn be a sequence of functions defined on some interval [a, b]. Then fn is said toconverge quasi–uniformly1 to a function f if for every ε > 0 and every positive integerN one can find a N ′ > N such that for each x ∈ [a, b] there is an integer nx, withN ≤ nx ≤ N ′, such that

|f(x)− fnx(x)| < ε. (1)

The integrability condition can now be stated as follows: Let fn(x) be a boundedsequence that converges pointwise to f(x). If all fn(x) are Riemann integrable over [a, b],then f is Riemann integrable over [a, b] if and only if fn(x) converges quasi-uniformelyto f on [a, b].

1The difference between quasi–uniform and uniform convergence is that (1) need not hold for all N ′ > N .

40

9 Frechet on metric spaces

Frechet began his investigations already in 1904 with a paper which can be consideredas an aperitif of his 1906 thesis Sur quelques points du calcul fonctionnel. It is dividedinto two parts, of which the first deals with abstraction and the second with applications.Frechet had big ambitions with his project. He hoped that his generalization of analysiswould include all previous work by Fredholm and Hilbert as special cases, and even thework by Cantor on point sets. We cite from [3] his motivation for undertaking this task:

”The present work is a tentative first [effort] to establish systematicallycertain fundamental principles of the Functional Calculus, and then to applythem to certain concrete examples.”

Interestingly, it is this procedure he feels he had to motivate by further writing2:

”In proceeding thus, it happens that certain demonstrations are made moredifficult because one does without some [of the] more concrete representation[s].But that which is lost in this way, is largely regained in dispensing with the rep-etition, several times, of different forms of the same reasoning. One often gainsthereby from seeing more clearly that which was essential in the demonstrations... from the simplifications, and in the freeing [of the proofs] from that whichonly depends on the particular nature of the elements considered. It is thiswhich we are going to try to do for the Functional Calculus and in particularfor the theory of abstract sets.”

Frechet based his work on two considerations in order to obtain maximum generality.First the notions of Cantor on set theory and second a characterization of limit. In generala limit in his sets would not be defined, but rather be characterized by two propertiessimilar to the characterization of group multiplication. The class of sets for which thisconcept of limit is introduced is called L, and a set E will belong to the class L if givenany infinite set of elements A1, A2, . . . of E chosen at random, it is possible to determinewether or not there exists a unique element A (called the limit of An when it exists)subject to the following conditions:

I If Ai = A, for i = 1, 2, . . ., then the limit is A itself

II If A is the limit of An = A1, A2, . . . then A is the limit of every subsequenceAn1 , An2 , . . . of An

In the coming discussion we will assume that all sets under consideration are of classL. That is, all sets have a limit defined and all theory and structure of these sets arecompatible with this limit.

We begin by giving several important definitions. The derived set of a set E, denotedby E′ is the set of points which are limits of sequences belonging to E. E′ is closed ifE′ ⊂ E and perfect if E′ = E. A is an interior point of E if A is not the limit of anysequence in the complement of E. A set E is called compact if either E has finitely manyelements or if every infinite subset of E has at least one limit element. If E is bothcompact and closed, it is called extremal. These concepts have changed very little sinceFrechet defined them. Compactness and extremal in the sense of Frechet is now known asrelatively sequentially compact and sequentially compact respectively.

2Cited from [3].

41

He continued by considering real valued functions (functionals) defined on a set E ofclass L, by him called functional operations, and generalized the notion of continuity. Afunctional U is said to be continuous in E at A if

limn→∞

U(An) = U(A)

for all sequences An in E such that the limit of An is A.Using these definitions he was able to show the extreme value theorem and the inter-

mediate value theorem. Uniform convergence of functionals is defined in a natural way,that a sequence of functionals Un converge uniformly to U if |Un(A) − U(A)| < ε forall sufficiently large n independent of A. This allowed him to define compactness for setsof functionals and to generalize equicontinuity for sequences of functionals. A set U offunctionals defined on E is compact if every infinite subset of U contains a subsequenceUn of functionals which converge uniformly on E. The set U is called equicontinuous ifgiven any ε > 0, there exist an N such that for all n > N we have that |U(A)−U(An)| < εfor all U ∈ U, whenever An converges to A. The definitions above made it possible forhim to prove the Arzela-Ascoli theorem:

Theorem 9.1 (Arzela-Ascoli). A necessary and sufficient condition that a set U offunctionals, all defined and continuous on the same extremal set E, be compact is thatthey be uniformly bounded and equicontinuous on E.

To add further structure to his sets, Frechet wanted to find a subclass of L such that ifE belongs to this subclass andD ⊂ E, thenD′ would always be closed. For this purpose heintroduced the subclass V for which a neighbourhood has been defined. A neighbourhood3

is a real valued function (A,B) defined on all pairs A,B satisfying the following threeproperties:

1) (A,B) = (B,A) ≥ 0

2) (A,B) = 0 if and only if A = B

3) There is a real valued function f(ε) which tends to zero with ε such that if (A,B) < εand (B,C) < ε, then (A,C) ≤ f(ε)

A set E will be of class V if it has a neighbourhood defined on it. This allowed himto define convergence in terms of neighbourhoods, i.e. that An converges to A if theneighbourhoods (An, A) converge to zero. These definitions enabled him to show thatall derived subsets of the class V are closed, and from now on he only considered setsbelonging to the class V . He also introduced another definition of major importance forthe development of mathematics; A set E of class V is called separable if it contains adenumerable set whose derived set is the entire class.

Frechet went on by generalizing more concepts of the real line. First, a sequence Anis called a Cauchy sequence if for every ε > 0 there is an n such that for all p > 0 wehave that (An, An+p) < ε. A set of class V will be called normal if it is perfect, separableand if every Cauchy sequence has a limit. That is, a V –normal set where all Cauchysequences have a limit in the set, is what we call a complete space. The word complete(vollstandig) was first used by Felix Hausdorff (1868 – 1942) 1914. Frechet himself usedthe word complete for a different purpose. [20]

3Note that the usage of the word ”neighbourhood” is quite different from that of today.

42

Finally, Frechet did one final specialization of the set V . This time he replaced condi-tion 3 of the definition of a neighbourhood with the condition that for any elements, A,B and C, of E, we have that

(A,B) ≤ (A,C) + (C,B).

The sets satisfying these three conditions are said to be of class E and the real valuedfunction (A,B) is called an ecart on E . The reason for introducing this is that he wantedto classify the extremal sets C of class E . Thus, in 1906, the modern definition of a metricspace4 was born and has not changed since.

In the second part of his thesis, Frechet dealt with very concrete sets of different objectsand defines ecarts on them. For example the Frechet metric (which is used in the studyof C∞ functions)

(x, y) =∞∑

p=1

1p!

|xp − yp|1 + |xp − yp|

,

where x = (x1, x2, . . .) and y = (y1, y2, . . .) are sequences of real numbers. Anotherimportant example is the set of real valued functions continuous on some interval I, wherethe limit is taken to be the uniform limit, with ecart

(f, g) = maxx∈I

|f(x)− g(x)|.

Today this is called the maximum norm and it was well-known even in 1906 that conver-gence in this norm is uniform. We will not go deeper into the second part of Frechet’sthesis since it is a bit outside of the scope of this work, and does not serve the purpose ofmotivating the future development. For those interested in Frechet I can warmly recomendthe great articles [20], [21] and [22].

9.1 Synthetic geometry – Euclidean geometry in function spaces

In 1908, Eliakim Hastings Moore (1862 – 1932) published a paper entitled On a form ofgeneral analysis with applications to linear differential and integral equations. In this paperhe followed the footsteps of Frechet and has the same ambitions to include all theory offinite linear systems, infinite linear systems in infinitely many unknowns, integral equationsand the work of Hilbert as special cases in his general analysis. His ambitions failed andthis paper had almost no influence on the european mathematical community. The reasonfor this failure has been described in terms of everything from political and socialistic toindividualistic and notational. For a somewhat detailed discussion, I refer to [18]. Still,I think that Hellinger and Toeplitz summarize it best when they say that ”... solutiontheory is not accomplished through such axiomatic formulation ...”. Simply, at this timethere was no need for further abstractions. [3]

Schmidt had more success with his approach to the recently developed theories. Hisaim was to simplify Hilbert’s proofs and to generalize some of his results. This resulted inone of his greatest successes – the introduction of geometry into what he called functionspaces, which appeared 1908 in Uber die Auflosung linearer Gleichungen mit unendlichvielen Unbekannten. Schmidt started by investigating the set of infinite sequences, whichHilbert had used as domain of definition for his quadratic forms, but this time Schmidtalso allows complex numbers. Thus, the functions in his space are really sequences ofcomplex numbers z = zp with

∞∑p=1

|zp|2 <∞.

4The name metric was given later by F. Hausdorff. [4].

43

He introduced, for what seems to be the first time, the notation ||z|| for what later wereto become the norm of z as

||z||2 =∞∑

p=1

zpzp.

Remember that Hilbert introduced the notation

(z, w) =∞∑

p=1

zpwp

from which it follows that5

||z|| =√

(z, z).

Two vectors z and w are said to be orthogonal if and only if (z, w) = 0 and (w, z) = 0from which the generalized Pythagorean theorem follows, and moreover that orthogonalsystems of vectors are linearly independent. From our perspective, one important resultbased on this fact is that an orthogonal system of functions continuous on some interval,is countable. He continued by proving the Bessel and Schwarz inequalities from which thetriangle inequality follows, but it does not seem as he was aware of the fact that his normhas exactly the same properties as Frechet’s ecart.

Schmidt defined a sequence of elements zp to converge strongly to z if ||zp − z|| → 0as p→∞, and a strong Cauchy sequence is defined as a sequence satisfying ||zp− zq|| → 0as p, q → ∞. Then he showed that every strong Cauchy sequence converges strongly tosome element z in the space of sequences, i.e. that this space is complete.

Maybe the most interesting part of this paper is about closed subspaces and projectionsof elements onto subspaces. A subset A of Schmidt’s sequence space H is called a closedsubspace of H if it is a closed subset of H and if it satisfies the property of algebraicclosure. That is, if w1 and w2 are any elements of A then a1w1 + a2w2 is an element inA for all complex numbers a1 and a2. Let A be a fixed closed subspace of H. Schmidtconstructed what he called perpendicular functions6. He proved that for any element z ofH there are unique elements w1 and w2 such that z = w1 + w2 where w1 ∈ A and w2 isorthogonal to w2. In modern notation that w1 is the projection of z in A, and the resultis known as the projection theorem. Some results regarding w2 are proven, of which oneof the most interesting is that ||w2|| = min ||y − z|| where y is any element of A and thatthis minimum is attained only for y = w1. Based on this fact, Schmidt called ||w2|| thedistance7 between z and A. [3]

Then Schmidt suddenly stoped. After he has defined the distance between a pointand a closed subset, he does not go on and define the distance between any two elementsin the space in terms of the norm. The reason is a mystery since one can see that heis familiar with Frechet’s work and he has proven, but not realized, that his norm hasall the properties of an ecart. One might also wonder why he chose the name functionspace for the space of sequences. A possibility is that he had in mind that all sequenceszp satisfying

∑∞p=1 |zp|2 <∞ would define the coefficients in some series expansion of a

function, but that he was not able to prove it.

5Notice the slight difference compared to the modern definition (z, w) =∞∑

p=1

zpwp.

6Remember that a function is an element in the space, i.e. a sequence.7Entfernung.

44

10 Lebesgue on Integration theory

Ever since Cantor created set theory, mathematicians had been struggling to associatenumbers to sets, which would in some sense measure the set. Intuitively this numbershould always be zero for the empty set, and grow with bigger sets. During the 1880’s theItalians Ulisse Dini (1845 – 1918) and Volterra were investigating the relationship betweenintegrability in Dirichlet’s sense1 and in Riemann’s sense. Hermann Hankel (1839 – 1873)had proposed a theorem and a proof that functions continuous everywhere except for setsof measure zero are necessarily integrable which Dini opposed, but he could not come upwith a counterexample that this was not the case. Dini’s skepticism was proven right byVolterra who proved the existence of, what we today would call, a nowhere dense set withpositive outer content, from which it followed that Hankel’s theorem was false.

The usefulness of measuring sets began to gain recognition, and in the early 1880’s itspread to Germany where Paul Du Bois-Reymond (1831 – 1889) named sets of contentzero2 integrable system of points, to distinguish them from other nowhere dense sets. In1882, Axel Harnack (1851 – 1888) introduced a notion similar to that of a property tohold almost everywhere. Two functions, f and g, were said to be equal in general if forevery δ > 0 the set of points x such that |f(x)− g(x)| < δ is discrete.

Cantor himself, in 1884, tried to define content in the sense of subsets of the n–dimensional euclidean space, without much success. His definition relied on the assumptionthat a certain multiple integral was well-defined, a fact that was not sufficiently justifieduntil the work of Jordan in 1892 on multiple integrals. Even with that assumption justified,the distinction between a set and its closure was not clear enough which resulted in thatthe content of a disjoint union of two sets was not in general the sum of their contents– a property which is fundamental, and in some sense even defining. After this failure,Cantor lost interest in contents and turned his attention to other areas. Harnack, on theother hand, did not lose interest but picked out the most promising parts of Cantor’s workand reconsidered the definition of contents, and thought about what would happen if onewould allow infinite coverings with intervals of a set in n–dimensional euclidean space.He writes: ”in a certain sense, every ’countable’ point set has the property that all itspoints can be enclosed in intervals whose sum [of lenghts] is arbitrarily small.”. That is,Harnack seems to have been the first one who considered this property for countable sets,the property that made Emile Borel (1871 – 1956) aware of the usefulness of a theory ofmeasure. [8]

In Borel’s 1894 doctoral thesis, and later more extensively in 1898, we find the firstdefinition of a countably additive measure, and what later were to become Borel sets. Forthose (Borel) sets he defined the (Borel) measure m with the key property that

m

( ∞⋃k=1

Ak

)=

∞∑k=1

m(Ak)

for disjoint (Borel) sets Ak – the additive property that Cantor failed with in defining hismeasure.

Judging from comments made by Weierstrass, he was never satisfied with the Rie-mann definition of an integral. In a correspondence with Du Bois-Reymond, concerninghis discovery that Dirichlet’s condition for integrability was not sufficient for Riemann

1That the points of discontinuity form a nowhere dense set. [8]2In the 1870’s it was unclear which sets that were negligible in connection with integration. There were

essentially three potential negligible sets: Cantor’s first species, nowhere dense sets and sets that can beenclosed in a finite number of intervals of arbitrarily small total lenghts. The last sets are what we referto as sets of content zero. [8]

45

integrability, Weierstrass responded that Dirichlet for sure had in mind another, and moregeneral, definition than that of Riemann. Weierstrass suggested that Dirichlet had in mindan extension of Cauchy’s definition to functions with infinitely many points of discontinu-ity. Since Hankel had proven that the points of continuity of an integrable function forma dense set, then the partition of any interval [a, b] could always be taken such that in theCauchy sum, S =

∑f(ti)(xi − xi−1), the ti are continuity points of f . Working in this

direction it should then be possible to extend the integral to a larger class of integrablefunctions. Weierstrass himself worked out a definition of an integral which he correspondedto his friend and student, Sofia Kovalevskaya (1850 – 1891). The main idea of Weierstrassis to take any interval [a, b] and in each arbitrarily small part of this interval let therebe points where the function is defined. For each of these points where the function isdefined, erect the ordinate. These ordinates need not overlap continuously and hence theintegral can not be defined as the area filled up by these ordinates. If we let each of theseordinates be surrounded by a rectangle whose base is δ, then these rectangles overlap, andif we define the sets of those points that are in some rectangle, it is seen that they forma continuum. This continuum has a content Sδ which is a function of δ. It can be shownthat this content decreases with decreasing δ and hence has a limit as δ → 0. Then define

b∫a

f(x)dx = limδ→0

Sδ.

This definition is justified since it coincides with the usual definition for continuous func-tions. However, there were other problems with this definition. As Volterra pointed out,this definition does not make integration additive. Weierstrass had thus encountered thesame problem as Cantor when trying to define these new concepts. [8]

It was not until 1902 that all these problems were definitively solved in Lebesgue’sfamous doctoral thesis, Integrale, longueur, aire. It is here we find all familiar notions ofintegration theory, such as measures, Lebesgue measures, sets of measure zero, measurablefunctions, almost everywhere and of course, the Lebesgue integral. The only thing missingin this thesis is the fundamental theorem of caculus,

d

dx

x∫a

f(t)dt

= f(x) almost everywhere. (1)

He knew that there would exist a theorem like this, but at the time of his dissertation hewas not able to prove it and it took him another year before he was able to give a completeproof. [17]

It took another few years for Lebesgue’s work to mature, and for other mathematiciansto realize the importance and usefulness of the Lebesgue integral. In particular, it wasnow possible to take limits under the integral sign under very general assumptions,

limn→∞

∫fn(x)dx =

∫lim

n→∞fn(x)dx,

which was an important step in putting the theory of orthogonal series and Fourier serieson a rigorous basis.

Let e be a measurable set and f a function which takes values f(x) for x ∈ e and 0elsewhere. The value of the integral is called the integral of f(x) on e and is written

F (e) =∫e

f(x)dx.

46

With this definition, we can state that if e1, e2, . . . are disjoint, measurable sets and f isLebesgue integrable on e1 ∪ e2 ∪ . . ., then

F (e1 ∪ e2 ∪ . . .) = F (e1) + F (e2) + · · · .

That is, F (e) is a countable, additive set function – what both Weierstrass and Cantorhad been searching for. [16]

47

11 The creation of modern Functional Analysis

With all these new ideas, concepts and the mathematical alignments of abstraction versusproblem solving, the world waited for someone to come up with a unifying theory. Theone who should have credit for this unification is F. Riesz, a hungarian mathematicianworking as a high-school teacher at the turn of the century. After he finished his thesisin 1902 (same year as Lebesgue), he went to Gottingen where he met Hilbert and becamegood friend with Schmidt. After his stay in Gottingen he went to Paris where he madefriends with Borel and Lebesgue, so Riesz was indeed the right man to come up with aunifying theory. Back in Hungary, he started working on functional analysis inspired byhis visits. When he wrote about concrete problems he wrote in German and published inGerman periodicals, and when he wrote about abstract theories he wrote in French andpublished in French periodicals. [7]

In 1906, Riesz had become enough acquainted with Lebesgue’s work to understandthat by combining it with Frechet’s work on abstract spaces, he could greatly improvesome results by Schmidt. He observed that in the space of continuous functions on someinterval I with metric max |f(x)− g(x)| for x ∈ I, the concept of an orthogonal system ofcontinuous functions could be generalized to any orthogonal system of functions, as longas they were integrable in some sense. Since Schmidt had proven that any such system iscountable, why not use the integrability condition in the Lebesgue sense, since it was wellcompatible with countability.

As an example, Riesz considered all bounded and Lebesgue integrable functions definedon a Lebesgue measurable set E with the distance defined as(∫

E

(f(x)− g(x)

)2dx)1/2

,

and the convention that all functions with∫E |f(x)|dx = 0 are identified with the function

everywhere identically equal to zero on E. Thus we see the origin of the important Lp–space theory, where in this case p = 2. [8]

Earlier, Hilbert had studied integral equations of the form

f(s) = ϕ(s) +

b∫a

K(s, t)ϕ(t)dt (1)

where f and K had been assumed to be continuous. With the new theory of Lebesgueintegration, Riesz wanted to see if he could improve Hilbert’s results to more general func-tions. His study of equation (1) resulted in whether or not Riesz could insure that thegeneralized Fourier coefficients of f could be determined relatively to a given orthonor-mal system of functions, ϕp. Conversely, he was also interested to find under whichcircumstances a given sequence of numbers, ap, was the set of Fourier coefficients ofsome function f relative to an orthonormal system of functions ϕp. Pierre Fatou (1878– 1929) had showed that a necessary condition for this was that the sequence be squaresummable. Maybe it was also sufficient? This question had not been of very much inter-est before the introduction of the Lebesgue integral, since a positive answer would haveseemed very unlikely. However, Riesz (and independently Ernst Fischer (1875 – 1954), )were able to give a complete answer to this question by the following celebrated theorem.

Theorem 11.1 (Riesz-Fischer theorem). If ϕp is an orthonormal system of squareLebesgue integrable functions defined on some interval [a, b] and if ap is a square summable

48

sequence of real numbers, then there exists a square Lebesgue integrable function1 f definedon [a, b] such that

ap =

b∫a

f(x)ϕp(x)dx

if and only if∞∑

p=1

a2p <∞.

The necessity for this theorem follows from Bessel’s inequality and was known to bevalid for such functions. He began with the classic case when the orthonormal system isthe set of trigonometric functions and the interval [a, b] is [0, 2π]. To establish the theoremin this case, Riesz formed a trigonometric series,

∞∑p=1

apϕp(x)

where the ϕp are of the form 1pπ cos(px) or 1

pπ sin(px) and ap the given sequence ofnumbers, and proved that it converges uniformly to a continuous function of boundedvariation with derivatives almost everywhere. The function f is then defined to be thisderivative where it exists, and to have arbitrary values on sets of measure zero. This fis then shown to be measurable and square Lebesgue integrable and to have the desiredFourier coefficients ap. Thus the theorem is proved when the orthonormal system is thesequence of trigonometric functions.

To prove the general case, Riesz considered a system of infinitely many equations ininfinitely many unknowns,

ap =∞∑

q=1

xqbpq, for p = 1, 2, . . . , (2)

where ap is the given square summable sequence of numbers, xp are the unknowns and

bpq =

2π∫0

ϕp(x)ψq(x)dx. (3)

In the last equation (3), ϕp is the orthonormal system of trigonometric functions andψp is an arbitrary orthonormal system. It was known that if

∞∑r=1

bprbqr = δpq for p, q = 1, 2, . . . ,

then (2) had a square summable solution x = (x1, x2, . . .) given by

xq =∞∑

p=1

bpqap. (4)

1The fact that the function f is square Lebesgue integrable was not proved in the first version of thistheorem. [3]

49

By Fatou’s theorem2, Riesz could then prove that

∞∑r=1

bprbqr =∞∑

r=1

2π∫0

ϕp(x)ψr(x)dx

2π∫0

ψq(x)ϕr(x)dx

=

2π∫0

ϕp(x)ϕq(x)dx = δpq for p, q = 1, 2, . . .

and hence the bpq satisfy the correct conditions, which establish the validity of (4) andhence the xp, p = 1, 2, . . ., can be considered as known.

The special case when ϕp is a system of trigonometric function then insured himthat there is a measurable and square Lebesgue integrable function f such that f satisfies

2π∫0

f(x)ϕq(x)dx = xq. (5)

By substituting (5) and (3) into (2) he obtained

ap =∞∑

q=1

2π∫0

f(x)ϕq(x)dx

2π∫0

ψp(x)ϕq(x)dx

=

2π∫0

f(x)ψp(x)dx,

where again the last equality is valid due to Fatou’s theorem. Thus he had proved that ap

is the pth Fourier coefficient of f with respect to the arbitrary orthonormal system ψp.The final adjustment that needed to be done is a change of variable to obtain the resultfor any interval [a, b]. [3]

If the orthonormal system ψp is complete, then the coefficients bpq determine thesolution xp uniquely and hence f is unique up to an additive function with zero inte-gral. This means that for a fixed, complete orthonormal system we have a one to onecorrespondance between the set of measurable and square Lebesgue integrable functionsand the set of square summable sequences.

Less than a month after this publication by Riesz, E. Fischer published basically thesame result in the same journal. In 1904, Fischer had published some papers on theParseval identity for Riemann integrable functions. During this process he had comeacross an unsuccessful attempt by Harnack to prove that if Sn denotes the nth partial sumof a Fourier series of an integrable function and

limm,n→∞

2π∫0

(Sn − Sm)2 = 0,

2If ϑp is any given orthogonal system, then for arbitrary functions h and g we have that

b∫a

h(x)g(x)dx =

∞∑p=1

b∫a

h(x)ϑp(x)dx

b∫a

g(x)ϑp(x)dx

.

50

then there would exist a limit function g(x) = limn→∞

Sn(s) in general (almost everywhere).It was probably this that led Fischer to introduce what he called mean convergence, definedas: let Ω denote the class of square Lebesgue integrable functions on [a, b], and supposethat fn ∈ Ω for n = 1, 2 . . .. Then the sequence of functions fn is said to converge inthe mean if

limm,n→∞

b∫a

(fn − fm)2 = 0.

The sequence fn is said to converge in the mean to a function f ∈ Ω if

limn→∞

b∫a

(f − fn)2 = 0.

Using this definition, Fischer could state his main theorem as

Theorem 11.2. If fn converges in the mean, then there exists an f in Ω such that fn

converges in the mean to f .

That is, the space Ω is complete in the mean3. From now on, we agree to denote Ω byL2[a, b] or, if there is not danger of confusion regarding the interval involved, just by L2.From this theorem, the Riesz-Fischer theorem follows as a corollary, since given any squaresummable sequence ap, the functions fn =

∑np=1 apψp converges in the mean.

The Riesz-Fischer theorem had immediate influence. Riesz himself showed that theintegral equation of second kind,

f(s) = ϕ(s) +

b∫a

K(s, t)ϕ(t)dt

could be completely solved under the more relaxed assumptions that f ∈ L2[I] and K ∈L2[I × I], where I = [a, b]. It allowed Frechet to determine the compact sets in L2, andby considering the metric

(f, g) =

b∫a

(f(x)− g(x)

)2dx, (6)

where two functions are identified if they differ only on a set of measure zero, Frechetcould prove the following theorem:

Theorem 11.3. For every continuous, linear functional U defined on L2[a, b] (with themetric (6)), there is a function u(x) ∈ L2[a, b] such that for every f ∈ L2[a, b],

U(f) =

b∫a

f(x)u(x)dx.

These results are in some sense the unification of the work of Fredholm, Hilbert, Frechetand Lebesgue. It did not only show that two apparently different sets, l2 and L2 couldactually be completely identified, but an even greater importance was that it really showedhow problem solving led to abstraction, and how these abstractions actually included all

3I.e. the metric d(f, g) =( b∫

a

(f(x)− g(x))2dx)(1/2)

.

51

previous work. Hence both Hilbert and Frechet were right when one claimed that problemsolving was the essential part, and the other that abstraction was the essential part. Therecould not have been a better man to realize this than F. Riesz, who worked idependentlyof both the German and French school in the beginning, and later had training in both.

11.1 Spectral theory of compact operators

Already in 1907 when F. Riesz studied equations of the form

b∫a

gn(x)f(x)dx = an for n = 1, 2, . . . ,

where an is a given sequence, gn a, not necessarily orthonormal, sequence of functionsand the problem is to determine f when integration is in the Lebesgue sense, he was ledto considering Lp spaces and its relation to Lq spaces. However, it took until 1910 untilthe theory was furnished enough to become commonly accepted and usable. [5]

His main tools for completing the theory were the Holder inequalities

n∑i=1

|aibi| ≤( n∑

i=1

|ai|p)1

p( n∑

i=1

|bi|q)1

q (7)

or ∣∣∣ ∫M

f(x)g(x)dx∣∣∣ ≤ (∫

M

|f(x)|pdx)1

p(∫

M

|g(x)|qdx)1

q, (8)

where 1/p+ 1/q = 1, and the Minkowski inequalities( n∑i=1

|ai + bi|p)1

p ≤( n∑

i=1

|ai|p)1

p +( n∑

i=1

|bi|p)1

p (9)

or (∫M

|f(x) + g(x)|pdx)1

p ≤(∫

M

|f(x)|pdx)1

p +(∫

M

|g(x)|pdx)1

p, (10)

where M is the region of integration. This allowed Riesz to define the space Lp as the setof all functions f , measurable on a set M for which |f |p is integrable.

Taking the set M to be the closed interval [a, b], Riesz defined strong convergence of asequence of functions fn to a function f in the mean of order p as

limn→∞

b∫a

|fn(x)− f(x)|pdx = 0.

The sequence fn is said to converge weakly to f if

b∫a

|fn(x)|pdx < M

for every n and that for every x ∈ [a, b] we have

limn→∞

x∫a

(fn(t)− f(t)

)dt = 0.

52

He proved that if fn converges weakly, then for every g ∈ Lq

limn→∞

b∫a

(f(x)− fn(x)

)g(x)dx = 0, (11)

and noted in passing that if fn and f are such that (11) is satisfied for every g ∈ Lq,then fn converge weakly to f – the modern definition of weak convergence, which iscompletely equivalent to that of Riesz. [3]

With this new machinery, he again set out to study an eigenvalue problem, whichturned out to be one of the most fruitful so far. According to [5] (p. 145–146) it ”is one ofthe most beautiful [papers] ever written; it is entirely geometric in language and spirit, andso perfectly adapted to its goal that it has never been superseded and that Reisz’ proofs canstill be transcribed almost verbatim.”. The paper is entitled Untersuchungen uber Systemeintegrierbarer Funktionen and was published in Acta Mathematica 19184.

In finite dimensional linear algebra, an operator is a linear map, from a vector spaceto itself, represented by a square matrix. Given an arbitrary finite dimensional linearoperator one asks, what can we do with it? In some cases the operator permits a completeeigenvalue decomposition; it is diagonizable. Unfortunately that is not always the case. Ifthe vector space is complex, then there is a basis such that the corresponding matrix isupper triangular. Let T be a linear operator on a finite complex vectorspace V . A basisof V is called a Jordan basis for T if T , with respect to this basis, has a correspondingblock diagonal matrix A1 . . . 0

.... . .

...0 . . . Am

, (12)

where each Aj is an upper triangular matrix of the form

Aj =

λj 1 0

. . . . . .. . . 1

0 λj

.In each Aj , the diagonal consists of some eigenvalue λj of T , the superdiagonal consist ofonly 1’s and all other entries are zero. If an operator J is of the form (12) it is said to bein Jordan normal form, and the existence of such operators is given by

Theorem 11.4. Let V be a finite dimensional complex vector space. If T is a linearoperator on V , then there is a basis of V that is a Jordan basis for T . That is, anyoperator on a complex vector space can be brought to Jordan normal form by a change ofbasis.

That V is a complex vector space is essential since there are linear operators on realvector spaces which do not have any eigenvalues, for example a rotation in the plane. Thistheorem was first proven by Jordan in 1870. [1]

Riesz never adopted Hilbert’s method of dealing with bilinear forms but instead hetook the same view as Fredholm and dealt with operators. The natural question wasthen what one could do if given an arbitrary linear infinite dimensional operator, or which

4This paper was first published 1916 in hungarian, but did not gain any recognition until publishedagain in German two years later.

53

assumptions one had to impose on the operator in order to get ”good” properties, likethe Jordan normal form for the finite dimensional case. For simplicity he considered theset of continuous functions on the interval [a, b], but he claims that the theory is easilygeneralizable ([15] p. 71, cited from [4]):

”The restriction to continuous functions made in this paper is not essential.The reader familiar with the more recent investigations on various functionspaces will recognize immediately the general applicability of the method; he willalso notice that certain among those, such as the square integrable functionsand Hilbert space of infinitely many dimensions, still admit simplifications,whereas the seemingly simpler case treated here may be regarded as a test casefor the general applicability.”

To start his investigations, he began with a few definitions ([15] p. 72). The set of allcontinuous functions on the interval [a, b] is called a function space and the norm of f ,denoted ||f ||, is the maximum of |f(x)|. Hence the norm is in general positive and zeroonly when f is identically zero. Furthermore we have that

||cf(x)|| = |c|||f(x)||; ||f1 + f2|| ≤ ||f1||+ ||f2||.

By the distance between f1 and f2 we mean the norm ||f1− f2|| = ||f2− f1||. Convergenceof a sequence of functions fn to a limit function f is then understood as ||fn − f || → 0when n→∞. If f , f1, f2 are in this function space then so are cf and f1 + f2 and if fnis a convergent sequence in this space, then the limit function f is also in this space. Thatis, this space of functions is a normed space which is complete with respect to the topologyof strong convergence, exactly as defined by Stefan Banach (1892 – 1945) four years later.A transformation T of an element f in this space to a uniquely determined element T [f ]in this space is called a linear transformation if it is distributive and bounded, i.e. if forall f , f1, f2 and every constant c we have

T [cf ] = cT [f ]; T [f1 + f2] = T [f1] + T [f2]

and for all f there exist an M such that

||T [f ]|| ≤M ||f ||.

It follows immediately from this definition that T maps any bounded sequence fn offunctions to a bounded sequence of functions, and from

||T [fn]− T [f ]|| = ||T [fn − f ]|| ≤M ||fn − f ||

we see that every such T is continuous. A sequence of functions fn is called compact,due to Frechet, if every subsequence has a convergent subsequence. Hence every compactsequence is bounded but in general the converse is not true, as can easily be seen by thesequence fn(x) = xn for x ∈ [a, b]. The sequence is bounded but not compact since allsubsequences are tending to a function which is discontinuous at x = 1.

A transformation T that maps every bounded sequence to a compact sequence is calledcompletely continuous. An example of a completely continuous transformation is T [f ] =f(a) which maps every function to a constant. As for an example of a transformationwhich is not completely continuous, one can take the identity transformation E[f ] = f .From the definition it follows that if T , T1 and T2 are completely continuous and c is anyconstant, then so is cT and T1 + T2. Hence the completely continuous transformationsform a linear space.

54

The reason for introducing all these notations and definitions is to be able to treat theeigenvalue problem

ϕ(x)− λK(ϕ(x)

)= f(x), (13)

where f is known, ϕ is the unknown andK is a symmetric (bounded) linear transformationin L2. Fredholm and Hilbert studied this equation under the assumption that the functionsinvolved were continuous on the interval [a, b]. With the new theory of integration, Rieszwanted to improve the results by Fredholm and Hilbert by relaxing the assumptions onthe functions and make (13) solvable for a larger class of functions. It turned out that themost successful way to study equation (13) was to let the involved functions be of class L2

and that is why Riesz restricted himself to L2 from here on, but the theory is applicablein Lp for any p > 1.

For the parameter λ, Riesz proved the spectral radius theorem saying that if |λ| < 1||K||

then (13) has a solution which is unique up to a null function. This is proved by showingthat the transformation T = E − λK, where E is the identity transformation on L2, isinvertible. Furthermore he continued to show that if K

(f(x)

)is real whenever f(x) is

real, then for at least one of the two integrals

b∫a

∣∣∣ϕ(x)± 1||K||

K(ϕ(x)

)∣∣∣2dxthere is a sequence ϕn(x) with

b∫a

|ϕn(x)|2dx = 1 for n = 1, 2, . . . , (14)

such that

limn→∞

b∫a

∣∣∣ϕn(x)± 1||K||

K(ϕn(x)

)∣∣∣2dx = 0. (15)

Riesz is hence faced with the problem of determining which properties of K one hasto impose in order for the equation

b∫a

∣∣∣ϕ(x)± 1||K||

K(ϕ(x)

)∣∣∣2dx = 0 (16)

to have non-trivial solutions. From the boundedness of K it follows that if fn convergesstrongly to f then K(fn) converges strongly to K(f). From (15), Riesz could concludethat there is a subsequence ϕnj (x) of ϕn(x) with ϕnj converging weakly to ϕ. Thissubsequence would then satisfy (16). The problem is that it could happen that every suchsubsequence ϕnj could converge to a null function which would result in only trivialsolutions. However, that situation will not occur if K is assumed to be completely contin-uous – which was Reisz’ motivation for introducing the concept of completely continuousoperators. If K is assumed to be completely continuous, then by the inequality b∫

a

∣∣ϕ′n(x)∣∣2dx

1/2

−

b∫a

∣∣∣∣ 1||K||

K(ϕ′n(x)

)∣∣∣∣2 dx1/2

≤

b∫a

∣∣∣∣ϕ′n(x)− 1||K||

K(ϕ′n(x)

)∣∣∣∣2 dx1/2

,

55

and the equations (14) and (15), we see that

limj→∞

b∫a

|K(ϕnj (x)

)|2dx = ||K||2.

Since the sequence K(ϕnj ) converge strongly to some K(ϕ0) we have that5

limj→∞

b∫a

|K(ϕnj )|2dx =

b∫a

|K(ϕ0(x)

)|2dx = ||K||2,

but since K is bounded we also have

||K||2 =

b∫a

|K(ϕ0(x)

)|2dx ≤ ||K||2

b∫a

|ϕ0(x)|2dx

and hence thatb∫

a

|ϕ0(x)|2dx ≥ 1,

which means that ϕ0(x) is not a null function. Thus it is proved that (16) has non-trivialsolutions and a sufficient condition is that K is completely continuous. [3]

Riesz continued to show that this method is applicable to any λ for which there is asequence of functions ϕn satisfying (14) and (15). These λ:s are called the eigenvaluesof K and the non-trivial solutions are called eigenfunctions. Phrased in modern language,Riesz had proved that the continuous spectrum of a real symmetric compact operator inL2 is empty. He closed the discussion about completely continuous operators by provingthe Hilbert decomposition theorem for real symmetric completely continuous operators,which state that

K(f(x)

)=

∞∑i=1

1λiKi

(f(x)

),

where the Ki are certain transformations similar to projections and the sum is taken overall eigenvalues. Hence with the completely continuous operators we have an analogy withthe finite dimensional case and the Jordan normal form.

This paper is without doubt one of the most significant in the history of functionalanalysis. It finally settled the analogy with finite dimensional linear algebra and introducedor developed almost every important concept that Banach axiomatized four years later.There are even more astonishing features of this paper that I have not dealt with here.Among others the introduction of the adjoint operator T ∗ of T which is used to studyinverses of operators. For a more detailed discussion, I refer the reader to [3] or [4] andfor a thorough investigation of spectral theory in particular, see [19].

From this point on, the development of functional analysis was explosive. Between1920 and 1932, both Banach and Hans Hahn (1879 – 1934) published their books on thesubject which contained the major theorems of functional analysis; The Hahn–Banachtheorem, the uniform boundedness theorem and the open mapping theorem. With thecomplete axiomatization of Banach and Hilbert spaces along with the new rigor of quantummechanics, many prominent mathematicians turned their attention to functional analysisand developed the theory, and the physicists found usage of it. Thus the establishmentwas complete and there could be no questions about the usefulness of this new theory.

5By an unquoted result by Riesz in the same paper.

56

A Solution of the Dirichlet and Neumann problems by Fred-holm’s metod

Because of the significance of Fredholm’s work on the Dirichlet problem we will give it amore careful investigation, following [16]. In the Dirichlet problem one seeks a harmonicfunction which is continuous on a domain and reduces to a given function on the boundary.The Neumann problem is similar, but instead of prescribing the value of the solution onthe boundary of the domain, one prescribes the value of its normal derivative. We willlimit ourselves to a domain in the plane which is bounded by a simple closed curve Cwith continuous curvature and parametrized by arc length. We will refer to an interioror exterior problem depending on if the domain under consideration is the interior Di, orexterior De of C.

A.1 The interior Dirichlet problem

In the interior problem we seek a function u(P ) which is harmonic in Di and whose limitwhen the point P tends to a point on C is equal to a given continuous function g(s), thatis

ui(s) = g(s). (17)

Following the classical method by Neumann we try to find a harmonic function u of theform

u(P ) =∫C

µ(t)∂

∂ntlog

1rPt

dt =∫C

µ(t)cos(rPt, nt)

rPtdt. (18)

That is, as the potential of a double layer µ(t) distributed over C, rPt is the distance fromthe point P to the point t on C and nt is the interior normal at the point t of C.

When this double layer is continuous it is known that the potential µ(t) is harmonicin Di and De, but discontinuous when we cross C. The interior and exterior limits arethen related by the relations

ui(s) = u(s) + πµ(s), ue(s) = u(s)− πµ(s). (19)

The normal derivatives are continuous,(∂u

∂n

)i

=(∂u

∂n

)e

and hence µ must satisfy the integral equation

1πg(s) = µ(s) +

∫C

K(s, t)µ(t)dt, (20)

whereK(s, t) =

1π

∂

∂ntlog

1rst

=1π

cos(rst, nt)rst

.

The kernel K(s, t) is continuous not only for s 6= t, but also for the diagonal s = t. If wedenote these rectangular coordinates of the point s on C by x(s) and y(s), these functions

57

are twice continuously differentiable by hypothesis and they satisfy

lims,t→s0

K(s, t)

= lims,t→so

(y(s)− y(t))x′(t)− (x(s)− x(t)) y′(t)(x(s)− x(t))2 + (y(s)− y(t))2

=y′′(s0)x′(s0)− x′′(s0)y′(s0)

2= k(s0),

where k(s0) is the curvature of C at the point s0. Hence we are allowed to apply theFredholm theory.

From the Fredholm theory it follows that either the non-homogeneous equation (20) hasa continuous solution µ(s) for any continuous function g(s), or the homogeneous equation

v(s) +∫C

K(s, t)v(t)dt = 0 (21)

has a continuous solution v(s) 6≡ 0. If we study the last case (21), we see that it is notpossible. From (17), (20) and (21) it follows that for the potential v(P ) corresponding tothe double layer v(s) we have vi(s) ≡ 0 which implies v(P ) ≡ 0 in Di, since a harmonic

function attains its extremal values on the boundary. Hence we also have(∂

∂n

)i

=(∂

∂n

)e

≡ 0. Moreover, since the potential is harmonic in De we can apply the Green’s

formula ∫ ∫De

(v2x + v2

y)dxdy = −∫C

ve

(∂v

∂n

)e

dt,

which implies that vx = vy ≡ 0 in De and hence that v is constant in De. Since v is zeroat infinity, this constant value is necessary zero. Hence it follows that ve(s) ≡ 0 and from(19) that

v(s) =12π

(vi(s)− ve(s)) ≡ 0,

which means that the homogeneous equation has only the solution v(s) ≡ 0. Hence wehave proved

Theorem A.1. The interior Dirichlet problem has a solution for every continuous func-tion g(s) given on the boundary.

A.2 The exterior Dirichlet problem

Analagously with the interior problem, the exterior problem leads to the equation

1πg(s) = µ(s)−

∫C

K(s, t)µ(t)dt = 0, (22)

and that the solutions are necessary constant. However, they need not be zero. Hence thenumber of linearly independent solutions is equal to one. The same is true for the adjointequation

%(s)−∫C

K(s, t)%(t)dt = 0.

58

Let %0(s) be a solution of the adjoint equation such that all other solutions are multiplesof %0(s). Then it is necessary and sufficient for (22) to have a solution that∫

C

g(s)%0(s)ds = 0.

Now determine a constant c such that g1(s) = g(s)− c is orthogonal to %0(s) and denoteby u1(s) the potential by which (22) corresponds to g1(s). Then u = u1 + c is a solutionof (22) and hence we have

Theorem A.2. The exterior Dirichlet problem has a solution for every continuous func-tion g(s) given on the boundary.

A.3 The interior Neumann problem

The methods used to solve the Dirichlet problem can also be applied to the Neumannproblem, but instead of considering solutions of type (18) we seek a single layer potential,

u(P ) =∫C

%(t) log1rPt

dt.

This potential is harmonic on both Di and De and even continuous on C, but its normalderivatives are discontinuous on C. Hence we have the relations

ui = ue

and (∂u

∂ns

)i

+ π%(s) =(∂u

∂ns

)e

− π%(s) =∫C

%(t)(

∂

∂nslog

1rPt

)dt.

Expressing the interior Neumann problem as(∂u

∂ns

)i

= h(s)

then leads to the integral equation

− 1πh(s) = %(s)−

∫C

%(t)K(t, s)dt, (23)

where the kernel K(t, s) is the adjoint of that encountered when studying the Dirichletproblem. Thus using the same argument as for the Dirichlet problem, we can concludethat (23) has a solution if and only if h(s) is orthogonal to 1.

Theorem A.3. The internal Neumann problem has a solution for every continuous func-tion h(s) such that ∫

C

h(s)ds = 0.

59

A.4 The exterior Neumann problem

The exterior Neumann problem leads by exactly the same arguments to the equation

− 1πh(s) = %(s) +

∫C

%(t)K(t, s)dt. (24)

The corresponding homogeneous equation has no solutions, since the adjoint homogeneoussolutions does not have any as we have seen in the Dirichlet problem. Thus we have

Theorem A.4. The exterior Neumann problem has a solution for every given continuousfunction h(s).

60

References

[1] Axler, Sheldon. Linear Algebra Done Right. Springer Science+Business Media, Inc,New York, 1996.

[2] Bernkopf, Michael. A history of infinite matrices. Archive for History of ExactSciences, 4(4):308–358, 1968.

[3] Bernkopf, Michael. The development of function spaces with particular referenceto their origins in integral equation theory. Archive for History of Exact Sciences,3(1):1–96, 1975.

[4] Birkhoff, Garrett and Kreyszig, Erwin. The Establishment of Functional Analysis.Historia Mathematica, 11:258–321, 1984.

[5] Dieudonne, Jean. History of Functional Analysis. North–Holland, Amsterdam, 1981.

[6] Fredholm, Ivar. Sur une classe d’equations fonctionnelles. Acta Mathematica,27(1):365–390, 1903.

[7] Gray, J. D. The shaping of the riesz representation theorem: A chapter in the historyof analysis. Archive for History of Exact Sciences, 31(2):127–187, 1984.

[8] Hawkins, Thomas. Lebesgue’s Theory of Integration: Its Origins and Development.The University of Wisconsin Press, Madison, Milwaukee, London, 1970.

[9] Hilbert, David. Grundzuge einer allgemeinen Theorie der linearen Integralgleichun-gen. B.G. Teubner, Leibzig und Berlin, 1912.

[10] Johansson, Bo Goran. Matematikens historia. Studentlitteratur AB, Lund, 2004.

[11] Katz, Victor. The History of Mathematics: An Introduction. Addison–Wesley, Read-ing, 1998.

[12] Kreyszig, Erwin. Introductory Functional Analysis with Applications. John Wileyand Sons, New York, 1989.

[13] Luciano, Erika. At the Origins of Functional Analysis: G. Peano and M. Gramegna onOrdinary Differential Equations. Revue d’Histoire des Mathematiques, 12(1):35–79,2006.

[14] Monna, A.F. Functional Analysis in Historical Perspective. Oosthoek PublishingCompany, Utrecht, 1973.

[15] Riesz, Friedrich. Uber lineare Funktionalgleichungen. Acta Mathematica, 41(1):71–98,1918.

[16] Riesz, Frigyes and Sz.-Nagy, Bela. Functional Analysis. Frederick Ungar PublishingCo, New York, 1955.

[17] Saxe, Karen. Beginning Functional Analysis. Springer–Verlag, New York, 2002.

[18] Siegmund-Schultze, Reinhard. Eliakim Hastings Moores ”General Analysis”. Archivefor History of Exact Sciences, 52(1):51–89, 1998.

[19] Steen, Lynn Arthur. Highlights in the History of Spectral Theory. American Mathe-matical Monthly, 80(4):359–381, 1973.

61

[20] Taylor, Angus E. A study of Maurice Frechet: I. His early work on point set theoryand the theory of functionals. Archive for History of Exact Sciences, 27(3):233–295,1982.

[21] Taylor, Angus E. A study of Maurice Frechet: II. Mainly about his work his work ongeneral topology, 1909–1928. Archive for History of Exact Sciences, 34(4):279–380,1985.

[22] Taylor, Angus E. A study of Maurice Frechet: III. Frechet as analyst, 1909–1930.Archive for History of Exact Sciences, 37(1):25–76, 1987.

[23] Vretblad, Anders. Fourier Analysis and Its Applications. Springer–Verlag, New York,2005.

62

Documents

On the origin and early history of functional analysis