Lecture Notes Winter ’07 Applied Functional Analysishaase/Dokus/afa-lectures1-14.pdf · Lecture Notes Winter ’07 Applied Functional Analysis Lecture 1 What is functional analysis

Lecture Notes Winter ’07

Applied Functional Analysis

Lecture 1

What is functional analysis all about?

Recall linear algebra. The general form of a system of d linear equations in d

unknowns isx−Ax = y

where A ∈ Rd×d is a real matrix and y ∈ Rd is a given vector. The vectorx ∈ Rd is to be determined. Borrowing geometrical terminology we call thevectors x, y points. The right (= efficient) framework to treat this problemsis that of vector spaces (Rd) and linear mappings (operators) (A).

Now note that a point x = (x1, . . . , xd) can be viewed as a function x :1, . . . , d −→ R, i.e.,

Rd = x | x : 1, . . . , d −→ R

the set of all real-valued functions on a set of d elements. So our “points” areactually functions (on a finite set). This leads us to our first answer:

Functional analysis is the “right” (= efficient) setting to treat prob-lems (equations), where the sought-after object is a function on anon-finite set.

Let us look at an example.

Let k : [0, 1] × [0, 1] −→ R and y : [0, 1] −→ R be continuous functions.Consider the integral equation

x(t)−∫ 1

0k(t, s)x(s) ds = y(t) (t ∈ [0, 1]).

The unknown “point” x is now a continuous function on [0, 1]. Recall that theset

C[0, 1] := x | x : [0, 1] −→ R

2 Applied Functional Analysis Lecture 1

is again a vector space with respect to the pointwise definition

(f +g)(t) := f(t)+g(t), (λf)(t) := λf(t) (f, g ∈ C[0, 1], λ ∈ R, t ∈ [0, 1])

of sum and scalar multiple. So we are still in the setting of vector spaces! Butthis space is not finite-dimensional:

1.1 Fact The functions 1, t, t2, t3 . . . in C[0, 1] are linearly independent.

Proof. See LN 1.3.3

So we can amend our first answer.

Functional analysis develops and applies the theory of infinite-dimensional vector spaces.

Now, by abbreviating

(Ax)(t) :=∫ 1

0k(t, s)x(s) ds (t ∈ [0, 1], x ∈ C[0, 1])

the above integral equation becomes

x−Ax = y

as in the linear algebra problem above. Moreover, the mapping

A : C[0, 1] −→ C[0, 1]

is linear (check it!), so the analogy is perfect.

One may try to solve the integral equation by rewriting it into a fixed pointproblem

x = F (x) := Ax + y

and try to determine x = (I −A)−1y via an iterative method:

x0 := 0, xn+1 = F (xn) = Axn + y.

This leads to the approximants

0, y, y + Ay, y + Ay + A2y, . . .

and hence to the “geometric” series

x =∞∑

n=0

Any.

Lecture 1 Applied Functional Analysis 3

Here one needs some notion of convergence (topology) in the infinite-dimensional space C[0, 1]. Actually, one may wish to have not only a notionof convergence in the set of points, i.e. within C[0, 1], but even in the set ofoperators. With suitable such notions defined and under suitable conditions,one would expect the formula

(I −A)−1 =∞∑

n=0

Ak

as if A were a number (geometric series). As we shall see, this is indeed possibleand comes under the name Neumann series.

There is a second reason why topology/convergence is an indispensable ingre-dient. In linear algebra you can represent your points (solutions to equations)as

x =d∑

j=1

αjbj

where b1, . . . , bd is basis which is nicely adapted to the problem you are study-ing. (Think of a basis of eigenvectors in the case that the matrix A is diag-onalizable). You cannot do the same in infinite dimensions, since algebraicbases are unaccessible. But you can hope for representing your solutions x asseries

x =∞∑

j=0

αjbj

where the bj are “nice” (e.g. again eigenvectors of the linear operator A). Butnow you need again a notion of convergence to make the infinite sum havingsense.

However, even if you have such a notion, there may be no limit, simply becausethe space is “too small” (like in Q there is no limit to the sequence definedby x0 = 0, xn+1 := (2 + xn)/(1 + xn).) So one also needs some sort ofcompleteness.Now, although there are situations in functional analysis where one needssome general topology, in the parts that concern us here we need only someless abstract theory. Namely, we shall use the language of metric spaces, andthe main metrics that we use are induced by so-called norms on vector space.

Some metric space theory

1.2 Definition Let Ω be a set. A metric on Ω is a mapping d : Ω× Ω −→ [0,∞)satisfying the following three conditions:


1) d(x, y) = d(y, x) for all x, y ∈ Ω (symmetry);

2) d(x, y) = 0 iff x = y, for all x, y ∈ Ω (definiteness);

3) d(x, y) ≤ d(x, z) + d(z, y) for all x, y, z ∈ Ω (triangle inequality);

Metrics are to measure distances.

1.3 Examples (1) The natural euclidean distance on Ω := Rd (different d’s!)

d(x, y) := |x− y| :=

d∑j=1

|xj − yj |2 1

2

(x, y ∈ Rd).

(2) The discrete metric on any set Ω given by

d(x, y) :=

0 if x = y

1 if x 6= y.

(3) Many more below.

Each metric induces a notion of convergence: let (xn)n ⊂ Ω be a sequencein Ω and x ∈ Ω a point. Then we say that xn → x (with respect to the metricd) if d(xn, x) → 0, i.e.,

∀ ε > 0 ∃ N ∈ N : d(xn, x) < ε (n ≥ N).

If this is so, the point x is called the limit of the sequence (xn)n. Limits areunique: if xn → x and xn → x′ then by the triangle inequality

d(x, x′) ≤ d(x, xn) + d(xn, x′) → 0 as n →∞

whence d(x, x′) = 0. By definiteness, x = x′.

1.4 Exercise How does a convergent sequence in the discrete metric look like?

A Cauchy sequence with respect to a metric d is every sequence (xn)n ⊂ Ωsuch that d(xn, xm) → 0 as n, m →∞, i.e.,

∀ ε > 0 ∃ N ∈ N : d(xn, xm) < ε (n, m ≥ N).

A convergent sequence is always a Cauchy sequence, since by the triangleinequality one has

d(xn, xm) ≤ d(xn, x) + d(x, xm) (n, m ∈ N).


In general, the converse is false: there are metric spaces with non-convergentCauchy-sequences. (Example: Q with the natural euclidean metric.) A metricis called complete if every Cauchy sequence converges. The metrics in theexamples above are complete metrics.If a metric d is given on Ω then to x ∈ Ω and ε > 0 one defines the ball ofradius ε around x

Bε(x) := y ∈ Ω | d(x, y) < ε.

This ball is also called the ε-neighbourhood of x and written Nε(x) (see LN2.2.3 and p.31)

1.5 Exercise How does a ball look like in the discrete metric/the euclidean metric?

A subset O ⊂ Ω is called open if for each x ∈ O there is a r > 0 such thatNr(x) ⊂ O. (This means intuitively, that around each point of O there is stillspace within O. So if you tremble while pointing at x ∈ O you still are inO, but it depends on the point x how much you are allowed to tremble.) Wedefine

τΩ := O ⊂ Ω | O is open

and call it the topology defined by the metric d.

1.6 Lemma ∅,Ω are open. If (Oι)ι is any collection of open sets, then⋃

ι Oι is

open. If O,W are open then O ∩W is open. Each ball Nr(x) is open!

Proof. The first two properties are trivial. Suppose that x ∈ O ∩ W . Thenthere is ε, δ > 0 such that Nε(x) ⊂ O and Nδ(x) ⊂ W . Let r := min(ε, δ).Then Nr(x) ⊂ Nε(x)∩Nδ(x) ⊂ O∩W . Since x ∈ O∩W was arbitrary, O∩W

is open.Now fix x ∈ Ω and r > 0, and take y ∈ Nr(x). Let 0 < ε := d(x, y) < r. Weclaim that Nr−ε(y) ⊂ Nr(x). By the triangle inequality, for each z ∈ Nr−ε(y)

d(x, z) ≤ d(x, y) + d(y, z) = ε + d(y, z) < ε + (r − ε) = r.

Hence z ∈ Nr(x) as claimed.

A set F ⊂ Ω is called closed if Ω \ F = F c is open. Attention: usually thereare sets which are neither closed nor open, and there may be sets that areboth closed and open! (Try to think of examples using the metrics mentionedabove. Make sure that you understand that for intervals of the real line, thenew notions of open and closed are exactly the same as the common ones.)

1.7 Lemma A set F ⊂ Ω is closed iff for every convergent sequence xn → x in Ωsuch that all xn ∈ F it follows that also x ∈ F .


Proof. Suppose F is closed and xn → x, but x /∈ F . Then x ∈ O := F c whichis open. So there is r > 0 such that Br(x) ⊂ O. This means that d(x, y) ≥ r

for all y which are not in O, that is, in F . So since d(xn, x) → 0, eventuallyxn /∈ F .To prove the other direction, assume that F is not closed. That means thatO := F c is not open. So there is a point x ∈ O such that Br(x) 6⊂ O forevery r > 0. In particular one can find for each n ∈ N a point xn /∈ O withd(xn, x) < 1

n . So xn → x and xn ∈ F = Oc. But x /∈ F .

A subset A of a metric space (Ω, d) becomes a metric space in its own right byrestricting the metric d to A. This just means that one measures distances inA as one measures them in Ω, but simply forgets about all the points of Ω\A.

1.8 Lemma Let A be a subset of a complete metric space Ω. Then A is closed in

Ω iff A (as a metric space in its own right) is complete.

Proof. Suppose that A is closed in Ω and let (xn)n ⊂ A be a Cauchy sequence.By completeness of Ω, there is a limit x, i.e. xn → x ∈ Ω. By closedness andLemma 2, x ∈ A. Hence A is complete.Conversely, suppose that A is complete, take (xn)n ⊂ A and suppose thatxn → x for some element x ∈ Ω. We have to show that x ∈ A. Since (xn)n

converges in Ω, it is a Cauchy-sequence; so it is a Cauchy-sequence in A. ButA is complete, and therefore there is a limit xn → a ∈ A. But then alsoxn → a within the metric space Ω, and since limits are unique, x = a ∈ A.

1.9 Lemma Let (Ω, d) be a metric space. Then

A := x ∈ Ω | ∃(xn)n ⊂ A, xn → x = x | ∀ε > 0 : Bε(x) ∩A 6= ∅

and A is closed.

Proof. If x ∈ A then by definition there is a sequence (xn)n in A convergingto x. So for given ε > 0 one has d(xn, x) < ε eventually. In particularBε(x) ∩ A 6= ∅. On the other hand, if Bε(x) ∩ A 6= ∅ for all ε > 0, then pickxn ∈ B1/n(x) ∩A and observe that xn → x. This shows x ∈ A.To see that A is closed, take x ∈ Ω \ A. By what we have seen before, theremust be an ε > 0 such that Bε(x) ∩ A = ∅. But this is Bε(x) ⊂ Ω \ A. SoΩ \A is open, hence A is closed.

It is clear that A ⊂ A (constant sequences!). By Lemma 3, any closed setB ⊃ A must contain A. Hence by Lemma 4, A is the “smallest” closed set


containing A. We therefore call A the closure of A. The set A is closed iffA = A.

The concept of closure is absolutely central in applied functional analysis,because it abstracts the process of approximation by simple objects, like inthe Fourier expansion of a complicated function

f(x) =∞∑

k=−∞ake

−ikx.

1.10 Definition Let Ω′,Ω be two metric spaces (with metrics d′, d, say). A mappingf : Ω −→ Ω′ is called continuous at x ∈ Ω, if the implication

(xn)n ⊂ Ω, xn → x ⇒ f(xn) → f(x0)

holds. The mapping f is simply called continuous if it is continuous at everypoint x ∈ Ω. (See LN 4.1.1, 4.2.1, 4.1.6).

1.11 Lemma A mapping f : Ω −→ Ω′ is continuous iff f−1(U) is open in Ω for

each open set U ⊂ Ω′.

Proof. “if”: Take a sequence xn → x and ε > 0. The ball U := Bε(f(x))is open in Ω′, hence by hypothesis V := f−1(U) is open in Ω. Since clearlyx ∈ V , one finds δ > 0 such that Bδ(x) ⊂ V . Since (xn)n converges to x, onehas xn ∈ Bδ(x) eventually. Hence f(xn) ∈ f(Bδ(x)) ⊂ f(V ) ⊂ U = Bε(f(x))eventually. This shows that f(xn) → f(x).“only if”: Take an open U ⊂ Ω′ and set V := f−1(U). We prove thatF := Ω \ V is closed. Take a sequence (xn)n ⊂ F such that xn → x ∈ Ω.By hypothesis, f(xn) → f(x) in Ω′. But since xn /∈ V , f(xn) /∈ U . Now,G := Ω′ \ U is closed (since U is open) and hence f(x) = limn f(xn) ∈ G, byLemma 1.7. This implies x ∈ F . Again by Lemma 1.7 we conclude that F isclosed, whence V is open.

1.12 Lemma Let (Ω, d) is a metric space, then the metric itself is a continuous

mapping d : Ω × Ω −→ R. This means that if xn → x and yn → y then

d(xn, yn) → d(x, y).

Proof. Since d(xn, yn) ≤ d(xn, x) + d(x, y) + d(y, yn) and d(x, y) ≤ d(x, xn) +d(xn, yn) + d(yn, y) one has

|d(xn, yn)− d(x, y)| ≤ d(xn, x) + d(yn, y) → 0.



Lecture 2

Normed spaces

The metric spaces we are usually occupied with are subsets of normed vectorspaces.

2.1 Definition (LN 5.1.1). Let E be a linear (=vector) space. A mapping ‖·‖ :E −→ R is called a norm if it has the following properties:

1) ‖x‖ ≥ 0 for all x ∈ E, and ‖x‖ = 0 if and only of x = 0 (definiteness).

2) ‖λx‖ = |λ| ‖x‖ for all x ∈ E, λ ∈ R (homogeneity).

3) ‖x + y‖ ≤ ‖x‖+ ‖y‖ for all x, y ∈ E (triangle inequality).

Before we give examples, here two simple properties:

2.2 Lemma One has ‖−x‖ = ‖x‖ and |‖x‖ − ‖y‖| ≤ ‖x− y‖ for all x, y ∈ E.

Proof. From 2) one finds ‖−x‖ = ‖(−1)x‖ = |−1| ‖x‖ = ‖x‖. The second isLN Exercise 5.3.2

The closed unit ball, that comes with a norm, is

BE := x ∈ E | ‖x‖ ≤ 1.

A subset M ⊂ E is bounded if there is c > 0 such that M ⊂ cBE , i.e., if‖x‖ ≤ c for all x ∈ M .

As first examples of norms on E = Rd we have

‖x‖1 :=d∑

j=1

|xj | and ‖x‖∞ := maxj=1,...,d

|xj |

for x = (x1, . . . , xd) ∈ Rd (see Exercise LN 5.3.1).

2.3 Exercise Make a sketch of the unit balls of ‖·‖1 and ‖·‖∞ if d = 2!

Each norm ‖·‖ on a linear space E induces a metric via

d(x, y) := ‖x− y‖ (x, y ∈ E).

Indeed: d(x, y) ≥ 0 is obvious; d(x, x) = ‖x− x‖ = ‖0‖ = 0, and if 0 =d(x, y) = ‖x− y‖ then x − y = 0, i.e. x = y. This is 1) in the definition of


metric. For 2), note that d(y, x) = ‖y − x‖ = ‖−(x− y)‖ = ‖x− y‖ = d(x, y).Finally, the triangle inequality comes from

d(x, y) = ‖x− y‖ = ‖(x− z) + (z − y)‖ ≤ ‖x− z‖+‖z − y‖ = d(x, z)+d(z, y).

In the example (Rd, ‖·‖∞) convergence of a sequence with respect to the in-duced metric is simply convergence in all coordinates!

2.4 Proposition Let (E, ‖·‖) be a normed space. Then the addition, scalar mul-

tiplication and the norm mapping itself are continuous (with respect to the

metric induced by the norm). That is to say: if xn → x, yn → y in E and

λn → λ in R, then

xn + yn → x + y, λnxn → λx and ‖xn‖ → ‖x‖ .

Proof. Continuity of addition follows from

d(xn + yn, x + y) = ‖(xn + yn)− (x + y)‖ = ‖(xn − x) + (yn − y)‖≤ ‖xn − x‖+ ‖yn − y‖ = d(xn, x) + d(yn, y) → 0.

For the scalar multiplication note that

λnxn − λx = (λn − λ)(xn − x) + λ(xn − x) + (λn − λ)x

and taking norms and using the triangle inequality

d(λnxn, λx) ≤ |λn − λ| ‖xn − x‖+ |λ| ‖xn − x‖+ |λn − λ| ‖x‖ → 0.

For the last statement note that by Lemma 2.2 |‖xn‖ − ‖x‖| ≤ ‖x− xn‖ → 0.

As a corollary we prove

2.5 Corollary Let (E, ‖·‖) be a normed vector space. Then for every x ∈ E, c > 0

x + cBE = y ∈ E | ‖y − x‖ ≤ c

is closed and

Nc(x) = y ∈ E | ‖y − x‖ < c

is open.


Proof. if yn → y then yn − x → y − x, whence c ≥ ‖yn − x‖ → ‖y − x‖. Thisshows the closedness of the first set. The second set is open, as we know fromthe theory of metric spaces.

Convergence is a qualitative notion, whereas a metric quantifies this notion. Soin general, two different metrics may lead to the same notion of convergence,i.e. to the same convergent sequences. (Example: d and 2d). In this case, thetwo metrics are said to be equivalent.Likewise, two norms ‖·‖, ‖| ·‖| on a linear space E are called equivalent if theinduced metrics are equivalent, i.e., if for any sequence (xn)n ⊂ E and pointx ∈ E one has

‖xn − x‖ → 0 if and only if ‖|xn − x‖| → 0.

2.6 Lemma Let ‖·‖, ‖| ·‖| be two norms on a linear space E. Then the following

statements are equivalent:

(i) There is a constant m > 0 such that ‖x‖ ≤ m ‖|x‖| for all x ∈ E.

(ii) For all (xn)n ⊂ E, x ∈ E:

‖|xn − x‖| → 0 implies ‖xn − x‖ → 0.

Proof. The implication (i)⇒(ii) is trivial. Suppose that (i) does not hold. Thenfor each n ∈ N there is xn ∈ E such that ‖xn‖ > n ‖|xn‖| . Then xn 6= 0 for alln, and we can form yn := n−1 ‖|xn‖|−1xn. Then

‖| yn‖| = n−1 ‖|xn‖|−1 ‖|xn‖| = n−1 → 0

but‖yn‖ = n−1 ‖|xn‖|−1 ‖xn‖ > n−1 ‖|xn‖|−1n ‖|xn‖| = 1

and this violates (ii).

2.7 Exercise What does the estimate 1) say geometrically, when considering unitballs?

By Lemma 2.6, the norms are equivalent if one finds constants m,M > 0 suchthat

m ‖x‖ ≤ ‖|x‖| ≤ M ‖x‖ (x ∈ E).

In geometrical terms, norms are equivalent if the unit balls are, after appro-priate scaling, contained in each other.

The main result of this section is that finite-dimensional spaces are — in asense — boring.


2.8 Theorem Let E be a finite dimensional linear space. Then all norms on E

are equivalent.

Proof. By choosing a basis e1, . . . ed in E we may assume that E = Rd and theej are the canonical basis vectors. Let ‖·‖ be any norm on Rd and let

c :=d∑

j=1

‖ej‖ > 0.

Then for arbitrary x ∈ E = Rd

‖x‖ =∥∥∥∥∑d

j=1xjej

∥∥∥∥ ≤ ∑d

j=1‖xjej‖ =

∑d

j=1|xj | ‖ej‖ ≤ c ‖x‖∞ .

This yields in particular, that the norm mapping ‖·‖ is continuous on Rd (withrespect to the metric induced by ‖·‖∞), since

|‖x‖ − ‖y‖| ≤ ‖x− y‖ ≤ c ‖x− y‖∞ (x, y ∈ Rd).

Now set c′ := inf‖x‖ | ‖x‖∞ = 1 and pick a sequence xn with ‖xn‖∞ = 1and ‖xn‖ → c′. The set

x ∈ Rd | ‖x‖∞ = 1

is compact (closed and bounded), hence by passing to a subsequence we maysuppose that there is x with ‖x‖∞ = 1 and ‖xn − x‖∞ → 0. Then ‖x‖ = c′,as ‖·‖ is continuous. Since x 6= 0, c′ 6= 0. Clearly we have

c′ ‖y‖∞ ≤ ‖y‖ ≤ c ‖y‖∞ (y ∈ Rd)

and this was what we needed to prove.

2.9 Definition A normed vector space (E, ‖·‖) is called a Banach space if themetric induced by the norm is complete, i.e. if every Cauchy sequence has alimit.

A consequence of Theorem 2.8 is that every finite-dimensional normed spaceis a Banach space. Indeed, all norms are equivalent, and so have the sameCauchy sequences. (Can you see this?) Hence we can pick a convenient one,and then completeness is known (see LN 3.3.7).

One can describe completeness of the norm also in terms of series. As in finitedimensions a formal sum

∞∑j=1

xj


in a normed vector space E is said to be convergent if the sequence of partialsums

sn :=n∑

j=1

xj (n ∈ N)

is convergent; if this is the case then

∞∑j=1

xj := s := limn→∞

sn

denotes its limit; moreover it is easy to see (LN 4.3.2) that then limn→∞ xn = 0,since

‖xn‖ = ‖sn − sn−1‖ ≤ ‖sn − s‖+ ‖sn−1 − s‖

for all n ∈ N. A formal sum as above is called absolutely convergent if

∞∑j=1

‖xn‖ < ∞.

As is already known from one dimension, a convergent sum need not be abso-lutely convergent. The converse is true if the space is complete, and actuallycharacterizes completeness.

2.10 Lemma Let (E, ‖·‖) be a normed space. Then E is complete if and only if

every absolutely convergent series in E is convergent.

Proof. Let the (formal) series given by s =∑∞

n=0 xn. If this is absolutelyconvergent then for n ≥ m

‖sn − sm‖ =

∥∥∥∥∥n∑

k=m+1

xk

∥∥∥∥∥ ≤n∑

k=m+1

‖xk‖ → 0

by the (repeated) triangle inequality. This implies that (sn)n is a Cauchysequence, hence convergent if E is complete.To prove the converse, suppose that (yn)n is a Cauchy sequence. Pick asubsequence ynk

such that∥∥ynk+1− ynk

∥∥ ≤ 2−k (k ∈ N).

If xk := ynk+1− ynk

then clearly∑

k xk is absolutely convergent. By assump-tion, the series is simply convergent, i.e., x =

∑∞k=0 xk exists. But this means

that

x = limm

sm = limm

m∑k=0

xk = limm

m∑k=0

(ynk+1− ynk

) = limm

ynm+1 − yn0 .


This shows that y := limm ynm exists. But since

‖yn − y‖ ≤ ‖yn − ynm‖+ ‖ynm − y‖

and (yn)n is a Cauchy sequence, also limn yn = y.

The `p-spaces

We shall now see important examples of infinite-dimensional Banach spaces,so-called sequence spaces.

2.11 Definition Let x = (xn)n be a sequence of real numbers. For 1 ≤ p < ∞ wedefine the p-norm of x by

‖x‖p :=

∞∑j=1

|xj |p 1

p

∈ [0,∞].

If ‖x‖p < ∞ we call x p-summable. The set of all p-summable sequences isdenoted by

`p = `p(N) := x = (xn)n∈N | ‖x‖p < ∞.

So `p is a subset of the set of all functions N → R. It is not obvious at all that`p is a vector space and ‖·‖p is a norm on it, but that is what we are goingto prove in the following. Eventually we shall see that `p is in fact a Banachspace.

It is easy to see that for any sequence x = (x1, x2 . . . ) and real number λ onehas

‖x‖p = 0 ⇔ x = 0 and ‖λx‖p = |λ| ‖x‖p

(with the convention 0 · ∞ = 0). That accounts for the first two propertiesof a norm. The triangle inequality for the p-norm is called Minkowski’sinequality.

2.12 Theorem Let x = (x1, . . . ) and y = (y1, . . . ) be real sequences and 1 ≤ p < ∞.

Then

‖x + y‖p ≤ ‖x‖p + ‖y‖p

(in the sense that ∞+ a = ∞).

Let us note that from the theorem (and the remarks before it) it follows that`p is a vector space and ‖·‖p is a norm on it.

To prove the theorem we need the following lemma from calculus.


2.13 Lemma Let a, b ≥ 0 be real numbers and 1 ≤ p < ∞. Then

(a + b)p = inft∈(0,1)

t1−pap + (1− t)1−pbp

Proof. One defines ϕ(t) := t1−pap + (1 − t)1−pbp for t ∈ (0, 1) and uses theusual calculus methods to compute its extremum.

Employing the lemma we proceed as follows: we may assume that ‖x‖p , ‖y‖p <

∞. Fix t ∈ (0, 1). Then for every j ∈ N

|xj + yj |p ≤ (|xj |+ |yj |)p ≤ t1−p |xj |p + (1− t)1−p |yj |p .

Summing up yields

‖x + y‖pp =

∞∑j=1

|xj + yj |p ≤ t1−p∞∑

j=1

|xj |p + (1− t)1−p∞∑

j=1

|yj |p

= t1−p ‖x‖pp + (1− t)1−p ‖y‖p

p .

Now this holds for every t ∈ (0, 1) and taking the infimum by the lemma againyields

‖x + y‖pp ≤ inf

t∈(0,1)t1−p ‖x‖p

p + (1− t)1−p ‖y‖pp = (‖x‖p + ‖y‖p)

p.

Taking p-th roots completes the proof.

2.14 Theorem Let 1 ≤ p < ∞. Then `p is complete, i.e., a Banach space.

Proof. Let us change notation a bit. Instead of sequence notation, we usefunction notation. So the elements of `p are functions f : N → R, with then-th component denoted by f(n).Now take a Cauchy sequence f1, f2, f3 . . . in `p. Note that each fn is now afunction on N. The proof follows a standard procedure: First find the limitfunction by using a weaker notion of convergence (usually pointwise), thenprove that the limit function is really the limit with respect to the norm.Fix j ∈ N. Then obviously

|fn(j)− fm(j)| ≤ ‖fn − fm‖p (n, m ∈ N)

Hence the sequence (fn(j))n is a Cauchy sequence in R. By the completenessof R, it has a limit, say

f(j) := limn→∞

fn(j).


This yields a candidate f : N −→ R for the limit of the sequence fn. But westill have to prove that f ∈ `p and ‖f − fn‖p → 0.

Fix ε > 0 and M = M(ε) ∈ N such that ‖fn − fm‖p < ε if n, m > M . Forfixed N ∈ N we obtain

N∑j=1

|fn(j)− fm(j)|p ≤∞∑

j=1

|fn(j)− fm(j)|p = ‖fn − fm‖pp ≤ εp

for all n, m ≥ M . Letting m →∞ yields

N∑j=1

|fn(j)− f(j)|p ≤ εp

for all n ≥ M and all N ∈ N. Letting N →∞ gives

‖fn − f‖pp =

∞∑j=1

|fn(j)− f(j)|p ≤ εp

i.e. ‖fn − f‖p < ε for n ≥ M . In particular, by Minkowski’s inequality,

‖f‖p ≤ ‖fM − f‖p + ‖fM‖p ≤ ε + ‖fM‖p < ∞

whence f ∈ `p. Moreover, since ε > 0 was arbitrary, ‖fn − f‖p → 0 asn →∞, as desired.

The scale of `p-spaces is not yet complete. There is another one.

2.15 Definition For a sequence f : N −→ R, the ∞-norm is defined as

‖f‖∞ = supn∈N

|f(n)|

and the space `∞ is defined as

`∞ := f : N −→ R | ‖f‖∞ < ∞.

2.16 Fact The space `∞ is a Banach space with the norm ‖·‖∞.

Proof. Exercise LN 5.3.6


Holder’s Inequality

We finally provide an important computational tool when dealing with `p-spaces. Recall that one can multiply sequences f, g : N −→ R pointwise by

(fg)(n) := f(n) · g(n) (n ∈ N).

For 1 ≤ p ≤ ∞ there is a unique number q ∈ [1,∞] such that

1p

+1q

= 1

(writing 1/∞ = 0). This is called the dual exponent to p (and sometimesdenoted by p′).

2.17 Theorem Let 1 ≤ p ≤ ∞ and let q ∈ [1,∞] be the dual exponent. If f ∈ `p

and g ∈ `q then fg ∈ `1 and

‖fg‖1 =∞∑

j=1

|f(j)| |g(j)| ≤ ‖f‖p ‖g‖q .

This is called Holder’s inequality. In the case p, q = 2 it is called theCauchy–Schwarz inquality. We leave the proof in the case that p = 1, q =∞ to the reader. The proof in the case p, q ∈ (1,∞) is similar to the proof ofthe Minkowski inequality; it rests on a lemma from calculus.

2.18 Lemma Let p, q ∈ (1,∞) such that 1/p + 1/q = 1, and let a, b ≥ 0. Then

ab = inft>0

tp

pap +

t−q

qbq

Proof. This again is elementary calculus.

Now let us complete the proof of Holder’s inequality. We may suppose withoutloss of generality that f(j), g(j) ≥ 0 for all j ∈ N. Fix t ∈ (0,∞). Then

f(j)g(j) ≤ tp

pf(j)p +

t−q

qg(j)q

for all j ∈ N, by the lemma. Summing up yields

‖fg‖1 ≤tp

p‖f‖p

p +t−q

q‖g‖q

q (t > 0).

Now we take the infimum over t > 0 and obtain by the lemma

‖fg‖1 ≤ inft>0

tp

p‖f‖p

p +t−q

q‖g‖q

q . = ‖f‖p ‖g‖q

as desired.



Lecture 3

Bounded functions

Let I denote any (nonempty) set, and

F(I) := f | f : I −→ R

the set of all real-valued functions on the set I. This is a vector space withthe usual pointwise definitions

(f + g)(x) := f(x) + g(x), (λf)(x) := λf(x)

for x ∈ I, f, g ∈ F(I), λ ∈ R.

There is a natural notion of convergence on F(I), the so-called pointwiseconvergence. We say that a sequence of functions (fn)n on I convergespointwise to a function f : I −→ R, if

fn(x) → f(x) in R as n →∞

for every x ∈ I. In logical notation

∀x ∈ I ∀ε > 0 ∃N = N(x, ε) : |fn(x)− f(x)| ≤ ε (n ≥ N).

This is an important notion, but unless I is finite, it does not come from anorm on F(I) (and if I is not countable, then it does not even come from ametric). Since functional analysis needs norms to work with, it pays the pricenot working on the space of all functions but on certain subspaces of it.Consider the case I = N. Then the space `p is a subspace of F(I) and it hasa norm turning it into a Banach space. Moreover,

|f(n)| ≤ ‖f‖p (f ∈ `p(N), n ∈ N)

shows that convergence in norm implies pointwise convergence.

3.1 Exercise Can you think of a simple example of a sequence (fn)n ⊂ `p suchthat fn → 0 pointwise but ‖fn‖p = 1 for all n?

The most important example of such a subspace is

B(I) := f ∈ F(I) | ‖f‖∞ := supx∈I

|f(x)| < ∞,

the space of bounded functions.


3.2 Lemma The set B(I) is a subspace of F(I) and

‖f‖∞ := supx∈I

|f(x)| (f ∈ B(I))

defines a norm on it.

Proof. See LN p. 48.

Let f, g ∈ B(I). How can we interpret the distance d(f, g) = ‖f − g‖∞? Well,let ε > 0. Then d(f, g) ≤ ε says that

|f(x)− g(x)| ≤ ε

for all x ∈ I. So fn → f in the norm ‖·‖∞ may be written as

∀ε > 0 ∃N = N(ε) : |fn(x)− f(x)| ≤ ε (n ≥ N,x ∈ I).

Compare this with “pointwise convergence” above: there the ’speed of con-vergence’ may depend on the point x, whereas here it does not! The chosenN depends on ε but it is the same (= ”uniform”) for every x ∈ I. Thereforewe say that (fn) converges to f uniformly (on I) and call the norm ‖·‖∞sometimes the uniform norm.

3.3 Theorem The space B(I) endowed with the norm ‖·‖∞ is complete, i.e. a

Banach space.

Proof. The proof is the same as in the case I = N, where B(N) = `∞ asintroduced in the previous lecture. See also LN 6.3.3. We give an alternativeproof based on series, using Lemma 8 from last lecture.Suppose (fn)n ⊂ B(I) such that

∑n ‖fn‖∞ =: M < ∞. The trivial inequali-

ties|f(x)| ≤ ‖f‖∞ (x ∈ I)

imply that uniform convergence (= convergence in ‖·‖∞) implies pointwiseconvergence, so we can find the limit by looking at each point separately. Nowfor fixed x ∈ I, ∑

n

|fn(x)| ≤∑

n

‖fn‖∞ = M < ∞

whence

f(x) = limn

n∑j=1

fj(x)

exists for every x ∈ I. Since for every x ∈ I

|f(x)| ≤∑

n

|fn(x)| ≤∑

n

‖fn‖∞ = M < ∞


f is indeed a bounded function (M does not depend on x!). Moreover,∣∣∣∣(f −n∑

j=1

fj

)(x)∣∣∣∣ = ∣∣∣∣f(x)−

n∑j=1

fj(x)∣∣∣∣ ≤ ∞∑

j=n+1

|fj(x)| ≤∞∑

j=n+1

‖fj‖∞

for each x ∈ I, and taking the supremum yields∥∥∥∥f − n∑j=1

fj

∥∥∥∥∞≤

∞∑j=n+1

‖fj‖∞ → 0 as n →∞.

This completes the proof.

Remark: We have proved that if∑

n ‖fn‖∞ < ∞ then the sum∑

n fn(x)converges absolutely and uniformly for x ∈ I. This is often called the Weier-strass’ M-test.

Continuous functions

Suppose now that the set I is not only a set, but also a metric space, withmetric d. (Think of I being an subinterval of R, if you like.) Then we havethe space

C(I) := f ∈ F(I) | f is continuous

and it is easy to see that this is a subspace of F(I), i.e. sums and scalarmultiples of continuous functions are continuous. (Exercise!)

As for all functions, this space in general has no natural norm attached to it.But like previously, we can restrict ourselves to an even smaller subspace

BC(I) := C(I) ∩ B(I)

the space of bounded and continuous functions.

3.4 Lemma (See LN 6.2.1)

The space BC(I) is a closed subset of B(I) (with respect to the uniform norm).

Proof. This is what is called a “3ε”-argument. Suppose (fn)n ⊂ BC(I) andfn → f uniformly on I, for some bounded function f ∈ B(I). We fix anarbitrary x ∈ I and have to show that f is continuous at x. By the scalartriangle inequality we may write

|f(x)− f(y)| ≤ |f(x)− fn(x)|+ |fn(x)− fn(y)|+ |fn(y)− f(y)|≤ 2 ‖fn − f‖∞ + |fn(x)− fn(y)|


for all y ∈ I and n ∈ N. Given ε > 0 choose n so large that ‖fn − f‖∞ < ε.For this n, since fn is continuous at x, we find δ > 0 such that

y ∈ I, d(x, y) < δ ⇒ |fn(x)− fn(y)| < ε.

Then|f(x)− f(y)| ≤ 2 ‖fn − f‖∞ + |fn(x)− fn(y)| < 3ε

for all y ∈ I with d(x, y) < δ.

By general metric space theory (see Lecture 1), we can conclude that BC(I) iscomplete, i.e. a Banach space, with respect to the uniform norm. If the metricspace I is compact, e.g. a closed finite interval [a, b], then by the Weierstrasstheorem, every continuous function on I is already bounded. Hence in thiscase BC(I) = C(I). Let us sum up what we know so far.

3.5 Corollary (See LN 6.3.4)

If I is any set, B(I) is a Banach space. If I is a metric space then BC(I) is a

Banach space. And C[a, b] is a Banach space for [a, b] a compact subinterval

of R.

The theorem of Picard–Lindelof (Cauchy–Lipschitz)

In this section we encounter our first real “application” of functional analysis:we shall show that the general first-order initial value problem

y′ = F (x, y), y(x0) = y0 (3.1)

has a unique local solution, under mild conditions on the function F .For the time being we make the following assumptions:

a) (x0, y0) ∈ R2.

b) there are numbers a, b > 0 and F : Ia × Jb −→ R is continuous

where we abbreviate Ia := [x0 − a, x0 + a] and Jb := [y0 − b, y0 + b].(The setting is 1-dim, because we want to focus on the functional analysis in-gredients; once understood, one can easily cover the higher-dimensional case.)Let us point out that mere continuity of F is to weak for our approach towork, but we shall come to the right additional condition later on. Since F

is continuous and the rectangle Ia × Jb is closed and bounded, F is certainlybounded, so

M := sup|F (x, y)| | x ∈ Ia, y ∈ Jb < ∞.

(We shall need this number later.)


A solution of (3.1) is given by a differentiable function f : Iα −→ R for some0 < α ≤ a, such that f(x0) = y0, f(x) ∈ Jb for all x ∈ Iα and

d

dxf(x) = F (x, f(x)) (x ∈ Iα).

Integrating the equation yields the integral equation

f(x) = f(x0) +∫ x

x0

F (t, f(t)) dt (x ∈ Iα)

and this is even an equivalent formulation in the sense that each function f

which satisfies the integral equation is also a solution of our original problem(by the fundamental theorem of calculus).

The idea for the solution now consists in viewing the integral equation as afixed point problem in (a closed subset A of) the space C(Iα), i.e., we writethe equation as f = Tf where

(Tf)(x) := y0 +∫ x

x0

F (t, f(t)) dt (x ∈ Iα)

and try to find a fixed “point” of T by the naive intuition that the sequence

f0 ≡ y0, fn+1 := Tfn

should converge to it. This requires some effort:

1) One has to apply T over and over again, so one should find α > 0 and aset A ⊂ C(Iα) such that Tf is defined for f ∈ A and Tf ∈ A again.

2) One has to choose A in such a way that the sequence (fn)n converges andits limit f lies in A.

3) One needs the notion of convergence in such a way that the mapping T iscontinuous, because only then one can conlude that the limit f is indeeda fixed point of T :

Tf = T ( limn→∞

fn) = limn→∞

Tfn = limn→∞

fn+1 = f.

WriteAα := f ∈ C(Iα) | f(x) ∈ Jb ∀x ∈ Iα.

If α ≤ a then for f ∈ Aα the function Tf as above is well-defined. Moreover,

|Tf(x)− y0| ≤∣∣∣∣∫ x

x0

F (t, f(t)) dt

∣∣∣∣ ≤ α sup|F (x, y)| | x ∈ Iα, y ∈ Jb ≤ αM.

Hence if α ≤ b/M and f ∈ Aα then again Tf ∈ Aα again. So for smallα > 0 the set Aα is invariant under the “operator” T , and one can iterate theoperation of T on it.


3.6 Lemma The set Aα is a closed subset of the Banach space C(Iα) (with the

uniform norm). Hence it is a complete metric space in its own right.

Proof. If (fn)n ⊂ Aα and fn → f uniformly on Iα then also pointwise. Hence

f(x) = limn→∞

fn(x) ∈ Jb

for all x ∈ Iα, as Jb is a closed interval.

We are going to show that T has a unique fixed point on Iα if α is smallenough, and if F satisfies the following additional hypothesis:

c) F satisfies a Lipschitz condition in the second variable, i.e. there issome L > 0 such that∣∣F (x, y)− F (x, y′)

∣∣ ≤ L∣∣y − y′

∣∣ (x ∈ Ia, y, y′ ∈ Jb)

Under this hypothesis we have

|(Tf)(x)− (Tg)(x)| ≤∣∣∣∣∫ x

x0

F (t, f(t))− F (t, g(t)) dt

∣∣∣∣≤ α sup|F (t, f(t))− F (t, g(t))| | t ∈ Iα≤ αL ‖f − g‖∞

for all x ∈ Iα, hence

‖Tf − Tg‖∞ ≤ αL ‖f − g‖∞

for all f, g ∈ Aα. If α < 1/L then q := (αL) < 1 and one can apply thefollowing theorem from metric space theory.

3.7 Theorem (Banach’s contraction principle)

Let (Ω, d) be a complete metric space and let T : Ω −→ Ω be a mapping such

that there exists q < 1 such that

d(Tx, Ty) ≤ qd(x, y) (x, y ∈ Ω)

Then T has a unique fixed point x. Moreover, starting from any point x0 ∈ Ω,

the sequence given by xn+1 := Txn, n ∈ N converges to x.

Proof. Uniqueness: If x, y are fixed points then d(x, y) = d(Tx, Ty) ≤ qd(x, y).Since q < 1 one must have d(x, y) = 0, i.e. x = y.Existence: Start with any x0 ∈ Ω and define the iteration xn+1 := Txn. Byhypothesis, it is evident that T is continuous. Hence if (xn)n converges tox, say, then Tx = x is a fixed point. (See the argument above). To prove


convergence, since Ω is complete, it suffices to prove that the sequence isCauchy. This is done as follows:

d(xn+m, xn) ≤m∑

j=1

d(xn+j , xn+j−1) =m∑

j=1

d(Tn+jx0, Tn+j−1x0)

≤m∑

j=1

qn+j−1d(Tx0, x0) ≤ d(Tx0, x0)∞∑

j=n

qj

and this is small as n is large, independent of m ≥ 1.

(One can refine the proof to give error estimates, see LN Appendix B.1.4.)

Let me add a last thing. From the hypotheses on F the continuity is certainlynatural, but the Lipschitz condition may seem queer on the first glance. But itis not! Indeed, suppose that F is not only continuous but C1, i.e., continuouslydifferentiable on Ia × Jb. Then

∣∣F (x, y)− F (x, y′)∣∣ ≤ ∫ 1

0

∣∣∂2F (x, y′ + t(y − y′))(y − y′)∣∣ dt

≤ sup|∂2F (s, t)| | s ∈ Ia, t ∈ Jb∣∣y − y′

∣∣and this is the Lipschitz condition. Hence we obtain:

3.8 Corollary Let F : Ia × Jb −→ R be continuously differentiable. Then the

initial value problem

y′ = F (x, y), y(x0) = y0

has a unique local solution.



Lecture 4

Completeness and denseness

We introduce the important concept of a set being dense in a metric space.

4.1 Lemma and Definition

Let (Ω, d) be a metric space, and A ⊂ Ω. Then the following assertions are

equivalent:

(i) A = Ω;

(ii) Every f ∈ Ω is the limit of a sequence (fn)n with fn ∈ A, n ∈ N;

(iii) For every f ∈ Ω and ε > 0 there is g ∈ A such that d(f, g) < ε;

If one of these conditions hold, then A is said to be dense in Ω.

Proof. This is immediate from Lemma 1.9.

This concept is already known in the example of the rational numbers Q beingdense in the reals. But we have now an important new example.

4.2 Example For a natural number n ∈ N we denote by en ∈ F(N) the sequencewhich is 0 except for the n-th coordinate, where it is 1. In function notation

en(m) =

1, n = m

0, n 6= m(m ∈ N).

The linear span of the vectors en, n ∈ N, is the set

c00 := spanen | n ∈ N ⊂ F(N)

of all finite sequences. (Hence a sequence is finite iff it is eventually 0, i.e.,of the form f = (x1, . . . , xn, 0, 0 . . . ).)Clearly, c00 is a subspace of `p, but it does not exhaust it. Indeed, if 0 < q < 1then f := (q, q2, q3, . . . ) is certainly not a finite sequence, but f ∈ `p, since

‖f‖pp =

∞∑n=1

|f(n)|p =∞∑

n=1

qnp =qp

1− qp< ∞.

We claim that c00 is dense in `p (with respect to the p-norm, of course), if1 ≤ p < ∞. In fact, we show more: for each f ∈ `p one has

f = limN→∞

N∑n=1

f(n)en =∞∑

n=1

f(n)en


as a limit in the p-norm. (This is the reason why the collection en | n ∈ Nis also called the standard basis of `p.)

Proof. Clearly c00 ⊂ `p. If f ∈ `p then∥∥∥∥f − ∑N

n=1f(n)en

∥∥∥∥p

=∞∑

n=N+1

|f(n)|p →∞ (N →∞).

And that’s it.

4.3 Exercise Let us consider the case p = ∞. Denote by 1 the sequence which isconstant to 1. Show that within `∞, 1 /∈ c00, and so the result above does nothold for p = ∞. Describe c00 (closure within `∞)?

A second important example is the so-called Weierstrass approximationtheorem.

4.4 Theorem Let [a, b] be a compact subinterval of the reals and consider the

Banach space C[a, b] with the sup-norm. Then the space of polynomials

R[t] = P := spantn | n ≥ 0

is dense in C[a, b].

Proof. See LN 8.1.5.

Remark. In the official Lecture Notes, this theorem (LN 8.1.5) is termed“Stone–Weierstrass” which is not quite correct. Indeed, the actual Stone-Weierstrass theorem is a far-reaching (and powerful) generalization of Weier-strass’ original result. It is not very difficult, but needs some preparationswhich would lead us too far astray.

Another remark. Note that not every continuous functions can be representedby a power series. Hence the monomials tn do not play the same role in C[a, b]as the standard basis vectors en in `p!

We know from Lemma 1.8, that within a Banach space E, a subspace F isclosed if and only if itself is complete, i.e. a Banach space. Hence we obtainfrom the above examples that c00 is not complete with respect to any p-norm;and the space of polynomials P is not complete with respect to the sup-norm.We consider another example of a non-complete space.


4.5 Example On E := C[a, b] we define ‖f‖1 by

‖f‖1 :=∫ b

a|f(x)| dx (f ∈ C[a, b]).

This is indeed a norm: homogeneity and triangle inequality are trivial. (Doit in detail if you do not see it on the first glance.) For definiteness, supposethat f 6= 0. Then there is x0 ∈ (a, b) where f(x0) 6= 0, i.e. |f(x0)| > 0. Bycontinuity, there are ε, δ > 0 such that

|x− x0| ≤ δ ⇒ |f(x)| ≥ ε.

But then

‖f‖1 ≥∫ x0+δ

x0−δ|f(x)| dx ≥ 2δε > 0.

We claim that E = C[a, b] with respect to the norm ‖·‖1 is not complete. Theexample is the sequence of functions (fn)n on E = C[−1, 1] given in LN 8.2.2.It is shown there that this sequence is Cauchy in ‖·‖1, and claimed that it hasno limit.We give a formal proof that this claim is true. Suppose that the limit is f .Fix 0 < a < 1 and consider the restriction mapping g 7−→ R(g) := g

∣∣[a,1]

. Themapping R : C[−1, 1] 7−→ C[a, 1] is linear and one has

‖Rg‖1,[a,1] =∫ 1

a|g(x)| dx ≤

∫ 1

−1|g(x)| dx = ‖g‖1 (g ∈ C[−1, 1]).

This shows that R is continuous with respect to the 1-norms. For n largeenough, Rfn = 1[a,1] (the constant function 1 on [a, 1]), and so

1[a,1] = limn

1[a,1] = limn

Rfn = R(limn

fn) = Rf.

Hence f = 1 on [a, 1]. Similarly one shows that f = −1 on [−1,−a]. Since a

was arbitrary, f is discontinuous at 0, a contradiction.

In general, each non-complete normed space F can be completed in the follow-ing sense: there is a Banach space E such that F can be viewed as a densesubspace of E and the norm of F is the restriction of the norm on E. However,this is an abstract construction. In the case of ‖·‖1 on C[a, b] there is a moreconcrete way. This is the theory of Lebesgue integration.

Lebesgue Integration Theory II

It is not possible here to give a thorough introduction to this important field.Some information is in LN Appendix C, but we shall give an independentpresentation.


The central objects in the Lebesgue theory are so-called measurable functions,a notion which builts upon the basic notion of a measure. In the Riemanntheory one uses partitions of the domain interval to obtain approximations ofthe integral ∑n

j=1f(tj−1)(tj − tj−1) →

∫ b

af(x) dx.

Lebesgue’s idea was not to partition the domain, but the range of the function.This leads to sets of the form

tj−1 ≤ f < tj := x ∈ I | f(x) ∈ [tj−1, tj)

and if one has a good notion of the (1-dim) “volume” λ(A) of such sets A, onemay use ∑n

j=1tj−1λ

(tj−1 ≤ f < tj

)as an approximation (from below) of the integral, where

λ(tj−1 ≤ f < tj

)denotes in a reasonble way the “1-dimensional volume (content)” of the settj−1 ≤ f < tj. So the first task was to make precise this notion of a volume.

It turned out that there is no chance to find a reasonable1 notion of volumefor all subsets of R. But one can do for a large subclass, the so-called Borelsets. To define these, we need a preparatory notion.

4.6 Definition Let Ω be any set. A σ-algebra2 is a collection A ⊂ P(Ω) of subsetsof Ω, such that the following hold:

1) ∅,Ω ∈ A.

2) If A,B ∈ A then A ∪B,A ∩B,A \B ∈ A.

3) If (An)n∈N ⊂ A, then⋃

n∈N An,⋂

n∈N An ∈ A.

(One can do with a more restricted set of axioms.)

If Ω is a metric space, there is a “smallest” σ-algebra which contains all opensets3 (and hence, by 1) and 2), also all closed sets). This is called the Borel

1“Reasonable” means that the requirements (1)–(4) of Theorem 4.8 below are satisfied.2The “σ” refers to the fact that somehere is a union of countably many sets is lurking

around. This is purely conventional. The corresponding symbol for countable intersec-

tions is “δ”. Do not bother too much about this!3If you feel uneasy with such a vague notion, here is the mathematical way to formalize it:

If you have an arbitrary family (Aι)ι∈J of σ-algebras on Ω, then the intersectionT

ι∈J Aι

is again a σ-algebra. If E denotes the collection of open sets on Ω, then

σ(E) :=\A | A is σ-algebra and E ⊂ A

is the “smallest” σ-algebra containing E .


σ-algebra B(Ω).

4.7 Lemma The Borel σ-algebra B(R) of R contains all (= open, halfopen, closed,

bounded/unbounded) intervals.

The Borel σ-algebra B(I) of an interval I is just A ∈ B(R) | A ⊂ I.

Proof. Just for example: [a, b) =⋂

n(a− 1/n, b), and the intervals (a− 1/n, b)are open, hence Borel sets.

4.8 Theorem and Definition

There is a unique mapping λ : B(R) −→ [0,∞] satisfying the following condi-

tions:

1) λ(∅) = 0;

2) λ(A ∪B) = λ(A) + λ(B) if A,B ∈ B(R), A ∩B = ∅;

3) λ(⋃∞

n=1 An) =∑∞

n=1 λ(An) if (An)n∈N ⊂ B(R) and An ∩ Am = ∅,(n 6= m).

4) λ([a, b)

)= b− a if a,∈ R, a ≤ b.

This mapping is called the (one-dimensional) Lebesgue measure. It has the

following additional properties

a) (regularity):

λ(A) = infλ(O) | A ⊂ O, O open= supλ(K) | K ⊂ A, K compact

for all A ∈ B(R).

b) (translation invariance)

λ(A + c) = λ(A) (A ∈ B(R), c ∈ R).

(Property 2) is called the finite additivity of the “set function” λ, and 3) iscalled σ-additivity. Actually, 2) is superfluous, as it is implied by 1) and 3)(Exercise).)

4.9 Exercise Prove that the measure is monotone, i.e. if A ⊂ B are Borel thenλ(A) ≤ λ(B).Prove that if An ∈ B(I) and A1 ⊂ A2 . . . then λ(

⋃∞n=1 An) = supn∈N λ(An).

Having the notion of the measure, we can define “measurable functions”.


4.10 Definition Let I ⊂ R be an interval. A function f : I −→ [−∞,∞] is called(Borel) measurable if the set

a ≤ f < b(

= f−1[a, b) = x ∈ I | f(x) ∈ [a, b))

is a Borel set, for every choice a, b ∈ R, a ≤ b. We set

M(I) := f : I −→ R | f is measurable;M+(I) := f : I −→ [0,∞] | f is measurable.

4.11 Examples Continuous functions are measurable. (We omit the proof, but itis easy). If A ∈ B(R) then its characteristic function 1A defined by

1A(x) :=

1 (x ∈ A)

0 (x /∈ A)

is measurable, since a ≤ 1A < b = ∅, A, Ac, R, depending on whether 0 or 1is or is not in the interval [a, b).

The class of measurable functions has some nice properties.

4.12 Lemma Let I be an interval. Then the following assertions hold.

a) If f, g ∈M+(I), α ≥ 0, then f + g, αf ∈M+(I).

b) If f, g ∈M(I) and α, β ∈ R, then αf + βg ∈M(I).

c) f, g : I −→ [−∞,∞] are measurable then −f,minf, g,maxf, g are

measurable.

d) If fn : I −→ [−∞,∞] is measurable for each n ∈ N and fn → f

pointwise, then f is also measurable.

This is some work, but not hard. The last property is certainly as a stunningfact, as neither continuous nor Riemann-integrable functions are closed underpointwise limits.

4.13 Exercise Let f : I −→ [−∞,∞] be measurable. Show that f = ∞, f =−∞, f ≥ a and f < a are Borel sets for every a ∈ R.

For a measurable function f : I −→ [0,∞] one can define its integral in athree-step procedure. The basic intuition is still that an integral over a positivefunction is the same as the area under its graph. So∫

I1A dλ := λ(A)


seems reasonable. Then, an integral should be linear, hence we set∫If dλ :=

n∑j=1

αjλ(Aj) ∈ [0,∞]

if

f =n∑

j=1

αj1Aj (n ∈ N, αj ≥ 0, Aj ∈ B(I)).

A function f which is a linear combination of characteristic functions as aboveis called a simple function. Since a representation of a simple function as alinear combination of characteristic functions is not unique, one has to showthat different representations lead to the same value of the integral!The last step is by approximation. For arbitary measurable f : I −→ [0,∞]define ∫

If dλ := sup

n∈N

∫Ign dλ ∈ [0,∞],

where (gn)n is any sequence of simple functions such that 0 ≤ gn ≤ gn+1 f

(pointwise)

Of course one has to ensure two things here: first, the value of the integralshould be independent of the approximating sequence. (Can be done.) Second,for each f there should exist at least one such approximating sequence. Thelatter is true by using Lebesgue’s original idea of partitioning the range of thefunction.

4.14 Lemma Let f : I −→ [0,∞] be measurable. Then there exists a sequence of

simple functions (gn)n such that

0 ≤ gn ≤ gn+1 f (pointwise) (n →∞)

Proof. First step: For n ∈ N define the simple function fn by

fn :=n−1∑k=0

n∑j=1

(k +

j − 1n

)1k+ j−1

n≤f<k+ j

n + n1f≥n

Then 0 ≤ fn ≤ f , fn = n on n ≤ f ≤ ∞ and |f − fn| ≤ 1/n on 0 ≤ f <

n, as one can see by simply inserting a point x ∈ I and distinguishing the(disjoint) cases f(x) ≥ n and k + (j − 1)/n ≤ f(x) < k + j/n, 1 ≤ j ≤ n, 0 ≤k ≤ n− 1. In particular, fn → f pointwise.Second step: Define gn := maxf1, . . . , fn.


4.15 Theorem The integral for positive measurable functions has the following

properties (f, g ∈M+(I), α ≥ 0)

a) (Additivity and homogeneity)∫If + g dλ =

∫If dλ +

∫Ig dλ,

∫Iαf dλ = α

∫Ifdλ.

b) (Monotonicity)

f ≤ g ⇒∫

If dλ ≤

∫Ig dλ.

c) (Beppo Levi, Monotone Convergence Theorem)Let (fn)n∈N ⊂ M+(I) such that 0 ≤ f1 ≤ f2 ≤ . . . , and suppose that

fn → f pointwise, then ∫If dλ = lim

n→∞

∫Ifn dλ.

Property c) is the real novelty to the Riemann approach, as there is no compa-rable result in the Riemann theory. The class of measurable functions is closedwith respect to pointwise limits, and this an advantage that neither has theclass of continuous functions nor the class of Riemann-integrable functions.

We conclude this lecture with a last definition

4.16 Definition A measurable function f : I −→ [−∞,∞] is called (Lebesgue)integrable if ∫

I|f | dλ < ∞.

We denote byL1(I, λ) := f ∈M(I) | f is integrable

the set of real valued integrable functions.

(Note that if f is measurable, then also |f | = maxf,−f is measurable, hencethe integral above is defined.)

4.17 Lemma The set L1(I, λ) is a vector space, and

‖f‖1 :=∫

I|f | dλ (f ∈ L1(I, λ))

is homogeneous and satisfies the triangle inequality.


Proof. Homogeneity: As |αf | = |α| |f | pointwise, integrating yields∫I|αf | dλ =

∫I|α| |f | dλ = |α|

∫I|f | dλ

and the latter is finite by hypothesis. Hence αf is integrable and ‖αf‖1 =|α| ‖f‖1.Triangle inequality: One has |f + g| ≤ |f | + |g| pointwise. Integrating yieldsby Theorem 4.15 a) and b),∫

I|f + g| dλ ≤

∫I|f |+ |g| dλ =

∫I|f | dλ +

∫|g| dλ,

and as the right side is a sum of finite numbers, it is finite. Hence f + g isintegrable and ‖f + g‖1 ≤ ‖f‖1 + ‖g‖1.

We shall see in the next lecture what hinders ‖·‖1 from being a proper norm.

Literature. A really good book to consult (if one whishes to do so) is:Inder K. Rana, An introduction to measure and integration, GSM 45, Amer.Math. Soc., Providence, 2005. (EWI-Library Delft: TW - 28-01 Rana)



Lecture 5

Null sets

Recall that we considered the “almost-norm”

‖f‖1 :=∫

I|f | dλ (f ∈ L1(I, λ))

and showed that it is almost a norm. To see what is missing to be a propernorm, we need a new concept.

5.1 Definition A set A ⊂ R is called a (Lebesgue) null set or negligible ifthere is a Borel set N ∈ B(R) such that A ⊂ N and λ(N) = 0.

Note that in general a null set need not be a Borel set.

5.2 Lemma Null sets have the following properties:

a) If A is a null set and B ⊂ A then B is also a null set.

b) If An is a null set (n ∈ N), then⋃

n An is a null set.

Proof. a) is immediate from the definition. b) follows from the so-called σ-subadditivity of the measure, i.e. the inequality

λ( ∞⋃

n=1

Bn

)≤

∞∑n=1

λ(Bn)

which holds for every sequence (Bn)n of Borel sets. (The proof is elementarymeasure theory, see the book of Rana.)

5.3 Example A one-point set (=singleton) a is a null set. Indeed for all n ∈ N

a ⊂ [a, a +1n

)

which implies 0 ≤ λa ≤ λ[a, a + 1/n) = 1/n for all n ∈ N. Since each setcan be written as a union over all its singletons, Lemma 5.2 implies that eachcountable set is a null set. In particular, Q is a null set.

5.4 Exercise Prove that λ([a, b]) = b− a = λ((a, b]) = λ((a, b)) for all a, b ∈ R.Show that if A,B are Borel sets and (A \ B) ∪ (B \ A) is a null set, thenλ(A) = λ(B).


It may seem as if countable sets are the only null sets, but this is far frombeing true. A prominent example of an uncountable null set is the so-calledCantor’s “middle thirds” set or Cantor dust. (See the book of Rana, if youwant more details.)

Null sets appear in a natural way in integration theory:

5.5 Lemma Let f : I −→ [−∞,∞] be measurable. Then the following assertions

hold.

a)∫I |f |dλ = 0 if and only if the set f 6= 0 = |f | > 0 is a null set.

b) If∫I |f | dλ < ∞, then the set |f | = ∞ is a null set.

One says that two functions f, g are equal almost everywhere (abbreviatedby “f = g a.e.” or “f ∼λ g”) if the set

f 6= g := x ∈ I | f(x) 6= g(x)

is a null set.More generally, let P be a property of points of I. Then P is said to holdalmost everywhere or: for almost all x ∈ I, if the set

x ∈ I | P does not hold for x

is a null set. For example “f ≥ 0 almost everywhere” means that f(x) ≥ 0everywhere except for x from a certain set of measure zero.The statements of the last lemma can therefore be rephrased as:

‖f − g‖1 = 0 ⇔ f = g a.e.,

‖f‖1 < ∞ ⇒ |f | < ∞ a.e.

5.6 Lemma The relation ∼λ (“is equal almost everywhere to”) is an equivalence

relation on F(I).

Proof. Obviously one has f ∼λ f for every f since f 6= f = ∅ is a nullset. Symmetry is trivial. Let us show transitivity. Suppose that f = g

almost everywhere and g = h almost everywhere. Then there are measur-able null sets N1, N2 such that f 6= g ⊂ N1 and g 6= h ⊂ N2. Hencef 6= h ⊂ f 6= g∪g 6= h ⊂ N1∪N2, and this is a null set by Lemma 5.2.

Very often in integration theory, functions are considered to be equal if theyare merely equal almost everywhere. This corresponds to speak not about


the functions themselves but their equivalence classes.4 However, this is oftennot made explicit by notation, i.e. one writes f for both the function and itsequivalence class. Usually this does not lead to confusion. Since e.g. f = f1

a.e. and g = g1 a.e. implies that f + g = f1 + g1 a.e. (check it!), one may addequivalence classes by adding representatives:

[f ] + [g] := [f + g].

The same is true for products and scalar multiples and even countable supremaand infima. Hence in practice there is almost no difference to working withfunctions. (But one cannot forget it altogether. For example, if one talksabout equiavalence classes, the value f(x0) of f at a point x0 is not defined,as singletons are null sets.) The next result shows that also the integral isindependent of the choice of the representatives.

5.7 Lemma Let f, g : I −→ [0,∞] and suppose that f = g a.e.. Then∫If dλ =

∫Ig dλ.

Proof. Suppose first that f, g are real-valued. Then f = |f | ≤ |f − g| + |g|and hence

∫f ≤

∫|f − g|+

∫|g| =

∫|g|. Interchanging the roles of f, g yields∫

g ≤∫

f , whence equality.In the general case consider fn := minf, n and gn = ming, n. Thenfn f and gn g pointwise. Moreover, as f = g a.e., also fn = gn

a.e. and since these functions are real-valued,∫

fn =∫

gn. The Monotone

4I include a short comment about equivalence classes. If ∼ is an equivalence relation on a

set X, then to each x ∈ X one can define its equivalence class [x] := y ∈ X | x ∼ y,the set of all elements of X that are equivalent to x. Two such classes are either equal

or disjoint: [x] = [a] iff x ∼ a. The set X/ ∼ denotes the set of equivalence classes.

Each member of an equivalence class is called a representative for it. Suppose you have

a binary operation ∗ on X such that if x ∼ a and y ∼ b then x ∗ y ∼ a ∗ b. Then you can

induce this operation on X/ ∼ by defining

[x] ∗ [y] := [x ∗ y].

By hypothesis, this definition does not depend on the choice of representatives, hence is

a good definition.

The same mechanism works with functions of more variables and with relations in

general. The standard examples are from algebra: if V is a vector space and U ⊂ V

is a subspace, then x ∼ y iff x − y ∈ U is an equivalence relation compatible with the

vector space operations, and the space of equivalence classes, denoted by V/U , becomes

a vector space (“quotient space”) and the mapping x 7−→ [x] (“canonical surjection”)

becomes linear. Other examples of this type are: quotient groups G/N , rings , modules.

Actually, our situation is of the vector space type: V := F(I) is the big vector space

and U := f : I −→ R | f = 0 a.e. is the subspace.


Convergence Theorem now implies that∫

f =∫

g, too.

The last lemma shows that one can integrate the classes as if they were func-tions. But also, it allows to define the integral for a non-measurable functionf : I −→ [0,∞], as long as it is equal almost everywhere to a measurable one.

The space L1(I) and the integral

Let us defineL1(I) := L1(I, λ) := L1(I, λ)

/∼λ

the space of equivalence classes of integrable functions. (Since we only usethe Lebesgue measure, we omit the explicit reference, and write often L1(I)instead of L1(I, λ).) As said above, this is a vector space in a natural way. OnL1(I) we consider (as before)

‖f‖1 :=∫

I|f | dλ (f ∈ L1(I)).

(But now f is an equivalence class! The definition is independent of therepresentative by Lemma 5.7 above.) This is now a proper norm(!) and weshall see below that this space is a Banach space.

For a general function f we define its positive part and negative part by

f+ := maxf, 0 and f− := max−f, 0.

Then |f | = f+ + f− and f = f+ − f−. Given f ∈ L1(I), its integral∫If dλ :=

∫If+ dλ −

∫If− dλ

is a well-defined real number. (Note that f+, f− ≤ |f | whence both integralson the right side are finite.)

5.8 Lemma The integral (f 7−→∫I f dλ) : L1(I) −→ R has the following proper-

ties:

a) (Linearity)∫

I(αf + βg) dλ = α

∫If dλ + β

∫Ig dλ;

b) (Monotonicity) f ≤ g ⇒∫

If dλ ≤

∫Ig dλ;

c) (Triangle inequality)∣∣∣∣∫

If dλ

∣∣∣∣ ≤ ∫I|f | dλ.


Proof. This is elementary by splitting f and g into positive and negative partand using Theorem 4.15. As an example, consider c):∣∣∣∣∫

If dλ

∣∣∣∣ =∣∣∣∣∫

If+ dλ −

∫If− dλ

∣∣∣∣ ≤ ∣∣∣∣∫If+ dλ

∣∣∣∣ +∣∣∣∣∫

If− dλ

∣∣∣∣=∫

If+ dλ +

∫If− dλ =

∫If+ + f− dλ =

∫I|f | dλ.

What is the connection to the Riemann-integral?

5.9 Theorem If f : [a, b] −→ R is Riemann-integrable, then it is Lebesgue inte-

grable and ∫ b

af(x) dx =

∫[a,b]

f dλ

(We cheat a bit here: a Riemann-inegrable function need not be Borel-measurable, but it coincides almost everywhere with a Borel-measurable one.)

The last theorem gives us permission to write∫ b

af(x) dx instead of

∫[a,b]

f dλ.

The space Lp(I), 1 < p < ∞

Fix 1 ≤ p < ∞. For a measurable function f : I −→ [−∞,∞] we define

‖f‖p :=(∫

I|f |p dλ

)1/p

with the convention that ∞1/p := ∞. As in the case p = 1 we form

Lp(I, λ) := f ∈M(I) | ‖f‖p < ∞

andLp(I) := Lp(I, λ) := Lp(I, λ)

/∼λ

the space of (equivalence classes of) p-integrable functions.

5.10 Lemma Let 1 ≤ p < ∞. The space Lp(I) is a vector space and ‖·‖p is a norm

on it.

Proof. We begin with definiteness: ‖0‖p = 0 is obvious; and if ‖f‖p = 0 then∫I |f |

p dλ = 0, hence |f |p = 0 almost everywhere. This implies f = 0 almosteverywhere. So f = 0 in the space of equivalence classes.


Homogeneity is easy. Let us prove the triangle inequality. (This is analogousto the sequence case.) Let f, g ∈ Lp(I). The calculus Lemma 2.13 tells us that

|f + g|p ≤ (|f |+ |g|)p ≤ t1−p |f |p + (1− t)1−p |g|p

pointwise on I, for every t ∈ (0, 1). Integrating yields

‖f + g‖pp ≤ t1−p ‖f‖p

p + (1− t)1−p ‖g‖pp

for every t ∈ (0, 1). Taking the infimum over t finally yields

‖f + g‖pp ≤ inf

t∈(0,1)t1−p ‖f‖p

p + (1− t)1−p ‖g‖pp = ‖f‖p + ‖g‖p

and this shows both f + g ∈ Lp(I) and the triangle inequality.

The triangle inequality for the p-norm is also called Minkowski’s inequality.

The most important result about Lp-spaces is the so-called Lebesgue’s Dom-inated Convergence theorem (LDC). It relates convergence in the p-normand pointwise convergence almost everywhere.

5.11 Theorem (LDC)Let (fn)n∈N ⊂ Lp(I) and suppose that f(x) := limn→∞ fn(x) exists for almost

all x ∈ I. If there is 0 ≤ g ∈ Lp(I) such that |fn| ≤ g almost everywhere, for

each n ∈ N, then f ∈ Lp(I) and ‖fn − f‖p → 0. If p = 1, we have also

limn→∞

∫Ifn dλ =

∫If dλ.

Remark: Note that f need not be defined everywhere, but being defined almosteverywhere it determines a unique equivalence class with respect to ∼λ.Another remark: The theorem is false without the “domination” by a functiong: Let fn := 1[n,n+1]; then fn → 0 pointwise everywhere on R, but

∫R fn dλ = 1

for all n ∈ N. (A different example on a finite interval is LN Appendix C.4.3.(a).)Last remark: If fn → f in the p-norm, then fn need not converge to f almosteverywhere. In fact it may diverge everywhere (see LN Appendix C, C.4.3.(b)). However, the following is often good enough to work with.

5.12 Theorem Let 1 ≤ p < ∞, (fn)n ⊂ Lp(I) and fn → f in the p-norm. Then

there is a subsequence (fnk)k∈N such that fnk

→ f almost everywhere as

k →∞.

Using LDC we prove that the space Lp is complete.


5.13 Theorem Let I be any interval and 1 ≤ p < ∞. Then the space Lp(I, λ) is

complete with respect to ‖·‖p.

Proof. We shall show that each each absolutely convergent series is convergent.By Lemma 2.10, this is equivalent to completeness.Let fn ∈ Lp(I), n ∈ N, and suppose that

∑n ‖fn‖p < ∞. Define the positive

measurable functions

gN :=N∑

n=1

|fn| and g := supN

gN =∞∑

n=1

|fn|

(g might assume the value ∞ but we shall show shortly that this happens onlyon a null set.) By the Monotone Convergence Theorem∫

Igp dλ = lim

N

∫IgN dλ.

Taking p-th roots and applying Minkowski’s inequality we have

‖g‖p = limN‖gN‖p ≤ lim

N

N∑n=1

‖fn‖p =∞∑

n=1

‖fn‖p < ∞

by hypothesis. This shows that∫Igp dλ = ‖g‖p

p < ∞

which in turn implies that g = ∞ = gp = ∞ is a null set, by Lemma5.5 b). So g is finite almost everywhere, and this means two things: firstthat g ∈ Lp(I) and second that the sum f(x) :=

∑∞n=1 fn(x) is (absolutely)

convergent for almost all x ∈ I.So we are left to show that f =

∑∞n=1 fn in the p-norm. To do this, we

use (LDC). One has, as shown above, f = limN∑N

n=1 fn pointwise almosteverywhere. But also ∣∣∣∣∑N

n=1fn

∣∣∣∣ ≤ ∑N

n=1|fn| ≤ g

almost everywhere, for every n. Since the dominating function g is in Lp,(LDC) tells us that in fact ∥∥∥∥f − ∑N

n=1fn

∥∥∥∥p

→ 0

as N →∞.

A remark about notation. Note that single points are null sets. HenceLp[a, b] = Lp(a, b] = Lp[a, b) = Lp(a, b).


The space L∞(I)

As in the sequence case, there is also a space corresponding to p = ∞. For ameasurable function f : I −→ [−∞,∞] we let

esssup(|f |) := ‖f‖∞ := infc > 0 | |f | ≤ c almost everywhere

with the convention that ‖f‖∞ = ∞ if there is no such c. (The symbol“esssup” stands for essential supremum.) A function such that ‖f‖∞ < ∞ iscalled essentially bounded. We set

L∞(I, λ) := f ∈M(I) | f is essentially bounded

andL∞(I) := L∞(I, λ) := L∞(I, λ)

/∼λ .

5.14 Theorem The space L∞(I) is a vector space, and ‖·‖∞ is a complete norm on

it.

Some remarks. Note that ‖·‖∞ is not a norm on L∞(I, λ), for the same reasonas in the case p < ∞: The function 1x0 is not 0 but — since x0 is a nullset —

∥∥1x0∥∥∞ = 0.

Note also that there is now an ambiguity of the symbol ‖·‖∞: we used italso to denote the uniform norm. If one keeps this in mind, there will be noconfusion. For a continuous function f , both meanings of ‖f‖∞ coincide!

5.15 Exercise Show that if f : I −→ [−∞,∞] is a function then |f(x)| ≤ ‖f‖∞ foralmost all x ∈ I.

5.16 Exercise Show that if f : I −→ R is bounded and continuous, thenesssup(|f |) = sup|f(x)| | x ∈ I.

Holder’s Inequality

As in the sequence case we can derive an inequality for products of Lp-functions. Up to now, we have never used (nor stated) the fact that productsof measurable functions are measurable. But this is, what we need here.

5.17 Lemma If f, g : I −→ [0,∞] are measurable, then fg is measurable too.

(We use the convention that α · ∞ = ∞ if α > 0 and 0 · ∞ = 0.)

Proof. If f, g are both simple functions, this is trivial, since 1A1B = 1A∩B

for all sets A,B. The general result follows by approximation. (Lemma 4.14).


By looking at positive and negative parts, one proves that products of real-valued measurable functions are measurable. Now we can state Holder’s in-equality.

5.18 Theorem (Holder’s inequality)Let p, q ∈ [1,∞] such that 1/p + 1/q = 1. If f ∈ Lp(I) and g ∈ Lq(I) then

fg ∈ L1(I) and ∣∣∣∣∫Ifg dλ

∣∣∣∣ ≤ ‖f‖p ‖g‖q .

Proof. Since∣∣∫

I fg dλ∣∣ ≤ ∫

I |f | |g| dλ one may suppose that f, g ≥ 0. Thecases p, q ∈ 1,∞ are simple and left as exercise, so we may suppose thatp, q ∈ (1,∞). We do the same thing as in the sequence case. Replacinga = f(x), b = g(x) in the calculus Lemma 2.18,

f(x)g(x) = inft>0

tp

pf(x)p +

t−q

qg(x)p (x ∈ I).

Thus for fixed t > 0 and all x ∈ I

f(x)g(x) ≤ tp

pf(x)p +

t−q

qg(x)p.

Integrating yields∫If(x)g(x) dλ ≤ tp

p

∫If(x)p dλ +

t−q

q

∫Ig(x)p dλ

for all t > 0. Taking the infimum over t > 0, again by the calculus lemmafrom above,∫

If(x)g(x) dλ ≤ inf

t>0

tp

p

∫If(x)p dλ +

t−q

q

∫Ig(x)p dλ

=(∫

If(x)p dλ

)1/p (∫If(x)q dλ

)1/q

= ‖f‖p ‖g‖q .

5.19 Exercise Prove Holder’s inequality for the case p = 1, q = ∞.

Density Results

Do you still remember where our whole discussion of Lebesgue integrationstarted? It was the problem to find a natural completion of the space


(C[a, b], ‖·‖1). Now, finally, we are able to do this. Note that as continu-ous functions are measurable and bounded, C[a, b] ⊂ L∞(a, b) ⊂ L1(a, b).Moreover, the symbol ‖f‖1 is unambiguous, as the Lebesgue integral is a gen-eralization of the Riemann integral:∫ b

af(x) dx = ‖f‖1 =

∫(a,b)

f dλ.

We also know that L1(a, b) is complete (Theorem 5.13).

5.20 Theorem The space C[a, b] is dense in Lp(a, b), 1 ≤ p < ∞. More general,

the space C(I) ∩ Lp(I) is dense in Lp(I), for each interval I ⊂ R, 1 ≤ p < ∞.

(I omit the proof for now, but you certainly find it in Rana’s book.)

When one wants to prove density of a certain set/subspace then two importantprinciples alows one to use already known density results.

Density Principle 1 Let (Ω, d) be a metric space and let A,B ⊂ Ω. If A is

dense in Ω and A ⊂ B, then B is dense in Ω.

Proof. This is simply the triangle inequality (what else?!). Let x ∈ Ω bearbitrary. For given ε > 0 we have to find an element of B that is ε-close tox. By assumption we find a ∈ A such that d(a, x) < ε/2, and we find b ∈ B

such that d(a, b) < ε/2. The triangle inequality yields d(b, x) < ε.

Density Principle 2 Let E be a vector space, ‖·‖1 and ‖·‖2 norms on E such

that there is a constant c > 0 with ‖x‖1 ≤ c ‖x‖2, (x ∈ E).

If A is dense in E with respect to ‖·‖2 then A is also dense in E with respect

to ‖·‖1.Proof. Let x ∈ E be arbitrary. By hypothesis there is a sequence (an)n ⊂ A

such that ‖an − x‖2 → 0. Hence also ‖an − x‖1 → 0, by the norm estimate.

5.21 Exercise (LN 8.3.6) Show that the polynomials P are dense in Lp(a, b), 1 ≤p < ∞.

5.22 Corollary The space

C10 [a, b] := f ∈ C1[a, b] | f(a) = 0 = f(b)

is dense in Lp(a, b), 1 ≤ p < ∞.


Proof. We treat the case [a, b] = [0, 1], the general case being a slight mod-ification. By Assignment 1, Exercise 4 d), the space C1

0 [0, 1] is dense inF := f ∈ C[a, b] | f(0) = 0 = f(1), with respect to the uniform norm.But one has ‖f‖p ≤ ‖f‖∞ for all f ∈ C[0, 1], and so density in the uniformnorm implies density in the p-norm (Density Principle 2). By Density Principle1 it remains to show that F is dense in C[0, 1] with respect to the p-norm.Let f ∈ C[0, 1] be arbitrary, and define

ϕn(t) :=

nt (t ∈ [0, 1/n])

1 (t ∈ [1/n, 1− 1/n]

n(1− t) (t ∈ [1− 1/n, 1]).

Let fn := fϕn. Then fn ∈ F , fn → f pointwise on (0, 1) (in particular: almosteverywhere), and |fn| = |fϕn| ≤ |f |. Hence (LDC) implies ‖fn − f‖p → 0.If you do not like (LDC), then you can estimate ‖fn − f‖p by hand!



Lecture 6

Bilinear forms

We now come to the most important examples of normed/Banach spaces, thosewith a geometrical structure that is familiar from finite-dimensional euclideanspace. The fundamental notion is that of a form.

6.1 Definition Let E be a real vector space. A (bilinear) form on E is a mapping

a : E × E −→ R

such that a(f, ·) and a(·, f) are linear mappings for every f ∈ E.

Thus a form satisfies the equations

a(λf + µg, h) = λa(f, h) + µa(g, h)

a(h, λf + µg) = λa(h, f) + µa(h, g)

for f, g, h ∈ E, λ, µ ∈ R.

6.2 Definition Let a be a bilinear form on E. Then its adjoint form is given by

a∗(f, g) = a(g, f) (f, g ∈ E).

The form a is called symmetric if a = a∗, i.e.

a(f, g) = a(g, f) (f, g ∈ E).

and positive (semi-definite) if it is symmetric and

a(f, f) ≥ 0 (f ∈ E).

A symmetric form is called non-degenerate if the implication

a(f, g) = 0 for all g ∈ E ⇒ f = 0

holds for all f ∈ E. A symmetric form is called positive definite or an innerproduct or a scalar product if the implication

f 6= 0 ⇒ a(f, f) > 0

holds.


Note that — by definition — a positive form is always symmetric, and aninner product is positive.

We shall deal mostly with inner products, but the general notion of a formis very important in the applications. Indeed, it is central in the functional-analytic treatment of so-called elliptic partial differential equations.

6.3 Examples a) The generic bilinear form on E = Rd is given by

a(x, y) =d∑

i,j=1

aijxiyj (x, y ∈ Rd)

where A = (aij)i,j is a d×d-matrix. The form is symmetric iff the matrixis so, i.e. iff aij = aji for all i, j.If A = I is the identity matrix, we obtain the standard inner product

(x | y ) =d∑

i=1

xiyi (x, y ∈ Rd)

on Rd.As a second example consider d = 2 and

a(x, y) = x1y2 + x2y1 (x, y ∈ R2).

This form is obviously symmetric and (not so obviously) non-degenerate.However, it is not positive and both canonical unit vectors x = e1, e2

satisfy a(x, x) = 0.

b) Let m ∈ `∞ and define

a(f, g) :=∞∑

n=1

f(n)g(n)m(n)

for f, g ∈ `2. This defines a symmetric form on `2 which is positive iffm(n) ≥ 0 for all n ∈ N.

c) Let m ∈ C[a, b]. Then

a(f, g) :=∫ b

af(x)g(x)m(x) dx

defines a symmetric form on C[a, b]. The form is positive iff m(x) ≥ 0for all x ∈ [a, b]. (Exercise.)


d) More generally, let m ∈ L∞(a, b). Then

a(f, g) :=∫ b

af(x)g(x)m(x) dx

defines a symmetric form on L2(a, b). It is positive if m ≥ 0 a.e.. (In fact,the converse is also true, but we would need more results in integrationtheory to prove it.)

Suppose a is a symmetric form. Then one can make the simple calculation

a(f + g, f + g) = a(f, f) + 2a(f, g) + a(g, g).

Replacing g by −g yields

a(f − g, f − g) = a(f, f)− 2a(f, g) + a(g, g)

and subtracting the second from the first we arrive at

a(f, g) =14

(a(f + g, f + g)− a(f − g, f − g)) .

This important identity is called polarization identity.5

If, however, you add the two identities above, you get

a(f + g, f + g) + a(f − g, f − g) = 2a(f, f) + 2a(g, g)

which is called parallelogram identity (see Exercise 6.8 below).

A cornerstone of the theory of forms is the so-called (generalized) Cauchy–Schwarz–Bunyakowski6 inequality. It shows why positive forms play a cen-tral role.

6.4 Theorem (Generalized Cauchy–Schwarz inequality)Let a, b be two symmetric forms on a vector space E such that

|a(f, f)| ≤ b(f, f) (f ∈ E).

Then

|a(f, g)| ≤√

b(f, f)√

b(g, g) (f, g ∈ E).

5As a consequence one sees that a symmetric form a is already determined by the values

of its associated quadratic form qa(x) := a(x, x), x ∈ E.6For historial reasons, the name Bunyakowski is often omitted.


Proof. By applying first the polarization identity for a and then the parallel-ogram identity for b we obtain

4 |a(f, g)| = |a(f + g, f + g)− a(f − g, f − g)|≤ |a(f + g, f + g)|+ |a(f − g, f − g)|≤ b(f + g, f + g) + b(f − g, f − g)

= 2b(f, f) + 2b(g, g).

We now choose t > 0 and replace f by tf and g by t−1g in the above compu-tation. As a(tf, t−1g) = tt−1a(f, g) = a(f, g), we obtain

|a(f, g)| =∣∣a(tf, t−1g)

∣∣ ≤ t2

2b(f, f) +

t−2

2b(g, g) (t > 0).

From the calculus Lemma 2.18 with p = q = 2 we conclude

|a(f, g)| ≤ inft>0

t2

2b(f, f) +

t−2

2b(g, g) =

√b(f, f)

√b(g, g)

which was to prove.

6.5 Corollary Let a be a positive form on a vector space E and f ∈ E. Then

a(f, f) = 0 ⇔ a(f, g) = 0 ∀ g ∈ E.

In particular, a is non-degenerate if and only if it is positive definite, i.e. an

inner product.

Proof. The direction ⇐ is trivial by choosing g = f , and the direction ⇒ isjust the generalized Cauchy–Schwarz inequality.

Inner product spaces

Let us now suppose that we are given a linear space E and a fixed inner producton it. There are different conventions to denote such an inner product, forexample

〈f, g〉 , (f | g )

or simply (f, g) as LN does. This last convention has the disadvantage thatit is the same as for the ordered pair (f, g) and this may cause considerableconfusion. Hence in this lecture notes we stick to the notation 〈f, g〉. Wedefine

‖f‖ :=√〈f, f〉 (f ∈ E)

and will soon see that this is a norm. (Can you see definiteness and homo-geneity already?)


6.6 Corollary (Cauchy–Schwarz Inequality)Let (E, 〈·, ·〉) be an inner product space. Then for f, g ∈ E

|〈f, g〉| ≤ ‖f‖ ‖g‖ (6.2)

with equality if and only if f, g are linearly dependent.

Proof. The inequality is a direct consequence of Theorem 6.4, witha = b = 〈·, ·〉. There is a different (shorter and more direct) approach in LN7.2.3, where also the additional statement is proved.

6.7 Corollary Let (E, 〈·, ·〉) be an inner product space. Then ‖f‖ :=√〈f, f〉

defines a norm on E. The scalar product is continuous with respect to this

norm.7

Proof. Definiteness and homogeneity are simple. For the triangle inequality,we apply Cauchy–Schwarz to obtain

‖f + g‖2 = ‖f‖2 + 2 〈f, g〉+ ‖g‖2 ≤ ‖f‖2 + 2 ‖f‖ ‖g‖+ ‖g‖2 = (‖f‖+ ‖g‖)2

for all f, g ∈ E. We prove the continuity of the scalar product. Let ‖fn − f‖ →0 and ‖gn − g‖ → 0. By bilinearity,

〈fn, gn〉 − 〈f, g〉 = 〈fn − f, gn − g〉+ 〈f, gn − g〉+ 〈fn − f, g〉 ,

and taking absolute values, by Cauchy–Schwarz, we obtain

|〈fn, gn〉 − 〈g, f〉| ≤ ‖fn − f‖ ‖gn − g‖+ ‖f‖ ‖gn − g‖+ ‖fn − f‖ ‖g‖ → 0.

6.8 Exercise Interpreting ‖x‖ as the length of the vector x, draw a picture tointerpret the parallelogram law geometrically. Are you able to deduce this bygeometrical means (in H = R2, of course)?

The inner product spaces which are complete with respect to the induced normhave their own name.

6.9 Definition An inner product space (E, 〈·, ·〉) which is a Banach space withrespect to the norm induced by the inner product, is called a Hilbert space.

7The latter contains the assertion of LN 7.2.4.


6.10 Examples We just list the most important ones.

a) H = Rd with respect to any inner product. (All norms on Rd areequivalent!)

b) H = `2(N) with respect to the inner product

〈f, g〉 =∞∑

n=1

f(n)g(n) (f, g ∈ `2).

The induced norm is just the 2-norm:√〈f, f〉 =

(∑∞

n=1|f(n)|2

)1/2= ‖f‖2 .

c) H = L2(I) with respect to the inner product

〈f, g〉 =∫

If(x)g(x) dx (f, g ∈ L2(I))

The induced norm is just the 2-norm:√〈f, f〉 =

(∫I|f(x)|2 dx

)1/2

= ‖f‖2 .

The Sobolev Space H1(a, b)

We come to an example of a different type, but very important in applications.

6.11 Definition Let f ∈ L2(a, b). A function g ∈ L2(a, b) is called a weak deriva-tive of f if ∫ b

ag(x)ϕ(x) dx = −

∫ b

af(x)ϕ′(x) dx

for all functions ϕ ∈ C10 [a, b].

Remark. The space C10 [a, b], defined in the end of Lecture 5, is called the space

of test functions.

Second Remark. If f ∈ C1[a, b] then g = f ′ (usual derivative) is a weak deriva-tive (integration by parts!).

Third Remark. A weak derivative, if it exists, is unique.

Proof. Suppose that g1, g2 ∈ L2(a, b) are weak derivatives of f ∈ L2(a, b).Define g := g1 − g2. Then by hypothesis

〈g, ϕ〉 =∫ b

ag1(x)ϕ(x) dx−

∫ b

ag2(x)ϕ(x) dx

= −∫ b

af(x)ϕ′(x) dx−

(−∫ b

af(x)ϕ′(x) dx

)= 0


for all ϕ ∈ C10 [a, b]. Since C1

0 [a, b] is dense in L2(a, b) (Corollary 5.22) and theinner product is continuous, 〈g, h〉 = 0 for all h ∈ L2(a, b), and this impliesthat g = 0.

A last Remark. A weak derivatve need not exist. An example is H = L2(−1, 1)and there f = 1(0,1). (The proof of the claim is a not too easy exercise, maybea challenge. But you can do it!)

6.12 Definition The space H1(a, b) is defined as

H1(a, b) := f ∈ L2(a, b) | f has a weak derivative

and is called the first Sobolev space. For f ∈ H1(a, b) we denote by f ′ itsweak derivative.

On H1(a, b) we define the inner product

〈f, g〉H1 := 〈f, g〉L2 +⟨f ′, g′

⟩L2 =

∫ b

af(x)g(x) dx +

∫ b

af ′(x)g′(x) dx

for f, g ∈ H1(a, b).

6.13 Exercise Check that this is indeed an inner product.

If one writes out the norm of f ∈ H1(a, b) one obtains:

‖f‖H1 =(‖f‖22 +

∥∥f ′∥∥2

2

)1/2.

This shows that convergence of a sequence (fn)n in H1(a, b) is the same as theL2-convergence of (fn)n and of (f ′n)n.

6.14 Theorem The first Sobolev space H1(a, b) is a Hilbert space.

Proof. Suppose that (fn)n ⊂ H1(a, b) is a Cauchy sequence,i.e. ‖fn − fm‖H1 → 0 as n, m → ∞. As we observed already, thismeans that both ‖fn − fm‖2 → 0 and ‖f ′n − f ′m‖2 → 0 as n, m → ∞, whichis nothing else than to say that both sequences (fn)n and (f ′n)n are Cauchysequences in L2(a, b). By the completeness of L2(a, b), there are functionsf, g ∈ L2(a, b) such that

‖fn − f‖2 → 0, and∥∥f ′n − g

∥∥2→ 0

as n → ∞. It suffices to show that g is a weak derivative of f . In order toestablish this, let ϕ ∈ C1

0 [a, b]. Then by the continuity of the scalar productwith respect to its induced norm (Corollary 6.7 above)

〈g, ϕ〉L2 = limn

⟨f ′n, ϕ

⟩L2 = lim

n−⟨fn, ϕ′

⟩= −〈f, ϕ〉 .


(We used that f ′n is a weak derivative of fn, for each n ∈ N.)


Lecture 7

Orthonormal systems and bases

The most important aspect of Hilbert space theory is the presence of geometryin the form of orthogonality.

7.1 Definition Let (H, 〈·, ·〉) be an inner product space. Two elements f, g ∈ H

are said to be orthogonal, in symbols: f ⊥ g, if 〈f, g〉 = 0.For a non-empty subset A ⊂ H we define its orthogonal space as

A⊥ := f ∈ H | f ⊥ a ∀ a ∈ A

and we write f ⊥ A if f ∈ A⊥.

7.2 Lemma Let A ⊂ H be not empty. Then A⊥ = span(A)⊥

is a closed subspace

of H.8

Proof. Suppose that x, y ⊥ A and λ, µ ∈ R. Then for every a ∈ A:

〈λx + µy, a〉 = λ 〈x, a〉+ µ 〈y, a〉 = 0

(bilinearity of the inner product), whence λx + µy ⊥ a. Hence λx + µy ⊥ A.This shows that A⊥ is a subspace. If (xn)n ⊂ A⊥ and xn → x then 0 =〈xn, a〉 → 〈x, a〉 for every a ∈ A, since the inner product is continuous. Thusx ∈ A⊥, and hence A⊥ is closed.The same arguments show that if x ⊥ A then x ⊥ span(A), whenceA⊥ ⊂ span(A)

⊥, the reverse inclusion being trivial.

7.3 Lemma (Pythagoras)Let x, y ∈ H. Then x ⊥ y if and only if ‖x + y‖2 = ‖x‖2 + ‖y‖2.More generally, let x1, . . . , xn be pairwise orthogonal elements of H. Then

‖x1 + · · ·+ xn‖2 = ‖x1‖2 + · · ·+ ‖xn‖2 .

Proof. One has

‖x + y‖2 = 〈x + y, x + y〉 = ‖x‖2 + 2 〈x, y〉+ ‖y‖2

8Recall that for a subset A of a vector space E its linear span is

span(A) := Xn

j=1λjaj | n ∈ N, λj ∈ R, aj ∈ A (j = 1, . . . , n),

also called the subspace generated by A.


from which the first assertion follows. The second is just a little induction argu-ment since by the previous lemma, xj+1 ⊥ x1+ · · ·+xj for all j = 1, . . . n−1.

You find non-trivial examples of orthogonal elements in inner product spacesin LN Exercise 7.4.2 and 7.4.4.

Consider the space H = `2 and therein the sequence (en)n of unit vectors asdefined in Example 4.2. We have seen there that each element of H has aseries representation x =

∑n λnen. The coefficient λn can be found from x by

taking the scalar product of x with en:

λn = 〈x, en〉`2 (n ∈ N)

We aim at showing that such a representation is possible in any Hilbert space.The key is the following concept.

7.4 Definition Let H be a Hilbert space. A collection of vectors (ei)i∈I is calledan orthonormal system (ONS) if

〈ei, ej〉 = δij :=

0 i 6= j

1 i = j.

(The δ-symbol here is conventional, and is called the Kronecker-delta.) Inparticular ‖ei‖ =

√〈ei, ei〉 = 1 for all i ∈ I.9 For a vector x ∈ H we call the

scalar

〈x, ei〉

the i-th (abstract) Fourier coefficient and the (formal) series∑i∈I〈x, ei〉 ei

its (abstract) Fourier series. We shall show that in any Hilbert space, sucha series always converges10 and in case that the ONS is maximal in a certainsense,

x =∑

i∈I〈x, ei〉 ei

holds. Finally, we shall give a concrete example in the case that H = L2(0, 2π).

9Please note that I here denotes just any indexing set and not an interval, as sometimes in

previous lectures.10As I is a general index set, one has first to make sense of this.


Bessel’s inequality

We begin with a preparatory result.

7.5 Lemma Suppose that e1, . . . , en is an orthonormal system and λ1 . . . , λn ∈ R.

If x :=∑n

j=1 λjej then

a) 〈x, ek〉 =∑n

j=1λj 〈ej , ek〉 = λk (k = 1, . . . , n).

b) ‖x‖2 =∑n

j=1|λj |2 =

∑n

j=1|〈x, ej〉|2.

Proof. This is easy, the second being just Pythagoras’ theorem.

Note that by a) each (finite) ONS is a linearly independent set. So it is a basisfor its linear span.

7.6 Lemma (Orthogonal Projection, Part 1)Let e1, . . . , en be a finite ONS in the inner product space H and denote by

F its linear span:

F := spanej | j = 1, . . . , n.

For x ∈ H we define

PF x :=n∑

j=1

〈x, ej〉 ej .

Then PF x is the unique element y ∈ F satisfying x− y ⊥ F . One has

‖x‖2 = ‖x− PF x‖2 +n∑

j=1

|〈x, ej〉|2

Proof. Clearly PF x ∈ F . For every k ∈ 1, . . . , n we have

〈PF x, ek〉 = 〈x, ek〉

by Lemma 7.5. Hence x − PF x ⊥ ek for all k, and so x − PF x ⊥ F . Ify =

∑nj=1 λjej ∈ F is such that x− y ⊥ F , then again by Lemma 7.5,

λk = 〈y, ek〉 = 〈x, ek〉

for all k and hence y = PF x. This proves uniqueness. The last assertion isjust Pythagoras:

‖x‖2 = ‖x− PF x‖2 + ‖PF x‖2 = ‖x− PF x‖2 +n∑

j=1

|〈x, ej〉|2 .

This finishes the proof.

The vector PF x is called the orthogonal projection of x onto F .


7.7 Corollary (Bessel inequality)Let (ei)i∈I be an ONS in an inner product space H and let x ∈ H. Then the

set Ix := i ∈ I | 〈x, ei〉 6= 0 is at most countable and∑i∈I|〈x, ei〉|2 := sup

J⊂I, J finite

∑j∈J

|〈x, ej〉|2 ≤ ‖x‖2 .

Proof. By the previous lemma,∑j∈J

|〈x, ej〉|2 ≤ ‖x‖2

for every finite subset J ⊂ I. This proves Bessel’s inequality. Consider the set

In := i ∈ I | |〈x, ei〉|2 ≥ 1/n.

Then if J ⊂ In is finite,

#J

n≤∑j∈J

|〈x, ej〉|2 ≤ ‖x‖2 ,

and hence #In ≤ n ‖x‖2 is finite. Therefore, Ix =⋃

n∈N In is at mostcountable.

Complete orthonormal systems

As we want to represent x as x =∑

i∈I 〈x, ei〉 ei, we have to look now atconvergence issues.

7.8 Theorem Let (ej)j∈N be an ONS in a Hilbert space H, and let λ : N −→ R.

Then the following assertions are equivalent:

(i) The sum∑∞

j=1λjej converges in H.

(ii) λ ∈ `2.

Proof. Let sn :=∑n

j=1λjej . Since H is a Hilbert space, convergence of the

sum is equivalent to (sn)n being a Cauchy sequence. But with m ≥ n we have

‖sm − sn‖2 =∥∥∥∑m

j=n+1λjej

∥∥∥2=∑m

j=n+1‖λjej‖2 =

∑m

j=n+1λ2

j

by Pythagoras. Thus (sn)n is a Cauchy sequence if and only if λ is square-summable.


7.9 Lemma Let λ ∈ `2 and (en)n∈N be an ONS in a Hilbert space H. Then the

sum∑

n λnen is independent with respect to rearranging the summands, i.e.∑∞

n=1λnen =

∑∞

n=1λπ(n)eπ(n)

for every permutation (= bijective map) π : N −→ N.

Proof. To be filled in.

Remark: The property shown in the previous lemma is called unconditionalsummability. By a famous theorem of Riemann, in finite-dimensional spacesunconditional summability and absolute summability is the same. Here we cansee that this is not longer true in infinite dimensions:Let (en)n be an ONS in a Hilbert space H and λ : N −→ R. The previousresults show that the series

∑n λnen converges unconditionally if and only if

λ ∈ `2. However, by definition it converges absolutely if∑n

|λn| =∑

n

|λn| ‖en‖ =∑

n

‖λnen‖ < ∞,

hence if and only if λ ∈ `1. Since the sequence (1/n)n is square summable butnot absolutely summable, the series

∞∑n=1

1n

en

converges unconditionally but not absolutely in H.

By Bessel’s inequality, if (ei)i∈I is an ONS in a Hilbert space, the sum∑i∈I〈x, ei〉 ei

is meaningful in the following sense: apart from at most countably many i allterms are equal to zero; for the remaining i you can choose any enumeration

i ∈ I | 〈x, ei〉 6= 0 = Ix = i1, i2, . . .

and then form the sum∑

k 〈x, eik〉 eik . This exists, by Theorem 7.8, since byBessel’s inequality, the sequence (〈x, ein〉)n∈N is square summable. The resultis then also independent of the enumeration, by Lemma 7.9. (If you find thisa bit too challenging, no problem. Just think of all ONS’s as countable oneswith a given enumeration.)

We now turn to the case that this so-defined sum is actually equal to thevector x itself.


7.10 Definition and Theorem

An ONS (ei)i∈I in a Hilbert space H is called maximal or complete or anorthonormal basis if one of the following equivalent conditions holds:

1) ei | i ∈ I⊥ = 0.

2) x =∑

i∈I〈x, ei〉 ei for every x ∈ H.

3) ‖x‖2 =∑

i∈I|〈x, ei〉|2 for every x ∈ H. (Parseval identity)

4) The space spanei | i ∈ I is dense in H.

Proof. 1) ⇒ 2): Fix x ∈ H. Without loss of generality we may assume Ix = N.Define Pnx :=

∑nj=1 〈x, ej〉 ej . As explained above, y := limn→∞ Pnx exists in

H. Now, by Lemma 7.5 x − Pnx ⊥ ek if n ≥ k, and so x − y ⊥ ek for everyk ∈ Ix. But x− Pnx ⊥ ei for i /∈ Ix, trivially! So x− y ⊥ ei for all i ∈ I, andthus, by hypothesis, x− y = 0.2) ⇔ 3): Without loss of generality we may suppose that I = Ix = N. Then∥∥∥x− ∑n

j=1〈x, ej〉 ej

∥∥∥2+

n∑j=1

|〈x, ej〉|2 = ‖x‖2

for each n ∈ N. The equivalence of 2) and 3) is immmediate.2) ⇒ 4): This is trivial.4) ⇒ 1): By Lemma 7.2 above

ei | i ∈ I⊥ = spanei | i ∈ I⊥ = H⊥ = 0.

The proof is complete.

7.11 Example In the Hilbert space H = `2 the canonical unit vectors en as intro-duced in Lecture 4 form a complete ONS.

Separable spaces and the Gram-Schmidt procedure

Of course, the question arises whether a complete ONS does always exist. Thisis indeed the case, by an application of a result which is called Zorn’s Lemmaand is very close to a fundamental set-theoretic axiom, the so-called Axiomof Choice. Moreover, some other set-theoretic reasoning also reveals that twocomplete ONS’s must be of the same “size” (cardinality). This is a little reliefwhen one does general theory, but in practice one is not interested in existenceproofs but in concrete examples. And in the course of this lecture (almost) allONS’s are countable.There is a class of spaces where one can construct explicitely a complete ONS.These are the separable ones. The concept is not bound to Hilbert spaces.


7.12 Definition A normed space E is called separable if there is an sequence offinite-dimensional subspaces (Fn)n∈N of E such that such that

Fn ⊂ Fn+1 (n ∈ N) and⋃n∈N

Fn = E.

7.13 Exercise Use the fact that Q is dense in R to prove that a space E is separableif and only if E contains a countable dense set. (This shows that our definitionof separability is coherent with the ususal definition in metric space theory.)

7.14 Examples 1) The Banach space C[a, b] is separable, since by Weierstrass’theorem, the subspaces Fn := spantk | k = 0, . . . n satisfy the require-ments of Definition 7.12.

2) If 1 ≤ p < ∞, Lp(a, b) and `p are separable.

3) If H is a Hilbert space with a countable complete ONS, then H is sepa-rable.

Now let H be a separable Hilbert space (with sequence of subspaces Fn).By adding subspaces if necessary, we may suppose dim Fn = n. We choosesubsequently vectors xn ∈ Fn \ Fn−1, n ∈ N. From this we manufacture anorthogonal system by letting

yn := xn − PFn−1xn.

Then yn ⊥ Fn−1 and yn 6= 0 for all n ∈ N, by choice of the xn. Finally wedefine

en := yn/ ‖yn‖

Then (en)n∈N is an ONS. Since

spane1, . . . , en = spanx1, . . . , xn = Fn

for all n, this ONS is complete. (This process is called the Gram-Schmidtprocedure).

To sum up, we have established the following theorem.

7.15 Theorem Let H be a Hilbert space. Then H is separable if and only if there

exists an at most countable complete ONS in H.

7.16 Exercise Let (en)n∈N be a complete ONS in the Hilbert space H and let (fi)i∈I

be a second ONS. Prove that I is also countable.


Classical Fourier Series

We now come to an important example of a complete ONS in the space H =L2(0, 2π).

7.17 Lemma Consider the Hilbert space L2(0, 2π). Then

(1) sinnx ⊥ cos mx (n, m ∈ N0)

(2) sinnx ⊥ sinmx (n, m ∈ N, n 6= m)

(3) cos nx ⊥ cos mx (n, m ∈ N, n 6= m)

(4) ‖cos nx‖2 = ‖sinnx‖2 = π (n ∈ N)

Proof. The easiest way to see this is to consider the complex-valued functionsen(x) := einx, n ∈ Z. Then if n 6= 0∫ 2π

0en(x) dx =

einx

in

∣∣∣∣2π

0

=1− 1in

= 0

by the fundamental theorem of calculus, hence∫ 2π

0en(x) dx =

0 (n 6= 0)

2π (n = 0).

Now take n, m ∈ Z and observe that

ei(n−m)x = einxe−imx = (cos nx + i sinnx)(cos mx− i sinmx)

= (cos nx cos mx + sinnx sinmx) + i(sinnx cos mx− sinmx cos nx).

If n 6= m then integrating yields

0 =∫ 2π

0cos nx cos mxdx +

∫ 2π

0sinnx sinmxdx

0 =∫ 2π

0sinnx cos mxdx −

∫ 2π

0sinmx cos nxdx.

Replacing n by −n in these equations establishes (2) and (3), and (1) in thecase n 6= m. (Note that cos(−x) = cos x and sin(−x) = − sinx.)Moreover, letting m = −n in the second equation above yields

2∫ 2π

0sinnx cos nxdx = 0

which is the missing case in (1). On the other hand, m = −n in the first yields∫ 2π

0(cos nx)2 dx =

∫ 2π

0(sinnx)2 dx =


and this amounts to (4) when using the well-known formula(cos nx)2 + (sinnx)2 = 1.

By the previous lemma, the so-called trigonometric system1√2π

∪

cos nx√π

∣∣ n ∈ N∪

sinnx√π

∣∣ n ∈ N

is an orthonormal system in H = L2(0, 2π). The corresponding abstractFourier series of a function f ∈ L2(0, 2π) is

α0√2π

+1√π

∞∑n=1

αn cos nx + βn sinnx

where α0 =∫ 2π0 f(s) ds and

αn =1√π

∫ 2π

0f(s) cos(ns) ds, βn =

1√π

∫ 2π

0f(s) sin(ns) ds, (n ∈ N)

are its Fourier coefficients.

7.18 Theorem The trigonometric system is maximal. Hence

f =α0√2π

+1√π

∞∑n=1

(αn cos nx + βn sinnx

)where the sum converges in the L2-sense.

Proof. In LN, Exercise 9.4.5, it is shown that if f ∈ C10 [0, 2π], then

f(x) =α0√2π

+1√π

∞∑n=1

(αn cos nx + βn sinnx

)uniformly, i.e., in the L∞-sense. Since

‖g‖2 ≤√

2π ‖g‖∞ (g ∈ L2(0, 2π))

the Fourier series converges also in the L2-norm. As we know that C10 [0, 2π]

is dense in H = L2(0, 2π), by the Density Principle 1 (after Theorem 5.20) wesee that 4) in Definition 7.10 is satisfied for the trigonometric system.



Lecture 8

Best approximations

Let H be an inner product space, F ⊂ H a linear subspace and x ∈ H. In thecase that F is finite-dimensional we found — with the help of an orthonormalbasis (which may be constructed by Gram-Schmidt) of F — a vector y = PF x

with the following two properties: 1) y ∈ F and 2) y − x ⊥ F . Weshowed that these properties actually determine the vector y uniquely. (Inparticular, y does not depend on the orthonormal basis we used to find it.)

We shall now see that in a Hilbert space we have such a mapping PF not onlyfor finite-dimensional subspaces, but for any closed subspace F . Geometricintuition lets us expect that the vector PF x is closest to x under all vectors inF , i.e., minimizes the distance to x. This leads us to the next — very general— definition.

8.1 Definition Let (Ω, d) be a metric space and A ⊂ Ω. Then

d(x,A) := infd(x, a) | a ∈ A

is called the distance of x to the set A. Each point a ∈ A such that d(x,A) =d(x, a) is called a best approximation of x in A.

A best approximation a of x in A it is then a minimizer of the function

(a 7−→ d(x, a)) : A −→ R+.

In general situations such minimizers do not necessarily exist and when theyexists, they are not necessarily unique.

8.2 Exercise Let (Ω, d) be a metric space, A ⊂ Ω and x ∈ Ω. Show that x ∈ A ifand only if d(x,A) = 0.

8.3 Examples a) By the previous exercise, if A is not closed then for x ∈ A\A

there cannot be a best approximation in A: since d(x,A) = 0, a bestapproximation a would satisfy d(x, a) = 0 and a ∈ A, and hence x =a ∈ A which is false by choice of x.A special case of this is A = c00 the space of finite sequences and Ω = `2

and x = (1/n)n∈N, see LN 11.1.7.


b) In Banach spaces different from Hilbert spaces best approximations mayfail to exists, even if the set A is closed. Here is an example. We considerE = `∞ and x := (1 + 1/n)n∈N ∈ `∞. The set A is defined by

A := e1, e1 + e2, e1 + e2 + e3, . . .

where en denotes the n-th canonical unit vector. (Hence e1 + · · ·+ en isthe sequence that starts with n ones and then contiues with zeroes.) Anytwo distingt vectors in A have ‖·‖∞-distance equal to 1, therefore anysequence in A that converges in E is eventually constant. Consequently,A is closed. Now

d(x, e1 + · · ·+ en) = ‖x− (e1 + · · ·+ en)‖∞= sup1 + 1/m | m > n = 1 + 1/(n + 1) > 1,

but d(x,A) = infm(1 + 1/m) = 1. No best approximation of x exists inA.

c) One can find a closed subspace F of a Banach space E and a vectorx ∈ E such that there is no best approximation of x in F. We shall meetan example soon.

d) See LN 11.1.2 for an example in two dimensions that a best approxima-tion need not be unique.

What does this have to do with our original question of orthogonal projection?

8.4 Lemma Let H be an inner product space, let F ⊂ H be a closed linear

subspace and let x ∈ H and y ∈ F . The following assertions are equivalent:

(i) x− y ⊥ F .

(ii) ‖x− y‖ = d(x, F ), i.e. y is a best approximation of x in F . In other

words: y minimizes the map (z 7−→ ‖x− z‖) : F −→ R+.

Proof. Since F is a linear subspace of H and y ∈ F , we have

F = y − tz | z ∈ F, t > 0.

Therefore

‖x− y‖ = d(x, F ) ⇔ ‖x− y‖2 ≤ ‖x− f‖2 ∀ f ∈ F

⇔ ‖x− y‖2 ≤ ‖x− y + tz‖2 ∀ z ∈ F, t > 0

⇔ 0 ≤ 2 〈x− y, tz〉+ ‖tz‖2 ∀ z ∈ F, t > 0

⇔ 0 ≤ t ‖z‖2 + 2 〈x− y, z〉 ∀ z ∈ F, t > 0

⇔ 0 ≤ 〈x− y, z〉 ∀z ∈ F.


Since in the last line we can always replace z by −z, we see that this holds

⇔ 〈x− y, z〉 = 0 ∀ z ∈ F

as desired.

So the problem of finding an orthogonal projection reduces to the problem ofminimizing a distance.

8.5 Definition A closed subset A of a normed vector space E is called convex if

x, y ∈ A ⇒ 12(x + y) ∈ A.

(This means: for any two points in A the midpoint of the line segment joiningthem is also a point of A.)11

8.6 Theorem Let H be a Hilbert space, and let A be a closed convex subset of

H. Furthermore, let x ∈ H. Then there is a unique vector PAx := y ∈ A with

‖x− y‖ = d(x,A).

Proof. Let us define d := d(x,A)2 = inf‖x− a‖2 | a ∈ A. For y, z ∈ A wehave by the parallelogram identity

‖y − z‖2 = ‖(y − x)− (z − x)‖2 = 2 ‖y − x‖2 + 2 ‖z − x‖2 − 4∥∥∥∥y + z

2− x

∥∥∥∥2

≤ 2 ‖y − x‖2 + 2 ‖z − x‖2 − 4d

since (y + z)/2 ∈ A as A is convex. If both z, y minimize ‖· − x‖2 then‖x− y‖2 = d = ‖x− z‖2 and we obtain

‖y − z‖ ≤ 2d + 2d− 4d = 0

whence y = z. This proves uniqueness. To show existence, let (yn)n be aminimizing sequence in A, i.e. yn ∈ A and dn := ‖x− yn‖2 d. For m ≥ n

we replace yn, ym in the above and obtain

‖yn − ym‖2 ≤ 2 ‖yn − x‖2 + 2 ‖ym − x‖2 − 4d ≤ 4(dn − d).

Hence if ε > 0 is given and we choose N ∈ N so large that if n ≥ N we havedn ≤ d + ε/4, then

‖yn − ym‖2 ≤ ε ∀m ≥ n ≥ N.

11More general, a subset A of a vector space E is called convex if for any pair of points

x, y ∈ A one also has tx + (1− t)y ∈ A for all t ∈ [0, 1]. If E is a normed space and A is

closed, then our definition is equivalent to this.


Therefore, (yn)n is a Cauchy sequence in A, and as H is supposed to be aHilbert space, i.e. complete, there is a limit y := limn yn. As A is closed andyn ∈ A for all n, we have y ∈ A too. But the norm is continuous, and so

‖y − x‖2 = limn‖x− yn‖2 = lim

ndn = d,

and we have found our desired minimizer.

Remark. Our formulation of Theorem 8.6 is not the most general. Insteadof requiring H to be Hilbert, one can do with the mere assumption that H

is a (possibly non-complete) inner product space, as long as one knows thatthe convex set A is not just closed in H but complete with respect to theinduced metric. Compare this to the formulation of LN, Theorem 11.1.6 andthe discussion in LN 11.1.7.

Another Remark. We proved Theorem 8.6 for convex sets whereas LN juststicks to linear subspaces. So we shall not make use of the more generalversion. However, it is very important in applications, as any look into a bookof approximation theory or numerical analysis will reveal.

Orthogonal projection

Let us return to the case where the convex set A is actually a closed subspaceF of H.

8.7 Theorem and Definition (Orthogonal Projection, Part 2)Let F be a closed subspace of a Hilbert space H. For x ∈ H we call the vector

PF x the orthogonal projection of x onto F . It is determined by either one

of the following conditions:

1) PF x ∈ F and PF x− x ⊥ F

2) PF x ∈ F and ‖PF x− x‖ = d(x, F ), i.e. PF x is a (the) best approximation

of x in F .

3) PF x ∈ F and 〈x, ei〉 = 〈PF x, ei〉 for all i ∈ I, with (ei)i∈I being any

complete ONS of F .

4) PF x is the Fourier series

PF x =∑i∈I

〈x, ei〉 ei

of x with respect to any complete ONS (ei)i∈I of F .


Proof. (See also LN 11.2.8.) The equivalence of 1) and 2) was shown in Lemma8.4. Uniqueness follows from Theorem 8.6. Let (ei)i∈I be any complete ONSof F . The equivalence of 3) and 4) is direct Theorem and Definition 7.10.Define

Px :=∑i∈I

〈x, ei〉 ei (x ∈ H).

As in the discussion preceding Theorem 7.10 we see that the vector Px iswell-defined: Bessel’s inequality shows that there are at most countably manyi ∈ I with 〈x, ei〉 6= 0. Choosing any enumeration of these we may assumeI ⊂ N; again Bessel’s inequality yields that the sequence (〈x, ei〉)i∈I is squaresummable, and since we are in a Hilbert space, the sum converges by Theorem7.8. As the sum is a limit of linear combinations of the ei, Px ∈ F . By linearityand continuity of the inner product,

〈PF x, ek〉 =∑i∈I

〈x, ei〉〈ei, ek〉 =∑i∈I

〈x, ei〉 δik = 〈x, ek〉

for every k ∈ I. Hence Px− x ⊥ ek for all k and therefore

Px− x ⊥ spanek | k ∈ I = F

as the ONS is complete. Hence Px = PF x by the characterization 1).

8.8 Corollary Let H be a Hilbert space and F a closed linear subspace of H. Then

the orthogonal projection PF : H −→ F is linear and satisfies

‖PF x‖ ≤ ‖x‖ (x ∈ H)

Proof. This is immediate if one can use a complete ONS for F to constructPF . If one wants to avoid this, one can argue as follows. Let λ, µ ∈ R andx, y ∈ H. Since F is a subspace λPF x + µPF y ∈ F . Moreover

(λx + µy)− (λPF x + µPF y) = λ(x− PF x) + µ(y − PF y) ⊥ F.

(Lemma 7.2). By uniqueness λPF x + µPF y = PF (λx + µy), and linearity isproved. By Pythagoras, for any x ∈ H,

‖PF x‖2 ≤ ‖PF x‖2 + ‖x− PF x‖2 = ‖x‖2 .

Taking square roots yields ‖PF x‖ ≤ ‖x‖.


8.9 Corollary Let F be a closed linear subspace of a Hilbert space H. Then(F⊥)⊥ = F

Proof. Since F ⊥ F⊥ we have F ⊂ (F⊥)⊥. Take x ∈ (F⊥)⊥. Then x ⊥ F⊥

and PF x ⊥ F⊥ (since PF x ∈ F ). Therefore x − PF x ⊥ F⊥; but alsox− PF x ∈ F⊥ and so x− PF x is orthogonal to itself, hence 0.

The orthogonal projection leads us to a natural decomposition of a Hilbertspace.

8.10 Definition Let E be a normed vector space and U, V linear subspaces. If

E = u + v | u ∈ U, v ∈ V =: U + V, and U ∩ V = 0

then we call E the algebraic direct sum of U and V and write

E = U ⊕ V.

In this case, if E is a normed space and U, V are closed then E is called the(topological) direct sum of U, V .

8.11 Exercise Let U, V be linear subspaces of a vector space E. Show that E =U⊕V if and only if every vector x ∈ E can be written in a unique way as a sumx = u + v with u ∈ U, v ∈ V . In this case, mappings PU : E −→ U and PV :E −→ V are defined such that x = PUx + PV x for each x ∈ E. Show that themappings PU and PV are linear. They are called the canonical projectionsonto U, V associated with the direct sum decomposition E = U ⊕ V .

(See LN Remark 11.2.2; the linearity is done like in the proof of Corollary8.8.)

8.12 Theorem Let F be a closed linear subspace of a Hilbert space H. Then F⊥

is a closed subspace of H and

H = F ⊕ F⊥.

The orthogonal projection PF is the canonical projection of H onto F associ-

ated with this decomposition.

Proof. That F⊥ is always a closed linear subspace of H is Lemma 7.2. Fur-thermore F ∩ F⊥ = 0 by the definiteness of the inner product. And forevery x ∈ H we have

x = PF x + (x− PF x) ∈ F + F⊥,

giving H = F + F⊥ and the last assertion.


Minimal norm problems

Suppose you have a closed subspace G of a Hilbert space H and a vector f0 /∈ F

and you are looking at the problem to find an element m ∈ M := f0−G withminimal norm; equivalently, you want to find an g0 ∈ G such that ‖f0 − g0‖is minimal. Such a g0 is the best approximation of f0 in G and hence satisfiesm = f0 − g0 ⊥ G. Thus

m := f0 − PGf0 = PG⊥f0.

In LN 11.2.10 one considers the Hilbert space H := H1(0,∞) (which isnowhere defined in LN!) and therein the set

M := f ∈ H1(0,∞) | f(0) = 1.

It is used without proof that H1(0,∞) is continuously embedded into the setBC[0,∞). From this it follows that

G := f ∈ H1(0,∞) | f(0) = 0

is a closed linear subspace of H. Choosing any function f0 ∈ M one then canwrite M = f0 −G.One wants to find an element m in M with minimal norm. This means, m ⊥ G

and m(0) = 1. In LN 11.2.10 it is shown that m(t) := e−t does the job; inthe proof it is used (without proof) that one can integrate by parts withinH1(0,∞).

If this example is mysterious to you, do not worry. There are too many thingsnot yet defined or unproved. More accessible examples are in the ExercisesLN 11.3.3.



Lecture 9

Bounded linear operators

We now leave Hilbert spaces for a short time and look at some concepts validon general Banach spaces. Recall that a mapping

T : E −→ F

between vector spaces E,F is called linear if

T (αx + βy) = αT (x) + βT (y) (α, β ∈ R, x, y ∈ E).

9.1 Exercise Let E be a finite dimensional vector space with basis b1, . . . , bd,and let T : E −→ F be a linear mapping, where F is any vector space. Showthat T is uniquely determined by the values Tbj , j = 1, . . . , d. Conversely,show that for any vectors c1, . . . , cd ∈ F there is a (unique) linear mappingT : E −→ F such that Tbj = cj , j = 1, . . . , d.

Linear mappings are often called (linear) operators; following LN 10.1.4, Itry to reserve the word “operator” to the case where E = F , but be aware thatthis is not a generally accepted practice.12 In contrast to this, it is commonpractice to write “Tx” instead of “T (x)”. Furthermore, linear mappings withF = R are called (linear) functionals (see LN 10.1.5). We denote by

L(E,F ) := T : E −→ F | T linear

the set of all linear maps between the vector spaces E and F , with the abbre-viation L(E) := L(E,E). The following lemma allows one to construct newlinear mappings from known ones.

9.2 Lemma Let E,F, G are vector spaces, T, S : E −→ F and R : F −→ G are

linear mappings, and α, β ∈ R. Then the mappings (αT + βS) : E −→ F and

RS : E −→ G, defined by

(αT + βS)(x) := αTx + βSx, (RS)(x) := R(S(x)) (x ∈ E)

are also linear.

Proof. Exercise in Linear Algebra.

Associated with a linear mapping T : E −→ F are its nullspace or kernelN (T ) = x ∈ E | Tx = 0 = T−10 and its range or image R(T ) :=Tx | x ∈ E.12See LN Example 10.3.3 where LN does not follow its own terminological convention.


9.3 Exercise Show that N (T ) is a linear subspace of E and R(T ) is a linearsubspace of F (LN 10.4.2 a)). Show that T is injective (one-to-one) if andonly if N (T ) = 0 (see LN 10.3.7).

The (obviously linear) mapping T : E −→ F defined by Tx := 0 for all x ∈ E

is simply denoted by 0. If E = F , the identity operator I = Id is defined byId(x) = x for all x ∈ E. The identity operator is also trivially linear.

A linear mapping T : E −→ F is called invertible if it is bijective (i.e., one-to-one and onto). In this case its inverse mapping T−1 : F −→ E exists andis also linear (see Exercise LN 10.4.2).

So far the concepts are linear-algebraic. But as we have normed spaces, we askwhen a linear mapping is continuous. To characterize continuity of a linearmap, we need the notion of boundedness.

9.4 Lemma Let E,F be normed spaces and T : E −→ F linear. Then

sup‖Tx‖ | x ∈ E, ‖x‖ ≤ 1 = infM ≥ 0 | ‖Tx‖ ≤ M ‖x‖ for all x ∈ E

(with the convention that inf ∅ := ∞).13

Proof. Define A := ‖Tx‖ | x ∈ E, ‖x‖ ≤ 1 and

B := M ≥ 0 | ‖Tx‖ ≤ M ‖x‖ for all x ∈ E,

and define α := supA and β := inf B.Suppose that β < ∞ and M ∈ B. Then ‖Tx‖ ≤ M ‖x‖ ≤ M for all x ∈ E

such that ‖x‖ ≤ 1. Thus α = supA ≤ M . This being true for all M ∈ B weconclude that α ≤ β = inf B. In particular, α < ∞.Suppose that α < ∞. Then for every x 6= 0, ‖x/ ‖x‖‖ = 1 and hence‖T (x/ ‖x‖)‖ ∈ A. This implies ‖T (x/ ‖x‖)‖ ≤ α. Multiplying this with ‖x‖,by the linearity of T and the homogeneity of the norm we obtain

‖Tx‖ = ‖x‖ ‖T (x/ ‖x‖)‖ ≤ α ‖x‖

This argument worked for all x 6= 0, but the inequality is trivially true forx = 0. Hence α ∈ B. This implies β < ∞ and β ≤ α.

Bringing the two arguments together we see that α < ∞ iff β < ∞ andα = β.

13Note that in ‖Tx‖ we mean the norm on F , and in ‖x‖ we mean the norm of E.


9.5 Definition A linear map T : E −→ F between normed spaces E,F is calledbounded if the number

‖T‖ := sup‖Tx‖ | x ∈ E ‖x‖ ≤ 1

is finite. This number is called the (operator) norm of T . If ‖T‖ ≤ 1, thenT is called a (linear) contraction.

Remark: In general the supremum in the definition of ‖T‖ is not a maximum,i.e. there are cases where there is no x ∈ E such that ‖Tx‖ = ‖T‖ ‖x‖.

By Lemma 9.4, T is bounded iff there exists M ≥ 0 such that ‖Tx‖ ≤ M ‖x‖for all x ∈ E. Moreover, from the proof of Lemma 9.4 we know that one canchoose M = ‖T‖. Thus we obtain the following.

9.6 Corollary Let T : E −→ F be a bounded linear map. Then

‖Tx‖ ≤ ‖T‖ ‖x‖ (x ∈ E).

The connection with continuity is given by the next result.

9.7 Theorem Let E,F be normed spaces and T : E −→ F linear. The following

assertions are equivalent:

(i) T is continuous.

(ii) T is continuous at 0.

(iii) T is bounded.

(iv) There is a constant M such that ‖Tx‖ ≤ M ‖x‖ (x ∈ E).

Proof. The implication (i) =⇒ (ii) is trivial. To prove the implication (iv) =⇒(i), take a convergent sequence xn → x in E. Then

‖Txn − Tx‖ = ‖T (xn − x)‖ ≤ M ‖xn − x‖ → 0.

Hence ‖Txn − Tx‖ → 0 by sandwiching. (See Assignment 1, Exercise 1b).)The equivalence (iv)⇐⇒ (iii) is Lemma 9.4Suppose that (iv) does not hold. Then for each n ∈ N there exists xn ∈ E

such that ‖Txn‖ > n ‖xn‖. In particular xn 6= 0 and we may define

yn :=xn

n ‖xn‖(n ∈ N).

Then ‖yn‖ = n−1, whence yn → 0. But

‖Tyn‖ =1

n ‖xn‖‖Txn‖ >

n ‖xn‖n ‖xn‖

= 1


Hence (Tyn)n is not convergent to 0. Consequently, T is not continuous at 0.By contraposition, we thus have established the implication (ii) =⇒ (iv).

A bounded linear mapping is uniquely determined by its values on a densesubspace.

9.8 Lemma (Density principle)Let E,F be normed spaces and let E0 be a dense subspace of E.

a) Let T : E −→ F be a bounded operator and let T0 = T∣∣E0

be its

restriction to E0. Then T0 is bounded and ‖T0‖ = ‖T‖.

b) Let T, S : E −→ F be two bounded operators. If Sx = Tx for all

x ∈ E0, then S = T .

Proof. a) If T is bounded on E, then ‖Tx‖ ≤ ‖T‖ ‖x‖ for all x ∈ E. Inparticular for all x ∈ E0. Hence T0 is bounded with ‖T0‖ ≤ ‖T‖. On theother side one has ‖Tx‖ = ‖T0x‖ ≤ ‖T0‖ ‖x‖ for all x ∈ E0. Approximatinga general element x ∈ E by a sequence (xn)n ⊂ E0 we have clearly ‖Txn‖ ≤‖T0‖ ‖xn‖ for all n, and then in the limit ‖Tx‖ ≤ ‖T0‖ ‖x‖, since T is bounded(=continuous). This proves a).b) follows from a): by hypothesis we have (S−T )0 := (S−T )

∣∣E0

= 0. Hence‖S − T‖ = ‖(S − T )0‖ = 0. This implies clearly that S − T = 0, i.e. S + T .

9.9 Lemma Let E,F be normed spaces and T : E −→ F be a bounded linear

map. Then N (T ) is a closed linear subspace of E.

Proof. we already have seen that N (T ) is alinear subspace. The one-pointset A := 0 is closed in F and N (T ) is the inverse image of A under thecontinuous mapping T . Hence N (T ) is closed by metric space theory. Seealso Assignment 1, Exercise 1c).

The range of a bounded linear map is not always closed.

9.10 Example Consider E = C[a, b], F = L2[a, b], and T : E −→ E the inclusionmapping given by Tf = f .14 Since

‖Tf‖2 =(∫ b

a|f(s)|2 ds

)1/2

≤ (b− a)1/2 ‖f‖∞ (f ∈ E)

we have that T is bounded. Its range R(T ) is all of C[0, 1], which is dense inF but not the whole of F . Thus R(T ) is not closed in F .14Note that T is not the identity mapping, since the spaces E, F are different!


Examples of bounded operators and their norms

A) The easiest operators are certainly the zero operator 0 and the identitymapping Id. As

‖0x‖ = ‖0‖ = 0

for all x ∈ E, one has ‖0‖ = 0. As Idx = x for all x ∈ E, one has ‖Id‖ =sup‖x‖ | ‖x‖ ≤ 1 = 1.

Suppose F is a normed space and y ∈ F is fixed. Consider the mapping

T := (t 7−→ ty) : R1 −→ E

The map T is obviously linear, and since ‖Tt‖ = ‖ty‖ = |t| ‖y‖ one has‖T‖ ≤ ‖y‖. Since T (1) = y and |1| = 1, one then concludes that ‖T‖ = ‖y‖.

B) Consider an arbitrary linear mapping T : Rd −→ F , where F is anynormed space. Then for any x ∈ Rd:

‖Tx‖ =∥∥∥∥T (

∑d

j=1xjej)

∥∥∥∥ =∥∥∥∥∑d

j=1xjTej

∥∥∥∥ ≤ ∑d

j=1|xj | ‖Tej‖

≤(

maxj=1,...,d

|xj |) ∑d

j=1‖Tej‖ = C ‖x‖∞

with C :=∑d

j=1 ‖Tej‖. This shows that T is bounded. However, it is ingeneral not true that ‖T‖ =

∑dj=1 ‖Tej‖. Note that we worked with the norm

‖·‖∞ on Rd; since all norms on Rd are equivalent, the boundedness of T doesnot depend on the chosen norm. However, the actual value of ‖T‖ does verywell depend on the chosen norm!

Since every finite dimensional normed space is Rd for some d we obtain:

9.11 Theorem Let E,F be normed spaces and T : E −→ F linear. If dim E < ∞then T is bounded.

C) Let us turn to more concrete forms of linear mappings. As E is a properfunction space, i.e., E is a subspace of F(Ω) for some set Ω, then one mayconsider the point-evaluation

δω := (f 7−→ f(ω)) : E −→ R

in a fixed point ω ∈ Ω. (This is also called the Dirac functional in ω.) Thisis certainly linear, and may or may not be bounded. For example, if E = B(Ω)is the space of all bounded functions with the sup-norm, then

|δω(f)| = |f(ω)| ≤ ‖f‖∞


by definition of ‖f‖∞.δω : B(Ω) −→ R

is bounded, with norm ‖δω‖ ≤ 1. To see that ‖δω‖ = 1 we choose f = 1:clearly ‖1‖∞ = 1 and δω(1) = 1(ω) = 1, whence ‖δω‖ = 1.

The same example works with E = BC(Ω), as Ω is a metric space, in particularwith E = C[a, b] (LN 10.1.3 f and 10.2.3 f). However, point-evalutaions on E =(C[a, b], ‖·‖1) are not continuous, see LN 10.2.8 and Assignment 1, Exercise4b).

D) We look briefly on restrictions. Suppose that [c, d] ⊂ [a, b] are intervals.Then the mapping

T := (f 7−→ f∣∣[c,d]

) : C[a, b] −→ C[c, d]

is linear. It is bounded for the sup-norm, since obviously

‖Tf‖∞ = sup|f(x)| | x ∈ [c, d] ≤ sup|f(x)| | x ∈ [a, b] = ‖f‖∞

for each f ∈ C[a, b]. This gives ‖T‖ ≤ 1. Inserting the funtion 1, which hassup-norm 1, yields ‖Tf‖∞ = 1 and hence ‖T‖ = 1.Similar remarks apply to the restriction mapping

(f 7−→ f∣∣[c,d]

) : Lp(a, b) −→ Lp(c, d).

for any 1 ≤ p < ∞.

E) On E = `p we consider the left shift L and the right shift R definedby

L(x1, x2, x3 . . . ) := (x2, x3, . . . )

R(x1, x2, x3 . . . ) := (0, x1, 1x2, x3, . . . ).

Thus (Lx)(n) := x(n + 1), n ∈ N and (Rx)(n) = 0 if n = 1 and (Rx)(n) =x(n− 1) if n ≥ 2. It is easy to see that

‖Rx‖p = ‖x‖p (x ∈ `p)

and this is what we call an isometry. It follows immediately that ‖R‖`p→`p =1. For the left shift we obtain for x ∈ `p

‖Lx‖pp =

∑∞

j=1|(Lx)(n)|p =

∑∞

j=1|x(n + 1)|p =

∑∞

j=2|x(n)|p

≤∑∞

j=1|x(n)|p = ‖x‖p

p


which implies that ‖L‖ ≤ 1. However, certainly LRx = x for all x, and so‖Rx‖ = ‖x‖ = ‖LRx‖. Fixing x 6= 0 and defining y := Rx/ ‖Rx‖p we get‖y‖p = 1 and ‖Ly‖p = 1, thus showing that ‖L‖`p→`p = 1.

Note that since LR = Id, R injective and L is surjective. But R is notsurjective, and L is not injective.

F) Consider the space E = C[a, b] and a function m ∈ C[a, b]. This functioninduces a linear(!) operator Mm on E by multiplication:

Mmf := mf (f ∈ C[a, b]).

Since|(Mmf)(x)| = |m(x)f(x)| = |m(x)| |f(x)| ≤ ‖m‖∞ ‖f∞‖

one sees that Mm is bounded and ‖Mm‖ ≤ ‖m∞‖. On the other hand, Mm1 =m and ‖1‖∞ = 1. Hence

‖Mm‖ ≥ ‖Mm1‖∞ = ‖m‖∞Hence we conclude that ‖Mm‖ = ‖m‖∞.

9.12 Exercise Let E := f ∈ C[0, 1] | f(1) = 0, with supremum norm. This is aclosed linear subspace of C[0, 1] as it is the kernel of the bounded functionalδ1 = (f 7−→ f(1)), see above. Consider the multiplication operator T definedby (Tf)x = xf(x), x ∈ [0, 1]. Show that ‖Tf‖∞ < 1 for every f ∈ E suchthat ‖f‖∞ < 1, but nevertheless ‖T‖ = 1. This is an example that in thedefinition of the operator norm

‖T‖ = sup‖Tx‖ | ‖x‖ ≤ 1

the supremum need not be a maximum!

As a second example of a multiplication operator we take a bounded sequencem ∈ `∞ and p ∈ [1,∞) and consider the operator Am : `p −→ `p given by

(Amf)(n) := m(n)f(n) (n ∈ N, f ∈ `p)

Since the sequence m is bounded, we obtain

‖Amf‖pp =

∑n∈N

|m(n)f(n)|p ≤∑

n∈N‖m‖p

∞ |f(n)|p = ‖m‖p∞ ‖f‖

pp

for every f ∈ `p. Therefore Am is bounded and ‖Am‖ ≤ ‖m‖∞. To see thatactually ‖Am‖ = ‖m‖∞ we observe that

‖Amen‖p = ‖m(n)en‖p = |m(n)| ‖en‖p

where en is the n-th unit vector. Hence ‖Am‖ ≥ |m(n)| for every n, and thus‖Am‖ ≥ supn |m(n)| = ‖m‖∞.


9.13 Exercise Choose the multiplicator sequence in such a way that the norm ‖Am‖is not attained. (See LN 10.2.6 d.)

G) A last example. Let I ⊂ R be any interval and consider the spaceE = L1(I). On E the integral is a linear functional

ϕ := (f 7−→∫

If(x) dx) : L1(I) −→ R.

By the triangle inequality for integrals:∣∣∣∣∫If(x) dx

∣∣∣∣ ≤ ∫I|f(x)| dx = ‖f‖1

for all f ∈ E and hence ϕ is bounded with ‖ϕ‖ ≤ 1. But on taking a positivefunction f ≥ 0 one obviously has

ϕ(f) = ‖f‖1

and so ‖ϕ‖ = 1.


Appendix: General remarks about linear mappings

(This material is purely optional. To be read on a long winter’s evening,by those people who agree (or at least sympathize with the opinion) thatabstraction — once absorbed — makes things easier.)

The following remarks are to review our list of examples from a more abstractviewpoint. As with, e.g., continuous or differentiable functions in calculus, weidentify linear mappings most easily by combining two things:

a) A list of somehow elementary examples.

b) A list of principles by which new examples can be constructed fromknown ones.

Here, our list of principles is short and is basically contained in Lemma 9.2above. One more principle — forming so-called “adjoint” operators — willcome soon.Let us therefore turn to the elementary examples. As I said in Lecture 1,functional analysis deals mostly with vector spaces of scalar functions definedon a set Ω, i.e., with subspaces of F(Ω, R). Actually, one should correct thisstatement a little and replace scalar functions by functions with values in avector space E. So let us consider spaces of the form

F(Ω, E) := f | f : Ω −→ E

where Ω is a set and E is a vector space. Addition and scalar multiplicationin F(Ω, E) are — as usual — defined pointwise:

(αf + βg)(ω) := αf(ω) + βg(ω) (α, β ∈ R, f, g ∈ F(Ω, E), ω ∈ Ω).

Now, here are the some instances of linear mappings I consider as “elementary”or “basic”.

1) (Base space transformation) Suppose one has a mapping ϕ : Ω −→Ω′. Then a linear mapping

Tϕ : F(Ω′, E) −→ F(Ω, E)

is induced by defining Tϕf := f ϕ.

A special case is the left shift operator L, where Ω = N and ϕ := (n 7−→n + 1), see E) above.

2) (Restriction) This is a special case of the previous. One takes a subsetΩ′ of Ω and considers as base space transformation the inclusion mapping

ϕ := (ω 7−→ ω) : Ω′ −→ Ω.


The induced linear mapping on the function spaces is then Tϕf = f ϕ =f∣∣Ω′

the restriction of f to Ω′:

(f 7−→ f∣∣Ω′

) : F(Ω, E) −→ F(Ω′, E).

An special case of restriction is the mapping

(f 7−→ f∣∣[c,d]

) : C[a, b] −→ C[c, d]

considered above in D).

As another special case, take Ω′ := ω a one-point set. Then the restric-tion is nothing else than the point evaluation

δω := (f 7−→ f(ω)) : F(Ω; E) −→ E,

see C) above.

3) (Extension) One takes a subset Ω′ of Ω and considers the mapping

(f 7−→ f) : F(Ω′, E) −→ F(Ω, E).

defined by

f(ω) =

f(ω) ω ∈ Ω′

0 ω ∈ Ω \ Ω′.

This is linear, by the pointwise definition of the vector space operationsin spaces of the form F(Ω, E).

As an example consider the subset N of Z. If f ∈ lp(N) then f is the(double) sequence

(. . . 0, . . . , 0, f(1), f(2), . . . )

and obviously f ∈ `p(Z) (appropriately defined). This gives a linearmapping `p(N) 7−→ `p(Z) which is even isometric:∥∥f∥∥

`p(Z)= ‖f‖`p(N) (f ∈ `p(N)).

4) (Multiplication) This comes in a lot of instances. For example, takea fixed function m : Ω −→ R and define Mm : F(Ω, E) −→ F(Ω, E) by

(Mmx)(ω) := m(ω)x(ω) (ω ∈ Ω).

(Note that this is well defined, als m(ω) ∈ R and x(ω) ∈ E.)

A special case was given above in F) as the multiplication operator onC[a, b].


A little more general, take a function m : Ω −→ L(E;F ) and define

Mm := F(Ω, E) −→ F(Ω, F )

by (Mmx)(ω) := m(ω)[x(ω)] for each ω. One again shows easily thatMm is linear.

A special case of this is by taking m to be constant: m(ω) := T forall ω ∈ Ω and a fixed linear map T : E −→ F . Then one obtains themultiplication operator

(x 7−→ Tx) : F(Ω, E) −→ F(Ω, F )

by (Tx)(ω) = Tx(ω), ω ∈ Ω.

5) (Integral) As one knows general measure theory, one can form, associ-ated with a so-called measure space (Ω,Σ, µ), a space L1(Ω,Σ, µ) and onit a linear functional

f 7−→∫

Ωf(x) µ(dx) (f ∈ L1(Ω,Σ, µ)),

the integral. We know a special case of this as Ω = I is an interval, µ = λ

is the Lebesgue measure and Σ = B(I) is the Borel σ-algebra on I. (Seealso G) above.)

6) (Factor spaces) As E is a vector space and F is a linear subspace thenone can define an equivalence relation on E by x ∼ y iff x− y ∈ F . Thisrelation is compatible with the vector space operations, and therefore thespace E

/F is vector space such that the so-called canonical surjection

s := (x 7−→ [x]) : E −→ E/F

is linear. (Here [x] denotes the equivalence class of x ∈ E. As a set,[x] = x + F = x + f | f ∈ F.)

We encountered a special case of this in Lecture 5: E = L1(I,B, λ) andF := f | f = 0 almost everywhere. Then E

/F = L1(I,B, λ).

9.14 Exercise How can the right shift operator R on F(N) be constructed from theabove examples?

A last remark: nothing is said here about continuity of these linear map-pings on certain normed spaces. That has to be considered individually. Asan example look at the point evaluation (Dirac) functional (f 7−→ f(0)) :F([0, 1]) −→ R: when restricted to bounded functions with the sup-norm, itis continuous; when restricted to integrable functions with the integral norm,it is not continuous.



Lecture 10

Introduction to integral operators

We take a closer look at a certain class of linear operators, the so-called inte-gral operators. What they are and why they are important is shown in thefollowing examples.

A) Given f ∈ L2(0, 1), consider the initial value problem

u′ = f, u(0) = 0.

The derivative here is to be read in the weak sense. We have seen in theassignments that the functions from H1(0, 1) are actually continuous up tothe boundary, hence we can try to find a solution u ∈ H1(0, 1). And in fact,again from the assignments we know the solution:

u(t) = Jf(t) =∫ t

0f(s) ds (t ∈ [0, 1]).

This can be written in the form

u(t) =∫ 1

0G(s, t)f(s) ds (t ∈ [0, 1]),

where

G(s, t) =

1 s ≤ t,

0 s > t

is called the associated Green’s function.

B) Given f ∈ L2(0, 1), consider the boundary value problem

u′′ = −f, u(0) = u(1) = 0.

We look for solutions in the space

H2(0, 1) := u ∈ H1(0, 1) | u′ ∈ H1(0, 1).

(Again, H2(0, 1) ⊂ C[0, 1] and hence the point-evalutations u(0) and u(1) aremeaningful.) By Exercise 3 and 4 from Assignment 2 we find successively

u′(t) = −Jf(t) + c, u(t) = −J2f(t) + tc + d (t ∈ [0, 1])

for certain constants c, d ∈ R. By using the boundary conditions we obtaind = 0 and

c = J2f(1) =∫ 1

0(1− s)f(s) ds.


Here we used that15

J2f(t) =∫ t

0

∫ s

0f(r) dr ds =

∫ t

0(t− s)f(s) ds.

By inserting c back into the formula for u one obtains

u(t) = −∫ t

0(t− s)f(s) ds + t

∫ 1

0(1− s)f(s) ds

=∫ 1

0[(s− t)1s≤t(t, s) + t(1− s)]f(s) ds =

∫ 1

0G(t, s)f(s) ds

with Green’s function

G(t, s) =

s(1− t), s ≤ t

t(1− s), t ≤ s.

In general, an operator A is an integral operator if there is a so-called kernelk, that is a function

k : [a, b]× [a, b] −→ R

such that A is given by

(Af)(s) :=∫ b

ak(s, t)f(t) dt (s ∈ [a, b])

for functions f where this definition is meaningful. Depending on howgood/bad the function k is, this works on different function spaces.

10.1 Exercise Suppose that k : [a, b]2 −→ R is continuous. Show that the kernel k

induces a bounded integral operator on C[a, b].

In our lectures we put our focus on the integral operators on the Hilbertspace H = L2(a, b). Unfortunately, we need here some more information frommeasure and integration theory, the so-called Fubini/Tonelli theorem.

Product spaces

We take two intervals I, J ⊂ R and look at the metric space Ω = I × J . As ametric space, one can form its Borel algebra B(I×J), which is the “smallest”σ-algebra which contains all open sets. (Probably, any subset of I × J you

15You have two ways of seeing this: either you know that integration by parts works in

H1(a, b), or you establish the formula for f ∈ C[a, b] first and then use that C[a, b] is

dense in L2[a, b], and J is continuous.


could think of is a Borel set.) A function f : I × J −→ [−∞,∞] is called(product) measurable if the set

a ≤ f < b = f−1[a, b)

is a Borel set, for all pairs a, b ∈ R, a < b. (This is totally the same asDefinition 4.10.) All continuous functions are measurable. Lemma 4.12 and4.14 carry over to measurable functions on I × J .

10.2 Lemma Let f : I × J −→ [−∞,∞] be measurable. Then for every a ∈ I and

b ∈ J the functions

f(a, ·) : J −→ [−∞,∞] and f(·, b) : I −→ [−∞,∞]

are both measurable.

The following theorem is the cornerstone to the integration theory on productspaces.

10.3 Theorem (Tonelli)Let f : I × J −→ [0,∞] be positive and measurable. Then the new functions

F1 : I −→ [0,∞] and F2 : J −→ [0,∞]

defined by

F1(x) :=∫

Jf(x, y) dy (x ∈ I)

F2(y) :=∫

If(x, y) dx (y ∈ J)

are again measurable. Moreover,∫I

∫J

f(x, y)dy dx =∫

IF1(x) dx =

∫J

F2(y) dy =∫

J

∫If(x, y)dxdy.

The Tonelli theorem allows one to introduce 2-dimensional Lebesgue mea-sure by setting

λ2(A) :=∫

R

∫R1A(x, y) dxdy (A ∈ B(R× R)),

(this is called “Cavalieri’s principle”), but we shall not go into this. More im-portant is the next result, which looks very similar, but has some characteristicdifferences.


10.4 Theorem (Lp-Fubini)Let f : I × J −→ R be measurable and 1 ≤ p < ∞. Suppose that∫

I

[∫J|f(x, y)|dy

]p

dx < ∞.

Then for almost every x ∈ I the function f(x, ·) is in L1(J); moreover, the

(almost everywhere defined) function

F1(x) :=∫

If(x, y) dy (x ∈ I)

is an element of Lp(I).In the case p = 1 one has also f(·, y) ∈ L1(I) for almost every y ∈ J and the

almost everywhere defined function

F2(y) :=∫

If(x, y) dx (y ∈ J)

is an element of L1(J). Moreover,∫I

∫J

f(x, y)dy dx =∫

IF1(x) dx =

∫J

F2(y) dy =∫

J

∫If(x, y)dxdy.

Proof. I cannot give a full proof here. But let us try at least to understand whythe theorem is meaningful. The function |f | : I × J −→ [0,∞] is measurable,since f is. Tonelli yields that the function G defined by

G(x) :=∫

I|f(x, y)| dy

is measurable, whence also the function Gp. So the hypotheses make sense.Now, from the hypothesis∫

IGp(x) dx =

∫I

(∫J|f(x, y)| dy

)p

dx < ∞

we conclude by Lemma 5.5 that Gp < ∞ almost everywhere. Taking the p-throot then implies that for almost all x ∈ I∫

I|f(x, y)| dy = G(x) < ∞

i.e. f(x, ·) ∈ L1(I). The rest is now done by decomposing f into positive andnegative parts, and applying Tonelli to each of those.

Fubini’s theorem is a consequence of Tonelli’s, but whereas Tonelli strictlystays within the realm of functions, Fubini’s theorem plays more in the field ofequivalence classes of functions: in general, the “functions” F1, F2 are reallyonly defined almost everywhere; but as such still determine equivalence classesof functions with respect to equality almost everywhere.


Integral operators of Hilbert–Schmidt type

A measurable function k : I×J −→ R is called a Hilbert–Schmidt integralkernel if ∫

I

∫J|k(x, y)|2 dy dx < ∞.

We shall show that k defines a bounded integral operator L2(J) −→ L2(I).

Take f ∈ L2(J). Then the function F : I × J −→ R defined by

F (x, y) := k(x, y)f(y)

is measurable (believe me, but what else would you expect?); furthermore, byCauchy–Schwarz∫

J|F (x, y)| dy =

∫J|k(x, y)f(y)|dy ≤

[∫J|k(x, y)|2 dy

]1/2

‖f‖2

for all x ∈ I. Hence∫I

[∫J|F (x, y)| dy

)2

dx ≤∫

I

∫J|k(x, y)|2 dy dx ‖f‖22 .

By the L2-Fubini, F (x, ·) = k(x, ·)f(·) ∈ L1(J) for almost all x ∈ I, and

Af(x) :=∫

Jk(x, y)f(y) dy

defines an element Af from L2(I). Moreover,

‖Af‖22 =∫

I

∣∣∣∣∫J

k(x, y)f(y) dy

∣∣∣∣2 dx ≤∫

I

[∫J|k(x, y)f(y)| dy

]2

dx

≤∫

I

∫J|k(x, y)|2 dy dx ‖f‖22 .

This shows that

‖Af‖2 ≤(∫

I

∫J|k(x, y)|2 dy dx

)1/2

‖f‖2 (f ∈ L2(J)).

Hence A : L2(J) −→ L2(I) is bounded with

‖A‖ ≤(∫

I

∫J|k(x, y)|2 dy dx

)1/2

.

(In general there is not equality here. See Lecture 14, Appendix.) We callA = Ak a Hilbert–Schmidt operator with (integral) kernel k.


10.5 Exercise Here is the analogous situation for sequence spaces: Let a : N×N −→R be an infinite matrix such that

‖a‖HS :=(∑

i,j∈N|a(i, j)|2

)1/2< ∞

Show that a induces a linear operator A on `2(N) by

(Af)(n) :=∑∞

j=1a(n, j)f(j)

and ‖A‖`2→`2 ≤ ‖a‖HS .

The space BL(E,F )

We now have a closer look at the space L(E,F ), defined in Lecture 9. Theset L(E,F ) is a vector space in a natural way: by Lemma 2 of the previouslecture, if T, S ∈ L(E,F ) and α, β ∈ R, then also αT +βS ∈ L(E,F ). It is easyto establish that the usual vector space axioms (associative laws, distributivelaws) are satisfied.16 A further consequence of linearity are the distributivelaws

R(S + T ) = RS + RT and (Q + R)S = QR + RS

where R,Q ∈ L(F,G) and S, T ∈ L(E,F ), with E,F, G being vector spaces.In particular, the space L(E) carries the structure of an algebra (vector spacewith associative bilinear multiplication on it). That means for example, thatfor T ∈ L(R) one can form its powers Tn defined by Tn := T . . . T (n-times),or polynomials of T by

p(T ) :=n∑

j=0

ajTj if p(t) =

n∑j=0

ajtj ∈ R[t]

For normed spaces E,F let us define

BL(E,F ) := T : E −→ F | T is linear and bounded

with the abbreviations BL(E) := BL(E,E).

10.6 Lemma Let E,F, G be normed spaces. Then BL(E,F ) is a subspace of

L(E,F ) and the operator norm ‖·‖ defined by

‖T‖ = ‖T‖E→F := sup‖Tx‖F | ‖x‖E ≤ 116A more conceptual way to this result is to show first that the space F(Ω, F ) of all mapping

from a set Ω into a vector space F is vector space with respect to pointwise defined

addition and scalar multiplication. Then Lemma 9.2 shows that if Ω = E is also a vector

space, L(E, F ) is a subspace of F(E, F ).


is a norm on BL(E,F ). Moreover,

‖RS‖ ≤ ‖R‖ ‖S‖ (S ∈ BL(E,F ), R ∈ BL(F,G)).

Proof. We have seen that the zero operator 0 is bounded with norm equal tozero. Let T ∈ BL(E,F ) such that ‖T‖ = 0. Hence, for x ∈ E,

‖Tx‖ ≤ ‖T‖ ‖x‖ = 0 ‖x‖ = 0,

i.e. ‖Tx‖ = 0, that is Tx = 0. Thus T = 0.Let T, S ∈ BL(E,F ). For x ∈ E arbitrary,

‖(T + S)x‖ = ‖Tx + Sx‖ ≤ ‖Tx‖+ ‖Sx‖ ≤ ‖T‖ ‖x‖+ ‖S‖ ‖x‖= (‖T‖+ ‖S‖) ‖x‖ .

As x was arbitrary, by the definition of the operator norm it follows that‖T + S‖ ≤ ‖T‖ + ‖S‖. In particular, as ‖T‖ , ‖S‖ are finite, so is ‖T + S‖,and hence S + T ∈ BL(E,F ).Let T ∈ BL(E,F ) and α ∈ R. Then for x ∈ E

‖(αT )x‖ = ‖α(Tx)‖ = |α| ‖Tx‖ ≤ |α| ‖T‖ ‖x‖ .

This shows that αT ∈ BL(E,F ) with ‖αT‖ ≤ |α| ‖T‖. If α 6= 0 then applythis to α−1 to obtain ‖αT‖ = |α| ‖T‖. If α = 0 this equality is trivial.Finally, suppose that S ∈ BL(E,F ) and R ∈ BL(F,G). Then for x ∈ E

arbitrary we obtain

‖(RS)x‖ = ‖R(Sx)‖ ≤ ‖R‖ ‖Sx‖ ≤ ‖R‖ ‖S‖ ‖x‖ .

This shows that RS ∈ BL(E,G) and ‖RS‖ ≤ ‖R‖ ‖S‖.

10.7 Corollary The multiplication of operators

[(S, T ) 7−→ ST ] : BL(F,G)× BL(E,F ) −→ BL(E,G)

is continuous: From ‖Tn − T‖ → 0 and ‖Sn − S‖ → 0 it follows that

‖SnTn − ST‖ → 0.

Proof. This very similar to already known examples (scalar product on Hilbertspaces, scalar multiplication on a normed space). We write

(S − Sn)(T − Tn) = ST + SnTn − SnT − STn


and (S − Sn)T = ST − SnT , S(Tn − T ) = STn − ST . Hence

(S − Sn)(T − Tn)− (S − Sn)T + S(T − Tn) = SnTn − ST

Therefore

‖SnTn − ST‖ ≤ ‖(S − Sn)(T − Tn)‖+ ‖(S − Sn)T‖+ ‖S(T − Tn)‖≤ ‖S − Sn‖ ‖T − Tn‖+ ‖S − Sn‖ ‖T‖+ ‖S‖ ‖T − Tn‖

and the right-hand side tends to 0 as n →∞.

10.8 Theorem Let E be a normed space and F be a Banach space. Then BL(E,F )with the operator norm is also a Banach space.

Proof. Suppose that (Tn)n is a sequence in BL(E,F ), Cauchy with respect tothe operator norm. For all x ∈ E, n, m ∈ N we have

‖Tnx− Tmx‖ = ‖(Tn − Tm)x‖ ≤ ‖Tn − Tm‖ ‖x‖ .

Since ‖Tn − Tm‖ → 0 as n, m →∞, we have that (Tnx)n is a Cauchy sequencein F , for every x ∈ E. Hence

Tx := limn→∞

Tnx

exists in F , as F is complete. It is an exercise to show that T is linear (LN12.4.1). Let us prove that T is bounded and Tn → T in operator norm.Fix ε > 0 and take an N ∈ N such that ‖Tn − Tm‖ ≤ ε for all n, m ≥ N. Thismeans that

‖Tnx− Tmx‖ ≤ ε (x ∈ E, ‖x‖ ≤ 1)

If x ∈ BE is fixed, by letting m → ∞ in the above estimate, and noting thatthe norm is continuous, we get

‖Tnx− Tx‖ ≤ ε (x ∈ BE , n ≥ N)

This can be abbreviated to

‖Tn − T‖ ≤ ε (n ≥ N).

This shows that ‖Tn − T‖ → 0, and since each Tn is bounded, we get thatT = Tn − (Tn − T ) is also bounded.

Remark: The proof is pretty much the same as the completeness of B(Ω) withrespect to the sup norm. In fact, the operator norm of T is nothing else thanthe sup-norm of the restriction of T the unit ball of E.


The Neumann series

Remember our very first lecture: we asked for the solvability of the equation

x−Ax = y

where A was an integral operator on E = C[0, 1] and y ∈ C[0, 1] is given. Theiteration xn+1 := Axn + y with x0 := 0 lead us to the series

∞∑n=0

Any

as a possible solution. This is a general fact from abstract operator theory.

10.9 Lemma Let E be a normed space and A ∈ BL(E). If y ∈ E is such that the

series∑∞

n=0 Any converges in E, then its sum x, say, satisfies x−Ax = y.

Proof. We insert A into the polynomial identity

(1− t)∑n

j=0tj = 1− tn+1 =

(∑n

j=0tj)

(1− t)

to obtain

(Id−A)∑n

j=0Aj = Id−An+1 =

(∑n

j=0Aj)

(Id−A).

Now we apply the first identity to the vector y ∈ E and get

(Id−A)∑n

j=0Ajy = y −An+1y.

By hypothesis,x := lim

n→∞

∑n

j=0Ajy

exists in E. Thus we also have limn→∞An+1y = 0. We now take limits in theabove identity; since Id−A is a bounded operator, we obtain

x−Ax = (Id−A)x = limn→∞

(Id−A)∑n

j=0Ajy = lim

n→∞y −An+1y

= y − limn→∞

An+1y = y.

How can we ensure that the series above converges? Well, let’s make it abso-lutely convergent.


10.10 Theorem Let E be a Banach space and let A ∈ BL(E) such that ‖A‖ < 1.

Then the operator Id−A is invertible and its inverse is given by the (abso-

lutely) convergent series

(Id−A)−1 =∞∑

n=0

An.

This series is called the Neumann series for A.

Proof. As ‖A‖ < 1 one has∑∞

n=0‖An‖ ≤

∑∞

n=0‖A‖n < ∞

and thus the Neumann series converges absolutely. Since E is a Banach space,by Theorem 10.8, also BL(E) is a Banach space, and therefore S :=

∑∞n=0 An

exists in BL(E). The lemma above shows that (Id−A)S = Id, whence Id−A

is surjective and S is injective. But S(Id−A) = (Id−A)S since S = limn Sn

where Sn is the n-th partial sum, which satisfies (Id−A)Sn = Sn(Id−A).Altogether we have shown that S is the inverse to Id−A.

Look at Example LN 12.2.4 to have a concrete application of the Neumannseries to an integral equation.17

In the context of linear algebra, a linear mapping T : E −→ F between twovector spaces is called invertible, if it is bijective. It is easy to see that thenthe inverse T−1 mapping is also linear (Exercise LN 10.4.2).In the context of normed space, we call a bounded linear mapping T invert-ible if it is bijective and its inverse T−1 is again bounded. In general, theboundedness of T−1 is really an additional requirement, but on Banach spacesit is automatically satisfied.

10.11 Theorem (Banach)Let E,F be Banach spaces and T : E −→ F be a bounded linear map. If T is

bijective, then T−1 is also bounded, i.e., T is invertible.

We cannot give the proof here; it rests on the theorem of Baire about sets incomplete metric spaces. Proofs are tricky, but elementary.

17However, note that in LN 12.2.4 one actually does not need the convergence in norm of

the Neumann series. The series applied to the concrete vector x4 is anyway finite, hence

trivially convergent. So Lemma 10.9 applies and the actual norm of the integral operator

is irrelevant here. The equation (12.2.1) can be solved with pure linear algebra and does

not require any analysis.


10.12 Exercise (See also Exercise LN 12.4.8.) Show that the set

A ∈ BL(E) | A is invertible

is open in BL(E), when E is a Banach space.(Hint: Let T be a bounded, invertible operator on E and let S ∈ BL(E) suchthat ‖S − T‖ <

∥∥T−1∥∥−1. Write

S = (Id−(T − S)T−1)T

and use the Neumann series to prove that S is invertible. What’s its inverse?)



Lecture 11

The dual space

11.1 Definition Given a normed space E, its dual space is E∗ := BL(E, R), thespace of all bounded linear functionals on E.

11.2 Example Suppose E is finite dimensional. Then we know that every linearmapping from E into another normed space is bounded. In particular, everylinear functional on E is bounded. By choosing a basis, one can identifyE ∼= Rd as column vectors, say. Then E∗ ∼= Rd as well (row vectors).

By the results of the previous lecture, since R is complete, we know that E∗

is a Banach space with norm

‖ϕ‖ = sup|ϕ(x)| | x ∈ E, ‖x‖ ≤ 1

Given any non-zero linear functional 0 6= ϕ : E −→ R, its null-space (kernel)

N (ϕ) = x ∈ E | ϕ(x) = 0

is a hyperspace, i.e. it is a linear subspace of E and there is exactly onedimension missing to get the whole space E. Indeed, let x0 ∈ E such thatϕ(x0) 6= 0 (which exists by assumption ϕ 6= 0). Then if x ∈ E is arbitrary,

ϕ

(x− ϕ(x)

ϕ(x0)x0

)= ϕ(x)− ϕ

(ϕ(x)ϕ(x0)

x0

)= ϕ(x)− ϕ(x)

ϕ(x0)ϕ(x0) = 0.

So x− ϕ(x)ϕ(x0)x0 ∈ N (ϕ), i.e. x ∈ Rx0 +N (ϕ). As x was arbitrary, this yields

E = Rx0 ⊕N (ϕ).

Conversely, any decomposition of E as a direct sum

E = Rx0 ⊕ U

for some non-zero vector x0 and some subspace U determines a unique linearfunctional ϕ : E −→ R with null-space U and ϕ(x0) = 1.

11.3 Lemma Let ϕ : E −→ R be a linear functional. Then ϕ is bounded if and

only if N (ϕ) is closed. In this case

|ϕ(x)| = ‖ϕ‖dist(x,N (ϕ))

for all x ∈ E.


Proof. Define F := N (ϕ). We may suppose that ϕ 6= 0, i.e. F 6= E.Suppose first that ϕ is bounded. Then it is continuous and hence its kernel isclosed. Moreover, if x ∈ E and y ∈ F we have

|ϕ(x)| = |ϕ(x− y)| ≤ ‖ϕ‖ ‖x− y‖ ;

taking the infimum over all y ∈ F we obtain

|ϕ(x)| ≤ dist(x, F ) ‖ϕ‖ (x ∈ E).

Conversely, suppose that F = N (ϕ) is closed. Take vectors x ∈ E, y ∈ E \ F .Then ϕ(y) 6= 0 and, as above,

z := x− ϕ(x)ϕ(y)

y ∈ F.

Consequently,

|ϕ(y)|dist(x, F ) ≤ |ϕ(y)| ‖x− z‖ = |ϕ(y)|∥∥∥∥ϕ(x)

ϕ(y)y

∥∥∥∥ = |ϕ(x)| ‖y‖ .

Thus(∗) |ϕ(y)|dist(x, F ) ≤ |ϕ(x)| ‖y‖

is true for all x, y ∈ E (if y ∈ F it is trivially true). As ϕ 6= 0, fix x0 ∈ E

such that ϕ(x0) = 1. Then x0 /∈ F and as F is supposed to be closed,dist(x0, F ) > 0. Now (∗) with x replaced by x0 shows that ϕ is bounded with‖ϕ‖ ≤ dist(x0, F )−1. Taking in (∗) the supremum over all y such that ‖y‖ ≤ 1finally yields

‖ϕ‖dist(x, F ) ≤ |ϕ(x)| (x ∈ E)

as desired.

Remark. In general, a hyperplane of a vector space E is a set M of the formM = x0 + U , where U is a hyperspace, i.e., the kernel of a non-zero linearfunctional ϕ. There are two cases: either x0 ∈ U or x0 /∈ U . In the formercase M = U = N (ϕ). In the latter, one can multiply ϕ with a scalar tohave ϕ(x0) = 1. By Lemma 11.3 we have that M is closed if and only if ϕ isbounded. (See LN 13.1.2–13.1.5.)

The dual space is an important theoretical tool. It is so already in finite-dimensions (i.e. Linear Algebra) where the notion underlies the discussionsabout row rank and column rank of a matrix. In finite dimensions we havedim E = dim E∗. In infinite dimensions, such a statement is also true, butquite irrelevant (both dimensions are simply “infinite”). More important is to


“identify” the dual space of a concretely given Banach space E with anotherwell-known space F . This is done my means of a so-called duality, a bilinearmapping

b : E × F −→ R.

The easiest example here is provided by the `p-spaces. Let 1 ≤ p < ∞, let q

be its dual exponent, and consider b : `p × `q −→ R defined by

b(f, g) :=∑∞

n=1f(n)g(n) (f ∈ `p, g ∈ `q).

This is well-defined by Holder’s inequality, with

|b(f, g)| ≤ ‖f‖p ‖g‖q (f ∈ `p, g ∈ `q).

Thus each g ∈ `q determines a bounded linear functional ϕg := b(·, g) on `p,and one has ‖ϕg‖ ≤ ‖g‖q. The next theorem says that the mapping

Φ = (g 7−→ ϕg) : `q −→ (`p)∗

is an isometric isomorphism18.

11.4 Theorem Let 1 ≤ p < ∞, let q be its dual exponent. Then for every ϕ ∈ (`p)∗

there is a unique g ∈ `q such that ϕ = ϕg. Moreover, ‖ϕg‖ = ‖g‖q.

Proof. We deal with the case p = 1 and leave the other case as exercise. Let(en)n be the canonical unit vectors. If ϕ = ϕg we certainly have ϕg(en) =b(en, g) = g(n) for all n, whence g is uniquely determined by ϕg.Given ϕ ∈ (`1)∗, we therefore define g ∈ `∞ by

g(n) := ϕ(en) (n ∈ N).

Indeed, g ∈ `∞ since

|g(n)| = |ϕ(en)| ≤ ‖ϕ‖ ‖en‖ = ‖ϕ‖ (n ∈ N),

which, by taking the supremum over n yields ‖g‖∞ ≤ ‖ϕ‖. Now, we have twobounded linear functionals — ϕ and ϕg — on `1, and they coincide on the unitvectors. Hence they coincide on the dense subspace c00 = spanen | n ∈ N ByLemma 9.8 we conclude that ϕ = ϕg. In particular, ‖ϕg‖ ≤ ‖g‖∞. Combined

18To clarify terminology: an algebraic isomorphism T is just a bijective linear mapping; a

(topological) isomorphism is an algebraic isomorphism such that T and T−1 are both

continuous (=bounded). A linear mapping T : E −→ F between normed spaces E, F is

isometric if ‖Tx‖F = ‖x‖E for all x ∈ E. Such a mapping is continuous and injective.

If it is also surjective, it is then an isometric isomorphism.


with the converse inequality shown above, we finally have ‖ϕg‖ = ‖g‖∞.

Remark. The dual space of `∞ cannot be identified with `1, cf. LN 13.3.3.Another remark. An analogue of Theorem 11.4 holds for the spaces Lp(I) (or,more general, for spaces Lp(Ω), where Ω is any measure space), cf. LN 13.3.4.A third remark. The identification of the dual of E = C[a, b], or, more general,of E = C(Ω) where Ω is any compact topological space, is furnished by afamous theorem of Riesz. It says that functionals ϕ on C(Ω) correspond to“regular signed Borel measures” µ on Ω, and the duality is given by integration:

(f, µ) 7−→∫

Ωf dµ.

One needs abstract integration theory here. In the concrete case Ω = [a, b]one can also identify C[a, b]∗ with the space of (suitably normalized) functionsof bounded variation, and the duality is given by

(f, g) 7−→∫ b

af(x) dg(x),

the so-called Riemann–Stieltjes integration. See also LN 13.3.5 for the specialexample of a Dirac functional ϕ = δc and its connection to the theory ofdistributions.

The Riesz–Frechet theorem

Let us consider the special case of a Hilbert space E = H. We already have anatural duality of H with itself, given by the scalar product:

〈·, ·〉 : H ×H −→ R.

For y ∈ H let us denote by ϕy the functional ϕy := 〈·, y〉. Then

‖ϕy‖ = sup‖x‖≤1

|〈x, y〉| = ‖y‖

Proof. By Cauchy–Schwarz we have |〈x, y〉| ≤ ‖x‖ ‖y‖ for all x ∈ H. If y = 0the assertion is trivially true, if y 6= 0 we plug in y/ ‖y‖ to obtain

‖ϕy‖ ≥ |ϕy(y/ ‖y‖)| = 〈y, y〉‖y‖

= ‖y‖

concluding the proof.

It is natural to ask whether H can be identified with H∗ via the mappingy 7−→ ϕy. This is indeed the case.


11.5 Theorem (Riesz–Frechet)Let H be a Hilbert space and let ϕ be a bounded linear functional on H. Then

there exists a unique y ∈ H such that

ϕ(x) = 〈x, y〉 (x ∈ H).

Moreover, ‖ϕ‖H∗ = ‖y‖H .

Proof. (See also LN 13.2.3.) Uniqueness: If ϕy1 = ϕy2 then

〈x, y1 − y2〉 = 〈x, y1〉 − 〈x, y2〉 = ϕy1(x)− ϕy2(x) = 0 (x ∈ H).

Hence y1 − y2 ⊥ H which is only possible if y1 = y2.Existence: We may suppose that ϕ 6= 0. We have to find y such that ϕ = ϕy.Now, IF there is such a y, we certainly have

x ∈ N (ϕ) ⇐⇒ ϕ(x) = 0 ⇐⇒ 〈x, y〉 = 0 ⇐⇒ x ⊥ y

for all x ∈ H. Hence N (ϕ) = y⊥. This gives us a clue how to find y: takeany non-zero vector y0 ∈ H such that y0 ⊥ N (ϕ). Such a vector must existby our knowledge about orthogonal decomposition of H19. Define

y :=ϕ(y0)‖y0‖2

y0.

By construction ϕy = 0 = ϕ on N (ϕ). But also

ϕy(y0) = 〈y0, y〉 =ϕ(y0)‖y0‖2

〈y0, y0〉 = ϕ(y0).

Since H = Ry0 ⊕ F we conclude that ϕ = ϕy.

We shall give a typical application in the field of (partial) differential equations.(Actually, no partial derivatives here, but that’s a contingent fact, because weare working in dimension one.) Consider, for given f ∈ L2(a, b) the boundaryvalue problem (“Poisson’s equation”)

u′′ = −f, u(a) = u(b) = 0

19Choose any vector x0 ∈ H such that ϕ(x0) 6= 0, i.e. x0 /∈ F := N (ϕ). Then let y0 :=

x0 − PF x0 where PF : H −→ F is the orthogonal projection onto the closed(!) subspace

F . Then y0 ∈ F⊥ and y0 6= 0 since x0 /∈ F .


to be solved within H2(0, 1). From Lecture 10 we already know how to dothis with the help of an integral operator; here we present the so-called “vari-ational” method. Since the differential equation above is to be understood inthe weak sense, we can equivalently write⟨

ϕ′, u′⟩L2 = 〈ϕ, f〉L2 (ϕ ∈ C1

0 [a, b]).

The idea is now readily sketched. Define

H10 (a, b) := u ∈ H1(a, b) | u(a) = u(b) = 0

This is a closed subspace of H1(a, b) since H1(a, b) ⊂ C[a, b] continuously (asshown in Assignment 2) and the point evaluations are bounded on C[a, b].Thus H := H1

0 (a, b) is a Hilbert space with respect to the (induced) scalarproduct

〈f, g〉H :=⟨f ′, g′

⟩L2 + 〈f, g〉L2 =

∫ b

af ′(x)g′(x) dx +

∫ b

af(x)g(x) dx.

We may leave out the second summand here:

11.6 Lemma (Poincare’s inequality) There is a constant c depending on b − a

such that

‖u‖L2 ≤ c∥∥u′∥∥

L2

for all u ∈ H10 (a, b).

Note that a similar estimate cannot hold for H1(a, b), since for a constantfunction f one has f ′ = 0.

Proof. We have shown in Assignment 2 that the integration operator J :L2[a, b] −→ L2[a, b] is bounded. Now if u ∈ H1

0 (a, b) then (also by Assign-ment 2) u = Ju′, and so

‖u‖L2 =∥∥Ju′

∥∥L2 ≤ ‖J‖L2→L2

∥∥u′∥∥L2 .

As a consequence, we see that (u | v ) := 〈u′, v′〉L2 is an inner product on H, andthe norm induced by it is equivalent to the orginal norm. Hence (H, ( · | ·)) is aHilbert space. The Poincare inequality shows also that the inclusion mapping

(v 7−→ v) : H = H10 (a, b) −→ L2(a, b)

is continuous. Hence the linear functional

Φ := (v 7−→ 〈v, f〉L2) : H −→ R


is bounded. By the Riesz–Frechet theorem (Theorem 11.5) there exists aunique u ∈ H such that⟨

v′, u′⟩L2 = (v |u) = Φ(v) = 〈v, f〉

for all v ∈ H10 (a, b). In particular, for all v ∈ C1

0 [a, b], whence u is a solution ofour original problem. (One can show that C1

0 [a, b] is dense in H10 (a, b); from

this follows that the found u is also the only solution.)

Remark. The previous example is just a one-dimensional version of the so-called Dirichlet principle in arbitrary dimensions. I sketch this withoutgiving proofs. One starts with a bounded, open set Ω ⊂ Rd. On Ω oneconsiders d-dimensional Lebesgue measure and the Hilbert space L2(Ω). Thenone looks at the Poisson problem

∆u = −f, u∣∣∂Ω

= 0.

Here, ∆ is the Laplace operator defined by

∆u =d∑

j=1

∂2u

∂x2j

.

In the classical case, f ∈ C(Ω), and one wants a solution u ∈ C2(Ω) ∩ C(Ω)satisfying literally the PDE above. The functional analytic way to tackle this isa two-step procedure: first find a solution within L2(Ω), where the derivativesare interpreted in a “weak” manner, then try to find conditions on f suchthat this solution is a classical solution. The second step belongs properly tothe realm of PDE, but the first step can be done by our abstract functionalanalysis methods.

As the space of test functions one takes

C10 (Ω) := ϕ ∈ C1(Ω) | ϕ

∣∣∂Ω

= 0.

A weak gradient of a function f ∈ L2(Ω) is a d-tuple g = (g1, . . . , gd) ∈ L2(Ω)d

such that ∫Ω

f(x)∂ϕ

∂xj(x) dx = −

∫Ω

gj(x)ϕ(x) dx (j = 1, . . . , d).

for all ϕ ∈ C10 (Ω). One proves that such a weak gradient is unique, and writes

∇f = g and ∂f/∂xj := gj . One defines

H1(Ω) := u ∈ L2(Ω) | u has a weak gradient


which is a Hilbert space for the scalar product

〈u, v〉H1 = 〈u, v〉L2 +∫

Ω〈∇u(x),∇v(x)〉Rd dx

=∫

Ωu(x)v(x) dx +

d∑j=a

∫Ω

∂u

∂xj(x)

∂v

∂xj(x) dx.

The boundary condition is incorporated into a closed subspace H10 (Ω) of

H1(Ω):H1

0 (Ω) = C10 (Ω)

(closure within H1(Ω).) One then shows Poincare’s inequality:∫Ω|u|2 dx ≤ c

∫Ω|∇u|2 dx (u ∈ H1

0 (Ω))

for some constant c depending on Ω. In the end, Riesz–Frechet is applied toobtain a unique u ∈ H1

0 (Ω) such that∫Ω〈∇u(x),∇ϕ(x)〉 dx = −

∫Ω

f(x)ϕ(x) dx (ϕ ∈ C10 (Ω),

that is ∆u = −f in a weak sense. (Of course one would like to have u ∈ H2(Ω),but this will be true only if Ω is sufficiently regular, e.g., if ∂Ω is smooth.)

The adjoint operator

Let H be a Hilbert space and A ∈ BL(H). For x ∈ H consider the linearfunctional

y 7−→ 〈Ay, x〉 (y ∈ H).

It is bounded since

|〈Ay, x〉| ≤ ‖Ay‖ ‖x‖ ≤ ‖A‖ ‖y‖ ‖x‖

for all y ∈ H. By the Riesz–Frechet theorem (Theorem 11.5) there exists aunique vector A∗x, say, such that this functional is equal to ϕA∗x, i.e. suchthat

〈Ay, x〉 = 〈y, A∗x〉 (y ∈ H).

Moreover, ‖A∗x‖ = ‖ϕA∗x‖ ≤ ‖A‖ ‖x‖.This defines a new mapping A∗ : H −→ H, called the adjoint of A. We shallconsider this in more detail in the next lecture.


Lecture 12

Adjoint operators continued

We recall where we stopped last time. Let H be a Hilbert space and A ∈BL(H) be a bounded linear operator on H. We can compute the norm of A

as‖A‖ = sup

‖x‖≤1‖Ax‖ = sup

‖x‖≤1sup‖y‖≤1

|〈Ax, y〉| .

(See the discussion preceding the Riesz–Frechet theorem (Theorem 11.5)

Now, with A ∈ BL(H) we have — by means of the Riesz–Frechet theorem— associated a mapping A∗ : H −→ H, called the adjoint of A, definedimplicitely by

〈x,A∗y〉 = 〈Ax, y〉 (x, y ∈ H).

Here are some easy properties.

12.1 Lemma Let A ∈ BL(H). The mapping A∗ ∈ BL(H) as well and ‖A∗‖ = ‖A‖.Moreover,

H = R(A∗)⊕N (A)

is an orthogonal decomposition.

Proof. Let x, y ∈ H,α, β ∈ R. Then

〈z,A∗(αx + βy)〉 = 〈Az, αx + βy〉 = α 〈Az, x〉+ β 〈Az, y〉= α 〈z,A∗x〉+ β 〈z,A∗y〉 = 〈z, αA∗x + βA∗y〉

for all z ∈ H. Thus A∗(αx + βy) = αA∗x + βA∗y. This shows linearity. Theremaining identity is shown as follows

‖A‖ = sup‖x‖≤1

sup‖y‖≤1

|〈Ax, y〉| = sup‖y‖≤1

sup‖x‖≤1

|〈x,A∗y〉| = ‖A∗‖ .

To prove the last assertion, observe that

x ⊥ R(A∗) ⇐⇒ x ⊥ R(A∗) ⇐⇒ 〈x,A∗y〉 for all y

⇐⇒ 〈Ax, y〉 = 0 for all y ⇐⇒ Ax = 0 ⇐⇒ x ∈ N (A).

Let us look at some examples.

12.2 Examples (See also LN 14.2.4.)


1) If H = Rd is finite-dimensional with the canonical scalar product, thenA ∈ BL(H) is given by a d×d-matrix M = (aij)i,j . An easy computationshows that A∗ then corresponds to the transposed matrix M t = (aji)i,j .

2) Consider the shifts L,R on H = `2(N). Then

〈Ren, ek〉 = 〈en+1, ek〉 = δn+1,k = δn,k−1 = 〈en, Lek〉

for all n, k ∈ N. Since spanem | m ∈ N is dense in H, by (bi-)linearityand continuity we conclude that 〈Rx, y〉 = 〈x, Ly〉 for all x, y ∈ H, whenceL∗ = R and R∗ = L.

3) Suppose H = L2(a, b) and k : [a, b]2 −→ R a kernel satisfying

‖k‖2 :=(∫ b

a

∫ b

a|k(s, t)|2 dsdt

)1/2

< ∞.

Such a kernel is called a Hilbert-Schmidt kernel and the associatedintegral operator A = Ak : L2(a, b) −→ L2(a, b) given by

(Akf)(t) =∫ b

ak(t, s)f(s) ds (f ∈ L2(a, b)).

is called an Hilbert-Schmidt operator on H. Let us compute A∗.

Given f, g ∈ L2(a, b) then (t, s) 7−→ g(t)k(t, s)f(s) is product-measurableand by applying Cauchy–Schwarz twice∫ b

a

∫ b

a|g(t)k(t, s)f(s)| dsdt =

∫ b

a|g(t)|

∫ b

a|k(t, s)f(s)| dsdt

≤∫ b

a|g(t)|

(∫ b

a|k(t, s)|2 ds

)1/2

‖f‖2 dt

=∫ b

a|g(t)|

(∫ b

a|k(t, s)|2 ds

)1/2

dt ‖f‖2

≤ ‖f‖2 ‖g‖2(∫ b

a

∫ b

a|k(t, s)|2 dsdt

)1/2

< ∞.

So the hypotheses of the L1-Fubini theorem hold, hence we may concludethat

〈A∗g, f〉 = 〈g,Af〉 =∫ b

ag(t)

∫ b

ak(t, s)f(s) dsdt

=∫ b

a

∫ b

ag(t)k(t, s)f(s) dsdt =

∫ b

a

∫ b

ag(t)k(t, s)f(s) dt ds

=∫ b

a

(∫ b

ag(t)k(t, s) dt

)f(s) ds.


This yields

A∗g(t) =∫ b

ak∗(t, s) g(s) ds (g ∈ L2(a, b))

where k∗(t, s) := k(s, t) is the adjoint kernel. So we see that A∗ is aHilbert-Schmidt operator, too.

Finally, we have now a mapping (A 7−→ A∗) : BL(H) −→ BL(H).

12.3 Theorem Let H be a Hilbert space, and A,B ∈ BL(H), α, β ∈ R. Then

a) (αA + βB)∗ = αA∗ + βB∗.

b) (A∗)∗ = A

c) (AB)∗ = B∗A∗.

Proof. For b) take x, y ∈ H and compute

〈x, (A∗)∗y〉 = 〈A∗x, y〉 = 〈y, A∗x〉 = 〈Ay, x〉 = 〈x,Ay〉 .

Since this is true for all x, we obtain (A∗)∗y = Ay, and since this is true forall y ∈ H we have (A∗)∗ = A. For a), c) see LN Exercise 14.3.2.

Preview: The spectral theorem

One of the most important results in finite-dimensional linear algebra is theso-called spectral theorem. You may not have termed it like this, but youcertainly know the statement:

Spectral Theorem (Matrix Version).

If A is a real, symmetric d × d-matrix then there is an orthogonal matrix O

such that O−1AO is a diagonal matrix:

O−1AO = diag(λ1, . . . , λd)

The λi are the eigenvalues of A, and the columns of O form an orthonormalbasis of Rd consisting of eigenvectors. Avoiding the matrix terminology, onecan state the result in an equivalent form:

Spectral Theorem (Finite-dimensional Version).

Let A be a linear mapping on a d-dimensional Hilbert space H, such that

A∗ = A. Then there is an orthonormal basis (ej)j=1,...,d of H and real numbers


(λj)j=1,...,d such that Aej = λjej for all j = 1, . . . , d. The mapping A is then

given as

Ax =∑d

j=1λj 〈x, ej〉 ej (x ∈ H).

Our goal is to see which hypotheses are needed to obtain an analogous state-ment if H is an infinite-dimensional Hilbert space. Hence we look for operatorsA on a Hilbert space H which allow a representation

(∗) Ax =∑∞

k=0λk 〈x, ek〉 ek (x ∈ H).

where the λk are real scalars and (en)n forms an orthonormal system of H.

To get an idea what we can expect, let us analyse this situation further. Sosuppose that A is a bounded operator given by (∗). Then we have Aek = λkek

for all k, and hence the λk are eigenvalues with eigenvectors ek. Moreover, forx, y ∈ H,

〈Ax, y〉 =⟨∑∞

k=1λk 〈x, ek〉 ek, y

⟩=∑∞

k=1λk 〈x, ek〉〈ek, y〉

=⟨x,∑∞

k=1λk 〈y, ek〉 ek

⟩= 〈x,Ay〉 .

This shows that A = A∗, i.e., A is self-adjoint.

We shall investigate self-adjoint operators and some general spectral theory(generalized eigenvalues) later. For now, let us turn to another aspect of theformula (∗). The series should at least be convergent for each individual x.Unfortunately, this so-called “strong convergence” of the series does not leadto a satisfying general spectral theorem. But we shall be successful if we aimat a stronger notion of convergence here.Suppose again that A is given by (∗). Define for n ∈ N the operator An by

Anx :=∑n

k=0λk 〈x, ek〉 ek (x ∈ H).

Then An is a finite-dimensional operator, i.e., its range has finite dimension.Moreover,

‖Anx−Ax‖2 =∥∥∥∑∞

k=n+1λk 〈x, ek〉 ek

∥∥∥2=∑∞

k=n+1|λk 〈x, ek〉|2

≤ ‖x‖2 supk≥n+1

|λk|2

for all x ∈ H. This shows

‖An −A‖ ≤ supk≥n+1

|λk| (n ∈ N).


On the other hand, if k ≥ n + 1 then Anek = 0 and hence

|λk| = ‖Aek‖ = ‖Aek −Anek‖ ≤ ‖A−An‖ ‖ek‖ = ‖A−An‖

which shows supk≥n+1 |λk| ≤ ‖An −A‖. Combining both inequalities we ar-rive at

‖An −A‖ = supk≥n+1

|λk| (n ∈ N)

which implies

limn→∞

‖An −A‖ = 0 ⇐⇒ limn→∞

|λn| = 0.

Hence if we want A to be given by (∗) with this type of convergence, A

must necessarily be approximable in operator norm by operators with finite-dimensional range. This leads to the notion of compact operator, which weshall study now.

Compact operators

The following theorem of real analysis plays a central role.

12.4 Theorem (Bolzano-Weierstrass, one-dimensonal version)Let (xn)n∈N ⊂ R be a bounded sequence. Then (xn)n∈N has a convergent

subsequence.

To recall terminology: a subsequence s′ = (yn)n of a sequence s = (xn)n isgiven by a strictly increasing map

σ : N −→ N with σ(1) < σ(2) < . . .

such that yn = xσ(n) for all n ∈ N. It is an easy exercise to show that if asequence converges, then also each subsequence converges. Also, a subsequenceof a subsequence is a subsequence.

Proof. One constructs a subsequence which is Cauchy, by the followingmethod: There are a, b such that xn ∈ [a, b] for all n. Divide the intervalI0 := [a, b] in two halves. Select as I1 the subinterval which contains xn

for infinitely many n. Then divide I1 and continue this way. This gives asequence of nested interval I0 ⊃ I1 ⊃ . . . with lengths converging to 0. Nowpick successively nk such that nk < nk+1 and xnk

∈ Ik. This must be aCauchy-sequence and hence convergent.

The one-dimensional result extends readily to the finite-dimensional case.


12.5 Theorem (Bolzano-Weierstrass, finite-dimensional version)Let E be a finite-dimensional normed space, and let (xn)n∈N ⊂ E be a bounded

sequence. Then (xn)n∈N has a convergent subsequence.

Proof. By choosing a basis one may suppose that E = Rd. Convergenceof s = (xn)n is the same as convergence in each coordinate. Since thexn = (xn,1, . . . , xn,d) are uniformly bounded, each sequence (xn,j)n ∈ Nis a bounded real sequence (j = 1, . . . d). Then by the one-dimensionalBolzano-Weierstrass theorem, one finds a subsequence s(1) with converging1st coordinates. Then we choose a subsequence s(2) of s(1) with converging2nd coordinates. Continuing this way, we arrive at a subsequence s(d) withconverging d-th coordinate, and which is a subsequence of all the othersequences. So s(d) converges in all coordinates, i.e., is convergent.

The following example shows that the Bolzano-Weierstrass theorem fails ininfinite dimensions.

12.6 Example Let H be a Hilbert space and (xn)n is a bounded sequence of pairwiseorthogonal vectors with δ := infn ‖xn‖ > 0. Then as n 6= m,

‖xn − xm‖2 = ‖xn‖2 + ‖xm‖2 ≥ 2δ2

(by Pythagoras), and hence ‖en − em‖ ≥√

2δ. Thus (xn)n contains no con-vergent subsequence.In particular, if H is infinite-dimensional, then the Bolzano-Weierstrass theo-rem does not hold for H. (Choose xn = en for some ONS (en)n.)

12.7 Definition Let E,F be a Banach spaces. A bounded linear mapping A ∈BL(E,F ) is called compact if for every bounded sequence (xn)n ⊂ E thesequence (Axn)n has a subsequence that converges in F .

A bounded linear operator A ∈ BL(E,F ) is called finite-dimensional or afinite rank operator, if dimR(A) < ∞.

12.8 Corollary If A is a finite-dimensional operator, then A is compact.

This follows immediately from the definition and the Bolzano-Weierstrass the-orem.

Let us introduce the space

C(E,F ) := A ∈ BL(E,F ) | A is compact.


12.9 Theorem Let E,F, G be Banach spaces. Then C(E,F ) is a closed linear

subspace of BL(E,F ). Moreover, if A : E −→ F is compact and C ∈ BL(F,G)and D ∈ BL(G, E), then CA and AD are compact.

Proof. Take A,B ∈ C(E,F ) and a bounded sequence (xn)n ⊂ E. By passing toa subsequence20, we may suppose that a := limn Axn exists. Passing again to asubsequence we may suppose that b := limn Bxn exists too. Hence limn λAxn+µBxn = λa + µb also exists. This shows that λA + µB is again compact.

The proof that CA and AD are both compact is an easy exercise.

Suppose that ‖An −A‖ → 0 and each An is compact. We have to show thatA is compact as well. This is a so-called “diagonal argument”, a quite funnything if you see it for the first time.Let s0 := (fn)n be a sequence in E, with ‖fn‖ ≤ M for all n. We can find asubsequence s1 := (f1

n)n of s0 such that y1 := limn A1f1n exists. Then we find

a subsequence s2 := (f2n)n of s1 such that y2 := limn A2f

2n exists. Repeating

this we construct successive subseqences

sm = (fmn )n, with existing limit ym := lim

nAmfm

n .

Consider now the “diagonal sequence” s := (fnn )n. This is certainly a subse-

quence of the original sequence s0, but more is true: for each m, the “tail”(fn

n )n≥m is a subsequence of sm. Hence

limn

Amfnn = ym

holds for all m ∈ N. We claim that (Afnn )n is convergent. To show this, since

F is a Banach space, it suffices to show that this sequence is Cauchy. Fixε > 0 and fix N such that 2M ‖AN −A‖ ≤ ε. Then

‖Afnn −Afm

m ‖ ≤ ‖(A−AN )fnn ‖+ ‖ANfn

n −ANfmm ‖+ ‖(AN −A)fm

m ‖≤ ‖A−AN‖ ‖fn

n ‖+ ‖ANfnn −ANfm

m ‖+ ‖AN −A‖ ‖fmm ‖

≤ 2 ‖AN −A‖M + ‖ANfnn −ANfm

m ‖ ≤ ε + ‖ANfnn −ANfm

m ‖

for all n, m. But for large enough n, m, the second summand is‖ANfn

n −ANfmm ‖ ≤ ε as well, and we are done.

20This kind of argument appears frequently. One first finds a subsequence (xkn)n with the

desired property. Then one renames this sequence as (xn)n, because one wants to avoid

to many indices. This procedure may be iterated, but one always gets a subsequence of

the orginal sequence.


12.10 Theorem Le H be a Hilbert space and A ∈ BL(H). Then A is compact if and

only if there exists a sequence of finite-dimensional operators An such that

‖An −A‖ → 0.

Remark. One direction is an easy consequence of Corollary 12.8 and Theorem12.9. The other direction is stated here without proof, but for self-adjointoperators it will follow from the spectral theorem. Theorem 12.10 providesthe justification of LN which defines compact operators as limits of finite-rankoperators (see LN 12.3).

Another Remark. The non-trivial implication of Theorem 12.10 is not true ingeneral Banach spaces. This was for a long time one of the most famous openproblems, posed originally by Mazur (a colleague of Banach in Lwow) in the1920’s and finally resolved by Per Enflo in 1972. He was rewarded a goose forthat.

12.11 Theorem Let k : [a, b]2 −→ R be a Hilbert-Schmidt kernel and Ak be the

associated Hilbert-Schmidt integral operator. Then Ak is compact.

Proof. See Theorem 12.17 below for a rigorous proof. Compare also to LN12.3.4.

Appendix: Abstract Hilbert–Schmidt operators

Let us start with a small lemma.

12.12 Lemma Let (en)n and (fn)n be two complete orthonormal systems in the

Hilbert space H, and A ∈ BL(A). Then∑n

‖Aen‖2 =∑n,m

|〈Aen, fm〉|2 =∑m

‖A∗fm‖2

as values in [0,∞].

Proof. Summing of positive numbers is independent of the order of summation.Hence ∑

n,m

|〈Aen, fm〉|2 =∑

n

∑m

|〈Aen, fm〉|2 =∑

n

‖Aen‖2

by Parseval’s theorem. Reversing the summation and noting that 〈Aen, fm〉 =〈en, A∗fm〉 yields∑

n,m

|〈Aen, fm〉|2 =∑m

∑n

|〈en, A∗fm〉|2 =∑m

‖A∗fm‖2 .


This proves the claim.

Lemma 12.12 shows that the number

‖A‖HS :=

(∑n

‖Aen‖2)1/2

does not depend on the choice of the orthonormal basis (en)n. Indeed, onemay apply the lemma two times, first as it stands and then with en = fn tocompute ∑

n

‖Aen‖2 =∑m

‖A∗fm‖2 =∑

n

‖Afn‖2 .

12.13 Definition Let H be a (separable) Hilbert space with a complete orthonormalsystem (en)n. A bounded linear operator A ∈ BL(H) is called an (abstract)Hilbert–Schmidt operator if

‖A‖2HS :=∞∑

n=1

‖Aen‖2 < ∞

The lemma from above directly shows that ‖A∗‖HS = ‖A‖HS .

12.14 Exercise Show that ‖A + B‖HS ≤ ‖A‖HS + ‖B‖HS . (Hint: Minkowski’s in-equality for `2.)

12.15 Lemma Let A be a Hilbert–Schmidt operator. Then ‖A‖ ≤ ‖A‖HS .

Proof. Let x ∈ H. Then

‖Ax‖2 =∥∥∥A∑

n〈x, en〉 en

∥∥∥2=∥∥∥∑

n〈x, en〉Aen

∥∥∥2

≤(∑

n|〈x, en〉| ‖Aen‖

)2≤(∑

n|〈x, en〉|2

) (∑n‖Aen‖2

)= ‖x‖2 ‖A‖2HS

by Cauchy–Schwarz (=Holder’s) inequality in `2. Taking square roots yields‖Ax‖ ≤ ‖A‖HS ‖x‖, and since x ∈ H was arbitrary, we obtain ‖A‖ ≤ ‖A‖HS .

We want to show that Hilbert–Schmidt operators are compact. To this end,let A be an abstract Hilbert–Schmidt operator. By completeness of the ONSand since A is bounded, we may write

Ax = A∑

n

〈x, en〉 en =∑

n

〈x, en〉Aen (x ∈ H)


Define the finite-dimensional operator An by

Anx :=n∑

k=1

〈x, ek〉 gk (x ∈ H).

12.16 Lemma With the notations from above,

‖An −A‖ ≤ ‖An −A‖HS → 0 (n →∞).

In particular, a Hilbert–Schmidt operator is compact.

Proof. From Lemma 12.15 we know that ‖An −A‖ ≤ ‖An −A‖HS . But

‖An −A‖2HS =∞∑

k=n

‖Aek‖2 → 0 (n →∞)

as claimed.

Last, we show that our Hilbert–Schmidt integral operators are indeed abstractHilbert–Schmidt operators.

12.17 Theorem Let I ⊂ R be an interval and k : I× I −→ R a measurable mapping

such that ∫I

∫I|k(t, s)|2 dsdt < ∞.

Let Ak be the associated integral operator on H = L2(I). Then Ak is Hilbert–

Schmidt, and

‖A‖2HS ≤∫

I

∫I|k(t, s)|2 dsdt.

(One has actually equality in the last line, but that is beyond our means.) In

particular, Ak is compact.

Proof. Let (en)n be a complete ONS in L2(I). The class of Hilbert-Schmidtkernels is in itself an inner product space, namely the space

E := L2(I × I)

of (equivalence classes of) measurable functions k : I × I −→ R such that

‖k‖22 =∫

I

∫I|k(t, s)|2 dsdt < ∞.

The inner product on E is given by

〈k, l〉 :=∫

I

∫Ik(t, s)l(t, s) dsdt


which is well-defined by Tonelli, Fubini and two times Cauchy–Schwarz. Now,it is simple to check that the functions en ⊗ em defined by

(en ⊗ em)(t, s) := en(t)em(s) (s, t ∈ (a, b))

form an ONS in E. Hence,

〈Aken, em〉 =∫

I

∫Ik(t, s)em(t)en(s) dsdt = 〈k, en ⊗ em〉E

is nothing else than the Fourier coefficient of k ∈ E with respect to en ⊗ em.Lemma 12.12 together with Bessel’s inequality in the space E then yields

‖Ak‖2HS =∑n,m

|〈Aen, em〉H |2 =

∑n,m

|〈k, en ⊗ em〉E |2

≤ ‖k‖2E =∫

I

∫I|k(t, s)|2 dsdt

which was to prove.

Remark. It can be shown that the space of abstract Hilbert–Schmidt operatorsis again a Hilbert space, with the scalar product given by

〈A,B〉 :=∑

n

〈Aen, Ben〉

(The sum is absolutely convergent by two-times Cauchy–Schwarz, first in H,then in `2.)Another Remark. With some more information from Integration Theory onecan show that the space L2(I × I) introduced above is complete, and one haseven equality

‖Ak‖2HS =∫

I

∫I|k(t, s)|2 dsdt

for all k ∈ L2(I × I).Last Remark. With even more information from Integration Theory one canshow that every abstract Hilbert–Schmidt operator A on L2(I) is of the formA = Ak for some integral kernel k ∈ L2(I × I).



Lecture 13

Some general spectral theory

The main point in the spectral theorem (yet to state/prove) is an eigenvaluedecomposition. We recall (from linear algebra) that λ ∈ R is called an eigen-value of A if there is 0 6= x ∈ H such that Ax = λx. The subspace N (A−λ Id)is called the corresponding eigenspace and each nonzero member of it is calleda corresponding eigenvector. In finite dimensions, the eigenvalues tell a greatdeal of the story of an operator. This is due to the fact, that one has

λ Id−A injective ⇐⇒ λ Id−A surjective

for every linear operator A on Rd. Such a statement is not true in infinitedimensions: On H = `2 the right shift is injective but not surjective, and theleft shift is surjective but not injective. This leads to the general notion ofspectrum.

13.1 Definition Let E be a Banach space and A ∈ BL(E). A scalar λ is called aregular value of A if A−λ Id is invertible, i.e. if it is bijective and its inverseis bounded21. If λ is a regular value, the operator

R(λ, A) := (λ Id−A)−1

is called the resolvent of A in the point λ. The set of regular values,

%(A) := λ | A− λ Id is invertible

is called the resolvent set and its complement

σ(A) := λ | A− λ Id is not invertible

is called the spectrum of A.

So if λ is an eigenvalue of A then it is certainly in the spectrum, and this isalready the end of the story in finite dimensions. However, the example ofthe right shift shows that in the infinite-dimensional case there are usuallyspectral values which are not eigenvalues.

The Neumann series gives us a handy criterium for regular values.

13.2 Theorem Let E be a nontrivial Banach space and A ∈ BL(E). Then the

following assertions hold.

21Actually, since E is supposed to be a Banach space, the inverse is automatically bounded,

by Banach’s Theorem 10.11.


a) If |λ| > ‖A‖ then λ ∈ %(A) and

R(λ, A) =∑∞

n=0λ−n−1An.

b) σ(A) is closed, %(A) is open and

dist(λ, σ(A)) ≥ 1‖R(λ, A)‖

(λ ∈ %(A)).

Proof. I shall write I instead of Id in this proof.a) Let |λ| > ‖A‖. Then

∥∥λ−1A∥∥ = |λ|−1 ‖A‖ < 1, and hence I − λ−1A is

invertible. But(λI −A) = λ(I − λ−1A)

and this shows that λI −A is invertible with

R(λ, A) = (λI −A)−1 = λ−1(I − λ−1A) =∑∞

n=0λ−(n+1)An.

b) Fix λ ∈ %(A) and suppose that

‖(λ− µ)R(λ, A)‖ = |µ− λ| ‖R(λ, A)‖ < 1.

By Neumann series, we know that the operator I−(λ−µ)R(λ, A) is invertible.But

(µI −A) = (λI −A)− (λ− µ)I = (λI −A)[I − (λ− µ)R(λ, A)].

Hence (µ−A) is invertible, and its inverse is given by

R(µ,A) = [I − (λ− µ)R(λ, A)]−1(λ−A)−1 =∑∞

n=0(λ− µ)nR(λ, A)n+1.

In particular, µ ∈ %(A). By contraposition, if µ ∈ σ(A) then|µ− λ| ≥ ‖R(λ, A)‖−1 and taking the infimum over these µ we obtaindist(λ, σ(A)) ≥ ‖R(λ, A)‖−1.

13.3 Exercise Let E be a Banach space, A ∈ BL(E). Show that

R(λ, A)−R(µ,A) = (µ− λ)R(λ, A)R(µ,A) (λ, µ ∈ %(A)).

(This identity is called the resolvent identity.)

Remark. Computing spectra of operators is a big issue in functional analysis.(See LN 15.2.3 and LN 15.2.4 for the spectra of the left and the right shift.)Very often, asymptotic properties of dynamical systems are determined by thespectral properties of the underlying operators (“Stability Theory”). We donot go into this here, but stress the fact that spectral theory has very appliedfacets.

The notion closest to an eigenvalue is that of an “approximate” eigenvalue.


13.4 Definition Let E be a Banach space and A ∈ BL(E). The scalar λ is called anapproximate eigenvalue of A if there exists (xn)n ⊂ E such that ‖xn‖ = 1for all n ∈ N and ‖Axn − λxn‖ → 0.

An approximate eigenvalue is certainly a spectral point, and every eigenvalueis an approximate eigenvalue. But the following examples show that in generalthere are spectral values that are not approximate eigenvalues and approxi-mate eigenvalues that are not eigenvalues.

13.5 Examples 1) Consider the integration operator J on E = C[a, b]. Then J

is certainly not bijective (as its range consists of C1-functions), but it isinjective: if Jf = 0 then f = (Jf)′ = 0.) Hence 0 is not an eigenvalue buta spectral value. Even more, we can use the sequence fn with fn(t) := tn

to see that 0 is an approximate eigenvalue.

2) Consider the right shift R on H = `2. Then R is an isometry, but notsurjective, and hence 0 ∈ σ(A). But 0 is not an approximate eigenvalue.

3) Let H = L2(a, b) and A ∈ BL(H) given by

(Af)(t) = tf(t) (t ∈ (a, b)).

We claim that every λ0 ∈ [a, b] is an approximate eigenvalue. Indeed, forn ∈ N we can find fn such that

‖fn‖2 = 1 and fn(t) = 0 (|t− λ0| ≥ 1/n).

(Choose fn := cn1[λ0−1/n,λ0+1/n] with a suitable constant cn.) Then

‖Afn − λ0fn‖22 =∫ b

a|t− λ0|2 |fn(t)|2 dt ≤ 4

n2

∫ b

a|fn(t)|2 d(t) =

4n2→ 0.

It is easy to see that no λ ∈ [a, b] is an eigenvalue (Exercise LN 15.4.2)and that σ(A) = [a, b].

The compact operators show a special behaviour here.

13.6 Theorem Let A be a compact operator on a Hilbert space and let λ 6= 0 be

an approximate eigenvalue of A. Then A is an eigenvalue and N (A− λ Id) is

finite-dimensional.

Proof. By definition, there is a sequence (xn)n ⊂ H such that ‖xn‖ = 1 for alln and ‖Axn − λxn‖ → 0. As A is compact, by passing to a subsequence wemay suppose that y := limn Axn exists. Consequently

‖λxn − y‖ ≤ ‖λxn −Axn‖+ ‖Axn − y‖ → 0.


Thus ‖y‖ = limn ‖λxn‖ = limn |λ| ‖xn‖ = |λ| 6= 0. Moreover,

Ay = A(limn

λxn) = λ limn

Axn = λy

which shows that λ is an eigenvalue with eigenvector y. Suppose that F :=N (A− λ Id) is infinite dimensional. Then there is an infinite ONS (en)n∈N inF . For n 6= m,

‖Aen −Aem‖ = ‖λen − λem‖ = |λ| ‖en − em‖ = |λ|√

2.

Since λ 6= 0, the sequence (Aen)n does not have a convergent subsequence,which is contradiction to the compactness of A.

Remark. A real quadratic matrix may not have any real eigenvalue. This isdue to the fact that eigenvalues of a matrix are the zeros of the characteristicpolynomial, and there may be no real zeros. As an example consider theoperator

A :=

(0 −11 0

)on Rd. However, if one passes to complex Banach spaces the spectrum isalways not empty.

So already in finite dimensions, a satisfying spectral theory exists only if oneworks in a complex setting. This is even more true in infinite dimensions:the resolvent mapping is an analytic function, i.e., around each point it hasa power series representation (see proof of Theorem 13.2); and we know evenfrom real analysis that the “true” meaning and full power of analytic functionsis revealed only by extension into the complex domain.

This is the reason why LN introduces complex spaces in LN Chapter 14.Howver, this being absolutely reasonable when one really is interested in gen-eral spectral theory, for our goal (the spectral theorem for compact selfadjointoperators) it is not necessary at all. Please just be aware that the spectraltheory we included here is absolutely fragmentary and there is much more tosay to it on complex Banach spaces (the definition of which you find in LNChapter 14).

Self-adjoint operators

We now look at a very important class of operators on a Hilbert space H.

13.7 Definition An operator A ∈ BL(H) is called self-adjoint if A∗ = A. Aself-adjoint operator is called positive (notation: A ≥ 0) if

〈Ax, x〉 ≥ 0 (x ∈ H).


13.8 Examples 1) Let F ⊂ H be a closed linear subspace of H and let P = PF

be the orthogonal projection onto F . Then P is self-adjoint, P ≥ 0 and‖P‖ ≤ 1.

Proof. For any vectors x, y ∈ H: Px ⊥ y − Py. Hence

〈Px, y〉 = 〈Px, Py + (y − Py)〉 = 〈Px, Py〉 = 〈Px + (x− Px), Py〉= 〈x, Py〉 .

for all x, y ∈ H. Hence P ∗ = P . Since P 2 = P , it follows that

〈Px, x〉 = 〈PPx, x〉 = 〈Px, Px〉 ≥ 0

for all x ∈ H, whence P ≥ 0. Finally, by Pythagoras,

‖x‖2 = ‖Px‖2 + ‖x− Px‖2 ≥ ‖Px‖2 (x ∈ H)

whence ‖P‖ ≤ 1.

2) If A = Ak is a Hilbert-Schmidt operator on H = L2(a, b) then A is self-adjoint iff the kernel k is symmetric, i.e.

k(t, s) = k(s, t) (a ≤ s, t ≤ b)

(As the operator A is not changed by changing the kernel k on (2-dimensional) null-sets, the stated equality need to be satisfied only “al-most everywhere” on [a, b] × [a, b].) Unfortunately, there is no simplecriterion for the kernel k characterizing the positivity of A.

3) Suppose again that H = L2(a, b), but now the operator A is given bymultiplication

(Af)(t) := m(t)f(t) (t ∈ (a, b), f ∈ L2(a, b))

for some fixed real-valued function m ∈ L∞(a, b). Then A is self-adjoint,as

〈Af, g〉 =∫ b

a[m(t)f(t)]g(t) dt =

∫ b

af(t)[m(t)g(t)] dt = 〈f,Ag〉

for all f, g ∈ H. One has A ≥ 0 if and only if m(t) ≥ 0 almost everywhere.(It is quite elementary to show this if m is a continuous function, but oneneed measure theory to prove that in general.)

If A,B are two self-adjoint operators, one writes A ≤ B instead of B−A ≥ 0.Furthermore, if α ∈ R, one often writes α instead of α Id.


13.9 Lemma Let A,B be self-adjoint and C ∈ BL(H) arbitrary. If A,B ≥ 0 then

A + B ≥ 0 and C∗AC ≥ 0. Moreover, if A ≥ 0 one has

‖Ax‖2 =⟨A2x, x

⟩≤ ‖A‖ 〈Ax, x〉 (x ∈ H).

Proof. Since (A+B)∗ = A∗+B∗ = A+B and (C∗AC)∗ = C∗A∗C∗∗ = C∗AC,both operators are self-adjoint. Moreover 〈(A + B)x, x〉 = 〈Ax, x〉+ 〈Bx, x〉 ≥0 and 〈C∗ACx, x〉 = 〈ACx,Cx〉 ≥ 0 for all x ∈ H.For the additional statement write α := ‖A‖ and I := Id. Then we haveA,αI −A ≥ 0. Hence

α2A− αA2 = A(αI −A)A + (αI −A)A(αI −A) ≥ 0

by what we have proved before. If α = 0 there is nothing more to prove. Elsewe may divide by α to obtain A2 ≤ αA, and that is what we wanted.

For a self-adjoint operator A ∈ BL(H) we define its numerical range by

W (A) := 〈Ax, x〉 | ‖x‖ = 1

and the numerical radius by

w(A) := sup|〈Ax, x〉| | x ∈ H, ‖x‖ = 1.

Since by Cauchy–Schwarz |〈Ax, x〉| ≤ ‖Ax‖ ‖x‖ ≤ ‖A‖ for all x ∈ H, ‖x‖ = 1,one has

W (A) ⊂ [−‖A‖ , ‖A‖] and w(A) ≤ ‖A‖ .

We aim at proving the following theorem.

13.10 Theorem Let A be a self-adjoint operator on the Hilbert space H. Then the

following assertions hold.

a) One has

w(A) = sup|〈Ax, x〉| | x ∈ H, ‖x‖ = 1 = ‖A‖ .

b) Either ‖A‖ or −‖A‖ is an approximate eigenvalue of A.

c) If λ, µ are distinct eigenvalues then

N (λ Id−A) ⊥ N (µ Id−A).

d) If H = G⊕G⊥ is an orthogonal decomposition H such that A(G) ⊂ G,

then A(G⊥) ⊂ G⊥.


Proof. a) (See LN 15.3.4) Let α := w(A). Then −α Id ≤ A ≤ α Id. Considerthe symmetric(!) bilinear forms a, b defined by

a(x, y) := 〈Ax, y〉 b(x, y) := α 〈x, y〉

for x, y ∈ H. Then |a(x, x)| ≤ b(x, x) for all x ∈ H, and by the generalizedCauchy–Schwarz inequality (Theorem 6.4) we obtain

|〈Ax, y〉| = |a(x, y)| ≤√

b(x, x)√

b(y, y) = α ‖x‖ ‖y‖

for all x, y ∈ H. This shows that ‖A‖ ≤ α, and we are done.b) (See LN 15.3.5) By what we have already proved we can find a sequence(xn)n such that ‖xn‖ = 1 for all n and |〈Axn, xn〉| → α := w(A) = ‖A‖. Bypassing to a subsequence and from A to −A if necessary, we may suppose thatactually 〈Axn, xn〉 → α. But if we replace A by α Id−A ≥ 0 in Lemma 13.9,we obtain ∥∥(α Id−A)2xn

∥∥ ≤ ‖α Id−A‖ 〈αxn −Axn, xn〉= ‖α Id−A‖ (α− 〈Axn, xn〉) → 0.

This shows that α is an approximate eigenvalue.c) Suppose that λ, µ are scalars and x, y ∈ H such that Ax = λx and Ay = µy.Then

(λ− µ) 〈x, y〉 = 〈λx, y〉 − 〈x, µy〉 = 〈Ax, y〉 − 〈x,Ay〉 = 0

since A = A∗. Hence λ 6= µ implies that x ⊥ y.d) Finally, let x ∈ G, y ∈ G⊥. Then Ax ∈ G and hence 〈x,Ay〉 = 〈Ax, y〉 = 0.As x ∈ G was arbitrary, Ay ∈ G⊥.

Appendix: Spectral Characterization of Positivity

We end this section with an additional information over spectral values ofself-adjoint operators.

13.11 Theorem Let A be a self-adjoint operators on the Hilbert space H. Then

σ(A) consists entirely of approximate eigenvalues.

Proof. Let λ ∈ σ(A). By passing to A − λ Id we may suppose that λ = 0 isnot an approximate eigenvalue. I claim that then there exists c > 0 such that

‖x‖ ≤ c ‖Ax‖ (x ∈ H).


Indeed: otherwise, for every n ∈ N you find xn such that ‖Axn‖n < ‖xn‖.This yields xn 6= 0 and A(x/ ‖xn‖) → 0 which implies that 0 is an approximateeigenvalue.The norm estimate above now implies that R(A) is closed (here is alittle argument involving Cauchy sequences hidden) and A is injective.Hence N (A) = 0 and by Lemma 12.1 we know that H = R(A). Thus,R(A) = R(A) = H and so A is bijective. But the norm estimate also impliesnow that A−1 is bounded and

∥∥A−1∥∥ ≤ c. Hence 0 /∈ σ(A).

13.12 Corollary Let A be a self-adjoint operator. Then A ≥ 0 if and only if σ(A) ∈[0,∞).

Proof. Suppose that A ≥ 0 and λ ∈ σ(A). By the theorem, there is xn ∈ H

such that ‖xn‖ = 1 and λxn −Axn → 0. But then

λ = λ 〈xn, xn〉 = 〈λxn −Axn, xn〉+ 〈Axn, xn〉 ≥ 〈λxn −Axn, xn〉 → 0

whence λ ≥ 0. On the other hand, suppose that σ(A) ⊂ [0,∞) and defineα0 := inf W (A). As in the proof of Theorem 13.10, part b), one shows thatα0 is an approximate eigenvalue. Namely, one picks a sequence (xn)n with‖xn‖ = 1 and such that 〈Axn, xn〉 → α0. As A − α0 Id ≥ 0, we estimate byLemma 13.9

‖(A− α0 Id)xn‖2 ≤ ‖A− α0‖ 〈(A− α0 Id)xn, xn〉= ‖A− α0‖ (〈Axn, xn〉 − α0) → 0.

This shows that α0 is an approximate eigenvalue and thus α0 ≥ 0. HenceA ≥ 0.

This corollary is an important result for the application of Hilbert space theoryto Quantum Mechanics.


Lecture 14

The Spectral Theorem

In this section we shall prove the spectral theorem for compact self-adjointoperators on a Hilbert space.

14.1 Lemma Let H be a Hilbert space and A 6= 0 a compact self-adjoint operator

on H. Then the following assertions hold.

a) ‖A‖ or −‖A‖ is an eigenvalue with finite-dimensional eigenspace.

b) For every ε > 0 there are only finitely many eigenvalues λ of A such

that |λ| ≥ ε:

#λ | |λ| ≥ ε, N (λ Id−A) 6= 0 < ∞

Proof. We know that ‖A‖ or −‖A‖ is an approximate eigenvalue, by Theorem13.10. Since A is compact and A 6= 0, by Theorem 13.6, this must be aneigenvalue with finite-dimensional eigenspace.

For the second statement, suppose towards a contradiction that for some ε > 0there are infinitely many eigenvalues λ with |λ| ≥ ε. Then one can pick asequence (λn)n of pairwise distinct such eigenvalues. For each n, choose aunit vector en ∈ H such that Aen = λnen. Then (en)n forms an ONS, sinceby Theorem 13.10 the eigenspaces with respect to different eigenvalues arepairwise orthogonal. Hence

‖Aen −Aem‖2 = ‖λnen − λmem‖2 = |λn|2 + |λm|2 ≥ 2ε2

if n 6= m by Pythagoras, and hence (Aen)n does not have any convergentsubsequence. This contradicts the compactness of A.

Remark. If A = 0 then the whole space H is the eigenspace to the eigenvalueλ = 0 = ‖A‖.

Now let A be any compact and self-adjoint operator on the Hilbert space H.We suppose in addition that dim H = ∞, the case of a finite-dimensional spacebeing an easy modification.

Define H1 := H, B1 := A. By Lemma 14.1 there is a vector e1 and a scalarλ1 such that

‖e1‖ = 1, |λ1| = ‖B1‖ and B1e1 = λ1e1.


Define H2 := e1⊥, i.e. H1 = H2⊕Re1 is an orthogonal decomposition. FromTheorem 13.10 d) it follows that B1H2 ⊂ H2. So we may define

B2 := B1

∣∣H2

.

Since dim H = ∞ we must have dim H2 = ∞, too. Moreover, B∗2 = B2 since

〈B∗2x, y〉H2

= 〈A∗x, y〉H = 〈Ax, y〉H = 〈B2x, y〉H2(x, y ∈ H2).

Furthermore, B2 is compact, since A is. Hence the pair (H2, B2) satisfies thesame conditions as the pair (H1, B1), and so we can iterate the procedure.

Continuing in this manner we find sequences (en)n of vectors of H and (λn)n

of scalars and a (decreasing) sequence of closed subspaces (Hn)n such thatH1 = H and

1) Hn = Hn+1 ⊕ Ren is an orthogonal decomposition;

2) ‖en‖ = 1 and |λn| = ‖A∣∣Hn‖;

3) Aen = λnen

for all n ∈ N. Since Hn+1 ⊂ Hn it follows from 3) that |λn+1| ≤ |λn|, i.e. thesequence (|λn|)n is decreasing. But each λn is an eigenvalue of A, and byLemma 14.1 we must have limn λn = 0. It follows from 2) and 3) that (en)n

is an ONS.

Define the finite-dimensional operator An by

Anx :=n∑

j=1

λj 〈x, ej〉 ej = A( n∑

j=1

〈x, ej〉 ej

)= APFnx (x ∈ H),

where Fn := spane1, . . . , en. I claim that ‖An −A‖ → 0. Indeed, it followsfrom the construction that

H = H1 ⊕ Re1 = H2 ⊕ Re2 ⊕ Re1 = . . .

= Hn+1 ⊕ Ren ⊕ · · · ⊕ Re1 = Hn+1 ⊕ spane1, . . . en= Hn+1 ⊕ Fn

is an orthogonal decomposition, whence F⊥n = Hn+1. Therefore,

‖(A−An)x‖ = ‖Ax−APFnx‖ = ‖A(Id−PFn)x‖ = ‖APF⊥n

x‖

≤ ‖A∣∣Hn+1

‖∥∥PHn+1x

∥∥ ≤ |λn+1|∥∥PHn+1

∥∥ ‖x‖ ≤ |λn+1| ‖x‖

for all x ∈ H, by 3). This yields ‖An −A‖ ≤ |λn+1| and this tends to zero asn →∞.

Most of the work is done, the rest is cosmetics. The following lemma showsin particular that in our sequence we find every non-zero eigenvalue of A.


14.2 Lemma In the situation and with the notions from above, the following state-

ments hold:

a) The set en | n ∈ N, λn 6= 0 is a complete ONS for R(A), i.e.

R(A) = spanen | n ∈ N, λn 6= 0,

b) If λ 6= 0 is an eigenvalue of A with multiplicity k := dimN (λ Id−A),then #n | λn = λ = k.

c) σ(A) \ 0 = λn | n ∈ N, λn 6= 0.

d) If dim H = ∞ then 0 ∈ σ(A) (but 0 is not necessarily an eigenvalue).

Proof. Define I := n ∈ N | λn 6= 0. a) As n ∈ I, i.e. λn 6= 0, we may writeen = A(λ−1

n en) ∈ R(A). On the other hand, we know that for every x

Ax =∞∑

n=1

λn 〈x, en〉 en =∑

n∈N,λn 6=0

λn 〈x, en〉 en. =∑n∈I

λn 〈x, en〉 en.

b) Let λ 6= 0 be a scalar, define F := N (λ Id−A). Suppose that F 6= 0, i.e. λ

is an eigenvector. Then clearly F ⊂ R(A) (as before). Now, for every n ∈ N,if λ 6= λn then F ⊥ en (by Theorem 13.10). Hence

F ⊂ en | n ∈ I, λn 6= λ⊥ = spanen | n ∈ I, λn = λ ⊂ F,

and therefore F = spanen | n ∈ I, λn = λ.c) We give two proofs. One uses Theorem 13.11 from the appendix from Lec-ture 12, which says that every spectral value λ ∈ σ(A) must be an approximateeigenvalue; so if λ 6= 0, it must be an eigenvalue by Theorem 13.6 and henceis one of the λn, by b).

The second proof is much longer than the first. We have seen in b) that everynon-zero eigenvalue of A appears as one of the λn. We have to show thatno other non-zero spectral values can occur. Suppose that λ 6= 0 is not aneigenvalue; we have to show that we can find a continuous inverse to λ Id−A.We make use of the orthogonal decomposition H = N (A)⊕R(A) (see Lemma12.1) and define

By :=

∑∞n=1

〈y,en〉λ−λn

en if y ∈ R(A)

λ−1y if y ∈ N (A).

It rests to show that this sum converges always, that B is bounded, and thatin fact B(λ Id−A) = (λ Id−A)B = Id. These arguments can be found in LN16.2.2.


d) Suppose that dim H = ∞ and 0 /∈ σ(A). Then A is bijective and A−1 isbounded. Hence

1 = ‖en‖ =∥∥A−1Aen

∥∥ ≤ ∥∥A−1∥∥ ‖λnen‖ =

∥∥A−1∥∥ |λn| → 0

which is a contradiction.

We now bring together all what we have learned so far. We also cover the casethat H is finite dimensional.

14.3 Theorem (Spectral Theorem for compact self-adjoint operators)Let H be a Hilbert space, dim H = ∞, and let A ∈ BL(H) be self-adjoint and

compact. Then there exist real scalars (λn)∞n=1 and vectors (en)∞n=1 in H such

that the following assertions hold.

a) |λn| ≥ |λn+1| (n ∈ N).

b) limn→∞ |λn| = 0.

c) en | λn 6= 0 is a complete ONS for R(A).

d) σ(A) = 0 ∪ λn | n ∈ N, λn 6= 0.

e) #n | λn = λ = dimN (λ Id−A) for every λ 6= 0.

f) One has

Ax =∞∑

n=1

λn 〈x, en〉 en (x ∈ H).

In particular, Aen = λnen for every n.

Application: Dirichlet boundary conditions

In Lecture 10 we considered the Poisson problem

u′′ = −f, u ∈ H2(0, 1) ∩H10 (0, 1)

and showed that for given f ∈ L2(0, 1) the solution u is given by u = Af whereA = Ak is the Hilbert-Schmidt operator associated with the kernel k given by

k(t, s) :=

s(1− t), s ≤ t

t(1− s), t ≤ s.

A different way of writing the operator A is

Af(t) = [J2f(1)]t− J2f(t) (t ∈ [0, 1]).


Since k(t, s) = k(s, t), the operator A is self-adjoint, and as Hilbert-Schmidtoperators are compact (Theorem 12.11), we can apply the Spectral Theorem.The following lemma shows that the eigenvalue problem of A is equivalentto the eigenvalue problem of the Laplace operator with Dirichlet boundaryconditions. (Here the ominous “Laplace operator” is nothing more than thesecond derivative, and “Dirichlet boundary conditions” means just that thefunctions should vanish at the boundary.)

14.4 Lemma The operator A is injective. Let λ 6= 0 and 0 6= u ∈ L2(0, 1). Then

Au = λu if and only if u ∈ C2[0, 1], λ > 0 and u satisfies

u′′ =−1λ

u, u(0) = 0 = u(1).

Proof. Since we know that J maps L2(0, 1) into H1(0, 1) such that (Ju)′ = u,we see that Au ∈ H2(0, 1) with (Au)′′ = −u for all u ∈ L2(0, 1). If Au = 0,then u = −(Au)′′ = 0, and hence A is injective.

Now suppose that Au = λu for some u ∈ L2(0, 1) and λ 6= 0. We know(e.g. from Assignment 2) that Ju is continuous, hence J2u ∈ C1[0, 1]. Thisimplies that u = λ−1Au ∈ C1. But then u ∈ C2[0, 1], by the same argumentas before. (Obviously one can repeat this and conclude that u ∈ C∞[0, 1].)Moreover, differentiating yields that u satisfies the differential equation

u′′ =−1λ

u.

Moreover,

λu(0) = (Au)(0) = 0 = (Au)(1) = λu(1)

which yields u(0) = 0 = u(1). To see that λ > 0 if u 6= 0 we write

‖u‖2 = 〈u, u〉 = −λ⟨u′′, u

⟩= λ

⟨u′, u′

⟩since u ∈ C1

0 [0, 1]. Since 〈u′, u′〉 ≥ 0 always, either λ > 0 or u = 0.

Conversely, suppose that u has the stated properties. Then from λu′′ = −u itfollows by integrating twice that

λu(t) = −J2u(t) + ct + d (t ∈ [0, 1])

for certain constants c, d. The condition u(0) = 0 yields d = 0, and thecondition u(1) = 0 yields c = J2u(1), whence λu = Au as desired.


To analyze the problem further, let u, λ satisfy the equivalent conditions statedin the lemma. As is known from the elementary theory of differential equa-tions, the solution u must have the form

u(t) = α cos(

t√λ

)+ β sin

(t√λ

)for some constants α, β. The boundary conditions u(0) = u(1) = 0 give α = 0and, since we want a non-trivial solution,

sin(1/√

λ) = 0.

This limits the possibilities of λ to

λ =1π2

,1

4π2, . . . ,

1n2π2

, . . .

So we have found our eigenvalue sequence and the associated ONS:

λn =1

n2π2, en(t) =

sin(nπt)√2

(t ∈ [0, 1], n ∈ N).

Since A is injective, R(A) = H and the system (en)∞n=1 is indeed a completeONS for L2(0, 1). Moreover, the operator can be written as∫ 1

0k(t, s)f(s) ds = (Af)(t) =

∞∑n=1

(1

2n2π2

∫ 1

0f(s) sin(nπs) ds

)sin(nπt).

Application: The heat equation on an interval

In this section we look at the following partial differential equation (initial-boundary value problem) on [0,∞)× [0, 1]:

∂tu(t, x) = ∂xxu(t, x) (t, x) ∈ (0,∞)× (0, 1)

u(t, 0) = u(t, 1) = 0 (t > 0)

u(0, x) = f(x) (x ∈ (0, 1))

where f : (0, 1) −→ R is a given initial data. This is the one-dimensionalheat equation with Dirichlet boundary conditions. If f is continuousit is reasonable to speak of a so-called “classical” solution, i.e., a functionu ∈ C([0,∞)×[0, 1])∩C1,2((0,∞)×(0, 1)) that solves the PDE in the ordinarysense. However, the most successful strategy is to allow for a very weak notionof solution (in order to make it easy to find one) and then in a second stepinvestigate under which conditions on f this solution is a classical one.


To find a reasonable candidate for a solution, one shifts the problem fromPDEs to functional analysis. We want our ’solution’ u to be a function

u : [0,∞) −→ L2(0, 1) satisfying u(0) = f ∈ L2(0, 1).

We define its “time derivative” ut to be a function ut : [0,∞) −→ L2(0, 1)such that

d

dt〈u(t), v〉 = 〈ut, v〉

for all v ∈ L2(0, 1). Moreover, one actually wants u(t) ∈ H2(0, 1) ∩ H10 (0, 1)

satisfying (u(t))′′ = ut(t) for all t > 0.

Given this, using the operator A from above (the solution operator of thePoisson problem with Dirichlet boundary conditions), we see that −A(ut(t)) =u(t) for all t. Writing out the operator A in its associated Fourier expansiongives

u(t) = −A(ut(t)) =∑

n

−λn 〈ut(t), en〉 en.

Since we know that (en)n is a complete ONS in L2(0, 1) we can replace u(t)on the left by its Fourier expansion to get∑

n

〈u(t), en〉 en =∑

n

−λn 〈ut(t), en〉 en

and comparing Fourier coefficients we arrive at

〈u(t), en〉 = −λn 〈ut(t), en〉 (n ∈ N, t > 0).

Employing our definition of the time-derivative above leads to the followinginfinite system of linear ODEs:

d

dt〈u(t), en〉 =

−1λn

〈u(t), en〉 , 〈u(0), en〉 = 〈f, en〉 (n ∈ N).

This is clearly satisfiable by setting

u(t) :=∑n∈N

e−t/λn 〈f, en〉 en (t ≥ 0).

It is now a quite tedious but manageable exercise in analysis to prove thatthe series actually defines a smooth function on (0,∞)× [0, 1] which satisfiesthe heat equation. Moreover, the initial condition is met in the sense thatlimt0 u(t) = f in L2(0, 1), but one can say more depending on whether f iscontinuous or has even higher regularity.

Remark. The method sketched here is a step into the fascinating field of Evo-lution Equations, where one transforms finite-dimensional PDEs into ODE’s


in diverse Banach spaces and applies functional analytic methods in order tosolve them or to study the asymptotic behaviour or other properties of theirsolutions. The matter is in the intersection of many fields of theoretical andapplied mathematics, physics, biology, economics, engineering.

Appendix: Further applications

We shall look at the consequences of the spectral theorem when applied tosome other Hilbert–Schmidt operators on L2(0, 1) associated with the differ-ential equation

u′′ = −f

on the interval (0, 1) with different boundary conditions. The most interest-ing (and commonly addressed) of these are the Neumann and the periodicboundary conditions. Moreover, we shall determine the norm of the integra-tion operator J , which leads to mixed boundary conditions. Finally we shallcompute the best constant in Poincare’s inequality. After a reformulation ofthe problem, this will take us back to the Dirichlet situation from above.

Application: Neumann boundary conditions

Now we consider an operator that is associated with the problem

u′′ = −f, u′(0) = u′(1) = 0.

The boundary condition “vanishing derivative” is often called the Neumannboundary condition. Define

g(s) :=(1− s)2

2− 1

6(s ∈ [0, 1])

and the operator A by

Au := −J2Pu + 〈u, g〉1 (u ∈ L2(0, 1)),

where Pu := u− 〈u,1〉1 is the orthogonal projection onto 1⊥. Note that

〈g,1〉 =∫ 1

0

(1− s)2

2− 1

6ds = 0.

If one writes out the operator, one finds that A is induced by the kernelfunction

k(t, s) =s2 + t2

2+

13−max(s, t) (s, t ∈ [0, 1])

showing that A is indeed a compact self-adjoint operator, and Au ∈ H2(0, 1)with (Au)′′ = −Pu for every u ∈ L2(0, 1).


14.5 Lemma a) N (A) = R1.

b) Let 0 6= λ ∈ R and 0 6= u ∈ L2(0, 1). Then Au = λu if and only if

u ∈ C2[0, 1], λ > 0 and u satisfies

u′′ =−1λ

u, u′(0) = u′(1) = 0.

Proof. Since g ⊥ 1 and P1 = 0, we have A1 = 0. Conversely, if Au = 0 thenPu = −(Au)′′ = 0, hence u is a constant.

Now suppose that Au = λu for some u ∈ L2(0, 1) and λ 6= 0. As above we canconclude that u ∈ C2[0, 1] and λu′′ = −u. Moreover, λu′ = (Au)′ = −JPu

and therefore

λu′(0) = −JPu(0) = 0, and λu′(1) = −JPu(1) = −〈1, Pu〉 = 0.

The positivity of λ now follows similarly to the Dirichlet case. Since u′(0) =u′(1) = 0, we can integrate by parts:

‖u‖2 = 〈u, u〉 = −λ⟨u′′, u

⟩= λ

⟨u′, u′

⟩which shows that u 6= 0 implies λ > 0.

Conversely, suppose that u satisfies the stated conditions. Then

〈1, u〉 = −λ

∫ 1

0u′′(s) ds = −λ[u′(1)− u′(0)] = 0

hence u ⊥ 1, i.e. Pu = u. Integrating two times yields

λu(t) = −J2u(t) + ct + d (t ∈ [0, 1])

for some constants c, d ∈ R. Taking the derivative once shows

λu′(t) = −Ju(t) + c

and since u′(0) = 0 we obtain c = 0. On the other hand, we know thatJu(1) = 〈1, u〉 = 0, hence

0 = λJu(1) = −J3u(1) + dJ1(1) = −J3u(1) + d.

This amount to

d = J3u(1) =∫ 1

0

(1− s)2

2u(s) ds.

Again using u = Pu we can write

λu = −J2Pu + 〈Pu, h〉1 = −J2Pu + 〈u, Ph〉1− J2Pu + 〈u, g〉1 = Au


with h(s) = 12(1− s)2; indeed, Ph = h− 〈h,1〉1 = h− 1/6 = g.

As in the Dirichlet case, let us analyse the problem further and suppose thatu, λ satisfy the equivalent conditions stated in the lemma. Again, from theelementary theory of differential equations we know that the solution u musthave the form

u(t) = α cos(

t√λ

)+ β sin

(t√λ

)for some constants α, β. The boundary conditions u′(0) = u′(1) = 0 give β = 0and, since we want a non-trivial solution, again we arrive at

sin(1/√

λ) = 0

with the possibilities

λ =1π2

,1

4π2, . . . ,

1n2π2

, . . .

So we have found our eigenvalue sequence and the associated ONS:

λn =1

n2π2, en(t) =

cos(nπt)√2

(t ∈ [0, 1], n ∈ N).

Since A has kernel R1, the system

1 ∪ en | n ∈ N

is a complete ONS for L2(0, 1). Moreover, the operator A can be written as∫ 1

0k(t, s)f(s) ds = (Af)(t) =

∞∑n=1

(1

2n2π2

∫ 1

0f(s) cos(nπs) ds

)cos(nπt).

Application: Periodic boundary conditions

Let us write A0 instead of A for the operator which is associated with Dirich-let boundary conditions (see above). Again, we let Pu = u − 〈u,1〉1 theorthogonal projection onto 1⊥. By Lemma 13.9 we know that the operator

A := PA0P

is self-adjoint and positive. If we write out the left P here we obtain

Au = A0Pu− 〈A0Pu,1〉1 = −J2Pu + J2Pu(1)t− 〈A0Pu,1〉1,

so (Au)′′ = −Pu in the weak sense.

14.6 Exercise Work out a kernel function k that induces the operator A.


We try to determine the eigenvalues of A.

14.7 Lemma a) N (A) = R1.

b) Fix λ 6= 0 and 0 6= u ∈ L2(0, 1). Then Au = λu if and only if u ∈ C2[0, 1]satisfying

u′′ =−1λ

u, u(0) = u(1), u′(0) = u′(1).

Moreover, this is the case if and only if

λ =1

4π2n2and u(t) = α cos(2πnt) + β sin(2πnt) (t ∈ (0, 1))

for some n ∈ N and some α, β ∈ R.

Proof. Suppose that Au = 0. Then −Pu = (Au)′′ = 0 and hence u must be aconstant function. On the other hand, A1 = PA0P1 = 0. Hence λ = 0 is aneigenvalue with eigenspace R1.

Now suppose λ 6= 0. If u 6= 0 and Au = λu then by Theorem 13.10, u ⊥ 1 andso Pu = u. Moreover, as above one sees that u ∈ C2[0, 1]. This yields

λu = A0u− 〈A0u,1〉1

and so u(0) = u(1). Also

u′(1)− u′(0) =⟨u′′,1

⟩= λ−1 〈u,1〉 = 0

and hence u as periodic boundary conditions. Conversely, if u ∈ C2[0, 1] withu′′ = −λ−1u and periodic boundary conditions, one easily establihes thatu ⊥ 1 and Au = λu.

Consequently, we get as general solution as before

u(t) = α cos(

t√λ

)+ β sin

(t√λ

)for some constants α, β. The boundary conditions u(0) = u(1) and u′(1) =u′(0) imply that

α = α cos(1/√

λ) + β sin(1/√

λ)

β = β cos(1/√

λ)− α sin(1/√

λ)

This means that (α, β) is a non-trivial fixed vector of the rotation matrix(cos(1/

√λ) sin(1/

√λ)

− sin(1/√

λ) cos(1/√

λ))

)


which is only possible if this matrix is the identity matrix. Hence necessarily

cos(1/√

λ) = 1 and sin(1/√

λ) = 0,

that is 1/√

λ ∈ 2πZ. In this case, any choice of α, β are possible, and hence oneobtains a sequence of eigenvalues with associated two-dimensional eigenspaces:

λn =1

4π2n2, en1(t) =

cos(2πnt)√2

, en2(t) =sin(2πnt)√

2(n ∈ N).

The spectral theorem tells us that the system of functions

1, e11, e12, e21, e22 . . .

is a complete ONS of L2(0, 1) (a fact that we know already from Theorem7.18), and that the operator A can be written as

(Af)(t) =∞∑

n=1

18π2n2

∫ 1

0cos(2nπs)f(s) ds cos(2πnt)

+1

8π2n2

∫ 1

0sin(2nπs)f(s) ds sin(2πnt)

for every f ∈ L2(0, 1).

Application: The norm of the integration operator

Several times we have encountered the operator J given by integration:

(Jf)(t) :=∫ t

af(s) ds (t ∈ [a, b]).

This operator is often called also the Volterra operator. It is quite easy tocompute its norm when considered as acting on C[a, b] with the supremumnorm. (It is equal to one.) But what is the norm of J when considered as anoperator on L2(a, b)? Of course, J is an integral operator with kernel

k(t, s) = 1s≤t(t, s)

and so one can estimate

‖J‖2 ≤ ‖k‖2HS =∫ b

a

∫ b

a|k(t, s)|2 dsdt

=∫ b

a

∫ t

adsdt =

∫ b

a(t− a) dt =

(b− a)2

2


which gives ‖J‖L2→L2 ≤ (b − 1)/√

2. But we shall see that we do not haveequality here.

The idea is to use the Spectral Theorem. However, J is not a self-adjointoperator (proof?) and so one has to use a trick, based on the following lemma.

14.8 Lemma Let A be an arbitrary bounded operator on a Hilbert space. Then

A∗A and AA∗ are positive self-adjoint operators with norm

‖A∗A‖ = ‖AA∗‖ = ‖A‖2 .

Proof. Choose C = Id in Lemma 13.9 to see that A∗A is positive and self-adjoint. Clearly

‖A∗A‖ ≤ ‖A∗‖ ‖A‖ = ‖A‖2

by Lemma 12.1. But on the other hand

‖Ax‖2 = 〈Ax,Ax〉 = 〈A∗Ax, x〉 ≤ ‖A∗Ax‖ ‖x‖ ≤ ‖A∗A‖ ‖x‖2

for all x ∈ H, by Cauchy–Schwarz. Hence ‖A‖2 ≤ ‖A∗A‖, by definition ofthe norm. The statements about AA∗ = (A∗)∗(A∗) follow by applying theseresults to A∗ in place of A.

To apply the lemma we look at J∗.

14.9 Lemma We have

J∗f = 〈f,1〉1− Jf (f ∈ L2(a, b)).

Proof. Take f, g ∈ C[a, b]. Then

〈J∗f, g〉 = 〈f, Jg〉 =∫ b

af(t)

∫ t

ag(s) dsdt =

∫ b

ag(s)

∫ b

sf(t) dt ds

=∫ b

a[〈f,1〉 − Jf(s)] g(s) ds = 〈〈f,1〉1− Jf, g〉

by a simple integration by parts. This establishes that the two bounded op-erators J∗ and 〈·,1〉1− J coincide on C[a, b], and since this space is dense inL2(a, b), these operators must coincide everywhere, i.e. are equal (see Lemma9.8).(You also can perform the above computation with general f, g ∈ L2(a, b),but this requires an application of Fubini’s theorem, see also Example 12.2c)and LN Exercise 14.3.6.)


By the above lemma, the operator A := JJ∗ is given by

Af(t) = JJ∗f(t) = 〈f,1〉 (t− a)− J2f(t)

=∫ b

a(t− a)f(s) ds−

∫ t

a(t− s) f(s) ds

=∫ b

amin(t− a, s− a)f(s) ds

for f ∈ L2(a, b), hence is induced by the kernel k(t, s) := min(t − a, s − a).(See also LN 16.3.2). By the Spectral Theorem, since A is positive, the normof A is equal to the largest eigenvalue of A.

14.10 Lemma Fix λ 6= 0 and 0 6= u ∈ L2(a, b). Then Au = λu if and only if

u ∈ C2[a, b] satisfying

u′′ =−1λ

u, u(a) = 0 = u′(b).

Moreover, this is the case if and only if

λ =(

2(b− a)(2n− 1)π

)2

and u(t) = cos(

(2n− 1)π(t− a)2(b− a)

)for some n ∈ N.

Proof. Suppose that Au = λu. Since Ju is already a continuous function (seeAssignment 2), J2u is a C1-function. So u = λ−1Au is C1; but then J2u isC2 and hence u ∈ C2. Moreover, clearly λu′′ = −u (this is the fundamentaltheorem of calculus), λu(a) = (Au)(a) = 0 and

λu′(b) = (Au)′(b) = 〈u,1〉 − Ju(b) = 0.

This yields one implication. To prove the other, suppose that u ∈ C2[a, b]satisfies λu′′ = −u and u(a) = 0 = u′(b). Then integrating twice yields

λu = −J2u + c(t− a) + d

for some constants c, d. The boundary condition u(a) = 0 implies d = 0, andtaking one derivative yiels

0 = λu′(b) = −Ju(b) + c

which gives c = Ju(b) = 〈u,1〉. Together we have indeed λu = Au.

Elementary analysis teaches us that in the above situation one must have

u(t) = α cos(

t− a√λ

)+ β sin

(t− a√

λ

)


for some constants α, β. The boundary condition u(a) = 0 give α = 0 and

0 = u′(b) =1√λ

cos(

b− a√λ

).

This is the case if and only if

b− a√λ

=(2n− 1)π

2

for some n ∈ N. (Note that our eigenvalues must be positive!)

Now, back to our original question: we look for the biggest eigenvalue of A

and find

‖J‖2 = ‖JJ∗‖ = ‖A‖ =(

2(b− a)π

)2

and that gives

‖J‖ =2(b− a)

π

and this is slightly smaller than (b− a)/√

2 = ‖J‖HS .

Application: The best constant in Poincare’s inequality

In Lecture 11 we encountered Poincare’s inequality:

‖u‖2 ≤ c∥∥u′∥∥

2(u ∈ H1

0 (a, b)).

We saw there that one can choose c = ‖J‖, which we now know is ‖J‖ =2(b− a)/π. However, it is still not clear what the best constant

c0 := infc ≥ 0 | ‖u‖2 ≤ c

∥∥u′∥∥2

for all u ∈ H10 (a, b)

could be. We shall see now that actually

c0 =b− a

π.

The first step to achieve this is to reformulate the inequality.

14.11 Lemma The space C10 [a, b] ∩ C2[a, b] is dense in H1

0 (a, b) (in the norm of

H1(a, b), of course).

Proof. Fix f ∈ H10 (a, b). By Assignment 2, Exercise 4c), one finds g ∈ L2(a, b)

and c ∈ R such that f = Jg + c. Since f(a) = f(b) = 0, one finds c = 0


and g ⊥ 1. By Corollary 5.22, there is a sequence gn ∈ C1[a, b] such that‖gn − g‖2 → 0. Define

fn(t) := Jgn(t)− 〈gn,1〉 t− a

b− a.

Then clearly fn ∈ C10 [a, b] ∩ C2[a, b]. We claim that ‖fn − f‖H1 → 0. This

equivalent to the two statements

‖fn − f‖2 → 0 and∥∥f ′n − f ′

∥∥2→ 0.

Since gn → g in L2, by continuity of J and the scalar product we have

〈gn,1〉 → 〈g,1〉 = 0 and Jgn → Jg = f

in the L2-norm. Hence∥∥f ′n − f ′∥∥

2≤ ‖gn − g‖2 +

〈gn,1〉√b− a

→ 0

and, with h(t) := (t− a)/(b− a),

‖fn − f‖2 ≤ ‖J‖ ‖gn − Jg‖2 + 〈gn,1〉 ‖h‖2 → 0

as claimed.

The lemma shows that the least constant c0 in Poincare’s inequality is deter-mined by taking just functions from C1

0 [a, b] ∩ C2[a, b], i.e.,

c0 = infc | ‖u‖2 ≤ c

∥∥u′∥∥2, u ∈ C1

0 [a, b] ∩ C2[a, b]

.

But now remember that we have

C10 [a, b] ∩ C2[a, b] = A0f | f ∈ C[a, b]

where A0 is the operator considered above, associated with Dirichlet boundaryconditions. (Actually, we did it on (0, 1), but the arguments on the generalinterval (a, b) are analogous, with obvious changes.) Hence we look at the leastc such that

‖A0f‖2 ≤ c2⟨(A0f)′, (A0f)′

⟩= c2

⟨A0f,−(A0f)′′

⟩= c2 〈A0f, f〉

for all f ∈ C[a, b]. Since C[a, b] is dense in L2(a, b) and A0 is a boundedoperator on this space, we look actually for the least c such that

‖A0f‖2 ≤ c2 〈A0f, f〉 (f ∈ L2(a, b)).


Since A0 ≥ 0, Lemma 13.9 shows that c2 = ‖A0‖ is a valid choice, whencec20 ≤ ‖A0‖. But by Cauchy–Schwarz,

〈A0f, f〉 ≤ ‖A0f‖ ‖f‖ ≤ ‖A0‖ ‖f‖2

for all f ∈ L2(a, b) implies that ‖A0‖2 ≤ c20 ‖A0‖, hence ‖A0‖ ≤ c2

0. So wehave identified our desired quantity as

c20 = ‖A0‖ .

Now, the spectral theorem tells us that ‖A0‖ equals the absolute value of thebiggest eigenvalue of A0, which is (b−a)2/π2, with corresponding eigenfunction

e1(t) =

√b− a

2sin(

π(t− a)b− a

)(t ∈ [a, b]).

(Adapt the considerations about the operator on (0, 1) from above.) Henceindeed c0 = (b− a)/π, with the function e1 as extremal case.


Applied Functional Analysis 145

Contents

Introduction and Metric Space Theory 1

What is functional analysis all about? . . . . . . . . . . . . . . . . . 1Some metric space theory . . . . . . . . . . . . . . . . . . . . . . . . 3

Normed spaces and Banach spaces 9

Normed spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9The `p-spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Holder’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

The space of bounded and continuous functions 19

Bounded Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Continuous functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 21The theorem of Picard–Lindelof . . . . . . . . . . . . . . . . . . . . . 22

Completeness, Denseness and Integration Theory 27

Completeness and Denseness . . . . . . . . . . . . . . . . . . . . . . 27Lebesgue Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Lebesgue Integration II 37

Null sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37The space L1(I)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40The spaces Lp(I), 1 < p < ∞ . . . . . . . . . . . . . . . . . . . . . . 41The space L∞(I) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Holder’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Density results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Hilbert spaces 49

Bilinear forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 52The Sobolev Space H1(a, b) . . . . . . . . . . . . . . . . . . . . . . . 54

Abstract Fourier Expansions 57

Orthonormal systems and bases . . . . . . . . . . . . . . . . . . . . . 57Bessel’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59Complete orthonormal systems . . . . . . . . . . . . . . . . . . . . . 60Separable spaces and the Gram–Schmidt procedure . . . . . . . . . . 62Classical Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . 64

146 Applied Functional Analysis

Orthogonal Projections 67

Best approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . 67Orthogonal Projection . . . . . . . . . . . . . . . . . . . . . . . . . . 70Minimal norm problems . . . . . . . . . . . . . . . . . . . . . . . . . 73

Bounded Linear Operators I 75

Bounded linear operators . . . . . . . . . . . . . . . . . . . . . . . . 75Examples of bounded operators and their norms . . . . . . . . . . . 79Appendix: General remarks about linear mappings . . . . . . . . . . 83

Bounded Linear Operators II 87

Introduction to integral operators . . . . . . . . . . . . . . . . . . . . 87Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88Integral operators of Hilbert–Schmidt type . . . . . . . . . . . . . . 91The space BL(E,F ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 92The Neumann series . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Dual Spaces 99

The dual space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99The Riesz–Frechet theorem . . . . . . . . . . . . . . . . . . . . . . . 102The adjoint operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

Adjoints and Compact Operators 107

Adjoint operators continued . . . . . . . . . . . . . . . . . . . . . . . 107Preview: The spectral theorem . . . . . . . . . . . . . . . . . . . . . 109Compact operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111Appendix: Abstract Hilbert–Schmidt operators . . . . . . . . . . . . 114

Spectral Theory of Selfadjoint Operators 119

Some general spectral theory . . . . . . . . . . . . . . . . . . . . . . 119Self-adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . 122Appendix: Spectral Characterization of Positivity . . . . . . . . . . 125

The Spectral Theorem 127

The Spectral Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 127Application: Dirichlet boundary conditions . . . . . . . . . . . . . . 130Application: The heat equation on an interval . . . . . . . . . . . . . 132Application: Neumann boundary conditions . . . . . . . . . . . . . . 134Application: Periodic boundary conditions . . . . . . . . . . . . . . . 136Application: The norm of the integration operator . . . . . . . . . . 138Application: The best constant in Poincare’s inequality . . . . . . . 141

Documents

Lecture Notes Winter ’07 Applied Functional Analysishaase/Dokus/afa-lectures1-14.pdf · Lecture Notes Winter ’07 Applied Functional Analysis Lecture 1 What is functional analysis