Intro Real Anal

Introduction to Real Analysis

Lee Larson

University of Louisville

January 19, 2015

About This Document

I often teach the MATH 501-502: Introduction to Real Analysis course at theUniversity of Louisville. These are notes I’ve compiled from those experiences. Theycover the basic ideas of analysis on the real line. The course is intended for a mix ofmostly senior mathematics majors and beginning graduate students.

The notes are updated quite often. The date of the version you’re reading is atthe bottom-left of most pages. The latest version is available for download at theWeb address math.louisville.edu/⇠lee/ira.

Feel free to use these notes for any purpose, as long as you give me blame orcredit. In return, I only ask you to tell me about mistakes. Also appreciated areany suggestions and opinions. I can be contacted using the email address on theWeb page referenced above.

January 19, 2015

i

Contents

About This Document i

Chapter 1. Basic Ideas 1-11. Sets 1-12. Algebra of Sets 1-23. Indexed Sets 1-44. Functions and Relations 1-55. Cardinality 1-106. Exercises 1-12

Chapter 2. The Real Numbers 2-11. The Field Axioms 2-12. The Order Axiom 2-33. The Completeness Axiom 2-54. Comparisons of Q and R 2-85. Exercises 2-10

Chapter 3. Sequences 3-11. Basic Properties 3-12. Monotone Sequences 3-43. Subsequences and the Bolzano-Weierstrass Theorem 3-64. Lower and Upper Limits of a Sequence 3-75. The Nested Interval Theorem 3-96. Cauchy Sequences 3-97. Exercises 3-12

Chapter 4. Series 4-11. What is a Series? 4-12. Positive Series 4-33. Absolute and Conditional Convergence 4-104. Rearrangements of Series 4-125. Exercises 4-15

Chapter 5. The Topology of R 5-11. Open and Closed Sets 5-12. Relative Topologies and Connectedness 5-43. Covering Properties and Compactness on R 5-54. More Small Sets 5-85. Exercises 5-13

Chapter 6. Limits of Functions 6-1

i

1. Basic Definitions 6-12. Unilateral Limits 6-43. Continuity 6-54. Unilateral Continuity 6-85. Continuous Functions 6-106. Uniform Continuity 6-127. Exercises 6-13

Chapter 7. Di↵erentiation 7-11. The Derivative at a Point 7-12. Di↵erentiation Rules 7-23. Derivatives and Extreme Points 7-54. Di↵erentiable Functions 7-65. Applications of the Mean Value Theorem 7-96. Exercises 7-14

Chapter 8. Integration 8-11. Partitions 8-12. Riemann Sums 8-23. Darboux Integration 8-44. The Integral 8-75. The Cauchy Criterion 8-96. Properties of the Integral 8-107. The Fundamental Theorem of Calculus 8-138. Change of Variables 8-169. Integral Mean Value Theorems 8-1910. Exercises 8-20

Chapter 9. Sequences of Functions 9-11. Pointwise Convergence 9-12. Uniform Convergence 9-33. Metric Properties of Uniform Convergence 9-54. Series of Functions 9-65. Continuity and Uniform Convergence 9-66. Integration and Uniform Convergence 9-117. Di↵erentiation and Uniform Convergence 9-138. Power Series 9-15

Chapter 10. Fourier Series 10-11. Trigonometric Polynomials 10-12. The Riemann Lebesgue Lemma 10-33. The Dirichlet Kernel 10-34. Dini’s Test for Pointwise Convergence 10-55. The Fejer Kernel 10-76. Fejer’s Theorem 10-107. Exercises 10-11

Appendix. Bibliography A-1

Index A-1

January 19, 2015 http://math.louisville.edu/⇠lee/ira

Appendix. Index A-2


CHAPTER 1

Basic Ideas

In the end, all mathematics can be boiled down to logic and set theory. Becauseof this, any careful presentation of fundamental mathematical ideas is inevitablycouched in the language of logic and sets. This chapter defines enough of thatlanguage to allow the presentation of basic real analysis. Much of it will be familiarto you, but look at it anyway to make sure you understand the notation.

1. Sets

Set theory is a large and complicated subject in its own right. There is no timein this course to touch on any but the simplest parts of it. Instead, we’ll just lookat a few topics from what is often called “naive set theory.”

We begin with a few definitions.A set is a collection of objects called elements. Usually, sets are denoted by the

capital letters A, B, · · · , Z. A set can consist of any type and number of elements.Even other sets can be elements of a set. The sets dealt with here usually have realnumbers as their elements.

If a is an element of the set A, we write a 2 A. If a is not an element of the setA, we write a /2 A.

If all the elements of A are also elements of B, then A is a subset of B. In thiscase, we write A ⇢ B or B � A. In particular, notice that whenever A is a set, thenA ⇢ A.

Two sets A and B are equal, if they have the same elements. In this case wewrite A = B. It is easy to see that A = B i↵ A ⇢ B and B ⇢ A. Establishing thatboth of these containments are true is the most common way to show that two setsare equal.

If A ⇢ B and A 6= B, then A is a proper subset of B. In cases when this isimportant, it is written A ( B instead of just A ⇢ B.

There are several ways to describe a set.A set can be described in words such as “P is the set of all presidents of the

United States.” This is cumbersome for complicated sets.All the elements of the set could be listed in curly braces as S = {2, 0, a}. If

the set has many elements, this is impractical, or impossible.More common in mathematics is set builder notation. Some examples are

P = {p : p is a president of the United states}= {Washington, Adams, Je↵erson, · · · , Clinton, Bush, Obama}

and

A = {n : n is a prime number} = {2, 3, 5, 7, 11, · · · }.

1-1

1-2 CHAPTER 1. BASIC IDEAS

In general, the set builder notation defines a set in the form

{formula for a typical element : objects to plug into the formula}.

A more complicated example is the set of perfect squares:

S = {n2 : n is an integer} = {0, 1, 4, 9, · · · }.

The existence of several sets will be assumed. The simplest of these is theempty set, which is the set with no elements. It is denoted as ;. The natural

numbers is the set N = {1, 2, 3, · · · } consisting of the positive integers. The setZ = {· · · , �2, �1, 0, 1, 2, · · · } is the set of all integers. ! = {n 2 Z : n � 0} ={0, 1, 2, · · · } is the nonnegative integers. Clearly, ; ⇢ A, for any set A and

; ⇢ N ⇢ ! ⇢ Z.

Definition 1.1. Given any set A, the power set of A, written P(A), is the setconsisting of all subsets of A; i. e.,

P(A) = {B : B ⇢ A}.

For example, P({a, b}) = {;, {a}, {b}, {a, b}}. Also, for any set A, it is alwaystrue that ; 2 P(A) and A 2 P(A). If a 2 A, it is never true that a 2 P(A), but itis true that {a} ⇢ P(A). Make sure you understand why!

An amusing example is P(;) = {;}. (Don’t confuse ; with {;}! The former isempty and the latter has one element.) Now, consider

P(;) = {;}P(P(;)) = {;, {;}}P(P(P(;))) = {;, {;}, {{;}}, {;, {;}}}

After continuing this n times, for some n 2 N, the resulting set,

P(P(· · · P(;) · · · )),is very large. In fact, since a set with k elements has 2k elements in its power set,

there 22

22

= 65, 536 elements after only five iterations of the example. Beyondthis, the numbers are too large to print. Number sequences such as this one aresometimes called tetrations.

2. Algebra of Sets

Let A and B be sets. There are four common binary operations used on sets.1

The union of A and B is the set containing all the elements in either A or B:

A [ B = {x : x 2 A _ x 2 B}.

The intersection of A and B is the set containing the elements contained inboth A and B:

A \ B = {x : x 2 A ^ x 2 B}.

1In the following, some logical notation is used. The symbol _ is the logical nonexclusive “or.”The symbol ^ is the logical “and.” Their truth tables are as follows:

^ T F

T T F

F F F

_ T F

T T T

F T F


2. ALGEBRA OF SETS 1-3

A B

A B

A B

A B

A

A ∆ B

BA B

A \ B

Figure 1. These are Venn diagrams showing the four standard binaryoperations on sets. In this figure, the set which results from the operationis shaded.

The di↵erence of A and B is the set of elements in A and not in B:

A \ B = {x : x 2 A ^ x /2 B}.

The symmetric di↵erence of A and B is the set of elements in one of the sets,but not the other:

A�B = (A [ B) \ (A \ B).

Another common set operation is complementation. The complement of a set Ais usually thought of as the set consisting of all elements which are not in A. But,a little thinking will convice you this is not a meaningful definition because thecollection of elements not in A is not a precisely understood collection. To makesense of the complement of a set, there must be a well-defined universal set U whichcontains all the sets in question. Then the complement of a set A ⇢ U is Ac = U \A.It is usually the case that the universal set U is evident from the context in whichit is used.

With these operations, an extensive algebra for the manipulation of sets canbe developed. It’s usually done hand in hand with formal logic because the twosubjects share much in common. These topics are studied as part of Boolean algebra.

Several examples of set algebra are given in the following theorem and its corollary.

Theorem 1.2. Let A, B and C be sets.

(a) A \ (B [ C) = (A \ B) \ (A \ C)(b) A \ (B \ C) = (A \ B) [ (A \ C)



Proof. (a) This is proved as a sequence of equivalences.2

x 2 A \ (B [ C) () x 2 A ^ x /2 (B [ C)

() x 2 A ^ x /2 B ^ x /2 C

() (x 2 A ^ x /2 B) ^ (x 2 A ^ x /2 C)

() x 2 (A \ B) \ (A \ C)

(b) This is also proved as a sequence of equivalences.

x 2 A \ (B \ C) () x 2 A ^ x /2 (B \ C)

() x 2 A ^ (x /2 B _ x /2 C)

() (x 2 A ^ x /2 B) _ (x 2 A ^ x /2 C)

() x 2 (A \ B) [ (A \ C)

⇤Theorem 1.2 is a version of a group of set equations which are often called

DeMorgan’s Laws. The more usual statement of DeMorgan’s Laws is in Corollary 1.3.Corollary 1.3 is an obvious consequence of Theorem 1.2 when there is a universalset to make the complementation well-defined.

Corollary 1.3 (DeMorgan’s Laws). Let A and B be sets.

(a) (A [ B)c = Ac \ Bc

(b) (A \ B)c = Ac [ Bc

3. Indexed Sets

We often have occasion to work with large collections of sets. For example, wecould have a sequence of sets A

1

, A2

, A3

, · · · , where there is a set An

associatedwith each n 2 N. In general, let ⇤ be a set and suppose for each � 2 ⇤ there is a setA

�

. The set {A�

: � 2 ⇤} is called a collection of sets indexed by ⇤. In this case, ⇤is called the indexing set for the collection.

Example 1.1. For each n 2 N, let An

= {k 2 Z : k2 n}. Then

A1

= A2

=A3

= {�1, 0, 1}, A4

= {�2, �1, 0, 1, 2}, · · · ,

A50

= {�7, �6, �5, �4, �3, �2, �1, 0, 1, 2, 3, 4, 5, 6, 7}, · · ·is a collection of sets indexed by N.

Two of the basic binary operations can be extended to work with indexedcollections. In particular, using the indexed collection from the previous paragraph,we define [

�2⇤

A�

= {x : x 2 A�

for some � 2 ⇤}

and \

�2⇤

A�

= {x : x 2 A�

for all � 2 ⇤}.

DeMorgan’s Laws can be generalized to indexed collections.

2The logical symbol () is the same as “if, and only if.” If A and B are any two statements,then A () B is the same as saying A implies B and B implies A. It is also common to use i↵in this way.


4. FUNCTIONS AND RELATIONS 1-5

Theorem 1.4. If {B�

: � 2 ⇤} is an indexed collection of sets and A is a set,

then

A \[

�2⇤

B�

=\

�2⇤

(A \ B�

)

and

A \\

�2⇤

B�

=[

�2⇤

(A \ B�

).

Proof. The proof of this theorem is Exercise 1.3. ⇤

4. Functions and Relations

4.1. Tuples. When listing the elements of a set, the order in which they arelisted is unimportant; e. g., {e, l, v, i, s} = {l, i, v, e, s}. If the order in which n itemsare listed is important, the list is called an n-tuple. (Strictly speaking, an n-tuple isnot a set.) We denote an n-tuple by enclosing the ordered list in parentheses. Forexample, if x

1

, x2

, x3

, x4

are 4 items, the 4-tuple (x1

, x2

, x3

, x4

) is di↵erent from the4-tuple (x

2

, x1

, x3

, x4

).Because they are used so often, 2-tuples are called ordered pairs and a 3-tuple

is called an ordered triple.

Definition 1.5. The Cartesian product of two sets A and B is the set of allordered pairs

A ⇥ B = {(a, b) : a 2 A ^ b 2 B}.

Example 1.2. If A = {a, b, c} and B = {1, 2}, then

A ⇥ B = {(a, 1), (a, 2), (b, 1), (b, 2), (c, 1), (c, 2)}.

and

B ⇥ A = {(1, a), (1, b), (1, c), (2, a), (2, b), (2, c)}.

Notice that A ⇥ B 6= B ⇥ A because of the importance of order in the ordered pairs.

A useful way to visualize the Cartesian product of two sets is as a table. TheCartesian product A ⇥ B from Example 1.2 is listed as the entries of the followingtable.

1 2a (a, 1) (a, 2)b (b, 1) (b, 2)c (c, 1) (c, 2)

Of course, the common Cartesian plane from your analytic geometry course isnothing more than a generalization of this idea of listing the elements of a Cartesianproduct as a table.

The definition of Cartesian product can be extended to the case of more thantwo sets. If {A

1

, A2

, · · · , An

} are sets, then

A1

⇥ A2

⇥ · · · ⇥ An

= {(a1

, a2

, · · · , an

) : ak

2 Ak

for 1 k n}is a set of n-tuples. This is often written as

nY

k=1

Ak

= A1

⇥ A2

⇥ · · · ⇥ An

.



4.2. Relations.

Definition 1.6. If A and B are sets, then any R ⇢ A ⇥ B is a relation from Ato B. If (a, b) 2 R, we write aRb.

In this case,

dom (R) = {a : (a, b) 2 R}is the domain of R and

ran (R) = {b : (a, b) 2 R}is the range of R.

In the special case when R ⇢ A ⇥ A, for some set A, there is some additionalterminology.

R is symmetric, if aRb () bRa.R is reflexive, if aRa whenever a 2 dom (A).R is transitive, if aRb ^ bRc =) aRc.R is an equivalence relation on A, if it is symmetric, reflexive and transitive.

Example 1.3. Let R be the relation on Z ⇥ Z defined by aRb () a b.Then R is reflexive and transitive, but not symmetric.

Example 1.4. Let R be the relation on Z ⇥ Z defined by aRb () a < b.Then R is transitive, but neither reflexive nor symmetric.

Example 1.5. Let R be the relation on Z ⇥ Z defined by aRb () a2 = b2.In this case, R is an equivalence relation. It is evident that aRb i↵ b = a or b = �a.

4.3. Functions.

Definition 1.7. A relation R ⇢ A ⇥ B is a function if

aRb1

^ aRb2

=) b1

= b2

.

If f ⇢ A ⇥ B is a function and dom (f) = A, then we usually write f : A ! Band use the usual notation f(a) = b instead of afb.

If f : A ! B is a function, the usual intuitive interpretation is to regard fas a rule that associates each element of A with a unique element of B. It’s notnecessarily the case that each element of B is associated with something from A;i.e., B may not be ran (f).

Example 1.6. Define f : N ! Z by f(n) = n2 and g : Z ! Z by g(n) = n2.In this case ran (f) = {n2 : n 2 N} and ran (g) = ran (f) [ {0}. Notice that eventhough f and g use the same formula, they are actually di↵erent functions.

Definition 1.8. If f : A ! B and g : B ! C, then the composition of g withf is the function g � f : A ! C defined by g � f(a) = g(f(a)).

In Example 1.6, g � f(n) = g(f(n)) = g(n2) = (n2)2 = n4 makes sense for alln 2 N, but f � g is undefined at n = 0.

There are several important types of functions.

Definition 1.9. A function f : A ! B is a constant function, if ran (f) has asingle element; i. e., there is a b 2 B such that f(a) = b for all a 2 A. The functionf is surjective (or onto B), if ran (f) = B.



A B

a

bc

f

f

f

A B

a

bc

g

g

g

Figure 2. These diagrams show two functions, f : A ! B and g : A !B. The function g is injective and f is not because f(a) = f(c).

In a sense, constant and surjective functions are the opposite extremes. Aconstant function has the smallest possible range and a surjective function has thelargest possible range. Of course, a function f : A ! B can be both constant andsurjective, if B has only one element.

Definition 1.10. A function f : A ! B is injective (or one-to-one), iff(a) = f(b) implies a = b.

The terminology “one-to-one” is very descriptive because such a functionuniquely pairs up the elements of its domain and range. An illustration of thisdefinition is in Figure 2. In Example 1.6, f is injective while g is not.

Definition 1.11. A function f : A ! B is bijective, if it is both surjective andinjective.

A bijective function can be visualized as uniquely pairing up all the elements ofA and B. Some authors, favoring less pretentious language, use the more descriptiveterminology one-to-one correspondence instead of bijection. This pairing up of theelements from each set is like counting them and finding they have the same numberof elements. Given any two sets, no matter how many elements they have, theintuitive idea is they have the same number of elements if, and only if, there is abijection between them.

The following theorem shows that this property of counting the number ofelements works in a familiar way. (Its proof is left as an easy exercise.)

Theorem 1.12. If f : A ! B and g : B ! C are bijections, then g � f : A ! Cis a bijection.

4.4. Inverse Functions.



A B

f

f —1

f —1

f

Figure 3. This is one way to visualize a general invertible function.First f does something to a and then f

�1 undoes it.

Definition 1.13. If f : A ! B, C ⇢ A and D ⇢ B, then the image of C is theset f(C) = {f(a) : a 2 C}. The inverse image of D is the set f�1(D) = {a : f(a) 2D}.

Definitions 1.11 and 1.13 work together in the following way. Suppose f : A ! Bis bijective and b 2 B. The fact that f is surjective guarantees that f�1({b}) 6= ;.Since f is injective, f�1({b}) contains only one element, say a, where f(a) = b. Inthis way, it is seen that f�1 is a rule that assigns each element of B to exactly oneelement of A; i. e., f�1 is a function with domain B and range A.

Definition 1.14. If f : A ! B is bijective, the inverse of f is the functionf�1 : B ! A with the property that f�1 � f(a) = a for all a 2 A and f � f�1(b) = bfor all b 2 B.

There is some ambiguity in the meaning of f�1 between 1.13 and 1.14. Theformer is an operation working with subsets of A and B; the latter is a functionworking with elements of A and B. It’s usually clear from the context which meaningis being used.

Example 1.7. Let A = N and B be the even natural numbers. If f : A ! Bis f(n) = 2n and g : B ! A is g(n) = n/2, it is clear f is bijective. Sincef � g(n) = f(n/2) = 2n/2 = n and g � f(n) = g(2n) = 2n/2 = n, we see g = f�1.(Of course, it is also true that f = g�1.)

Example 1.8. Let f : N ! Z be defined by

f(n) =

((n � 1)/2, n odd,

�n/2, n even

It’s quite easy to see that f is bijective and

f�1(n) =

(2n + 1, n � 0,

�2n, n < 0

Given any set A, it’s obvious there is a bijection f : A ! A and, if g : A ! B is abijection, then so is g�1 : B ! A. Combining these observations with Theorem 1.12,an easy theorem follows.

Theorem 1.15. Let S be a collection of sets. The relation on S defined by

A ⇠ B () there is a bijection f : A ! B

is an equivalence relation.



4.5. Schroder-Bernstein Theorem. The following theorem is a powerfultool in set theory, and shows that a seemingly intuitively obvious statement issometimes di�cult to verify. It will be used in Section 5.

Theorem 1.16 (Schroder-Bernstein3). Let A and B be sets. If there are injective

functions f : A ! B and g : B ! A, then there is a bijective function h : A ! B.

B

A1

A2

A3

B1

B2

B3

B4

B5

A4 · · ·

· · ·

A5

A

f(A)

Figure 4. Here are the first few steps from the construction used inthe proof of Theorem 1.16.

Proof. Let B1

= B\f(A). If Bk

⇢ B is defined for some k 2 N, let Ak

= g(Bk

)and B

k+1

= f(Ak

). This inductively defines Ak

and Bk

for all k 2 N. Use thesesets to define A =

Sk2N A

k

and h : A ! B as

h(x) =

(g�1(x), x 2 A

f(x), x 2 A \ A.

It must be shown that h is well-defined, injective and surjective.To show h is well-defined, let x 2 A. If x 2 A \ A, then it is clear h(x) = f(x) is

defined. On the other hand, if x 2 A, then x 2 Ak

for some k. Since x 2 Ak

= g(Bk

),we see h(x) = g�1(x) is defined. Therefore, h is well-defined.

To show h is injective, let x, y 2 A with x 6= y.If both x, y 2 A or x, y 2 A\ A, then the assumptions that g and f are injective,

respectively, imply h(x) 6= h(y).The remaining case is when x 2 A and y 2 A \ A. Suppose x 2 A

k

andh(x) = h(y). If k = 1, then h(x) = g�1(x) 2 B

1

and h(y) = f(y) 2 f(A) = B \ B1

.This is clearly incompatible with the assumption that h(x) = h(y). Now, supposek > 1. Then there is an x

1

2 B1

such that

x = g � f � g � f � · · · � f � g| {z }k � 1 f ’s and k g’s

(x1

).

This impliesh(x) = g�1(x) = f � g � f � · · · � f � g| {z }

k � 1 f ’s and k � 1 g’s

(x1

) = f(y)

3This is often called the Cantor-Schroder-Bernstein or Cantor-Bernstein Theorem, despite thefact that it was apparently first proved by Richard Dedekind.



so thaty = g � f � g � f � · · · � f � g| {z }

k � 2 f ’s and k � 1 g’s

(x1

) 2 Ak�1

⇢ A.

This contradiction shows that h(x) 6= h(y). We conclude h is injective.To show h is surjective, let y 2 B. If y 2 B

k

for some k, then h(Ak

) = g�1(Ak

) =B

k

shows y 2 h(A). If y /2 Bk

for any k, y 2 f(A) because B1

= B \ f(A), andg(y) /2 A, so y = h(x) = f(x) for some x 2 A. This shows h is surjective. ⇤

The Schroder-Bernstein theorem has many consequences, some of which are atfirst a bit unintuitive, such as the following theorem.

Corollary 1.17. There is a bijective function h : N ! N ⇥ N

Proof. If f : N ! N ⇥ N is f(n) = (n, 1), then f is clearly injective. On theother hand, suppose g : N ⇥ N ! N is defined by g((a, b)) = 2a3b. The uniquenessof prime factorizations guarantees g is injective. An application of Theorem 1.16yields h. ⇤

To appreciate the power of the Schroder-Bernstein theorem, try to find anexplicit bijection h : N ! N ⇥ N.

5. Cardinality

There is a way to use sets and functions to formalize and generalize how wecount. For example, suppose we want to count how many elements are in the set{a, b, c}. The natural way to do this is to point at each element in succession andsay “one, two, three.” What is really happening is that we’re defining a bijectivefunction between {a, b, c} and the set {1, 2, 3}. This idea can be generalized.

Definition 1.18. Given n 2 N, the set n = {1, 2, · · · , n} is called an initial

segment of N. The trivial initial segment is 0 = ;. A set S has cardinality n, ifthere is a bijective function f : S ! n. In this case, we write card (S) = n.

The cardinalities defined in Definition 1.18 are called the finite cardinal numbers.They correspond to the everyday counting numbers we usually use. The idea canbe generalized still further.

Definition 1.19. Let A and B be two sets. If there is an injective functionf : A ! B, we say card (A) card (B).

According to Theorem 1.16, the Schroder-Bernstein Theorem, if card (A) card (B) and card (B) card (A), then there is a bijective function f : A ! B. Asexpected, in this case we write card (A) = card (B). When card (A) card (B), butno such bijection exists, we write card (A) < card (B). Theorem 1.15 shows thatcard (A) = card (B) is an equivalence relation between sets.

The idea here, of course, is that card (A) = card (B) means A and B have thesame number of elements and card (A) < card (B) means A is a smaller set than B.This simple intuitive understanding has some surprising consequences when the setsinvolved do not have finite cardinality.

In particular, the set A is countably infinite, if card (A) = card (N). In this case,it is common to write card (N) = @

0

.4 When card (A) @0

, then A is said to be a

4The symbol @ is the Hebrew letter “aleph” and @

0

is usually pronounced “aleph nought.”


5. CARDINALITY 1-11

countable set. In other words, the countable sets are those having finite or countablyinfinite cardinality.

Example 1.9. Let f : N ! Z be defined as

f(n) =

(n+1

2

, when n is odd

1 � n

2

, when n is even.

It’s easy to show f is a bijection, so card (N) = card (Z) = @0

.

Theorem 1.20. Suppose A and B are countable sets.

(a) A ⇥ B is countable.

(b) A [ B is countable.

Proof. (a) This is a consequence of Theorem 1.17.(b) This is Exercise 1.20. ⇤An alert reader will have noticed from previous examples that

@0

= card (Z) = card (!) = card (N) = card (N ⇥ N) = card (N ⇥ N ⇥ N) = · · ·A logical question is whether all sets either have finite cardinality, or are countablyinfinite. That this is not so is seen by letting S = N in the following theorem.

Theorem 1.21. If S is a set, card (S) < card (P(S)).

Proof. Noting that 0 = card (;) < 1 = card (P(;)), the theorem is true whenS is empty.

Suppose S 6= ;. Since {a} 2 P(S) for all a 2 S, it follows that card (S) card (P(S)). Therefore, it su�ces to prove there is no surjective function f : S !P(S).

To see this, assume there is such a function f and let T = {x 2 S : x /2 f(x)}.Since f is surjective, there is a t 2 S such that f(t) = T . Either t 2 T or t /2 T .

If t 2 T = f(t), then the definition of T implies t /2 T , a contradiction. Onthe other hand, if t /2 T = f(t), then the definition of T implies t 2 T , anothercontradiction. These contradictions lead to the conclusion that no such function fcan exist. ⇤

A set S is said to be uncountably infinite, or just uncountable, if @0

< card (S).Theorem 1.21 implies @

0

< card (P(N)), so P(N) is uncountable. In fact, the sameargument implies

@0

= card (N) < card (P(N)) < card (P(P(N))) < · · ·So, there are an infinite number of distinct infinite cardinalities.

In 1874 Georg Cantor [5] proved card (R) = card (P(N)) and card (R) > @0

,where R is the set of real numbers. (A version of Cantor’s theorem appears inTheorem 2.25 below.) This naturally led to the question whether there are sets Ssuch that @

0

< card (S) < card (R). Cantor spent many years trying answer thisquestion and never succeeded. His assumption that no such sets exist came to becalled the continuum hypothesis.

The importance of the continuum hypothesis was highlighted by David Hilbertat the 1900 International Congress of Mathematicians in Paris, when he put it firston his famous list of the 23 most important open problems in mathematics. KurtGodel proved in 1940 that the continuum hypothesis cannot be disproved using



standard set theory, but he did not prove it was true. In 1963 it was proved byPaul Cohen that the continuum hypothesis is actually unprovable as a theorem instandard set theory.

So, the continuum hypothesis is a statement with the strange property that itis neither true nor false within the framework of ordinary set theory. This meansthat in the standard axiomatic development of set theory, the continuum hypothesis,or a suitable negation of it, can be taken as an additional axiom without causingany contradictions. The technical terminology is that the continuum hypothesis isindependent of the axioms of set theory.

The proofs of these theorems are extremely di�cult and entire broad areas ofmathematics were invented just to make their proofs possible. Even today, thereare some deep philosophical questions swirling around them. A more technicalintroduction to many of these ideas is contained in the book by Ciesielski [7].A nontechnical and very readable history of the e↵orts by mathematicians tounderstand the continuum hypothesis is the book by Aczel [1].

6. Exercises

1.1. If a set S has n elements for n 2 !, then how many elements are in P(S)?

1.2. Prove that for any sets A and B,

(a) A = (A \ B) [ (A \ B)(b) A [ B = (A \ B) [ (B \ A) [ (A \ B) and that the sets A \ B, B \ A and

A \ B are pairwise disjoint.(c) A \ B = A \ Bc.

1.3. Prove Theorem 1.4.

1.4. For any sets A, B, C and D,

(A ⇥ B) \ (C ⇥ D) = (A \ C) ⇥ (B \ D)

and

(A ⇥ B) [ (C ⇥ D) ⇢ (A [ C) ⇥ (B [ D).

Why does equality not hold in the second expression?


1.6. Suppose R is an equivalence relation on A. For each x 2 A define Cx

={y 2 A : xRy}. Prove that if x, y 2 A, then either C

x

= Cy

or Cx

\ Cy

= ;. (Thecollection {C

x

: x 2 A} is the set of equivalence classes induced by R.)

1.7. If f : A ! B and g : B ! C are bijections, then so is g � f : A ! C.

1.8. Prove or give a counter example: f : X ! Y is injective i↵ whenever A andB are disjoint subsets of Y , then f�1(A) \ f�1(B) = ;.

1.9. If f : A ! B is bijective, then f�1 is unique.

1.10. Prove that f : X ! Y is surjective i↵ for each subset A ⇢ X, Y \ f(A) ⇢f(X \ A).


6. EXERCISES 1-13

1.11. Suppose that Ak

is a set for each positive integer k.

(a) Show that x 2 T1n=1

(S1

k=n

Ak

) i↵ x 2 Ak

for infinitely many sets Ak

.(b) Show that x 2 S1

n=1

(T1

k=n

Ak

) i↵ x 2 Ak

for all but finitely many of thesets A

k

.

The setT1

n=1

(S1

k=n

Ak

) from (a) is often called the superior limit of the sets Ak

andS1

n=1

(T1

k=n

Ak

) is often called the inferior limit of the sets Ak

.

1.12. Given two sets A and B, it is common to let AB denote the set of all

functions f : B ! A. Prove that for any set A, card⇣2A

⌘= card (P(A)). This is

why many authors use 2A as their notation for P(A).

1.13. Let S be a set. Prove the following two statements are equivalent:

(a) S is infinite; and,(b) there is a proper subset T of S and a bijection f : S ! T .

This statement is often used as the definition of when a set is infinite.

1.14. If S is an infinite set, then there is a countably infinite collection of nonemptypairwise disjoint infinite sets T

n

, n 2 N such that S =S

n2N Tn

.

1.15. Without using the Schroder-Bernstein theorem, find a bijection f : [0, 1] !(0, 1).

1.16. If f : [0, 1) ! (0, 1) and g : (0, 1) ! [0, 1) are given by f(x) = x + 1 andg(x) = x, then the proof of the Schroder-Bernstein theorem yields what bijectionh : [0, 1) ! (0, 1)?

1.17. Find a function f : R \ {0} ! R \ {0} such that f�1 = 1/f .

1.18. Find a bijection f : [0, 1) ! (0, 1).

1.19. If f : A ! B and g : B ! A are functions such that f � g(x) = x for allx 2 B and g � f(x) = x for all x 2 A, then f�1 = g.

1.20. If A and B are sets such that card (A) = card (B) = @0

, then card (A [ B) =@

0

.

1.21. Using the notation from the proof of the Schroder-Bernstein Theorem, letA = [0, 1), B = (0, 1), f(x) = x + 1 and g(x) = x. Determine h(x).

1.22. Using the notation from the proof of the Schroder-Bernstein Theorem, letA = N, B = Z, f(n) = n and

g(n) =

(1 � 3n, n 0

3n � 1, n > 0.

Calculate h(6) and h(7).

1.23. Suppose that in the statement of the Schroder-Bernstein theorem A = B = Zand f(n) = g(n) = 2n. Following the procedure in the proof yields what function h?



1.24. If {An

: n 2 N} is a collection of countable sets, thenS

n2N An

is countable.

1.25. If A and B are sets, the set of all functions f : A ! B is often denoted by

BA. If S is a set, prove that card⇣2S

⌘= card (P(S)).

1.26. If @0

card (S)), then there is an injective function f : S ! S that is notsurjective.

1.27. If card (S) = @0

, then there is a sequence of pairwise disjoint sets Tn

, n 2 Nsuch that card (T

n

) = @0

for every n 2 N andS

n2N Tn

= S.


CHAPTER 2

The Real Numbers

This chapter concerns what can be thought of as the rules of the game: theaxioms of the real numbers. These axioms imply all the properties of the realnumbers and, in a sense, any set satisfying them is uniquely determined to be thereal numbers.

The axioms are presented here as rules without very much justification. Otherapproaches can be used. For example, a common approach is to begin with thePeano axioms — the axioms of the natural numbers — and build up to the realnumbers through several “completions” of the natural numbers. It’s also possible tobegin with the axioms of set theory to build up the Peano axioms as theorems andthen use those to prove our axioms as further theorems. No matter how it’s done,there are always some axioms at the base of the structure and the rules for the realnumbers are the same, whether they’re axioms or theorems.

We choose to start at the top because the other approaches quickly turn intoa long and tedious labyrinth of technical exercises without much connection toanalysis.

1. The Field Axioms

These first six axioms are called the field axioms because any object satisfyingthem is called a field. They give the algebraic properties of the real numbers.

A field is a nonempty set F along with two binary operations, multiplication⇥ : F ⇥ F ! F and addition + : F ⇥ F ! F satisfying the following axioms.1

Axiom 1 (Associative Laws). If a, b, c 2 F, then (a + b) + c = a + (b + c) and(a ⇥ b) ⇥ c = a ⇥ (b ⇥ c).

Axiom 2 (Commutative Laws). If a, b 2 F, then a + b = b + a and a ⇥ b = b ⇥ a.

Axiom 3 (Distributive Law). If a, b, c 2 F, then a ⇥ (b + c) = (a ⇥ b) + (a ⇥ c).

Axiom 4 (Existence of identities). There are 0, 1 2 F such that a + 0 = a anda ⇥ 1 = a, for all a 2 F.

Axiom 5 (Existence of an additive inverse). For each a 2 F there is �a 2 Fsuch that a + (�a) = 0.

1Given a set A, a function f : A ⇥ A ! A is called a binary operation. In other words, abinary operation is just a function with two arguments. The standard notations of +(a, b) = a+ b

and ⇥(a, b) = a ⇥ b are used here. The symbol ⇥ is unfortunately used for both the Cartesianproduct and the field operation, but the context in which it’s used removes the ambiguity.

2-1

2-2 CHAPTER 2. THE REAL NUMBERS

Axiom 6 (Existence of a multiplicative inverse). For each a 2 F \ {0} there isa�1 2 F such that a ⇥ a�1 = 1.

Although these axioms seem to contain most properties of the real numbers wenormally use, they don’t characterize the real numbers; they just give the rules forarithmetic. There are other fields besides the real numbers and studying them is alarge part of most abstract algebra courses.

Example 2.1. From elementary algebra we know that the rational numbers

Q = {p/q : p 2 Z ^ q 2 N}form a field. It is shown in Theorem 2.12 that

p2 /2 Q, so Q doesn’t contain all the

real numbers.

Example 2.2. Let F = {0, 1, 2} with addition and multiplication calculatedmodulo 3. The addition and multiplication tables are as follows.

+ 0 1 20 0 1 21 1 2 02 2 0 1

⇥ 0 1 20 0 0 01 0 1 22 0 2 1

It is easy to check that the field axioms are satisfied. This field is usually called Z3

.

The following theorems, containing just a few useful properties of fields, arepresented mostly as examples showing how the axioms are used. More completedevelopments can be found in any beginning abstract algebra text.

Theorem 2.1. The additive and multiplicative identities of a field F are unique.

Proof. Suppose e1

and e2

are both multiplicative identities in F. Then

e1

= e1

⇥ e2

= e2

,

so the multiplicative identity is unique. The proof for the additive identity isessentially the same. ⇤

Theorem 2.2. Let F be a field. If a, b 2 F with b 6= 0, then �a and b�1

are

unique.

Proof. Suppose b1

and b2

are both multiplicative inverses for b 6= 0. Then,using Axioms 4 and 1,

b1

= b1

⇥ 1 = b1

⇥ (b ⇥ b2

) = (b1

⇥ b) ⇥ b2

= 1 ⇥ b2

= b2

.

This shows the multiplicative inverse in unique. The proof is essentially the samefor the additive inverse. ⇤

There are many other properties of fields which could be proved here, but theycorrespond to the usual properties of the real numbers learned in beginning algebra,so we omit them. Some of them are in the exercises at the end of this chapter.

From now on, the standard notations for algebra will usually be used; e. g., wewill allow ab instead of a ⇥ b and a/b instead of a ⇥ b�1. The reader may also usethe standard facts she learned from elementary algebra.


2. THE ORDER AXIOM 2-3

2. The Order Axiom

The axiom of this section gives the order and metric properties of the realnumbers. In a sense, the following axiom adds some geometry to a field.

Axiom 7 (Order axiom.). There is a set P ⇢ F such that

(a) If a, b 2 P , then a + b, ab 2 P .(b) If a 2 F, then exactly one of the following is true: a 2 P , �a 2 P ora = 0.

Any field F satisfying the axioms so far listed is naturally called an ordered field.

Of course, the set P is known as the set of positive elements of F. Using Axiom 7(ii),we see that F is divided into three pairwise disjoint sets: P , {0} and {�x : x 2 P}.The latter of these is, of course, the set of negative elements from F. The followingdefinition introduces familiar notation for order.

Definition 2.3. We write a < b or b > a, if b � a 2 P . The meanings of a band b � a are now as expected.

Notice that a > 0 () a = a � 0 2 P and a < 0 () �a = 0 � a 2 P , soa > 0 and a < 0 agree with our usual notions of positive and negative.

Our goal is to capture all the properties of the real numbers with the axioms.The order axiom eliminates many fields from consideration. For example, Exercise2.5 shows the field Z

3

of Example 2.2 is not an ordered field. On the other hand,facts from elementary algebra imply Q is an ordered field, so the first seven axiomsstill don’t “capture” the real numbers.

Following are a few standard properties of ordered fields.

Theorem 2.4. Let F be an ordered field and a 2 F. a 6= 0 i↵ a2 > 0.

Proof. ()) If a > 0, then a2 > 0 by Axiom 7(a). If a < 0, then �a > 0 byAxiom 7(b) and a2 = 1a2 = (�1)(�1)a2 = (�a)2 > 0.

(() Since 02 = 0, this is obvious. ⇤Theorem 2.5. If F is an ordered field and a, b, c 2 F, then

(a) a < b () a + c < b + c,(b) a < b ^ b < c =) a < c,(c) a < b ^ c > 0 =) ac < bc,(d) a < b ^ c < 0 =) ac > bc.

Proof. (a) a < b () b�a 2 P () (b+c)� (a+c) 2 P () a+c < b+c.(b) By supposition, both b � a, c � b 2 P . Using the fact that P is closed under

addition, we see (b � a) + (c � b) = c � a 2 P . Therefore, c > a.(c) Since b � a 2 P and c 2 P and P is closed under multiplication, c(b � a) =

cb � ca 2 P and, therefore, ac < bc.(d) By assumption, b � a, �c 2 P . Apply part (c) and Exercise 2.1. ⇤Theorem 2.6 (Two Out of Three Rule). Let F be an ordered field and a, b, c 2 F.

If ab = c and any two of a, b or c are positive, then so is the third.

Proof. If a > 0 and b > 0, then Axiom 7(a) implies c > 0. Next, supposea > 0 and c > 0. In order to force a contradiction, suppose b 0. In this case,Axiom 7(b) shows

0 a(�b) = �(ab) = �c < 0,

which is impossible. ⇤



Corollary 2.7. Let F be an ordered field and a 2 F. If a > 0, then a�1 > 0.If a < 0, then a�1 < 0.

Proof. The proof is Exercise 2.2. ⇤Suppose a > 0. Since 1a = a, Theorem 2.6 implies 1 > 0. Applying Theorem

2.5, we see that 1 + 1 > 1 > 0. It’s clear that by induction, we can find a copy of Nembedded in any ordered field. Similarly, Z and Q also have unique copies in anyordered field.

The standard notation for intervals will be used on an ordered field, F; i. e.,(a, b) = {x 2 F : a < x < b}, (a, 1) = {x 2 F : a < x}, [a, b] = {x 2 F : a x b},etc. It’s also useful to start visualizing F as a standard number line.

2.1. Metric Properties. The order axiom on a field F allows us to introducethe idea of the distance between points in F. To do this, we begin with the followingfamiliar definition.

Definition 2.8. Let F be an ordered field. The absolute value function on F isa function | · | : F ! F defined as

|x| =

(x, x � 0

�x, x < 0.

The most important properties of the absolute value function are contained inthe following theorem.

Theorem 2.9. Let F be an ordered field and x, y 2 F. Then

(a) |x| � 0 and |x| = 0 () x = 0;(b) |x| = | � x|;(c) �|x| x |x|;(d) |x| y () �y x y; and,(e) |x + y| |x| + |y|.

Proof. (a) The fact that |x| � 0 for all x 2 F follows from Axiom 7(b).Since 0 = �0, the second part is clear.

(b) If x � 0, then �x 0 so that | � x| = �(�x) = x = |x|. If x < 0, then�x > 0 and |x| = �x = | � x|.

(c) If x � 0, then �|x| = �x x = |x|. If x < 0, then �|x| = �(�x) = x <�x = |x|.

(d) This is left as Exercise 2.3.(e) Add the two sets of inequalities �|x| x |x| and �|y| y |y| to see

�(|x| + |y|) x + y |x| + |y|. Now apply (d).⇤

From studying analytic geometry and calculus, we are used to thinking of |x�y|as the distance between the numbers x and y. This notion of a distance betweentwo points of a set can be generalized.

Definition 2.10. Let S be a set and d : S ⇥ S ! R satisfy

(a) for all x, y 2 S, d(x, y) � 0 and d(x, y) = 0 () x = y,(b) for all x, y 2 S, d(x, y) = d(y, x), and(c) for all x, y, z 2 S, d(x, z) d(x, y) + d(y, z).

Then the function d is a metric on S. The pair (S, d) is called a metric space.


3. THE COMPLETENESS AXIOM 2-5

A metric is a function which defines the distance between any two points of aset.

Example 2.3. Let S be a set and define d : S ⇥ S ! S by

d(x, y) =

(1, x 6= y

0, x = y.

It can readily be verified that d is a metric on S. This simplest of all metrics iscalled the discrete metric and it can be defined on any set. It’s not often useful.

Theorem 2.11. If F is an ordered field, then d(x, y) = |x � y| is a metric on F.

Proof. Use parts (a), (b) and (e) of Theorem 2.9. ⇤The metric on F derived from the absolute value function is called the standard

metric on F. There are other metrics sometimes defined for specialized purposes,but we won’t have need of them.

3. The Completeness Axiom

All the axioms given so far are obvious from beginning algebra, and, on thesurface, it’s not obvious they haven’t captured all the properties of the real numbers.Since Q satisfies them all, the following theorem shows that we’re not yet done.

Theorem 2.12. There is no ↵ 2 Q such that ↵2 = 2.

Proof. Assume to the contrary that there is ↵ 2 Q with ↵2 = 2. Then thereare p, q 2 N such that ↵ = p/q with p and q relatively prime. Now,

(4)

✓p

q

◆2

= 2 =) p2 = 2q2

shows p2 is even. Since the square of an odd number is odd, p must be even; i. e.,p = 2r for some r 2 N. Substituting this into (4), shows 2r2 = q2. The sameargument as above establishes q is also even. This contradicts the assumption thatp and q are relatively prime. Therefore, no such ↵ exists. ⇤

Since we suspectp

2 is a perfectly fine number, there’s still something missingfrom the list of axioms. Completeness is the missing idea.

The Completeness Axiom is somewhat more complicated than the previousaxioms, and several definitions are needed in order to state it.

3.1. Bounded Sets.

Definition 2.13. A subset S of an ordered field F is bounded above, if thereexists M 2 F such that M � x for all x 2 S. A subset S of an ordered field F isbounded below, if there exists m 2 F such that m x for all x 2 S. The elementsM and m are called an upper bound and lower bound for S, respectively. If S isbounded both above and below, it is a bounded set.

There is no requirement in the definition that the upper and lower bounds for aset are elements of the set. They can be elements of the set, but typically are not.For example, if N = (�1, 0), then [0, 1) is the set of all upper bounds for N , butnone of them is in N . On the other hand, if T = (�1, 0], then [0, 1) is again theset of all upper bounds for T , but in this case 0 is an upper bound which is also an



element of T . An extreme case is ;. A vacuous argument shows every element of Fis both an upper and lower bound, but obviously none of them is in ;.

A set need not have upper or lower bounds. For example N = (�1, 0) has nolower bounds, while P = (0, 1) has no upper bounds. The integers, Z, has neitherupper nor lower bounds. If S has no upper bound, it is unbounded above and, ifit has no lower bound, then it is unbounded below. In either case, it is usually justsaid to be unbounded.

If M is an upper bound for the set S, then every x � M is also an upper boundfor S. Considering some simple examples should lead you to suspect that amongthe upper bounds for a set, there is one that is best in the sense that everythinggreater is an upper bound and everything less is not an upper bound. This is thebasic idea of completeness.

Definition 2.14. Suppose F is an ordered field and S is bounded above in F.A number B 2 F is called a least upper bound of S if

(a) B is an upper bound for S, and(b) if ↵ is any upper bound for S, then B ↵.

If S is bounded below in F, then a number b 2 F is called a greatest lower bound ofS if

(a) b is a lower bound for S, and(b) if ↵ is any lower bound for S, then b � ↵.

Theorem 2.15. If F is an ordered field and A ⇢ F is nonempty, then A has at

most one least upper bound and at most one greatest lower bound.

Proof. Suppose u1

and u2

are both least upper bounds for A. Since u1

andu

2

are both upper bounds for A, u1

u2

u1

=) u1

= u2

. The proof of theother case is similar. ⇤

Definition 2.16. If A ⇢ F is nonempty and bounded above, then the leastupper bound of A is written lub A. When A is not bounded above, we writelub A = 1. When A = ;, then lubA = �1.

If A ⇢ F is nonempty and bounded below, then the greatest lower of A iswritten glb A. When A is not bounded below, we write glb A = �1. When A = ;,then glb A = 1.2

Notice that the symbol “1” is not an element of F. Writing lub A = 1 is justa convenient way to say A has no upper bounds. Similarly lub ; = �1 tells us ;has every real number as an upper bound.

Theorem 2.17. Let A ⇢ F and ↵ 2 F. ↵ = lub A i↵ (↵, 1) \ A = ; and for

all " > 0, (↵ � ", ↵] \ A 6= ;. Similarly, ↵ = glb A i↵ (�1, ↵) \ A = ; and for all

" > 0, [↵, ↵ + ") \ A 6= ;.Proof. We will prove the first statement, concerning the least upper bound.

The second statement, concerning the greatest lower bound, follows similarly.()) If x 2 (↵, 1) \ A, then ↵ cannot be an upper bound of A, which is a

contradiction. If there is an " > 0 such that (↵ � ", ↵] \ A = ;, then from above, we

2Some people prefer the notation supA and inf A instead of lubA and glbA, respectively.They stand for the supremum and infimum of A.


3. THE COMPLETENESS AXIOM 2-7

conclude (↵ � ", 1) \ A = ;. This implies ↵ � "/2 is an upper bound for A whichis less than ↵ = lub A. This contradiction shows (↵ � ", ↵] \ A 6= ;.

(() The assumption that (↵, 1)\A = ; implies ↵ � lub A. On the other hand,suppose lub A < ↵. By assumption, there is an x 2 (lub A, ↵) \ A. This is clearly acontradiction, since lub A < x 2 A. Therefore, ↵ = lub A. ⇤

An eagle-eyed reader may wonder why the intervals in Theorem 2.17 are (↵�", ↵]and [↵, ↵ + ") instead of (↵ � ", ↵) and (↵, ↵ + "). Just consider the case A = {↵}to see that the theorem fails when the intervals are open. When lub A /2 A orglb A /2 A, the intervals can be open, as shown in the following corollary.

Corollary 2.18. If A is bounded above and ↵ = lub A /2 A, then for all " > 0,(↵ � ", ↵) \ A is an infinite set. Similarly, if A is bounded below and � = glb A /2 A,

then for all " > 0, (�, � + ") \ A is an infinite set.

Proof. Let " > 0. According to Theorem 2.17, there is an x1

2 (↵ � ", ↵] \ A.By assumption, x

1

< ↵. We continue by induction. Suppose n 2 N and xn

hasbeen chosen to satisfy x

n

2 (↵ � ", ↵) \ A. Using Theorem 2.17 as before tochoose x

n+1

2 (xn

, ↵) \ A. The set {xn

: n 2 N} is infinite and contained in(↵ � ", ↵) \ A. ⇤

When F = Q, Theorem 2.12 shows there is no least upper bound for A = {x :x2 < 2} in Q. In a sense, Q has a hole where this least upper bound should be.Adding the following completeness axiom enlarges Q to fill in the holes.

Axiom 8 (Completeness). Every nonempty set which is bounded above has aleast upper bound.

This is the final axiom. Any field F satisfying all eight axioms is called acomplete ordered field. We assume the existence of a complete ordered field, R,called the real numbers.

In naive set theory it can be shown that if F1

and F2

are both completeordered fields, then they are the same, in the following sense. There exists a uniquebijective function i : F

1

! F2

such that i(a + b) = i(a) + i(b), i(ab) = i(a)i(b) anda < b () i(a) < i(b). Such a function i is called an order isomorphism. Theexistence of such an order isomorphism shows that R is essentially unique. Morereading on this topic can be done in some advanced texts [9, 10].

Every statement about upper bounds has a dual statement about lower bounds.A proof of the following dual to Axiom 8 is left as an exercise.

Corollary 2.19. Every nonempty set which is bounded below has a greatest

lower bound.

In Section 4 it will be shown that there is an x 2 R satisfying x2 = 2. This willshow R removes the deficiency of Q highlighted by Theorem 2.12. The CompletenessAxiom plugs up the holes in Q.

3.2. Some Consequences of Completeness. The property of completenessis what separates analysis from geometry and algebra. It requires the use of approxi-mation, infinity and more dynamic visualizations than algebra or classical geometry.The rest of this course is largely concerned with applications of completeness.

Theorem 2.20 (Archimedean Principle ). If a 2 R, then there exists na

2 Nsuch that n

a

> a.



Proof. If the theorem is false, then a is an upper bound for N. Let � = lubN.According to Theorem 2.17 there is an m 2 N such that m > � � 1. But, this is acontradiction because � = lubN < m + 1 2 N. ⇤

Some other variations on this theme are in the following corollaries.

Corollary 2.21. Let a, b 2 R with a > 0.

(a) There is an n 2 N such that an > b.(b) There is an n 2 N such that 0 < 1/n < a.(c) There is an n 2 N such that n � 1 a < n.

Proof. (a) Use Theorem 2.20 to find n 2 N where n > b/a.(b) Let b = 1 in part (a).(c) Theorem 2.20 guarantees that S = {n 2 N : n > a} 6= ;. If n is the least

element of this set, then n � 1 /2 S and n � 1 a < n. ⇤Corollary 2.22. If I is any interval from R, then I \ Q 6= ; and I \ Qc 6= ;.Proof. Left as an exercise. ⇤A subset of R which intersects every interval is said to be dense in R. Corollary

2.22 shows both the rational and irrational numbers are dense.

4. Comparisons of Q and R

All of the above still does not establish that Q is di↵erent from R. In Theorem2.12, it was shown that the equation x2 = 2 has no solution in Q. The followingtheorem shows x2 = 2 does have solutions in R. Since a copy of Q is embedded inR, it follows, in a sense, that R is bigger than Q.

Theorem 2.23. There is a positive ↵ 2 R such that ↵2 = 2.

Proof. Let S = {x > 0 : x2 < 2}. Then 1 2 S, so S 6= ;. If x � 2, thenTheorem 2.5(c) implies x2 � 4 > 2, so S is bounded above. Let ↵ = lub S. It willbe shown that ↵2 = 2.

Suppose first that ↵2 < 2. This assumption implies (2 � ↵2)/(2↵ + 1) > 0.According to Corollary 2.21, there is an n 2 N large enough so that

0 <1

n<

2 � ↵2

2↵ + 1=) 0 <

2↵ + 1

n< 2 � ↵2.

Therefore,✓

↵ +1

n

◆2

= ↵2 +2↵

n+

1

n2

= ↵2 +1

n

✓2↵ +

1

n

◆

< ↵2 +(2↵ + 1)

n< ↵2 + (2 � ↵2) = 2

contradicts the fact that ↵ = lub S. Therefore, ↵2 � 2.Next, assume ↵2 > 2. In this case, choose n 2 N so that

0 <1

n<

↵2 � 2

2↵=) 0 <

2↵

n< ↵2 � 2.

Then✓

↵ � 1

n

◆2

= ↵2 � 2↵

n+

1

n2

> ↵2 � 2↵

n> ↵2 � (↵2 � 2) = 2,


4. COMPARISONS OF Q AND R 2-9

↵1

= .↵1(1) ↵1

(2) ↵1

(3) ↵1

(4) ↵1

(5) . . .↵

2

= .↵2

(1) ↵2(2) ↵2

(3) ↵2

(4) ↵2

(5) . . .↵

3

= .↵3

(1) ↵3

(2) ↵3(3) ↵3

(4) ↵3

(5) . . .↵

4

= .↵4

(1) ↵4

(2) ↵4

(3) ↵4(4) ↵4

(5) . . .↵

5

= .↵5

(1) ↵5

(2) ↵5

(3) ↵5

(4) ↵5(5) . . ....

......

......

...

Figure 1. The proof of Theorem 2.25 is called the “diagonal argument’”because it constructs a new number z by working down the main diagonalof the array shown above, making sure z(n) 6= ↵

n

(n) for each n 2 N.

again contradicts that ↵ = lub S.Therefore, ↵2 = 2. ⇤Theorem 2.12 leads to the obvious question of how much bigger R is than Q.

First, note that since N ⇢ Q, it is clear that card (Q) � @0

. On the other hand,every q 2 Q has a unique reduced fractional representation q = m(q)/n(q) withm(q) 2 Z and n(q) 2 N. This gives an injective function f : Q ! Z ⇥ N defined byf(q) = (m(q), n(q)), and we conclude card (Q) card (Z ⇥ N) = @

0

. The followingtheorem ensues.

Theorem 2.24. card (Q) = @0

.

In 1874, Georg Cantor first showed that R is not countable. The following proofis his famous diagonal argument from 1891.

Theorem 2.25. card (R) > @0

Proof. It su�ces to prove that card ([0, 1]) > @0

. If this is not true, then thereis a bijection ↵ : N ! [0, 1]; i.e.,

(5) [0, 1] = {↵n

: n 2 N}.

Each x 2 [0, 1] can be written in the decimal form x =P1

n=1

x(n)/10n wherex(n) 2 {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} for each n 2 N. This decimal representation is notnecessarily unique. For example,

1

2=

5

10=

4

10+

1X

n=2

9

10n.

In such a case, there is a choice of x(n) so it is constantly 9 or constantly 0 fromsome N onward. When given a choice, we will always opt to end the number with astring of nines. With this convention, the decimal representation of x is unique.

Define z 2 [0, 1] by choosing z(n) 2 {d 2 ! : d 8} such that z(n) 6= ↵n

(n).Let z =

P1n=1

z(n)/10n. Since z 2 [0, 1], there is an n 2 N such that z = ↵(n).But, this is impossible because z(n) di↵ers from ↵

n

in the nth decimal place. Thiscontradiction shows card ([0, 1]) > @

0

. ⇤Around the turn of the twentieth century these then-new ideas about infinite

sets were very controversial in mathematics. This is because some of these ideas arevery unintuitive. For example, the rational numbers are a countable set and theirrational numbers are uncountable, yet between every two rational numbers are an



uncountable number of irrational numbers and between every two irrational numbersthere are a countably infinite number of rational numbers. It would seem there areeither too few or too many gaps in the sets to make this possible. Such a seeminglyparadoxical situation flies in the face of our intuition, which was developed withfinite sets in mind.

This brings us back to the discussion of cardinalities and the ContinuumHypothesis at the end of Section 5. Most of the time, people working in realanalysis assume the Continuum Hypothesis is true. With this assumption andTheorem 2.25 it follows that whenever A ⇢ R, then either card (A) @

0

orcard (A) = card (R) = card (P(N)).3 Since P(N) has many more elements than N,any countable subset of R is considered to be a small set, in the sense of cardinality,even if it is infinite. This works against the intuition of many beginning studentswho are not used to thinking of Q, or any other infinite set as being small. But itturns out to be quite useful because the fact that the union of a countably infinitenumber of countable sets is still countable can be exploited in many ways.4

In later chapters, other useful small versus large dichotomies will be found.

5. Exercises

2.1. Prove that if a, b 2 F, where F is a field, then (�a)b = �(ab) = a(�b).

2.2. Prove Corollary 2.7. If a > 0, then so is a�1. If a < 0, then so is a�1.

2.3. Prove |x| y i↵ �y x y.

2.4. If S ⇢ R is bounded above, then

lub S = glb {x : x is an upper bound for S}.

2.5. Prove there is no set P ⇢ Z3

which makes Z3

into an ordered field.

2.6. If ↵ is an upper bound for S and ↵ 2 S, then ↵ = lub S.

2.7. Let A and B be subsets of R that are bounded above. Define A+B = {a+ b :a 2 A ^ b 2 B}. Prove that lub (A + B) = lub A + lub B.

2.8. If A ⇢ Z is bounded below, then A has a least element.

2.9. If F is an ordered field and a 2 F such that 0 a < " for every " > 0, thena = 0.

2.10. Let x 2 R. Prove |x| < " for all " > 0 i↵ x = 0.

2.11. If p is a prime number, then the equation x2 = p has no rational solutions.

2.12. If p is a prime number and " > 0, then there are x, y 2 Q such thatx2 < p < y2 < x2 + ".

3Since @

0

is the smallest infinite cardinal, @1

is used to denote the smallest uncountablecardinal. You will also see card (R) = c, where c is the old-style German letter c, standing for the“cardinality of the continuum.” Assuming the continuum hypothesis, it follows that @

0

< @

1

= c.4See Problem 24 on page 1-13.


5. EXERCISES 2-11

2.13. If a < b, then (a, b) \ Q 6= ;.

2.14. If q 2 Q and a 2 R \ Q, then q + a 2 R \ Q. Moreover, if q 6= 0, thenaq 2 R \ Q.

2.15. Prove that if a < b, then there is a q 2 Q such that a <p

2q < b.

2.16. Prove Corollary 2.22.

2.17. If F is an ordered field and x1

, x2

, . . . , xn

2 F for some n 2 N, then��

nX

i=1

xi

�� nX

i=1

|xi

|.(8)

2.18. Prove Corollary 2.19.

2.19. Prove card (Qc) = c.

2.20. If A ⇢ R and B = {x : x is an upper bound for A}, then lub (A) = glb (B).


CHAPTER 3

Sequences

1. Basic Properties

Definition 3.1. A sequence is a function a : N ! R.

Instead of using the standard function notation of a(n) for sequences, it isusually more convenient to write the argument of the function as a subscript, a

n

.

Example 3.1. Let the sequence an

= 1 � 1/n. The first three elements area1

= 0, a2

= 1/2, a3

= 2/3, etc.

Example 3.2. Let the sequence bn

= 2n. Then b1

= 2, b2

= 4, b3

= 8, etc.

Example 3.3. If a and r are constants, then a sequence given by c1

= a,c2

= ar, c3

= ar2 and in general cn

= arn�1 is called a geometric sequence. Thenumber r is called the ratio of the sequence. Staying away from the trivial caseswhere a = 0 or r = 0, a geometric sequence can always be recognized by noticingthat a

n+1

a

n

= r for all n 2 N. Example 3.2 is a geometric sequence with a = r = 2.

Example 3.4. If a and d are constants, then a sequence of the form dn

=a + (n � 1)d is called an arithmetic sequence. Another way of looking at this is thatdn

is an arithmetic sequence if dn+1

� dn

= d for all n 2 N.

Example 3.5. Some sequences are not defined by an explicit formula, but aredefined recursively. This is an inductive method of definition in which successiveterms of the sequence are defined by using other terms of the sequence. The mostfamous of these is the Fibonacci sequence. To define the Fibonacci sequence, f

n

,let f

1

= 0, f2

= 1 and for n > 2, let fn

= fn�2

+ fn�1

. The first few terms are0, 1, 1, 2, 3, 5, 8, . . . . There actually is a simple formula that directly gives f

n

, butwe leave its derivation as Exercise 3.5.

It’s often inconvenient for the domain of a sequence to be N, as required byDefinition 3.1. For example, the sequence beginning 1, 2, 4, 8, . . . can be written20, 21, 22, 23, . . . . Written this way, it’s natural to let the sequence function be 2n

with domain !. As long as there is a simple substitution to write the sequencefunction in the form of Definition 3.1, there’s no reason to adhere to the letter of thelaw. In general, the domain of a sequence can be any set of the form {n 2 Z : n � N}for some N 2 Z.

Definition 3.2. A sequence an

is bounded if {an

: n 2 N} is a bounded set.This definition is extended in the obvious way to bounded above and bounded below.

The sequence of Example 3.1 is bounded, but the sequence of Example 3.2 isnot, although it is bounded below.

3-1

3-2 CHAPTER 3. SEQUENCES


converges to L 2 R if for all " > 0 there existsan N 2 N such that whenever n � N , then |a

n

� L| < ". If a sequence does notconverge, then it is said to diverge.

When an

converges to L, we write limn!1 a

n

= L, or often, more simply,an

! L.

Example 3.6. Let an

= 1 � 1/n be as in Example 3.1. We claim an

! 1. Tosee this, let " > 0 and choose N 2 N such that 1/N < ". Then, if n � N

|an

� 1| = |(1 � 1/n) � 1| = 1/n 1/N < ",

so an

! 1.

Example 3.7. The sequence bn

= 2n of Example 3.2 diverges. To see this,suppose not. Then there is an L 2 R such that b

n

! L. If " = 1, there must bean N 2 N such that |b

n

� L| < " whenever n � N . Choose n � N . |L � 2n| < 1implies L < 2n + 1. But, then

bn+1

� L = 2n+1 � L > 2n+1 � (2n + 1) = 2n � 1 � 1 = ".

This violates the condition on N . We conclude that for every L 2 R there exists an" > 0 such that for no N 2 N is it true that whenever n � N , then |b

n

� L| < ".Therefore, b

n

diverges.


diverges to 1 if for every B > 0 there is anN 2 N such that n � N implies a

n

> B. The sequence an

is said to diverge to �1if �a

n

diverges to 1.When a

n

diverges to 1, we write limn!1 a

n

= 1, or often, more simply,an

! 1.

A common mistake is to forget that an

! 1 actually means the sequencediverges in a particular way. Don’t be fooled by the suggestive notation into treating1 as a number!

Example 3.8. It is easy to prove that the sequence an

= 2n of Example 3.2diverges to 1.

Theorem 3.5. If an

! L, then L is unique.

Proof. Suppose an

! L1

and an

! L2

. Let " > 0. According to Definition3.2, there exist N

1

, N2

2 N such that n � N1

implies |an

� L1

| < "/2 and n � N2

implies |an

� L2

| < "/2. Set N = max{N1

, N2

}. If n � N , then

|L1

� L2

| = |L1

� an

+ an

� L2

| |L1

� an

| + |an

� L2

| < "/2 + "/2 = ".

Since " is an arbitrary positive number an application of Exercise 3.10 showsL

1

= L2

. ⇤Theorem 3.6. a

n

! L i↵ for all " > 0, the set {n : an

/2 (L � ", L + ")} is

finite.

Proof. ()) Let " > 0. According to Definition 3.2, there is an N 2 N suchthat {a

n

: n � N} ⇢ (L�", L+"). Then {n : an

/2 (L�", L+")} ⇢ {1, 2, . . . , N �1},which is finite.

(() Let " > 0. By assumption {n : an

/2 (L � ", L + ")} is finite, so letN = max{n : a

n

/2 (L � ", L + ")} + 1. If n � N , then an

2 (L � ", L + "). ByDefinition 3.2, a

n

! L. ⇤


1. BASIC PROPERTIES 3-3

Corollary 3.7. If an

converges, then an

is bounded.

Proof. Suppose an

! L. According to Theorem 3.6 there are a finite numberof terms of the sequence lying outside (L � 1, L + 1). Since any finite set is bounded,the conclusion follows. ⇤

The converse of this theorem is not true. For example, an

= (�1)n is bounded,but does not converge. The main use of Corollary 3.7 is as a quick first check to seewhether a sequence might converge. It’s usually pretty easy to determine whether asequence is bounded. If it isn’t, it must diverge.

The following theorem lets us analyze some complicated sequences by breakingthem down into combinations of simpler sequences.

Theorem 3.8. Let an

and bn

be sequences such that an

! A and bn

! B.

Then

(a) an

+ bn

! A + B,

(b) an

bn

! AB, and

(c) an

/bn

! A/B as long as bn

6= 0 for all n 2 N and B 6= 0.

Proof. (a) Let " > 0. There are N1

, N2

2 N such that n � N1

im-plies |a

n

� A| < "/2 and n � N2

implies |bn

� B| < "/2. DefineN = max{N

1

, N2

}. If n � N , then

|(an

+ bn

) � (A + B)| |an

� A| + |bn

� B| < "/2 + "/2 = ".

Therefore an

+ bn

! A + B.(b) Let " > 0 and ↵ > 0 be an upper bound for |a

n

|. Choose N1

, N2

2 N suchthat n � N

1

=) |an

�A| < "/2(|B|+1) and n � N2

=) |bn

�B| < "/2↵.If n � N = max{N

1

, N2

}, then

|an

bn

� AB| = |an

bn

� an

B + an

B � AB| |a

n

bn

� an

B| + |an

B � AB|= |a

n

||bn

� B| + ||B||an

� A|< ↵

"

2↵+ |B| "

2(|B| + 1)

< "/2 + "/2 = ".

(c) First, notice that it su�ces to show that 1/bn

! 1/B, because part (b) ofthis theorem can be used to achieve the full result.

Let " > 0. Choose N 2 N so that the following two conditions aresatisfied: n � N =) |b

n

| > |B|/2 and |bn

� B| < B2"/2. Then, whenn � N , ��

1

bn

� 1

B

�� =��B � b

n

bn

B

�� <��

B2"/2

(B/2)B

�� = ".

Therefore 1/bn

! 1/B.⇤

If you’re not careful, you can easily read too much into the previous theoremand try to use its converse. Consider the sequences a

n

= (�1)n and bn

= �an

.Their sum, a

n

+ bn

= 0, product an

bn

= �1 and quotient an

/bn

= �1 all converge,but the original sequences diverge.



It is often easier to prove that a sequence converges by comparing it with aknown sequence than it is to analyze it directly. For example, a sequence suchas a

n

= sin2 n/n3 can easily be seen to converge to 0 because it is dominated by1/n3. The following theorem makes this idea more precise. It’s called the SandwichTheorem here, but is also called the Squeeze, Pinching, Pliers or ComparisonTheorem in di↵erent texts.

Theorem 3.9 (Sandwich Theorem). Suppose an

, bn

and cn

are sequences such

that an

bn

cn

for all n 2 N.(a) If a

n

! L and cn

! L, then bn

! L.(b) If b

n

! 1, then cn

! 1.

(c) If bn

! �1, then an

! �1.

Proof. (a) Let " > 0. There is an N 2 N large enough so that whenn � N , then L � " < a

n

and cn

< L + ". These inequalities implyL � " < a

n

bn

cn

< L + ". Theorem 3.6 shows cn

! L.(b) Let B > 0 and choose N 2 N so that n � N =) b

n

> B. Thencn

� bn

> B whenever n � N . This shows cn

! 1.(c) This is essentially the same as part (b).

⇤

2. Monotone Sequences

One of the problems with using the definition of convergence to prove a givensequence converges is the limit of the sequence must be known in order to verify thesequence converges. This gives rise in the best cases to a “chicken and egg” problemof somehow determining the limit before you even know the sequence converges. Inthe worst case, there is no nice representation of the limit to use, so you don’t evenhave a “target” to shoot at. The next few sections are ultimately concerned withremoving this deficiency from Definition 3.2, but some interesting side-issues areexplored along the way.

Not surprisingly, we begin with the simplest case.


is increasing, if an+1

� an

for all n 2 N. It isstrictly increasing if a

n+1

> an

for all n 2 N.A sequence a

n

is decreasing, if an+1

an

for all n 2 N. It is strictly decreasing

if an+1

< an

for all n 2 N.If a

n

is any of the four types listed above, then it is said to be a monotone

sequence.

Notice the and � in the definitions of increasing and decreasing sequences,respectively. Many calculus texts use strict inequalities because they seem to bettermatch the intuitive idea of what an increasing or decreasing sequence should do.For us, the non-strict inequalities are more convenient.

Theorem 3.11. A bounded monotone sequence converges.

Proof. Suppose an

is a bounded increasing sequence, L = lub {an

: n 2 N}and " > 0. Clearly, a

n

L for all n 2 N. According to Theorem 2.17, thereexists an N 2 N such that a

N

> L � ". Because the sequence is increasing,L � a

n

� aN

> L � " for all n � N . This shows an

! L.If a

n

is decreasing, let bn

= �an

and apply the preceding argument. ⇤


2. MONOTONE SEQUENCES 3-5

The key idea of this proof is the existence of the least upper bound of the sequencewhen viewed as a set. This means the Completeness Axiom implies Theorem 3.11.In fact, it isn’t hard to prove Theorem 3.11 also implies the Completeness Axiom,showing they are equivalent statements. Because of this, Theorem 3.11 is often usedas the Completeness Axiom on R instead of the least upper bound property we usedin Axiom 8.

Example 3.9. The sequence en

=�1 + 1

n

�n

converges.

Looking at the first few terms of this sequence, e1

= 2, e2

= 2.25, e3

⇡ 2.37,e4

⇡ 2.44, it seems to be increasing. To show this is indeed the case, fix n 2 N anduse the binomial theorem to expand the product as

en

=nX

k=0

✓n

k

◆1

nk

(9)

and

en+1

=n+1X

k=0

✓n + 1

k

◆1

(n + 1)k.(10)

For 1 k n, the kth term of (9) is

✓n

k

◆1

nk

=n(n � 1)(n � 2) · · · (n � (k � 1))

k!nk

=1

k!

n � 1

n

n � 2

n· · · n � k + 1

n

=1

k!

✓1 � 1

n

◆✓1 � 2

n

◆· · ·✓

1 � k � 1

n

◆

<1

k!

✓1 � 1

n + 1

◆✓1 � 2

n + 1

◆· · ·✓

1 � k � 1

n + 1

◆

=(n + 1)n(n � 1)(n � 2) · · · (n + 1 � (k � 1))

k!(n + 1)k

=

✓n + 1

k

◆1

(n + 1)k,

which is the kth term of (10). Since (10) also has one more positive term in thesum, it follows that e

n

< en+1

, and the sequence en

is increasing.Noting that 1/k! 1/2k�1 for k 2 N, equation (9) implies

✓n

k

◆1

nk

=n!

k!(n � k)!

1

nk

=n � 1

n

n � 2

n· · · n � k + 1

n

1

k!

<1

k!

1

2k�1

.



Substituting this into (9) yields

en

=nX

k=0

✓n

k

◆1

nk

< 1 + 1 +1

2+

1

4+ · · · +

1

2n�1

= 1 +1 � 1

2

n

1 � 1

2

< 3,

so en

is bounded.Since e

n

is increasing and bounded, Theorem 3.11 implies en

converges. Ofcourse, you probably remember from your calculus course that e

n

! e ⇡ 2.71828.

Theorem 3.12. An unbounded monotone sequence diverges to 1 or �1,

depending on whether it is increasing or decreasing, respectively.

Proof. Suppose an

is increasing and unbounded. If B > 0, the fact that an

isunbounded yields an N 2 N such that a

N

> B. Since an

is increasing, an

� aN

> Bfor all n � N . This shows a

n

! 1.The proof when the sequence decreases is similar. ⇤

3. Subsequences and the Bolzano-Weierstrass Theorem

Definition 3.13. Let an

be a sequence and � : N ! N be a function suchthat m < n implies �(m) < �(n); i.e., � is a strictly increasing sequence of naturalnumbers. Then b

n

= a � �(n) = a�(n)

is a subsequence of an

.

The idea here is that the subsequence bn

is a new sequence formed from anold sequence a

n

by possibly leaving terms out of an

. In other words, we see all theterms of b

n

must also appear in an

, and they must appear in the same order.

Example 3.10. Let �(n) = 3n and an

be a sequence. Then the subsequencea�(n)

looks likea3

, a6

, a9

, . . . , a3n

, . . .

The subsequence has every third term of the original sequence.

Example 3.11. If an

= sin(n⇡/2), then some possible subsequences are

bn

= a4n+1

=) bn

= 1,

cn

= a2n

=) cn

= 0,

anddn

= an

2 =) dn

= (1 + (�1)n+1)/2.

Theorem 3.14. an

! L i↵ every subsequence of an

converges to L.

Proof. ()) Suppose � : N ! N is strictly increasing, as in the precedingdefinition. With a simple induction argument, it can be seen that �(n) � n for alln. (See Exercise 3.8.)

Now, suppose an

! L and bn

= a�(n)


. If " > 0, thereis an N 2 N such that n � N implies a

n

2 (L � ", L + "). From the precedingparagraph, it follows that when n � N , then b

n

= a�(n)

= am

for some m � n. So,bn

2 (L � ", L + ") and bn

! L.(() Since a

n

is a subsequence of itself, it is obvious that an

! L. ⇤


4. LOWER AND UPPER LIMITS OF A SEQUENCE 3-7

The main use of Theorem 3.14 is not to show that sequences converge, but, ratherto show they diverge. It gives two strategies for doing this: find two subsequencesconverging to di↵erent limits, or find a divergent subsequence. In Example 3.11, thesubsequences b

n

and cn

demonstrate the first strategy, while dn

demonstrates thesecond.

Even if the original sequence diverges, it is possible there are convergent subse-quences. For example, consider the divergent sequence a

n

= (�1)n. In this case, an

diverges, but the two subsequences a2n

= 1 and a2n+1

= �1 are constant sequences,so they converge.

Theorem 3.15. Every sequence has a monotone subsequence.

Proof. Let an

be a sequence and T = {n 2 N : m > n =) am

� an

}. Thereare two cases to consider, depending on whether T is finite.

First, assume T is infinite. Define �(1) = min T and assuming �(n) is defined,set �(n + 1) = min T \ {�(1), �(2), . . . , �(n)}. This inductively defines a strictlyincreasing function � : N ! N. The definition of T guarantees a

�(n)

is an increasingsubsequence of a

n

.Now, assume T is finite. Let �(1) = max T + 1. If �(n) has been chosen for

some n > max T , then the definition of T implies there is an m > �(n) such thatam

a�(n)

. Set �(n + 1) = m. This inductively defines the strictly increasingfunction � : N ! N such that a

�(n)

is a decreasing subsequence of an

. ⇤If the sequence in Theorem 3.15 is bounded, then the corresponding monotone

subsequence is also bounded. Recalling Theorem 3.11, we see the following.

Theorem 3.16 (Bolzano-Weierstrass). Every bounded sequence has a convergent

subsequence.

4. Lower and Upper Limits of a Sequence

There are an uncountable number of strictly increasing functions � : N ! N,so every sequence a

n

has an uncountable number of subsequences. If an

converges,then Theorem 3.14 shows all of these subsequences converge to the same limit. It’salso apparent that when a

n

! 1 or an

! �1, then all its subsequences diverge inthe same way. When a

n

does not converge or diverge to ±1, the situation is a bitmore di�cult.

Example 3.12. Let Q = {qn

: n 2 N} and ↵ 2 R. Since every intervalcontains an infinite number of rational numbers, it is possible to choose �(1) =min{k : |q

k

� ↵| < 1}. In general, assuming �(n) has been chosen, choose �(n +1) = min{k > �(n) : |q

k

� ↵| < 1/n}. Such a choice is always possible becauseQ\(↵�1/n, ↵+1/n)\{q

k

: k �(n)} is infinite. This induction yields a subsequenceq�(n)

of qn

converging to ↵.

If an

is a sequence and bn

is a convergent subsequence of an

with bn

! L,then L is called an accumulation point of a

n

. A convergent sequence has only oneaccumulation point, but a divergent sequence may have many accumulation points.As seen in Example 3.12, a sequence may have all of R as its set of accumulationpoints.

To make some sense out of this, suppose an

is a bounded sequence, andTn

= {ak

: k � n}. Define

`n

= glb Tn

and µn

= lub Tn

.



Because Tn

� Tn+1

, it follows that for all n 2 N,

`1

`n

`n+1

µn+1

µn

µ1

.(11)

This shows `n

is an increasing sequence bounded above by µ1

and µn

is a decreasingsequence bounded below by `

1

. Theorem 3.11 implies both `n

and µn

converge. If`n

! ` and µn

! µ, (11) shows for all n,

`n

` µ µn

.(12)

Suppose bn

! � is any convergent subsequence of an

. From the definitions of`n

and µn

, it is seen that `n

bn

µn

for all n. Now (12) shows ` � µ.The normal terminology for ` and µ is given by the following definition.

Definition 3.17. Let an

be a sequence. If an

is bounded below, then the lowerlimit of a

n

islim inf a

n

= limn!1 glb {a

k

: k � n}.

If an

is bounded above, then the upper limit of an

is

lim sup an

= limn!1 lub {a

k

: k � n}.

When an

is unbounded, the lower and upper limits are set to appropriate infinitevalues, while recalling the familiar warnings about 1 not being a number.

Example 3.13. Define

an

=

(2 + 1/n, n odd

1 � 1/n, n even.

Then

µn

= lub {ak

: k � n} =

(2 + 1/n, n odd

2 + 1/(n + 1), n even# 2

and

`n

= glb {ak

: k � n} =

(1 � 1/n, n even

1 � 1/(n + 1), n even" 1.

So,lim sup a

n

= 2 > 1 = lim inf an

.

Suppose an

is bounded above and both µn

and µ are as in the discussionpreceding the definition. Choose �(1) so a

�(1)

> µ1

� 1. If �(n) has been chosen forsome n 2 N, then choose �(n + 1) > �(n) to satisfy

µn

� a�(n+1)

> lub Tn+1

� 1/n = un+1

� 1/n.

This inductively defines a subsequence a�(n)

! µ = lim sup an

, where the conver-gence is guaranteed by Theorem 3.9, the Sandwich Theorem.

In the cases when lim sup an

= 1 and lim sup an

= �1, it is left to the readerto show there is a subsequence b

n

! lim sup an

.Similar arguments can be made for lim inf a

n

.To summarize: If � is an accumulation point of a

n

, then

lim inf an

� lim sup an

.

In case an

is bounded, both lim inf an

and lim sup an

are accumulation points of an

and an

converges i↵ lim inf an

= limn!1 a

n

= lim sup an

.The following theorem has been proved.


3: Cauchy Sequences 3-9

Theorem 3.18. Let an

be a sequence.

(a) If � is an accumulation point of an

, then lim inf an

� lim sup an

.

(b) There are subsequences of an

converging to lim inf an

and lim sup an

.

This shows, whenever they are finite, lim inf an

and lim sup an

are accu-

mulation points of an

.

(c) lim inf an

= lim sup an

i↵ an

converges.

5. The Nested Interval Theorem

Definition 3.19. A collection of sets {Sn

: n 2 N} is said to be nested, ifSn+1

⇢ Sn

for all n 2 N.

Theorem 3.20 (Nested Interval Theorem). If {In

= [an

, bn

] : n 2 N} is a

nested collection of closed intervals such that limn!1(b

n

� an

) = 0, then there is

an x 2 R such that

Tn2N I

n

= {x}.Proof. Since the intervals are nested, it’s clear that a

n

is an increasing sequencebounded above by b

1

and bn

is a decreasing sequence bounded below by a1

. ApplyingTheorem 3.11 twice, we find there are ↵, � 2 R such that a

n

! ↵ and bn

! �.We claim ↵ = �. To see this, let " > 0 and use the “shrinking” condition on

the intervals to pick N 2 N so that bN

� aN

< ". The nestedness of the intervalsimplies a

N

an

< bn

bN

for all n � N . Therefore

aN

lub {an

: n � N} = ↵ bN

and aN

glb {bn

: n � N} = � bN

.

This shows |↵ � �| |bN

� aN

| < ". Since " > 0 was chosen arbitrarily, we conclude↵ = �.

Let x = ↵ = �. It remains to show thatT

n2N In

= {x}.First, we show that x 2 T

n2N In

. To do this, fix N 2 N. Since an

increases tox, it’s clear that x � a

N

. Similarly, x bN

. Therefore x 2 [aN

, bN

]. Because Nwas chosen arbitrarily, it follows that x 2 T

n2N In

.Next, suppose there are x, y 2 T

n2N In

and let " > 0. Choose N 2 N such thatbN

� aN

< ". Then {x, y} ⇢ Tn2N I

n

⇢ [aN

, bN

] implies |x � y| < ". Since " waschosen arbitrarily, we see x = y. Therefore

Tn2N I

n

= {x}. ⇤Example 3.14. If I

n

= (0, 1/n] for all n 2 N, then the collection {In

: n 2 N}is nested, but

Tn2N I

n

= ;. This shows the assumption that the intervals be closedin the Nested Interval Theorem is necessary.

Example 3.15. If In

= [n, 1) then the collection {In

: n 2 N} is nested,but

Tn2N I

n

= ;. This shows the assumption that the lengths of the intervals bebounded is necessary. (It will be shown in Corollary 5.11 that when their lengthsdon’t go to 0, then the intersection is nonempty, but the uniqueness of x is lost.)

6. Cauchy Sequences

Often the biggest problem with showing that a sequence converges using thetechniques we have seen so far is that we must know ahead of time to what itconverges. This is the “chicken and egg” problem mentioned above. An escape fromthis dilemma is provided by Cauchy sequences.


is a Cauchy sequence if for all " > 0 there isan N 2 N such that n, m � N implies |a

n

� am

| < ".

1

January 19, 2015

c�Lee Larson ([email protected])



This definition is a bit more subtle than it might at first appear. It sort of saysthat all the terms of the sequence are close together from some point onward. Theemphasis is on all the terms from some point onward. To stress this, first considera negative example.

Example 3.16. Suppose an

=P

n

k=1

1/k for n 2 N. There’s a trick for showingthe sequence a

n

diverges. First, note that an

is strictly increasing. For any n 2 N,consider

a2

n�1

=2

n�1X

k=1

1

k=

n�1X

j=0

2

j�1X

k=0

1

2j + k

>

n�1X

j=0

2

j�1X

k=0

1

2j+1

=n�1X

j=0

1

2=

n

2! 1

Hence, the subsequence a2

n�1

is unbounded and the sequence an

diverges. (To seehow this works, write out the first few sums of the form a

2

n�1

.)On the other hand, |a

n+1

� an

| = 1/(n + 1) ! 0 and indeed, if m is fixed,|a

n+m

� sn

| ! 0. This makes it seem as though the terms are getting close together,as in the definition of a Cauchy sequence. But, s

n

is not a Cauchy sequence, asshown by the following theorem.

Theorem 3.22. A sequence converges i↵ it is a Cauchy sequence.

Proof. ()) Suppose an

! L and " > 0. There is an N 2 N such that n � Nimplies |a

n

� L| < "/2. If m, n � N , then

|am

� an

| = |am

� L + L � an

| |am

� L| + |L � am

| < "/2 + "/2 = ".

This shows an

is a Cauchy sequence.(() Let a

n

be a Cauchy sequence. First, we claim that an

is bounded. To seethis, let " = 1 and choose N 2 N such that n, m � N implies |a

n

� am

| < 1. In thiscase, a

N

� 1 < an

< aN

+ 1 for all n � N , so {an

: n � N} is a bounded set. Theset {a

n

: n < N}, being finite, is also bounded. Since {an

: n 2 N} is the union ofthese two bounded sets, it too must be bounded.

Because an

is a bounded sequence, Theorem 3.16 implies it has a convergentsubsequence b

n

= a�(n)

! L. Let " > 0 and choose N 2 N so that n, m � N implies|a

n

� am

| < "/2 and |bn

� L| < "/2. If n � N , then �(n) � n � N and

|an

� L| = |an

� bn

+ bn

� L| |a

n

� bn

| + |bn

� L|= |a

n

� a�(n)

| + |bn

� L|< "/2 + "/2 = ".

Therefore, an

! L. ⇤The fact that Cauchy sequences converge is yet another equivalent version

of completeness. In fact, most advanced texts define completeness as “Cauchysequences converge.” This is convenient in general spaces because the definition of aCauchy sequence only needs the metric on the space and none of its other structure.

A typical example of the usefulness of Cauchy sequences is given below.

Definition 3.23. A sequence xn

is contractive if there is a c 2 (0, 1) such that|x

k+1

� xk

| c|xk

� xk�1

| for all k > 1. c is called the contraction constant.


3: Cauchy Sequences 3-11

Theorem 3.24. If a sequence is contractive, then it converges.

Proof. Let xk

be a contractive sequence with contraction constant c 2 (0, 1).We first claim that if n 2 N, then

(13) |xn

� xn+1

| cn�1|x1

� x2

|.This is proved by induction. When n = 1, the statement is

|x1

� x2

| c0|x1

� x2

| = |x1

� x2

|,which is trivially true. Suppose that |x

n

� xn+1

| cn�1|x1

� x2

| for some n 2 N.Then, from the definition of a contractive sequence and the induction hypothesis,

|xn+1

� xn+2

| c|xn

� xn+1

| c�cn�1|x

1

� x2

|� = cn|x1

� x2

|.This shows the claim is true in the case n + 1. Therefore, by induction, the claim istrue for all n 2 N.

To show xn

is a Cauchy sequence, let " > 0. Since cn ! 0, we can choose N 2 Nso that

(14)cN�1

(1 � c)|x

1

� x2

| < ".

Let n > m � N . Then

|xn

� xm

| = |xn

� xn�1

+ xn�1

� xn�2

+ xn�2

� · · · � xm+1

+ xm+1

� xm

| |x

n

� xn�1

| + |xn�1

� xn�2

| + · · · + |xm+1

� xm

|Now, use (13) on each of these terms.

cn�2|x1

� x2

| + cn�3|x1

� x2

| + · · · + cm�1|x1

� x2

|= |x

1

� x2

|(cn�2 + cn�3 + · · · + cm�1)

Apply the formula for a geometric sum.

= |x1

� x2

|cm�1

1 � cn�m

1 � c

< |x1

� x2

|cm�1

1 � c(15)

Use (14) to estimate the following.

|x1

� x2

|cN�1

1 � c

< |x1

� x2

| "

|x1

� x2

|= "

This shows xn

is a Cauchy sequenceand must converge by Theorem 3.22. ⇤Example 3.17. Let �1 < r < 1 and define the sequence s

n

=P

n

k=0

rk. (Youno doubt recognize this as the geometric series from your calculus course.) If r = 0,the convergence of s

n

is trivial. So, suppose r 6= 0. In this case,

|sn+1

� sn

||s

n

� sn�1

| =

��rn+1

rn

�� = |r| < 1

and sn

is contractive. Theorem 3.24 implies sn

converges.



Example 3.18. Suppose f(x) = 2 + 1/x, a1

= 2 and an+1

= f(an

) for n 2 N.It is evident that a

n

� 2 for all n. Some algebra gives��an+1

� an

an

� an�1

�� =��f(f(a

n�1

)) � f(an�1

)

f(an�1

) � an�1

�� =1

1 + 2an�1

1

5.

This shows an

is a contractive sequence and, according to Theorem 3.24, an

! Lfor some L � 2. Since, a

n+1

= 2 + 1/an

, taking the limit as n ! 1 of both sidesgives L = 2 + 1/L. A bit more algebra shows L = 1 +

p2.

L is called a fixed point of the function f ; i.e. f(L) = L. Many approximationtechniques for solving equations involve such iterative techniques depending uponcontraction to find fixed points.

The calculations in the proof of Theorem 3.24 give the means to approximatethe fixed point to within an allowable error. Looking at line (15), notice

|xn

� xm

| < |x1

� x2

|cm�1

1 � c.

Let n ! 1 in this inequality to arrive at the error estimate

|L � xm

| |x1

� x2

|cm�1

1 � c.(16)

In Example 3.18, a1

= 2, a2

= 5/2 and c 1/5. Suppose we want to approximateL to 5 decimal places of accuracy. This means we need |a

n

� L| < 5 ⇥ 10�6. Using(16), with m = 9 shows

|a1

� a2

|cm�1

1 � c 1.6 ⇥ 10�6.

Some arithmetic gives a9

⇡ 2.41421. The calculator value of L = 1 +p

2 ⇡2.414213562, confirming our estimate.

7. Exercises

3.1. Let the sequence an

=6n � 1

3n + 2. Use the definition of convergence for a

sequence to show an

converges.

3.2. If an

is a sequence such that a2n

! L and a2n+1

! L, then an

! L.

3.3. Let an

be a sequence such that a2n

! A and a2n

� a2n�1

! 0. Then an

! A.

3.4. If an

is a sequence of positive numbers converging to 0, thenp

an

! 0.

3.5. Find examples of sequences an

and bn

such that an

! 0 and bn

! 1 suchthat

(a) an

bn

! 0(b) a

n

bn

! 1(c) lim

n!1 an

bn

does not exist, but an

bn

is bounded.(d) Given c 2 R, a

n

bn

! c.

3.6. If xn

and yn

are sequences such that limn!1 x

n

= L 6= 0 and limn!1 x

n

yn

exists, then limn!1 y

n

exists.


7. EXERCISES 3-13

3.7. Determine the limit of an

= n

pn!. (Hint: If n is even, then n! > (n/2)n/2.)

3.8. If � : N ! N is strictly increasing, then �(n) � n for all n 2 N.

3.9. Every unbounded sequence contains a monotonic subsequence.

3.10. Find a sequence an

such that given x 2 [0, 1], there is a subsequence bn

ofan

such that bn

! x.

3.11. A sequence an

converges to 0 i↵ |an

| converges to 0.

3.12. Define the sequence an

=p

n for n 2 N. Show that |an+1

� an

| ! 0, but an

is not a Cauchy sequence.

3.13. Suppose a sequence is defined by a1

= 0, a1

= 1 and an+1

= 1

2

(an

+ an�1

)for n � 2. Prove a

n

converges, and determine its limit.

3.14. If the sequence an

is defined recursively by a1

= 1 and an+1

=p

an

+ 1,then show a

n

converges and determine its limit.

3.15. If an

is a sequence such that limn!1 |a

n+1

/an

| = ⇢ < 1, then an

! 0.

3.16. Prove that the sequence an

= n3/n! converges.

3.17. Let an

and bn

be sequences. Prove that both sequences an

and bn

convergei↵ both a

n

+ bn

and an

� bn

converge.

3.18. Let an

be a bounded sequence. Prove that given any " > 0, there is aninterval I with length " such that {n : a

n

2 I} is infinite. Is it necessary that an

bebounded?

3.19. A sequence an

converges in the mean if an

= 1

n

Pn

k=1

ak

converges. Provethat if a

n

! L, then an

! L, but the converse is not true.

3.20. Find a sequence xn

such that for all n 2 N there is a subsequence of xn

converging to n.

3.21. If an

is a Cauchy sequence whose terms are integers, what can you say aboutthe sequence?

3.22. Show an

=P

n

k=0

1/k! is a Cauchy sequence.

3.23. If an

is a sequence such that every subsequence of an

has a furthersubsequence converging to L, then a

n

! L.

3.24. If a, b 2 (0, 1), then show n

pan + bn ! max{a, b}.

3.25. If 0 < ↵ < 1 and sn

is a sequence satisfying |sn+1

| < ↵|sn

|, then sn

! 0.

3.26. If c � 1 in the definition of a contractive sequence, can the sequenceconverge?



3.27. If an

is a convergent sequence and bn

is a sequence such that |am

� an

| �|b

m

� bn

| for all m, n 2 N, then bn

converges.

3.28. If an

� 0 for all n 2 N and an

! L, thenp

an

! pL.

3.29. If an

is a Cauchy sequence and bn


such that bn

! L,then a

n

! L.

3.30. Let an

be a sequence. an

! L i↵ lim sup an

= L = lim inf an

.

3.31. Is lim sup(an

+ bn

) = lim sup an

+ lim sup bn

?

3.32. If an

is a sequence of positive numbers, then lim inf an

= lim sup 1/an

.

3.33. an

= 1/n is not contractive.

3.34. The equation x3 � 4x + 2 = 0 has one real root lying between 0 and 1.Find a sequence of rational numbers converging to this root. Use this sequence toapproximate the root to five decimal places.

3.35. Approximate a solution of x3 � 5x + 1 = 0 to within 10�4 using a Cauchysequence.

3.36. Prove or give a counterexample: If an

! L and � : N ! N is bijective, thenbn

= a�(n)

converges. Note that bn

might not be a subsequence of an

. (bn

is calleda rearrangement of a

n

.)


CHAPTER 4

Series

Given a sequence an

, in many contexts it is natural to ask about the sum of allthe numbers in the sequence. If only a finite number of the a

n

are nonzero, thisis trivial—and not very interesting. If an infinite number of the terms aren’t zero,the path becomes less obvious. Indeed, it’s even somewhat questionable whether itmakes sense at all to add an infinite number of numbers.

There are many approaches to this question. The method given below is themost common technique. Others are mentioned in the exercises.

1. What is a Series?

The idea behind adding up an infinite collection of numbers is a reduction tothe well-understood idea of a sequence. This is a typical approach in mathematics:reduce a question to a previously solved problem.

Definition 4.1. Given a sequence an

, the series having an

as its terms is thenew sequence

sn

=nX

k=1

ak

= a1

+ a2

+ · · · + an

.

The numbers sn

are called the partial sums of the series. If sn

! S 2 R, then theseries converges to S. This is normally written as

1X

k=1

ak

= S.

Otherwise, the series diverges.

The notationP1

n=1

an

is understood to stand for the sequence of partial sumsof the series with terms a

n

. When there is no ambiguity, this is often abbreviatedto just

Pan

.

Example 4.1. If an

= (�1)n for n 2 N, then s1

= �1, s2

= �1 + 1 = 0,s3

= �1 + 1 � 1 = �1 and in general

sn

=(�1)n � 1

2does not converge because it oscillates between �1 and 0. Therefore, the seriesP

(�1)n diverges.

Example 4.2 (Geometric Series). Recall that a sequence of the form an

= c rn�1

is called a geometric sequence. It gives rise to a series1X

n=1

c rn�1 = c + cr + cr2 + cr3 + · · ·

4-1

4-2 Series

called a geometric series. The number r is called the ratio of the series.Suppose a

n

= rn�1 for r 6= 1. Then,

s1

= 1, s2

= 1 + r, s3

= 1 + r + r2, . . .

In general, it can be shown by induction (or even long division) that

sn

=nX

k=1

ak

=nX

k=1

rk�1 =1 � rn

1 � r.(23)

The convergence of sn

in (23) depends on the value of r. Letting n ! 1, it’sapparent that s

n

diverges when |r| > 1 and converges to 1/(1 � r) when |r| < 1.When r = 1, s

n

= n ! 1. When r = �1, it’s essentially the same as Example 4.1,and therefore diverges. In summary,

1X

n=1

c rn�1 =c

1 � r

for |r| < 1, and diverges when |r| � 1. This is called a geometric series with ratio r.

Figure 1. Stepping to the wall.

2 1 01/2distance from wall

steps

In some cases, the geometric series has an intuitively plausible limit. If youstart two meters away from a wall and keep stepping halfway to the wall, no numberof steps will get you to the wall, but a large number of steps will get you as close tothe wall as you want. (See Figure 1.) So, the total distance stepped has limitingvalue 2. The total distance after n steps is the nth partial sum of a geometric serieswith ratio r = 1/2 and c = 1.

Example 4.3 (Harmonic Series). The seriesP1

n=1

1/n is called the harmonic

series. It was shown in Example 3.16 that the harmonic series diverges.

Example 4.4. The terms of the sequence

an

=1

n2 + n, n 2 N.

can be decomposed into partial fractions as

an

=1

n� 1

n + 1.

If sn

is the series having an

as its terms, then s1

= 1/2 = 1 � 1/2. We claim thatsn

= 1 � 1/(n + 1) for all n 2 N. To see this, suppose sk

= 1 � 1/(k + 1) for somek 2 N. Then

sk+1

= sk

+ ak+1

= 1 � 1

k + 1+

✓1

k + 1� 1

k + 2

◆= 1 � 1

k + 2


Positive Series 4-3

and the claim is established by induction. Now it’s easy to see that1X

n=1

1

n2 + n= lim

n!1

✓1 � 1

n + 2

◆= 1.

This is an example of a telescoping series. The name is apparently based on theidea that the middle terms of the series cancel, causing the series to collapse like ahand-held telescope.

The following theorem is an easy consequence of the properties of sequencesshown in Theorem 3.8.

Theorem 4.2. Let

Pan

and

Pbn

be convergent series.

(a) If c 2 R, thenP

c an

= cP

an

.

(b)P

(an

+ bn

) =P

an

+P

bn

.

(c) an

! 0

Proof. Let An

=P

n

k=1

ak

and Bn

=P

n

k=1

bk

be the sequences of partialsums for each of the two series. By assumption, there are numbers A and B whereA

n

! A and Bn

! B.(a)

Pn

k=1

c ak

= cP

n

k=1

ak

= cAn

! cA.

(b)P

n

k=1

(ak

+ bk

) =P

n

k=1

ak

+P

n

k=1

bk

= An

+ Bn

! A + B.

(c) For n > 1, an

=P

n

k=1

ak

�Pn�1

k=1

ak

= An

� An�1

! A � A = 0. ⇤Notice that the first two parts of Theorem 4.2 show that the set of all convergent

series is closed under linear combinations.Theorem 4.2(c) is most useful because its contrapositive provides the most basic

test for divergence.

Corollary 4.3 (Going to Zero Test). If an

6! 0, thenP

an

diverges.

Many have made the mistake of reading too much into Corollary 4.3. It canonly be used to show divergence. When the terms of a series do tend to zero, thatdoes not guarantee convergence. Example 4.3, shows Theorem 4.2(c) is necessary,but not su�cient for convergence.

Another useful observation is that the partial sums of a convergent series are aCauchy sequence. The Cauchy criterion for sequences can be rephrased for series asthe following theorem, the proof of which is Exercise 4.4.

Theorem 4.4 (Cauchy Criterion). Let

Pan

be a series. The following state-

ments are equivalent.

(a)P

an

converges.

(b) For every " > 0 there is an N 2 N such that whenever n � m � N ,

then ��

nX

i=m

ai

�� < ".

2. Positive Series

Most of the time, it is very hard or impossible to determine the exact limit ofa convergent series. We must satisfy ourselves with determining whether a seriesconverges, and then approximating its sum. For this reason, the study of series


4-4 Series

usually involves learning a collection of theorems that might answer whether a givenseries converges, but don’t tell us to what it converges. These theorems are usuallycalled the convergence tests. The reader probably remembers a battery of such testsfrom her calculus course. There is a myriad of such tests, and the standard ones arepresented in the next few sections, along with a few of those less widely used.

Since convergence of a series is determined by convergence of the sequence of itspartial sums, the easiest series to study are those with well-behaved partial sums.Series with monotone sequences of partial sums are certainly the simplest suchseries.

Definition 4.5. The seriesP

an

is a positive series, if an

� 0 for all n.

The advantage of a positive series is that its sequence of partial sums is non-negative and increasing. Since an increasing sequence converges if and only if it isbounded above, there is a simple criterion to determine whether a positive seriesconverges. All of the standard convergence tests for positive series exploit thiscriterion.

2.1. The Most Common Convergence Tests. All beginning calculus coursescontain several simple tests to determine whether positive series converge. Most ofthem are presented below.

2.1.1. Comparison Tests. The most basic convergence tests are the comparisontests. In these tests, the behavior of one series is inferred from that of anotherseries. Although they’re easy to use, there is one often fatal catch: in order to usea comparison test, you must have a known series to which you can compare themystery series. For this reason, a wise mathematician collects example series for hertoolbox. The more samples in the toolbox, the more powerful are the comparisontests.

Theorem 4.6 (Comparison Test). Suppose

Pan

and

Pbn

are positive series

with an

bn

for all n.

(a) If

Pbn

converges, then so does

Pan

.

(b) If

Pan

diverges, then so does

Pbn

.

Proof. Let An

and Bn

be the partial sums ofP

an

andP

bn

, respectively. Itfollows from the assumptions that A

n

and Bn

are increasing and for all n 2 N,

(24) An

Bn

.

IfP

bn

= B, then (24) implies B is an upper bound for An

, andP

an

converges.On the other hand, if

Pan

diverges, An

! 1 and the Sandwich Theorem3.9(b) shows B

n

! 1. ⇤

Example 4.5. Example 4.3 shows thatP

1/n diverges. If p 1, then 1/np �1/n, and Theorem 4.6 implies

P1/np diverges.

Example 4.6. The seriesP

sin2 n/2n converges because

sin2 n

2n 1

2n

for all n and the geometric seriesP

1/2n = 1.


Positive Series 4-5

a1

+ a2

+ a3| {z }

2a2

+ a4

+ a5

+ a6

+ a7| {z }

4a4

+ a8

+ a9

+ · · · + a15| {z }

8a8

+a16

+ · · ·

a1|{z}

�a2

+ a2

+ a3| {z }

�2a4

+ a4

+ a5

+ a6

+ a7| {z }

�4a8

+ a8

+ a9

+ · · · + a15| {z }

�8a16

+a16

+ · · ·

Figure 2. This diagram shows the groupings used in inequality (25).

Theorem 4.7 (Cauchy’s Condensation Test1). Suppose an

is a decreasing

sequence of nonnegative numbers. Then

Xan

converges i↵

X2na

2

n

converges.

Proof. Since an

is decreasing, for n 2 N,

2

n+1�1X

k=2

n

ak

2na2

n 22

n�1X

k=2

n�1

ak

.(25)

(See Figure 2.1.1.) Adding for 1 n m gives

2

m+1�1X

k=2

ak

mX

k=1

2ka2

k 22

m�1X

k=1

ak

and the theorem follows from the Comparison Test. ⇤

Example 4.7 (p-series). For fixed p 2 R, the seriesP

1/np is called a p-series.The special case when p = 1 is the harmonic series. Notice

X 2n

(2n)p=X�

21�p

�n

is a geometric series with ratio 21�p, so it converges only when 21�p < 1. Since21�p < 1 only when p > 1, it follows from the Cauchy Condensation Test that thep-series converges when p > 1 and diverges when p 1. (Of course, the divergencehalf of this was already known from Example 4.5.)

The p-series are often useful for the Comparison Test, and also occur in manyareas of advanced mathematics such as harmonic analysis and number theory.

Theorem 4.8 (Limit Comparison Test). Suppose

Pan

and

Pbn

are positive

series with

↵ = lim infan

bn

lim supan

bn

= �.(26)

(a) If ↵ 2 (0, 1) and

Pan


Pbn

, and if

Pbn


Pan

.

(b) If � 2 (0, 1) and

Pbn


Pan

, and if

Pan


Pbn

.

1The seriesP

2na2

n is sometimes called the condensed series associated withP

a

n

.


4-6 Series

Proof. To prove (a), suppose ↵ > 0. There is an N 2 N such that

n � N =) ↵

2<

an

bn

.(27)

If n > N , then (27) gives

↵

2

nX

k=N

bk

<

nX

k=N

ak

(28)

Part (a) now follows from the comparison test.The proof of (b) is similar. ⇤

The following easy corollary is the form this test takes in most calculus books.It’s easier to use than Theorem 4.8 and su�ces most of the time.

Corollary 4.9 (Limit Comparison Test). Suppose

Pan

and

Pbn

are positive

series with

↵ = limn!1

an

bn

.(29)

If ↵ 2 (0, 1), thenP

an

and

Pbn

either both converge or both diverge.

Example 4.8. To test the seriesX 1

2n � nfor convergence, let

an

=1

2n � nand b

n

=1

2n.

Then

limn!1

an

bn

= limn!1

1/(2n � n)

1/2n= lim

n!12n

2n � n= lim

n!11

1 � n/2n= 1 2 (0, 1).

SinceP

1/2n = 1, the original series converges by the Limit Comparison Test.

2.1.2. Geometric Series-Type Tests. The most important series is undoubtedlythe geometric series. Several standard tests are basically comparisons to geometricseries.

Theorem 4.10 (Root Test). Suppose

Pan

is a positive series and

⇢ = lim sup a1/n

n

.

If ⇢ < 1, thenP

an

converges. If ⇢ > 1, thenP

an

diverges.

Proof. First, suppose ⇢ < 1 and r 2 (⇢, 1). There is an N 2 N so that a1/n

n

< rfor all n � N . This is the same as a

n

< rn for all n � N . Using this, it follows thatwhen n � N ,

nX

k=1

ak

=N�1X

k=1

ak

+nX

k=N

ak

<

N�1X

k=1

ak

+nX

k=N

rk.

SinceP

n

k=N

rk is a partial sum of a geometric series with ratio r < 1, it mustconverge.

If ⇢ > 1, there is an increasing sequence of integers kn

! 1 such that a1/k

n

k

n

> 1for all n 2 N. This shows a

k

n

> 1 for all n 2 N. By Theorem 4.3,P

an

diverges. ⇤


Positive Series 4-7

Example 4.9. For any x 2 R, the seriesP |xn|/n! converges. To see this, note

that according to Exercise 4.7,✓ |xn|

n!

◆1/n

=|x|

(n!)1/n! 0 < 1.

Applying the Root Test shows the series converges.

Example 4.10. Consider the p-seriesP

1/n andP

1/n2. The first divergesand the second converges. Since n1/n ! 1 and n2/n ! 1, it can be seen that when⇢ = 1, the Root Test in inconclusive.

Theorem 4.11 (Ratio Test). Suppose

Pan

is a positive series. Let

r = lim infn!1

an+1

an

lim supn!1

an+1

an

= R.

If R < 1, thenP

an

converges. If r > 1, thenP

an

diverges.

Proof. First, suppose R < 1 and ⇢ 2 (R, 1). There exists N 2 N such thatan+1

/an

< ⇢ whenever n � N . This implies an+1

< ⇢an

whenever n � N . Fromthis it’s easy to prove by induction that a

N+m

< ⇢maN

whenever m 2 N. It followsthat, for n > N ,

nX

k=1

ak

=NX

k=1

ak

+nX

k=N+1

ak

=NX

k=1

ak

+n�NX

k=1

aN+k

<

NX

k=1

ak

+n�NX

k=1

aN

⇢k

<

NX

k=1

ak

+aN

⇢

1 � ⇢.

Therefore, the partial sums ofP

an

are bounded, andP

an

converges.If r > 1, then choose N 2 N so that a

n+1

> an

for all n � N . It’s now apparentthat a

n

6! 0. ⇤

In calculus books, the ratio test usually takes the following simpler form.

Corollary 4.12 (Ratio Test). Suppose

Pan

is a positive series. Let

r = limn!1

an+1

an

.

If r < 1, thenP

an

converges. If r > 1, thenP

an

diverges.

From a practical viewpoint, the ratio test is often easier to apply than the roottest. But, the root test is actually the stronger of the two in the sense that thereare series for which the ratio test fails, but the root test succeeds. (See Exercise4.11, for example.) This happens because

lim infan+1

an

lim inf a1/n

n

lim sup a1/n

n

lim supan+1

an

.(30)


4-8 Series

To see this, note the middle inequality is always true. To prove the right-hand

inequality, choose r > lim sup an+1

/an

. It su�ces to show lim sup a1/n

n

r. As inthe proof of the ratio test, a

n+k

< rkan

. This implies

an+k

< rn+k

an

rn,

which leads to

a1/(n+k)

n+k

< r⇣a

n

rn

⌘1/(n+k)

.

Finally,

lim sup a1/n

n

= lim supk!1

a1/(n+k)

n+k

lim supk!1

r⇣a

n

rn

⌘1/(n+k)

= r.

The left-hand inequality is proved similarly.

2.2. Kummer-Type Tests. Most times the simple tests of the preceding sec-tion su�ce. However, more di�cult series require more delicate tests. There dozensof other, more specialized, convergence tests. Several of them are consequences ofthe following theorem.

Theorem 4.13 (Kummer’s Test). Suppose

Pan

is a positive series, pn

is a

sequence of positive numbers and

(31) ↵ = lim inf

✓pn

an

an+1

� pn+1

◆ lim sup

✓pn

an

an+1

� pn+1

◆= �

If ↵ > 0, thenP

an

converges. If

P1/p

n

diverges and � < 0, thenP

an

diverges.

Proof. Let sn

=P

n

k=1

ak

, suppose ↵ > 0 and choose r 2 (0, ↵). There mustbe an N > 1 such that

pn

an

an+1

� pn+1

> r, 8n � N.

Rearranging this gives

(32) pn

an

� pn+1

an+1

> ran+1

, 8n � N.

For M > N , (32) implies

MX

n=N

(pn

an

� pn+1

an+1

) >

MX

n=N

ran+1

pN

aN

� pM+1

aM+1

> r(sM

� sN�1

)

pN

aN

� pM+1

aM+1

+ rsN�1

> rsM

pN

aN

+ rsN�1

r> s

M

Since N is fixed, the left side is an upper bound for sM

, and it follows thatP

an

converges.Next suppose

P1/p

n

diverges and � < 0. There must be an N 2 N such that

pn

an

an+1

� pn+1

< 0, 8n � N.

This implies

pn

an

< pn+1

an+1

, 8n � N.


Kummer-Type Tests 4-9

Therefore, pn

an

> pN

aN

whenever n > N and

an

> pN

aN

1

pn

, 8n � N.

Because N is fixed andP

1/pn

diverges, the Comparison Test showsP

an

diverges.⇤

Kummer’s test is powerful. In fact, it can be shown that, given any positiveseries, a judicious choice of the sequence p

n

can always be made to determine whetherit converges. (See Exercise 4.18, [15] and [14].) But, as stated, Kummer’s test isnot very useful because choosing p

n

for a given series is often di�cult. Experiencehas led to some standard choices that work with large classes of series. For example,Exercise 4.9 asks you to prove the choice p

n

= 1 for all n reduces Kummer’s test tothe standard ratio test. Other useful choices are shown in the following theorems.

Theorem 4.14 (Raabe’s Test). Let

Pan

be a positive series such that an

> 0for all n. Define

↵ = lim supn!1

n

✓an

an+1

� 1

◆� lim inf

n!1 n

✓an

an+1

� 1

◆= �

If ↵ > 1, thenP

an

converges. If � < 1, thenP

an

diverges.

Proof. Let pn

= n in Kummer’s test, Theorem 4.13. ⇤When Raabe’s test is inconclusive, there are even more delicate tests, such as

the theorem given below.

Theorem 4.15 (Bertrand’s Test). Let

Pan

be a positive series such that an

> 0for all n. Define

↵ = lim infn!1 ln n

✓n

✓an

an+1

� 1

◆� 1

◆ lim sup

n!1ln n

✓n

✓an

an+1

� 1

◆� 1

◆= �.

If ↵ > 1, thenP

an

converges. If � < 1, thenP

an

diverges.

Proof. Let pn

= n ln n in Kummer’s test. ⇤Example 4.11. Consider the series

Xan

=X

nY

k=1

2k

2k + 1

!p

.(33)

It’s of interest to know for what values of p it converges.An easy computation shows that a

n+1

/an

! 1, so the ratio test is inconclusive.Next, try Raabe’s test. Manipulating

limn!1 n

✓✓an

an+1

◆p

� 1

◆= lim

n!1

⇣2n+3

2n+2

⌘p

� 1

1

n

it becomes a 0/0 form and can be evaluated with L’Hospital’s rule.2

limn!1

n2

⇣3+2n

2+2n

⌘p

p

(1 + n) (3 + 2n)=

p

2.

2See §5.2.


4-10 Series

From Raabe’s test, Theorem 4.14, it follows that the series converges when p > 2and diverges when p < 2. Raabe’s test is inconclusive when p = 2.

Now, suppose p = 2. Consider

limn!1 ln n

✓n

✓an

an+1

� 1

◆� 1

◆= � lim

n!1 ln n(4 + 3 n)

4 (1 + n)2= 0

and Bertrand’s test, Theorem 4.15, shows divergence.The series (33) converges only when p > 2.

3. Absolute and Conditional Convergence

The tests given above are for the restricted case when a series has positiveterms. If the stipulation that the series be positive is thrown out, things becomesconsiderably more complicated. But, as is often the case in mathematics, someproblems can be attacked by reducing them to previously solved cases. The followingdefinition and theorem show how to do this for some special cases.

Definition 4.16. LetP

an

be a series. IfP |a

n

| converges, thenP

an

isabsolutely convergent. If it is convergent, but not absolutely convergent, then it isconditionally convergent.

SinceP |a

n

| is a positive series, the preceding tests can be used to determine itsconvergence. The following theorem shows that this is also enough for convergenceof the original series.

Theorem 4.17. If

Pan

is absolutely convergent, then it is convergent.

Proof. Let " > 0. Theorem 4.4 yields an N 2 N such that when n � m � N ,

" >

nX

k=m

|ak

| ��

nX

k=m

ak

�� 0.

Another application Theorem 4.4 finishes the proof. ⇤

Example 4.12. The seriesP

(�1)n+1/n is called the alternating harmonic

series. Since the harmonic series diverges, we see the alternating harmonic series isnot absolutely convergent.

On the other hand, if sn

=P

n

k=1

(�1)k+1/k, then

s2n

=nX

k=1

✓1

2k � 1� 1

2k

◆=

nX

k=1

1

2k(2k � 1)

is a positive series that converges. Since |s2n

� s2n�1

| = 1/2n ! 0, it’s clearthat s

2n�1

must also converge to the same limit. Therefore, sn

converges andP(�1)n+1/n is conditionally convergent. (See Figure 3.)

To summarize: absolute convergence implies convergence, but convergence doesnot imply absolute convergence.

There are a few tests that address conditional convergence. Following are themost well-known.

Theorem 4.18 (Abel’s Test). Let an

and bn

be sequences satisfying

(a) sn

=P

n

k=1

ak

is a bounded sequence.

(b) bn

� bn+1

, 8n 2 N


Absolute and Conditional Convergence 4-11

0 5 10 15 20 25 30 35

0.2

0.4

0.6

0.8

1.0

Figure 3. This plot shows the first 35 partial sums of the alternatingharmonic series. It converges to ln 2 ⇡ 0.6931, which is the level of thedashed line. Notice how the odd partial sums decrease to ln 2 and theeven partial sums increase to ln 2.

(c) bn

! 0

Then

Pan

bn

converges.

To prove this theorem, the following lemma is needed.

Lemma 4.19 (Summation by Parts). For every pair of sequences an

and bn

,

nX

k=1

ak

bk

= bn+1

nX

k=1

ak

�nX

k=1

(bk+1

� bk

)kX

`=1

a`

Proof. Let sn

=P

n

k=1

ak

and s0

= 0. Then

nX

k=1

ak

bk

=nX

k=1

(sk

� sk�1

)bk

=nX

k=1

sk

bk

�nX

k=1

sk�1

bk

=nX

k=1

sk

bk

�

nX

k=1

sk

bk+1

� sn

bn+1

!

= bn+1

nX

k=1

ak

�nX

k=1

(bk+1

� bk

)kX

`=1

a`

⇤

Proof. To prove the theorem, suppose |Pn

k=1

ak

| < M for all n 2 N. Let" > 0 and choose N 2 N such that b

N

< "/2M . If N m < n, use Lemma 4.19 to


4-12 Series

write��

nX

`=m

a`

b`

�� =

��

nX

`=1

a`

b`

�m�1X

`=1

a`

b`

��

=

��bn+1

nX

`=1

a`

�nX

`=1

(b`+1

� b`

)`X

k=1

ak

�

bm

m�1X

`=1

a`

�m�1X

`=1

(b`+1

� b`

)`X

k=1

ak

!��

Using (a) gives

(bn+1

+ bm

)M + MnX

`=m

|b`+1

� b`

|

Now, use (b) to see

= (bn+1

+ bm

)M + M

nX

`=m

(b`

� b`+1

)

and then telescope the sum to arrive at

= (bn+1

+ bm

)M + M(bm

� bn+1

)

= 2Mbm

< 2M"

2M< "

This showsP

n

`=1

a`

b`

satisfies Theorem 4.4, and therefore converges. ⇤

There’s one special case of this theorem that’s most often seen in calculus texts.

Corollary 4.20 (Alternating Series Test). If cn

decreases to 0, then the seriesP(�1)n+1c

n

converges.

Proof. Let an

= (�1)n+1 and bn

= cn

in Theorem 4.18. ⇤

A series such as that in Corollary 4.20 is called an alternating series. Moreformally, if a

n

is a sequence such that an

/an+1

< 0 for all n, thenP

an

is analternating series. Informally, it just means the series alternates between positiveand negative terms.

Example 4.13. Corollary 4.20 provides another way to prove the alternatingharmonic series in Example 4.12 converges. Figure 3 shows how the partial sumsbounce up and down across the sum of the series.

4. Rearrangements of Series

We want to use our standard intuition about adding lists of numbers whenworking with series. But, this intuition has been formed by working with finite sumsand does not always work with series.


Rearrangements of Series 4-13

Example 4.14. SupposeP

(�1)n+1/n = � so thatP

(�1)n+12/n = 2�. It’seasy to show � > 1/2. Consider the following calculation.

2� =X

(�1)n+1

2

n

= 2 � 1 +2

3� 1

2+

2

5� 1

3+ · · ·

Rearrange and regroup.

= (2 � 1) � 1

2+

✓2

3� 1

3

◆� 1

4+

✓2

5� 1

5

◆� 1

6+ · · ·

= 1 � 1

2+

1

3� 1

4+ · · ·

= �

So, � = 2� with � 6= 0. Obviously, rearranging and regrouping of this series is aquestionable thing to do.

In order to carefully consider the problem of rearranging a series, a precisedefinition is needed.

Definition 4.21. Let � : N ! N be a bijection andP

an

be a series. The newseries

Pa�(n)

is a rearrangement of the original series.

The problem with Example 4.14 is that the series is conditionally convergent.Such examples cannot happen with absolutely convergent series. For the most part,absolutely convergent series behave as we are intuitively led to expect.

Theorem 4.22. IfP

an

is absolutely convergent and

Pa�(n)

is a rearrangement

of

Pan

, then

Pa�(n)

=P

an

.

Proof. LetP

an

= s and " > 0. Choose N 2 N so that N m < n impliesPn

k=m

|ak

| < ". Choose M � N such that

{1, 2, . . . , N} ⇢ {�(1), �(2), . . . , �(M)}.

If P > M , then ��

PX

k=1

ak

�PX

k=1

a�(k)

�� 1X

k=N+1

|ak

| "

and both series converge to the same number. ⇤When a series is conditionally convergent, the result of a rearrangement is

unpredictable. This is shown by the following theorem.

Theorem 4.23 (Riemann Rearrangement). If

Pan

is conditionally convergent

and c 2 R [ {�1, 1}, then there is a rearrangement � such that

Pa�(n)

= c.

To prove this, the following lemma is needed.

Lemma 4.24. If

Pan

is conditionally convergent and

bn

=

(an

, an

> 0

0, an

0and c

n

=

(�a

n

, an

< 0

0, an

� 0,

then both

Pbn

and

Pcn

diverge.


4-14 Series

Proof. SupposeP

bn

converges. By assumption,P

an

converges, so Theorem4.2 implies X

cn

=X

bn

�X

an

converges. Another application of Theorem 4.2 showsX

|an

| =X

bn

+X

cn

converges. This is a contradiction of the assumption thatP

an

is conditionallyconvergent, so

Pbn

cannot converge.A similar contradiction arises under the assumption that

Pcn

converges. ⇤Proof. (Theorem 4.23) Let b

n

and cn

be as in Lemma 4.24 and define thesubsequence a+

n

of bn

by removing those terms for which bn

= 0 and an

6= 0. Definethe subsequence a�

n

of cn

by removing those terms for which cn

= 0. The seriesP1n=1

a+

n

andP1

n=1

a�n

are still divergent because only terms equal to zero havebeen removed from b

n

and cn

.Now, let c 2 R and m

0

= n0

= 0. According to Lemma 4.24, we can define thenatural numbers

m1

= min{n :nX

k=1

a+

k

> c} and n1

= min{n :m1X

k=1

a+

k

+nX

`=1

a�`

< c}.

If mp

and np

have been chosen for some p 2 N, then define

mp+1

= min

8<

:n :pX

k=0

m

k+1X

`=m

k

+1

a+

`

�n

k+1X

`=n

k

+1

a�`

!+

nX

`=m

p

+1

a+

`

> c

9=

;

and

np+1

= min

(n :

pX

k=0

m

k+1X

`=m

k

+1

a+

`

�n

k+1X

`=n

k

+1

a�`

!

+

n

p+1X

`=m

p

+1

a+

`

�nX

`=n

p

+1

a�`

< c

9=

; .

Consider the series

a+

1

+ a+

2

+ · · · + a+

m1� a�

1

� a�2

� · · · � a�n1

(34)

+ a+

m1+1

+ a+

m1+2

+ · · · + a+

m2� a�

n1+1

� a�n1+2

� · · · � a�n2

+ a+

m2+1

+ a+

m2+2

+ · · · + a+

m3� a�

n2+1

� a�n2+2

� · · · � a�n3

+ · · ·It is clear this series is a rearrangement of

P1n=1

an

and the way in which mp

andnp

were chosen guarantee that

0 <

p�1X

k=0

0

@m

k+1X

`=m

k

+1

a+

`

�n

kX

`=n

k

+1

a�`

+

m

pX

k=m

p

+1

a+

k

1

A� c a+

m

p

and

0 < c �pX

k=0

m

k+1X

`=m

k

+1

a+

`

�n

kX

`=n

k

+1

a�`

! a�

n

p


5. EXERCISES 4-15

Since both a+

m

p

! 0 and a�n

p

! 0, the result follows from the Squeeze Theorem.The argument when c is infinite is left as Exercise 4.32. ⇤

5. Exercises


4.2. IfP1

n=1

an

is a convergent positive series, then doesP1

n=1

1

1+a

n

converge?

4.3. The series1X

n=1

(an

� an+1

) converges i↵ the sequence an

converges.

4.4. Prove or give a counter example: IfP |a

n

| converges, then nan

! 0.

4.5. If the series a1

+ a2

+ a3

+ · · · converges to S, then so does

a1

+ 0 + a2

+ 0 + 0 + a3

+ 0 + 0 + 0 + a4

+ · · · .(35)

4.6. IfP1

n=1

an

converges and bn

is a bounded monotonic sequence, thenP1n=1

an

bn

converges.

4.7. Let xn

be a sequence with range {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. Prove thatP1n=1

xn

10�n converges.

4.8. Write 6.17272727272 · · · as a fraction.

4.9. Prove the ratio test by setting pn

= 1 for all n in Kummer’s test.

4.10. IfP1

n=1

an

andP1

n=1

bn

are positive series and limn!1

an

bn

exists and is not

zero, then both series converge or both series diverge. (This is called the limitcomparison test.)

4.11. Consider the series

1 + 1 +1

2+

1

2+

1

4+

1

4+ · · · = 4.

Show that the ratio test is inconclusive for this series, but the root test gives apositive answer.

4.12. Does1X

n=2

1

n(ln n)2converge?

4.13. Does1

3+

1 ⇥ 2

3 ⇥ 5+

1 ⇥ 2 ⇥ 3

3 ⇥ 5 ⇥ 7+

1 ⇥ 2 ⇥ 3 ⇥ 4

3 ⇥ 5 ⇥ 7 ⇥ 9+ · · ·

converge?

4.14. For what values of p does✓

1

2

◆p

+

✓1 · 3

2 · 4

◆p

+

✓1 · 3 · 5

2 · 4 · 6

◆p

+ · · ·


4-16 Series

converge?

4.15. Find sequences an

and bn

satisfying:

(a) an

> 0, 8n 2 N and an

! 0;(b) B

n

=P

n

k=1

bk

is a bounded sequence; and,(c)

P1n=1

an

bn

diverges.

4.16. Let an

be a sequence such that a2n

! A and a2n

� a2n�1

! 0. Thenan

! A.

4.17. Prove Bertrand’s test, Theorem 4.15.

4.18. LetP

an

be a positive series. Prove thatP

an

converges if and only if thereis a sequence of positive numbers p

n

and ↵ > 0 such that

limn!1 p

n

an

an+1

� pn+1

= ↵.

(Hint: If s =P

an

and sn

=P

n

k=1

ak

, then let pn

= (s � sn

)/an

.)

4.19. Prove thatP1

n=0

x

n

n!

converges for all x 2 R.

4.20. Find all values of x for whichP1

k=0

k2(x + 3)k converges.

4.21. For what values of x does the series1X

n=1

(�1)n+1x2n�1

2n � 1(40)

converge?

4.22. For what values of x does1X

n=1

(x + 3)n

n4nconverge absolutely, converge

conditionally or diverge?

4.23. For what values of x does1X

n=1

n + 6

n2(x � 1)nconverge absolutely, converge

conditionally or diverge?

4.24. For what positive values of ↵ doesP1

n=1

↵nn↵ converge?

4.25. Prove thatX

cosn⇡

3sin

⇡

nconverges.

4.26. For a seriesP1

k=1

an

with partial sums sn

, define

�n

=1

n

nX

k=1

sn

.

Prove that ifP1

k=1

an

= s, then �n

! s. Find an example where �n

converges, butP1k=1

an

does not. (If �n

converges, the sequence is said to be Cesaro summable.)


5. EXERCISES 4-17

4.27. If an

is a sequence with a subsequence bn

, thenP1

n=1

bn

is a subseries

ofP1

n=1

an

. Prove that if every subseries ofP1

n=1

an

converges, thenP1

n=1

an

converges absolutely.

4.28. IfP1

n=1

an

is a convergent positive series, then so isP1

n=1

a2

n

. Give anexample to show the converse is not true.

4.29. Prove or give a counter example: IfP1

n=1

an

converges, thenP1

n=1

a2

n

converges.

4.30. For what positive values of ↵ doesP1

n=1

↵nn↵ converge?

4.31. If an

� 0 for all n 2 N and there is a p > 1 such that limn!1 npa

n

existsand is finite, then

P1n=1

an

converges. Is this true for p = 1?

4.32. Finish the proof of Theorem 4.23.

4.33. Leonhard Euler started with the equationx

x � 1+

x

1 � x= 0,

transformed it to1

1 � 1/x+

x

1 � x= 0,

and then used geometric series to write it as

· · · +1

x2

+1

x+ 1 + x + x2 + x3 + · · · = 0.(44)

Show how Euler did his calculation and find his mistake.


CHAPTER 5

The Topology of R

1. Open and Closed Sets

Definition 5.1. A set G ⇢ R is open if for every x 2 G there is an " > 0 suchthat (x � ", x + ") ⇢ G. A set F ⇢ R is closed if F c is open.

The idea is that about every point of an open set, there is some room inside theset on both sides of the point. It is easy to see that any open interval (a, b) is anopen set because if a < x < b and " = min{x � a, b � x}, then (x � ", x + ") ⇢ (a, b).

Open half-lines are also open sets. For example, let x 2 (a, 1) and " = x � a.Then (x � ", x + ") ⇢ (a, 1).

A singleton set {a} is closed. To see this, suppose x 6= a and " = |x � a|. Thena /2 (x � ", x + "), and {a}c must be open. The definition of a closed set implies{a} is closed.

A common mistake is to assume all sets are either open or closed. Most sets areneither open nor closed. For example, if S = [a, b) for some numbers a < b, then nomatter the size of " > 0, neither (a � ", a + ") nor (b � ", b + ") are contained in Sor Sc.

Theorem 5.2. (a) If {G�

: � 2 ⇤} is a collection of open sets, thenS�2⇤

G�

is open.

(b) If {Gk

: 1 k n} is a finite collection of open sets, then

Tn

k=1

Gk

is

open.

(c) Both ; and R are open.

Proof. (a) If x 2 S�2⇤

G�

, then there is a �x

2 ⇤ such that x 2 G�

x

.Since G

�

x

is open, there is an " > 0 such that x 2 (x � ", x + ") ⇢ G�

x

⇢S�2⇤

G�

. This showsS

�2⇤

G�

is open.(b) If x 2 Tn

k=1

Gk

, then x 2 Gk

for 1 k n. For each Gk

there is an "k

such that (x � "k

, x + "k

) ⇢ Gk

. Let " = min{"k

: 1 k n}. Then(x � ", x + ") ⇢ G

k

for 1 k n, so (x � ", x + ") ⇢ Tn

k=1

Gk

. ThereforeTn

k=1

Gk

is open.(c) ; is open vacuously. R is obviously open.

⇤1.1. Topological Spaces. The preceding theorem provides the starting point

for a fundamental area of mathematics called topology. The properties of the opensets of R motivate the following definition.

Definition 5.3. For X a set, not necessarily a subset of R, let T ⇢ P(X). Theset T is called a topology on X if {;, X} ⇢ T and T is closed under arbitrary unionsand finite intersections. The pair (X, T ) is called a topological space. The elementsof T are the open sets of the topological space. The closed sets of the topologicalspace are those sets whose complements are open.

5-1

5-2 CHAPTER 5. THE TOPOLOGY OF R

If O = {G ⇢ R : G is open}, then Theorem 5.2 shows (R, O) is a topologicalspace called the standard topology on R. While the standard topology is the mostwidely used one, there are many other possible topologies on R. For example,R = {(a, 1) : a 2 R} [ {R, ;} is a topology on R called the right ray topology. Thecollection F = {S ⇢ R : Sc is finite} [ {;} is called the finite complement topology.The study of topologies is a huge subject, further discussion of which would take ustoo far afield. There are many fine books on the subject ([13]) to which one canrefer.

Applying DeMorgan’s laws to the parts of Theorem 5.2 immediately yields thefollowing, which is sometimes also used as the definition of the standard topology.

Corollary 5.4. (a) If {F�

: � 2 ⇤} is a collection of closed sets, thenT�2⇤

G�

is closed.

(b) If {Fk

: 1 k n} is a finite collection of closed sets, then

Sn

k=1

Fk

is

closed.

(c) Both ; and R are closed.

Surprisingly, ; and R are both open and closed. They are the only subsets ofthe standard topology with this dual personality. Other topologies may have manysets that are both open and closed. Such sets in a topological space are often calledclopen.

1.2. Limit Points and Closure.

Definition 5.5. x0

is a limit point

1 of S ⇢ R if for every " > 0,

(x0

� ", x0

+ ") \ S \ {x0

} 6= ;.

The derived set of S is

S0 = {x : x is a limit point of S}.

A point x0

2 S \ S0 is an isolated point of S.

Notice that limit points of S need not be elements of S, but isolated points of Smust be elements of S. In a sense, limit points and isolated points are at oppositeextremes. The definitions can be restated as follows:

x0

is a limit point of S i↵ 8" > 0 (S \ (x0

� ", x0

+ ") \ {x0

} 6= ;)

x0

2 S is an isolated point of S i↵ 9" > 0 (S \ (x0

� ", x0

+ ") \ {x0

} = ;)

Example 5.1. If S = (0, 1], then S0 = [0, 1] and S has no isolated points.

Example 5.2. If T = {1/n : n 2 Z \ {0}}, then T 0 = {0} and all points of Tare isolated points of T .

Theorem 5.6. x0

is a limit point of S i↵ there is a sequence xn

2 S \ {x0

}such that x

n

! x0

.

Proof. ()) For each n 2 N choose xn

2 S \ (x0

� 1/n, x0

+1/n) \ {x0

}. Then|x

n

� x0

| < 1/n for all n 2 N, so xn

! x0

.(() Suppose x

n

is a sequence from xn

2 S \{x0

} converging to x0

. If " > 0, thedefinition of convergence for a sequence yields an N 2 N such that whenever n � N ,

1This use of the term limit point is not universal. Some authors use the term accumulationpoint. Others use condensation point, although this is more often used for those cases when everyneighborhood of x

0

intersects S in an uncountable set.


1. OPEN AND CLOSED SETS 5-3

then xn

2 S \ (x0

� ", x0

+ ") \ {x0

}. This shows S \ (x0

� ", x0

+ ") \ {x0

} 6= ;,and x

0

must be a limit point of S. ⇤

There is some common terminology making much of this easier to state. Ifx

0

2 R and G is an open set containing x0

, then G is called a neighborhood of x0

.The observations given above can be restated in terms of neighborhoods.

Corollary 5.7. Let S ⇢ R.(a) x

0

is a limit point of S i↵ every neighborhood of x0

contains an infinite

number of points from S.(b) x

0

2 S is an isolated point of S i↵ there is a neighborhood of x0

containing only a finite number of points from S.

Following is a generalization of Theorem 3.16.

Theorem 5.8 (Bolzano-Weierstrass Theorem). If S ⇢ R is bounded and infinite,

then S0 6= ;.Proof. For the purposes of this proof, if I = [a, b] is a closed interval, let

IL = [a, (a + b)/2] be the closed left half of I and IR = [(a + b)/2, b] be the closedright half of I.

Suppose S is a bounded and infinite set. The assumption that S is boundedimplies the existence of an interval I

1

= [�B, B] containing S. Since S is infinite,at least one of the two sets IL

1

\ S or IR1

\ S is infinite. Let I2

be either IL1

or IR1

such that I2

\ S is infinite.If I

n

is such that In

\ S is infinite, let In+1

be either ILn

or IRn

, where In+1

\ Sis infinite.

In this way, a nested sequence of intervals, In

for n 2 N, is defined such thatIn

\ S is infinite for all n 2 N and the length of In

is B/2n�2 ! 0. According tothe Nested Interval Theorem, there is an x

0

2 R such thatT

n2N In

= {x0

}.To see that x

0

is a limit point of S, let " > 0 and choose n 2 N so thatB/2n�2 < ". Then x

0

2 In

⇢ (x0

� ", x0

+ "). Since In

\ S is infinite, we seeS \ (x

0

� ", x0

+ ") \ {x0

} 6= ;. Therefore, x0

is a limit point of S. ⇤

Theorem 5.9. A set S ⇢ R is closed i↵ it contains all its limit points.

Proof. ()) Suppose S is closed and x0

is a limit point of S. If x0

/2 S, thenSc open implies the existence of " > 0 such that (x

0

� ", x0

+ ") \ S = ;. Thiscontradicts the fact that x

0

is a limit point of S. Therefore, x0

2 S, and S containsall its limit points.

(() Since S contains all its limit points, if x0

/2 S, there must exist an " > 0such that (x

0

� ", x0

+ ") \ S = ;. It follows from this that Sc is open. Therefore Sis closed. ⇤

Definition 5.10. The closure of a set S is the set S = S [ S0.

For the set S of Example 5.1, S = [0, 1]. In Example 5.2, T = {1/n : n 2Z \ {0}} [ {0}. According to Theorem 5.9, the closure of any set is a closed set.A useful way to think about this is that S is the smallest closed set containing S.This is made more precise in Exercise 5.2.

Following is a generalization of Theorem 3.20.



Corollary 5.11. If {Fn

: n 2 N} is a nested collection of nonempty closed

and bounded sets, then

Tn2N F

n

6= ;.Proof. Form a sequence x

n

by choosing xn

2 Fn

for each n 2 N. Since the Fn

are nested, {xn

: n 2 N} ⇢ F1

, and the boundedness of F1

implies xn

is a boundedsequence. An application of Theorem 3.16 yields a subsequence y

n

of xn

such thatyn

! y. It su�ces to prove y 2 Fn

for all n 2 N.To do this, fix n

0

2 N. Because yn

is a subsequence of xn

and xn0 2 F

n0 , it iseasy to see y

n

2 Fn0 for all n � n

0

. Using the fact that yn

! y, we see y 2 F 0n0

.Since F

n0 is closed, Theorem 5.9 shows y 2 Fn0 . ⇤

2. Relative Topologies and Connectedness

2.1. Relative Topologies. Another useful topological notion is that of arelative or subspace topology. In our case, this amounts to using the standardtopology on R to generate a topology on a subset of R. The definition is as follows.

Definition 5.12. Let X ⇢ R. The set S ⇢ X is relatively open in X, if thereis a set G, open in R, such that S = G \ X. The set T ⇢ X is relatively closed inX, if there is a set F , closed in R, such that S = F \ X. (If there is no chance forconfusion, the simpler terminology open in X and closed in X is sometimes used.)

It is left as exercises to show that if X ⇢ R and S consists of all relatively opensubsets of X, then (X, S) is a topological space and T is relatively closed in X, ifX \ T 2 S. (See Exercises 5.10 and 5.11.)

Example 5.3. Let X = [0, 1]. The subsets [0, 1/2) = X \ (�1, 1/2) and(1/4, 1] = X \ (1/4, 2) are both relatively open in X.

Example 5.4. If X = Q, then {x 2 Q : �p2 < x <

p2} = (�p

2,p

2) \ Q =[�p

2,p

2] \ Q is clopen relative to Q.

2.2. Connected Sets. One place where the relative topologies are useful isin relation to the following definition.

Definition 5.13. A set S ⇢ R is disconnected if there are two open intervals Uand V such that U \ V = ;, U \ S 6= ;, V \ S 6= ; and S ⇢ U [ V . Otherwise, it isconnected. The sets U \ S and V \ S are said to be a separation of S.

In other words, S is disconnected if it can be written as the union of two disjointand nonempty sets that are both relatively open in S. Since both these sets arecomplements of each other relative to S, they are both clopen in S. This, in turn,implies S is disconnected if it has a proper relatively clopen subset.

Example 5.5. Let S = {x} be a set containing a single point. S is connectedbecause there cannot exist nonempty disjoint open sets U and V such that S\U 6= ;and S \ V 6= ;. The same argument shows that ; is connected.

Example 5.6. If S = [�1, 0) [ (0, 1], then U = (�2, 0) and V = (0, 2) are opensets such that U \ V = ;, U \ S 6= ;, V \ S 6= ; and S ⇢ U [ V . This shows S isdisconnected.

Example 5.7. The sets U = (�1,p

2) and V = (p

2, 1) are open sets suchthat U \ V = ;, U \ Q 6= ;, V \ Q 6= ; and Q ⇢ U [ V = R \ {p

2}. This showsQ is disconnected. In fact, the only connected subsets of Q are single points. Setswith this property are often called totally disconnected.


Section 3: Covering Properties and Compactness on R 5-5

The notion of connectedness is not really very interesting on R because theconnected sets are exactly what one would expect. It becomes more complicated inhigher dimensional spaces.

Theorem 5.14. A nonempty set S ⇢ R is connected i↵ it is either a single

point or an interval.

Proof. ()) If S is not a single point or an interval, there must be numbersr < s < t such that r, t 2 S and s /2 S. In this case, the sets U = (�1, s) andV = (s, 1) are a disconnection of S.

(() It was shown in Example 5.5 that a set containing a single point is connected.So, assume S is an interval.

Suppose S is not connected with U and V forming a disconnection of S. Chooseu 2 U \ S and v 2 V \ S. There is no generality lost by assuming u < v, so that[u, v] ⇢ S.

Let A = {x : [u, x) ⇢ U}.We claim A 6= ;. To see this, use the fact that U is open to find " 2 (0, v � u)

such that (u � ", u + ") ⇢ U . Then u < u + "/2 < v, so u + "/2 2 A.Define w = lub A.Since v 2 V it is evident u < w v and w 2 S.If w 2 U , then u < w < v and there is " 2 (0, v�w) such that (w�", w+") ⇢ U

and [u, w + ") ⇢ S because w + " < v. This clearly contradicts the definition of w,so w /2 U .

If w 2 V , then there is an " > 0 such that (w � ", w] ⇢ V . In particular, thisshows w = lub A w � " < w. This contradiction forces the conclusion that w /2 V .

Now, putting all this together, we see w 2 S ⇢ U [ V and w /2 U [ V . This is aclear contradiction, so we’re forced to conclude there is no separation of S. ⇤

3. Covering Properties and Compactness on R

3.1. Open Covers.

Definition 5.15. Let S ⇢ R. A collection of open sets, O = {G�

: � 2 ⇤}, isan open cover of S if S ⇢ S

G2O G. If O0 ⇢ O is also an open cover of S, then O0 isan open subcover of S from O.

Example 5.8. Let S = (0, 1) and O = {(1/n, 1) : n 2 N}. We claim that O isan open cover of S. To prove this, let x 2 (0, 1). Choose n

0

2 N such that 1/n0

< x.Then

x 2 (1/n0

, 1) ⇢[

n2N(1/n, 1) =

[

G2OG.

Since x is an arbitrary element of (0, 1), it follows that (0, 1) =S

G2O G.Suppose O0 is any infinite subset of O and x 2 (0, 1). Since O0 is infinite, there

exists an n 2 N such that x 2 (1/n, 1) 2 O0. The rest of the proof proceeds asabove.

On the other hand, if O0 is a finite subset of O, then let M = max{n : (1/n, 1) 2O0}. If 0 < x < 1/M , it is clear that x /2 S

G2O0 G, so O0 is not an open cover of(0, 1).

2

January 19, 2015




Example 5.9. Let T = [0, 1) and 0 < " < 1. If

O = {(1/n, 1) : n 2 N} [ {(�", ")},

then O is an open cover of T .It is evident that any open subcover of T from O must contain (�", "), because

that is the only element of O which contains 0. Choose n 2 N such that 1/n < ".Then O0 = {(�", "), (1/n, 1)} is an open subcover of T from O which contains onlytwo elements.

Theorem 5.16 (Lindelof Property). If S ⇢ R and O is any open cover of S,then O contains a subcover with a countable number of elements.

Proof. Let O = {G�

: � 2 ⇤} be an open cover of S ⇢ R. Since O is an opencover of S, for each x 2 S there is a �

x

2 ⇤ and numbers px

, qx

2 Q satisfyingx 2 (p

x

, qx

) ⇢ G�

x

2 O. The collection T = {(px

, qx

) : x 2 S} is an open cover ofS.

Thinking of the collection T = {(px

, qx

) : x 2 S} as a set of ordered pairs ofrational numbers, it is seen that card (T ) card (Q ⇥ Q) = @

0

, so T is countable.For each interval I 2 T , choose a �

I

2 ⇤ such that I ⇢ G�

I

. Then

S ⇢[

I2TI ⇢

[

I2TG

�

I

shows O0 = {G�

I

: I 2 T } ⇢ O is an open subcover of S from O. Also, card (O0) card (T ) @

0

, so O0 is a countable open subcover of S from O. ⇤

Corollary 5.17. Any open subset of R can be written as a countable union of

pairwise disjoint open intervals.

Proof. Let G be open in R. For x 2 G let ↵x

= glb {y : (y, x] ⇢ G} and�x

= lub {y : [x, y) ⇢ G}. The fact that G is open implies ↵x

< x < �x

. DefineIx

= (↵x

, �x

).Then I

x

⇢ G. To see this, suppose x < w < �x

. Choose y 2 (w, �x

). Thedefinition of �

x

guarantees w 2 (x, y) ⇢ G. Similarly, if ↵x

< w < x, it follows thatw 2 G.

This shows O = {Ix

: x 2 G} has the property that G =S

x2G

Ix

.Suppose x, y 2 G and I

x

\ Iy

6= ;. There is no generality lost in assumingx < y. In this case, there must be a w 2 (x, y) such that w 2 I

x

\ Iy

. We knowfrom above that both [x, w] ⇢ G and [w, y] ⇢ G, so [x, y] ⇢ G. It follows that↵x

= ↵y

< x < y < �x

= �y

and Ix

= Iy

.From this we conclude O consists of pairwise disjoint open intervals.To finish, apply Theorem 5.16 to extract a countable subcover from O. ⇤

Corollary 5.17 can also be proved by a di↵erent strategy. Instead of usingTheorem 5.16 to extract a countable subcover, we could just choose one rationalnumber from each interval in the cover. The pairwise disjointness of the intervals inthe cover guarantee this will give a bijection between O and a subset of Q. Thismethod has the advantage of showing O itself is countable from the start.

3.2. Compact Sets. There is a class of sets for which the conclusion ofLindelof’s theorem can be strengthened.


Section 3: Covering Properties and Compactness on R 5-7

Definition 5.18. An open cover O of a set S is a finite cover, if O has only afinite number of elements. The definition of a finite subcover is analogous.

Definition 5.19. A set K ⇢ R is compact, if every open cover of K contains afinite subcover.

Theorem 5.20 (Heine-Borel). A set K ⇢ R is compact i↵ it is closed and

bounded.

Proof. ()) Suppose K is unbounded. The collection O = {(�n, n) : n 2 N}is an open cover of K. If O0 is any finite subset of O, then

SG2O0 G is a bounded

set and cannot cover the unbounded set K. This shows K cannot be compact, andevery compact set must be bounded.

Suppose K is not closed. According to Theorem 5.9, there is a limit point x of Ksuch that x /2 K. Define O = {[x�1/n, x+1/n]c : n 2 N}. Then O is a collection ofopen sets and K ⇢ S

G2O G = R\{x}. Let O0 = {[x�1/ni

, x+1/ni

]c : 1 i N}be a finite subset of O and M = max{n

i

: 1 i N}. Since x is a limit point of K,there is a y 2 K \ (x�1/M, x+1/M). Clearly, y /2 S

G2O0 G = [x�1/M, x+1/M ]c,so O0 cannot cover K. This shows every compact set must be closed.

(() Let K be closed and bounded and let O be an open cover of K. ApplyingTheorem 5.16, if necessary, we can assume O is countable. Thus, O = {G

n

: n 2 N}.For each n 2 N, define

Fn

= K \n[

i=1

Gi

= K \n\

i=1

Gc

i

.

Then Fn

is a sequence of nested, bounded and closed subsets of K. Since O coversK, it follows that \

n2NFn

⇢ K \[

n2NG

n

= ;.

According to the Corollary 5.11, the only way this can happen is if Fn

= ; for somen 2 N. Then K ⇢ S

n

i=1

Gi

, and O0 = {Gi

: 1 i n} is a finite subcover of Kfrom O. ⇤

Compactness shows up in several di↵erent, but equivalent ways on R. We’vealready seen several of them, but their equivalence is not obvious. The followingtheorem shows a few of the most common manifestations of compactness.

Theorem 5.21. Let K ⇢ R. The following statements are equivalent to each

other.

(a) K is compact.

(b) K is closed and bounded.

(c) Every infinite subset of K has a limit point in K.

(d) Every sequence {an

: n 2 N} ⇢ K has a subsequence converging to an

element of K.

(e) If Fn

is a nested sequence of nonempty relatively closed subsets of K, thenTn2N F

n

6= ;.Proof. (a) () (b) is the Heine-Borel Theorem, Theorem 5.20.That (b))(c) is the Bolzano-Weierstrass Theorem, Theorem 5.8.(c))(d) is contained in the sequence version of the Bolzano-Weierstrass theorem,

Theorem 3.16.



(d))(e) Let Fn

be as in (e). For each n 2 N, choose an

2 Fn

. By assumption,an

has a convergent subsequence bn

! b. Each Fn

contains a tail of the sequencebn

, so b 2 F 0n

⇢ Fn

for each n. Therefore, b 2 Tn2N F

n

, and (e) follows.(e))(b). Suppose K is such that (e) is true.Let F

n

= K \ ((�1, �n] [ [n, 1)). Then Fn

is a sequence of sets whichare relatively closed in K such that

Tn2N F

n

= ;. If K is unbounded, thenFn

6= ;, 8n 2 N, and a contradiction of (e) is evident. Therefore, K must bebounded.

If K is not closed, then there must be a limit point x of K such that x /2 K.Define a sequence of relatively closed and nested subsets of K by F

n

= [x � 1/n, x +1/n] \ K for n 2 N. Then

Tn2N F

n

= ;, because x /2 K. This contradiction of (e)shows that K must be closed. ⇤

These various ways of looking at compactness have been given di↵erent names bytopologists. Property (c) is called limit point compactness and (d) is called sequential

compactness. There are topological spaces in which various of the equivalences donot hold.

4. More Small Sets

This is an advanced section that can be omitted.

We’ve already seen one way in which a subset of R can be considered small—ifits cardinality is at most @

0

. Such sets are small in the set-theoretic sense. Thissection shows how sets can be considered small in the metric and topological senses.

4.1. Sets of Measure Zero. An interval is the only subset of R for whichmost people could immediately come up with some sort of measure — its length.This idea of measuring a set by length can be generalized. For example, we knowevery open set can be written as a countable union of open intervals, so it is naturalto assign the sum of the lengths of its component intervals as the measure of theset. Discounting some technical di�culties, such as components with infinite length,this is how the Lebesgue measure of an open set is defined. It is possible to assign ameasure to more complicated sets, but we’ll only address the special case of setswith measure zero, sometimes called Lebesgue null sets.

Definition 5.22. A set S ⇢ R has measure zero if given any " > 0 there is asequence (a

n

, bn

) of open intervals such that

S ⇢[

n2N(a

n

, bn

) and1X

n=1

(bn

� an

) < ".

Such sets are small in the metric sense.

Example 5.10. If S = {a} contains only one point, then S has measure zero.To see this, let " > 0. Note that S ⇢ (a � "/4, a + "/4) and this single interval haslength "/2 < ".

There are complicated sets of measure zero, as we’ll see later. For now, we’llstart with a simple theorem.

Theorem 5.23. If {Sn

: n 2 N} is a countable collection of sets of measure

zero, then

Sn2N S

n

has measure zero.


4. MORE SMALL SETS 5-9

Proof. Let " > 0. For each n, let {(an,k

, bn,k

) : k 2 N} be a collection ofintervals such that

Sn

⇢[

k2N(a

n,k

, bn,k

) and1X

k=1

(bn,k

� an,k

) <"

2n.

Then[

n2NSn

⇢[

n2N

[

k2N(a

n,k

, bn,k

) and1X

n=1

1X

k=1

(bn,k

� an,k

) <

1X

n=1

"

2n= ".

⇤Combining this with Example 5.10 gives the following corollary.

Corollary 5.24. Every countable set has measure zero.

The rational numbers is a large set in the sense that every interval contains arational number. But we now see it is small in both the set theoretic and metricsenses because it is countable and of measure zero.

Uncountable sets of measure zero are constructed in Section 4.3.There is some standard terminology associated with sets of measure zero. If a

property P is true, except on a set of measure zero, then it is often said “P is truealmost everywhere” or “almost every point satisfies P .” It is also said “P is true ona set of full measure.” For example, “Almost every real number is irrational’.’ or“The irrational numbers are a set of full measure.”

4.2. Dense and Nowhere Dense Sets. We begin by considering a way thata set can be considered topologically large in an interval. If I is any interval, recallfrom Corollary 2.22 that I \ Q 6= ; and I \ Qc 6= ;. An immediate consequence ofthis is that every real number is a limit point of both Q and Qc. In this sense, therational and irrational numbers are both uniformly distributed across the numberline. This idea is generalized in the following definition.

Definition 5.25. Let A ⇢ B ⇢ R. A is said to be dense in B, if B ⇢ A.

Both the rational and irrational numbers are dense in every interval. Corollary5.17 shows the rational and irrational numbers are dense in every open set. It’snot hard to construct other sets dense in every interval. For example, the set ofdyadic numbers, D = {p/2q : p, q 2 Z}, is dense in every interval — and dense inthe rational numbers.

On the other hand, Z is not dense in any interval because it’s closed and containsno interval. If A ⇢ B, where B is an open set, then A is not dense in B, if Acontains any interval-sized gaps.

Theorem 5.26. Let A ⇢ B ⇢ R. A is dense in B i↵ whenever I is an open

interval such that I \ B 6= ;, then I \ A 6= ;.Proof. ()) Assume there is an open interval I such that I \ B 6= ; and

I \ A = ;. If x 2 I \ B, then I is a neighborhood of x that does not intersect A.Definition 5.5 shows x /2 A0 ⇢ A, a contradiction of the assumption that B ⇢ A.This contradiction implies that whenever I \ B 6= ;, then I \ A 6= ;.

(() If x 2 B \ A = A, then x 2 A. Assume x 2 B \ A. By assumption, foreach n 2 N, there is an x

n

2 (x � 1/n, x + 1/n) \ A. Since xn

! x, this showsx 2 A0 ⇢ A. It now follows that B ⇢ A. ⇤



If B ⇢ R and I is an open interval with I \ B 6= ;, then I \ B is often called aportion of B. The previous theorem says that A is dense in B i↵ every portion of Bintersects A.

If A being dense in B is thought of as A being a large subset of B, then perhapswhen A is not dense in B, it can be thought of as a small subset. But, thinking ofA as being small when it is not dense isn’t quite so clear when it is noticed that Acould still be dense in some portion of B, even if it isn’t dense in B. To make Abe a truly small subset of B in the topological sense, it should not be dense in anyportion of B. The following definition gives a way to assure this is true.

Definition 5.27. Let A ⇢ B ⇢ R. A is said to be nowhere dense in B if B \ Ais dense in B.

The following theorem shows that a nowhere dense set is small in the sensementioned above because it fails to be dense in any part of B.

Theorem 5.28. Let A ⇢ B ⇢ R. A is nowhere dense in B i↵ for every open

interval I such that I \ B 6= ;, there is an open interval J ⇢ I such that J \ B 6= ;and J \ A = ;.

Proof. ()) Let I be an open interval such that I \ B 6= ;. By assumption,B \ A is dense in B, so Theorem 5.26 implies I \ (B \ A) 6= ;. If x 2 I \ (B \ A),then there is an open interval J such that x 2 J ⇢ I and J \ A = ;. Since A ⇢ A,this J satisfies the theorem.

(() Let I be an open interval with I \ B 6= ;. By assumption, there is an openinterval J ⇢ I such that J \ A = ;. It follows that J \ A = ;. Theorem 5.26 impliesB \ A is dense in B. ⇤

Example 5.11. Let G be an open set that is dense in R. If I is any openinterval, then Theorem 5.26 implies I \ G 6= ;. Because G is open, if x 2 I \ G,then there is an open interval J such that x 2 J ⇢ G. Now, Theorem 5.28 showsGc is nowhere dense.

The nowhere dense sets are topologically small in the following sense.

Theorem 5.29 (Baire). If I is an open interval, then I cannot be written as a

countable union of nowhere dense sets.

Proof. Let An

be a sequence of nowhere dense subsets of I. According toTheorem 5.28, there is a bounded open interval J

1

⇢ I such that J1

\ A1

= ;. Byshortening J

1

a bit at each end, if necessary, it may be assumed that J1

\ A1

= ;.Assume J

n

has been chosen for some n 2 N. Applying Theorem 5.28 again, choosean open interval J

n+1

as above so Jn+1

\ An+1

= ;. Corollary 5.11 shows

I \[

n2NA

n

�\

n2NJn

6= ;

and the theorem follows. ⇤Theorem 5.29 is called the Baire category theorem because of the terminology

introduced by Rene-Louis Baire in 1899.3 He said a set was of the first category, ifit could be written as a countable union of nowhere dense sets. An easy example of

3Rene-Louis Baire (1874-1932) was a French mathematician. He proved the Baire categorytheorem in his 1899 doctoral dissertation.


4. MORE SMALL SETS 5-11

such a set is any countable set, which is a countable union of singletons. All othersets are of the second category.4 Theorem 5.29 can be stated as “Any open intervalis of the second category.” Or, more generally, as “Any nonempty open set is of thesecond category.”

A set is called a G�

set, if it is the countable intersection of open sets. It iscalled an F

�

set, if it is the countable union of closed sets. DeMorgan’s laws showthat the complement of an F

�

set is a G�

set and vice versa. It’s evident that anycountable subset of R is an F

�

set, so Q is an F�

set.On the other hand, suppose Q is a G

�

set. Then there is a sequence of open setsG

n

such that Q =T

n2N Gn

. Since Q is dense, each Gn

must be dense and Example5.11 shows Gc

n

is nowhere dense. From DeMorgan’s law, R = Q[Sn2N Gc

n

, showingR is a first category set and violating the Baire category theorem. Therefore, Q isnot a G

�

set.Essentially the same argument shows any countable subset of R is a first category

set. The following protracted example shows there are uncountable sets of the firstcategory.

4.3. The Cantor Middle-Thirds Set. One particularly interesting exampleof a nowhere dense set is the Cantor Middle-Thirds set, introduced by the Germanmathematician Georg Cantor in 1884.5 It has many strange properties, only a fewof which will be explored here.

To start the construction of the Cantor Middle-Thirds set, let C0

= [0, 1] andC

1

= I1

\ (1/3, 2/3) = [0, 1/3] [ [2/3, 1]. Remove the open middle thirds of theintervals comprising C

1

, to get

C2

=

0,

1

9

�[2

9,1

3

�[2

3,7

9

�[8

9, 1

�.

Continuing in this way, if Cn

consists of 2n pairwise disjoint closed intervals each oflength 3�n, construct C

n+1

by removing the open middle third from each of thoseclosed intervals, leaving 2n+1 closed intervals each of length 3�(n+1). This gives anested sequence of closed sets C

n

each consisting of 2n closed intervals of length3�n. (See Figure 1.) The Cantor Middle-Thirds set is

C =\

n2NC

n

.

Corollaries 5.4 and 5.11 show C is closed and nonempty. In fact, the latter isapparent because {0, 1/3, 2/3, 1} ⇢ C

n

for every n. At each step in the construction,2n open middle thirds, each of length 3�(n+1) were removed from the intervalscomprising C

n

. The total length of the open intervals removed was1X

n=0

2n

3n+1

=1

3

1X

n=0

✓2

3

◆n

= 1.

Because of this, Example 5.11 implies C is nowhere dense in [0, 1].

4Baire did not define any categories other than these two. Some authors call first categorysets meager sets, so as not to make students fruitlessly wait for definitions of third, fourth andfifth category sets.

5Cantor’s original work [6] is reprinted with an English translation in Edgar’s Classics onFractals [8]. Cantor only mentions his eponymous set in passing and it had actually been presentedearlier by others.



Figure 1. Shown here are the first few steps in the construction of theCantor Middle-Thirds set.

C is an example of a perfect set; i.e., a closed set all of whose points are limitpoints of itself. (See Exercise 5.23.) Any closed set without isolated points is perfect.The Cantor Middle-Thirds set is interesting because it is an example of a perfectset without any interior points. Many people call any bounded perfect set withoutinterior points a Cantor set. Most of the time, when someone refers to the Cantorset, they mean C.

There is another way to view the Cantor set. Notice that at the nth stage ofthe construction, removing the middle thirds of the intervals comprising C

n

removesthose points whose base 3 representation contains the digit 1 in position n + 1.Then,

C =

(c =

1X

n=1

cn

3n: c

n

2 {0, 2})

.(47)

So, C consists of all numbers c 2 [0, 1] that can be written in base 3 without usingthe digit 1.6

If c 2 C, then (47) shows c =P1

n=1

c

n

3

n

for some sequence cn

with range in{0, 2}. Moreover, every such sequence corresponds to a unique element of C. Define� : C ! [0, 1] by

�(c) = �

1X

n=1

cn

3n

!=

1X

n=1

cn

/2

2n.(48)

Since cn

is a sequence from {0, 2}, then cn

/2 is a sequence from {0, 1} and �(c) canbe considered the binary representation of a number in [0, 1]. According to (47), itfollows that � is a surjection and

�(C) =

( 1X

n=1

cn

/2

2n: c

n

2 {0, 2})

=

( 1X

n=1

bn

2n: b

n

2 {0, 1})

= [0, 1].

Therefore, card (C) = card ([0, 1]) > @0

.The Cantor set is topologically small because it is nowhere dense and large from

the set-theoretic viewpoint because it is uncountable.The Cantor set is also a set of measure zero. To see this, let C

n

be as in theconstruction of the Cantor set given above. Then C ⇢ C

n

and Cn

consists of 2n

pairwise disjoint closed intervals each of length 3�n. Their total length is (2/3)n.Given " > 0, choose n 2 N so (2/3)n < "/2. Each of the closed intervals comprising

6Notice that 1 =P1

n=1

2/3n, 1/3 =P1

n=2

2/3n, etc.


5. EXERCISES 5-13

Cn

can be placed inside a slightly longer open interval so the sums of the lengths ofthe 2n open intervals is less than ".

5. Exercises

5.1. If G is an open set and F is a closed set, then G \ F is open and F \ G isclosed.

5.2. Let S ⇢ R and F = {F : F is closed and S ⇢ F}. Prove S =T

F2F F . This

proves that S is the smallest closed set containing S.

5.3. If S and T are subsets of R, then S [ T = S [ T .

5.4. If S is a finite subset of R, then S is closed.

5.5. Q is neither open nor closed.

5.6. A set S ⇢ R is open i↵ @S \ S = ;. (@S is the set of boundary points of S.)

5.7. Find a sequence of open sets Gn

such thatT

n2N Gn

is neither open norclosed.

5.8. An open set G is called regular if G = (G)�. Find an open set that is notregular.

5.9. Let R = {(x, 1) : x 2 R} and T = R [ {R, ;}. Prove that (R, T ) is atopological space. This is called the right ray topology on R.

5.10. If X ⇢ R and S is the collection of all sets relatively open in X, then (X, S)is a topological space.

5.11. If X ⇢ R and G is an open set, then X \ G is relatively closed in X.

5.12. For any set S, let F = {T ⇢ S : card (S \ T ) @0

} [ {;}. Then (S, F) is atopological space. This is called the finite complement topology.

5.13. An uncountable subset of R must have a limit point.

5.14. If S ⇢ R, then S0 is closed.

5.15. Prove that the set of accumulation points of any sequence is closed.

5.16. Prove any closed set is the set of accumulation points for some sequence.

5.17. If an

is a sequence such that an

! L, then {an

: n 2 N} [ {L} is compact.

5.18. If F is closed and K is compact, then F \ K is compact.

5.19. If {K↵

: ↵ 2 A} is a collection of compact sets, thenT

↵2A

K↵

is compact.



5.20. Prove the union of a finite number of compact sets is compact. Give anexample to show this need not be true for the union of an infinite number of compactsets.

5.21. (a) Give an example of a set S such that S is disconnected, but S [ {1} isconnected. (b) Prove that 1 must be a limit point of S.

5.22. If K is compact and V is open with K ⇢ V , then there is an open set Usuch that K ⇢ U ⇢ U ⇢ V .

5.23. If C is the Cantor middle-thirds set, then C = C 0.

5.24. If x 2 R and K is compact, then there is a z 2 K such that |x � z| =glb{|x � y| : y 2 K}. Is z unique?

5.25. If K is compact and O is an open cover of K, then there is an " > 0 suchthat for all x 2 K there is a G 2 O with (x � ", x + ") ⇢ G.

5.26. Let f : [a, b] ! R be a function such that for every x 2 [a, b] there is a�x

> 0 such that f is bounded on (x � �x

, x + �x

). Prove f is bounded.

5.27. Is the function defined by (48) a bijection?

5.28. If A is nowhere dense in an interval I, then A contains no interval.

5.29. Use the Baire category theorem to show R is uncountable.


CHAPTER 6

Limits of Functions

1. Basic Definitions

Definition 6.1. Let D ⇢ R, x0

be a limit point of D and f : D ! R. Thelimit of f(x) at x

0

is L, if for each " > 0 there is a � > 0 such that when x 2 Dwith 0 < |x � x

0

| < �, then |f(x) � L| < ". When this is the case, we writelim

x!x0 f(x) = L.

It should be noted that the limit of f at x0

is determined by the values of fnear x

0

and not at x0

. In fact, f need not even be defined at x0

.

Figure 1. This figure shows a way to think about the limit. The graphof f must stay inside the box (x

0

� �, x

0

+ �) ⇥ (L � ", L + "), exceptpossibly the point (x

0

, f(x0

)).

A useful way of rewording the definition is to say that limx!x0 f(x) = L i↵

for every " > 0 there is a � > 0 such that x 2 (x0

� �, x0

+ �) \ D \ {x0

} impliesf(x) 2 (L � ", L + "). This can also be succinctly stated as

8" > 0 9� > 0 ( f ( (x0

� �, x0

+ �) \ D \ {x0

} ) ⇢ (L � ", L + ") ) .

Example 6.1. If f(x) = c is a constant function and x0

2 R, then for anypositive numbers " and �,

x 2 (x0

� �, x0

+ �) \ D \ {x0

} ) |f(x) � c| = |c � c| = 0 < ".

This shows the limit of every constant function exists at every point, and the limitis just the value of the function.

1January 19, 2015


6-1

6-2 CHAPTER 6. LIMITS OF FUNCTIONS

2-2

4

8

Figure 2. The function from Example 6.3. Note that the graph is aline with one “hole” in it.

Example 6.2. Let f(x) = x, x0

2 R, and " = � > 0. Then

x 2 (x0

� �, x0

+ �) \ D \ {x0

} ) |f(x) � f(x0

)| = |x � x0

| < � = ".

This shows that the identity function has a limit at every point and its limit is justthe value of the function at that point.

Example 6.3. Let f(x) = 2x

2�8

x�2

. In this case, the implied domain of f isD = R \ {2}. We claim that lim

x!2

f(x) = 8.To see this, let " > 0 and choose � 2 (0, "/2). If 0 < |x � 2| < �, then

|f(x) � 8| =

��2x2 � 8

x � 2� 8

�� = |2(x + 2) � 8| = 2|x � 2| < ".

Example 6.4. Let f(x) =p

x + 1. Then the implied domain of f is D =[�1, 1). We claim that lim

x!�1

f(x) = 0.To see this, let " > 0 and choose � 2 (0, "2). If 0 < x � (�1) = x + 1 < �, then

|f(x) � 0| =p

x + 1 <p

� <p

"2 = ".

Figure 3. The function f(x) = |x|/x from Example 6.5.

Example 6.5. If f(x) = |x|/x for x 6= 0, then limx!0

f(x) does not exist. (SeeFigure 3.) To see this, suppose lim

x!0

f(x) = L, " = 1 and � > 0. If L � 0 and�� < x < 0, then f(x) = �1 < L�". If L < 0 and 0 < x < �, then f(x) = 1 > L+".These inequalities show for any L and every � > 0, there is an x with 0 < |x| < �such that |f(x) � L| > ".


1. BASIC DEFINITIONS 6-3

0.2 0.4 0.6 0.8 1.0

-1.0

-0.5

0.5

1.0

Figure 4. This is the function from Example 6.6. The graph shownhere is on the interval [0.03, 1]. There are an infinite number of oscillationsfrom �1 to 1 on any open interval containing the origin.

There is an obvious similarity between the definition of limit of a sequence andlimit of a function. The following theorem makes this similarity explicit, and givesanother way to prove facts about limits of functions.

Theorem 6.2. Let f : D ! R and x0

be a limit point of D. limx!x0 f(x) = L

i↵ whenever xn

is a sequence from D \ {x0

} such that xn

! x0

, then f(xn

) ! L.

Proof. ()) Suppose limx!x0 f(x) = L and x

n


}such that x

n

! x0

. Let " > 0. There exists a � > 0 such that |f(x) � L| < "whenever x 2 (x � �, x + �) \ D \ {x

0

}. Since xn

! x0

, there is an N 2 N such thatn � N implies 0 < |x

n

� x0

| < �. In this case, |f(xn

) � L| < ", showing f(xn

) ! L.(() Suppose whenever x

n


} such that xn

! x0

,then f(x

n

) ! L, but limx!x0 f(x) 6= L. Then there exists an " > 0 such that for

all � > 0 there is an x 2 (x0

� �, x0

+ �) \ D \ {x0

} such that |f(x) � L| � ". Inparticular, for each n 2 N, there must exist x

n

2 (x0

� 1/n, x0

+ 1/n) \ D \ {x0

}such that |f(x

n

) � L| � ". Since xn

! x0

, this is a contradiction. Therefore,lim

x!x0 f(x) = L. ⇤Theorem 6.2 is often used to show a limit doesn’t exist. Suppose we want to

show limx!x0 f(x) doesn’t exist. There are two strategies: find a sequence x

n

! x0

such that f(xn

) has no limit; or, find two sequences yn

! x0

and zn

! x0

suchthat f(y

n

) and f(zn

) converge to di↵erent limits. Either way, the theorem showslim

x!x0 fails to exist.In Example 6.5, we could choose x

n

= (�1)n/n so f(xn

) oscillates between �1and 1. Or, we could choose y

n

= 1/n = �zn

so f(xn

) ! 1 and f(zn

) ! �1.

Example 6.6. Let f(x) = sin(1/x), an

= 1

n⇡

and bn

= 2

(4n+1)⇡

. Then an

# 0,

bn

# 0, f(an

) = 0 and f(bn

) = 1 for all n 2 N. An application of Theorem 6.2 showslim

x!0

f(x) does not exist. (See Figure 4.)

Theorem 6.3 (Squeeze Theorem). Suppose f , g and h are all functions defined

on D ⇢ R with f(x) g(x) h(x) for all x 2 D. If x0

is a limit point of D and

limx!x0 f(x) = lim

x!x0 h(x) = L, then limx!x0 g(x) = L.

Proof. Let xn

be any sequence from D \ {x0

} such that xn

! x0

. Accordingto Theorem 6.2, both f(x

n

) ! L and h(xn

) ! L. Since f(xn

) g(xn

) h(xn

), anapplication of the Sandwich Theorem for sequences shows g(x

n

) ! L. Now, anotheruse of Theorem 6.2 shows lim

x!x0 g(x) = L. ⇤



0.02 0.04 0.06 0.08 0.10

-0.10

-0.05

0.05

0.10

Figure 5. This is the function from Example 6.7. The bounding linesy = x and y = �x are also shown. There are an infinite number ofoscillations between �x and x on any open interval containing the origin.

Example 6.7. Let f(x) = x sin(1/x). Since �1 sin(1/x) 1 when x 6= 0, wesee that �x x sin(1/x) x for x 6= 0. Since lim

x!0

x = limx!0

�x = 0, Theorem6.3 implies lim

x!0

x sin(1/x) = 0. (See Figure 5.)

Theorem 6.4. Suppose f : D ! R and g : D ! R and x0

is a limit point of

D. If limx!x0 f(x) = L and lim

x!x0 g(x) = M , then

(a) limx!x0(f + g)(x) = L + M ,

(b) limx!x0(af)(x) = aL, 8x 2 R,

(c) limx!x0(fg)(x) = LM , and

(d) limx!x0(1/f)(x) = 1/L, as long as L 6= 0.

Proof. Suppose an


} converging to x0

. ThenTheorem 6.2 implies f(a

n

) ! L and g(an

) ! M . (a)-(d) follow at once from thecorresponding properties for sequences. ⇤

Example 6.8. Let f(x) = 3x + 2. If g1

(x) = 3, g2

(x) = x and g3

(x) = 2, thenf(x) = g

1

(x)g2

(x) + g3

(x). Examples 6.1 and 6.2 along with parts (a) and (c) ofTheorem 6.4 immediately show that for every x 2 R, lim

x!x0 f(x) = f(x0

).

In the same manner as Example 6.8, it can be shown for every rational functionf(x), that lim

x!x0 f(x) = f(x0

) whenever f(x0

) exists.

2. Unilateral Limits

Definition 6.5. Let f : D ! R and x0

be a limit point of (�1, x0

) \ D.f has L as its left-hand limit at x

0

if for all " > 0 there is a � > 0 such thatf((x

0

� �, x0

) \ D) ⇢ (L � ", L + "). In this case, we write limx"x0 f(x) = L.


Section 3: Continuity 6-5

Let f : D ! R and x0

be a limit point of D \ (x0

, 1). f has L as its right-handlimit at x

0

if for all " > 0 there is a � > 0 such that f(D\(x0

, x0

+�)) ⇢ (L�", L+").In this case, we write lim

x#x0 f(x) = L.2

These are called the unilateral or one-sided limits of f at x0

. When they aredi↵erent, the graph of f is often said to have a “jump” at x

0

, as in the followingexample.

Example 6.9. As in Example 6.5, let f(x) = |x|/x. Then limx#0

f(x) = 1 andlim

x"0

f(x) = �1. (See Figure 3.)

In parallel with Theorem 6.2, the one-sided limits can also be reformulated interms of sequences.


.

(a) Let x0

be a limit point of D \ (x0

, 1). limx#x0 f(x) = L i↵ whenever x

n

is a sequence from D \ (x0

, 1) such that xn

! x0

, then f(xn

) ! L.(b) Let x

0

be a limit point of (�1, x0

) \ D. limx"x0 f(x) = L i↵ whenever x

n

is a sequence from (�1, x0

) \ D such that xn

! x0

, then f(xn

) ! L.

The proof of Theorem 6.6 is similar to that of Theorem 6.2 and is left to thereader.


be a limit point of D.

limx!x0

f(x) = L () limx"x0

f(x) = L = limx#x0

f(x)

Proof. This proof is left as an exercise. ⇤Theorem 6.8. If f : (a, b) ! R is monotone, then both unilateral limits of f

exist at every point of (a, b).

Proof. To be specific, suppose f is increasing and x0

2 (a, b). Let " > 0and L = lub {f(x) : a < x < x

0

}. According to Corollary 2.18, there must exist anx 2 (a, x

0

) such that L � " < f(x) L. Define � = x0

� x. If y 2 (x0

� �, x0

), thenL � " < f(x) f(y) L. This shows lim

x"x0 f(x) = L.The proof that lim

x#x0 f(x) exists is similar.To handle the case when f is decreasing, consider �f instead of f . ⇤

3. Continuity


2 D. f is continuous at x0

if forevery " > 0 there exists a � > 0 such that when x 2 D with |x � x

0

| < �, then|f(x) � f(x

0

)| < ". The set of all points at which f is continuous is denoted C(f).

Several useful ways of rephrasing this are contained in the following theorem.They are analogous to the similar statements made about limits. Proofs are left tothe reader.


2 D. The following statements are

equivalent.

(a) x0

2 C(f),

2Calculus books often use the notation limx"x0 f(x) = lim

x!x0� f(x) and limx#x0 f(x) =

limx!x0+ f(x).



Figure 6. The function f is continuous at x0

, if given any " > 0 thereis a � > 0 such that the graph of f does not cross the top or bottom ofthe dashed rectangle (x

0

� �, x

0

+ d)⇥ (f(x0

)� ", f(x0

) + ").

(b) For all " > 0 there is a � > 0 such that

x 2 (x0

� �, x0

+ �) \ D ) f(x) 2 (f(x0

) � ", f(x0

) + "),

(c) For all " > 0 there is a � > 0 such that

f((x0

� �, x0

+ �) \ D) ⇢ (f(x0

) � ", f(x0

) + ").

Example 6.10. Define

f(x) =

(2x

2�8

x�2

, x 6= 2

8, x = 2.

It follows from Example 6.3 that 2 2 C(f).

There is a subtle di↵erence between the treatment of the domain of the functionin the definitions of limit and continuity. In the definition of limit, the “targetpoint,” x

0

is required to be a limit point of the domain, but not actually be anelement of the domain. In the definition of continuity, x

0

must be in the domain ofthe function, but does not have to be a limit point. To see a consequence of thisdi↵erence, consider the following example.

Example 6.11. If f : Z ! R is an arbitrary function, then C(f) = Z. To seethis, let n

0

2 Z, " > 0 and � = 1. If x 2 Z with |x � n0

| < �, then x = n0

. It followsthat |f(x) � f(n

0

)| = 0 < ", so f is continuous at n0

.

This leads to the following theorem.


2 D. If x0

is a limit point of D, then

x0

2 C(f) i↵ limx!x0 f(x) = f(x

0

). If x0

is an isolated point of D, then x0

2 C(f).

Proof. If x0

is isolated in D, then there is an � > 0 such that (x0

� �, x0

+�) \ D = {x

0

}. For any " > 0, the definition of continuity is satisfied with this �.Next, suppose x

0

2 D0.The definition of continuity says that f is continuous at x

0

i↵ for all " > 0 thereis a � > 0 such that when x 2 (x

0

� �, x0

+ �)\D, then f(x) 2 (f(x0

)� ", f(x0

)+ ").The definition of limit says that lim

x!x0 f(x) = f(x0

) i↵ for all " > 0 there is a� > 0 such that when x 2 (x

0

��, x0

+�)\D\{x0

}, then f(x) 2 (f(x0

)�", f(x0

)+").


Section 3: Continuity 6-7

Comparing these two definitions, it is clear that x0

2 C(f) implies

limx!x0

f(x) = f(x0

).

On the other hand, suppose limx!x0 f(x) = f(x

0

) and " > 0. Choose �according to the definition of limit. When x 2 (x

0

� �, x0

+ �) \ D \ {x0

}, thenf(x) 2 (f(x

0

) � ", f(x0

) + "). It follows from this that when x = x0

, then f(x) �f(x

0

) = f(x0

) � f(x0

) = 0 < ". Therefore, when x 2 (x0

� �, x0

+ �) \ D, thenf(x) 2 (f(x

0

) � ", f(x0

) + "), and x0

2 C(f), as desired. ⇤Example 6.12. If f(x) = c, for some c 2 R, then Example 6.1 and Theorem

6.11 show that f is continuous at every point.

Example 6.13. If f(x) = x, then Example 6.2 and Theorem 6.11 show that fis continuous at every point.

Corollary 6.12. Let f : D ! R and x0

2 D. x0

2 C(f) i↵ whenever xn

is a

sequence from D with xn

! x0

, then f(xn

) ! f(x0

).

Proof. Combining Theorem 6.11 with Theorem 6.2 shows this to be true. ⇤

Example 6.14 (Dirichlet Function). Suppose

f(x) =

(1, x 2 Q0, x /2 Q

.

For each x 2 Q, there is a sequence of irrational numbers converging to x, and foreach y 2 Qc there is a sequence of rational numbers converging to y. Corollary 6.12shows C(f) = ;.

Example 6.15 (Salt and Pepper Function). Since Q is a countable set, it canbe written as a sequence, Q = {q

n

: n 2 N}. Define

f(x) =

(1/n, x = q

n

,

0, x 2 Qc.

If x 2 Q, then x = qn

, for some n and f(x) = 1/n > 0. There is a sequence xn

from Qc such that xn

! x and f(xn

) = 0 6! f(x) = 1/n. Therefore C(f) \ Q = ;.On the other hand, let x 2 Qc and " > 0. Choose N 2 N large enough so

that 1/N < " and let � = min{|x � qn

| : 1 n N}. If |x � y| < �, thereare two cases to consider. If y 2 Qc, then |f(y) � f(x)| = |0 � 0| = 0 < ". Ify 2 Q, then the choice of � guarantees y = q

n

for some n > N . In this case,|f(y) � f(x)| = f(y) = f(q

n

) = 1/n < 1/N < ". Therefore, x 2 C(f).This shows that C(f) = Qc.

It is a consequence of the Baire category theorem that there is no function fsuch that C(f) = Q. Proving this would take us too far afield.

The following theorem is an almost immediate consequence of Theorem 6.4.

Theorem 6.13. Let f : D ! R and g : D ! R. If x0

2 C(f) \ C(g), then

(a) x0

2 C(f + g),(b) x

0

2 C(↵f), 8↵ 2 R,(c) x

0

2 C(fg), and(d) x

0

2 C(f/g) when g(x0

) 6= 0.



Corollary 6.14. If f is a rational function, then f is continuous at each point

of its domain.

Proof. This is a consequence of Examples 6.12 and 6.13 used with Theorem6.13. ⇤

Theorem 6.15. Suppose f : Df

! R and g : Dg

! R such that f(Df

) ⇢ Dg

.

If there is an x0

2 C(f) such that f(x0

) 2 C(g), then x0

2 C(g � f).

Proof. Let " > 0 and choose �1

> 0 such that

g((f(x0

) � �1

, f(x0

) + �1

) \ Dg

) ⇢ (g � f(x0

) � ", g � f(x0

) + ").

Choose �2

> 0 such that

f((x0

� �2

, x0

+ �2

) \ Df

) ⇢ (f(x0

) � �1

, f(x0

) + �1

).

Then

g � f((x0

� �2

, x0

+ �2

) \ Df

) ⇢ g((f(x0

) � �1

, f(x0

) + �1

) \ Dg

)

⇢ (g � f(x0

) � �2

, g � f(x0

) + �2

) \ Df

).

Since this shows Theorem 6.10(c) is satisfied at x0

with the function g � f , it followsthat x

0

2 C(g � f). ⇤Example 6.16. If f(x) =

px for x � 0, then according to Problem 6.8,

C(f) = [0, 1). Theorem 6.15 shows f � f(x) = 4p

x is continuous on [0, 1).In similar way, it can be shown by induction that f(x) = xm/2

n

is continuouson [0, 1) for all m, n 2 Z.

4. Unilateral Continuity


2 D. f is left-continuous at x0

if forevery " > 0 there is a � > 0 such that f((x

0

� �, x0

] \ D) ⇢ (f(x0

) � ", f(x0

) + ").Let f : D ! R and x

0

2 D. f is right-continuous at x0

if for every " > 0 thereis a � > 0 such that f([x

0

, x0

+ �) \ D) ⇢ (f(x0

) � ", f(x0

) + ").

Example 6.17. Let the floor function be

bxc = max{n 2 Z : n x}and the ceiling function be

dxe = min{n 2 Z : n � x}.

The floor function is right-continuous, but not left-continuous at each integer, andthe ceiling function is left-continuous, but not right-continuous at each integer.


2 D. x0

2 C(f) i↵ f is both right and

left-continuous at x0

.

Proof. The proof of this theorem is left as an exercise. ⇤According to Theorem 6.7, when f is monotone on an interval (a, b), the

unilateral limits of f exist at every point. In order for such a function to becontinuous at x

0

2 (a, b), it must be the case that

limx"x0

f(x) = f(x0

) = limx#x0

f(x).

If either of the two equalities is violated, the function is not continuous at x0

.


Section 4: Unilateral Continuity 6-9

f(x) = x

2�4

x�2

1 2 43

1

2

3

4

5

6

7

Figure 7. The function from Example 6.19. Note that the graph is aline with one “hole” in it. Plugging up the hole removes the discontinuity.

In the case, when limx"x0 f(x) 6= lim

x#x0 f(x), it is said that a jump discontinuity

occurs at x0

.

Example 6.18. The function

f(x) =

(|x|/x, x 6= 0

0, x = 0.

has a jump discontinuity at x = 0.

In the case when limx"x0 f(x) = lim

x#x0 f(x) 6= f(x0

), it is said that f has aremovable discontinuity at x

0

. The discontinuity is called “removable” because inthis case, the function can be made continuous at x

0

by merely redefining its valueat the single point, x

0

, to be the value of the two one-sided limits.

Example 6.19. The function f(x) = x

2�4

x�2

is not continuous at x = 2 because 2is not in the domain of f . Since lim

x!2

f(x) = 4, if the domain of f is extended toinclude 2 by setting f(2) = 4, then this extended f is continuous everywhere. (SeeFigure 7.)

Theorem 6.18. If f : (a, b) ! R is monotone, then (a, b) \ C(f) is countable.

Proof. In light of the discussion above and Theorem 6.7, it is apparent thatthe only types of discontinuities f can have are jump discontinuities.

To be specific, suppose f is increasing and x0

, y0

2 (a, b) \ C(f) with x0

< y0

.In this case, the fact that f is increasing implies

limx"x0

f(x) < limx#x0

f(x) limx"y0

f(x) < limx#y0

f(x).

This implies that for any two x0

, y0

2 (a, b) \ C(f), there are disjoint open intervals,Ix0 = (lim

x"x0 f(x), limx#x0 f(x)) and I

y0 = (limx"y0 f(x), lim

x#y0 f(x)). For eachx 2 (a, b) \ C(f), choose q

x

2 Ix

\ Q. Because of the pairwise disjointness of theintervals {I

x

: x 2 (a, b) \ C(f)}, this defines an bijection between (a, b) \ C(f) anda subset of Q. Therefore, (a, b) \ C(f) must be countable.

A similar argument holds for a decreasing function. ⇤Theorem 6.18 implies that a monotone function is continuous at “nearly every”

point in its domain. Characterizing the points of discontinuity as countable is thebest that can be hoped for, as seen in the following example.



Example 6.20. Let D = {dn

: n 2 N} be a countable set and define Jx

= {n :dn

< x}. The function

(49) f(x) =

(0, J

x

= ;P

n2J

x

1

2

n

Jx

6= ; .

is increasing and C(f) = Dc. The proof of this statement is left as Exercise 6.9.

5. Continuous Functions

Up until now, continuity has been considered as a property of a function at apoint. There is much that can be said about functions continuous everywhere.

Definition 6.19. Let f : D ! R and A ⇢ D. We say f is continuous on A ifA ⇢ C(f). If D = C(f), then f is continuous.

Continuity at a point is, in a sense, a metric property of a function because itmeasures relative distances between points in the domain and image sets. Continuityon a set becomes more of a topological property, as shown by the next few theorems.

Theorem 6.20. f : D ! R is continuous i↵ whenever G is open in R, thenf�1(G) is relatively open in D.

Proof. ()) Assume f is continuous on D and let G be open in R. Letx 2 f�1(G) and choose " > 0 such that (f(x) � ", f(x) + ") ⇢ G. Using thecontinuity of f at x, we can find a � > 0 such that f((x � �, x + �) \ D) ⇢ G. Thisimplies that (x � �, x + �) \ D ⇢ f�1(G). Because x was an arbitrary element off�1(G), it follows that f�1(G) is open.

(() Choose x 2 D and let " > 0. By assumption, the set f�1((f(x) �", f(x) + ") is relatively open in D. This implies the existence of a � > 0 suchthat (x � �, x + �) \ D ⇢ f�1((f(x) � ", f(x) + "). It follows from this thatf((x � �, x + �) \ D) ⇢ (f(x) � ", f(x) + "), and x 2 C(f). ⇤

A function as simple as any constant function demonstrates that f(G) need notbe open when G is open. Defining f : [0, 1) ! R by f(x) = sin x tan�1 x showsthat the image of a closed set need not be closed because f([0, 1)) = (�⇡/2, ⇡/2).

Theorem 6.21. If f is continuous on a compact set K, then f(K) is compact.

Proof. Let O be an open cover of f(K) and I = {f�1(G) : G 2 O}. ByTheorem 6.20, I is a collection of sets which are relatively open in K. Since I coversK, I is an open cover of K. Using the fact that K is compact, we can choose a finitesubcover of K from I, say {G

1

, G2

, . . . , Gn

}. There are {H1

, H2

, . . . , Hn

} ⇢ O suchthat f�1(H

k

) = Gk

for 1 k n. Then

f(K) ⇢ f

0

@[

1kn

Gk

1

A =[

1kn

Hk

.

Thus, {H1

, H2

, . . . , H3

} is a subcover of f(K) from O. ⇤

Several of the standard calculus theorems giving properties of continuous func-tions are consequences of Corollary 6.21. In a calculus course, K is usually a compactinterval, [a, b].


Section 5: Continuous Functions 6-11

Corollary 6.22. If f : K ! R is continuous and K is compact, then f is

bounded.

Proof. By Theorem 6.21, f(K) is compact. Now, use the Bolzano-Weierstrasstheorem to conclude f is bounded. ⇤

Corollary 6.23 (Maximum Value Theorem). If f : K ! R is continuous and

K is compact, then there are m, M 2 K such that f(m) f(x) f(M) for all

x 2 K.

Proof. According to Theorem 6.21 and the Bolzano-Weierstrass theorem, f(K)is closed and bounded. Because of this, glb f(K) 2 f(K) and lub f(K) 2 f(K). Itsu�ces to choose m 2 f�1(glb f(K)) and M 2 f�1(lub f(K)). ⇤

Corollary 6.24. If f : K ! R is continuous and invertible and K is compact,

then f�1 : f(K) ! K is continuous.

Proof. Let G be open in K. According to Theorem 6.20, it su�ces to showf(G) is open in f(K).

To do this, note that K \G is compact, so by Theorem 6.21, f(K \G) is compact,and therefore closed. Because f is injective, f(G) = f(K) \ f(K \ G). This showsf(G) is open in f(K). ⇤

Theorem 6.25. If f is continuous on an interval I, then f(I) is an interval.

Proof. If f(I) is not connected, there must exist two disjoint open sets, Uand V , such that f(I) ⇢ U [ V and f(I) \ U 6= ; 6= f(I) \ V . In this case, Theorem6.20 implies f�1(U) and f�1(V ) are both open. They are clearly disjoint andf�1(U) \ I 6= ; 6= f�1(V ) \ I. But, this implies f�1(U) and f�1(V ) disconnect I,which is a contradiction. Therefore, f(I) is connected. ⇤

Corollary 6.26 (Intermediate Value Theorem). If f : [a, b] ! R is continuous

and ↵ is between f(a) and f(b), then there is a c 2 [a, b] such that f(c) = ↵.

Proof. This is an easy consequence of Theorem 6.25 and Theorem 5.14. ⇤Definition 6.27. A function f : D ! R has the Darboux property if whenever

a, b 2 D and � is between f(a) and f(b), then there is a c between a and b suchthat f(c) = �.

Calculus texts usually call the Darboux property the intermediate value property.Corollary 6.26 shows that a function continuous on an interval has the Darbouxproperty. The next example shows continuity is not necessary for the Darbouxproperty to hold.

Example 6.21. The function

f(x) =

(sin 1/x, x 6= 0

0, x = 0

is not continuous, but does have the Darboux property. (See Figure 4.) It can beseen from Example 6.6 that 0 /2 C(f).

To see f has the Darboux property, choose two numbers a < b.If a > 0 or b < 0, then f is continuous on [a, b] and Corollary 6.26 su�ces to

finish the proof.



On the other hand, if 0 2 [a, b], then there must exist an n 2 Z such that both2

(4n+1)⇡

, 2

(4n+3)⇡

2 [a, b]. Since f( 2

(4n+1)⇡

) = 1, f( 2

(4n+3)⇡

) = �1 and f is continuous

on the interval between them, we see f([a, b]) = [�1, 1], which is the entire range off . The claim now follows.

6. Uniform Continuity

Most of the ideas contained in this section will not be needed until we begindeveloping the properties of the integral in Chapter 8.

Definition 6.28. A function f : D ! R is uniformly continuous if for all " > 0there is a � > 0 such that when x, y 2 D with |x � y| < �, then |f(x) � f(y)| < ".

The idea here is that in the ordinary definition of continuity, the � in thedefinition depends on both the " and the x at which continuity is being tested. Withuniform continuity, � only depends on "; i. e., the same � works uniformly acrossthe whole domain.

Theorem 6.29. If f : D ! R is uniformly continuous, then it is continuous.

Proof. This proof is left as Exercise 6.31. ⇤

The converse is not true.

Example 6.22. Let f(x) = 1/x on D = (0, 1) and " > 0. It’s clear that f iscontinuous on D. Let � > 0 and choose m, n 2 N such that m > 1/� and n � m > ".If x = 1/m and y = 1/n, then 0 < y < x < � and f(y) � f(x) = n � m > ".Therefore, f is not uniformly continuous.

Theorem 6.30. If f : D ! R is continuous and D is compact, then f is

uniformly continuous.

Proof. Suppose f is not uniformly continuous. Then there is an " > 0 such thatfor every n 2 N there are x

n

, yn

2 D with |xn

� yn

| < 1/n and |f(xn

) � f(yn

)| � ".An application of the Bolzano-Weierstrass theorem yields a subsequence x

n

k

of xn

such that xn

k

! x0

2 D.Since f is continuous at x

0

, there is a � > 0 such that whenever x 2 (x0

��, x

0

+ �) \ D, then |f(x) � f(x0

)| < "/2. Choose nk

2 N such that 1/nk

< �/2 andxn

k

2 (x0

� �/2, x0

+ �/2). Then both xn

k

, yn

k

2 (x0

� �, x0

+ �) and

" |f(xn

k

) � f(yn

k

)| = |f(xn

k

) � f(x0

) + f(x0

) � f(yn

k

)| |f(x

n

k

) � f(x0

)| + |f(x0

) � f(yn

k

)| < "/2 + "/2 = ",

which is a contradiction.Therefore, f must be uniformly continuous. ⇤

The following corollary is an immediate consequence of Theorem 6.30.

Corollary 6.31. If f : [a, b] ! R is continuous, then f is uniformly continuous.

Theorem 6.32. Let D ⇢ R and f : D ! R. If f is uniformly continuous and

xn

is a Cauchy sequence from D, then f(xn

) is a Cauchy sequence..

Proof. The proof is left as Exercise 6.37. ⇤


7. EXERCISES 6-13

Uniform continuity is necessary in Theorem 6.32. To see this, let f : (0, 1) ! Rbe f(x) = 1/x and x

n

= 1/n. Then xn

is a Cauchy sequence, but f(xn

) = n is not.This idea is explored in Exercise 6.32.

It’s instructive to think about the converse to Theorem 6.32. Let f(x) = x2,defined on all of R. Since f is continuous everywhere, Corollary 6.12 shows f mapsCauchy sequences to Cauchy sequences. On the other hand, in Exercise 6.36, itis shown that f is not uniformly continuous. Therefore, the converse to Theorem6.32 is false. Those functions mapping Cauchy sequences to Cauchy sequences aresometimes said to be Cauchy continuous. The converse to Theorem 6.32 can betweaked to get a true statement.

Theorem 6.33. Let f : D ! R where D is bounded. If f is Cauchy continuous,

then f is uniformly continuous.

Proof. Suppose f is not uniformly continuous. Then there is an " > 0 andsequences x

n

and yn

from D such that |xn

� yn

| < 1/n and |f(xn

) � f(yn

)| � ".Since D is bounded, the sequence x

n

is bounded and the Bolzano-Weierstrasstheorem gives a Cauchy subsequence, x

n

k

. The new sequence

zk

=

(xn(k+1)/2

k odd

yn

k/2k even

is easily shown to be a Cauchy sequence. But, f(zk

) is not a Cauchy sequence,since |f(z

k

) � f(zk+1

)| � " for all odd k. This contradicts the fact that f isCauchy continuous. We’re forced to conclude the assumption that f is not uniformlycontinuous is false. ⇤

7. Exercises

6.1. Prove limx!�2

(x2 + 3x) = �2.

6.2. Give examples of functions f and g such that neither function has a limit ata, but f + g does. Do the same for fg.

6.3. Let f : D ! R and a 2 D0.

limx!a

f(x) = L () limx"a

f(x) = limx#a

f(x) = L

6.4. Find two functions defined on R such that

0 = limx!0

(f(x) + g(x)) 6= limx!0

f(x) + limx!0

g(x).

6.5. If limx!a

f(x) = L > 0, then there is a � > 0 such that f(x) > 0 when

0 < |x � a| < �.

6.6. If Q = {qn

: n 2 N} is an enumeration of the rational numbers and

f(x) =

(1/n, x = q

n

0, x 2 Qc

then limx!a

f(x) = 0, for all a 2 Qc.



6.7. Use the definition of continuity to show f(x) = x2 is continuous everywhereon R.

6.8. Prove that f(x) =p

x is continuous on [0, 1).

6.9. If f is defined as in (49), then D = C(f)c.

6.10. If f : R ! R is monotone, then there is a countable set D such thatthe values of f can be altered on D in such a way that the altered function isleft-continuous at every point of R.

6.11. Does there exist an increasing function f : R ! R such that C(f) = Q?

6.12. If f : R ! R and there is an ↵ > 0 such that |f(x) � f(y)| ↵|x � y| for allx, y 2 R, then show that f is continuous.

6.13. Suppose f and g are each defined on an open interval I, a 2 I anda 2 C(f) \ C(g). If f(a) > g(a), then there is an open interval J such thatf(x) > g(x) for all x 2 J .

6.14. If f, g : (a, b) ! R are continuous, then G = {x : f(x) < g(x)} is open.

6.15. If f : R ! R and a 2 C(f) with f(a) > 0, then there is a neighborhood Gof a such that f(G) ⇢ (0, 1).

6.16. Let f and g be two functions which are continuous on a set D ⇢ R. Proveor give a counter example: {x 2 D : f(x) > g(x)} is relatively open in D.

6.17. If f, g : R ! R are functions such that f(x) = g(x) for all x 2 Q andC(f) = C(g) = R, then f = g.

6.18. Let I = [a, b]. If f : I ! I is continuous, then there is a c 2 I such thatf(c) = c.

6.19. Find an example to show the conclusion of Problem 18 fails if I = (a, b).

6.20. If f and g are both continuous on [a, b], then {x : f(x) g(x)} is compact.

6.21. If f : [a, b] ! R is continuous, not constant,

m = glb {f(x) : a x b} and M = lub {f(x) : a x b},

then f([a, b]) = [m, M ].

6.22. Suppose f : R ! R is a function such that every interval has points at whichf is negative and points at which f is positive. Prove that every interval has pointswhere f is not continuous.

6.23. If f : [a, b] ! R has a limit at every point, then f is bounded. Is this truefor f : (a, b) ! R?

6.24. Give an example of a bounded function f : R ! R with a limit at no point.


7. EXERCISES 6-15

6.25. If f : R ! R is continuous and periodic, then there are xm

, xM

2 R suchthat f(x

m

) f(x) f(xM

) for all x 2 R. (A function f is periodic, if there is ap > 0 such that f(x + p) = f(x) for all x 2 R.)

6.26. A set S ⇢ R is disconnected i↵ there is a continuous f : S ! R such thatf(S) = {0, 1}.

6.27. If f : R ! R satisfies f(x + y) = f(x) + f(y) for all x and y and 0 2 C(f),then f is continuous.

6.28. Assume that f : R ! R is such that f(x+y) = f(x)f(y) for all x, y 2 R. If fhas a limit at zero, prove that either lim

x!0

f(x) = 1 or f(x) = 0 for all x 2 R \ {0}.

6.29. If g : R ! R satisfies g(x + y) = g(x)g(y) for all x, y 2 R and 0 2 C(g), theng is continuous.

6.30. If F ⇢ R is closed, then there is an f : R ! R such that F = C(f)c.

6.31. If f : [a, b] ! R is uniformly continuous, then f is continuous.

6.32. Prove that an unbounded function on a bounded open interval cannot beuniformly continuous.

6.33. If f : D ! R is uniformly continuous on a bounded set D, then f is bounded.


6.35. Every polynomial of odd degree has a root.

6.36. Show f(x) = x2, with domain R, is not uniformly continuous.



CHAPTER 7

Di↵erentiation

1. The Derivative at a Point

Definition 7.1. Let f be a function defined on a neighborhood of x0

. f isdi↵erentiable at x

0

, if the following limit exists:

f 0(x0

) = limh!0

f(x0

+ h) � f(x0

)

h.

Define D(f) = {x : f 0(x) exists}.

The standard notations for the derivative will be used; e. g., f 0(x), df(x)

dx

, Df(x),etc.

An equivalent way of stating this definition is to note that if x0

2 D(f), then

f 0(x0

) = limx!x0

f(x) � f(x0

)

x � x0

.

(See Figure 1.)This can be interpreted in the standard way as the limiting slope of the secant

line as the points of intersection approach each other.

Example 7.1. If f(x) = c for all x and some c 2 R, then

limh!0

f(x0

+ h) � f(x0

)

h= lim

h!0

c � c

h= 0.

So, f 0(x) = 0 everywhere.

Example 7.2. If f(x) = x, then

limh!0

f(x0

+ h) � f(x0

)

h= lim

h!0

x0

+ h � x0

h= lim

h!0

h

h= 1.

So, f 0(x) = 1 everywhere.

Theorem 7.2. For any function f , D(f) ⇢ C(f).

Proof. Suppose x0

2 D(f). Then

limx!x0

|f(x) � f(x0

)| = limx!x0

��f(x) � f(x

0

)

x � x0

(x � x0

)

��

= f 0(x0

) 0 = 0.

This shows limx!x0 f(x) = f(x

0

), and x0

2 C(f). ⇤

Of course, the converse of Theorem 7.2 is not true.

1January 19, 2015


7-1

7-2 CHAPTER 7. DIFFERENTIATION

Figure 1. These graphs illustrate that the two standard ways of writingthe di↵erence quotient are equivalent.

Example 7.3. The function f(x) = |x| is continuous on R, but

limh#0

f(0 + h) � f(0)

h= 1 = � lim

h"0

f(0 + h) � f(0)

h,

so f 0(0) fails to exist.

Theorem 7.2 and Example 7.3 show that di↵erentiability is a strictly strongercondition than continuity. For a long time most mathematicians believed thatevery continuous function must certainly be di↵erentiable at some point. In thenineteenth century, several researchers, most notably Bolzano and Weierstrass,presented examples of functions continuous everywhere and di↵erentiable nowhere.2

It has since been proved that, in a technical sense, the “typical” continuous functionis nowhere di↵erentiable [4]. So, contrary to the impression left by many beginningcalculus classes, di↵erentiability is the exception rather than the rule, even forcontinuous functions..

2. Di↵erentiation Rules

Following are the standard rules for di↵erentiation learned in every beginningcalculus course.

Theorem 7.3. Suppose f and g are functions such that x0

2 D(f) \ D(g).

(a) x0

2 D(f + g) and (f + g)0(x0

) = f 0(x0

) + g0(x0

).(b) If a 2 R, then x

0

2 D(af) and (af)0(x0

) = af 0(x0

).(c) x

0

2 D(fg) and (fg)0(x0

) = f 0(x0

)g(x0

) + f(x0

)g0(x0

).(d) If g(x

0

) 6= 0, then x0

2 D(f/g) and

✓f

g

◆0(x

0

) =f 0(x

0

)g(x0

) � f(x0

)g0(x0

)

(g(x0

))2.

2Bolzano presented his example in 1834, but it was little noticed. The 1872 example ofWeierstrass is more well-known [2]. A translation of Weierstrass’ original paper [16] is presentedby Edgar [8]. Weierstrass’ example is not very transparent because it depends on trigonometricseries. Many more elementary constructions have since been made. One such will be presented inExample 9.5.


2. DIFFERENTIATION RULES 7-3

Proof. (a)

limh!0

(f + g)(x0

+ h) � (f + g)(x0

)

h

= limh!0

f(x0

+ h) + g(x0

+ h) � f(x0

) � g(x0

)

h

= limh!0

✓f(x

0

+ h) � f(x0

)

h+

g(x0

+ h) � g(x0

)

h

◆= f 0(x

0

) + g0(x0

)

(b)

limh!0

(af)(x0

+ h) � (af)(x0

)

h= a lim

h!0

f(x0

+ h) � f(x0

)

h= af 0(x

0

)

(c)

limh!0

(fg)(x0

+ h) � (fg)(x0

)

h= lim

h!0

f(x0

+ h)g(x0

+ h) � f(x0

)g(x0

)

h

Now, “slip a 0” into the numerator and factor the fraction.

= limh!0

f(x0

+ h)g(x0

+ h) � f(x0

)g(x0

+ h) + f(x0

)g(x0

+ h) � f(x0

)g(x0

)

h

= limh!0

✓f(x

0

+ h) � f(x0

)

hg(x

0

+ h) + f(x0

)g(x

0

+ h) � g(x0

)

h

◆

Finally, use the definition of the derivative and the continuity of f and g at x0

.

= f 0(x0

)g(x0

) + f(x0

)g0(x0

)

(d) It will be proved that if g(x0

) 6= 0, then (1/g)0(x0

) = �g0(x0

)/(g(x0

))2. Thisstatement, combined with (c), yields (d).

limh!0

(1/g)(x0

+ h) � (1/g)(x0

)

h= lim

h!0

1

g(x0

+ h)� 1

g(x0

)h

= limh!0

g(x0

) � g(x0

+ h)

h

1

g(x0

+ h)g(x0

)

= � g0(x0

)

(g(x0

)2

Plug this into (c) to see✓

f

g

◆0(x

0

) =

✓f

1

g

◆0(x

0

)

= f 0(x0

)1

g(x0

)+ f(x

0

)�g0(x

0

)

(g(x0

))2

=f 0(x

0

)g(x0

) � f(x0

)g0(x0

)

(g(x0

))2.

⇤Combining Examples 7.1 and 7.2 with Theorem 7.3, the following theorem is

easy to prove.

Corollary 7.4. A rational function is di↵erentiable at every point of its

domain.



Theorem 7.5 (Chain Rule). If f and g are functions such that x0

2 D(f) and

f(x0

) 2 D(g), then x0

2 D(g � f) and (g � f)0(x0

) = g0 � f(x0

)f 0(x0

).

Proof. Let y0

= f(x0

). By assumption, there is an open interval J containingf(x

0

) such that g is defined on J . Since J is open and x0

2 C(f), there is an openinterval I containing x

0

such that f(I) ⇢ J .Define h : J ! R by

h(y) =

8<

:

g(y) � g(y0

)

y � y0

� g0(y0

), y 6= y0

0, y = y0

.

Since y0

2 D(g), we see

limy!y0

h(y) = limy!y0

g(y) � g(y0

)

y � y0

� g0(y0

) = g0(y0

) � g0(y0

) = 0 = h(y0

),

so y0

2 C(h). Now, x0

2 C(f) and f(x0

) = y0

2 C(h), so Theorem 6.15 impliesx

0

2 C(h � f). In particular

(53) limx!x0

h � f(x) = 0.

From the definition of h � f for x 2 I with f(x) 6= f(x0

), we can solve for

(54) g � f(x) � g � f(x0

) = (h � f(x) + g0 � f(x0

))(f(x) � f(x0

)).

Notice that (54) is also true when f(x) = f(x0

). Divide both sides of (54) by x�x0

,and use (53) to obtain

limx!x0

g � f(x) � g � f(x0

)

x � x0

= limx!x0

(h � f(x) + g0 � f(x0

))f(x) � f(x

0

)

x � x0

= (0 + g0 � f(x0

))f 0(x0

)

= g0 � f(x0

)f 0(x0

).

⇤Theorem 7.6. Suppose f : [a, b] ! [c, d] is continuous and invertible. If

x0

2 D(f) and f 0(x0

) 6= 0 for some x0

2 (a, b), then f(x0

) 2 D(f�1) and�f�1

�0(f(x

0

)) = 1/f 0(x0

).

Proof. Let y0

= f(x0

) and suppose yn

is any sequence in f([a, b]) \ {y0

}converging to y

0

and xn

= f�1(yn

). By Theorem 6.24, f�1 is continuous, so

x0

= f�1(y0

) = limn!1 f�1(y

n

) = limn!1 x

n

.

Therefore,

limn!1

f�1(yn

) � f�1(y0

)

yn

� y0

= limn!1

xn

� x0

f(xn

) � f(x0

)=

1

f 0(x0

).

⇤Example 7.4. It follows easily from Theorem 7.3 that f(x) = x3 is di↵erentiable

everywhere with f 0(x) = 3x2. Define g(x) = 3p

x. Then g(x) = f�1(x). Supposeg(y

0

) = x0

for some y0

2 R. According to Theorem 7.6,

g0(y0

) =1

f 0(x0

)=

1

3x2

0

=1

3(g(y0

))2=

1

3( 3p

y0

)2=

1

3y2/3

0

.


3. DERIVATIVES AND EXTREME POINTS 7-5

In the same manner as Example 7.4, the following corollary can be proved.

Corollary 7.7. Suppose q 2 Q, f(x) = xq

and D is the domain of f . Then

f 0(x) = qxq�1

on the set

(D, when q � 1

D \ {0}, when q < 1.

3. Derivatives and Extreme Points

As is learned in calculus, the derivative is a powerful tool for determining thebehavior of functions. The following theorems form the basis for much of di↵erentialcalculus. First, we state a few familiar definitions.

Definition 7.8. Suppose f : D ! R and x0

2 D. f is said to have a relative

maximum at x0

if there is a � > 0 such that f(x) f(x0

) for all x 2 (x0

��, x0

+�)\D.f has a relative minimum at x

0

if �f has a relative maximum at x0

. If f has eithera relative maximum or a relative minimum at x

0

, then it is said that f has a relative

extreme value at x0

.

The absolute maximum of f occurs at x0

if f(x0

) � f(x) for all x 2 D. Thedefinitions of absolute minimum and absolute extreme are analogous.

Examples like f(x) = x on (0, 1) show that even the nicest functions neednot have relative extrema. Corollary 6.23 shows that if D is compact, then anycontinuous function defined on D assumes both an absolute maximum and anabsolute minimum on D.

Theorem 7.9. Suppose f : (a, b) ! R. If f has a relative extreme value at x0

and x0

2 D(f), then f 0(x0

) = 0.

Proof. Suppose f(x0

) is a relative maximum value of f . Then there must bea � > 0 such that f(x) f(x

0

) whenever x 2 (x0

� �, x0

+ �). Since f 0(x0

) exists,

(55) x 2 (x0

� �, x0

) =) f(x) � f(x0

)

x � x0

� 0 =) f 0(x0

) = limx"x0

f(x) � f(x0

)

x � x0

� 0

and

(56) x 2 (x0

, x0

+�) =) f(x) � f(x0

)

x � x0

0 =) f 0(x0

) = limx#x0

f(x) � f(x0

)

x � x0

0.

Combining (55) and (56) shows f 0(x0

) = 0.If f(x

0

) is a relative minimum value of f , apply the previous argument to�f . ⇤

Theorem 7.9 is, of course, the basis for much of a beginning calculus course. Iff : [a, b] ! R, then the extreme values of f occur at points of the set

C = {x 2 (a, b) : f 0(x) = 0} [ {x 2 [a, b] : f 0(x) does not exist}.

The elements of C are often called the critical points or critical numbers of f on[a, b]. To find the maximum and minimum values of f on [a, b], it su�ces to find itsmaximum and minimum on the smaller set C, which is finite in elementary calculuscourses.



4. Di↵erentiable Functions

Di↵erentiation becomes most useful when a function has a derivative at eachpoint of an interval.

Definition 7.10. The function f is di↵erentiable on an open interval I ifI ⇢ D(f). If f is di↵erentiable on its domain, then it is said to be di↵erentiable. Inthis case, the function f 0 is called the derivative of f .

The fundamental theorem about di↵erentiable functions is the Mean ValueTheorem. Following is its simplest form.

Lemma 7.11 (Rolle’s Theorem). If f : [a, b] ! R is continuous on [a, b],di↵erentiable on (a, b) and f(a) = 0 = f(b), then there is a c 2 (a, b) such that

f 0(c) = 0.

Proof. Since [a, b] is compact, Corollary 6.23 implies the existence of xm

, xM

2[a, b] such that f(x

m

) f(x) f(xM

) for all x 2 [a, b]. If f(xm

) = f(xM

), then f isconstant on [a, b] and any c 2 (a, b) satisfies the lemma. Otherwise, either f(x

m

) < 0or f(x

M

) > 0. If f(xm

) < 0, then xm

2 (a, b) and Theorem 7.9 implies f 0(xm

) = 0.If f(x

M

) > 0, then xM

2 (a, b) and Theorem 7.9 implies f 0(xM

) = 0. ⇤Rolle’s Theorem is just a stepping-stone on the path to the Mean Value Theorem.

Two versions of the Mean Value Theorem follow. The first is a version more generalthan the one given in most calculus courses. The second is the usual version.4

Theorem 7.12 (Cauchy Mean Value Theorem). If f : [a, b] ! R and g : [a, b] !R are both continuous on [a, b] and di↵erentiable on (a, b), then there is a c 2 (a, b)such that

g0(c)(f(b) � f(a)) = f 0(c)(g(b) � g(a)).

Proof. Let

h(x) = (g(b) � g(a))(f(a) � f(x)) + (g(x) � g(a))(f(b) � f(a)).

Because of the assumptions on f and g, h is continuous on [a, b] and di↵erentiableon (a, b) with h(a) = h(b) = 0. Theorem 7.11 yields a c 2 (a, b) such that h0(c) = 0.Then

0 = h0(c) = �(g(b) � g(a))f 0(c) + g0(c)(f(b) � f(a))

=) g0(c)(f(b) � f(a)) = f 0(c)(g(b) � g(a)).

⇤

Corollary 7.13 (Mean Value Theorem). If f : [a, b] ! R is continuous on

[a, b] and di↵erentiable on (a, b), then there is a c 2 (a, b) such that f(b) � f(a) =f 0(c)(b � a).

Proof. Let g(x) = x in Theorem 7.12. ⇤Many of the standard theorems of beginning calculus are easy consequences of

the Mean Value Theorem. For example, following are the usual theorems aboutmonotonicity.

3

January 19, 2015


4Theorem 7.12 is also often called the Generalized Mean Value Theorem.


Section 4: Di↵erentiable Functions 7-7

Figure 2. This is a “picture proof” of Corollary 7.13.

Theorem 7.14. Suppose f : (a, b) ! R is a di↵erentiable function. f is

increasing on (a, b) i↵ f 0(x) � 0 for all x 2 (a, b). f is decreasing on (a, b) i↵

f 0(x) 0 for all x 2 (a, b).

Proof. Only the first assertion is proved because the proof of the second ispretty much the same with all the inequalities reversed.

()) If x, y 2 (a, b) with x 6= y, then the assumption that f is increasing gives

f(y) � f(x)

y � x� 0 =) f 0(x) = lim

y!x

f(y) � f(x)

y � x� 0.

(() Let x, y 2 (a, b) with x < y. According to Theorem 7.13, there is a c 2 (x, y)such that f(y)� f(x) = f 0(c)(y �x) � 0. This shows f(x) f(y), so f is increasingon (a, b). ⇤

Corollary 7.15. Let f : (a, b) ! R be a di↵erentiable function. f is constant

i↵ f 0(x) = 0 for all x 2 (a, b).

It follows from Theorem 7.2 that every di↵erentiable function is continuous.But, it’s not true that a derivative need be continuous.

Example 7.5. Let

f(x) =

(x2 sin 1

x

, x 6= 0

0, x = 0.

We claim f is di↵erentiable everywhere, but f 0 is not continuous.To see this, first note that when x 6= 0, the standard di↵erentiation formulas

give that f 0(x) = 2x sin(1/x) � cos(1/x). To calculate f 0(0), choose any h 6= 0.Then ��

f(h)

h

�� =��h2 sin(1/h)

h

�� h2

h

�� = |h|

and it easily follows from the definition of the derivative and the Squeeze Theorem(Theorem 6.3) that f 0(0) = 0.

Let xn

= 1/2⇡n for n 2 N. Then xn

! 0 and

f 0(xn

) = 2xn

sin(1/xn

) � cos(1/xn

) = �1

for all n. Therefore, f 0(xn

) ! �1 6= 0 = f 0(0), and f 0 is not continuous at 0.



But, derivatives do share one useful property with continuous functions; they sat-isfy an intermediate value property. Compare the following theorem with Corollary6.26.

Theorem 7.16 (Darboux’s Theorem). If f is di↵erentiable on an open set

containing [a, b] and � is between f 0(a) and f 0(b), then there is a c 2 [a, b] such that

f 0(c) = �.

Proof. If f 0(a) = f 0(b), then c = a satisfies the theorem. So, we may as wellassume f 0(a) 6= f 0(b). There is no generality lost in assuming f 0(a) < f 0(b), for,otherwise, we just replace f with g = �f .

Figure 3. This could be the function h of Theorem 7.16.

Let h(x) = f(x)��x so that D(f) = D(h) and h0(x) = f 0(x)��. In particular,this implies h0(a) < 0 < h0(b). Because of this, there must be an " > 0 small enoughso that

h(a + ") � h(a)

"< 0 =) h(a + ") < h(a)

andh(b) � h(b � ")

"> 0 =) h(b � ") < h(b).

(See Figure 3.) In light of these two inequalities and Theorem 6.23, there mustbe a c 2 (a, b) such that h(c) = glb {h(x) : x 2 [a, b]}. Now Theorem 7.9 gives0 = h0(c) = f 0(c) � �, and the theorem follows. ⇤

Here’s an example showing a possible use of Theorem 7.16.

Example 7.6. Let

f(x) =

(0, x 6= 0

1, x = 0.

Theorem 7.16 implies f is not a derivative.

A more striking example is the following

Example 7.7. Define

f(x) =

(sin 1

x

, x 6= 0

1, x = 0and g(x) =

(sin 1

x

, x 6= 0

�1, x = 0.

Since

f(x) � g(x) =

(0, x 6= 0

2, x = 0

does not have the intermediate value property, at least one of f or g is not aderivative. (Actually, neither is a derivative because f(x) = �g(�x).)


5. APPLICATIONS OF THE MEAN VALUE THEOREM 7-9

5. Applications of the Mean Value Theorem

In the following sections, the standard notion of higher order derivatives is used.To make this precise, suppose f is defined on an interval I. The function f itself canbe written f (0). If f is di↵erentiable, then f 0 is written f (1). Continuing inductively,if n 2 !, f (n) exists on I and x

0

2 D(f (n)), then f (n+1)(x0

) = df (n)(x0

)/dx.

5.1. Taylor’s Theorem. The motivation behind Taylor’s theorem is the at-tempt to approximate a function f near a number a by a polynomial. The polynomialof degree 0 which does the best job is clearly p

0

(x) = f(a). The best polynomial ofdegree 1 is the tangent line to the graph of the function p

1

(x) = f(a) + f 0(a)(x � a).Continuing in this way, we approximate f near a by the polynomial p

n

of degree n

such that f (k)(a) = p(k)

n

(a) for k = 0, 1, . . . , n. A simple induction argument showsthat

(57) pn

(x) =nX

k=0

f (k)(a)

k!(x � a)k.

This is the well-known Taylor polynomial of f at a.Many students leave calculus with the mistaken impression that (57) is the

important part of Taylor’s theorem. But, the important part of Taylor’s theoremis the fact that in many cases it is possible to determine how large n must be toachieve a desired accuracy in the approximation of f ; i. e., the error term is theimportant part.

Theorem 7.17 (Taylor’s Theorem). If f is a function such that f, f 0, . . . , f (n)

are continuous on [a, b] and f (n+1)

exists on (a, b), then there is a c 2 (a, b) such

that

f(b) =nX

k=0

f (k)(a)

k!(b � a)k +

f (n+1)(c)

(n + 1)!(b � a)n+1.

Proof. Let the constant ↵ be defined by

(58) f(b) =nX

k=0

f (k)(a)

k!(b � a)k +

↵

(n + 1)!(b � a)n+1

and define

F (x) = f(b) �

nX

k=0

f (k)(x)

k!(b � x)k +

↵

(n + 1)!(b � x)n+1

!.

From (58) we see that F (a) = 0. Direct substitution in the definition of F showsthat F (b) = 0. From the assumptions in the statement of the theorem, it is easyto see that F is continuous on [a, b] and di↵erentiable on (a, b). An application ofRolle’s Theorem yields a c 2 (a, b) such that

0 = F 0(c) = �✓

f (n+1)(c)

n!(b � c)n � ↵

n!(b � c)n

◆=) ↵ = f (n+1)(c),

as desired. ⇤

5

January 19, 2015




2 4 6 8

-4

-2

2

4

n = 2

n = 4

n = 6

n = 8

n = 10

n = 20

y = cos(x)

Figure 4. Here are several of the Taylor polynomials for the functioncos(x) graphed along with cos(x).

Now, suppose f is defined on an open interval I with a, x 2 I. If f is n + 1times di↵erentiable on I, then Theorem 7.17 implies there is a c between a and xsuch that

f(x) = pn

(x) + Rf

(n, x, a),

where Rf

(n, x, a) = f

(n+1)(c)

(n+1)!

(x � a)n+1 is the error in the approximation.6

Example 7.8. Let f(x) = cos x. Suppose we want to approximate f(2) to 5decimal places of accuracy. Since it’s an easy point to work with, we’ll choose a = 0.Then, for some c 2 (0, 2),

(59) |Rf

(n, 2, 0)| =|f (n+1)(c)|(n + 1)!

2n+1 2n+1

(n + 1)!.

A bit of experimentation with a calculator shows that n = 12 is the smallest n suchthat the right-hand side of (59) is less than 5 ⇥ 10�6. After doing some arithmetic,it follows that

p12

(2) = 1 � 22

2!+

24

4!� 26

6!+

28

8!� 210

10!+

212

12!= �27809

66825⇡ �0.41614.

is a 5 decimal place approximation to cos(2).

But, things don’t always work out the way we might like. Consider the followingexample.

Example 7.9. Suppose

f(x) =

(e�1/x

2

, x 6= 0

0, x = 0.

6There are several di↵erent formulas for the error. The one given here is sometimes calledthe Lagrange form of the remainder. In Example 8.4 a form of the remainder using integrationinstead of di↵erentiation is derived.



In Example 7.11 below it is shown that f is di↵erentiable to all orders everywhereand f (n)(0) = 0 for all n � 0. With this function the Taylor polynomial centered at0 gives a useless approximation.

5.2. L’Hospital’s Rules and Indeterminate Forms. According to Theo-rem 6.4,

limx!a

f(x)

g(x)=

limx!a

f(x)

limx!a

g(x)

whenever limx!a

f(x) and limx!a

g(x) both exist and limx!a

g(x) 6= 0. But, itis easy to find examples where both lim

x!a

f(x) = 0 and limx!a

g(x) = 0 andlim

x!a

f(x)/g(x) exists, as well as similar examples where limx!a

f(x)/g(x) failsto exist. Because of this, such a limit problem is said to be in the indeterminate

form 0/0. The following theorem allows us to determine many such limits.

Theorem 7.18 (Easy L’Hospital’s Rule). Suppose f and g are each continuous

on [a, b], di↵erentiable on (a, b) and f(b) = g(b) = 0. If g0(x) 6= 0 on (a, b) and

limx"b f 0(x)/g0(x) = L, where L could be infinite, then lim

x"b f(x)/g(x) = L.

Proof. Let x 2 [a, b), so f and g are continuous on [x, b] and di↵erentiable on(x, b). Cauchy’s Mean Value Theorem, Theorem 7.12, implies there is a c(x) 2 (x, b)such that

f 0(c(x))g(x) = g0(c(x))f(x) =) f(x)

g(x)=

f 0(c(x))

g0(c(x)).

Since x < c(x) < b, it follows that limx"b c(x) = b. This shows that

L = limx"b

f 0(x)

g0(x)= lim

x"bf 0(c(x))

g0(c(x))= lim

x"bf(x)

g(x).

⇤Several things should be noted about this proof. First, there is nothing special

about the left-hand limit used in the statement of the theorem. It could just aseasily be written in terms of the right-hand limit. Second, if lim

x!a

f(x)/g(x) isnot of the indeterminate form 0/0, then applying L’Hospital’s rule will usually givea wrong answer. To see this, consider

limx!0

x

x + 1= 0 6= 1 = lim

x!0

1

1.

Another case where the indeterminate form 0/0 occurs is in the limit at infinity.That L’Hopital’s rule works in this case can easily be deduced from Theorem 7.18.

Corollary 7.19. Suppose f and g are di↵erentiable on (a, 1) and

limx!1 f(x) = lim

x!1 g(x) = 0.

If g0(x) 6= 0 on (a, 1) and limx!1 f 0(x)/g0(x) = L, where L could be infinite, then

limx!1 f(x)/g(x) = L.

Proof. There is no generality lost by assuming a > 0. Let

F (x) =

(f(1/x), x 2 (0, 1/a]

0, x = 0and G(x) =

(g(1/x), x 2 (0, 1/a]

0, x = 0.

Thenlimx#0

F (x) = limx!1 f(x) = 0 = lim

x!1 g(x) = limx#0

G(x),



so both F and G are continuous at 0. It follows that both F and G are continuouson [0, 1/a] and di↵erentiable on (0, 1/a) with G0(x) = �g0(x)/x2 6= 0 on (0, 1/a)and lim

x#0

F 0(x)/G0(x) = limx!1 f 0(x)/g0(x) = L. The rest follows from Theorem

7.18. ⇤The other standard indeterminate form arises when

limx!1 f(x) = 1 = lim

x!1 g(x).

This is called an 1/1 indeterminate form. It is often handled by the followingtheorem.

Theorem 7.20 (Hard L’Hospital’s Rule). Suppose that f and g are di↵erentiable

on (a, 1) and g0(x) 6= 0 on (a, 1). If

limx!1 f(x) = lim

x!1 g(x) = 1 and limx!1

f 0(x)

g0(x)= L 2 R [ {�1, 1},

then

limx!1

f(x)

g(x)= L.

Proof. First, suppose L 2 R and let " > 0. Choose a1

> a large enough sothat ��

f 0(x)

g0(x)� L

�� < ", 8x > a1

.

Since limx!1 f(x) = 1 = lim

x!1 g(x), we can assume there is an a2

> a1

suchthat both f(x) > 0 and g(x) > 0 when x > a

2

. Finally, choose a3

> a2

such thatwhenever x > a

3

, then f(x) > f(a2

) and g(x) > g(a2

).Let x > a

3

and apply Cauchy’s Mean Value Theorem, Theorem 7.12, to f andg on [a

2

, x] to find a c(x) 2 (a2

, x) such that

(60)f 0(c(x))

g0(c(x))=

f(x) � f(a2

)

g(x) � g(a2

)=

f(x)⇣1 � f(a2)

f(x)

⌘

g(x)⇣1 � g(a2)

g(x)

⌘ .

If

h(x) =1 � g(a2)

g(x)

1 � f(a2)

f(x)

,

then (60) impliesf(x)

g(x)=

f 0(c(x))

g0(c(x))h(x).

Since limx!1 h(x) = 1, there is an a

4

> a3

such that whenever x > a4

, then|h(x) � 1| < ". If x > a

4

, then��f(x)

g(x)� L

�� =��f 0(c(x))

g0(c(x))h(x) � L

��

=

��f 0(c(x))

g0(c(x))h(x) � Lh(x) + Lh(x) � L

��

��f 0(c(x))

g0(c(x))� L

�� |h(x)| + |L||h(x) � 1|< "(1 + ") + |L|" = (1 + |L| + ")"



can be made arbitrarily small through a proper choice of ". Therefore

limx!1 f(x)/g(x) = L.

The case when L = 1 is done similarly by first choosing a B > 0 and adjusting(60) so that f 0(x)/g0(x) > B when x > a

1

. A similar adjustment is necessary whenL = �1. ⇤

There is a companion corollary to Theorem 7.20 which is proved in the sameway as Corollary 7.19.

Corollary 7.21. Suppose that f and g are continuous on [a, b] and di↵eren-

tiable on (a, b) with g0(x) 6= 0 on (a, b). If

limx#a

f(x) = limx#a

g(x) = 1 and limx#a

f 0(x)

g0(x)= L 2 R [ {�1, 1},

then

limx#a

f(x)

g(x)= L.

Example 7.10. If ↵ > 0, then limx!1 ln x/x↵ is of the indeterminate form

1/1. Taking derivatives of the numerator and denominator yields

limx!1

1/x

↵x↵�1

= limx!1

1

↵x↵

= 0.

Theorem 7.20 now implies limx!1 ln x/x↵ = 0, and therefore ln x increases more

slowly than any positive power of x.

Example 7.11. Let f be as in Example 7.9. It is clear f (n)(x) exists whenevern 2 ! and x 6= 0. We claim f (n)(0) = 0. To see this, we first prove that

limx!0

e�1/x

2

xn

= 0, 8n 2 Z.(61)

When n 0, (61) is obvious. So, suppose (61) is true whenever m n for somen 2 !. Making the substitution u = 1/x, we see

limx#0

e�1/x

2

xn+1

= limu!1

un+1

eu2 .(62)

Since

limu!1

(n + 1)un

2ueu2 = limu!1

(n + 1)un�1

2eu2 =n + 1

2limx#0

e�1/x

2

xn�1

= 0

by the inductive hypothesis, Theorem 7.20 gives (62) in the case of the right-handlimit. The left-hand limit is handled similarly. Finally, (61) follows by induction.

When x 6= 0, a bit of experimentation can convince the reader that f (n)(x)

is of the form pn

(1/x)e�1/x

2

, where pn

is a polynomial. Induction and repeatedapplications of (61) establish that f (n)(0) = 0 for n 2 !.



6. Exercises

7.1. If

f(x) =

(x2, x 2 Q0, otherwise

,

then show D(f) = {0} and find f 0(0).

7.2. Let f be a function defined on some neighborhood of x = a with f(a) = 0.Prove f 0(a) = 0 if and only if a 2 D(|f |).7.3. If f is defined on an open set containing x

0

, the symmetric derivative of f atx

0

is defined as

fs(x0

) = limh!0

f(x0

+ h) � f(x0

� h)

2h.

Prove that if f 0(x) exists, then so does fs(x). Is the converse true?

7.4. Let G be an open set and f 2 D(G). If there is an a 2 G such thatlim

x!a

f 0(x) exists, then limx!a

f 0(x) = f 0(a).

7.5. Suppose f is continuous on [a, b] and f 00 exists on (a, b). If there is anx

0

2 (a, b) such that the line segment between (a, f(a)) and (b, f(b)) contains thepoint (x

0

, f(x0

)), then there is a c 2 (a, b) such that f 00(c) = 0.

7.6. If � = {f : f = F 0 for some F : R ! R}, then � is closed under addition andscalar multiplication. (This shows the di↵erentiable functions form a vector space.)

7.7. If

f1

(x) =

(1/2, x = 0

sin(1/x), x 6= 0

and

f2

(x) =

(1/2, x = 0

sin(�1/x), x 6= 0,

then at least one of f1

and f2

is not in �.

7.8. Prove or give a counter example: If f is continuous on R and di↵erentiableon R \ {0} with lim

x!0

f 0(x) = L, then f is di↵erentiable on R.

7.9. Suppose f is di↵erentiable everywhere and f(x+y) = f(x)f(y) for all x, y 2 R.Show that f 0(x) = f 0(0)f(x) and determine the value of f 0(0).

7.10. If I is an open interval, f is di↵erentiable on I and a 2 I, then there is asequence a

n

2 I \ {a} such that an

! a and f 0(an

) ! f 0(a).

7.11. Use the definition of the derivative to findd

dx

px.

7.12. Let f be continuous on [0, 1) and di↵erentiable on (0, 1). If f(0) = 0 and|f 0(x)| < |f(x)| for all x > 0, then f(x) = 0 for all x � 0.


6. EXERCISES 7-15

7.13. Suppose f : R ! R is such that f 0 is continuous on [a, b]. If there is ac 2 (a, b) such that f 0(c) = 0 and f 00(c) > 0, then f has a local minimum at c.

7.14. Prove or give a counter example: If f is continuous on R and di↵erentiableon R \ {0} with lim

x!0

f 0(x) = L, then f is di↵erentiable on R.

7.15. Let f be continuous on [a, b] and di↵erentiable on (a, b). If f(a) = ↵ and|f 0(x)| < � for all x 2 (a, b), then calculate a bound for f(b).

7.16. Suppose that f : (a, b) ! R is di↵erentiable and f 0 is bounded. If xn

is asequence from (a, b) such that x

n

! a, then f(xn

) converges.

7.17. Let G be an open set and f 2 D(G). If there is an a 2 G such thatlim

x!a

f 0(x) exists, then limx!a

f 0(x) = f 0(a).

7.18. Prove or give a counter example: If f 2 D((a, b)) such that f 0 is bounded,then there is an F 2 C([a, b]) such that f = F on (a, b).

7.19. Show that f(x) = x3 + 2x + 1 is invertible on R and, if g = f�1, then findg0(1).

7.20. Suppose that I is an open interval and that f 00(x) � 0 for all x 2 I. If a 2 I,then show that the part of the graph of f on I is never below the tangent line tothe graph at (a, f(a)).

7.21. Suppose f is continuous on [a, b] and f 00 exists on (a, b). If there is anx

0

2 (a, b) such that the line segment between (a, f(a)) and (b, f(b)) contains thepoint (x

0

, f(x0

)), then there is a c 2 (a, b) such that f 00(c) = 0.

7.22. Let f be defined on a neighborhood of x.

(a) If f 00(x) exists, then

limh!0

f(x � h) � 2f(x) + f(x + h)

h2

= f 00(x).

(b) Find a function f where this limit exists, but f 00(x) does not exist.

7.23. If f : R ! R is di↵erentiable everywhere and is even, then f 0 is odd. Iff : R ! R is di↵erentiable everywhere and is odd, then f 0 is even.7

7.24. Prove that ��sin x �✓

x � x3

6+

x5

120

◆�� <1

5040when |x| 1.

7A function g is even if g(�x) = g(x) for every x and it is odd if g(�x) = �g(x) for everyx. The terms are even and odd because this is how g(x) = x

n behaves when n is an even or oddinteger, respectively.


CHAPTER 8

Integration

Contrary to the impression given by most calculus courses, there are many waysto define integration. The one given here is called the Riemann integral or theRiemann-Darboux integral, and it is the one most commonly presented to calculusstudents.

1. Partitions

A partition of the interval [a, b] is a finite set P ⇢ [a, b] such that {a, b} ⇢ P .The set of all partitions of [a, b] is denoted part([a, b]). Basically, a partition shouldbe thought of as a way to divide an interval into a finite number of subintervals bychoosing some points where it is divided.

If P 2 part([a, b]), then the elements of P can be ordered in a list as a = x0

<x

1

< · · · < xn

= b. The adjacent points of this partition determine n compactintervals of the form IP

k

= [xk�1

, xk

], 1 k n. If the partition is clear from thecontext, we write I

k

instead of IPk

. It’s clear that these intervals only intersect attheir common endpoints and there is no requirement they have the same length.

Since it’s inconvenient to always list each part of a partition, we’ll use thepartition of the previous paragraph as the generic partition. Unless it’s necessarywithin the context to specify some other form for a partition, assume any partitionis the generic partition. (See Figure 1.)

If I is any interval, its length is written |I|. Using the notation of the previousparagraph, it follows that

nX

k=1

|Ik

| =nX

k=1

(xk

� xk�1

) = xn

� x0

= b � a.

The norm of a partition P is

kPk = max{|IPk

| : 1 k n}.

In other words, the norm of P is just the length of the longest subinterval determinedby P . If |I

k

| = kPk for every Ik

, then P is called a regular partition.Suppose P, Q 2 part([a, b]). If P ⇢ Q, then Q is called a refinement of P . When

this happens, we write P ⌧ Q. In this case, it’s easy to see that P ⌧ Q implies

x1

a b

x0

x2

x3

x4

x5

I1

I2

I3

I4

I5

Figure 1. The generic partition with five subintervals.

8-1

8-2 CHAPTER 8. INTEGRATION

kPk � kQk. It also follows at once from the definitions that P [ Q 2 part([a, b])with P ⌧ P [ Q and Q ⌧ P [ Q. The partition P [ Q is called the common

refinement of P and Q.

2. Riemann Sums

Let f : [a, b] ! R and P 2 part([a, b]). Choose x⇤k

2 Ik

for each k. The set{x⇤

k

: 1 k n} is called a selection from P . The expression

R (f, P, x⇤k

) =nX

k=1

f(x⇤k

)|Ik

|

is the Riemann sum for f with respect to the partition P and selection x⇤k

. TheRiemann sum is the usual first step toward integration in a calculus course and canbe visualized as the sum of the areas of rectangles with height f(x⇤

k

) and width |Ik

|— as long as the rectangles are allowed to have negative area when f(x⇤

k

) < 0. (SeeFigure 2.)

Notice that given a particular function f and partition P , there are an uncount-ably infinite number of di↵erent possible Riemann sums, depending on the selectionx⇤k

. This sometimes makes working with Riemann sums quite complicated.

Example 8.1. Suppose f : [a, b] ! R is the constant function f(x) = c. IfP 2 part([a, b]) and {x⇤

k

: 1 k n} is any selection from P , then

R (f, P, x⇤k

) =nX

k=1

f(x⇤k

)|Ik

| = c

nX

k=1

|Ik

| = c(b � a).

Example 8.2. Suppose f(x) = x on [a, b]. Choose any P 2 part([a, b]) wherekPk < 2(b � a)/n. (Convince yourself this is always possible.1) Make two specific

1This is with the generic partition

a bx1 x

2

x3

x4x

0

x⇤1

x⇤2

x⇤3

x⇤4

y = f(x)

Figure 2. The Riemann sum R (f, P, x⇤k

) is the sum of the areas of therectangles in this figure. Notice the right-most rectangle has negativearea because f(x⇤

4

) < 0.


Riemann Sums 8-3

selections l⇤k

= xk�1

and r⇤k

= xk

. If x⇤k

is any other selection from P , thenl⇤k

x⇤k

r⇤k

and the fact that f is increasing on [a, b] gives

R (f, P, l⇤k

) R (f, P, x⇤k

) R (f, P, r⇤k

) .

With this in mind, consider the following calculation.

R (f, P, r⇤k

) � R (f, P, l⇤k

) =nX

k=1

(r⇤k

� l⇤k

)|Ik

|(68)

=nX

k=1

(xk

� xk�1

)|Ik

|

=nX

k=1

|Ik

|2

nX

k=1

kPk2

= nkPk2

<4(b � a)2

n

This shows that if a partition is chosen with a small enough norm, all the Riemannsums for f over that partition will be close to each other.

In the special case when P is a regular partition, |Ik

| = (b � a)/n, rk

=a + k(b � a)/n and

R (f, P, r⇤k

) =nX

k=1

rk

|Ik

|

=nX

k=1

✓a +

k(b � a)

n

◆b � a

n

=b � a

n

na +

b � a

n

nX

k=1

k

!

=b � a

n

✓na +

b � a

n

n(n + 1)

2

◆

=b � a

2

✓an � 1

n+ b

n + 1

n

◆.

In the limit as n ! 1, this becomes the familiar formula (b2 �a2)/2, for the integralof f(x) = x over [a, b].

Definition 8.1. The function f is Riemann integrable on [a, b], if there exists anumber R(f) such that for all " > 0 there is a � > 0 so that whenever P 2 part([a, b])with kPk < �, then

|R(f) � R (f, P, x⇤k

) | < "

for any selection x⇤k

from P .

Theorem 8.2. If f : [a, b] ! R and R(f) exists, then R(f) is unique.



Proof. Suppose R1

(f) and R2

(f) both satisfy the definition and " > 0. Fori = 1, 2 choose �

i

> 0 so that whenever kPk < �i

, then

|Ri

(f) � R (f, P, x⇤k

) | < "/2,

as in the definition above. If P 2 part([a, b]) so that kPk < �1

^ �2

, then

|R1

(f) � R2

(f)| |R1

(f) � R (f, P, x⇤k

) | + |R2

(f) � R (f, P, x⇤k

) | < "

and it follows R1

(f) = R2

(f). ⇤

Theorem 8.3. If f : [a, b] ! R and R(f) exists, then f is bounded.

Proof. Left as Exercise 8.1. ⇤

3. Darboux Integration

As mentioned above, a di�culty with handling Riemann sums is there are anuncountably so many di↵erent ways to choose partitions and selections that workingwith them is unwieldy. One way to resolve this problem was shown in Example 8.2,where it was easy to find largest and smallest Riemann sums associated with eachpartition. However, that’s not always a straightforward calculation, so to use thatidea, a little more care must be taken.

Definition 8.4. Let f : [a, b] ! R be bounded and P 2 part([a, b]). For eachIk

determined by P , let

Mk

= lub {f(x) : x 2 Ik

} and mk

= glb {f(x) : x 2 Ik

}.

The upper and lower Darboux sums for f on [a, b] are

D(f, P ) =nX

k=1

Mk

|Ik

| and D(f, P ) =nX

k=1

mk

|Ik

|.

The following theorem is the fundamental relationship between Darboux sums.Pay careful attention because it’s the linchpin holding everything together!

Theorem 8.5. If f : [a, b] ! R is bounded and P, Q 2 part([a, b]) with P ⌧ Q,

then

D(f, P ) D(f, Q) D(f, Q) D(f, P ).

Proof. Let P be the generic partition and let Q = P [ {x}, where x 2(x

k0�1

, xx0) for some k

0

. Clearly, P ⌧ Q. Let

Ml

= lub {f(x) : x 2 [xk0�1

, x]}m

l

= glb {f(x) : x 2 [xk0�1

, x]}M

r

= lub {f(x) : x 2 [x, xk0 ]}

mr

= glb {f(x) : x 2 [x, xk0 ]}

Then

mk0 m

l

Ml

Mk0 and m

k0 mr

Mr

Mk0


Darboux Integration 8-5

so that

mk0 |Ik0 | = m

k0 (|[xk0�1

, x]| + |[x, xk0 ]|)

ml

|[xk0�1

, x]| + mr

|[x, xk0 ]|

Ml

|[xk0�1

, x]| + Mr

|[x, xk0 ]|

Mk0 |[xk0�1

, x]| + Mk0 |[x, x

k0 ]|= M

k0 |Ik0 |.This implies

D(f, P ) =nX

k=1

mk

|Ik

|

=X

k 6=k0

mk

|Ik

| + mk0 |Ik0 |

X

k 6=k0

mk

|Ik

| + ml

|[xk0�1

, x]| + mr

|[x, xk0 ]|

= D(f, Q)

D(f, Q)

=X

k 6=k0

Mk

|Ik

| + Ml

|[xk0�1

, x]| + Mr

|[x, xk0 ]|

nX

k=1

Mk

|Ik

|

= D(f, P )

The argument given above shows that the theorem holds if Q has one more pointthan P . Using induction, this same technique also shows the theorem holds when Qhas an arbitrarily larger number of points than P . ⇤

The main lesson to be learned from Theorem 8.5 is that refining a partitioncauses the lower Darboux sum to increase and the upper Darboux sum to decrease.Moreover, if P, Q 2 part([a, b]) and f : [a, b] ! [�B, B], then,

�B(b � a) D(f, P ) D(f, P [ Q) D(f, P [ Q) D(f, Q) B(b � a).

Therefore every Darboux lower sum is less than or equal to every Darboux upper

sum. Consider the following definition with this in mind.

Definition 8.6. The upper and lower Darboux integrals of a bounded functionf : [a, b] ! R are

D(f) = glb {D(f, P ) : P 2 part([a, b])}and

D(f) = lub {D(f, P ) : P 2 part([a, b])},

respectively.

As a consequence of the observations preceding the definition, it follows thatD(f) � D(f) always. In the case D(f) = D(f), the function is said to be Darboux

integrable on [a, b], and the common value is written D(f).The following is obvious.



Corollary 8.7. A bounded function f : [a, b] ! R is Darboux integrable if and

only if for all " > 0 there is a P 2 part([a, b]) such that D(f, P ) � D(f, P ) < ".

Which functions are Darboux integrable? The following corollary gives a firstapproximation to an answer.

Corollary 8.8. If f 2 C([a, b]), then D(f) exists.

Proof. Let " > 0. According to Corollary 6.31, f is uniformly continuous, sothere is a � > 0 such that whenever x, y 2 [a, b] with |x�y| < �, then |f(x)�f(y)| <"/(b � a). Let P 2 part([a, b]) with kPk < �. By Corollary 6.23, in each subintervalIi

determined by P , there are x⇤i

, y⇤i

2 Ii

such that

f(x⇤i

) = lub {f(x) : x 2 Ii

} and f(y⇤i

) = glb {f(x) : x 2 Ii

}.

Since |x⇤i

� y⇤i

| |Ii

| < �, we see 0 f(x⇤i

)� f(y⇤i

) < "/(b�a), for 1 i n. Then

D(f) � D(f) D(f, P ) � D(f, P )

=nX

i=1

f(x⇤i

)|Ii

| �nX

i=1

f(y⇤i

)|Ii

|

=nX

i=1

(f(x⇤i

) � f(y⇤i

))|Ii

|

<"

b � a

nX

i=1

|Ii

|

= "

and the corollary follows. ⇤This corollary should not be construed to imply that only continuous functions

are Darboux integrable. In fact, the set of integrable functions is much moreextensive than only the continuous functions. Consider the following example.

Example 8.3. Let f be the salt and pepper function of Example 6.15. It wasshown that C(f) = Qc. We claim that f is Darboux integrable over any compactinterval [a, b].

To see this, let " > 0 and N 2 N so that 1/N < "/2(b � a). Let

{qk

i

: 1 i m} = {qk

: 1 k N} \ [a, b]

and choose P 2 part([a, b]) such that kPk < "/2m. Then

D(f, P ) =nX

`=1

lub {f(x) : x 2 I`

}|I`

|

=X

q

k

i

/2I

`

lub {f(x) : x 2 I`

}|I`

| +X

q

k

i

2I

`

lub {f(x) : x 2 I`

}|I`

|

1

N(b � a) + mkPk

<"

2(b � a)(b � a) + m

"

2m

= ".

Since f(x) = 0 whenever x 2 Qc, it follows that D(f, P ) = 0. Therefore, D(f) =D(f) = 0 and D(f) = 0.


The Integral 8-7

4. The Integral

There are now two di↵erent definitions for the integral. It would be embarassing,if they gave di↵erent answers. The following theorem shows they’re really di↵erentsides of the same coin.2

Theorem 8.9. Let f : [a, b] ! R.(a) R(f) exists i↵ D(f) exists.

(b) If R(f) exists, then R(f) = D(f).

Proof. (a) (=)) Suppose R(f) exists and " > 0. By Theorem 8.3, f isbounded. Choose P 2 part([a, b]) such that

|R(f) � R (f, P, x⇤k

) | < "/4

for all selections x⇤k

from P . From each Ik

, choose xk

and xk

so that

Mk

� f(xk

) <"

4(b � a)and f(x

k

) � mk

<"

4(b � a).

Then

D(f, P ) � R (f, P, xk

) =nX

k=1

Mk

|Ik

| �nX

k=1

f(xk

)|Ik

|

=nX

k=1

(Mk

� f(xk

))|Ik

|

<"

4(b � a)(b � a) =

"

4.

In the same way,

R (f, P, xk

) � D(f, P ) < "/4.

Therefore,

D(f) � D(f)

= glb {D(f, Q) : Q 2 part([a, b])} � lub {D(f, Q) : Q 2 part([a, b])} D(f, P ) � D(f, P )

<⇣R (f, P, x

k

) +"

4

⌘�⇣R (f, P, x

k

) � "

4

⌘

|R (f, P, xk

) � R (f, P, xk

)| +"

2

< |R (f, P, xk

) � R(f)| + |R(f) � R (f, P, xk

) | +"

2< "

Since " is an arbitrary positive number, this shows D(f) exists and equals R(f),which is part (b) of the theorem.

2Theorem 8.9 shows that the two integrals presented here are the same. But, there are manyother integrals, and not all of them are equivalent. For example, the well-known Lebesgue integralincludes all Riemann integrable functions, but not all Lebesgue integrable functions are Riemannintegrable. The Denjoy integral is another extension of the Riemann integral which is not the sameas the Lebesgue integral. For more discussion of this, see [9].



((=) Suppose f : [a, b] ! [�B, B], D(f) exists and " > 0. Since D(f) exists,there is a P

1

2 part([a, b]), with points a = p0

< · · · < pm

= b, such that

D(f, P1

) � D(f, P1

) <"

2.

Set � = "/8mB. Choose P 2 part([a, b]) with kPk < � and let P2

= P [ P1

. SinceP

1

⌧ P2

, according to Theorem 8.5,

D(f, P2

) � D(f, P2

) <"

2.

Thinking of P as the generic partition, the interiors of its intervals (xi�1

, xi

)may or may not contain points of P

1

. For 1 i n, let

Qi

= {xi�1

, xi

} [ (P1

\ (xi�1

, xi

)) 2 part(Ii

).

If P1

\(xi�1

, xi

) = ;, then D(f, P ) and D(f, P2

) have the term Mi

|Ii

| in commonbecause Q

i

= {xi�1

, xi

}.Otherwise, P

1

\ (xi�1

, xi

) 6= ; and

D(f, Qi

) � �BkP2

k � �BkPk > �B�.

Since P1

has m � 1 points in (a, b), there are at most m � 1 of the Qi

not containedin P .

This leads to the estimate

D(f, P ) � D(f, P2

) = D(f, P ) �nX

i=1

D(f, Qi

) < (m � 1)2B� <"

4.

In the same way,

D(f, P2

) � D(f, P ) < (m � 1)2B� <"

4.

Putting these estimates together yields

D(f, P ) � D(f, P ) =�D(f, P ) � D(f, P

2

)�

+�D(f, P

2

) � D(f, P2

)�

+ (D(f, P2

) � D(f, P ))

<"

4+

"

2+

"

4= "

This shows that, given " > 0, there is a � > 0 so that kPk < � implies

D(f, P ) � D(f, P ) < ".

Since

D(f, P ) D(f) D(f, P ) and D(f, P ) R(f, P, x⇤i

) D(f, P )

for every selection x⇤i

from P , it follows that |R(f, P, x⇤i

)�D(f)| < " when kPk < �.We conclude f is Riemann integrable and R(f) = D(f). ⇤

From Theorem 8.9, we are justified in using a single notation for both R(f) and

D(f). The obvious choice is the familiarRb

a

f(x) dx, or, more simply,Rb

a

f .When proving statements about the integral, it’s convenient to switch back and

forth between the Riemann and Darboux formulations. Given f : [a, b] ! R thefollowing three facts summarize much of what we know.

(1)Rb

a

f exists i↵ for all " > 0 there is a � > 0 and an ↵ 2 R such that wheneverP 2 part([a, b]) and x⇤

i

is a selection from P , then |R (f, P, x⇤i

) � ↵| < ".

In this caseRb

a

f = ↵.


The Cauchy Criterion 8-9

(2)Rb

a

f exists i↵ 8" > 09P 2 part([a, b])�D(f, P ) � D(f, P ) < "

�

(3) For any P 2 part([a, b]) and selection x⇤i

from P ,

D(f, P ) R (f, P, x⇤i

) D(f, P ).

5. The Cauchy Criterion

We now face a conundrum. In order to show thatRb

a

f exists, we must knowits value. It’s often very hard to determine the value of an integral, even if theintegral exists. We’ve faced this same situation before with sequences. The basicdefinition of convergence for a sequence, Definition 3.2, requires the limit of thesequence be known. The path out of the dilemma in the case of sequences was theCauchy criterion for convergence, Theorem 3.22. The solution is the same here,with a Cauchy criterion for the existence of the integral.

Theorem 8.10 (Cauchy Criterion). Let f : [a, b] ! R. The following statements

are equivalent.

(a)Rb

a

f exists.

(b) Given " > 0 there exists P 2 part([a, b]) such that if P ⌧ Q1

and

P ⌧ Q2

, then

|R (f, Q1

, x⇤k

) � R (f, Q2

, y⇤k

)| < "(69)

for any selections from Q1

and Q2

.

Proof. (=)) AssumeRb

a

f exists. According to Definition 8.1, there is a � > 0

such that whenever P 2 part([a, b]) with kPk < �, then | R b

a

f � R (f, P, x⇤i

) | < "/2for every selection. If P ⌧ Q

1

and P ⌧ Q2

, then kQ1

k < �, kQ2

k < � and a simpleapplication of the triangle inequality shows

|R (f, Q1

, x⇤k

) � R (f, Q2

, y⇤k

)|

��R (f, Q

1

, x⇤k

) �Z

b

a

f

��+

��

Zb

a

f � R (f, Q2

, y⇤k

)

�� < ".

((=) Let " > 0 and choose P 2 part([a, b]) satisfying (b) with "/2 in place of ".We first claim that f is bounded. To see this, suppose it is not. Then it must be

unbounded on an interval Ik0 determined by P . Fix a selection {x⇤

k

2 Ik

: 1 k n}and let y⇤

k

= x⇤k

for k 6= k0

with y⇤k0

any element of Ik0 . Then

"

2> |R (f, P, x⇤

k

) � R (f, P, y⇤k

)| =��f(x⇤

k0) � f(y⇤

k0)�� |I

k0 | .

But, the right-hand side can be made bigger than "/2 with an appropriate choiceof y⇤

k0because of the assumption that f is unbounded on I

k0 . This contradictionforces the conclusion that f is bounded.

Thinking of P as the generic partition and using mk

and Mk

as usual withDarboux sums, for each k, choose x⇤

k

, y⇤k

2 Ik

such that

Mk

� f(x⇤k

) <"

4n|Ik

| and f(y⇤k

) � mk

<"

4n|Ik

| .



With these selections,

D(f, P ) � D(f, P )

= D(f, P ) � R (f, P, x⇤k

) + R (f, P, x⇤k

) � R (f, P, y⇤k

) + R (f, P, y⇤k

) � D(f, P )

=nX

k=1

(Mk

� f(x⇤k

))|Ik

| + R (f, P, x⇤k

) � R (f, P, y⇤k

) +nX

k=1

(f(y⇤k

) � mk

)|Ik

|

nX

k=1

|Mk

� f(x⇤k

)| |Ik

| + |R (f, P, x⇤k

) � R (f, P, y⇤k

)| +

��

nX

k=1

(f(y⇤k

) � mk

)|Ik

|��

<

nX

k=1

"

4n|Ik

| |Ik| + |R (f, P, x⇤k

) � R (f, P, y⇤k

)| +nX

k=1

"

4n|Ik

| |Ik|

<"

4+

"

2+

"

4< "

Corollary 8.7 implies D(f) exists and Theorem 8.9 finishes the proof. ⇤

Corollary 8.11. If

Rb

a

f exists and [c, d] ⇢ [a, b], thenRd

c

f exists.

Proof. Let P0

= {a, b, c, d} 2 part([a, b]) and " > 0. Using Theorem 8.10,choose a partition P

"

such that P0

⌧ P"

and whenever P"

⌧ P and P"

⌧ P 0, then

|R (f, P, x⇤k

) � R (f, P 0, y⇤k

) | < ".

Let P 1

"

= P"

\ [a, c], P 2

"

= P"

\ [c, d] and P 3

"

= P"

\ [d, b]. Suppose P 2

"

⌧ Q1

andP 2

"

⌧ Q2

. Then P 1

"

[ Qi

[ P 3

"

for i = 1, 2 are refinements of P"

and

|R (f, Q1

, x⇤k

) � R (f, Q2

, y⇤k

) | =

|R �f, P 1

"

[ Q1

[ P 3

"

, x⇤k

�� R �f, P 1

"

[ Q2

[ P 3

"

, y⇤k

� | < "

for any selections. An application of Theorem 8.10 showsRb

a

f exists. ⇤

6. Properties of the Integral

Theorem 8.12. If

Rb

a

f and

Rb

a

g both exist, then

(a) If ↵, � 2 R, thenRb

a

(↵f +�g) exists and

Rb

a

(↵f +�g) = ↵Rb

a

f +�Rb

a

g.

(b)Rb

a

fg exists.

(c)Rb

a

|f | exists.

Proof. (a) Let " > 0. If ↵ = 0, in light of Example 8.1, it is clear ↵f isintegrable. So, assume ↵ 6= 0, and choose a partition P

f

2 part([a, b]) such thatwhenever P

f

⌧ P , then

��R (f, P, x⇤k

) �Z

b

a

f

�� <"

2|↵| .


Properties of the Integral 8-11

Then

��R (↵f, P, x⇤k

) � ↵

Zb

a

f

�� =

��

nX

k=1

↵f(x⇤k

)|Ik

| � ↵

Zb

a

f

��

= |↵|��

nX

k=1

f(x⇤k

)|Ik

| �Z

b

a

f

��

= |↵|��R (f, P, x⇤

k

) �Z

b

a

f

��

< |↵| "

2|↵|=

"

2.

This shows ↵f is integrable andRb

a

↵f = ↵Rb

a

f .Assuming � 6= 0, in the same way, we can choose a P

g

2 part([a, b]) such thatwhen P

g

⌧ P , then

��R (g, P, x⇤k

) �Z

b

a

g

�� <"

2|�| .

Let P"

= Pf

[ Pg

be the common refinement of Pf

and Pg

, and suppose P"

⌧ P .Then

|R (↵f + �g, P, x⇤k

) �

↵

Zb

a

f + �

Zb

a

g

!|

|↵|��R (f, P, x⇤

k

) �Z

b

a

f

��+ |�|��R (g, P, x⇤

k

) �Z

b

a

g

�� < "

for any selection. This shows ↵f +�g is integrable andRb

a

(↵f +�g) = ↵Rb

a

f +�Rb

a

g.

(b) Claim: IfRb

a

h exists, then so doesRb

a

h2

To see this, suppose first that 0 h(x) M on [a, b]. If M = 0, the claim istrivially true, so suppose M > 0. Let " > 0 and choose P 2 part([a, b]) such that

D(h, P ) � D(h, P ) "

2M.

For each 1 k n, let

mk

= glb {h(x) : x 2 Ik

} lub {h(x) : x 2 Ik

} = Mk

.

Since h � 0,

m2

k

= glb {h(x)2 : x 2 Ik

} lub {h(x)2 : x 2 Ik

} = M2

k

.



Using this, we see

D(h2, P ) � D(h2, P ) =nX

k=1

(M2

k

� m2

k

)|Ik

|

=nX

k=1

(Mk

+ mk

)(Mk

� mk

)|Ik

|

2M

nX

k=1

(Mk

� mk

)|Ik

|!

= 2M�D(h, P ) � D(h, P )

�

< ".

Therefore, h2 is integrable when h � 0.If h is not nonnegative, let m = glb {h(x) : a x b}. Then h � m � 0, and

h � m is integrable by (a). From the claim, (h � m)2 is integrable. Since

h2 = (h � m)2 + 2mh � m2,

it follows from (a) that h2 is integrable.Finally, fg = 1

4

((f + g)2 � (f � g)2) is integrable by the claim and (a).

(c) Claim: If h � 0 is integrable, then so isp

h.To see this, let " > 0 and choose P 2 part([a, b]) such that

D(h, P ) � D(h, P ) < "2.

For each 1 k n, let

mk

= glb {p

h(x) : x 2 Ik

} lub {p

h(x) : x 2 Ik

} = Mk

.

and define

A = {k : Mk

� mk

< "} and B = {k : Mk

� mk

� "}.

Then

(70)X

k2A

(Mk

� mk

)|Ik

| < "(b � a).

Using the fact that mk

� 0, we see that Mk

� mk

Mk

+ mk

, andX

k2B

(Mk

� mk

)|Ik

| 1

"

X

k2B

(Mk

+ mk

)(Mk

� mk

)|Ik

|(71)

=1

"

X

k2B

(M2

k

� m2

k

)|Ik

|

1

"

�D(h, P ) � D(h, P )�

< "

Combining (70) and (71), it follows that

D(p

h, P ) � D(p

h, P ) < "(b � a) + " = "((b � a) + 1)

can be made arbitrarily small. Therefore,p

h is integrable.Since |f | =

pf2 an application of (b) and the claim su�ce to prove (c). ⇤

Theorem 8.13. If

Rb

a

f exists, then


The Fundamental Theorem of Calculus 8-13

(a) If f � 0 on [a, b], thenRb

a

f � 0.

(b) | R b

a

f | R b

a

|f |(c) If a c b, then

Rb

a

f =Rc

a

f +Rb

c

f .

Proof. (a) Since all the Riemann sums are nonnegative, this follows at once.

(b) It is always true that |f | ± f � 0 and |f | � f � 0, so by (a),Rb

a

(|f | + f) � 0

andRb

a

(|f | � f) � 0. Rearranging these shows � R b

a

f Rb

a

|f | andRb

a

f Rb

a

|f |.Therefore, | R b

a

f | R b

a

|f |, which is (b).(c) By Corollary 8.11, all the integrals exist. Let " > 0 and choose P

l

2part([a, c]) and P

r

2 part([c, b]) such that whenever Pl

⌧ Ql

and Pr

⌧ Qr

, then,

��R (f, Ql

, x⇤k

) �Z

c

a

f

�� <"

2and

��R (f, Qr

, y⇤k

) �Z

b

c

f

�� <"

2.

If P = Pl

[ Pr

and Q = Ql

[ Qr

, then P, Q 2 part([a, b]) and P ⌧ Q. The triangleinequality gives

��R (f, Q, x⇤k

) �Z

c

a

f �Z

b

c

f

�� < ".

Since every refinement of P has the form Ql

[ Qr

, part (c) follows. ⇤

There’s some notational trickery that can be played here. IfRb

a

f exists, then

we defineRa

b

f = � R b

a

f . With this convention, it can be shownRb

a

f =Rc

a

f +Rb

c

fno matter the order of a, b and c, as long as at least two of the integrals exist. (SeeProblem 8.4.)

7. The Fundamental Theorem of Calculus

Theorem 8.14 (Fundamental Theorem of Calculus 1). Suppose f, F : [a, b] ! Rsatisfy

(a)Rb

a

f exists

(b) F 2 C([a, b]) \ D((a, b))(c) F 0(x) = f(x), 8x 2 (a, b)

Then

Rb

a

f = F (b) � F (a).

Proof. Let " > 0 and choose P"

2 part([a, b]) such that whenever P"

⌧ P ,then ��R (f, P, x⇤

k

) �Z

b

a

f

�� < ".

for every selection from P . On each interval [xk�1

, xk

] determined by P , thefunction F satisfies the conditions of the Mean Value Theorem. (See Corollary 7.13.)Therefore, for each k, there is an c

k

2 (xk�1

, xk

) such that

F (xk

) � F (xk�1

) = F 0(ck

)(xk

� xk�1

) = f(ck

)|Ik

|.



So,��

Zb

a

f � (F (b) � F (a))

�� =

��

Zb

a

f �nX

k=1

(F (xk

) � F (xk�1

)

��

=

��

Zb

a

f �nX

k=1

f(ck

)|Ik

|��

=

��

Zb

a

f � R (f, P, ck

)

��< "

and the theorem follows. ⇤

Example 8.4. Suppose f and its first n derivatives are all continuous on [a, b].There is a function R

n

(x, t) such that

Rn

(x, t) = f(x) �n�1X

k=0

f (k)(t)

k!(x � t)k

for a t b. Di↵erentiating both sides of the equation with respect to t, note thatthe right-hand side telescopes, so the result is

d

dtR

n

(x, t) = � (x � t)n�1

(n � 1)!f (n)(t).

Using Theorem 8.14 and the fact that Rn

(x, x) = 0 gives

Rn

(x, c) = Rn

(x, c) � Rn

(x, x)

=

Zc

x

d

dtR

n

(x, t) dt

=

Zx

c

(x � t)n�1

(n � 1)!f (n)(t) dt,

which is the integral form of the remainder from Taylor’s formula.

Corollary 8.15 (Integration by Parts). If f, g 2 C([a, b]) \ D((a, b)) and both

f 0g and fg0are integrable on [a, b], then

Zb

a

fg0 +

Zb

a

f 0g = f(b)g(b) � f(a)g(a).

Proof. Use Theorems 7.3(c) and 8.14. ⇤

SupposeRb

a

f exists. By Corollary 8.11, f is integrable on every interval [a, x],for x 2 [a, b]. This allows us to define a function F : [a, b] ! R as F (x) =

Rx

a

f ,called the indefinite integral of f on [a, b].

Theorem 8.16 (Fundamental Theorem of Calculus 2). Let f be integrable on

[a, b] and F be the indefinite integral of f . Then F 2 C([a, b]) and F 0(x) = f(x)whenever x 2 C(f) \ (a, b).


The Fundamental Theorem of Calculus 8-15

x x + h

M

m

Figure 3. This figure illustrates a “box” argument showing

limh!0

1h

Zx+h

x

f = f(x).

Proof. To show F 2 C([a, b]), let x0

2 [a, b] and " > 0. SinceRb

a

f exists,there is an M > lub {|f(x)| : a x b}. Choose 0 < � < "/M and x 2(x

0

� �, x0

+ �) \ [a, b]. Then

|F (x) � F (x0

)| =

��Z

x

x0

f

�� M |x � x0

| < M� < "

and x0

2 C(F ).Let x

0

2 C(f)\(a, b) and " > 0. There is a � > 0 such that x 2 (x0

��, x0

+�) ⇢(a, b) implies |f(x) � f(x

0

)| < ". If 0 < h < �, then��F (x

0

+ h) � F (x0

)

h� f(x

0

)

�� =

��1

h

Zx0+h

x0

f � f(x0

)

��

=

��1

h

Zx0+h

x0

(f(t) � f(x0

)) dt

��

1

h

Zx0+h

x0

|f(t) � f(x0

)| dt

<1

h

Zx0+h

x0

" dt

= ".

This shows F 0+

(x0

) = f(x0

). It can be shown in the same way that F 0�(x

0

) = f(x0

).Therefore F 0(x

0

) = f(x0

). ⇤

The right picture makes Theorem 8.16 almost obvious. Consider Figure 3.Suppose x 2 C(f) and " > 0. There is a � > 0 such that

f((x � d, x + d) \ [a, b]) ⇢ (f(x) � "/2, f(x) + "/2).

Letm = glb {fy : |x � y| < �} lub {fy : |x � y| < �} = M.



Apparently M � m < " and for 0 < h < �,

mh Z

x+h

x

f Mh =) m F (x + h) � F (x)

h M.

Since M � m ! 0 as h ! 0, a “squeezing” argument shows

limh#0

F (x + h) � F (x)

h= f(x).

A similar argument establishes the limit from the left and F 0(x) = f(x).It’s easy to read too much into the Fundamental Theorem of Calculus. We

are tempted to start thinking of integration and di↵erentiation as opposites ofeach other. But, this is far from the truth. The operations of integration andantidi↵erentiation are di↵erent operations, that happen to sometimes be tied togetherby the Fundamental Theorem of Calculus. Consider the following examples.

Example 8.5. Let

f(x) =

(|x|/x, x 6= 0

0, x = 0

It’s easy to prove that f is integrable over any compact interval, and that F (x) =Rx

�1

f = |x| � 1 is an indefinite integral of f . But, F is not di↵erentiable at x = 0and f is not a derivative, according to Theorem 7.16.

Example 8.6. Let

f(x) =

(x2 sin 1

x

2 , x 6= 0

0, x = 0

It’s straightforward to show that f is di↵erentiable and

f 0(x) =

(2x sin 1

x

2 � 2

x

cos 1

x

2 , x 6= 0

0, x = 0

Since f 0 is unbounded near x = 0, it follows from Theorem 8.3 that f 0 is notintegrable over any interval containing 0.

Example 8.7. Let f be the salt and pepper function of Example 6.15. It was

shown in Example 8.3 thatRb

a

f = 0 on any interval [a, b]. If F (x) =Rx

0

f , thenF (x) = 0 for all x and F 0 = f only on C(f) = Qc.

8. Change of Variables

Integration by substitution works side-by-side with the Fundamental Theorem ofCalculus in the integration section of any calculus course. Most of the time calculusbooks require all functions in sight to be continuous. In that case, a substitutiontheorem is an easy consequence of the Fundamental Theorem and the Chain Rule.(See Exercise 8.7.) More general statements are true, but they are harder to prove.

Theorem 8.17. If f and g are functions such that

(a) g is strictly monotone on [a, b],(b) g is continuous on [a, b],

(c) both

Rg(b)

g(a)

f and

Rb

a

(f � g)g0exist,


8. CHANGE OF VARIABLES 8-17

then

Zg(b)

g(a)

f =

Zb

a

(f � g)g0.(72)

Proof. Suppositions (a) and (b) show g is a bijection from [a, b] to an interval[c, d]. The correspondence between the endpoints depends on whether g is increasingor decreasing.

Let " > 0.From (d) and Definition 8.1, there is a �

1

> 0 such that whenever P 2 part([a, b])with kPk < �

1

, then��R ((f � g)g0, P, x⇤

i

) �Z

b

a

(f � g)g0�� <

"

2(73)

for any selection from P . Choose P1

2 part([a, b]) such that kP1

k < �1

.Using the same argument, there is a �

2

> 0 such that whenever Q 2 part([c, d])with kQk < �

2

, then��R (f, Q, x⇤

i

) �Z

d

c

f

�� <"

2(74)

for any selection from Q. As above, choose Q1

2 part([c, d]) such that kQ1

k < �2

.Setting P

2

= P1

[ {g�1(x) : x 2 Q1

} and Q2

= P1

[ {g(x) : x 2 P1

}, it isapparent that P

1

⌧ P2

, Q1

⌧ Q2

, kP2

k kP1

k < �1

, kQ2

k kQ1

k < �2

andQ

2

= {g(x) : x 2 P2

}. From (73) and (74), it follows that��

Zb

a

(f � g)g0 � R ((f � g)g0, P2

, x⇤i

)

�� <"

2

and(75)��

Zd

c

f � R (f, Q2

, y⇤i

)

�� <"

2

for any selections from P2

and Q2

.Label the points of P

2

as a = x1

< x2

< · · · < xn

= b and those of Q2

asc = y

0

< y1

< · · · < yn

= d. From (b), (c) and the Mean Value Theorem, for each i,choose c

i

2 (xi�1

, xi

) such that

g(xi

) � g(xi�1

) = g0(ci

)(xi

� xi�1

).(76)

Notice that {ci

: 1 i n} is a selection from P2

.First, assume g is strictly increasing. In this case g(x

i

) = yi

for 0 i n andg(c

i

) 2 (yi�1

, yi

) for 0 < i n, so g(ci

) is a selection from Q2

.��

Zg(b)

g(a)

f �Z

b

a

(f � g)g0��

=

��

Zg(b)

g(a)

f � R (f, Q2

, g(ci

)) + R (f, Q2

, g(ci

)) �Z

b

a

(f � g)g0��

��

Zg(b)

g(a)

f � R (f, Q2

, g(ci

))

��+

��R (f, Q2

, g(ci

)) �Z

b

a

(f � g)g0��



Use the triangle inequality and (75). Expand the second Riemann sum.

<"

2+

��

nX

i=1

f(g(ci

)) (g(xi

) � g(xi�1

)) �Z

b

a

(f � g)g0��

Apply the Mean Value Theorem, as in (76), and then use (75).

="

2+

��

nX

i=1

f(g(ci

))g0(ci

) (xi

� xi�1

) �Z

b

a

(f � g)g0��

="

2+

��R ((f � g)g0, P2

, ci

) �Z

b

a

(f � g)g0��

<"

2+

"

2= "

and (72) follows.Now assume g is strictly decreasing on [a, b]. The proof is much the same as above,

except the bookkeeping is trickier because order is reversed instead of preservedby g. This means g(x

k

) = yn�k

when 0 k n and g(cn�k+1

) 2 (yk�1

, yk

) for0 < k n. Therefore, g(c

n�k+1

) is a selection from Q2

. From the Mean ValueTheorem,

yk

� yk�1

= g(xn�k

) � g(xn�k+1

)

= �(g(xn�k+1

) � g(xn�k

))(77)

= �g0(cn�k+1

)(xn�k+1

� xn�k

),

where ck

2 (xk�1

, xk

) is as above. The rest of the proof is much like the case wheng is increasing.

��

Zg(b)

g(a)

f �Z

b

a

(f � g)g0��

=

��Z

g(a)

g(b)

f + R (f, Q2

, g(cn�k+1

)) � R (f, Q2

, g(cn�k+1

)) �Z

b

a

(f � g)g0��

��Z

g(b)

g(a)

f + R (f, Q2

, g(cn�k+1

))

��+

��R (f, Q2

, g(cn�k+1

)) �Z

b

a

(f � g)g0��

Use (75), expand the second Riemann sum and apply (77).

<"

2+

��nX

k=1

f(g(cn�k+1

))(yk

� yk�1

) �Z

b

a

(f � g)g0��

="

2+

��

nX

k=1

f(g(cn�k+1

))g0(cn�k+1

)(xn�k+1

� xn�k

) �Z

b

a

(f � g)|g0|��


Integral Mean Value Theorems 8-19

Reverse the order of the sum and use (75).

="

2+

��

nX

k=1

f(g(ck

))g0(ck

)(xk

� xk�1

) �Z

b

a

(f � g)g0��

="

2+

��R ((f � g)g0, P2

, ck

) �Z

b

a

(f � g)g0��

<"

2+

"

2= "

The theorem has been proved. ⇤

Example 8.8. Suppose we want to calculateR

1

�1

p1 � x2 dx. Using the notation

of Theorem 8.17, let f(x) =p

1 � x2, g(x) = sin x and [a, b] = [�⇡/2, ⇡/2]. In thiscase, g is an increasing function. Then (72) becomes

Z1

�1

p1 � x2 dx =

Zsin(⇡/2)

sin(�⇡/2)

p1 � x2 dx

=

Z⇡/2

�⇡/2

p1 � sin2 x cos x dx

=

Z⇡/2

�⇡/2

cos2 d dx

=⇡

2.

On the other hand, it can also be done with a decreasing function. If g(x) = cos xand [a, b] = [0, ⇡], then

Z1

�1

p1 � x2 dx =

Zcos 0

cos⇡

p1 � x2 dx

= �Z

cos⇡

cos 0

p1 � x2 dx

= �Z

⇡

0

p1 � cos2 x(� sin x) dx

=

Z⇡

0

p1 � cos2 x sin x dx

=

Z⇡

0

sin2 x dx

=⇡

2

9. Integral Mean Value Theorems

Theorem 8.18. Suppose f, g : [a, b] ! R are such that

(a) g(x) � 0 on [a, b],(b) f is bounded and m f(x) M for all x 2 [a, b], and

(c)Rb

a

f and

Rb

a

fg both exist.



There is a c 2 [m, M ] such that

Zb

a

fg = c

Zb

a

g.

Proof. Obviously,

m

Zb

a

g Z

b

a

fg M

Zb

a

g.(78)

IfRb

a

g = 0, we’re done. Otherwise, let

c =

Rb

a

fgRb

a

g.

ThenRb

a

fg = cRb

a

g and from (78), it follows that m c M . ⇤Corollary 8.19. Let f and g be as in Theorem 8.18, but additionally assume

f is continuous. Then there is a c 2 (a, b) such that

Zb

a

fg = f(c)

Zb

a

g.

Proof. This follows from Theorem 8.18 and Corollaries 6.23 and 6.26. ⇤Theorem 8.20. Suppose f, g : [a, b] ! R are such that

(a) g(x) � 0 on [a, b],(b) f is bounded and m f(x) M for all x 2 [a, b], and

(c)Rb

a

f and

Rb

a

fg both exist.

There is a c 2 [a, b] such that

Zb

a

fg = m

Zc

a

g + M

Zb

c

g.

Proof. For a x b let

G(x) = m

Zx

a

g + M

Zb

x

g.

By Theorem 8.16, G 2 C([a, b]) and

glb G G(b) = m

Zb

a

g Z

b

a

fg M

Zb

a

g = G(a) lub G.

Now, apply Corollary 6.26 to find c where G(c) =Rb

a

fg. ⇤

10. Exercises

8.1. If f : [a, b] ! R and R(f) exists, then f is bounded.

8.2. Let

f(x) =

(1, x 2 Q0, x /2 Q

.

(a) Use Definition 8.1 to show f is not integrable on any interval.(b) Use Definition 8.6 to show f is not integrable on any interval.


10. EXERCISES 8-21

8.3. CalculateR

5

2

x2 using the definition of integration.

8.4. If at least two of the integrals exist, thenZ

b

a

f =

Zc

a

f +

Zb

c

f

no matter the order of a, b and c.

8.5. If ↵ > 0, f : [a, b] ! [↵, �] andRb

a

f exists, thenRb

a

1/f exists.

8.6. If f : [a, b] ! [0, 1) is continuous and D(f) = 0, then f(x) = 0 for allx 2 [a, b].

8.7. In the statement of Theorem 8.17, make the additional assumption that f iscontinuous. Use the Fundamental Theorem of Calculus to give an easier proof.


CHAPTER 9

Sequences of Functions

1. Pointwise Convergence

We have accumulated much experience working with sequences of numbers. Thenext level of complexity is sequences of functions. This chapter explores several waysthat sequences of functions can converge to another function. The basic startingpoint is contained in the following definitions.

Definition 9.1. Suppose S ⇢ R and for each n 2 N there is a functionfn

: S ! R. The collection {fn

: n 2 N} is a sequence of functions defined on S.

For each fixed x 2 S, fn

(x) is a sequence of numbers, and it makes sense to askwhether this sequence converges. If f

n

(x) converges for each x 2 S, a new functionf : S ! R is defined by

f(x) = limn!1 f

n

(x).

The function f is called the pointwise limit of the sequence fn

, or, equivalently, itis said f

n

converges pointwise to f . This is abbreviated fn

S�!f , or simply fn

! f ,if the domain is clear from the context.

Example 9.1. Let

fn

(x) =

8><

>:

0, x < 0

xn, 0 x < 1

1, x � 1

.

Then fn

! f where

f(x) =

(0, x < 1

1, x � 1.

(See Figure 1.) This example shows that a pointwise limit of continuous functionsneed not be continuous.

Example 9.2. For each n 2 N, define fn

: R ! R by

fn

(x) =nx

1 + n2x2

.

(See Figure 2.) Clearly, each fn

is an odd function and lim|x|!1 fn

(x) = 0. A bitof calculus shows that f

n

(1/n) = 1/2 and fn

(�1/n) = �1/2 are the extreme valuesof f

n

. Finally, if x 6= 0,

|fn

(x)| =

��nx

1 + n2x2

�� <��

nx

n2x2

�� =��

1

nx

��

implies fn

! 0. This example shows that functions can remain bounded away from0 and still converge pointwise to 0.

9-1

9-2 CHAPTER 9. SEQUENCES OF FUNCTIONS

0.2 0.4 0.6 0.8 1.0

0.2

0.4

0.6

0.8

1.0

Figure 1. The first ten functions from the sequence of Example 9.1.

!3 !2 !1 1 2 3

!0.4

!0.2

0.2

0.4

Figure 2. The first four functions from the sequence of Example 9.2.

Example 9.3. Define fn

: R ! R by

fn

(x) =

8><

>:

22n+4x � 2n+3, 1

2

n+1 < x < 3

2

n+2

�22n+4x + 2n+4, 3

2

n+2 x < 1

2

n

0, otherwise

To figure out what this looks like, it might help to look at Figure 3.The graph of f

n

is a piecewise linear function supported on [1/2n+1, 1/2n] andthe area under the isoceles triangle of the graph over this interval is 1. Therefore,R

1

0

fn

= 1 for all n.If x > 0, then whenever x > 1/2n, we have f

n

(x) = 0. From this it follows thatfn

! 0.The lesson to be learned from this example is that it may not be true that

limn!1

R1

0

fn

=R

1

0

limn!1 f

n

.

Example 9.4. Define fn

: R ! R by

fn

(x) =

(n

2

x2 + 1

2n

, |x| 1

n

|x|, |x| > 1

n

.


2. UNIFORM CONVERGENCE 9-3

0.2 0.4 0.6 0.8 1.0

10

20

30

40

50

60

Figure 3. The first four functions from the sequence of Example 9.3.

!1.0 !0.5 0.5 1.0

0.2

0.4

0.6

0.8

1.0

Figure 4. The first ten functions f from Example 9.4.

(See Figure 4.) The parabolic section in the center was chosen so fn

(±1/n) = 1/nand f 0

n

(±1/n) = ±1. This splices the sections together at (±1/n, ±1/n) so fn

isdi↵erentiable everywhere. It’s clear f

n

! |x|, which is not di↵erentiable at 0.This example shows that the limit of di↵erentiable functions need not be

di↵erentiable.

The examples given above show that continuity, integrability and di↵erentiabilityare not preserved in the pointwise limit of a sequence of functions. To have anyhope of preserving these properties, a stronger form of convergence is needed.

2. Uniform Convergence

Definition 9.2. The sequence fn

: S ! R converges uniformly to f : S ! Ron S, if for each " > 0 there is an N 2 N so that whenever n � N and x 2 S, then|f

n

(x) � f(x)| < ".In this case, we write f

n

S�!�! f , or simply fn

�!�! f , if the set S is clear fromthe context.



The di↵erence between pointwise and uniform convergence is that with pointwiseconvergence, the convergence of f

n

to f can vary in speed at each point of S. Withuniform convergence, the speed of convergence is roughly the same all across S.Uniform convergence is a stronger condition to place on the sequence f

n

thanpointwise convergence in the sense of the following theorem.

Theorem 9.3. If fn

S�!�! f , then fn

S�!f .

Proof. Let x0

2 S and " > 0. There is an N 2 N such that when n � N , then|f(x) � f

n

(x)| < " for all x 2 S. In particular, |f(x0

) � fn

(x0

)| < " when n � N .This shows f

n

(x0

) ! f(x0

). Since x0

2 S is arbitrary, it follows that fn

! f . ⇤The first three examples given above show the converse to Theorem 9.3 is false.

There is, however, one interesting and useful case in which a partial converse is true.

Definition 9.4. If fn

S�!f and fn

(x) " f(x) for all x 2 S, then fn

increases tof on S. If f

n

S�!f and fn

(x) # f(x) for all x 2 S, then fn

decreases to f on S. Ineither case, f

n

is said to converge to f monotonically.

The functions of Example 9.4 decrease to |x|. Notice that in this case, theconvergence is also happens to be uniform. The following theorem shows Example9.4 to be an instance of a more general phenomenon.

Theorem 9.5 (Dini’s Theorem). If

(a) S is compact,

(b) fn

S�!f monotonically,

(c) fn

2 C(S) for all n 2 N, and(d) f 2 C(S),

then fn

�!�! f .

Proof. There is no loss of generality in assuming fn

# f , for otherwise weconsider �f

n

and �f . With this assumption, if gn

= fn

� f , then gn

is a sequenceof continuous functions decreasing to 0. It su�ces to show g

n

�!�! 0.To do so, let " > 0. Using continuity and pointwise convergence, for each x 2 S

find an open set Gx

containing x and an Nx

2 N such that gN

x

(y) < " for all y 2 Gx

.Notice that the monotonicity condition guarantees g

n

(y) < " for every y 2 Gx

andn � N

x

.

a b

f(x)

f(x) + ε

f(x ε

fn(x)

Figure 5. |fn

(x) � f(x)| < " on [a, b], as in Definition 9.2.


3. METRIC PROPERTIES OF UNIFORM CONVERGENCE 9-5

The collection {Gx

: x 2 S} is an open cover for S, so it must contain a finitesubcover {G

x

i

: 1 i n}. Let N = max{Nx

i

: 1 i n} and choose m � N . Ifx 2 S, then x 2 G

x

i

for some i, and 0 gm

(x) gN

(x) gN

i

(x) < ". It followsthat g

n

�!�! 0. ⇤

3. Metric Properties of Uniform Convergence

If S ⇢ R, let B(S) = {f : S ! R : f is bounded}. For f 2 B(S), definekfk

S

= lub {|f(x)| : x 2 S}. (It is abbreviated to kfk, if the domain S is clear fromthe context.) Apparently, kfk � 0, kfk = 0 () f ⌘ 0 and, if g 2 B(S), thenkf � gk = kg � fk. Moreover, if h 2 B(S), then

kf � gk = lub {|f(x) � g(x)| : x 2 S} lub {|f(x) � h(x)| + |h(x) � g(x)| : x 2 S} lub {|f(x) � h(x)| : x 2 S} + lub {|h(x) � g(x)| : x 2 S}= kf � hk + kh � gk

Combining all this, it follows that kf � gk is a metric1 on B(S).The definition of uniform convergence implies that for a sequence of bounded

functions fn

: S ! R,

fn

�!�! f () kfn

� fk ! 0.

Because of this, the metric kf � gk is often called the uniform metric or the sup-

metric. Many ideas developed using the metric properties of R can be carried overinto this setting. In particular, there is a Cauchy criterion for uniform convergence.

Definition 9.6. Let S ⇢ R. A sequence of functions fn

: S ! R is a Cauchy

sequence under the uniform metric, if given " > 0, there is an N 2 N such that whenm, n � N , then kf

n

� fm

k < ".

Theorem 9.7. Let fn

2 B(S). There is a function f 2 B(S) such that fn

�!�! fi↵ f

n

is a Cauchy sequence in B(S).

Proof. ()) Let fn

�!�! f and " > 0. There is an N 2 N such that n � Nimplies kf

n

� fk < "/2. If m � N and n � N , then

kfm

� fn

k kfm

� fk + kf � fn

k <"

2+

"

2= "

shows fn

is a Cauchy sequence.(() Suppose f

n

is a Cauchy sequence in B(S) and " > 0. Choose N 2 N sothat when kf

m

� fn

k < " whenever m � N and n � N . In particular, for a fixedx

0

2 S and m, n � N , |fm

(x0

) � fn

(x0

)| kfm

� fn

k < " shows the sequencefn

(x0

) is a Cauchy sequence in R and therefore converges. Since x0

is an arbitrarypoint of S, this defines an f : S ! R such that f

n

! f .Finally, if m, n � N and x 2 S the fact that |f

n

(x) � fm

(x)| < " gives

|fn

(x) � f(x)| = limm!1 |f

n

(x) � fm

(x)| ".

This shows that when n � N , then kfn

� fk ". We conclude that f 2 B(S) andfn

�!�! f . ⇤

1Definition 2.10



A collection of functions S is said to be complete under uniform convergence,if every Cauchy sequence in S converges to a function in S. Theorem 9.7 showsB(S) is complete under uniform convergence. We’ll see several other collections offunctions that are complete under uniform convergence.

4. Series of Functions

The definitions of pointwise and uniform convergence are extended in the naturalway to series of functions. If

P1k=1

fk

is a series of functions defined on a set S,then the series converges pointwise or uniformly, depending on whether the sequenceof partial sums, s

n

=P

n

k=1

fk

converges pointwise or uniformly, respectively. It isabsolutely convergent or absolutely uniformly convergent, if

P1n=1

|fn

| is convergentor uniformly convergent on S, respectively.

The following theorem is obvious and its proof is left to the reader.

Theorem 9.8. Let

P1n=1

fn

be a series of functions defined on S. If

P1n=1

fn

is absolutely convergent, then it is convergent. If

P1n=1

fn

is absolutely uniformly

convergent, then it is uniformly convergent.

The following theorem is a restatement of Theorem 9.5 for series.

Theorem 9.9. If

P1n=1

fn

is a series of nonnegative continuous functions

converging pointwise to a continuous function on a compact set S, thenP1

n=1

fn

converges uniformly on S.

A simple, but powerful technique for showing uniform convergence of series isthe following.

Theorem 9.10 (Weierstrass M-Test). If fn

: S ! R is a sequence of functions

and Mn

is a sequence nonnegative numbers such that kfn

kS

Mn

for all n 2 Nand

P1n=1

Mn

converges, then

P1n=1

fn

is absolutely uniformly convergent.

Proof. Let " > 0 and sn

be the sequence of partial sums ofP1

n=1

|fn

|. Thereis an N 2 N such that when n > m � N , then

Pn

k=m

Mk

< ". So,

ksn

� sm

k ��

nX

k=m+1

|fk

|��

nX

k=m+1

kfk

k nX

k=m

Mk

< ".

This shows sn

is a Cauchy sequence in B(S) and must converge according toTheorem 9.7. ⇤

5. Continuity and Uniform Convergence

Theorem 9.11. If fn

: S ! R such that each fn

is continuous at x0

and

fn

S�!�! f , then f is continuous at x0

.

Proof. Let " > 0. Since fn

�!�! f , there is an N 2 N such that whenevern � N and x 2 S, then |f

n

(x) � f(x)| < "/3. Because fN

is continuous at x0

, thereis a � > 0 such that x 2 (x

0

� �, x0

+ �) \ S implies |fN

(x) � fN

(x0

)| < "/3. Usingthese two estimates, it follows that when x 2 (x

0

� �, x0

+ �) \ S,

|f(x) � f(x0

)| = |f(x) � fN

(x) + fN

(x) � fN

(x0

) + fN

(x0

) � f(x0

)| |f(x) � f

N

(x)| + |fN

(x) � fN

(x0

)| + |fN

(x0

) � f(x0

)|< "/3 + "/3 + "/3 = ".

Therefore, f is continuous at x0

. ⇤


5. CONTINUITY AND UNIFORM CONVERGENCE 9-7

The following corollary is immediate from Theorem 9.11.

Corollary 9.12. If fn

is a sequence of continuous functions converging uni-

formly to f on S, then f is continuous.

Example 9.1 shows that continuity is not preserved under pointwise convergence.Corollary 9.12 establishes that if S ⇢ R, then C(S) is complete under the uniformmetric.

The fact that C([a, b]) is closed under uniform convergence is often usefulbecause, given a “bad” function f 2 C([a, b]), it’s often possible to find a sequencefn

of “good” functions in C([a, b]) converging uniformly to f .

Theorem 9.13 (Weierstrass Approximation Theorem). If f 2 C([a, b]), thenthere is a sequence of polynomials p

n

�!�! f .

To prove this theorem, we first need a lemma.

Lemma 9.14. For n 2 N let cn

=⇣R

1

�1

(1 � t2)n dt⌘�1

and

kn

(t) =

(cn

(1 � t2)n, |t| 1

0, |t| > 1.

(See Figure 6.) Then

(a) kn

(t) � 0 on [�1, 1] for all n 2 N;(b)

R1

�1

kn

= 1 for all n 2 N; and,(c) if 0 < � < 1, then k

n

�!�! 0 on [�1, ��] [ [�, 1].

!1.0 !0.5 0.5 1.0

0.2

0.4

0.6

0.8

1.0

1.2

Figure 6. Here are the graphs of kn

(t) for n = 1, 2, 3, 4, 5.

Proof. Parts (a) and (b) follow easily from the definition of kn

.To prove (c) first note that

1 =

Z1

�1

kn

�Z

1/

pn

�1/

pn

cn

(1 � t2)n dt � cn

2pn

✓1 � 1

n

◆n

.



Since�1 � 1

n

�n " 1

e

, it follow that there is an ↵ > 0 such that cn

< ↵p

n.2 Letting� 2 (0, 1) and � t 1,

kn

(t) kn

(�) ↵p

n(1 � �2)n ! 0

by L’Hospital’s Rule. Since kn

is an even function, this establishes (c). ⇤

A sequence of functions satisfying conditions such as those in Lemma 9.14 iscalled a convolution kernel or a Dirac sequence.3 Several such kernels play a keyrole in the study of Fourier series, as we will see in Theorems 10.5 and 10.8.

We now turn to the proof of the theorem.

Proof. There is no generality lost in assuming [a, b] = [0, 1], for otherwise weconsider the linear change of variables g(x) = f((b�a)x+a). Similarly, we can assumef(0) = f(1) = 0, for otherwise we consider g(x) = f(x) � ((f(1) � f(0))x + f(0),which is a polynomial added to f . We can further assume f(x) = 0 when x /2 [0, 1].

Set

pn

(x) =

Z1

�1

f(x + t)kn

(x) dt.(79)

To see pn

is a polynomial, change variables in the integral using u = x + t to arriveat

pn

(x) =

Zx+1

x�1

f(u)kn

(u � x) du =

Z1

0

f(u)kn

(x � u) du,

because f(x) = 0 when x /2 [0, 1]. Notice that kn

(x � u) is a polynomial in u withcoe�cients being polynomials in x, so integrating f(u)k

n

(x � u) yields a polynomialin x. (Just try it for a small value of n and a simple function f !)

Use (79) and Lemma 9.14(b) to see for � 2 (0, 1) that

(80) |pn

(x) � f(x)| =

��Z

1

�1

f(x + t)kn

(t) dt � f(x)

��

=

��Z

1

�1

(f(x + t) � f(x))kn

(t) dt

��

Z

1

�1

|f(x + t) � f(x)|kn

(t) dt

=

Z�

��

|f(x + t) � f(x)|kn

(t) dt +

Z

�<|t|1

|f(x + t) � f(x)|kn

(t) dt.

We’ll handle each of the final integrals in turn.

2It is interesting to note that with a bit of work using complex variables one can prove

c

n

=�(n+ 3/2)p

⇡ �(n+ 1)=

n+ 1/2

n

⇥

n� 1/2

n� 1⇥

n� 3/2

n� 2⇥ · · ·⇥

3/2

1.

3Given two functions f and g defined on R, the convolution of f and g is the integral

f ? g(x) =

Z 1

�1f(t)g(x� t) dt.

The term convolution kernel is used because such kernels typically replace g in the convolutiongiven above, as can be seen in the proof of the Weierstrass approximation theorem.


5. CONTINUITY AND UNIFORM CONVERGENCE 9-9

0.5 1.0 1.5 2.0 2.5 3.0 3.5

0.2

0.4

0.6

0.8

1.0

Figure 7. s0

, s1

and s2

from Example 9.5.

Let " > 0 and use the uniform continuity of f to choose a � 2 (0, 1) such thatwhen |t| < �, then |f(x + t) � f(x)| < "/2. Then, using Lemma 9.14(b) again,

Z�

��

|f(x + t) � f(x)|kn

(t) dt <"

2

Z�

��

kn

(t) dt <"

2(81)

According to Lemma 9.14(c), there is an N 2 N so that when n � N and |t| � �,then k

n

(t) < "

8(kfk+1)(1��)

. Using this, it follows that

(82)

Z

�<|t|1

|f(x + t) � f(x)|kn

(t) dt

=

Z ��

�1

|f(x + t) � f(x)|kn

(t) dt +

Z1

�

|f(x + t) � f(x)|kn

(t) dt

2kfkZ ��

�1

kn

(t) dt + 2kfkZ

1

�

kn

(t) dt

< 2kfk "

8(kfk + 1)(1 � �)(1 � �) + 2kfk "

8(kfk + 1)(1 � �)(1 � �) =

"

2

Combining (81) and (82), it follows from (80) that |pn

(x)�f(x)| < " for all x 2 [0, 1]and p

n

�!�! f . ⇤

Corollary 9.15. If f 2 C([a, b]) and " > 0, then there is a polynomial p such

that kf � pk[a,b]

< ".

The theorems of this section can also be used to construct some striking examplesof functions with unwelcome behavior. Following is perhaps the most famous.

Example 9.5. There is a continuous f : R ! R that is di↵erentiable nowhere.

Proof. Thinking of the canonical example of a continuous function that failsto be di↵erentiable at a point–the absolute value function–we start with a “sawtooth”function. (See Figure 5.)

s0

(x) =

(x � 2n, 2n x < 2n + 1, n 2 Z2n + 2 � x, 2n + 1 x < 2n + 2, n 2 Z

Notice that s0

is continuous and periodic with period 2 and maximum value 1.Compress it both vertically and horizontally:

sn

(x) =

✓3

4

◆n

sn

(4nx) , n 2 N.

Each sn

is continuous and periodic with period pn

= 2/4n and ksn

k = (3/4)n.



0.5 1.0 1.5 2.0

0.5

1.0

1.5

2.0

2.5

3.0

Figure 8. The nowhere di↵erentiable function f from Example 9.5.

Finally, the desired function is

f(x) =1X

n=0

sn

(x).

Since ksn

k = (3/4)n, the Weierstrass M -test implies the series defining f isuniformly convergent and Corollary 9.12 shows f is continuous on R. We will showf is di↵erentiable nowhere.

Let x 2 R, m 2 N and hm

= 1/(2 · 4m).If n > m, then h

m

/pn

= 4n�m�1 2 N, so sn

(x ± hm

) � sn

(x) = 0 and

f(x ± hm

) � f(x)

±hm

=mX

k=0

sk

(x ± hm

) � sk

(x)

±hm

.(83)

On the other hand, if n < m, then a worst-case estimate is that��sn

(x ± hm

) � sn

(x)

hm

�� ✓

3

4

◆n

/

✓1

4n

◆= 3n.

This gives��

m�1X

k=0

sk

(x ± hm

) � sk

(x)

±hm

�� m�1X

k=0

��sk

(x ± hm

) � sk

(x)

±hm

��

3m � 1

3 � 1<

3m

2.(84)

Since sm

is linear on intervals of length 4�m = 2 · hm

with slope ±3m on thoselinear segments, at least one of the following is true:

��sm

(x + hm

) � s(x)

hm

�� = 3m or

��sm

(x � hm

) � s(x)

�hm

�� = 3m.(85)

Suppose the first of these is true. The argument is essentially the same in the secondcase.


6. INTEGRATION AND UNIFORM CONVERGENCE 9-11

Using (83), (84) and (85), the following estimate ensues

��f(x + h

m

) � f(x)

hm

�� =

��

1X

k=0

sk

(x + hm

) � sk

(x)

hm

��

=

��

mX

k=0

sk

(x + hm

) � sk

(x)

hm

��

��sm

(x + hm

) � s(x)

hm

��m�1X

k=0

��sk

(x ± hm

) � sk

(x)

±hm

��

> 3m � 3m

2=

3m

2.

Since 3m/2 ! 1, it is apparent f 0(x) does not exist. ⇤

There are many other constructions of nowhere di↵erentiable continuous func-tions. The first was published by Weierstrass [16] in 1872, although it was knownin the folklore sense among mathematicians earlier than this. (There is an Englishtranslation of Weierstrass’ paper in [8].) In fact, it is now known in a technicalsense that the “typical” continuous function is nowhere di↵erentiable [4].

6. Integration and Uniform Convergence

One of the recurring questions with integrals is when it is true that

limn!1

Zfn

=

Zlimn!1 f

n

.

This is often referred to as “passing the limit through the integral.” At some pointin her career, any student of advanced analysis or probability theory will be temptedto just blithely pass the limit through. But functions such as those of Example9.3 show that some care is needed. A common criterion for doing so is uniformconvergence.

Theorem 9.16. If fn

: [a, b] ! R such that

Rb

a

fn

exists for each n and fn

�!�! fon [a, b], then

Zb

a

f = limn!1

Zb

a

fn

.

Proof. Some care must be taken in this proof, because there are actually twothings to prove. Before the equality can be shown, it must be proved that f isintegrable.



To show that f is integrable, let " > 0 and N 2 N such that kf�fN

k < "/3(b�a).If P 2 part([a, b]), then

|R (f, P, x⇤k

) � R (fN

, P, x⇤k

) | = |nX

k=1

f(x⇤k

)|Ik

| �nX

k=1

fN

(x⇤k

)|Ik

||(86)

= |NX

k=1

(f(x⇤k

) � fN

(x⇤k

))|Ik

||

NX

k=1

|f(x⇤k

) � fN

(x⇤k

)||Ik

|

<"

2(b � a))

nX

k=1

|Ik

|

="

3

According to Theorem 8.10, there is a P 2 part([a, b]) such that wheneverP ⌧ Q

1

and P ⌧ Q2

, then

|R (fN

, Q1

, x⇤k

) � R (fN

, Q2

, y⇤k

) | <"

3.(87)

Combining (86) and (87) yields

|R (f, Q1

, x⇤k

) � R (f, Q2

, y⇤k

)|= |R (f, Q

1

, x⇤k

) � R (fN

, Q1

, x⇤k

) + R (fN

, Q1

, x⇤k

)

�R (fN

, Q1

, x⇤k

) + R (fN

, Q2

, y⇤k

) � R (f, Q2

, y⇤k

)| |R (f, Q

1

, x⇤k

) � R (fN

, Q1

, x⇤k

)| + |R (fN

, Q1

, x⇤k

) � R (fN

, Q1

, x⇤k

)|+ |R (f

N

, Q2

, y⇤k

) � R (f, Q2

, y⇤k

)|<

"

3+

"

3+

"

3= "

Another application of Theorem 8.10 shows that f is integrable.Finally,

��

Zb

a

f �Z

b

a

fN

�� =

��

Zb

a

(f � fN

)

�� <Z

b

a

"

3(b � a)=

"

3

shows thatRb

a

fn

! Rb

a

f . ⇤

Corollary 9.17. If

P1n=1

fn

is a series of integrable functions converging

uniformly on [a, b], thenZ

b

a

1X

n=1

fn

=1X

n=1

Zb

a

fn

Combining Theorem 9.16 with Dini’s Theorem, gives the following.


is a sequence of continuous functions converging mono-

tonically to a continuous function f on [a, b], thenRb

a

fn

! Rb

a

f .


7. DIFFERENTIATION AND UNIFORM CONVERGENCE 9-13

7. Di↵erentiation and Uniform Convergence

The relationship between uniform convergence and di↵erentiation is somewhatmore complex than those we’ve already examined. First, because there are twosequences involved, f

n

and f 0n

, either of which may converge or diverge at a point;and second, because di↵erentiation is more “delicate” than continuity or integration.

Example 9.4 is an explicit example of a sequence of di↵erentiable functionsconverging uniformly to a function which is not di↵erentiable at a point. Thederivatives of the functions from that example converge pointwise to a functionthat is not a derivative. The Weierstrass Approximation Theorem and Example9.5 push this to the extreme by showing the existence of a sequence of polynomialsconverging uniformly to a continuous nowhere di↵erentiable function.

The following theorem starts to shed some light on the situation.

Theorem 9.19. If fn

is a sequence of derivatives defined on [a, b] and fn

�!�! f ,then f is a derivative. Moreover, if the sequence F

n

2 C([a, b]) satisfies F 0n

= fn

and Fn

(a) = 0, then Fn

�!�! F where F 0 = f .

Proof. For each n, let Fn

be an antiderivative of fn

. By considering Fn

(x) �Fn

(a), if necessary, there is no generality lost with the assumption that Fn

(a) = 0.Let " > 0. There is an N 2 N such that

m, n � N =) kfm

� fn

k <"

b � a.

If x 2 [a, b] and m, n � N , then the Mean Value Theorem and the assumption thatFm

(a) = Fn

(a) = 0 yield a c 2 [a, b] such that

|Fm

(x) � Fn

(x)| = |(Fm

(x) � Fn

(x)) � (Fm

(a) � Fn

(a))|= |f

m

(c) � fn

(c)| |x � a| kfm

� fn

k(b � a) < ".

This shows Fn

is a Cauchy sequence in C([a, b]) and there is an F 2 C([a, b]) withFn

�!�! F .It su�ces to show F 0 = f . To do this, several estimates are established.Let M 2 N so that

m, n � M =) kfm

� fn

k <"

3.

Notice this implies

kf � fn

k "

3, 8n � M.(88)

For such m, n � M and x, y 2 [a, b] with x 6= y, another application of theMean Value Theorem gives��Fn

(x) � Fn

(y)

x � y� F

m

(x) � Fm

(y)

x � y

��

=1

|x � y| |(Fn

(x) � Fm

(x)) � (Fn

(y) � Fm

(y))|

=1

|x � y| |fn

(c) � fm

(c)| |x � y| kfn

� fm

k <"

3.

Letting m ! 1, it follows that��Fn

(x) � Fn

(y)

x � y� F (x) � F (y)

x � y

�� "

3, 8n � M.(89)



Fix n � M and x 2 [a, b]. Since F 0n

(x) = fn

(x), there is a � > 0 so that��Fn

(x) � Fn

(y)

x � y� f

n

(x)

�� <"

3, 8y 2 (x � �, x + �) \ {x}.(90)

Finally, using (89), (90) and (88), we see��F (x) � F (y)

x � y� f(x)

��

=

��F (x) � F (y)

x � y� F

n

(x) � Fn

(y)

x � y

+Fn

(x) � Fn

(y)

x � y� f

n

(x) + fn

(x) � f(x)

��

��F (x) � F (y)

x � y� F

n

(x) � Fn

(y)

x � y

��

+

��Fn

(x) � Fn

(y)

x � y� f

n

(x)

��+ |fn

(x) � f(x)|

<"

3+

"

3+

"

3= ".

This establishes that

limy!x

F (x) � F (y)

x � y= f(x),

as desired. ⇤

Corollary 9.20. If Gn

2 C([a, b]) is a sequence such that G0n

�!�! g and

Gn

(x0

) converges for some x0

2 [a, b], then Gn

�!�! G where G0 = g.

Proof. Let Fn

and F be as in Theorem 9.20. Then F 0 � G0 = gn

� gn

= 0, soG

n

= Fn

+ ↵n

for some constant ↵n

. In particular, ↵n

= Gn

(x0

) � Fn

(x0

) ! ↵, forsome ↵ 2 R, because both G

n

and Fn

converge at x0

.Let " > 0. Since both F

n

and ↵n

are Cauchy sequences, there exists an N 2 Nsuch that

m, n � N =) kFn

� Fm

k <"

2and |↵

n

� ↵m

| <"

2.

If m, n � N and x 2 [a, b], then

|Gn

(x) � Gm

(x)| = |(Fn

(x) + ↵n

) � (Fm

(x) + ↵m

)| |F

n

(x) � Fm

(x)| + |↵n

� ↵m

|< kF

n

� Fm

k +"

2< ".

This shows Gn

is a Cauchy sequence in C([a, b]) and therefore Gn

�!�! G. It is clearthat G = F + ↵, so G0 = F 0 = g. ⇤


is a sequence of di↵erentiable functions defined on [a, b]such that

P1k=1

f(x0

) exists for some x0

2 [a, b] andP1

k=1

f 0n

converges uniformly,

then 1X

k=1

f

!0=

1X

k=1

f 0

Proof. Left as an exercise. ⇤


8. POWER SERIES 9-15

8. Power Series

8.1. The Radius and Interval of Convergence. One place where uniformconvergence plays a key role is with power series. Recall the definition.

Definition 9.22. A power series is a function of the form

f(x) =1X

n=0

an

(x � c)n.(91)

The domain of f is the set of all x at which the series converges. The constant c iscalled the center of the series.

To determine the domain of (91), let x 2 R \ {c} and use the root test to seethe series converges when

lim sup |an

(x � c)n|1/n = |x � c| lim sup |an

|1/n < 1

and diverges when|x � c| lim sup |a

n

|1/n > 1.

If r lim sup |an

|1/n 1 for some r � 0, then these inequalities imply (91) is absolutelyconvergent when |x � c| < r. In other words, if

R = lub {r : r lim sup |an

|1/n < 1},(92)

then the domain of (91) is an interval of radius R centered at c. The root test givesno information about convergence when |x � c| = R. This R is called the radius

of convergence of the power series. Assuming R > 0, the open interval centeredat c with radius R is called the interval of convergence. It may be di↵erent fromthe domain of the series because the series may converge at one endpoint or bothendpoints of the interval of convergence.

The ratio test can also be used to determine the radius of convergence, but, asshown in (30), it will not work as often as the root test. When it does,

R = lub {r : r lim sup

��an+1

an

�� < 1}.(93)

This is usually easier to compute than (92), and both will give the same value for R.

Example 9.6. Calling to mind Example 4.2, it is apparent the geometric powerseries

P1n=0

xn has center 0, radius of convergence 1 and domain (�1, 1).

Example 9.7. For the power seriesP1

n=1

2n(x + 2)n/n, we compute

lim sup

✓2n

n

◆1/n

= 2 =) R =1

2.

Since the series diverges when x = �2± 1

2

, it follows that the interval of convergenceis (�5/2, �3/2).

Example 9.8. The power seriesP1

n=1

xn/n has interval of convergence (�1, 1)and domain [�1, 1). Notice it is not absolutely convergent when x = �1.

Example 9.9. The power seriesP1

n=1

xn/n2 has interval of convergence (�1, 1),domain [�1, 1] and is absolutely convergent on its whole domain.

The preceding is summarized in the following theorem.



Theorem 9.23. Let the power series be as in (91) and R be given by either

(92) or (93).

(a) If R = 0, then the domain of the series is {c}.(b) If R > 0 the series converges at x when |c � x| < R and diverges at xwhen |c�x| > R. In the case when c = 1, the series converges everywhere.

(c) If R 2 (0, 1), then the series may converge at none, one or both of

c � R and c + R.

8.2. Uniform Convergence of Power Series. The partial sums of a powerseries are a sequence of polynomials converging pointwise on the domain of theseries. As has been seen, pointwise convergence is not enough to say much aboutthe behavior of the power series. The following theorem opens the door to a lotmore.

Theorem 9.24. A power series converges absolutely and uniformly on compact

subsets of its interval of convergence.

Proof. There is no generality lost in assuming the series has the form of(91) with c = 0. Let the radius of convergence be R > 0 and K be a compactsubset of (�R, R) with ↵ = lub {|x| : x 2 K}. Choose r 2 (↵, R). If x 2 K, then|a

n

xn| < |an

rn| for n 2 N. SinceP1

n=0

|an

rn| converges, the Weierstrass M -testshows

P1n=0

an

xn is absolutely and uniformly convergent on K. ⇤The following two corollaries are immediate consequences of Corollary 9.12 and

Theorem 9.16, respectively.

Corollary 9.25. A power series is continuous on its interval of convergence.

Corollary 9.26. If [a, b] is an interval contained in the interval of convergence

for the power series

P1n=0

an

(x � c)n, thenZ

b

a

1X

n=0

an

(x � c)n =1X

n=0

an

Zb

a

(x � c)n.

The next question is: What about di↵erentiability?Notice that the continuity of the exponential function and L’Hospital’s Rule

give

limn!1 n1/n = lim

n!1 exp

✓ln n

n

◆= exp

✓limn!1

ln n

n

◆= exp(0) = 1.

Therefore, for any sequence an

,

lim sup(nan

)1/n = lim sup n1/na1/n

n

= lim sup a1/n

n

.(94)

Now, suppose the power seriesP1

n=0

an

xn has a nontrivial interval of conver-gence, I. Formally di↵erentiating the power series term-by-term gives a new powerseries

P1n=1

nan

xn�1. According to (94) and Theorem 9.23, the term-by-termdi↵erentiated series has the same interval of convergence as the original. Its partialsums are the derivatives of the partial sums of the original series and Theorem 9.24guarantees they converge uniformly on any compact subset of I. Corollary 9.21shows

d

dx

1X

n=0

an

xn =1X

n=0

d

dxan

xn =1X

n=1

nan

xn�1, 8x 2 I.

This process can be continued inductively to obtain the same results for all higherorder derivatives. We have proved the following theorem.



Theorem 9.27. If f(x) =P1

n=0

an

(x � c)n is a power series with nontrivial

interval of convergence, I, then f is di↵erentiable to all orders on I with

f (m)(x) =1X

n=m

n!

(n � m)!an

(x � c)n�m.(95)

Moreover, the di↵erentiated series has I as its interval of convergence.

8.3. Taylor Series. Suppose f(x) =P1

n=0

an

xn has I = (�R, R) as itsinterval of convergence for some R > 0. According to Theorem 9.27,

f (m)(0) =m!

(m � m)!am

=) am

=f (m)(0)

m!, 8m 2 !.

Therefore,

f(x) =1X

n=0

f (n)(0)

n!xn, 8x 2 I.

This is a remarkable result! It shows that the values of f on I are completelydetermined by its values on any neighborhood of 0. This is summarized in thefollowing theorem.

Theorem 9.28. If a power series f(x) =P1

n=0

an

(x � c)n has nontrivial

interval of convergence I, then

f(x) =1X

n=0

f (n)(c)

n!(x � c)n, 8x 2 I.(96)

The series (96) is called the Taylor series

4 for f centered at c. The Taylorseries can be formally defined for any function that has derivatives of all orders at c,but, as Example 7.9 shows, there is no guarantee it will converge to the functionanywhere except at c. Taylor’s Theorem 7.17 can be used to examine the questionof pointwise convergence.

8.4. The Endpoints of the Interval of Convergence. We have seen thatat the endpoints of its interval of convergence a power series may diverge or evenabsolutely converge. A natural question when it does converge is the following:What is the relationship between the value at the endpoint and the values insidethe interval of convergence?

Theorem 9.29 (Abel). If f(x) =P1

n=0

an

(x � c)n has a finite radius of

convergence R > 0 and f(c + R) exists, then c + R 2 C(f).

Proof. It can be assumed c = 0 and R = 1. There is no loss of generality witheither of these assumptions because otherwise just replace f(x) with f((x + c)/R).

4When c = 0, it is often called the Maclaurin series for f .



Set s = f(1), s�1

= 0 and sn

=P

n

k=0

ak

for n 2 !. For |x| < 1,nX

k=0

ak

xk =nX

k=0

(sk

� sk�1

)xk

=nX

k=0

sk

xk �nX

k=1

sk�1

xk

= sn

xn +n�1X

k=0

sk

xk � x

n�1X

k=0

sk

xk

= sn

xn + (1 � x)n�1X

k=0

sk

xk

When n ! 1, since sn

is bounded,

f(x) = (1 � x)1X

k=0

sk

xk.(97)

Since (1 � x)P1

n=0

xn = 1, (97) implies

|f(x) � s| =

��(1 � x)1X

k=0

(sk

� s)xk

�� .(98)

Let " > 0. Choose N 2 N such that |sN

� s| < "/2, � 2 (0, 1) so

�

NX

k=0

|sn

� s| < "/2.

and 1 � � < x < 1. With these choices, (98) becomes

|f(x) � s| ��(1 � x)

NX

k=0

(sk

� s)xk

��+

��(1 � x)1X

k=N+1

(sk

� s)xk

��

< �

NX

k=0

|sk

� s| +"

2

��(1 � x)1X

k=N+1

xk

�� <"

2+

"

2= "

It has been shown that limx"1

f(x) = f(1), so 1 2 C(f). ⇤Abel’s theorem opens up a more general idea for the summation of series.

Suppose, as in the proof of the theorem, f(x) =P1

n=0

an

xn with interval ofconvergence (�1, 1). If lim

x"1

f(x) exists, then this limit is called the Abel sum ofthe coe�cients of the series. In this case, the following notation is used

limx"1

f(x) = A1X

n=0

an

.

From Theorem 9.29, it is clear that whenP1

n=0

an

exists, thenP1

n=0

an

=AP1

n=0

an

. The converse of this statement may not be true.

Example 9.10. Notice that1X

n=0

(�x)n =1

1 + x,



has interval of convergence (�1, 1) and diverges when x = 1. (It is the sum1 � 1 + 1 � 1 + · · · .) But,

A

1X

n=0

(�1)n = limx"1

1

1 + x=

1

2.

Finally, here is an example showing the power of these techniques.

Example 9.11. The series1X

n=0

(�1)nx2n =1

1 + x2

has (�1, 1) as its interval of convergence and (0, 1] as its domain. If 0 |x| < 1,then Corollary 9.17 justifies

arctan(x) =

Zx

0

dt

1 + t2=

Zx

0

1X

n=0

(�1)nt2n dt =1X

n=0

(�1)n

2n + 1x2n+1.

This series for the arctangent converges by the alternating series test when x = 1,so Theorem 9.29 implies

1X

n=0

(�1)n

2n + 1= lim

x"1

arctan(x) = arctan(1) =⇡

4.(99)


CHAPTER 10

Fourier Series

In the late eighteenth century, it was well-known that complicated functionscould sometimes be approximated by a sequence of polynomials. Some of the leadingmathematicians at that time, including such luminaries as Daniel Bernoulli, Eulerand d’Alembert studied the possibility of using sequences of trigonometric functionsfor approximation. In 1807, this idea opened into a huge area of research whenJoseph Fourier used series of sines and cosines to solve several outstanding partialdi↵erential equations of physics.1

In particular, he used series of the form

1X

n=0

an

cos nx + bn

sin nx

to approximate his solutions. Series of this form are called trigonometric series,

and the ones derived from Fourier’s methods are called Fourier series. Much ofthe mathematical research done in the nineteenth and early twentieth century wasdevoted to understanding the convergence of Fourier series. This chapter presentsnothing more than the tip of that huge iceberg.

1. Trigonometric Polynomials

Definition 10.1. A function of the form

p(x) =nX

k=0

↵k

cos kx + �k

sin kx(100)

is called a trigonometric polynomial. The largest value of k such that |↵k

|+�k

| 6= 0 isthe degree of the polynomial. Denote by T the set of all trigonometric polynomials.

Evidently, all functions in T are 2⇡-periodic and T is closed under additionand multiplication by real numbers. Indeed, it is a real vector space, in the sense oflinear algebra and the set {sin nx : n 2 N} [ {cos nx : n 2 !} is a basis for T .

The following theorem can be proved using integration by parts or trigonometricidentities.

Theorem 10.2. If m, n 2 Z, thenZ

⇡

�⇡

sin mx cos nx dx = 0,(101)

1Fourier’s methods can be seen in most books on partial di↵erential equations, such as [3]. Forexample, see solutions of the heat and wave equations using the method of separation of variables.

10-1

10-2 CHAPTER 10. FOURIER SERIES

Z⇡

�⇡

sin mx sin nx dx =

8><

>:

0, m 6= n

0, m = 0 or n = 0

⇡ m = n 6= 0

(102)

and

Z⇡

�⇡

cos mx cos nx dx =

8><

>:

0, m 6= n

2⇡ m = n = 0

⇡ m = n 6= 0

.(103)

If p(x) is as in (100), then Theorem 10.2 shows

2⇡↵0

=

Z⇡

�⇡

p(x) dx, �0

= 0

and for n > 0,

⇡↵n

=

Z⇡

�⇡

p(x) cos nx dx, ⇡�n

=

Z⇡

�⇡

p(x) cos nx dx.

Combining these, it follows that if

an

=1

⇡

Z⇡

�⇡

p(x) cos nx dx and bn

=1

⇡

Z⇡

�⇡

p(x) sin nx dx

for n 2 !, then

p(x) =a0

2+

1X

n=1

an

cos nx + bn

sin nx.(104)

(Remember that all but a finite number of the an

and bn

are 0!)At this point, the logical question is whether this same method can be used to

represent a more general 2⇡-periodic function. For any function f , integrable on[�⇡, ⇡], the coe�cients can be defined as above; i.e., for n 2 !,

an

=1

⇡

Z⇡

�⇡

f(x) cos nx dx and bn

=1

⇡

Z⇡

�⇡

f(x) sin nx dx.(105)

The problem is whether and in what sense an equation such as (104) might betrue. This turns out to be a very deep and di�cult question with no short answer.2

Because we don’t know whether equality in the sense of (104) is true, the usualpractice is to write

f(x) ⇠ a0

2+

1X

n=1

an

cos nx + bn

sin nx,(106)

indicating that the series on the right is calculated from the function on the leftusing (105). The series is called the Fourier series for f .

There are at least two fundamental questions arising from (106): Does theFourier series of f converge to f? Can f be recovered from its Fourier series, evenif the Fourier series does not converge to f? These are often called the convergence

2Many people, including me, would argue that the study of Fourier series has been themost important area of mathematical research over the past two centuries. Huge mathematicaldisciplines, including set theory, measure theory and harmonic analysis trace their lineage back tobasic questions about Fourier series. Even after centuries of study, research in this area continuesunabated.


3. THE DIRICHLET KERNEL 10-3

and representation questions, respectively. The next few sections will give somepartial answers.

2. The Riemann Lebesgue Lemma

We learned early in our study of series that the first and simplest convergencetest is to check whether the terms go to zero. For Fourier series, this is always thecase.

Theorem 10.3 (Riemann-Lebesgue Lemma). If f is a function such that

Rb

a

fexists, then

lim↵!1

Zb

a

f(t) cos ↵t dt = 0 and lim↵!1

Zb

a

f(t) sin ↵t dt = 0.

Proof. Since the two limits have similar proofs, only the first will be proved.Let " > 0 and P be the generic partition of [a, b] satisfying

0 <

Zb

a

f � D(f, P ) <"

2.

For mi

= glb {f(x) : xi�1

< x < xi

}, define a function g on [a, b] by g(x) = mi

when xi�1

x < xi

and g(b) = mn

. Note thatRb

a

g = D(f, P ) so

0 <

Zb

a

(f � g) <"

2.

Since f � g,��

Zb

a

f(t) cos ↵t dt

�� =

��

Zb

a

(f(t) � g(t)) cos ↵t dt +

Zb

a

g(t) cos ↵t dt

��

��

Zb

a

(f(t) � g(t)) cos ↵t dt

��+

��

Zb

a

g(t) cos ↵t dt

��

Z

b

a

(f � g) +

��1

↵

nX

i=1

mi

((sin(↵xi

) � sin(↵xi�1

))

��

Z

b

a

(f � g) +2

↵

nX

i=1

|mi

|

The first term on the right is less than "/2 and the second can be made less than"/2 by choosing ↵ large enough. ⇤

Corollary 10.4. If f is integrable on [�⇡, ⇡] with an

and bn

the Fourier

coe�cients of f , then an

! 0 and bn

! 0.

3. The Dirichlet Kernel

Suppose f is integrable on [�⇡, ⇡] and 2⇡-periodic on R, so the Fourier seriesof f exists. The partial sums of the Fourier series are written as s

n

(f, x), or moresimply s

n

(x) when there is only one function in sight. To be more precise,

sn

(f, x) =a0

2+

nX

k=1

(ak

cos kx + bk

sin kx) .

We begin with the following calculation.



-p -p

2

p

2p

-3

3

6

9

12

15

Figure 1. The Dirichlet kernel Dn

(s) for n = 1, 4, 7.

sn

(x) =a0

2+

nX

k=1

(ak

cos kx + bk

sin kx)

=1

2⇡

Z⇡

�⇡

f(t) dt +nX

k=1

1

⇡

Z⇡

�⇡

(f(t) cos kt cos kx + f(t) sin kt sin kx) dt

=1

2⇡

Z⇡

�⇡

f(t)nX

k=1

(1 + 2(cos kt cos kx + sin kt sin kx)) dt

=1

2⇡

Z⇡

�⇡

f(t)nX

k=1

(1 + 2 cos k(x � t)) dt

Substitute s = x � t and use the assumption that f is 2⇡-periodic.

=1

2⇡

Z⇡

�⇡

f(x � s)

1 + 2

nX

k=1

cos ks

!ds(107)

The sequence of trigonometric polynomials from within the integral,

Dn

(s) = 1 + 2nX

k=1

cos ks,(108)

is called the Dirichlet kernel. Its properties will prove useful for determining thepointwise convergence of Fourier series.

Theorem 10.5. The Dirichlet kernel has the following properties.

(a) Dn

(s) is an even 2⇡-periodic function for each n 2 N.(b) D

n

(0) = 2n + 1 for each n 2 N.(c) |D

n

(s)| 2n + 1 for each n 2 N and all s.

(d)1

2⇡

Z⇡

�⇡

Dn

(s) ds = 1 for each n 2 N.


4. DINI’S TEST FOR POINTWISE CONVERGENCE 10-5

(e) Dn

(s) =sin(n + 1/2)s

sin s/2for each n 2 N and s not an integer multiple of

⇡.

Proof. Properties (a)–(d) follow from the definition of the kernel.The proof of property (e) uses some trigonometric manipulation. Suppose n 2 N

and s 6= m⇡ for any m 2 Z.

Dn

(s) = 1 + 2nX

k=1

cos ks

Use the facts that the cosine is even and the sine is odd.

=nX

k=�n

cos ks +cos s

2

sin s

2

nX

k=�n

sin ks

=1

sin s

2

nX

k=�n

⇣sin

s

2cos ks + cos

s

2sin ks

⌘

=1

sin s

2

nX

k=�n

sin(k +1

2)s

This is a telescoping sum.

=sin(n + 1

2

)s

sin s

2

⇤

According to (107),

sn

(f, x) =1

2⇡

Z⇡

�⇡

f(x � t)Dn

(t) dt.

This is similar to a situation we’ve seen before within the proof of the Weierstrassapproximation theorem, Theorem 9.13. The integral given above is a convolution

integral similar to that used in the proof of Theorem 9.13, although the Dirichletkernel isn’t a convolution kernel in the sense of Lemma 9.14.

4. Dini’s Test for Pointwise Convergence

Theorem 10.6 (Dini’s Test). Let f : R ! R be a 2⇡-periodic function integrable

on [�⇡, ⇡] with Fourier series given by (106). If there is a � > 0 and s 2 R such

that

Z�

0

��f(x + t) + f(x � t) � 2s

t

�� dt < 1,

then

a0

2+

1X

k=1

(ak

cos kx + bk

cos kx) = s.



-p -p

2

p

2p 2 p

-p

-p

2

p

2

p

Figure 2. This plot shows the function of Example 10.1 and s8

(x)for that function.

Proof. Since Dn

is even,

sn

(x) =1

2⇡

Z⇡

�⇡

f(x � t)Dn

(t) dt

=1

2⇡

Z0

�⇡

f(x � t)Dn

(t) dt +1

2⇡

Z⇡

0

f(x � t)Dn

(t) dt

=1

2⇡

Z⇡

0

(f(x + t) + f(x � t)) Dn

(t) dt.

By Theorem 10.5(d) and (e),

sn

(x) � s =1

2⇡

Z⇡

0

(f(x + t) + f(x � t) � 2s) Dn

(t) dt

=1

2⇡

Z⇡

0

f(x + t) + f(x � t) � 2s

t· t

sin t

2

· sin(n +1

2)t dt.

Since t/ sin t

2

is bounded on (0, ⇡), Theorem 10.3 shows sn

(x) � s ! 0. Now useCorollary 8.11 to finish the proof. ⇤

Example 10.1. Suppose f(x) = x for �⇡ < x ⇡ and is 2⇡-periodic on R.Since f is odd, a

n

= 0 for all n. Integration by parts gives bn

= (�1)n+12/n forn 2 N. Therefore,

f ⇠1X

n=1

(�1)n+1

2

nsin nx.

For x 2 (�⇡, ⇡), let 0 < � < min{⇡ � x, ⇡ + x}. (This is just the distance from x toclosest endpoint of (�⇡, ⇡).) Using Dini’s test, we see

Z�

0

��f(x + t) + f(x � t) � 2x

t

�� dt =

Z�

0

��x + t + x � t � 2x

t

�� dt = 0 < 1,


5. THE FEJER KERNEL 10-7

so1X

n=1

(�1)n+1

2

nsin nx = x for � ⇡ < x < ⇡.(109)

In particular, when x = ⇡/2, (109) gives another way to derive (99). When x = ⇡,the series converges to 0, which is the middle of the “jump” for f .

This behavior of converging to the middle of a jump discontinuity is typical. Tosee this, denote the one-sided limits of f at x by

f(x�) = limt"x

f(t) and f(x+) = limt#x

f(t),

and suppose f has a jump discontinuity at x with

s =f(x�) + f(x+)

2.

Guided by Dini’s test, considerZ

�

0

��f(x + t) + f(x � t) � 2s

t

�� dt

=

Z�

0

��f(x + t) + f(x � t) � f(x�) � f(x+)

t

�� dt

Z

�

0

��f(x + t) � f(x+)

t

�� dt +

Z�

0

��f(x � t) � f(x�)

t

�� dt

If both of the integrals on the right are finite, then the integral on the left is alsofinite. This amounts to a proof of the following corollary.

Corollary 10.7. Suppose f : R ! R is 2⇡-periodic and integrable on [�⇡, ⇡].If both one-sided limits exist at x and there is a � > 0 such that both

Z�

0

��f(x + t) � f(x+)

t

�� dt < 1 and

Z�

0

��f(x � t) � f(x�)

t

�� dt < 1,

then the Fourier series of f converges to

f(x�) + f(x+)

2.

The Dini test given above provides a powerful condition su�cient to ensure thepointwise convergence of a Fourier series. There is a plethora of ever more abstruseconditions that can be proved in a similar fashion to show pointwise convergence.

The problem is complicated by the fact that there are continuous functionswith Fourier series divergent at a point and integrable functions with Fourier seriesdiverging everywhere [11]. Such examples are much too far into the deep water tobe presented here.

5. The Fejer Kernel

Since pointwise convergence of the partial sums seems complicated, why notchange the rules of the game? Instead of looking at the sequence of partial sums,consider a rolling average instead:

�n

(f, x) =1

n + 1

nX

k=0

sn

(f, x).



The trigonometric polynomials �n

(f, x) are called the Cesaro means of the partialsums. If lim

n!1 �n

(f, x) exists, then the Fourier series for f is said to be (C, 1)summable at x. The idea is that this averaging will “smooth out” the partial sums,making them more nicely behaved. It is not hard to show that if s

n

(f, x) convergesat some x, then �

n

(f, x) will converge to the same thing. But there are sequencesfor which �

n

(f, x) converges and sn

(f, x) does not. (See Exercises 3.19 and 4.26.)As with s

n

(x), we’ll simply write �n

(x) instead of �n

(f, x), when it is clearwhich function is being considered.

We start with a calculation.

�n

(x) =1

n + 1

nX

k=0

sk

(x)

=1

n + 1

nX

k=0

1

2⇡

Z⇡

�⇡

f(x � t)Dk

(t) dt

=1

2⇡

Z⇡

�⇡

f(x � t)1

n + 1

nX

k=0

Dk

(t) dt(*)

=1

2⇡

Z⇡

�⇡

f(x � t)1

n + 1

nX

k=0

sin(k + 1/2)t

sin t/2dt

=1

2⇡

Z⇡

�⇡

f(x � t)1

(n + 1) sin2 t/2

nX

k=0

sin t/2 sin(k + 1/2)t dt

Use the identity 2 sin A sin B = cos(A � B) � cos(A + B).

=1

2⇡

Z⇡

�⇡

f(x � t)1

(n + 1) sin2 t/2

nX

k=0

(cos kt � cos(k + 1)t) dt

The sum telescopes.

=1

2⇡

Z⇡

�⇡

f(x � t)1

(n + 1) sin2 t/2(1 � cos(n + 1)t) dt

Use the identity 2 sin2 A = 1 � cos 2A.

=1

2⇡

Z⇡

�⇡

f(x � t)1

(n + 1)

✓sin n+1

2

t

sin t

2

◆2

dt(**)

The Fejer kernel is the sequence of functions highlighted above; i.e.,

Kn

(t) =1

(n + 1)

✓sin n+1

2

t

sin t

2

◆2

, n 2 N.(110)

Comparing the lines labeled (*) and (**) in the previous calculation, we see anotherform for the Fejer kernel is

Kn

(t) =1

n + 1

nX

k=0

Dk

(t).(111)


5. THE FEJER KERNEL 10-9

-p - p2

p2

p

2

4

6

8

10

Figure 3. A plot of K5

(t), K8

(t) and K10

(t).

Once again, we’re confronted with a convolution integral containing a kernel:

�n

(x) =1

2⇡

Z⇡

�⇡

f(x � t)Kn

(t) dt.

Theorem 10.8. The Fejer kernel has the following properties.

3

(a) Kn

(t) is an even 2⇡-periodic function for each n 2 N.(b) K

n

(0) = n + 1 for each n 2 !.(c) K

n

(t) � 0 for each n 2 N.(d) 1

2⇡

R⇡

�⇡

Kn

(t) dt = 1 for each n 2 !.

(e) If 0 < � < ⇡, then Kn

�!�! 0 on [�⇡, �] [ [�, ⇡].

(f) If 0 < � < ⇡, thenR�

�⇡

Kn

(t) dt ! 0 and

R⇡

�

Kn

(t) dt ! 0.

Proof. Theorem 10.5 and (111) imply (a), (b) and (d). Equation (110) implies(c).

Let � be as in (e). In light of (a), it su�ces to prove (e) for the interval [�, ⇡].Noting that sin t/2 is decreasing on [�, ⇡], it follows that for � t ⇡,

Kn

(t) =1

(n + 1)

✓sin n+1

2

t

sin t

2

◆2

1

(n + 1)

✓1

sin t

2

◆2

1

(n + 1)

1

sin2

�

2

! 0

It follows that Kn

�!�! 0 on [�, ⇡] and (e) has been proved.

3Compare this theorem with Lemma 9.14.



Theorem 9.16 and (e) imply (f). ⇤

6. Fejer’s Theorem

Theorem 10.9 (Fejer). If f : R ! R is 2⇡-periodic, integrable on [�⇡, ⇡] andcontinuous at x, then �

n

(x) ! f(x).

Proof. Since f is 2⇡-periodic andR⇡

�⇡

Rf(t) dt exists, so does

R⇡

�⇡

R(f(x �

t) � f(x)) dt. Theorem 8.3 gives an M > 0 so |f(x � t) � f(x)| < M for all t.Let " > 0 and choose � > 0 such that |f(x) � f(y)| < "/3 whenever |x � y| < �.

By Theorem 10.8, there is an N 2 N so that whenever n � N ,

1

2⇡

Z�

�⇡

Kn

(t) dt <"

3Mand

1

2⇡

Z⇡

�

Kn

(t) dt <"

3M.

We start calculating.

|�n

(x) � f(x)| =

��1

2⇡

Z⇡

�⇡

f(x � t)Kn

(t) dt � 1

2⇡

Z⇡

�⇡

f(x)Kn

(t) dt

��

=1

2⇡

��Z

⇡

�⇡

(f(x � t) � f(x))Kn

(t) dt

��

=1

2⇡

��

Z ��

�⇡

(f(x � t) � f(x))Kn

(t) dt +

Z�

��

(f(x � t) � f(x))Kn

(t) dt

+

Z⇡

�

(f(x � t) � f(x))Kn

(t) dt

��

��

1

2⇡

Z ��

�⇡

(f(x � t) � f(x))Kn

(t) dt

��+

��1

2⇡

Z�

��

(f(x � t) � f(x))Kn

(t) dt

��

+

��1

2⇡

Z⇡

�

(f(x � t) � f(x))Kn

(t) dt

��

<M

2⇡

Z ��

�⇡

Kn

(t) dt +1

2⇡

Z�

��

|f(x � t) � f(x)|Kn

(t) dt +M

2⇡

Z⇡

�

Kn

(t) dt

< M"

3M+

"

3

1

2⇡

Z�

��

Kn

(t) dt + M"

3M< "

This shows �n

(x) ! f(x). ⇤Theorem 10.9 gives a partial solution to the representation problem.

Corollary 10.10. Suppose f and g are continuous and 2⇡-periodic on R. If fand g have the same Fourier coe�cients, then they are equal.

Proof. By assumption, �n

(f, t) = �n

(g, t) for all n and Theorem 10.9 implies

0 = �n

(f, t) � �n

(g, t) ! f � g.

⇤In the case of continuous functions, the convergence is uniform, rather than

pointwise.

Theorem 10.11 (Fejer). If f : R ! R is 2⇡-periodic and continuous, then

�n

(x) �!�! f(x).


7. EXERCISES 10-11

-p -p

2

p

2p 2 p

-p

-p

2

p

2

p

-p -p

2

p

2p 2 p

-p

-p

2

p

2

p

Figure 4. These plots illustrate the functions of Example 10.2. Onthe left are shown f(x), s

8

(x) and �

8

(x). On the right are shown f(x),�

3

(x), �5

(x) and �

20

(x). Compare this with Figure 2.

Proof. By Exercise 10.3, f is uniformly continuous. This can be used to showthe calculation within the proof of Theorem 10.9 do not depend on x. The detailsare left as Exercise 10.5. ⇤

A perspicacious reader will have noticed the similarity between Theorem 10.11and the Weierstrass approximation theorem, Theorem 9.13. In fact, the Weierstrassapproximation theorem can be proved from Theorem 10.11 using power series andTheorem 9.24.

Example 10.2. As in Example 10.1, let f(x) = x for �⇡ < x ⇡ and extendf to be periodic on R with period 2⇡. Figure 4 shows the di↵erence between theFejer and classical methods of summation. Notice that the Fejer sums remain muchmore smoothly a�xed to the function.

7. Exercises


10.2. Suppose f is integrable on [�⇡, ⇡]. If f is even, then the Fourier series forf has no sine terms. If f is odd, then the Fourier series for f has no cosine terms.

10.3. If f : R ! R is periodic with period p and continuous on [0, p], then f isuniformly continuous.

10.4. The function g(t) = t/ sin(t/2) is undefined whenever t = 2n⇡ for somen 2 Z. Show that it can be redefined on the set {2n⇡ : n 2 Z} to be periodic anduniformly continuous on R.


10.6. Prove the Weierstrass approximation theorem using Taylor series andTheorem 9.24.


Bibliography

[1] A.D. Aczel. The mystery of the Aleph: mathematics, the Kabbalah, and the search for infinity.Washington Square Press, 2001.

[2] Carl B. Boyer. The History of the Calculus and Its Conceptual Development. Dover Publica-tions, Inc., 1959.

[3] James Ward Brown and Ruel V Churchill. Fourier series and boundary value problems.McGraw-Hill, New York, 8th ed edition, 2012.

[4] Andrew Bruckner. Di↵erentiation of Real Functions, volume 5 of CRM Monograph Series.American Mathematical Society, 1994.

[5] Georg Cantor. Uber eine Eigenschaft des Inbegri↵es aller reelen algebraishen Zahlen. Journalfur die Reine und Angewandte Mathematik, 77:258–262, 1874.

[6] Georg Cantor. De la puissance des ensembles parfait de points. Acta Math., 4:381–392, 1884.[7] Krzysztof Ciesielski. Set Theory for the Working Mathematician. Number 39 in London

Mathematical Society Student Texts. Cambridge University Press, 1998.[8] Gerald A. Edgar, editor. Classics on Fractals. Addison-Wesley, 1993.[9] James Foran. Fundamentals of Real Analysis. Marcel-Dekker, 1991.

[10] Nathan Jacobson. Basic Algebra I. W. H. Freeman and Company, 1974.[11] Yitzhak Katznelson. An Introduction to Harmonic Analysis. Cambridge University Press,

Cambridge, UK, 3rd ed edition, 2004.[12] Charles C. Mumma. n! and The Root Test. Amer. Math. Monthly, 93(7):561, August-

September 1986.[13] James R. Munkres. Topology; a first course. Prentice-Hall, Englewood Cli↵s, N.J., 1975.[14] Hans Samelson. More on Kummer’s test. The American Mathematical Monthly, 102(9):817–

818, 1995.[15] Jingcheng Tong. Kummer’s test gives characterizations for convergence or divergence of all

positive series. The American Mathematical Monthly, 101(5):450–452, 1994.[16] Karl Weierstrass. Uber continuirliche Functionen eines reelen Arguments, di fur keinen Werth

des Letzteren einen bestimmten Di↵erentialquotienten besitzen, volume 2 of Mathematische

werke von Karl Weierstrass, pages 71–74. Mayer & Muller, Berlin, 1895.

A-1

Index

| ⇧ |, absolute value, 2-4@

0

, cardinality of N, 1-10�!

�!

uniform convergence, 9-3S

c, complement of S, 1-3c, cardinality of R, 2-10D(.) Darboux integral, 8-5D(.). lower Darboux integral, 8-5D(., .) lower Darboux sum, 8-4D(.). upper Darboux integral, 8-5D(., .) upper Darboux sum, 8-4\, set di↵erence, 1-3D

n

(t) Dirichlet kernel, 10-42, element, 1-162, not an element, 1-1;, empty set, 1-2() , logically equivalent, 1-4K

n

(t) Fejer kernel, 10-9B

A, all functions f : A ! B, 1-13glb , greatest lower bound, 2-6i↵, if and only if, 1-4=) , implies, 1-4<, , >, �, 2-31, infinity, 2-6\, intersection, 1-3^, logical and, 1-2_, logical or, 1-2lub , least upper bound, 2-6n, initial segment, 1-10N, natural numbers, 1-2!, nonnegative integers, 1-2part([a, b]) partitions of [a, b], 8-1! pointwise convergence, 9-1P(A), power set, 1-2⇧, indexed product, 1-5R, real numbers, 2-7R(.) Riemann integral, 8-3R (., ., .) Riemann sum, 8-2⇢, subset, 1-1(, proper subset, 1-1�, superset, 1-1), proper superset, 1-1�, symmetric di↵erence, 1-3⇥, product (Cartesian or real), 1-5, 2-1

T trigonometric polynomials, 10-1[, union, 1-2Z, integers, 1-2Abel sum, 9-18Abel’s test, 4-10absolute value, 2-4almost every, 5-9alternating harmonic series, 4-10Alternating Series Test, 4-12and ^, 1-2Archimedean Principle, 2-7axioms of R

additive inverse, 2-1associative laws, 2-1commutative laws, 2-1completeness, 2-7distributive law, 2-1identities, 2-1multiplicative inverse, 2-1order, 2-3

Baire category theorem, 5-10Baire, Rene-Louis, 5-10Bertrand’s test, 4-9Bolzano-Weierstrass Theorem, 3-7, 5-3bound

lower, 2-5upper, 2-5

bounded, 2-5above, 2-5below, 2-5

Cantor middle-thirds set, 5-11Cantor, Georg, 1-11

diagonal argument, 2-9cardinality, 1-10

countably infinite, 1-10finite, 1-10uncountably infinite, 1-11

Cartesian product, 1-5Cauchy

condensation test, 4-4criterion, 4-3Mean Value Theorem, 7-6

A-2

Index A-3

sequence, 3-9Cauchy continuous, 6-13ceiling function, 6-8clopen set, 5-2closed set, 5-1closure of a set, 5-3Cohen, Paul, 1-12compact, 5-7

equivalences, 5-7comparison test, 4-4completeness, 2-5composition, 1-6connected set, 5-4continuous, 6-5

Cauchy, 6-13left, 6-8right, 6-8uniformly, 6-12

continuum hypothesis, 1-11, 2-10convergence

pointwise, 9-1uniform, 9-3

convolution kernel, 9-8critical number, 7-5critical point, 7-5

Darbouxintegral, 8-5lower integral, 8-5lower sum, 8-4Theorem, 7-8upper integral, 8-5upper sum, 8-4

DeMorgan’s Laws, 1-4dense set, 2-8, 5-9

irrational numbers, 2-8rational numbers, 2-8

derivative, 7-1chain rule, 7-3rational function, 7-3

derived set, 5-2diagonal argument, 2-9di↵erentiable function, 7-6Dini’s test, 10-5Dini’s Theorem, 9-4Dirac sequence, 9-8Dirichlet function, 6-7Dirichlet kernel, 10-4disconnected set, 5-4

extreme point, 7-5

Fejer Kernel, 10-9Fejer’s theorem, 10-10field, 2-1

complete ordered, 2-7field axioms, 2-1finite cover, 5-6floor function, 6-8

Fourie series, 10-2full measure, 5-9function, 1-6

bijective, 1-7composition, 1-6constant, 1-6di↵erentiable, 7-6even, 7-15image of a set, 1-7injective, 1-7inverse, 1-8inverse image of a set, 1-7odd, 7-15one-to-one, 1-7onto, 1-6salt and pepper, 6-7surjective, 1-6

Fundamental Theorem of Calculus, 8-13,8-14

geometric sequence, 3-1Godel, Kurt, 1-11greatest lower bound, 2-6

Heine-Borel Theorem, 5-7Hilbert, David, 1-11

indexed collection of sets, 1-4indexing set, 1-4

infinity 1, 2-6initial segment, 1-10integers, 1-2integral

Cauchy criterion, 8-9change of variables, 8-16

integration by parts, 8-14intervals, 2-4irrational numbers, 2-8isolated point, 5-2

Kummer’s test, 4-8

L’Hospital’s Ruleeasy, 7-11hard, 7-12

least upper bound, 2-6left-hand limit, 6-4limit

left-hand, 6-4right-hand, 6-4unilateral, 6-4

limit comparison test, 4-5limit point, 5-2limit point compact, 5-7Lindelof property, 5-6

Maclaurin series, 9-17meager, 5-11Mean Value Theorem, 7-6metric, 2-5


A-4 Index

discrete, 2-5space, 2-4standard, 2-5

n-tuple, 1-5natural numbers, 1-2Nested Interval Theorem, 3-9nested sets, 3-9, 5-3nowhere dense, 5-10

open cover, 5-5finite, 5-6

open set, 5-1or _, 1-2order isomorphism, 2-7ordered field, 2-3ordered pair, 1-5ordered triple, 1-5

partition, 8-1common refinement, 8-1generic, 8-1norm, 8-1refinement, 8-1selection, 8-2

Peano axioms, 2-1perfect set, 5-11portion of a set, 5-10power series, 9-15

center, 9-15domain, 9-15geometric, 9-15interval of convergence, 9-15Maclaurin, 9-17radius of convergence, 9-15Taylor, 9-17

power set, 1-2

Raabe’s test, 4-9ratio test, 4-7rational function, 6-8rational numbers, 2-2, 2-8real numbers, R, 2-7relation, 1-6

domain, 1-6equivalence, 1-6function, 1-6range, 1-6reflexive, 1-6symmetric, 1-6transitive, 1-6

relative maximum, 7-5relative minimum, 7-5relative topology, 5-4relatively closed set, 5-4relatively open set, 5-4Riemann

integral, 8-3Rearrangement Theorem, 4-13

sum, 8-2right-hand limit, 6-4Rolle’s Theorem, 7-6root test, 4-6

salt and pepper function, 6-7, 8-16Sandwich Theorem, 3-4Schroder-Bernstein Theorem, 1-9sequence, 3-1

accumulation point, 3-7bounded, 3-1bounded above, 3-1bounded below, 3-1Cauchy, 3-9contractive, 3-10convergent, 3-2decreasing, 3-4divergent, 3-2Fibonacci, 3-1functions, 9-1geometric, 3-1increasing, 3-4lim inf, 3-8limit, 3-2lim sup, 3-8monotone, 3-4recursive, 3-1subsequence, 3-6

sequentially compact, 5-7series, 4-1

Abel’s test, 4-10absolutely convergent, 4-10alternating, 4-12alternating harmonic, 4-10Alternating Series Test, 4-12Bertrand’s test, 4-9Cauchy Criterion, 4-3Cauchy’s condensation test, 4-4Cesaro summability, 4-16comparison test, 4-4conditionally convergent, 4-10convergent, 4-1divergent, 4-1Fourier, 10-2geometric, 4-1harmonic, 4-2Kummer’s test, 4-8limit comparison test, 4-5p-series, 4-5partial sums, 4-1positive, 4-4Raabe’s test, 4-9ratio test, 4-7rearrangement, 4-13root test, 4-6subseries, 4-17summation by parts, 4-11telescoping, 4-3


Index A-5

terms, 4-1set, 1-1

clopen, 5-2closed, 5-1compact, 5-7complement, 1-3complementation, 1-3dense, 5-9di↵erence, 1-3element, 1-1empty set, 1-2equality, 1-1F

�

, 5-11G

�

, 5-11intersection, 1-2meager, 5-11nowhere dense, 5-10open, 5-1perfect, 5-11proper subset, 1-1subset, 1-1symmetric di↵erence, 1-3union, 1-2

subcover, 5-5subspace topology, 5-4summation by parts, 4-11

Taylor series, 9-17Taylor’s Theorem, 7-9

integral remainder, 8-14thm:riemannlebesgue, 10-3topology, 5-2

finite complement, 5-13relative, 5-13right ray, 5-2, 5-13standard, 5-2

totally disconnected set, 5-4trigonometric polynomial, 10-1

unbounded, 2-6uniform continuity, 6-12uniform metric, 9-5

Cauchy sequence, 9-5complete, 9-6

unilateral limit, 6-4

WeierstrassApproximation Theorem, 9-7, 10-11M-Test, 9-6


Documents

Intro Real Anal