26
Lecture 2: Real Analysis, Part 1 Matthew Rognlie August 20, 2013 1 Metric spaces 1.1 Motivation for metric spaces As we discussed earlier, the definitions of the real numbers and other fields, taken alone, are not sufficient for many purposes. For instance, it was awkward to show that there exists a solution in R to x n = y when y > 0, even though existence is obvious when the function x n is graphed. To translate such “obvious” insights into mathematically solid proofs, we need the machinery of analysis. Perhaps the most fundamental concept in analysis is distance. Yet this is not always easy to define. For two real numbers x and y, we probably want the distance to be | x - y|. For two complex numbers w and z, we could also write the distance as | w - z|. But another perfectly valid notion of distance would be |Re w - Re z| + |Im w - Im z|. Similarly, for two elements x and y of R n , we could define the distance to be | x - y| = p ( x 1 - y 1 ) 2 + ... +( x n - y n ) 2 . But we could also write define the distance to be | x - y| = | x 1 - y 1 | + ... + | x n - y n |. The picture is even more complicated when we think about richer spaces. Suppose that we are trying to measure the distance between two bounded functions f : [0, 1] R and g : [0, 1] R. We could define this to be: sup x[0,1] | f ( x) - g( x)|, the least upper bound of the distance between f and g. sup x[0,1]-M, M small | f ( x) - g( x)|, the least upper bound of the distance between f and g, possibly disregarding some “small” set of points M. ´ 1 0 | f ( x) - g( x)| α dx 1/α , an average of distances | f ( x) - g( x)| from 0 to 1. 1 It turns out that in various situations, all three notions of distance above are useful. It’s much easier to deal with these varied cases, however, if we have a general theory of distance and the properties derived from it. This leads us to the concept of metric spaces. 1 We haven’t formally defined the integral yet, and we won’t for a while, but assuming you’ve seen it in some form before you get the idea... 1

03 Analysis Lecture 1

  • Upload
    538995

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

Page 1: 03 Analysis Lecture 1

Lecture 2: Real Analysis, Part 1

Matthew Rognlie

August 20, 2013

1 Metric spaces

1.1 Motivation for metric spaces

As we discussed earlier, the definitions of the real numbers and other fields, taken alone,are not sufficient for many purposes. For instance, it was awkward to show that thereexists a solution in R to xn = y when y > 0, even though existence is obvious when thefunction xn is graphed. To translate such “obvious” insights into mathematically solidproofs, we need the machinery of analysis.

Perhaps the most fundamental concept in analysis is distance. Yet this is not alwayseasy to define. For two real numbers x and y, we probably want the distance to be|x − y|. For two complex numbers w and z, we could also write the distance as |w− z|.But another perfectly valid notion of distance would be |Re w − Re z| + |Im w − Im z|.Similarly, for two elements x and y of Rn, we could define the distance to be |x − y| =√(x1 − y1)2 + . . . + (xn − yn)2. But we could also write define the distance to be |x− y| =|x1 − y1|+ . . . + |xn − yn|.

The picture is even more complicated when we think about richer spaces. Supposethat we are trying to measure the distance between two bounded functions f : [0, 1]→ R

and g : [0, 1]→ R. We could define this to be:

• supx∈[0,1] | f (x)− g(x)|, the least upper bound of the distance between f and g.

• supx∈[0,1]−M, M small | f (x)− g(x)|, the least upper bound of the distance between fand g, possibly disregarding some “small” set of points M.

•(´ 1

0 | f (x)− g(x)|α dx)1/α

, an average of distances | f (x)− g(x)| from 0 to 1.1

It turns out that in various situations, all three notions of distance above are useful. It’smuch easier to deal with these varied cases, however, if we have a general theory ofdistance and the properties derived from it. This leads us to the concept of metric spaces.

1We haven’t formally defined the integral yet, and we won’t for a while, but assuming you’ve seen it insome form before you get the idea...

1

Page 2: 03 Analysis Lecture 1

1.2 Definition of metric spaces

Definition 1.1 (Metric space). A metric space (X, d) consists of a set X, whose elementswe call points, along with a function d : X× X → R we call a metric. Suppose p, q, and rare points in X. Then the metric d must obey the following properties:

(a) (a) d(p, q) ≥ 0, with d(p, q) = 0 iff p = q.

(b) (b) d(p, q) = d(q, p).

(c) (c) d(p, r) ≤ d(p, q) + d(p, r). (The triangle inequality.)

Examples:

• Rn is a metric space with d(x, y) = |x− y|.

• Rn is also a metric space with d(x, y) = ∑ni=1 |xi− yi|. When n = 2, this is sometimes

called the “Manhattan” or “taxicab” metric, because it tells you the distance you’dneed to walk between x and y on a rectangular grid of streets. This metric agreeswith the previous one only when n = 1.

• Rn is also a metric space with d(x, y) = max |xi − yi|. (Note that one set can beendowed with many different metrics!)

• For any set Y, the set X of all functions f : Y → [0, 1] is a metric space with d( f , g) =supy∈Y | f (y)− g(y)|.

• For any space X, we can define the discrete metric

d(x, y) =

{1 x 6= y0 x = y

• If d is a metric for space X, then for 0 < α < 1 we can replace it with d(x, y) =d(x, y)α to obtain a another valid metric for space X. (If α > 1, we might violatecondition (c), the triangle inequality.)

1.3 Open and closed sets

Definition 1.2 (Open and closed balls). Let (X, d) be a metric space, and x ∈ X. The openball Br(x) of radius r at x is defined to be the set

Br(x) = {y ∈ X : d(x, y) < r}

The closed ball Cr(x) of radius r at x is

Cr(x) = {y ∈ X : d(x, y) ≤ r}

2

Page 3: 03 Analysis Lecture 1

Definition 1.3 (Open and closed intervals). In R, the open interval (a, b) is defined tobe the set {x ∈ R : a < x < b}. The closed interval [a, b] is defined to be the set{x ∈ R : a ≤ x ≤ b}. (A closed interval includes its endpoints; an open interval doesnot.)

Sometimes we see “half-open, half-closed” intervals of the form (a, b] or [a, b). Theseare defined in the natural way, as {x ∈ R : a < x ≤ b} and {x ∈ R : a ≤ x < b},respectively.

We also see intervals of the forms (a, ∞), (−∞, a), [a, ∞), or (−∞, a]. These are defined,again naturally, as {x ∈ R : a < x}, {x ∈ R : x < a}, {x ∈ R : a ≤ x}, and {x ∈ R : x ≤a}, respectively.

In all definitions below, (X, d) is a metric space and A is a subset of X.

Definition 1.4 (Interior points and interior). We say x ∈ A is an interior point of A ifthere exists some r such that open ball Br(x) is a subset of A. The interior Int A consistsof all x ∈ A that are interior points of A.

Definition 1.5 (Open sets). We say that A is open if every x ∈ A is an interior point of A.(Equivalently, A is open if A = Int A.)

Definition 1.6 (Neighborhood, open neighborhood). We say that A is an open neighbor-hood of x if A is open and x ∈ A. We say that B is a neighborhood of x if there is someopen set A such that x ∈ A ⊂ B. (Sometimes the term “neighborhood” is used morenarrowly to mean what we call an “open neighborhood”.)

Definition 1.7 (Limit points, isolated points, closure, and boundary). We say x is a limitpoint of A if for every r > 0, there exists some y 6= x such that y ∈ A and |y− x| < r.(Equivalently, x is a limit point if for every r > 0, the intersection Br(x) ∩ A contains apoint y 6= x.) We denote the set of all limit points of A by A′. We say x is an isolatedpoint of A if x ∈ A but x is not a limit point of A.

Let A′ be the set of limit points of A. The closure A of A consists of A plus all its limitpoints: A = A ∪ A′. The boundary ∂A of A is the closure minus the set of interior points:∂A = A \ (Int A).

Definition 1.8 (Closed sets). We say A is closed if it contains all its limit points. (Equiva-lently, A is closed if A = A.)

Definition 1.9 (Bounded sets). We say A is bounded if there is a real number M such thatfor all p, q ∈ A, d(p, q) < M.

Examples. One of the simplest examples of an open set is the open ball.

Proposition 1.10. In any metric space (X, d), and for any x ∈ X and r > 0, the open ball Br(x)is an open set.

Proof. Suppose y ∈ Br(x), and let s = |y − x|. Then for any z where z ∈ Br−s(y) wehave |z− x| ≤ |z− y|+ |y− x| < (r − s) + s = r, so that z ∈ Br(x). We conclude thatBr−s(y) ⊂ Br(x), and thus y is an interior point of Br(x). Since this is true for all y ∈ Br(x),Br(x) is open. (See the diagram below for a visualization of this proof in R2.)

3

Page 4: 03 Analysis Lecture 1

x

y

r

s

r− s

Additional examples.

• Open intervals (a, b) are open sets in R, and closed intervals [a, b] are closed sets inR.

• In any metric space (X, d), all open balls Br(x) are open sets, and all closed ballsCr(x) are closed sets.

• In any metric space (X, d), both the full set X and the empty set ∅ are both open andclosed.

• The subset of all integers in R is closed. The subset of all rationals in R is not closed.(For instance,

√2 is a limit point of the set {1, 1.4, 1.41, 1.414, . . .} of rationals, and it

is not rational.) In fact, as we will prove later, the closure of Q in R is R itself.

• In any metric space, all finite subsets are closed.

4

Page 5: 03 Analysis Lecture 1

We illustrate several of the concepts below for the case of an open ball A in R2. Note thatsince A is open, the interior is just A. Furthermore, every point of A is a limit point.

Our set A Its closure A Its boundary ∂A

1.4 Properties of open and closed sets

There is a very nice relationship between open and closed sets.

Proposition 1.11. Let (X, d) be a metric space. If A is an open set, then X \ A is a closed set,and if A is a closed set, then X \ A is an open set.

Proof. First, suppose that A is open. No x ∈ A can be a limit point of X \ A, because theopenness of A implies that for some r > 0, Br(x) ⊂ A, and consequently Br(x) ∩ (X \A) = ∅. Therefore X \ A contains all its limit points, and is closed.

Now suppose that A is closed, and take any x ∈ X \ A. Since A is closed, x cannotbe a limit point of A. Therefore there exists some r > 0 such that Br(x) ∩ A = ∅, orequivalently Br(x) ⊂ X \ A, so that x is an interior point of X \ A. Since this is true forany x ∈ X \ A, we conclude that X \ A is an open set.

The figure below illustrates how the complement of an open ball A in R2 is closed. Adoes not contain its boundary (a circle), while its complement does.

A R2 \ A

5

Page 6: 03 Analysis Lecture 1

Proposition 1.12.

(a) For any collection {Gα} of open sets, ∪αGα is open.

(b) For any finite collection G1, . . . , Gn of open sets, ∩iGi is open.

(c) For any collection {Fα} of closed sets, ∩αFα is closed.

(d) For any finite collection F1, . . . , Fn of closed sets, ∪iFi is closed.

Proof. For (a), consider any x ∈ ∪αGα. Choose the Gα′ such that x ∈ Gα′ Since Gα′ is open,x is an interior point of Gα′ . But then it is certainly an interior point of ∪αGα ⊃ Gα′ too!Since this is true for all x ∈ ∪αGα, we conclude that ∪αGα is open.

For (b), consider any x ∈ ∩iGi. For each i, there is some ri such that B(x, ri) ⊂ Gi. Letr = mini ri. Then B(x, r) ⊂ Gi for all i, implying that B(x, r) ⊂ ∩iGi. We conclude that∩iGi is open.

For (c) and (d), we apply (a), (b), Proposition 1.11, and DeMorgan’s laws. In particular,for (c) we observe that X \ (∩αFα) = ∪α(X \ Fα). Since each X \ Fα is open, ∪α(X \ Fα) isopen by (a), and therefore ∩αFα is closed.

Similarly, for (d) we note that X \ (∪iFi) = ∩i(X \ Fi). Since each X \ Fi is open,∩i(X \ Fi) is open by (b), and therefore ∪iFi is closed.

Illustration of (a):Union of progressively larger open balls (with limit radius 1) is another open ball

6

Page 7: 03 Analysis Lecture 1

Illustration of (b)’s failure when collection is infinite:Intersection of progressively smaller open balls(with limit radius 1) is a closed ball, not open

It is important to note that whether a set A is open or closed (or neither) depends onthe metric space X in which A is situated. For instance, the open interval A = (0, 1) is notclosed in X = R, but it is closed in X = R \ {0, 1}, the real line with 0 and 1 removed, andit is certainly closed in X = A. In general, when the specific metric space in which weare working is not clear, we clarify matters by saying that A is open or closed relative toa space Y. For instance, we say that A = (0, 1) is open relative to Y = R, and both openand closed relative to Y = R \ {0, 1}.

If we have a particular set A, can we say anything about when A will be open or closedrelative to a space Y? The following theorem is sometimes useful.

Proposition 1.13. Suppose Y ⊂ X. A subset A of Y is open relative to Y if and only if A = Y∩ Bfor some open subset B ⊂ X.

Note that this result implies that if A ⊂ Y ⊂ X, and A is open relative to X, then A isopen relative to Y as well.

We will define many concepts using metric spaces, and most of these concepts (thoughnot all) can be defined entirely in terms of open sets (or, equivalently, closed sets). Suchproperties are called topological properties.2 Sometimes there are multiple ways to as-sign a metric to some set X under which the open sets are exactly the same, implying thatall topological properties are identical under the two metrics.

Definition 1.14 (Topologically equivalent metrics). Suppose that d1 and d2 are metrics onthe same set X, and let use Br(x; d) to denote the open ball of radius r around point xwhen using the metric d. We say d1 and d2 are topologically equivalent if for any x ∈ X

2“Topology” is an area of mathematics that generalizes many of the concepts we’re defining with metricspaces. In topology, we give a set some structure by choosing which sets are open (subject to some rules),rather than deriving all the structure from a metric.

7

Page 8: 03 Analysis Lecture 1

and r > 0, there exist r′ > 0 and r′′ > 0 such that

Br′(x; d1) ⊂ Br(x; d2) (1.1)Br′′(x; d2) ⊂ Br(x; d1) (1.2)

Proposition 1.15. Topological equivalence is an equivalence relation on metrics.

Proposition 1.16. If two metrics d1 and d2 are topologically equivalent, then a subset A ⊂ X isopen using metric d1 iff it is open using metric d2.

Proof. By definition, A is open using metric d1 if for any x ∈ X, there exists some openball Br(x; d1) ⊂ A. Under our definition of topological equivalent, for any x ∈ X there isthen some open ball Br′′(x; d2) ⊂ Br(x; d1) ⊂ A as well, implying that A is open using d2.

The converse, that A open using metric d2 implies A open using metric d1, follows bythe same argument.

The concept of topological equivalence can be extremely useful. Consider the follow-ing three metrics in Rn:

• The Euclidean metric dE(x, y) =√

∑i(xi − yi)2.

• The Manhattan (or taxicab) metric dT(x, y) = ∑i |xi − yi|.

• The “max” metric dM(x, y) = maxi |xi − yi|.

Although these metrics look very different, it turns out that they are all topologicallyequivalent. The easiest way to prove this is to show that both the Euclidean and Manhat-tan metrics are equivalent to the max metric.

Proposition 1.17. dE, dT, and dM are topologically equivalent.

Proof. Take any x, y ∈ X. Then dM(x, y) < dE(x, y) and dE(x, y) ≤ n · dM(x, y). It followsthat for any x ∈ X and r > 0, Br(x; dE) ⊂ Br(x; dM), and Br/n(x; dM) ⊂ Br(x; dE), so thatdE and dM are equivalent.

The same argument is true for dT and dM, simply replacing dE above with dT.

The topological equivalence of dE, dT, and dM is depicted below in the case of R2.Open balls under dE have a circular boundary, while open balls under dT have a diamond-shaped boundary and open balls under dM have a square boundary. (Think about why.)Following Definition 1.14, the topological equivalence of dE and dM boils down to the factthat we can place a circle inside any square and a square inside any circle; the topologicalequivalence of dT and dM boils down to the fact that we can place a diamond inside anysquare and a square inside any diamond. This is depicted below.

8

Page 9: 03 Analysis Lecture 1

Equivalence of dE and dM Equivalence of dT and dM

The concept of topological equivalence is relevant in a number of interesting othercases, of which the following is only one example:

Proposition 1.18. For any metric d on a space X, if we define the new metric d by

d(x, y) = min(d(x, y), 1) (1.3)

then d and d are topologically equivalent.

Proof. In definition 1.14, let r′ = r′′ = min(r, 1) for all r and x. Then both conditions fortopological equivalence are satisfied.

The interesting thing about this proposition is that it demonstrates that topologicalproperties are inherently “local”; we can truncate the values of all distances beyond acertain level without changing which sets are open. (Of course, we can replace the “1”in this proposition with any constant and the proof will be the same.) In particular, thenotion of boundedness (Definition 1.9) means very little in this context, since any metricis equivalent to a metric that is bounded.

1.5 Sequences and subsequences

Definition 1.19 (Sequence). A sequence in a metric space X is a function p : N → Xwhose domain is the set of natural numbers. We usually denote the value of p at n bypn instead of p(n), and we denote the sequence as a whole by writing (pn) or by listingelements (p1, p2, p3, . . .).

Definition 1.20 (Convergence). A sequence (pn) in X converges to p ∈ X if for everyε > 0, there exists an N such that d(pn, p) < ε (or equivalently, pn ∈ Bε(p)) for alln ≥ N.3

3Note that in this definition, the order of quantifiers is very important! We require that for every ε > 0,there exists an N such that d(pn, p) < ε for all n ≥ N. This is much weaker than requiring that there existsa single N such that for all ε > 0, d(pn, p) < ε for n ≥ N. Indeed, the latter would require that the sequencebe constant starting at N: pN = pN+1 = pN+2 = . . ..

9

Page 10: 03 Analysis Lecture 1

Definition 1.21 (Limit). The point p in the previous definition is called the limit of thesequence (pn), and we write lim pn = p or pn → p.

Proposition 1.22. Let (pn) be a sequence in a metric space X. Then:

(a) (pn) converges to p ∈ X iff for every neighborhood A of p, there exists an N suchthat pn ∈ A for all n ≥ N. (The latter is sometimes taken as an alternative definitionof convergence. Note that it implies that convergence is a topological property, andtherefore that convergence is identical under topologically equivalent metrics.)

(b) If p, p′ ∈ X and both pn → p and pn → p′, then p = p′. (If a sequence converges, itslimit is unique.)

(c) If (pn) converges, then (pn) is bounded.

(d) If A ⊂ X and p is a limit point of A, then there exists some sequence (pn) consistingof points in A such that pn → p.

Proof.

(a) This proof is mainly a matter of stating the definitions. Let A be any neighborhoodof p. By definition, there is an open set C ⊂ A containing p, and some ε such thatBε(p) ⊂ C ⊂ A. If pn → p, then there exists some N such that for all n ≥ N,pn ∈ Bε(p) ⊂ A, as desired.

Conversely, suppose that for any neighborhood A of p, there exists some N suchthat for all n ≥ N, pn ∈ A. Then for any ε > 0, let A = Bε(p) be our neighborhood.(This is a valid choice because Bε(p) is open, by Proposition 1.10.) We then have anN such that for all n ≥ N, pn ∈ Bε(p)⇐⇒ d(pn, p) < ε, as desired.

(b) For arbitrary ε > 0, there is some N such that d(pn, p) < ε for all n ≥ N, and someN′ such that d(pn, p′) < ε for all n ≥ N′. Take N′′ = max(N, N′). Then we haved(p, p′) ≤ d(p, pN′′) + d(pN′′ , p′) < 2ε. We can choose ε to be arbitrarily small, andtherefore the only possible value for d(p, p′) is 0. We conclude that p = p′.

(c) If pn → p, then there exists some N such that for all n ≥ N, d(pn, p) < 1. Nowdefine M = maxn∈{1,...,N−1} d(pn, pN). Then for all n ∈ {1, . . . , N − 1}, we haved(pn, p) ≤ M + 1, and also for all n ≥ N we have d(pn, p) < 1 ≤ M + 1. Weconclude that (pn) is bounded with bound M + 1.

(d) Since p is a limit point, for any n there exists some pn such that d(pn, p) < 1/n. Let(pn) be our sequence. By construction, it converges to p.

Definition 1.23. For any sequence (pn), consider an increasing sequence of natural num-bers (nk). Then the sequence (pnk) is called a subsequence of (pn).

10

Page 11: 03 Analysis Lecture 1

Examples.

• In X = R with the standard metric d(x, y) = |x − y|, the sequence pn = 1/n haslimit 0. The sequence pn = n does not converge, and neither does the sequencepn = sin n. The latter demonstrates that although all convergent sequences arebounded, not all bounded sequences are convergent.

• In the space of functions f : R→ [−1, 1] with the metric d( f , g) = sup | f (x)− g(x)|,the sequence of functions fn(x) = 1/n converges to the function f (x) = 0. Thesequence of functions

fn(x) =

{0 x ≤ n1 x > n

does not converge under this metric, even though for each individual point x, fn(x)→0.

• For any X, if we use the discrete metric (defined by d(x, y) = 0 if x = y and d(x, y) =1 if x 6= y), then pn → p iff there is some N such that pn = p for all n ≥ N.

Bounded sequences in R (and Rk) have an especially nice property.

Proposition 1.24 (Bolzano-Weierstrass). Every bounded sequence (xn) in R contains a con-vergent subsequence.4

Proof. Since the sequence in bounded, there exists some M0 such that for any pair ofelements xn, xm in the sequence, |xn − xm| < M0. Let M = M0 + |x1|; then for any xn inthe sequence, |xn| ≤ |xn − x1|+ |x1| < M0 + |x1| = M. Thus the sequence is contained inthe interval [−M, M].

Split [−M, M] into two equally sized intervals [−M, 0] and [0, M]. Since the sequencehas infinitely many points in [−M, M], it must have infinitely many points in at leastone of these two smaller intervals. Subdivide that interval into two more equally sizedintervals and repeat the same logic. Continuing this process, we will obtain a sequence ofnested closed intervals Am = [am, bm] (giving our initial interval the index m = 0), each ofwhich contains infinitely many points, where am ≤ am+1 < bm+1 ≤ bm (i.e. Am+1 ⊂ Am),and each interval is half the size of the previous one: bm − am = 21−mM.

We argue that supm am = infm bm. Suppose otherwise. First, if supm am < infm bm, thenwe have bk − ak ≥ infm bm − supm am > 0 for all k, contradicting the fact that bk − ak =

21−k M → 0 as k → ∞. Second, if supm am > infm bm, then by definition y = (supm am >infm bm)/2 cannot be either an upper bound for (am) or a lower bound for (bm), so thatthere exists some ak1 > y and some bk2 < y. But by construction, it is impossible tohave ak1 > bk2 for any k1, k2. We conclude that supm am = infm bm, and we denote thiscommon value by x. Note that x ∈ [am, bm] for all m.Now let us construct subsequence(xkm) of (xn) according to the following procedure. For each [am, bm], let us pick somexkm ∈ [am, bm] such that km > k1, . . . , km−1. (This is always possible, since by constructionthere are infinitely many members of (xn) in each interval [am, bm].) Now, |xkm − x| <bm − am = 21−mM.

4Here R is given the standard metric d(x, y) = |x− y|

11

Page 12: 03 Analysis Lecture 1

Therefore, for any ε > 0, there is some N such that |xkm − x| < 21−mM < ε for allm ≥ N. We conclude that xkm → x, as desired.

Proposition 1.25. [Bolzano-Weierstrass for Rk] Every bounded sequence (xn) in Rk contains aconvergent subsequence.

Proof. Since the sequence is bounded under a suitable metric5 in Rk, one can show thatthe sequence formed by each coordinate i = 1, . . . , k of the elements xn is bounded underthe usual metric in R.Now take the sequence of first coordinates (xn1). By Theorem 1.24,this sequence has a convergent subsequence. Let us take the subsequence of (xn) withonly the indices of this convergent subsequence, and relabel it simply (xn). Now takethe sequence of second coordinates (xn2). Again, this has a convergent subsequence.Take the corresponding subsequence of (xn). (Which is a subsubsequence of the originalsequence.) Continue for the third coordinate, and so on, for i = 3, . . . , k. By the end of thisprocess, we will have a subsubsub...subsequence of the original sequence, such that eachcoordinate converges to some value, and therefore the full sequence (xn) converges tosome x. But, of course, a subsubsub...subsequence is also simply a subsequence, meaningthat we have found the desired subsequence.

The above proof may be confusing, and for concreteness we offer the following exam-ple. Suppose that we have some bounded sequence (xn) in R2. The sequence of firstcoordinates x11, x21, x31, . . . is bounded, and therefore has some convergent subsequence.Suppose that this subsequence happens to be (x21, x41, x61, . . .). Let’s take this as a sub-sequence of the entire original sequence (xn), so that we have (x2, x4, x6, . . .). Now let’sconsider the sequence of second coordinates of this subsequence, (x22, x42, x62, . . .). Sincethis sequence is bounded, it too has some convergent subsequence. Suppose that thissubsequence happens to be (x42, x82, x12 2, . . .). Again we take this as a subsequence of theentire sequence, so that we have (x4, x8, x12, . . .). Now, by construction, both the first andsecond coordinates of this subsequence converge, and therefore the subsequence itselfconverges. We have found a convergent subsequence of (xn), and may pat ourselves onthe back.

This trick—repeatedly taking subsequences of subsequences to obtain a subsequencewith nice properties—is very common in analysis.

1.6 Completeness

Is it possible, in general, to show that a sequence converges even if we can’t identify thespecific value to which it converges? In other words, can we sometimes show nonconstruc-tively that a sequence converges?

One possibility is the following. Suppose that we do know that the values of a se-quence (xn) eventually become very close to each other, as defined in the following man-ner.

5Here the metric can be either of the three metrics mentioned in the previous section: the Euclideanmetric dE, the Manhattan (taxicab) metric dT , or the max metric dM. This is not completely general, however:there could be other metrics topologically equivalent to these three where this theorem is not true. Indeed,Proposition 1.18 tells us that we can define a metric where the entire space Rk is bounded, and certainly wecannot always find a convergent subsequence of a sequence that is free to roam around all of Rk.

12

Page 13: 03 Analysis Lecture 1

Definition 1.26 (Cauchy sequence). A sequence (xn) is a Cauchy sequence if for anyε > 0, there exists some N such that for any n, m ≥ N, d(xn, xm) < ε.

If (xn) is a Cauchy sequence, can we show that (xn) converges to some x? This seemsintuitive: if the subsequent values of xn eventually get very close to each other as inDefinition 1.26, it is only natural to suppose that they converge on some specific value.Yet this is not always true. If X = (0, 1) and xn = 1/n, for instance, then (xn) is a Cauchysequence but does not converge to any value in X.6

Proposition 1.27. All convergent sequences are Cauchy sequences, but not vice versa.

If in some cases Cauchy sequences do necessarily converge, however, it would be veryuseful to know. This motivates us to define the notion of completeness.

Definition 1.28 (Completeness). A metric space X is complete if all Cauchy sequences inX converge.

What metric spaces are complete? The most important example is R, or more gener-ally Rk. Proofs of completeness for many other metric spaces build upon the basic factthat Rk is complete. The fact, in turn, can be proven using the Bolzano-Weierstrass the-orem (Theorem 1.25) in the last section. First we prove two useful facts about Cauchysequences.

Proposition 1.29. The elements of a Cauchy sequences form a bounded set.

Proof. Let (xn) be a Cauchy sequence. By definition, there exists some N such that for alln, m ≥ N, d(xn, xm|) < 1. Define M0 = maxn=1,...,N−1 d(xn, xN), and write M = M0 + 1.Then for any n and m, d(xn, xm) < M0 + 1 = M. Thus the set {xn} is bounded.

Proposition 1.30. Let (xn) be a Cauchy sequence, and suppose it has a convergent subsequence(xmn) with limit x. Then (xn) converges to x as well.

Proof. Take any ε > 0. Since (xn) is a Cauchy sequence, there exists some N such that forall m, n ≥ N, d(xm, xn) < ε/2. Moreover, by the convergence of (xnm), there exists someM0 such that for all m ≥ M0, d(xnm , x) < ε/2.

Let us define M such that M ≥ M0 and nM ≥ N. Then for all n ≥ nM, we may write:

d(xn, x) ≤ d(xn, xnM) + d(xnM , x)< ε/2 + ε/2< ε

and therefore xn → x, as desired.

Now the main result is easy.

Proposition 1.31. Rk is complete.

6If X was a subset of the larger space Y = R, then (xn) would converge to 0, which is in Y. But we areconsidering an example where the entire space is just X.

13

Page 14: 03 Analysis Lecture 1

Proof. Let (xn) be a Cauchy sequence in Rk. By Proposition 1.29, we know that (xn)is bounded. Applying Bolzano-Weierstrass (Proposition 1.25), we infer that (xn) has aconvergent subsequence. It follows from Proposition 1.30 that (xn) converges.

There is an important relationship between the properties of completeness and closed-ness.

Proposition 1.32. Let X be a metric space. Then:

(a) If X is complete, then if Y ⊂ X is closed, then the subspace Y is also complete.

(b) If Y is a complete subspace of X, then Y must be closed.

Proof.

(a) Let (xn) be any Cauchy sequence in Y. By the completeness of X, it converges tosome limit x ∈ X. x is a limit point of (xn), and therefore Y. Since Y is closed, itcontains its limit points, and therefore x ∈ Y.

We conclude that Y is complete.

(b) Suppose to the contrary that Y is not closed. Then there exists some limit pointx ∈ X of Y such that x /∈ Y. By Proposition 1.22, there is a sequence (xn) in Y suchthat xn → x. (xn) is a Cauchy sequence in Y, but it does not converge to any elementof Y, which contradicts the completeness of Y. Therefore Y must be closed.

One important example of a set that is not complete is Q. (In fact, R is sometimes definedas the “completion” of Q.)

Proposition 1.33. Q is not complete.

Proof. There are many possible quick proofs. For instance, we showed earlier that√

2 isnot an element of Q. But the sequence of decimal approximations (1, 1.4, 1.41, . . .) (whichare rational) converges to

√2 in R, implying that Q is not closed in R. Therefore by

Proposition 1.32, Q cannot be complete.

We close with a nice application of completeness, which is tremendously useful ineconomics (particularly in macroeconomics). We will use it later to prove the InverseFunction Theorem, one of the nicest results in calculus.

Proposition 1.34 (Contraction mapping theorem). Let X be a complete metric space, andsuppose that f : X → X is such that d( f (x), f (y)) ≤ α · d(x, y), where 0 < α < 1. (We saythat such an f is a contraction mapping.) Then f has a unique fixed point, i.e. a unique pointx ∈ X such that f (x) = x.

14

Page 15: 03 Analysis Lecture 1

Proof. To show the existence of a fixed point, let us start with any x ∈ X. We will definea sequence, with first element x1 = x, and subsequent elements defined inductively byxn+1 = f (xn). (So that x2 = f (x), x3 = f ( f (x)), etc.)

Note that d(xn, xn+1) = d( f (xn−1), f (xn)) ≤ αd(xn−1, xn). If we denote d(x1, x2) by c,we have d(xn, xn+1) ≤ αn−1c. In fact, for n < m we have:

d(xn, xm) ≤ d(xn, xn+1) + d(xn+1, xn+2) + . . . + d(xm−1, xm) (1.4)

≤ αn−1c + αnc + . . . + αm−1c (1.5)

< αn−1(1 + α + α2 + . . .)c (1.6)

=αn−1

1− αc (1.7)

Since α < 1, αn−1

1−α → 0 as n → ∞. Therefore, for any ε < 0, by picking a sufficiently highN we can ensure that for N ≤ n < m, d(xn, xm) < ε. We conclude that (xn) is a Cauchysequence. It follows from the completeness of X that xn converges to some x.

We will now argue that f (x) = x. For any ε > 0, there exists some N such thatd(xn, x) < ε and d(xn, xn+1) < ε for all n ≥ N. This implies:

d(x, f (x)) ≤ d(x, xn) + d(xn, xn+1) + d(xn+1, f (x)) (1.8)≤ d(x, xn) + d(xn, xn+1) + α · d(xn, x) (1.9)< (2 + α)ε (1.10)

Since ε can be made arbitrarily small, we must have d(x, f (x)) = 0, which implies f (x) =x.

Finally, we argue that this is the unique fixed point of x. Suppose to the contrary that fory 6= x, we have f (y) = y. Then d(x, y) = d( f (x), f (y)) ≤ α · d(x, y), which is impossiblesince α < 1.

1.7 Continuity

What happens when we evaluate a function at each point in a convergent sequence? Inparticular, if pn → p, will f (pn) → f (p) as well? Intuition suggests that this is true forsufficiently “nice” functions. For instance, if pn → p, then p2

n → p2, |pn| → |p|, and√pn →

√p.

Yet for some “bad” functions, it does not hold. Consider the function f : R → R

defined such that f (x) = 0 for all x ≤ 0 and f (x) = 1 for all x > 0. Then 1/n → 0, butf (1/n) 9 f (0), since f (1/n) = 1 for all n but f (0) = 0. Intuitively, this is because thefunction f “jumps” from 0 to 1; points x1 and x2 that are arbitrarily close to each othernevertheless produce values f (x1) and f (x2) that differ by 1.

How can we rule out such jumps, and formally capture the difference between “good”functions and “bad” ones? Our answer will be the notion of continuity, but first we mustdefine the limit of a function.

15

Page 16: 03 Analysis Lecture 1

Definition 1.35. Let f : X → Y be a function from a metric space (X, dX) to a metricspace (Y, dY). Suppose further that p is a limit point of X. Then we say that the limit ofthe function f (x) as x → p is q if for every ε > 0, there exists a δ > 0 such that

dY( f (x), q) < ε (1.11)

for all x ∈ X such that 0 < dX(x, p) < δ. In this case, we write that f (x)→ q as x → p, or

limx→p

f (x) = q (1.12)

Limits of functions can be equivalently defined in a number of other ways. For in-stance, there is a close relationship to the concept of the limit of a sequence.

Proposition 1.36. Suppose the conditions in the first two sentences of Definition 1.35 hold. Thefollowing are equivalent:

• limx→p f (x) = q.

• For every sequence (pn) in X such that pn 6= p and limn→∞ pn = p, limn→∞ f (pn) = q.

Proof. Suppose the first statement is true. Then for any ε > 0, there exists a δ > 0 such thatdY( f (x), q) < ε for all x ∈ X such that 0 < dX(x, p) < δ. Moreover, since limn→∞ pn = pand pn 6= p, there exists an N such that for all n ≥ N, 0 < dX(pn, p) < δ. Therefore, forall n ≥ N, we also have dY( f (pn), q) < ε. We conclude that f (pn)→ q.

Conversely, suppose the second statement is true, and suppose the first statement isnot true. (We will show by contradiction that it must be true.) Then there exists someε > 0 such that for all δ > 0, there is some y where dX(y, p) < δ yet dY( f (y), q) ≥ ε. Letus define δn = 1/n, and for each δn find the corresponding y with this property, labelingit yn. By construction, the sequence (yn) cannot converge to q. Yet the second statementimplies that it must converge to q, which is a contradiction. We conclude that the firststatement must be true as well.

Note that the equivalence in Proposition 1.36 implies that the uniqueness result forlimits of sequences in Proposition 1.22 carries over to limits of functions.

Proposition 1.37. If limx→p f (x) = q and limx→p f (x) = q′, then q = q′. (In other words,limits of functions must be unique.)

Now we are in a good position to define continuity. As desired, it will rule out situa-tions where pn → p yet f (pn) 6→ f (p).

Definition 1.38. A function f : X → Y from a metric space (X, dX) to a metric space(Y, dY) is continuous at p ∈ X if for any ε > 0, there exists a δ > 0 such that for all p′

where dX(p, p′) < δ, we also have dY( f (p), f (p′)) < ε. When the function f is continuousat all p ∈ X, we say simply that f is continuous.

Proposition 1.39. Let f be a function from a metric space (X, dX) to a metric space (Y, dY),where p ∈ X. Then (a) and (b) below are equivalent. Further, if p is a limit point of X, (a), (b),and (c) are equivalent.7

7Otherwise, as we note from Definition 1.35, the concept of function limit is not defined.

16

Page 17: 03 Analysis Lecture 1

1. (a) f is continuous at p.

2. (b) For any sequence (pn) converging to p, ( f (pn)) converges to f (p).

3. (c) limx→p f (x) = f (p)

Proof. This follows from Definition 1.35 and Proposition 1.36.

Note that since continuity can be expressed as a statement about convergence, whichwe already showed to be a topological property, continuity (both for f at a particularpoint p and for the function f in general) is also a topological property.

Indeed, one of the nicest features of continuity for a function is that it is equivalent toa statement about the inverse images of open sets.

Proposition 1.40 (Open set characterization of continuity). f : X → Y is continuous iff forany open set V in Y, f−1(V) is open in X.

Proof. Suppose that f : X → Y is continuous. Then for any open set V in Y and pointp ∈ f−1(V), where q = f (p) ∈ V, there exists some ε > 0 such that Bε(q; dY) ⊂ V. Bythe definition of continuity, there exists some δ such that Bδ(p; dX) ⊂ f−1(Bε(q; dY)) ⊂f−1(V). Thus, any such p is an interior point in f−1(V), and f−1(V) is open.

Conversely, suppose that for any open V in Y, f−1(V) is open in X. For any ε > 0 andp ∈ X, consider V = Bε( f (p); dY). Then f−1(V) is open, implying that p is an interiorpoint in f−1(V), i.e. that there exists some δ such that p ∈ Bδ(p; dX) ⊂ f−1(Bε( f (p); dY).This implies that f (Bδ(p; dX)) ⊂ Bε( f (p); dY), which for arbitrary ε is equivalent to thedefinition of continuity.

Corollary 1.41. f : X → Y is continuous iff for any closed set V in Y, f−1(V) is closed in X.

Proof. Apply Theorem 1.11 to Proposition 1.40.

These characterizations of continuity are useful, but it would still be extremely tediousto apply them directly to show that a complicated function is continuous. How can wesimplify the process? One nice result, which allows us to build up continuous functionsfrom the composition of other continuous functions, is the following.

Proposition 1.42 (Composition of continuous functions). Let X, Y, Z be metric spaces, andlet f : X → Y and g : Y → Z be continuous functions. If f is continuous at some p ∈ X and gis continuous at f (p) ∈ Y, then g ◦ f is continuous at p ∈ X.

If f and g are continuous functions, then g ◦ f is a continuous function.

Proof. Continuity of g at f (p) implies that for all ε > 0, there exists an η > 0 such that forall q such that dY(q, f (p)) < η, dZ(g(q), (g ◦ f )(p)) < ε. Continuity of f at p implies thatthere exists a δ > 0 such that for all p′ such that dX(p′, p) < δ, dY( f (p′), f (p)) < η.

Combining these results, we find that for all p′ such that dX(p′, p) < δ, dZ((g ◦f )(p′), (g ◦ f )(p)) < ε. Since this holds for some δ for arbitrary ε, we have demonstratedcontinuity of g ◦ f at p.

The second statement of the proposition (for continuous functions) follows immedi-ately from the first statement of the proposition, which can be used to show that if f and

17

Page 18: 03 Analysis Lecture 1

g are continuous at every point p ∈ X, then g ◦ f is continuous at every point p ∈ X (i.e.g ◦ f is continuous).

Alternatively, the second statement follows from Proposition 1.40. If f and g are con-tinuous, then for any open V ⊂ Z, (g ◦ f )−1(V) = f−1(g−1(V)) is open as well.

Even with this proposition in hand, we need to know that some basic functions arecontinuous before we can assemble more complicated continuous functions. Two exceed-ingly basic function are the identity and the constant function.

Proposition 1.43.

(a) f : X → X defined by f (x) = x is continuous.

(b) f : X → Y defined by f (x) = c for some constant c ∈ Y is continuous.

Proof.

(a) For ε > 0, d( f (x), f (x′)) < ε for all d(x, x′) < ε.

(b) For ε > 0, d( f (x), f (x′)) = 0 < ε for any values of x and x′.

Now, if (for instance) we want to show that functions from X to R are continuous, weneed to know that the basic arithmetic operations are continuous. This is mildly tediousto verify, but we have:

Proposition 1.44. Let f and g be continuous functions from X to R. Then f + g, f g, and f /gare continuous on X. (The latter only if g(x) 6= 0.)

Now we can verify that some functions are indeed continuous. For instance, we canshow that all polynomials p : R→ R are continuous.

Proposition 1.45. Let p : R→ R be a polynomial: p(x) = xn + a1xn−1 + . . . + an. Then p iscontinuous.

Proof. First we prove that xk is continuous for any nonnegative integer k. We do so byinduction. This is true for the base case k = 0, where xk takes the constant value 1, by (b)of Proposition 1.43. Furthermore, if it is true for xk−1, then it is true for xk = xk−1 · x by (a)of Proposition 1.43 (which shows that x is continuous) and then Proposition 1.44. Thus itis true for all nonnegative integers k.

Now, axk is continuous for any constant a and nonnegative integer k by (b) of Propo-sition 1.43 and Proposition 1.44. Finally, p(x) = xn + a1xn−1 + . . . + an is continuous byrepeated application of Proposition 1.44.

18

Page 19: 03 Analysis Lecture 1

1.8 Connectedness

We observed earlier that in some cases the existence of a solution to an equation is “obvi-ous”. For instance, if f (x) = xn, where n ∈ N, and y > 0, then it seems clear that thereshould be some solution to f (x) = y. After all, f (0) = 0 < y, and f (y + 1) > y, so some-where between 0 and y + 1 we expect to see a solution f (x) = y. Since f is continuous, itcan’t jump past y!

Yet even with the notion of continuity in hand, we cannot quite transform this intuitioninto an actual proof. To do that, we need the concept of connectedness.

The key property that allows us to infer the existence to a solution f (x) = y is that theset A = [0, y + 1] is connected. Intuitively speaking, A has no gaps that separate it intotwo or more separate pieces. If it did, we could not be confident that a solution existedin A. For instance, if n = 2 and y = 4, then f (0) = 0 < y and f (5) = 25 > y, and it isreasonable to expect a solution x ∈ [0, 5]. (Indeed, such a solution is x = 2.) But it wouldnot be reasonable to expect a solution x ∈ [0, 1] ∪ [4, 5], because although f attains valuesbelow and above y for x in this set, it might only attain y itself in the “gap” (1, 4).

We translate this discussion into a formal concept by defining what it means for ametric space to be connected.

Definition 1.46 (Connected space). A metric space X is connected if it is not the union oftwo disjoint open subsets.

Suppose we want to define connectedness for a subset A of a metric space X, ratherthan a metric space itself. Then we say the following.

Definition 1.47 (Connected subset). A subset E of a metric space X is connected if itcannot be written as the union of two subsets A and B that are separated. A and B areseparated if A ∩ B and A ∩ B are both empty.

In fact, the latter definition is really a restatement of the former. We can therefore speakof sets as being inherently connected or not connected (given a certain metric), regardlessof the metric space in which they are situated.

Proposition 1.48. Let E be a subset of a metric space X. Then E is connected in the sense ofDefinition 1.46 when viewed as its own metric space iff E is connected in the sense of Definition1.47 when viewed as a subset of X.

Proof. Suppose E is not connected under Definition 1.46. Then it can be written as theunion of two disjoint subsets C and D, which are open relative to E. Since C ⊂ E \ D andD ⊂ E \ C, and E \ D and E \ C are closed relative to E, we know that C ∩ E ⊂ E \ Dand D ∩ E ⊂ E \ C.8 Therefore (C ∩ E) ∩ D and C ∩ (D ∩ E) are both empty, which sinceC, D ⊂ E is equivalent to simply C ∩ D and C ∩ D being empty, which is Definition 1.47.

Conversely, suppose that E is not connected under Definition 1.47. Then there existA ∪ B = E such that A ∩ B and A ∩ B are empty. This implies A ∩ E = A and B ∩ E = B.

8This step deserves a little explanation. Since C ⊂ E \ D and D ⊂ E \ C, and E \ D and E \ C are closedrelative to E, can conclude that C ⊂ E \ D and D ⊂ E \ C, where closures are taken relative to E. But theclosures C and D taken relative to the entire metric space X may add some limit points of E that are not in E.We recover the closures relative to E from the closures relative to X by taking the intersection with E.

19

Page 20: 03 Analysis Lecture 1

Theorem 1.13 now implies that A and B are both closed relative to E. Since A ∪ B = Eand A ∩ B = ∅, this is equivalent to saying that A and B are both open relative to E, andthis means that E is not connected under Definition 1.46.

As defined above, the concept of connectedness probably seems abstract and myste-rious. How does a definition stated only in terms of open and closed sets relate to ourintuition of connectedness? This is not easy to answer, but we can obtain some intuitionby seeing which subsets of R are connected.

Proposition 1.49. Let E be a subset of R. E is connected iff for any x, y ∈ E and z ∈ R such thatx < z < y, z ∈ E as well.

Proof. Suppose that z /∈ E. Then if we take A = E ∩ (−∞, z) and B = E ∩ (z, ∞), A and Bare both open in E by Theorem 1.13, and E is not connected by Definition 1.46.

Suppose conversely that E is not connected. Then there exist A, B ⊂ R such thatA ∪ B = E, A ∩ B = ∅ and A ∩ B = ∅. Choose some x ∈ A and y ∈ B such that (withoutloss of generality) x < y. Now let z = sup(A∩ [x, y]). z is a limit point of A, so that z ∈ A,which implies that z /∈ B.

If z /∈ A, then we have z /∈ E and x < z < y, and we are done.If z ∈ A, then z /∈ B, which means that there exists some z′ /∈ B such that z < z′ < y.

By construction of z, z′ ∈ [z, y] ⇒ z′ /∈ A. Then x < z′ < y and z′ /∈ E, and we aredone.

Corollary 1.50. Let E be a subset of R. E is connected iff it is an interval taking one of 9 possibleforms (where a, b ∈ R and a < b):

• [a, b], (a, b), [a, b), or (a, b].

• [a, ∞), (a, ∞), (−∞, a), or (−∞, a].

• (−∞, ∞) (i.e. R)

Proof. Given Proposition 1.49, we must prove that if x, y ∈ E and x < z < y imply z ∈ E,then E must take one of these forms. We consider four cases.

Case 1. E does not have a lower bound or an upper bound. In this case, for any z ∈ R,there must exist x ∈ E sufficiently small that x < z and y ∈ E sufficiently large that y > z,so that x < z < y and therefore z ∈ E. The only possibility, then, is that E = (−∞, ∞) =R.

Case 2. E does not have a lower bound, but it has an upper bound. In this case, leta = sup E. For any z < a, there exists some y ∈ E such that z < y < a (otherwise z wouldbe an upper bound of E too). Further, since E has no lower bound there exists some xsuch that x < z and x ∈ E. Thus z < a ⇒ z ∈ E. It is possible that either a ∈ E or a /∈ E,and these two cases correspond to (−∞, a] and (−∞, a), respectively.

Case 3. E does not have an upper bound, but it has a lower bound. The logic in thiscase is the same as in case 2, and the two possibilities are [a, ∞) and (a, ∞).

20

Page 21: 03 Analysis Lecture 1

Case 4. E has both an upper bound and a lower bound. Let a = inf E and b = sup E.For any a < z < b, there must exist some x, y ∈ E such that a < x < z and z < y < b.Thus a < z < b ⇒ z ∈ E. It is possible that either a ∈ E or a /∈ E, and similarly it ispossible that either b ∈ E or b /∈ E, leaving us with the four possibilities [a, b], (a, b), [a, b),and (a, b].

The converse, that if E takes one of these forms then x, y ∈ E and x < z < y implyz ∈ E, is clear enough that we will not prove it.

It is reassuring that connectedness has such a simple interpretation in R. In fact, thiswill be very useful to us in the future, even when we are dealing with more complicatedmetric spaces, because often we look at functions from such spaces to R.

We can get some more intuition by looking at when sets in R2 are connected. In par-ticular, we look at when the sets resulting from various unions of open or closed balls areconnected.

Connected Not Connected

Connected Not Connected

The following proposition relates continuity of functions and connectedness in an es-pecially useful way.

Proposition 1.51. If f : X → Y is a continuous function and A ⊂ X is connected, then f (A) isconnected.

21

Page 22: 03 Analysis Lecture 1

Proof. Let us define B = f (A), and view f as a function from A to B. We describe open-ness and closedness relative to A or B, and ignore the spaces X and Y.

Now, B is not connected if it is the union of two disjoint open subsets C1 and C2. Butin this case, B = C1 ∪ C2 ⇒ A = f−1(C1) ∪ f−1(C2), and C1 ∩ C2 = ∅ ⇒ f−1(C1) ∩f−1(C2) = ∅. Furthermore, since f is continuous, f−1(C1) and f−1(C2) are both open.Thus if B is not connected, A is also not connected, because it is the union of disjoint opensets f−1(C1) and f−1(C2).

We conclude that the contrapositive is true: if A is connected, B is connected.

We are now ready to prove the result that motivated our definition of connectedness.

Proposition 1.52. Suppose that X is connected, f : X → R is continuous, and for some x, y ∈ X,f (x) = a and f (y) = b where a < b. Then for any c ∈ (a, b), there exists some z ∈ Z such thatf (z) = c.

Proof. Since X is connected, Proposition 1.51 implies that f (X) ⊂ R is connected. ThenProposition 1.8 implies that for any a ∈ f (X), b ∈ f (X), c ∈ R where a < c < b, c ∈ f (X)as well.

As desired, this allows us to prove the existence of a solution to xn = y quite easily.

Corollary 1.53. For any n ∈ N and y ∈ R where y > 0, the equation f (x) = xn = y has asolution x ∈ R.

Proof. f is continuous, R is connected, and f (0) < y < f (y + 1). If follows immediatelyfrom Theorem 1.52 above that there exists some x ∈ R such that f (x) = y.

Suppose that we want to show that some space X other than R is connected. Howcould we go about proving it? One way is to show that the space X is the image, undera continuous function, of another space we already know to be connected; then we canapply Proposition 1.51. Another way is to use the concept of path connectedness.

Definition 1.54. Let X be a metric space. We say that X is path connected if for any twopoints x, y ∈ X, there exists some continuous function (a “path”) f from [0, 1] to X suchthat f (0) = x and f (1) = y.

In other words, X is path connected if we can draw a continuous path between anytwo points.

Proposition 1.55. If X is path connected, it is connected. The converse is not true: there existconnected sets that are not path connected.

Path connectedness is a stronger property than connectedness, but in some cases itmay be easier to prove. At the very least, it is more intuitive: the idea that a space is“connected” if you can draw a path in that space between any two points rings true.

Another way is to write X as the union of overlapping subsets that we already knowto be connected. Formally:

Proposition 1.56. Let X be a metric space. Suppose that {Gα} is a collection of connected setssuch that X = ∪αGα, and there is some x such that x ∈ Gα for all α. Then X is connected as well.

22

Page 23: 03 Analysis Lecture 1

1.9 Compactness

The concept of connectedness, along with the Intermediate Value Theorem, gives us aneasy way to prove the existence of solutions to equations. We might also be interested,however, in proving the existence of maxima and minima9 of functions. This is a key issuein economics, where we often seek to solve optimization problems: if we’re seeking theoptimum, we want to know that it actually exists! This goal will motivate the notion ofcompactness, and the Extreme Value Theorem.

First, we can piece together some useful intuition. Suppose that f is a continuous func-tion on R. There is certainly no way that we can guarantee that f achieves a maximumand minimum on R; if f (x) = x, for instance, then f (R) is not bounded. We might hopethat if A ⊂ R is a bounded subset of R, then f (A) will be bounded, but this is also nottrue. If we define f : (0, 1)→ R by f (x) = x−1, then (0, 1) is bounded but f ((0, 1)) is not;it has no maximum. (And being bounded, of course, is no guarantee that the function hasa maximum: if f : R → R is defined by f (x) = arctan x, then f (R) = (−1, 1), and f hasno maximum or minimum over its domain.)

There are some sets, however, on which a continuous f always seems to have a max-imum and minimum. For instance, take A = [0, 1]. In contrast to the A′ = (0, 1) case,no matter how we draw a continuous f on A, f will inevitably have a maximum andminimum. (Try it!) This is for good reason: A is one of the most basic examples of a com-pact set, and we will show that any continuous function on a compact set achieves both amaximum and minimum.

We start with the standard definition of compactness in terms of open sets. It willtake some time to work our way to the point where we can define and prove the ExtremeValue Theorem.

Definition 1.57 (Compact set). For any set E in a metric space X, an open cover of E is acollection {Gα} of open subsets of X such that E ⊂ ∪αGα. (The open sets Gα “cover” E.)

Now consider any subset A ⊂ X. We say that A is compact if every open cover of Ahas a finite subcover. This means that if {Gα} is an open cover of A, then there must befinitely many indices α1, . . . , αn such that A ⊂ Gα1 ∪ · · · ∪ Gαn .

Like connectedness, one of the nice features of compactness is that the compactness ofa set A does not depend on the metric space in which A is situated. In particular, let ussay that A is compact relative to X if the requirements of Definition 1.57 are satisfied forthe metric space X. Then we have the following formal result.

Proposition 1.58. Suppose that A ⊂ X and A ⊂ Y. Then A is compact relative to X iff A iscompact relative to Y.

Proof. First we consider the case where A ⊂ Y ⊂ X. Suppose that A is compact relativeto X, and let {Gα} be a collection of sets, open relative to Y, such that A ⊂ ∪αGα. ByTheorem 1.13, there are corresponding sets Hα, open relative to X, such that Gα = Y ∩ Hα

9By a maximum of a function f , we mean a value c such that f (x) ≤ c for all x in the domain, such thatf (x) = c for some x. The definition of minimum is the same, except that ≤ is changed to ≥.

23

Page 24: 03 Analysis Lecture 1

for all α. Since A is compact relative to X, the open cover {Hα} has a finite subcover, andwe can write

A ⊂ Hα1 ∪ · · · ∪ Hαn

Taking the intersection of Y with both sides of this inclusion, using the fact that A ⊂ Y ⇒A ∩Y = Y, and also using set distributivity (recall the first lecture), we have:

A ⊂ (Hα1 ∩Y) ∪ · · · ∪ (Hαn ∩Y)A ⊂ Gα1 ∪ · · · ∪ Gαn

Therefore, an open cover relative to Y, {Gα}, has a finite subcover Gα1 ∪ · · · ∪ Gαn as well.We conclude that A is compact relative to Y.

Conversely, suppose that A is compact relative to Y, and let {Hα} be a collection ofopen subsets of X that covers A. Now write Gα = Y ∩ Hα. Then the Gα are open relativeto Y, and form an open cover of A. Compactness of A relative to Y implies the existenceof a finite subcover Gα1 ∪ · · · ∪ Gαn ⊃ A. Since Hα ⊃ Gα for all α, Hα1 ∪ · · · ∪ Hαn ⊃ A aswell, and Hα1 , . . . , Hαn is a finite subcover of A relative to X. Thus A is compact relativeto X as well.

Finally, we use this result to prove the slightly more general fact that if A ⊂ X andA ⊂ Y, then A is compact relative to X iff A is compact relative to Y. Let Z = X ∪ Y.Then since A ⊂ X ⊂ Z, our result shows that A is compact relative to X iff A is compactrelative to Z. It shows the same for Y and Z. By transitivity, A is compact relative to X iffA is compact relative to Y.

Also like connectedness, compactness is preserved by continuous functions.

Proposition 1.59. If f : X → Y is a continuous function and A ⊂ X is compact, then f (A) iscompact.

Proof. Let {Hα} be an open cover of f (A). Since f is continuous, each f−1(Hα) is open,and therefore { f−1(Hα)} is an open cover of A. Since A is compact, however, there is afinite subcover f−1(Hα1) ∪ · · · ∪ f−1(Hαn) ⊃ A. It follows that Hα1 ∪ · · · ∪ Hαn ⊃ f (A),so that there is a finite subcover of our original open cover for f (A) as well. We concludethat f (A) is compact.

All compact sets are closed and bounded.

Proposition 1.60. Let X be a metric space, and let A ⊂ X be compact. Then A is a closed subsetof X. Moreover, any closed subset B of A is also compact.

Proposition 1.61. Let A ⊂ X be compact. Then A is a bounded set.

Proof. For any x ∈ A, consider the open cover consisting of the open balls {Bn(x)} forall n ∈ N. Compactness implies that this cover has a finite subcover, and therefore theresome maximal N such that A ⊂ BN(x). This implies that A is bounded.

24

Page 25: 03 Analysis Lecture 1

Not all closed and bounded sets are compact, however. For instance, remember thatI showed that all metrics have a topologically equivalent metric that has a maximumvalue of 1, under which any set is bounded. For any space X under this metric, X itselfis closed and bounded. If closedness and boundedness implied compactness, then allspaces would be compact. This clearly cannot be true.

As an explicit example, consider any metric space X that uses the discrete metric,where d(x, y) = 1 if x 6= y and d(x, y) = 0 if x = y. X is closed and bounded. Yet itcertainly is not compact: if we take the open cover consisting of the open balls B1/2(x) forall x ∈ X, there is no finite subcover. Compact implies closed and bounded, but not theother way around.

There is an alternate definition of compactness, called sequential compactness, thatis arguably more intuitive than regular compactness, and is also equivalent to regularcompactness in metric spaces.

Definition 1.62. Let X be a metric space, and let A ⊂ X. Then A is sequentially compactif for any sequence (xn) of xn ∈ A, there is a subsequence converging to some x ∈ A.

Proposition 1.63. A space X is sequentially compact iff it is compact.

Proof. We will only prove that compactness implies sequential compactness. The otherdirection is somewhat more difficult, though not impossibly so.

Suppose that X is compact. Take any sequence (xn) in X. If (xn) has a limit point inX, then by Proposition 1.22 it has a subsequence that converges in X, and we are done.If, on the other hand, it does not have a limit point in X, then the subset {xn} is closed,and itself compact by Proposition 1.61. For each xn, pick some rn such that Brn(xn) doesnot contain any xm in the sequence other than xn; this is possible since no point is a limitpoint. Then the open cover of {xn} consisting of the open balls Brn(xn) does not have anyfinite subcover, because each open ball covers exactly one xn in the set. This contradictsthe compactness of X.

Corollary 1.64. If a metric space X is compact, it is complete.

Proof. Suppose we have any Cauchy sequence in X. Then by sequential compactness ofX, the Cauchy sequence has a convergent subsequence. It follows from Proposition 1.30that the Cauchy sequence itself converges.

A few minutes ago we discussed how even though compactness implies closednessand boundedness, the converse is not generally true. We now provide one incrediblyimportant caveat to this discussion. In Rk, under the standard metric, subsets are compactiff they are closed and bounded. This result makes compactness a very useful concept inreal analysis.

Theorem 1.65 (Heine-Borel theorem). Subsets of Rk are compact iff they are closed and bounded.

Proof. We already know that compactness implies closedness and boundedness; we mustprove the other direction. If A ⊂ Rk is bounded, then by the Bolzano-Weierstrass The-orem (Theorem 1.25), any sequence (xn) in A has a convergent subsequence. Since Ais closed, it contains its limit points, so this convergent subsequence must converge to apoint in A. But this is precisely the definition of sequential compactness. It now followsfrom Proposition 1.63 that A is compact.

25

Page 26: 03 Analysis Lecture 1

Finally, we prove the Extreme Value Theorem.

Theorem 1.66 (Extreme value theorem). Suppose that X is compact and f : X → R is contin-uous. Then f attains a maximum and minimum on X.

Proof. Since f is continuous, f (X) is compact. By Propositions 1.60 and 1.61, f (X) is aclosed and bounded subset of R. Let m = inf f (X) and M = sup f (X); boundednessimplies that both m and M are finite. Additionally, since both m and M are limit points off (X), closedness implies that m and M are elements of f (X). Thus f attains a minimumm and a maximum M on X.

Specializing this to R, we see the Extreme Value Theorem as we normally encounterit in calculus.

Corollary 1.67. Let [a, b] be a closed interval in R. If f : [a, b]→ R is continuous, then f attainsa maximum and minimum on X.

Proof. [a, b] is compact by Theorem 1.65. The result then follows from Theorem 1.66.

Compactness allows us to prove a stronger form of continuity, called uniform conti-nuity.

Definition 1.68. A function f : X → Y is uniformly continuous if for any ε > 0, thereexists some δ > 0 such that for all x, x′ ∈ X, dX(x, x′) < δ⇒ dY( f (x), f (x′)) < ε.10

Proposition 1.69. Let X be a compact metric space and f : X → Y be continuous. Then f isuniformly continuous.

Proof. Take any ε > 0. Since f is continuous, for each x ∈ X there exists some δx > 0 suchthat dX(x, x′) < δX ⇒ dY( f (x), f (x′)) < ε/2.

Now take the open cover of X consisting of the open balls Bδx/2(x) for all x ∈ X. SinceX is compact, there is a finite subcover of balls centered around points x1, . . . , xn. Letδ = min(δx1 , . . . , δxn)/2.

Take any x, x′ ∈ X such that dX(x, x′) < δ. x is in some ball Bδxn /2(xn) in the finitesubcover. Then dY( f (xn), f (x)) < ε/2. Further, observe that

dX(xn, x′) ≤ dX(xn, x) + dX(x, x′)< δxn /2 + δ

≤ δxn /2 + δxn /2= δxn

Thus dY( f (xn), f (x′)) < ε/2 as well. We conclude that

dY( f (x), f (x′)) ≤ dY( f (x), f (xn)) + dY( f (xn), f (x′))< ε/2 + ε/2= ε

as desired.

10Note that the definition of uniform continuity closely resembles the definition of continuity, except thatthe quantifiers are switched. With regular continuity, for all x there exists an appropriate δ; with uniformcontinuity, there exists a δ that works for all x. Since uniform continuity switched the quantifiers to placethe existential quantifier first, it is a stronger concept.

26